I have a string that consists of repeated words. I want to replace a substring 'OK' located between 'L3' and 'L4'. Below you can find my code:
$search = "/(?<=L3).*(OK).*(?=L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, $replace, $str);
If I use that pattern with preg_match, it finds a correct substring(third 'OK'). However, when I apply that pattern to preg_replace, it replaces substring that matches the full pattern, instead of the parenthesized subpattern.
So could you please give me an advice what I should change in my code? I know that there are plenty amount of similar questions about regex, but as I understand my pattern is correct and I'm only confused with preg_replace function
It is true that your regex matches a place in the string that is preceded with L3 then contains the last OK substring after 0+ chars other than linebreak symbols and then matches any 0+ chars up to the place followed with L4. See your regex demo.
A possible solution is to use 2 capturing groups around the subpatterns before and after the OK, and use backreferences in the replacement pattern:
$search = "/(L3.*?)OK(.*?L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, '$1'.$replace.'$2', $subject);
echo $str; // => 'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'REPLACEMENT'), 'L4' => ('John', 'Madrid', 'OK')
See the PHP demo
If there cannot be any L3.5 in between L3 and L4, the (L3.*?)OK(.*?L4) pattern is safe to use. It will match and capture L3 and then 0+ chars other than a linebreak up to the first OK, then will match OK, and then will match and capture 0+ chars up to the first L4.
If there can be no L4, use a (?:(?!L4).)* tempered greedy token matching any symbol other than a linebreak symbol that is not starting an L4 sequence:
'~(L3(?:(?!L4).)*)OK~'
See the regex demo
NOTE: If you want to make the regexps safer, add ' around L# inside the patterns.
Related
I have a problem, I would like to ask for this string:
[NAME: abc] [EMAIL: email#gm.com] [TIMEFRAME: 3 weeks] [BUDGET: 1000 dollars] [MESSAGE: bla bla bla]
Replace it with an array in the form:
array(
'NAME' => 'abc',
'EMAIL' => 'email#gm.com',
'TIMEFRAME' => '3 weeks',
'BUDGET' => '1000 dollars',
'MESSAGE' => 'bla bla bla' );
I tried to do something like this:
$content = str_replace(array('[', ']'), '', '[NAME: abc] [EMAIL: email#gm.com] [TIMEFRAME: 3 weeks] [BUDGET: 1000 dollars] [MESSAGE: bla bla bla]');
preg_match_all('/[A-Z]+\:/', $content, $inputs);
I managed to pull out the "keys", but I do not know how to pull out their "values". Any ideas?
Thank you in advance for your help and I apologize for my English.
You may use the following regex:
'~\[(\w+):\s*([^][]*)]~'
See the regex demo.
Details
\[ - a [ char
(\w+) - Group 1: 1+ letters, digits or _
: - a colon
\s* - 0+whitespaces
([^][]*) - Group 2: 0+ chars other than [ and ]
] - a ] char.
See the PHP demo:
$s = "[NAME: abc] [EMAIL: cde] [TIMEFRAME: efg] [BUDGET: hij] [MESSAGE: klm]";
if (preg_match_all('~\[(\w+):\s*([^][]*)]~', $s, $m)) {
array_shift($m); // Removes whole match values from array
print_r(array_combine($m[0], $m[1])); // Build the result with keys (Group 1) and values (Group 2)
}
I have an input string like this:
"Day":June 8-10-2012,"Location":US,"City":Newyork
I need to match 3 value substrings:
June 8-10-2012
US
Newyork
I don't need the labels.
Per my comment above, if this is JSON, you should definitely use those functions as they are more suited for this.
However, you can use the following REGEX.
/:([a-zA-Z0-9\s-]*)/g
<?php
preg_match('/:([a-zA-Z0-9\s-]*)/', '"Day":June 8-10-2012,"Location":US,"City":Newyork', $matches);
print_r($matches);
The regex demo is here:
https://regex101.com/r/BbwVQ5/1
Here are a couple of simple ways:
Code: (Demo)
$string = '"Day":June 8-10-2012,"Location":US,"City":Newyork';
var_export(preg_match_all('/:\K[^,]+/', $string, $out) ? $out[0] : 'fail');
echo "\n\n";
var_export(preg_split('/,?"[^"]+":/', $string, 0, PREG_SPLIT_NO_EMPTY));
Output:
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
Pattern #1 Demo \K restarts the match after : so that a positive lookbehind can be avoided (saving "steps" / improving pattern efficiency) By matching all following characters that are not a comma, a capture group can be avoided (saving "steps" / improving pattern efficiency).
Patter #2 Demo ,? makes the comma optional and qualifies the leading double-quoted "key" to be matched (split on). The targeted substring to split on will match the full "key" substring and end on the following : colon.
I want to filter strings that I have in an csv file, and I'm looking for a correct regexp that matches these strings:
PLP_LES_HALLES.VOLUME_POMPE
Newyork:Flow(m3/h)
In fact, the string should not contain any characters like : ç & é # ! ? “ ' ³ = + etc.
I tried this one :
([a-zA-Z0-9_:.(\/)]*) but when I tested it, I figured out that it matches everything. Kindly help me to find the correct one.
Here is my code to test:
while (($line = fgetcsv($handle, 1024, ";")) !== FALSE) {
$total = count( $line );
$keys = array('mesure', 'timestamp', 'value');
$args=array(
'mesure' => array('filter' => FILTER_VALIDATE_REGEXP,
'options' => array('regexp' => '([a-zA-Z0-9_:.(\/)]*)')),
'timestamp' => array( 'filter' => FILTER_VALIDATE_INT,
'options' => array('min_range' => 20000000000000, 'length' => 14)),
'value' => FILTER_VALIDATE_FLOAT);
$testing = filter_var_array(array_combine($keys, $line), $args);
var_dump($testing);
}
EDIT
These strings should not match:
PLP_LES_HALLéS.VOLUME_POMPE
PLP_LES_HàLLES.VOLUME_POMPE
Newyork:Flow(m³/h)
To sum up, all strings that have any characters from the list ç & é # ! ? “ ' ³ = + etc` should not match
Your regex does not match the whole string, and you are using ambiguous regex delimiter, it is recommended to use more common symbols as regex delimiters.
'/^[a-zA-Z0-9_:.()\/]*$/'
^^ ^^
The ^ will match the start of the string, and $ will match its end, requiring a whole string match.
Also, [a-zA-Z0-9_] can be written as \w, use it to shorten the pattern (this is not recommended only if you do not want to match Unicode strings):
'/^[\w:.()\/]*$/'
$string = '/start info#example.com';
$pattern = '/{command} {name}#{domain}';
get array params in php, Like the example below:
['command' => 'start', 'name' => 'info', 'domain' => 'example.com']
and
$string = '/start info#example.com';
$pattern = '/{command} {email}';
['command' => 'start', 'email' => 'info#example.com']
and
$string = '/start info#example.com';
$pattern = '{command} {email}';
['command' => '/start', 'email' => 'info#example.com']
If its a single line string you can use preg_match and a regular expression such as this
preg_match('/^\/(?P<command>\w+)\s(?P<name>[^#]+)\#(?P<domain>.+?)$/', '/start info#example.com', $match );
But depending on variation in the data you may have to adjust the regx a bit. This outputs
command [1-6] start
name [7-11] info
domain [12-23] example.com
but it will also have the numeric index in the array.
https://regex101.com/r/jN8gP7/1
Just to break this down a bit, in English.
The leading ^ is start of line, then named capture ( \w (any a-z A-Z 0-9 _ ) ) then a space \s then named capture of ( anything but the #t sign [^#] ), then the #t sign #, then name captured of ( anything .+? to the end $ )
This will capture anything in this format,
(abc123_ ) space (anything but #)#(anything)
Given the following code:
$regex = '/(http\:\/\/|https\:\/\/)([a-z0-9-\.\/\?\=\+_]*)/i';
$text = preg_split($regex, $note, -1, PREG_SPLIT_DELIM_CAPTURE);
its returning an array such as:
array (size=4)
0 => string '...' (length=X)
1 => string 'https://' (length=8)
2 => string 'duckduckgo.com/?q=how+much+wood+could+a+wood-chuck+chuck+if+a+wood-chuck+could+chuck+wood' (length=89)
3 => string '...' (length=X)
I would prefer it if the returned array had size=3, with one single URL. Is this possible?
Sure that can be done, just remove those extra matching groups from your regex. Try following code:
$regex = '#(https?://[a-z0-9.?=+_-]*)#i';
$text = preg_split($regex, $note, -1, PREG_SPLIT_DELIM_CAPTURE);
Now resulting array will have 3 elements in the array instead of 4.
Besides removing extra grouping I have also simplified your regex also since most of the special characters don't need to be escaped inside character class.