PHP regex - weird issue with multiple group capture - php

The code:
$pattern = '~(/(?P<lang>en|ru))?/foo(/(?P<bar>bar))?~';
preg_match($pattern, '/foo', $matches);
var_dump($matches);
/*output:
array(1) {
[0] =>
string(4) "/foo"
}*/
preg_match($pattern, '/foo/bar', $matches);
var_dump($matches);
/*output:
array(7) {
[0] =>
string(8) "/foo/bar"
[1] =>
string(0) ""
'lang' =>
string(0) ""
[2] =>
string(0) ""
[3] =>
string(4) "/bar"
'bar' =>
string(3) "bar"
[4] =>
string(3) "bar"
}*/
The question: why the hell does it capture <lang> in the second preg_match call and how do I fix it?
P.S. I tried this regex on https://www.regex101.com and there it captures correctly, but on my machine with PHP7, it does not. I get the feeling that regex101 filters the output.

As others have said, that's simply how regex works. It's fairly universal to regexes, as far as I know. It even has parallels in programming in general, such as how Java requires a String returning function to return a String (unless it throws an error).
In PHP, use array_filter on $matches to remove empty entries.
Also, I suggest using non-capturing groups (?:) to cut the clutter:
(?:/(?P<lang>en|ru))?/foo(?:/(?P<bar>bar))?
Or split it into 2 regexes: (?:/(?P<lang>en|ru)) and /foo(?:/(?P<bar>bar)).

Related

Regular expression - repeat groups

I have text:
<b>Title1:</b><br/><b>Title2:</b> Value1<br/><b>Title3:</b> Value2<br/><b>Title4:</b> Value3<br/>Value4<b>Title5:</b> Value5<br/>
What regex to get:
[0] => <b>Title1:</b><br/>
[1] => <b>Title2:</b> Value1<br/>
[2] => <b>Title3:</b> Value2<br/>
[3] => <b>Title4:</b> Value3<br/>Value4
[4] => <b>Title5:</b> Value5<br/>
My variant not working:
<b>(.*?)</b>(.*?)
You can use preg_split() with a lookahead:
<?php
$split = preg_split( '/(?=<b>Title\d+:)/', '<b>Title1:</b><br/><b>Title2:</b> Value1<br/><b>Title3:</b> Value2<br/><b>Title4:</b> Value3<br/>Value4<b>Title5:</b> Value5<br/>' );
array_shift( $split );
var_dump( $split );
Output:
array(5) {
[0]=>
string(19) "<b>Title1:</b><br/>"
[1]=>
string(26) "<b>Title2:</b> Value1<br/>"
[2]=>
string(26) "<b>Title3:</b> Value2<br/>"
[3]=>
string(32) "<b>Title4:</b> Value3<br/>Value4"
[4]=>
string(26) "<b>Title5:</b> Value5<br/>"
}
Your regex was close, you need:
<b>(.*?)<\/b>(.*?)(?=<b>|$)
https://regex101.com/r/dk67IK/1
A resource like this can be very useful in troubleshooting regex: https://regex101.com/
Looks like you are missing an escape character in <b>(.*?)</b>(.*?)
<b>(.*?)<\/b>(.*?) should stop an error from being thrown for that current regex and get you close to the result, you'll need to work with it a bit more to get the exact results you want though.
<b>(.*?)<\/b>(.*?)<br\/> should be a bit closer I think as it looks like you want to include the break tags.

php preg_split to find all words in a string is not working

I am using preg_split to split a string into words.
However, it is not working for a particular string that is fetched from a mysql text column.
If I manually assign the string to a variable it will work correctly but not when the string is fetched from the database.
Here is the simple code I am using:
//The failing string. When manually assigned like this it works correctly
$string = "<p><strong>Iden is lesz lehetoseg a foproba és a koncert napjan ebedet kerni a MUPA-ban. Ára 1000-1200 Ft körül várható. Azoknak, akik még nem jártak a MUPA-ban ingyenes bejarasi lehetoseget biztositunk. Tovabba segitunk a pesti szallas megszervezeseben is, ha igenyt tartotok ra.</strong></p>";
$string = strip_tags(trim($string));
$words = preg_split('/\PL+/u', $string, null, PREG_SPLIT_NO_EMPTY);
Here is what the preg_split returns when called on the string from the database:
array(1) { [0]=> string(269) "Iden is lesz lehetoseg a foproba és a koncert napjan ebedet kerni a MUPA-ban. Ára 1000-1200 Ft körül várható. Azoknak, akik még nem jártak a MUPA-ban ingyenes bejarasi lehetoseget biztositunk. Tovabba segitunk a pesti szallas megszervezeseben is, ha igenyt tartotok ra." }
Does anyone know what is causing preg_split to fail for this string?
Thanks
I tested your code with a string from the database and happened the same error, change the regular expresion and you will have the solution. Use this expression:
$words = preg_split('/[\s]/', $string, null, PREG_SPLIT_NO_EMPTY);
//var_dump result
array(42) {
[0]=>
string(4) "Iden"
[1]=>
string(2) "is"
[2]=>
string(4) "lesz"
[3]=>
string(9) "lehetoseg"
...
}
UPDATE:
The modifier /u are for UTF 8, maybe your database is not in UTF8, and so the expression did not work
You don't need a regex for this, explode will do the job:
$string = "<p><strong>Iden is lesz lehetoseg a foproba és a koncert napjan ebedet kerni a MUPA-ban. Ára 1000-1200 Ft körül várható. Azoknak, akik még nem jártak a MUPA-ban ingyenes bejarasi lehetoseget biztositunk. Tovabba segitunk a pesti szallas megszervezeseben is, ha igenyt tartotok ra.</strong></p>";
$string = strip_tags(trim($string));
$words = explode(" ", $string);
print_r($words);
Output:
Array
(
[0] => Iden
[1] => is
[2] => lesz
[3] => lehetoseg
[4] => a
[5] => foproba
[6] => és
[7] => a
[8] => koncert
...
Ideone Demo

PHP String Extraction [duplicate]

This question already has an answer here:
PHP preg_match to find multiple occurrences
(1 answer)
Closed 8 years ago.
I'm weak with regex, need help. My problem is I have to extract all the string that matches the given pattern I have into an array. See the problem below:
The string
<?php
$alert_types = array(
'warning' => array('', __l("Warning!") ),
'error' => array('alert-error', __l("Error!") ),
'success' => array('alert-success', __l("Success!") ),
'info' => array('alert-info', __l("For your information.") ),
);?>
The Preg_Match Code
preg_match("/.*[_][_][l][\(]['\"](.*)['\"][\)].*/", $content, $matches);
I'm only getting the first one match which is Warning!. I'm Expecting matches will have the following values:
Warning!, Error!, Success!, For your information.
Actually I'm using file_get_contents($file) to get the string.
Can anyone help me to solve this. Thankyou in advance.
preg_match() only finds the first match in the string. Use preg_match_all() to get all matches.
preg_match_all("/.*__l\(['\"](.*?)['\"]\).*/", $content, $matches);
$matches[1] will contain an array of the strings you're looking for.
BTW, you don't need all those single-character brackets. Just put the character into the regexp.
var_dump($matches);
array(2) {
[0]=>
array(4) {
[0]=>
string(45) " 'warning' => array('', __l("Warning!") ),"
[1]=>
string(52) " 'error' => array('alert-error', __l("Error!") ),"
[2]=>
string(58) " 'success' => array('alert-success', __l("Success!") ),"
[3]=>
string(65) " 'info' => array('alert-info', __l("For your information.") ),"
}
[1]=>
array(4) {
[0]=>
string(8) "Warning!"
[1]=>
string(6) "Error!"
[2]=>
string(8) "Success!"
[3]=>
string(21) "For your information."
}
}

PHP - Preg_match_all optional match

I'm having problems matching the[*] which is sometimes there and sometimes not. Anyone have suggestions?
$name = 'hello $this->row[today1][] dfh fgh df $this->row[test1] ,how good $this->row[test2][] is $this->row[today2][*] is monday';
echo $name."\n";
preg_match_all( '/\$this->row[.*?][*]/', $name, $match );
var_dump( $match );
output:
hello $this->row[test] ,how good $this->row[test2] is $this->row[today][*] is monday
array (
0 =>
array (
0 => '$this->row[today1][*]',
1 => '$this->row[test1] ,how good $this->row[test2][*]',
2 => '$this->row[today2][*]',
),
)
Now the [0][1] match takes on too much because it is matching until the next '[]' instead of ending at '$this->row[test]' . I'm guessing the [*]/ adds a wildcard. Somehow need to check if the next character is [ before matching to []. Anyone?
Thanks
[, ] and * are special meta characters in regex and you need to escape them. Also you need to make last [] optional as per your question.
Following these suggestions following should work:
$name = 'hello $this->row[today1][] dfh fgh df $this->row[test1] ,how good $this->row[test2][] is $this->row[today2][*] is monday';
echo $name."\n";
preg_match_all( '/\$this->row\[.*?\](?:\[.*?\])?/', $name, $match );
var_dump( $match );
OUTPUT:
array(1) {
[0]=>
array(4) {
[0]=>
string(20) "$this->row[today1][]"
[1]=>
string(17) "$this->row[test1]"
[2]=>
string(19) "$this->row[test2][]"
[3]=>
string(21) "$this->row[today2][*]"
}
}

regular expression, PHP

I need to parce a string like that: "{data type="subject"} using regular expressions in PHP.
I've got this:
$template = '/{([\w]+)\s([\w]+)="([\w]+)"}/';
but nothing happens.
Can anyone help me with that?
Everything looks fine.
Your pattern:
<?php
$s = '{data type="subject"}';
$template = '/{([\w]+)\s([\w]+)="([\w]+)"}/';
preg_match($template, $s, $matches);
var_dump($matches);
Result is:
array(4) {
[0] =>
string(21) "{data type="subject"}"
[1] =>
string(4) "data"
[2] =>
string(4) "type"
[3] =>
string(7) "subject"
}
Show your full example, please.
Curly braces must be escaped. Try something like this :
'/\{(\w+)\s(\w+)="(\w+)"\}/'
Edit : changed a little mistake. I've tried it, it works fine.

Categories