PHP preg_match_all not finding first match - php

I am trying to find all matches in a string. For some reason if my match is at the start of the string it is not returning that particular match. Does it have something to do with index 0? I am also using PREG_OFFSET_CAPTURE to get the indexes vs. the matches. Below is the code of working an non-working.
$text = '[QUOTE]I wonder why[QUOTE]PHP[IMG]hates me[/IMG][/QUOTE][/QUOTE][URL="http://www.bing.com"]Click me![QUOTE]........[/QUOTE]Ok Bai![/URL]';
preg_match_all('#\[QUOTE\]#', $text, $matches, PREG_OFFSET_CAPTURE, PREG_PATTERN_ORDER);
print_r($matches);
The result of which is:
Array ( [0] => Array ( [0] => Array ( [0] => [QUOTE] [1] => 19 ) [1] => Array ( [0] => [QUOTE] [1] => 100 ) ) )
As you can see it only found two matches. If I add a character to the start of the string it will then find all three.
$text = 'a[QUOTE]I wonder why[QUOTE]PHP[IMG]hates me[/IMG][/QUOTE][/QUOTE][URL="http://www.bing.com"]Click me![QUOTE]........[/QUOTE]Ok Bai![/URL]';
preg_match_all('#\[QUOTE\]#', $text, $matches, PREG_OFFSET_CAPTURE, PREG_PATTERN_ORDER);
print_r($matches);
The result of which is:
Array ( [0] => Array ( [0] => Array ( [0] => [QUOTE] [1] => 1 ) [1] => Array ( [0] => [QUOTE] [1] => 20 ) [2] => Array ( [0] => [QUOTE] [1] => 101 ) ) )
All three matches. If anyone can help me figure out if my REGEX needs to be modified or if there is some quirk I'm unaware of it would be much appreciated. I've tried this same thing utilizing Python and the re library and it returns all my matches. I also utilized this http://www.regextester.com/ and it reports it as working in both scenarios and matching everything as it should. My only guess is something to do with the PREG_OFFSET_CAPTURE finding a match at position 0 and the 0 causing some issue.
Thanks in advance for any assistance!

The correct way to add multiple flags is with a pipe |, so:
preg_match_all('#\[QUOTE\]#', $text, $matches, PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER);
Your , before PREG_PATTERN_ORDER means it becomes the 'offset' parameter (at which point in the string to start), and as PREG_PATTERN_ORDER==1, it starts at the second character.

The problem is in your function call:
preg_match_all('#\[QUOTE\]#', $text, $matches, PREG_OFFSET_CAPTURE, PREG_PATTERN_ORDER);
The fifth parameter is the offset, not another flag.

Related

regex repeated asterisk pattern matching

If I do the regex matching
preg_match('/^[*]{2}((?:[^*]|[*][^*]*[*])+?)[*]{2}(?![*]{2})/s', "**A** **B**", $matches);
I get the result for $matches I want of
Array ( [0] => **A** [1] => A )
but I am not sure how to modify the regex to yield the same result in $matches from the input text without the space in the middle, that is, "**A****B**".
It looks like the regex matching
preg_match('/^[*]{2}((?:[^*]|[*][^*]*[*])+?)[*]{2}/s', "**A****B**", $matches);
yields the result for $matches I want of
Array ( [0] => **A** [1] => A )

How to get a particular string using preg_replace?

i want to get a particular value from string in php. Following is the string
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_replace('/(.*)\[(.*)\](.*)\[(.*)\](.*)/', '$2', $str);
i want to get value of data01. i mean [1,2].
How can i achieve this using preg_replace?
How can solve this ?
preg_replace() is the wrong tool, I have used preg_match_all() in case you need that other item later and trimmed down your regex to capture the part of the string you are looking for.
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('/\[([0-9,]+)\]/',$string,$match);
print_r($match);
/*
print_r($match) output:
Array
(
[0] => Array
(
[0] => [1,2]
[1] => [2,3]
)
[1] => Array
(
[0] => 1,2
[1] => 2,3
)
)
*/
echo "Your match: " . $match[1][0];
?>
This enables you to have the captured characters or the matched pattern , so you can have [1,2] or just 1,2
preg_replace is used to replace by regular expression!
I think you want to use preg_match_all() to get each data attribute from the string.
The regex you want is:
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('#data[0-9]{2}=(\[[0-9,]+\])#',$string,$matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => data01=[1,2]
[1] => data02=[2,3]
)
[1] => Array
(
[0] => [1,2]
[1] => [2,3]
)
)
I have tested this as working.
preg_replace is for replacing stuff. preg_match is for extracting stuff.
So you want:
preg_match('/(.*?)\[(.*?)\](.*?)\[(.*?)\](.*)/', $str, $match);
var_dump($match);
See what you get, and work from there.

Finding the no of occurence of a string inside another string using regex in PHP?

I want to find the no of occurences of a sustring(pattern based) inside another string.
For example:
$mystring = "|graboard='KERALA'||graboarded='KUSAT'||graboard='MG'";
I want to find the no of graboards present in the $mystring,
So I used the regex for this, But how will I find the no of occurrence?
If you must use a regex, preg_match_all() returns the number of matches.
Use preg_match_all:
$mystring = "|graboard='KERALA'||graboarded='KUSAT'||graboard='MG'";
preg_match_all("/(graboard)='(.+?)'/i", $mystring, $matches);
print_r($matches);
will yield:
Array
(
[0] => Array
(
[0] => graboard='KERALA'
[1] => graboard='MG'
)
[1] => Array
(
[0] => graboard
[1] => graboard
)
[2] => Array
(
[0] => KERALA
[1] => MG
)
)
So then you can use count($matches[1]) -- however, this regex may need to be modified to suit your needs, but this is just a basic example.
Just use preg_match_all():
// The string.
$mystring="|graboard='KERALA'||graboarded='KUSAT'||graboard='MG'";
// The `preg_match_all()`.
preg_match_all('/graboard/is', $mystring, $matches);
// Echo the count of `$matches` generated by `preg_match_all()`.
echo count($matches[0]);
// Dumping the content of `$matches` for verification.
echo '<pre>';
print_r($matches);
echo '</pre>';

Nested pattern matching with preg_match_all (Regex and PHP)

I'm working with text data that contains special flags in the form of "{X}" or "{XX}" where X could be any alphanumeric character. Special meaning is assigned to these flags when they are adjacent or when they are separated. I need a regex which will match adjacent flags AND separate each flag in the group.
For Example, given the following input:
{B}{R}: Target player loses 1 life.
{W}{G}{U}: Target player gains 5 life.
The output should be approximate:
("{B}{R}",
"{W}{G}{U}")
("{B}",
"{R}")
("{W}",
"{G}",
"{U}")
My PHP code is returning the adjacents array properly, but the split array contains only the last matching flag in each group:
$input = '{B}{R}: Target player loses 1 life.
{W}{G}{U}: Target player gains 5 life.';
$pattern = '#((\{[a-zA-Z0-9]{1,2}})+)#';
preg_match_all($pattern, $input, $results);
print_r($results);
Output:
Array
(
[0] => Array
(
[0] => {B}{R}
[1] => {W}{G}{U}
)
[1] => Array
(
[0] => {B}{R}
[1] => {W}{G}{U}
)
[2] => Array
(
[0] => {R}
[1] => {U}
)
)
Thanks for any help!
unset($results[1]);
foreach($results[0] AS $match){
preg_match_all('/\{[a-zA-Z0-9]{1,2}}/', $match, $r);
$results[] = $r[0];
}
That's the only way I know of to create your Required datastructure. Though, a preg_split would work as well:
unset($results[1]);
foreach($results[0] AS $match)
$results[] = preg_split('/(?<=})(?=\{)/', $match);

Regex, get multiple occurrences

I would like to know how to get multiple occurrences from a regex.
$str = "Some validations <IF TEST>firstValue</IF> in <IF OK>secondValue</IF> end of string.";
$do = preg_match("/<IF(.*)>.*<\/IF>/i", $str, $matches);
This is what I've done so far. It works if I have only 1 , but if I have more it doesn't return the right values. Here is the result:
Array ( [0] => firstValue in secondValue [1] => TEST>firstValue in
I need to get the "TEST" and the "OK" values.
EDIT: I've brought the modifications suggested, thanks a lot it works fine ! However, I am now trying to add a elsif parameter and can't get it to work well. Here is what I've done:
$do = preg_match_all("~<IF([^<>]+)>([^<>]+)(</IF>|<ELSEIF([^<>]+)>([^<>]+)</IF>)~", $str, $matches, PREG_SET_ORDER);
and the results is
Array
(
[0] => Array
(
[0] => firstValuesecondValue
[1] => TEST
[2] => firstValue
[3] => secondValue
[4] => TEST1
[5] => secondValue
)
[1] => Array
(
[0] => thirdValue
[1] => OK
[2] => thirdValue
[3] =>
)
)
Is there a way to make my array more clean ? It has many elements which are useless like the [0][4] etc.
You should make the regex more specific. The .* that you are using should either be less greedy, or better yet disallow other angle brackets:
~<IF([^<>]+)>([^<>]+)</IF>~i
More importantly, you should use preg_match_all, not just preg_match.
preg_match_all("~<IF([^<>]+)>([^<>]+)</IF>~i", $str, $matches, PREG_SET_ORDER);
That'll give you a nested array like:
[0] => Array
(
[0] => <IF TEST>firstValue</IF>
[1] => TEST
[2] => firstValue
)
[1] => Array
(
[0] => <IF OK>secondValue</IF>
[1] => OK
[2] => secondValue
)
The answers pointing out that you should use preg_match_all are correct.
But there is another problem: the .* is greedy by default. This will cause it to match both tags in a single match, so you need to make the star non-greedy (i.e. lazy):
/<IF(.*?)>.*?<\/IF>/i
Use this code:
$string = "Some validations <IF TEST>firstValue</IF> in <IF OK>secondValue</IF> end of string.";
$regex = "/<IF (.*?)>.*?<\/IF>/i";
preg_match_all($regex, $string, $matches);
print_r($matches[1]);
You regex is good but you have to use the non-greedy mode adding the ? char and use the preg_match_all() function.
Use a non-greedy match .*? and preg_match_all for this purpose.

Categories