multiple patterns with preg_match - php

My situation is: I'm processing an array word by word. What I'm hoping to do and
working on, is to capture a certain word. But for that I need to test two patterns or more with preg-match.
This is my code :
function search_array($array)
{
$pattern = '[A-Z]{1,3}[0-9]{1,3}[A-Z]{1,2}[0-9]{1,2}[A-Z]?';
$pattern2 = '[A-Z]{1,7}[0-9]{1,2}';
$patterns = array($pattern, $pattern2);
$regex = '/(' .implode('|', $patterns) .')/i';
foreach ($array as $str) {
if (preg_match ($regex, $str, $m)){
$matches[] = $m[1];
return $matches[0];
}
}
}
Example of array I could have :
Array ( [0] => X [1] => XXXXXXX [2] => XXX [3] => XXXX [4] => ABC01DC4 )
Array ( [0] => X [1] => XXXXXXX [2] => XXX [3] => ABCDEF4 [4] => XXXX [5] => XX )
Words I would like to catch :
-In the first array : ABC01DC4
-In the second array : ABCDEF4
The problem is not the pattern itself, it's the syntax to use multiple pattern in the same pregmatch

Your code worked with me, and I didn't find any problem with the code or the REGEX. Furthermore, the description you provided is not enough to understand your needs.
However, I have guessed one problem after observing your code, which is, you didn't use any anchor(^...$) to perform matching the whole string. Your regex can find match for these inputs: %ABC01DC4V or ABCDEF4EE. So change this line with your code:
$regex = '/^(' .implode('|', $patterns) .')$/i';
-+- -+-

Related

Flatten array of regular expressions

I have an array of regular expressions -$toks:
Array
(
[0] => /(?=\D*\d)/
[1] => /\b(waiting)\b/i
[2] => /^(\w+)/
[3] => /\b(responce)\b/i
[4] => /\b(from)\b/i
[5] => /\|/
[6] => /\b(to)\b/i
)
When I'm trying to flatten it:
$patterns_flattened = implode('|', $toks);
I get a regex:
/(?=\D*\d)/|/\b(waiting)\b/i|/^(\w+)/|/\b(responce)\b/i|/\b(from)\b/i|/\|/|/\b(to)\b/i
When I'm trying to:
if (preg_match('/'. $patterns_flattened .'/', 'I'm waiting for a response from', $matches)) {
print_r($matches);
}
I get an error:
Warning: preg_match(): Unknown modifier '(' in ...index.php on line
Where is my mistake?
Thanks.
You need to remove the opening and closing slashes, like this:
$toks = [
'(?=\D*\d)',
'\b(waiting)\b',
'^(\w+)',
'\b(response)\b',
'\b(from)\b',
'\|',
'\b(to)\b',
];
And then, I think you'll want to use preg_match_all instead of preg_match:
$patterns_flattened = implode('|', $toks);
if (preg_match_all("/$patterns_flattened/i", "I'm waiting for a response from", $matches)) {
print_r($matches[0]);
}
If you get the first element instead of all elements, it'll return the whole matches of each regex:
Array
(
[0] => I
[1] => waiting
[2] => response
[3] => from
)
Try it on 3v41.org
<?php
$data = Array
(
0 => '/(?=\D*\d)/',
1 => '/\b(waiting)\b/i',
2 => '/^(\w+)/',
3 => '/\b(responce)\b/i',
4 => '/\b(from)\b/i',
5 => '/\|/',
6 => '/\b(to)\b/i/'
);
$patterns_flattened = implode('|', $data);
$regex = str_replace("/i",'',$patterns_flattened);
$regex = str_replace('/','',$regex);
if (preg_match_all( '/'.$regex.'/', "I'm waiting for a responce from", $matches)) {
echo '<pre>';
print_r($matches[0]);
}
You have to remove the slashes from your regex and also the i parameter in order to make it work. That was the reason it was breaking.
A really nice tool to actually validate your regex is this :
https://regexr.com/
I always use that when i have to make a bigger than usual regular expression.
The output of the above code is :
Array
(
[0] => I
[1] => waiting
[2] => responce
[3] => from
)
There are a few adjustments to make with your $tok array.
To remove the error, you need to remove the pattern delimiters and pattern modifiers from each array element.
None of the capture grouping is necessary, in fact, it will lead to a higher step count and create unnecessary output array bloat.
Whatever your intention is with (?=\D*\d), it needs a rethink. If there is a number anywhere in your input string, you are potentially going to generate lots of empty elements which surely can't have any benefit for your project. Look at what happens when I put a space then 1 after from in your input string.
Here is my recommendation: (PHP Demo)
$toks = [
'\bwaiting\b',
'^\w+',
'\bresponse\b',
'\bfrom\b',
'\|',
'\bto\b',
];
$pattern = '/' . implode('|', $toks) . '/i';
var_export(preg_match_all($pattern, "I'm waiting for a response from", $out) ? $out[0] : null);
Output:
array (
0 => 'I',
1 => 'waiting',
2 => 'response',
3 => 'from',
)

How to split negative and positive values in string using php?

I have a string variable in php, with look like "0+1.65+0.002-23.9", and I want to split in their individual values.
Ex:
0
1.65
0.002
-23.9
I Try to do with:
$keys = preg_split("/^[+-]?\\d+(\\.\\d+)?$/", $data);
but not work I expected.
Can anyone help me out? Thanks a lot in advance.
Like this:
$yourstring = "0+1.65+0.002-23.9";
$regex = '~\+|(?=-)~';
$splits = preg_split($regex, $yourstring);
print_r($splits);
Output (see live php demo):
[0] => 0
[1] => 1.65
[2] => 0.002
[3] => -23.9
Explanation
Our regex is +|(?=-). We will split on whatever it matches
It matches +, OR |...
the lookahead (?=-) matches a position where the next character is a -, allowing us to keep the -
Then we split!
Option 2 if you decide you also want to keep the + Character
(?=[+-])
This regex is one lookahead that asserts that the next position is either a plus or a minus. For my sense of esthetics it's quite a nice solution to look at. :)
Output (see online demo):
[0] => 0
[1] => +1.65
[2] => +0.002
[3] => -23.9
Reference
Lookahead and Lookbehind Zero-Length Assertions
Mastering Lookahead and Lookbehind
You could try this
$data = ' 0 1.65 0.002 -23.9';
$t = str_replace( array(' ', ' -'), array(',',',-'), trim($data) );
$ta = explode(',', $t);
print_r($ta);
Which gives you an array containing each field like so:
Array
(
[0] => 0
[1] => 1.65
[2] => 0.002
[3] => -23.9
)
RE: Your comment: The originals values are in a string variable only separated for a sign possitive or negative
$data = ' 0+1.65+0.002-23.9 ';
$t = str_replace( array('-', '+'), array(',-',',+'), trim($data) );
$ta = explode(',', $t);
print_r($ta);
which gives a similiar answer but with the correct inputs and outputs
Array
(
[0] => 0
[1] => +1.65
[2] => +0.002
[3] => -23.9
)

Strange behavior of preg_match_all php

I have a very long string of html. From this string I want to parse pairs of rus and eng names of cities. Example of this string is:
$html = '
Абакан
Хакасия республика
Абан
Красноярский край
Абатский
Тюменская область
';
My code is:
$subject = $this->html;
$pattern = '/<a href="([\/a-zA-Z0-9-"]*)">([а-яА-Я]*)/';
preg_match_all($pattern, $subject, $matches);
For trying I use regexer . You can see it here http://regexr.com/399co
On the test used global modifier - /g
Because of in PHP we can't use /g modifier I use preg_match_all function. But result of preg_match_all is very strange:
Array
(
[0] => Array
(
[0] => <a href="/forecasts5000/russia/republic-khakassia/abakan">Абакан
[1] => <a href="/forecasts5000/russia/krasnoyarsk-territory/aban">Абан
[2] => <a href="/forecasts5000/russia/tyumen-area/abatskij">Аба�
[3] => <a href="/forecasts5000/russia/arkhangelsk-area/abramovskij-ma">Аб�
)
[1] => Array
(
[0] => /forecasts5000/russia/republic-khakassia/abakan
[1] => /forecasts5000/russia/krasnoyarsk-territory/aban
[2] => /forecasts5000/russia/tyumen-area/abatskij
[3] => /forecasts5000/russia/arkhangelsk-area/abramovskij-ma
)
[2] => Array
(
[0] => Абакан
[1] => Абан
[2] => Аба�
[3] => Аб�
)
)
First of all - it found only first match (but I need to get array with all matches)
The second - result is very strange for me. I want to get the next result:
pairs of /forecasts5000/russia/republic-khakassia/abakan and Абакан
What do I do wrong?
Element 0 of the result is an array of each of the full matches of the regexp. Element 1 is an array of all the matches for capture group 1, element 2 contains capture group 2, and so on.
You can invert this by using the PREG_SET_ORDER flag. Then element 0 will contain all the results from the first match, element 1 will contain all the results from the second match, and so on. Within each of these, [0] will be the full match, and the remaining elements will be the capture groups.
If you use this option, you can then get the information you want with:
foreach ($matches as $match) {
$url = $match[1];
$text = $match[2];
// Do something with $url and $text
}
You can also use T-Regx library which has separate methods for each case :)
pattern('<a href="([/a-zA-Z0-9-"]*)">([а-яА-Я]*)')
->match($this->html)
->forEach(function (Match $match) {
$match = $match->text();
$group = $match->group(1);
echo "Match $match with group $group"
});
I also has automatic delimiters

capturing group under capturing group?

Is possible to capturing group under capturing group so i can have an array like that
regex = (asd1).(lol1),(asd2).(asd2)
string = asd1.lol1,asd2.lol2
return_array[0]=>group[0]='asd1';
return_array[0]=>group[1]='lol1';
return_array[1]=>group[0]='asd2';
return_array[1]=>group[1]='lol2';
While using regular expressions can get what you want, you could also use strtok() to iterate through what seems to simply be comma separated sets:
$results = array();
$str = 'asd1.lol1,asd2.lol2';
$token = strtok($str, ',');
while ($token !== false) {
$results[] = explode('.', $token, 2);
$token = strtok(',');
}
Output:
Array
(
[0] => Array
(
[0] => asd1
[1] => lol1
)
[1] => Array
(
[0] => asd2
[1] => lol2
)
)
With regular expressions your pattern needs to only include the two terms surrounding a period, i.e.:
$pattern = '/(?<=^|,)(\w+)\.(\w+)/';
preg_match_all($pattern, $str, $result, PREG_SET_ORDER);
The (?<=^|,) is a look-behind assertion; it makes sure to only match what comes after if preceded by either the start of your search string or a comma, but it doesn't "consume" anything.
Output:
Array
(
[0] => Array
(
[0] => asd1.lol1
[1] => asd1
[2] => lol1
)
[1] => Array
(
[0] => asd2.lol2
[1] => asd2
[2] => lol2
)
)
You're probably looking for preg_match_all.
$regex = '/^((\w+)\.(\w+)),((\w+)\.(\w+))$/';
$string = 'asd1.lol1,asd2.lol2';
preg_match_all($regex, $string, $matches);
This function will create a 2-dimensional array, where the first dimension represents the matched groups (i.e. the parentheses, 0 contains the whole matched string though) and each have subarrays to all the matched lines (only 1 in this case).
[0] => ("asd1.lol1,asd2.lol2") // a view of $matches
[1] => ("asd1.lol1")
[2] => ("asd1")
[3] => ("lol1")
[4] => ("asd2.lol2")
[5] => ("asd2")
[6] => ("lol2")
Your best bet to have groups is to process the first dimension of the array that you want and to then process them further, i.e. get "asd1.lol1" from 1 and 4 and then process these further into asd1 and lol1.
You wouldn't need as many parentheses in your first run:
$regex = '/^(\w+\.\w+),(\w+\.\w+)$/';
will yield:
[0] => ("asd1.lol1,asd2.lol2")
[1] => ("asd1.lol1")
[2] => ("asd2.lol2")
Then you can split the array in 1 and 2 into more granular values.
Flags can be set to preg_match_all to order the output differently. Particularly, PREG_SET_ORDER allows you to have all matched instances in the same subarray. This is of little importance if you're only processing one string, but if you're matching a pattern in a text, it might be more convenient to have all info about one match in $matches[0], and so forth.
Note that if you're just separating a string by comma and then by any periods, you might not need regular expressions and could conveniently use explode() as so:
$string = 'asd1.lol1,asd2.lol2';
$matches = explode(',', $string);
foreach($matches as &$match) {
$match = explode('.', $match);
}
This will give you exactly what you want, but do note that you don't have as much control over the process as with regular expressions – for instance, asd1.lol1.lmao,asd2.lol2.rofl.hehe will also work and they'll produce bigger arrays than you may want. You can check with count() on the size of the subarray and handle the cases when the array isn't of the appropriate size, though. I still believe that's more comfortable than using regular expressions.

Nested pattern matching with preg_match_all (Regex and PHP)

I'm working with text data that contains special flags in the form of "{X}" or "{XX}" where X could be any alphanumeric character. Special meaning is assigned to these flags when they are adjacent or when they are separated. I need a regex which will match adjacent flags AND separate each flag in the group.
For Example, given the following input:
{B}{R}: Target player loses 1 life.
{W}{G}{U}: Target player gains 5 life.
The output should be approximate:
("{B}{R}",
"{W}{G}{U}")
("{B}",
"{R}")
("{W}",
"{G}",
"{U}")
My PHP code is returning the adjacents array properly, but the split array contains only the last matching flag in each group:
$input = '{B}{R}: Target player loses 1 life.
{W}{G}{U}: Target player gains 5 life.';
$pattern = '#((\{[a-zA-Z0-9]{1,2}})+)#';
preg_match_all($pattern, $input, $results);
print_r($results);
Output:
Array
(
[0] => Array
(
[0] => {B}{R}
[1] => {W}{G}{U}
)
[1] => Array
(
[0] => {B}{R}
[1] => {W}{G}{U}
)
[2] => Array
(
[0] => {R}
[1] => {U}
)
)
Thanks for any help!
unset($results[1]);
foreach($results[0] AS $match){
preg_match_all('/\{[a-zA-Z0-9]{1,2}}/', $match, $r);
$results[] = $r[0];
}
That's the only way I know of to create your Required datastructure. Though, a preg_split would work as well:
unset($results[1]);
foreach($results[0] AS $match)
$results[] = preg_split('/(?<=})(?=\{)/', $match);

Categories