PHP preg_match matches - php

I'm trying to get all the matches from string:
$string = '[RAND_15]d4trg[RAND_23]';
with preg_match like this:
$match = array();
preg_match('#\[RAND_.*]#', $string, $match);
but after that $match array looks like this:
Array ( [0] => [RAND_15]d4trg[RAND_23] )
What should I do to get both occurrences as 2 separate elements in $match array? I would like to get result like this:
$match[0] = [RAND_15];
$match[1] = [RAND_23];

Use ...
$match = array();
preg_match_all('#\[RAND_.*?]#', $string, $match);
... instead. ? modifier will make the pattern become 'lazy', matching the shortest possible substring. Without it the pattern will try to cover the maximum distance possible, and technically, [RAND_15]d4trg[RAND_23] does match the pattern.
Another way is restricting the set of characters to match with negated character class:
$match = array();
preg_match_all('#\[RAND_[^]]*]#', $string, $match);
This way we won't have to turn the quantifier into a lazy one, as [^]] character class will stop matching at the first ] symbol.
Still, to catch all the matches you should use preg_match_all instead of preg_match. Here's the demo illustrating the difference.

Related

Find a pattern in a string

I am trying to detect a string inside the following pattern: [url('example')] in order to replace the value.
I thought of using a regex to get the strings inside the squared brackets and then another to get the text inside the parenthesis but I am not sure if that's the best way to do it.
//detect all strings inside brackets
preg_match_all("/\[([^\]]*)\]/", $text, $matches);
//loop though results to get the string inside the parenthesis
preg_match('#\((.*?)\)#', $match, $matches);
To match the string between the parenthesis, you might use a single pattern to get a match only:
\[url\(\K[^()]+(?=\)])
The pattern matches:
\[url\( Match [url(
\K Clear the current match buffer
[^()]+ Match 1+ chars other than ( and )
(?=\)]) Positive lookahead, assert )] to the right
See a regex demo.
For example
$re = "/\[url\(\K[^()]+(?=\)])/";
$text = "[url('example')]";
if (preg_match($re, $text, $match)) {
var_dump($match[0]);;
}
Output
string(9) "'example'"
Another option could be using a capture group. You can place the ' inside or outside the group to capture the value:
\[url\(([^()]+)\)]
See another regex demo.
For example
$re = "/\[url\(([^()]+)\)]/";
$text = "[url('example')]";
if (preg_match($re, $text, $match)) {
var_dump($match[1]);;
}
Output
string(9) "'example'"

How to split repeating pattern?

I have a string that repeats a pattern. I have a regular expression that matches the pattern, but I would like to split them instead.
$target = 'a1v33a33v55a2v43';
I would like to split them into a1v33, a33v55, and a2v43. Basically, I want to split the string into an array of ['a1v33', 'a33v55', 'a2v43'].
I've tried the following code, but it only matches the pattern. How can I split them instead?
$target = 'a1v33a33v55a2v43';
$pattern = '/(a[0-9]+v[0-9]+)*$/im';
preg_match($pattern, $target, $match);
echo '<pre>';
print_r($match);
You can use preg_split too:
$result = preg_split('~(?=a)~i', $target, -1, PREG_SPLIT_NO_EMPTY);
Use preg_match_all with '/a[0-9]+v[0-9]+/i':
$target = 'a1v33a33v55a2v43';
$pattern = '/a[0-9]+v[0-9]+/i';
preg_match_all($pattern, $target, $match);
print_r($match);
See the IDEONE demo
The /(a[0-9]+v[0-9]+)*$/im pattern matches some substrings meeting a[0-9]+v[0-9]+ pattern, 1 or more occurrences, up to the end of the string ($). When we remove the quantified grouping with the end-of-line/string anchor, we can match indiviual tokens.

Find words from the array in the text received through file_get_contents

I have a receipt of a remote page:
$page = file_get_contents ('http://sayt.ru/');
There is a array of words:
$word = array ("word", "second");
How to count the number of words in the array matches the text on the page?
Started to dig in the direction
$matches = array ();
$count_words = preg_match_all ('/'. $word. '/ i',$page, $matches);
But certainly not in the direction I dig because count is always zero. And through preg_match_all sought after one word, not the entire array. : (
you have to either check or each word in array or use regexp like this:
$serachWords = array_map(function($w){ return preg_quote($w,'/'); }, $word);
$search = implode('|', $searchWords);
$count_words = preg_match_all('/\b(?:'.$serach.')\b/i', $page, $matches);
Added few modification to have better results: escape all words, so they wouldn't break expression and add word boundaries (\b) no match word as a word, not part of swords.

Regexp in php: how do I filter dynamic strings like abc/123/...?

I am trying to filter out all characters before the first / sign. I have strings like
ABC/123/...
and I am trying to filter out ABC, 123 and ... into separate strings. I have alsmost succeeded with the parsing of the first letters before the / sign except that the / sign is part of the match, which I donĀ“t want to.
<?php
$string = "ABC/123/...";
$pattern = '/.*?\//';
preg_match($pattern, $string, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
The letters before the first/ can differ both in length and characters, so a string could also look like EEEE/1111/aaaa.
If you are trying to split the string using / as the delimiter, you can use explode.
$array = explode("/", $string);
And if you are looking only for the first element, you can use array_shift.
$array = array_shift(explode("/", $string));

Return the CSS values from CSS attributes using regex

I have the following regex:
$string = 'font-size:12em;';
$pattern = '#:[A-Za-z0-9.]+;#i';
preg_match($pattern, $string, $matches);
$matches returns:
Array ( [0] => :12em; )
However, why is the : and ; returned? How can I get it to not return those colons and only return the CSS value 12em?
Because the first element in that array is the whole match. Use a capturing group, and the second element (or use lookarounds).
Example:
preg_match('/:\s*(\w[^;}]*?)\s*[;}]/', $string, $matches);
print $matches[1];
Note that things like these will not work in all cases. Comments and more complicated statements could break it.
Example:
/* foo: bar; */
foo: url("bar?q=:x;");
Use this pattern instead:
#(?<=:)[A-Za-z0-9.]+(?=;)#i
The explanation is that you the (?<=) and (?=) are respectively lookbehind and lookahead groups. Which means they aren't captured as part of your match.
Edit For handling %'s +more
#(?<=:)[^;]+#i

Categories