How to split repeating pattern? - php

I have a string that repeats a pattern. I have a regular expression that matches the pattern, but I would like to split them instead.
$target = 'a1v33a33v55a2v43';
I would like to split them into a1v33, a33v55, and a2v43. Basically, I want to split the string into an array of ['a1v33', 'a33v55', 'a2v43'].
I've tried the following code, but it only matches the pattern. How can I split them instead?
$target = 'a1v33a33v55a2v43';
$pattern = '/(a[0-9]+v[0-9]+)*$/im';
preg_match($pattern, $target, $match);
echo '<pre>';
print_r($match);

You can use preg_split too:
$result = preg_split('~(?=a)~i', $target, -1, PREG_SPLIT_NO_EMPTY);

Use preg_match_all with '/a[0-9]+v[0-9]+/i':
$target = 'a1v33a33v55a2v43';
$pattern = '/a[0-9]+v[0-9]+/i';
preg_match_all($pattern, $target, $match);
print_r($match);
See the IDEONE demo
The /(a[0-9]+v[0-9]+)*$/im pattern matches some substrings meeting a[0-9]+v[0-9]+ pattern, 1 or more occurrences, up to the end of the string ($). When we remove the quantified grouping with the end-of-line/string anchor, we can match indiviual tokens.

Related

Perform word match in a String until another word is found in the String

I'm using preg_match_all to perform all match in a String
$match_this = '/cola/';
$sentence = 'cola this cola is a nice cola';
if(preg_match_all($match_this, $sentence, $matches)){
echo 'match found'.'<br>';
print_r($matches[0]);
}
But I want this match performing operation to stop when I encounter the word nice and $matches array shouldn't store any more matched word after that.
How the code can be modified for this ?
There can be multiple times 'cola' comes before 'nice'. This is just
an example sentence. Again 'cola' and 'nice' are just example words.
The words to match and where to stop are randomly picked from
database. This code is for a word game.
You could use positive lookahead:
$match_this = '\bcola\d\b';
$until = '\bnice\b';
$sentence = 'cola1 this cola2 is a nice cola3';
if(preg_match_all("/$match_this(?=.*$until)/", $sentence, $matches)){
print_r($matches[0]);
}
Output:
Array
(
[0] => cola1
[1] => cola2
)
I've added a number at the end of each cola to be sure it matches only the ones that are before the word nice.
I've also added word boudaries arround the words to match.
Finally the code is:
$match_this = '\bcola\b';
$until = '\bnice\b';
$sentence = 'cola this cola is a nice cola';
if(preg_match_all("/$match_this(?=.*$until)/", $sentence, $matches)){
print_r($matches[0]);
}
First get the offset of nice, then run the preg match on the substring before it.
$sentence = 'cola this cola is a nice cola';
$match_this = '/nice/';
if(preg_match($match_this, $sentence, $matches, PREG_OFFSET_CAPTURE)){
$niceOffset = $matches[0][1];
$match_this = '/cola/';
if(preg_match_all($match_this, substr($sentence, 0, $niceOffset), $matches2, PREG_OFFSET_CAPTURE)){
var_dump($matches2);
}
}

Using preg_match to match this specific pattern

I have a preg_match matching for specific patterns, but it's just not matching the pattern I'm trying to match. What am I doing wrong?
<?php
$string = "tell me about cats";
preg_match("~\b(?:tell me about|you know(?: of| about)?|what do you think(?: of| about)?|(?:what|who) is|(?:whats|whos)) ((?:[a-z]+ ){1,2})$~", $string, $match);
print_r($match);
?>
Expected Result:
array(0 => tell me about 1 => cats)
Actual Result:
array()
You are having an extra space in (but there are no spaces after cat making the entire regex to fail)
((?:[a-z]+ ){1,2})
^^
||
here
also, you don't have capturing group for first part (due to (?:..)). Make a capturing group and make the spaces optional using ? (if you want to capture at most two words)
\b(tell me about|you know(?: of| about)?|what do you think(?: of| about)?|(?:what|who) is|(?:whats|whos)) ((?:[a-z]+){1,2} ?)$
Regex Demo
PHP Code
$string = "tell me about cats";
preg_match("~\b(tell me about|you know(?: of| about)?|what do you think(?: of| about)?|(?:what|who) is|(?:whats|whos)) ((?:[a-z]+ ?){1,2})$~", $string, $match);
print_r($match);
NOTE :- $match[1] and $match[2] will contain your result. $match[0] is reserved for entire match found by the regex in the string.
Ideone Demo

Regex replace recursive with one pattern

$array[key][key]...[key]
replace to
$array['key']['key']...['key']
I managed only to add quotes to the first keyword of the array.
\$([a-zA-Z0-9]+)\[([a-zA-Z_-]+[0-9]*)\] replace to \$\1\[\'\2\3\'\]
You may use a regex that does not perform a recursive, but consecutive matching:
$re = '/(\$\w+|(?!^)\G)\[([^]]*)\]/';
$str = "\$array[key][key][key]";
$subst = "$1['$2']";
$result = preg_replace($re, $subst, $str);
echo $result;
See IDEONE demo
The regex (\$\w+|(?!^)\G)\[([^]]*)\] matches all square parenthetical substrings (capturing their contents into Group 2) (with \[([^]]*)\]) that either are right after a '$'+alphanumerics substring (due to the \$\w+ part) or that follow one another consecutively (thanks to (?!^)\G).
Shouldn't need anything fancy, just get the stuff you need then
replace in a callback.
Untested:
$new_input = preg_replace_callback('/(?i)\$[a-z]+\K(?:\[[^\[\]]*\])+/',
function( $matches ){
return preg_replace( '/(\[)|(\])/', "$1'$2", $matches[0]);
},
$input );

PHP preg_match matches

I'm trying to get all the matches from string:
$string = '[RAND_15]d4trg[RAND_23]';
with preg_match like this:
$match = array();
preg_match('#\[RAND_.*]#', $string, $match);
but after that $match array looks like this:
Array ( [0] => [RAND_15]d4trg[RAND_23] )
What should I do to get both occurrences as 2 separate elements in $match array? I would like to get result like this:
$match[0] = [RAND_15];
$match[1] = [RAND_23];
Use ...
$match = array();
preg_match_all('#\[RAND_.*?]#', $string, $match);
... instead. ? modifier will make the pattern become 'lazy', matching the shortest possible substring. Without it the pattern will try to cover the maximum distance possible, and technically, [RAND_15]d4trg[RAND_23] does match the pattern.
Another way is restricting the set of characters to match with negated character class:
$match = array();
preg_match_all('#\[RAND_[^]]*]#', $string, $match);
This way we won't have to turn the quantifier into a lazy one, as [^]] character class will stop matching at the first ] symbol.
Still, to catch all the matches you should use preg_match_all instead of preg_match. Here's the demo illustrating the difference.

How to match full words?

I use a simple preg_match_all to find the occurrence of a list of words in a text.
$pattern = '/(word1|word2|word3)/';
$num_found = preg_match_all( $pattern, $string, $matches );
But this also match subset of words like abcword123. I need it to find word1, word2 and word3 when they're occurring as full words only. Note that this doesn't always mean that they're separated by spaces on both sides, it could be a comma, semi-colon, period, exclamation mark, question mark, or another punctuation.
IF you are looking to match "word1", "word2", "word3" etc only then using in_array is always better. Regex are super powerful but it takes a lot of cpu power also. So try to avoid it when ever possible
$words = array ("word1", "word2", "word3" );
$found = in_array ($string, $words);
check PHP: in_array - Manual for more information on in_array
And if you want to use regex only try
$pattern = '/^(word1|word2|word3)$/';
$num_found = preg_match_all( $pattern, $string, $matches );
And if you want to get something like "this statement has word1 in it", then use "\b" like
$pattern = '/\b(word1|word2|word3)\b/';
$num_found = preg_match_all( $pattern, $string, $matches );
More of it here PHP: Escape sequences - Manual search for \b
Try:
$pattern = '/\b(word1|word2|word3)\b/';
$num_found = preg_match_all( $pattern, $string, $matches );
You can use \b to match word boundaries. So you want to use /\b(word1|word2|word3)\b/ as your regex.

Categories