php regex mach before and after specific word - php

I have a string with data that looks like this:
$string = '
foo=bar
badge_name_foo=foo
bar_badge_name=bar
bar=baz
';
I want to match all *_badge_name and badge_name_* strings.
The regex im using is this:
preg_match_all('~(?:(\w+)_)?badge_name(?:_(\w+))?~', $string, $matches, PREG_SET_ORDER);
The result is:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] =>
[2] => foo
)
[1] => Array
(
[0] => bar_badge_name
[1] => bar
)
)
The *_badge_name is working fine, but on badge_name_* there is every time a empty value? Now how can i remove that with preg_match_all
Expected result should be:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] => foo
)
[1] => Array
(
[0] => bar_badge_name
[1] => bar
)
)

It seems you need to use BRANCH RESET feature:
Alternatives inside a branch reset group share the same capturing groups. The syntax is (?|regex) where (?| opens the group and regex is any regular expression. If you don't use any alternation or capturing groups inside the branch reset group, then its special function doesn't come into play. It then acts as a non-capturing group.
Use
(?|(\w+)_badge_name|badge_name_(\w+))
^^^
See the regex demo.
PHP demo:
$re = '/(?|(\w+)_badge_name|badge_name_(\w+))/';
$str = 'foo=bar
badge_name_foo=foo
bar_badge_name=bar
bar=baz';
preg_match_all($re, $str, $matches);
print_r($matches);
Result:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] => bar_badge_name
)
[1] => Array
(
[0] => foo
[1] => bar
)
)

Related

Retrieving text outside square brackets in PHP

I need some way of capturing the text outside square brackets. So for example, the following string:
My [ground]name[test]Jhon[random]petor [shorts].
I m using the below preg match expression but the result could not be expected
preg_match_all("/\[[^\]]*\]/", $text, $matches);
it giving me the result which is within the square bracket.
Result :
Array (
[0] => [ground]
[1] => [test]
[2] => [random]
[3] => [shorts]
)
Expect Output:
Array (
[0] => [My]
[1] => [name]
[2] => [Jhon]
[3] => [petor]
)
Any help that would be great
You can extend the pattern adding \K to clean what is matched so far and then using an alternation to match 1 or more word characters.
\[[^][]+]\K|\w+
See a regex demo
$re = '/\[[^][]+]\K|\w+/';
$str = 'My [ground]name[test]Jhon[random]petor [shorts].';
preg_match_all($re, $str, $matches);
print_r(array_values(array_filter($matches[0])));
Output
Array
(
[0] => My
[1] => name
[2] => Jhon
[3] => petor
)

PHP - Preg Match All - Wordpress Multiple short codes with multiple parameters

I'm trying to find a regex capable of capturing the content of short codes produces in Wordpress.
My short codes have the following structure:
[shortcode name param1="value1" param2="value2" param3="value3"]
The number of parameters is variable.
I need to capture the shortcode name, the parameter name and its value.
The closest results I have achieved is with this:
/(?:\[(.*?)|\G(?!^))(?=[^][]*])\h+([^\s=]+)="([^\s"]+)"/
If I have the following content in the same string:
[specs product="test" category="body"]
[pricelist keyword="216"]
[specs product="test2" category="network"]
I get this:
0=>array(
0=>[specs product="test"
1=> category="body"
2=>[pricelist keyword="216"
3=>[specs product="test2"
4=> category="network")
1=>array(
0=>specs
1=>
2=>pricelist
3=>specs
4=>)
2=>array(
0=>product
1=>category
2=>keyword
3=>product
4=>category)
3=>array(
0=>test
1=>body
2=>216
3=>test2
4=>network)
)
I have tried different regex models but I always end up with the same issue, if I have more than one parameter, it fails to detect it.
Do you have any idea of how I could achieve this?
Thanks
Laurent
You could make use of the \G anchor using 3 capture groups, where capture group 1 is the name of the shortcode, and group 2 and 3 the key value pairs.
Then you can remove the first entry of the array, and remove the empty entries in the 1st, 2nd and 3rd entry.
This is a slightly updated pattern
(?:\[(?=[^][]*])(\w+)|\G(?!^))\h+(\w+)="([^"]+)"
Regex demo | Php demo
Example
$s = '[specs product="test" category="body"]';
$pattern = '/(?:\[(?=[^][]*])(\w+)|\G(?!^))\h+(\w+)="([^"]+)"/';
$strings = [
'[specs product="test" category="body"]',
'[pricelist keyword="216"]',
'[specs product="test2" category="network" key="value"]'
];
foreach($strings as $s) {
if (preg_match_all($pattern, $s, $matches)) {
unset($matches[0]);
$matches = array_map('array_filter', $matches);
print_r($matches);
}
}
Output
Array
(
[1] => Array
(
[0] => specs
)
[2] => Array
(
[0] => product
[1] => category
)
[3] => Array
(
[0] => test
[1] => body
)
)
Array
(
[1] => Array
(
[0] => pricelist
)
[2] => Array
(
[0] => keyword
)
[3] => Array
(
[0] => 216
)
)
Array
(
[1] => Array
(
[0] => specs
)
[2] => Array
(
[0] => product
[1] => category
[2] => key
)
[3] => Array
(
[0] => test2
[1] => network
[2] => value
)
)

Preg_match_all behaving wierd

I am new to PHP and I have the below code and I basically wish to find all keywords enclosed between
'<#' and '#>'
sample code:
<?php
$subject = "askdbvbaldjbvasdblasdbvl<#2134#>cbkdbskbkabdvb<#213aca4#>";
$pattern = "/(?<=\<\#)(.*?)(?=\#\>)/";
preg_match_all($pattern, $subject, $matches);
echo '<pre>',print_r($matches,true),'</pre>';
?>
now i am expecting a value array like:
Array
(
[0] => Array
(
[0] => 2134
[1] => 213aca4
)
)
But i am getting and output like:
Array
(
[0] => Array
(
[0] => 2134
[1] => 213aca4
)
[1] => Array
(
[0] => 2134
[1] => 213aca4
)
)
can any one tell me why am i getting the second array and how can i get rid of that..
The second array contains the sub-match, or matched group, because you're using a capture group.
Simply remove the parens in your regex:
$pattern = "/(?<=\<\#).*?(?=\#\>)/";
Also, you should be able to use this regex without some escapes:
$pattern = "/(?<=<#).*?(?=#>)/";

preg_match_all into simple array

I have preg_match_all function:
preg_match_all('#<h2>(.*?)</h2>#is', $source, $output, PREG_SET_ORDER);
It's working as intended, BUT the problem is, it preg_matches all items twice and into a huge multi dimensional array like this for example where it, as intended, preg_matched all 11 items needed, but twice and into a multidimensional array:
Array
(
[0] => Array
(
[0] => <h2>10. <em>Cruel</em> by St. Vincent</h2>
[1] => 10. <em>Cruel</em> by St. Vincent
)
[1] => Array
(
[0] => <h2>9. <em>Robot Rock</em> by Daft Punk</h2>
[1] => 9. <em>Robot Rock</em> by Daft Punk
)
[2] => Array
(
[0] => <h2>8. <em>Seven Nation Army</em> by the White Stripes</h2>
[1] => 8. <em>Seven Nation Army</em> by the White Stripes
)
[3] => Array
(
[0] => <h2>7. <em>Do You Want To</em> by Franz Ferdinand</h2>
[1] => 7. <em>Do You Want To</em> by Franz Ferdinand
)
[4] => Array
(
[0] => <h2>6. <em>Teenage Dream</em> by Katie Perry</h2>
[1] => 6. <em>Teenage Dream</em> by Katie Perry
)
[5] => Array
(
[0] => <h2>5. <em>Crazy</em> by Gnarls Barkley</h2>
[1] => 5. <em>Crazy</em> by Gnarls Barkley
)
[6] => Array
(
[0] => <h2>4. <em>Kids</em> by MGMT</h2>
[1] => 4. <em>Kids</em> by MGMT
)
[7] => Array
(
[0] => <h2>3. <em>Bad Romance</em> by Lady Gaga</h2>
[1] => 3. <em>Bad Romance</em> by Lady Gaga
)
[8] => Array
(
[0] => <h2>2. <em>Pumped Up Kicks</em> by Foster the People</h2>
[1] => 2. <em>Pumped Up Kicks</em> by Foster the People
)
[9] => Array
(
[0] => <h2>1. <em>Paradise</em> by Coldplay</h2>
[1] => 1. <em>Paradise</em> by Coldplay
)
[10] => Array
(
[0] => <h2>Song That Get Stuck In Your Head YouTube Playlist</h2>
[1] => Song That Get Stuck In Your Head YouTube Playlist
)
)
How to convert this array into simple one and without those duplicated items? Thank you very much.
You will always get a multidimensional array back, however, you can get close to what you want like this:
if (preg_match_all('#<h2>(.*?)</h2>#is', $source, $output, PREG_PATTERN_ORDER))
$matches = $output[0]; // reduce the multi-dimensional array to the array of full matches only
And if you don't want the submatch at all, then use a non-capturing grouping:
if (preg_match_all('#<h2>(?:.*?)</h2>#is', $source, $output, PREG_PATTERN_ORDER))
$matches = $output[0]; // reduce the multi-dimensional array to the array of full matches only
Note that this call to preg_match_all is using PREG_PATTERN_ORDER instead of PREG_SET_ORDER:
PREG_PATTERN_ORDER Orders results so that $matches[0] is an array of
full pattern matches, $matches[1] is an array of strings matched by
the first parenthesized subpattern, and so on.
PREG_SET_ORDER Orders results so that $matches[0] is an array of first
set of matches, $matches[1] is an array of second set of matches, and
so on.
See: http://php.net/manual/en/function.preg-match-all.php
Use
#<h2>(?:.*?)</h2>#is
as your regex. If you use a non capturing group (which is what ?: signifies), a backreference won't show up in the array.

PHP/Regex - Grab everything between { and }?

I have some text strings like this
{hello|hi}{there|you}
I want to count the instances of {..anything..}, so in the example above, I would want to return:
hello|hi
there|you
in the matches array created by preg_match_all()
Right now my code looks like:
preg_match_all('/{(.*?)}/', $text,$text_pieces);
And $text_pieces contains:
Array ( [0] => Array ( [0] => {hello|hi} [1] => {there|you} ) [1] => Array ( [0] => hello|hi [1] => there|you ) )
All I need is this:
[0] => hello|hi [1] => there|you
preg_match_all cannot omit the full text matches, only subpattern matches, therefore the only solution is to set $text_pieces to $text_pieces[1] after the function call:
if(preg_match_all('/{(.*?)}/', $text,$text_pieces))
{
$text_pieces = $text_pieces[1];
}

Categories