Array named capture using PHP regex - php

If named capture matches multiple times, is it possible to retrieve all matches?
Example
<?php
$string = 'TextToMatch [some][random][tags] SomeMoreMatches';
$pattern = "!(TextToMatch )(?P<tags>\[.+?\])+( SomeMoreMatches)!";
preg_match($pattern, $string, $matches);
print_r($matches);
Which results in
Array
(
[0] => TextToMatch [some][random][tags] SomeMoreMatches
[1] => TextToMatch
[tags] => [tags]
[2] => [tags]
[3] => SomeMoreMatches
)
Is is possible to get something like
Array
(
[0] => TextToMatch [some][random][tags] SomeMoreMatches
[1] => TextToMatch
[tags] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
[2] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
[3] => SomeMoreMatches
)
using only preg_match?
I am aware that I can explode tags, but I wonder if I can do this with preg_match (or similiar function) only.
Other example
$input = "Some text [many][more][other][tags][here] and maybe some text here?";
Desirable output
Array
(
[0] => Some text [many][more][other][tags][here] and maybe some text here?
[1] => Some text
[tags] => Array
(
[0] => [many]
[1] => [more]
[2] => [other]
[3] => [tags]
[4] => [here]
)
[2] => Array
(
[0] => [many]
[1] => [more]
[2] => [other]
[3] => [tags]
[4] => [here]
)
[3] => and maybe some text here?
)

You need use preg_match_all and modify the reg exp:
preg_match_all('/(?P<tags>\[.+?\])/', $string, $matches);
Just remove the + after ) to set one pattern and preg_match_all make a global search
If you need the specific answer that you posted, try with:
$string = '[some][random][tags]';
$pattern = "/(?P<tags>\[.+?\])/";
preg_match_all($pattern, $string, $matches);
$matches = [
implode($matches['tags']), end($matches['tags'])
] + $matches;
print_r($matches);
You get:
Array
(
[0] => [some][random][tags]
[1] => [tags]
[tags] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
)

Since you stated in your comments that you are not actually interested in the leading substring before the set of tags, and because you stated that you don't necessarily need the named capture group (I never use them), you really only need to remove the first bit, split the string on the space after the set of tags, then split each tag in the set of tags.
Code: (Demo)
$split = explode(' ', strstr($input, '['), 2); // strstr() trims off the leading substring
var_export($split); // ^ tells explode to stop after making 2 elements
Produces:
array (
0 => '[many][more][other][tags][here]',
1 => 'and maybe some text here?',
)
Then the most direct/clean way to split those square bracketed tags, is to use the zero-width position between each closing bracket (]) and each opening bracket ([). Since only regex can isolate these specific positions as delimiters, I'll suggest preg_split().
$split[0] = preg_split('~]\K~', $split[0], -1, PREG_SPLIT_NO_EMPTY);
var_export($split); ^^- release/forget previously matched character(s)
This is the final output:
array (
0 =>
array (
0 => '[many]',
1 => '[more]',
2 => '[other]',
3 => '[tags]',
4 => '[here]',
),
1 => 'and maybe some text here?',
)

No, as Wiktor stated(1, 2), it is not possible to do using only preg_match
Solution that just works
<?php
$string = 'TextToMatch [some][random][tags] SomeMoreMatches';
$pattern = "!(TextToMatch )(?P<tags>\[.+?\]+)( SomeMoreMatches)!";
preg_match($pattern, $string, $matches);
$matches[2] = $matches["tags"] = array_map(function($s){return "[$s]";}, explode("][", substr($matches["tags"],1,-1)));
print_r($matches);

Related

remove BBcode tags from whole match

How can I get Array with all string.
$str = "This is some a text with [b]Bold[/b] and [i]Italic[/i] elements inside";
preg_match_all("/.*(\[.+\]).*/isU",$str,$matches);
print_r($matches);
I obtain only:
Array (
[0] => Array
(
[0] => This is a text with [b]
[1] => Bold[/b]
[2] => and [i]
[3] => Italic[/i]
)
[1] => Array
(
[0] => [b]
[1] => [/b]
[2] => [i]
[3] => [/i]
)
)
without "elements inside" text in the end.
Preg_replace?
$new_str = preg_replace("/(\[.*?\])/", "", $str);
http://www.phpliveregex.com/p/fKH

Unexpected preg_match result from pattern with "?:"

I try this pattern
(?:(\d+)\/|)reports\/(\d+)-([\w-]+).html
with this string (preg_match with modifiers "Axu")
reports/683868-derger-gergewrger.html
and i expected this matched result (https://regex101.com/r/kX6yZ5/1):
[1] => 683868
[2] => derger-gergewrger
But i get this:
[1] =>
[2] => 683868
[3] => derger-gergewrger
Why? Where does the empty value (1), because the pattern should not capture "?:"
I have two cases:
"reports/683868-derger-gergewrger.html"
"757/reports/683868-derger-gergewrger.html"
at first case, i need two captures, but at second case i need three captures.
You can use:
preg_match('~(?:\d+/)?reports/(\d+)-([\w-]+)\.html~',
'reports/683868-derger-gergewrger.html', $m);
print_r($m);
Array
(
[0] => reports/683868-derger-gergewrger.html
[1] => 683868
[2] => derger-gergewrger
)
EDIT: You probably want this behavior:
$s = '757/reports/683868-derger-gergewrger.html';
preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
$s, $m); print_r($m);Array
(
[0] => 757/reports/683868-derger-gergewrger.html
[1] => 757
[2] => 683868
[3] => derger-gergewrger
)
and:
$s = 'reports/683868-derger-gergewrger.html';
preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
$s, $m); print_r($m);
Array
(
[0] => reports/683868-derger-gergewrger.html
[1] => 683868
[2] => derger-gergewrger
)
(?|..) is a Non-capturing group. Subpatterns declared within each alternative of this construct will start over from the same index.

How to parse page numbers like print preview does using Regex

If you want to print arbitrary pages on windows/office you can define it like in the picture:
So, this will print pages: 1,2,3,6,7,8
Now, I'm trying to do same thing using Regex
<?php
$str = "1-4,6,7,8";
preg_match('/((\d+-\d+)|(\d+)),((\d+-\d+)|(\d+))/',$str,$out);
print_r($out);
?>
and it prints
Array ( [0] => 1-4,6 [1] => 1-4 [2] => 1-4 [3] => [4] => 6 [5] => [6] => 6 )
but I want to is the following
Array ( [0] => 1-4 [1] => 6, [2] => 7, [3] => 7 )
How can I do this?
Here is the fiddle
Check this regexp pattern, please
$str = "1-4,6,7,8";
preg_match('/((\d+-\d+)|(\d+)),?/',$str,$out);
print_r($out);
or better use explode function:
$str = "1-4,6,7,8";
$out = explode(',', $str);
print_r($out);
Use this:
$str = "1-4,6,7,8";
preg_match_all('/(\d+(?:-\d+)?),?/', $str, $out);
print_r($out);
output:
Array
(
[0] => Array
(
[0] => 1-4,
[1] => 6,
[2] => 7,
[3] => 8
)
[1] => Array
(
[0] => 1-4
[1] => 6
[2] => 7
[3] => 8
)
)
This should do the trick:
(\d+)-?(\d*)?(,(?!$))?
Matches 1 or more numbers (mandatory).
Optional match for hyphen.
Optional match for second set of digits.
Optional comma after each set of numbers but does not allow comma on the end.
DEMO

Parsing attributes in PHP using regular expressions

Consider that i have the string,
$string = 'tag2 display="users" limit="5"';
Using the preg_match_all function, i need to get the output
Required o/p
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)
I tried using this pattern '/([^=\s]+)="([^"]+)"/' but it is not recognizing the parameter with no value (in this case tag2) Instead it gives the output
What I am getting
Array
(
[0] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[1] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)
What will be the pattern for getting the required output ?
EDIT 1: I also need to get the attributes which are not wrapped with quotes ex: attr=val. Sorry for not mentioning before.
Try this:
<?php
$string = 'tag2 display="users" limit="5"';
preg_match_all('/([^=\s]+)(="([^"]+)")?/', $string, $res);
foreach ($res[0] as $r => $v) {
$o[] = array($res[0][$r], $res[1][$r], $res[3][$r]);
}
print_r($o);
?>
It outputs me:
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)
I think it's not fully possible to give you with one call what you're looking for, but this is pretty close:
$string = 'tag2 display="users" limit=5';
preg_match_all('/([^=\s]+)(?:="?([^"]+)"?|())?/', $string, $res, PREG_SET_ORDER);
print_r($res);
Output:
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
[3] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit=5
[1] => limit
[2] => 5
)
)
As you can see, the first element has no value, I tried to work around that and offer an empty match now. So this builds the array you were asking for, but has an additional entry on the empty attribute.
However the main point is the PREG_SET_ORDER flag of preg_match_all. Maybe you can live with this output already.
Maybe you're interested in this litte snippet that parses all sorts of attribute styles. <div class="hello" id=foobar style='display:none'> is valid html(5), not pretty, I know…
<?php
$string = '<tag2 display="users" limit="5">';
$attributes = array();
$pattern = "/\s+(?<name>[a-z0-9-]+)=(((?<quotes>['\"])(?<value>.*?)\k<quotes>)|(?<value2>[^'\" ]+))/i";
preg_match_all($pattern, $source, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$attributes[$match['name']] = $match['value'] ?: $match['value2'];
}
var_dump($attributes);
will give you
$attributes = array(
'display' => 'users',
'limit' => '5',
);

How can I split a list with multiple delimiters?

Basically, I want to enter text into a text area, and then use them. For example
variable1:variable2#variable3
variable1:variable2#variable3
variable1:variable2#variable3
I know I could use explode to make each line into an array, and then use a foreach loop to use each line separately, but how would I separate the three variables to use?
Besides preg_split:
$line = 'variable11:variable12#variable13';
print_r(preg_split('/[:#]/', $line));
/*
Array
(
[0] => variable11
[1] => variable12
[2] => variable13
)
*/
you could do a preg_match_all:
$text = 'variable11:variable12#variable13
variable21:variable22#variable23
variable31:variable32#variable33';
preg_match_all('/([^\r\n:]+):([^\r\n#]+)#(.*)\s*/', $text, $matches, PREG_SET_ORDER);
print_r($matches);
/*
Array
(
[0] => Array
(
[0] => variable11:variable12#variable13
[1] => variable11
[2] => variable12
[3] => variable13
)
[1] => Array
(
[0] => variable21:variable22#variable23
[1] => variable21
[2] => variable22
[3] => variable23
)
[2] => Array
(
[0] => variable31:variable32#variable33
[1] => variable31
[2] => variable32
[3] => variable33
)
)
*/
try preg_split http://php.net/manual/en/function.preg-split.php
if necessary, you could make several calls to "explode"
http://jp.php.net/manual/en/function.explode.php

Categories