What's wrong with this regex expression? - php

i want to preg_match the following code:
{{{/foo:bar/a/0/b}}}
This is my regex (which doesn't work, and i don't understand why):
|{{{\/([[:alpha:]][[:alnum:]\_]*\:[[:alpha:]][[:alnum:]\_]*)(?:\/([[:alnum:]\_]*))+}}}|Uism
Expected result:
Array (
[0] => Array
(
[0] => {{{/foo:bar/a/0/b}}}
)
[1] => Array
(
[0] => foo:bar
)
[2] => Array
(
[0] => a
)
[3] => Array
(
[0] => 0
)
[4] => Array
(
[0] => b
)
)
The result i get:
Array (
[0] => Array
(
[0] => {{{/foo:bar/a/0/b}}}
)
[1] => Array
(
[0] => foo:bar
)
[2] => Array
(
[0] => b
)
)
I only get the last element back. So what's wrong with it?

You're repeating the second capturing group:
(?:
\/
(
[[:alnum:]\_]*
)
)+
On each repetition of the outer non-capturing group, the contents of the inner capturing group are overwritten, which is the reason why only the last match is preserved. This is standard behavior across all regex engines.

(?=(^.*$)|(?:\/(.*?)(?:\/|})))
Try this.See demo.
http://regex101.com/r/lS5tT3/3

Each subsequent match of the same capture group will overwrite the previous one; that's why you end up with just b.
What I would suggest in this case is to match the whole block first and then use a simpler explode() to dig out the inner data; use this expression:
|{{{\/([[:alpha:]][[:alnum:]\_]*\:[[:alpha:]][[:alnum:]\_]*(?:\/[[:alnum:]\_]*)+)}}}|U
Then, with the resulting $matches array (third argument to preg_match()):
$data = explode('/', $matches[1]);

Your pattern is complete overkill for something that should be quite simple:
$rex = "#[{]{3}/(\w+:\w+)/(\w)/(\d)/(\w)[}]{3}#";
$str = "{{{/foo:bar/a/0/b}}}";
preg_match($rex, $str, $res);
Result:
Array
(
[0] => {{{/foo:bar/a/0/b}}}
[1] => foo:bar
[2] => a
[3] => 0
[4] => b
)

Related

How can I convert a string with square brackets into an array?

I got this from the name attribute of a field
name="field[a][2][b][0][c][1][field_name]"
after serializing the form, I got this:
array('field[a][2][b][0][c][1][field_name]'=>'value')
and I need to convert that into the following array:
$field = array (
'a' => array (
[2] => array (
'b' => array (
[0] => array (
'c' => array (
[1] => array (
'field_name'=>'value'
)
)
)
)
)
)
);
do I need some sort of foreach function or php can recognize this string as array?
If you want the result nested, use parse_str().
$text = "field[a][2][b][0][c][1][field_name]=value";
parse_str($text, $result);
print_r($result);
Output:
Array
(
[field] => Array
(
[a] => Array
(
[2] => Array
(
[b] => Array
(
[0] => Array
(
[c] => Array
(
[1] => Array
(
[field_name] => value
)
)
)
)
)
)
)
)
See https://3v4l.org/7nmFT
You can get values in brackets with the regular expression, and then reduce it to the array that you want:
$key = 'field[a][2][b][0][c][1][field_name]';
$value = 'value';
$matches = array();
preg_match_all('/\[([^[]+)\]/', $key, $matches);
$keys = array_reverse($matches[1]);
$result = array_reduce($keys, function ($array, $item) {
return array($item => $array);
}, $value);
Explanation
In the regular expression \[([^[]+)\]:
([^[]+) is matches any symbol except opening bracket, one or more
times, and gets it into the capturing group (I hope you will not have nested brackets);
\[...\] is literally matches brackets around.
The preg_match_all function should populate the $matches array with following data:
Array
(
[0] => Array
(
[0] => [a]
[1] => [2]
[2] => [b]
[3] => [0]
[4] => [c]
[5] => [1]
[6] => [field_name]
)
[1] => Array
(
[0] => a
[1] => 2
[2] => b
[3] => 0
[4] => c
[5] => 1
[6] => field_name
)
)
The $matches[0] have values of a full match and the $matches[1] have values of our first and only capturing group. We have interested only in capturing group values.
Then with the array_reduce function we can simply go through keys in the reverse order, and sequentially wrap our value into an array.
You can use explode function to do it. But the values should have space in between them, for example "Hello World" in which the values would be in Array ( [0] => Hello [1] => world ) but in your case it would be like Array ( [0] => field[a][2][b][0][c][1][field_name] ) until $name as space in-between the characters or word.
$name="field[a][2][b][0][c][1][field_name]"
$name_array = explode(" ",$name);
print_r ($name_array);
When testing your problem on my PHP server with this test code
<form method="post">
<input type="text" name="field[a][2][b][0][c][1][field_name]">
<input type="submit" value="OK">
<form>
<pre>
<?php
if (isset($_POST['field']))
print_r($_POST['field']);
?>
</pre>
I got the following response (after entering the word "hello" into the text box and clicking the "OK" button):
Array ( [a] => Array ( [2] => Array ( [b] =>
Array ( [0] => Array ( [c] => Array ( [1] =>
Array ( [field_name] => hello ) ) ) ) ) ) )
Admittedly, it was formatted nicer, but I am posting from my smartphone, so, please, forgive me for not formatting it again manually.
So, to me it is not quite clear why OP needs extra code to solve his/her problem.
Here is your solution:
https://codebrace.com/editor/b06588218
I have used regex match twice to match variable name and arrays
/[(.+?)]/
matches any values in the array where "?" is lazy matching.
while following regex
/^[^[]+/ matches the variable name.
I have used variable variables to create variable from string extracted from above regex
Result:
Array
(
[a] => Array
(
[2] => Array
(
[b] => Array
(
[0] => Array
(
[c] => Array
(
[1] => Array
(
[field_name] => value
)
)
)
)
)
)
)
I hope this helps

Array named capture using PHP regex

If named capture matches multiple times, is it possible to retrieve all matches?
Example
<?php
$string = 'TextToMatch [some][random][tags] SomeMoreMatches';
$pattern = "!(TextToMatch )(?P<tags>\[.+?\])+( SomeMoreMatches)!";
preg_match($pattern, $string, $matches);
print_r($matches);
Which results in
Array
(
[0] => TextToMatch [some][random][tags] SomeMoreMatches
[1] => TextToMatch
[tags] => [tags]
[2] => [tags]
[3] => SomeMoreMatches
)
Is is possible to get something like
Array
(
[0] => TextToMatch [some][random][tags] SomeMoreMatches
[1] => TextToMatch
[tags] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
[2] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
[3] => SomeMoreMatches
)
using only preg_match?
I am aware that I can explode tags, but I wonder if I can do this with preg_match (or similiar function) only.
Other example
$input = "Some text [many][more][other][tags][here] and maybe some text here?";
Desirable output
Array
(
[0] => Some text [many][more][other][tags][here] and maybe some text here?
[1] => Some text
[tags] => Array
(
[0] => [many]
[1] => [more]
[2] => [other]
[3] => [tags]
[4] => [here]
)
[2] => Array
(
[0] => [many]
[1] => [more]
[2] => [other]
[3] => [tags]
[4] => [here]
)
[3] => and maybe some text here?
)
You need use preg_match_all and modify the reg exp:
preg_match_all('/(?P<tags>\[.+?\])/', $string, $matches);
Just remove the + after ) to set one pattern and preg_match_all make a global search
If you need the specific answer that you posted, try with:
$string = '[some][random][tags]';
$pattern = "/(?P<tags>\[.+?\])/";
preg_match_all($pattern, $string, $matches);
$matches = [
implode($matches['tags']), end($matches['tags'])
] + $matches;
print_r($matches);
You get:
Array
(
[0] => [some][random][tags]
[1] => [tags]
[tags] => Array
(
[0] => [some]
[1] => [random]
[2] => [tags]
)
)
Since you stated in your comments that you are not actually interested in the leading substring before the set of tags, and because you stated that you don't necessarily need the named capture group (I never use them), you really only need to remove the first bit, split the string on the space after the set of tags, then split each tag in the set of tags.
Code: (Demo)
$split = explode(' ', strstr($input, '['), 2); // strstr() trims off the leading substring
var_export($split); // ^ tells explode to stop after making 2 elements
Produces:
array (
0 => '[many][more][other][tags][here]',
1 => 'and maybe some text here?',
)
Then the most direct/clean way to split those square bracketed tags, is to use the zero-width position between each closing bracket (]) and each opening bracket ([). Since only regex can isolate these specific positions as delimiters, I'll suggest preg_split().
$split[0] = preg_split('~]\K~', $split[0], -1, PREG_SPLIT_NO_EMPTY);
var_export($split); ^^- release/forget previously matched character(s)
This is the final output:
array (
0 =>
array (
0 => '[many]',
1 => '[more]',
2 => '[other]',
3 => '[tags]',
4 => '[here]',
),
1 => 'and maybe some text here?',
)
No, as Wiktor stated(1, 2), it is not possible to do using only preg_match
Solution that just works
<?php
$string = 'TextToMatch [some][random][tags] SomeMoreMatches';
$pattern = "!(TextToMatch )(?P<tags>\[.+?\]+)( SomeMoreMatches)!";
preg_match($pattern, $string, $matches);
$matches[2] = $matches["tags"] = array_map(function($s){return "[$s]";}, explode("][", substr($matches["tags"],1,-1)));
print_r($matches);

Unexpected preg_match result from pattern with "?:"

I try this pattern
(?:(\d+)\/|)reports\/(\d+)-([\w-]+).html
with this string (preg_match with modifiers "Axu")
reports/683868-derger-gergewrger.html
and i expected this matched result (https://regex101.com/r/kX6yZ5/1):
[1] => 683868
[2] => derger-gergewrger
But i get this:
[1] =>
[2] => 683868
[3] => derger-gergewrger
Why? Where does the empty value (1), because the pattern should not capture "?:"
I have two cases:
"reports/683868-derger-gergewrger.html"
"757/reports/683868-derger-gergewrger.html"
at first case, i need two captures, but at second case i need three captures.
You can use:
preg_match('~(?:\d+/)?reports/(\d+)-([\w-]+)\.html~',
'reports/683868-derger-gergewrger.html', $m);
print_r($m);
Array
(
[0] => reports/683868-derger-gergewrger.html
[1] => 683868
[2] => derger-gergewrger
)
EDIT: You probably want this behavior:
$s = '757/reports/683868-derger-gergewrger.html';
preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
$s, $m); print_r($m);Array
(
[0] => 757/reports/683868-derger-gergewrger.html
[1] => 757
[2] => 683868
[3] => derger-gergewrger
)
and:
$s = 'reports/683868-derger-gergewrger.html';
preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
$s, $m); print_r($m);
Array
(
[0] => reports/683868-derger-gergewrger.html
[1] => 683868
[2] => derger-gergewrger
)
(?|..) is a Non-capturing group. Subpatterns declared within each alternative of this construct will start over from the same index.

Find all patterns in a string php

This is my string.
$str = '"additional_details":" {"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"}],';
I want to find all patterns that start with "{" and end with "}".
I am trying this:
preg_match_all( '/"(\{.*\})"/', $json, $matches );
print_r($matches);
It gives me an output of:
Array
(
[0] => Array
(
[0] => "{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"
)
[1] => Array
(
[0] => {"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}
)
)
See the array key 1. It gives all matches in one key and other details too.
I want an array of all matches. Like
Array
(
[0] => Array
(
[0] => "{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"}],"additional_details":"{"mode_of_transport":"air"}"
)
[1] => Array
(
[0] => {"mode_of_transport":"air"},
[1] => {"mode_of_transport":"air"},
[2] => {"mode_of_transport":"air"}
)
)
What should I change in my pattern.
Thanks
You can use:
preg_match_all( '/({[^}]*})/', $str, $matches );
print_r($matches[1]);
Array
(
[0] => {"mode_of_transport":"air"}
[1] => {"mode_of_transport":"air"}
[2] => {"mode_of_transport":"air"}
)

preg_match_all to get all occurrences of a string

I am trying to find offset of all occurrences with preg_match_all
e.g.
$haystack = 'aaaab';
$needle = 'aa';
preg_match_all('/' . $needle . '/', $haystack, $matches);
$matches is
Array
(
[0] => Array
(
[0] => Array
(
[0] => aa
[1] => 0
)
[1] => Array
(
[0] => aa
[1] => 2
)
)
)
It returns offset of first and second group of aa ("aa" "aa" "b") from the haystack, while I am expecting it to return "aa" starting at index 1 as well.
Array
(
[0] => Array
(
[0] => Array
(
[0] => aa
[1] => 0
)
[1] => Array
(
[0] => aa
[1] => 1
)
[2] => Array
(
[0] => aa
[1] => 2
)
)
)
Is there a way I can fix the regex or use some other function (which accepts regex) to get this done?
PS: I know strpos which can do this, but I have few more things to search for hence will go with preg_match_all.
You'll need to change your needle expression to use an assertion. This will prevent the 2nd a from being eaten by the regular expression engine:
$needle = 'a(?=a)';

Categories