php regex - get content with square brackets escaped inside square brackets - php

this is some hard, I have solutions for this with srtpos, but it's ugly, I need help to do it with preg_pos or preg_match . I have a string like below:
$text="Some
[parameter=value\[anoter value|subparam=20] with or
[parameter|value\]anoter value|subparam=21|nothing] or
[parameter=value\[anoter value\]|subparam=22] ";
... I would like to get the following result:
array (
0 => '=value[anoter value|subparam=20',
1 => '|value[anoter value|subparam=21|nothing',
2 => '=value[anoter value]|subparam=22',
)
I mean i know my parameter: [parameter---get this section---] after 'parameter' all text can be to change, and it can contains escaped: bracket - square bracket - parenthesis - ampersand.
thanks !

Use \K to discard the previously matched characters.
\[parameter\K(?:\\[\]\[]|[^\[\]])*
DEMO
$re = "~\\[parameter\\K(?:\\\\[\\]\\[]|[^\\[\\]])*~m";
$str = "Some \n[parameter=value\[anoter value|subparam=20] with or \n[parameter|value\]anoter value|subparam=21|nothing] or \n[parameter=value\[anoter value\]|subparam=22] \";\nfoo bar";
preg_match_all($re, $str, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => =value\[anoter value|subparam=20
[1] => |value\]anoter value|subparam=21|nothing
[2] => =value\[anoter value\]|subparam=22
)
)

Even if you extract the substrings you are interested by, you will need to remove the escaped square brackets in a second time. Let's see the full solution:
$pattern = '~\[\w+\K[^]\\\]*(?:(?:\\\.)+[^]\\\]*)*+(?=])~s';
if (preg_match_all($pattern, $str, $m))
$result = array_map(function ($item) {
return strtr($item, array('\]' => ']', '\[' => '['));
}, $m[0]);

Related

Match a pattern by ignoring different brackets

I have a string and I would like to know the first position of a pattern. But it should be found only, if it's not enclosed by with brackets.
Example String: "This is a (first) test with the first hit"
I want to know the position of the second first => 32. To match it, the (first) must be ignored, because it's enclosed in brackets.
Unfortunately I do not have to ignore round brackets ( ) only, I have to ignore square brackets [ ] and brace brackets { } too.
I tried this:
preg_match(
'/^(.*?)(first)/',
"This is a (first) test with the first hit",
$matches
);
$result = strlen( $matches[2] );
It works fine, but the result is the position of the first match (11).
So I need to change the .*?.
I tried to replace it with .(?:\(.*?\))*? in the hope, all characters inside the brackets will be ignored. But this does not match the brackets.
And I can't use negative look ahead '/(?<!\()first(?!\))/', since I have three different bracket types, which have to match open and closing bracket.
You can match all 3 formats that you don't want using a group with and an alternation and make use of (*SKIP)(*FAIL) to not get those matches. Then match first between word boundaries \b
(?:\(first\)|\[first]|{first})(*SKIP)(*FAIL)|\bfirst\b
Regex demo
Example code
$strings = [
"This is a (first) test with the first hit",
"This is a (first] test with the first hit"
];
foreach ($strings as $str) {
preg_match(
'/(?:\(first\)|\[first]|{first})(*SKIP)(*FAIL)|\bfirst\b/',
$str,
$matches,
PREG_OFFSET_CAPTURE);
print_r($matches);
}
Output
Array
(
[0] => Array
(
[0] => first
[1] => 32
)
)
Array
(
[0] => Array
(
[0] => first
[1] => 11
)
)
Php demo

Split string into array by regular expression without bracket

My string variable contains "[100][200][300][400]" data.
This variable should be split it into array without brackets.
Currently I can split into $matches array with regular expression, but bracket appear in array.
I have used current expression as bellow:
preg _match_all('/\[.*?\]/', $string , $matches)
why not try this:
<?php
$str = " [100][200][300][400] ";
$str = explode("][", trim($str, "[] "));
print_r($str);
exit;
This code maybe can help
$str = "[100][200][300][400]";
$str = explode("][",trim($str,'\[\]'));
Results in:
Array(
[0] => 100
[1] => 200
[2] => 300
[3] => 400
)
Use brackets to choose what part of the match you want to return.
preg_match_all('/\[(.*?)\]/', $string , $matches)

How to split a string into an array using a given regex expression

I am trying to explode / preg_split a string so that I get an array of all the values that are enclosed in ( ). I've tried the following code but I always get an empty array, I have tried many things but I cant seem to do it right
Could anyone spot what am I missing to get my desired output?
$pattern = "/^\(.*\)$/";
$string = "(y3,x3),(r4,t4)";
$output = preg_split($pattern, $string);
print_r($output);
Current output Array ( [0] => [1] => )
Desired output Array ( [0] => "(y3,x3)," [1] => "(r4,t4)" )
With preg_split() your regex should be matching the delimiters within the string to split the string into an array. Your regex is currently matching the values, and for that, you can use preg_match_all(), like so:
$pattern = "/\(.*?\)/";
$string = "(y3,x3),(r4,t4)";
preg_match_all($pattern, $string, $output);
print_r($output[0]);
This outputs:
Array
(
[0] => (y3,x3)
[1] => (r4,t4)
)
If you want to use preg_split(), you would want to match the , between ),(, but without consuming the parenthesis, like so:
$pattern = "/(?<=\)),(?=\()/";
$string = "(y3,x3),(r4,t4)";
$output = preg_split($pattern, $string);
print_r($output);
This uses a positive lookbehind and positive lookahead to find the , between the two parenthesis groups, and split on them. It also output the same as the above.
You can use a simple regex like \B,\B to split the string and improve the performance by avoiding lookahead or lookbehind regex.
\B is a non-word boundary so it will match only the , between ) and (
Here is a working example:
http://regex101.com/r/cV7bO7/1
$pattern = "/\B,\B/";
$string = "(y3,x3),(r4,t4),(r5,t5)";
$result = preg_split($pattern, $string);
$result will contain:
Array
(
[0] => (y3,x3)
[1] => (r4,t4)
[2] => (r5,t5)
)

Match rest of string with regex

I have a string like this
ch:keyword
ch:test
ch:some_text
I need a regular expression which will match all of the strings, however, it must not match the following:
ch: (ch: is proceeded by a space, or any number of spaces)
ch: (ch: is proceeded by nothing)
I am able to deduce the length of the string with the 'ch:' in it.
Any help would be appreciated; I am using PHP's preg_match()
Edit: I have tried this:
preg_match("/^ch:[A-Za-z_0-9]/", $str, $matches)
However, this only matches 1 character after the string. I tried putting a * after the closing square bracket, but this matches spaces, which I don't want.
preg_match('/^ch:(\S+)/', $string, $matches);
print_r($matches);
\S+ is for matching 1 or more non-space characters. This should work for you.
Try this regular expression:
^ch:\S.*$
$str = <<<TEXT
ch:keyword
ch:test
ch:
ch:some_text
ch: red
TEXT;
preg_match_all('|ch\:(\S+)|', $str, $matches);
echo '<pre>'; print_r($matches); echo '</pre>';
Output:
Array
(
[0] => Array
(
[0] => ch:keyword
[1] => ch:test
[2] => ch:some_text
)
[1] => Array
(
[0] => keyword
[1] => test
[2] => some_text
)
)
Try using this:
preg_match('/(?<! +)ch:[^ ].*/', $str);

Alone, a part of a regex works. With a part added before it and an other after it, it stops working

UPDATE: I'm making progress, but this is hard!
The test text will be valid[REGEX_EMAIL|REGEX_PASSWORD|REGEX_TEST].
(The real life text is required|valid[REGEX_EMAIL]|confirmed[emailconfirmation]|correct[not in|emailconfirmation|email confirmation].)
([^|]+) saves REGEX_EMAIL, REGEX_PASSWORD and REGEX_TEST in an array.
^[^[]+\[ matches valid[
\] matches ]
^[^[]+\[ + ([^|]+) + \] doesn't save REGEX_EMAIL, REGEX_PASSWORD and REGEX_TEST in an array.
How to solve?
Why is it important to try to everything with a single regular expression? It becomes much easier if you extract the two parts first and then split the strings on | using explode:
$s = 'valid[REGEX_EMAIL|REGEX_PASSWORD|REGEX_TEST]';
$matches = array();
$s = preg_match('/^([^[]++)\[([^]]++)\]$/', $s, $matches);
$left = explode('|', $matches[1]);
$right = explode('|', $matches[2]);
print_r($left);
print_r($right);
Output:
Array
(
[0] => valid
)
Array
(
[0] => REGEX_EMAIL
[1] => REGEX_PASSWORD
[2] => REGEX_TEST
)

Categories