Whats wrong with this RegEx? - php

I wrote this RegEx: '/\[\.{2}([^\.].+)\]/'
And it is supposed to match patterns like this: [..Class,Method,Parameter]
It works until I have a pattern like this: [..Class1,Method1,Para1][..Class2,Method2,Para2]
I tried to make the RegEx lazy by putting a ? behin the +. '/\[\.{2}([^\.].+?)\]/' but it didn't help. Any suggestions?

I believe you wanted to use [^\.]+ rather than [\.].+. Note that .+ is a greedily quantified dot pattern and matches any 1 or more chars other than line break chars, and thus matches across both ] and [.
Match any 1 or more chars other than ] with [^]] rather than using [^\.]:
\[\.{2}([^]]+)]
See this regex demo
Details
\[ - a [ char
\.{2} - two dot chars
([^]]+) - Group 1: one or more chars other than ] (no need to escape ] when it is the first char in a character class)
] - a closing bracket (no need to escape ] when it is outside a character class).
PHP demo:
$str = '[..Class,Method,Parameter] [..Class1,Method1,Para1][..Class2,Method2,Para2]';
preg_match_all('/\[\.{2}([^\.].+?)\]/', $str, $matches);
print_r($matches[0]);
Results:
Array
(
[0] => [..Class,Method,Parameter]
[1] => [..Class1,Method1,Para1]
[2] => [..Class2,Method2,Para2]
)

Related

How can I extract values that have opening and closing brackets with regular expression?

I am trying to extract [[String]] with regular expression. Notice how a bracket opens [ and it needs to close ]. So you would receive the following matches:
[[String]]
[String]
String
If I use \[[^\]]+\] it will just find the first closing bracket it comes across without taking into consideration that a new one has opened in between and it needs the second close. Is this at all possible with regular expression?
Note: This type can either be String, [String] or [[String]] so you don't know upfront how many brackets there will be.
You can use the following PCRE compliant regex:
(?=((\[(?:\w++|(?2))*])|\b\w+))
See the regex demo. Details:
(?= - start of a positive lookahead (necessary to match overlapping strings):
(- start of Capturing group 1 (it will hold the "matches"):
(\[(?:\w++|(?2))*]) - Group 2 (technical, used for recursing): [, then zero or more occurrences of one or more word chars or the whole Group 2 pattern recursed, and then a ] char
| - or
\b\w+ - a word boundary (necessary since all overlapping matches are being searched for) and one or more word chars
) - end of Group 1
) - end of the lookahead.
See the PHP demo:
$s = "[[String]]";
if (preg_match_all('~(?=((\[(?:\w++|(?2))*])|\b\w+))~', $s, $m)){
print_r($m[1]);
}
Output:
Array
(
[0] => [[String]]
[1] => [String]
[2] => String
)

How to remove text inside brackets and parentheses at the same time with any whitespace before if present?

I am trying to clean a string in PHP using the following code, but I am not sure how to get rid of the text inside brackets and parentheses at the same time with any whitespace before if present.
The code I am using is:
$string = "Deadpool 2 [Region 4](Blu-ray)";
echo preg_replace("/\[[^)]+\]/","",$string);
The output I'm getting is:
Deadpool [](Blu-ray)
However, the desired output is:
Deadpool 2
Using the solutions from this and this questions, it is not clear how to remove both one type of matches and the other one while also removing the optional whitespace before them.
There are four main points here:
String between parentheses can be matched with \([^()]*\)
String between square brackets can be matched with \[[^][]*] (or \[[^\]\[]*\] if you prefer to escape literal [ and ], in PCRE, it is stylistic, but in some other regex flavors, it might be a must)
You need alternation to match either this or that pattern and account for any whitespaces before these patterns
Since after removing these strings you may get leading and trailing spaces, you need to trim the string.
You may use
$string = "Deadpool 2 [Region 4](Blu-ray)";
echo trim(preg_replace("/\s*(?:\[[^][]*]|\([^()]*\))/","", $string));
See the regex demo and a PHP demo.
The \[[^][]*] part matches strings between [ and ] having no other [ and ] inside and \([^()]*\) matches strings between ( and ) having no other parentheses inside. trim removes leading/trailing whitespace.
Regex graph and explanation:
\s* - 0+ whitespaces
(?: - start of a non-capturing group:
\[[^][]*] - [, zero or more chars other than [ and ] (note you may keep these brackets inside a character class unescaped in a PCRE pattern if ] is right after initial [, in JS, you would have to escape ] by all means, [^\][]*)
| - or (an alternation operator)
\([^()]*\) - (, any 0+ chars other than ( and ) and a )
) - end of the non-capturing group.
Based on just the one sample input there are some simpler approaches.
$string = "Deadpool 2 [Region 4](Blu-ray)";
var_export(preg_replace("~ [[(].*~", "", $string));
echo "\n";
var_export(strstr($string, ' [', true));
Output:
'Deadpool 2'
'Deadpool 2'
These assume that the start of the unwanted substring begins with space opening square brace.
The strstr() technique requires that the space-brace sequence exists in the string.
If the unwanted substring marker is not consistently included, then you can use:
var_export(explode(' [', $string, 2)[0]);
This will put the unwanted substring in explode's output array at [1] and the wanted substring in [0].

Another regex: square brackets

I have something like: word[val1|val2|val3] . Need a regex to capture both: word and val1|val2|val3
Another sample: leader[77]
Result: Need a regex to capture both: leader and 77
This is what I have so far: ^(.*\[)(.*) and it gives me: array[0]=word[val1|val2|val3];
array[1]=word[
array[2]=val1|val2|val3]
array[1] is needed but without [
array[2] is needed but without ]
Any ideas? - Thank you
For the either one you can use \w*(\[.*\])
\w* match any word character [a-zA-Z0-9_]
Quantifier: * Between zero and unlimited times
\[ matches the character [ literally
.* matches any character (except newline)
\] matches the character ] literally
EDIT: I kept hammering away to get rid of the brackets and came up with (\w*)\[([^][]*)]
EDIT: Which I now see Wiktor suggested in comments before I got back with mine.
You can use
([^][]+)\[([^][]*)]
Here is the regex demo
Explanation:
([^][]+) - Group 1 matching one or more chars other than ] and [
\[ - a literal [
([^][]*) - Group 2 capturing 0+ chars other than [ and ]
] - a literal ].
See IDEONE demo:
$re = '~([^][]+)\[([^][]*)]~';
$str = "word[val1|val2|val3]";
preg_match($re, $str, $matches);
echo $matches[1]. "\n" . $matches[2];

find a string mapped between two string using php

I know this question was asked many times before and was read most of them, but I have still issue with this.
I will have a string that mapped with [[[ and ]]], and I don't know the position of this string and either I don't know how many times this would be happen.
for example :
$string = '[[[this is a string]]] and this is some other part. [[[this is another]]]and etc.';
Now, would some body help me to learn how can I find this is a string and this is another
Thanks in Advance
You need to use preg_match_all(), and you also need to be sure to escape the square brackets since they are special characters.
$string = '[[[this is a string]]] and this is some other part. [[[this is another]]]and etc.';
preg_match_all('/\[\[\[([^\]]*)\]\]\]/', $string, $matches);
print_r($matches);
Regex logic:
\[\[\[([^\]]*)\]\]\]
Debuggex Demo
Output:
Array
(
[0] => Array
(
[0] => [[[this is a string]]]
[1] => [[[this is another]]]
)
[1] => Array
(
[0] => this is a string
[1] => this is another
)
)
Here is a method using lookbehinds and lookaheads:
$string = '[[[this is a string]]] and this is some other part. [[[this is another]]]and etc.';
preg_match_all('/(?<=\[{3}).*?(?=\]{3})/', $string, $m);
print_r($m);
This outputs the following:
Array
(
[0] => Array
(
[0] => this is a string
[1] => this is another
)
)
Here is the explanation of the REGEX:
(?<= \[{3} ) .*? (?= \]{3} )
1 2 3 4 5 6 7
(?<= Positive lookbehind - This combination of (?<= ... ) tells REGEX to make sure that whatever is in the parenthesis has to appear directly before whatever it is we are trying to match. It will check to see if it's there, but won't include it in the matches.
\[{3} This says to look for an opening square brace '[', three times in a row {3}. The only thing is that the square brace is a special character in REGEX, so we have to escape it with a backslash \. [ becomes \[.
) Closing parenthesis ) for the lookbehind (Item #1)
.*? This tells REGEX to match any character ., any number of times * until it hits the next part of our regular expression ?. In this case, the next part that it will hit will be a lookahead for three closing square braces.
(?= Positive lookahead - The combination of (?= ... ) tells REGEX to make sure that whatever is in the parenthesis has to be directly in front (ahead) of what we are currently matching. It will check to see if it's there, but won't include it as part of our match.
\]{3} This looks for a closing square brace ], three times in a row {3} and as with item #2, must be escaped with a backslash \.
) Closing parenthesis ) for the lookahead (Item #5)

Find a pattern in string

I have a string like this:
{param1}{param2}{param3}....{myparam paramvalue}{paramn}
How can i get the paramvalue of myparam
Simple regex:
/\({[^ ]+?) ([^}]+?)\}/
{[^ ]+?) - it will look for anything at least 1 time occured but space and put it in subpattern
([^}]+?) - it will look for anything at least 1 time occured but { and put it in subpattern.
use it with preg_match() function
OR
The other simple regex:
preg_match('/([a-z0-9]+?) ([a-z0-9]+?)\}/', $str, $matches);
([a-z0-9]+?) - a-z 0-9 at least one time not greedy
([^}]+?) - a-z 0-9 at least one time not greedy
Output:
Array ( [0] => myparam paramvalue} [1] => myparam [2] => paramvalue )
Demo
To specifically get that parameter value, you first have to match the left part:
/\{myparam/
Followed by at least one space:
/\{myparam\s+/
Capture characters until a closing curly brace is found:
/\{myparam\s+([^}]+)\}/
The expression [^}]+ is a negative character set, indicated by the ^ just after the opening bracket; it means "match all characters except these".
Try with this regex:
/\{\w+\s+(\w+)\}/
if(preg_match('/\{'.preg_quote('myparam').' ([^\}]+)\}/', $input, $matches) {
echo "myparam=".$matches[1];
} else {
echo "myparam not found";
}
in preg_match, '{' and '}' are special chars, so they need to be escaped
the preg_quote may not be neccessary, as long as "myparam" will never have any special regex chars
the (cryptic) part ([^}]+)} matches one or more chars not being a '}', followed by '}'
the parantheses make that match available in the third arg to preg_match, $matches in this case
You can try this one as well:
.+?\s+([^}]+)
EDIT
Explanation:
.+? means match everything one or more time but its lazy, will prefer to match as less as it can.
\s+ means it will match white-spaces one or more time.
([^}]+) means match everything except `}`(close bracket) one or more time and capture group.

Categories