Another regex: square brackets - php

I have something like: word[val1|val2|val3] . Need a regex to capture both: word and val1|val2|val3
Another sample: leader[77]
Result: Need a regex to capture both: leader and 77
This is what I have so far: ^(.*\[)(.*) and it gives me: array[0]=word[val1|val2|val3];
array[1]=word[
array[2]=val1|val2|val3]
array[1] is needed but without [
array[2] is needed but without ]
Any ideas? - Thank you

For the either one you can use \w*(\[.*\])
\w* match any word character [a-zA-Z0-9_]
Quantifier: * Between zero and unlimited times
\[ matches the character [ literally
.* matches any character (except newline)
\] matches the character ] literally
EDIT: I kept hammering away to get rid of the brackets and came up with (\w*)\[([^][]*)]
EDIT: Which I now see Wiktor suggested in comments before I got back with mine.

You can use
([^][]+)\[([^][]*)]
Here is the regex demo
Explanation:
([^][]+) - Group 1 matching one or more chars other than ] and [
\[ - a literal [
([^][]*) - Group 2 capturing 0+ chars other than [ and ]
] - a literal ].
See IDEONE demo:
$re = '~([^][]+)\[([^][]*)]~';
$str = "word[val1|val2|val3]";
preg_match($re, $str, $matches);
echo $matches[1]. "\n" . $matches[2];

Related

PHP preg_replace - Replace text part but not too soon

Using:
$text = preg_replace("/\[\[(.*?)SPACE(.*?)\]\]/im",'$2',$text);
for cleaning and get wordtwo
$text = '..text.. [[wordoneSPACE**wordtwo**]] ..moretext..';
but fails if text has [[ before
$text = '.. [[ ..text(not to cut).. [[wordoneSPACE**wordtwo**]] ..moretext..';
how can I limit to only where I have only the SPACE word?
If there can be no [ and ] inside the [[...]] you may use
$text = preg_replace("/\[\[([^][]*)SPACE([^][]*)]]/i",'$2',$text);
See the regex demo. [^][] negated character class will only match a char other than [ and ] and won't cross the [[...]] border.
Otherwise, use a tempered greedy token:
$text = preg_replace("/\[\[((?:(?!\[{2}).)*?)SPACE(.*?)]]/is",'$2',$text);
See this regex demo.
The (?:(?!\[{2}).)*? pattern will match any char, 0 or more repetitions but as few as possible, that does not start [[ char sequence, and won't cross the next entity [[ border.
Another option might be using possessive quantifiers.
In the first group you could use a negated character class to match any characters except square brackets or an S if it is followed by SPACE.
\[\[([^][S]++(?:S(?!PACE)|[^][S]+)*+)SPACE([^][]++)\]\]
In parts
\[\[ Match [[
( Capture group 1
[^][S]++ Match 1+ times any char except S, ] or [
(?: Non capturing group
S(?!PACE) Match either an S not followed by PACE
| Or
[^][S]+ Match 1+ times any char except S, ] or [
)*+ Close group and repeat 0+ times
) Close group 1
SPACE Match literally
( Capture group 2
[^][]++ Match 1+ times any char except ] or [
) Close group
\]\] Match ]]
Regex demo

How to remove text inside brackets and parentheses at the same time with any whitespace before if present?

I am trying to clean a string in PHP using the following code, but I am not sure how to get rid of the text inside brackets and parentheses at the same time with any whitespace before if present.
The code I am using is:
$string = "Deadpool 2 [Region 4](Blu-ray)";
echo preg_replace("/\[[^)]+\]/","",$string);
The output I'm getting is:
Deadpool [](Blu-ray)
However, the desired output is:
Deadpool 2
Using the solutions from this and this questions, it is not clear how to remove both one type of matches and the other one while also removing the optional whitespace before them.
There are four main points here:
String between parentheses can be matched with \([^()]*\)
String between square brackets can be matched with \[[^][]*] (or \[[^\]\[]*\] if you prefer to escape literal [ and ], in PCRE, it is stylistic, but in some other regex flavors, it might be a must)
You need alternation to match either this or that pattern and account for any whitespaces before these patterns
Since after removing these strings you may get leading and trailing spaces, you need to trim the string.
You may use
$string = "Deadpool 2 [Region 4](Blu-ray)";
echo trim(preg_replace("/\s*(?:\[[^][]*]|\([^()]*\))/","", $string));
See the regex demo and a PHP demo.
The \[[^][]*] part matches strings between [ and ] having no other [ and ] inside and \([^()]*\) matches strings between ( and ) having no other parentheses inside. trim removes leading/trailing whitespace.
Regex graph and explanation:
\s* - 0+ whitespaces
(?: - start of a non-capturing group:
\[[^][]*] - [, zero or more chars other than [ and ] (note you may keep these brackets inside a character class unescaped in a PCRE pattern if ] is right after initial [, in JS, you would have to escape ] by all means, [^\][]*)
| - or (an alternation operator)
\([^()]*\) - (, any 0+ chars other than ( and ) and a )
) - end of the non-capturing group.
Based on just the one sample input there are some simpler approaches.
$string = "Deadpool 2 [Region 4](Blu-ray)";
var_export(preg_replace("~ [[(].*~", "", $string));
echo "\n";
var_export(strstr($string, ' [', true));
Output:
'Deadpool 2'
'Deadpool 2'
These assume that the start of the unwanted substring begins with space opening square brace.
The strstr() technique requires that the space-brace sequence exists in the string.
If the unwanted substring marker is not consistently included, then you can use:
var_export(explode(' [', $string, 2)[0]);
This will put the unwanted substring in explode's output array at [1] and the wanted substring in [0].

Whats wrong with this RegEx?

I wrote this RegEx: '/\[\.{2}([^\.].+)\]/'
And it is supposed to match patterns like this: [..Class,Method,Parameter]
It works until I have a pattern like this: [..Class1,Method1,Para1][..Class2,Method2,Para2]
I tried to make the RegEx lazy by putting a ? behin the +. '/\[\.{2}([^\.].+?)\]/' but it didn't help. Any suggestions?
I believe you wanted to use [^\.]+ rather than [\.].+. Note that .+ is a greedily quantified dot pattern and matches any 1 or more chars other than line break chars, and thus matches across both ] and [.
Match any 1 or more chars other than ] with [^]] rather than using [^\.]:
\[\.{2}([^]]+)]
See this regex demo
Details
\[ - a [ char
\.{2} - two dot chars
([^]]+) - Group 1: one or more chars other than ] (no need to escape ] when it is the first char in a character class)
] - a closing bracket (no need to escape ] when it is outside a character class).
PHP demo:
$str = '[..Class,Method,Parameter] [..Class1,Method1,Para1][..Class2,Method2,Para2]';
preg_match_all('/\[\.{2}([^\.].+?)\]/', $str, $matches);
print_r($matches[0]);
Results:
Array
(
[0] => [..Class,Method,Parameter]
[1] => [..Class1,Method1,Para1]
[2] => [..Class2,Method2,Para2]
)

PHP Regex find + alter + replace within each tag

I am new to regex if this is indeed what I need.
The string might include :
[name* your-name ]
[email* your-main-email some_thing]
etc
Amateur logic :
Search string for '['
get substring between this and next ']'
extract hyphenated word (probably find first word between first and next space)
Replace substring with hyphenated word
Repeat with all remaining tags
To hopefully produce :
[your-name]
[your-main-email]
etc
Or am I off target with method?
Many thanks
Try this code
$str = '[name* your-name ] [email* your-main-email some_thing]';
$str = preg_replace("/\[[^\s]+\s+([^\]\s]+)\s+[^\]]*\]/", "[$1]", $str);
echo $str;
Regex explanation:
/ Delimiter
\[ Match starting square bracket
[^\s]+ Match one or more non-space character
\s+ Match one or more space
( Start capturing group
[^\]\s]+ Match one or more character that is not space and not ]
) End capturing group
\s+ Match one or more space
[^\]]* Match zero or more character that is not ]
\] Match closing square bracket
/ Delimiter
Edit
To do replace when last space is missing i.e. [name* your-name] then use following regex
/\[[^\s]+\s+([^\]\s]+)[^\]]*\]/

Regex Replace [ with -> depending on a condition

I'm looking for a regex that could replace all "[" with an "->" but only when it's not followed by "]".
And replace at the same time all of the "]" with nothing, but only when they are not next to an "["
So in other word "test[hi][]" will be become "test->hi[]"
Thank you ;)
I really have no clue how to do that ;)
I've made the assumptions that what exist between the brackets follows PHP variable naming conventions (i.e. letters, digits, underscores) and your code is valid (e.g. no $test['five]).
echo preg_replace('/\[[\'"]?(\w+)[\'"]?\]/', '->\1', $input);
This should handle:
test[one]
test['two']
test["three"]
But not:
test[$four]
No regexp needed!
strtr($str, array('[]'=>'[]','['=>'->',']'=>''))
$ cat 1.php
<?php
echo strtr('[hi][]', array('[]'=>'[]','['=>'->',']'=>''));
$ php 1.php
->hi[]
replace this regex \[(\w+)\] with -> + match group 1
This should do. It uses
\[ # match a [
( # match group
[^\]]+ # match everything but a ] one or more times
) # close match group
\] # match ]
to match anything between brackets
$replaced = preg_replace("/\[([^\]]+)\]/", "->$1", $string);

Categories