Regex/PHP Replace any repeating (but flexible) word group - php

How can I match "Any Group" repeated as "ANY GROUP" or "ANYGROUP"
$string = "Foo Bar (Any Group - ANY GROUP Baz)
Foo Bar (Any Group - ANYGROUP Baz)";
so they return as "Foo Bar (Any Group - Baz)"
The separator would always be -
This post extends Regex/PHP Replace any repeating word group
This matches "Any Group - ANY GROUP" but not when repeated without blank.
$result = preg_replace(
'%
( # Match and capture
(?: # the following:...
[\w/()]{1,30} # 1-30 "word" characters
[^\w/()]+ # 1 or more non-word characters
){1,4} # 1 to 4 times
) # End of capturing group 1
([ -]*) # Match any number of intervening characters (space/dash)
\1 # Match the same as the first group
%ix', # Case-insensitive, verbose regex
'\1\2', $subject);

This is ugly (as I said it would be), but it should work:
$result = preg_replace(
'/((\b\w+)\s+) # One repeated word
\s*-\s*
\2
|
((\b\w+)\s+(\w+)\s+) # Two repeated words
\s*-\s*
\4\s*\5
|
((\b\w+)\s+(\w+)\s+(\w+)\s+) # Three
\s*-\s*
\7\s*\8\s*\9
|
((\b\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+) # Four
\s*-\s*
\11\s*\12\s*\13\s*\14\b/ix',
'\1\3\6\10-', $subject);

Up to 6 word(s) solution is:
$result = preg_replace(
'/
(\(\s*)
(([^\s-]+)
\s*?([^\s-]*)
\s*?([^\s-]*)
\s*?([^\s-]*)
\s*?([^\s-]*)
\s*?([^\s-]*))
(\s*\-\s*)
\3\s*\4\s*\5\s*\6\s*\7\s*\8\s*
/ix',
'\1\2\9',
$string);
Check this demo.

Related

How to do preg_replace that only matches particular conditions?

I am struggling to write a preg_replace command that achieves what I need.
Essentially I have the following array (all the items follow one of these four patterns):
$array = array('Dogs/Cats', 'Dogs/Cats/Mice', 'ANIMALS/SPECIES Dogs/Cats/Mice', '(Animals/Species) Dogs/Cats/Mice' );
I need to be able to get the following result:
Dogs/Cats = Dogs or Cats
Dogs/Cats/Mice = Dogs or Cats or Mice
ANIMALS/SPECIES Dogs/Cats/Mice = ANIMALS/SPECIES Dogs or Cats or Mice
(Animals/Species) Dogs/Cats/Mice = (Animals/Species) Dogs or Cats or Mice
So basically replace slashes in anything that isn't capital letters or brackets.
I am starting to grasp it but still need some guidance:
preg_replace('/(\(.*\)|[A-Z]\W[A-Z])[\W\s\/]/', '$1 or', $array);
As you can see this recognises the first patterns but I don't know where to go from there
Thanks!
You might use the \G anchors to assert the position at the previous match and use \K to forget what was matched to match only a /.
You could optionally match ANIMALS/SPECIES or (Animals/Species) at the start.
(?:^(?:\(\w+/\w+\)\h+|[A-Z]+/[A-Z]+\h+)?|\G(?!^))\w+\K/
Explanation
(?: Non capturing group
^ Assert start of string
(?: Non capturing group, match either
\(\w+/\w+\)\h+ Match between (....) 1+ word chars with a / between ending with 1+ horizontal whitespace chars
| Or
[A-Z]+/[A-Z]+\h+ Match 1+ times [A-Z], / and again 1+ times [A-Z]
)? Close non capturing group and make it optional
| Or
\G(?!^) Assert position at the previous match
)\w+ Close non capturing group and match 1+ times a word char
\K/ Forget what was matched, and match a /
Regex demo | Php demo
In the replacement use a space, or and a space
For example
$array = array('Dogs/Cats', 'Dogs/Cats/Mice', 'ANIMALS/SPECIES Dogs/Cats/Mice', '(Animals/Species) Dogs/Cats/Mice');
$re = '~(?:^(?:\(\w+/\w+\)\h+|[A-Z]+/[A-Z]+\h+)?|\G(?!^))\w+\K/~';
$array = preg_replace($re, " or ", $array);
print_r($array);
Result:
Array
(
[0] => Dogs or Cats
[1] => Dogs or Cats or Mice
[2] => ANIMALS/SPECIES Dogs or Cats or Mice
[3] => (Animals/Species) Dogs or Cats or Mice
)
The way you present your problem with your example strings, doing:
$result = preg_replace('~(?:\S+ )?[^/]*+\K.~', ' or ', $array);
looks enough. In other words, you only have to check if there's a space somewhere to consume the beginning of the string until it and to discard it from the match result using \K.
But to avoid future disappointments, it is sometimes useful to put yourself in the shoes of the Devil to consider more complex cases and ask embarrassing questions:
What if a category, a subcategory or an item contains a space?
~
(?:^
(?:
\( [^)]* \)
|
\p{Lu}+ (?> [ ] \p{Lu}+ \b )*
(?> / \p{Lu}+ (?> [ ] \p{Lu}+ \b )* )*
)
[ ]
)?
[^/]*+ \K .
~xu
demo
In the same way, to deal with hyphens, single quotes or whatever, you can replace [ ] with [^\pL/] (a class that excludes letters and the slash) or something more specific.

Split address street name house number and room number

I need split address: Main Str. 202-52 into
street=Main Str.
house No.=202
room No.=52
I tried to use this:
$data['address'] = "Main Str. 202-52";
$data['street'] = explode(" ", $data['address']);
$data['building'] = explode("-", $data['street'][0]);
It is working when street name one word. How split address where street name have several words.
I tried $data['street'] = preg_split('/[0-9]/', $data['address']);But getting only street name...
You may use a regular expression like
/^(.*)\s(\d+)\W+(\d+)$/
if you need all up to the last whitespace into group 1, the next digits into Group 2 and the last digits into Group 3. \W+ matches 1+ chars other than word chars, so it matches - and more. If you have a - there, just use the hyphen instead of \W+.
See the regex demo and a PHP demo:
$s = "Main Str. 202-52";
if (preg_match('~^(.*)\s(\d+)\W+(\d+)$~', $s, $m)) {
echo $m[1] . "\n"; // Main Str.
echo $m[2] . "\n"; // 202
echo $m[3]; // 52
}
Pattern details:
^ - start of string
(.*) - Group 1 capturing any 0+ chars other than line break chars as many as possible up to the last....
\s - whitespace, followed with...
(\d+) - Group 2: one or more digits
\W+ - 1+ non-word chars
(\d+) - Group 3: one or more digits
$ - end of string.
Also, note that in case the last part can be optional, wrap the \W+(\d+) with an optional capturing group (i.e. (?:...)?, (?:\W+(\d+))?).

split a string which consists decimals instead of integer

I split a string '3(1-5)' like this:
$pattern = '/^(\d+)\((\d+)\-(\d+)\)$/';
preg_match($pattern, $string, $matches);
But I need to do the same thing for decimals, i.e. '3.5(1.5-4.5)'.
And what do I have to do, if the user writes '3,5(1,5-4,5)'?
Output of '3.5(1.5-4.5)' should be:
$matches[1] = 3.5
$matches[2] = 1.5
$matches[3] = 4.5
You can use the following regular expression.
$pattern = '/^(\d+(?:[.,]\d+)?)\(((?1))-((?1))\)$/';
The first capturing group ( ... ) matches the following pattern:
( # group and capture to \1:
\d+ # digits (0-9) (1 or more times)
(?: # group, but do not capture (optional):
[.,] # any character of: '.', ','
\d+ # digits (0-9) (1 or more times)
)? # end of grouping
) # end of \1
Afterwords we look for an opening parenthesis and then recurse (match/capture) the 1st subpattern followed by a hyphen (-) and then recurse (match/capture) the 1st subpattern again followed by a closing parenthesis.
Code Demo
This pattern should help:
^(\d+\.?\,?\d+)\((\d+\,?\.?\d+)\-(\d+\.?\,?\d+)\)$

Preg_match/Preg_replace in php for matching pattern and replacing it in php

I want to replace value in string with XXX
input:
insert into employees values('shrenik', 555, NULL)
output:
insert into employees values('XXX', XXX, NULL)
I tried this: ([0-9]|\'.*\')
I want to match first for insert into after that want to skip the string up to (. I already mentioned in the statement the pattern and output I required.
Thanks in advance.
You can use this:
$sql = 'insert into employees values(\'shrenik\', 555, NULL)';
$pattern = '~(?:\binsert into [^(]*\(|\G(?<!^),(?:\s*+NULL,)*)\s*+\K(\')?(?(1)[^\']*\'|(?!NULL\b)[^\s,)]*)~i';
$sql = preg_replace($pattern, '$1XXX$1', $sql);
pattern details
~ # pattern delimiter
(?: # non capturing group: where the pattern is allowed to start
\binsert into [^(]*\( # after "insert to" until the opening parenthesis
| # OR
\G(?<!^), # after a precedent match if there is a comma
(?:\s*+NULL,)* # skip NULL values
)
\s*+ # zero or more spaces
\K # reset all that was matched before from match result
(')? # optional capture group 1 with single quote
(?(1) # IF capture group 1 exists:
[^']*' # THEN matches all characters except ' followed by a literal '
| # ELSE
(?!NULL\b)[^\s,)]* # matches all characters except spaces, comma, ) and the last NULL value
) # ENDIF
~i # closing pattern delimiter, case-insensitive

Regex/PHP Replace any repeating word group

How can match
$string = "Foo Bar (Any Group - ANY GROUP Baz)";
Should return as "Foo Bar (Any Group - Baz)"
Is it possible without bruteforce as here Replace repeating strings in a string ?
Edit:
* The group could consist of 1-4 words while each word could match [A-Za-z0-9\/\(\)]{1,30}
* The separator would always be -
Leaving the space out of the list of allowed "word" characters, the following works for your example:
$result = preg_replace(
'%
( # Match and capture
(?: # the following:...
[\w/()]{1,30} # 1-30 "word" characters
[^\w/()]+ # 1 or more non-word characters
){1,4} # 1 to 4 times
) # End of capturing group 1
([ -]*) # Match any number of intervening characters (space/dash)
\1 # Match the same as the first group
%ix', # Case-insensitive, verbose regex
'\1\2', $subject);

Categories