regex how i write this pattern? - php

$string = "anyWord Hello A 1 *** .";
preg_match('/(.*?) Hello (A|B) (1|0) (if(g2 == B)then|else).*/i',$string,$match);
// g1 g2 g3 -->|
print_r($match);
ٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍ
What i do ?

Try this:
preg_match("/(.*?) Hello ([AB]) (?|(0)|(1) (\*\*\*))/i",$string,$match);
The (?|...|...) structure allows subpatterns in the alternation to be numbered independently of each other. Otherwise, you'd end up with match 3 either being 0 or nothing, and match 4/5 being 1/*** or nothing. The structure combined them, to get match 3 being 0 or 1, and match 4 being nothing or ***

Related

Regex match from right to left

I wanted to match something from right to left, below is one of such example.
100abababab3x3x3xx1000morewords
If i want to match something between and last xx and immediate previous ab and get 3x3x3
I tried something like below , but it matches ababab3x3x3
preg_match('/ab(.*?)xx/',$text,$totmat);
Note : please don't recommend strrev.
Above example is just for illustration , all i wanted to do is match from right to left.
Not sure this is the most optimized way or not? But this will work for you if you use the combination of Look ahead positive (?=) and Look behind positive (?<=). See regex
<?php
$re = '/\w+(?<=ab)(.*?)(?=xx)/m';
$str = '100abababab3x3x3xx1000morewords';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the full matched result
echo $matches[0][1];
DEMO: https://3v4l.org/db69N
$str = '100abababab3x3x3xx1000morewords';
preg_match('/ab((?:(?!ab).)*)xx/', $str, $m);
print_r($m);
Output:
Array
(
[0] => ab3x3x3xx
[1] => 3x3x3
)
>Explanation:
ab : literally ab
( : start group 1
(?: : start non capture group
(?!ab) : negative lookahead, make sure we doon't have ab
. : any character but newline
)* : end group, may appear 0 or more times
) : end group 1
xx : literally xx
There are other approaches than a regex to this kind of problem that would be close to twice faster in computing time.
Here for example :
$str = "100abababab3x3x3xx1000morewords";
$result = explode("ab", explode("xx", $str)[0]);
var_dump(end($result));
First occurence of explode split the string in two between the "xx" characters. We're only interested by the left part (Index 0).
Second occurence of explode split the string with the caracters ab. We're only interested by the last occurence of ab. Therefore var_dump(end($result)); prints the expected result.

don't match string in brackets php regex

I've been trying to use preg_replace() in php to replace string. I want to match and replace all 's' in this string, but I just came with solution only mathching 's' between 'b' and 'c' or 's' between > <. Is there any way I can use negative look behind not just for the character '>' but for whole string ? I don't want to replace anything in brackets.
<text size:3>s<text size:3>absc
<text size:3>xxetxx<text size:3>sometehing
edit:
just get 's' in >s< and in bsc. Then when I will change string for example from 's' to 'te', to replace 'te' in xtex and sometehing. So I was looking for regular expression to avoid replacing anything in <....>
You can use this pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/';
$replace = '\1\3■'; # ■ = your replacement string
$result = preg_replace( $pattern, $replace, $str );
regex101 demo
Pattern explanation:
( # group 1:
(<[^>]*>)* # group 2: zero-or-more <...>
)
([^s]*) # group 3: zero-or-more not “s”
s # litterally “s”
If you want match case-insensitive, add a “i” at the end of pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/i';
Edit: Replacement explanation
In the search pattern we have 3 groups surrounded by round brackets. In the replace string we can refer to groups by syntax \1, where 1 is the group number.
So, replace string in the example means: replace group 1 with itself, replace group 3 with itself, replace “s” with desired replacement. We don't need to use group 2 because it is included in group 1 (this due to regex impossibility to retrieve repeating groups).
In the demo string:
abs<text size:3>ssss<text size:3><img src="img"><text size:3>absc
└┘╵└───────────┘╵╵╵╵└───────────────────────────────────────┘└┘╵╵
└─┘└────────────┘╵╵╵└──────────────────────────────────────────┘
1 2 345 6
Pattern matches:
group 1 group 3 s
--------- --------- ---------
1 > 0 1 1
2 > 1 0 1
3 > 0 0 1
4 > 0 0 1
5 > 0 0 1
6 > 3 1 1
The last “c” is not matches, so is not replaced.
Use preg_match_all to get all the s letters and use it with flag PREG_OFFSET_CAPTURE to get the indices.
The regular expression $pat contains a negative lookahead and lookbehind so that the s inside the brackets expression is not matched.
In this example I replace s with the string 5. Change to the string you want to substitute:
<?php
$s = " <text size:3>s<text size:3>absc";
$pat = "/(?<!\<text )s(?!ize:3\>)/";
preg_match_all($pat, $s, $matches, PREG_OFFSET_CAPTURE);
foreach ($matches[0] as $match) {
$s[$match[1]] = "5";
}
print_r(htmlspecialchars($s));

RegEx Challenge: Capture all the numbers in a specific row

Assume we have this text:
...
settingsA=9, 4.2
settingsB=3, 1.5, 9, 2, 4, 6
settingsC=8, 3, 2.5, 1
...
The question is how can I capture all the numbers that are in specific row using a single step?
Single step means:
single regex pattern.
single operation (no loops or splits, etc.)
all matches are captured in one array.
Let's say I want to capture all the numbers that are present in row which starts with settingsB=. The final result should look like this:
3
1.5
9
2
4
6
My failed attempts:
<?php
$subject =
"settingsA=9, 4.2
settingsB=3, 1.5, 9, 2, 4, 6
settingsC=8, 3, 2.5, 1";
$pattern = '([\d\.]+)(, )?' // FAILED!
$pattern = '(?:settingsB=)(?:([\d\.]+)(?:, )?)' // FAILED!
$pattern = '(?:settingsB=)(?:([\d\.]+)(?:, )?)+' // FAILED!
$pattern = '(?<=^settingsB=|, )([\d+\.]+)' // FAILED!
preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER);
if ($matches) {
print_r($matches);
}
?>
UPDATE 1: #Saleem's example uses multiple steps instead of a single step, unfortunately. I'm not saying that his example is bad (it actually works), but I want to know if there is another way to do it and how. Any ideas?
UPDATE 2: #bobble bubble provided a perfect solution for this challenge.
You can use the \G anchor to glue matches to the end of a previous match. This pattern which also uses \K to reset before the desired part would work with PCRE regex flavor.
(?:settingsB *=|\G(?!^) *,) *\K[\d.]+
(?: opens a non-capturing group for alternation
match settingsB, followed by * any amount of space, followed by literal =
|\G(?!^) or continue where the previous match ended but not start
*, and match a comma preceded by optional space
) end of alternation (non-capturing group)
*\K reset after optional space
[\d.]+ match one or more digits & periods.
If the sequence contains tabs or newlines, use \s for whitespace character instead of space.
See demo at regex101 or PHP demo at eval.in
or this more compatible pattern with use of a capturing group instead of \K which should work in any regex flavor that supports the \G anchor (Java, .NET, Ruby...)
Here is python solution but will post PHP rx later. However, python regex and php are quite similar.
(?<=settingsB=)(\d+(?:\.\d+)?(?:, )?)+
Python:
import re
subject = """
...
settingsA=9, 4.2
settingsB=3, 1.5, 9, 2, 4, 6
settingsC=8, 3, 2.5, 1
...
"""
rx = re.compile(r"(?<=settingsB=)(\d+(?:\.\d+)?(?:, )?)+", re.IGNORECASE)
result = rx.search(subject)
if result:
numString = result.group()
for n in [f.strip() for f in numString.split(',')]:
print(n)
PHP
$subject =
"settingsA=9, 4.2
settingsB=3, 1.5, 9, 2, 4, 6
settingsC=8, 3, 2.5, 1";
$pattern = '/(?<=settingsB=)(\d+(?:\.\d+)?(?:, )?)+/i';
preg_match($pattern, $subject, $matches);
if ($matches) {
$num = explode(",", $matches[0]);
for ($i = 0; $i < count($num); $i++) {
print(trim($num[$i]) . "\n");
}
}
Output:
3
1.5
9
2
4
6

Get all matched groups PREG PHP flavor

Pattern : '/x(?: (\d))+/i'
String : x 1 2 3 4 5
Returned : 1 Match Position[11-13] '5'
I want to catch all possible repetitions, or does it return 1 result per group?
I want the following :
Desired Output:
MATCH 1
1. [4-5] `1`
2. [6-7] `2`
3. [8-9] `3`
4. [10-11] `4`
5. [12-13] `5`
Which I was able to achieve just by copy pasting the group, but this is not what I want. I want a dynamic group capturing
Pattern: x(?: (\d))(?: (\d))(?: (\d))(?: (\d))(?: (\d))
You cannot use one group to capture multiple texts and then access them with PCRE. Instead, you can either match the whole substring with \d+(?:\s+\d+)* and then split with space:
$re2 = '~\d+(?:\s+\d+)*~';
if (preg_match($re2, $str, $match2)) {
print_r(preg_split("/\\s+/", $match2[0]));
}
Alternatively, use a \G based regex to return multiple matches:
(?:x|(?!^)\G)\s*\K\d+
See demo
Here is a PHP demo:
$str = "x 1 2 3 4 5";
$re1 = '~(?:x|(?!^)\G)\s*\K\d+~';
preg_match_all($re1, $str, $matches);
var_dump($matches);
Here, (?:x|(?!^)\G) is acting as a leading boundary (match the whitespaces and digits only after x or each successful match). When the digits are encountered, all the characters matched so far are omitted with the \K operator.

How to split phone numbers in string with spaces use php

I have different strings, contains phone numbers like this:
New order to car wash #663. Customer number is 7962555443. Thank you.
or
New order to car wash #663. Customer number is 50414. Thank you, bye.
or
New order to car wash #663. A phone number to connect with the customer is 905488739038.
I need this:
New order to car wash #663. Customer number is 7 9 6 2 5 5 5 4 4 3. Thank you.
or
New order to car wash #663. Customer number is 5 0 4 1 4. Thank you, bye.
or
New order to car wash #663. A phone number to connect with the customer is 9 0 5 4 8 8 7 3 9 0 3 8.
I need to separate numbers contains more than 3 symbols.
preg_replace alone without any callback function would be sufficient.
preg_replace('~#\d+(*SKIP)(*F)|(?<=\d)(?=\d)~', ' ', $str);
DEMO
#\d+(*SKIP)(*F) Matches and discards all the numbers which starts with #.
| OR
(?<=\d)(?=\d) Now from the remaining string, this would match the boundary which exists between two digits.
Now by replacing the matched boundary with space will give you the desired output.
You could use a callback for this:
$str = preg_replace_callback('~\b\d{4,}~',
function($m) {
return implode(' ', str_split($m[0]));
}, $str);
eval.in
Also can do this by using the \G anchor. Replace with matched digit + space: "$0 "
I need to separate numbers contains more than 3 symbols.
$str = preg_replace('~\b\d(?=\d{3})|\G\d\B~', "$0 ", $str);
\b\d matches a word-boundary \b followed by a digit (\d is a short for [0-9])
(?=\d{3}) Using a lookahead to check next 3 after first \d are digits too
|\G\d\B OR match a digit at \G end of previous match followed by \B non word-boundary
See test at regex101 or eval.in
As an alternative could also replace first digit after a \h horizontal space: \h\d|\G\d\B
Try this.
$phonenumber="7962555443";
$formatted = implode(' ',str_split($phonenumber));
You can use implode you just need to use str_split first which converts the string to an array:
$number="905488739038";
$formatted = implode(' ',str_split($number));
echo $formatted;
Output:
9 0 5 4 8 8 7 3 9 0 3 8
Ref: http://www.php.net/manual/en/function.str-split.php
You may try this regex as well:
((?:(?:\d)\s?){4,})
It will capture all the number having length four or more. Also you need to do an additional step to remove spaces from the matches from the results like this:
7 9 6 2 5 5 5 4 4 3
Demo

Categories