Pattern : '/x(?: (\d))+/i'
String : x 1 2 3 4 5
Returned : 1 Match Position[11-13] '5'
I want to catch all possible repetitions, or does it return 1 result per group?
I want the following :
Desired Output:
MATCH 1
1. [4-5] `1`
2. [6-7] `2`
3. [8-9] `3`
4. [10-11] `4`
5. [12-13] `5`
Which I was able to achieve just by copy pasting the group, but this is not what I want. I want a dynamic group capturing
Pattern: x(?: (\d))(?: (\d))(?: (\d))(?: (\d))(?: (\d))
You cannot use one group to capture multiple texts and then access them with PCRE. Instead, you can either match the whole substring with \d+(?:\s+\d+)* and then split with space:
$re2 = '~\d+(?:\s+\d+)*~';
if (preg_match($re2, $str, $match2)) {
print_r(preg_split("/\\s+/", $match2[0]));
}
Alternatively, use a \G based regex to return multiple matches:
(?:x|(?!^)\G)\s*\K\d+
See demo
Here is a PHP demo:
$str = "x 1 2 3 4 5";
$re1 = '~(?:x|(?!^)\G)\s*\K\d+~';
preg_match_all($re1, $str, $matches);
var_dump($matches);
Here, (?:x|(?!^)\G) is acting as a leading boundary (match the whitespaces and digits only after x or each successful match). When the digits are encountered, all the characters matched so far are omitted with the \K operator.
Related
How to remove all numbers exept alphanumeric, for example if i have string like this:
Abs_1234abcd_636950806858590746.lands
to become it like this
Abs_1234abcd_.lands
It is probably done like this
Find (?i)(?<![a-z\d])\d+(?![a-z\d])
Replace with nothing.
Explained:
It's important to note that in the class [a-z\d] within assertions,
there exists a digit, without which could let "abc901234def" match.
(?i) # Case insensitive
(?<! [a-z\d] ) # Behind, not a letter nor digit
\d+ # Many digits
(?! [a-z\d] ) # Ahead, not a letter nor digit
Note - a speedier version exists (?i)\d(?<!\d[a-z\d])\d*(?![a-z\d])
Regex1: (?i)\d(?<!\d[a-z\d])\d*(?![a-z\d])
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 0.53 s, 530.56 ms, 530564 µs
Matches per sec: 188,478
Regex2: (?i)(?<![a-z\d])\d+(?![a-z\d])
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 2
Elapsed Time: 0.91 s, 909.58 ms, 909577 µs
Matches per sec: 109,941
In this specific example, we can simply use _ as a left boundary and . as the right boundary, collect our digits, and replace:
Test
$re = '/(.+[_])[0-9]+(\..+)/m';
$str = 'Abs_1234abcd_636950806858590746.lands';
$subst = '$1$2';
$result = preg_replace($re, $subst, $str);
echo $result;
Demo
For your example data, you could also match not a word character or an underscore [\W_] using a character class. Then forget what is matched using \K.
Match 1+ digits that you want to replace with a empty string and assert what is on the right is again not a word character or an underscore.
[\W_]\K\d+(?=[\W_])
Regex demo
I wanted to match something from right to left, below is one of such example.
100abababab3x3x3xx1000morewords
If i want to match something between and last xx and immediate previous ab and get 3x3x3
I tried something like below , but it matches ababab3x3x3
preg_match('/ab(.*?)xx/',$text,$totmat);
Note : please don't recommend strrev.
Above example is just for illustration , all i wanted to do is match from right to left.
Not sure this is the most optimized way or not? But this will work for you if you use the combination of Look ahead positive (?=) and Look behind positive (?<=). See regex
<?php
$re = '/\w+(?<=ab)(.*?)(?=xx)/m';
$str = '100abababab3x3x3xx1000morewords';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the full matched result
echo $matches[0][1];
DEMO: https://3v4l.org/db69N
$str = '100abababab3x3x3xx1000morewords';
preg_match('/ab((?:(?!ab).)*)xx/', $str, $m);
print_r($m);
Output:
Array
(
[0] => ab3x3x3xx
[1] => 3x3x3
)
>Explanation:
ab : literally ab
( : start group 1
(?: : start non capture group
(?!ab) : negative lookahead, make sure we doon't have ab
. : any character but newline
)* : end group, may appear 0 or more times
) : end group 1
xx : literally xx
There are other approaches than a regex to this kind of problem that would be close to twice faster in computing time.
Here for example :
$str = "100abababab3x3x3xx1000morewords";
$result = explode("ab", explode("xx", $str)[0]);
var_dump(end($result));
First occurence of explode split the string in two between the "xx" characters. We're only interested by the left part (Index 0).
Second occurence of explode split the string with the caracters ab. We're only interested by the last occurence of ab. Therefore var_dump(end($result)); prints the expected result.
This is an example:
$str="this is string 1 / 4w";
$str=preg_replace(?); var_dump($str);
I want to capture 1 / 4w in this string and move this portion to the begin of string.
Result: 1/4W this is string
Just give me the variable that contains the capture.
The last portion 1 / 4W may be different.
e.g. 1 / 4w can be 1/ 16W , 1 /2W , 1W , or 2w
The character W may be an upper case or a lower case.
Use capture group if you want to capture substring:
$str = "this is string 1 / 4w"; // "1 / 4w" can be 1/ 16W, 1 /2W, 1W, 2w
$str = preg_replace('~^(.*?)(\d+(?:\s*/\s*\d+)?w)~i', "$2 $1", $str);
var_dump($str);
Without seeing some different sample inputs, it seems as though there are no numbers in the first substring. For this reason, I use a negated character class to capture the first substring, leave out the delimiting space, and then capture the rest of the string as the second substring. This makes my pattern very efficient (6x faster than Toto's and with no linger white-space characters).
Pattern Demo
Code:
$str="this is string 1 / 4w";
$str=preg_replace('/([^\d]+) (.*)/',"$2 $1",$str);
var_export($str);
Output:
'1 / 4w this is string'
I've been trying to use preg_replace() in php to replace string. I want to match and replace all 's' in this string, but I just came with solution only mathching 's' between 'b' and 'c' or 's' between > <. Is there any way I can use negative look behind not just for the character '>' but for whole string ? I don't want to replace anything in brackets.
<text size:3>s<text size:3>absc
<text size:3>xxetxx<text size:3>sometehing
edit:
just get 's' in >s< and in bsc. Then when I will change string for example from 's' to 'te', to replace 'te' in xtex and sometehing. So I was looking for regular expression to avoid replacing anything in <....>
You can use this pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/';
$replace = '\1\3■'; # ■ = your replacement string
$result = preg_replace( $pattern, $replace, $str );
regex101 demo
Pattern explanation:
( # group 1:
(<[^>]*>)* # group 2: zero-or-more <...>
)
([^s]*) # group 3: zero-or-more not “s”
s # litterally “s”
If you want match case-insensitive, add a “i” at the end of pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/i';
Edit: Replacement explanation
In the search pattern we have 3 groups surrounded by round brackets. In the replace string we can refer to groups by syntax \1, where 1 is the group number.
So, replace string in the example means: replace group 1 with itself, replace group 3 with itself, replace “s” with desired replacement. We don't need to use group 2 because it is included in group 1 (this due to regex impossibility to retrieve repeating groups).
In the demo string:
abs<text size:3>ssss<text size:3><img src="img"><text size:3>absc
└┘╵└───────────┘╵╵╵╵└───────────────────────────────────────┘└┘╵╵
└─┘└────────────┘╵╵╵└──────────────────────────────────────────┘
1 2 345 6
Pattern matches:
group 1 group 3 s
--------- --------- ---------
1 > 0 1 1
2 > 1 0 1
3 > 0 0 1
4 > 0 0 1
5 > 0 0 1
6 > 3 1 1
The last “c” is not matches, so is not replaced.
Use preg_match_all to get all the s letters and use it with flag PREG_OFFSET_CAPTURE to get the indices.
The regular expression $pat contains a negative lookahead and lookbehind so that the s inside the brackets expression is not matched.
In this example I replace s with the string 5. Change to the string you want to substitute:
<?php
$s = " <text size:3>s<text size:3>absc";
$pat = "/(?<!\<text )s(?!ize:3\>)/";
preg_match_all($pat, $s, $matches, PREG_OFFSET_CAPTURE);
foreach ($matches[0] as $match) {
$s[$match[1]] = "5";
}
print_r(htmlspecialchars($s));
$string = "anyWord Hello A 1 *** .";
preg_match('/(.*?) Hello (A|B) (1|0) (if(g2 == B)then|else).*/i',$string,$match);
// g1 g2 g3 -->|
print_r($match);
ٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍٍ
What i do ?
Try this:
preg_match("/(.*?) Hello ([AB]) (?|(0)|(1) (\*\*\*))/i",$string,$match);
The (?|...|...) structure allows subpatterns in the alternation to be numbered independently of each other. Otherwise, you'd end up with match 3 either being 0 or nothing, and match 4/5 being 1/*** or nothing. The structure combined them, to get match 3 being 0 or 1, and match 4 being nothing or ***