I have problem now, I was struggling to solve this regex issue, I already spent 2 hours for this.
Text: what2 when2 not3 not 2 not 2018 not2not
Expected: what-what when-when not3 not 2 not 2018 not2not
I want to replace every word that contains [alphabet]+ number(2) in the end of word. And then I will replace [text]2 into [text]-[text]
Here is my regex. My final script:
$str = 'what2 when2 not3 not 2 not 2018 not2not';
echo preg_replace('/[a-z]+2/i', "$0-$0", $str);
//result: what2-what2 when2-when2 not3 not 2 not 2018 not2-not2not
//expected: what-what when-when not3 not 2 not 2018 not2not
My mistake is:
My regex still includes not2not which shouldn't be included
I can't replace number(2) from my matched return ($0). I try $1 and $2 but still can't solve the problem.
Did I miss anything? I'm very bad at regex actually but always want to try learn it.
thanks for any advice
Change your preg_replace function to:
echo preg_replace('/([a-z]+)2/i', "$1-$1", $str);
The $0 means to replace the entire match. If you want just the word without the trailing 2 put capturing parenthesis around it by doing /([a-z]+)2/i then use $1 to grab just that capture. Or in other words the word without the 2 at the end. This returns:
what-what when-when not3 not 2 not 2018 not-notnot
Next the final not-notnot is because your not looking for a space or end of the string. So it captures the not2 in not2not. To fix that you can check for a word boundary afterwards by changing it to: /([a-z]+)2\b/i. The \b checks for both white space and end of line to capture strings like 'what2 yes2' correctly.
I suggest this solution:
$str = 'what2 when2 not3 not 2 not 2018 not2not';
echo preg_replace('/([a-z]+)2\s/i', "$1-$1 ", $str);
// OUTPUT: what-what when-when not3 not 2 not 2018 not2not
//expected: what-what when-when not3 not 2 not 2018 not2not
The \s used to found full words, but not 2 in the middle of word.
If you won't use it, you'll have a wrong last replace (not2not). But in this way, you should add space in subject ("$1-$1 ")
Related
i use this to check if there is a 06 number in the string. the 06 number is always 10 numbers.
$string = "This is Henk 0612345678";
$number = preg_replace('/[^0-9.]+/', '', $string);
echo $number;
This is working good, but when the string is
$string = "This is 12Henk 0612345678";
The number is 120612345678
I dont want the 12 into it, but 12 is not always the same in the string.
How can i check only for a 10 digits number ?
Thanks
This could help you
/([ \w]+)(06[0-9.+]{8})/
the 1st () is the entire String before the 06 and the 2nd() is the Number starting with 06 and 8 digits.
The solution does not cover the case where a 06 comes before the number sequence
Rather than replacing everything that's not what you want, try searching for what you do want with preg_match.
That makes it a lot easier to be specific:
The number always starts 06, so you can hard-code that in your regex
That's followed by exactly 8 more digits, which you can specify as [0-9]{8} or \d{8} ("d" for "digit")
To avoid matching longer numbers, you can surround that with \b for "word break"
Put it together, and you get:
preg_match('/\b06\d{8}\b/', $string, $matched_parts);
This doesn't change $string, but gives you an array in $matched_parts containing the matched parts; see the preg_match documentation for a detailed explanation.
I have a question regarding one case.
I'm using php preg_replace() function to format data from this string:
12:05 Place1 12:40 14:00 16:30 Place2 "Test" 29 Janury
I need it to be outputed like this:
<li>29 January - Place1 - 12:05</li>
<li>29 January - Place2 "Test" - 12:40 14:00 16:30</li>
My regular expression:
/(\d[0-9]:\d[0-9]).+?(\D+).+?(\d[0-9]\s(January|February|March))/
I'm currently using something like this:
$text = "12:05 Place1 12:40 14:00 16:30 Place2 "Test" 29 Janury";
$data = preg_replace("/(\d[0-9]:\d[0-9]).+?(\D+).+?(\d[0-9]\s(January|February|March))/", "<li>$3 - $2 - $1</li>", $text);
echo $data;
The problem it shows only first matches,
29 January - Place1 - 12:05
Maybe someone know how to solve this case?
Thanks :)
It is not possible to do in one regex because you want to use a match from the end of the string before the PCRE engine has reached it. You need to do this in two steps:
preg_match() with PREG_OFFSET_CAPTURE to capture the date from the end of the string into another variable. Use the offset it provides to truncate the original string.
preg_match_all() the truncated string to get the times & places as an array, then iterate the array to use each match & the date variable to create the list. '/((?:\s*+\d\d:\d\d)++)\s++((?:.(?!\d\d:\d\d\s))++)\s*+/' is a suitable pattern.
I need a php regex to match text that is not preceded by the name "Total" of "maximum" case insensitive in the text below.
[1]
[1m]
[1mk][1mks]
[1mark]
[1marks]
(1mk)
12mk
12 mark
13 mark
[Total: 15]
Total: 16 mark
Total 1 mark
Total 12 mark
Total: 9 mark
Total: 10 mark
[Total: 11 marks] Total 6 mark
maximum 5 marks
maximum:5 marks
Note: This text is in a one long line.
The regex should match the following
[1]
[1m]
[1mk][1mks]
[1mark]
[1marks]
(1mk)
12mk
12 mark
13 mark
I have tried this one but its not working
/(?<!Total\:\s|Total\s|maximum\s|maximum\:\s)[\[|\(]?([0-9]{1,2})(\s|(?=marks|mark|mks|mk|m|\]))?(\]|marks|mark|mks|mk|m)[\]|\)]?/i
EDIT
https://www.debuggex.com/r/yNNN_B3iQmGyYWoz
EDIT2
e.g '12 mark' should be returned only is its not "Total[:]\s+ 12 mark" or "maximum[:]\s+12 mark"
Try this: (?:\[?\b(?:Total|maximum):?\s?\d+\s?[^ ]+(*SKIP)(*FAIL))|(\d++\s?[^ )\]]*)
(Use ignore case.)
Explanation
Part 1
(?:\[? Non capturing group that may have a [
\b Boundary
(?:Total|maximum) non capturing group matching either literal
:?\s?\d+\s? Maybe a : maybe a space, some digits, maybe another space.
[^ ]+ A bunch of non spaces.
(*SKIP)(*FAIL))| Plot twist: Anything matching Part 1 FAILS
Part 2
This is captured, for real.
\d++\s? digits, maybe followed by a space.
[^ )\]]* And maybe stuff that's not a space, ), or ].
The PHP should look something like this:
preg_match_all(
'/(?:\[?\b(?:Total|maximum):?\s?\d+\s?[^ ]+(*SKIP)(*FAIL))|(\d++\s?[^ )\]]*)/i',
"YOUR STRING",
$matches
);
print_r($matches[0]);
Actually I would go for the two step solution. First clean up the trashy words by replacing them with this regexp:
(Total:?\s?|maximum:?\s?)
Then match all the content you really need is easy:
\[?\(?([0-9]{1,2}\s?marks?|[0-9]{1,2}\s?mk?s?)\)?\]?
No idea how to use debuggex.com but I tested all regular expressions in pspad so it definitely works.
I want to check if a string contains more than or equals 3 times a letter/number and replace it with only one letter/number. For example:
IIIII havvvvve a bigggg tesssssttttt tomorrow soooo iiii 2222551111 haveeee to do this rightttttt
To became like this
I have a big test tomorrow so i 2551 have to do this right.
How can this be done with preg_replace ?
Regex:
([A-Za-z0-9])\1\1+
This would match more than or equals 3 times a letter/number and captures the first letter or Number. Finally the whole string was replaced with the character in the group index 1.
Replacement string:
\1
DEMO
<?php
$text = 'IIIII havvvvve a bigggg tesssssttttt tomorrow soooo iiii 2222551111 haveeee to do this rightttttt';
$pattern = '~([A-Za-z0-9])\1\1+~';
echo preg_replace($pattern,'\1',$text);
?>
Output:
I have a big test tomorrow so i 2551 have to do this right
([A-Za-z0-9])(\1{2,})?
Try this.Replace with $1.
See demo..
http://regex101.com/r/sA7pZ0/27
I am trying to find a link using regexp which appears just before textABCXYZ123 string in below HTML .
lorem ispum...<strong>FIRSTlink </strong><br>
1 points| Saved Jan 08, 2014 at 00:49 <span class=notes_box>ANOTHERLINK</span>.
... more text........... more text........
... more text.......<strong>other link </strong><br>
1 points| Saved Jan 08, 2014 at 00:49 <span class=notes_box>ANOTHERLINK</span>.
... more text........... more text........
<strong>somewhere to go </strong><br>
1 points| Saved Jan 08, 2014 at 00:49 <span class=notes_box>textABCXYZ123</span>
...
... more text..........<strong>other link </strong><br>
1 points| Saved Jan 08, 2014 at 00:49 <span class=notes_box>ANOTHERLINK</span>.
... more text........... more text........
There are many links and I need to capture the link which appears just before textABCXYZ123 string. i tried below regex but it is returning me first link instead of last one:
$find_string = 'ABCXYZ123';
preg_match('#href="(.*)".*text'.$find_string.'#sU',$html,$match);
// so final resutl is "http://www.site.com/link/123" which is first link
Can someone guide me how can I capture that link just before my string textABCXYZ123? P.S I know about xpath and simple html dom but I would like to match with regexp. Thanks for any input.
You could maybe try the regex:
href="([^"]*)">(?=(?:(?!href).)*textABCXYZ123)
Like so?
$find_string = 'ABCXYZ123';
preg_match('~href="([^"]*)">(?=(?:(?!href).)*text'.$find_string.')~sU',$html,$match);
regex101 demo
The first part is href="([^"]*)"> and shouldn't be too hard to understand. It matches href=" and then any number of non-quote characters, followed by quotes and >.
(?=(?:(?!href).)*textABCXYZ123) first is a positive lookahead. (A positive lookahead has the format (?= ... )) It will make sure that there is what's inside to say that there is a match.
For instance, a(?=.*b) matches any a, as long as there is any characters, then a b somewhere after the a (also means it matches a as long as there's a b somewhere after it).
So, href="([^"]*)"> will match only if there is (?:(?!href).)*textABCXYZ123 somewhere ahead.
(?:(?!href).)* is a modified .*, because the negative lookahead (format (?! ... )) makes sure no href is matched. You could say it's the opposite of a positive lookahead:
a(?!.*b) matches any a as long as it is not followed by a b.
(?s)href=[^<]+</a>(?!.*(href).*(textABCXYZ123))(?=.*(textABCXYZ123))
Could also try this, let me know if you want an explantation