Can someone tell me why this does not work ? - https://regex101.com/r/hJ5zN6/11
Test string:
[test][dzspgb_container][dzspgb_row][dzspgb_row_part part="1.4"][dzspgb_element text="whwaha" type_element="text"][/dzspgb_element][dzspgb_element text="test" type_element="text"][/dzspgb_element][/dzspgb_row_part][dzspgb_row_part part="1.4"][/dzspgb_row_part][dzspgb_row_part part="1.4"][/dzspgb_row_part][dzspgb_row_part part="1.4"][/dzspgb_row_part][/dzspgb_row][dzspgb_container]test second[/dzspgb_container][/dzspgb_container][/thisbreaks]
Test regex:
*\[dzspgb_container(.*?)](.*?)\[\/dzspgb_container\](?!\s*\[\/)*
If we remove [/thisbreaks] from the string, it will work.
It's because of the negative lookahead assertion at the end. I suggest you to remove that lookahead and use a greedy regex pattern like below.
\[dzspgb_container(.*?)](.*)\[\/dzspgb_container\]
DEMO
(?!\s*\[\/) asserts that the match won't be followed by (zero or more space characters and a [ symbol)
Related
I try to find any string it not exactly one or more word
My pattern
(?!(^ignoreme$)|(^ignoreme2$))
Iam looking for
ignoreme - no
ignoreme2 - no
ignoremex - match
ignorem - match
gnoreme - match
ignoreme22 - match
But it return many space. How to do that thank.
https://regex101.com/r/u4EsNv/1
You may use this corrected regex:
^(?!ignoreme2?$).*$
Updated RegEx Demo
RegEx Details:
^: Start
(?!ignoreme2?$): Negartive lookahead to fail the match when we have ignoreme or ignoreme2 ahead till end.
.*: Match 0 more of any characters
$: End
Note that regex (?!(^ignoreme$)|(^ignoreme2$)) matches first 2 invalid cases because you have included ^ in negative lookahead expressions not outside. This causes regex engine to start matching after 1st character to satisfy lookahead assertions. (You can see that in regex101 highlighted matches)
OK this regex will match string like 2aa, a2, 2aaaaaa, aaaa2, aaa2aaaa, 2222a2222-2222-aaaa... in short, mix of alphanumeric characters in a sequence:
preg_match("/(?:\d+[a-z]|[a-z]+\d)[a-z\d]*/i")
now I want to exclude something but I'm stuck, something like this doesn't work
preg_match("/(?!1920x1200|1920x1080)(?:\d+[a-z]|[a-z]+\d)[a-z\d]*/i")
for example the string aaaaa222aaa1920x1200bbbbb1234556789 is still matched but it shouldn't because it contains 1920x1200
any help is appreciated :)
i'm using regex found here for matching alphanum sequences Regex: match only letters WITH numbers
regex test: https://regex101.com/r/vU9aU9/1
Your negative lookahead should have .* in front to allow for 0 or more characters before not-allowed text. Also use anchors in your regex.
regex should be:
preg_match('/^.*?1920x1200.*$(*SKIP)(*F)|(?:\d+[a-z]|[a-z]+\d)[a-z\d]*/im')
RegEx Demo
I'm trying to capture words in a string like:
1vTvFpU
KOoy6Cc
With regex pattern:
\b(?=(?:.*?[a-z]){1,})[A-Za-z0-9\/\-_.]{7,7}\b
But I have a problem because it also matches words like:
FDSFDFI
WEWEFDP
RRRRRRR
In a string:
FDSFDFI sdfdfdf
WEWEFDP traliii
RRRRRRR sdfdfdf
What Am I doing wrong?
I suggest you to use \S* instead of .* inside the lookahead. Because when you include .*? inside the lookahead, it checks for atleast one lower-case letter for the whole line not for the word.
\b(?=(?:\S*?[a-z]))[A-Za-z0-9\/\-_.]{7}\b
{7,7} is equal to {7}
DEMO
No need to use a lookahead to do that, character classes suffice:
[^\Wa-z]*+\w+
Then checks the string length with php (for example with array_filter).
I want to write a regex with assertions to extract the number 55 from string unknownstring/55.1, here is my regex
$str = 'unknownstring/55.1';
preg_match('/(?<=\/)\d+(?=\.1)$/', $str, $match);
so, basically I am trying to say give me the number that comes after slash, and is followed by a dot and number 1, and after that there are no characters. But it does not match the regex. I just tried to remove the $ sign from the end and it matched. But that condition is essential, as I need that to be the end of the string, because the unknownstring part can contain similar text, e.g. unknow/545.1nstring/55.1. Perhaps I can use preg_match_all, and take the last match, but I want understand why the first regex does not work, where is my mistake.
Thanks
Use anchor $ inside lookahead:
(?<=\/)\d+(?=\.1$)
RegEx Demo
You cannot use $ outside the positive lookahead because your number is NOT at the end of input and there is a \.1 following it.
I'm using capturing groups in regular expressions for the first time and I'm wondering what my problem is, as I assume that the regex engine looks through the string left-to-right.
I'm trying to convert an UpperCamelCase string into a hyphened-lowercase-string, so for example:
HelloWorldThisIsATest => hello-world-this-is-a-test
My precondition is an alphabetic string, so I don't need to worry about numbers or other characters. Here is what I tried:
mb_strtolower(preg_replace('/([A-Za-z])([A-Z])/', '$1-$2', "HelloWorldThisIsATest"));
The result:
hello-world-this-is-atest
This is almost what I want, except there should be a hyphen between a and test. I've already included A-Z in my first capturing group so I would assume that the engine sees AT and hyphenates that.
What am I doing wrong?
The Reason your Regex will Not Work: Overlapping Matches
Your regex matches sA in IsATest, allowing you to insert a - between the s and the A
In order to insert a - between the A and the T, the regex would have to match AT.
This is impossible because the A is already matched as part of sA. You cannot have overlapping matches in direct regex.
Is all hope lost? No! This is a perfect situation for lookarounds.
Do it in Two Easy Lines
Here's the easy way to do it with regex:
$regex = '~(?<=[a-zA-Z])(?=[A-Z])~';
echo strtolower(preg_replace($regex,"-","HelloWorldThisIsATest"));
See the output at the bottom of the php demo:
Output: hello-world-this-is-a-test
Will add explanation in a moment. :)
The regex doesn't match any characters. Rather, it targets positions in the string: the positions between the change in letter case. To do so, it uses a lookbehind and a lookahead
The (?<=[a-zA-Z]) lookbehind asserts that what precedes the current position is a letter
The (?=[A-Z]) lookahead asserts that what follows the current position is an upper-case letter.
We just replace these positions with a -, and convert the lot to lowercase.
If you look carefully on this regex101 screen, you can see lines between the words, where the regex matches.
Reference
Lookahead and Lookbehind Zero-Length Assertions
Mastering Lookahead and Lookbehind
I've separated the two regular expressions for simplicity:
preg_replace(array('/([a-z])([A-Z])/', '/([A-Z]+)([A-Z])/'), '$1-$2', $string);
It processes the string twice to find:
lowercase -> uppercase boundaries
multiple uppercase letters followed by another uppercase letter
This will have the following behaviour:
ThisIsHTMLTest -> This-Is-HTML-Test
ThisIsATest -> This-Is-A-Test
Alternatively, use a look-ahead assertion (this will effect the reuse of the last capital letter that was used in the previous match):
preg_replace('/([A-Z]+|[a-z]+)(?=[A-Z])/', '$1-', $string);
To fix the interesting use case Jack mentioned in your comments (avoid splitting of abbreviations), I went with zx81's route of using lookahead and lookbehinds.
(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])
You can split it in two for the explanation:
First part
(?<= look behind to see if there is:
[a-z] any character of: 'a' to 'z'
) end of look-behind
(?= look ahead to see if there is:
[A-Z] any character of: 'A' to 'Z'
) end of look-ahead
(TL;DR: Match between strings of the CamelCase Pattern.)
Second part
(?<= look behind to see if there is:
[A-Z] any character of: 'A' to 'Z'
) end of look-behind
(?= look ahead to see if there is:
[A-Z] any character of: 'A' to 'Z'
[a-z] any character of: 'a' to 'z'
) end of look-ahead
(TL;DR: Special case, match between abbreviation and CamelCase pattern)
So your code would then be:
mb_strtolower(preg_replace('/(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])/', '-', "HelloWorldThisIsATest"));
Demo of matches
Demo of code