php regexp: can't exclude one element - php

I am trying to set-up a quite complex regexp, but I can't avoid just one element from not-match list.
My regular expression is:
1234567-8_abc((?!_ABC|_DEFGHI)[\w]?)*(\.ios|\.and)
What I have to exclude is:
1234567-8_abc.ios
1234567-8_abc_DEFGHI.ios
1234567-8_abc_ABC.ios
Instead, what I have to include is:
1234567-8_abc_1UP.ios
1234567-8_abc_FI.ios
1234567-8_abc_gmg.ios
1234567-8_abc_1UP.and
1234567-8_abc_FI.and
1234567-8_abc_gmg.and
1234567-8_abc_ddd.and
1234567-8_abc_qwert.ios
1234567-8_abc_88.ios
Well, I can't exclude the first option (1234567-8_abc.ios).
I tried it here.
How can I achieve this?
Thank you!

You can use this pattern:
1234567-8_abc_[^_.]++(?<!_ABC|_DEFGHI)\.(?:ios|and)
Note: I assume that each substring between _ and .ios doesn't contain a dot or an underscore.
The possessive quantifier ++ is necessary to fail faster with the less possible backtracking steps

This regex matches your examples in PHP:
1234567-8_abc_((?!ABC|DEFGHI)[\w]?)*(\.ios|\.and)

Add a negative lookahead like below,
1234567-8_abc(?!_ABC|_DEFGHI)\w+(\.ios|\.and)
DEMO
(?!_ABC|_DEFGHI) Negative lookahead asserts that the string following _abc wouldn't be _ABC or _DEFGHI . And it must have one or more word characters before .ios or .and. So it won't match this 1234567-8_abc.ios string.

1234567-8_abc(?:(?!_ABC|_DEFGHI)\w)+(\.ios|\.and)
Try this.Your regex has left \w after 1234567-8_abc optional.Just made it compulsary.See demo.
http://regex101.com/r/bB8jY7/1

Related

Non greedy match does not work

I want to implement non greedy match using .*? pattern. However, I came across one sample string which shows, that non greedy match does not work. This is the code and the sample string:
preg_match_all('/\<w:t.*?\>\<w:p\>/', '<w:t xml:space="preserve"></w:t></w:r><w:r><w:rPr><w:b/></w:rPr><w:t xml:space="preserve">Text 1 </w:t></w:r><w:r><w:rPr><w:b/><w:u w:val="single"/><w:color w:val="ff0000"/></w:rPr><w:t xml:space="preserve"></w:t></w:r><w:r><w:rPr><w:b/><w:u w:val="single"/><w:color w:val="ff0000"/><w:i/></w:rPr><w:t xml:space="preserve">Text 2</w:t></w:r><w:r><w:t xml:space="preserve"></w:t></w:r><w:r><w:t xml:space="preserve"></w:t></w:r><w:r><w:t xml:space="preserve"></w:t></w:r></w:p></w:t></w:r></w:p><w:p w:rsidRDefault="004D3323" w:rsidP="003F03B1"><w:r><w:t><w:p>', $match);
But if I print_r the $match variable, I see that this pattern matches the whole string. However, what I want is to match only such strings as:
"<w:t><w:p>" and "<w:t any text may go here><w:p>"
So, what I did wrong and how can I fix it? Thanks!
Use this regex instead:
<w:t[^>]*><w:p>
[^>]* allows all characters except >
see https://regex101.com/r/nuMzTk/1

Regular expression R22.5 and R22

Need little help!
I have stroke - "R22.5 and R22"
And i need find only R22 word.
I try this: "\bR\d{2,2}\b" but do nothing, because this regular expression return me two variants (R22 and R22)
How to make a regular expression seen only R22, without the fractional part(.xx)?
Thanks!
You can use a negative lookahead assertion to check if a dot and a digit don't follow:
\bR\d{2}\b(?!\.\d)

regex using preg-match on de limited based string

I got a string that goes like
Text58||INPUT 6~~Text67||INPUT 7~~Text68||INPUT 8~~CR_Exp_Date||INPUT 9~~Text60||INPUT 10~~Text63||INPUT 14~~Combo_Box65||Ship~~Text66||INPUT 15~~First_Name||INPUT 18~~Middle_Name||INPUT 19~~Last_Name||INPUT 20~~Suffix||INPUT 21~~Country||INPUT 22~~Mailing_Address||INPUT 23~~City||INPUT 24~~State||INPUT 25~~Zip_Code||INPUT 26~~
trying to extract First_Name||INPUT 18
tried doing (?=First_Name[||]).*?(?<=[~~][$])
didnt come up with anything else ...any ones what i am doing wrong ?
Try this:
First_Name\|\|.*?(?=~~)
First_Name should not be in a lookahead, you want to include it in the match.
To match a pair of |, you should include them in the regexp, with \ to escape them.
~~ should be in a positive lookahead, not lookbehind. And they don't need to be in brackets.
If you don't want to include First_Name|| in the match (why did you say you did in the question?), you can use a positive lookbehind:
(?<=First_Name\|\|).*?(?=~~)
It seems like you got lookbehind and lookahead backwards in your attempt.
DEMO

using preg_match_all to find patterns, don't include pattern deliminator in matchs

I'm matching patterns with reg_ex as in
$Structure = 'C:N:X:A:V:T:J:N:G:T:N:N:C:J:N:C:A:J:N:.:';
preg_match_all('/(T:|G:|L:|D:).*?(G:|i:|X:|\.:)/', $Structure, $arr, PREG_SET_ORDER);
the results I get are
T:J:N:G: , T:N:N:C:J:N:C:A:J:N:.:
How can I modify the query so that the deliminator (G:|i:|X:|.:) of the match is not included in the find, but will bu used in the next search. In other words make the result look as bellow:
T:J:N: , G:T:N:N:C:J:N:C:A:J:N:
instead?
Is this possible?
Thanks
Yes, instead of making your 2nd capturing group consume the input, turn it into a positive lookahead:
/(T:|G:|L:|D:).*?(?=(?:G:|i:|X:|\.:))/
Now, instead of matching (and consuming) the delimiter, this:
(?=(?:G:|i:|X:|\.:))
States that the regex must assert that the delimiter is present from the current point forward, i.e. a positive lookahead.
This results in:
"T:J:N:, G:T:N:N:C:J:N:C:A:J:N:"
It is possible by lookaheads, with the following syntax:
(?=G:|i:|X:|\.:)
That will not consume the piece that matches the regex.
On a side note, the delimiter means the slashes that you have enclosing your regex and not the capturing group you have.

problem with regular expressions php

/any_string/any_string/any_number
with this regular expression:
/(\w+).(\w+).(\d+)/
It works, but I need this url:
/specific_string/any_string/any_string/any_number
And I don't know how to get it. Thanks.
/(specific_string).(\w+).(\w+).(\d+)/
Though note that the .s in your regular expression technically match any character and
not just the /
/(specific_string)\/(\w+)\/(\w+)\/(\d+)/
This will have it match only slashes.
This one will match the second url:
"/(\w+)\/(\w+)\/(\w+)\/(\d+)/"
/\/specific_string\/(\w+).(\w+).(\d+)/
Just insert the specific_string in the regexp:
/specific_string\/(\w+)/(\w+)/\d+)/
Another variant with the outer delimiters changed to avoid extraneous escaping:
preg_match("#/FIXED_STRING/(\w+)/(\w+)/(\d+)#", $_SERVER["REQUEST_URI"],
I would use something like this:
"/\/specific_string\/([^\/]+)\/([^\/]+)\/(\d+)/"
I use [^\/]+ because that will match anything that is not a slash. \w+ will work almost all the time, but this will also work if there is an unexpected character in the path somewhere. Also note that my regex requires the leading slash.
If you want to get a little more complicated, the following regex will match both of the patterns you provided:
"/^(?:\/specific_string)*\/([^\/]+)\/([^\/]+)\/(\d+)$/"
This will match:
"/any_string/any_string/any_number"
"/specific_string/any_string/any_string/any_number"
but it will not match
"/some_other_string/any_string/any_string/any_number"

Categories