Exclude a certain match in a capturing group in regex - php

I have a regex capturing group and I want to exclude a number if it matches a certain pattern also.
This is my capturing group:
https://regex101.com/r/zL1tL8/1
if \n is followed by a number and character like "1st", "2nd", "4dffgsd", "3sf" then it should stop the match BEFORE the number.
0-9 is important in the capturing group.
So far I have this pattern [0-9][a-zA-Z]+ to match a number followed by characters. How do I apply this to the capturing group as a condition?
Update:
https://regex101.com/r/zL1tL8/4
Line 1 is wrong.
It should not match a number followed by characters

You'll want to use a negative lookahead to "stop" the match if something after matches your pattern. So, something like this might work:
(\\n(?![0-9][a-zA-Z]))
See it in use here: https://regex101.com/r/zL1tL8/2
Here's a page with some more info on lookahead and lookbehind: http://www.rexegg.com/regex-lookarounds.html

Related

Using Regex to detect if string exists

I need to use PHP's preg_match() and Regex to detect the following conditions:
If a URL path is one of the following:
products/new items
new items/products
new items/products/brand name
Do something...
I can't seem to figure out how to check if the a string exists before or after the word products. The closest I can get is:
if (preg_match("([a-zA-Z0-9_ ]+)\/products\/([a-zA-Z0-9_ ]+)", $url_path)) {
// Do something
Would anyone know a way to check if the first part of the string exists within the one regex line?
You could use an alternation with an optional group for the last item making the / part of the optional group.
If you are only looking for a match, you can omit the capturing groups.
(?:[a-zA-Z0-9_ ]+/products(?:/[a-zA-Z0-9_ ]+)?|products/[a-zA-Z0-9_ ]+)
Explanation
(?: Non catpuring group
[a-zA-Z0-9_ ]+/products Match 1+ times what is listed in the character class, / followed by products
(?:/[a-zA-Z0-9_ ]+)? Optionally match / and what is listed in the character class
| Or
products/[a-zA-Z0-9_ ]+ Match products/, match 1+ times what is listed
) Close group
Regex demo
Note that [a-zA-Z0-9_ ]+ might be shortened to [\w ]+
You can use alternation
([\w ]+)\/products|products\/([\w ]+)
Regex Demo
Note:- I am not sure how you're using the matched values, if you don't need back reference to any specific values then you can avoid capturing group, i.e.
[\w ]+\/products|products\/[\w ]+

REGEX - match words that contain letters repeating next to each other

im looking for a regex that matches words that repeat a letter(s) more than once and that are next to each other.
Here's an example:
This is an exxxmaple oooonnnnllllyyyyy!
By far I havent found anything that can exactly match:
exxxmaple and oooonnnnllllyyyyy
I need to find it and place them in an array, like this:
preg_match_all('/\b(???)\b/', $str, $arr) );
Can somebody explain what regexp i have to use?
You can use a very simple regex like
\S*(\w)(?=\1+)\S*
See how the regex matches at http://regex101.com/r/rF3pR7/3
\S matches anything other than a space
* quantifier, zero or more occurance of \S
(\w) matches a single character, captures in \1
(?=\1+) postive look ahead. Asserts that the captrued character is followed by itsef \1
+ quantifiers, one or more occurence of the repeated character
\S* matches anything other than space
EDIT
If the repeating must be more than once, a slight modification of the regex would do the trick
\S*(\w)(?=\1{2,})\S*
for example http://regex101.com/r/rF3pR7/5
Use this if you want discard words like apple etc .
\b\w*(\w)(?=\1\1+)\w*\b
or
\b(?=[^\s]*(\w)\1\1+)\w+\b
Try this.See demo.
http://regex101.com/r/kP8uF5/20
http://regex101.com/r/kP8uF5/21
You can use this pattern:
\b\w*?(\w)\1{2}\w*
The \w class and the word-boundary \b limit the search to words. Note that the word boundary can be removed, however, it reduces the number of steps to obtain a match (as the lazy quantifier). Note too, that if you are looking for words (in the common meaning), you need to remove the word boundary and to use [a-zA-Z] instead of \w.
(\w)\1{2} checks if a repeated character is present. A word character is captured in group 1 and must be followed with the content of the capture group (the backreference \1).

PHP RegEx get first letter after set of characters

I have some text with heading string and set of letters.
I need to get first one-digit number after set of string characters.
Example text:
ABC105001
ABC205001
ABC305001
ABCD105001
ABCD205001
ABCD305001
My RegEx:
^(\D*)(\d{1})(?=\d*$)
Link: http://www.regexr.com/390gv
As you cans see, RegEx works ok, but it captures first groups in results also. I need to get only this integer and when I try to put ?= in first group like this: ^(?=\D*)(\d{1})(?=\d*$) , Regex doesn't work.
Any ideas?
Thanks in advance.
(?=..) is a lookahead that means followed by and checks the string on the right of the current position.
(?<=...) is a lookbehind that means preceded by and checks the string on the left of the current position.
What is interesting with these two features, is the fact that contents matched inside them are not parts of the whole match result. The only problem is that a lookbehind can't match variable length content.
A way to avoid the problem is to use the \K feature that remove all on the left from match result:
^[A-Z]+\K\d(?=\d*$)
You're trying to use a positive lookahead when really you want to use non-capturing groups.
The one match you want will work with this regex:
^(?:\D*\d{1})(\d*)$
The (?: string will start a non-capturing group. This will not come back in matches.
So, if you used preg_match(';^(?:\D*\d{1})(\d*)$;', $string, $matches) to find your match, $matches[1] would be the string for which you're looking. (This is because $matches[0] will always be the full match from preg_match.)
try:
^(?:\D*)(\d{1})(?=\d*$) // (?: is the beginning of a no capture group

Phone no contain this patteren AABBCC e.g 112233

I want to check if phone no contains this pattern AABBCC
Where A[0-9],B[0-9],C[0,9] They should be different e.g 112233,553322,887766
Let Us Suppose
I Have a phone no 03334112233
It will say yes pattern matched.
PHP Code but It Is For Exact String
$str = 'aabbaabbccaass'; //or whatever
if (preg_match('/(?!.*?aabbcc)^.*$/', $str))
echo "accepted\n";
else
echo "rejected\n";
Problem i don't know how to do if string is for numbers
Possible Duplicate
but it does not contain answer and exact detail.
Edited :
I want to match the last 6 characters of the string in this pattern AABBCC e.g 03329112233
To match number with AABBCC format, you can use this pattern:
(?:(\d)\1(?!\1)){2}(\d)\2
example of use:
if (preg_match('/(?:(\d)\1(?!\1)){2}(\d)\2/', $str)
echo "rejected\n";
else
echo "accepted\n";
But if you have other tests to do (for example that there is only digits), it can be more flexible to use it in this way:
if (preg_match('/(?!.*(?:(\d)\1(?!\1)){2}(\d)\2)^\d+$/', $str)
echo "accepted\n";
else
echo "rejected\n";
pattern details:
(?: # open a non capturing group that describes a repeated digit
(\d) # capture the first digit with group 1
\1 # a backreference to group 1 (the same digit thus)
(?!\1) # check with a negative lookahead that the same digit doesn't follow
){2} # repeat the group two times
(\d)\2 # same thing for digits 5 & 6 (the lookahead isn't needed here)
Note that the digit in the capture group change at each repetition of the non capturing group (because the negative lookahead forces it).
Notice: if you want to reject numbers that contains, for example, 111122 or 112222 or 111111, you only need to remove the negative lookahead.
if you want to reject numbers with the format 112211 or 448844, you must change the pattern like this: (\d)\1(?!\d{0,2}\1)(\d)\2(?!\2)(\d)\3
As I understand, you only want to match the last 6 characters of the string, if they are digits, and of 3 all different digit pairs. Would also use a lookahead and some pattern like this:
(?>((\d)\2)(?!.*\1)){3}$
\2 checks for an equivalent of 2nd capturing group, which is one digit (shorthand \d)
using a negative lookahead to check, if not followed by .* any amount of any characters, followed by equivalent of 1st capturing group (which contains 2 equal digits).
{3} 3 repitions at $ end of string.
Test on regex101.com, Regex FAQ
Your regex should be like this:
^((\d)\2){3}$
It is simpler and also works.
You can use capturing groups and backreferences like this:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^.*$/', $str))
The (.) will match any single character and assign it to a group. The first instance is assigned to group 1, the second to group 2 and so on. Later in the pattern, the backreference \1 will match exactly what was previously captured in group the first group, \2 will match what was captured in the second group, etc.
You probably will also want to use \d to match any single digit (it's only necessary to use this outside of the lookahead) and a {n,m} quantifier to match between n and m digits. For example, the following will match any sequence of 7 to 10 digits that does not contain a subsequence like AABBCC:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^\d{7,10}$/', $str))

Lookahead, lookbehind condition in regular expression

The following example is about using lookahead assertion as a condition. I found it in the PHP manual at: http://www.php.net/manual/en/regexp.reference.conditional.php
(?(?=[^a-z]*[a-z])
\d{2}-[a-z]{3}-\d{2} | \d{2}-\d{2}-\d{2} )
Here's the description about this regex:
The condition is a positive lookahead assertion that matches an optional sequence of non-letters followed by a letter. In other words, it tests for the presence of at least one letter in the subject. If a letter is found, the subject is matched against the first alternative; otherwise it is matched against the second. This pattern matches strings in one of the two forms dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
Could anyone tell me why we use lookahead assertion as the condition in this example? Why don't we use lookbehind assertion? I get confused when they're used as conditions like this because I don't know how do they match the subject string. Thanks in advance!
In this case we're using a lookahead assertion to decide which regex to use. It looks like it's deciding between matching dates of the form 01-Jan-12 and 01-01-12. The lookahead assertion sees if there are any letters within what we're trying to match and if so uses the \d{2}-[a-z]{3}-\d{2} to try and match 01-Jan-12 if not it uses \d{2}-\d{2}-\d{2} to try and match 01-01-12.

Categories