Regex repeating letters - php

How can I not allow a user to enter a word with repeating letters I already have the case for special characters?
I have tried this and it works for the special characters allowed in the text.
^(?!.*([ \-])\1)\w[a-zA-z0-9 \-]*$
3 My Address--
Will not work (--)
This is what I am trying to do for the letters (?!.*([a-z])\1{4}) but it does not work it breaks the regex.
(?!.*([ \-])\1)(?!.*([a-z])\1{4})\w[a-zA-z0-9 \-]*$
It should prevent any repeating letters when they have been entered 4 times in a row for example this is for a address and as it stand I can enter.
3 My Adddddddddd

You need to use \2 backreference in the second lookahead, and mind using [a-zA-Z], not [a-zA-z] in the consuming part:
^(?!.*([ -])\1)(?!.*([A-Za-z])\2{3})\w[a-zA-Z0-9 -]*$
See the regex demo.
The first capturing group is ([ -]) in the first lookahead, the second lookahead contains the second group, thus, \2 is necessary.
As you want to filter out matches with at least 4 identical consecutive letters, you need ([A-Za-z])\2{3}, not {4}.
Also, if you plan to match a digit at the beginning, consider replacing \w with \d.
Regex details
^ - start of string
(?!.*([ -])\1) - no two identical consecutive spaces or hyphens allowed in the string
(?!.*([A-Za-z])\2{3}) - no four identical consecutive letters allowed in the string
\w - the first char should be a letter, digit or _
[a-zA-Z0-9 -]* - 0+ letters, digits, spaces or hyphens
$ - end of string.

Related

how to develop regex for a number with php

I want to build a regex with php for a number such as '123 2345 7890'. The first 3 characters should be numbers then space, again 4 characters should be numbers then space, again 4 characters should be numbers then space. So far I have done this but it's not working I mean this does not gives me the actual format that i want, can anyone please help me to sort it.
preg_match("/^([0-9]{3})([0-9]{4})([0-9]{4}).*$/", $new_password)
Your pattern does not match spaces, and the .* at the end optionally matches any character except a newline.
You could use \h+ to match 1 or more horizontal whitespace chars and at the end match optional horizontal whitespace chars \h*
Or just to match a mere space instead.
If you don't need the capture groups for after processing, you could omit them.
^\d{3}\h+\d{4}\h+\d{4}\h*$
Regex demo

Update a regex that matches twitter like mentions to allow for dots

I have already found helpful answers for a regex that matches twitter like username mentions in this answer and this answer
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9_]+)
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9-_]+)
However, I need to update this regex to also include usernames that has dots.
One or more dots are allowed in a username.
The username must not start or end with a dot.
No two consecutive dots are allowed.
Example of a matched string:
#valid.user.name
^^^^^^^^^^^^^^^^
Examples of non-matched strings:
#.user.name // starts with a dot
#user.name. // ends with a dot
#user..name // has two consecutive dots
You can use this refactored regex:
(?<=[^\w.-]|^)#([A-Za-z]+(?:\.\w+)*)$
RegEx Demo
RegEx Details:
(?<=[^\w.-]|^): Lookbehind to assert that we have start of line or any non-word, non-dot, non-hyphen character before current position
#: Match literal `#1
(: Start capture group
[A-Za-z]+: Match 1+ ASCII letters
(?:\.\w+)*: Match 0 or more instances of dot followed 1+ word characters
): End capture group
$: End
The (?<=^|(?<=[^a-zA-Z0-9-_\.])) is a positive lookbehind that requires a match to be at the start of the string or right after an alphanumeric, -, _, ., you may write it in a more compact way as (?<![\w.-]), a negative lookbehind.
Next, ([A-Za-z]+[A-Za-z0-9_]+) captures 1+ ASCII letters and then 1+ ASCII letters or/and underscores. You seem to make sure the first char is a letter, then any number of sequences of . and 1+ word chars are allowed, that is, you may use [A-Za-z]\w*(?:\.\w+)*.
As you do not want to match it if there is a . right after the expected match, you need to set a lookahead that will require a space or end of string, (?!\S).
So, combining it, you can use
'~(?<![\w.-])#([A-Za-z]\w*(?:\.\w+)*)(?!\S)~'
See the regex demo
Details
(?<![\w.-]) - no letters, digits, _, . and - immediately to the left of the current location are allowed
# - a # char
([A-Za-z]\w*(?:\.\w+)*) - Group 1:
[A-Za-z] - an ASCII letter
\w* - 0+ letters, digits, _
(?:\.\w+)* - 0+ sequences of
\. - dot
\w+ - 1+ letters, digits, _
(?!\S) - whitespace or end of string are required immediately to the right of the current location.
EDIT: Simpler version (same result)
^#[a-zA-Z](\.?[\w-]+)*$
Original
Another one:
^#[a-zA-Z][a-zA-Z_-]?(\.?[\w\d-]+){0,}$
^# starts with #
[a-zA-Z] first char
[a-zA-Z_-]? match a-zA-Z_- 0 or more times
( start group
\.? match . (optional)
[\w\d-]+ match a-zA-Z0-9-_ 1 or more times
) end group
{0,} repeat group 0 to infinite times
$ end
Tests
valid:
#validusername
#valid.user.name
#valid-user-name
#valid_user-name
#valid-user123_name
#a.valid-user123_name
not valid:
#-invalid.user
#_invalid.user
#1notvalid-user_123name33
#.user.name
#user.name.
#user..name

Input field validation constraints using regular expression

I am working on a symfony(2.8) project. Where in the registration form needs some input validation.
I need to set following constraints on the Subdomain name input field:
1. Should contain only alphanumeric characters
2. First character can not be a number
3. No white spaces
I am using annotations for this task.
Here is the Assert statement I am using:
#Assert\Regex(pattern="/^[a-zA-Z][a-zA-Z0-9]\s+$/", message="Subdomain name must start with a letter and can only have alphanumeric characters with no spaces", groups={"registration"})
When I enter any simple string of words eg. svits, it still shows the error message "Subdomain name must start with a letter and can only have alphanumeric characters with no spaces"
Any suggestions would be appreciated.
You are very close with your regex, just add quantifier and remove \s:
/^[a-zA-Z][a-zA-Z0-9]+$/
Your pattern does not work because:
The [a-zA-Z0-9] only matches 1 alphanumeric character. To match 0 or more, add * quantifier (*zero or more occurrences of the quantified subpattern), or + (as in Toto's answer) to match one or more occurrences (to only match 2+-letter words).
Since your third requirement forbids the usage of whitespaces in the input string, remove \s+ from your pattern as it requires 1 or more whitespace symbols at the end of the string.
So, my suggestion is
pattern="/^[a-zA-Z][a-zA-Z0-9]*$/"
^
to match 1+ letter words as full strings that start with a letter and may be followed with 0+ any alphanumeric symbols.
To allow whitespaces in any place of the string but the start, put the \s into the second [...] (character class):
pattern="/^[a-zA-Z][a-zA-Z0-9\s]*$/"
^^ ^
If you do not want to allow more than 1 whitespace on end (no 2+ consecutive whitespaces), use
pattern="/^[a-zA-Z][a-zA-Z0-9]*(?:\s[a-zA-Z0-9]+)*$/"
^^^^^^^^^^^^^^^^^^^
The (?:\s[a-zA-Z0-9]+)* will match 0+ sequences of a single whitespace followed with 1+ alphanumerics.

Password Regular expression with four criteria

I am trying to write a regular expression in PHP to ensure a password matches a criteria which is:
It should atleast 8 characters long
It should include at least one special character
It should include at least one capital letter.
I have written the following expression:
$pattern=([a-zA-Z\W+0-9]{8,})
However, it doesn't seem to work as per the listed criteria. Could I get another pair of eyes to aid me please?
Your regex - ([a-zA-Z\W+0-9]{8,}) - actually searches for a substring in a larger text that is at least 8 characters long, but also allowing any English letters, non-word characters (other than [a-zA-Z0-9_]), and digits, so it does not enforce 2 of your requirements. They can be set with look-aheads.
Here is a fixed regex:
^(?=.*\W.*)(?=.*[A-Z].*).{8,}$
Actually, you can replace [A-Z] with \p{Lu} if you want to also match/allow non-English letters. You can also consider using \p{S} instead of \W, or further precise your criterion of a special character by adding symbols or character classes, e.g. [\p{P}\p{S}] (this will also include all Unicode punctuation).
An enhanced regex version:
^(?=.*[\p{S}\p{P}].*)(?=.*\p{Lu}.*).{8,}$
A human-readable explanation:
^ - Beginning of a string
(?=.*\W.*) - Requirement to have at least 1 non-word character
OR (?=.*[\p{S}\p{P}].*) - At least 1 Unicode special or punctuation symbol
(?=.*[A-Z].*) - Requirement to have at least 1 uppercase English letter
OR (?=.*\p{Lu}.*) - At least 1 Unicode letter
.{8,} - Requirement to have at least 8 symbols
$ - End of string
See Demo 1 and Demo 2 (Enhanced regex)
Sample code:
if (preg_match('/^(?=.*\W.*)(?=.*[A-Z].*).{8,}$/u', $header)) {
// PASS
}
else {
# FAIL
}
Using positive lookahead ?= we make sure that all password requirements are met.
Requirements for strong password:
At least 8 chars long
At least 1 Capital Letter
At least 1 Special Character
Regex:
^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$
PHP implementation:
if (preg_match('/^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$/u', $password)) {
# Strong Password
} else {
# Weak Password
}
Examples:
12345678 - WEAK
1234%fff - WEAK
1234_44A - WEAK
133333A$ - STRONG
Regex Explanation:
^ assert position at start of the string
1st Capturing group ((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))
(?=[\S]{8}) Positive Lookahead - Assert that the regex below can be matched
[\S]{8} match a single character present in the list below
Quantifier: {8} Exactly 8 times
\S match any kind of visible character [\P{Z}\H\V]
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[A-Z]{1}) Positive Lookahead - Assert that the regex below can be matched
[A-Z]{1} match a single character present in the list below
Quantifier: {1} Exactly 1 time (meaningless quantifier)
A-Z a single character in the range between A and Z (case sensitive)
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[\p{S}]) Positive Lookahead - Assert that the regex below can be matched
[\p{S}] match a single character present in the list below
\p{S} matches math symbols, currency signs, dingbats, box-drawing characters, etc
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
$ assert position at end of the string
u modifier: unicode: Pattern strings are treated as UTF-16. Also causes escape sequences to match unicode characters
Demo:
https://regex101.com/r/hE2dD2/1

php regex - find uppercase string with number and spaces in text

I want to write php regular expression to find uppercase string , which can also contain one number and spaces, from text.
For example from this text "some text to contain EXAM PL E 7STRING uppercase word" I want to get string- EXAM PL E 7STRING ,
found string should start and end only with uppercase, but in the middle, without uppercase letters can also contain(but not necessarily ) one number and spaces. So, regex should match any of these patterns
1) EXAMPLESTRING - just uppercase string
2) EXAMP4LESTRING - with number
3) EXAMPLES TRING - with space
4) EXAM PL E STRING - with more than one spaces
5) EXAMP LE4STRING - with number and space
6) EXAMP LE 4ST RI NG - with number and spaces
and with total length string should be equal or more than 4 letters
I wrote this regex '/[A-Z]{1,}([A-Z\s]{2,}|\d?)[A-Z]{1,}/', that can find first 4 patterns, but I can not figure it out to match also the last 2 patterns.
Thanks
There is a neat trick called a lookahead. It just checks what is following after the current position. That can be used to check for multiple conditions:
'/(?<![A-Z])(?=(?:[A-Z][\s\d]*){3}[A-Z])(?!(?:[A-Z\s]*\d){2})[A-Z][A-Z\s\d]*[A-Z]/'
The first lookaround is actually a lookbehind and checks that there is no previous uppercase letter. This is just a little speedup for strings that would fail the match anyway. The second lookaround (a lookahead) checks that there are at least four letters. The third one checks that there are no two digits. The rest just matches then a string of the allowed characters, starting and ending with an uppercase letter.
Note that in the case of two digits this will not match at all (instead of matching everything up to the second digit). If you do want to match in such a case, you could incorporate the "1 digit" rule into the actual match instead:
'/(?<![A-Z])(?=(?:[A-Z][\s\d]*){3}[A-Z])[A-Z][A-Z\s]*\d?[A-Z\s]*[A-Z]/'
EDIT:
As Ωmega pointed out, this will cause problems if there are less then four letters before the second digit, but more after that. This is actually quite tough, because the assertion needs to be, that there are more than 4 letters before the second digit. Since we do not know where the first digit occurs in those four letters, we have to check for all possible positions. For this I would do away with the lookaheads altogether, and simply provide the three different alternatives. (I will keep the lookbehind as an optimization for non-matching parts.)
'/(?<![A-Z])[A-Z]\s*(?:\d\s*[A-Z]\s*[A-Z]|[A-Z]\s*\d\s*[A-Z]|[A-Z]\s*[A-Z][A-Z\s]*\d?)[A-Z\s]*[A-Z]/'
Or here with added comments:
'/
(?<! # negative lookbehind
[A-Z] # current position is not preceded by a letter
) # end of lookbehind
[A-Z] # match has to start with uppercase letter
\s* # optional spaces after first letter
(?: # subpattern for possible digit positions
\d\s*[A-Z]\s*[A-Z]
# digit comes after first letter, we need two more letters before last one
| # OR
[A-Z]\s*\d\s*[A-Z]
# digit comes after second letter, we need one more letter before last one
| # OR
[A-Z]\s*[A-Z][A-Z\s]*\d?
# digit comes after third letter, or later, or not at all
) # end of subpattern for possible digit positions
[A-Z\s]* # arbitrary amount of further letters and whitespace
[A-Z] # match has to end with uppercase letter
/x'
That gives the same result on Ωmega's lengthy test input.
I suggest to use regex pattern
[A-Z][ ]*(\d)?(?(1)(?:[ ]*[A-Z]){3,}|[A-Z][ ]*(\d)?(?(2)(?:[ ]*[A-Z]){2,}|[A-Z][ ]*(\d)?(?(3)(?:[ ]*[A-Z]){2,}|[A-Z][ ]*(?:\d|(?:[ ]*[A-Z])+[ ]*\d?))))(?:[ ]*[A-Z])*
(see this demo).
[A-Z][ ]*(?:\d(?:[ ]*[A-Z]){2}|[A-Z][ ]*\d[ ]*[A-Z]|(?:[A-Z][ ]*){2,}\d?)[A-Z ]*[A-Z]
(see this demo)

Categories