Password Regular expression with four criteria - php

I am trying to write a regular expression in PHP to ensure a password matches a criteria which is:
It should atleast 8 characters long
It should include at least one special character
It should include at least one capital letter.
I have written the following expression:
$pattern=([a-zA-Z\W+0-9]{8,})
However, it doesn't seem to work as per the listed criteria. Could I get another pair of eyes to aid me please?

Your regex - ([a-zA-Z\W+0-9]{8,}) - actually searches for a substring in a larger text that is at least 8 characters long, but also allowing any English letters, non-word characters (other than [a-zA-Z0-9_]), and digits, so it does not enforce 2 of your requirements. They can be set with look-aheads.
Here is a fixed regex:
^(?=.*\W.*)(?=.*[A-Z].*).{8,}$
Actually, you can replace [A-Z] with \p{Lu} if you want to also match/allow non-English letters. You can also consider using \p{S} instead of \W, or further precise your criterion of a special character by adding symbols or character classes, e.g. [\p{P}\p{S}] (this will also include all Unicode punctuation).
An enhanced regex version:
^(?=.*[\p{S}\p{P}].*)(?=.*\p{Lu}.*).{8,}$
A human-readable explanation:
^ - Beginning of a string
(?=.*\W.*) - Requirement to have at least 1 non-word character
OR (?=.*[\p{S}\p{P}].*) - At least 1 Unicode special or punctuation symbol
(?=.*[A-Z].*) - Requirement to have at least 1 uppercase English letter
OR (?=.*\p{Lu}.*) - At least 1 Unicode letter
.{8,} - Requirement to have at least 8 symbols
$ - End of string
See Demo 1 and Demo 2 (Enhanced regex)
Sample code:
if (preg_match('/^(?=.*\W.*)(?=.*[A-Z].*).{8,}$/u', $header)) {
// PASS
}
else {
# FAIL
}

Using positive lookahead ?= we make sure that all password requirements are met.
Requirements for strong password:
At least 8 chars long
At least 1 Capital Letter
At least 1 Special Character
Regex:
^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$
PHP implementation:
if (preg_match('/^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$/u', $password)) {
# Strong Password
} else {
# Weak Password
}
Examples:
12345678 - WEAK
1234%fff - WEAK
1234_44A - WEAK
133333A$ - STRONG
Regex Explanation:
^ assert position at start of the string
1st Capturing group ((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))
(?=[\S]{8}) Positive Lookahead - Assert that the regex below can be matched
[\S]{8} match a single character present in the list below
Quantifier: {8} Exactly 8 times
\S match any kind of visible character [\P{Z}\H\V]
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[A-Z]{1}) Positive Lookahead - Assert that the regex below can be matched
[A-Z]{1} match a single character present in the list below
Quantifier: {1} Exactly 1 time (meaningless quantifier)
A-Z a single character in the range between A and Z (case sensitive)
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[\p{S}]) Positive Lookahead - Assert that the regex below can be matched
[\p{S}] match a single character present in the list below
\p{S} matches math symbols, currency signs, dingbats, box-drawing characters, etc
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
$ assert position at end of the string
u modifier: unicode: Pattern strings are treated as UTF-16. Also causes escape sequences to match unicode characters
Demo:
https://regex101.com/r/hE2dD2/1

Related

PHP Regex: Remove words not equal exactly 3 characters

An excellent "very close" answer at
Remove words less than 3 chars
with DEMO
where Regex
\b([a-z]{1,2})\b
removes all words less than 3 chars.
But how to reset this demo vise versa ? to remove all words NOT EXACTLY EQUAL 3 chars ?
We can catch word where exactly 3 chars by
\b([a-z]{3})\b
but how to tell regex - remove all other words what are NOT equal to 3 ?
So in regex demo ref above should leave only word 'and'
Use alternatives to match either 1-2 or 4+ letters.
\b(?:[a-z]{1,2}|[a-z]{4,})\b
Another variation with a negative lookbehind asserting not 3 chars to the left
\b[a-z]+\b(?<!\b[a-z][a-z][a-z]\b)
Regex demo
Or with a skip fail approach for 3 chars a-z:
\b[a-z]{3}\b(*SKIP)(*F)|\b[a-z]+\b
Regex demo
I think maybe:
\b(?![a-z]{3}\b)[a-z]+\b
Matching:
\b - A word-boundary.
(?![a-z]{3}\b) - A negative lookahead to avoid three-letter words.
[a-z]+\b - Any 1+ letter-words (greedy) us to a word boundary.
Another trick is to use a capture group to match what you want:
\b(?:[a-z]{3}|([a-z]+))\b
\b - A word-boundary
(?:[a-z]{3}|([a-z]+)) - A nested capture group inside alternation to first neglect three alpha chars and capture any 1+ words (greedy).
\b - A word-boundary
With an optional group of letters with at least 2 characters and a possessive quantifier:
\b[a-z]{1,2}+(?:[a-z]{2,})?\b
demo
This approach is based on a calculation trick and on backtracking.
In other words: 2 + x = 3 with x > 1 has no solution.
If I had written \b[a-z]{1,2}(?:[a-z]{2,})?\b (with or without the last \b it isn't important), when the regex engine reaches the position at the start of a three letters word [a-z]{1,2} would have consumed the two first letters, but as an extra character is needed for the last word boundary to succeed, the regex engine doesn't have an other choice to backtrack the {1,2} quantifier. With one backtracking step, the [a-z]{1,2} would have consumed only one character and (?:[a-z]{2,})?\b could have succeeded. But by making this quantifier possessive I forbid this backtracking step. Since, for a three letters word, [a-z]{1,2}+ takes 2 characters and [a-z]{2,} needs at least 2 letters, the pattern fails.
Use the word boundary and force to fail with the possessive quantifier:
\b(?:[a-z]{3}\b)?+[a-z]+
demo
This one plays also with an impossible assertion: three letters followed by a word boundary, can't be followed by a letter.
One more time, with a three letter words, once the three letters are consumed by [a-z]{3}, the possessive quantifier ?+ forbids to backtrack and [a-z]+ makes the pattern fail.
Force to fail with 3 letters and skip them using a backtracking control verb:
\b[a-z]{3}\b(*SKIP)^|[a-z]+
demo

Regex repeating letters

How can I not allow a user to enter a word with repeating letters I already have the case for special characters?
I have tried this and it works for the special characters allowed in the text.
^(?!.*([ \-])\1)\w[a-zA-z0-9 \-]*$
3 My Address--
Will not work (--)
This is what I am trying to do for the letters (?!.*([a-z])\1{4}) but it does not work it breaks the regex.
(?!.*([ \-])\1)(?!.*([a-z])\1{4})\w[a-zA-z0-9 \-]*$
It should prevent any repeating letters when they have been entered 4 times in a row for example this is for a address and as it stand I can enter.
3 My Adddddddddd
You need to use \2 backreference in the second lookahead, and mind using [a-zA-Z], not [a-zA-z] in the consuming part:
^(?!.*([ -])\1)(?!.*([A-Za-z])\2{3})\w[a-zA-Z0-9 -]*$
See the regex demo.
The first capturing group is ([ -]) in the first lookahead, the second lookahead contains the second group, thus, \2 is necessary.
As you want to filter out matches with at least 4 identical consecutive letters, you need ([A-Za-z])\2{3}, not {4}.
Also, if you plan to match a digit at the beginning, consider replacing \w with \d.
Regex details
^ - start of string
(?!.*([ -])\1) - no two identical consecutive spaces or hyphens allowed in the string
(?!.*([A-Za-z])\2{3}) - no four identical consecutive letters allowed in the string
\w - the first char should be a letter, digit or _
[a-zA-Z0-9 -]* - 0+ letters, digits, spaces or hyphens
$ - end of string.

Update a regex that matches twitter like mentions to allow for dots

I have already found helpful answers for a regex that matches twitter like username mentions in this answer and this answer
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9_]+)
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9-_]+)
However, I need to update this regex to also include usernames that has dots.
One or more dots are allowed in a username.
The username must not start or end with a dot.
No two consecutive dots are allowed.
Example of a matched string:
#valid.user.name
^^^^^^^^^^^^^^^^
Examples of non-matched strings:
#.user.name // starts with a dot
#user.name. // ends with a dot
#user..name // has two consecutive dots
You can use this refactored regex:
(?<=[^\w.-]|^)#([A-Za-z]+(?:\.\w+)*)$
RegEx Demo
RegEx Details:
(?<=[^\w.-]|^): Lookbehind to assert that we have start of line or any non-word, non-dot, non-hyphen character before current position
#: Match literal `#1
(: Start capture group
[A-Za-z]+: Match 1+ ASCII letters
(?:\.\w+)*: Match 0 or more instances of dot followed 1+ word characters
): End capture group
$: End
The (?<=^|(?<=[^a-zA-Z0-9-_\.])) is a positive lookbehind that requires a match to be at the start of the string or right after an alphanumeric, -, _, ., you may write it in a more compact way as (?<![\w.-]), a negative lookbehind.
Next, ([A-Za-z]+[A-Za-z0-9_]+) captures 1+ ASCII letters and then 1+ ASCII letters or/and underscores. You seem to make sure the first char is a letter, then any number of sequences of . and 1+ word chars are allowed, that is, you may use [A-Za-z]\w*(?:\.\w+)*.
As you do not want to match it if there is a . right after the expected match, you need to set a lookahead that will require a space or end of string, (?!\S).
So, combining it, you can use
'~(?<![\w.-])#([A-Za-z]\w*(?:\.\w+)*)(?!\S)~'
See the regex demo
Details
(?<![\w.-]) - no letters, digits, _, . and - immediately to the left of the current location are allowed
# - a # char
([A-Za-z]\w*(?:\.\w+)*) - Group 1:
[A-Za-z] - an ASCII letter
\w* - 0+ letters, digits, _
(?:\.\w+)* - 0+ sequences of
\. - dot
\w+ - 1+ letters, digits, _
(?!\S) - whitespace or end of string are required immediately to the right of the current location.
EDIT: Simpler version (same result)
^#[a-zA-Z](\.?[\w-]+)*$
Original
Another one:
^#[a-zA-Z][a-zA-Z_-]?(\.?[\w\d-]+){0,}$
^# starts with #
[a-zA-Z] first char
[a-zA-Z_-]? match a-zA-Z_- 0 or more times
( start group
\.? match . (optional)
[\w\d-]+ match a-zA-Z0-9-_ 1 or more times
) end group
{0,} repeat group 0 to infinite times
$ end
Tests
valid:
#validusername
#valid.user.name
#valid-user-name
#valid_user-name
#valid-user123_name
#a.valid-user123_name
not valid:
#-invalid.user
#_invalid.user
#1notvalid-user_123name33
#.user.name
#user.name.
#user..name

Building a complex regex with "conditions"

I'm trying to build a complex regex with the following constraints:
1. My string can only be composed of:
"Regular" alphanumeric characters : a-zA-Z0-9
4 specials characters : space . _ -
2. Length has to be between 3 and 25
So far it's quite easy but then it gets complicated :
3. There cannot be 2 consecutive special characters, unless the 1st one is a space and the 2nd one isn't a space. Logical consequence : there cannot be 3 consecutive special characters
4 The string cannot start or end with a space
I'm especially struggling with 3.
Any help/hint would be much appreciated.
Examples:
" lkjsdi1SD" => FALSE (starts with a space)
"-lkjsdi1SD" => TRUE
"lkjsd -i1SD " => FALSE (ends with a space)
".Dg5 -lkjsdi1SD" => TRUE
"jhv5675gjjvghHJHvg655775vfFVHFJFf445576JHFFfhd12" => FALSE (too long)
"jhv 12" => FALSE (two consecutive spaces)
"as" => FALSE (too short)
"a r" => TRUE
I suggest using:
^ # Start of string
(?=.{3,25}$) # The total string length is from 3 to 25
[._-]? # An optional . _ or - (? means "match 1 or 0 times")
[a-zA-Z0-9]+ # one or more alphanumeric symbols
(?: # Zero or more sequences of:
(?:[._-]|[ ][._-]?) # one . _ or - OR a space followed with an optional . _ or -
[a-zA-Z0-9]+ # one or more alphanumerics
)* # (here * defines zero or more times)
[._-]? # one optional . _ or -
$ # End of string
See the inline description for each part (I used /x VERBOSE (or free-space) modifier to enable comments that is helpful to keep long patterns readable).
See the regex demo
More pattern details
^ - start of string anchor, the regex engine will only look for the whole pattern at the string start. Thus, if there is a space at the start, no match will be returned as [a-zA-Z0-9]+, the first obligatory subpattern, requires an alphanumeric, and [._-]? (a character class that matches one or zero ., _, or - (the ? is a quantifier matching one or zero occurrences of the quantified subpattern) only allows 1 of these 3 characters before the first alphanumeric.
(?=.{3,25}$) is a positive lookahead anchored at the start, that requires at least 3 and at most 25 characters other than a newline (. matches any char other than a LF if /s modifier is not defined) from start till end ($ is the string end anchor that matches at the end of string or before the final char that is a newline character, replace with \z if you want to disallow matching a string with a newline symbol at the end).
The {3,25} is a limiting quantifier that allows matching min to max amount of characters conforming to the subpattern quantified. Note that a lookahead does not consume the text, i.e. the regex engine returns to the place where it starts matching the lookahead pattern with the true or false result, and if true, goes on matching the rest of the pattern.
[._-]? - an optional single char, one of the defined chars in the character class (see explanation above)
[a-zA-Z0-9]+ - one or more (I wrote "1+") characters (the + quantifier matches 1 or more occurrences) that are in the ranges defined in the character class.
(?:(?:[._-]|[ ][._-]?)[a-zA-Z0-9]+)* - is a non-capturing group used only for grouping subpatterns (to match them consecutively) that can match one or more (as the * stands after it) sequences of (?:[._-]|[ ][._-]?)[a-zA-Z0-9]+:
(?:[._-]|[ ][._-]?) - either a ., _, or -, OR (due to the | alternation operator) the space (I put the space into a character class [ ] because I used the /x VERBOSE modifier to introduce newline formatting and comments into the pattern, you may use a regular space if you do not use the /x modifier) followed with ., _, or -.
[a-zA-Z0-9]+ - 1 or more (due to +) alphanumerics.
Try using this:
^(?:[a-zA-Z0-9]|[._-](?![ ._-]))(?:[a-zA-Z0-9 ]|[._-](?![ ._-])){1,23}[a-zA-Z0-9._-]$
The part [._-](?![ ._-]) means "match [._-] if it's not followed by [ ._-].
In general you can look into lookarounds

Regex for password grading

I'm working on a regular expression grading the quality of the used password. The idea is that a password is considered mediocre if it contains ONLY 1 uppercase character OR atleast 6 uppercase characters. The password itself should be atleast 8 characters long.
Desired behavior:
Aaaaaaaa -> match
AAAAAAaa -> match
AAaaaaaa -> no match
I tried something like this:
(?=.*[A-Z]{1,1}|(?=.*[A-Z]{6,})).{8,}
Which doesn't do the trick because it also matches on AAaaaaaa. The problem is the first positive lookahead which allows 2-5 uppercase characters but i couldn't figure out how to avoid that.
You should restrict the first lookahead to only require 1 uppercase letter in the whole string. Just define the full string pattern as any non-uppercase letter(s) followed with 1 uppercase one, and then any number of non-uppercase letter characters are allowed.
If you plan to require 6 uppercase letters at a row, use
/^(?=[^A-Z]*[A-Z][^A-Z]*$|.*[A-Z]{6,}).{8,}$/
^^^^^^^^^^^^^^^^^^^^
See this regex demo
If these 6 uppercase letters can be scattered around the string, use
/^(?=[^A-Z]*[A-Z][^A-Z]*$|(?:[^A-Z]*[A-Z]){6,}).{8,}$/
^^^^^^^^^^^^^^^^^^^^
Where (?:[^A-Z]*[A-Z]){6,} searches for at least 6 occurrences of 0+ non-uppercase letter characters followed with an uppercase letter. See this regex demo.
If you need to support Unicode, add /u modifier at the end of the regex, and replace [A-Z] with \p{Lu}, [^A-Z] with \P{Lu}.
Also, it is recommended to use \A instead of ^ and \z instead of $ since it is password validation.
Another regex that is based on the logic suggested by bobble bubble:
^ # String start
(?=.{8,}$) # 8 or more characters other than a newline
(?:
[^A-Z]*[A-Z][^A-Z]* # a string with 1 uppercase letter only
| # or...
(?:[^A-Z]*[A-Z]){6} # 6 occ. of 0+ chars other than uppercase letters followed with 1 uppercase letter
.* # 0+ chars other than a newline
)
$ # string end
See the regex demo and as 1 line:
/^(?=.{8,}$)(?:[^A-Z\n]*[A-Z][^A-Z\n]*|(?:[^A-Z\n]*[A-Z]){6}.*)$/
See this demo.
Your .*[A-Z] will also consume uppers. Use exclusion between upper letters.
^(?=.{8})(?:([^A-Z]*[A-Z]){6}.*|(?1)[^A-Z]*$)
It checks if there is at least 6 or exactly 1 upper surrounded by non-upper.
(?=.{8}) The lookahead at start checks for at least 8 characters
(?1) is a reference to ([^A-Z]*[A-Z])
More explanation and demo at regex101
Ok. I think this is not a suitable criteria for strong passwords, but let's cocentrate on the question anyway: count the capital letters in a string.
I think this is really not easy to do using regexp. Does it need to be one? Then you should remove the php tag from the question...
PHP solution: I would just filter for the capital letters and count them afterwards.
$caps = strlen(preg_replace('/[^A-Z]+/', '', $pass)); // number of uppercase chars in `$pass`
$mediocre = strlen($pass) != 8 || $caps > 5 || $caps < 2;
// mediocre if $pass is not exactly 8 chars long and does not contain between 2 and 5 uppercase characters

Categories