Regular expression - avoid the repetition of the sequence of the same letters - php

I'm trying to make a check on the password inserted by a user, working on a PHP website.
My check wants to:
at least 8 characters
maximum 20 characters
accept letters, numbers, and common special characters like (dot) # + $ - _ !
Until this point I've been able to figure out the right expression, but now I want to add an other rule, where an user can't write the same sequence of letter more then 1 time.
Let's say that, not considering the repetition of two times of the same letter, if the user write the same string (equal or more than 3 characters) more then once, it should not match.
For example:
abcde not valid - should be at least 8 characters
abcde1234 valid
abcd1abcd1 not valid due to repetition of the string abcd1
More examples (updated):
abababab not valid - the string "ab" is repited 2 times or more
aaaaaaaa not valid - the string aaa is repited more then once
helloworld valid - even if there is the letter "l" repeated two times
Any suggestion?
I don't know is it's possibile to write down a correct RegExp, maybe I'm trying to do something impossibile.
Before leaving the idea, I was curious to check the opinion of someone who know more then me in RegExp.
Thanks in advance

^(?!.*?(.+)\1)([\w#+$!.-]+){8,20}$
seems to work well: http://regex101.com/r/cU9lD0/1
The tricky part is ^(?!.*?(.+)\1) which reads "start of input, when not followed by: something, then some substring, then that substring once again".
That being said, all "password validation" is a quite pointless enterprise, which actually stops people from using really good passwords.

Related

Regex match word with characters before and after the value PHP

I've been searching all day finding a regular expression that looks for a specific word even if there are signs in front or behind.
It should be used for a bad words filter. It must look for exact matches but also with marks around the word
It has to search through an array of bad words
For example:
stupid - must match
123stupid - must match
stupid123 - must match
123stupid456 - must match
stupi - must not match (because the bad word is not fully inserted)
All I can find so far is looking only at whether the exact word (stupid but not 123stupid) or it also searches the half word (stupi)
Can anyone help me?
You can simply use
preg_match('/stupid/', $string)

Regex for first name and last name in form

I needed a regex to validate wether first and last name were provided corectly or not. Well This is what i came up with:
preg_match('/^[\p{L}]{4,25}[\s][\p{L}]{4,25}$/u', Form::post('name'))
This one works if string contains:
word (4-25 chars long and utf8 chars allowed)
space
word (4-25 chars long and utf8 chars allowed)
which rather is fine, but it seems too much complex for my script
is there a way to convert that regex so it will meet same conditions but has kind of "global" characters range instead, something like this:
(word space word){8,50}
also optionaly it could have second space and third word in case that some foreign person would want to use my site
any help will be appriciated:)
Aside from the fact that name validation is a bad idea in and of itself (see Falsehoods programmers believe about names), and that your regex can be simplified syntactically to
/^\pL{4,25}\s\pL{4,25}$/u
yes, it is possible, but ugly. You would need to use a positive lookahead assertion to make sure that there is only one space, and that it's neither at the end nor at the start of the string:
/^(?=\S+\s\S+$)[\pL\s]{8,50}$/u
If you want to allow more spaces/words, you can use
/^(?=\S+(?:\s\S+)+$)[\pL\s]{8,50}$/u

PHP: Validate string containing numbers, separated by hyphen (possible by preg_match)?

I’m trying to validate a string which contains numbers where each four numbers are separated by a hyphen, for example 1111-2222-3333-4444
I’m trying to do some kind of validating so I can guarantee that this format is being used (with 16 digits, three hyphens and nothing else). I’ve this preg_match where it checks for digits only but I need to accept hyphens and this format.
preg_match('/^[0-9]{1,}$/', $validatenumbers)
I’ve tried to do it with regex but unfortunately it isn’t my strongest side so I haven’t been able to correctly validate the numbers.
It is important that it is in PHP and not Javascript because of the ability to “turn off” javascript in a browser.
preg_match("/^([0-9]{4}-){3}[0-9]{4}$/", $input);
([0-9]{4}-){3} Matches exactly 3 groups of 4 digits followed by a hyphen. That is terminated by another group [0-9]{4} (4 digits without a hyphen).
preg_match('/^[0-9]{4}\-[0-9]{4}\-[0-9]{4}\-[0-9]{4}$/',$numbers);
i think that should work.
This looks like a credit card number. If that's the case, you should use a Luhn checksum instead of a simple regex.
try:
if(preg_match('#^\d{4}-\d{4}-\d{4}-\d{4}$#',$string){}
If you require to match that exact format the pattern would be '~^\d{4}-\d{4}-\d{4}-\d{4}$~', or you can write it more generally like this: '/^(\d+-)*\d+$/' (this would match 11, 11-11111... and so on),

Editing a regex that isn't mine, not sure how to adjust it for needs

I have a regex that was written for me for passwords:
~^[a-z0-9!##\$%\^&\*\(\)]{8,16}$~i
It's supposed to match strings of alphanumerics and symbols of 8-16 characters. Now I need to remove the min and max length requirement as I need to split the error messages for user friendliness - I tried to just take out the {8,16} portion but then it breaks it. How would I do this? Thanks ahead of time.
I take it you're doing separate checks for too-long or too-short strings, and this regex is only making sure there are no invalid characters. This should do it:
~^[a-z0-9!##$%^&*()]+$~i
+ means one or more, * means zero or more; it probably doesn't matter which one you use.
I got rid of some unnecessary backslashes, too; none of those characters has any special meaning in a character class (inside the square brackets, that is).

Help with password complexity regex

I'm using the following regex to validate password complexity:
/^.*(?=.{6,12})(?=.*[0-9]{2})(?=.*[A-Z]{2})(?=.*[a-z]{2}).*$/
In a nutshell: 2 lowercase, 2 uppercase, 2 numbers, min length is 6 and max length is 12.
It works perfectly, except for the maximum length, when I'm using a minimum length as well.
For example:
/^.*(?=.{6,})(?=.*[0-9]{2})(?=.*[A-Z]{2})(?=.*[a-z]{2}).*$/
This correctly requires a minimum length of 6!
And this:
/^.*(?=.{,12})(?=.*[0-9]{2})(?=.*[A-Z]{2})(?=.*[a-z]{2}).*$/
Correctly requires a maximum length of 12.
However, when I pair them together as in the first example, it just doesn't work!!
What gives? Thanks!
You want:
/^(?=.{6,12}$)...
What you're doing is saying: find me any sequence of characters that is followed by:
6-12 characters
another sequence of characters that is followed by 2 digits
another sequence of characters that is followed by 2 uppercase letters
another sequence of characters that is followed by 2 lowercase letters
And all that is followed by yet another sequence of characters. That's why the maximum length isn't working because 30 characters followed by 00AAaa and another 30 characters will pass.
Also what you're doing is forcing two numbers together. To be less stringent than that but requiring at least two numbers anywhere in the string:
/^(?=.{6,12}$)(?=(.*?\d){2})(?=(.*?[A-Z]){2})(?=(.*?[a-z]){2})/
Lastly you'll note that I'm using non-greedy expressions (.*?). That will avoid a lot of backtracking and for this kind of validation is what you should generally use. The difference between:
(.*\d){2}
and
(.*?\d){2}
Is that the first will grab all the characters with .* and then look for a digit. It won't find one because it will be at the end of the string so it will backtrack one characters and then look for a digit. If it's not a digit it will keep backtracking until it finds one. After it does it will match that whole expression a second time, which will trigger even more backtracking.
That's what greedy wildcards means.
The second version will pass on zero characters to .*? and look for a digit. If it's not a digit .*? will grab another characters and then look for a digit and so on. Particularly on long search strings this can be orders of magnitude faster. On a short password it almost certainly won't make a difference but it's a good habit to get into of knowing how the regex matcher works and writing the best regex you can.
That being said, this is probably an example of being too clever for your own good. If a password is rejected as not satisfying those conditions, how do you determine which one failed in order to give feedback to the user about what to fix? A programmatic solution is, in practice, probably preferable.

Categories