Check if string contains only English alphabets , digits and symbols - php

I want to check for string that contains only english alphabets , digits and symbols.I tried below code but it works only when all the characters are different language.
if(strlen($string) != mb_strlen($string, 'utf-8'))
{
echo "No English words ";
}
else {
echo "only english words";
}
For example
1. hellow hi 123#!##!##()### -- true
2. ព្រាប សុវ ok yes ### - false
3. this is good 123 - true
4. ព្រាប -- false
p.s : my question is not duplicate because other questions only cover alphabets and symbols , mine covers symbol too

Would determining if a string is just printable ASCII work? If so you can use this regex:
[ -~]
http://www.catonmat.net/blog/my-favorite-regex/
If you need non ASCII characters as well than you can use the Wikipedia page to get the specific unicode formats that you need:
https://en.wikipedia.org/wiki/List_of_Unicode_characters#Control_codes

Related

How can I validate a password based on my rules? [duplicate]

My password strength criteria is as below :
8 characters length
2 letters in Upper Case
1 Special Character (!##$&*)
2 numerals (0-9)
3 letters in Lower Case
Can somebody please give me regex for same. All conditions must be met by password .
You can do these checks using positive look ahead assertions:
^(?=.*[A-Z].*[A-Z])(?=.*[!##$&*])(?=.*[0-9].*[0-9])(?=.*[a-z].*[a-z].*[a-z]).{8}$
Rubular link
Explanation:
^ Start anchor
(?=.*[A-Z].*[A-Z]) Ensure string has two uppercase letters.
(?=.*[!##$&*]) Ensure string has one special case letter.
(?=.*[0-9].*[0-9]) Ensure string has two digits.
(?=.*[a-z].*[a-z].*[a-z]) Ensure string has three lowercase letters.
.{8} Ensure string is of length 8.
$ End anchor.
You should also consider changing some of your rules to:
Add more special characters i.e. %, ^, (, ), -, _, +, and period. I'm adding all the special characters that you missed above the number signs in US keyboards. Escape the ones regex uses.
Make the password 8 or more characters. Not just a static number 8.
With the above improvements, and for more flexibility and readability, I would modify the regex to.
^(?=(.*[a-z]){3,})(?=(.*[A-Z]){2,})(?=(.*[0-9]){2,})(?=(.*[!##$%^&*()\-__+.]){1,}).{8,}$
Basic Explanation
(?=(.*RULE){MIN_OCCURANCES,})
Each rule block is shown by (?=(){}). The rule and number of occurrences can then be easily specified and tested separately, before getting combined
Detailed Explanation
^ start anchor
(?=(.*[a-z]){3,}) lowercase letters. {3,} indicates that you want 3 of this group
(?=(.*[A-Z]){2,}) uppercase letters. {2,} indicates that you want 2 of this group
(?=(.*[0-9]){2,}) numbers. {2,} indicates that you want 2 of this group
(?=(.*[!##$%^&*()\-__+.]){1,}) all the special characters in the [] fields. The ones used by regex are escaped by using the \ or the character itself. {1,} is redundant, but good practice, in case you change that to more than 1 in the future. Also keeps all the groups consistent
{8,} indicates that you want 8 or more
$ end anchor
And lastly, for testing purposes here is a robulink with the above regex
Answers given above are perfect but I suggest to use multiple smaller regex rather than a big one.
Splitting the long regex have some advantages:
easiness to write and read
easiness to debug
easiness to add/remove part of regex
Generally this approach keep code easily maintainable.
Having said that, I share a piece of code that I write in Swift as example:
struct RegExp {
/**
Check password complexity
- parameter password: password to test
- parameter length: password min length
- parameter patternsToEscape: patterns that password must not contains
- parameter caseSensitivty: specify if password must conforms case sensitivity or not
- parameter numericDigits: specify if password must conforms contains numeric digits or not
- returns: boolean that describes if password is valid or not
*/
static func checkPasswordComplexity(password password: String, length: Int, patternsToEscape: [String], caseSensitivty: Bool, numericDigits: Bool) -> Bool {
if (password.length < length) {
return false
}
if caseSensitivty {
let hasUpperCase = RegExp.matchesForRegexInText("[A-Z]", text: password).count > 0
if !hasUpperCase {
return false
}
let hasLowerCase = RegExp.matchesForRegexInText("[a-z]", text: password).count > 0
if !hasLowerCase {
return false
}
}
if numericDigits {
let hasNumbers = RegExp.matchesForRegexInText("\\d", text: password).count > 0
if !hasNumbers {
return false
}
}
if patternsToEscape.count > 0 {
let passwordLowerCase = password.lowercaseString
for pattern in patternsToEscape {
let hasMatchesWithPattern = RegExp.matchesForRegexInText(pattern, text: passwordLowerCase).count > 0
if hasMatchesWithPattern {
return false
}
}
}
return true
}
static func matchesForRegexInText(regex: String, text: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
}
You can use zero-length positive look-aheads to specify each of your constraints separately:
(?=.{8,})(?=.*\p{Lu}.*\p{Lu})(?=.*[!##$&*])(?=.*[0-9])(?=.*\p{Ll}.*\p{Ll})
If your regex engine doesn't support the \p notation and pure ASCII is enough, then you can replace \p{Lu} with [A-Z] and \p{Ll} with [a-z].
All of above regex unfortunately didn't worked for me.
A strong password's basic rules are
Should contain at least a capital letter
Should contain at least a small letter
Should contain at least a number
Should contain at least a special character
And minimum length
So, Best Regex would be
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!##\$%\^&\*]).{8,}$
The above regex have minimum length of 8. You can change it from {8,} to {any_number,}
Modification in rules?
let' say you want minimum x characters small letters, y characters capital letters, z characters numbers, Total minimum length w. Then try below regex
^(?=.*[a-z]{x,})(?=.*[A-Z]{y,})(?=.*[0-9]{z,})(?=.*[!##\$%\^&\*]).{w,}$
Note: Change x, y, z, w in regex
Edit: Updated regex answer
Edit2: Added modification
I would suggest adding
(?!.*pass|.*word|.*1234|.*qwer|.*asdf) exclude common passwords
import re
RegexLength=re.compile(r'^\S{8,}$')
RegexDigit=re.compile(r'\d')
RegexLower=re.compile(r'[a-z]')
RegexUpper=re.compile(r'[A-Z]')
def IsStrongPW(password):
if RegexLength.search(password) == None or RegexDigit.search(password) == None or RegexUpper.search(password) == None or RegexLower.search(password) == None:
return False
else:
return True
while True:
userpw=input("please input your passord to check: \n")
if userpw == "exit":
break
else:
print(IsStrongPW(userpw))
codaddict's solution works fine, but this one is a bit more efficient: (Python syntax)
password = re.compile(r"""(?#!py password Rev:20160831_2100)
# Validate password: 2 upper, 1 special, 2 digit, 1 lower, 8 chars.
^ # Anchor to start of string.
(?=(?:[^A-Z]*[A-Z]){2}) # At least two uppercase.
(?=[^!##$&*]*[!##$&*]) # At least one "special".
(?=(?:[^0-9]*[0-9]){2}) # At least two digit.
.{8,} # Password length is 8 or more.
$ # Anchor to end of string.
""", re.VERBOSE)
The negated character classes consume everything up to the desired character in a single step, requiring zero backtracking. (The dot star solution works just fine, but does require some backtracking.) Of course with short target strings such as passwords, this efficiency improvement will be negligible.
For PHP, this works fine!
if(preg_match("/^(?=(?:[^A-Z]*[A-Z]){2})(?=(?:[^0-9]*[0-9]){2}).{8,}$/",
'CaSu4Li8')){
return true;
}else{
return fasle;
}
in this case the result is true
Thsks for #ridgerunner
Another solution:
import re
passwordRegex = re.compile(r'''(
^(?=.*[A-Z].*[A-Z]) # at least two capital letters
(?=.*[!##$&*]) # at least one of these special c-er
(?=.*[0-9].*[0-9]) # at least two numeric digits
(?=.*[a-z].*[a-z].*[a-z]) # at least three lower case letters
.{8,} # at least 8 total digits
$
)''', re.VERBOSE)
def userInputPasswordCheck():
print('Enter a potential password:')
while True:
m = input()
mo = passwordRegex.search(m)
if (not mo):
print('''
Your password should have at least one special charachter,
two digits, two uppercase and three lowercase charachter. Length: 8+ ch-ers.
Enter another password:''')
else:
print('Password is strong')
return
userInputPasswordCheck()
Password must meet at least 3 out of the following 4 complexity rules,
[at least 1 uppercase character (A-Z)
at least 1 lowercase character (a-z)
at least 1 digit (0-9)
at least 1 special character — do not forget to treat space as special characters too]
at least 10 characters
at most 128 characters
not more than 2 identical characters in a row (e.g., 111 not allowed)
'^(?!.(.)\1{2})
((?=.[a-z])(?=.[A-Z])(?=.[0-9])|(?=.[a-z])(?=.[A-Z])(?=.[^a-zA-Z0-9])|(?=.[A-Z])(?=.[0-9])(?=.[^a-zA-Z0-9])|(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])).{10,127}$'
(?!.*(.)\1{2})
(?=.[a-z])(?=.[A-Z])(?=.*[0-9])
(?=.[a-z])(?=.[A-Z])(?=.*[^a-zA-Z0-9])
(?=.[A-Z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
.{10.127}

Allow only English letters and numbers in php

I'm trying to create a filter to allow users to use only English letters (Lowercase & uppercase) and numbers. how can I do that? (ANSI)
(not trying to sanitize, only to tell if a string contain non-english letters)
That filter should get me a clean database with only english usernames, without multibyte and UTF-8 characters.
And can anyone explain to me why echo strlen(À) outputs '2'? it means two bytes right? wans't UTF-8 chars supposed to contain a single byte?
Thanks
You should use regular expressions to see if a string matches a pattern. This one is pretty simple:
if (preg_match('/^[a-zA-Z0-9]+$/', $username)) {
echo 'Username is valid';
} else {
echo 'Username is NOT valid';
}
And the reason why strlen('À') equals 2 is because strlen doesn't know that string is UTF-8. Try using:
echo strlen(utf8_decode('À'));
This is how you check whether a string contains only letters from the English alphabet.
if (!preg_match('/[^A-Za-z0-9]/', $string)) {
//string contains only letters from the English alphabet
}
The other question:
strlen(À)
will not return 2. Maybe you meant
strlen('À')
strlen returns
The length of the string on success, and 0 if the string is empty.
taken from here. So, that character is interpreted as two characters, probably due to your encoding.

What percentage of the characters in a string are non-english?

Is there a simple way in PHP to tell what percentage of the characters in a string are non-english?
What I'm trying to achieve is detecting non english items in a list based on a description and the percentage is used to account for the special characters that might be present in an english text too. Eg. having a less than 5% non english characters would not necessarily mean that the text is not in english but having 95% non english characters would.
Well there is no direct way of doing it, but this might help using mb_strlen
Here is an example
$string="string with utf-8 chars åèä - doo-bee doo-bee dooh";
$utf = mb_strlen($string, 'utf-8') ;
echo $utf ;
echo "<br />";
$all = strlen($string);
echo $all ;
echo "<br />";
$non_eng = $all - $utf ;
echo $non_eng ;
You will have 3 non eng chars and using the total length you can calculate the % .
In the English language you know that we have 26 letters without any diacritical marks (i.e the accents).
You can either:
1) have a list of letters in both upper and lower case, number characters and any other characters you would like to accept as 'English' stored in an array
2) or do a short cut like this: $az = range('a', 'z'); which will return all 26 characters, make sure you do the same for capital letters and numbers, and add those elements into one big array.
Then iterate through each letter in your text document comparing it to each letter in your English characters array where you will tally hits and misses as you go through the document.
Then you can work out the percentage for English letters found in your document by doing the following:
100/ total number of characters in the document * hits (the total number of English characters found)

Difference between PHP regex and JavaScript regex

Hi i want to use below php regex in spry java script framework but them doesn't work with spry framework and spry doesn't let the user to input!.
1)"/^[\d]+$/"
2)"/^([\x{600}-\x{6FF}]+\s)*[\x{600}-\x{6FF}]+$/u"
3)"/^([\x{600}-\x{6FF}]+\d*\s)*[\x{600}-\x{6FF}]+\d*$/u"
please help me to convert them to use in spry framework.
1) /^[\d]+$/
2) /^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/
3) /^([\u0600-\u06FF]+\d*\s)*[\u0600-\u06FF]+\d*$/
/u is not supported, since Javascript regexes only supports unicode in terms of codepoints. \x{???} (unicode codepoints) should be written \u???? in Javascript regex (always 4 digits 0 padded)
In these cases, the following applies to the rest of the regex:
\s in Javascript is treated as unicode
\d isn't, which means only ASCII digits (0-9) are accepted.
This means we specifically have to allow "foreign" numerals, e.g. Persian (codepoints 06F0-06F9):
1) /^[\d\u06F0-\u06F9]+$/
2) /^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/
3) /^([\u0600-\u06FF]+[\d\u06F0-\u06F9]*\s)*[\u0600-\u06FF]+[\d\u06F0-\u06F9]*$/
(Remove \d if ASCII digits shouldn't be accepted)
Not sure what the brackets are supposed to be doing in example 1, originally they could be written:
1) /^\d+$/
But to add the Persian numerals, we need them, see above.
Update
Spry character masking, however, only wants a regex to be applied on each entered character - i.e., we can't actually do pattern matching, it's just a "list" of accepted characters in all places, in which case:
1 ) /[\u06F0-\u06F9\d]/ // match 0-9 and Persian numerals
2 and 3) /[\u0600-\u06FF\d\s]/ // match all Persian characters (includes numerals), 0-9 and space
Once again, remove \d if you don't want to accept 0-9.
Update 2
Now... using regex for validation with Spry:
var checkFullName = function(value, options)
{
// Match with the by now well-known regex:
if (value.match(/^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/))
{
return true;
}
return false;
}
var sprytextfield =
new Spry.Widget.ValidationTextField(
"sprytextfield",
"custom",
{ validation: checkFullName, validateOn: ["blur", "change"] }
);
A similar custom function can be made for case 3.
See examples from Adobe labs
Are you passing them in as strings or as regex objects? Try removing the " characters from around the regex.
The 'u' flag is a little more tricky. You may need to explain what the 2nd and 3rd regexes are trying to do.

PHP / regular expression to check if a string contains a word of a certain length

I need to check whether a received string contains any words that are more than 20 characters in length. For example the input string :
hi there asssssssssssssssssssskkkkkkkk how are you doing ?
would return true.
could somebody please help me out with a regexp to check for this. i'm using php.
thanks in advance.
/\w{20}/
...filller for 15 characters....
You can test if the string contains a match of the following pattern:
[A-Za-z]{20}
The construct [A-Za-z] creates a character class that matches ASCII uppercase and lowercase letters. The {20} is a finite repetition syntax. It's enough to check if there's a match that contains 20 letters, because if there's a word that contains more, it contains at least 20.
References
regular-expressions.info/Character Classes and Finite Repetition
PHP snippet
Here's an example usage:
$strings = array(
"hey what the (##$&*!#^#*&^#!#*^##*##*&^#!*#!",
"now this one is just waaaaaaaaaaaaaaaaaaay too long",
"12345678901234567890123 that's not a word, is it???",
"LOLOLOLOLOLOLOLOLOLOLOL that's just unacceptable!",
"one-two-three-four-five-six-seven-eight-nine-ten",
"goaaaa...............aaaaaaaaaalll!!!!!!!!!!!!!!",
"there is absolutely nothing here"
);
foreach ($strings as $str) {
echo $str."\n".preg_match('/[a-zA-Z]{20}/', $str)."\n";
}
This prints (as seen on ideone.com):
hey what the (##$&*!#^#*&^#!#*^##*##*&^#!*#!
0
now this one is just waaaaaaaaaaaaaaaaaaay too long
1
12345678901234567890123 that's not a word, is it???
0
LOLOLOLOLOLOLOLOLOLOLOL that's just unacceptable!
1
one-two-three-four-five-six-seven-eight-nine-ten
0
goaaaa...............aaaaaaaaaalll!!!!!!!!!!!!!!
0
there is absolutely nothing here
0
As specified in the pattern, preg_match is true when there's a "word" (as defined by a sequence of letters) that is at least 20 characters long.
If this definition of a "word" is not adequate, then simply change the pattern to, e.g. \S{20}. That is, any seqeuence of 20 non-whitespace characters; now all but the last string is a match (as seen on ideone.com).
I think the strlen function is what you looking for. you can do something like this:
if (strlen($input) > 20) {
echo "input is more than 20 characters";
}

Categories