Difference between PHP regex and JavaScript regex - php

Hi i want to use below php regex in spry java script framework but them doesn't work with spry framework and spry doesn't let the user to input!.
1)"/^[\d]+$/"
2)"/^([\x{600}-\x{6FF}]+\s)*[\x{600}-\x{6FF}]+$/u"
3)"/^([\x{600}-\x{6FF}]+\d*\s)*[\x{600}-\x{6FF}]+\d*$/u"
please help me to convert them to use in spry framework.

1) /^[\d]+$/
2) /^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/
3) /^([\u0600-\u06FF]+\d*\s)*[\u0600-\u06FF]+\d*$/
/u is not supported, since Javascript regexes only supports unicode in terms of codepoints. \x{???} (unicode codepoints) should be written \u???? in Javascript regex (always 4 digits 0 padded)
In these cases, the following applies to the rest of the regex:
\s in Javascript is treated as unicode
\d isn't, which means only ASCII digits (0-9) are accepted.
This means we specifically have to allow "foreign" numerals, e.g. Persian (codepoints 06F0-06F9):
1) /^[\d\u06F0-\u06F9]+$/
2) /^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/
3) /^([\u0600-\u06FF]+[\d\u06F0-\u06F9]*\s)*[\u0600-\u06FF]+[\d\u06F0-\u06F9]*$/
(Remove \d if ASCII digits shouldn't be accepted)
Not sure what the brackets are supposed to be doing in example 1, originally they could be written:
1) /^\d+$/
But to add the Persian numerals, we need them, see above.
Update
Spry character masking, however, only wants a regex to be applied on each entered character - i.e., we can't actually do pattern matching, it's just a "list" of accepted characters in all places, in which case:
1 ) /[\u06F0-\u06F9\d]/ // match 0-9 and Persian numerals
2 and 3) /[\u0600-\u06FF\d\s]/ // match all Persian characters (includes numerals), 0-9 and space
Once again, remove \d if you don't want to accept 0-9.
Update 2
Now... using regex for validation with Spry:
var checkFullName = function(value, options)
{
// Match with the by now well-known regex:
if (value.match(/^([\u0600-\u06FF]+\s)*[\u0600-\u06FF]+$/))
{
return true;
}
return false;
}
var sprytextfield =
new Spry.Widget.ValidationTextField(
"sprytextfield",
"custom",
{ validation: checkFullName, validateOn: ["blur", "change"] }
);
A similar custom function can be made for case 3.
See examples from Adobe labs

Are you passing them in as strings or as regex objects? Try removing the " characters from around the regex.
The 'u' flag is a little more tricky. You may need to explain what the 2nd and 3rd regexes are trying to do.

Related

plus-minus (±) sign in regex

I need to produce a regex pattern that verifies UTC offsets. These are typically formatted as UTC+05:30 or UTC-01:00. It seemed simple enough to match as follows (being permissive for spaces):
^UTC[ ]?[+\-±][ ]?[01][0-9]:[034][05]$
[Note: I updated this pattern based on feedback from #barman]
There is a pocket case in which the code is written UTC±00:00. However, the plus-minus sign is throwing things off. Using PHP for example:
echo preg_match("/^±$/","±");
echo preg_match("/^[±]$/","±");
echo preg_match("/^[\±]$/","±");
Will return true for the first match but false on the other two.
So my question is, does the ± require special handling in Regex? I can't find any reference to this symbol in the docs. Thx.
It looks like #Barmar probably solved the first issue you were having (matching the UTC string). However, to explain what you were seeing with:
preg_match("/^±$/","±"); // true
preg_match("/^[±]$/","±"); // false
preg_match("/^[\±]$/","±"); // false
The ± character is two bytes long, so preg_match is interpretting it as two characters. In order to match in the way you expect, you have to use the /u modifier. This tells preg_match to treat your pattern as utf-8, which will interpret ± as a single character instead of two characters.
preg_match("/^[±]$/u","±"); // true
And to include an example that matches your UTC sample:
// with the /u modifier (works as expected)
preg_match("/^UTC[ ]?[+\-±][ ]?[01][0-9]:[034][05]$/u", "UTC±05:30"); // true
// without the /u modifier (does not match)
preg_match("/^UTC[ ]?[+\-±][ ]?[01][0-9]:[034][05]$/", "UTC±05:30"); // false
You mustn't put the - between two characters inside [], that makes it create a range (like when you write [0-9]) rather than matching the - character literally.
You should put the - at the beginning or end, or escape it.
^UTC[ ]?[+\-±][ ]?[01][0-9]:[034][05]$
Also, you don't put | inside [] character sets. That's used inside () to create alternative patterns.

PHP preg_replace: find string part not starting with an exclamation point

I am working on some very messy Excel sheets, and trying to use PHP to find clues..
I have a MySQL database with all formulas from an excel document, and as usual, the cellnames from the current sheet do not have a "sheetname!" in front of it. To make it searchable (and find dead-routes in the formulas) I like to replace all formulas in the database with their sheetname as prefix.
Example:
=+(sheet_factory_costs!A17/sheet_employees!D23)+T12+W12
The database contains the name of the current sheet, and I like to change the formula above with that sheetname (let's call it "sheet_turnover").
=+(sheet_factory_costs!A17 / sheet_employees!D23)+sheet_turnover!T12+sheet_turnover!W12
I try this in PHP with preg_replace, and I think I need the following rules:
Find one or two letters, directly followed by a number. This is always a cell-adress within formulas.
When there is a ! on the position before, there is already a sheetname. So I am only looking for the letters and numbers NOT starting with an exclamation point.
The problem seems to be that the ! is also a special sign within patterns. Even if I try to escape it, it does not work:
$newformula =
preg_replace('/(?<\!)[A-Z]{1,2}[0-9]/',
'lala',
$oldformula);
(lala is my temporary marker to see if it is selecting the right cell-adresses)
(and yes, the lala is only places over the first number, but that's no issue right now)
(and yes, all Excel $..$.. (permanent) markers have already been replaced. No need to build that in the formula)
Your negative lookbehind is corrupt, you need to define it as (?<!!). However, you also need to use either a word boundary before it, or a (?<![A-Z]) lookbehind to make sure you have no other letters before the [A-Z]{1,2}.
So, you may use
'~\b(?<!!)[A-Z]{1,2}[0-9]~'
See the regex demo. Replace with sheet_turnover!$0 where $0 is the whole match value.
Details
\b - a word boundary (it is necessary, or name!AA11 would still get matched)
(?<!!) - no ! immediately to the left of the current location
[A-Z]{1,2} - 1 or 2 letters
[0-9] - a digit.
Another approach is match and skip "wrong" contexts and then match and keep the "right" ones:
'~\w+![A-Z]{1,2}[0-9](*SKIP)(*F)|\b[A-Z]{1,2}[0-9]~'
See this regex demo.
Here, \w+![A-Z]{1,2}[0-9](*SKIP)(*F)| part matches 1 or more word chars, then 1 or 2 uppercase ASCII letters and then a digit, and (*SKIP)(*F) will omit the match and will make the engine proceed looking for matches after the end of the previous match.

Regex for password ver [duplicate]

I need a regular expression with condition:
min 6 characters, max 50 characters
must contain 1 letter
must contain 1 number
may contain special characters like !##$%^&*()_+
Currently I have pattern: (?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{6,50})$
However it doesn't allow special characters, does anybody have a good regex for that?
Thanks
Perhaps a single regex could be used, but that makes it hard to give the user feedback for which rule they aren't following. A more traditional approach like this gives you feedback that you can use in the UI to tell the user what pwd rule is not being met:
function checkPwd(str) {
if (str.length < 6) {
return("too_short");
} else if (str.length > 50) {
return("too_long");
} else if (str.search(/\d/) == -1) {
return("no_num");
} else if (str.search(/[a-zA-Z]/) == -1) {
return("no_letter");
} else if (str.search(/[^a-zA-Z0-9\!\#\#\$\%\^\&\*\(\)\_\+]/) != -1) {
return("bad_char");
}
return("ok");
}
following jfriend00 answer i wrote this fiddle to test his solution with some little changes to make it more visual:
http://jsfiddle.net/9RB49/1/
and this is the code:
checkPwd = function() {
var str = document.getElementById('pass').value;
if (str.length < 6) {
alert("too_short");
return("too_short");
} else if (str.length > 50) {
alert("too_long");
return("too_long");
} else if (str.search(/\d/) == -1) {
alert("no_num");
return("no_num");
} else if (str.search(/[a-zA-Z]/) == -1) {
alert("no_letter");
return("no_letter");
} else if (str.search(/[^a-zA-Z0-9\!\#\#\$\%\^\&\*\(\)\_\+\.\,\;\:]/) != -1) {
alert("bad_char");
return("bad_char");
}
alert("oukey!!");
return("ok");
}
btw, its working like a charm! ;)
best regards and thanks to jfriend00 of course!
Check a password between 7 to 16 characters which contain only characters, numeric digits, underscore and first character must be a letter-
/^[A-Za-z]\w{7,14}$/
Check a password between 6 to 20 characters which contain at least one numeric digit, one uppercase, and one lowercase letter
/^(?=.\d)(?=.[a-z])(?=.*[A-Z]).{6,20}$/
Check a password between 7 to 15 characters which contain at least one numeric digit and a special character
/^(?=.[0-9])(?=.[!##$%^&])[a-zA-Z0-9!##$%^&]{7,15}$/
Check a password between 8 to 15 characters which contain at least one lowercase letter, one uppercase letter, one numeric digit, and one special character
/^(?=.\d)(?=.[a-z])(?=.[A-Z])(?=.[^a-zA-Z0-9])(?!.*\s).{8,15}$/
I hope this will help someone. For more please check this article and this site regexr.com
A more elegant and self-contained regex to match these (common) password requirements is:
^(?=.*[A-Za-z])(?=.*\\d)[A-Za-z\\d^a-zA-Z0-9].{5,50}$
The elegant touch here is that you don't have to hard-code symbols such as $ # # etc.
To accept all the symbols, you are simply saying: "accept also all the not alphanumeric characters and not numbers".
Min and Max number of characters requirement
The final part of the regex {5,50} is the min and max number of characters, if the password is less than 6 or more than 50 characters entered the regex returns a non match.
I have a regex, but it's a bit tricky.
^(?:(?<Numbers>[0-9]{1})|(?<Alpha>[a-zA-Z]{1})|(?<Special>[^a-zA-Z0-9]{1})){6,50}$
Let me explain it and how to check if the tested password is correct:
There are three named groups in the regex.
1) "Numbers": will match a single number in the string.
2) "Alpha": will match a single character from "a" to "z" or "A" to "Z"
3) "Special": will match a single character not being "Alpha" or "Numbers"
Those three named groups are grouped in an alternative group, and {6,50} advises regex machine to capture at least 6 of those groups mentiond above, but not more than 50.
To ensure a correct password is entered you have to check if there is a match, and after that, if the matched groups are capture as much as you desired. I'm a C# developer and don't know, how it works in javascript, but in C# you would have to check:
match.Groups["Numbers"].Captures.Count > 1
Hopefully it works the same in javascript! Good luck!
I use this
export const validatePassword = password => {
const re = /^(?=.*[A-Za-z])(?=.*\d)[a-zA-Z0-9!##$%^&*()~¥=_+}{":;'?/>.<,`\-\|\[\]]{6,50}$/
return re.test(password)
}
DEMO https://jsfiddle.net/ssuryar/bjuhkt09/
Onkeypress the function triggerred.
HTML
<form>
<input type="text" name="testpwd" id="testpwd" class="form=control" onkeyup="checksPassword(this.value)"/>
<input type="submit" value="Submit" /><br />
<span class="error_message spassword_error" style="display: none;">Enter minimum 8 chars with atleast 1 number, lower, upper & special(##$%&!-_&) char.</span>
</form>
Script
function checksPassword(password){
var pattern = /^.*(?=.{8,20})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%&!-_]).*$/;
if(!pattern.test(password)) {
$(".spassword_error").show();
}else
{
$(".spassword_error").hide();
}
}
International UTF-8
None of the solutions here allow international letters, i.e. éÉöÖæÆóÓúÚáÁ, but are mainly focused on the english alphabet.
The following regEx uses unicode, UTF-8, to recognise upper and lower case and thus, allow international characters:
// Match uppercase, lowercase, digit or #$!%*?& and make sure the length is 6 to 50 in length
const pwdFilter = /^(?=.*\p{Ll})(?=.*\p{Lu})(?=.*[\d|##$!%*?&])[\p{L}\d##$!%*?&]{6,50}$/gmu
if (!pwdFilter.test(pwd)) {
// Show error that password has to be adjusted to match criteria
}
This regEx
/^(?=.*\p{Ll})(?=.*\p{Lu})(?=.*[\d|##$!%*?&])[\p{L}\d##$!%*?&]{6,50}$/gmu
checks if an uppercase, lowercase, digit or #$!%*?& are used in the password. It also limits the length to be 6 minimum and maximum 50 (note that the length of 😀🇺🇸🇪🇸🧑‍💻 emojis counts as more than one character in the length).
The u in the end, tells it to use UTF-8.
First, we should make the assumption that passwords are always hashed (right? always hashed, right?). That means we should not specify the exact characters allowed (as per the 4th bullet). Rather, any characters should be accepted, and then validate on minimum length and complexity (must contain a letter and a number, for example). And since it will definitely be hashed, we have no concerns over a max length, and should be able to eliminate that as a requirement.
I agree that often this won't be done as a single regex but rather a series of small regex to validate against because we may want to indicate to the user what they need to update, rather than just rejecting outright as an invalid password. Here's some options:
As discussed above - 1 number, 1 letter (upper or lower case) and min 8 char. Added a second option that disallows leading/trailing spaces (avoid potential issues with pasting with extra white space, for example).
^(?=.*\d)(?=.*[a-zA-Z]).{8,}$
^(?=.*\d)(?=.*[a-zA-Z])\S.{6,}\S$
Lastly, if you want to require 1 number and both 1 uppercase and 1 lowercase letter, something like this would work (with or without allowing leading/trailing spaces)
^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}$
^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])\S.{6,}\S$
Lastly as requested in the original post (again, don't do this, please try and push back on the requirements!!) - 1 number, 1 letter (upper or lower case), 1 special char (in list) and min 8 char, max 50 char. Both with/without allowing leading/trailing spaces, note the min/max change to account for the 2 non-whitespace characters specified.
^(?=.*\d)(?=.*[a-zA-Z])(?=.*[!##$%^&*()_+]).{8,50}$
^(?=.*\d)(?=.*[a-zA-Z])(?=.*[!##$%^&*()_+])\S.{6,48}\S$
Bonus - separated out is pretty simple, just test against each of the following and show the appropriate error in turn:
/^.{8,}$/ // at least 8 char; ( /^.{8,50}$/ if you must add a max)
/[A-Za-z]/ // one letter
/[A-Z]/ // (optional) - one uppercase letter
/[a-z]/ // (optional) - one lowercase letter
/\d/ // one number
/^\S+.*\S+$/ // (optional) first and last character are non-whitespace)
Note, in these regexes, the char set for a letter is the standard English 26 character alphabet without any accented characters. But my hope is this has enough variations so folks can adapt from here as needed.
// more secure regex password must be :
// more than 8 chars
// at least one number
// at least one special character
const PASSWORD_REGEX_3 = /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!##$%^&*]).{8,}$/;

How can I validate a password based on my rules? [duplicate]

My password strength criteria is as below :
8 characters length
2 letters in Upper Case
1 Special Character (!##$&*)
2 numerals (0-9)
3 letters in Lower Case
Can somebody please give me regex for same. All conditions must be met by password .
You can do these checks using positive look ahead assertions:
^(?=.*[A-Z].*[A-Z])(?=.*[!##$&*])(?=.*[0-9].*[0-9])(?=.*[a-z].*[a-z].*[a-z]).{8}$
Rubular link
Explanation:
^ Start anchor
(?=.*[A-Z].*[A-Z]) Ensure string has two uppercase letters.
(?=.*[!##$&*]) Ensure string has one special case letter.
(?=.*[0-9].*[0-9]) Ensure string has two digits.
(?=.*[a-z].*[a-z].*[a-z]) Ensure string has three lowercase letters.
.{8} Ensure string is of length 8.
$ End anchor.
You should also consider changing some of your rules to:
Add more special characters i.e. %, ^, (, ), -, _, +, and period. I'm adding all the special characters that you missed above the number signs in US keyboards. Escape the ones regex uses.
Make the password 8 or more characters. Not just a static number 8.
With the above improvements, and for more flexibility and readability, I would modify the regex to.
^(?=(.*[a-z]){3,})(?=(.*[A-Z]){2,})(?=(.*[0-9]){2,})(?=(.*[!##$%^&*()\-__+.]){1,}).{8,}$
Basic Explanation
(?=(.*RULE){MIN_OCCURANCES,})
Each rule block is shown by (?=(){}). The rule and number of occurrences can then be easily specified and tested separately, before getting combined
Detailed Explanation
^ start anchor
(?=(.*[a-z]){3,}) lowercase letters. {3,} indicates that you want 3 of this group
(?=(.*[A-Z]){2,}) uppercase letters. {2,} indicates that you want 2 of this group
(?=(.*[0-9]){2,}) numbers. {2,} indicates that you want 2 of this group
(?=(.*[!##$%^&*()\-__+.]){1,}) all the special characters in the [] fields. The ones used by regex are escaped by using the \ or the character itself. {1,} is redundant, but good practice, in case you change that to more than 1 in the future. Also keeps all the groups consistent
{8,} indicates that you want 8 or more
$ end anchor
And lastly, for testing purposes here is a robulink with the above regex
Answers given above are perfect but I suggest to use multiple smaller regex rather than a big one.
Splitting the long regex have some advantages:
easiness to write and read
easiness to debug
easiness to add/remove part of regex
Generally this approach keep code easily maintainable.
Having said that, I share a piece of code that I write in Swift as example:
struct RegExp {
/**
Check password complexity
- parameter password: password to test
- parameter length: password min length
- parameter patternsToEscape: patterns that password must not contains
- parameter caseSensitivty: specify if password must conforms case sensitivity or not
- parameter numericDigits: specify if password must conforms contains numeric digits or not
- returns: boolean that describes if password is valid or not
*/
static func checkPasswordComplexity(password password: String, length: Int, patternsToEscape: [String], caseSensitivty: Bool, numericDigits: Bool) -> Bool {
if (password.length < length) {
return false
}
if caseSensitivty {
let hasUpperCase = RegExp.matchesForRegexInText("[A-Z]", text: password).count > 0
if !hasUpperCase {
return false
}
let hasLowerCase = RegExp.matchesForRegexInText("[a-z]", text: password).count > 0
if !hasLowerCase {
return false
}
}
if numericDigits {
let hasNumbers = RegExp.matchesForRegexInText("\\d", text: password).count > 0
if !hasNumbers {
return false
}
}
if patternsToEscape.count > 0 {
let passwordLowerCase = password.lowercaseString
for pattern in patternsToEscape {
let hasMatchesWithPattern = RegExp.matchesForRegexInText(pattern, text: passwordLowerCase).count > 0
if hasMatchesWithPattern {
return false
}
}
}
return true
}
static func matchesForRegexInText(regex: String, text: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
}
You can use zero-length positive look-aheads to specify each of your constraints separately:
(?=.{8,})(?=.*\p{Lu}.*\p{Lu})(?=.*[!##$&*])(?=.*[0-9])(?=.*\p{Ll}.*\p{Ll})
If your regex engine doesn't support the \p notation and pure ASCII is enough, then you can replace \p{Lu} with [A-Z] and \p{Ll} with [a-z].
All of above regex unfortunately didn't worked for me.
A strong password's basic rules are
Should contain at least a capital letter
Should contain at least a small letter
Should contain at least a number
Should contain at least a special character
And minimum length
So, Best Regex would be
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!##\$%\^&\*]).{8,}$
The above regex have minimum length of 8. You can change it from {8,} to {any_number,}
Modification in rules?
let' say you want minimum x characters small letters, y characters capital letters, z characters numbers, Total minimum length w. Then try below regex
^(?=.*[a-z]{x,})(?=.*[A-Z]{y,})(?=.*[0-9]{z,})(?=.*[!##\$%\^&\*]).{w,}$
Note: Change x, y, z, w in regex
Edit: Updated regex answer
Edit2: Added modification
I would suggest adding
(?!.*pass|.*word|.*1234|.*qwer|.*asdf) exclude common passwords
import re
RegexLength=re.compile(r'^\S{8,}$')
RegexDigit=re.compile(r'\d')
RegexLower=re.compile(r'[a-z]')
RegexUpper=re.compile(r'[A-Z]')
def IsStrongPW(password):
if RegexLength.search(password) == None or RegexDigit.search(password) == None or RegexUpper.search(password) == None or RegexLower.search(password) == None:
return False
else:
return True
while True:
userpw=input("please input your passord to check: \n")
if userpw == "exit":
break
else:
print(IsStrongPW(userpw))
codaddict's solution works fine, but this one is a bit more efficient: (Python syntax)
password = re.compile(r"""(?#!py password Rev:20160831_2100)
# Validate password: 2 upper, 1 special, 2 digit, 1 lower, 8 chars.
^ # Anchor to start of string.
(?=(?:[^A-Z]*[A-Z]){2}) # At least two uppercase.
(?=[^!##$&*]*[!##$&*]) # At least one "special".
(?=(?:[^0-9]*[0-9]){2}) # At least two digit.
.{8,} # Password length is 8 or more.
$ # Anchor to end of string.
""", re.VERBOSE)
The negated character classes consume everything up to the desired character in a single step, requiring zero backtracking. (The dot star solution works just fine, but does require some backtracking.) Of course with short target strings such as passwords, this efficiency improvement will be negligible.
For PHP, this works fine!
if(preg_match("/^(?=(?:[^A-Z]*[A-Z]){2})(?=(?:[^0-9]*[0-9]){2}).{8,}$/",
'CaSu4Li8')){
return true;
}else{
return fasle;
}
in this case the result is true
Thsks for #ridgerunner
Another solution:
import re
passwordRegex = re.compile(r'''(
^(?=.*[A-Z].*[A-Z]) # at least two capital letters
(?=.*[!##$&*]) # at least one of these special c-er
(?=.*[0-9].*[0-9]) # at least two numeric digits
(?=.*[a-z].*[a-z].*[a-z]) # at least three lower case letters
.{8,} # at least 8 total digits
$
)''', re.VERBOSE)
def userInputPasswordCheck():
print('Enter a potential password:')
while True:
m = input()
mo = passwordRegex.search(m)
if (not mo):
print('''
Your password should have at least one special charachter,
two digits, two uppercase and three lowercase charachter. Length: 8+ ch-ers.
Enter another password:''')
else:
print('Password is strong')
return
userInputPasswordCheck()
Password must meet at least 3 out of the following 4 complexity rules,
[at least 1 uppercase character (A-Z)
at least 1 lowercase character (a-z)
at least 1 digit (0-9)
at least 1 special character — do not forget to treat space as special characters too]
at least 10 characters
at most 128 characters
not more than 2 identical characters in a row (e.g., 111 not allowed)
'^(?!.(.)\1{2})
((?=.[a-z])(?=.[A-Z])(?=.[0-9])|(?=.[a-z])(?=.[A-Z])(?=.[^a-zA-Z0-9])|(?=.[A-Z])(?=.[0-9])(?=.[^a-zA-Z0-9])|(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])).{10,127}$'
(?!.*(.)\1{2})
(?=.[a-z])(?=.[A-Z])(?=.*[0-9])
(?=.[a-z])(?=.[A-Z])(?=.*[^a-zA-Z0-9])
(?=.[A-Z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
.{10.127}

How can I test if an input field contains foreign characters?

I have an input field in a form. Upon pushing submit, I want to validate to make sure the user entered non-latin characters only, so any foreign language characters, like Chinese among many others. Or at the very least test to make sure it does not contain any latin characters.
Could I use a regular expression for this? What would be the best approach for this?
I am validating in both javaScript and in PHP. What solutions can I use to check for foreign characters in the input field in both programming languages?
In PHP, you can check the Unicode property IsLatin. That's probably closest to what you want.
So if preg_match('/\p{Latin}/u', $subject) returns true, then there is at least one Latin character in your $subject. See also this reference.
JavaScript doesn't support this; you'd have to contruct the valid Unicode ranges manually.
In Javascript, at least, you can use hex codes inside character range expressions:
var rlatins = /[\u0000-\u007f]/;
You can then test to see if there are any latin characters in a string like this:
if (rlatins.test(someString)) {
alert("ROMANI ITE DOMUM");
}
You're trying to check if all letters are not Latin, but you do accept accented letters.
A simple solution is to validate the string using the regex (this is useful if you have a validation plugin):
/^[^a-z]+$/i
^...$ - Match from start to end
^[...] - characters that are not
a-z - A though Z,
+ - with at least one letter
/i - ignoring case (could also done /^[^a-zA-Z]+$/ )
Another option is simply to look for a letter:
/[a-z]/i
This regex will match if the string conatins a letter, so you can unvalidated it.
In JavaScript you can check that easily with if:
var s = "שלום עולם";
if(s.match(/^[^a-z]+$/i){
}
or
if(!s.match(/[a-z]/i))
PHP has a different syntax and more security than JavaScript, but the regular expressions are the same.

Categories