I'm trying to create a filter to allow users to use only English letters (Lowercase & uppercase) and numbers. how can I do that? (ANSI)
(not trying to sanitize, only to tell if a string contain non-english letters)
That filter should get me a clean database with only english usernames, without multibyte and UTF-8 characters.
And can anyone explain to me why echo strlen(À) outputs '2'? it means two bytes right? wans't UTF-8 chars supposed to contain a single byte?
Thanks
You should use regular expressions to see if a string matches a pattern. This one is pretty simple:
if (preg_match('/^[a-zA-Z0-9]+$/', $username)) {
echo 'Username is valid';
} else {
echo 'Username is NOT valid';
}
And the reason why strlen('À') equals 2 is because strlen doesn't know that string is UTF-8. Try using:
echo strlen(utf8_decode('À'));
This is how you check whether a string contains only letters from the English alphabet.
if (!preg_match('/[^A-Za-z0-9]/', $string)) {
//string contains only letters from the English alphabet
}
The other question:
strlen(À)
will not return 2. Maybe you meant
strlen('À')
strlen returns
The length of the string on success, and 0 if the string is empty.
taken from here. So, that character is interpreted as two characters, probably due to your encoding.
Related
I want to check for string that contains only english alphabets , digits and symbols.I tried below code but it works only when all the characters are different language.
if(strlen($string) != mb_strlen($string, 'utf-8'))
{
echo "No English words ";
}
else {
echo "only english words";
}
For example
1. hellow hi 123#!##!##()### -- true
2. ព្រាប សុវ ok yes ### - false
3. this is good 123 - true
4. ព្រាប -- false
p.s : my question is not duplicate because other questions only cover alphabets and symbols , mine covers symbol too
Would determining if a string is just printable ASCII work? If so you can use this regex:
[ -~]
http://www.catonmat.net/blog/my-favorite-regex/
If you need non ASCII characters as well than you can use the Wikipedia page to get the specific unicode formats that you need:
https://en.wikipedia.org/wiki/List_of_Unicode_characters#Control_codes
I'm newest and happy to be here in Stackoverflow. I am following this website since a long time.
I have an input/text field. (e.g. "Insert your name")
The script starts the following controls when the user sends data:
1) Control whether the field is empty or not.
2) Control whether the field goes over maximum characters allowed.
3) Control whether the field contain a wrong matched preg_match regex. (e.g. it contains numbers instead of only letters - without symbols and accents -).
The question is: why if i put this characters "w/", the script doesn't make the control? And it seems like the string bypass controls?
Hello to all guys and sorry if I'm late with the answer (and also for the non-code posted ^^).
Now, talking about my problem. I checked that the problem is on ONLY if I work offline (I use EasyPhp 5.3.6.1). Otherwise the regEx tested online is ok.
This is the code I use to obtain only what I said above:
if (!preg_match('/^[a-zA-Z]+[ ]?[a-zA-Z]+$/', $name)) {
echo "Error";
}
As you can see, this code match:
A string that start (and finish) with only letters;
A string with only 0 or 1 empty space (for persons who has two name, i.e.: Anna Maria);
...right?!
(Please correct me if I am wrong)
Thanks to all!
Wart
My reading of the requirements is
Only letters (upper or lower) can be provided.
Something must be provided (i.e. a non-zero length string).
There is a maximum length.
The code below checks this in a very simple manner and just echos any errors it finds; you probably want to do something more useful when you detect an error.
<?php
$max = 10;
$string = 'w/';
// Check only letters; the regex searches for anything that isn't a plain letter
if (preg_match('/[^a-zA-Z]/', $string)){
echo 'only letters are allowed';
}
// Check a value is provided
$len = strlen($string);
if ($len == 0) {
echo 'you must provide a value';
}
// Check the string is long to long
if ($len > $max) {
echo 'the value cannot be longer than ' . $max;
}
You can also try this:
if (preg_match('/^[a-z0-9]{1,12}/im', $subject)) {
\\ match
}
The above will only match similar to #genesis' post, but will only match if the $subject is between and including 1 - 12 characters long. The regex is also case insensitive.
It works fine.
<?php
$string = '\with';
if (preg_match('~[^0-9a-z]~i', $string)){
echo "only a-Z0-9 is allowed"; //true
}
http://sandbox.phpcode.eu/g/18535/3
You have to make sure you don't put user input into your regex, because that would mean you'll probably check something wrong.
What is the easiest or best way in PHP to validate true or false that a string only contains characters that can be typed using a standard US or UK keyboard with the keyboard language set to UK or US English?
To be a little more specific, I mean using a single key depression with or without using the shift key.
I think the characters are the following. 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz~`!##$%^&*()_-+={[}]|\:;"'<,>.?/£ and Space
You can cover every ASCII character by [ -~] (i.e. range from space to tilde). Then just add £ too and there you go (you might need to add other characters as well, such as ± and §, but for that, have a look at the US and UK keyboard layouts).
Something like:
if(preg_match('#^[ -~£±§]*$#', $string)) {
// valid
}
The following regular expression may be of use for you:
/^([a-zA-Z0-9!"#$%&'()*+,\-.\/:;<=>?#[\\\]^_`{|}~\t ])*$/m
Use this as:
$result = (bool)preg_match('/^([a-zA-Z0-9!"#$%&\'()*+,\-.\/:;<=>?#[\\\]^_`{|}~\t ])*$/m', $input);
Or create a reusable function from this code:
function testUsUkKeyboard($input)
{
return (bool)preg_match('/^([a-zA-Z0-9!"#$%&\'()*+,\-.\/:;<=>?#[\\\]^_`{|}~\t ])*$/m', $input);
}
The easier way to check is to check if chars exist rather then they do not, so first you would need a list of chars that do not exists, you can get these from the ascii range 128 - 255 where as 0 - 127 is the regular key set.
Tio create the invalid array uou can do:
$chars = range(128,255);
The above array would contain all the chars in the table below:
then you should check agains the string in question, people say use regex, but i dont really think thats needed
$string = "testing a plain string";
for($s=0;$s<strlen($string);$s++)
{
if(in_array(ord($string[$s]),$chars))
{
//Invalid
}
}
I need to check whether a received string contains any words that are more than 20 characters in length. For example the input string :
hi there asssssssssssssssssssskkkkkkkk how are you doing ?
would return true.
could somebody please help me out with a regexp to check for this. i'm using php.
thanks in advance.
/\w{20}/
...filller for 15 characters....
You can test if the string contains a match of the following pattern:
[A-Za-z]{20}
The construct [A-Za-z] creates a character class that matches ASCII uppercase and lowercase letters. The {20} is a finite repetition syntax. It's enough to check if there's a match that contains 20 letters, because if there's a word that contains more, it contains at least 20.
References
regular-expressions.info/Character Classes and Finite Repetition
PHP snippet
Here's an example usage:
$strings = array(
"hey what the (##$&*!#^#*&^#!#*^##*##*&^#!*#!",
"now this one is just waaaaaaaaaaaaaaaaaaay too long",
"12345678901234567890123 that's not a word, is it???",
"LOLOLOLOLOLOLOLOLOLOLOL that's just unacceptable!",
"one-two-three-four-five-six-seven-eight-nine-ten",
"goaaaa...............aaaaaaaaaalll!!!!!!!!!!!!!!",
"there is absolutely nothing here"
);
foreach ($strings as $str) {
echo $str."\n".preg_match('/[a-zA-Z]{20}/', $str)."\n";
}
This prints (as seen on ideone.com):
hey what the (##$&*!#^#*&^#!#*^##*##*&^#!*#!
0
now this one is just waaaaaaaaaaaaaaaaaaay too long
1
12345678901234567890123 that's not a word, is it???
0
LOLOLOLOLOLOLOLOLOLOLOL that's just unacceptable!
1
one-two-three-four-five-six-seven-eight-nine-ten
0
goaaaa...............aaaaaaaaaalll!!!!!!!!!!!!!!
0
there is absolutely nothing here
0
As specified in the pattern, preg_match is true when there's a "word" (as defined by a sequence of letters) that is at least 20 characters long.
If this definition of a "word" is not adequate, then simply change the pattern to, e.g. \S{20}. That is, any seqeuence of 20 non-whitespace characters; now all but the last string is a match (as seen on ideone.com).
I think the strlen function is what you looking for. you can do something like this:
if (strlen($input) > 20) {
echo "input is more than 20 characters";
}
I want to be able to detect (using regular expressions) if a string contains hebrew characters both utf8 and iso8859-8 in the php programming language. thanks!
Here's map of the iso8859-8 character set. The range E0 - FA appears to be reserved for Hebrew. You could check for those characters in a character class:
[\xE0-\xFA]
For UTF-8, the range reserved for Hebrew appears to be 0591 to 05F4. So you could detect that with:
[\u0591-\u05F4]
Here's an example of a regex match in PHP:
echo preg_match("/[\u0591-\u05F4]/", $string);
well if your PHP file is encoded with UTF-8 as should be in cases that you have hebrew in it, you should use the following RegX:
$string="אבהג";
echo preg_match("/\p{Hebrew}/u", $string);
// output: 1
Here's a small function to check whether the first character in a string is in hebrew:
function IsStringStartsWithHebrew($string)
{
return (strlen($string) > 1 && //minimum of chars for hebrew encoding
ord($string[0]) == 215 && //first byte is 110-10111
ord($string[1]) >= 144 && ord($string[1]) <= 170 //hebrew range in the second byte.
);
}
good luck :)
First, such a string would be completely useless - a mix of two different character sets?
Both the hebrew characters in iso8859-8, and each byte of multibyte sequences in UTF-8, have a value ord($char) > 127. So what I would do is find all bytes with a value greater than 127, and then check if they make sense as is8859-8, or if you think they would make more sense as an UTF8-sequence...
function is_hebrew($string)
{
return preg_match("/\p{Hebrew}/u", $string);
}