Why is this regex playing up? - php

I've written this function:
function contain_special($string){
# -- Check For Any Special Chars --
if(preg_match('/[^a-z0-9]/',$string)){
# - Special Chars Were Found -
return true;
}//end of special chars found
else{
# - String Does Not Contain Special Chars -
return false;
}//end of else - does not contain special chars
}//end of function
To check if a string contains special chars.
The function is supposed to ignore alphanumeric chars and look for special chars. If found, return true, else, return false.
Now all works well when testing it with most special chars:
$text="sdfs-df";
var_dump(contain_special($text));//returns true because "-" was found
BUT, when I have a $ that is not in a certain position of the string, the function fails to pick it up:
$text="sdfsdf$";//this works
$text="sdf$sdf";//this does not work
$text="$sdfsdf";//this works
Any ideas on what I'm doing wrong here?

Take a look at echo $text. You may not be using the string you think you are. Literal dollar signs often need to be escaped in double-quoted strings so that you're not using the variable, $sdfsdf, for example.
I'd recommend just using single quotes here.
http://php.net/manual/en/language.types.string.php

^ is for start of string. Ok - it also negates the interval(but you do not need to negate - just switch true and false)
$ is for end of string.
* means zero ore more times.
So I think that the regex you want is:
\^[a-z0-9]*$\
Does that work?

the dollars in double quotes start a variable substitution, unless they are not followed by a char that validly starts a variable name.
This explains why $text="sdfsdf$" works and $text="sdf$sdf" does not.
Your last example may work if you have a variable named $sdfsdf.

Related

PHP, Regex - how to disallow non-alphanumeric characters

I'm trying to make a regex that would allow input including at least one digit and at least one letter (no matter if upper or lower case) AND NOTHING ELSE. Here's what I've come up with:
<?php
if (preg_match('/(?=.*[a-z]+)(?=.*[0-9]+)([^\W])/i',$code)) {
echo "=)";
} else {
echo "=(";
}
?>
While it gives false if I use only digits or only letters, it gives true if I add $ or # or any other non-alphanumeric sign. Now, I tried putting ^\W into class brackets with both a-z and 0-9, tried to use something like ?=.*[^\W] or ?>! but I just can't get it work. Typing in non-alphanums still results in true. Halp meeee
You need to use anchors so that it matches against the entire string.
^(?=.*[a-z]+)(?=.*[0-9]+)(\w+)$
Since you are using php, why even use regex at all. You can use ctype_alnum()
http://php.net/manual/en/function.ctype-alnum.php

How can I find Alphanumeric in a string

$foo = "username122";
pre_match('Contain only aplhanumeric string', $foo){
return true;
}
$foo Contain Only alphanumeric not special characters (=\*-[( etc)
ctype_alnum() function will do you dandy :)
Use a regular expression matching on alphanumeric characters only from beginning to end:
/^[A-Za-z0-9]*$/
For example:
$testRegex = "/^[A-Za-z0-9]*$/";
$testString = "abc123";
if (preg_match($testRegex, $testString)) {
// yes, the string is entirely alphanumeric
} else {
// no, the string is not entirely alphanumeric
}
if(preg_match('~[^a-z0-9 ]~i', $foo)) {
//DO SOMETHING HERE;
}
Your regular expression for something like this won't change between PHP and JavaScript. In JavaScript it's an object, whereas in PHP it's a string, but the pattern is still the same:
/^[a-z0-9]+$/i
Where ^ represents the start of the string and $ represents the end of the string. Next, a-z matches any letter, and 0-9 matches any number. The + states that the previous pattern could be repeated one or more times. The i modifier makes the a-z portion case-insensitive, so upper and lower case are matched.
Testing in JavaScript:
/^[a-z0-9]+$/i.test("Foo123"); // true
Testing in PHP:
preg_match("/^[a-z0-9]+$/i", "Foo123"); // 1
With PHP, you have the option of using POSIX character classes, such as :alnum:. Please note that this won't work in JavaScript:
preg_match("/^[[:alnum:]]+$/i", "Foo123"); // 1
There's actually a much easier method for testing in PHP, using the Ctype functions, specifically the ctype_alnum function, which will return a boolean value stating whether all characters in a given string are alphanumeric or not:
ctype_alnum("Foo123"); // true
ctype_alnum("Foo!23"); // false

regex with special characters?

i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....

How to check if a string is in an array?

I basically need a function to check whether a string's characters (each character) is in an array.
My code isn't working so far, but here it is anyway,
$allowedChars = array("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"," ","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"," ","0","1","2","3","4","5","6","7","8","9"," ","#",".","-","_","+"," ");
$input = "Test";
$input = str_split($input);
if (in_array($input,$allowedChars)) {echo "Yep, found.";}else {echo "Sigh, not found...";}
I want it to say 'Yep, found.' if one of the letters in $input is found in $allowedChars. Simple enough, right? Well, that doesn't work, and I haven't found a function that will search a string's individual characters for a value in an array.
By the way, I want it to be just those array's values, I'm not looking for fancy html_strip_entities or whatever it is, I want to use that exact array for the allowed characters.
You really should look into regex and the preg_match function: http://php.net/manual/en/function.preg-match.php
But, this should make your specific request work:
$allowedChars = array("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"," ","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"," ","0","1","2","3","4","5","6","7","8","9"," ","#",".","-","_","+"," ");
$input = "Test";
$input = str_split($input);
$message = "Sigh, not found...";
foreach($input as $letter) {
if (in_array($letter, $allowedChars)) {
$message = "Yep, found.";
break;
}
}
echo $message;
Are you familiar with regular expressions at all? It's sort of the more accepted way of doing what you're trying to do, unless I'm missing something here.
Take a look at preg_match(): http://php.net/manual/en/function.preg-match.php
To address your example, here's some sample code (UPDATED TO ADDRESS ISSUES IN COMMENTS):
$subject = "Hello, this is a string";
$pattern = '/[a-zA-Z0-9 #._+-]*/'; // include all the symbols you want to match here
if (preg_match($pattern, $subject))
echo "Yep, matches";
else
echo "Doesn't match :(";
A little explanation of the regex: the '^' matches the beginning of the string, the '[a-zA-Z0-9 #._+-]' part means "any character in this set", the '*' after it means "zero or more of the last thing", and finally the '$' at the end matches the end of the string.
A somewhat different approach:
$allowedChars = array("a","b","c","d","e");
$char_buff = explode('', "Test");
$foundTheseOnes = array_intersect($char_buff, $allowedChars);
if(!empty($foundTheseOnes)) {
echo 'Yep, something was found. Let\'s find out what: <br />';
print_r($foundTheseOnes);
}
Validating the characters in a string is most appropriately done with string functions.preg_match() is the most direct/elegant method for this task.
Code: (Demo)
$input="Test Test Test Test";
if(preg_match('/^[\w +.#_-]*$/',$input)){
echo "Input string does not contain any disallowed characters";
}else{
echo "Input contains one or more disallowed characters";
}
// output: Yes, input contains only allowed characters
Pattern Explanation:
/ # start pattern
^ # start matching from start of string
[\w +.#-] # match: a-z, A-Z, 0-9, underscore, space, plus, dot, atsign, hyphen
* # zero or more occurrences
$ # match until end of string
/ # end pattern
Significant points:
The ^ and $ anchors are crucial to ensure that the entire string is validated versus just a substring of the string.
The \w (a.k.a. "any word character" -> a shorthand character class) is the easy way to write: [a-zA-Z0-9_]
The . dot character loses its "match anything (almost)" meaning and becomes literal when it is written inside of a character class. No escaping slash is necessary.
The hyphen inside of a character class can be written without an escaping slash (\-) so long as the it is positioned at the start or end of the character class. If the hyphen is not at the start/end and it is not escaped, it will create a range of characters between the characters on either side of it.Like it or not, [.-z] will not match a hyphen symbol because it does not exist "between" the dot character and the lowercase letter z on the ascii table.
The * that follows the character class is the "quantifier". The asterisk means "0 or more" of the preceding character class. In this case, this means that preg_match() will allow an empty string. If you want to deny an empty string, you can use + which means "1 or more" of the preceding character class. Finally, you can be far more specific about string length by using a number or numbers in a curly bracketed expression.
{8} would mean the string must be exactly 8 characters long.
{4,} would mean the string must be at least 4 characters long.
{,10} would mean the string length must be between 0 and 10.
{5,9} would mean the string length must be between 5 and 9 characters.
All of that advice aside, if you absolutely must use your array of characters AND you wanted to use a loop to check individual characters against your validation array (and I certainly don't recommend it), then the goal should be to reduce the number of array elements involved so as to reduce total iterations.
Your $allowedChars array has multiple elements that contain the space character, but only one is necessary. You should prepare the array using array_unique() or a similar technique.
str_split($input) will run the chance of generating an array with duplicate elements. For example, if $input="Test Test Test Test"; then the resultant array from str_split() will have 19 elements, 14 of which will require redundant validation checks.
You could probably eliminate redundancies from str_split() by calling count_chars($input,3) and feeding that to str_split() or alternatively you could call str_split() then array_unique() before performing the iterative process.
Because you're just validating a string, see preg_match() and other PCRE functions for handling this instead.
Alternatively, you can use strcspn() to do...
$check = "abcde.... '; // fill in the rest of the characters
$test = "Test";
echo ((strcspn($test, $check) === strlen($test)) ? "Sigh, not found..." : 'Yep, found.');

PHP Regular Expression. Check if String contains ONLY letters

In PHP, how do I check if a String contains only letters? I want to write an if statement that will return false if there is (white space, number, symbol) or anything else other than a-z and A-Z.
My string must contain ONLY letters.
I thought I could do it this way, but I'm doing it wrong:
if( ereg("[a-zA-Z]+", $myString))
return true;
else
return false;
How do I find out if myString contains only letters?
Yeah this works fine. Thanks
if(myString.matches("^[a-zA-Z]+$"))
Never heard of ereg, but I'd guess that it will match on substrings.
In that case, you want to include anchors on either end of your regexp so as to force a match on the whole string:
"^[a-zA-Z]+$"
Also, you could simplify your function to read
return ereg("^[a-zA-Z]+$", $myString);
because the if to return true or false from what's already a boolean is redundant.
Alternatively, you could match on any character that's not a letter, and return the complement of the result:
return !ereg("[^a-zA-Z]", $myString);
Note the ^ at the beginning of the character set, which inverts it. Also note that you no longer need the + after it, as a single "bad" character will cause a match.
Finally... this advice is for Java because you have a Java tag on your question. But the $ in $myString makes it look like you're dealing with, maybe Perl or PHP? Some clarification might help.
Your code looks like PHP. It would return true if the string has a letter in it. To make sure the string has only letters you need to use the start and end anchors:
In Java you can make use of the matches method of the String class:
boolean hasOnlyLetters(String str) {
return str.matches("^[a-zA-Z]+$");
}
In PHP the function ereg is deprecated now. You need to use the preg_match as replacement. The PHP equivalent of the above function is:
function hasOnlyLetters($str) {
return preg_match('/^[a-z]+$/i',$str);
}
I'm going to be different and use Character.isLetter definition of what is a letter.
if (myString.matches("\\p{javaLetter}*"))
Note that this matches more than just [A-Za-z]*.
A character is considered to be a letter if its general category type, provided by Character.getType(ch), is any of the following: UPPERCASE_LETTER, LOWERCASE_LETTER, TITLECASE_LETTER, MODIFIER_LETTER, OTHER_LETTER
Not all letters have case. Many characters are letters but are neither uppercase nor lowercase nor titlecase.
The \p{javaXXX} character classes is defined in Pattern API.
Alternatively, try checking if it contains anything other than letters: [^A-Za-z]
The easiest way to do a "is ALL characters of a given type" is to check if ANY character is NOT of the type.
So if \W denotes a non-character, then just check for one of those.

Categories