Check string for high ascii, punctuation, other weird characters - php

I've got a string called $ID coming in from a different page and hitting base64_decode($enc); and want to check it for any weird characters. $ID when decrypted should only contain letters, numbers, underscores and dashes.
I've had a bit of a look at preg_replace('/[\x80-\xFF]/', '', $string); which cuts out some weird characters---which is helpful---but I can still see sometimes that # signs and brackets and stuff still make it in.
Is there a way I can lower the ascii test? Or how else do I cut out everything except letters, numbers, underscores and dashes?
Any help at pointing me in the right direction is wonderful and thanks!
$enc = $_GET["key"];
$ID= base64_decode($enc);
if (empty($enc)) { echo "key is empty"; } else {
echo "string ok<br>";
$check = preg_replace('/[\x80-\xFF]/', '', $ID);
echo $check;
// i can see this step is helping cut junk out, do more tests from here
}

Typing a caret after the opening square bracket negates the character class, so you can do:
$check = preg_replace('/[^A-Za-z0-9_-]/', '', $ID);

You can use this replacement:
$check = preg_replace('~[^[:word:]-]+~', '', $ID);
The [:word:] character class contains letters, digits and the underscore.
To make the string lowercase, use strtolower()

Related

Allow apostrophe when using ctype_alpha?

I know there's are other ways to do it, but I'm playing with validating a name field using ctype_alpha but allowing spaces, hyphens, and apostrophes.
Per another post on Stack, I was able to add the spaces no problem, but I'm thinking I have the syntax wrong for replacing multiple characters.
What I've used so far that works for validating that only letters and spaces are allowed:
if (ctype_alpha(str_replace(' ', '', $name)) === false) {
echo'Name must contain letters and spaces only';
exit;
}
This removes any spaces before checking that the string is letters only. I was looking to simply add to this to also allow hyphens and apostrophes.
What I've tried for adding hyphens and/or apostrophes (doesn't work):
if (ctype_alpha(str_replace(' ', '', '-', '', $name)) === false) {
echo'Name must contain letters and spaces only';
exit;
}
My guess is that adding a second string in the str_replace function is not proper syntax, but being a PHP newb, I'm having a hard time figuring out how to phrase my searches to find the correct syntax.
Also, am I correct in saying that '\w' will cover my apostrophes once I figure out the correct syntax for the str_replace function?
Genuinely appreciate the help guys. You're all invaluable and I try hard not to abuse it.
the proper syntax, as stated in the manual is:
if (ctype_alpha(str_replace(array(' ', '', '-'), '', $name)) === false) {
echo'Name must contain letters and spaces only';
exit;
}
With apostrophe
if (ctype_alpha(str_replace(array(' ', '', '-',"'"), '', $name)) === false) {
echo'Name must contain letters and spaces only';
exit;
}
Using '\w' is unfeasible with respect to the apostrophe, i.e. a single quote character. Per the manual:
\w Any word character (letter, number, underscore)
As for the syntax in the OP's code, the primary issue is needing to have an array of characters for the first parameter of str_replace() in order to replace multiple characters.
In addition to enclosing a single quote in double quotes ("'"), PHP permits escaping the single quote character with a backslash and then enclosing it in single quotes, as the following snippet indicates:
<?php
$name = "Kate O'Henry-Smith";
$arrDelChars = [' ','\'','-'];
if ( ctype_alpha( str_replace( $arrDelChars, '', $name ) ) === false ) {
exit( "Name must contain letters and spaces only\n" );
}
print_r($name);
See demo
str_replace() replaces each character in $name with an empty string based on an array of values to exclude. Note, specifying the empty string in the array is needless since the replacement value is the empty string. The new string which emerges becomes the actual parameter for ctype_alpha() instead of $name. Accordingly the function returns true. Consequently, the if-conditional evaluates as false, thereby preventing an error message from displaying. Cute trick for allowing ctype_alpha() to validate $name so-to-speak.

PHP letters and spaces only validation

I'm validating my contact form using PHP and I've used the following code:
if (ctype_alpha($name) === false) {
$errors[] = 'Name must only contain letters!';
}
This code is works fine, but it over validates and doesn't allow spaces. I've tried ctype_alpha_s and that comes up with a fatal error.
Any help would be greatly appreciated
Regex is overkill and will perform worse for such a simple task, consider using native string functions:
if (ctype_alpha(str_replace(' ', '', $name)) === false) {
$errors[] = 'Name must contain letters and spaces only';
}
This will strip spaces prior to running the alpha check. If tabs and new lines are an issue you could consider using this instead:
str_replace(array("\n", "\t", ' '), '', $name);
ctype_alpha only checks for the letters [A-Za-z]
If you want to use it for your purpose, first u will have to remove the spaces from your string and then apply ctype_alpha.
But I would go for preg_match to check for validation. You can do something like this.
if ( !preg_match ("/^[a-zA-Z\s]+$/",$name)) {
$errors[] = "Name must only contain letters!";
}
One for the UTF-8 world that will match spaces and letters from any language.
if (!preg_match('/^[\p{L} ]+$/u', $name)){
$errors[] = 'Name must contain letters and spaces only!';
}
Explanation:
[] => character class definition
p{L} => matches any kind of letter character from any language
Space after the p{L} => matches spaces
+ => Quantifier — matches between one to unlimited times (greedy)
/u => Unicode modifier. Pattern strings are treated as UTF-16. Also
causes escape sequences to match unicode characters
This will also match names like Björk Guðmundsdóttir as noted in a comment by Anthony Hatzopoulos above.
if (ctype_alpha(str_replace(' ', '', $name)) === false) {
$errors[] = 'Name must contain letters and spaces only';
}

php validate string with preg_match

I am trying to verify in PHP with preg_match that an input string contains only "a-z, A-Z, -, _ ,0-9" characters. If it contains just these, then validate.
I tried to search on google but I could not find anything usefull.
Can anybody help?
Thank you !
Use the pattern '/^[A-Za-z0-9_-]*$/', if an empty string is also valid. Otherwise '/^[A-Za-z0-9_-]+$/'
So:
$yourString = "blahblah";
if (preg_match('/^[A-Za-z0-9_-]*$/', $yourString)) {
#your string is good
}
Also, note that you want to put a '-' last in the character class as part of the character class, that way it is read as a literal '-' and not the dash between two characters such as the hyphen between A-Z.
$data = 'abc123-_';
echo preg_match('/^[\w|\-]+$/', $data); //match and output 1
$data = 'abc..';
echo preg_match('/^[\w|\-]+$/', $data); //not match and output 0
You can use preg_replace($pattern, $replacement, $subject):
if (preg_replace('/[A-Za-z0-9\-\_]/', '', $string)) {
echo "Detect non valid character inside the string";
}
The idea is to remove any valid chars, if the result is NOT empty do the code.

PHP remove everything except letters and a hyphen (-)

I'm making a form that asks for the user's first and last name, and I don't want them entering
$heil4
I would like them to enter
Sheila
I know how to filter out everything except letters, but I'm aware that some names can have
Sheila-McDonald
So how would I remove everything from a string apart from letters and a hyphen?
Simply use
$s = preg_replace("/[^a-z-]/i", "", $s);
or if you want to convert some non-ascii characters to ascii, such as Jean-Rémy to Jean-Remy, then use
$s = preg_replace("/[^a-z-]/i", "", iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $s));
Instead of replacing with nothing, have some fun. that way a name that consists mainly of numbers you can decode ;p
$name = '$h3il4-McD0nald';
$find = array(0,1,3,4,5,6,7,'$');
$replace = array('o','l','e','a','s','g','t','s');
$name = str_replace($find,$replace,$name);
//Sheila-McDonald
echo ucfirst(preg_replace('/[^a-z-]/i', '', $name));
$new = preg_replace('#[^A-Z-]#iu', '', $data);
but instead of removing letters (and thus modifying user's input) better validate it
and show an error if the input is not valid. This way the user will know that what he had entered is exactly the value you have
if(!preg_match('#[A-Z-]#iu', $data)) echo 'invalid';
Use this to strip out all non alpha-numeric characters, not including non latin characters, and prescribed punctuation.
$strtochange= preg_replace("/[^\s\p{Pd}a-zA-ZÀ-ÿ]/",'',$strtochange);
Note: this will turn $heil4 into heil.

PHP preg_replace replacing numbers along with special chars

I have the following PHP code to remove special characters from a variable;
<?php
$name = "my%^$##name8";
$patterns = array( '/\s+/' => '_', '/&/' => 'and', '/[^[:alpha:]]+/' => '_');
$name2 = preg_replace(array_keys($patterns), array_values($patterns), trim($name));
echo $name2;
?>
But, along with special chars, numbers also are getting replaced with underscores_. I want to include numbers in the result. How can I fix this?
Your third pattern, /[^[:alpha:]]+/ is replacing everything that's not a letter with an underscore. So add numbers to it, like /[^[:alpha:]0-9]+/
Replace '/[^[:alpha:]]+/' with '/[^[:alpha:][:digit:]]+/'. The original is replacing anything that's not an alphabetic character. Adding [:digit:] means it will replace anything that's not a letter or a number, so your numbers will be preserved as well.

Categories