Completely confused about this simple regex in preg_match - php

I'm trying to validate some input, but I'm getting reverse results (or maybe I have completely missed the use of the preg_match function?)
This is my code:
$check_firstname = "##";
if(preg_match('/[^A-Za-z0-9_]/', $check_firstname)) echo "valid"; else echo "invalid";
I would have thought that this regex in preg_match would not only allow alphanumeric characters, but this input gives an output of valid! If the input (in this case, check_firstname) is NOT alphanumeric, why is preg_match finding a match? I've already checked the documentation but I don't understand what's happening.
My real requirement is to allow the dash character (-) along with alphanumeric characters in the user input, but I don't see how to get anywhere when my basic test seems not to be working.
Edit
Thank you for your responses! I get the problem with the caret now. However I forgot mention that apart from dashes, I need to allow spaces as well.

You're using the ^ symbol which means "not" when it's inside of brackets. Here's a good site on syntax for Regex. In single line mode, the \s allows for tabs and spaces. Note: the hyphen needs to go last so it doesn't try and treat it like a range.
if(preg_match('/^[A-Z0-9\s-]+$/i', $check_firstname)){
echo "valid"; } else {echo "invalid";}

You simply put the caret (^) in the wrong position :)
[^...] matches any character not found between the square brackets.
^ alone means the beginning of the string.
[^A-Za-z0-9_] means that the regex will check for the first character in the string and if that character is not an alphanumeric, or underscore, it will pass.
To allow only alphanumeric characters and a dash, you will use:
^[A-Za-z0-9-]+$
The + (one or more times) and $ (end of string) operators will ensure that the whole string is checked.
Note that this regex doesn't allow underscores (_). If you need to allow them, you can use the regex in Casimir's answer.
To allow spaces as well:
^[A-Za-z0-9\s-]+$

Try this:
if(preg_match('/^[A-Za-z0-9_]+$/', $check_firstname)===1){ echo 'Valid'; } else { echo 'Invalid'; }
Just incorrect place to put the ^ character, and also added the $ symbol to indicate the end of the match.
You should also be using a type/value comparison as this function can return 3 different values:
(int)1 = Match found
(int)0 = Match not found
(bool)false = Error

You could use this:
if(preg_match('/^[\w- ]+$/', $check_firstname)) echo "valid"; else echo "invalid";

Why not stick with your regex and change the interpretation?
if(preg_match('/[^A-Za-z0-9_]/', $check_firstname)) echo "invalid"; else echo "valid";
If the regex matches, you know that $check_firstname contains at least one char that is not a letter, a number or underscore --> that makes it invalid
qed

Related

PHP Regular Expression special characters

I am facing difficulties to validate the below format. I want regular expression needs to be satisfied with the below condition.
$pattern = '/^\w[\w\s\.\%\-\(\)\[\]]*$/u';
$file_name = "(00)filename.jpg";
if(preg_match($pattern,$file_name)){
echo "Pattern matched";
}else {
echo "Pattern not matched";
}
I have tried several ways. But, the main problem is do not write the own pregmatch, instead need to modify the existing one which accepts the brackets().
So this should match (00)filename.jpg and does not, because your regex requires the string to ^\w start with a word-character. You can add optional parenthesized \w+ to the start:
^(?:\(\w+\)|\w)[-\w\s.%()[\]]*$
Also need to put the hyphen - at the start or end inside the character class. Else it would express a range. Furthermore need to escape the closing ] inside the character class.
test at regex101
But possibly, you just want to check:
if there's at least one word-character in the string.
the string consist only of [-\w\s.%()[\]]
If so, use a lookahead to check for the \w:
^(?=.*?\w)[-\w\s.%()[\]]+$
test at regex101

Using Preg Match to check if string contains an underscore

I am trying to check if a string contains an underscore - can anyone explain whats wrong with the following code
$str = '12_322';
if (preg_match('/^[1-9]+[_]$/', $str)) {
echo 'contains number underscore';
}
In your regex [_]$ means the underscore is at the end of the string. That's why it is not matching with yours.
If you want to check only underscore checking anywhere at the string, then:
if (preg_match('/_/', $str)) {
If you want to check string must be comprised with numbers and underscores, then
if (preg_match('/^[1-9_]+$/', $str)) { // its 1-9 you mentioned
But for your sample input 12_322, this one can be handy too:
if (preg_match('/^[1-9]+_[1-9]+$/', $str)) {
You need to take $ out since underscore is not the last character in your input. You can try this regex:
'/^[1-9]+_/'
PS: Underscore is not a special regex character hence doesn't need to be in character class.

REGEXP not catching some names correctly if certain values are at certain positions in the string

I have the following regex meant to test against valid name formats:
^[a-zA-Z]+(([\'\,\.\- ][a-zA-Z ])?[a-zA-Z]*)*$
it seems to work fine with all the expected odd name possibilities, including the following:
o'Bannon
Smith, Jr.
Double-barreled
I'm having problem when I plug this into my PHP code. If the first character is a number it passes through as valid.
If the last character is a space, comma, full-stop or other special allowed character, it's failing as invalid.
My PHP code is :
$v = 'Tested Value';
$value = (filter_var($v, FILTER_VALIDATE_REGEXP,array("options"=>array("regexp"=>"^[a-zA-Z]+(([\'\,\.\-,\ ][a-zA-Z ])?[a-zA-Z]*)*$^"))));
if (strlen($value) <2 && strlen($v) !=0) {
return "not valid";
}
What am I doing wrong here?
^[a-zA-Z]+(([\'\,\.\-,\ ][a-zA-Z ])?[a-zA-Z]*)*$^
The carets (^) at the beginning and end of the regex are being interpreted as regex deliminators, not as anchors. The regex isn't really matching the digits at the beginning of the string, it's skipping over them so it can start matching at the first letter it finds. You can use almost any ASCII punctuation character as the regex deliminator, but most people use # or ~, which are relatively uncommon and have no special meaning in regexes.
As for not allowing punctuation at the end, that's how the regex is written. Specifically, [\'\,\.\- ][a-zA-Z ] requires that each apostrophe, comma, period or hyphen be followed by a letter or a space. If you really want to allow any of those characters at the end, it's pretty simple:
~^(?:[a-z]+[',. -]*)+$~i
Of course, that's not a particularly good regex for validating names, but I have nothing better to offer; it's a job for which regexes are particularly ill-suited. And do you really want to be the one to tell your users their own names are invalid?
Your regex is way to complex
/^[a-z]+[',. a-z-]*$/i
should do the same thing

How to include hypens and apostrophes for a PHP Regex?

I'm writing a password regex in PHP that should return false for any string that has at least one character that is not:
a lowercase letter a-z
an uppercase letter A-Z
a number 0-9
a whitespace " *"
a punctuation symbol :,.!().?";
So far I have this:
<?php
$password = 'azAZ0 giggles 9*":,.!() .?";';
$regex1 = '#^[a-zA-Z0-9" *":,.!().?";\']+$#i';
if (preg_match($regex1, $password)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
?>
Does this seem to be working as I intend it to, or do you see any glaring errors?
And what should I add to the regex so that it should return false for any string that has at least one character that is not:
a hyphen -
Your regex is pretty close to the target, but not totally correct.
I would use this one:
$regex1 = '/^[a-z0-9 :,.!().?";\'-]+$/i';
Points of interest:
Moved the hyphen to the end of the list, so that it won't be mistaken for a character range delimiter
Included an apostrophe by escaping it with a backslash, as per PHP's string escaping rules
Removed the A-Z part since the regex includes the case-insensitive modifier
Replaced * (which in this context means "a space or an asterisk") with just a space -- if you want to also allow tabs and newlines as part of the password (unlikely), replace it with \s
You simply need to escape ' using \. Try this
$regex1 = '#^[a-zA-Z0-9" *":,.!-().?";\']+$#i';
And you already seem to have - in the regex.
Within a character class (denoted by square brackets in regex), a minus - is always introducing a range: [A-Z].
You have !-(, which is no meaningful range and therefore does not do what you think. Solution:
Move the - to the start or the end of the character class: [-A-Z...] / [A-Z...-]
Escape the -: [A-Z\-...]
The other question you ask is "How do I get a single quote into a PHP string?" and really has nothing to do with regex. But "escape it" is the answer, of course.

Why doesn't this regular expression work with spaces?

How do I make the following regular expression accept only the symbols I want it to accept as well as spaces?
if(!preg_match('/^[A-Z0-9\/\'&,.-]*$/', $line))
{
die();
}
else
{
//execute the rest of the validation script
}
I want the user to only be able to enter A-Z, 0-9, forward slashes, apostrophes, ampersands, commas, periods, and hyphens into a given text field $line.
It currently will accept something along the lines of HAM-BURGER which is perfect, it should accept that. I run into an issue when the user wants to type HAM BURGER (<- note the space).
If I remove the ^ from the beginning and/or the $ from the end it will succeed if the user types in anything. My attempted remedy to this was to make the * into a + but then it will accept anything as long as the user puts in at least one of the acceptable characters.
Add the space to the character class:
if(!preg_match('/^[A-Z0-9\/\'&,. -]*$/', $line))
Yes, it's that simple.
Note that the space has to be inserted before the - because it is a metacharacter in a character class (unless it's the first or last character in said character class). Another option is to escape it like:
if(!preg_match('/^[A-Z0-9\/\'&,.\- ]*$/', $line))
The regex explained:
^ and $ are start and end of string anchors. It tells the regex engine that it has to match the whole string rather than just part of it.
[...] is a character class.
* is the zero-or-more repetition operator. This means it will accept an empty string. You can change it to + (one-or-more) so it rejects the empty string.
This is a good reference for RegEx, though specifically for Perl:
http://www.cs.tut.fi/~jkorpela/perl/regexp.html

Categories