regex validation not parsing string in if PHP - php

Here is the deal...
I am suppose to parse a Canadian Postal Code (CapsLetterNumberCapsLetterNumberCapsLetterNumber: exemple A1B2C3 or G2V3V4)
IF there is one.
I have this code (PHP):
//Create new SESSION variable to store a warning
$_SESSION['msg'] = "";
//IF empty do nothing, IF NOT empty parse, IF NOT match regex put message in msg
if(!preg_match('^([A-Z][0-9][A-Z][0-9][A-Z][0-9])?$^', $_POST['txtPostalCode']) && $_POST['txtPostalCode'] != "")
{
$_SESSION['msg'] .= "Warning invalide Postal Code";
}
then the code goes on to display $_SESSION['msg'].
The problem is that whatever I enter in $_POST['txtPostalCode'] it NEVER get parse by the REGEX.

You made the entire capturing group optional:
^([A-Z][0-9][A-Z][0-9][A-Z][0-9])?$^
^
It's also not a good idea to use regex metadata characters as your delimiter. Try this regex, which matches an uppercase letter and a number three times:
/^((?:[A-Z][0-9]){3})$/
You don't need to make the capturing group optional because you handle the logic for when the user doesn't submit a code with the && $_POST['txtPostalCode'] != "" part of the if statement.
Finally, since you're not even using the matches from this regex, you don't need the capturing group:
/^(?:[A-Z][0-9]){3}$/

Your regex will match invalid postal codes.
A quick Google search for "canadian postal code regex" bought up
^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$
You may also want to put your $_POST['txtPostalCode'] != "" condition first since there's no point in executing a regex if the value is empty to begin with.
Edit: As pointed out by the comments, the quantifiers are redundant:
^[ABCEGHJKLMNPRSTVXY]\d[A-Z] *\d[A-Z]\d$

Related

Regular expressions in preg_match pattern not matching string

I've read the PHPManual RegEx Intro, but am confused on how to structure the pattern for preg_match. I am checking that the username on the login form is all lower case alphabet between 2 and 5 characters in length.
Pattern 1: Initially, I used a character class followed by a repetition quantifier:
if (preg_match("[a-z]{2,5}",$_POST['ULusername'])) {
$formmessage = 'Hello, ' . $_POST['ULusername'];
} else {
$formmessage = 'Enter username.';
}
The output was always "Enter username."
Pattern 2: I then thought perhaps I needed delimiters:
if (preg_match("/[a-z]{2,5}/",$_POST['ULusername'])) {
$formmessage = 'Hello, ' . $_POST['ULusername'];
} else {
$formmessage = 'Enter username.';
}
But the output was still always "Enter username."
Pattern 3: Finally, I tried delimiters with the begin/end anchors:
if (preg_match("#^([a-z]{2,5})$#",$_POST['ULusername'])) {
$formmessage = 'Hello, ' . $_POST['ULusername'];
} else {
$formmessage = 'Enter username.';
}
This gave me the desired output.
Why does the third pattern work, but not the first two?
The first one fails because it doesn't contain a delimiter.
In the second one, there is a problem in your logic. Because /[a-z]{2,5}/ check only two to five consecutive lower case letters only. And there is no indication of input length in there. Try it with ABcdEF, then you'll understand what's going on there.
In the third one first, you grouped this pattern [a-z]{2,5} using () and check whether that given string starts and ends with this ([a-z]{2,5}) group pattern. But according to my tests of your third code, the grouping doesn't affect your logic. Try it without () and you will get the same result. Because when you group the logic [a-z]{2,5} and check whether a given string starts and ends with that group is same as #^[a-z]{2,5}$#.
For more information, you can refer tutorials about regular expressions.
http://www.rexegg.com/regex-quickstart.html
https://www.regular-expressions.info/refcapture.html
The first pattern returns false, an indication that an error occurred (here, no delimiter in pattern).
The second and third patterns are valid regex patterns but they do not match the same set of strings. Using "/[a-z]{2,5}/" you'd have a match whenever $_POST['ULusername'] contains at least two consecutive lowercase characters. However, it does not care if the length of the whole string is greater than 5.
The last pattern both has delimiters and a start and end anchors, so only lowercase strings of length 2 to 5 will match.

Regex to block multiple items

I have a form on a site that is collecting user details. There was a fool submitting the form with a name like "Barneycok" from different IP addresses, so I learned how to block that name from going through on the form.
I learned a little regex, just enough to write this little piece:
if (preg_match('/\b(\w*arneycok)\b/', $FirstName)){
$error.= "<li><font face=arial><strong>Sorry, an error occured. Please try again later.</strong><br><br>";
$errors=1;
}
The code has worked perfectly and I never got that name coming through anymore. However, recently, someone is entering a string of numbers on the name field.
The string looks like this:
123456789
123498568
123477698
12346897w
If you notice, the first 4 characters are constant throughout.
So how do I add that in my regex above so that if the name starts with "1234", it will simply match that regex and give the user the error code?
Your help will be greatly appreciated.
Jaime
This will match $FirstName which starts with 1234. for matching a specific word like Barneycok you should use this (b|B)arneycok
Regex: ^\s*1234|\b(?:b|B)arneycok\b
1. ^\s*1234 starts with 1234 can contain spaces in starting
2. | is like or condition,
3. \b(?:b|B)arneycok\b matches the word which contains barneycok or Barneycok
Try this code snippet here
if (preg_match('/^1234|\b(?:b|B)arneycok\b/i', $FirstName))
{
$error.= "<li><font face=arial><strong>Sorry, an error occured. Please try again later.</strong><br><br>";
$errors = 1;
}
The following regex will work.
^1234.*
For the sake of providing the best possible pattern to protect your site, I'd like to offer this:
/^\s*1234|barneycok/i
This will match a string that has 1234 as its first non-white characters as well as a string that contains the substring barneycok (case insensitively).
Demo Link
You will notice that the pattern:
omits the leading word boundary (letting it catch abarneycok),
doesn't bother with a non-capturing group with a pipe between B and b (because it is pointless when using the i flag)
omits the trailing word boundary (letting it catch barneycoka)
uses the i flag so that bArNeYcOk is caught.
You can implement the pattern with:
if(preg_match('/^\s*1234|barneycok/i',$FirstName)){
$error.="<li><font face=arial><strong>Sorry, an error occurred. Please try again later.</strong><br><br>";
$errors=1;
}
On SO, it is important that the very best answers are posted and awarded the green tick, because sub-optimal answers run the risk of performing poorly for the OP as well as teaching future SO readers bad practices / sloppy coding writing habits. I hope you find this refinement helpful and informative.

How to use preg_match properly to accept characters including new lines?

I've tried to use a series of questions to construct a preg_match if statement to check a string and make sure it includes characters that are accepted to pass through the system.
I've got the following if statement;
if ( !preg_match("~[A-Za-z0-9-_=+,.:;/\!?%^&*()##\"\'£\$€ ]~", $data['text']) ) {}
I'm using ~ as a separator in the string and want to ensure that the above characters are accepted in whatever string is passed through.
I've had to escape " and ' quotes and the $ sign to ensure it doesn't break the statement.
It seems to work however the following doesn't work.
Hello, this is a single line. Don't you agree?
This is also another line, see?
After some trial and error, it seemed the comma was also causing the string check to fail but it's in the preg_match rule too.
How can I accept these characters A-Za-z0-9-_=+,.:;/\!?%^&*()##\"\'£\$€ as well as multi lines (line blank lines, spaces etc etc).
EDIT
Just an update as to what I enter in the textarea and what data is actually returned.
I entered the following in the textarea;
Testing 123
Testing 123
The following was returned using print_r;
Testing 123\r\n\r\nTesting 1231
I had another look at your question and there is actually two issues:
The regex misses a character to accept any kind of spaces, you should update it to ~[A-Za-z0-9-_=+,.:;/\!?%^&*()##\"\'£\$€\s]+~ here I replaced your space character at the end with \s so that it supports any kind of spaces (i.e. tabs, newline, etc)
You forgot to escape the input string from $data['text'] which you can do using stripcslashes.
After fixing those you still need to validate your input, to do so you can use preg_replace to create a new string that will contain all the invalid characters if there is any. From there you only need to check if the string is empty, if so then the input is valid.
Here is what I used to test this:
<?php
$data = 'Testing 123\r\n\r\nTesting 1231';
$unescaped_data = stripcslashes($data);
$impureData = preg_replace("~[A-Za-z0-9-_=+,.:;/\!?%^&*()##\"\'£\$€\s]+~",
'', $unescaped_data);
if (0 == strlen($impureData)) {
echo 'TRUE';
}
else {
echo 'FALSE';
}
print_r("
====
$data
====
$impureData
====
");
And I get this result:
TRUE
====
Testing 123\r\n\r\nTesting 1231
====
====

what is the proper way of using an if condition with RegEx

I have been trying to validate a form where the input is the first and last name using regex in PHP. All I need the regex to do is check to make sure that there are no numbers. This is what I have right now:
if (preg_match('/\A\b[^0-9]*\W[^0-9]*\b\Z/sm', $name)) {
# Successful match
$nameError = "";
echo $name;
} else {
# Match attempt failed\
$nameError = "No Numbers";
}
The $name variable holds First and last name. I have been trying to make this work and I have not been able to get the input to match the regex. Am I using this correctly or do I need to input it in another way. Thank you for your help
if name is surename and first name you should use condition depending on country for example in Poland it would be
preg_match('/[a-z]+ [a-z]+/i',$name);
It means that all the names that contains two part that are alphabetic with space separating them are good. If you want first letter of name to be upper you should change it to
preg_match('/[A-Z][a-z]+ [A-Z][a-z]+/',$name);
Preg_match returns true if $name is validated by regular expression that you provide in the first argument.
So your usage of this function is okay, you should check your expression.
http://pl1.php.net/preg_match
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.
You can always check your regex on online checker for example
http://www.solmetra.com/scripts/regex/
If you just want two words separated by one space, this will do what you want: if (preg_match('/^[A-Za-z]+ [A-Za-z]+$/', $name))
Thank you all for your replies, I found the answer in the most obvious place though and it didn't have anything to do with the regex. I forgot to setup the variables correctly for using them on the same page as the form. Stupid mistake. Anyway, thank you again.

How to validate a regex with PHP

I want to be able to validate a user's inputted regex, to check if it's valid or not. First thing I found with PHP's filter_var with the FILTER_VALIDATE_REGEXP constant but that doesn't do what I want since it must pass a regex to the options but I'm not regex'ing against anything so basically it's just checking the regex validity.
But you get the idea, how do I validate a user's inputted regex (that matches against nothing).
Example of validating, in simple words:
$user_inputted_regex = $_POST['regex']; // e.g. /([a-z]+)\..*([0-9]{2})/i
if(is_valid_regex($user_inputted_regex))
{
// The regex was valid
}
else
{
// The regex was invalid
}
Examples of validation:
/[[0-9]/i // invalid
//(.*)/ // invalid
/(.*)-(.*)-(.*)/ // valid
/([a-z]+)-([0-9_]+)/i // valid
Here's an idea (demo):
function is_valid_regex($pattern)
{
return is_int(#preg_match($pattern, ''));
}
preg_match() returns the number of times pattern matches. That will be
either 0 times (no match) or 1 time because preg_match() will stop
searching after the first match.
preg_match() returns FALSE if an error occurred.
And to get the reason why the pattern isn't valid, use preg_last_error.
You would need to write your own function to validate a regex. You can validate it so far as to say whether it contains illegal characters or bad form, but there is no way to test that it is a working expression. For that you would need to create a solution.
But then you do realize there really is no such thing as an invalid regex. A regex is performance based. It either matches or it doesn't and that is dependent upon the subject of the test--even if the expression or its results are seemingly meaningless.
In other words, you can only test a regular expression for valid syntax...and that can be nearly anything!

Categories