PHP regexp for a valid regexp pattern - php

Is there a regexp to check if a string is a valid php regexp ?
In the back office of my application, the administrator define the data type and he can define a pattern as a regexp. For example /^[A-Z][a-zA-Z]+[a-z]$/ and in the front office, i use this pattern for validate user entries.
In my code i use the PHP preg_match function
preg_match($pattern, $user_entries);
the first argument must be a valid PHP regexp, how can i be sure that $pattern is a valid regexp since it a user entrie in my back office.
Any idea ?

Execute it and catch any warnings.
$track_errors = ini_get('track_errors');
ini_set('track_errors', 'on');
$php_errormsg = '';
#preg_match($regex, 'dummy');
$error = $php_errormsg;
ini_set('track_errors', $track_errors);
if($error) {
// do something. $error contains the PHP warning thrown by the regex
}
If you just want to know if the regex fails or not you can simply use preg_match($regex, 'dummy') === false - that won't give you an error message though.

As a work-around, you could just try and use the regex and see if an error occurs:
function is_regex($pattern)
{
return #preg_match($pattern, "") !== false;
}
The function preg_match() returns false on error, and int when executing without error.
Background: I don't know if regular expressions themselves form a regular grammar, i.e. whether it's even possible in principle to verify a regex with a regex. The generic approach is to start parsing and checking if an error occurs, which is what the workaround does.

Technically, any expression can be a valid regular expression...
So, the validity of a regular expression will depend on the rules you want to respect.
I would:
Identify the rules your regex must do
Use a preg_match of your own, or some combination of substr to validate the pattern

You could use T-Regx library:
<?php
if (pattern('invalid {{')->valid()) {
https://t-regx.com/docs/is-valid

Related

Check if text contains url, email and phone number with php and regex

I have a text, for example, like: $descrizione = "Tel.+39.1234.567899 asd.test#testwebsite.com
www.testwebsite.com" and I would like to obtain three different variable with:
"+39.1234.567899""asd.test#testwebsite.com"
"www.testwebsite.com".
To check if text contains email I use regex and I write this code:
$regex = '/[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})/';
if (preg_match($regex, $descrizione, $email_is)) {
for($e = 0; $e < count($email_is) ; $e++){
if(strpos($email_is[$e], "#") !== false){
$linkEmail = $email_is[$e];
}
}
}
now, I would like to find website url, so I try to write:
$regex = '/[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi';
if( preg_match($regex, $descrizione, $matches)){
$linkWebsite = $matches[0];
}
but the preg_match return false. I control the regex with the website http://regexr.com/ and it's correct, so I don't understand why return always false. Where is the problem?I try to use "/[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/" but I have the same problem and I try to check errors with trycatch but it doesn't return errors.
Finally I would like to find phone number but I don't know how to write regex.
Is there someone thet can help me, please?
Your regex fails because it's faulty. You've escaped the slashes (/) with slashes. You should use backslashes:
[-a-zA-Z0-9#:%_\+.~#?&\/=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&\/=]*)?
Here at regex101.
Since regexr uses JS regex it doesn't complain, but if you try it at regex101 selecting php you'll easily detect such errors.
About regex for phone numbers - search! E.g https://stackoverflow.com/search?q=%5Bregex%5D+phone+number
I have find the solution, I hope thet this can help someone.
The preg_match returns only first result and not all the result thet it has find.
So, if I check the regex using a website like regex101, it returns the corrects result with all matches, but if I use the same regex in php, it returns only one.
The regex option "g" (global = don't return after first match) corresponds to the function preg_match_all.

Can't use regular expression properly in PHP

I am new in regular expression and i was doing some form validation using regular expression. But the problem is most of the regular expression are like
^(?=.{8})(?=.*[A-Z])(?=.*[a-z])(?=.*\d.*\d.*\d)(?=.*[^a-zA-Z\d].*[^a-zA-Z\d].*[^a-zA-Z\d])[-+%#a-zA-Z\d]+$
This one i am using for password validation. For other form validation I found lot of such expression here. Now the problem is when i use them in my code as follows
if(preg_match('^(?=.{8})(?=.*[A-Z])(?=.*[a-z])(?=.*\d.*\d.*\d)(?=.*[^a-zA-Z\d].*[^a-zA-Z\d].*[^a-zA-Z\d])[-+%##a-zA-Z\d]+$', $password))
I get at least one error. Most of the time it show erro No ending delimiter or unknown modifier etc
You don't have a delimiter around your expression.
Try this:
$pattern = '/^(?=.{8})(?=.*[A-Z])(?=.*[a-z])(?=.*\d.*\d.*\d)(?=.*[^a-zA-Z\d].*[^a-zA-Z\d].*[^a-zA-Z\d])[-+%#a-zA-Z\d]+$/';
preg_match ($pattern, $password);
Direct answer: You have no delimiters on your expression. PCRE grabs the first character ^ assumes it's the delimiter, and throws the error because it doesn't find a closing ^ at the end of the regex.
Indirect answer: Like Andy-Lester commented, your regex is over-complex and pretty much unreadable to anyone that isn't a regex guru. I use the following which is more readable and more maintainable.
$req_regex = array(
'/[A-Z]/', //uppercase
'/[a-z]/', //lowercase
'/[^A-Za-z]/' //non-alpha
);
foreach($req_regex as $regex) {
if( !preg_match($regex, $password) ) {
return NULL;
}
}
The problem with the expression you have given is that you do not have the delimiters around the expression.
For complex regular expressions it is best to build them up piecemeal. I have found the add-on for Firefox (https://addons.mozilla.org/en-us/firefox/addon/rext/) useful.

Test if a string is regex

Is there a good way of test if a string is a regex or normal string in PHP?
Ideally I want to write a function to run a string through, that returns true or false.
I had a look at preg_last_error():
<?php
preg_match('/[a-z]/', 'test');
var_dump(preg_last_error());
preg_match('invalid regex', 'test');
var_dump(preg_last_error());
?>
Where obviously first one is not an error, and second one is. But preg_last_error() returns int 0 both times.
Any ideas?
The simplest way to test if a string is a regex is:
if( preg_match("/^\/.+\/[a-z]*$/i",$regex))
This will tell you if a string has a good chance of being intended to be as a regex. However there are many string that would pass that check but fail being a regex. Unescaped slashes in the middle, unknown modifiers at the end, mismatched parentheses etc. could all cause problems.
The reason preg_last_error returned 0 is because the "invalid regex" is not:
PREG_INTERNAL_ERROR (an internal error)
PREG_BACKTRACK_LIMIT_ERROR (excessively forcing backtracking)
PREG_RECURSION_LIMIT_ERROR (excessively recursing)
PREG_BAD_UTF8_ERROR (badly formatted UTF-8)
PREG_BAD_UTF8_OFFSET_ERROR (offset to the middle of a UTF-8 character)
Here is a good answer how to:
https://stackoverflow.com/a/12941133/2519073
if(#preg_match($yourPattern, null) === false){
//pattern is broken
}else{
//pattern is real
}
The only easy way to test if a regex is valid in PHP is to use it and check if a warning is thrown.
ini_set('track_errors', 'on');
$php_errormsg = '';
#preg_match('/[blah/', '');
if($php_errormsg) echo 'regex is invalid';
However, using arbitrary user input as a regex is a bad idea. There were security holes (buffer overflow => remote code execution) in the PCRE engine before and it might be possible to create specially crafted long regexes which require lots of cpu/memory to compile/execute.
Why not just use...another regex? Three lines, no # kludges or anything:
// Test this string
$str = "/^[A-Za-z ]+$/";
// Compare it to a regex pattern that simulates any regex
$regex = "/^\/[\s\S]+\/$/";
// Will it blend?
echo (preg_match($regex, $str) ? "TRUE" : "FALSE");
Or, in function form, even more pretty:
public static function isRegex($str0) {
$regex = "/^\/[\s\S]+\/$/";
return preg_match($regex, $str0);
}
This doesn't test validity; but it looks like the question is Is there a good way of test if a string is a regex or normal string in PHP? and it does do that.

How to validate a regex with PHP

I want to be able to validate a user's inputted regex, to check if it's valid or not. First thing I found with PHP's filter_var with the FILTER_VALIDATE_REGEXP constant but that doesn't do what I want since it must pass a regex to the options but I'm not regex'ing against anything so basically it's just checking the regex validity.
But you get the idea, how do I validate a user's inputted regex (that matches against nothing).
Example of validating, in simple words:
$user_inputted_regex = $_POST['regex']; // e.g. /([a-z]+)\..*([0-9]{2})/i
if(is_valid_regex($user_inputted_regex))
{
// The regex was valid
}
else
{
// The regex was invalid
}
Examples of validation:
/[[0-9]/i // invalid
//(.*)/ // invalid
/(.*)-(.*)-(.*)/ // valid
/([a-z]+)-([0-9_]+)/i // valid
Here's an idea (demo):
function is_valid_regex($pattern)
{
return is_int(#preg_match($pattern, ''));
}
preg_match() returns the number of times pattern matches. That will be
either 0 times (no match) or 1 time because preg_match() will stop
searching after the first match.
preg_match() returns FALSE if an error occurred.
And to get the reason why the pattern isn't valid, use preg_last_error.
You would need to write your own function to validate a regex. You can validate it so far as to say whether it contains illegal characters or bad form, but there is no way to test that it is a working expression. For that you would need to create a solution.
But then you do realize there really is no such thing as an invalid regex. A regex is performance based. It either matches or it doesn't and that is dependent upon the subject of the test--even if the expression or its results are seemingly meaningless.
In other words, you can only test a regular expression for valid syntax...and that can be nearly anything!

PHP Regex Functions

I am trying to validate form input data using PHP's preg_match function. I am a little confused of how to use it. If I want to validate say an alphanumeric string, I would use ^[0-9a-zA-Z ]+$ as the first parameter and the string we're validating as the second one. But how would I use preg_match to tell if it's valid or not? Would I do this:
if(preg_match("^[0-9a-zA-Z ]+$", $_POST['display_name'])){
"String is valid";
} else {
"String is not valid";
}
Or the other way around? I am currently using the if not preg_match if statement but it's returning false for some reason... I know this is probably an easy answer, but I cannot figure this out.
FALSE return from a preg_match indicates an error
you need to delimit your regex (see the leading and trailing / you can use other characters too
if (preg_match("/^[0-9a-zA-Z ]+$/", $_POST['display_name'])) {
You need add the delimiters of your pattern, like this:
preg_match("/^[0-9a-zA-Z ]+$/", $_POST['display_name'])

Categories