PHP preg_match to accept new line
I want to pass every post/string through PHP preg_match function. I want to accept all the alpha-numerics and some special characters. Help me edit my syntax to allow newline. As the users fill textarea and press enter. Following syntax does not allow new line.
Please feedback whether following special characters are properly done or not
*/_:,.?#;-*
if (preg_match("/^[0-9a-zA-Z \/_:,.?#;-]+$/", $string)) {
echo 'good';
else {
echo 'bad';
}
You were almost there!
The DOTALL modifier mentioned by others is irrelevant to your regex.
To allow new lines, we just add \r\n to your character class. Your code becomes:
if (preg_match("/^[\r\n0-9a-zA-Z \/_:,.?#;-]+$/", $string)) {
echo 'good';
else {
echo 'bad';
}
Note that this test and the regex can be written in a tidier way:
echo (preg_match("~^[\r\n\w /:,.?#;-]+$~",$string))? "***Good!***" : "Bad!";
See the result of the online demo at the bottom.
\w matches letters, digits and underscores, so we can get rid of them in the character class
Changing the delimiter to a ~ allows you to use a / slash without escaping it (you need to escape delimiters)
it's always safe to add backslash to any non-alphanumeric characters so:
/^[0-9a-zA-Z \/\_\:\,\.\?\#\;\-]+$/
Also use character classes:
/^[[:alnum:] \/\_\:\,\.\?\#\;\-]+$/
oh about the new lines:
/^[[:alnum:] \r\n\/\_\:\,\.\?\#\;\-]+$/
to be able to do that string ^ (also, it'll be easier/safer to use single quotes)
'/^[[:alnum:] \\r\\n\/\_\:\,\.\?\#\;\-]+$/'
You can use an alternation to factor in the newlines:
/^(?:[0-9a-zA-Z \/_:,.?#;-]|\r?\n)+$/
Btw, you can shorten the expression a bit by replacing [A-Za-z0-9_] with [\w\d]:
/^(?:[\w\d \/:,.?#;-]|\r?\n)+$/
So:
if (preg_match('/^(?:[\w\d \/:,.?#;-]|\r?\n)+$/', $string)) {
echo "good";
} else {
echo "bad";
}
Related
I am trying to strip away all non-allowed characters from a string using regex. Here is my current php code
$input = "👮";
$pattern = "[a-zA-Z0-9_ !##$%^&*();\\\/|<>\"'+\-.,:?=]";
$message = preg_replace($pattern,"",$input);
if (empty($message)) {
echo "The string is empty";
}
else {
echo $message;
}
The emoji gets printed out when I run this when I want it to print out "The string is empty.".
When I put my regex code into http://regexr.com/ it shows that the emoji is not matching, but when I run the code it gets printed out. Any suggestions?
This pattern should do the trick :
$filteredString = preg_replace('/([^-\p{L}\x00-\x7F]+)/u', '', $rawString);
Some sequences are quite rare, so let's explain them:
\p{L} matches any kind of letter from any language
\x00-\x7F a single character in the range between (index 0) and (index 127) (case sensitive)
the u modifier who turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8.
Your pattern is incorrect. If you want to strip away all the characters that are not in the list provided, then you have to use a negating character class: [^...]. Also, currently, [ and ] are being used as delimiters, which means, the pattern isn't seen as a character class.
The pattern should be:
$pattern = "~[^a-zA-Z0-9_ !##$%^&*();\\\/|<>\"'+.,:?=-]~";
This should now strip away the emoji and print your message.
$pattern = '/\\\p\\\/';
if (preg_match($pattern, "\p\")) {
echo "Correct";
} else {
echo "Incorrect";
}
I don't understand the first \\\p.
Why \\p does not work?
Your pattern is wrong. pattern \\p\\ matches the string \p\. But \\\p\\\ doesn't matches anything.
DEMO
If you want to match the string \\p\\, your pattern should be \\\\p\\\\.
DEMO
Note that "\p\" is not a valid string:
The final \" escapes the quote, so that the string is not terminated
The \p matches a literal p character, which is not what you intended
If you want to say \p\ in a string, you have to write it like this: "\\p\\"
To match \p\, use:
$regex = '~\\\\p\\\\~';
echo (preg_match($regex,"\\p\\")) ? "Matches" : "Doesn't Match";
See the output at the bottom of the online demo.
The problem here is that both strings and regular expressions use escape characters and they need to be doubled in order to effect the intended behaviour.
So, in this case you need four backslashes in the regular expression and two of them in the search string:
if (preg_match('/\\\\p\\\\/', '\\p\\')) {
echo "Hurray!\n";
}
The reason why '/\\\p\\\/' works is because \p and \/ have no special meaning in a single quoted string and so the backslash is printed verbatim. In other words, PHP corrects your string to have the correct meaning; that said, you should use the correct number of escape characters.
Btw, "\\p\" is just plain wrong and will cause a parse error; I'm going to assume that this was a typo.
I'm trying to make a regex that would allow input including at least one digit and at least one letter (no matter if upper or lower case) AND NOTHING ELSE. Here's what I've come up with:
<?php
if (preg_match('/(?=.*[a-z]+)(?=.*[0-9]+)([^\W])/i',$code)) {
echo "=)";
} else {
echo "=(";
}
?>
While it gives false if I use only digits or only letters, it gives true if I add $ or # or any other non-alphanumeric sign. Now, I tried putting ^\W into class brackets with both a-z and 0-9, tried to use something like ?=.*[^\W] or ?>! but I just can't get it work. Typing in non-alphanums still results in true. Halp meeee
You need to use anchors so that it matches against the entire string.
^(?=.*[a-z]+)(?=.*[0-9]+)(\w+)$
Since you are using php, why even use regex at all. You can use ctype_alnum()
http://php.net/manual/en/function.ctype-alnum.php
i am looking for a regex that can contain special chracters like / \ . ' "
in short i would like a regex that can match the following:
may contain lowercase
may contain uppercase
may contain a number
may contain space
may contain / \ . ' "
i am making a php script to check if a certain string have the above or not, like a validation check.
The regular expression you are looking for is
^[a-z A-Z0-9\/\\.'"]+$
Remember if you are using PHP you need to use \ to escape the backslashes and the quotation mark you use to encapsulate the string.
In PHP using preg_match it should look like this:
preg_match("/^[a-z A-Z0-9\\/\\\\.'\"]+$/",$value);
This is a good place to find the regular expressions you might want to use.
http://regexpal.com/
You can always escape them by appending a \ in front of the special characters.
try this:
preg_match("/[A-Za-z0-9\/\\.'\"]/", ...)
NikoRoberts is 100% correct.
I would only add the following suggestion: When creating a PHP regex pattern string, always use: single-quotes. There are far fewer chars which need to be escaped (i.e. only the single quote and the backslash itself needs to be escaped (and the backslash only needs to be escaped if it appears at the end of the string)).
When dealing with backslash soup, it helps to print out the (interpreted) regex string. This shows you exactly what is being presented to the regex engine.
Also, a "number" might have an optional sign? Yes? Here is my solution (in the form of a tested script):
<?php // test.php 20110311_1400
$data_good = 'abcdefghijklmnopqrstuvwxyzABCDE'.
'FGHIJKLMNOPQRSTUVWXYZ0123456789+- /\\.\'"';
$data_bad = 'abcABC012~!###$%^&*()';
$re = '%^[a-zA-Z0-9+\- /\\\\.\'"]*$%';
echo($re ."\n");
if (preg_match($re, $data_good)) {
echo("CORRECT: Good data matches.\n");
} else {
echo("ERROR! Good data does NOT match.\n");
}
if (preg_match($re, $data_bad)) {
echo("ERROR! Bad data matches.\n");
} else {
echo("CORRECT: Bad data does NOT match.\n");
}
?>
The following regex will match a single character that fits the description you gave:
[a-zA-Z0-9\ \\\/\.\'\"]
If your point is to insure that ONLY characters in this range of characters are used in your string, then you can use the negation of this which would be:
[^a-zA-Z0-9\ \\\/\.\'\"]
In the second case, you could use your regex to find the bad stuff (that you don't want to be included), and if it didn't find anything then your string pattern must be kosher, because I'm assuming that if you find one character that is not in the proper range, then your string is not valid.
so to put it in PHP syntax:
$regex = "[^a-zA-Z0-9\ \\\/\.\'\"]"
if preg_match( $regex, ... ) {
// handle the bad stuff
}
Edit 1:
I've completely ignored the fact that backslashes are special in php double-quoted strings, so here is a correcting to the above code:
$regex = "[^a-zA-Z0-9\\ \\\\\\/\\.\\'\\\"]"
If that doesn't work it shouldn't take too much for someone to debug how many of the backslashes need to be escaped with a backslash, and what other characters need also to be escaped....
I want a regular expression to validate a nickname: 6 to 36 characters, it should contain at least one letter. Other allowed characters: 0-9 and underscores.
This is what I have now:
if(!preg_match('/^.*(?=\d{0,})(?=[a-zA-Z]{1,})(?=[a-zA-Z0-9_]{6,36}).*$/i', $value)){
echo 'bad';
}
else{
echo 'good';
}
This seems to work, but when a validate this strings for example:
11111111111a > is not valid, but it should
aaaaaaa!aaaa > is valid, but it shouldn't
Any ideas to make this regexp better?
I would actually split your task into two regex:
to find out whether it's a valid word: /^\w{6,36}$/i
to find out whether it contains a letter /[a-z]/i
I think it's much simpler this way.
Try this:
'/^(?=.*[a-z])\w{6,36}$/i'
Here are some of the problems with your original regex:
/^.*(?=\d{0,})(?=[a-zA-Z]{1,})(?=[a-zA-Z0-9_]{6,36}).*$/i
(?=\d{0,}): What is this for??? This is always true and doesn't do anything!
(?=[a-zA-Z]{1,}): You don't need the {1,} part, you just need to find one letter, and i flag also allows you to omit A-Z
/^.*: You're matching these outside of the lookaround; it should be inside
(?=[a-zA-Z0-9_]{6,36}).*$: this means that as long as there are between 6-36 \w characters, everything else in the rest of the string matches! The string can be 100 characters long mostly containing illegal characters and it will still match!
You can do it easily using two calls to preg_match as:
if( preg_match('/^[a-z0-9_]{6,36}$/i',$input) && preg_match('/[a-z]/i',$input)) {
// good
} else {
// bad
}