preg_match('~^[a-zA-Z0-9!##$%^*()-_+=.]+$~', $string)
This is the pattern I used in my code, What I wanted to do was telling users that they're only allowed to use these characters. But the problem is that it works for some characters and not for some others. For example it doesn't allow a string like "john&john" but it allows "test<>" even though I didn't enter '<' and '>' in the pattern!
Most of those characters in the pattern have special meaning for the regex engine and must be escaped with backslash:
^[a-zA-Z0-9\!\#\#\$\%\^\*\(\)\-\_\+\=\.]+$
https://regex101.com/r/kH7hD8/1
I always test my regexps with tools like that https://regex101.com/
You must escape some special characters in your regexp:
^[a-zA-Z0-9!##\$%\^\*\(\)\-_\+=\.]+$
this will work:
preg_match('~^[a-zA-Z0-9!##$%^*()_+=.-]+$~', $string)
Problem is presence of an un-escaped hyphen in the middle of character class that is acting as range. Use this regex:
preg_match('~^[\w!##$%^*()+=.-]+$~', $string)
Related
I have a string that contains normal characters, white charsets and newline characters between <div> and </div>.
This regular expression doesn't work: /<div>(.*)<\/div>. It is because .* doesn't match newline characters. How can I do this?
You need to use the DOTALL modifier (/s).
'/<div>(.*)<\/div>/s'
This might not give you exactly what you want because you are greedy matching. You might instead try a non-greedy match:
'/<div>(.*?)<\/div>/s'
You could also solve this by matching everything except '<' if there aren't other tags:
'/<div>([^<]*)<\/div>/'
Another observation is that you don't need to use / as your regular expression delimiters. Using another character means that you don't have to escape the / in </div>, improving readability. This applies to all the above regular expressions. Here's it would look if you use '#' instead of '/':
'#<div>([^<]*)</div>#'
However all these solutions can fail due to nested divs, extra whitespace, HTML comments and various other things. HTML is too complicated to parse with Regex, so you should consider using an HTML parser instead.
To match all characters, you can use this trick:
%\<div\>([\s\S]*)\</div\>%
You can also use the (?s) mode modifier. For example,
(?s)/<div>(.*?)<\/div>
There shouldn't be any problem with just doing:
(.|\n)
This matches either any character except newline or a newline, so every character. It solved it for me, at least.
An option would be:
'/<div>(\n*|.*)<\/div>/i'
Which would match either newline or the dot identifier matches.
There is usually a flag in the regular expression compiler to tell it that dot should match newline characters.
I have a string that contains normal characters, white charsets and newline characters between <div> and </div>.
This regular expression doesn't work: /<div>(.*)<\/div>. It is because .* doesn't match newline characters. How can I do this?
You need to use the DOTALL modifier (/s).
'/<div>(.*)<\/div>/s'
This might not give you exactly what you want because you are greedy matching. You might instead try a non-greedy match:
'/<div>(.*?)<\/div>/s'
You could also solve this by matching everything except '<' if there aren't other tags:
'/<div>([^<]*)<\/div>/'
Another observation is that you don't need to use / as your regular expression delimiters. Using another character means that you don't have to escape the / in </div>, improving readability. This applies to all the above regular expressions. Here's it would look if you use '#' instead of '/':
'#<div>([^<]*)</div>#'
However all these solutions can fail due to nested divs, extra whitespace, HTML comments and various other things. HTML is too complicated to parse with Regex, so you should consider using an HTML parser instead.
To match all characters, you can use this trick:
%\<div\>([\s\S]*)\</div\>%
You can also use the (?s) mode modifier. For example,
(?s)/<div>(.*?)<\/div>
There shouldn't be any problem with just doing:
(.|\n)
This matches either any character except newline or a newline, so every character. It solved it for me, at least.
An option would be:
'/<div>(\n*|.*)<\/div>/i'
Which would match either newline or the dot identifier matches.
There is usually a flag in the regular expression compiler to tell it that dot should match newline characters.
I'm working in php and want to set some rules for a submitted text field. I want to allow letters, numbers, spaces, and the symbols # ' , -
This is what I have:
/^(a-z,0-9+# )+$/i
That seems to work but when I add the ' or - symbols I get errors.
Almost there. What you're looking for is called character classes. These are denoted by the use of square brackets. For example
/^[-a-z0-9+#,' ]+$/i
To include the hyphen character, it needs to be the first or last character in the class.
Edit
As you want to include the single quote and you're using PHP where regular expressions must be represented as strings, be careful with how you quote the pattern. In this case, you can use either of
$pattern = "/^[-a-z0-9+#,' ]+\$/i"; // or
$pattern = '/^[-a-z0-9+#,\' ]+$/i';
You should use a character class - [a-zA-Z0-9 #',-]
Note that - should be used first or last or escaped otherwise it gets treated as denoting a range and you will get errors
I want to allow letters, numbers, spaces, and the symbols #, ', , and -.
Use this regex...
/^[-a-zA-Z\d ',#]+\z/
Note the \z. If you use $, you are allowing a trailing \n. CodePad.
Ensure to escape the ' if you are using ' as your string delimiter.
Please use /^[a-z,0-9+\#\-,\s]+$/i
Use this regex:
/^[-a-z0-9,# ']+$/i
Need Some Help With Regex:
I want to replace
[url=http://youtube.com]YouTube.com[/url]
with
YouTube.com
the regex
preg_replace("/[url=(.*?)](.*?)[/url]/is", '$2', $text);
why does this give me:
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'r' in C:\Programa\wamp\www\func.php on line 18
You should escape special characters in your regular expression:
preg_replace('/\[url=(.*?)](.*?)\[\/url]/is', '$2', $text);
I have escaped the [ characters (they specify the start of a character class) and the / character (it specifies the boundaries of the regular expression.)
Alternatively (for the / character) you can use another boundary character:
preg_replace('#\[url=(.*?)](.*?)\[/url]#is', '$2', $text);
You still have to escape the [ characters, though.
PHP is interpreting the '/' in /url as being the end of the regex and the start of the regex options. Insert a '\' before it to make it a literal '/'.
You need to escape the '['s in the same way (otherwise they will be interpreted as introducing a character class).
preg_replace("/\[url=(.*?)](.*?)\[\/url]/is", '$2', $text);
Both the slashes and square brackets are special characters in regex, you will need to escape them:
\/ \[ \]
The 2nd '/' in a regex string ends the regex. You need to escape it. Also, preg_replace will interpret the '[url=(.*?)]' as a character class, so you need to escape those as well.
preg_replace('/\[url=(.*?)\](.*?)\[\/url\]/is', '$2', $text);
You seem to be just starting out with regular expressions. If that is the case - or maybe even if it isn't - you will find the Regex Coach to be a very helpful tool. It provides a sandbox for us to test our pattern matches and our replace strings too. If you had been using that it would have highlighted the need to escape the special characters.
I have a string that contains normal characters, white charsets and newline characters between <div> and </div>.
This regular expression doesn't work: /<div>(.*)<\/div>. It is because .* doesn't match newline characters. How can I do this?
You need to use the DOTALL modifier (/s).
'/<div>(.*)<\/div>/s'
This might not give you exactly what you want because you are greedy matching. You might instead try a non-greedy match:
'/<div>(.*?)<\/div>/s'
You could also solve this by matching everything except '<' if there aren't other tags:
'/<div>([^<]*)<\/div>/'
Another observation is that you don't need to use / as your regular expression delimiters. Using another character means that you don't have to escape the / in </div>, improving readability. This applies to all the above regular expressions. Here's it would look if you use '#' instead of '/':
'#<div>([^<]*)</div>#'
However all these solutions can fail due to nested divs, extra whitespace, HTML comments and various other things. HTML is too complicated to parse with Regex, so you should consider using an HTML parser instead.
To match all characters, you can use this trick:
%\<div\>([\s\S]*)\</div\>%
You can also use the (?s) mode modifier. For example,
(?s)/<div>(.*?)<\/div>
There shouldn't be any problem with just doing:
(.|\n)
This matches either any character except newline or a newline, so every character. It solved it for me, at least.
An option would be:
'/<div>(\n*|.*)<\/div>/i'
Which would match either newline or the dot identifier matches.
There is usually a flag in the regular expression compiler to tell it that dot should match newline characters.