Regex escape escape characters in PHP - php

So I have this regex that works on regex101.com
(?:[^\#\\S\\+]*)
It matches the first from first#second.
Whenever I try to use my regex with PHP's preg_replace I don't get the result I expect.
So far I tried it via preg_quote():
\(\?\:\[\^\\#\\S\\\+\]\*\)
And tried it with escaping the original \\ with 4 \'s:
\(\?\:\[\^\\#\\\\S\\\\\+\]\*\)
Still no success. Am I doing something fundamentaly wrong?
I'm just using:
preg_replace("/$regex/", "", $string);
All my other regexes that don't need so many escape chars work perfectly that way.

When you use (?:[^\#\\S\\+]*) in a preg_match in PHP, both in a single or double quoted string literal, the \\S is parsed as a non-whitespace pattern. [^\S] is equal to \s, i.e. it matches whitespace.
The preg_quote() function is only meant to be used to make any string a literal one for a regex, it just escapes all chars that are sepcial regex metacharacters / operators (like (, ), [, etc.), thus you should not use it here.
While you could use a regex to match 1+ chars other than whitespace and # from the start of a string like preg_match('~^[^#\s]+~', $s, $match), you can just explode your input string with # and get the 0th item.

Related

PHP preg_replace not working as intended

I am trying to replace /admin and \admin from the following two strings:
F:\dev\htdocs\cms\admin
http://localhost/cms/admin
Using the following regular expression in preg_replace:
/[\/\\][a-zA-Z0-9_-]*$/i
1) From the first string it just replaces admin where as it should replace \admin
2) From the second string it replaces every thing except http: where as it should replace only /admin
I have checked this expression on http://regexpal.com/ and it works perfect there but not in PHP.
Any idea?
Note that the last part of each string admin is not fixed, it can
be any user selected value and thats why I have used [a-zA-Z0-9_-]* in
regular expression.
The original regular expression should be /[\/\\][a-zA-Z0-9_-]*$/i, but since you need to escape the backslashes in string declarations as well, each backslash must be expressed with \\ -- 4 backslashes in total.
From the PHP manual:
Single and double quoted PHP strings have special meaning of backslash. Thus if \ has to be matched with a regular expression \\, then "\\\\" or '\\\\' must be used in PHP code.
So, your preg_replace() statement should look like:
echo preg_replace('/[\/\\\\][a-zA-Z0-9_-]*$/i', '', $str);
Your regex can be improved as follows:
echo preg_replace('~[/\\\\][\w-]*$~', '', $str);
Here, \w matches the ASCII characters [A-Za-z0-9_]. You can also avoid having to escape the forward slash / by using a different delimiter -- I've used ~ above.
[\/\\\][a-zA-Z0-9_-]*$/i
Live demo

Regex match with special character

I'm having the following reg-ex which is working fine for the normal string match against an array,
preg_grep( "/^". $name . "$/i", $values);
However its not working for the string which has special characters like "Entertainment (General)".
Find a related thread however its for java script and also it didn't help.
Use the preg_quote function to escape any special characters that might be in the string:
preg_grep( "/^". preg_quote($name, '/') . "$/i", $values);
From the documentation:
preg_quote() puts a backslash in front of every character that is part of the regular expression syntax. This is useful if you have a run-time string that you need to match in some text and the string may contain special regex characters.

Getting regular expression

How can i extract https://domain.com/gamer?hid=.115f12756a8641 from the below string ,i.e from url
rrth:'http://www.google.co',cctp:'323',url:'https://domain.com/gamer?hid=.115f12756a8641',rrth:'https://another.com'
P.s :I am new to regular expression, I am learning .But above string seems to be formatted..so some sort of shortcut must be there.
If your input string is called $str:
preg_match('/url:\'(.*?)\'/', $str, $matches);
$url = $matches[1];
(.*?) captures everything between url:' and ' and can later be retrieved with $matches[1].
The ? is particularly important. It makes the repetition ungreedy, otherwise it would consume everything until the very last '.
If your actual input string contains multiple url:'...' section, use preg_match_all instead. $matches[1] will then be an array of all required values.
Simple regex:
preg_match('/url\s*\:\s*\'([^\']+)/i',$theString,$match);
echo $match[1];//should be the url
How it works:
/url\s*\:\s*: matches url + [any number of spaces] + : (colon)+ [any number of spaces]But we don't need this, that's where the second part comes in
\'([^\']+)/i: matches ', then the brackets (()) create a group, that will be stored separately in the $matches array. What will be matches is [^']+: Any character, except for the apostrophe (the [] create a character class, the ^ means: exclude these chars). So this class will match any character up to the point where it reaches the closing/delimiting apostrophe.
/i: in case the string might contain URL:'http://www.foo.bar', I've added that i, which is the case-insensitive flag.
That's about it.Perhaps you could sniff around here to get a better understanding of regex's
note: I've had to escape the single quotes, because the pattern string uses single quotes as delimiters: "/url\s*\:\s*'([^']+)/i" works just as well. If you don't know weather or not you'll be dealing with single or double quotes, you could replace the quotes with another char class:
preg_match('/url\s*\:\s*[\'"]([^\'"]+)/i',$string,$match);
Obviously, in that scenario, you'll have to escape the delimiters you've used for the pattern string...

How to use regex to match this html tag?

I can't seem to figure out what I'm doing wrong...
I'm trying to find matches of
<cite>stuffhere</cite>
Is this right?
preg_match_all('<cite>(.*?)</cite>/ms', $str, $matches)
escape the /
preg_match_all('/<cite>(.*?)<\/cite>/ms', $str, $matches);
Your confusion is not your fault; PHP is notoriously weird in this area.
In most programming languages, you create a regex object one of two ways. If the language supports regexes as a first-class language element, you can use a regex literal:
var re = /<b>"\w+"<\/b>/; // JavaScript
Here, the forward-slash (/) is the regex delimiter; if you want to match a literal /, you have to escape it with a backslash: \/.
In other languages, you have to write the regex in the form of a string literal, which you then pass to a constructor or a factory method:
Pattern p = Pattern.compile("<b>\"\\w+\"</b>"); // Java
The forward-slash doesn't need to be escaped, but both the double-quote (") and backslash (\) do, because of their special meanings in string literals.
But PHP is unique: it doesn't support regex literals, so you have to write the regex as a string, but the string has to look like a regex literal! That is, it has to have string delimiters (quotes) and regex delimiters. For example:
$re = '/<b>"\w+"<\/b>/';
It isn't all bad; as you can see, you can use PHP's single-quoted strings instead of double-quoted, so you don't have to escape all backslashes and double-quotes. You can also choose different regex delimiters, so you don't have to escape (for example) literal forward-slashes in your regex:
$re = '~<cite>(.*?)</cite>~s'
The modifiers ('s' for single-line, 'i' for ignore-case, etc.) go after the trailing regex delimiter, as in Perl or JavaScript. Almost any ASCII punctuation character can be used as a regex delimiter; ~ and # are popular choices.
You should use an HTML Parser to parse html, or you will end up with unexpected errors. However, this is what your regex should be:
'#<cite>(.*?)</cite>#s'
Try this:
preg_match_all('/<cite>(.*?)<\/cite>/ms', $str, $matches);

Why does this PHP regex give me error?

Need Some Help With Regex:
I want to replace
[url=http://youtube.com]YouTube.com[/url]
with
YouTube.com
the regex
preg_replace("/[url=(.*?)](.*?)[/url]/is", '$2', $text);
why does this give me:
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'r' in C:\Programa\wamp\www\func.php on line 18
You should escape special characters in your regular expression:
preg_replace('/\[url=(.*?)](.*?)\[\/url]/is', '$2', $text);
I have escaped the [ characters (they specify the start of a character class) and the / character (it specifies the boundaries of the regular expression.)
Alternatively (for the / character) you can use another boundary character:
preg_replace('#\[url=(.*?)](.*?)\[/url]#is', '$2', $text);
You still have to escape the [ characters, though.
PHP is interpreting the '/' in /url as being the end of the regex and the start of the regex options. Insert a '\' before it to make it a literal '/'.
You need to escape the '['s in the same way (otherwise they will be interpreted as introducing a character class).
preg_replace("/\[url=(.*?)](.*?)\[\/url]/is", '$2', $text);
Both the slashes and square brackets are special characters in regex, you will need to escape them:
\/ \[ \]
The 2nd '/' in a regex string ends the regex. You need to escape it. Also, preg_replace will interpret the '[url=(.*?)]' as a character class, so you need to escape those as well.
preg_replace('/\[url=(.*?)\](.*?)\[\/url\]/is', '$2', $text);
You seem to be just starting out with regular expressions. If that is the case - or maybe even if it isn't - you will find the Regex Coach to be a very helpful tool. It provides a sandbox for us to test our pattern matches and our replace strings too. If you had been using that it would have highlighted the need to escape the special characters.

Categories