php regex is not escaped - php

what is a regex to find any text that has 'abc' but does not have a '\' before it. so it should match 'jfdgabc' but not 'asd\abc'. basically so its not escaped.

Use:
(?<!\\)abc
This is a negative lookbehind. Basically this is saying: find me the string "abc" that is not preceded by a backslash.
The one problem with this is that if you want to allow escaping of backslashes. For example:
123\\abcdef
(ie the backslash is escaped) then it gets a little trickier.

$str = 'jfdg\abc';
var_dump(preg_match('#(?<!\\\)abc#', $str));

Try the regex:
(?<!\\)abc
It matches a abc only if its not preceded by a \

Related

Regex escape escape characters in PHP

So I have this regex that works on regex101.com
(?:[^\#\\S\\+]*)
It matches the first from first#second.
Whenever I try to use my regex with PHP's preg_replace I don't get the result I expect.
So far I tried it via preg_quote():
\(\?\:\[\^\\#\\S\\\+\]\*\)
And tried it with escaping the original \\ with 4 \'s:
\(\?\:\[\^\\#\\\\S\\\\\+\]\*\)
Still no success. Am I doing something fundamentaly wrong?
I'm just using:
preg_replace("/$regex/", "", $string);
All my other regexes that don't need so many escape chars work perfectly that way.
When you use (?:[^\#\\S\\+]*) in a preg_match in PHP, both in a single or double quoted string literal, the \\S is parsed as a non-whitespace pattern. [^\S] is equal to \s, i.e. it matches whitespace.
The preg_quote() function is only meant to be used to make any string a literal one for a regex, it just escapes all chars that are sepcial regex metacharacters / operators (like (, ), [, etc.), thus you should not use it here.
While you could use a regex to match 1+ chars other than whitespace and # from the start of a string like preg_match('~^[^#\s]+~', $s, $match), you can just explode your input string with # and get the 0th item.

regex searching for a backslash

why is it that in searching for a backslash in a regex you need to escape the backslash 4 times?
Example:
$pattern = '/\\\\/';
$string = 'to\m';
preg_match( $pattern, $string, $matches );
echo "<pre>";
print_r($matches);
echo "</pre>";
Returns:
Array
(
[0] => \
)
Because there are two levels of parsing being done, once by PHP, and a second time by the regular expression engine:
The intended target: \
Well I need to put that in a string without it escaping the character after it: "\\", PHP sees \
Now I need to feed that into a regex: "\\\\" PHP sees \\, regex engine sees \
The function preg_quote() will remove a layer of confusion for you by escaping all regular expression metacharacters for you. eg:
$foo = preg_quote("c:\\some\\path\\or_whatever");
preg_match("/$foo/", $bar);
edit
You seem to be thinking of this as "units of \\", which doesn't seem like an accurate depiction of what is happening. For a better example let's use a different character that is also significant in both PHP and regular expressions, $.
Intended target: $
Escaping for a PHP string: "\$", the literal string seen by PHP is $
Escaping for a PHP string to be interpreted as a literal $ in a regular expression:
"\\\$", PHP sees the literal string \$, the regular expression sees the literal string $
Illustrated with different styles of braces representing different levels of escaping:
0: $ $
1: \$ [\$]
2: \\\\ [{\\}{\$}]
0: \ \
1: \\ [\\]
2: \\\\ [{\\}{\\}]
0: \\server\$c\Windows
1: [\\][\\]server[\\][\$]c[\\]Windows
2: [{\\}{\\}][{\\}{\\}]server[{\\}{\\}][{\\}{\$}]c[{\\}{\\}]Windows
Which also illustrates why dealing with Windows paths sucks butts.
This is because the backslash has a special meaning in both a php string and a regular expression, so you must escape it twice:
To match a single backslash, the pure regex should be:
/\\/
If it was:
/\/
, the backslash would be escaping the forward slash, leading to an invalid regex matching a single forward slash, but missing it's ending slash.
Then, this pure regex is put into a php string, and each backslash is again escaped:
'/\\\\/'
Because a backslash is a special character, you need to escape it twice. So \\ for the first backslash, and \\ for the second.

PHP preg_replace backslash

I have double backslashes '\' in my string that needs to be converted into single backslashes '\'. I've tried several combinations and end up with the whole string disappearing when I used echo or more backslashes are added to the string by accident. This regex thing is making me go bonkers...lol...
I tried this amongst other failed attempts:
$pattern = '[\\]';
$replacement = '/\/';
?>
<td width="100%"> <?php echo preg_replace($pattern, $replacement,$q[$i]);?></td>
I do apologise if this is a foolish issue and I appreciate any pointers.
Use stripslashes() - it does exactly what you're looking for.
<td width="100%"> <?php echo stripslashes($q[$i]);?></td>
Use stripslashes instead. Also, in your regex, you are searching for single backslashes and your replacement is incorrect. \\{2} should search for double backslashes and \ should replace them with singles, although I haven't tested this.
Just to explain further, the pattern [\\] matches any character in a set comprised of a single backslash. In php, you should also delimit your regex with forward slashes: /[\\]/
Your replacement, which is (without delimiters) \, is not a regular expression for matching a single backslash. The regex for matching a single backslash is \\. Note the escaping. This said, the replacement term needs to be a string, not a regex (with the exception of backreferences).
EDIT: Sven claims below that stripslashes removes all backslashes. This is simply not true, and I will explain why below.
If a string contains 2 backslashes, the first one will be considered an escaping backslash and will be removed. This can be seen at http://www.phpfiddle.org/main/code/3yn-2ut. The fact that any backslashes remain at all by itself contradicts the claim that stripslashes removes all backslashes.
Just to clarify, this string declaration is invalid: $x = "\";, since the backslash escapes the second quote. This string "\\" contains one backslash. In the process of unquoting this string, this backslash will be removed. This "\\\\" string contains two backslashes. When unquoting, the first will be considered an escaping backslash, and will be removed.
Use preg_replace to turn double backslash into single backslash:
preg_replace('/\\\\{2}/', '\\', $str)
The \ in the first parameter needs to be escaped twice, once for string and once more for regex, just like CodeAngry says.
In the second parameter it only gets excaped once for string.
Make sense?
Never use a regular expression if the string you are looking for is constant, as is the case with "Every instance of double backslash".
Use str_replace() for this task. It is a very easy function that replaces every occurance of a string with another.
In your case: str_replace('\\\\', '\\', $var).
The double backslash actually translates into four backslashed, because inside any quotes (single or double), a single backslash is the start of an escape sequence for the following character. If you want one literal backslash, you have to write two of them. You want two backslashes, you have to write four of them.
I do not like the suggestion of stripslashes(). This will of course "decode" your double backslash into one single backslash. But it will also remove all single backslashes in the whole string. If there were none - fine, otherwise things will fail now.
$pattern = '[\\]'; // wrong
$pattern = '[\\\\]'; // right
escape \ as \\ and escape \\ as \\\\ because \\] means escaped ].
Use htmlentities function to convert your slashes to html entities then using str_replace or preg_match to change them with new entity

matching either nothing (beginning of string) or any character but a \

To use a simplified example, I have:
$str = "Hello :special_text:! Look, I can write \:special_text:";
$pattern = /*???*/":special_text:";
$res = preg_replace($pattern, 'world', $str);
$res = str_replace("/:", ":", $res);
$res === "Hello world! Look, I can write :special_text:"; // => true
In other words, I'd like to be able to "escape" something that I'm writing.
I think that I have something almost working (using [^:]? as the first part of pattern), but I don't think that works if $str === ":special_text:", in that^doesn't match[^:]?`.
You can use a negative lookbehind:
(?<!\\):special_text:
This says "replace a :special_text: that isn't preceded by a backslash".
In your second str_replace looks like you want to replace \: by :.
See it in action here.
Also, don't forget if you use backslash in PHP strings you need to escape them once more (if you want a literal \ you need to use PHP \\, and to get a literal \\ you need to use PHP \\\\:
$pattern = '#(?<!\\\\):([^:]+):#';
Here the # is just a regex delimiter.
$pattern = "/[^\\\\]*:special_text:/";
-or-
$pattern = "/(?<!\\\\):special_text:/";
The other answers don't take into account the need to super-escape the backslashes in this situation. It's a little crazy.
To match a literal backslash, one has to write \\\\ as the regex string because the regular expression must be \\, and each backslash must be expressed as \\ inside a string literal. In regexes that feature backslashes repeatedly, this leads to lots of repeated backslashes and makes the resulting strings difficult to understand.
Something like this should do it: /[^\\]\:([a-z]+)\:/i
You can use RegexPal to text your regex against possible strings in realtime.

Why does this PHP regex give me error?

Need Some Help With Regex:
I want to replace
[url=http://youtube.com]YouTube.com[/url]
with
YouTube.com
the regex
preg_replace("/[url=(.*?)](.*?)[/url]/is", '$2', $text);
why does this give me:
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'r' in C:\Programa\wamp\www\func.php on line 18
You should escape special characters in your regular expression:
preg_replace('/\[url=(.*?)](.*?)\[\/url]/is', '$2', $text);
I have escaped the [ characters (they specify the start of a character class) and the / character (it specifies the boundaries of the regular expression.)
Alternatively (for the / character) you can use another boundary character:
preg_replace('#\[url=(.*?)](.*?)\[/url]#is', '$2', $text);
You still have to escape the [ characters, though.
PHP is interpreting the '/' in /url as being the end of the regex and the start of the regex options. Insert a '\' before it to make it a literal '/'.
You need to escape the '['s in the same way (otherwise they will be interpreted as introducing a character class).
preg_replace("/\[url=(.*?)](.*?)\[\/url]/is", '$2', $text);
Both the slashes and square brackets are special characters in regex, you will need to escape them:
\/ \[ \]
The 2nd '/' in a regex string ends the regex. You need to escape it. Also, preg_replace will interpret the '[url=(.*?)]' as a character class, so you need to escape those as well.
preg_replace('/\[url=(.*?)\](.*?)\[\/url\]/is', '$2', $text);
You seem to be just starting out with regular expressions. If that is the case - or maybe even if it isn't - you will find the Regex Coach to be a very helpful tool. It provides a sandbox for us to test our pattern matches and our replace strings too. If you had been using that it would have highlighted the need to escape the special characters.

Categories