I am trying to find and replace binary values in strings:
$str = '('.chr(0x00).chr(0x91).')' ;
$str = preg_replace("/\x00\x09/",'-',$str) ;
But I am getting "Warning: preg_replace(): Null byte in regex" error message.
How to work on binary values in Regex/PHP?
It is because you are using double quotes " around your regex pattern, which make the php engine parse the chars \x00 and \x09.
If you use single quotes instead, it will work:
$str = '(' . chr(0x00) . chr(0x91) . ')' ;
$str = preg_replace('/\x00\x09/', '-', $str) ;
But your regex seems also not to be correct, if I understood your question correctly. If you want to replace the chars \x00 and \x91 with a dash -, you must put them into brackets []:
$str = preg_replace('/[\x00\x91]/', '-', $str) ;
Related
So, I'm doing some manipulation on lat/long pairs, and I need to turn this:
39.1889375383777,-94.48019109594397
into:
39.1889375383777 -94.48019109594397
I can't use str_replace, unless I want to have an array of 10 search and 10 replace strings, so I was hoping to use preg_replace:
$query1 = preg_replace( "/([0-9-]),([0-9-])/", "\1 \2", $query );
The problem is that the "-" gets lost:
39.1889375383777 94.48019109594397
Note, that I have a string containing a list of these, trying to do all at once:
[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]
I managed to make this work with preg_replace_callback:
$str = preg_replace_callback( "/([0-9-]),([0-9-])/",
function ($matches) {return $matches[1] . " " . $matches[2];},
$query
);
But still not sure why the simpler preg_match didn't work?
Your main issue is that "\1 \2" define a "\x1\x20\x2" string, where the first character is a SOH char and the third one is STX char (see the ASCII table). To define backreferences, you need to use a literal backslash, "\\", or, better, use $n notation, and better inside a single-quoted string literal.
You can also use a solution without backreferences:
preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str)
Details:
(?<=\d) - a location that is immediately preceded with a digit
, - a comma
(?=-?\d) - a location that is immediately followed with an optional - and a digit.
See the PHP demo:
$str = '[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]';
echo preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str);
// => [[39.1889375383777 -94.48019109594397],[39.18425796890108 -94.28288005131176],[39.41972019529712 -94.19956344733345],[39.41412315915102 -94.41932608390658],[39.34785744845041 -94.4893603307242],[39.1889375383777 -94.48019109594397]]
preg_replace does not return desired result when I use it on string fetched from database.
$result = DB::connection("connection")->select("my query");
foreach($result as $row){
//prints run-d.m.c.
print($row->artist . "\n");
//should print run.d.m.c
//prints run-d.m.c
print(preg_replace("/-/", ".", $row->artist) . "\n");
}
This occurs only when i try to replace - (dash). I can replace any other character.
However if I try this regex on simple string it works as expected:
$str = "run-d.m.c";
//prints run.d.m.c
print(preg_replace("/-/", ".", $str) . "\n");
What am I missing here?
It turns out you have Unicode dashes in your strings. To match all Unicode dashes, use
/[\p{Pd}\xAD]/u
See the regex demo
The \p{Pd} matches any hyphen in the Unicode Character Category 'Punctuation, Dash' but a soft hyphen, \xAD, hence it should be combined with \p{Pd} in a character class.
The /u modifier makes the pattern Unicode aware and makes the regex engine treat the input string as Unicode code point sequence, not a byte sequence.
I need to replace all double quotes in any (variable) given string.
For example:
$text = 'data-caption="hello"world">';
$pattern = '/data-caption="[[\s\S]*?"|(")]*?">/';
$output = preg_replace($pattern, '"', $text);
should result in:
"hello"world"
(The above pattern is my attempt at getting it to work)
The problem is that I don't now in advance if and how many double quotes are going to be in the string.
How can i replace the " with quot; ?
You may match strings between data-caption=" and "> and then replace all " inside that match with " using a mere str_replace:
$text = 'data-caption="<element attribute1="wert" attribute2="wert">Name</element>">';
$pattern = '/data-caption="\K.*?(?=">)/';
$output = preg_replace_callback($pattern, function($m) {
return str_replace('"', '"', $m[0]);
}, $text);
print_r($output);
// => data-caption="<element attribute1="wert" attribute2="wert">Name</element>">
See the PHP demo
Details
data-caption=" - starting delimiter
\K - match reset operator
.*? - any 0+ chars other than line break chars, as few as possible
(?=">) - a positive lookahead that requires the "> substring immediately to the right of the current location.
The match is passed to the anonymous function inside preg_replace_callback (accessible via $m[0]) and that is where it is possible to replace all " symbols in a convenient way.
preg_replace does not return desired result when I use it on string fetched from database.
$result = DB::connection("connection")->select("my query");
foreach($result as $row){
//prints run-d.m.c.
print($row->artist . "\n");
//should print run.d.m.c
//prints run-d.m.c
print(preg_replace("/-/", ".", $row->artist) . "\n");
}
This occurs only when i try to replace - (dash). I can replace any other character.
However if I try this regex on simple string it works as expected:
$str = "run-d.m.c";
//prints run.d.m.c
print(preg_replace("/-/", ".", $str) . "\n");
What am I missing here?
It turns out you have Unicode dashes in your strings. To match all Unicode dashes, use
/[\p{Pd}\xAD]/u
See the regex demo
The \p{Pd} matches any hyphen in the Unicode Character Category 'Punctuation, Dash' but a soft hyphen, \xAD, hence it should be combined with \p{Pd} in a character class.
The /u modifier makes the pattern Unicode aware and makes the regex engine treat the input string as Unicode code point sequence, not a byte sequence.
I use the RegExr website to test my regular expressions. On that website it can easily find the character "\" with the RegEx string /\\/g. But when I use it in php it throws back:
Warning: preg_replace(): No ending delimiter '/'
My Code
$str = "0123456789 _+-.,!##$%^&*();\/|<>";
echo preg_replace('/\\/', '', $str);
Why doesn't PHP like to escape "\"?
When using it in the regexp use \\ to use it in the replacement, use \\\\ will turn into \\ that will be interpreted as a single backslash.
Use it like this:
<?php
$str = "0123456789 _+-.,!##$%^&*();\/|<>";
echo preg_replace('/\\\\/', '', $str);
Output:
0123456789 _+-.,!##$%^&*();/|<>