I've to replace newline (\n) with & in a string so that the received data could be parsed with parse_str() into array. The thing is that when I put \n in single quote it somehow turns out as to be replaced with a space:
str_ireplace(array('&', '+', '\n'), array('', '', '&'), $response)
"id=1 name=name gender=gender age=age friends=friends"
But when I put \n in double quotes then it works just fine:
str_ireplace(array('&', '+', "\n"), array('', '', '&'), $response)
"id=1&name=name&gender=gender&age=age&friends=friends"
Why is that so?
Because only the escaped sequences \' and \\ have a meaning in single quoted strings.
See the documentation:
To specify a literal single quote, escape it with a backslash (\). To specify a literal backslash, double it (\\). All other instances of backslash will be treated as a literal backslash: this means that the other escape sequences you might be used to, such as \r or \n, will be output literally as specified rather than having any special meaning.
Update:
Another difference is that PHP only substitutes variables inside double-quoted strings (and heredoc). Therefore you can consider processing of single-quoted strings to be faster in general (but maybe not measurably faster).
Btw you don't necessarily need to use str_ireplace as &, + and \n have no upper or lower case version. There is just one version, so str_replace would be enough.
Related
I'm using PHP 7.1.11
As mentioned in the PHP manual
To specify a literal single quote in a string which is already
enclosed in a pair of single quotes, escape it with a backslash ().
To specify a literal backslash, double it (\). All other instances of
backslash will be treated as a literal backslash: this means that the
other escape sequences you might be used to, such as \r or \n, will be
output literally as specified rather than having any special meaning.
I'm not understanding the above paragraph due to which following doubts have been created in my mind :
Can only single quotes be specified in a string(using the escape sequence character \') already enclosed in single quotes?
Consider the below sentence
To specify a literal backslash, double it (\).
Actually, I can simply specify a single backslash literal by adding single \ in a string which is already enclosed in a pair of single quotes then why the manual is saying that I have to use double slash(\\) to specify it?
Now consider the below sentence
All other instances of
backslash will be treated as a literal backslash: this means that the
other escape sequences you might be used to, such as \r or \n, will be
output literally as specified rather than having any special meaning.
Does this mean that no other escape sequence character except the single quote(\') can be added in a string quoted in single quotes? Will the characters like \r, \t and \n will get printed as it is like a simple text in a browser?
Someone please clarify all of my above doubts.
Thanks.
Regarding "first paragraph", it's about escaping backslash right before your escaped single quote
echo '\\\'' // Output: \'
All other backslashes output as it is, so there is no possibility to write "new line" symbol with single quotes.
echo '\n\ \'' // Output: \n\ '
To match a literal backslash, many people and the PHP manual say: Always triple escape it, like this \\\\
Note:
Single and double quoted PHP strings have special meaning of backslash. Thus if \ has to be matched with a regular expression \\, then "\\\\" or '\\\\' must be used in PHP code.
Here is an example string: \test
$test = "\\test"; // outputs \test;
// WON'T WORK: pattern in double-quotes double-escaped backslash
#echo preg_replace("~\\\t~", '', $test); #output -> \test
// WORKS: pattern in double-quotes with triple-escaped backslash
#echo preg_replace("~\\\\t~", '', $test); #output -> est
// WORKS: pattern in single-quotes with double-escaped backslash
#echo preg_replace('~\\\t~', '', $test); #output -> est
// WORKS: pattern in double-quotes with double-escaped backslash inside a character class
#echo preg_replace("~[\\\]t~", '', $test); #output -> est
// WORKS: pattern in single-quotes with double-escaped backslash inside a character class
#echo preg_replace('~[\\\]t~', '', $test); #output -> est
Conclusion:
If the pattern is single-quoted, a backslash has to be double-escaped \\\ to match a literal \
If the pattern is double-quoted, it depends whether
the backlash is inside a character-class where it must be at least double-escaped \\\
outside a character-class it has to be triple-escaped \\\\
Who can show me a difference, where a double-escaped backslash in a single-quoted pattern e.g. '~\\\~' would match anything different than a triple-escaped backslash in a double-quoted pattern e.g. "~\\\\~" or fail.
When/why/in what scenario would it be wrong to use a double-escaped \ in a single-quoted pattern e.g. '~\\\~' for matching a literal backslash?
If there's no answer to this question, I would continue to always use a double-escaped backslash \\\ in a single-quoted PHP regex pattern to match a literal \ because there's possibly nothing wrong with it.
A backslash character (\) is considered to be an escape character by both PHP's parser and the regular expression engine (PCRE). If you write a single backslash character, it will be considered as an escape character by PHP parser. If you write two backslashes, it will be interpreted as a literal backslash by PHP's parser. But when used in a regular expression, the regular expression engine picks it up as an escape character. To avoid this, you need to write four backslash characters, depending upon how you quote the pattern.
To understand the difference between the two types of quoting patterns, consider the following two var_dump() statements:
var_dump('~\\\~');
var_dump("~\\\\~");
Output:
string(4) "~\\~"
string(4) "~\\~"
The escape sequence \~ has no special meaning in PHP when it's used in a single-quoted string. Three backslashes do also work because the PHP parser doesn't know about the escape sequence \~. So \\ will become \ but \~ will remain as \~.
Which one should you use:
For clarity, I'd always use ~\\\\~ when I want to match a literal backslash. The other one works too, but I think ~\\\\~ is more clear.
There is no difference between the actual escaping of the slash in either single or double quoted strings in PHP - as long as you do it correct. The reason why you're getting a WONT WORK on your first example is, as pointed out in the comments, it expands \t to the tab meta character.
When you're using just three backslashes, the last one in your single quoted string will be interpreted as \~, which as far as single quoted strings go, will be left as it is (since it does not match a valid escape sequence). It is however just a coincidence that this will be parsed as you expect in this case, and not have some sort of side effect (i.e, \\\' would not behave the same way).
The reason for all the escaping is that the regular expression also needs backslashes escaped in certain situations, as they have special meaning there as well. This leads to the large number of backslashes after each other, such as \\\\ (which takes eight backslashes for the markdown parser, as it yet again adds another level of escaping).
Hopefully that clears it up, as you seem to be confused regarding the handling of backslashes in single/double quoted strings more than the behaviour in the regular expression itself (which will be the same regardless of " or ', as long as you escape things correctly).
What's the regular expression to find \"
I think it's this: '/\\"/' but I need to use it on a really large dataset so need to make sure this is correct.
I need to replace it with " so my code is : $data = preg_replace('/\\"/', '"', $data)
Is that correct?
For matching backslashes you need to 'double-escape' them, so you have four \ at the end:
$data = preg_replace('/\\\\"/', '"', $data);
Why you need 4 \: PHP parses a string \\" as \" and RegEx interprets this as " since in RegEx you don't need to escape ". So it wont match \". \\\\" will be parsed as \\" which will be interpreted as \" by RegEx.
A backslash does not need to be escaped in either a single-quoted string or a regular expression, unless the following character is a character that can be escaped (such as the backslash itself).
A double quote does not need to be escaped and cannot be escaped in a single-quoted string. In a regular expression it doesn't have to be either, but it can be.
That means \\ in both a single-quoted string and a regular expression becomes \, while \" in a single-quoted string remains \", while in a regular expression it becomes ".
However, in PHP you can only create a regular expression from a string, so you have to escape twice.
In other words...
Original string String processed Regexp processed
'/\"/' /\"/ "
'/\\"/' /\"/ "
'/\\\"/' /\\"/ \"
'/\\\\"/' /\\"/ \"
'/\\\\\"/' /\\\"/ \"
'/\\\\\\"/' /\\\"/ \"
'/\\\\\\\"/' /\\\\"/ \\"
Bonus backslash
In a double-quoted string, of course, the " does need to be escaped, so...
"/\"/" /"/ "
"/\\"/" syntax error
"/\\\"/" /\"/ "
"/\\\\"/" syntax error
"/\\\\\"/" /\\"/ \"
"/\\\\\\"/" syntax error
"/\\\\\\\"/" /\\\"/ \"
"/\\\\\\\\"/" syntax error
"/\\\\\\\\\"/" /\\\\"/ \\"
I think you should probably go for preg_replace("/\\\\\\\"/", "\"", $data) just to be on the safeconfusing side.
As long as you mean the literal string \", matching for those characters in a regular expression requires:
\\"
So, you'd use /\\\\"/ as the pattern parameter in a preg_* function.
(You only need to escape the backslash - since PHP handles backslashes in single and double-quotes strings as a special character, you need to escape them twice.)
Is this all you need to match? If so, I'd recommend just using str_replace():
$string = str_replace('\\"', '"', $string);
For a simple search/replace of literal characters like this, an iterative string function like str_replace() will be faster than a regular expression.
this one is correct.
preg_replace('/\\\"/', '"', $data);
http://sandbox.phpcode.eu/g/1283c.php
In PHP, backslashes have special meaning. You can therefore represent a literal backslash as either of the following: \\\ or \\\\. The alternative method is to use a character class: [\\].
Refer to the section labeled "Note" here:
http://www.php.net/manual/en/regexp.reference.escape.php
Would this not work just as well for your data?
str_replace('\\"','"',$data);
$result = preg_replace('/\\\\"/i', '"', $subject);
How can characters " \n \t \r " be replaced with '-' ?
echo preg_replace('/\s/','-','\n\t\n\r\n');//output '\n\t\n\r\n' instead should be'-----'
Edit: I have dynamic content in real app like:
preg_replace('/\s/','-',$_Request['content']);
can I fix it by adding "" around variable?
preg_replace('/\s/','-',"$_Request['content']");
Edit2:
How can be string converted from format 'str' to format "str"?
Thanks
Well, two things. First, the problem is single quotes in your replacement string. Meta-Characters (\n\t\r, etc) are not processed inside of single quotes.
However, don't use a regex for this. There's no need for the complexity of the regex. Use
Either use str_replace:
echo str_replace(array("\r", "\n", "\t", "\v"), '-', "\r\n\t\r\v\n\t");
Or strtr:
echo strtr("\r\n\t\r\v\n\t", "\r\n\t\v", '----');
Edit: Ahh, now I see what you're getting at. You have a string with a literal \r\n\t\r\v\n\t in it, and want to replace them out. Well, you can do that via regex:
$regex = '/(\s|\\\\[rntv]{1})/';
$string = preg_replace($regex, '-', $_GET['content']);
Basically, it matches any space character, and any literal \ followed by either r, n, t or v...
If you are looking to replace the actual whitespace characters, you need to enclose the input string in double quotes (") so PHP converts the escape sequences for you:
echo preg_replace('/\s/', '-', "\n\t\n\r\n");
Else if the escape sequences occur literally (i.e. you see \n\t\n\r\n instead of line feed, tab, line feed, carriage return, line feed), you need to replace by the following character class (and keep single quotes (') on the input string):
echo preg_replace('/\\\\[rnt]/', '-', '\n\t\n\r\n');
You ought to be passing content through $_POST instead of $_GET, I don't know how PHP handles tabs, newlines and returns in GET variables.
You are using 's instead of "s. You should change your code to:
echo preg_replace('/\s/','-',"\n\t\n\r\n");
See here: single-quoted and double-quoted.
http://www.php.net/manual/en/language.types.string.php
There's also a string method for that:
echo strtr($str, "\r\n\t\v ", "-----");
If you want to remove linebreaks but retain spaces, then remove the trailing and the fifth -.
Since you seemingly want literal \r and \n converted, you need to use a map (or even a regex) like:
echo strtr($str, array('\\r'=>"\r", '\\n'=>"\n", '\t'=>"\t", ' '=>"␣"));
// single quoted strings escaped twice for illustration
Try:
echo preg_replace('/\s/','-',"\n\t\n\r\n");
Note the double quotes on the string.
If you enclose a string with single quotes, special characters lose their special meaning:
echo preg_replace('/\s/','-',"\n\t\n\r\n");
//remove line breaks
function safeEmail($string) {
return preg_replace( '((?:\n|\r|\t|%0A|%0D|%08|%09)+)i' , '', $string );
}
/*** example usage 1***/
$from = 'HTML Email\r\t\n';
/*** example usage 2***/
$from = "HTML Email\r\t\n";
if(strlen($from) < 100)
{
$from = safeEmail($from);
echo $from;
}
1 returns HTML Email\r\t\n while
2 returns HTML Email
what's with the quotes?
As per the PHP Documentation
Unlike the double-quoted and heredoc syntaxes, variables and escape sequences for special characters will not be expanded when they occur in single quoted strings.
In other words, double quoted strings expand variables and escape sequences for special characters. Single quoted strings don't.
So in example1, with the single quoted string, the string is exactly as you see it. Slashes and all.
But in example2, rather than ending with the string \r\t\n, it ends with a carriage return, a tab and then a new line. In other words the escape sequences for special characters are expanded.
with single quotes in PHP those special characters as \n \r \t... doesn't work as expected.
According to the docs:
To specify a literal single quote, escape it with a backslash (\). To specify a literal
backslash, double it (\\). All other instances of backslash will be treated as a literal
backslash: this means that the other escape sequences you might be used to, such as \r or
\n, will be output literally as specified rather than having any special meaning.