my code is not working ? and i dont want to use str_replace , for there maybe more slashes than 3 to be replaced. how can i do the job using preg_replace?
my code here like this:
<?php
$str='<li>
<span class=\"highlight\">Color</span>
Can\\\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
$str=preg_replace("#\\+#","\\",$str);
echo $str;
There is merit in the other answers, but to me it looks like what you're actually trying to accomplish is something very different. In the php code \\\' is not three slashes followed by an apostrophe, it's one escaped slash followed by an escaped apostrophe, and in the rendered output, that's exactly what you see—a slash followed by an apostrophe (with no need to escape them in the rendered html). It's important to realize that the escape character is not actually part of the string; it's merely a way to help you represent a character that normally has very different meaning in within php—in this case, an apostrophe normally terminates a string literal. What looks like 4 characters in php is actually only 2 characters in the string.
If this is the extent of your code, there's no need for string manipulation or regular expressions. What you actually need is just this:
<?php
$str='<li>
<span class="highlight">Color</span>
Can\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
echo $str;
?>
Only one escape character is needed here for the apostrophe, and in the rendered HTML you will see no slashes at all.
Further Reading:
Escape sequences
The root of this problem is actually in how it was written into your database and likely to be caused by magic_quotes_gpc; this was used in older versions and a really bad idea.
The best fix
This requires a few steps:
Fix the script that puts the HTML inside your database by disabling magic_quotes_gpc.
Write a script that reads all existing database entries, applies stripslashes() and saves the changes.
Fix the presentation part (though, that may need no changes at all.
Alternative patch
Use stripslashes() before you present the HTML.
use this pattern
preg_replace('#\\+#', '\\', $text);
This replaces two or more \ symbols preceding an ' symbol with \'
$theConvertedString = preg_replace("/\\{2,}'/", "\'", $theSourceString);
Ideally, you shouldn't have code causing this issue in the first place so I would have a look at why you have \\' in your code to begin with. If you've manually put it in your variables, take it out. Often, this also happens with multiple calls to addslashes() or mysql_real_escape_string() or a cheap hosting providers' automatic transformation of all POST request variables to escape slashes, combined with your server side PHP code to do the same.
Related
I am writing php code that generates html that contains links to documents via their DOI. The links should point to https://doi.org/ followed by the DOI of the document.
As the results is a url, I thought I could simply use php's esc_url() function like in
echo '' . esc_url('https://doi.org/' . $doi)) . '';
as this is what one is supposed to use in text nodes, attribute nodes or anywhere else. Unfortunately things apparenty aren't that easy...
The problem is that DOIs can contain all sorts of special characters that are apparently not handled correctly by esc_url(). A nice example of such a DOI is
10.1002/(SICI)1521-3978(199806)46:4/5<493::AID-PROP493>3.0.CO;2-P
which is supposed to link to
https://doi.org/10.1002/(SICI)1521-3978(199806)46:4/5<493::AID-PROP493>3.0.CO;2-P
With $doi equal to this DOI the above code however produces a link that is displayed and links to https://doi.org/10.1002/(SICI)1521-3978(199806)46:4/5493::AID-PROP4933.0.CO;2-P.
This leads me to the question: If esc_url() is obviously not one-size-fits-all no-brained solution to escaping urls, then what should I use? For this case I can get the result I want with
esc_url(htmlspecialchars('https://doi.org/' . $doi))
but is this really the right way™ of doing it? Does this have any other unwanted side effects? If not, then why does esc_url() not also escape < and >? Would esc_html() be better than htmlspecialchars()? If so, should I nest it into a esc_url()?
I am aware that there are many articles on escaping urls in php on stackoverflow, but I couldn't find one that addresses the issues of < and > signs.
I'm no PHP expert, but I do know about DOIs and SICIs can be really annoying.
URL-encoding and HTML encoding are separate things, so it makes sense to think about them separately. You must escape the angle-brackets to make correct HTML. As for the URL-escaping, you should also do this because there are other characters that might break URLs (such as the # character, which also pops up from time to time).
So I would recommend:
'https://doi.org/' . htmlspecialcharacters(urlencode($doi))
Which will give you:
Click here
Note the order of function application, and the fact that you don't want to encode the https://doi.org resolver!
To the above "dipshit decision" comment... it's certainly inconvenient. But SICIs were around before DOIs and it's one of those annoying things we've had to live with ever since!
I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
My PHP application adds backslashes to quotes in many locations with addslashes(). Unfortunately, this adding has produced output akin to the following.
Every time I refresh my page, this string increases in number of back slashes.
don't
don\'t
don\\'t
don\\\\'t
don\\\\\\\\'t
and so on.
I want to write a function that deletes all these extra back slashes. I have tried
str_replace($text, "\\\\", "");
to no avail.
You may just use:
stripslashes($text);
$text=preg_replace('/\134+/',"\134",$text);
What happens if your actual value happens to include \\ legitimately? For instance, if your content were referring to a Windows-style share reference - \\192.168.1.1\foo. If you blindly strip out any double slashes, you'll strip out ones that are meant to be there, as well.
The "right" answer is really to not add slashes to things that don't need them. You should know whether a given value is already escaped or not (after all, you're the one controlling where every value comes from, and what it's being used for) and only add slashes when you need them.
This has been driving be crazy, but I can't seem to find an answer. We run a technical knowledge base that will sometimes include Windows samba paths for mapping to network drives.
For example: \\servername\sharename
When we include paths that have two backslashes followed by each other, they are not escaped properly when running 'addslashes'. My expected results would be "\\\\servername\\sharename", however it returns "\\servername\\sharename". Obviously, when running 'stripslashes' later on, the double backslash prefix is only a single slash. I've also tried using a str_replace("\\", "\", $variable); however it returns "\servername\sharename" when I would expect "\\servername\sharename".
So with addslashes, it ignores the first set of double-backslashes and with str_replace it changes the double-backslashes into a single, encoded backslash.
We need to run addslashes and stripslashes for database insertion; using pg_escape_string won't work in our specific case.
This is running on PHP 5.3.1 on Apache.
EDIT: Example Code
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo addslashes($variable);
This returns: In the box labeled Folder type: \\servername\\sharename
EDIT: Example Code #2
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo str_replace('\\', '\', $variable);
This returns: In the box labeled Folder type: \servername\sharename
I'd also like to state that using a single quotes or double-quotes does not give me different results (as you would expect). Using either or both give me the same exact results.
Does anyone have any suggestions on what I can possibly do?
I think I know where is a problem. Just try to run this one:
echo addslashes('\\servername\sharename');
And this one
echo addslashes('\\\\servername\sharename');
PHP escapes double slashes even with single quotes, because it is used to escape single quote.
Ran a test on the problem you described, and the only way I could get the behavior you desired was to couple a conditional with a regex and anticipate the double slashes at the start.
$str = '\\servername\sharename';
if(substr($str,0,1) == '\\'){
//String starts with double backslashes, let's append an escape one.
//Exclaimation used for demonstration purposes.
$str = '\\'.$str;
echo addslashes(preg_replace('#\\\\\\\\#', '!',$str ));
}
This outputs:
!servername\\sharename
While this may not be an outright answer, it does work and illustrates a difference in how the escape character is treated by these two constructs. If used, the ! could easily be replaced with the desired characters using another regex.
This is not a problem with addslashes, it is a problem with the way you are assigning the string to your variable.
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo $variable;
This returns: In the box labeled Folder type: \servername\sharename
This is because the double backslash is interpreted as an escaped backslash. Use this assignment instead.
$variable = 'In the box labeled Folder type: \\\\servername\\sharename';
I've determined, with more testing, that it indeed is with how PHP is handling hard-coded strings. Since hard-coded strings are not what I'm interested in (I was just using them for testing/this example), I created a form with a single text box and a submit button. addslashes would correctly escape the POST'ed data this way.
Doing even more research, I determined that the issue I was experiencing was with how PostgreSQL accepts escaped data. Upon inserting data into a PostgreSQL database, it will remove any escape characters it is given when it actually places the data in the table. Therefore, stripslashes is not required to remove escape characters when pulling the data back out.
This problem stemmed from code migration from PHP 4.1 (with Magic Quotes on) to PHP 5.3 (with Magic Quotes deprecated). In the existing system (PHP4), I don't think we were aware that Magic Quotes were on. Therefore, all POST data was being escaped already and then we were escaping that data again with addslashes before inserting. When it got inserted into PostgreSQL, it would strip one set of slashes and leave the other, therefore requiring us to stripslashes on the way out. Now, with Magic Quotes off, we escape with addslashes but are not required to use stripslashes on the way out.
It was very hard to organize and determine exactly where the problem lay, so I know this answer is a little off to my original question. I do, however, thank everyone who contributed. Having other people sound off on their ideas always helps to make you think on avenues you may not have on your own.