I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
Related
I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
my code is not working ? and i dont want to use str_replace , for there maybe more slashes than 3 to be replaced. how can i do the job using preg_replace?
my code here like this:
<?php
$str='<li>
<span class=\"highlight\">Color</span>
Can\\\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
$str=preg_replace("#\\+#","\\",$str);
echo $str;
There is merit in the other answers, but to me it looks like what you're actually trying to accomplish is something very different. In the php code \\\' is not three slashes followed by an apostrophe, it's one escaped slash followed by an escaped apostrophe, and in the rendered output, that's exactly what you see—a slash followed by an apostrophe (with no need to escape them in the rendered html). It's important to realize that the escape character is not actually part of the string; it's merely a way to help you represent a character that normally has very different meaning in within php—in this case, an apostrophe normally terminates a string literal. What looks like 4 characters in php is actually only 2 characters in the string.
If this is the extent of your code, there's no need for string manipulation or regular expressions. What you actually need is just this:
<?php
$str='<li>
<span class="highlight">Color</span>
Can\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
echo $str;
?>
Only one escape character is needed here for the apostrophe, and in the rendered HTML you will see no slashes at all.
Further Reading:
Escape sequences
The root of this problem is actually in how it was written into your database and likely to be caused by magic_quotes_gpc; this was used in older versions and a really bad idea.
The best fix
This requires a few steps:
Fix the script that puts the HTML inside your database by disabling magic_quotes_gpc.
Write a script that reads all existing database entries, applies stripslashes() and saves the changes.
Fix the presentation part (though, that may need no changes at all.
Alternative patch
Use stripslashes() before you present the HTML.
use this pattern
preg_replace('#\\+#', '\\', $text);
This replaces two or more \ symbols preceding an ' symbol with \'
$theConvertedString = preg_replace("/\\{2,}'/", "\'", $theSourceString);
Ideally, you shouldn't have code causing this issue in the first place so I would have a look at why you have \\' in your code to begin with. If you've manually put it in your variables, take it out. Often, this also happens with multiple calls to addslashes() or mysql_real_escape_string() or a cheap hosting providers' automatic transformation of all POST request variables to escape slashes, combined with your server side PHP code to do the same.
It's a pretty silly question, sorry. There is a big and rather complex system that has a bug and I managed to track it down to this piece
return str_replace('%2F', '/', rawurlencode(str_replace('%20', ' ', $key)));
There is a comment explaining why slashes are replaced - to preserve path structure, e.g. encoded1/encoded2/etc. However there is no explanation whatsoever why %20 is replaced with space and that part is the direct cause of a bug. I am tempted to just remove str_replace() but it looks like it was placed there for some reason and I have a feeling that I'll break something else by doing this. Has anyone encountered anything similar? Perhaps it's a dirty fix for some PHP bug? Any guesses and insights are highly appreciated!
Doing so would prevent %20 (encoded space) from being encoded to %2F20. However, it only serves to prevent double escaped spaces; other special characters would still get double encoded.
This is a sign of bad code; strings that are passed into this function shouldn't be allowed to have encoded characters in the first place.
I would recommend creating unit tests that cover all referencing code and then refactor this function to remove the str_replace() to make sure it doesn't break the tests.
First thing that jumps to mind is as a mitigation technique against double encoding.
Not that I would recommend doing such a thing this way, as it would get real messy real quickly (and one would already wonder why only that entity, perhaps 'they' never experienced issues with any others... yet).
It could be the result of a misunderstanding of rawurlencode() vs urlencode()
urlencode() replaces spaces with + signs
If the original author thought that rawurlencode() did the same thing, they would be attempting to pre-encode the spaces so they don't get turned into +s
I need to replace characters in a string with their HTML coding.
Ex. The "quick" brown fox, jumps over the lazy (dog).
I need to replace the quotations with the & quot; and replace the brakets with & #40; and & #41;
I have tried str_replace, but I can only get 1 character to be replaced. Is there a way to replace multiple characters using str_replace? Or is there a better way to do this?
Thanks!
I suggest using the function htmlentities().
Have a look at the Manual.
PHP has a number of functions to deal with this sort of thing:
Firstly, htmlentities() and htmlspecialchars().
But as you already found out, they won't deal with ( and ) characters, because these are not characters that ever need to be rendered as entities in HTML. I guess the question is why you want to convert these specific characters to entities? I can't really see a good reason for doing it.
If you really do need to do it, str_replace() will do multiple string replacements, using arrays in both the search and replace paramters:
$output = str_replace(array('(',')'), array('(',')'), $input);
You can also use the strtr() function in a similar way:
$conversions = array('('=>'(', ')'=>')');
$output = strtr($conversions, $input);
Either of these would do the trick for you. Again, I don't know why you'd want to though, because there's nothing special about ( and ) brackets in this context.
While you're looking into the above, you might also want to look up get_html_translation_table(), which returns an array of entity conversions as used in htmlentities() or htmlspecialchars(), in a format suitable for use with strtr(). You could load that array and add the extra characters to it before running the conversion; this would allow you to convert all normal entity characters as well as the same time.
I would point out that if you serve your page with the UTF8 character set, you won't need to convert any characters to entities (except for the HTML reserved characters <, > and &). This may be an alternative solution for you.
You also asked in a separate comment about converting line feeds. These can be converted with PHP's nl2br() function, but could also be done using str_replace() or strtr(), so could be added to a conversion array with everything else.
I'm having a lot of difficulty matching an image url with spaces.
I need to make this
http://site.com/site.com/files/images/img 2 (5).jpg
into a div like this:
.replace(/(http:\/\/([^\s]+\.(jpg|png|gif)))/ig, "<div style=\"background: url($1)\"></div>")
Here's the thread about that:
regex matching image url with spaces
Now I've decided to first make the spaces into entities so that the above regex will work.
But I'm really having a lot of difficulty doing so.
Something like this:
.replace(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/ig, "http://$1/$2%20$3$4")
Replaces one space, but all the rest are still spaces.
I need to write a regex that says, make all spaces between http:// and an image extension (png|jpg|gif) into %20.
At this point, frankly not sure if it's even possible. Any help is appreciated, thanks.
Trying Paolo's escape:
.escape(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/)
Another way I can do this is to escape serverside in PHP, and in PHP I can directly mess with the file name without having to match it in regex.
But as far as I know something like htmlentities do not apply to spaces. Any hints in this direction would be great as well.
Try the escape function:
>>> escape("test you");
test%20you
If you want to control the replacement character but don't want to use a regular expression, a simple...
$destName = str_replace(' ', '-', $sourceName);
..would probably be the more efficient solution.
Lets say you have the string variable urlWithSpaces which is set to a URL which contains spaces.
Simply go:
urlWithoutSpaces = escape(urlWithSpaces);
What about urlencode() - that may do what you want.
On the JS side you should be using encodeURI(), and escape() only as a fallback. The reason to use encodeURI() is that it uses UTF-8 for encoding, while escape() uses ISO Latin. Same problems applies for decoding.
encodeURI = encodeURI || escape;
alert(encodeURI('image name.png'));