I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
Related
I want to create hyperlinks to pages on my site but the page address have double quotes within them?
Eg:
the above just links to mysite.com/search.php?q= as I would expect as it is written.
The API returning results allows phrase searches by placing them in double quotes.
Is there a way to escape these within the href tag?
Simple solution: use altenative quotes:
<a href='mysite.com/search.php?q="sales+manager"&l=usa'></a>
This will work fine (the browser will make sure the URL gets properly formatted when a user clicks it), but you should really be urlencoding special characters because there's a whole bunch of stuff that you're not allowed to use in URLs, and some stuff that has a different meaning (in a URL, spaces become +, for instance, so you can't drop in a + and get it to stay that once you parse it. URL magic!).
Have a look at urlencode and use that when generating the link URL server side. This will turn things like spaces into %20, double quotes into %22, etc., and is how you send literal string data from a client to a server.
Yo must encode the quotes "
mysite.com/search.php?q="sales+manager"&l=usa
Is there a way to escape these within the href tag?
Yes, with the escape character. \.
Although, the current state of your code would produce:
effectively breaking the href since you are breaking the string.
What you want, is just:
...
you are considering whether the characters need to be escaped in order to work in your HTML.
however, you should also consider whether they need to be escaped in order to be sound URLs.
to work in your HTML you may do
<a href="mysite.com/search.php?q='sales+manager'&l=usa">.
however, the ' character cannot be in a URL.
"Uniform Resource Locators may only contain the displayable characters in the standard ASCII character set. Nondisplayable characters or characters in the extended ASCII set (128 through 255) are specially encoded."
See here for a list of URL escape codes.
perhaps you want to retain the quotes in the get-request of your URL. in that case, you might want:
<a href="mysite.com/search.php?q=%22sales+manager%22&l=usa">
I'm using php to look at an XML file that has a URL in it. The URLs look something like this:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
When I echo out the URLs, the "¤" shows up as "¤" (AKA #164, A4 or currency symbol) and the links don't work. This happens even though there isn't a closing semicolon for it. What is the cleanest way to make "¤" display literally?
Funny enough I ran into the same problem just now and I found this answer. However, I found another solution which might even be better!
Simply put the variable at the beginning of your query string, and you will avoid the ¤ completely.
Do:
https://site.com/bacon_report?currentDimension=2&Id=1&report=1¶m=1
instead of:
https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
Use the php function urlencode:
urlencode("https://site.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1"
will output
https%3A%2F%2Fsite.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
The problem here is escaping - you need to escape the "&" characters. In XML all special characters like <, >, ', " and & should be escaped.
Escape it properly as
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
..just like in HTML:
WRONG - no escaping
CORRECT - correct escape sequence
So - the cleanest way to show "¤" in HTML/XML is to properly escape the ampersand, and render it as "¤".
I think that in this case it is best to use htmlentities because with urlencode you get
https%3A%2F%2Fexample.com%2Fbacon_report%3FId%3D1%26report%3D1%26currentDimension%3D2%26param%3D1
and when applying urldecode, you will still have the ¤ symbol
where as with htmlentities the url comes out clean.
https://example.com/bacon_report?Id=1&report=1¤tDimension=2¶m=1
I came across this issue while working on technical documentation (in Markdown which gets converted to HTML).
To solve the issue I used a zero-width space character which I copied and pasted from between these brackets (). That way it appears that there is no space and can include the below without any issues:
/search?query=1¤tLonLat=-74.600291,40.360869
my code is not working ? and i dont want to use str_replace , for there maybe more slashes than 3 to be replaced. how can i do the job using preg_replace?
my code here like this:
<?php
$str='<li>
<span class=\"highlight\">Color</span>
Can\\\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
$str=preg_replace("#\\+#","\\",$str);
echo $str;
There is merit in the other answers, but to me it looks like what you're actually trying to accomplish is something very different. In the php code \\\' is not three slashes followed by an apostrophe, it's one escaped slash followed by an escaped apostrophe, and in the rendered output, that's exactly what you see—a slash followed by an apostrophe (with no need to escape them in the rendered html). It's important to realize that the escape character is not actually part of the string; it's merely a way to help you represent a character that normally has very different meaning in within php—in this case, an apostrophe normally terminates a string literal. What looks like 4 characters in php is actually only 2 characters in the string.
If this is the extent of your code, there's no need for string manipulation or regular expressions. What you actually need is just this:
<?php
$str='<li>
<span class="highlight">Color</span>
Can\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
echo $str;
?>
Only one escape character is needed here for the apostrophe, and in the rendered HTML you will see no slashes at all.
Further Reading:
Escape sequences
The root of this problem is actually in how it was written into your database and likely to be caused by magic_quotes_gpc; this was used in older versions and a really bad idea.
The best fix
This requires a few steps:
Fix the script that puts the HTML inside your database by disabling magic_quotes_gpc.
Write a script that reads all existing database entries, applies stripslashes() and saves the changes.
Fix the presentation part (though, that may need no changes at all.
Alternative patch
Use stripslashes() before you present the HTML.
use this pattern
preg_replace('#\\+#', '\\', $text);
This replaces two or more \ symbols preceding an ' symbol with \'
$theConvertedString = preg_replace("/\\{2,}'/", "\'", $theSourceString);
Ideally, you shouldn't have code causing this issue in the first place so I would have a look at why you have \\' in your code to begin with. If you've manually put it in your variables, take it out. Often, this also happens with multiple calls to addslashes() or mysql_real_escape_string() or a cheap hosting providers' automatic transformation of all POST request variables to escape slashes, combined with your server side PHP code to do the same.
It's a pretty silly question, sorry. There is a big and rather complex system that has a bug and I managed to track it down to this piece
return str_replace('%2F', '/', rawurlencode(str_replace('%20', ' ', $key)));
There is a comment explaining why slashes are replaced - to preserve path structure, e.g. encoded1/encoded2/etc. However there is no explanation whatsoever why %20 is replaced with space and that part is the direct cause of a bug. I am tempted to just remove str_replace() but it looks like it was placed there for some reason and I have a feeling that I'll break something else by doing this. Has anyone encountered anything similar? Perhaps it's a dirty fix for some PHP bug? Any guesses and insights are highly appreciated!
Doing so would prevent %20 (encoded space) from being encoded to %2F20. However, it only serves to prevent double escaped spaces; other special characters would still get double encoded.
This is a sign of bad code; strings that are passed into this function shouldn't be allowed to have encoded characters in the first place.
I would recommend creating unit tests that cover all referencing code and then refactor this function to remove the str_replace() to make sure it doesn't break the tests.
First thing that jumps to mind is as a mitigation technique against double encoding.
Not that I would recommend doing such a thing this way, as it would get real messy real quickly (and one would already wonder why only that entity, perhaps 'they' never experienced issues with any others... yet).
It could be the result of a misunderstanding of rawurlencode() vs urlencode()
urlencode() replaces spaces with + signs
If the original author thought that rawurlencode() did the same thing, they would be attempting to pre-encode the spaces so they don't get turned into +s
I'm having a lot of difficulty matching an image url with spaces.
I need to make this
http://site.com/site.com/files/images/img 2 (5).jpg
into a div like this:
.replace(/(http:\/\/([^\s]+\.(jpg|png|gif)))/ig, "<div style=\"background: url($1)\"></div>")
Here's the thread about that:
regex matching image url with spaces
Now I've decided to first make the spaces into entities so that the above regex will work.
But I'm really having a lot of difficulty doing so.
Something like this:
.replace(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/ig, "http://$1/$2%20$3$4")
Replaces one space, but all the rest are still spaces.
I need to write a regex that says, make all spaces between http:// and an image extension (png|jpg|gif) into %20.
At this point, frankly not sure if it's even possible. Any help is appreciated, thanks.
Trying Paolo's escape:
.escape(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/)
Another way I can do this is to escape serverside in PHP, and in PHP I can directly mess with the file name without having to match it in regex.
But as far as I know something like htmlentities do not apply to spaces. Any hints in this direction would be great as well.
Try the escape function:
>>> escape("test you");
test%20you
If you want to control the replacement character but don't want to use a regular expression, a simple...
$destName = str_replace(' ', '-', $sourceName);
..would probably be the more efficient solution.
Lets say you have the string variable urlWithSpaces which is set to a URL which contains spaces.
Simply go:
urlWithoutSpaces = escape(urlWithSpaces);
What about urlencode() - that may do what you want.
On the JS side you should be using encodeURI(), and escape() only as a fallback. The reason to use encodeURI() is that it uses UTF-8 for encoding, while escape() uses ISO Latin. Same problems applies for decoding.
encodeURI = encodeURI || escape;
alert(encodeURI('image name.png'));