unknown characters %252B in url - php

i have a page with links gotten from rss. they are:
broken link
http://news.asiaone.com/News/Latest%252BNews/Singapore/Story/A1Story20121220-390687.html
working link
http://news.asiaone.com/News/Latest%2BNews/Singapore/Story/A1Story20121220-390687.html
i realise it works by changing %252B to %2B. im using php. is there a way to detect and correct it on the run?

The URL has been double encoded. %25 is the escape sequence for "%", so a regular %2B got escaped again to %252B.
urldecode the value, but better avoid double-encoding it to begin with if possible.

Use "urldecode"
echo urldecode("http://news.asiaone.com/News/Latest%252BNews/Singapore/Story/A1Story20121220-390687.html");

Related

Purpose of using esc_url

I don't understand why we need to use the esc_url if I myself am the one who actually wrote the URL like:
echo get_template_directory_url . '/someText'
Although the /someText is hardcoded but I know it's clean and safe because I wrote it. What are the circumstances that this will be unsafe (like how do bad guys do bad things when I don't use the esc_url in this case? Do they hack into the server? If they can really hack into the server, they won't even bother the esc_url already?
I have referred to https://stackoverflow.com/a/30583251/19507498 , but he just explain how we use it without explaining why we need it.
The purpose of this function is to replace spaces and special characters with their encoded url pendants. For example will be replace with %20. This is needed, because spaces and some other special characters like umlauts or ß are not allowed in urls.
EDIT:
Furthermore ? and & need to be encoded, because those have special meanings in urls.

PHP file_get_contents() Chinese character ERROR CODE

I use file_get_contents() to download a JSON. There're some Chinese characters in the URL, I tried to print the URL out, it's OK. But when I ran the program, the URL I put in the function became error code. How do I know that is this URL links to a JSON that links to a MySQL request, and in the console of MySQL, I saw the URL became error code. I tried lots of ways to change URL string to UTF-8 or GB2312, etc, but none of that works. I Wish I could get help here, thanks.
Its very difficult to understand your question. I think i understood the first part of your question:
I use file_get_contents() to download a JSON. There're some Chinese
characters in the URL, I tried to print the URL out, it's OK. But when
I ran the program, the URL I put in the function became error code.
You try to access a URL containing chinese characters using file_get_contents():
The answer to this is:
You need to encode the part of the url containing chinese characters using urlencode() or rawurlencode().
The main difference between urlencode()and rawurlencode() is, that urlencode() converts spaces to +. rawurlencode() converts spaces to %20.
urlencode is used for Query Parameters as example ?q=my+search+key, in every other case you use rawurlencode.
Example:
$test = 'http://www.example.com/'.rawurlencode('以怎么下载').'.html';
print_r($test);
// $html = file_get_contents($test);
// output:
http://www.example.com/%E4%BB%A5%E6%80%8E%E4%B9%88%E4%B8%8B%E8%BD%BD.html
I hope it solves your problem.

Escaped symbol on get parameter

I have a string has encrypted but with some symbol qwfKOEK==dwk&f
What if I need pass this string to a parameter:
www.example.php?string=qwfKOEK==dwk&f
$_GET["string"]
But I can’t get the string cause the symbol interrupt it.
Anyway to escape the symbol?
I had try html_entity_decode but seems not working, any possible way to escape the symbol and $_GET the original string?
A URL value needs to be URL encoded using urlencode or rawurlencode.
The difference between the two is two slightly different standards for encoding, whereby the rawurlencode variant is generally preferred.
If you try to put this sting in a GET parameter, definitely it'll not be accepted as it explodes at "=" sign. You can try passing the same though a POST parameter or try changing the encryption technique. I hope this may solve the problem. plus, html_entity_decode() doesnt apply to "=" sign.

Removing %2520 and other nonstandard characters from URL in obj c

I am getting a URL from server and trying to load the URL in webview. The issue is that the url which I am getting contains non standard characters. The URL is:
https//p-r3.test.abc.com:443%2Ftablet%2Fjsp%2Fgift%2Fipad%2Fgifter%2FgitGiftList.jsp%3FregId%3D74500002%26filterBy%3DviewAll%26pageId%3DourGifty%26sort%3Dcategory%26groupBy%3Dcategory%26view%3Dlist%26categoryId%3D%26addCat%3Dcat100540004&title=re%20-&imgurl=https%3A%2F%2Fm-r3-testy.tr.com%3A443%2Ftablet%2Fimages%2Ft_Full.jpg%3Fwid%3D300%26hei%3D300.
I need to remove characters like %2520, %2F, %3D and other non standard characters from the URL. Anyone has idea to remove this encoding.
Any help would be appreciated
Thanks
%2520 is simply a double-encoded space. Encode it once and you get %20, encode it twice and you get %2520. It's not "non-standard", it's just poorly coded. In theory, there's no reason why you can't just replace %2520 with a space, but for all I know the server-side code is expecting the double-encoded string.
Found the answer.I am removing the encoding using the built in function of iOS.abc = [def stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
and i am loading abc in webview.It is working fine.
Thanks all for the responses.
You seem to have an urlencode() too many, or an urldecode() too few, in the code processing the URL server side.
To avoid multiple encoding, Remove any encoding first
_pdfUrl = [ _pdfUrl stringByRemovingPercentEncoding];
_pdfUrl = [_pdfUrl stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

Is it possible to output the '&' sign in an xml file?

I have a php which generates an xml file and prints it on screen.
Amongst other variables, it prints an Image link.
The problem is that if this Image link has an '&' character in it, I get an xml error because it isn't encoded properly.
So I solve it by replacing the & sign with &.
Atleast I thought it was solved, now the link to the image is for example like this:
www.domain.com/phones & equipment/img1.jpg
which causes a 404 file not found.
The real path is
www.domain.com/phones & equipment/img1.jpg
So how can I solve this then?
I would prefer not to change the folder names, I simply didn't know this when I created the folders.
Thanks
If it's a URL, you might want to URL encode it instead:
www.domain.com/phones%20%26%20equipment/img1.jpg
try to
www.domain.com/phones+%26+equipment/img1.jpg
You should url encode the link using php urlencode() function. The code for '&' is "%26".
Additionally, if you check IANA RFC regarding URL/URI, you will see that space character is not a valid character and shouldn't be present inside REQUEST URI. Having your URL like www.domain.com/phones-and-equipment/img1.jpg would be much beneficial from SEO standpoint as well.
ADDENDUM: For example, check page 2, section "Unsafe" of RFC 1738 and see why non-printable and non-US-ASCII characters are not safe.
use html_entity_decode to convert & back into an &
see http://php.net/manual/en/function.html-entity-decode.php
The correct representation of the ampersand in an XML file is &. If that isn't working, the problem is with the code that is reading the XML file and deferencing the URI.

Categories