php URL encoding error

php URL encoding error - php

I have the following line in my code
$lib = simplexml_load_file("http://www.goodreads.com/book/title.xml?key=MYAPIKEY&title=" . $current_book,null,true);
when this is sent, it is sent like this
http://www.goodreads.com/book/title.xml?key=MYAPIKEY&title=Dashing%2BThrough%2Bthe%2BSnow%2BMary%2BHiggins%2BClark%0A
When this happens, the book is not found and an error is returned . . .
However, if i type into my browser
http://www.goodreads.com/book/title.xml?key=MYAPIKEY&title=Dashing+Through+the+Snow+Mary+Higgins+Clark
Then i get a valid XML response, and i can use this to complete my code.
So, how do i send this without the + getting changed to %2B ?
And what is the %0A at the end of the URL?

Your url seems to be encoded twice.
Since you are not encoding at all in the code you show (by the way that would be the proper place to do that) it must have happened before.
The error is not in this code but somewhere before that.
You could correct it like this:
$lib = simplexml_load_file("http://www.goodreads.com/book/title.xml?key=MYAPIKEY&title=" . trim(urldecode($current_book)),null,true);
However that is only a workaround for an existing error, you should fix the prior encoding.
Also the character you mention is a newline character, thats the reason for trim

Related

php url string converts "&section=" to "§ion", which does not yield cURL response

I save a php string as
$url = "http://example.com/index.php?q=board/ajax_call&section=get_messages";
The url when printed to screen displays as
"http://example.com/index.php?q=board/ajax_call§ion=get_messages" as "&sect" gets auto converted to special char "§".
How can I prevent this so that I can call the correct URL using cURL .

Your Problem
The problem is that &sect is interpreted by the browser as the HTML entity for §.* So, &section displays as §ion.
The Solution
If you're going to print the URL itself, you need to escape the & and turn it into &. You can do this automatically using htmlentities(). Sample code:
<?php
$url = "http://example.com/index.php?q=board/ajax_call&section=get_messages";
echo "Without htmlentities(): " . $url . "\n";
// output: http://example.com/index.php?q=board/ajax_call&section=get_messages
echo "With htmlentities(): " . htmlentities($url) . "\n";
// output: http://example.com/index.php?q=board/ajax_call&section=get_messages
Here's a demo.
A Note About Security
Note that using htmlentities() here is a good idea for lots of other reasons. What if somebody used this URL?
http://example.com/index.php?q=board/ajax_call&section=get_messages<script src="http://evilsite/evil.js></script>
If you just dumped it out onto the screen, you have just included an evil JavaScript. Congratulations! You just hacked your user and, probably, got your own site hacked. This is a real problem called XSS (Cross-Site Scripting). But if you call htmlentities() first, you get:
http://example.com/index.php?q=board/ajax_call&section=get_messages<script src="http://evilsite/evil.js></script>
That's safe and won't actually run the evil script.
* Technically, the HTML entity is §, with the semicolon, but nearly all browsers with treat it as an HTML entity with or without the semicolon. See this answer for a good explanation.

Change the & to &.
(See w3c markup validator ampersand (&) error for a bit more information.)

Issue with encoding on receiver end when using xmlrpc_encode_request

I have an issue with charsets and how they are encoded in a request I send. I have a test case where I want the code to end up with the exact same md5-hash on both sides. While still being the character 'å', obviously. (So not converted into some broken char or just '?')
The source input is utf8 and contains a norwegian character, for example "båt".
This input will then be sent to an API that wants data to be latin1 / ISO-8859-1.
One goal is also to avoid having to add utf8_decode to the receiving end.
So this is the very simplified code of what I've sent until now:
$password_send = 'båt';
echo "Test 1: " . md5( $password ) . "\n";
$params = array('password' => $password);
$request = xmlrpc_encode_request($module, $params);
And this is how the receiving end treats it. It basically just converts it into an md5 hash and sends it to another method. No other conversion of the incoming data has been done.
$_hash = md5( $password_receive );
echo "Test 2: {$_hash}\n";
Member::updatePassword($member_id, $_hash);
I need the $_hash to be (when 'båt' is sent) to end up as the hash 7e2cdd98fccee62723784a815a2ecdcb. Since this is the md5-hash that 'båt' resolves into when the password 'båt' is saved on the site itself (and not trough the API)
So when I send 'båt' in the API-request, then on the receiver end, it ends up with: fd9cac747daca144726dc579c32f48a, which is wrong. When I check the md5() of 'båt' before I send it, then it is also displayed as fd9cac747daca144726dc579c32f48ae.
I guess this is expected, since I don't use utf8_decode yet, but if I change what I send, like so: $password_send = utf8_decode('båt');
Then it still doesn't end up with the correct hash on the receiver end, then it ends up with: b865deb1e3b0891a41c5444c00893a0f
However, if I also add utf8_decode on the receiver end, like so: $_hash = utf8_decode($password_receive), then it ends up with the hash I need it to be: 7e2cdd98fccee62723784a815a2ecdcb
But this seems very wrong... Having to do utf8_decode on both sides. And while this hash is now correct on the receiving end too, the issue is that I don't want to change any code on the receiving end. And it doesn't work to just do utf8_decode twice before I send the value, because then I just end up with the hash c2d1fbc45e123f65edd74401ef58dd6a on the receiving end (which is the equivalent of doing md5('b?t'). It only worked when I do utf8_decode once before I send it, and once on the receiver end.
So I started to realize that xmlrpc_encode_request probably is the culprit, in that it maybe did some conversion on it's own. First I checked what a var_dump of $request said, in the cases where the $password_send value has NOT been utf8_decoded. And that is:
<string>bÃ¥t</string>
When I do utf8_decode on the value $password_send before it's made into an xmlrpc request, then it is:
<string>båt</string>
Then I read the documentation on xmlrpc_encode_request. And I've tried various combinations of output_options, but none of them seems to work. In every scenario I still have to do utf8_decode in the code on the input data on the receiver end to end up with the exact same md5 that I need.
I realize this might be somewhat confusing. I would really really appreciate it if someone is able to help me out here. By giving me some pointer on what I should do or try. Because I've gotten completely lost on this issue now :(

The problem seems to be the escaping of xmlrpc_encode_request function. I have same problem albeit with czech "ě" character. I believe it might be bug in PHP, I however found a simple workaround.
Just turn of escaping of non-print and non-ascii characters.
echo xmlrpc_encode_request('test', 'å'); //Ã¥ - incorrect
echo xmlrpc_encode_request('test', 'å', ['escaping' => 'markup']); //å - correct

weird issue with url encoding

Searching through my application's uncaught exception logs ( js -> php -> vb6 dll ) i noticed a weird error:
file: /displaywords_GET.php?GreekWord=%E1%ED%E8%F1%F9%F0%EF%EC%DE%ED%E1%F2&selectedRes=1 # <b>Source:</b> mydll<br/><b>Description:</b> Invalid procedure call or argument # Variables:
# Array
(
[GreekWord] => ανθρωπομήνας
[selectedRes] => 1
)
so the exception in the .dll occurs for the given parameters. I tested it myself in the app by entering the specific word and the error did not occur. Then I tested to see by entering the encoded URL directly in the address bar and the error was reproduced. So in order to see if there is something wrong with the encoding, i did in javascript
encodeURIcomponent("ανθρωπομήνας")
and the result is :
%CE%B1%CE%BD%CE%B8%CF%81%CF%89%CF%80%CE%BF%CE%BC%CE%AE%CE%BD%CE%B1%CF%82
which is very different from the GET parameter above in the php log. Then i tried to decode the url get parameter as seen in the php file with :
decodeURIcomponent("%E1%ED%E8%F1%F9%F0%EF%EC%DE%ED%E1%F2")
and javascript says : malformed URI sequence. Why is this happening ? Obviously the application crashes because the particular URL parameter is malformed, not a proper one.
Now, my problem is, how can I see if the encoded string is a proper one or a corrupted one ? ( Though I'm not sure why php seems to decode it kind of correctly in the logs, when javascript says it's malformed ).
thanks in advance!

%E1%ED... is the URL-encoding of the string as represented in the ISO-8859-7 character set. You will need to convert to the UTF-8 encoding before URL-encoding the bytes, since JavaScript will only work with UTF-8 strings.
$word = 'ανθρωπομήνας';
var_dump(urlencode($word)); // %E1%ED%E8%F1%F9...
$utf8word = iconv('ISO-8859-7', 'UTF-8', $word);
var_dump(urlencode($utf8word)); // %CE%B1%CE%BD...

Encoding problem in PHP while making a webservice call

I have the nth problem encoding related with PHP!
so the story is:
i read a url from a file (ISO-8859). I cant change the encoding of this file for various reason I wont discuss here.
I use that url to make a call to a rest webservice.
the url happens to contain the symbol "è" which is conveted to � when it is loaded by the PHP engine.
as a result the webservice returns and unexpected result because what it gets is actually the word "perch�" instead of "perchè".
I tried to force php to work with ISO-8859 by doing:
ini_set('default_charset', "ISO-8859");
The problem is that it still doesn't work and the webservice doesn't answer properly. I am sure that the webservice works as I tried to copy paste the url by hand in a browser and I received the expected data.

You can convert data from one character set into another using iconv().
Your REST web service is most likely expecting UTF-8 data, so you would have to do something like this:
$data = iconv("iso-8859-1", "utf-8", $data);
before sending the request.

Why doesn't jQuery.parseJSON() work on all servers?

Hey there, I have an Arabic contact script that uses Ajax to retrieve a response from the server after filling the form.
On some apache servers, jQuery.parseJSON() throws an invalid json excepion for the same json it parses perfectly on other servers. This exception is thrown only on chrome and IE.
The json content gets encoded using php's json_encode() function. I tried sending the correct header with the json data and setting the unicode to utf-8, but that didn't help.
This is one of the json responses I try to parse (removed the second part of if because it's long):
{"pageTitle":"\u062e\u0637\u0623 \u0639\u0646\u062f \u0627\u0644\u0625\u0631\u0633\u0627\u0644 !"}
Note: This language of this data is Arabic, that's why it looks like this after being parsed with php's json_encode().
You can try to make a request in the examples given down and look at the full response data using firebug or webkit developer tools. The response passes jsonlint!
Finally, I have two urls using the same version of the script, try to browse them using chrome or IE to see the error in the broken example.
The working example : http://namodg.com/n/
The broken example: http://www.mt-is.co.cc/my/call-me/
Updated: To clarify more, I would like to note that I manged to fix this by using the old eval() to parse the content, I released another version with this fix, it was like this:
// Parse the JSON data
try
{
// Use jquery's default parser
data = $.parseJSON(data);
}
catch(e)
{
/*
* Fix a bug where strange unicode chars in the json data makes the jQuery
* parseJSON() throw an error (only on some servers), by using the old eval() - slower though!
*/
data = eval( "(" + data + ")" );
}
I still want to know if this is a bug in jquery's parseJSON() method, so that I can report it to them.

Found the problem! It was very hard to notice, but I saw something funny about that opening brace... there seemed to be a couple of little dots near it. I used this JavaScript bookmarklet to find out what it was:
javascript:window.location='http://www.google.com/search?q=u+'+('000'+prompt('String?').charCodeAt(prompt('Index?')).toString(16)).slice(-4)
I got the results page. Guess what the problem is! There is an invisible character, repeated twice actually, at the beginning of your output. The zero width non-breaking space is also called the Unicode byte order mark (BOM). It is the reason why jQuery is rejecting your otherwise valid JSON and why pasting the JSON into JSONLint mysteriously works (depending on how you do it).
One way to get this unwanted character into your output is to save your PHP files using Windows Notepad in UTF-8 mode! If this is what you are doing, get another text editor such as Notepad++. Resave all your PHP files without the BOM to fix your problem.
Step 1: Set up Notepad++ to encode files in UTF-8 without BOM by default.
Step 2: Open each existing PHP file, change the Encoding setting, and resave it.

You should try using json2.js (it's on https://github.com/douglascrockford/JSON-js)
Even John Resig (creator of jQuery) says you should:
This version of JSON.js is highly recommended. If you're still using the old version, please please upgrade (this one, undoubtedly, cause less issues than the previous one).
http://ejohn.org/blog/the-state-of-json/

I don't see anything related to parseJSON()
The only difference I see is that in the working example a session-cookie is set(guess it is needed for the "captcha", the mathematical calculation), in the other example no session-cookie is set. So maybe the comparision of the calculation-result fails without the session-cookie.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php URL encoding error - php

Related

php url string converts "&section=" to "§ion", which does not yield cURL response

Issue with encoding on receiver end when using xmlrpc_encode_request

weird issue with url encoding

Encoding problem in PHP while making a webservice call

Why doesn't jQuery.parseJSON() work on all servers?

Categories

Resources