Encoding problem in PHP while making a webservice call - php

I have the nth problem encoding related with PHP!
so the story is:
i read a url from a file (ISO-8859). I cant change the encoding of this file for various reason I wont discuss here.
I use that url to make a call to a rest webservice.
the url happens to contain the symbol "è" which is conveted to � when it is loaded by the PHP engine.
as a result the webservice returns and unexpected result because what it gets is actually the word "perch�" instead of "perchè".
I tried to force php to work with ISO-8859 by doing:
ini_set('default_charset', "ISO-8859");
The problem is that it still doesn't work and the webservice doesn't answer properly. I am sure that the webservice works as I tried to copy paste the url by hand in a browser and I received the expected data.

You can convert data from one character set into another using iconv().
Your REST web service is most likely expecting UTF-8 data, so you would have to do something like this:
$data = iconv("iso-8859-1", "utf-8", $data);
before sending the request.

Related

Weird encoding issue in XML-RPC call

I'm retrieving from Odoo 9 on Ubuntu 14.04 ENG a list of partners via XML-RPC using PHP and ripcord
Some names contain one or more diacritics:
Pièr
Frère Pièr
All those names have been entered from a single computer running Windows 8.1 using one version of Chrome.
The strange fact is that I get a list where some diacritics are correct, some other have encoding problems, like:
Pi�r
Fr�re Pièr
The same diacritic in the same string is correctly encoded or not.
In subsequent calls the result is always the same.
If I edit the string, then it could change the results, giving
Frère Pi�r
Frère Pièr
Fr�re Pi�r...
I need to output a JSON, and thus I need to encode this in UTF-8: but it is currently impossible since I don't have a clue of what encoding the original text is (and it seems to not have any encoding at all!)
Any idea?
I found out that the incoming array was in charset "Latin1"
I solved normalizing the array generated from the XML-RPC output, recursively applying a multbyte conversion function:
// given an XML-RPC output named $arr_output...
function descramble_diacritics(&$entry, $key) {
$entry = mb_convert_encoding($entry, 'UTF-8', 'Latin1');
}
array_walk_recursive($arr_output, 'descramble_diacritics');
header('Access-Control-Allow-Origin: *');
header('Content-Type: application/json');
echo json_encode($arr_output);

Strange encoding behaviour in AFNetworking

I'm using AFNetworking in and iOS project and so far everything went ok. Now I have a script in PHP that is supposed to get some info and return some json. Both the info the script is provided with and the json it is supposed to return cointains latin chars, mainly ã and õ.
The thing is that when i recieve the json back at my iOS app the characters come encoded as what I think is NSNonLossyASCIIStringEncoding. I think the encoding is not UTF8 because back at the app:
[jsonManager GET:myURL parameters:sendingData success:^(AFHTTPRequestOperation *op,id responseObject){
NSLog(#"%d",op.responseStringEncoding);
NSLog(#"%d",op.responseSerializer.stringEncoding);
NSLog(#"%#",op.responseString);
NSLog(#"%#",[[NSString alloc]initWithData:op.responseData encoding:NSNonLossyASCIIStringEncoding]);
} failure:^(AFHTTPRequestOperation *op,NSError *error){
NSLog(#"%#",op.responseString);
}];
The last NSLog(in case of success) is the only one that outputs the responseString as it was supposed to be. The third log outputs \u00e3 in the place of every ã.
And the first log confirms that the encoding used was NSUTF8StringEncoding.
The second log states that responseSerializer.stringEnconding is NSNonLossyASCIIStringEncoding because I set it to be like that, previously to making the request, it made no difference, dont know why either...
The really strange thing is that if I invoke the script using a browser I can see that the output is encoded as UTF8.
What is wrong here?
Thank You.
It sounds like your server is using different encoding types depending on the client or some header.
NSJSONSerialization strictly implements RFC 4627, which states:
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
JSON is always Unicode-encoded, so my guess is that your server isn't following the spec.
Instead of using your browser, try to replicate the behavior using CURL, or a Chrome plug-in like Advanced REST Client. One place to start is your server's parsing of the Accept, User-Agent and Content-Type headers.

How to ensure variables submitted in UTF8 using jquery $.post

I have been struggling with this for three days now and this is what i have got and i cannot understand why i am seeing this behavior.
my problem is that i have a MySql spanish db with char set and collation defined as utf8_general_ci. when i query the data base in delete.php like this "DELETE FROM countryNames WHERE country = '$name'"
the specified row doesnot get deleted. i am setting the variable $name in delete.php through a post variable $name=$_post['data'] . mostly $name gets the value in spanish characters e.g español, México etc. the delete.php file gets called from main.php.if i send a post message from main.php $.post("delete.php", {data:filename}); , the query doesnot deletes the entry (although the 'filename' string is in utf8) but if i create a form and then post my data variable in main.php, the query works!! the big question to me is why do i have to submit a form for the query to work? what im seeing is my database rejects the value if it comes from a jquery post call but accepts it when its from a submitted form. (i make no code change for the query to work. just post the value by submiting the form)
First of all, to see what charset ìs used for requests, install something like Firebug and check the 'Content-Type' header of your request/response. It will look something like 'application/json; charset=...'. This should be charset=utf-8 in your case.
My guess why it worked when posting a form is probably because of x-www-form-urlencoded - non-alphanumeric characters are additionally encoded on the client side and again decoded on the server, that's the difference to posting the data directly.
This means that somewhere there is a wrong encoding at work. PHP treats your strings agnostic to its encoding by default, so I would tend to rule it out as the source of the error. jQuery.post also uses UTF-8 by default... so my suspect is the filename variable. Are you sure it is in UTF-8? Where and how do you retrieve it?
You should probably also ensure that the actual HTML page is also sent as UTF-8 and not, let's say iso-8859-1. Have a look at this article for a thorough explanation on how to get it right.
guys this was a Mac problem!! i just tested it on windows as my server and now everything works fine. So beware when u r using Mac as a server with MySql having UTF8 as charset and collation. I guess the Mac stores the folder and file name in some different encoding and not UTF-8.
You answer might be here: How to set encoding in .getJSON JQuery
As it says there, use $.ajax instead of $.post and you can set encoding.
OR, as it says in the 2nd answer use $.ajaxSetup to set the encoding accordingly.
Use .serialize() ! I think it will work. More info: http://api.jquery.com/serialize/

PHP's rawurlencode is not equal to JavaScripts escape! Why?

i realized when i used urlencode or rawurlencode in PHP encoding the simple character § (paragraph) i get the following result: "%C2%A7".
But when i use escape in Javascript to encode that character, i get only "%A7".
In this case i have encoding problems when sending/receiving data between the server running PHP and the javascript client trying to fetch the data via ajax/jquery.
I want to be able to write any type of text i want. For this i encode the text and send it to the backend php script, escaping the data and sending. When i retrieve it, on php side i take the data from mysql and do rawurlencode and send it back.
Both sides, work in UTF-8 mode. jquery ajax function is called with "contentType: application/x-www-form-urlencoded:charset=UTF-8", mysql server is set for UTF-8 both for client and server, and the php script starts echoing with header( "application/x-www-form-urlencoded:charset=UTF-8");
Why is PHP producing that %C2 thing, which generates the character  on javascript side.
Coult somebody help?
I had the same problem a while ago and found the solution :
function rawurlencode (str) {
str = (str+'').toString();
return encodeURIComponent(str).replace(/!/g, '%21').replace(/'/g, '%27').replace(/\(/g, '%28').
replace(/\)/g, '%29').replace(/\*/g, '%2A');
}
The code is taken from here - http://phpjs.org/functions/rawurlencode:501
Hope it helps.
It's clearly a charset íssue:
[adrian#cheops3:~]> php -r 'echo rawurlencode(utf8_encode("§"));'
%C2%A7
[adrian#cheops3:~]> php -r 'echo rawurlencode("§");'
%A7
(the terminal is obviously not running in utf8 mode)
If you have a literal § in your PHP code ensure that the php file is saved as UTF8.

Why doesn't jQuery.parseJSON() work on all servers?

Hey there, I have an Arabic contact script that uses Ajax to retrieve a response from the server after filling the form.
On some apache servers, jQuery.parseJSON() throws an invalid json excepion for the same json it parses perfectly on other servers. This exception is thrown only on chrome and IE.
The json content gets encoded using php's json_encode() function. I tried sending the correct header with the json data and setting the unicode to utf-8, but that didn't help.
This is one of the json responses I try to parse (removed the second part of if because it's long):
{"pageTitle":"\u062e\u0637\u0623 \u0639\u0646\u062f \u0627\u0644\u0625\u0631\u0633\u0627\u0644 !"}
Note: This language of this data is Arabic, that's why it looks like this after being parsed with php's json_encode().
You can try to make a request in the examples given down and look at the full response data using firebug or webkit developer tools. The response passes jsonlint!
Finally, I have two urls using the same version of the script, try to browse them using chrome or IE to see the error in the broken example.
The working example : http://namodg.com/n/
The broken example: http://www.mt-is.co.cc/my/call-me/
Updated: To clarify more, I would like to note that I manged to fix this by using the old eval() to parse the content, I released another version with this fix, it was like this:
// Parse the JSON data
try
{
// Use jquery's default parser
data = $.parseJSON(data);
}
catch(e)
{
/*
* Fix a bug where strange unicode chars in the json data makes the jQuery
* parseJSON() throw an error (only on some servers), by using the old eval() - slower though!
*/
data = eval( "(" + data + ")" );
}
I still want to know if this is a bug in jquery's parseJSON() method, so that I can report it to them.
Found the problem! It was very hard to notice, but I saw something funny about that opening brace... there seemed to be a couple of little dots near it. I used this JavaScript bookmarklet to find out what it was:
javascript:window.location='http://www.google.com/search?q=u+'+('000'+prompt('String?').charCodeAt(prompt('Index?')).toString(16)).slice(-4)
I got the results page. Guess what the problem is! There is an invisible character, repeated twice actually, at the beginning of your output. The zero width non-breaking space is also called the Unicode byte order mark (BOM). It is the reason why jQuery is rejecting your otherwise valid JSON and why pasting the JSON into JSONLint mysteriously works (depending on how you do it).
One way to get this unwanted character into your output is to save your PHP files using Windows Notepad in UTF-8 mode! If this is what you are doing, get another text editor such as Notepad++. Resave all your PHP files without the BOM to fix your problem.
Step 1: Set up Notepad++ to encode files in UTF-8 without BOM by default.
Step 2: Open each existing PHP file, change the Encoding setting, and resave it.
You should try using json2.js (it's on https://github.com/douglascrockford/JSON-js)
Even John Resig (creator of jQuery) says you should:
This version of JSON.js is highly recommended. If you're still using the old version, please please upgrade (this one, undoubtedly, cause less issues than the previous one).
http://ejohn.org/blog/the-state-of-json/
I don't see anything related to parseJSON()
The only difference I see is that in the working example a session-cookie is set(guess it is needed for the "captcha", the mathematical calculation), in the other example no session-cookie is set. So maybe the comparision of the calculation-result fails without the session-cookie.

Categories