The scheme is a text input field in a html form to be autocompleted using jQuery.autocomplete and getting the appropriate server response (e;g. a city name json list). The whole package works well... except that the client does not get data returned from the server when typing accented characters (éèà..). Same as many, it looks like I'm facing a char encoding issue but can not manage to figure out where and how to solve it despite many tries (iconv, utf8_encode, urldecode...) and readings like this one for example.
Therefore I'd need some help/hints to understand where to act (before prototyping jQuery autocomplete code ... ?)
EDIT: might be also a jQuery accent folding issue, I'll try also that way.
Configuration:
server: Apache2.2 (debian lenny)
php : compiled 5.3.3 (so the option JSON_UNESCAPED_UNICODE is not available for json_encode)
mysql: 5.1.49 with MySQL charset: UTF-8 Unicode (utf8),
class: using a modified PFBC2.x version for the php form building
meta
The website is mostly for french users so it's all designed with ISO-8859-1 (bad initial choice I guess) :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
jQuery autocomplete code (applied to the city input field)
// DEBUG Testing (tested w/ and w/o the $charset_attr: no change)
$charset_attr = 'contentType: "application/x-www-form-urlencoded;charset=ISO-8859-1"';
echo 'jQuery("#' . $this->attributes["id"] . '").autocomplete({source:"' , $this->xhr_path . '", minLength:2, ' . $charset_attr .'});';
The generated code for that input field is matching the above expection.
Converting mysql rows into utf8 using this function :
I convert the msyql returned array into utf8 prior to sending json back to the client. Actually I tested and wrote also other functions, but this does not change anything so I guess the point is not there.
$encoded_arr = utf8json($returnData);
echo json_encode($encoded_arr);
flush();
Encoding control 1 (client side)
A embed control in the html form in order to check which char encoding is actually passed to jQuery.autocomplete :
jQuery(document).ready(function() {
<?php
$test_str ="foobar";
$check_encoding = "'" . mb_detect_encoding($test_str) . "'";
?>
alert('Check charset server encoding: ' + <?php echo $check_encoding;?> ); // output : ASCII
});
Encoding control 2 (server side)
$inputData = (isset($_GET))? htmlspecialchars($_GET['term'],ENT_COMPAT, 'UTF-8') : NULL;
$encoding_get = mb_detect_encoding($_GET['term']);
$encoding_data = mb_detect_encoding($inputData);
$utf8converted = #iconv(strtolower($encoding_get), 'utf-8', $inputData);
$checkconversion = mb_detect_encoding($utf8converted);
Sending lowcase normal characters (ea..), I get all as ASCII.
Sending lowcase accented characters (éèà..), I get all as UTF8.
So I'm lost as the server receives the proper char string, produces a json return (tested without ajax) but it looks like the client does not receive or interprate this properly.
For those facing the same kind of ...%$# issue, here is what I've done to solve my case :
Checking the char encoding at each node (eg client, apache server, mysql server), using mb_detect_encoding on the server side,
Finally pointed out the problem location node : in my case passing UTF8 chars to the mysql server i/o latin ISO-8859-1, so mysql server did not return the expected answers, which I could not detect or debug with direct url POSTing data to the server script. So I had log the input and output in a file, checking entry character encoding and mysql server output.
Changed the ajax request to POST i/o GET,
Solved by encoding $_POST data to ISO prior sending the mysql server request, using mb_convert_encoding, as well described here.
Related
I'm retrieving from Odoo 9 on Ubuntu 14.04 ENG a list of partners via XML-RPC using PHP and ripcord
Some names contain one or more diacritics:
Pièr
Frère Pièr
All those names have been entered from a single computer running Windows 8.1 using one version of Chrome.
The strange fact is that I get a list where some diacritics are correct, some other have encoding problems, like:
Pi�r
Fr�re Pièr
The same diacritic in the same string is correctly encoded or not.
In subsequent calls the result is always the same.
If I edit the string, then it could change the results, giving
Frère Pi�r
Frère Pièr
Fr�re Pi�r...
I need to output a JSON, and thus I need to encode this in UTF-8: but it is currently impossible since I don't have a clue of what encoding the original text is (and it seems to not have any encoding at all!)
Any idea?
I found out that the incoming array was in charset "Latin1"
I solved normalizing the array generated from the XML-RPC output, recursively applying a multbyte conversion function:
// given an XML-RPC output named $arr_output...
function descramble_diacritics(&$entry, $key) {
$entry = mb_convert_encoding($entry, 'UTF-8', 'Latin1');
}
array_walk_recursive($arr_output, 'descramble_diacritics');
header('Access-Control-Allow-Origin: *');
header('Content-Type: application/json');
echo json_encode($arr_output);
I'm having some troubles with my $_POST/$_REQUEST datas, they appear to be utf8_encoded still.
I am sending conventional ajax post requests, in these conditions:
oXhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded; charset=utf-8");
js file saved under utf8-nobom format
meta-tags in html <header> tag setup
php files saved under utf-8-nobom format as well
encodeURIComponent is used but I tried without and it gives the same result
Ok, so everything is fine: the database is also in utf8, and receives it this way, pages show well.
But when I'm receiving the character "º" for example (through $_REQUEST or $_POST), its binary represention is 11000010 10111010, while "º" hardcoded in php (utf8...) binary representation is 10111010 only.
wtf? I just don't know whether it is a good thing or not... for instance if I use "#º#" as a delimiter of the explode php function, it won't get detected and this is actually the problem which lead me here.
Any help will be as usual greatly appreciated, thank you so much for your time.
Best rgds.
EDIT1: checking against mb_check_encoding
if (mb_check_encoding($_REQUEST[$i], 'UTF-8')) {
raise("$_REQUEST is encoded properly in utf8 at index " . $i);
} else {
raise(false);
}
The encoding got confirmed, I had the message raised up properly.
Single byte utf-8 characters do not have bit 7(the eight bit) set so 10111010 is not utf-8, your file is probably encoded in ISO-8859-1.
I have been struggling with this for three days now and this is what i have got and i cannot understand why i am seeing this behavior.
my problem is that i have a MySql spanish db with char set and collation defined as utf8_general_ci. when i query the data base in delete.php like this "DELETE FROM countryNames WHERE country = '$name'"
the specified row doesnot get deleted. i am setting the variable $name in delete.php through a post variable $name=$_post['data'] . mostly $name gets the value in spanish characters e.g español, México etc. the delete.php file gets called from main.php.if i send a post message from main.php $.post("delete.php", {data:filename}); , the query doesnot deletes the entry (although the 'filename' string is in utf8) but if i create a form and then post my data variable in main.php, the query works!! the big question to me is why do i have to submit a form for the query to work? what im seeing is my database rejects the value if it comes from a jquery post call but accepts it when its from a submitted form. (i make no code change for the query to work. just post the value by submiting the form)
First of all, to see what charset ìs used for requests, install something like Firebug and check the 'Content-Type' header of your request/response. It will look something like 'application/json; charset=...'. This should be charset=utf-8 in your case.
My guess why it worked when posting a form is probably because of x-www-form-urlencoded - non-alphanumeric characters are additionally encoded on the client side and again decoded on the server, that's the difference to posting the data directly.
This means that somewhere there is a wrong encoding at work. PHP treats your strings agnostic to its encoding by default, so I would tend to rule it out as the source of the error. jQuery.post also uses UTF-8 by default... so my suspect is the filename variable. Are you sure it is in UTF-8? Where and how do you retrieve it?
You should probably also ensure that the actual HTML page is also sent as UTF-8 and not, let's say iso-8859-1. Have a look at this article for a thorough explanation on how to get it right.
guys this was a Mac problem!! i just tested it on windows as my server and now everything works fine. So beware when u r using Mac as a server with MySql having UTF8 as charset and collation. I guess the Mac stores the folder and file name in some different encoding and not UTF-8.
You answer might be here: How to set encoding in .getJSON JQuery
As it says there, use $.ajax instead of $.post and you can set encoding.
OR, as it says in the 2nd answer use $.ajaxSetup to set the encoding accordingly.
Use .serialize() ! I think it will work. More info: http://api.jquery.com/serialize/
We have php server that sends json string in utf-8 encoding.
Im responsible for the iphone app that get the data.
I want to be sure that on my side everything is correct :
//after I downlad the data stream :
NSString* content = [[NSString alloc] initWithData:self.m_dataToParse encoding:NSUTF8StringEncoding];
//here the data is shown correctly in the console
NSLog(#"%#",content);
SBJsonParser *_parser = [[SBJsonParser alloc]init];
NSDictionary *jsonContentDictionary = [_parser objectWithData:self.m_dataToParse];
//here, when i printer values of array IN array, i see \u454 u\545 \4545 format. any ideas why ?
for(id key in jsonContentDictionary)
{
NSLog(#"key:%#, value:%#,key, [ jsonContentDictionary objectForKey:key]);
}
im using the latest version of json library :
https://github.com/stig/json-framework/
There is problem is the iphone side ? (json parser ? ) or in the php server ?
just to be clear again :
1.on the console, before json, the string looks o.k
2.after doing json, the array in array values are in the format of \u545 \u453 \u545
Thanks in advance.
Your code is correct.
A possible reason of the issue, and you must investigate it with your content provider (the server that sends the json to you), is that even if the whole json string is correctly encoded as utf-8 (remember: the json text is a sequence of character and so an encoding must be specified), it may happen that some or all of the text content (that is the values of the single objects contained in the json message) has been originally encoded in another format, typically this is html-encoding (iso-8859) especially when particular characters are used (e.g. cyrillic or asian). Now the json framework by default decodes all data as utf-8, but if there is a coding mismatch between the utf-8 characters and the iso-8859 (just to remain in the example) then the only way to transform them in utf-8 is to use the \u format. This happens quite often, especially when php scripts extract the info from html pages, which are usually encoded using iso-8859. And consider also that iOS is not able to convert the whole set of iso-8859 characters to unicode (e.g.: cyrillic).
So possible solutions are:
- do a content encoding of texts server side (iso-8859 --> utf-8)
- or if this is not possible, then it's up to you to recognize the \uxxx sequences coming more often from your content provider and replace them with the corresponding utf-8 characters.
i realized when i used urlencode or rawurlencode in PHP encoding the simple character § (paragraph) i get the following result: "%C2%A7".
But when i use escape in Javascript to encode that character, i get only "%A7".
In this case i have encoding problems when sending/receiving data between the server running PHP and the javascript client trying to fetch the data via ajax/jquery.
I want to be able to write any type of text i want. For this i encode the text and send it to the backend php script, escaping the data and sending. When i retrieve it, on php side i take the data from mysql and do rawurlencode and send it back.
Both sides, work in UTF-8 mode. jquery ajax function is called with "contentType: application/x-www-form-urlencoded:charset=UTF-8", mysql server is set for UTF-8 both for client and server, and the php script starts echoing with header( "application/x-www-form-urlencoded:charset=UTF-8");
Why is PHP producing that %C2 thing, which generates the character  on javascript side.
Coult somebody help?
I had the same problem a while ago and found the solution :
function rawurlencode (str) {
str = (str+'').toString();
return encodeURIComponent(str).replace(/!/g, '%21').replace(/'/g, '%27').replace(/\(/g, '%28').
replace(/\)/g, '%29').replace(/\*/g, '%2A');
}
The code is taken from here - http://phpjs.org/functions/rawurlencode:501
Hope it helps.
It's clearly a charset íssue:
[adrian#cheops3:~]> php -r 'echo rawurlencode(utf8_encode("§"));'
%C2%A7
[adrian#cheops3:~]> php -r 'echo rawurlencode("§");'
%A7
(the terminal is obviously not running in utf8 mode)
If you have a literal § in your PHP code ensure that the php file is saved as UTF8.