I have been struggling with this for three days now and this is what i have got and i cannot understand why i am seeing this behavior.
my problem is that i have a MySql spanish db with char set and collation defined as utf8_general_ci. when i query the data base in delete.php like this "DELETE FROM countryNames WHERE country = '$name'"
the specified row doesnot get deleted. i am setting the variable $name in delete.php through a post variable $name=$_post['data'] . mostly $name gets the value in spanish characters e.g español, México etc. the delete.php file gets called from main.php.if i send a post message from main.php $.post("delete.php", {data:filename}); , the query doesnot deletes the entry (although the 'filename' string is in utf8) but if i create a form and then post my data variable in main.php, the query works!! the big question to me is why do i have to submit a form for the query to work? what im seeing is my database rejects the value if it comes from a jquery post call but accepts it when its from a submitted form. (i make no code change for the query to work. just post the value by submiting the form)
First of all, to see what charset ìs used for requests, install something like Firebug and check the 'Content-Type' header of your request/response. It will look something like 'application/json; charset=...'. This should be charset=utf-8 in your case.
My guess why it worked when posting a form is probably because of x-www-form-urlencoded - non-alphanumeric characters are additionally encoded on the client side and again decoded on the server, that's the difference to posting the data directly.
This means that somewhere there is a wrong encoding at work. PHP treats your strings agnostic to its encoding by default, so I would tend to rule it out as the source of the error. jQuery.post also uses UTF-8 by default... so my suspect is the filename variable. Are you sure it is in UTF-8? Where and how do you retrieve it?
You should probably also ensure that the actual HTML page is also sent as UTF-8 and not, let's say iso-8859-1. Have a look at this article for a thorough explanation on how to get it right.
guys this was a Mac problem!! i just tested it on windows as my server and now everything works fine. So beware when u r using Mac as a server with MySql having UTF8 as charset and collation. I guess the Mac stores the folder and file name in some different encoding and not UTF-8.
You answer might be here: How to set encoding in .getJSON JQuery
As it says there, use $.ajax instead of $.post and you can set encoding.
OR, as it says in the 2nd answer use $.ajaxSetup to set the encoding accordingly.
Use .serialize() ! I think it will work. More info: http://api.jquery.com/serialize/
Related
I am currently moving blog posts from wordpress to drupal. however after moving it
some of the text is not being displayed correctly.
wordpress is displaying :
When it hasn’t (html code is <h2>When it hasn’t</h2>)
Drupal is displaying :
When it hasn’t (html code is <h2>When it hasn’t</h2>)
In the wordpress and drupal db the value is correct. The source is the same.
<h2>When it hasn’t</h2>
I did a search and found many options. None of them helped.
Below are the ones I have done and checked.
1) I double checked that utf-8 is the character encoing in drupal and wp.
I also made a simple test.php file to check nothing else was coming in the way
and it still did not display correctly.
2) I made sure when we take a mysqldump and upload to drupal utf-8
is used.
3) I also made sure the .php file is in utf-8 when saved.
4) I changed the encoding type in chrome for every option available and nothing
displayed it correctly.
5) I also used php functions to recode it but they did not work.
$value2="<h2>When it hasn’t</h2>";
$out = recode_string('..utf-8', $value2);
//output - When it hasnt
$out2= mb_convert_encoding($value2,'UTF-8', "UTF-8");
// output - When it hasn’t
$out3= #iconv('UTF-8', 'utf-8', $value2);
// output - When it hasn’t
I have ran out of options now and I am stuck. Please help
You say the text in both databases is correct, but actually this doesn't mean too much: to viewing the content of a record you must use some client, and quite a few transformations may happen depending on how the text is rendered so you can read it.
So only two things matters:
the encoding of the column
the encoding of the HTML page returned by Drupal
Since your page outputs ’ (in CP1252 is xE2x80x99) for ’ (Unicode U+2019, UTF-8 is 0xE28099) I guess the column is indeed UTF-8, however there's someone between the database and the browser who thinks the text is CP1252. This is what you have to check:
If using MySQL, the connection encoding must be UTF-8 so that what you have in your PHP script is UTF-8 text. You can use SET NAMES 'UTF-8'. Note that if you don't need the Unicode set, you can even use CP1252: the only important thing is that you know the encoding, since PHP strings are just byte arrays.
Explicitely define the response encoding in the HTTP Content-Type header. I mean, configure Drupal to call header('Content-Type: text/html; charset=utf-8');
If the HTTP response encoding is different than the one used for the text retrieved from the db, transcode the query result accordingly
When navigating to a URL like this:
http://example.com/user?u=ヴィックサ
I notice that Chrome encodes the characters as:
http://example.com/user?u=%E3%83%B4%E3%82%A3%E3%83%83%E3%82%AF%E3%82%B5
And everything works serer-side.
However, in IE I get this error from my code:
The user you are trying to find (?????) does not exist.
Note the five question marks. For some reason the PHP never gets to see the parameter.
What could be causing this, and is there any way to fix it?
Sadly it seems what you want to do is not going to work for the current generation of IE
The accepted answer for this question UTF-8 Encoding issue in IE query parameters says that you need to encode the characters yourself rather than relying on the browser as support varies from browser to browser, and maybe even device to device
<a href='/path/to/page/?u=<?=urlencode('ヴィックサ')?>'>View User</a>
Also I presume you are setting utf8 headers from the webserver? you didn't say, if not, in php
header('Content-Type: text/html; charset=utf-8');
I have updated this question to make it clearer:
I am sending the word "Über" as the value in a php header as a response to an ajax request like so:
header('title: ' . 'Über');
The word "Über" comes from the database with the columns set as utf8_general_ci and when I echo out the word from the database I see it correctly.
The problem occurs in javascript when I do:
var title = request.getHeader('title');
I then do:
console.log(title);
In Chrome the value of title is correct (Über), but in Safari and Firefox it is converted to "Ãber".
I think that the problem might be that I'm sending the value in the header rather than normal response string.
Any help much appreciated.
Quentin was right. Sending raw unicode in the header caused this problem. I changed to a normal ajax request and it fixed things. Thanks everyone for your help.
I'm having some troubles with my $_POST/$_REQUEST datas, they appear to be utf8_encoded still.
I am sending conventional ajax post requests, in these conditions:
oXhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded; charset=utf-8");
js file saved under utf8-nobom format
meta-tags in html <header> tag setup
php files saved under utf-8-nobom format as well
encodeURIComponent is used but I tried without and it gives the same result
Ok, so everything is fine: the database is also in utf8, and receives it this way, pages show well.
But when I'm receiving the character "º" for example (through $_REQUEST or $_POST), its binary represention is 11000010 10111010, while "º" hardcoded in php (utf8...) binary representation is 10111010 only.
wtf? I just don't know whether it is a good thing or not... for instance if I use "#º#" as a delimiter of the explode php function, it won't get detected and this is actually the problem which lead me here.
Any help will be as usual greatly appreciated, thank you so much for your time.
Best rgds.
EDIT1: checking against mb_check_encoding
if (mb_check_encoding($_REQUEST[$i], 'UTF-8')) {
raise("$_REQUEST is encoded properly in utf8 at index " . $i);
} else {
raise(false);
}
The encoding got confirmed, I had the message raised up properly.
Single byte utf-8 characters do not have bit 7(the eight bit) set so 10111010 is not utf-8, your file is probably encoded in ISO-8859-1.
I have a webform and an input if you put in Latté and POST it using JSON...
$.ajax({
type: "POST",
url: "http://"+document.domain+"/includes/rpc.php",
data: {method:"add_item",item:item},
dataType: "json",
timeout: 10000,
success:......
item will be the value Latté Latté is posted and the responding JSON is Latt\u00e9 which the browser interprets as Latté. Effectively this script is a WYSIWYG editor so what you type in you get back. Anyhue if I refresh the text is pulled out of mysql and comes out as Latté?. So I am guessing that MYSQL is not the correct collation?
Some more information - the query to edit the DB is
UPDATE menu_items SET description = 'Latté' WHERE item_id = '742'
the JSON reply is
{"description":"Latt\u00e9","id":"#recordsArray_742"}
To be precise, the collation is the way in which the strings are compared and sorted. It has a relationship with the character set, but your problem is an character set problem, not a collation problem.
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set.
The first thing that you've to know is which character set you're using. Are you using UTF-8 or LATIN1 or others?
After that I'd try to output the correct header from the PHP script generating the JSON string. For example for UTF-8:
header('Content-type: application/json; charset=utf-8');
This could already solve your problem, if it doesn't we have to look deeper on how do you connect to the DB and how you manipulate your data. Let me know in case which MySQL libraries or framework are you using to connect to the db, and post the relevant source code.
It could be MySQL's collation.
It could also be HTTP encoding.
Or you could be using string functions in PHP which are not multi-byte-safe.
This kind of error can happen at many points through your tool chain.
Either the database collation is wrong or the DBAL you are using (PDO?).
Use utf8_encode() around your fetched values in your PHP backend before you output it