I have a data encoding problem. My database has accents in one of the columns, in the api return that column in a PDO query SQL SERVER in php. As soon as I return I transform into JSON by the json_encode method, plus the JSON comes NULL. When I give var_dump the question letters with accents this appears '�' and in json empty.
I know it's the encoding I need to convert to UTF8 but I'm not able to do this conversion in php. Can anyone help me?
Are you specifying the header for the right charset?
header('Content-type: text/html; charset=utf-8');
Notice also that your columns and tables should be utf8_unicode_ci.
And finally your connection to database should also be set accordingly charset=utf8.
Related
Now I, in my own opionion, have tried everything there is on this encoding problem, looked through a lot of answered quistions but nothing worked for me, so here I go.
I have a MySQL database with a Users table. This table has a column for "firstname" which collation is set to utf8_general_ci (all varchar columns is). I have then inserted a row where the firstname-column is set to "Løw", with the scandinavian special character "ø".
I now use the php-ActiveRecord library, where the connection string is to ";charset=utf8", to retrieve the row and afterwards outputs the user as json, like so:
$user = User::find($ID);
$userArr = $user->to_array();
header('Content-Type: application/json; charset=utf-8');
print(json_encode($userArr));
Now the wired things starts. The firstname is now NOT "Løw" as displayed in the MySQL Database , but "L\u00f8w". I then tried to see if this was also the case without the json_encode function, like so:
$user = User::find($ID);
$userArr = $user->to_array();
header('Content-Type: text/plain; charset=utf-8');
print_r($userArr);
But here the output was correct, firstname was "Løw". I then tried to encode the fields in the array to utf-8, since everybody told me if the strings was utf-8 it should work, like so:
$return[] = array_map('utf8_encode', $userArr);
print_r(json_encode($return));
But this gave me "L\u00c3\u00b8w", so that didn't work. I then tried, since i was out of ideas to utf8_decode it:
$return[] = array_map('utf8_decode', $userArr);
print_r(json_encode($return));
But that made the string return as "null". I then tried to check what encoding my vars was when they came out of the database, like so:
header('Content-Type: text/plain; charset=utf-8');
print(mb_detect_encoding($userArr['firstname']));
But this returned UTF-8.
So as you, hopefully, can see, i have tried everything and i still don't know why my json_encode, changes the "ø" charcter to "\u00f8". Please help, i don't want to make my own json_encode-method.
Ok found an answer pretty quick, but ill let other scandinavian people know, since i coulden't find anything on the subject.
I solved the problem by adding the following to the json_encode method:
print(json_encode($userArr,JSON_UNESCAPED_UNICODE));
This tells the method NOT to escape unicode chars (i think) or as it says in the PHP doc:
JSON_UNESCAPED_UNICODE (integer)
Encode multibyte Unicode characters literally (default is to escape as
\uXXXX). Available since PHP 5.4.0.
In my database, I have some content like this
അവരുടെ മനസ്സുകളില്.
But, when i am trying to fetch the content using PHP and display it in broswer, it is showing only some question marks like ????????? ???????. ??????.
I tried to set the content type header like this
header('Content-Type: text/html; charset=utf-8');
But it doesnt work.
How can i solve this ? any help would be greatly appreciated.
It does not work because you got this broken during data fetch and you are setting display encoding - it's already too late. Simply ensure correct encoding during connection by using either using proper method like mysqli_set_charset() or do query SET NAMES UTF8 just after your connect to DB
Depending on the method you are using to connect to the DB, you should be specifying the charset.
With PDO, you can specify the charset in PDO::__construct(), such as: charset=UTF-8
Otherwise you have mysqli::set_charset() for MySQLi, or god forbid you're still using mysql_* functions there's mysql_set_charset()
I have a MySQL DB, encoded in UTF8, where some records have 'ā's in them (in case it doesn't show up right in SO, that's an 'a' with a line above it).
There is a PHP script that is getting the records, putting them in an array, and json_encoding them. No matter whether the script is being invoked by ajax or the webpage, the 'ā's show up as question marks. Where is the problem, and how do I fix it?
Thanks,
Jamie McClymont
EDIT: Forgot to mention that the 'ā's show up fine in PHPMyAdmin
For the text to print correctly you need to set the charset of the mysql connection and the page
For the connection the following query will work
set names utf8
Run this query right after connecting
If the charset is still incorrect try adding
header('Content-Type: application/json; charset=utf-8');
assuming you're outputting json
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
I have a webform and an input if you put in Latté and POST it using JSON...
$.ajax({
type: "POST",
url: "http://"+document.domain+"/includes/rpc.php",
data: {method:"add_item",item:item},
dataType: "json",
timeout: 10000,
success:......
item will be the value Latté Latté is posted and the responding JSON is Latt\u00e9 which the browser interprets as Latté. Effectively this script is a WYSIWYG editor so what you type in you get back. Anyhue if I refresh the text is pulled out of mysql and comes out as Latté?. So I am guessing that MYSQL is not the correct collation?
Some more information - the query to edit the DB is
UPDATE menu_items SET description = 'Latté' WHERE item_id = '742'
the JSON reply is
{"description":"Latt\u00e9","id":"#recordsArray_742"}
To be precise, the collation is the way in which the strings are compared and sorted. It has a relationship with the character set, but your problem is an character set problem, not a collation problem.
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set.
The first thing that you've to know is which character set you're using. Are you using UTF-8 or LATIN1 or others?
After that I'd try to output the correct header from the PHP script generating the JSON string. For example for UTF-8:
header('Content-type: application/json; charset=utf-8');
This could already solve your problem, if it doesn't we have to look deeper on how do you connect to the DB and how you manipulate your data. Let me know in case which MySQL libraries or framework are you using to connect to the db, and post the relevant source code.
It could be MySQL's collation.
It could also be HTTP encoding.
Or you could be using string functions in PHP which are not multi-byte-safe.
This kind of error can happen at many points through your tool chain.
Either the database collation is wrong or the DBAL you are using (PDO?).
Use utf8_encode() around your fetched values in your PHP backend before you output it
I'm having encoding problems in my webpage, and it is driving me crazy. Let me try to explain
I have a meta tag defining utf8 as charset.
I'm including the scripts as utf8 too (<script type="text/javascript src="..." charset="utf8"></script>).
In .php files, I declare header('Content-Type: text/html; charset=utf8');
In my database (postgreSQL), I've made the query show lc_collate; and the return was en_US.UTF-8
I'm using AJAX
When I try to save the field value "name" as "áéíóú", I get the value "áéÃóú" in the record set (using phpPgAdmin to view results).
What am I doing wrong? There's a way to fix it without using decode/encode? Someone have a good reference about theses issues?
Thank you all!
Maybe the client encoding is not set correctly? PostgreSQL automatically converts between the character encoding on the client and the encoding in the database. For this to work it needs to know what encoding the client is using. Safest is to set this when you open your connection using:
SET CLIENT_ENCODING TO 'UTF8';
For details see the docs
You might be storing the data as ISO-8859-1?
Try enconding to base64 and decoding on the other end.