php ActiveRecord and json_encode æøå encoding issue - php

Now I, in my own opionion, have tried everything there is on this encoding problem, looked through a lot of answered quistions but nothing worked for me, so here I go.
I have a MySQL database with a Users table. This table has a column for "firstname" which collation is set to utf8_general_ci (all varchar columns is). I have then inserted a row where the firstname-column is set to "Løw", with the scandinavian special character "ø".
I now use the php-ActiveRecord library, where the connection string is to ";charset=utf8", to retrieve the row and afterwards outputs the user as json, like so:
$user = User::find($ID);
$userArr = $user->to_array();
header('Content-Type: application/json; charset=utf-8');
print(json_encode($userArr));
Now the wired things starts. The firstname is now NOT "Løw" as displayed in the MySQL Database , but "L\u00f8w". I then tried to see if this was also the case without the json_encode function, like so:
$user = User::find($ID);
$userArr = $user->to_array();
header('Content-Type: text/plain; charset=utf-8');
print_r($userArr);
But here the output was correct, firstname was "Løw". I then tried to encode the fields in the array to utf-8, since everybody told me if the strings was utf-8 it should work, like so:
$return[] = array_map('utf8_encode', $userArr);
print_r(json_encode($return));
But this gave me "L\u00c3\u00b8w", so that didn't work. I then tried, since i was out of ideas to utf8_decode it:
$return[] = array_map('utf8_decode', $userArr);
print_r(json_encode($return));
But that made the string return as "null". I then tried to check what encoding my vars was when they came out of the database, like so:
header('Content-Type: text/plain; charset=utf-8');
print(mb_detect_encoding($userArr['firstname']));
But this returned UTF-8.
So as you, hopefully, can see, i have tried everything and i still don't know why my json_encode, changes the "ø" charcter to "\u00f8". Please help, i don't want to make my own json_encode-method.

Ok found an answer pretty quick, but ill let other scandinavian people know, since i coulden't find anything on the subject.
I solved the problem by adding the following to the json_encode method:
print(json_encode($userArr,JSON_UNESCAPED_UNICODE));
This tells the method NOT to escape unicode chars (i think) or as it says in the PHP doc:
JSON_UNESCAPED_UNICODE (integer)
Encode multibyte Unicode characters literally (default is to escape as
\uXXXX). Available since PHP 5.4.0.

Related

Return JSON NULL

I have a data encoding problem. My database has accents in one of the columns, in the api return that column in a PDO query SQL SERVER in php. As soon as I return I transform into JSON by the json_encode method, plus the JSON comes NULL. When I give var_dump the question letters with accents this appears '�' and in json empty.
I know it's the encoding I need to convert to UTF8 but I'm not able to do this conversion in php. Can anyone help me?
Are you specifying the header for the right charset?
header('Content-type: text/html; charset=utf-8');
Notice also that your columns and tables should be utf8_unicode_ci.
And finally your connection to database should also be set accordingly charset=utf8.

json_encode() returns null for special characters

I know that there are lot of questions on this. I have tried many things but I couldn't fix it. Perhaps I failed executing the solutions because of my limited knowledge?
I select data from mysql database and use json_encode() function on it. This works , except for the data which contains special characters(Turkish). For those values , json returns null. How can I fix this?
Here is my simple php code:
<?php
require('init.php');
$sql="SELECT * FROM tablename;";
$result=mysqli_query($con,$sql);
$response=array();
while($row=mysqli_fetch_array($result)){
array_push($response,array("X"=>$row["x"],"Y"=>$row["y"]));
}
echo json_encode($response);
?>
in mysql , columns are set to utf8-turkish. I have tried things like setting headers , calling some functions , recreating php files in utf8 encoding etc.. but none did work.
Using mysqli_set_charset function in my connection script solved the problem. Couldn't find this on similar questions. Thanks to user "RiggsFolly" .
Add TRUE: json_decode($string, TRUE); That will get you utf8 characters instead of Unicode escape sequences (like \u00d6).
You need utf8 (or utf8mb4) in the connection, in the table, and in the html. See this.

PHP/MySQL encoding problems. � instead of certain characters

I have come across some problems when inputting certain characters into my mysql database using php. What I am doing is submitting user inputted text to a database. I cannot figure out what I need to change to allow any kind of character to be put into the database and printed back out through php as it's suppose to.
My MySQL collation is: latin1_swedish_ci
Just before I send the text to the database from my form I use mysql_real_escape_string() on the data.
Example below
this text:
�People are just as happy as they make up their minds to be.�
� Abraham Lincoln
is suppose to look like this:
“People are just as happy as they make up their minds to be.”
― Abraham Lincoln
As mentioned by others, you need to convert to UTF8 from end to end if you want to support "special" characters. This means your web page, PHP, mysql connection and mysql table. The web page is fairly simple, just use the meta tag for UTF8. Ideally your headers would say UTF8 also.
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
Set your PHP to use UTF8. Things would probably work anyway, but it's a good measure to do this:
mb_internal_encoding('UTF-8');
mb_http_output('UTF-8');
mb_http_input('UTF-8');
For mysql, you want to convert your table to UTF8, no need to export/import.
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8
You can, and should, configure mysql to default utf8. But you can also run the query:
SET NAMES UTF8
as the first query after establishing a connection and that will "convert" your database connection to UTF8.
That should solve all your character display problems.
The likeliest cause of the problem is that the database connection is set to latin1 but you are feeding it text encoded in UTF-8. The simplest way to solve this is to convert your input into what the client expects:
$quote = iconv("UTF-8", "WINDOWS-1252//TRANSLIT", $quote);
(What MySQL calls latin1 is windows-1252 in the rest of the world.) Note that many characters, such as the quotation dash U+2015 that you use there, cannot be represented in this encoding and will be converted into something else. Ideally you should change the column encoding to utf8.
An alternative solution: set the database connection to utf8. It doesn't matter how the columns are encoded: MySQL internally converts text from the connection encoding into the storage encoding, you can keep the columns as latin1 if you want to. (If you do, the quotation dash U+2015 will be turned into a question mark ? because it's not in latin1)
How to set the connection encoding depends on what library you are using: if you use the deprecated MySQL library it's mysql_set_charset, if MySQLi it's mysqli_set_charset, if PDO add encoding=utf8 to the DSN.
If you do this you'll have set the page encoding to UTF-8 with the Content-Type header.
Otherwise you would be having the same problem with the browser: feeding it text encoded in UTF-8 when it's expecting something else:
header("Content-Type: text/html; charset=utf-8");
The solutions provided are helpful if starting from scratch. Putting all possible connections to UTF-8 is indeed the safest. UTF-8 is the most used charset on the net for a variety of reasons.
Some suggestions and a word of warning:
copy the tables you want to sanitize with a unique prefix (tmp_)
although your db-connection is forced to utf8, check you General Settings collation, change to utf8_bin if that was not done yet
you need to run this on the local server
the funny char error is mostly due to mixing LATIN1 with UTF-8 configurations. This solution is designed for this. It could work with other used char-sets that LATIN1 but I haven't checked this
check these tmp_tables extensively before copying back to the original
Builds the 2 array needed for the magic:
$chars = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES, "UTF-8");
$LATIN1 = $UTF8 = array();
while (list($key,$val) = each ($chars)) {
$UTF8[] = $key;
$LATIN1[] = $val;
}
Now build up the routines you need: (tables->)rows->fields and at each field call
$row[$field] = mysql_real_escape_string(str_replace($LATIN1 , $UTF8 , $row[$field]));
$q[] = "$field = '{$row[$field]}'";
Finally build up and send the query:
mysql_query("UPDATE $table SET " . implode(" , " , $q) . " WHERE id = '{$row['id']}' LIMIT 1");
change the MySQL collation to utf8_unicode_ci or utf8_general_ci, including the table and the database.
You will need to set your database in utf-8 yes. There is many ways to do it. By changin the config file, via phpmyadmin or by calling php function (sorry memory blank) right before insert and update the mysql.
Unfortunately, i think you will have to re-enter any data you entered before.
One thing you also need to know, from personnal experience, make sure all table with relation have the same collation or you won'T be able to JOIN them.
as reference: http://dev.mysql.com/doc/refman/5.6/en/charset-syntax.html
Also, i can be a apache setting. We've experienced the same issue on 'free-hosting' server as well as on my brother's server. Once switched to another server, all the charater's became neat. Verfiy you apache setting, sorry but i can't bting more light on apache's config.
Get rid of everything you just need to follow these two points, every problem regarding special languages characters will be resolved.
1- You need to define the collation of your table to be utf8_general_ci.
2- define <meta http-equiv="content-type" content="text/html; charset=utf-8"> in the HTML after head tag.
2- You need to define the mysql_set_charset('utf8',$link_identifier); in the file where you made connection with the database and right after the selection of database like 'mysql_select_db' use this 'mysql_set_charset' this will allow you to add and retrieve data properly in what ever the language it is.
If your text has been encoded and decoded with the wrong encoding and so the mojibake is actually "solidified" into unicode characters, then the solutions mentioned so far won't work. I ended up having success with the ftfy Python package to automatically detect/fix mojibake:
https://github.com/LuminosoInsight/python-ftfy
https://pypi.org/project/ftfy/
https://ftfy.readthedocs.io/en/latest/
>>> import ftfy
>>> print(ftfy.fix_encoding("(ง'⌣')ง"))
(ง'⌣')ง
Hopefully this helps people who are in a similar situation.

Another charset problem with php and MySQL

I'm having a problem with some characters like 'í' or 'ñ' working in a web project with PHP and MySQL.
The database table is in UTF-8 charset and the web page is ISO-8859-1 (latin-1). at first look everything is handled ok, but a problem is coming when I use the JSON_ENCODE function of PHP.
When I get a query result, let's say, this row:
| ID | VALUE |
--------------------
| 1 | Línea |
I got the following (correct) array in PHP:
Array("ID"=>"1","VALUE"=>"Línea");
So far, so good. But, when i apply the JSON_ENCODE
$result = json_encode($result);
//$result is {"id":"1","value":"L"}
Then i tried some coding/decoding but i couldn't get the right result.
First I tried to decode the UTF-8 chars like follow:
$result['value'] = utf8_decode($result['value']);
//and I get $result['value'] is "L?a"
Then I tried with mb functions:
$result['value'] = mb_convert_encoding($result['value'],"ISO-8859-1","UTF-8");
//and I get that $result['value'] is "Lnea"
I don't really know why is the Json_encode breaking my string and i can't figure out what else to try. I will appreciate any help :)
Thanks!
The documentation for json_encode states that the function will only work on UTF-8 data. If it's not working for you, it means that your data is not UTF-8.
To understand what's going wrong, you need to know what your connection character set is. Is it UTF-8? Something else? Use SET NAMES utf-8 and see if it makes any difference.
Assuming the connection character set is indeed UTF-8, json_encode should work just fine. Then, you still have the final issue of converting the encoded data to ISO-8859-1. For example:
// assume any strings in $result are UTF-8 encoded
$json = json_encode($result);
$output = mb_convert_encoding($json, 'ISO-8859-1', 'UTF-8');
echo $output;
If it still doesn't work, it means that your UTF-8 strings contain characters not available in the ISO-8859-1 character set. There's nothing you can do about that.
Update:
When debugging complex character set conversions like this, you can use file_put_contents to write intermediate results to a file which you can inspect with a hex editor. This will help confirm that the output of a particular step of the process is correct or not.

How to read Asiatic characters (Japanese, Chinese) after json_encode in PHP

I've read every post about the topic but I don't think I've found a reply to my question, that's driving me crazy.
I got a couple of php files, one stores data into mySQL db, another one read those data: I get data from all over the world and it seems that I succeed to store asiatic character in a right way, but when I try to read those data I can't get those characters back.
As many other users I got ?? instead of the correct chars.
Top of my PHP files I got:
header('Content-Type: application/json; charset=utf-8');
then
mysql_query("SET CHARACTER SET utf8", $link);
mysql_query("SET NAMES 'utf8'", $link);
then
$fab[] = array_map(utf8_encode,$array);
Here if I print_r ($fab) I lost asiatic chars :-(
Then when I do:
$json_string = json_encode($fab); //originale
What I get is "??".
How is the correct way to get the right chars back? The json string is then passed
to an iPhone client.
Any suggestion or help would be sooo appreciated.
Thank you anyway,
Fabrizio
Seems like you're double encoding it? If you get the data from mysql which is already utf8 encoded, what's the point of $fab[] = array_map(utf8_encode,$array); then?
Just had similar thing 2 days ago, when I was accepting utf8 data from an ExtJs form and it was messed up. It was cause I used utf8_encode on the data I received from the script (which was in utf8). So i broke it by double encoding. Maybe same in your case
The problem was what Tseng said: double encoding on the array: I thought I made the right test but simply I didn't.
So the only code I need is:
while($obj = mysql_fetch_object($rs)) {
$arr[] = $obj;
}
$json_string = json_encode($arr);
echo ($json_string);
Again Tseng, thanx.

Categories