I had issue with json_encode. I was getting
PHP Warning: json_encode() [<a href='function.json-encode'>function.json-encode</a>]: Invalid UTF-8 sequence in argument in /var/www/html/web/example.php on line 500
Then I set magic_quotes_gpc = 0 in php.ini (it was 1 before) and it stopped showing the json_encode error.
Now, I started getting the same error again. magic_quotes_gpc is 0 in php.ini. I am using PHP 5.3
I found many answers which say to convert it to UTF-8. But I cannot do it because I am using json_encode in many places and changing all is not possible.
I would like to fix the root issue so that I don't need to change the json_encode code.
In MySQL, the result for
SHOW VARIABLES LIKE 'character_set%';
is
character_set_client latin1
character_set_connection latin1
character_set_database latin1
character_set_filesystem binary
character_set_results
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
What is the reason for the json_encode error?
I am using zend server and I see this json_encode error in zend server logs.
Another thing I noticed is, even if I see the error in the server logs, it is properly converting the array to json.
There is no error in converting array to json. Then Why I see the error in zend server?
json_encode() needs valid UTF-8 data as input.
You are feeding it invalid data.
From what you describe, you probably are getting latin1 data from your database connection, which will cause json_encode() to choke.
Set the database's connection in your script to UTF-8. How to do that depends on the database library you are using.
Here is a list of ways to switch to UTF-8 in the most common libraries:
UTF-8 all the way through
Try to add the instruction below after the connection parameters:
mysql_set_charset('utf8');
And for the result value:
mb_convert_encoding($result,'UTF-8','UTF-8');
Related
Firstly, I don't have an option to use latest SQLSRV drivers on my host so I am stuck with odbc connection.
$connection_string = 'DRIVER={SQL Server};SERVER=111.111.111.111;DATABASE=MY_DATABASE';
$user = 'name';
$pass = 'pass';
$connection = odbc_connect( $connection_string, $user, $pass, SQL_CUR_USE_ODBC );
The collation of that database is Slovak_CI_AI. If I set my PHP header to utf-8, output data looks messed up, encoding is wrong.
If I put 'Slovak_CI_AI' as a charset to my PHP header, data displays fine, but it is probably a no go, because I need to work with that data in WordPress, which fails to process them if they contain special/non-english characters (those strings looks broken to WP).
I've tried many conversions with mb_convert_encoding, iconv or utf8_decode, but no luck. WordPress uses utf-8.
I can't find any solution for this.
Update: I've tried adding CHARSET=UTF8 to my odbc connection string, but no luck. Also I found out the character set for texts in the database is cp1250. I've tried setting cp1250 as a charset to my PHP header, output is fine but WordPress still fails once it encounters a special character. I've tried converting those strings from cp1250 to utf-8 with iconv, but no luck as well - strings have wrong encoding on output and WordPress fails as well.
This whole encoding thing still feels chaotic to me, but I somehow managed to do it. It works when:
odbc connection string contains charset=cp1250
PHP header character set is set to utf-8
I convert all problematic strings from cp1250 to utf-8 with iconv
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 4 years ago.
When I moved from php mysql shared hosting to my own VPS I've found that code which outputs user names in UTF8 from mysql database outputs ?�??????� instead of 鬼神❗. My page has utf-8 encoding, and I have default_charset = "UTF-8" in php.ini, and header('Content-Type: text/html; charset=utf-8'); in my php file, as well as <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> in html part of it.
My database has collation utf8_bin, and table has the same. On both previos and current hosting in phpmyadmin for this database record I see: 鬼神❗. When I create ANSI text file in Notepad++, paste 鬼神❗ into it and select Encoding->Encode in UTF-8 menu I see 鬼神❗, so I suppose it is correct encoded UTF-8 string.
Ok, and then I added
init_connect='SET collation_connection = utf8_general_bin'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_general_bin
skip-character-set-client-handshake
to my.cnf and now my page shows 鬼神❗ instead of ?�??????�. This is the same output I get in phpmyadmin on both hostings, so I'm on a right way. And still somehow on my old hosting the same php script returns utf-8 web page with name 鬼神❗ while on new hosting - 鬼神❗. It looks like the string is twice utf-8 encoded: I get utf-8 string, I give it as ansi string to Notepad++ and it encodes it in correct utf-8 string.
However when I try utf8_encode() I get й¬ÑзÒÑвÑâ, and utf8_decode() returns ?�???????. The same result return mb_convert_encoding($name,"UTF-8","ISO-8859-1"); and iconv( "ISO-8859-1","UTF-8", $name);.
So how could I reproduce the same conversion Notepad++ does?
See answer below.
The solution was simple yet not obvious for me, as I never saw my.cnf on that shared hosting: it seems that that server had settings as follows
init_connect='SET collation_connection = cp1252'
init_connect='SET NAMES cp1252'
character-set-server=cp1252
So to make no harm to other code on my new server I have to place mysql_query("SET NAMES CP1252"); on top of each php script which works with utf8 strings.
The trick here was script gets a string as is (ansi) and outputs it, and when browser is warned that page is in utf-8 encoding it just renders my strings as utf-8.
I am facing a strange issue when extracting data from a MySql database and inserting it in a CSV file. In the database, the field value is the following:
K Secure Connection 1 año 1 PC
When I echo it before writing it to the CSV file, I get the same as the above in my terminal.
I use the following code to write content to the CSV file:
fwrite($this->fileHandle, utf8_encode($lineContent . PHP_EOL));
Yet, when I open the CSV with LibreOffice Calc (and specify UTF-8 as the encoding format), the following is displayed:
K Secure Connection 1 año 1 PC
I have no idea why this happens. Can someone explain how to solve this?
REM:
SELECT ##character_set_database;
returns
latin1
REM 2:
`var_dump($lineContent, bin2hex($lineContent))`
gives
string(39) "Kaspersky Secure Connection 1 año 1 PC"
string(78) "4b6173706572736b792053656375726520436f6e6e656374696f6e20312061c3b16f2031205043"
The var_dump shows that the string is already encoded in UTF-8. Using utf8_encode on it will garble it (the function attempts a conversion from Latin-1 to UTF-8). You're therefore actually writing "año" encoded in UTF-8 into your file, which is then "correctly" picked up by LibreOffice.
Simply don't utf8_encode.
I would try to open the csv file with other editor just to make sure te problem is not with the office...
You may be double encoding the content if it is already in UTF-8 format.
I also prefer to aways work with UTF-8, so I get the data from database already in UTF-8 and no more convertion is needed. For that I run this query right after opening the SQL connection:
"set names 'utf8'"
I am importing data from an xml, and it appears they use " latin1_swedish_ci" which is causing me lots of issues with php and mysql (PDO).
I'm getting a lot of these errors:
General error: 1267 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
I am wondering how I can convert them to proper UTF-8 to store in my database.
I've tried doing this:
$game['human_name'] = iconv(mb_detect_encoding($game['human_name'], mb_detect_order(), true), "UTF-8", $game['human_name']);
Which I found here: PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
I still seem to get the same error though?
Don't use any conversion functions -- get utf8 specified throughout the processing.
Make sure the data is encoded utf8.
When using PDO:
$db = new PDO('dblib:host=host;dbname=db;charset=UTF-8', $user, $pwd);
Table definitions (or at least columns):
CHARACTER SET utf8
Web page output:
<meta ... charset=UTF-8>
this is really doing my nut.....
all relevant PHP Output scripts set headers (in this case only one file - the main php script):
header("Content-type: text/html; charset=utf-8");
HTML meta is set in head:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
all Mysql tables and related columns set to:
utf8_unicode_ci Unicode (multilingual), case-insensitive
I have been writing a class to do some translation.. when the class writes to a file using fopen, fputs etc everything works great, the correct chars appear in my output files (Which are written as php arrays and saved to the filesystem as .php or .htm files. eval() brings back .htm files correctly, as does just including the .php files when I want to use them. All good.
Prob is when I am trying to create translation entries to my DB. My DB connection class has the following line added directly after the initial connection:
mysql_query("SET NAMES utf8, character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'");
instead of seeing the correct chars, i get the usual crud you would expect using the wrong charset in the DB. Eg:
Propriétés
instead of:
propriétés
don't even get me started on Russian, Japanese, etc chars! But then using UTF8 should not make any single language charset an issue...
What have I missed? I know its not the PHP as the site shows the correct chars from the included translation .php or .htm files, its only when I am dealing with the MySQL DB that I am having these issues. PHPMyAdmin shows the entries with the wrong chars, so I assume its happening when the PHP "writes" to MySQL. Have checked similar questions here on stack, but none of the answers (all of which were taken care of) give me any clues...
Also, anyone have thoughts on speed difference using include $filename vs eval(file_get_contents($filename)).
You say that you are seeing "the usual crud you would expect using the wrong charset". But that crud is in fact created by using utf8_encode() on an already UTF8 string, so chances are that you are not using the "wrong encoding" anywhere, but exceeding the times you are encoding into UTF8.
You may take a look into a library I made to fix that kind of problems:
https://stackoverflow.com/a/3521340/290221
Here is all you need to make sure you have a good display of those chars :
/* HTTP charset */
header("Content-Type:text/html; charset=UTF-8");
/* Set MySQL communication encoding */
mysql_set_charset("UTF8");
You also need to set the DB encoding to the correct one, also each table's encoding AND the field's encoding
Last but not least, your php file's encoding should also match.
There is a mysql_set_charset('utf8'); in mysql for that. Run the query at the beginning of another query.