I want to save chinese characters in mysql db, charset is set to UTF8 via connecting to db, also the field's charset is utf8, and collation - utf8_general_ci,
But instead of the word it shows squares. I use sqlyog.
There is one thing, if I make request and echo the word in the browser if shows the right chinese word.
So, I am wondering why it shows the correct word in browser, when in db it is like squares and vice versa.
I am afraid that maybe via exporting or importing in the future I can have some data lose.
Thanks
Your data might be stored correctly in the DB, but read wrongly by sqlyog.
I haven't used sqlyog, but this problem might be because of the way sqlyog connects to MySQL - look for parameters in sqlyog connection to DB that are related to character set and make sure they are also utf8
I had a similar problem when I had to insert Latin characters to the database, I used mb_convert_encoding($str, 'utf8', 'HTML-ENTITIES') and it got stored correctly in the database and wen I had to show it in the html page I just had the encoding=utf-8
Related
According to the official MySQL manual the collation used defines the order of records when sorting alphabetically:
http://dev.mysql.com/doc/refman/5.0/en/charset-general.html
However: I have a PHP script (UTF-8) and I save some foreign characters in my MySQL database it's saved all weird (first row). This is when the collation I choose is latin1_swedish_ci. When I change the collation to utf8_unicode_ci all is good (second row).
When saving this data everything is exactly the same except for the collation.
So how about that "collation is used solely for sorting records"?
How someone can clarify this for me :-) Thanks in advance!
It appears that the charset of your connection is not set right, therefore the conversion from the programming language charset to the database is not correct.
You should set the charset in your connection, then both will workfine.
as pointed out in the comments a little explanation on how things work.
when you have not set the character set in your connections, the server assumes it to be the same as the collocation of the database. when data is recieved in a another encoding, the data is written nevertheless. just with wrong or other characters than they have been in the encoding of the data from the script.
as long as nothing changes, the script gets back the same data as it has written and everything appears to be fine.
however when either the connection encoding or the database encoding is changed at this point, the already stored data gets converted to the new encoding. the problem here is that the source data is not in the encoding that is assumend when converting.
all encodings share the ascii set with the same bits, thats why ascii charactes dont mess up. only special charaters do.
so you have to set your conneciton encoding in order to dont produce the mess that you are already in.
now what can you do about the data you already have?
you can make a dump of your database using mysqldump and use the --skip-set-charset option. then you get a plaintext file. in this plane text file replace all occurences of the actual database charset with the one the data is really in (the one you had in your script when you wrote the data).
then save the file and make sure your editor does not do any conversion (i recommend vim).
then import that file and you will get a database with data in the correct encoding. then you can change the encoding however you like and as long as your conneciton charset gets set also you will be fine from now on.
also make sure that the mysql server has the charsets installed, but it should have that already.
this is only my approach, i have cleaned up a lot of messed up installations like that. most of which at some point have garbled characters in their projects (after switching server, updating or restoring a backup...).
turns out not setting the connection charset is something that is very often forgotten.
I recenly had problem in importing latin1_swedish database into new one. Somone made Latin1 Database to store Latin2 characters. It was all working till I made database dump and wanted to import it to another database.
It's really complicated. In the end I corrected sql dump to proper ISO-8859-2 Encoded file with all characters displaying correctly. Still import into tables with Latin2 encoding didn't work, all special characters were lost (maybe its a PHPMyAdmin bug?).
Converting file to UTF-8 encoding and changing table encoding to utf8_general_ci imported everything correctly.
Next, whole PHP site uses and displays ISO-8859-2 characters (its old PHPBB forum).
While connecting to Database I use "SET NAMES latin2" command to change encoding.
To my surprise, page displays as proper ISO-8859-2.
If table is UTF-8 and Set names is latin2. Does MySQL connection convert characters into ISO-8859-2 before returning them???
(didnt know if I shoud write it all or not. Edit it if I put too much not needed info)
SET NAMES effectively sets how the data is translated before being stored or after recalled, prior to presenting to the client. For the case of storage, the character set definition of the column is the ultimate determining factor (if it differs from table, and database character set definition). See this informative blog post about encoding in MySQL.
I have a Mysql database with all tables collated as 'utf8_unicode_ci'.
Also all data I wrote to the Database with php was encoded in utf8.
But I forgot to set the mysql connection encoding to utf8, so it probably defaulted to ISO-8859.
For a long time this was not a problem. Although special characters where displayed wrong in Tools like phpMyAdmin, the data was correct when loading it into my php application, as long as I kept using the wrong connection encoding.
But now I need to use my database from another application, that (correctly) does not use ISO-8859 as connection encoding and gets broken special characters.
Now I want to convert my database so I can use the right connection encoding.
I already tried this:
mysql wrong connection encoding
But I does not help for me. The closest I got to a solution was 'ut8_decode(utf8_decode($data))'.
But this breaks fields that start with a special character.
Additional Information:
So what might happen is the following:
My application sends some utf8 Data to the database.
Mysql gets the data but thinks (due to the connection encoding) that it is not utf8, and converts it, to fit for the 'utf8_unicode_ci' collation.
When my php application reads the data from the database mysql seems to undo the previous conversion so everything looks fine again from my php app.
We imported a website from another server to our server. The code and database is 100% the same.
But the text on the website seems to have a wrong encoding.
Example:
In the database the word "Australië" is "AustraliĂŤ" while on the website its shown as Australi??.
I can fix the ?? with adding mysql_set_charset("utf8",$this->db); after the database connection.
But then its shown like in the database like "AustraliĂŤ" wich is incorrect. I tried different encodings in apache, after database and in meta tags.
The easiest way would be to change the data in the database but there is to much data in it to do this.
Anyone has a solution for this problem? Have been searching and trying a lot off things for hours.
You could try to:
set the MySQL connection collation to uft8_general_ci in the database
run SET NAMES 'utf8' and SET COLLATION_CONNECTION=utf8_unicode_ci in your PHP files
make sure all your PHP files are saved with UTF-8 encoding and do not feature a BOM
make sure the cells in your table are utf8_general_ci
make sure that MySQL charset is UTF-8 Unicode (utf8)
This is what I have. With this setup I see all characters in the database (phpMyAdmin) as they really appear on the website itself.
I have encountered a similar issue when I had a mismatch of encodings, i.e. I was saving data to a UTF-8 database by a ISO-8859-1 encoded site...
Hope this helps you.
I have been using php + mysql (phpmyadmin) to construct websites with Chinese contents (utf-8) for a long time.
When inputting forms, and also generate output php from db, the Chinese Words display well; but when I look at the database, although sometimes they are normal chinese characters, but something they are not (become strange strings), that made me notice that, the way that mysql handle and input data is not always utf-8.
Some experts on web mentioned, mysql were used to record the input data by latin1; nevertheless, I note that the existing charset in phpmyadmin is utf-8...
Will there be any solid way to detect the encoding format of chinese characters appeared in a phpmyadmin table cell?
Also, apart from mentioning at header of the page, will there be any method so that I can make sure the data entered to the db is utf-8 but not others?
Thank you.
The biggest problem that people encounter in this regard is that they don't tell MySQL that they're sending/expecting UTF-8 encoded data when connecting to the database, so MySQL thinks it's supposed to handle latin1 encoded data and converts it accordingly. Issue the command SET NAMES utf8 after connecting to the db or use mysql_set_charset.
in my case, it just because htmlentities(); Solution is change echo htmlentities($email_db); to echo htmlentities($email_db, ENT_COMPAT, 'UTF-8');