imported database dump from latin1 db to utf8 database - php

I used iconv to convert from latin1 to utf8 when I did an mysql dump of a database from mysql v4.0.21, and imported it onto a new server mysql v5.0.45
It was latin1 on the old server, it’s utf8 on the new server, so I ran this on the mysql dump: iconv −f latin1 −t UTF−8 quickwebcms_2010-03-01.sql
It ran successful, then I imported it onto the new server.
Now it displays question (?) marks (example: College?s) and  (example: College’s) when it prints out some of the data in my PHP application.
I exported the table these characters show up in and did a find and replace all within textmate, then imported it back into the new database and it uploads some of the fields as null, so the find and replace may of messed up something in the process. I saved the table csv as utf8 no bom, and just utf8 and it still does the same thing.
Any help as to why this might be happening is appreciated.

If the content of your tables are all OK (and in UTF-8) and you sill have "bad" characters in your Web application, make sure your MySQL connection is using the UTF-8 charset in your PHP script. Even if your databases and tables are in UTF-8, MySQL uses latin1 connections by default (at least in my shared server config). So you have to tell MySQL to send content in UTF-8. Otherwise it will convert it on the fly to latin1 producing "bad" characters in UTF-8 webpages.
Use mysql_set_charset if available otherwise you can set it with a SQL query (always use mysql_set_charset if available):
if (function_exists('mysql_set_charset'))
mysql_set_charset('utf8', $conn);
else
{
if (mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $conn) === false)
{
//Error! Do something...
}
}
Also make sure your (X)HTML markup uses UTF-8 too:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

IIRC, mysqldump produces UTF-8 output by default, no matter what the database's encoding is. This user comment in the mySQL manual seems to confirm it:
I am just using default character sets - normally latin1. However, the dump produced by mysqldump is, perhaps surprisingly, in utf8. This seems fine, but leads to trouble with the --skip-opt option to mysqldump, which turns off --set-charset but leaves the dump in utf8.
Perhaps the fact that mysqldump uses utf8 by default, and the importance of the --set-charset option should be more prominently documented (see the documentation for the --default-character-set attribute for the current mention of the use of utf8)
Try skipping the iconv step, might work straight away.

You may be better off loading the data onto the new server as latin1, then using the appropriate ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 DEFAULT COLLATION utf8_unicode_ci on each table (or use a script of some sort to do it for you).
Or you could convert first, then dump.

Related

How to solve Mysql Database collation? [duplicate]

I have my database properly set to UTF-8 and am dealing with a database containing Japanese characters. If I do SELECT *... from the mysql command line, I properly see the Japanese characters. When pulling data out of the database and displaying it on a webpage, I see it properly.
However, when viewing the table data in phpMyAdmin, I just see garbage text. ie.
ç§ã¯æ—¥æœ¬æ–™ç†ãŒå¥½ãã§ã™ã€‚日本料ç†ã‚...
How can I get phpMyAdmin to display the characters in Japanese?
The character encoding on the HTML page is set to UTF-8.
Edit:
I have tried an export of my database and opened up the .sql file in geany. The characters are still garbled even though the encoding is set to UTF-8. (However, doing a mysqldump of the database also shows garbled characters).
The character set is set correctly for the database and all tables ('latin' is not found anywhere in the file)
CREATE DATABASE `japanese` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
I have added the lines to my.cnf and restarted mysql but there is no change. I am using Zend Framework to insert data into the database.
I am going to open a bounty for this question as I really want to figure this out.
Unfortunately, phpMyAdmin is one of the first php application that talk to MySQL about charset correctly. Your problem is most likely due to the fact that the database does not store the correct UTF-8 strings at first place.
In order to correctly display the characters correctly in phpMyAdmin, the data must be correctly stored in the database. However, convert the database into correct charset often breaks web apps that does not aware charset-related feature provided by MySQL.
May I ask: is MySQL > version 4.1? What web app is the database for? phpBB? Was the database migrated from an older version of the web app, or an older version of MySQL?
My suggestion is not to brother if the web app you are using is too old and not supported. Only convert database to real UTF-8 if you are sure the web app can read them correctly.
Edit:
Your MySQL is > 4.1, that means it's charset-aware. What's the charset collation settings for you database? I am pretty sure you are using latin1, which is MySQL name for ASCII, to store the UTF-8 text in 'bytes', into the database.
For charset-insensitive clients (i.e. mysql-cli and php-mod-mysql), characters get displayed correctly since they are being transfer to/from database as bytes. In phpMyAdmin, bytes get read and displayed as ASCII characters, that's the garbage text you seem.
Countless hours had been spend years ago (2005?) when MySQL 4.0 went obsolete, in many parts of Asia. There is a standard way to deal with your problem and gobbled data:
Back up your database as .sql
Open it up in UTF-8 capable text editor, make sure they look correct.
Look for charset collation latin1_general_ci, replace latin1 to utf8.
Save as a new sql file, do not overwrite your backup
Import the new file, they will now look correctly in phpMyAdmin, and Japanese on your web app will become question marks. That's normal.
For your php web app that rely on php-mod-mysql, insert mysql_query("SET NAMES UTF8"); after mysql_connect(), now the question marks will be gone.
Add the following configuration my.ini for mysql-cli:
# CLIENT SECTION
[mysql]
default-character-set=utf8
# SERVER SECTION
[mysqld]
default-character-set=utf8
For more information about charset on MySQL, please refer to manual:
http://dev.mysql.com/doc/refman/5.0/en/charset-server.html
Note that I assume your web app is using php-mod-mysql to connect to the database (hence the mysql_connect() function), since php-mod-mysql is the only extension I can think of that still trigger the problem TO THIS DAY.
phpMyAdmin use php-mod-mysqli to connect to MySQL. I never learned how to use it because switch to frameworks* to develop my php projects. I strongly encourage you do that too.
Many frameworks, e.g. CodeIgniter, Zend, use mysqli or pdo to connect to databases. mod-mysql functions are considered obsolete cause performance and scalability issue. Also, you do not want to tie your project to a specific type of database.
If you're using PDO don't forget to initiate it with UTF8:
$con = new PDO('mysql:host=' . $server . ';dbname=' . $db . ';charset=UTF8', $user, $pass, array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
(just spent 5 hours to figure this out, hope it will save someone precious time...)
I did a little more googling and came across this page
The command doesn't seem to make sense but I tried it anyway:
In the file /usr/share/phpmyadmin/libraries/dbi/mysqli.dbi.lib.php at the end of function PMA_DBI_connect() just before the return statement I added:
mysqli_query($link, "SET SESSION CHARACTER_SET_RESULTS =latin1;");
mysqli_query($link, "SET SESSION CHARACTER_SET_CLIENT =latin1;");
And it works! I now see Japanese characters in phpMyAdmin. WTF? Why does this work?
I had the same problem,
Set all text/varchar collations in phpMyAdmin to utf-8 and in php files add this:
mysql_set_charset("utf8", $your_connection_name);
This solved it for me.
the solution for this can be as easy as :
find the phpmysqladmin connection function/method
add this after database is conncted $db_conect->set_charset('utf8');
phpmyadmin doesn't follow the MySQL connection because it defines its proper collation in phpmyadmin config file.
So if we don't want or if we can't access server parameters, we should just force it to send results in a different format (encoding) compatible with client i.e. phpmyadmin
for example if both the MySQL connection collation and the MySQL charset are utf8 but phpmyadmin is ISO, we should just add this one before any select query sent to the MYSQL via phpmyadmin :
SET SESSION CHARACTER_SET_RESULTS =latin1;
Here is my way how do I restore the data without looseness from latin1 to utf8:
/**
* Fixes the data in the database that was inserted into latin1 table using utf8 encoding.
*
* DO NOT execute "SET NAMES UTF8" after mysql_connect.
* Your encoding should be the same as when you firstly inserted the data.
* In my case I inserted all my utf8 data into LATIN1 tables.
* The data in tables was like ДЕТСКИÐ.
* But my page presented the data correctly, without "SET NAMES UTF8" query.
* But phpmyadmin did not present it correctly.
* So this is hack how to convert your data to the correct UTF8 format.
* Execute this code just ONCE!
* Don't forget to make backup first!
*/
public function fixIncorrectUtf8DataInsertedByLatinEncoding() {
// mysql_query("SET NAMES LATIN1") or die(mysql_error()); #uncomment this if you already set UTF8 names somewhere
// get all tables in the database
$tables = array();
$query = mysql_query("SHOW TABLES");
while ($t = mysql_fetch_row($query)) {
$tables[] = $t[0];
}
// you need to set explicit tables if not all tables in your database are latin1 charset
// $tables = array('mytable1', 'mytable2', 'mytable3'); # uncomment this if you want to set explicit tables
// duplicate tables, and copy all data from the original tables to the new tables with correct encoding
// the hack is that data retrieved in correct format using latin1 names and inserted again utf8
foreach ($tables as $table) {
$temptable = $table . '_temp';
mysql_query("CREATE TABLE $temptable LIKE $table") or die(mysql_error());
mysql_query("ALTER TABLE $temptable CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci") or die(mysql_error());
$query = mysql_query("SELECT * FROM `$table`") or die(mysql_error());
mysql_query("SET NAMES UTF8") or die(mysql_error());
while ($row = mysql_fetch_row($query)) {
$values = implode("', '", $row);
mysql_query("INSERT INTO `$temptable` VALUES('$values')") or die(mysql_error());
}
mysql_query("SET NAMES LATIN1") or die(mysql_error());
}
// drop old tables and rename temporary tables
// this actually should work, but it not, then
// comment out this lines if this would not work for you and try to rename tables manually with phpmyadmin
foreach ($tables as $table) {
$temptable = $table . '_temp';
mysql_query("DROP TABLE `$table`") or die(mysql_error());
mysql_query("ALTER TABLE `$temptable` RENAME `$table`") or die(mysql_error());
}
// now you data should be correct
// change the database character set
mysql_query("ALTER DATABASE DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci") or die(mysql_error());
// now you can use "SET NAMES UTF8" in your project and mysql will use corrected data
}
Change latin1_swedish_ci to utf8_general_ci in phpmyadmin->table_name->field_name
This is where you find it on the screen:
First, from the client do
mysql> SHOW VARIABLES LIKE 'character_set%';
This will give you something like
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
where you can inspect the general settings for the client, connection, database
Then you should also inspect the columns from which you are retrieving data with
SHOW CREATE TABLE TableName
and inspecting the charset and collation of CHAR fields (though usually people do not set them explicitly, but it is possible to give CHAR[(length)] [CHARACTER SET charset_name] [COLLATE collation_name] in CREATE TABLE foo ADD COLUMN foo CHAR ...)
I believe that I have listed all relevant settings on the side of mysql.
If still getting lost read fine docs and perhaps this question which might shed some light (especially how I though I got it right by looking only at mysql client in the first go).
1- Open file:
C:\wamp\bin\mysql\mysql5.5.24\my.ini
2- Look for [mysqld] entry and append:
character-set-server = utf8
skip-character-set-client-handshake
The whole view should look like:
[mysqld]
port=3306
character-set-server = utf8
skip-character-set-client-handshake
3- Restart MySQL service!
Its realy simple to add multilanguage in myphpadmin if you got garbdata showing in myphpadmin, just go to myphpadmin click your database go to operations tab in operation tab page see collation section set it to utf8_general_ci, after that all your garbdata will show correctly. a simple and easy trick
The function and file names don't match those in newer versions of phpMyAdmin. Here is how to fix in the newer PHPMyAdmins:
Find file:
phpmyadmin/libraries/DatabaseInterface.php
In function: public function query
Right after the opening { add this:
if($link != null){
mysqli_query($link, "SET SESSION CHARACTER_SET_RESULTS =latin1;");
mysqli_query($link, "SET SESSION CHARACTER_SET_CLIENT =latin1;");
}
That's it. Works like a charm.
I had exactly the same problem. Database charset is utf-8 and collation is utf8_unicode_ci. I was able to see Unicode text in my webapp but the phpMyAdmin and sqldump results were garbled.
It turned out that the problem was in the way my web application was connecting to MySQL. I was missing the encoding flag.
After I fixed it, I was able to see Greek characters correctly in both phpMyAdmin and sqldump but lost all my previous entries.
just uncomment this lines in libraries/database_interface.lib.php
if (! empty($GLOBALS['collation_connection'])) {
// PMA_DBI_query("SET CHARACTER SET 'utf8';", $link, PMA_DBI_QUERY_STORE);
//PMA_DBI_query("SET collation_connection = '" .
//PMA_sqlAddslashes($GLOBALS['collation_connection']) . "';", $link, PMA_DBI_QUERY_STORE);
} else {
//PMA_DBI_query("SET NAMES 'utf8' COLLATE 'utf8_general_ci';", $link, PMA_DBI_QUERY_STORE);
}
if you store data in utf8 without storing charset you do not need phpmyadmin to re-convert again the connection. This will work.
Easier solution for wamp is:
go to phpMyAdmin,
click localhost,
select latin1_bin for Server connection collation,
then start to create database and table
Add:
mysql_query("SET NAMES UTF8");
below:
mysql_select_db(/*your_database_name*/);
It works for me,
mysqli_query($con, "SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'");
ALTER TABLE table_name CONVERT to CHARACTER SET utf8;
*IMPORTANT: Back-up first, execute after

How to make MySQL character_set_connection work with utf16?

This query works fine:
set character_set_client = utf8
Same goes for utf8mb4, big5, dec8, cp850, hp8, koi8r, latin1, latin2, swe7, ascii, ujis, sjis, hebrew, etc.
However, when I tried set character_set_client = utf16 or set character_set_client = utf32, they don't work:
#1231 - Variable 'character_set_client' can't be set to the value of 'utf16'
#1231 - Variable 'character_set_client' can't be set to the value of 'utf32'
Why don't the commands work?
How can we make MySQL character_set_client work with utf16/32?
You can't.
MySQL docs only stated ucs2 cannot be used:
That was the 5.0 doc link. 5.5 says:
ucs2, utf16, and utf32 cannot be used as a client character set
and 5.6 adds utf16le. Essentially MySQL expects queries to be in an ASCII-compatible encoding, each doc version here lists the ASCII-incompatible encodings that version of MySQL knows about.
Is there any particular reason you prefer to use UTF-16? It's generally a bad choice for anything other than talking to other UTF-16 environments (Win32 API, Java etc).

Php + Mysql (UTF-8 ) some characters are still bug

Well i got a php script that takes nicknames from a the Steam web-api and insert them into a mysql db. Many of them got rare russian and greek characters. I set php to utf-8 in the php.ini and in all the php files with
mb_internal_encoding('utf-8');
My PDO connector is configured to handle utf8
$connection = new PDO('mysql:host=localhost;dbname=d2bd;mysql:charset=utf8mb4', 'root', '');
$connection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$connection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$connection->setAttribute(PDO::ATTR_PERSISTENT, true);
$connection->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'");
my mysql db is properly configured with utf8mb4
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server utf8mb4
character_set_system utf8
character_sets_dir C:\xampp\mysql\share\charsets\
collation_connection utf8mb4_unicode_ci
collation_database utf8mb4_unicode_ci
collation_server utf8mb4_unicode_ci
completion_type NO_CHAIN
concurrent_insert AUTO
connect_timeout 10
core_file OFF
In few words i take the input of the web-api and encode it with uft8_encode(). Then i insert it into the db. The problem is that some characters are not well encoded and when i recall them from the database they are all bugged.
Example 1:
1.Input -> Перуанский чертовски
2.Encode -> ÐеÑÑанÑкий ÑеÑÑовÑки
3.Insert into DB
4.Select from DB -> Ð?еÑ?Ñ?анÑкий Ñ?еÑ?Ñ?овÑкÐ
5.Decode
6.Output -> �?е�?�?анский �?е�?�?овск�
Example 2:
1.Input -> $ |/| 1 ↓_ € ♥ J
2.Encode -> $ |/| 1 â_ ⬠⥠J
3.Insert into DB
4.Select from DB -> 1 â??_ â?¬ â?¥ J
5.Decode
6.Output -> 1 �??_ �?� �?� J
Checklist for Problems with character/charset/collation
Including mysql, mysqli, PDO
Content
DISCLAIMER
My insert's in my DB doesn't work properly! What can i do?
Change Charset and Collation of a Database or Table
Set the encoding of your skript files
Set the charset of your page with php or meta tag
What's the difference between UTF8 and UTF8mb4?
Answer to this specific Question
Further Information/Additional Links
Side Notes
1. DISCLAIMER
This Answer should not only answer this question, also should the answer be a bit more extensive, so more people find faster a bundled and good answer!
!Important Notice!
If you change something in your Database always make sur you have a backup of your database! Check it 2 times, or 3!
I'm open for improvements and comments, such as error corrections.
In addition I apologize if the grammar is not perfect: D
If you get stuck on a question like this:
Php + Mysql (UTF-8, utf8mb4) some characters are still bug
How to convert an entire MySQL database characterset and collation to UTF-8?
“Incorrect string value” when trying to insert UTF-8 into MySQL
Change MySQL default character set to UTF-8 in my.cnf?
Using utf8mb4 with php and mysql
PDO + MySQL and broken UTF-8 encoding
Error in insertion data in php Mysql
PHP PDO: charset, set names?
SET NAMES utf8 in MySQL?
PHP mysql charset utf8 problems
UTF-8 all the way through
Manipulating utf8mb4 data from MySQL with PHP
ERROR 1115 (42000) : Unknown character set: 'utf8mb4' in mysql
...then my answer maybe helps you!
2. My insert's in my DB doesn't work properly! What can i do?
If your insert's doesn't work properly an your inserted data looks something like this in your database then this could have various reasons!
Examples:
??????????
𫗮𫗮𫗮𫗮
�??_ �?�
â_ ⬠⥠J
Here is a little checklist you can go trought and check if everything is how it should be!
(After the checklist there a few extra informations for mysql, mysqli and PDO)
Checklist:
Make sure default character sets is set on tables, client, server & text fields
If NOT See Point 3
Make sure your database connections character sets
IF NOT See Point mysql/PDO
Make sure if your displaying data that the charset of the document is set!
IF NOT See Point 5
Make sure your skript files are saved with the right charset!
IF NOT See Point 4
Make sure you set your character and your charset!
IF NOT See Point mysql/PDO
Make sure you forms accept utf8!
IF NOT See Point 5
Make sure you have set the connection encoding
IF NOT See Point mysql/pdo
Make sure you have set the servercharacter encoding right
IF NOT See Point mysql/pdo
...
You have to be sure your using utf8/ utf8mb4 everywhere!
mysql:
-mysql_query("SET NAMES 'utf8'"); Run SET NAMES before every query you use. Because if a mysql driver don't provied mechanismus to charset then you have to use SET NAMES!
-mysql_query("SET CHARACTER SET utf8 "); Set character to utf8
-mysql_set_charset('utf8'); Set your charset to utf8
-mysql API driver doesn't support utf8mb4 (ERROR 1115 (42000))
-character_set_server=utf8 to set server character
PDO:
-$dbh->exec("set names utf8"); If your using PDO you can use this line to SET NAMES
-$dbh = new PDO("mysql:host=$host;dbname=$db;charset=utf8"); This line set the charset but you have to have PHP 5.3.6 or higher
-$dbh->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci' "); You can also set SET NAMES with this line
-mb_internal_encoding('UTF-8'); to set the encoding when you use PDO
3. Change Charset and Collation of a Database or Table
If you have to change the charset or collation of a database or table you can use these lines of code:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
4. Set the encoding of your skript files
You may have to check that your skript(php) files are saved with the right charset!
For this i would recommend you Notpad++!
If you have opened your file in notpad go to the menupoint 'Encoding' and change the charset
5. Set the charset of your page with php or meta tag
For displaying data in utf8/utf8mb4 you have to be sure you site is set with the right charset!
You can set the charset in 3 ways like this:
//PHP
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
//HTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Also to accept utf8 in your form use:
<form accept-charset="UTF-8">
6. What's the difference between UTF8 and UTF8mb4?
UTF8:
-utf8 does only support symbols with 3 bytes
-...(many more)
UTF8MB4:
-utf8mb3 does support symbols with 4 bytes
-...(many more)
7. Answer to this specific Question
I think this should work since your using PDO:
(After you created a PDO object! If your using a PHP version less then 5.3.6)
$dbh->exec("set names utf8");
Otherwise try one of these:
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
UPDATE:
To change the collation or charset of a database or table use this:
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
8. Further Information/Additional Links
default character set
character set
mysql_set_charset
error_reporting
pdo
mysql
mysqli
9. Side Notes
9.1 Error Reporting
If Error's not get displayed use this code snippet:
<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
?>
9.2 Unicode
So that you don't make any mistake you have to really understand utf8!
9.3 One word to mysql, mysqli and PDO
My Personal ranking is:
PDO
mysqli
mysql
I would recommend you to use PDO or mysqli, because the have many benefits against mysql!
I changed the collation of the tables from SQLyog, but it seems that it's broken. When i changed them directly from a sql query it worked.

Cant insert utf8 characters on mysql (with utf8 collation, charset and nameset)

im facing a really stressing problem here.. i have everything in UTF-8 , all my DB and tables are utf8_general_ci but when trying to insert or update from a single PHP script all i see are symbols.. but if i edit in phpmyadmin the words are shown correctly.. i found that if i run the utf8_decode() function to my strings in php i can make it work, but im not planning to do that because is a mess and it should work without doing that :S
Here is a basic code im using to test this:
<?php
$conn=mysql_connect("localhost","root","root")
or die("Error");
mysql_select_db("mydb",$conn) or
die("Error");
mysql_query("UPDATE `mydb`.`Clients` SET `name` = '".utf8_decode("Araña")."' WHERE `Clients`.`id` =25;",
$conn) or die(mysql_error());
mysql_close($conn);
echo "Success.";
?>
This is what i get if i dont decode utf8 with php utf8_decode function:
instead of Araña, i get : Araña
I've run into the same issue many times. Sometimes it's because the type of database link I'm selecting from isn't the same type that I'm using for inserting and other times, it's from file data into a database.
For the later instance, mysql_set_charset('utf8',$link); is the magic answer.
Place the call to mysql_set_charset just after you select your database via mysql_select_db.
#ref http://php.net/manual/en/function.mysql-set-charset.php
"Araña" IS UTF-8. The characters "ñ" represent the two bytes into which the Spanish ñ are encoded in UTF-8. Whatever you're reading it back with is not handling the UTF-8 and is displaying it as (it appears) ISO-8859-1.
That DDL you mentioned has to do with the collation, not the character set. The correct statement would be:
ALTER TABLE Clients CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
You still need to make sure the client library (libmysql or whatever driver PHP is using) is not transcoding the data back to ISO-8859. mysql_set_charset('utf8') will explicitly set the client encoding to UTF-8. Alternatively, you can send a SET NAMES UTF8; right after you connect to the database. To do that implicitly, you can change the my.cnf [client] block to have utf-8 as the client character encoding (and /etc/init.d/mysql reload to apply). Either way, make sure the client doesn't mangle the results it's pulling.
[client]
default-character-set=utf8
You do not need to use utf8_decode if you're using mbstrings. The following php.ini configuration should ensure UTF-8 support on the PHP side:
mbstring.internal_encoding = utf-8
mbstring.http_output = utf-8
mbstring.func_overload = 6
Finally, when you display the results in HTML, verify that the page's encoding is explicitly UTF-8.

PHP MySQL database strange characters

I'm trying to output product information stored in a MySQL database, but it's writing out some strange characters, like a diamond with a question mark inside of it.
I think it may be an encoding/UTF8 issue, but I've specified the encoding I want:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Is this right? What should I check for?
If only the data that's coming from database has strange characters in it, be sure that the MySQL connection is also in UTF8 by using:
mysql_query("SET NAMES UTF8");
before any other queries. Otherwise, if the characters appear also in 'handwritten' files, make sure that the files are saved as UTF-8 in your editor. You can also try setting the charset header through PHP:
header('Content-type: text/html; charset=UTF-8');
Also make sure that all fields in the tables you are querying are set as some UTF-8 variant, for example utf8_general_ci.
I assume you want the result to be in utf8
save you php script utf8 encoded
make sure your http header (or some meta tags) tells that output is utf8
all tables in MySql should to be utf8
last but not least, the connection between client and server should be utf8. (This could be handled somewhere in the php.ini setting or by making the following query against the db: SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'
If you follow all 4 point you should never ever have any problem with broken encodings.
The last time I had that trouble, the solution was similar to what Tatu Ulmanen said, but slightly different...
So if his solution does not work, try replacing
mysql_query("SET NAMES UTF8");
with
mysql_query("SET NAMES latin1");
I say this because the default characterset in MySql is latin1, and that is what is used most of the time....
hope that helps...
Seconding what Tatu says.
This is good background reading on encoding: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Categories