How to make MySQL character_set_connection work with utf16? - php

This query works fine:
set character_set_client = utf8
Same goes for utf8mb4, big5, dec8, cp850, hp8, koi8r, latin1, latin2, swe7, ascii, ujis, sjis, hebrew, etc.
However, when I tried set character_set_client = utf16 or set character_set_client = utf32, they don't work:
#1231 - Variable 'character_set_client' can't be set to the value of 'utf16'
#1231 - Variable 'character_set_client' can't be set to the value of 'utf32'
Why don't the commands work?
How can we make MySQL character_set_client work with utf16/32?

You can't.
MySQL docs only stated ucs2 cannot be used:
That was the 5.0 doc link. 5.5 says:
ucs2, utf16, and utf32 cannot be used as a client character set
and 5.6 adds utf16le. Essentially MySQL expects queries to be in an ASCII-compatible encoding, each doc version here lists the ASCII-incompatible encodings that version of MySQL knows about.
Is there any particular reason you prefer to use UTF-16? It's generally a bad choice for anything other than talking to other UTF-16 environments (Win32 API, Java etc).

Related

Php + Mysql (UTF-8 ) some characters are still bug

Well i got a php script that takes nicknames from a the Steam web-api and insert them into a mysql db. Many of them got rare russian and greek characters. I set php to utf-8 in the php.ini and in all the php files with
mb_internal_encoding('utf-8');
My PDO connector is configured to handle utf8
$connection = new PDO('mysql:host=localhost;dbname=d2bd;mysql:charset=utf8mb4', 'root', '');
$connection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$connection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$connection->setAttribute(PDO::ATTR_PERSISTENT, true);
$connection->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'");
my mysql db is properly configured with utf8mb4
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server utf8mb4
character_set_system utf8
character_sets_dir C:\xampp\mysql\share\charsets\
collation_connection utf8mb4_unicode_ci
collation_database utf8mb4_unicode_ci
collation_server utf8mb4_unicode_ci
completion_type NO_CHAIN
concurrent_insert AUTO
connect_timeout 10
core_file OFF
In few words i take the input of the web-api and encode it with uft8_encode(). Then i insert it into the db. The problem is that some characters are not well encoded and when i recall them from the database they are all bugged.
Example 1:
1.Input -> Перуанский чертовски
2.Encode -> ÐеÑÑанÑкий ÑеÑÑовÑки
3.Insert into DB
4.Select from DB -> Ð?еÑ?Ñ?анÑкий Ñ?еÑ?Ñ?овÑкÐ
5.Decode
6.Output -> �?е�?�?анский �?е�?�?овск�
Example 2:
1.Input -> $ |/| 1 ↓_ € ♥ J
2.Encode -> $ |/| 1 â_ ⬠⥠J
3.Insert into DB
4.Select from DB -> 1 â??_ â?¬ â?¥ J
5.Decode
6.Output -> 1 �??_ �?� �?� J
Checklist for Problems with character/charset/collation
Including mysql, mysqli, PDO
Content
DISCLAIMER
My insert's in my DB doesn't work properly! What can i do?
Change Charset and Collation of a Database or Table
Set the encoding of your skript files
Set the charset of your page with php or meta tag
What's the difference between UTF8 and UTF8mb4?
Answer to this specific Question
Further Information/Additional Links
Side Notes
1. DISCLAIMER
This Answer should not only answer this question, also should the answer be a bit more extensive, so more people find faster a bundled and good answer!
!Important Notice!
If you change something in your Database always make sur you have a backup of your database! Check it 2 times, or 3!
I'm open for improvements and comments, such as error corrections.
In addition I apologize if the grammar is not perfect: D
If you get stuck on a question like this:
Php + Mysql (UTF-8, utf8mb4) some characters are still bug
How to convert an entire MySQL database characterset and collation to UTF-8?
“Incorrect string value” when trying to insert UTF-8 into MySQL
Change MySQL default character set to UTF-8 in my.cnf?
Using utf8mb4 with php and mysql
PDO + MySQL and broken UTF-8 encoding
Error in insertion data in php Mysql
PHP PDO: charset, set names?
SET NAMES utf8 in MySQL?
PHP mysql charset utf8 problems
UTF-8 all the way through
Manipulating utf8mb4 data from MySQL with PHP
ERROR 1115 (42000) : Unknown character set: 'utf8mb4' in mysql
...then my answer maybe helps you!
2. My insert's in my DB doesn't work properly! What can i do?
If your insert's doesn't work properly an your inserted data looks something like this in your database then this could have various reasons!
Examples:
??????????
𫗮𫗮𫗮𫗮
�??_ �?�
â_ ⬠⥠J
Here is a little checklist you can go trought and check if everything is how it should be!
(After the checklist there a few extra informations for mysql, mysqli and PDO)
Checklist:
Make sure default character sets is set on tables, client, server & text fields
If NOT See Point 3
Make sure your database connections character sets
IF NOT See Point mysql/PDO
Make sure if your displaying data that the charset of the document is set!
IF NOT See Point 5
Make sure your skript files are saved with the right charset!
IF NOT See Point 4
Make sure you set your character and your charset!
IF NOT See Point mysql/PDO
Make sure you forms accept utf8!
IF NOT See Point 5
Make sure you have set the connection encoding
IF NOT See Point mysql/pdo
Make sure you have set the servercharacter encoding right
IF NOT See Point mysql/pdo
...
You have to be sure your using utf8/ utf8mb4 everywhere!
mysql:
-mysql_query("SET NAMES 'utf8'"); Run SET NAMES before every query you use. Because if a mysql driver don't provied mechanismus to charset then you have to use SET NAMES!
-mysql_query("SET CHARACTER SET utf8 "); Set character to utf8
-mysql_set_charset('utf8'); Set your charset to utf8
-mysql API driver doesn't support utf8mb4 (ERROR 1115 (42000))
-character_set_server=utf8 to set server character
PDO:
-$dbh->exec("set names utf8"); If your using PDO you can use this line to SET NAMES
-$dbh = new PDO("mysql:host=$host;dbname=$db;charset=utf8"); This line set the charset but you have to have PHP 5.3.6 or higher
-$dbh->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci' "); You can also set SET NAMES with this line
-mb_internal_encoding('UTF-8'); to set the encoding when you use PDO
3. Change Charset and Collation of a Database or Table
If you have to change the charset or collation of a database or table you can use these lines of code:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
4. Set the encoding of your skript files
You may have to check that your skript(php) files are saved with the right charset!
For this i would recommend you Notpad++!
If you have opened your file in notpad go to the menupoint 'Encoding' and change the charset
5. Set the charset of your page with php or meta tag
For displaying data in utf8/utf8mb4 you have to be sure you site is set with the right charset!
You can set the charset in 3 ways like this:
//PHP
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
//HTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Also to accept utf8 in your form use:
<form accept-charset="UTF-8">
6. What's the difference between UTF8 and UTF8mb4?
UTF8:
-utf8 does only support symbols with 3 bytes
-...(many more)
UTF8MB4:
-utf8mb3 does support symbols with 4 bytes
-...(many more)
7. Answer to this specific Question
I think this should work since your using PDO:
(After you created a PDO object! If your using a PHP version less then 5.3.6)
$dbh->exec("set names utf8");
Otherwise try one of these:
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
UPDATE:
To change the collation or charset of a database or table use this:
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
8. Further Information/Additional Links
default character set
character set
mysql_set_charset
error_reporting
pdo
mysql
mysqli
9. Side Notes
9.1 Error Reporting
If Error's not get displayed use this code snippet:
<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
?>
9.2 Unicode
So that you don't make any mistake you have to really understand utf8!
9.3 One word to mysql, mysqli and PDO
My Personal ranking is:
PDO
mysqli
mysql
I would recommend you to use PDO or mysqli, because the have many benefits against mysql!
I changed the collation of the tables from SQLyog, but it seems that it's broken. When i changed them directly from a sql query it worked.

Character issues when inserting into MySQL DB

Having an issue with strange characters showing up when inserting into a database, have tried tirelessly to figure out the issue but I am out of ideas...
Basically if I insert this data like so (this is just testing):
$valy = "…industry's favorite </em><strong><em>party of the year</em></strong><em>, </em><a href='http://www.unitingagainstlungcancer.org/getinvolved/strolling-supper-with-blues-news'><span class='s1'><em>Joan's…";
$valy = mysql_real_escape_string($valy);
$query = "INSERT INTO test_table (data) VALUES ('".$valy."')";
mysql_query($query,$dbhandle);
this will end up in the database (notice the A characters):
"...industry's favorite party of the year, http://www.unitingagainstlungcancer.org/getinvolved/strolling-supper-with-blues-news'>Joan's..."
I have tried to line up all the character settings:
php default_charset = utf-8
mysql table & row charset = utf-8
Mysql instance variables:
character set client utf8
(Global value) latin1
character set connection utf8
(Global value) latin1
character set database latin1
character set filesystem binary
character set results utf8
(Global value) latin1
character set server latin1
character set system utf8
What could this issue be?
One thing you may be missing is when you setup the connection. There you should also set the encoding to utf8.
Example:
$link = mysql_connect('localhost', 'user', 'password');
mysql_set_charset('utf8',$link);
However, don't use the mysql extension, it's deprecated: http://php.net/manual/en/function.mysql-set-charset.php
Try to set the mysql connection to UTF-8 as well:
mysql_query("SET NAMES 'utf8'");

Cant insert utf8 characters on mysql (with utf8 collation, charset and nameset)

im facing a really stressing problem here.. i have everything in UTF-8 , all my DB and tables are utf8_general_ci but when trying to insert or update from a single PHP script all i see are symbols.. but if i edit in phpmyadmin the words are shown correctly.. i found that if i run the utf8_decode() function to my strings in php i can make it work, but im not planning to do that because is a mess and it should work without doing that :S
Here is a basic code im using to test this:
<?php
$conn=mysql_connect("localhost","root","root")
or die("Error");
mysql_select_db("mydb",$conn) or
die("Error");
mysql_query("UPDATE `mydb`.`Clients` SET `name` = '".utf8_decode("Araña")."' WHERE `Clients`.`id` =25;",
$conn) or die(mysql_error());
mysql_close($conn);
echo "Success.";
?>
This is what i get if i dont decode utf8 with php utf8_decode function:
instead of Araña, i get : Araña
I've run into the same issue many times. Sometimes it's because the type of database link I'm selecting from isn't the same type that I'm using for inserting and other times, it's from file data into a database.
For the later instance, mysql_set_charset('utf8',$link); is the magic answer.
Place the call to mysql_set_charset just after you select your database via mysql_select_db.
#ref http://php.net/manual/en/function.mysql-set-charset.php
"Araña" IS UTF-8. The characters "ñ" represent the two bytes into which the Spanish ñ are encoded in UTF-8. Whatever you're reading it back with is not handling the UTF-8 and is displaying it as (it appears) ISO-8859-1.
That DDL you mentioned has to do with the collation, not the character set. The correct statement would be:
ALTER TABLE Clients CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
You still need to make sure the client library (libmysql or whatever driver PHP is using) is not transcoding the data back to ISO-8859. mysql_set_charset('utf8') will explicitly set the client encoding to UTF-8. Alternatively, you can send a SET NAMES UTF8; right after you connect to the database. To do that implicitly, you can change the my.cnf [client] block to have utf-8 as the client character encoding (and /etc/init.d/mysql reload to apply). Either way, make sure the client doesn't mangle the results it's pulling.
[client]
default-character-set=utf8
You do not need to use utf8_decode if you're using mbstrings. The following php.ini configuration should ensure UTF-8 support on the PHP side:
mbstring.internal_encoding = utf-8
mbstring.http_output = utf-8
mbstring.func_overload = 6
Finally, when you display the results in HTML, verify that the page's encoding is explicitly UTF-8.

imported database dump from latin1 db to utf8 database

I used iconv to convert from latin1 to utf8 when I did an mysql dump of a database from mysql v4.0.21, and imported it onto a new server mysql v5.0.45
It was latin1 on the old server, it’s utf8 on the new server, so I ran this on the mysql dump: iconv −f latin1 −t UTF−8 quickwebcms_2010-03-01.sql
It ran successful, then I imported it onto the new server.
Now it displays question (?) marks (example: College?s) and  (example: College’s) when it prints out some of the data in my PHP application.
I exported the table these characters show up in and did a find and replace all within textmate, then imported it back into the new database and it uploads some of the fields as null, so the find and replace may of messed up something in the process. I saved the table csv as utf8 no bom, and just utf8 and it still does the same thing.
Any help as to why this might be happening is appreciated.
If the content of your tables are all OK (and in UTF-8) and you sill have "bad" characters in your Web application, make sure your MySQL connection is using the UTF-8 charset in your PHP script. Even if your databases and tables are in UTF-8, MySQL uses latin1 connections by default (at least in my shared server config). So you have to tell MySQL to send content in UTF-8. Otherwise it will convert it on the fly to latin1 producing "bad" characters in UTF-8 webpages.
Use mysql_set_charset if available otherwise you can set it with a SQL query (always use mysql_set_charset if available):
if (function_exists('mysql_set_charset'))
mysql_set_charset('utf8', $conn);
else
{
if (mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $conn) === false)
{
//Error! Do something...
}
}
Also make sure your (X)HTML markup uses UTF-8 too:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
IIRC, mysqldump produces UTF-8 output by default, no matter what the database's encoding is. This user comment in the mySQL manual seems to confirm it:
I am just using default character sets - normally latin1. However, the dump produced by mysqldump is, perhaps surprisingly, in utf8. This seems fine, but leads to trouble with the --skip-opt option to mysqldump, which turns off --set-charset but leaves the dump in utf8.
Perhaps the fact that mysqldump uses utf8 by default, and the importance of the --set-charset option should be more prominently documented (see the documentation for the --default-character-set attribute for the current mention of the use of utf8)
Try skipping the iconv step, might work straight away.
You may be better off loading the data onto the new server as latin1, then using the appropriate ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 DEFAULT COLLATION utf8_unicode_ci on each table (or use a script of some sort to do it for you).
Or you could convert first, then dump.

PHP MySQL database strange characters

I'm trying to output product information stored in a MySQL database, but it's writing out some strange characters, like a diamond with a question mark inside of it.
I think it may be an encoding/UTF8 issue, but I've specified the encoding I want:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Is this right? What should I check for?
If only the data that's coming from database has strange characters in it, be sure that the MySQL connection is also in UTF8 by using:
mysql_query("SET NAMES UTF8");
before any other queries. Otherwise, if the characters appear also in 'handwritten' files, make sure that the files are saved as UTF-8 in your editor. You can also try setting the charset header through PHP:
header('Content-type: text/html; charset=UTF-8');
Also make sure that all fields in the tables you are querying are set as some UTF-8 variant, for example utf8_general_ci.
I assume you want the result to be in utf8
save you php script utf8 encoded
make sure your http header (or some meta tags) tells that output is utf8
all tables in MySql should to be utf8
last but not least, the connection between client and server should be utf8. (This could be handled somewhere in the php.ini setting or by making the following query against the db: SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'
If you follow all 4 point you should never ever have any problem with broken encodings.
The last time I had that trouble, the solution was similar to what Tatu Ulmanen said, but slightly different...
So if his solution does not work, try replacing
mysql_query("SET NAMES UTF8");
with
mysql_query("SET NAMES latin1");
I say this because the default characterset in MySql is latin1, and that is what is used most of the time....
hope that helps...
Seconding what Tatu says.
This is good background reading on encoding: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Categories