MySQL and PHP: UTF-8 with Cyrillic characters [duplicate] - php

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 3 years ago.
I'm trying to insert a Cyrillic value in the MySQL table, but there is a problem with encoding.
Php:
<?php
$servername = "localhost";
$username = "a";
$password = "b";
$dbname = "c";
$conn = new mysqli($servername, $username, $password, $dbname);
mysql_query("SET NAMES 'utf8';");
mysql_query("SET CHARACTER SET 'utf8';");
mysql_query("SET SESSION collation_connection = 'utf8_general_ci';");
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$sql = "UPDATE `c`.`mainp` SET `search` = 'test тест' WHERE `mainp`.`id` =1;";
if ($conn->query($sql) === TRUE) {
}
$conn->close();
?>
MySQL:
| id | search |
| 1 | test ав |
Note: PHP file is utf-8, database collation utf8_general_ci

You are mixing APIs here, mysql_* and mysqli_* doesn't mix. You should stick with mysqli_ (as it seems you are anyway), as mysql_* functions are deprecated, and removed entirely in PHP7.
Your actual issue is a charset problem somewhere. Here's a few pointers which can help you get the right charset for your application. This covers most of the general problems one can face when developing a PHP/MySQL application.
ALL attributes throughout your application must be set to UTF-8
Save the document as UTF-8 w/o BOM (If you're using Notepad++, it's Format -> Convert to UTF-8 w/o BOM)
The header in both PHP and HTML should be set to UTF-8
HTML (inside <head></head> tags):
<meta charset="UTF-8">
PHP (at the top of your file, before any output):
header('Content-Type: text/html; charset=utf-8');
Upon connecting to the database, set the charset to UTF-8 for your connection-object, like this (directly after connecting)
mysqli_set_charset($conn, "utf8"); /* Procedural approach */
$conn->set_charset("utf8"); /* Object-oriented approach */
This is for mysqli_*, there are similar ones for mysql_* and PDO (see bottom of this answer).
Also make sure your database and tables are set to UTF-8, you can do that like this:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
(Any data already stored won't be converted to the proper charset, so you'll need to do this with a clean database, or update the data after doing this if there are broken characters).
If you're using json_encode(), you might need to apply the JSON_UNESCAPED_UNICODE flag, otherwise it will convert special characters to their hexadecimal equivalent.
Remember that EVERYTHING in your entire pipeline of code needs to be set to UFT-8, otherwise you might experience broken characters in your application.
In addition to this list, there may be functions that has a specific parameter for specifying a charset. The manual will tell you about this (an example is htmlspecialchars()).
There are also special functions for multibyte characters, example: strtolower() won't lower multibyte characters, for that you'll have to use mb_strtolower(), see this live demo.
Note 1: Notice that its someplace noted as utf-8 (with a dash), and someplace as utf8 (without it). It's important that you know when to use which, as they usually aren't interchangeable. For example, HTML and PHP wants utf-8, but MySQL doesn't.
Note 2: In MySQL, "charset" and "collation" is not the same thing, see Difference between Encoding and collation?. Both should be set to utf-8 though; generally collation should be either utf8_general_ci or utf8_unicode_ci, see UTF-8: General? Bin? Unicode?.
Note 3: If you're using emojis, MySQL needs to be specified with an utf8mb4 charset instead of the standard utf8, both in the database and the connection. HTML and PHP will just have UTF-8.
Setting UTF-8 with mysql_ and PDO
PDO: This is done in the DSN of your object. Note the charset attribute,
$pdo = new PDO("mysql:host=localhost;dbname=database;charset=utf8", "user", "pass");
mysql_: This is done very similar to mysqli_*, but it doesn't take the connection-object as the first argument.
mysql_set_charset('utf8');

Solution:
mysql_query("SET NAMES 'utf8';"); > $mysqli->set_charset('utf8');

Related

Php + Mysql (UTF-8 ) some characters are still bug

Well i got a php script that takes nicknames from a the Steam web-api and insert them into a mysql db. Many of them got rare russian and greek characters. I set php to utf-8 in the php.ini and in all the php files with
mb_internal_encoding('utf-8');
My PDO connector is configured to handle utf8
$connection = new PDO('mysql:host=localhost;dbname=d2bd;mysql:charset=utf8mb4', 'root', '');
$connection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$connection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$connection->setAttribute(PDO::ATTR_PERSISTENT, true);
$connection->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'");
my mysql db is properly configured with utf8mb4
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server utf8mb4
character_set_system utf8
character_sets_dir C:\xampp\mysql\share\charsets\
collation_connection utf8mb4_unicode_ci
collation_database utf8mb4_unicode_ci
collation_server utf8mb4_unicode_ci
completion_type NO_CHAIN
concurrent_insert AUTO
connect_timeout 10
core_file OFF
In few words i take the input of the web-api and encode it with uft8_encode(). Then i insert it into the db. The problem is that some characters are not well encoded and when i recall them from the database they are all bugged.
Example 1:
1.Input -> Перуанский чертовски
2.Encode -> ÐеÑÑанÑкий ÑеÑÑовÑки
3.Insert into DB
4.Select from DB -> Ð?еÑ?Ñ?анÑкий Ñ?еÑ?Ñ?овÑкÐ
5.Decode
6.Output -> �?е�?�?анский �?е�?�?овск�
Example 2:
1.Input -> $ |/| 1 ↓_ € ♥ J
2.Encode -> $ |/| 1 â_ ⬠⥠J
3.Insert into DB
4.Select from DB -> 1 â??_ â?¬ â?¥ J
5.Decode
6.Output -> 1 �??_ �?� �?� J
Checklist for Problems with character/charset/collation
Including mysql, mysqli, PDO
Content
DISCLAIMER
My insert's in my DB doesn't work properly! What can i do?
Change Charset and Collation of a Database or Table
Set the encoding of your skript files
Set the charset of your page with php or meta tag
What's the difference between UTF8 and UTF8mb4?
Answer to this specific Question
Further Information/Additional Links
Side Notes
1. DISCLAIMER
This Answer should not only answer this question, also should the answer be a bit more extensive, so more people find faster a bundled and good answer!
!Important Notice!
If you change something in your Database always make sur you have a backup of your database! Check it 2 times, or 3!
I'm open for improvements and comments, such as error corrections.
In addition I apologize if the grammar is not perfect: D
If you get stuck on a question like this:
Php + Mysql (UTF-8, utf8mb4) some characters are still bug
How to convert an entire MySQL database characterset and collation to UTF-8?
“Incorrect string value” when trying to insert UTF-8 into MySQL
Change MySQL default character set to UTF-8 in my.cnf?
Using utf8mb4 with php and mysql
PDO + MySQL and broken UTF-8 encoding
Error in insertion data in php Mysql
PHP PDO: charset, set names?
SET NAMES utf8 in MySQL?
PHP mysql charset utf8 problems
UTF-8 all the way through
Manipulating utf8mb4 data from MySQL with PHP
ERROR 1115 (42000) : Unknown character set: 'utf8mb4' in mysql
...then my answer maybe helps you!
2. My insert's in my DB doesn't work properly! What can i do?
If your insert's doesn't work properly an your inserted data looks something like this in your database then this could have various reasons!
Examples:
??????????
𫗮𫗮𫗮𫗮
�??_ �?�
â_ ⬠⥠J
Here is a little checklist you can go trought and check if everything is how it should be!
(After the checklist there a few extra informations for mysql, mysqli and PDO)
Checklist:
Make sure default character sets is set on tables, client, server & text fields
If NOT See Point 3
Make sure your database connections character sets
IF NOT See Point mysql/PDO
Make sure if your displaying data that the charset of the document is set!
IF NOT See Point 5
Make sure your skript files are saved with the right charset!
IF NOT See Point 4
Make sure you set your character and your charset!
IF NOT See Point mysql/PDO
Make sure you forms accept utf8!
IF NOT See Point 5
Make sure you have set the connection encoding
IF NOT See Point mysql/pdo
Make sure you have set the servercharacter encoding right
IF NOT See Point mysql/pdo
...
You have to be sure your using utf8/ utf8mb4 everywhere!
mysql:
-mysql_query("SET NAMES 'utf8'"); Run SET NAMES before every query you use. Because if a mysql driver don't provied mechanismus to charset then you have to use SET NAMES!
-mysql_query("SET CHARACTER SET utf8 "); Set character to utf8
-mysql_set_charset('utf8'); Set your charset to utf8
-mysql API driver doesn't support utf8mb4 (ERROR 1115 (42000))
-character_set_server=utf8 to set server character
PDO:
-$dbh->exec("set names utf8"); If your using PDO you can use this line to SET NAMES
-$dbh = new PDO("mysql:host=$host;dbname=$db;charset=utf8"); This line set the charset but you have to have PHP 5.3.6 or higher
-$dbh->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci' "); You can also set SET NAMES with this line
-mb_internal_encoding('UTF-8'); to set the encoding when you use PDO
3. Change Charset and Collation of a Database or Table
If you have to change the charset or collation of a database or table you can use these lines of code:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
4. Set the encoding of your skript files
You may have to check that your skript(php) files are saved with the right charset!
For this i would recommend you Notpad++!
If you have opened your file in notpad go to the menupoint 'Encoding' and change the charset
5. Set the charset of your page with php or meta tag
For displaying data in utf8/utf8mb4 you have to be sure you site is set with the right charset!
You can set the charset in 3 ways like this:
//PHP
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
//HTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Also to accept utf8 in your form use:
<form accept-charset="UTF-8">
6. What's the difference between UTF8 and UTF8mb4?
UTF8:
-utf8 does only support symbols with 3 bytes
-...(many more)
UTF8MB4:
-utf8mb3 does support symbols with 4 bytes
-...(many more)
7. Answer to this specific Question
I think this should work since your using PDO:
(After you created a PDO object! If your using a PHP version less then 5.3.6)
$dbh->exec("set names utf8");
Otherwise try one of these:
ini_set("default_charset", "UTF-8");
header('Content-type: text/html; charset=UTF-8');
UPDATE:
To change the collation or charset of a database or table use this:
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
8. Further Information/Additional Links
default character set
character set
mysql_set_charset
error_reporting
pdo
mysql
mysqli
9. Side Notes
9.1 Error Reporting
If Error's not get displayed use this code snippet:
<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
?>
9.2 Unicode
So that you don't make any mistake you have to really understand utf8!
9.3 One word to mysql, mysqli and PDO
My Personal ranking is:
PDO
mysqli
mysql
I would recommend you to use PDO or mysqli, because the have many benefits against mysql!
I changed the collation of the tables from SQLyog, but it seems that it's broken. When i changed them directly from a sql query it worked.

MySQL & PHP special character issue [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 8 years ago.
I am using MySQL 5.5 version
when i try to insert the ‘ special character in database it is automatically converted into ’ .
i changed the database character set to utf8 & character_set_connection to utf8 but i unable to get the expected result.
how to solve this issue ?
kindly help on this
You need to check how you are sending the data.
If the character set in the database is utf-8, you need to send like that to.
Try to encode the data before, like that:
$sql = "INSERT INTO tablex(field) VALUES('".utf8_encode($mydata)."')";
It is important to make sure that every part of your connection is using utf8, otherwise you will run into problems.
Below we will create a utf8 connection to the database, perform set names which is vitally important and then write using a utf8_encode method.
mysql_connect("host", "user", "pass");
mysql_query("SET character_set_results=utf8");
mysql_set_charset('utf8');
mb_internal_encoding('UTF-8');
mysql_select_db("my_db");
mysql_query("set names 'utf8'");
$sql = "INSERT INTO `table`(`foo`) VALUES('".utf8_encode($bar)."')";

Converting html entities to utf-8 and inserting them into a mysql database

I am trying to convert a string from HTML-ENTITIES to UTF-8 and then save the encoded string in my database. The html entities are greek letters and look for example like this: νω
Now I tried thousands of different ways, starting from just using utf8_encode or html_entity_decode until now I came across the function mb_convert_encoding().
Now the really weird thing is that when converting my string and then outputting it, it is correctly encoded to utf-8, but when inserting this string into my database I end up getting something like: ξÏνω.
This is the code for the encoding:
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
......
while($arr = $select->fetch_array(MYSQLI_ASSOC))
{
$text = $arr["greek"];
$result = mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES');
$mysqli->query("UPDATE some SET greek = '".$result."'");
}
When outputting my query and then manually doing a sql query in phpmyadmin it works fine, so it doesnt seem to be a problem of my db. There must be some problem when transferring the encoded string to my database...
As you see in your script, you are instructing the browser to use UTF8. That is the first step.
However your database needs the same thing and also the encoding/collation on the tables need to be UTF8 too.
You can either recreate your tables using utf8_general_ci or utf8_unicode_ci as the collation, or convert the existing tables (see here)
You need to also make sure that your database connection i.e. php code to mysql is using UTF8. If you are using PDO there are plenty of articles that show how to do that. The simplest way is to do:
$mysqli->query('SET NAMES utf8');
NOTE The change you will make now is final. If you change the connection encoding to your database, you could affect existing data.
EDIT You can do the following to set the connection
$mysqli = new mysqli($host, $user, $pass, $db);
if (!$mysqli->set_charset("utf8")) {
die("Error loading character set utf8: %s\n", $mysqli->error);
}
$mysqli->close();
Links of interest:
Whether to use "SET NAMES"
Execute the SET NAMES 'utf8' query prior to any others.

Set character set using MySQLi

I'm fetching data in Arabic from MySQL tables with MySQLi. So I usually use this in procedural style:
mysql_query("SET NAMES 'utf8'");
mysql_query('SET CHARACTER SET utf8');
Now I am using the OOP style so I am trying to see if there is something I could set rather than the above?
I only found this in PHP manual so I did it, but what about setting names to UTF8?
$mysqli->set_charset("utf8");
It's the same:
$mysqli->query("SET NAMES 'utf8'");
From the manual:
This is the preferred way to change the charset. Using mysqli::query()
to execute SET NAMES .. is not recommended.
$mysqli->set_charset("utf8"); is just enough, let the mysqli db driver do the thing for you.
You should use
$mysqli->set_charset("utf8");
Don't use SET NAMES or SET CHARACTER SET explicitly when using MySQLi., and certainly don't use both like the question asker here originally was. Reasons that this is a bad idea:
Calling SET CHARACTER SET utf8 after SET NAMES utf8 actually just undoes some of the work that SET NAMES did.
The PHP manual explicitly warns us to use mysqli_set_charset, not SET NAMES:
This is the preferred way to change the charset. Using mysqli_query() to set it (such as SET NAMES utf8) is not recommended. See the MySQL character set concepts section for more information.
Under the hood, mysqli_set_charset is just a wrapper for mysql_set_character_set from the MySQL C API (or its mysqlnd equivalent). Looking at the docs for that function, we can see the difference between it and SET NAMES:
This function works like the SET NAMES statement, but also sets the value of mysql->charset, and thus affects the character set used by mysql_real_escape_string()
In other words, $mysqli->set_charset('foo') will do everything SET NAMES foo does and also ensure that mysqli_real_escape_string respects the new encoding. Admittedly, if you're only using encodings like Latin 1 and UTF 8 that strictly extend ASCII (that is, which encode all ASCII strings exactly as they would be encoded in ASCII), then using SET NAMES instead of set_charset won't break anything. However, if you're using more unusual encodings like GBK, then you could end up garbling your strings or even introducing SQL injection vulnerabilities that bypass mysqli_real_escape_string.
It's thus good practice to only use set_charset, not SET NAMES. There's nothing that SET NAMES does that set_charset doesn't, and set_charset avoids the risk of mysqli_real_escape_string behaving incorrectly (or even insecurely).
You can use this. Its realy nice for mysqli Character Set
$dbhost = "localhost";
$dbuser = "root";
$dbpass = "dbpass";
$dbname = "dbname";
$conn = mysqli_connect($dbhost,$dbuser,$dbpass,$dbname);
mysqli_query($conn,"SET CHARACTER SET 'utf8'");
mysqli_query($conn,"SET SESSION collation_connection ='utf8_unicode_ci'");
For procedural style lovers.
// Create connection
$conn = mysqli_connect($servername, $username, $password, $dbname);
/* change character set to utf8 */
mysqli_set_charset($conn,"utf8");
mysqli_query($conn,"SET NAMES 'latin5'");
$conn = connection string

UTF-8 problems PHP/MySQL

I've always used ISO-8859-1 encoding, but I'm now going over to UTF-8.
Unfortunately I can't get it to work.
My MySQL DB is UTF-8, my PHP document is encoded in UTF-8, I set a UTF-8 charset, but it still doesn't work.
(it is special characters like æ/ø/å that doesn't work)
Hope you guys can help!
Make sure the connection to your database is also using this character set:
$conn = mysql_connect($server, $username, $password);
mysql_set_charset("UTF8", $conn);
According to the documentation of mysql_set_charset at php.net:
Note:
This is the preferred way to change the charset. Using mysql_query() to execute
SET NAMES .. is not recommended.
See also: http://nl3.php.net/manual/en/function.mysql-set-charset.php
Check the character set of your current connection with:
echo mysql_client_encoding($conn);
See also: http://nl3.php.net/manual/en/function.mysql-client-encoding.php
If you have done these things and add weird characters to your table, you will see it is displayed correct.
Remember to set connection encoding to utf8 as well.
In ext\mysqli do
$mysqli->set_charset("utf8")
In ext\mysql do
mysql_set_charset("utf8")
With other db extensions you might have to run query like
SET NAMES 'utf8'
Some more details about connection encoding in MySQL
As others point out, making sure your source code is utf-8 encoded also helps. Pay special attention to not having BOM (Byte Order Mark) - it would be sent to browser before any code is executed, so using headers or sessions would become impossible.
After connecting to db, run query SET NAMES UTF8
$db = new db(...);
$db->query('set name utf8');
and add this tag to header
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Are you having this error? MySql SELECT UNION Illegal mix of collations Error? Just set you entire mysql to utf 8 then
SET character_set_connection = utf8;
Try this after connecting to mysql:
mysql_query("SET NAMES 'utf8'");
And encode PHP document in UTF-8 without BOM.
I had the same problem but now its resolved. Here is the solution:
1st: update ur table
ALTER TABLE tbl_name
DEFAULT CHARACTER SET utf8
COLLATE utf8_general_ci;
2nd:
add this in the head section of the HTML code:
Regards
Saleha A.Latif
Nowadays PDO is the recommended way to use mysql. With that you should use the connection string to set encoding. For example: "mysql:host=$host;dbname=$db;charset=utf8"

Categories