I've met and interesting behavior with PDO which relates to UTF-8 encoding issues.
When inserting data I need to declare SET NAMES UTF-8 for data to be stored correctly. Ok so this is fine. BUT!
When selecting data (and fetching results) I specifically can't SET NAMES UTF-8, or otherwise the characters gets scrambled. Or if I use set names UTF-8 for selects I need to utf8decode the result set (as in translate to iso88591) if I want to see them correctly. This would suggest that my page would be interpreted as iso88591. However if I reduce my page to single page app (its an angularjs/PHP slim/pdo setup) the same functionality remains - UTF-8 names for pdo init for $http.post and explicitly no set names UTF-8 for select / $http.get. Also if I die and expose the data after fetching the results from UTF-8 collated table I need decode them to see correct data.
I made this hack in my db adapter construct to get around it but I rather would like to solve the whole issue since it has made me curious:
$init = array();
if ($write) {
$init = array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8");
}
$pdo = new PDO('mysql:host=' . DB_HOST . ';dbname=' . DB_NAME, DB_USER, DB_PASS, $init);
Some facts:
all associated files are UTF-8.
all (search, view, insert) pages are correctly as "text/HTML;charset=UTF-8" (and of course there is the fact that I reduced all functionalities to one page and still could replicate this scenario)
headers from debugger translated as UTF-8.
PHP detects strings (insert and search) as UTF-8 when echoed in controller from request parameters BEFORE sql insert/select.
PHP ini default charset is UTF-8.
MySQL database and table charset UTF-8 and collation UTF-8 general.
editor is atom.
environment ubuntu 14.
Has anyone else encountered anything similar behavior? Could this have to do something with the headers sent with angular (XHR) as they are default? Though after the request and params are interpreted correctly as UTF-8 by server side so it would seem far fetched. There is a chance that this might have a mystical link to environment since its not a fresh vm but a local dev machine that has its own tweaks on it.
Related
Last couple of hours i have tried hard but failed to come into a solution. why it's happening....
I have a table with collation=utf8_general_ci and it's column/field collation is also set to utf8_general_ci.
i can see thai data properly from phpmyadmin. but when i fetch this table data and show using php code it shows ???????
click to have a look the page.
in this page i have written one thai word directly which shows properly but same word/text when i fetch from database and display it shows ???
i am using
<meta http-equiv="Content-Type" content="text/html; charset=tis-620">
currently and when i have tried with
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
i found Direct and from Database both text/word display as ?????
It may be possible that your MySQL connection itself is not using utf8. From the MySQL manual:
SET NAMES indicates what character set the client will use to send SQL statements
to the server. Thus, SET NAMES 'cp1251' tells the server, “future incoming
messages from this client are in character set cp1251.” It also specifies the
character set that the server should use for sending results back to the client.
(For example, it indicates what character set to use for column values if you use
a SELECT statement.)
Depending on how you are connecting to the database (mysql_connect/mysqli_connect or PDO) the steps are a little bit different. If using mysql_connect(); or mysqli_connect; then you will need to run a mysql_query("SET NAMES utf8");. In thoery you can use the same steps if using PHP PDO but you can alternatively set the init command during PDO object construction. Here's an example from a database interaction class of mine.
$dsn = 'mysql:host='.$database_detail['dbhost'].';dbname='.$database_detail['dbname'];
$this->dbh = new PDO($dsn, $database_detail['dbuser'],
$database_detail['dbpass'], array(
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
PDO::ATTR_EMULATE_PREPARES => FALSE,
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8'
));
I have this problem with Spanish sometimes, and even though I set the content type in html it still displays the wrong encoding type.
What works for me is if I "save the php document" with the correct formatting and then that fixes the issue.
You see when your document gets severed from Apache, or whatever your using, it sends the document type that it reads, but your PHP code is dynamically getting the content from the database.
So I would do the following:
Write the thai word in HTML and get it to display correctly. Try doing this by saving the document in different encoding options. This is normally the option in the save dialog, and once you have it displaying then try fetching it from the Database.
Of course, you could force Apache to serve documents in the encoding you want, but this assumes that you have access to those settings.
I hope that helps.
I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.
When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.
If I do this same exact query in PHP, it will execute but will show 0 affected_rows.
Here's what I tried:
Using both LIKE instead of =
Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
Doing 'SET NAMES utf-8' and 'latin1' before running the query
Did header('Content-Type: text/html; charset=UTF-8'); in PHP
Escaping using MySQLi::real_escape_string
Doing a filter_var($String, FILTER_SANITIZE_STRING)
Tried a MySQLi stmt bind
The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.
How can I get it query properly in PHP when using the ²?
You should be able to get the query to work by ensuring the following:
Prepping PHP for UTF-8
You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:
header('Content-Type: text/html; charset=UTF-8');
Or, you can use a meta tag in your page's head block:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Prepping MySQL for UTF-8
Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:
SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';
Changing the database level encoding can be done using a command like this:
(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;
To see what character encoding a table uses, simply do:
SHOW CREATE TABLE myTable;
Similarly, here's how to change a table-level encoding:
(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;
I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:
ALTER TABLE ... CONVERT TO CHARACTER SET utf8;
Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.
Forcing MySQLi to Use UTF-8
Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:
$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
printf("Error loading character set utf8: %sn", $db->error);
}
Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.
If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!
i've read most of the questions and answers about this situation but i cant fix my character problem. My database's default character set is utf8 and all the tables' collation is utf8_general_ci. I'm sure that all of the settings are utf8 and utf8_general_ci, cuz i've checked them billions of time. Problem is after posting the value within a form, it doesnt seem like what i want in database, and also if i edit the database from phpmyadmin, when i fetch the data, its again not showing what i want.
The DB connection works, i edited it like mentioned before about this situation, but my script is buggy about character speciziliation..
The DB Connect Code is :
try {
$db = new PDO("mysql:host={$db_server};dbname={$db_name};charset=utf-8", $db_user, $db_password,array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
} catch (PDOException $e) {
// exc
}
Would be glad if you can help
Regards
For those who has the same error even you are sure about the things that #zerkms mentioned :
" you need to have in the same encoding: 1) the page/form 2) table and column (if any) charset (not collation) 3) db connection. That's it. If you have all 3 of them the same utf-8 - then it should work. If it doesn't - you're missing something and need to re-check each of them "
if you are still having problem like i did, check your form process data that if there is a filter that you forgot. i had a filter for security, it was sanitizing inputs and it was only for ansi encoding, so check out everything and than it will be fine.
thanks for all who replied.
:) Somewhere in your toolchain, something is not using utf8. PHPMyAdmin is well known for this type of issue but I cannot help you much there as I much prefer the command line or scripts to play with a database. If the output is fine everywhere except in phpMyadmin I can refer you to this post that offers a lot of tips relating to phpmyadmin.
Oh, and you can specify utf-8 encoding in your instanciation call to PDO:
$con = new PDO('mysql:host=' . $server . ';dbname=' . $db . ';charset=UTF8', $user, $pass, array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
The most basic helpful thing you can do when setting up MysqL is adding this to your /etc/my.cnf file:
[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
[client]
default-character-set=utf8
The editors can also play tricks on you. Some editors when not configured properly can switch and resave in improper encoding, causing garbled text when re-opened in utf-8. All decent IDE's and editors can be configured to handle UTF-8.
Hope this helps, good-luck.
You set the charset of the connection object to utf-8, that's good and other settings in the database should not have any impact then.
I would check that your page (the one that inserts and the one that displays) are both correctly encoded. There are two things to check: Check that your page is stored UTF-8 encoded (without BOM), this is the job of your editor/ide. Then check that you declared it correctly with something like:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
Of course if you already have inserted data to your db from a wrong encoded page, the data in your db is invalid and cannot be displayed properly, on a correctly encoded page.
what's up? :-)
I have one problem and i hope you can help me with it.
One friend of mine have a simple solid html website and i implemented little php; CRUD system for articles... problem i came across is placing and getting cyrillic characters from mysql database.
What i want to achive is next:
In the main navigation there are some separated sections, whose names, ids and item's order i want to place in mysql and than to pull names and to put each name as a link. Names are supposed to be cyrillic characters.
The problem comes when i, using php mysql_fetch_assoc function, try to display names which are inserted with cyrillic characters in database row, collation of row is utf8_general_ci, and i end with ????? insted of original characters. If i submit cyrillic characters via submit form to mysql it shows something like this У.
How can i solve this, thanks in advance!? :-)
Make sure you call this after connecting to database.
mysql_query("SET NAMES UTF8");
Also make sure that HTML file has charset meta tag set to UTF-8 or send header before output.
header("Content-Type: text/html; charset=utf-8");
I had the same problem until I encoded the 'Collation' column in my table to 'utf8_bin'.
if its really mysql fetch assoc messing up you should try:
mysql-set-charset
from the docs:
Note:
This is the preferred way to change
the charset. Using mysql_query() to
execute SET NAMES .. is not
recommended.
also make sure your files are saved as utf8 and check iconv_set_encoding / iconv_get_encoding
For anyone having more complex issues with legacy project upgrades from versions before PHP 5.6 and MYSQL 5.1 to PHP 7 & Latest MySQL/Percona/MariaDB etc...
If the project uses utf8_encode($value) you can either try removing the function from the value being prepared and use the accepted answer for setting UTF-8 encoding for all input.
--- OR ---
Try replacing utf8_encode($value) with mb_convert_encoding($value, 'utf-8')
PDO USERS
If you are using PDO here are two ways how to set utf8:
$options = [
\PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8'
];
new \PDO($dsn, $username, $passwd, $options);
--- OR ---
$dsn = 'mysql:host=localhost;charset=utf8;'
new \PDO($dsn, $username, $passwd);
I can confirm that mb_convert_encoding($value, 'utf-8') to SQL table using utf8_unicode_ci works for Cyrillic and Umlaut.
I've a MySQL table that has a UTF-8 charset and upon attempting to insert to it via a PHP form, the database gives the following error:
PDOStatement::execute():
SQLSTATE[HY000]: General error: 1366
Incorrect string value: '\xE8' for
column ...
The character in question is 'è', yet I don't see why this should be a problem considering the database and table are set to UTF-8.
Edit
I've tried directly from the mysql terminal and have the same problem.
Your database might be set to UTF-8, but the database connection also needs to be set to UTF-8. You should do that with a SET NAMES utf8 statement. You can use the driver_options in PDO to have it execute that as soon as you connect:
$handle = new PDO("mysql:host=localhost;dbname=dbname",
'username', 'password',
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
Have a look at the following two links for more detailed information about making sure your entire site uses UTF-8 appropriately:
UTF-8 all the way through…
UTF8, PHP and MySQL
E8 is greater than the maximum usable character 7F in a one-byte UTF8 character: http://en.wikipedia.org/wiki/UTF-8
It seems your connection is not set to UTF8 but some other 8 bit encoding like ISO Latin. If you set the database to UTF8 you only change the character set the database uses internally, connections may be on a different default value (latin1 for older MySQL versions) so you should try to send an initial SET CHARACTER SET utf-8 after connecting to the database. If you have access to my.cnf you can also set the correct default value there, but keep in mind that changing the default may break any other sites/apps running on the same host.
Before passing the value to Mysql you can use the following code:
$val = mb_check_encoding($val, 'UTF-8') ? $val : utf8_encode($val);
convert the string the to UTF-8, If it's matter of only one field.