Wrong charset in mysql [duplicate] - php

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 8 years ago.
I have a page with forms to add or edit a entry to a mysql-database. Everything works fine, the form shows everything right, the output on the page is right.
Only when I open the mysql-database (phpMyAdmin) the entries are shown wrong. Some german letters are wrong.
For example André instead of André. Or groß instead of groß.
The Output on the page works fine. Also the edit-function in an HTML-form works fine.
I just don't understand why everything is written wrong in the database.
All my PHP-Page are in "utf-8".

Access this link:
http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html
When you create a database, you can set CHARACTER SET by utf8.
Example:
CREATE DATABASE tablename
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_general_ci;
Make sure your database has UTF8 char set as well
You can convert you existing table like this:
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Add Header to the top of your php code. This will set encoding of content.
header('Content-Type: text/html; charset=utf-8');
You can use any encoding, which is set in your database instead ofutf-8.

Related

PHP MYSQL Collation Speical Characters XML->PHP->MYSQL

I am trying to import data from an XML file into a MYSQL DB using PHP. I am able to get the code to work just fine but when I look at the data in the DB there are special characters. For example, when I look at the XML in my browser it shows up as "outdoors in good weater..." but in the DB it appears to as "outdoors in good weather…".
I've cycled through all the different types of collation for that field in my DB but it does not seem to help much. Sometimes it shows up with the characters mentioned above and others as ???.
I have also tried to sync up the data with the following code in my PHP
$mysqli->query("SET NAMES 'utf8' COLLATE 'utf8_general_ci'");
But, again I have had no luck.
Thank you for reading this and for your help!
Akshay
You need to change the character set to UTF-8, along with your collation:
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
What you are seeing is a Unicode ellipsis character (…) being converted into another character set, which is probably Latin1. That is why it looks garbled.

Webpage outputting question marks in place of unicode characters, despite character sets and collation being correct?

When I fetch a record from my database, where the database, the table, and the row are all set to utf8_unicode_ci, I recieve a question boxed in a diagonal square in place of the correct unicode character; this is despite me also setting the HTML encoding on the page with:
<meta charset="utf8">
I have a suspicion however it is to do with MySQL/PHP though because when I print_r the output the question marks are still displaying while a manually entered degree symbol (the symbol I should be seeing) works fine.
This SQL query also did nothing:
SET NAMES utf8;
Any ideas? I've checked every end of my setup.
utf8_unicode_ci is the collation, you need the character set as utf8 as example:
CREATE TABLE someTable DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_unicode_ci;
As adrienne states in their answer here:
make sure that all of the following are true:
The DB connection is using UTF-8
The DB tables are using UTF-8
The individual columns in the DB tables are using UTF-8
The data is actually stored properly in the UTF-8 encoding inside the database (often not the case if you've imported from bad sources,
or changed table or column collations)
The web page is requesting UTF-8
Apache is serving UTF-8

Page with UTF-8 encoding sends data to MySQL with UTF-8 encoding but entry is scrambled

I realize there's a dozen similar questions, but none of the solutions suggested there work in this case.
I have a PHP variable on a page, initialized as:
$hometeam="Крылья Советов"; //Cyrrilic string
When I print it out on the page, it prints out correctly. So echo $hometeam displays the string Крылья Советов, as it should.
The content meta tag in the header is set as follows:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
And, at the very beginning of the page, I have the following (as suggested in one of the solutions found in my search):
ini_set('default_charset', 'utf-8');
So that should be all good.
The MySQL table I'm trying to save this to, and the column in question, have utf8_bin as their encoding. When I go to phpMyAdmin and manually enter Крылья Советов, it saves properly in the field.
However, when I try to save it through a query on the page, using the following basic query:
mysql_query("insert into tablename (round,hometeam) values ('1','$hometeam') ");
The mysql entry looks like this:
c390c5a1c391e282acc391e280b9c390c2bbc391c592c391c28f20c390c2a1c390c2bec390c2b2c390c2b5c391e2809ac390c2bec390c2b2
So what's going on here? If everything is ok on the page, and everything is ok with MySQL itself, where is the issue? Is there something I should add to the query itself to make it keep the string UTF-8 encoded?
Note that I have set mysql_set_charset('utf8'); after connecting to the database (at the top of the page).
EDIT: Running the query SHOW VARIABLES LIKE "%character_set%" gives the following:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
Seems like there could be something here, since there are 2 latin1's in that list. What do you think?
Also, when I type a Cyrillic string directly into phpMyAdmin, it appears fine at first (it displays correctly after I save it). But reloading the table, it displays in HEX like the inserted ones. I apologize for the misinformation regarding this in the question. As it turns out, this should mean the problem is with phpMyAdmin or the database itself.
EDIT #2: this is what show create table tablename returns:
CREATE TABLE `tablename` ( `id` int(11) NOT NULL AUTO_INCREMENT, `round` int(11), `hometeam` varchar(32) COLLATE utf8_bin NOT NULL, `competition` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Russia', PRIMARY KEY (`id`)) ENGINE=MyISAM AUTO_INCREMENT=119 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
Do you get this hex string in phpMyAdmin? I suppose when you SELECT the inserted value by e.g. PHP or the MySQL console client, you would be given the expected cyrillic UTF8 string.
If so, it's a configuration issue with phpMyAdmin, see e.g. here: http://theyouri.blogspot.ch/2010/12/phpmyadmin-collated-db-in-utf8bin-shows.html
phpMyAdmin collated db in utf8_bin shows hex data instead of UTF8 text
$cfg['DisplayBinaryAsHex'] = false;
Moreover, please don't use mysql_query that way, since you're totally open to SQL injections. I'm also not sure if you really want to use utf8_bin, see e.g. this discussion: utf8_bin vs. utf_unicode_ci or this: UTF-8: General? Bin? Unicode?
EDIT There's something weird going on. If you translate the given hex string to UTF8 characters, you get this: "ÐšÑ€Ñ‹Ð»ÑŒÑ Ð¡Ð¾Ð²ÐµÑ‚Ð¾Ð²" (see e.g. http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder). If you utf8_decode this, you get the desired "Крылья Советов". So, it seems that it's at least utf8 encoded twice (besides the problem that it somewhere shows up as hex characters).
Could you please provide the complete script? Do you utf8_encode your string anywhere? If your script is this and only this (besides a valid, opened MySQL connection):
<?php
$hometeam="Крылья Советов"; //Cyrrilic string
// open mysql connection here
mysql_set_charset('utf8');
mysql_query("INSERT INTO tablename (round, hometeam) VALUES ('1', '$hometeam')");
$result = mysql_query("SELECT * FROM tablename WHERE round = '1'");
$row = mysql_fetch_assoc($result);
echo $row['hometeam'];
?>
And you call the page, what is the result (in the page source of the browser, not what is displayed in the browser)?
Also, please check what happens if you change the collation to utf8_unicode_ci, as suggested in another answer here. That at least covers phpMyAdmin issues when displaying binary data and is propably anyway what you'll want (since you probably want ORDER BY clauses to perform as expected, see discussions in the SO questions I linked above).
EDIT2 Perhaps you could also provide some snippets like SHOW CREATE TABLE tablename or SHOW VARIABLES LIKE "%character_set%". Might help.
Also, when I type a Cyrillic string directly into phpMyAdmin, it
appears fine at first (it displays correctly after I save it). But
reloading the table, it displays in HEX like the inserted ones.
This almost certainly looks like there is a problem in your table! Run show create table tablename. I bet there is latin1 instead of utf8, because you have it set as the default in the character_set_database variable.
To change this, run the following commmand:
ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;
This will convert all your varchar fields to utf8. But be careful with the records you already have in the table, as they are already malformed, if you converted them to UTF8 they will stay malformed. Maybe the best idea is to create the database again, just add the following commands at the end of table definition:
CREATE TABLE `tablename` (
....
) ENGINE=<whatever you use> DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
1) Try to save the entry to the database with the PhpMyAdmin and then also look at the result in PhpMyAdmin. Does it look OK? If yes, database is created and set up properly.
2) Try to use utf8_general_ci instead. This shouldn't matter, but give it a try.
3) Tune all necessary settings on the PHP side - follow this post:
http://blog.loftdigital.com/blog/php-utf-8-cheatsheet . Especially try this trick:
echo htmlentities($hometeam, ENT_QUOTES, 'UTF-8')
As I saw in the comments, you don't seam to be able to update your database configuration isn't it?
I guess you have a misconfiguration of the encoding because I saw that in the official documentation MySQL Documentation
I can propose you a PHP solution. Because of a lot of encoding problem you can transform the string before inserting it inside database. You have to find a common language to talk between PHP and the database.
The one I tried in an other project consist in transform string using url_encode($string) and url_decode($string).

UTF-8 problem when saving to mysql

My website is using charset iso 8859 1
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
when user post chinese character, it will be saved into the database as & # 2 0 3 2 0 ; & # 2 2 9 0 9 ; which will output as the chinese character when retrieved.
I need to set my website to UTF-8
when user post chinese character, it will be saved as some funky character in the mysql, and when retrieved, some characters are correct but some are wrong.
my question is, after i set to UTF-8, how to i make mysql save the text as # 2 0 3 2 0 ; & # 2 2 9 0 9 ; instead of funky chars.
i tried to use htmlentities. it's not working correctly.
my script is using
$message = htmlentities(strip_tags(mysql_real_escape_string($_POST['message']),'<img><vid>'));
when it's saved to mysql it is like this ä½ å¥½
when it's retrieved to be display is like this ä½ å¥½ <---- is the funky character that is stored in mysql previously
i been seeing lots of encoding related issues
the owner should google a bit before posting
general check list:
mysql table schema to use utf-8
mysql clients connection to use utf-8 mysql --default-character-set=utf8
php mysqli_set_charset to utf-8
html encoding to utf-8
putty, emac clients... to be in utf-8
You should follow ajreal's advice on setting your encodings to UTF-8.
However, from the sound of it you may already have data stored in the database which will have to be converted.
If your website is uniformly iso-8859-1 then most likely Chinese characters are stored as HTML character entities, which means that data is not not stored mis-encoded and converting the character sets should not cause problems. If you carry out the instructions and find that characters appear incorrectly afterwards, it might be because text is stored mis-encoded, in which case there are steps that can be taken to remedy the situation.
Character sets for an existing column may be converted using syntax like
ALTER TABLE TableName MODIFY ColumnName COLUMN_TYPE CHARACTER SET utf8 [NOT NULL]
where COLUMN_TYPE is one of CHAR(n), VARCHAR(n), TEXT and the square brackets indicate that NOT NULL is optional.
Edit
"my question is, after i set to UTF-8, how to i make mysql save the text as # 2 0 3 2 0 ; & # 2 2 9 0 9 ; instead of funky chars."
This might be best tackled in your scripting language rather than in MySQL. If using PHP you might be able to use htmlentities() for this purpose.
If you are trying to go to UTF8 after you are already using another encoding, try these steps:
Run ALTER TABLE tablename CONVERT TO CHARACTER SET UTF8 on your tables
Configure MySQL to default to UTF8 or just run the SET NAMES UTF8 query once you establish a database connection. You only need to run it once per connection since it sets the connection to UTF8.
Output a UTF8 header before delivering content to the browser. header('Content-Type: text/html; charset=utf-8');
Include the UTF8 content-type meta tag on your page.
meta http-equiv="Content-Type" content="text/html; charset=utf-8"
If you do all those steps, everything should work fine.
User N before your values. Like:
mysql_query ("insert into ".$mysql_table_prefix."links ( url, title) values ( N'$url ', N'$title');
Try to tell mysql to use utf before executing any query. You can put this query in any file that is used in all files. replace "utf8_swedish_ci" with your collation
$sql='SET NAMES "utf8" COLLATE "utf8_swedish_ci"';
mysql_query($sql);

Database charset conversion

I've moved database from one host to another. I've used PMA to export and bigdump to import. The whole database have latin2 charset set everywhere where it's possible. However in database, special chars (polish ąęłó, etc.) are broken. When I used SELECT i see "bushes" - "Ä�" insetad of "ą". Then I've set document encoding to utf-8... And the characters are good. How to fix this? Can it be done using CONVERT in query? I don't want to export/import database again, because it has over 200MB. What's wrong?
Every PHP/MySQL query solution will save me.
Sorry if you can't understand this, because I'm still learning english though.
If a table contains the wrong kind of charset (let's say utf-8 has slipped into latin1 column varhcar(255)):
ALTER TABLE tablename MODIFY colummname BINARY(255);
ALTER TABLE tablename MODIFY colummname VARCHAR(255) CHARSET utf8;
ALTER TABLE tablename MODIFY colummname VARCHAR(255) CHARSET latin1;
See also: http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html
However, it is more likely you just have a wrong character set in your default connection. What does a SET NAMES latin1; before selecting result in?

Categories