My database is latin1_swedish_ci but all the tables which contain foreign characters (german, turkish...) are utf8_general_ci.
Before the upgrade to php 5.6, I used
mysql_query("SET CHARACTER SET utf8;");
mysql_query("SET NAMES utf8");
before mysql_query() and everything was displayed correctly in my page (<meta http-equiv="content-type" content="text/html;charset=UTF-8" /> in page header).
After the conversion of all mysql_query(...) to mysqli_query(id,...) and running under php 5.6, all the foreign languages are now scrambled with ? and �. Switching back to php 5.4 does not help. phpMyAdmin displays the mysql database (which has not changed) correctly.
I have looked around for a solution but nothing works... am I missing something?
What do I need to change in my code to work properly?
After searching again and again...
MAMP MySQL not recognizing my.cnf values in OSX
Does MySQL included with MAMP not include a config file?
http://www.toptal.com/php/a-utf-8-primer-for-php-and-mysql
Here is the solution.
In my php scripts, I had to add a charset query after connecting to database.
$con=mysqli_connect($host",$user,$password,$db);
mysqli_set_charset($con,"utf8");
The previous charset & names I used with mysql_query() up to php 5.4 were not enough anymore.
mysqli_query($con,"SET CHARACTER SET utf8;");
mysqli_query($con,"SET NAMES utf8");
On my local server, I also had to recompile mysql after adding a my.cnf file containing the following lines :
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
default-storage-engine = InnoDB
character-set-client-handshake
collation-server=utf8_general_ci
character-set-server=utf8
[mysqld_safe]
default-character-set=utf8
I also had to add the utf8 charset to MAMP by editing the MAMP/Library/share/charsets/index.xml and adding the folling lines :
<charset name="utf8">
<family>Unicode</family>
<description>UTF8 Unicode</description>
<alias>utf8</alias>
<collation name="utf8_general_ci" id="33">
<flag>primary</flag>
<flag>compiled</flag>
</collation>
<collation name="utf8_bin" id="83">
<flag>binary</flag>
<flag>compiled</flag>
</collation>
</charset>
On my web server, the 2 steps above were not necessary.
These gibberish characters are the result of a browser or other user-visible software package rendering utf-8 characters as if they were ASCII or Latin-1.
EDIT In the Chrome browser, you can view the encoding with which your browser is rendering your page. Click the chrome menu (three little horizontal lines) in the upper right corner. Click More Tools> Click Encoding> Then you will see a choice of character sets. Try choosing a different one.
Try putting this line into your HTML documents' <HEAD> sections.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
That may force the browser to use the correct character set.
END edit
After your switch to mysqli, did you keep the SET CHARACTER SET and SET NAMES queries as you opened up your mysql session? If not, put them back and see if it helps.
It's possible your database is working correctly but php is telling browsers you're using Latin-1. In fact that's very likely because phpmyadmin is working properly.
Try doing the things suggested here: Setting PHP default encoding to utf-8?
There's a lot of conceptual confusion around character sets and collations. When you say
My database is latin1_swedish_ci
you mean that the default character set for a newly defined table is latin1 and the default collation is case-insensitive Swedish. These are only defaults. What counts for data storage is the character set used for each column. You say that's utf8. The collation (utf8_general_ci) only counts for searching and matching.
I had the same problem.
Not sure if this is helpful, but as a noob, there is no way I was able to follow or implement the complexities of the answer to this issue.
First, I added this to my html page:
<meta charset="utf-8"/>.
Then, I went into PHPMyAdmin, clicked on the column in which I saw Diamond-circled Question Marks in my browser, and simply reset the encoding to UTF-8 at the database level.
I hope it's helpful. Certainly was easy for me.
Related
After migration from PHP 5.3 to PHP 5.6 I have encoding problem. My MySQL database is latin1 and my PHP files are in windows-1251. Now everything is displayed like "ñëåäíèòå àäðåñè" or "�����".
It should be display something in Cyrillic like "кирилица". I've tried mysqli_set_charset but it didn't solve my problem.
First, let's see what you have in the table. Do SELECT col, HEX(col)... to see how these are encoded. Here is the HEX that should be there if it is correctly utf8-encoded:
ñëå --> C3B1C3ABC3A5; кир --> D0BAD0B8D180
If you don't get those, then the problem was on inserting, and we may (or may not) be able to repair the data. If you have C390C2BAC390C2B8C391E282AC for the Cyrillic, then you have "double encoding", and it will take some work to 'fix'.
utf8 needs to be established in about 4 places.
The column(s) in the database -- Use SHOW CREATE TABLE to verify that they are explicitly set to utf8, or defaulted from the table definition. (It is not enough to change the database default.)
The connection between the client and the server. See SET NAMES utf8.
The bytes you have. (This is probably the case.)
If you are displaying the text in a web page, check the <meta> tag.
Halfer is right. Change both your PHP and MySQL encoding, first the PHP with
mb_internal_encoding ("UTF-8");
mb_http_output("UTF-8");
to UTF-8, at the top of your PHP pages.
If you miss out the "UTF-8" and print the output from these finctions, it will show you your current PHP encoding - probably windows-1251
Also note that with MySQL you need to change the character encoding on the row in the table as well as on the table itself overall and on the database itself overall, as the defaults will remain latin1 so any new fields you add would be latin1 without being carefully checked.
If you are trying to save Cryllic text to the database you will need the correct Cryllic character set in the database, rather than latin1
 is being inserted from textareas with empty space at beginning into MySQL table, despite having the database and table set to collation of utf8_general_ci.
I have <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the head of the page doing the inserts. I haven't experienced this problem with other databases/tables on the same MySQL installation, and these are set to the default collation of latin1_swedish_ci.
I can resolve the issue with mysql_query("SET NAMES 'utf8'"); after the MySQL connection is established, however I'd like to know why this happens despite setting the table to utf-8 and having the charset to utf-8 in the page.
You need to properly initialize the connection with utf8 unless you can set a default somehow, which means SET NAMES is mandatory.
If you're using some 1990s style mysql_query based application that opens and closes connections at random points in your code base, without any proper framework, you might have a hard time tracking these all down.
your php/html file/s should have utf-8 encoding. your mysql database and table/s as well collation utf-8. however, you can try then to make test inserts from your php file not with submitted html-form-data just to see if it's a php or a html/form/browser problem.
I didn't notice it before, but the textareas contained Removing that fixes the issue and I don't need to use mysql_query("SET NAMES 'utf8'");
I created an Arabic website using PHP 5.3, MySQL 5 and PHPMyAdmin 3.4.
On every page, I use the "utf-8" character set. I include the following line on every page:
http://jsfiddle.net/Hh7mk/
The website works fine offline (on local server (localhost)). Even after I edited and inserted new Arabic writings into the database.
The problem is when the website is online. All Arabic fonts are displayed properly, but after I edited or inserted new Arabic writings into the database online, the new writings are displayed as question marks.
My settings (online):
PHPMyAdmin MySQL Connection Collation : utf8_general_ci.
PHPMyAdmin MySQL charset : UTF-8 Unicode (utf8).
The database and tables collation : utf8_general_ci.
In the connection file, I have included mysqli query SET NAMES 'utf8' and SET CHARACTER SET utf8.
I also have tried to change the collation to "cp1256_general_ci", and the pages character set to "windows-1256", but the fonts still show as question marks.
Why the Arabic fonts show as question marks after I inserted/ edited them online? How to fix this?
Thank you in advance
Have you tried using mysql_set_charset
mysql_set_charset("utf8");
seems you are doing everything right. but did you make all this before inserting data?. try cleaning up database and inserting from the scratch
The ASCII ? mark's code is 63 which is the first byte of all Arabic characters (each character has two bytes). Having your strings turned to ? means that only the first bytes of your characters are stored/retrieved. This is an indication that somewhere is your store/retrieve process you've got a bottleneck in which only ASCII characters can pass. My bet is on MySQL. The best way you can find this out is to use your local server's code (a local PhpMyAdmin shall do) connecting to your production server's MySQL. If the problem remains it shows that it's MySQL's configuration doing (considering the fact that your local PhpMyAdmin works fine with your local MySQL). I can only give you ideas on how to investigate the problem. You need to find the problem itself on your own.
I know there are hundreds of questions about UTF-8 woes but I tried all the approaches I could find, none of them helped.
The facts:
I'm trying to read a string that contains a é from my MySQL database and display it on a PHP page. Actually, it does display as é (but the font does not recognize it as such and thus another default font is used). The troubles arose when I wanted to convert this string to a filename using PHP functions for string replacement. PHP does not recognize this as the é character at all.
Here's a quick rundown of what I'm doing:
1) The String is stored in a MySQL database. The MySQL server settings are:
MySQL connection collation utf8_unicode_ci
MySQL charset: UTF-8 Unicode (utf8)
The database itself is set to collation utf8_unicode_ci (MyISAM storage engine, not changeable due to shared server)
The actual table is set to collcation utf8_unicode_ci (InnoDB storage engine)
The é shows up correctly in phpMyAdmin. The data is inserted into the DB via a Java program but I have also tried this with manually entered data (entered in phpMyAdmin).
2) The PHP default_charset is not set (NO VALUE), I'm on a shared server and placing a manual override php.ini did not seem to work. Using ini_set("default_charset", 'utf-8'); works but has no effect on the problem I have.
3) Before I run the actual select query I query SET NAMES 'utf8'. The query itself is irrelevant but for testing I chose a simple SELECT title FROM items WHERE item_id = 1
4) The PHP file itself is encoded UTF-8. I have set the correct charset for the html with <meta http-equiv="content-type" content="text/html; charset=utf-8" />
5) To test the problem I used htmlentities on the returned string (Astérix), checking the source code it is converted to Astérix which is not correct of course. Accordingly, the string shows up as Astérix in the browser.
What possible reason could there be for this? To me it seems like I set everything that can be set to UTF-8.
http://php.net/manual/en/ref.mbstring.php - look at multibyte string functions.
I am trying to debug a nasty utf-8 problem, and do not know where to start.
A page contains the word 'categorieën', wich should be categorieën. Clearly something is wrong with the UTF-8. This happens with all these multibite characters. I have scanned the gazillion topics here on UTF8, but they mostly cover the basics, not this situation where everything appears to be configured and set correct, but clearly is not.
The pages are served by Drupal, from a MySQL database.
The database was migrated (not by me) by sql-dumping and -importing trough phpmyadmin. Good chance something went wrong there, because before, there was no problem. And because the problem occurs only on older, imported items. Editing these items or inserting new ones, and fixxing the wrongly encoded characters by hand, fixes the problem. Though I cannot see a difference in the database.
Content re-edited trough Drupal does not have this problem.
When, on the CLI, using MySQL, I can read out that text and get the correct ë character. On both The articles that render "correct "and "incorrect" characters.
The tables have collation utf8_general_ci
Headers appear to be sent with correct encoding: Vary Accept-Encoding and Content-Type text/html; charset=utf-8
HTML head contains a <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
the HTTP headers tell me there is a Varnish proxy inbetween. Could that cause UTF8-conversion/breakage
content is served Gzipped, normal in Drupal, and I have never seen this UTF8 issie wrt the gzipping, but you never know.
It appears the import is the culprit and I would like to know
a) what went wrong.
b) why I cannot see a difference in the mysql cli client between "wrong" and "correct" characters
c) how to fix the database, or where to start looking and learning on how to fix it.
The dump file was probably output as UTF-8, but interpreted as latin1 during import.
The ë, the latin1 two-byte representation of UTF-8's ë, is physically in your tables as UTF-8 data.
Seeing as you have a mix of intact and broken data, this will be tough to fix in a general way, but usually, this dirty workaround* will work well:
UPDATE table SET column = REPLACE("ë", "ë", column);
Unless you are working with languages other than dutch, the range of broken characters should be extremely limited and you might be able to fix it with a small number of such statements.
Related questions with the same problem:
Detecting utf8 broken characters in MySQL
I need help fixing Broken UTF8 encoding
* (of course, don't forget to make backups before running anything like this!)
There should have not gone anything awol in exporting and importing a Drupal dump, unless the person doing this somehow succeeded into setting the export as something else than UTF8. We export/import dumps a lot and have never bumped into a such problem.
Hopefully Pekkas answers will help you to resolve the issue, if it is in the DB, but I also thought that you could check wether the data being shown on the web page is being ran through some php functions that arent multibyte friendly.
Here are some equivalents of normal functions in mb: http://php.net/manual/en/ref.mbstring.php
ps. If you have recently moved your site to another server (so it's not just a db import), you should check what headers your site is sending out with a tool such as http://www.webconfs.com/http-header-check.php
Make sure the last row has UTF8 in it.
You mention that the import might be the problem. In that case it's possible that during import the connection with the client and the MySQL server wasn't using UTF-8. I've had this problem a couple of times in the past, so I'd like to share with you these MySQL settings (in my.conf):
Under the server settings add these:
# UTF 8
default-character-set=utf8
character-set-server=utf8
collation-server=utf8_general_ci
skip-character-set-client-handshake
And under the client settings add:
default-character-set=utf8
This might save you some headache the next time.
To be absolutely sure you have utf8 from start to end:
- source code files in utf8 without BOM
- database with utf8 collation
- database tables with utf8 collation
- database connection in utf8 (query it with 'SET CHARSET UTF8')
- pages header set to utf8 (the ajax ones too)
- meta tag to set page in utf8