Double UTF-8 encoding, but why? PHP/MySQL - php

I have some forms, that insert some data into a MySQL database, and for some reason the characters get double utf-8 encoded. You don't see it on the front-end of my website, but in the back-end you do, if i look at the data from phpmyadmin, it's double encoded.
Also, to display data entered from phpmyadmin i have to utf8_encode it.
If i use uft8_decode() on my data before i put it into my database, it works, but then i'd have to use utf8_encode() again to display my data properly, and i would like to find a better solution that re-writing most of my code.
The characters i'm dealing with is the danish æ, ø and å characters.
I have every setting i can find in php.ini set to utf-8, every thing i can find in phpmyadmin to utf8, html meta tag set to utf-8, and still i have this error to deal with.
So my question is, does anyone know why this happens, or how i could fix it?..
Update: After running the mysql code Jako suggested, the data is properly encoded in the back-end of the database when it comes from the front-end, but i still need to run utf8_encode() to display the data properly on the front-end, any ideas?..
Update 2: Again, after running the code from the answer to this question i still had problems, the encoding was now on and off utf8, and i suspect phpmyadmin for resetting the encoding somehow. I found a new way of doing things, and it works flawlessly, described in my answer below...

I ran into similar issues in the past and this did the trick for me.
mysql_query("SET names UTF8");

Okay, i found the answer!! I searched a bit more, (have been doing so for hours) and found this question: Whether to use “SET NAMES”
Using the answer from that question i ran this query:
mysql_query("SET character_set_client = UTF8");
mysql_query("SET character_set_results = UTF8");
mysql_query("SET character_set_connection = UTF8");
mysql_query("SET names UTF8");
That's it, it works fine now on all my php scripts. Still thanks to Jako for leading me in the right way. ;)
Update: Okay, since the encoding was on and off and the settings didn't stay that way i found a new solution that works. I added mysql_set_charset() to my connection script:
mysql_set_charset("UTF8");
It gives me the right data from the database every time and inserts the right data as well, so this was only one line of code.

Related

Character Encoding - I don't understand

Okay, I don't really get character encoding at all and I'm doing research on it, but I was hoping someone could explain to me what's going on here.
In my products table, I have descriptions. In phpMyadmin, it looks like this:
I've set the encoding (just today) to utf8_general_ci. Good.
Now in the pLongDescription on one of my products, I have this:
You see that ’ there? That's some dodgy apostrophe that Word or something uses. It continually creeps in to my life. I can't even type it on my keyboard into anything other than word. I much prefere to use ' instead.
Anyway, I would have thought with the utf8_general_ci set, it wouldnt be a problem. If I output this normally from the database through PHP, I get this:
However, If I use utf8_encode($pDescription) I get this:
Neither of them are perfect. On on hand, i've got a bunch of horrible errors. On the other, I've got bad grammer and spelling because it's missed out the apostrophe's in the description.
What is happening here and how can I fix it?
Try with this :
$pDescription = mb_convert_encoding ($pDescription, "UTF-8", "UTF-8");
Don't forget to aactivate this extension extension=php_mbstring.dll in your php.ini and restart your Web Server
Hope that Helps :)
mysql_query('SET CHARACTER SET "utf8"');
mysql_query('SET NAMES "utf8"');
ini_set('default_charset', 'UTF-8');
use this:) that help me
put this after connection of mysql
and if you are using html, don`t forget to set charset:
meta charset="utf-8"
You are mixing up concepts. What you are seeing is "collation", which indicates how it would sort data. Encoding is set at table level, with DEFAULT CHARSET=utf8 (or preferred charset) when you create the table. (it can be set as a global default in my.ini/my.cnf as well)
Then, you need to instruct each connection that it is a utf8 connection, by invoking SET NAMES utf8 (or mysql_set_charset('utf8', $link);) first hand for each connection you make.
From now on, when you read out data, and output with PHP, it will send a utf8 bytestream to the browser. Assuming that you have instructed the browser to decode it has utf8 by setting relevant content-type headers or similar, this would work.
If you have any hard coded text in your PHP files that utilizes extended characters, the PHP file must also be saved as utf8, so it matches the output from the DB (otherwise you will render mixed encodings on your pages).
There are more things to consider, one suggestion is this article:
http://www.toptal.com/php/a-utf-8-primer-for-php-and-mysql

Downgrade from mysql 5.5 to 5.1, utf-8 general_ci

i am using wamp for localhost, its mysql version is 5.5. After i finished my website i wanted to upload it to my website (shared hosting). which runs (5.1) but i cant insert arabic letters anymore.
when i insert any field in arabic, it gets stored as weird characters "ضووع".
it was doing great on my pc, but not online.
the database is myisam by default, but all tables are innodb with utf8_general_ci.
also this is the same database i used on my machine (innodb by default) (I've imported it into my new database on the shared hosting).
so far i tried those things after making the connection
mysql_set_charset('utf8');
and
mysql_query("SET NAMES 'utf8'");
mysql_query("SET CHARACTER SET utf8");
mysql_query("SET COLLATION_CONNECTION = 'utf8_unicode_ci'");
what can i do more?
Provided you are using mysql_set_charset('utf8'); and not getting any errors when inserting data, that means you are correctly giving the database utf-8. On another side, you could have tables defined in some other charset, such as Windows-1256 (labeled 'cp1256' by MySQL), but that doesn't seem possible from the output you see, which is UTF-8 decoded as Windows-1252.
So the possibility I see is that you are echoing data from the database, and seeing this strings. You need to tell the browser the data is in UTF-8 as well, before sending any output:
<?php
header("Content-Type: text/html; charset=utf-8");
If you are already doing this or this doesn't help, your data was probably incorrectly converted in the import process.
If this is the case, and you can reimport, ensure that when exporting the data comes out as utf-8, and when importing the exported data, it is treated as utf-8.
this happened to me once, i was using htmlentities() , but when i used htmlspecialchars($string, ENT_QUOTES, 'UTF-8'); the problem solved.
hope you have the same problem.

MySQL, PHP, JavaScript UTF-8 Problem with swedish letters (Everything Tested - Nothing Works)

Okay, so I know this has been discussed millions of times. And I've seen billions of attempts to solve the problem. And in most cases they have. But in my case, something still doesn't want to work.
I have ALL my files encoded with utf-8 (none missed, controlled several times). And I have my whole database, tables and everything encoded as utf8_general_ci (have tried utf8_unicode_ci and utf8_bin without result). But still I just get the "?-cube" everywhere instead of my swedish letters.
I've made some tests and have come to the conclusion that it's the jquery load and post calls that's responsible for the problem. If i load my php-file "load-folders.php" i get the correct åäöÅÄÖ. But even though the main file "index.php" is saved as AND has enctype utf8 when it calls the jquery load('load-folders.php') i still get the faulty letters. And yes, the jquery.js is also utf-8. So is my stylesheets and everything. I really don't get it. Is there anything else that might cause a problem when using javascript jquery to load a file into another. The standard is supposed to be utf-8 so it should work.
Anyone out the who is able to help me - I love you.
Thanks in advance!
I had a problem exactly like this. The only think that worked was:
mysql_query("SET NAMES utf8");
Write it just before your mysql query, like this:
mysql_query("SET NAMES utf8");
$q = mysql_query("SELECT * FROM ...") or die('Error: ' . mysql_error());
You can read comments over here: http://php.net/manual/en/function.mysql-client-encoding.php

Character encoding issues: MySQL 5.0 + PHP 5.2

I have a MySQL database with an InnoDB table containning utf8_general_ci varchar fields. When I fetch them through PHP (via PEAR::MDB2) and try to output them (via Smarty), I get ??? symbols. I would like to know how to fix that problem, which is most likely caused by PHP.
Good information to know:
It is a new version of the site I'm working on, the old version had the same problem even though it didn't use Smarty nor MDB2, so they are most likely not the cause. The old programmer used htmlentities() to remedy the problem, but I'm trying to avoid that.
The character encoding of all my files (template, source, etc.) is UTF-8 without BOM.
When I display a page, all accented characters (the ones in the templates, not the ones coming from MySQL) are shown correctly and the encoding in the browser is UTF-8. If I manually switch it over to ISO-8859-1, then the character from MySQL are outputed correctly, but no the others.
Basically, it seems that PHP or MySQL transforms the UTF-8 data contained within the database to ISO-8859-1 at some point during the query/fetch process, and that is what I want to fix.
I've done a lot of searching but haven't found any solution, and I'm hoping the problem lies in a setting somewhere. I'd like to avoid having to use htmlentities() or utf8_encode(), however that might be the only way to go until PHP6 shows up.
Thank you for your input on this!
You need to execute a few queries to tell it to use UTF-8 for the connection (the default is indeed Latin-1). Here's what I use:
SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";
I know some of these seem overkill, but they have been tested and do seem to work quite well...
My guess is the data wasn't utf-8-encoded when it hit the database.

Will changing collation affect my database?

I'm trying to track down a bug with some random characters appearing when saving data to our database. So far my travels have indicated that it's a character encoding issue.
I've swapped the collation on the dev to utf8_general_ci and it doesn't seem to have made a difference to the system, but I'm still unsure as to the full implications of changing collation.
I have been poking around in here, http://dev.mysql.com/doc/refman/5.0/en/charset-charsets.html and it's still not entirely clear.
I've also updated the page with the form on to include a utf-8 <meta /> tag.
The background of the issue is that posting a £ from the form, when it runs through our SQLBuilder class, it's passed through mysql_real_escape_string (deprecated I know :() and ends up in the database, and subsequently generated config files as £
As I understand it, the collation is a way for the database to compare characters, but I'm still not totally sure.
Ninja edit
Web application, posting an HTML form through a PHP class, into a MySQL DB
I usually do a mysql_query("set names utf8"); immediately after connecting to the database.

Categories