I was wondering if anyone had experience of trying to get Eth to enter into a mysql database properly at all? I have a simple html form, processed using PHP4 code which stores the data in mysql, but i want to allow users to be able to use characters such as Ð, æ, ö and the like. I have tried different collations such as latin1 and utf8_unicode_ci but none seem to want to accept them properly, I either get a question mark or completely different characters.
MySQL: 5.1.30
phpMyAdmin: 3.2.4
default charset: utf8
php charset: utf8
Any help would be most appreciated. Even if it is just to say it can't be done. I realise there is a chance i can't cover for all possible characters, but until someone says "No!", then i live in hope ;)
SET NAMES utf8 query at first
and then check your table character set and pages character set if problem persists
there are three main section in your application encoding:
HTML page encoding. set by Content-type HTTP header
db table encoding. set by table definition
db client (PHP) encoding. set by SET NAMES query
Check all three parts to have proper encoding and you'll never have any problem
Related
After migration from PHP 5.3 to PHP 5.6 I have encoding problem. My MySQL database is latin1 and my PHP files are in windows-1251. Now everything is displayed like "ñëåäíèòå àäðåñè" or "�����".
It should be display something in Cyrillic like "кирилица". I've tried mysqli_set_charset but it didn't solve my problem.
First, let's see what you have in the table. Do SELECT col, HEX(col)... to see how these are encoded. Here is the HEX that should be there if it is correctly utf8-encoded:
ñëå --> C3B1C3ABC3A5; кир --> D0BAD0B8D180
If you don't get those, then the problem was on inserting, and we may (or may not) be able to repair the data. If you have C390C2BAC390C2B8C391E282AC for the Cyrillic, then you have "double encoding", and it will take some work to 'fix'.
utf8 needs to be established in about 4 places.
The column(s) in the database -- Use SHOW CREATE TABLE to verify that they are explicitly set to utf8, or defaulted from the table definition. (It is not enough to change the database default.)
The connection between the client and the server. See SET NAMES utf8.
The bytes you have. (This is probably the case.)
If you are displaying the text in a web page, check the <meta> tag.
Halfer is right. Change both your PHP and MySQL encoding, first the PHP with
mb_internal_encoding ("UTF-8");
mb_http_output("UTF-8");
to UTF-8, at the top of your PHP pages.
If you miss out the "UTF-8" and print the output from these finctions, it will show you your current PHP encoding - probably windows-1251
Also note that with MySQL you need to change the character encoding on the row in the table as well as on the table itself overall and on the database itself overall, as the defaults will remain latin1 so any new fields you add would be latin1 without being carefully checked.
If you are trying to save Cryllic text to the database you will need the correct Cryllic character set in the database, rather than latin1
I am having a big problem with character encoding accross my domains. The big thing for me really is that I don't understand it. I set all my websites to be utf-8 using the meta tag:
<meta charset="UTF-8">
Which seemed to have solved a few problems a while back. Now I am seeing problems with between the website and the database, when a user enters their first or last name, and it has an accent in it, it doesnt display correctly. However I ran the following test.
I created a test table called 'test' (imaginative I know)
I wrote a very tiny script to take the value from a text box and put it into this table and then display the contents of this table each time this page displays, so I could see what is going on.
Here are some screen shots, first, from the output of the page:
And then a screenshot of the database itself:
So the type of column first is VARCHAR(50), I just left the settings as I would do normally, and the character encoding was latin_swedish1 or something. After id 4, I changed it utf8_bin, but that still didnt make a difference.
The problem is, the data still display okay on the website, but looks terrible in the database. Is that a problem, is this how it should be done? I think the problems I the users are complaining about are when it is put into emails and PDFs etc, which I don't think I set character encoding on.
Any help and advice would be greatly welcomed.
Make sure to have the charset is utf8 when you created the table.
create table my_table (
id int primary key,
.....
) engine=innodb charset=utf8;
and make sure that your connection is setting the charset for utf8. it depends on each framework you're using.
You can set internal character to utf-8 /* Set internal character encoding to UTF-8 */
mb_internal_encoding("UTF-8"); check http://php.net/manual/en/function.mb-internal-encoding.php
Try to use utf8_general_ci. This will solve troubles.
When the3 column type was assigned, how was it assigned? I would ensure that it was along the lines of VARCHAR(n) CHARSET utf8 as that would make your column type correct for the UTF8 standard, more normally I see the UCS2 (VARCHAR(n) CHARSET ucs2) which can then cause problems later down the line.
you can set this using NVARCHAR ( see http://dev.mysql.com/doc/refman/5.0/en/charset-national.html) since sql 5.
To change the type of a column on its own you can use this type of command:
Alter table tablenamehere MODIFY columnidhere newdatatypehere;
for example
Alter table mytabelforphp MODIFY oldvarcharcolumnId nvarchar(1024);
Let me know if you need more info:)
also add your character encoding to database connection
if you have mysqli connection use :
mysqli::set_charset
if you use PDO follow this:
PDO_MYSQL DSN
I have a form in my page for users to leave a comment.
I'm currently using this charset:
meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"
but retrieveving the comment from DB accents are not displaying correct ( Ex. è =>è ).
Which parameters should i care about for a correct handling of accents?
SOLVED
changed meta tag to charset='utf-8'
changed character-set Mysql (ALTER TABLE comments CONVERT TO CHARACTER SET utf-8)
changed connection character-set both when inserting records and retrieving ($conn->query('SET NAMES utf8'))
Now accents are displaying correct
thanks
Luca
Character sets can be complicated and pain to debug when it comes to LAMP web applications. At each of the stages that one piece of software talks to another there's scope for incorrect charset translation or incorrect storage of data.
The places you need to look out for are:
- Between the browser and the web server (which you've listed already)
- Between PHP and the MySQL server
The character you've listed look like normal a European character that will be included in the ISO-8859-1 charset.
Things to check for:
even though you're specifying the character set in a meta header have a look in your browser to be sure which character set the browser is actually using. If you've specified it the browser should use that charset to render/view the page but in cases I've seen it attempting to auto-detect the correct charset and failing. Most browsers will have an "encoding" menu (perhaps under "view") that allows you to choose the charset. Ensure that it says ISO-8859-1 (Western European).
MySQL can happily support character set conversion if required to but in most cases you want to have your tables and client connection set to use the same encoding. When configured this way MySQL won't attempt to do any encoding conversion and will just write the data you input byte for byte into the table. When read it'll come out the same way byte for byte.
You've not said if you're reading data from the database back out with the same web-app or with some other client. I'd suggest you try to read it out with the same web application and using the same meta charset header (again, check the browser is really setting it) and see what is displayed in the browser.
To debug these issues requires you to be really sure about whether the client/console you're using is doing any conversion too, the safest way is sometimes to get the data into a hex editor where you can be sure that nothing else is messing around with any translation.
If it doesn't look like it's a browser-side problem please can you include the output of the following commands against your database:
Run from a connection that your web-app makes (not from some other MySQL client):
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
Run from any MySQL client:
SHOW CREATE TABLE myTable;
(where myTable is the table you're reading/writing data from/to)
The ISO-8859-1 character set is for Latin characters only. Try UTF-8, and make sure that the database these characters are coming from are also UTF-8 columns.
There was a table in latin1 and site in cp1252
I want to have table in utf8 and site in utf-8
I've done:
1) on web page: Content-Type: text/html;charset=utf-8
2) Mysql: ALTER TABLE XXX CONVERT TO CHARACTER SET utf8
_
This SQL doesn't work as I want - it doesn't convert ä & ü characters in database to their multibyte equivalents
Please Help.
Tanks
As this blog post says, using MySQL's ALTER TABLE CONVERT syntax is A Bad Idea [TM]. Export your data, convert the table and then reimport the data, as described in the blog post.
Another idea: Have you set your default client connection charset via /etc/my.cnf or mysqli::set-charset .
I've been a fool. SET NAMES was missing.
What I know now:
1) Every time the charset of a column is changed, the actual data is ALWAYS recoded! Change field to binary to see that.
2) The charset of a column is prior!, the table and db charset follow in the priority. They are used mainly for setting defaults. (not 100% sure about last sentence)
3) SET NAMES is very important. German characters can come in latin1 and be placed get correctly in utf8 table(recoded by Mysql silently) when you SET NAMES correctly. The server can send data to a web page in the encoding you desire, no matter what the table encoding is. It can be recoded for output
4) If there is a column in encoding A and a column in encoding B, and you compare them (or use LIKE), the Mysql will silently convert them so that it looks like they are in one encoding
5) Mysql is smart. It never operates with text as with bytes unless the type is binary. It always operates as characters! He wants that ё in latin1 would equal ё in utf8 if he knows the data encoding
Since you claim you now get s**t back, it suggests that the characters were modified in the database.
How are you accessing the data in mysql? If you are using a programming interface such as PHP, then you may need to tell that interface what character encoding to expect.
For example, in PHP you will need to call something like mysql_set_charset("utf8"); but it can also be done with an SQL query of SET NAMES utf8
You will then also need to make sure that whatever is displaying the text knows it is utf8 and is rendering with an appropriate encoding. For example, on a web page you would need to set the content type to utf-8. something like Content-Type: text/html;charset=utf-8
okee, I followed all instructions I could find here
and i could display all kinds of multilingual characters on my pages...
The problem is in phpmyadmin the japanese characters are replaced by question marks, as in a bunch of ???? ??? pieced together. I think there's a problem with my database's collation but I just wanted to verify that here.
We've had this database set before on a default collation which is latin_swedish_ci
and it already has a lot of data. Now we had to add some tables that require support for special characters, so I definitely just couldn't set the database's collation to utf8. My solution was to use utf8 only on the tables which required such support and the specific columns where we expected special characters to be contained.
But still phpmyadmin displayed them as ????.
Another question that I have is will these fields be searchable?
I mean if the field contains some japanese characters and I typed sayuri as keyword, will the japanese character equivalent to their syllables pronounced in english?
Mmm, as to your first question do you have Japanese fonts installed on your system? They aren't installed by default on most OSs, but I have no idea what your OS is. Next one is silly but are your Browser settings Ok?
Next question, the answer is no, if you search for 吉永 小百合 it wont' match with Yoshinaga Sayuri.
Note: Can you see my Japanese characters?
I had similar problem, if your headers are correct.
once you made your database connection use:
mysqli_set_charset ( $mysqli,'utf8');
or
$mysqli->set_charset("utf8");
The problem is that your connection collation is not set to utf-8 (most probably latin1), which you need to display the Japanese characters. You could set it manually by issuing the queries:
SET CHARACTER SET utf8;
SET NAMES utf8;
Or in your MySQL configuration file:
default-character-set=utf8
skip-character-set-client-handshake