This question already has answers here:
strange character encoding of stored data , old script is showing them fine new one doesn't
(2 answers)
How to fix double-encoded UTF8 characters (in an utf-8 table)
(4 answers)
Closed 8 years ago.
I was using PDO MySQL to insert data from PHP. What I didn't know, I had to use "SET NAMES 'utf8'" statement before making every query.
So all the input made in Simplified Chinese before making that fix (SET NAMES 'utf8' command), became something like '世家'.
I have used PHP mb_detect_encoding and it seems the encoding is still in UTF-8. Is there any way I can convert these data (the ones user entered before adding the fix) to readable Simplified Chinese characters?
Below are some examples:
(first name ; last name)
"è© æ˜Œæ˜Ž" ; "蔡"
"瑩書" ; "陳"
"培熙" ; "楊"
"培熙" ; "楊"
"立" ; "王"
"光凱" ; "陳"
SOLUTION:
As mentioned in the comment, the issue is actually double encoded utf-8, which can be fixed using below MySQL statement (Thanks to the answer provided in How to fix double-encoded UTF8 characters (in an utf-8 table))
UPDATE TABLE_NAME
SET FIELD_NAME = CONVERT(CAST(CONVERT(FIELD_NAME USING latin1) AS BINARY) USING utf8);
I have updated with this answer, because there might be someone who does know know about 'double encoded utf-8' :)
Related
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 2 years ago.
I'm using Slovenia characters and PHP MySQL select where clause to fetch results from the database. This query is working properly with English letters but has an issue with Slovenia letters like (ć, č, đ, š). I have set database collation to utf8_slovenian_ci. And encoding of php files is utf8
I don't know how Slovenia characters are processed, but try this (it worked with Arabic characters):
$string = "ćčđš";
$mysqlEscaped = mysqli_escape_string($string);
I send russian alphabet with inline-keyboard, in callback_data I pass the letter that user selected. It looks like this:
But telegram returns me this letter is this way \xd0\xb3.
I also save word for compare in mysql db. It returns in this way \u0438\\u043c\\u043f\\u0435\\u0440\\u0430\\u0442\\u0438\\u0432. The encoding in the database is utf8_general_ci.
And as a result, I need to check if the selected letter is in the word from the database. How can I do that?
MySQL never generates \u0438, a Unicode representation. It will generate the 2-byte character whose hex is D0B3 (which might show as \xd0\xb3), specifically a Cyrillic character. And you should provide that format when INSERTing into a MySQL table.
PHP's json_encode will generate the Unicode form instead of the other, depending on the absence or presence of JSON_UNESCAPED_UNICODE in the second argument.
To check the database, do something like:
SELECT col, HEX(col) ...
If "correct" you should get something like
г D0B3
(That's a Cyrillic GHE, not a latin r.)
Who knows what telegram is doing to the data. There are over a hundred packages that use MySQL under the covers; I don't know anything about this one.
Terminology: The encoding is utf8 (or could be utf8mb4). The collation, according to what you say, is utf8_general_ci. Encoding is relevant to the querstion; collation has to do with the ordering of strings in comparisons and sorting.
Anoter example: Cyrillic small letter I и = utf8 hex D0B8 = Unicode codepoint \U0438
HTML is quite happy with Unicode codepoints; it will show и when given \U0438. Perhaps Telegram is converting to codepoints as it builds the web page?
This question already has answers here:
Save Data in Arabic in MySQL database
(8 answers)
Closed 6 years ago.
I am having problems storing urdu characters in Mysql table. Values from HTML form are stored in tables but when I view in phpmyadmin, it shows weird characters.
I have tried ucs2 and utf8 collations but still new values which I store are unknown characters.
What is the correct collation type? Or is there anything else that is wrong?
Thanks
Try setting your column as utf8-utf8_general_ci like this if you are using MySQL workbench.
Then you will be able to save urdu characters
Recommend using utf8_unicode_ci for accuracy
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 6 years ago.
Is there any way to save Bengali Language content in database without encoding, like: আমার will be saved exactly as আমার. But it is inserted as গলà§à¦ªà§‡ গলà§à¦ªà§‡ সি পà§à. it is being coded before save but how could I prevent this automatic encoding?
In this case, I would prefer using urlencode and urldecode. So before it is sent to the DB, I'll use:
urlencode($content); // %E0%A6%86%E0%A6%AE%E0%A6%BE%E0%A6%B0
And while displaying:
echo urldecode($content); // আমার
Output:
https://ideone.com/O8UaTl
https://ideone.com/TBvy1z
This way, it is safe to be stored. But only one issue is, the content's length in URL Encode will be larger than the original one.
But the right way, is to use UTF8-MB4 or similar Encoding in the Database. Eg. in MySQL:
CREATE TABLE `whatever` (
-- Fields
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci;
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 6 years ago.
I want to insert some word which is languages of Europe to MySQL. e.g á, ó, Ö,ü.(Sorry, I don't know what language it is. Maybe somebody can help me fix title.)
But it become something can't read.
I try this solution How to Inserting french characters in mySQL DB table?
I find out my PHP version is 5.1.6. So I can't use mysql_set_charset
Mysql charset is utf8_general_ci and version is 5.0.45.
How can I fix this problem?
DB changes:
change charset in your DB, tables + columns (yes! changing table charset is not enough, you have to do it column by column! Took me long time to figure this out!) to "UTF8"
change collation in your DB, tables + columns (really check every column) to "utf8_slovak_ci"
import the DATA and check that it looks fine in your DB
PHP changes:
When querying, use "mysql_query('SET character_set_results = UTF8;');" prior to SELECT/UPDATE statements