Emojis on textarea does not save post - php

I have a commenting system and when its just text, no problem - it is saved to the database. When I add a 😄 (for instance), then no comment is saved to the database? Nothing is saving, when there is an emoji.
What can I do to allow emojis?
The "message" is where I am saving the actual comment and where there should be an emoji.

You might want to update the charset and potentially collation. I'm assuming you're using MySQL. This is very confusing, but in MySQL the UTF8 charset isn't actually UTF8, but a mysql's proprietary charset that is largely similar to the actual UTF8, but lacks some characters.
The way to handle it is to switch to the actual UTF8, which in the world of mysql is called utf8mb4_general_ci. You can do so by running
ALTER DATABASE <you db name> CHARACTER SET utf8mb4_general_ci COLLATE utf8mb4_general_ci;
(this will affect only the new tables that you create)
and
ALTER TABLE <you existing table name> CONVERT TO CHARACTER SET utf8mb4_general_ci;
(this will update an already existing table, although the emojis that you already lost cannot be recovered)

Related

Bulk insert string containing Russian

I am converting a spreadsheet using PHPExcel to a Database and the cell value happens to contain Russian. If I run mb_detect_encoding() I am told the text is UTF8 and if I set a header of UTF8 then I see the correct Russian characters.
However if I compile it into a string (with only addslashes involved in the process) and insert it into the table I see lots of ????. I have set the table characterset as utf8mb4 and also set the collation as utf8mb4_general_ci. I have also run $this->db->query("SET NAMES 'utf8mb4'"); on my DB connection.
I run PDO query() with my multi part insert and get the ???s but if I output the query to screen I get ÐŸÐ¾Ñ which would be valid UTF8. Why would this not be stored correctly in the database?
I have kept this question rather than deleting it so someone may find the answer helpful.
The reason I was struggling was because in SQLYog it doesn't show you the column Charset by default. There is an option which reads "Hide language options" on the Alter table view which will then reveal that when SQLyog creates a table it uses the default server Charset as opposed to what you define the table Charset to be. I'm not sure if thats correct - but the solution simply is to turn on the Column Charset settings and check they match what you are expecting.
По is Mojibake for По. Probably...
The bytes you have in the client are correctly encoded in utf8 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8.)
The column in the tables may or may not have been CHARACTER SET utf8, but it should have been that.
The question marks imply...
you had utf8-encoded data (good)
SET NAMES latin1 was in effect (default, but wrong)
the column was declared CHARACTER SET latin1 (default, but wrong)
One way to help diagnose the problem(s) is to run
SELECT col, HEX(col) FROM tbl WHERE ...
For По, the hex should be D09FD0BE. Each Cyrillic character, in utf8, is hex D0xx.

How to set utf8 character set in magento core resource?

I working on bulk product import from API response.This bulk product import will be handle huge data update using mysql query of core resource connection.
So in this case system will receive some of special character from the Api Response, That special characters should be like below.
[Name] => GÄNGT M8X0.75 6H
We need to save this value should be like GÄNGT M8X0.75 6H.
For the reason of bulk update we are using direct update query to hit the mysql database instead of using native magento adapter.
These above special character not updating with utf8 conversion while doing direct update.But if we use magento product import adapter it will convert and save as value in mysql database.
I have tried to add set character_set_results=utf8 in magento core resource collection, but there is no luck.
Below is my try out :
$resource = Mage::getSingleton('core/resource');
$writeConnection = $resource->getConnection('core_write');
$writeConnection->query("set character_set_results=utf8");
$writeConnection->query($mysqlUpdateQuery);
$writeConnection->closeConnection();
Can any one help me, what goes wrong or what i want to add / modify for utf8 value conversion.
Any help much appreciation!
Ä is the Mojibake for utf8 Ä.
Usually Mojibake occurs when
The bytes you have in the client are correctly encoded in utf8 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8.)
xx The column in the table was declared CHARACTER SET latin1. (Or possibly it was inherited from the table/database.) (It should have been utf8.)
The column in the tables may or may not have been CHARACTER SET utf8, but it should have been that.
Since these seem to disagree with what you said, let's dig further. Please provide
SELECT col, HEX(col) FROM ... WHERE ...
GÄNGT M8X0.75 6H, if correctly stored in utf8 will have hex 47 C384 4E4754204D3858302E3735203648 (I added spaces);
If stored incorrectly (in one way), the hex will be 47 C383 E2809E 4E4754204D3858302E3735203648.
Do you see either of those? Or a third hex?
With that answer, we can proceed to plan corrective actions.
C383 E2809E was stored
That probably happened thus. And the result was "double-encoding", not "Mojibake".
The client had C384, the correct utf8 encoding for Ä.
The initialization was incorrectly set to latin1. This needs to be changed. Note that you had $writeConnection->query("set character_set_results=utf8");, which only handles the output side, not the input side. Read about SET NAMES. Change it to $writeConnection->query("SET NAMES utf8");
The column was correctly declared CHARSET utf8.
To repair the data:
UPDATE tbl SET name = CONVERT(BINARY(
CONVERT(name USING latin1))
USING utf8);
To set utf8_general_ci Mysql Database character set in magento as below
After you have created the database, you need to run this sql query:
ALTER DATABASE DB_NAME DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
Where DB_NAME is your database name.

Changed mysql charset to utf8, non-latin characters already in database now unreadable

I have several years of data in the DB, which is 99% Latin characters. Recently, I've added the following after the mysql connection:
mysqli_set_charset($link, "utf8");
Now all the existing data in the database that is composed of asian, Hebrew, etc characters is no longer readable and appears as garbage data.
How can I fix the data in the DB so its readable with a utf8 charset?
The table charset was always utf8. The only thing that changed is the fact that there is a charset set during the connection (as shown above), and before that line was absent.
The table creation is fairly basic, the collation is utf8_general_ci
CREATE TABLE `test` (
COLUMNS + INDEXES
) ENGINE=InnoDB DEFAULT CHARSET=utf8
You now have data that is double-encoded, and you are going to need to fix the data before you can read it on a connection that uses utf8 as the charset.
Here's a blog that explain in detail how to fix your data:
http://www.mysqlperformanceblog.com/2013/10/16/utf8-data-on-latin1-tables-converting-to-utf8-without-downtime-or-double-encoding/

Advice on converting ISO-8859-1 data to UTF-8 in MySQL

We have a very large InnoDB MySQL 5.1 database with all tables using the latin1_swedish_ci collation. We want to convert all of the data which should be in ISO-8859-1 into UTF-8. How effective would changing the collation to utf8_general_ci be, if at all?
Would we be better off writing a script to convert the data and inserting into a new table? Obviously our goal is to minimise the risk of losing any data when re-encoding.
Edit: We do have accented character's, £ symbols etc.
If the data is currently using only latin characters and you are just wanted to change the character set and collation to UTF8 to enable future addition of UTF-8 data, then there should be no problem simply changing the character set and collation. I would do it in a copy of the table first of course.
About a week ago I had to do the same task (issues with ö, ä, å)
Created a dump.sql.
Searched and replaced all CHARSET=latin1 to CHARSET=utf8 (in the dump.sql).
searched and replaced all COLLATE=latin1_swedish_ci to COLLATE=utf8_unicode_ci (in the dump.sql).
Created a new database with the collation utf8_unicode_ci.
Imported the dump.sql.
Altered the the database's charset with alter database MY_DB charset=utf8;
and it worked perfectly
Note: after Mike Brant's remark, I think it's better better to do manual searching and replace for the fields you specifically want. Or you can simply use ALTER for each field without needing the dump.sql. It didn't make much change in my case, as most of my fields needed to be utf encoded

Database charset conversion

I've moved database from one host to another. I've used PMA to export and bigdump to import. The whole database have latin2 charset set everywhere where it's possible. However in database, special chars (polish ąęłó, etc.) are broken. When I used SELECT i see "bushes" - "Ä�" insetad of "ą". Then I've set document encoding to utf-8... And the characters are good. How to fix this? Can it be done using CONVERT in query? I don't want to export/import database again, because it has over 200MB. What's wrong?
Every PHP/MySQL query solution will save me.
Sorry if you can't understand this, because I'm still learning english though.
If a table contains the wrong kind of charset (let's say utf-8 has slipped into latin1 column varhcar(255)):
ALTER TABLE tablename MODIFY colummname BINARY(255);
ALTER TABLE tablename MODIFY colummname VARCHAR(255) CHARSET utf8;
ALTER TABLE tablename MODIFY colummname VARCHAR(255) CHARSET latin1;
See also: http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html
However, it is more likely you just have a wrong character set in your default connection. What does a SET NAMES latin1; before selecting result in?

Categories