Trouble with encondig in Mysql - php

I have the next table:
CREATE TABLE IF NOT EXISTS `applications` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
I want to store the value "España" in the "name" field.
I have a PHP FILE (encoded in UTF8) with a form to save that. When i save "España" using the php file and i read from mysql with php i see the data ok.
But if a go to PMA o Mysql Query Browser i see this: "España"
If i save it from PMA (with encoding set to UTF-8 ) or mysql query browser i see ok on that two tools, but i see "Espa�a" from PHP.
I dont understand why.
In bytes:
If is saved from PHP i see: C3 83 C2 B1 (for ñ)
If is saved from MQB or PMA i see: C3 B1 (for ñ)

Run mysql_set_charset() before executing queries on the open connection.
mysql_set_charset("UTF-8");

The problem was the php mysql client. It uses latin1 as encoding for the connection, you can see this with:
echo mysql_client_encoding($con);
or
print_r(mysql_fetch_assoc(mysql_query("show variables like 'char%';")));
There is two ways to solve this:
mysql_set_charset("utf8");
or
mysql_query("SET CHARACTER SET 'UTF8'", $con); // data send by the server
mysql_query("SET NAMES 'UTF8'", $con); // data send by the client

Related

MySQL Character Set & Select Query Performance in stored procedure

Recently I noticed few queries are taking very long time in execution, checked further and found that MySQL Optimizer is trying to use COLLATE in Where clause and that's causing performance issue, if I run below query without COLLATE then getting quick response from database:
SELECT notification_id FROM notification
WHERE ref_table = 2
AND ref_id = NAME_CONST('v_wall_detail_id',_utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8mb4_unicode_ci')
MySQL version 5.7
Database Character Set: utf8mb4
Column Character set: UTF8
Column Data Type: CHAR(36) UUID
From PHP in Connection object passing: utf8mb4
Index is applied
This query is written in MySQL stored procedure
SHOW CREATE TABLE
CREATE TABLE `notification` (
`notification_id` CHAR(36) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`title` VARCHAR(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`created` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`notification_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8mb4
SHOW VARIABLES LIKE 'coll%';
collation_connection utf8_general_ci
collation_database utf8mb4_unicode_ci
collation_server latin1_swedish_ci
SHOW VARIABLES LIKE 'char%';
character_set_client, Connection,Result, System: utf8
character_set_database utf8mb4
character_set_server latin1
character_set_system utf8
Any suggestion, what improvements are needed to make my queries faster?
The table's character set is utf8, so I guess its collation is one of utf8_general_ci or utf8_unicode_ci. You can check this way:
SELECT collation_name from INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = '...your schema...' AND table_name = 'notification'
AND column_name = 'ref_id';
You are forcing it to compare to a string with a utf8mb4 charset and collation. An index is a sorted data structure, and the sort order depends on the collation of the column. Using that index means taking advantage of the sort order to look up values rapidly, without examining every row.
When you compared the column to a string with a different collation, MySQL cannot infer that the sort order or string equivalence of your UUID constant is compatible. So it must do string comparison the hard way, row by row.
This is not a bug, this is the intended way for collations to work. To take advantage of the index, you must compare to a string with a compatible collation.
I tested and found that the following expressions fail to use the index:
Different character set, different collation:
WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE utf8mb4_general_ci
WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE utf8mb4_unicode_ci
Same character set, different collation:
WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8_unicode_ci'
The following expressions successfully use the index:
Different character set, default collation:
WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c'
Same character set, same collation:
WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8_general_ci'
Same character set, default collation:
WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c'
To simplify your environment, I recommend you should just use one character set and one collation in all tables and in your session. I suggest:
ALTER TABLE notification CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
This will rebuild the indexes on string columns, using the sort order for the specified collation.
Then using COLLATE utf8mb4_unicode_ci will be compatible, and will use the index.
P.S. In all cases I omitted the NAME_CONST() function, because it has no purpose in a WHERE clause as far as I know. I don't know why you are using it.
These say what the client is talking in:
collation_connection utf8_general_ci
character_set_client, Connection,Result, System: utf8
Either change them or change the various columns to match them.
If you have Stored routines, they need to be dropped, do SET NAMES to match what you picked, then re-CREATEd.
Since you are using 5.7, I recommend using utf8mb4 and utf8mb4_unicode_520_ci throughout.

Convert and show data in mysql from 1252 to UTF-8

I have one old database which I must use. The problem is that the old data(mostly text) is stored in 1252(latin1_general_ci) and is showed like ?????? on the page. Then I've converted whole database and the table to UTF-8 collation like this:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
But the problem whit old records remains. I know that the queries above are just change the fields collation. My question is there an way to show those ????? records properly on the web page now?
1) Create dump
mysqldump --default-character-set=latin1 --skip-set-charset mydatabase mytable > ./mytable.sql
2) In mytable.sql replace latin1 in utf8
CREATE TABLE `test` (
`id` int(10) unsigned NOT NULL,
`name` char(255) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
3) Import DB
mysql --user=login -p --database=mydatabase < ./mytable.sql
mysqldump — A Database Backup Program

Changing MySQL DB encoding from Latin1_swedish to UTF-8

I can't convert data from Latin1_swedish to UTF-8.
The application is based on Symfony2 and the database is MySQL.
I've already tried this query:
ALTER TABLE <tablename> CONVERT TO CHARACTER
SET utf8 COLLATE utf8_unicode_ci
and:
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
I would like a solution that does all the tables and columns, because MySQL database has 1000 tables. If I had to modify them all manually it would take too long.
Since you mentioned "weird characters", I suspect that "changing from latin1 to utf8" is not the real task, but rather to fix up some kind of mess that happened during INSERTs.
There are about 5 cases to deal with. We don't yet know which case you have. Please provide
SHOW CREATE TABLE for a table that you are trying to change.
SELECT col, HEX(col) ... for some cell that has non-ascii text.
Let's review the attempts:
ALTER TABLE <tablename> CONVERT TO CHARACTER SET utf8;
That assumes the table is declared to be latin1 and correctly contains latin1 bytes, but you would like to change it to utf8. Since 'Ă' and 'Ĺ' do not exist in latin1, this ALTER feels very wrong.
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
is similar to the above, but works only one column at a time, and needs exactly the right stuff in the MODIFY clause. Hence, it would be quite tedious.
ALTER DATABASE databasename DEFAULT CHARACTER SET utf8;
merely sets the default CHARACTER SET for any new tables created in that databasename. The word DEFAULT is optional.
HEX('ĂĹ') = 'C482C4B9' -- So it looks like you are working with some Eastern European language, perhaps using utf8, perhaps not. Please provide further details. What came before and after Ă?
The fix for the "weird characters" is probably in my blog, but need details to point you directly.
Have you tried this ?
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
If you would like to change entire DB you should run this command:
ALTER DATABASE db_name DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

Wrong character encoding with database output using laravel

I've recently started using laravel for a project I'm working on, and I'm currently having problems displaying data from my database in the correct character encoding.
My current system consists of a separate script responsible for populating the database with data, while the laravel project is reponsible for displaying the data. The view that is used, is set to display all text as utf-8, which works as I've successfully printed special characters in the view. Text from the database is not printed as utf8, and will not print special characters the right way. I've tried using both eloquent models and DB::select(), but they both show the same poor result.
charset in database.php is set to utf8 while collation is set to utf8_unicode_ci.
The database table:
CREATE TABLE `RssFeedItem` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`feedId` smallint(5) unsigned NOT NULL,
`title` varchar(250) COLLATE utf8_unicode_ci NOT NULL,
`url` varchar(250) COLLATE utf8_unicode_ci NOT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`text` mediumtext COLLATE utf8_unicode_ci,
`textSha1` varchar(250) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `url` (`url`),
KEY `feedId` (`feedId`),
CONSTRAINT `RssFeedItem_ibfk_1` FOREIGN KEY (`feedId`) REFERENCES `RssFeed` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6370 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
I've also set up a test page in order to see if the problem could be my database setup, but the test page prints everything just fine. The test page uses PDO to select all data, and prints it on a simple html page.
Does anyone know what the problem might be? I've tried searching around with no luck besides this link, but I haven't found anything that might help me.
I did eventually end up solving this myself. The problem was caused by the separate script responsible for populating my database with data. This was solved by running a query with SET NAMES utf8 before inserting data to the database. The original data was pulled out, and then sent back in after running said query.
The reason for it working outside laravel, was simply because the said query wasn't executed on my test page. If i ran the query before retrieving the data, it came out with the wrong encoding because the query stated that the data was encoded as utf8, when it really wasn't.

Why is unicode not working in my MySQL table?

I have a MySQL DB table where I store addresses, including Norwegain addresses.
CREATE TABLE IF NOT EXISTS `addresses` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`street1` varchar(50) COLLATE utf8_danish_ci NOT NULL,
`street2` varchar(50) COLLATE utf8_danish_ci DEFAULT 'NULL',
`zipcode` varchar(10) COLLATE NOT NULL,
`city` varchar(30) COLLATE utf8_danish_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `index_store` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_danish_ci;
Now, this table was fine untill I screwed up and accidentaly set all cities = 'test'. Luckilly I had another table called helper_zipcode. This table contains alle zipcodes and cities for Norway.
So I updated addresses table with data from helper_zipcode.
Unfortunately in the front end, cities like Bodø now shows like Bod�.
All æ ø åare now shown as � � � (but they look fine in the DB).
I'm using HTML 5, so my header looks like this:
<!DOCTYPE HTML>
<head>
<meta charset = "utf-8" />
(...)
This is not the first time I struggle with unicode.
What is the seceret for storing unicode characters (from Europe) in DB and display the same way when retrieved from DB?
from mysql docs:
Posted by lorenz pressler on May 2
2006 12:46pm [Delete] [Edit]
if you
get data via php from your mysql-db
(everything utf-8) but still get '?'
for some special characters in your
browser (<meta
http-equiv="Content-Type"
content="text/html; charset=utf-8"
/>), try this:
after mysql_connect() , and
mysql_select_db() add this lines:
mysql_query("SET NAMES utf8");
worked for me. i tried first with the
utf8_encode, but this only worked for
äüöéè... and so on, but not for
kyrillic and other chars.
Is your problem storing the data in mysql or from retrieving the stored data using php?
Before query (1-st time) you must need add mysql_query("SET NAMES UTF8");.
What happens if you change your browser encoding from auto-detect to UTF-8 or Unicode ?
What I'm trying to determine if its the Database or the Web-browser that's wrong.
Alternatively. if you have a Database tool for your MySQL database, does that show the right or wrong characters ?

Categories