I'm doing a project with zend framework and I'm pulling data from a utf-8 database. The project is utf-8 as well.
In a form, I have a select element displaying a list of countries. The problem is:
In french or spanish, some countries are not displayed.
After doing a var_dump() of my country list, I saw that those were the countries with special characters. Accented ones.
in the var_dump I could see the character represented as a ? in a diamond. I tried changing the encoding to iso-8859-1 and I could see the var_dump result with the special characters just fine.
How come data coming from a utf-8 database are displaying in iso-8859-1!
Can I store iso-8859-1 character set in a utf-8 table in mysql without problem? Shouldn't it display messed up characters?
confused.
--
delimiter $$
CREATE TABLE `geo_Country` (
`CountryID` int(10) NOT NULL,
`CountryName` varchar(45) NOT NULL,
`CountryCompleteName` varchar(45) NOT NULL,
`Nationality` varchar(45) NOT NULL,
`CreationDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`Status` tinyint(1) NOT NULL DEFAULT '1',
`LanguageCode` char(2) NOT NULL,
`ZoneID` int(10) NOT NULL,
PRIMARY KEY (`CountryID`,`LanguageCode`),
KEY `fk_geo_Country_web_Language1` (`LanguageCode`),
KEY `fk_geo_Country_geo_Zone` (`ZoneID`),
KEY `idx_CountryName` (`CountryName`)
CONSTRAINT `fk_geo_Country_geo_Zone` FOREIGN KEY (`ZoneID`) REFERENCES `geo_Zone` (`ZoneID`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_geo_Country_web_Language1` FOREIGN KEY (`LanguageCode`) REFERENCES `web_Language` (`LanguageCode`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$
The thing to remember with UTF-8 is this:
Everything in your entire application needs to be UTF-8!
For a normal PHP/MySQL web application (a form, posting to a database), you need to check if:
Your database connection uses UTF-8 (execute this query right after your connection is set up: SET NAMES UTF8;)
Your PHP code uses UTF-8. That means no using character set translation/encoding functions (no need to when everything is UTF-8).
Your HTML output is UTF-8, by either sending a Content-Type: text/html; charset=utf8 header, of using a <meta charset="utf8"> tag (for HTML5, for other HTML variants, use <meta http-equiv="Content-Type" content="text/html; charset=utf8">)
In your case of var_dump'ing, there is just some plain text that is sent to the browser, without any mention of a character set. Looking at rule #3, this means your browser is displaying this in a different character set, presumably latin1, thus giving you the diamonds/question marks/blocks.
If you need to check if your data is stored properly, use a database client like PHPMyAdmin to view the record. This way you're viewing the content as UTF-8 (NOTE: this is a setting in PMA, so check if it is not set to a different charset!).
On a side note, set the collation of your databases' text columns to utf8_general_ci, this is not used for storing, but for sorting. So this isn't related to your problem, but it's a good practice to do so.
When connecting to database you should set up cleint encoding.
for Zend_Db it seems should be like this (notice 'driver_options'):
$params = array(
'host' => 'localhost',
'username' => 'username',
'password' => 'password',
'dbname' => 'dbname',
'driver_options' => array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;');
);
for the application.ini
resources.db.params.charset = utf8
as a last resort you could just run this query SET NAMES UTF8 manually just like any other query.
Related
Inserting UTF-8 encoded string into UTF-8 encoded table gives incorrect string value.
PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9D\x84\x8E i...' for column 'body_value' at row 1: INSERT INTO
I have a ๐ character, in a string that mb_detect_encoding claims is UTF-8 encoded.
I try to insert this string into a MySQL table, which is defined as (among other things) DEFAULT CHARSET=utf8
Edit: Drupal always does SET NAMES utf8 with optional COLLATE (atleast when talking to MySQL).
Edit 2: Some more details that appear to be relevant. I grab some text from a PostgreSQL database. I stick it onto an object, use mb_detect_encoding to verify that it's UTF-8, and persist the object to the database, using node_save. So while there is an HTTP request that triggers the import, the data does not come from the browser.
Edit 3: Data is denormalized over two tables:
SELECT character_set_name FROM information_schema.COLUMNS C WHERE table_schema = "[database]" AND table_name IN ("field_data_body", "field_revision_body") AND column_name = "body_value";
>+--------------------+
| character_set_name |
+--------------------+
| utf8 |
| utf8 |
+--------------------+
Edit 4: Is it possible that the character is "to new"? I'm more than a little fuzzy on the relationship between unicode and UTF-8, but this wikipedia article, implies that the character was standardized very recently.
I don't understand how that can fail with "Incorrect string value".
๐ (U+1D10E) is a character Unicode found outside the BMP (Basic Multilingual Plane) (above U+FFFF) and thus can't be represented in UTF-8 in 3 bytes. MySQL charset utf8 only accepts UTF-8 characters if they can be represented in 3 bytes. If you need to store this in MySQL, you'll need to use MySQL charset utf8mb4. You'll need MySQL 5.5.3 or later. You can use ALTER TABLE to change the character set without much problem; since it needs more space to store the characters, a couple issues show up that may require you to reduce string size. See http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html .
to solve this issue, first you change your database field to utf8m4b charset. For example:
ALTER TABLE `tb_name` CHANGE `field_name` `field_name` VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL;
then in your db connection, set driver_options for it to utf8mb4. For example, if you use PDO
$db = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8mb4', 'username', 'password');
or in zend framework 1.2
$dbParam = array('host' => 'localhost', 'username' => 'db_user_name',
'password' => 'password', 'dbname' => 'db_name',
'driver_options' => array(
'1002' => "SET NAMES 'utf8mb4'",
'12' => 0 //this is not necessary
)
);
In your PDO connecton, set the charset.
new PDO('mysql:host=localhost;dbname=the_db;charset=utf8mb4', $user, $password);
I fixed the error:
SQLSTATE[HY000]: General error: 1366 Incorrect string value ......
with this method:
I use utf8mb4_unicode_ci for database
Set utf8mb4_unicode_ci for all tables
Set longblog datatype for column (not text, longtext.... you need big datatype to store 4 bytes of your content)
It is okay now.
If you use laravel, continue to edit config/database.php
'charset' => 'utf8mb4',
'collation' => 'utf8mb4_unicode_ci',
If you use function strtolower, replace it with mb_strtolower
Notice: you have to put <meta charset="utf-8"> on your head tag
I have data with accents in my database. like image below
when I want to display the data with my controller it gives me this
Here is the code
header('Content-Type: text/html; charset=utf-8');
$icozim=$this->Icozim->find('all');
debug($icozim,0,0);
when I run my function I have this
How can I solve this problem?
sql for my table is
CREATE TABLE `icozims` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`synonymes` MEDIUMTEXT NOT NULL,
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
In your APP/Config/database.php look for the line:
// 'encoding' => 'utf8',
and uncomment it.
There are several things that you have to cover. All of these Encode has to be the same (assigned properly). then you can retrieve symbols or special characters properly.
Encode that you are using for Database
At the same time Encode you are using for Table
Encode of your HTML header
Encode of php code when ever you are retrieving/ printing / saving your data.
Also there are some functions that you can play around which does that for you.
e.g. mb_convert_encoding($value, 'UTF-8', 'HTML-ENTITIES')
If you are using any kind of frameworks you might need to set the Encode in core class
If your data has been saved with this specific symbol then you will need to edit all of them or write a function to convert them to a symbol or character you want to show. I remember I had the same problem with one of my project with wordpress which we removed all manually but after that I find an article that there was a plugin which sort out that problem automatically for you.
Hi I am developing a mobile app using phonegap and I am querying the MySQL database through ajax (jsonp). However I have an issue when special characters are returned as they are displayed as "?" instead for example ลป.
At the moment in my PHP I have added this, however it did not do the trick:
header('content-type: application/json; charset=UTF-8');
Is anyone aware of any other charset that can be used which includes special characters like the above?
First thing is first
a) Fix the db tables
Make sure that tables defined with proper character set
e.g
CREATE TABLE `types` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=utf8
b) After connection to db ensure following things
SET NAMES 'utf8';
[I also run following ]
SET character_set_client ='utf8',
character_set_connection ='utf8',
character_set_database ='utf8',
character_set_results ='utf8',
character_set_server ='utf8',
collation_connection ='utf8_general_ci',
collation_database ='utf8_general_ci',
collation_server ='utf8_general_ci'
c) Finally set proper content type for the html page
hope this will help you
You're working with a MySQL database? So try to set a utf8 charset to the database connection like:
$conn = mysql_connect('localhost','user1','pass1',TRUE);
mysql_set_charset('utf8',$conn);
Or try UTF-8 encoding
string utf8_encode ( string $data )
Parameters:
data
An ISO-8859-1 string.
Return Values:
Returns the UTF-8 translation of data.
According to the JSON implementation standards, all JSON data must be encoded in UTF format, the default format being UTF-8.
But you can always use other UTF formats, such as UTF-32BE, UTF-16BE, UTF-32LE, UTF-16LE.
For detailed standards and information, visit ietf standard.
I've setup a post validator in my symfony form to stop duplication of primary keys.
A primary key is a two-character string in this instance. Code used to validate:
$this->mergePostValidator(new sfValidatorDoctrineUnique(array(
'model' => 'Manufacturers',
'column' => 'id',
'primary_key' => 'id'
)));
The primary key is uppercase (for example AU). Bizarrely the post validator triggers successfully is lowercase 'au' is entered into the field (i.e. stops it from going to the database and triggering a 500 integrity constraint error), but if entered correctly as 'AU' it doesn't seem to notice the duplication.
Any thoughts?
That's not a symfony sfDoctrineValidator issue. All this validor does is to search your database for an existing record. If you are using a "_ci" (case-insensitive) collation (are you using mysql?) the search returns nothing - the validator is fooled.
Then when you insert the duplicate, you get a exception from the database. Try to change the collation of your table like this:
ALTER TABLE `table` DEFAULT CHARACTER SET utf8 COLLATE utf8_bin
(you should tell doctrine to do it for you:
MyTable:
options: { collate: utf8_bin, charset: utf8 }
)
I'm reading a UTF-8 encoded file using PHP and splatting the contents directly into a database. The problem is that when i encounter a character such as โ , it places the following รขโฌ into the database.
How can i encode this correctly, i'm reading a UTF-8 file and my database column's collation is a UTF-8. What am i doing wrong? Is there a nice function i'm missing? Any help is welcome.
This is my table:
CREATE TABLE tblProductData (
intProductDataId int(10) unsigned NOT NULL AUTO_INCREMENT,
strProductName varchar(50) NOT NULL,
strProductDesc varchar(255) NOT NULL,
strProductCode varchar(10) NOT NULL,
dtmAdded datetime DEFAULT NULL,
dtmDiscontinued datetime DEFAULT NULL,
stmTimestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (intProductDataId),
UNIQUE KEY (strProductCode)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_unicode_ci;
EDIT:
I'm reading the date like this:
$hFile = #fopen($FileName, "r") or exit("\nUnable to open file: " . $FileName);
if($hFile)
{
while(!feof($hFile))
{
$Line = fgets($hFile);
$this->Products[] = new Product($Line);
}
fclose($hFile);
}
use
mysql_query("SET NAMES utf8");
just after connection to DB and be sure that browser encoding is in utf-8, too
header("Content-Type: text/html; charset: utf-8");
You should set your connection encoding with this query
SET NAMES 'utf8'
before storing any data.
Keep also in mind that some database gui or web gui (i.e. phpMyAdmin) shows wrong encoding even if your data are encoded correctly. This happen for example with SequelPro on Mac and with phpMyAdmin in some environments.
You should trust your browser, i.e. show your inserted content in a page which has the
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
header and see if the data are shown correctly. Or even better trust mysql command line using the shell:
echo 'SELECT yourdata FROM your table' | mysql -uuser -pyourpwd db_name