This is on my windows test platform.
I have the following csv:
You have signed out successfully!,ar,لقد خرجت بنجاح!
I have the following table definition:
CREATE TABLE `translations` (
`sourcephrase` varchar(250) NOT NULL,
`language` char(5) NOT NULL,
`translatedphrase` varchar(250) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`sourcephrase`,`language`),
KEY `language` (`language`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
If I load this csv into table (via mysql workbench, import csv), I get the data just fine.
sourcephrase, language, translation
You have signed out successfully! ar لقد خرجت بنجاح!
If instead I run this php code (where psquery is just execute a prepared statement):
$sourcephrase="You have signed out successfully!";
$language="ar";
$translated="لقد خرجت بنجاح!";
$sql = "insert into translations (sourcephrase, language, translatedphrase) values (?,?,?)";
$this->DB->psquery($sql, array("sss", $sp, $language, $translated));
The table contains the following data:
You have signed out successfully! ar لقد خرجت بنجاØ!
Why am I getting a different result in php ? (I know its something utf8 related, but I can't see what). I don't believe it's mysql related, as the csv import is just fine.
لقد خرجت بنجاØ! is Mojibake for the desired string. See this for the likely causes, best practice, and debugging techniques.
Probably this item is relevant to your PHP connection: "The connection when INSERTing and SELECTing text needs to specify utf8 or utf8mb4."
Related
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 7 years ago.
So I was having an issue with json_encode returning null that I found the solution for here, but I don't understand why it was an issue in the first place. The MySQL tables from which I was drawing the data are defined like
CREATE TABLE `super_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`values` text,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
so shouldn't values and name be utf8 encoded already when I pull them out in PHP? A simplified version of what I'm doing:
$sql = "SELECT * FROM super_table";
$query = $mysqli->query($sql);
$data = (object) $query->fetch_all(MYSQLI_ASSOC);
foreach ($data as $key => $val) {
$data->$key = utf8_encode($val);
}
$result = array('success'=>$success, 'data'=>$data);
echo json_encode($result);
Why does not doing the extra utf8_encode step sometimes yield a null result when I try to json_encode it?
Indeed, MySQL's utf8 is not universally understood UTF-8. Even though this is not your issue here... MySQL's utf8 is a subset of actual UTF-8 only covering the BMP and not supporting 4-byte characters. But this just means that high characters will get discarded; otherwise it's still UTF-8 compatible.
Your actual issue is that MySQL is just storing the data in utf8, but that says nothing about how you will receive the data in your database client. MySQL converts text on the fly from the stored encoding to the connection encoding (and vice versa). When connecting to the database in your PHP code, you can choose which encoding you prefer to receive your data in. Use $mysqli->set_charset('utf8') to retrieve your data in UTF-8.
Once again I'm having problems with saving special characters into a database. After lots of searchs I still could not find solution so I am starting a new thread.
I have MySQL DB using UTF-8 character set and PHP application that reads data from XML files into DB. Earlier I had problems with estonian characters, which I managed to solve. For example & scaron; (š) is in XML as html entity & eth; and it is converted in PHP to & #353;. Earlier in PHP script I run mysql query "SET NAMES utf8". š saves into DB correctly.
Now I'm fighting with lithuanian characters, for example ų (& #371), which is as numeric entity, & #371;, in XML file. I am not doing any conversion for this in PHP since I assume that when & eth; converted to & #353; works with scaron, shouldn't & #371; save into DB as ų without PHP conversion? After save that appears in DB as question mark and if I try to use mb_convert_encoding() or html_entity_decode() result is ų.
Any advice?
You simple should make sure your table has correct encoding and run SET names just after connection.
I've prepared simple test. Try to run it to make sure everything works fine.
1) Create database testencoding and import the following code to it
CREATE TABLE IF NOT EXISTS `sample` (
`id` int(11) NOT NULL,
`value` varchar(255) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
--
ALTER TABLE `sample`
ADD PRIMARY KEY (`id`);
2) Create simple PHP script with following content and run it:
<?php
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
$subjectvalue='ų ų';
$link = mysqli_connect("localhost","root","","testencoding");
mysqli_query($link,"SET NAMES 'utf8'");
mysqli_query($link,"INSERT INTO sample(`value`) VALUES('".mysqli_real_escape_string($link,$subjectvalue)."')");
$result = mysqli_query($link, "SELECT * FROM sample");
echo "<br /><br />Data from database<br /><br />";
while ($data = mysqli_fetch_assoc($result)) {
echo $data['id'].' '.$data['value']."<br />";
}
3) On my PC all results are as expected:
As output from PHP file I have:
Data from database
1 ų ų
In phpMyadmin I have:
ų ų
So everything works fine. Try it and compare with my results
I am having an odd issue with PHP and MySQL.
In attempt to create a table from PHP, I have pasted in the query I need, which executes successfully outside of the PHP environment, into PHP.
$CREATE_PAGES = "DROP TABLE IF EXISTS `MyDatabase`.`pages`;
CREATE TABLE `MyDatabase`.`pages` (
`Page_ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Page_File` varchar(1000) NOT NULL,
`Page_Description` varchar(1000) NOT NULL,
`Page_Message` longtext NOT NULL,
PRIMARY KEY (`Page_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1;";
$result= mysql_query($CREATE_PAGES,$link);
if(!($result)){
echo mysql_error();
echo $CREATE_PAGES;
}
Then I get the standard error message
. . . for the right syntax to use near 'CREATE TABLE `MyDatabase`.`pages` ( `Page_ID` int(10) unsigned NOT NULL' at line 2
However, the odd part is that when I echo the query $CREATE_PAGES I can copy and paste and it will execute just fine. How can it be a syntax error?
I know that it is not a connection error, I can pull data from another table in that database.
Is there something I am missing?
PHP call to mysql_query allows only one action at the time (as a part of SQL injection prvention I guess) so you have to split your query into two parts and call mysql_query twice.
The mysql_query() function can only execute one query at a time, whereas you can execute an arbitrary number at the command line.
From the documentation:
mysql_query() sends a unique query (multiple queries are not supported) to the currently active database on the server that's associated with the specified link_identifier.
To overcome this:
$dropTable = "DROP TABLE IF EXISTS `MyDatabase`.`pages`";
mysql_query($dropTable, $link);
$createPages ="CREATE TABLE `MyDatabase`.`pages` (
`Page_ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Page_File` varchar(1000) NOT NULL,
`Page_Description` varchar(1000) NOT NULL,
`Page_Message` longtext NOT NULL,
PRIMARY KEY (`Page_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1;";
$result = mysql_query($createPages, $link);
if(!($result)) {
echo mysql_error();
}
From docs to mysql_query:
mysql_query() sends a unique query (multiple queries are not
supported) to the currently active database on the server that's
associated with the specified link_identifier.
mysql_query can only execute a single query, it doesn't support execution of multiple queries. It's also recommended to not end your query with a semicolon.
Look for more information in the PHP documentation.
I'm reading a UTF-8 encoded file using PHP and splatting the contents directly into a database. The problem is that when i encounter a character such as ” , it places the following †into the database.
How can i encode this correctly, i'm reading a UTF-8 file and my database column's collation is a UTF-8. What am i doing wrong? Is there a nice function i'm missing? Any help is welcome.
This is my table:
CREATE TABLE tblProductData (
intProductDataId int(10) unsigned NOT NULL AUTO_INCREMENT,
strProductName varchar(50) NOT NULL,
strProductDesc varchar(255) NOT NULL,
strProductCode varchar(10) NOT NULL,
dtmAdded datetime DEFAULT NULL,
dtmDiscontinued datetime DEFAULT NULL,
stmTimestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (intProductDataId),
UNIQUE KEY (strProductCode)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_unicode_ci;
EDIT:
I'm reading the date like this:
$hFile = #fopen($FileName, "r") or exit("\nUnable to open file: " . $FileName);
if($hFile)
{
while(!feof($hFile))
{
$Line = fgets($hFile);
$this->Products[] = new Product($Line);
}
fclose($hFile);
}
use
mysql_query("SET NAMES utf8");
just after connection to DB and be sure that browser encoding is in utf-8, too
header("Content-Type: text/html; charset: utf-8");
You should set your connection encoding with this query
SET NAMES 'utf8'
before storing any data.
Keep also in mind that some database gui or web gui (i.e. phpMyAdmin) shows wrong encoding even if your data are encoded correctly. This happen for example with SequelPro on Mac and with phpMyAdmin in some environments.
You should trust your browser, i.e. show your inserted content in a page which has the
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
header and see if the data are shown correctly. Or even better trust mysql command line using the shell:
echo 'SELECT yourdata FROM your table' | mysql -uuser -pyourpwd db_name
I'm doing a project with zend framework and I'm pulling data from a utf-8 database. The project is utf-8 as well.
In a form, I have a select element displaying a list of countries. The problem is:
In french or spanish, some countries are not displayed.
After doing a var_dump() of my country list, I saw that those were the countries with special characters. Accented ones.
in the var_dump I could see the character represented as a ? in a diamond. I tried changing the encoding to iso-8859-1 and I could see the var_dump result with the special characters just fine.
How come data coming from a utf-8 database are displaying in iso-8859-1!
Can I store iso-8859-1 character set in a utf-8 table in mysql without problem? Shouldn't it display messed up characters?
confused.
--
delimiter $$
CREATE TABLE `geo_Country` (
`CountryID` int(10) NOT NULL,
`CountryName` varchar(45) NOT NULL,
`CountryCompleteName` varchar(45) NOT NULL,
`Nationality` varchar(45) NOT NULL,
`CreationDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`Status` tinyint(1) NOT NULL DEFAULT '1',
`LanguageCode` char(2) NOT NULL,
`ZoneID` int(10) NOT NULL,
PRIMARY KEY (`CountryID`,`LanguageCode`),
KEY `fk_geo_Country_web_Language1` (`LanguageCode`),
KEY `fk_geo_Country_geo_Zone` (`ZoneID`),
KEY `idx_CountryName` (`CountryName`)
CONSTRAINT `fk_geo_Country_geo_Zone` FOREIGN KEY (`ZoneID`) REFERENCES `geo_Zone` (`ZoneID`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_geo_Country_web_Language1` FOREIGN KEY (`LanguageCode`) REFERENCES `web_Language` (`LanguageCode`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$
The thing to remember with UTF-8 is this:
Everything in your entire application needs to be UTF-8!
For a normal PHP/MySQL web application (a form, posting to a database), you need to check if:
Your database connection uses UTF-8 (execute this query right after your connection is set up: SET NAMES UTF8;)
Your PHP code uses UTF-8. That means no using character set translation/encoding functions (no need to when everything is UTF-8).
Your HTML output is UTF-8, by either sending a Content-Type: text/html; charset=utf8 header, of using a <meta charset="utf8"> tag (for HTML5, for other HTML variants, use <meta http-equiv="Content-Type" content="text/html; charset=utf8">)
In your case of var_dump'ing, there is just some plain text that is sent to the browser, without any mention of a character set. Looking at rule #3, this means your browser is displaying this in a different character set, presumably latin1, thus giving you the diamonds/question marks/blocks.
If you need to check if your data is stored properly, use a database client like PHPMyAdmin to view the record. This way you're viewing the content as UTF-8 (NOTE: this is a setting in PMA, so check if it is not set to a different charset!).
On a side note, set the collation of your databases' text columns to utf8_general_ci, this is not used for storing, but for sorting. So this isn't related to your problem, but it's a good practice to do so.
When connecting to database you should set up cleint encoding.
for Zend_Db it seems should be like this (notice 'driver_options'):
$params = array(
'host' => 'localhost',
'username' => 'username',
'password' => 'password',
'dbname' => 'dbname',
'driver_options' => array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;');
);
for the application.ini
resources.db.params.charset = utf8
as a last resort you could just run this query SET NAMES UTF8 manually just like any other query.