Lithuanian characters not saving correctly into MySQL DB - php

Once again I'm having problems with saving special characters into a database. After lots of searchs I still could not find solution so I am starting a new thread.
I have MySQL DB using UTF-8 character set and PHP application that reads data from XML files into DB. Earlier I had problems with estonian characters, which I managed to solve. For example & scaron; (š) is in XML as html entity & eth; and it is converted in PHP to & #353;. Earlier in PHP script I run mysql query "SET NAMES utf8". š saves into DB correctly.
Now I'm fighting with lithuanian characters, for example ų (& #371), which is as numeric entity, & #371;, in XML file. I am not doing any conversion for this in PHP since I assume that when & eth; converted to & #353; works with scaron, shouldn't & #371; save into DB as ų without PHP conversion? After save that appears in DB as question mark and if I try to use mb_convert_encoding() or html_entity_decode() result is ų.
Any advice?

You simple should make sure your table has correct encoding and run SET names just after connection.
I've prepared simple test. Try to run it to make sure everything works fine.
1) Create database testencoding and import the following code to it
CREATE TABLE IF NOT EXISTS `sample` (
`id` int(11) NOT NULL,
`value` varchar(255) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
--
ALTER TABLE `sample`
ADD PRIMARY KEY (`id`);
2) Create simple PHP script with following content and run it:
<?php
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
$subjectvalue='ų ų';
$link = mysqli_connect("localhost","root","","testencoding");
mysqli_query($link,"SET NAMES 'utf8'");
mysqli_query($link,"INSERT INTO sample(`value`) VALUES('".mysqli_real_escape_string($link,$subjectvalue)."')");
$result = mysqli_query($link, "SELECT * FROM sample");
echo "<br /><br />Data from database<br /><br />";
while ($data = mysqli_fetch_assoc($result)) {
echo $data['id'].' '.$data['value']."<br />";
}
3) On my PC all results are as expected:
As output from PHP file I have:
Data from database
1 ų ų
In phpMyadmin I have:
ų ų
So everything works fine. Try it and compare with my results

Related

Php utf8 failing in mysql but ok on csv import

This is on my windows test platform.
I have the following csv:
You have signed out successfully!,ar,لقد خرجت بنجاح!
I have the following table definition:
CREATE TABLE `translations` (
`sourcephrase` varchar(250) NOT NULL,
`language` char(5) NOT NULL,
`translatedphrase` varchar(250) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`sourcephrase`,`language`),
KEY `language` (`language`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
If I load this csv into table (via mysql workbench, import csv), I get the data just fine.
sourcephrase, language, translation
You have signed out successfully! ar لقد خرجت بنجاح!
If instead I run this php code (where psquery is just execute a prepared statement):
$sourcephrase="You have signed out successfully!";
$language="ar";
$translated="لقد خرجت بنجاح!";
$sql = "insert into translations (sourcephrase, language, translatedphrase) values (?,?,?)";
$this->DB->psquery($sql, array("sss", $sp, $language, $translated));
The table contains the following data:
You have signed out successfully! ar لقد خرجت بنجاح!
Why am I getting a different result in php ? (I know its something utf8 related, but I can't see what). I don't believe it's mysql related, as the csv import is just fine.
لقد خرجت بنجاح! is Mojibake for the desired string. See this for the likely causes, best practice, and debugging techniques.
Probably this item is relevant to your PHP connection: "The connection when INSERTing and SELECTing text needs to specify utf8 or utf8mb4."

Echo from database in collation "latin1_swedish_ci" results in questionmark symbols on webpage

I feel like this should be a simple fix, but the answer eludes me. Ive tried googling, and searching here... to no avail.
i have a long text stored in my database.
The collation is latin1_swedish_ci"
and when I see it in the database, it is stored correctly. For example:
string= Sally was walking one day and saw Tom. Tom said "Hi, Sally!" Sally's response was "Hi, Tom."
every " or ' shows up as a white question mark on a black diamond.
I want to
$result=mysqli_query($db,"SELECT string FROM Table WHERE 1")
while($row = mysqli_fetch_assoc($result)){
echo $row['string'];
}
and have all of the characters show up.
can anyone help?
You may be able to fix yours with this example.
SQL:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`string` longtext,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1;
INSERT INTO `test` VALUES ('1', 'Sally was walking one day and saw Tom. Tom said \"Hi, Sally!\" Sally\'s response was \"Hi, Tom.\"');
PHP:
$query = "SELECT id, string FROM test WHERE id = 1";
if ($result = mysqli_query($link, $query)) {
while ($row = mysqli_fetch_assoc($result)) {
echo $row['string'];
}
mysqli_free_result($result);
}
mysqli_close($link);
A single question mark is one kind of symptom; a string of question marks is another; and black diamond with a question mark is yet another problem. Sweep the slate clean... Get rid of all encoding functions and...
Have the data in the client be utf8-encoded, and
Establish that the connection is CHARACTER SET utf8, and
Have the tables/columns be CHARACTER SET utf8, and
On html pages, use <meta charset=UTF-8>.
After doing some more research, I found a post that didn't come up before.
This solution to a similar problem was offered by Emil H:
MySQL performs character set conversions on the fly to something
called the connection charset. You can specify this charset using the
sql statement
SET NAMES utf8 or use a specific API function such as
mysql_set_charset():
mysql_set_charset("utf8", $conn); If this is done correctly there's no
need to use functions such as utf8_encode() and utf8_decode().
You also have to make sure that the browser uses the same encoding.
This is usually done using a simple header:
header('Content-type: text/html;charset=utf-8'); (Note that the
charset is called utf-8 in the browser but utf8 in MySQL.)
In most cases the connection charset and web charset are the only
things that you need to keep track of, so if it still doesn't work
there's probably something else your doing wrong. Try experimenting
with it a bit, it usually takes a while to fully understand.
shareedit edited Mar 25 '09 at 12:17 answered Mar 25 '09 at 11:52
Emil H 28k75778
after reading that, I looked up the php code she/he was talking about and found that there is an sqli equivilant, with different syntax.
(PHP 5 >= 5.0.5, PHP 7)
mysqli::set_charset -- mysqli_set_charset — Sets the default client character set
Description ¶
Object oriented style
bool mysqli::set_charset ( string $charset )
Procedural style
bool mysqli_set_charset ( mysqli $link , string $charset )
Sets the default character set to be used when sending data from and to the database server.
I hope this helps anyone having the problem I had.

blob texts after backup and restore

Updated Question :
I had one old script with mysql 5.5.46 . i was storing my text in one columnt with blob type. my text was persian like this : سلام خوبی جه خبر .
after some month's i changed blob column to longblob column with phpmyadmin with gui(without any converting query).
everything work's correctly but i get backup from my mysql and restored this db after 2 years. now it's not show my persian characters correctly.english text is ok but persian text's is going someting like the hex i mentioned in my question.
I need text's stored in page_about and page_contact columns in this table:
Creating Table :
mysql_query("CREATE TABLE `".$prefix."ProFolio_info` (
`id` int(5) NOT NULL auto_increment,
`page_about` blob NOT NULL,
`page_contact` blob NOT NULL,
PRIMARY KEY (`id`)
) TYPE=MyISAM AUTO_INCREMENT=1") or die(mysql_error());
Updating Data :
$info_query = mysql_query("SELECT * FROM ".$prefix."ProFolio_info ORDER BY id DESC LIMIT 0,10");
while($info_row = mysql_fetch_array($info_query)){
$about_page = html_entity_decode($info_row['page_about']);
$contact_page = html_entity_decode($info_row['page_contact']);
if(isset($_POST['change_settings']) && $LOGGEDIN == 'yes'){
$new_aboutpage = clean_page($_POST['about_page']);
$new_contactpage = clean_page($_POST['contact_page']);
if($about_page != $new_aboutpage){
mysql_query("UPDATE ".$prefix."ProFolio_info SET page_about = '$new_aboutpage' WHERE id = '$info_id'");
}
if($contact_page != $new_contactpage){
mysql_query("UPDATE ".$prefix."ProFolio_info SET page_contact = '$new_contactpage' WHERE id = '$info_id'");
}
get texts from DB :
<textarea name="contact_page"><? echo str_replace('<br />', '', $contact_page); ?></textarea>
</div>
i tested some query like cast and convert and convert column to longtest but result is the same and i have wrong chracters.
I think it stored with latin1 collation but I select column with utf-8
HEX
BC28620264F736C6173683B26756D6C3B264F736C6173683B26736563743B265567726176653BC284264F736C6173683B26736563743B20265561637574653B266D6163723B265567726176653BC281264F736C6173683B266F7264663B265567726176653BC28520264F736C6173683B26736563743B2655636972633BC28C265567726176653BC286264F736C6173683B266E6F743B264F736C6173683B26736563743B20265567726176653BC286265567726176653BC2852655636972633BC28C265561637574653B266D6163723B265567726176653BC286264F736C6173683B266E6F743B265567726176653BC28720264F736C6173683B26736563743B265567726176653BC285264F736C6173683B26736563743B20265567726176653BC285264F736C6173683B26737570333B265567726176653BC286265561637574653B26636F70793B265567726176653BC28720265561637574653B266D6163723B265567726176653B
i uploaded my script in github
also uploaded script in my server and sending text from script but data going to db correctly ! i think it's just because of my backup of database.
you can check my ready script from here and view my text's from kave note's menu;
There are 4 places to "say" utf8:
The data in the client must be utf8-encoded. (It probably was.)
SET NAMES utf8 or equivalent. (You took care of that in new PDO.)
CHARACTER SET utf8 on the table or column declaration. Please provide `SHOW CREATE TABLE, but do not try to fix it if it says latin1.
On the html page, <meta ... charset=UTF-8 ...>
Please provide SELECT col, HEX(col) FROM tbl WHERE ... so we can see whether the data was messed up in the table. There two possible fixes for the data; we need to see the hex to know which fix to apply.
I have too got the same problem the main reason is windows uses different kind of character encoding for different characterset. I have found the package convert character set. It has solved my problem and it may solve your too.

Stored non-English characters, got '?????' - MySQL Character Set issue

My site that I am working on is in Farsi and all the text are being displayed as ????? (question marks).
I changed the collation of my DB tables to UTF8_general_ci but it still shows ???
I ran the following script to change all the tables but this did not work as well.
I want to know what am I doing wrong
<?php
// your connection
mysql_connect("mysql.ord1-1.websitesettings.com","user_name","pass");
mysql_select_db("895923_masihiat");
// convert code
$res = mysql_query("SHOW TABLES");
while ($row = mysql_fetch_array($res))
{
foreach ($row as $key => $table)
{
mysql_query("ALTER TABLE " . $table . " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci");
echo $key . " => " . $table . " CONVERTED<br />";
}
}
?>
Bad news. But first, double check:
SELECT col, HEX(col)...
to see what is in the table. If the hex shows 3F, then the data is gone. Correctly stored, the dal character should be hex D8AF; hah is hex D8AD.
What happened:
you had utf8-encoded data (good)
SET NAMES latin1 was in effect (default, but wrong)
the column was declared CHARACTER SET latin1 (default, but wrong)
As you INSERTed the data, it was converted to latin1, which does not have values for Farsi characters, so question marks replaced them.
The cure (for future `INSERTs):
Recode your application using mysqli_* interface instead of the deprecated mysql_* interface.
utf8-encoded data (good)
mysqli_set_charset('utf8')
check that the column(s) and/or table default are CHARACTER SET utf8
If you are displaying on a web page, <meta...utf8> should be near the top.
The discussion above is about CHARACTER SET, the encoding of characters. Now for a tip on COLLATION, which is used for comparing and sorting.
If you want these to be treated equal: 'بِسْمِ' = 'بسم', then use utf8_unicode_ci (instead of utf8_general_ci) for the COLLATION.

Encoding issue when reading values from a utf-8 file in PHP and putting them into a database.`

I'm reading a UTF-8 encoded file using PHP and splatting the contents directly into a database. The problem is that when i encounter a character such as ” , it places the following †into the database.
How can i encode this correctly, i'm reading a UTF-8 file and my database column's collation is a UTF-8. What am i doing wrong? Is there a nice function i'm missing? Any help is welcome.
This is my table:
CREATE TABLE tblProductData (
intProductDataId int(10) unsigned NOT NULL AUTO_INCREMENT,
strProductName varchar(50) NOT NULL,
strProductDesc varchar(255) NOT NULL,
strProductCode varchar(10) NOT NULL,
dtmAdded datetime DEFAULT NULL,
dtmDiscontinued datetime DEFAULT NULL,
stmTimestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (intProductDataId),
UNIQUE KEY (strProductCode)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_unicode_ci;
EDIT:
I'm reading the date like this:
$hFile = #fopen($FileName, "r") or exit("\nUnable to open file: " . $FileName);
if($hFile)
{
while(!feof($hFile))
{
$Line = fgets($hFile);
$this->Products[] = new Product($Line);
}
fclose($hFile);
}
use
mysql_query("SET NAMES utf8");
just after connection to DB and be sure that browser encoding is in utf-8, too
header("Content-Type: text/html; charset: utf-8");
You should set your connection encoding with this query
SET NAMES 'utf8'
before storing any data.
Keep also in mind that some database gui or web gui (i.e. phpMyAdmin) shows wrong encoding even if your data are encoded correctly. This happen for example with SequelPro on Mac and with phpMyAdmin in some environments.
You should trust your browser, i.e. show your inserted content in a page which has the
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
header and see if the data are shown correctly. Or even better trust mysql command line using the shell:
echo 'SELECT yourdata FROM your table' | mysql -uuser -pyourpwd db_name

Categories