Unable to retrieve UTF-8 accented characters from Access via PDO_ODBC - php

I am trying to get an Access DB converted into MySQL. Everything works perfectly, expect for one big monkey wrench... If the access db has any non standard characters, it wont work. My query will tell me:
Incorrect string value: '\xE9d'
If I directly echo out the rows text that has the 'invalid' character I get a question mark in a black square in my browser (so é would turn into that invalid symbal on echo).
NOTE: That same from will accept, save and display the "é" fine in a textbox that is used to title this db upload. Also if I 'save as' the page and re-open it up the 'é' is displayed correctly....
Here is how I connect:
$conn = new PDO("odbc:Driver={Microsoft Access Driver (*.mdb)};Dbq=$fileLocation;SystemDB=$securefilePath;Uid=developer;Pwd=pass;charset=utf;");
I have tried numerous things, including:
$conn -> exec("set names utf8");
When I try a 'CurrentDb.CollatingOrder' in access it tells me 1033 apparently that is dbSortGeneral for "English, German, French, and Portuguese collating order".
What is wrong? It is almost like the PDO is sending me a collation my browser and PHP does not fully understand.

The Problem
When using native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver, text is not UTF-8 encoded, even though it is stored in the Access database as Unicode characters. So, for a sample table named "Teams"
Team
-----------------------
Boston Bruins
Canadiens de Montréal
Федерация хоккея России
the code
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'odbc:' .
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb;' .
'Uid=Admin;';
$db = new PDO($connStr);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT Team FROM Teams";
foreach ($db->query($sql) as $row) {
$s = $row["Team"];
echo $s . "<br/>\n";
}
?>
</body>
</html>
displays this in the browser
Boston Bruins
Canadiens de Montr�al
????????? ?????? ??????
The Easy but Incomplete Fixes
The text returned by Access ODBC actually matches the Windows-1252 character encoding for the characters in that character set, so simply changing the line
$s = $row["Team"];
to
$s = utf8_encode($row["Team"]);
will allow the second entry to be displayed correctly
Boston Bruins
Canadiens de Montréal
????????? ?????? ??????
but the utf8_encode() function converts from ISO-8859-1, not Windows-1252, so some characters (notably the Euro symbol '€') will disappear. A better solution would be to use
$s = mb_convert_encoding($row["Team"], "UTF-8", "Windows-1252");
but that still wouldn't solve the problem with the third entry in our sample table.
The Complete Fix
For full UTF-8 support we need to use COM with ADODB Connection and Recordset objects like so
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb';
$con = new COM("ADODB.Connection", NULL, CP_UTF8); // specify UTF-8 code page
$con->Open($connStr);
$rst = new COM("ADODB.Recordset");
$sql = "SELECT Team FROM Teams";
$rst->Open($sql, $con, 3, 3); // adOpenStatic, adLockOptimistic
while (!$rst->EOF) {
$s = $rst->Fields("Team");
echo $s . "<br/>\n";
$rst->MoveNext;
}
$rst->Close();
$con->Close();
?>
</body>
</html>

A bit more easily to manipulate the data. (Matrix array).
function consulta($sql) {
$db_path = $_SERVER["DOCUMENT_ROOT"] . '/database/Registros.accdb';
$conn = new COM('ADODB.Connection', NULL, CP_UTF8) or exit('Falha ao iniciar o ADO (objeto COM).');
$conn->Open("Persist Security Info=False;Provider=Microsoft.ACE.OLEDB.12.0;Jet OLEDB:Database Password=ifpb#10510211298;Data Source=$db_path");
$rs = $conn->Execute($sql);
$numRegistos = $rs->Fields->Count;
$index = 0;
while (!$rs->EOF){
for ($n = 0; $n < $numRegistos; $n++) {
if(is_null($rs->Fields[$n]->Value)) continue;
$resultados[$index][$rs->Fields[$n]->Name] = $rs->Fields[$n]->Value;
echo '.';
}
echo '<br>';
$index = $index + 1;
$rs->MoveNext();
}
$conn->Close();
return $resultados;
}
$dados = consulta("select * from campus");
var_dump($dados);

Found the following solution. True, I did not have the opportunity to test it on php. But I suppose it should work out.
In order for native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver to be able to correctly subtract texts in Unicode encoding, that stored in the Access database as Unicode character, it is need enables "Beta: Use Unicode UTF-8 for worldwide language support" in Region Settiongs of Windows Operetion System.
After I did this at me, many programs using the standard ODBC driver MC Access, began to display correct texts in Unicode encoding.
All Settings -> Time & Language -> Language -> "Administrative Language Settings"

Related

PHP MySQL utf-8 Euro symbol shown as questionmark on a diamond

So after a whole day of googling and debugging I end up here.
MySQL
set to the following encoding:
db: utf8_general_ci
table: utf8_general_ci
column: utf8_general_ci, TEXT
I put in some euro symbols and some other weird characters
acentuação €€€€€
PHP (codeigniter)
config
$config['charset'] = 'UTF-8';
dsn
char_set=utf8,dbcollat=utf8_general_ci
I made some queries to compare
model
$query = $this->db->query("SET NAMES latin1");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['latin1'] = $query->row();
$query = $this->db->query("SET NAMES utf8");
$query = $this->db->query("SELECT shortdesc,HEX(shortdesc) FROM `contracttypes` WHERE id = 4");
$ret['utf8'] = $query->row();
return $ret;;
controller
public function utfhell() {
var_dump($this->campagne_model->utfhell());
}
This outputs
array (size=2)
'latin1' =>
object(stdClass)[34]
public 'shortdesc' => string 'acentua��o �����' (length=16)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
'utf8' =>
object(stdClass)[33]
public 'shortdesc' => string 'acentuação €€€€€' (length=28)
public 'HEX(shortdesc)' => string '6163656E747561C3A7C3A36F20E282ACE282ACE282ACE282ACE282AC' (length=56)
So far so good, on to a
view
<?php header('Content-Type: text/html; charset="utf-8"', true); ?>
<!doctype html>
<html>
<head>
<title>UTFhell</title>
<link rel="stylesheet" href="../assets/css/style.css"/>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
...
<?php
echo 'Original : ', $campagne_info->contractName->shortdesc."<br />";
echo 'UTF8 Encode : ', utf8_encode($campagne_info->contractName->shortdesc)."<br />";
echo 'UTF8 Decode : ', utf8_decode($campagne_info->contractName->shortdesc)."<br />";
echo 'TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE TRANSLIT : ', iconv("ISO-8859-1", "UTF-8//IGNORE//TRANSLIT", $campagne_info->contractName->shortdesc)."<br />";
echo 'IGNORE : ', iconv("ISO-8859-1", "UTF-8//IGNORE", $campagne_info->contractName->shortdesc)."<br />";
echo 'Plain: ', iconv("ISO-8859-1", "UTF-8", $campagne_info->contractName->shortdesc)."<br />";
echo '€€€€€€€€€€<br>';
?>
None of these now show me a normal euro symbol except the final echo statement, they all give me questionmark diamonds for the eurosymbols
The HEX is the utf8 encoding for that string. So the data is in the table 'correctly'.
The black diamond (�) is the browser's way of saying wtf. It comes from having latin1 characters, but telling the browser
to display utf8 characters.
You could tell the browser to display "Western", that is avoiding the underlying problems.
Remember, the goal is to really use utf8.
Sometimes this occurs together with Question Marks, in which case you must start over.
The cause (probably):
The bytes you had were encoded latin1. You acquired them from somewhere -- file dump online input, etc.
The connection parameters said latin1.
The column/table is declared to be CHARACTER SET said utf8, so during INSERT, they were correctly converted.
When SELECTing, the seting in step 2 was again latin1, so they were converted back to latin1.
When displaying text in a web page, the page's header said that the bytes were utf8.
Solution, Plan A: (Sloppy, but probably workable)
Change #5 so say the appropriate equivalent of latin1.
Solution, Plan B:
Fix the source to be utf8-encoded
query("SET NAMES utf8") (unless there is a way to set it at connect time)
Leave the table/column at CHARACTER SET utf8
Step 2 cover this.
Leave <meta ... UTF-*>.

imap_search and subject command

Hello for all and happy holidays!!
I have a code for connect my inbox mail. Is this...
$host = '{'.SMTP_HOST.':143/novalidate-cert}INBOX';
$entrada = imap_open($host, SMTP_USER , SMTP_PASS);
$emails_mejora = imap_search($entrada, 'SUBJECT "Envíanos el tamaño de la imágen"', SE_UID, , 'UTF-8');
The subject contains utf8 characters and show 0 results. With other subjects without utf8 characters works fine...
Please any help
Thanks ;)
#sergio, use header("Content-type: text/html; charset=UTF-8"); at the top of your php file to set Character Set in UTF-8 format
You should use collation " utf8_general_ci " in database table field if u used special characters.

PHP output JSON Web Service charset UTF-8 error

I am hosting a web service in JSON output by PHP.
I have Hebrew data set in DB and I am posting this as an output to Web service.
When I post the data initially it output the result as follows:
JSON:
{
"tasklist": [
{
"customerID": "9936",
"name": "טר ×רמה ×™×–×•× ×•×‘×™× ×•×™ בע"מ",
"cargo":"×ברר",
"destination":"מכר",
"quantity":"1.000000",
"startdate":"03/01/201300: 00: 00"
}
]
}
But this "×ברר" can be readable by Android/Iphone parser and convert it to original Hebrew. But i faced Error in "name": "טר ×רמה ×™×–×•× ×•×‘×™× ×•×™ בע"מ",. where " is in between the string so the JSON is not valid and shows error!
To Over come this issue I used UTF-8 to convert "×ברר" this to Hebrew "נברר". But in this case too the problem remains same:
PHP:
header('Content-type: text/html; charset=UTF-8');
JSON:
{
"tasklist": [
{
"customerID": "9936",
"name": "טר ארמה יזום ובינוי בע"מ",
"cargo":"נברר",
"destination":"מכר",
"quantity":"1.000000",
"startdate":"03/01/201300: 00: 00"
}
]
}
But still the problem remains:
Also in some case I am getting this � because of using UTF-8
"name":"מחצבות כפר גלעדי-חומרי מ�"
How can I overcome this issue?
Is there any other specific encode I need to use?
Note: The data cannot be changes in Database The solution should be while output to JSON.
How the data stored in DB is shown below:
name
מחצבות כפר גלעדי-חומרי מ×
My PHP Script which output JSON:
<?php
//My DB connection and Query comes here
$jsontext = '{"tasklist":[';
while($row = mysql_fetch_array($queryExe)){
$jsontext .= '{"customerID":"'.$row['AUTO_ID'].'",';
$jsontext .='"name":"'.$row['Customer_Name'].'",';
$jsontext .='"cargo":"'.$row['Type_of_Cargo'].'",';
$jsontext .='"destination":"'.$row['Destination'].'",';
$jsontext .='"quantity":"'.$row['Quantity'].'",';
$jsontext .='"startdate":"'.$row['startdate'].'"},';
}
$jsontext = substr_replace($jsontext, '', -1); // to get rid of extra comma
$jsontext .= "]}";
header('Content-type: text/html; charset=UTF-8');
//Output the final JSON
echo $jsontext;
?>
Thank you for your help in advance!
Was the question clear? to understand my issue.
If your db-field is utf8 you should fist do:
mysql_query("SET NAMES 'utf8'");
You should always do the 'SET NAMES...' before inserting your data, too.
Be sure that you really stored utf8 encoded strings!
then do your query:
mysql_query($your_query);
$array = array("tasklist"=>array());
while($row = mysql_fetch_array($queryExe)){
$a = array();
$a["customerID"] = $row['AUTO_ID'];
$a["name"] = $row['Customer_Name'];
$a["cargo"] = $row['Type_of_Cargo'];
$a["destination"] = $row['Destination'];
$a["quantity"] = $row['Quantity'];
$a["startdate"] = $row['startdate'];
$array["tasklist"][] = $a;
}
header("Content-type: application/json; charset=utf-8");
echo json_encode($array);
exit();
i've made the experience that these is not enough when the servers default charset is for example iso. In that case i need to do the following in my .htaccess:
AddDefaultCharset utf-8
You should change your code to use json_encode. You need to pass it properly utf8 encoded data.
If you are using MySQL you can try running the following before your query to get your data.
SET NAMES 'utf8';
You can also look into using utf8_encode.
From http://www.php.net/manual/en/function.json-encode.php#100565
That said, quotes " will produce invalid JSON, but this is only an issue if you're using json_encode() and just expect PHP to magically escape your quotes. You need to do the escaping yourself.
May be you can replace " with \" , i guess it will solve the issue.
Source : PHP JSON String, escape Double Quotes for JS output

json output with native languages?

i am using json output for my application and stored all data in my native language in mysql server with utf8_general_ci
when i am fetching that using json_encode i got the json array but the data format is not supported in that. how can i solve it.
code which i used to create json data.
<?php
header('Content-type: text/html; charset=utf-8; pageEncoding="ISO-8859-1"');
include('include/config.php');
mysql_query("SET NAMES 'utf-8' 'ISO-8859-1'");
//mysql_query("SET CHARACTER SET utf8 ISO-8859-1");
$sth = mysql_query("select v.verse,b.book_name,v.chapter,v.verse_number from tbl_verses_mal v inner join tbl_books_mal b on v.book_id=b.book_id");
$rows = array();
while($r = mysql_fetch_assoc($sth)) {
$rows[] = $r;
}
print json_encode($rows);
?>
the output i got is like
[{"verse":"???????? ?????? ???????? ?????? ??? ????????? . ?? ????????? ??? ??????? ????????????????? .","book_name":"Genesis","chapter":"1","verse_number":"1"},{"verse":"???? ??????? ?????????? ??????? : ???????????? ???? ????????????? .????????????? ??????? ??????","book_name":"Genesis","chapter":"1","verse_number":"2"},{"verse":"???????? ?????????? ????? ???? ?????????: ???????? ??????? ","book_name":"Genesis","chapter":"1","verse_number":"3"}]
????? marks represents the language which is in the database.
the expected results is like given below
[{"verse":"അയൽകാരന് ആവശ്യം വരുമ്പോൾ നിങ്ങൾ കടം കൊടുക്കു. .","book_name":"Genesis","chapter":"1","verse_number":"1"},{"verse":"ഭൂമി പാഴായും ശൂന്യമായും ഇരുന്നു : ആഴത്തിന്മീതെ","book_name":"Genesis","chapter":"1","verse_number":"2"},]
how can i solve this issue??
mysql_query("SET NAMES 'utf-8' 'ISO-8859-1'");
This makes no sense. Set the charset properly:
mysql_set_charset('utf8');
you need to set character set properly try like
mysql_set_charset('utf8');
And mysql_* function are deprecated use PDO or Mysqli instead

Using str_split on a UTF-8 encoded string

I'm currently working on a project, and instead of using regular MySQL queries I thought I'd go ahead and learn how to use PDO.
I have a table called contestants, both the database, the table, and all of the columns are in utf-8. I have ten entries in the contestant table, and their column "name" contains characters such as åäö.
Now, when I fetch an entry from the database, and var_dump the name, I get a good result, a string with all the special characters intact. But what I need to do is to split the string by characters, to get them in an array that I then shuffle.
For instance, I have this string:
Test ÅÄÖ Tåän
And when I run str_split I get each character in it's own key in an array. The only issue is that all the special characters display as this: �, meaning the array will be like this:
Array
(
[0] => T
[1] => e
[2] => s
[3] => t
[4] =>
[5] => �
[6] => �
[7] => �
[8] => �
[9] => �
[10] => �
[11] =>
[12] => T
[13] => �
[14] => �
[15] => �
[16] => �
[17] => n
)
As you can see, it not only messes up the characters, but it also duplicates them in str_split process. I've tried several ways to split the string, but they all have the same issue. When I output the string before the split, it shows the special characters just fine.
This is my dbConn.php code:
// Require config file:
require_once('config.inc.php');
// Start PDO connection:
$dbHandle = new PDO("mysql:host=$dbHost;dbname=$dbName;charset=utf-8", $dbUser, $dbPass);
$dbHandle -> exec("SET CHARACTER SET utf8");
// Set error reporting:
$dbHandle->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING);
And this is the code that I use to fetch from the database and loop:
// Require files:
require_once('dbConn.php');
// Get random artist:
$artist = $dbHandle->query("SELECT * FROM ".ARTIST_TABLE." WHERE id = 11 ORDER BY RAND() LIMIT 1");
$artist->setFetchMode(PDO::FETCH_OBJ);
$artist = $artist->fetch();
var_dump($artist->name);
// Split name:
$artistChars = str_split($artist->name);
I'm connecting with utf-8, my php file is utf-8 without BOM and no other special characters on this page share this issue. What could be wrong, or what am I doing wrong?
Mind that the utf8 declaration used in your connect-string is reported to be not working.
In the comments on php.net I frequently see this alternative:
$dbHandle = new PDO("mysql:host=$dbHost;dbname=$dbName;charset=utf8", $dbUser, $dbPass,
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"));
str_split does not work with multi-byte characters, it will only return the first byte - thus invalidating your characters. you could use mb_split.
UTF-8 Using PDO
problems when writing international (even Chinese and Thailandic) characters to the database
there may be more ways to make this work. I am not an expert, just a tech-freak, interested to understand all this. In Linux and Windows I have set up a few CMS (content-managing-systems), using a sample from the following website:
'http://www.elated.com/articles/cms-in-an-afternoon-php-mysql'
The sample is using PDO for insert, update and delete.
It took me a few hours to find a solution. Whatever I did, I always concluded differences between the data in my forms and in the phpmyadmin/heidi -views
I followed the hints of: 'https://mathiasbynens.be/notes/mysql-utf8mb4' but there was still no success
In my CMS-structure there is a file 'Config.php':
After reading this webpage I changed the line
define( 'DB_DSN', 'mysql:host=localhost;dbname=mythings);
to
define( 'DB_DSN', 'mysql:host=localhost;dbname=mythings;charset=utf8');
Now all works fine.
The str_split function splits by byte, not by character. You'll need mb_split.
this work for me... hope its usefull.
ensure that the database, apache and every config was in utf8.
PDO OBJECT
$dsn = 'mysql:host=' . Config::read('db.host') . ';dbname=' . config::read('db.basename') .';charset=utf8'. ';port=' . Config::read('db.port') .';connect_timeout=15';
$user = Config::read('db.user');
$password = Config::read('db.password');
$this->dbh = new PDO($dsn, $user, $password,array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"));
$this->dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
it work if not using another function like str_word_count.
USING str_word_count you need to use utf8_decode(utf8_encode)..
function cortar($str)
{
if (20>$count=str_word_count($str)) {
return $str;
}
else
{
$array = str_word_count($str,1,'.,-0123456789()+=?¿!"<>*ñÑáéíóúÁÉÍÓÚ#|/%$#¡');
$s='';
$c=0;
foreach ($array as $e) {
if (20>$c) {
if (19>$c) {
$s.=$e.' ';
}
else
{
$s.=$e;
}
}
$c+=1;
}
return utf8_decode(utf8_encode($s));
}
}
function returs string with 20 words.
UTF-8 PROBLEMS & SOLUTIONS by PHP FUNCTIONS
1. How to Save UTF-8 Charterers (mathematical string,special chars like 92 ÷ 8 ÷ 2 = ? ) ?
Ans. $string =utf8_encode('92 ÷ 8 ÷ 2 = ?');
2. How to print UTF-8 Charterers From Database ?
Ans. echo utf8_decode($string);
Note: If you do not want to do this by using encoding/decoding you can do this via.
1. if you are using mysqli_query() then
$conn = mysqli_connect('localhost','db_username','password','your_database_name');
mysqli_set_charset($conn,"utf8");
2.If you are using PDO then
class Database extends PDO{
function __construct() {
parent::__construct("mysql:host=localhost;dbname=your_db_name","gurutslz_root","Your_db_password",array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"));
}
}
$conn=new Database();
I only had issues with text fields in my database structure, storing product descriptions. I set the field settings to blob instead of text, which solved my problem.

Categories