Cannot get the correct utf-8 text from Access - php

When I tried to get chinese characters from the database, I got weird text.
I tried almost everything, like html_entity_decode, htmlentities, save the file using utf-8, encode in utf-8, but I can't seem to get it right.
How do i get the right text?
Here's my code:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
<?php
header('Content-Type: text/html; charset=utf-8');
$conn=odbc_connect('vocab','','');
$rs1=odbc_exec($conn,"SELECT MAX(ID) AS MaxId FROM vocab");
$NewMaxID=odbc_result($rs1,"MaxId");
$rand=rand(1,$NewMaxID);
$sql="SELECT word,part_of_speech,chinese FROM vocab WHERE ID=".$rand.";";
$rs=odbc_exec($conn,$sql);
$i=1;
odbc_fetch_row($rs);
$a=(odbc_result($rs,1));
$b=(odbc_result($rs,2));
$c=(odbc_result($rs,3));
//$c="鎮";
//$d=html_entity_decode($c);
//$c=htmlentities($d, ENT_NOQUOTES , "UTF-8");
$rows=array("first"=>$a,"second"=>$b,"third"=>$c);
echo json_encode($rows);
?>
ps: I am using Traditional Chinese version of MS Office.

I encountered this issue a while ago and the only way I could get it to work was to write the HTML into an ADODB.Stream object, save it to a file, and then echo the file:
<?php
define("TEMP_FOLDER", "C:\\__tmp\\");
header('Content-Type: text/html; charset=utf-8');
$stm = new COM("ADODB.Stream") or die("Cannot create COM object.");
$stm->Type = 2; // adTypeText
$stm->Charset = 'utf-8';
$stm->Open();
$stm->WriteText('<html>');
$stm->WriteText('<head>');
$stm->WriteText('<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />');
$stm->WriteText('<title>ADODB test</title>');
$stm->WriteText('</head>');
$stm->WriteText('<body>');
$con = new COM("ADODB.Connection");
$con->Open(
"Driver={Microsoft Access Driver (*.mdb, *.accdb)};" .
"Dbq=C:\\Users\\Public\\Database1.accdb");
$rst = $con->Execute("SELECT word FROM vocab WHERE ID=3");
$stm->WriteText($rst->Fields("word"));
$rst->Close();
$con->Close();
$stm->WriteText('</body>');
$stm->WriteText('</html>');
$tempFile = TEMP_FOLDER . uniqid("", TRUE) . ".txt";
$stm->SaveToFile($tempFile, 2); // adSaveCreateOverWrite
$stm->Close();
echo file_get_contents($tempFile);
unlink($tempFile);
?>

Related

I'm having problems with UTF-8 charset in MPDF library

I am making a system that automatically generates a contract, the problem is that I am unable to print some of the characters in PDF.
Sérgio Avilla (My name, for example, goes like this) ->
It should come out like this: Sérgio Avilla.
Below is the simplified application code.
<?php
require_once __DIR__ . '/vendor/autoload.php';
include 'config.php';
header("Content-type: text/html; charset=utf-8");
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8', mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
$html = file_get_contents_utf8("contratos/".$contrato);
$mpdf = new \Mpdf\Mpdf();
$mpdf->WriteHTML($html);
$mpdf->Output();
?>
I would be grateful if anyone could help me. I've already tested, $ html, if printed directly on the screen gives no problems, all the right characters, the problem is mpdf down.
On the contract html file there was a charset =... , meta tag, I just changed it to charset = utf-8 and it worked.
After:<meta http-equiv=Content-Type content="text/html; charset=utf-8">
Before: <meta http-equiv=Content-Type content="text/html; charset=windows-1252">

SQL result not english characters

I always work with MySQL but in but I am forced now to work with SQL Server and I am lost. I just want to get a row in spanish and I can't make it work. Here is the code, hopefully everything makes sense.
$connection = odbc_connect("Driver={SQL Server Native Client 11.0};Server=$server;Database=$database;", $user, $password);
$sql="SELECT * FROM my_table";
$res=odbc_exec($connection,$sql)or die(exit("Error en odbc_exec"));
while($arr = odbc_fetch_array($res)) {
$var = $arr["OkRef"];
echo "1.- ".iconv("Windows-1256", "UTF-8", "$var")."<br />";
echo "2.- ".iconv("CP437", "UTF-8", $var)."<br />";
echo "3.- ".iconv("CP850", "UTF-8", $var)."<br />";
echo "4.- ".utf8_decode($arr["OkRef"])."<br />";
echo "5.- ".utf8_encode($arr["OkRef"])."<br />";
echo "6.- ".$arr["OkRef"]."<br />";
echo "7.- ".mb_convert_encoding($arr["OkRef"], "utf-8", "windows-1251")."<br />";
echo "8.- ".htmlspecialchars( iconv("iso-8859-1", "utf-8", $var) );
}
}
I get this as result:
1.- ér àçHه¬´§d_meta_packet1Y³§0ت.122) ¸ؤ
2.- Θr ατHσ¼┤ºd_meta_packet1Y│º0╩.122) ╕─
3.- Úr ÓþHÕ¼┤ºd_meta_packet1Y│º0╩.122) ©─
4.- ?r ??H????d_meta_packet1Y??0?.122) ??
5.- ér àçH嬴§d_meta_packet1Y³§0Ê.122) ¸Ä
6.- �r ��H����d_meta_packet1Y��0�.122) ��
7.- йr азH嬴§d_meta_packet1Yі§0К.122) ёД
8.- ér àçH嬴§d_meta_packet1Y³§0Ê.122) ¸Ä
I tried also to add the following (not at once obviously) to make it work as it is:
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
header('Content-Type: text/html;charset=utf-8');
header('Content-Type: text/html;charset=iso-8859-1');
ini_set('mssql.charset', 'UTF-8');
The server is a Microsoft SQL Server Enterprise Edition, and the server Collation is Modern_Spanish_CI_AS.
I know, that this answer is posted too late, but I am in similar situation these days, so I want to share my experience.
My configuration is almost the same - database and table columns with Cyrillic_General_CS_AS collation. Note, that I use PHP Driver for SQL Server, not build-in ODBC support.
The steps below have helped me to resolve my case. I've used collation from your example.
Database:
CREATE TABLE [dbo].[MyTable] (
[TextInSpanish] [varchar](50) COLLATE Modern_Spanish_CI_AS NULL,
[NTextInSpanish] [nvarchar](50) COLLATE Modern_Spanish_CI_AS NULL
)
INSERT [dbo].[MyTable] (TextInSpanish, NTextInSpanish)
VALUES ('Algunas palabras en español', N'Algunas palabras en español')
PHP:
Set default_charset = "UTF-8" in your php.ini file.
Encode your source files in UTF-8. I use Notepad++ for this step.
Read data from database:
With default connection encoding. For reading data from database use $data = iconv('CP1252', 'UTF-8', $data);
Note, that by default data is returned in 8-bit characters as specified in the code
page of the Windows locale that is set on the system. Any
multi-byte characters or characters that do not map into
this code page are substituted with a single-byte question
mark (?) character. This is the default encoding.
With UTF-8 connection encoding.
Column must be of type 'nchar' or 'nvarchar'.
HTML:
Use: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Working Example:
test.php (PHP 7.1, PHP Driver for SQL Server 4.3, file test.php is UTF-8 encoded):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge"/>
<meta charset="utf-8">
<?php
// Connection settings
$server = '127.0.0.1\instance,port';
$database = 'database';
$user = 'username';
$password = 'password';
$cinfo = array(
"CharacterSet"=>SQLSRV_ENC_CHAR,
#"CharacterSet"=>"UTF-8",
"Database"=>$database,
"UID"=>$user,
"PWD"=>$password
);
$conn = sqlsrv_connect($server, $cinfo);
if ($conn === false)
{
echo "Error (sqlsrv_connect): ".print_r(sqlsrv_errors(), true);
exit;
}
// Query
$sql = "SELECT * FROM MyTable";
$res = sqlsrv_query($conn, $sql);
if ($res === false) {
echo "Error (sqlsrv_query): ".print_r(sqlsrv_errors(), true);
exit;
}
// Results
while ($arr = sqlsrv_fetch_array($res, SQLSRV_FETCH_ASSOC)) {
# Use next 2 lines with "CharacterSet"=>SQLSRV_ENC_CHAR connection setting
echo iconv('CP1252', 'UTF-8', $arr['TextInSpanish'])."</br>";
echo iconv('CP1252', 'UTF-8', $arr['NTextInSpanish'])."</br>";
# Use next 2 lines with "CharacterSet"=>"UTF-8" connection setting
#echo $arr['TextInSpanish']."</br>";
#echo $arr['NTextInSpanish']."</br>";
}
// End
sqlsrv_free_stmt($res);
sqlsrv_close($conn);
?>
</head>
<body></body>
</html>
Oh my gosh, this did it:
"$data = iconv('CP1252', 'UTF-8', $data);"
Or in my case:
$specialnost = $_POST['specialnost'];
$specialnost = iconv('CP1251', 'UTF-8', $specialnost);
I have been searching for the last three days for a solution! Thank you Zhorov!

UTF 8 encoding - characters displaying wrong

can anybody tell me, how is this possible ? This is my code:
<?php
require_once('class.Widget.php');
try {
$objWidget = new Widget(1);
print "Název nástroje: " . $objWidget->getName() . "<br>\n";
print "Popis nástroje: " . $objWidget->getDescription() . "<br>\n";
$objWidget->setName('2. nástroj');
$objWidget->setDescription('Tohle je druhý nástroj!');
} catch (Exception $e) {
die("Došlo k problému: " . $e->getMessage());
}
?>
I saved it in UTF - 8 encoding, but when I rum it in my web browser (Mozzila Firefox), it looks like this:
Název nástroje: 2. nástroj
Popis nástroje: Tohle je druhý nástroj!
Why are some characters displaying wrong ?
You need to have also:
<meta charset="utf-8" />
in your HTML code
or add this at the beginning of your PHP file:
header('Content-Type: text/html; charset=utf-8');
you can try this :
<?php
header('Content-Type: text/html; charset=utf-8');
?>
you should precise it at the beggining of your php script , your browser may decode to your OS default encoding i think

text encoding in php response

i'm trying to print a JSON in Hebrew and only get utf-8 encoded string. how can I make sure the client's browser shows the string in Hebrew?
the code is:
<html>
<head>
<meta charset=utf-8" />
</head>
<body>
<?php
header('Content-Type: text/html; charset=utf-8');
$response = array();
require_once __DIR__.'/db_connect.php';
$db = new DB_CONNECT();
$result = mysql_query(" SELECT * FROM stores") or die(mysql_error());
if (mysql_num_rows($result)>0){
$response["stores"]=array();
while($row = mysql_fetch_array($result)){
$store = array();
$store["_id"]=$row["_id"];
$store["name"]=$row["name"];
$store["store_type"]=$row["store_type"];
array_push($response["stores"],$store);
}
$response["success"] = 1;
$string = utf8_encode(json_encode($response));
echo hebrevc($string);
}else{
$response["success"]=0;
$response["message"]="No stores found";
echo utf8_encode(json_encode($response));
}
?>
</body>
</html>
and the response is:
{{"stores":[{"_id":"1","name":"\u05d7\u05ea\u05d5\u05dc\u05d9","store_type":"\u05de\u05e1\u05e2\u05d3\u05ea \u05d1\u05e9\u05e8\u05d9\u05dd"},{"_id":"2","name":"\u05de\u05e2\u05d3\u05e0\u05d9 \u05de\u05d0\u05de\u05d9","store_type":"\u05de\u05e1\u05e2\u05d3\u05d4 \u05dc\u05e8\u05d5\u05e1\u05d9\u05dd"}],"success":1
A nice constant was added in PHP 5.4: JSON_UNESCAPED_UNICODE
Using it will not escape your Hebrew characters.
echo json_encode($response, JSON_UNESCAPED_UNICODE);
Check out the json_encode reference.
The result looks like a UCS-2 string. Try setting the charset of the Mysql Connection:
mysql_set_charset('utf8', $conn)
then remove the utf8_encode statements

In php how to display chinese character?

what I build now is I grabbing from RSS feed in chinese RSS website, but once I echo out is blank, my code was work on english RSS, I try a lot of decode,iconv, header("Content-Type: text/html; charset=utf-8");, but still the same cannot display any chinese word on my screen.
here is my coding:
header("Content-Type: text/html; charset=utf-8");
function getrssfeed($feed_url){
$Current = date("Y-m-d" ,strtotime("now"));
$content = file_get_contents($feed_url);
$xml = new SimpleXmlElement($content);
$body = "";
foreach($xml->channel->item as $entry){
$body .= get_html_translation_table(htmlspecialchars_decode(strip_tags($Current ." ". $entry->description))) . "\n\n";
//$result = iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $body);
$i++;
if($i==5) {
break;
}
}
echo $body;
}
getrssFeed("http://news.baidu.com/n?cmd=1&class=enternews&tn=rss");
Can you guy help me how to solve my problem ?
thank you
in your HTML header put this
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" ></meta>
Two things you need to do
Set document type or header as
content="text/html;charset=utf-8"
Save those user Chinese characters in database with field collation as utf8_general_ci
may be you can use this function with
mb_convert_encoding
,but at the same time ,you should attention the native document charset must be utf-8 or gb2312

Categories