I'm having problems with UTF-8 charset in MPDF library - php

I am making a system that automatically generates a contract, the problem is that I am unable to print some of the characters in PDF.
Sérgio Avilla (My name, for example, goes like this) ->
It should come out like this: Sérgio Avilla.
Below is the simplified application code.
<?php
require_once __DIR__ . '/vendor/autoload.php';
include 'config.php';
header("Content-type: text/html; charset=utf-8");
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8', mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
$html = file_get_contents_utf8("contratos/".$contrato);
$mpdf = new \Mpdf\Mpdf();
$mpdf->WriteHTML($html);
$mpdf->Output();
?>
I would be grateful if anyone could help me. I've already tested, $ html, if printed directly on the screen gives no problems, all the right characters, the problem is mpdf down.

On the contract html file there was a charset =... , meta tag, I just changed it to charset = utf-8 and it worked.
After:<meta http-equiv=Content-Type content="text/html; charset=utf-8">
Before: <meta http-equiv=Content-Type content="text/html; charset=windows-1252">

Related

SQL result not english characters

I always work with MySQL but in but I am forced now to work with SQL Server and I am lost. I just want to get a row in spanish and I can't make it work. Here is the code, hopefully everything makes sense.
$connection = odbc_connect("Driver={SQL Server Native Client 11.0};Server=$server;Database=$database;", $user, $password);
$sql="SELECT * FROM my_table";
$res=odbc_exec($connection,$sql)or die(exit("Error en odbc_exec"));
while($arr = odbc_fetch_array($res)) {
$var = $arr["OkRef"];
echo "1.- ".iconv("Windows-1256", "UTF-8", "$var")."<br />";
echo "2.- ".iconv("CP437", "UTF-8", $var)."<br />";
echo "3.- ".iconv("CP850", "UTF-8", $var)."<br />";
echo "4.- ".utf8_decode($arr["OkRef"])."<br />";
echo "5.- ".utf8_encode($arr["OkRef"])."<br />";
echo "6.- ".$arr["OkRef"]."<br />";
echo "7.- ".mb_convert_encoding($arr["OkRef"], "utf-8", "windows-1251")."<br />";
echo "8.- ".htmlspecialchars( iconv("iso-8859-1", "utf-8", $var) );
}
}
I get this as result:
1.- ér àçHه¬´§d_meta_packet1Y³§0ت.122) ¸ؤ
2.- Θr ατHσ¼┤ºd_meta_packet1Y│º0╩.122) ╕─
3.- Úr ÓþHÕ¼┤ºd_meta_packet1Y│º0╩.122) ©─
4.- ?r ??H????d_meta_packet1Y??0?.122) ??
5.- ér àçH嬴§d_meta_packet1Y³§0Ê.122) ¸Ä
6.- �r ��H����d_meta_packet1Y��0�.122) ��
7.- йr азH嬴§d_meta_packet1Yі§0К.122) ёД
8.- ér àçH嬴§d_meta_packet1Y³§0Ê.122) ¸Ä
I tried also to add the following (not at once obviously) to make it work as it is:
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
header('Content-Type: text/html;charset=utf-8');
header('Content-Type: text/html;charset=iso-8859-1');
ini_set('mssql.charset', 'UTF-8');
The server is a Microsoft SQL Server Enterprise Edition, and the server Collation is Modern_Spanish_CI_AS.
I know, that this answer is posted too late, but I am in similar situation these days, so I want to share my experience.
My configuration is almost the same - database and table columns with Cyrillic_General_CS_AS collation. Note, that I use PHP Driver for SQL Server, not build-in ODBC support.
The steps below have helped me to resolve my case. I've used collation from your example.
Database:
CREATE TABLE [dbo].[MyTable] (
[TextInSpanish] [varchar](50) COLLATE Modern_Spanish_CI_AS NULL,
[NTextInSpanish] [nvarchar](50) COLLATE Modern_Spanish_CI_AS NULL
)
INSERT [dbo].[MyTable] (TextInSpanish, NTextInSpanish)
VALUES ('Algunas palabras en español', N'Algunas palabras en español')
PHP:
Set default_charset = "UTF-8" in your php.ini file.
Encode your source files in UTF-8. I use Notepad++ for this step.
Read data from database:
With default connection encoding. For reading data from database use $data = iconv('CP1252', 'UTF-8', $data);
Note, that by default data is returned in 8-bit characters as specified in the code
page of the Windows locale that is set on the system. Any
multi-byte characters or characters that do not map into
this code page are substituted with a single-byte question
mark (?) character. This is the default encoding.
With UTF-8 connection encoding.
Column must be of type 'nchar' or 'nvarchar'.
HTML:
Use: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Working Example:
test.php (PHP 7.1, PHP Driver for SQL Server 4.3, file test.php is UTF-8 encoded):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge"/>
<meta charset="utf-8">
<?php
// Connection settings
$server = '127.0.0.1\instance,port';
$database = 'database';
$user = 'username';
$password = 'password';
$cinfo = array(
"CharacterSet"=>SQLSRV_ENC_CHAR,
#"CharacterSet"=>"UTF-8",
"Database"=>$database,
"UID"=>$user,
"PWD"=>$password
);
$conn = sqlsrv_connect($server, $cinfo);
if ($conn === false)
{
echo "Error (sqlsrv_connect): ".print_r(sqlsrv_errors(), true);
exit;
}
// Query
$sql = "SELECT * FROM MyTable";
$res = sqlsrv_query($conn, $sql);
if ($res === false) {
echo "Error (sqlsrv_query): ".print_r(sqlsrv_errors(), true);
exit;
}
// Results
while ($arr = sqlsrv_fetch_array($res, SQLSRV_FETCH_ASSOC)) {
# Use next 2 lines with "CharacterSet"=>SQLSRV_ENC_CHAR connection setting
echo iconv('CP1252', 'UTF-8', $arr['TextInSpanish'])."</br>";
echo iconv('CP1252', 'UTF-8', $arr['NTextInSpanish'])."</br>";
# Use next 2 lines with "CharacterSet"=>"UTF-8" connection setting
#echo $arr['TextInSpanish']."</br>";
#echo $arr['NTextInSpanish']."</br>";
}
// End
sqlsrv_free_stmt($res);
sqlsrv_close($conn);
?>
</head>
<body></body>
</html>
Oh my gosh, this did it:
"$data = iconv('CP1252', 'UTF-8', $data);"
Or in my case:
$specialnost = $_POST['specialnost'];
$specialnost = iconv('CP1251', 'UTF-8', $specialnost);
I have been searching for the last three days for a solution! Thank you Zhorov!

Convert a view in utf-8 using codeigniter

I have a problem with encoding, I tried to convert my html in utf8 using CodeIgniter, so my code is:
public function generateTitlePage($company)
{
$this->load->library('dompdf_gen');
$dompdf = new DOMPDF();
$search = array('%27', '%20', '%C3%A2', '%C3%AE');
$replace = array('', ' ', 'â', 'î');
$company = str_replace($search, $replace, $company);
$html = '
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>
<body>
<div style="margin-top:20px;text-align: center;font-weight: bold">
LIMITATĂ'.$company.'
</div>
</body>
</html>
';
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream("welcome.pdf");
}
So, my output pdf is LIMITAT? name of company,I dont't understand why Ă is not converted is use meta tag, also I use Codeigniter config: $config['charset'] = 'UTF-8'; Help me pleaaaaase, thnx in advance
there is no direct conversion method.you have to use str_replace or something similar.for more info you can see this:PHP converting special characters, like ş to s, ţ to t, ă to a

Cannot get the correct utf-8 text from Access

When I tried to get chinese characters from the database, I got weird text.
I tried almost everything, like html_entity_decode, htmlentities, save the file using utf-8, encode in utf-8, but I can't seem to get it right.
How do i get the right text?
Here's my code:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
<?php
header('Content-Type: text/html; charset=utf-8');
$conn=odbc_connect('vocab','','');
$rs1=odbc_exec($conn,"SELECT MAX(ID) AS MaxId FROM vocab");
$NewMaxID=odbc_result($rs1,"MaxId");
$rand=rand(1,$NewMaxID);
$sql="SELECT word,part_of_speech,chinese FROM vocab WHERE ID=".$rand.";";
$rs=odbc_exec($conn,$sql);
$i=1;
odbc_fetch_row($rs);
$a=(odbc_result($rs,1));
$b=(odbc_result($rs,2));
$c=(odbc_result($rs,3));
//$c="鎮";
//$d=html_entity_decode($c);
//$c=htmlentities($d, ENT_NOQUOTES , "UTF-8");
$rows=array("first"=>$a,"second"=>$b,"third"=>$c);
echo json_encode($rows);
?>
ps: I am using Traditional Chinese version of MS Office.
I encountered this issue a while ago and the only way I could get it to work was to write the HTML into an ADODB.Stream object, save it to a file, and then echo the file:
<?php
define("TEMP_FOLDER", "C:\\__tmp\\");
header('Content-Type: text/html; charset=utf-8');
$stm = new COM("ADODB.Stream") or die("Cannot create COM object.");
$stm->Type = 2; // adTypeText
$stm->Charset = 'utf-8';
$stm->Open();
$stm->WriteText('<html>');
$stm->WriteText('<head>');
$stm->WriteText('<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />');
$stm->WriteText('<title>ADODB test</title>');
$stm->WriteText('</head>');
$stm->WriteText('<body>');
$con = new COM("ADODB.Connection");
$con->Open(
"Driver={Microsoft Access Driver (*.mdb, *.accdb)};" .
"Dbq=C:\\Users\\Public\\Database1.accdb");
$rst = $con->Execute("SELECT word FROM vocab WHERE ID=3");
$stm->WriteText($rst->Fields("word"));
$rst->Close();
$con->Close();
$stm->WriteText('</body>');
$stm->WriteText('</html>');
$tempFile = TEMP_FOLDER . uniqid("", TRUE) . ".txt";
$stm->SaveToFile($tempFile, 2); // adSaveCreateOverWrite
$stm->Close();
echo file_get_contents($tempFile);
unlink($tempFile);
?>

In php how to display chinese character?

what I build now is I grabbing from RSS feed in chinese RSS website, but once I echo out is blank, my code was work on english RSS, I try a lot of decode,iconv, header("Content-Type: text/html; charset=utf-8");, but still the same cannot display any chinese word on my screen.
here is my coding:
header("Content-Type: text/html; charset=utf-8");
function getrssfeed($feed_url){
$Current = date("Y-m-d" ,strtotime("now"));
$content = file_get_contents($feed_url);
$xml = new SimpleXmlElement($content);
$body = "";
foreach($xml->channel->item as $entry){
$body .= get_html_translation_table(htmlspecialchars_decode(strip_tags($Current ." ". $entry->description))) . "\n\n";
//$result = iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $body);
$i++;
if($i==5) {
break;
}
}
echo $body;
}
getrssFeed("http://news.baidu.com/n?cmd=1&class=enternews&tn=rss");
Can you guy help me how to solve my problem ?
thank you
in your HTML header put this
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" ></meta>
Two things you need to do
Set document type or header as
content="text/html;charset=utf-8"
Save those user Chinese characters in database with field collation as utf8_general_ci
may be you can use this function with
mb_convert_encoding
,but at the same time ,you should attention the native document charset must be utf-8 or gb2312

file_get_contents() Breaks Up UTF-8 Characters

I am loading a HTML from an external server. The HTML markup has UTF-8 encoding and contains characters such as ľ,š,č,ť,ž etc. When I load the HTML with file_get_contents() like this:
$html = file_get_contents('http://example.com/foreign.html');
It messes up the UTF-8 characters and loads Å, ¾, ¤ and similar nonsense instead of proper UTF-8 characters.
How can I solve this?
UPDATE:
I tried both saving the HTML to a file and outputting it with UTF-8 encoding. Both doesn't work so it means file_get_contents() is already returning broken HTML.
UPDATE2:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="sk" lang="sk">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Language" content="sk" />
<title>Test</title>
</head>
<body>
<?php
$html = file_get_contents('http://example.com');
echo htmlentities($html);
?>
</body>
</html>
I had similar problem with polish language
I tried:
$fileEndEnd = mb_convert_encoding($fileEndEnd, 'UTF-8', mb_detect_encoding($fileEndEnd, 'UTF-8', true));
I tried:
$fileEndEnd = utf8_encode ( $fileEndEnd );
I tried:
$fileEndEnd = iconv( "UTF-8", "UTF-8", $fileEndEnd );
And then -
$fileEndEnd = mb_convert_encoding($fileEndEnd, 'HTML-ENTITIES', "UTF-8");
This last worked perfectly !!!!!!
Solution suggested in the comments of the PHP manual entry for file_get_contents
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
You might also try your luck with http://php.net/manual/en/function.mb-internal-encoding.php
Alright. I have found out the file_get_contents() is not causing this problem. There's a different reason which I talk about in another question. Silly me.
See this question: Why Does DOM Change Encoding?
Exemple :
$string = file_get_contents(".../File.txt");
$string = mb_convert_encoding($string, 'UTF-8', "ISO-8859-1");
echo $string;
I think you simply have a double conversion of the character type there :D
It may be, because you opened an html document within a html document. So you have something that looks like this in the end
<!DOCTYPE html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title></title>
</head>
<body>
<!DOCTYPE html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Test</title>.......
The use of mb_detect_encoding therefore may lead you to other issues.
İn Turkish language, mb_convert_encoding or any other charset conversion did not work.
And also urlencode did not work because of space char converted to + char. It must be %20 for percent encoding.
This one worked!
$url = rawurlencode($url);
$url = str_replace("%3A", ":", $url);
$url = str_replace("%2F", "/", $url);
$data = file_get_contents($url);
I managed to solve using this function below:
function file_get_contents_utf8($url) {
$content = file_get_contents($url);
return mb_convert_encoding($content, "HTML-ENTITIES", "UTF-8");
}
file_get_contents_utf8($url);
Try this too
$url = 'http://www.domain.com/';
$html = file_get_contents($url);
//Change encoding to UTF-8 from ISO-8859-1
$html = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $html);
I am working with 35000 lines of data.
$f=fopen("veri1.txt","r");
$i=0;
while(!feof($f)){
$i++;
$line=mb_convert_encoding(fgets($f), 'HTML-ENTITIES', "UTF-8");
echo $line;
}
This code convert my strange characters into normal.
I had a similar problem, what solved it was html_entity_decode.
My code is:
$content = file_get_contents("http://example.com/fr");
$x = new SimpleXMLElement($content);
foreach($x->channel->item as $entry) {
$subEntry = html_entity_decode($entry->description);
}
In here I am retrieving an xml file (in French), that's why I'm using this $x object variable. And only then I decode it into this variable $subEntry.
I tried mb_convert_encoding but this didn't work for me.
Try this function
function mb_html_entity_decode($string) {
if (extension_loaded('mbstring') === true)
{
mb_language('Neutral');
mb_internal_encoding('UTF-8');
mb_detect_order(array('UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII'));
return mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
}
return html_entity_decode($string, ENT_COMPAT, 'UTF-8');
}

Categories