UTF 8 encoding - characters displaying wrong - php

can anybody tell me, how is this possible ? This is my code:
<?php
require_once('class.Widget.php');
try {
$objWidget = new Widget(1);
print "Název nástroje: " . $objWidget->getName() . "<br>\n";
print "Popis nástroje: " . $objWidget->getDescription() . "<br>\n";
$objWidget->setName('2. nástroj');
$objWidget->setDescription('Tohle je druhý nástroj!');
} catch (Exception $e) {
die("Došlo k problému: " . $e->getMessage());
}
?>
I saved it in UTF - 8 encoding, but when I rum it in my web browser (Mozzila Firefox), it looks like this:
Název nástroje: 2. nástroj
Popis nástroje: Tohle je druhý nástroj!
Why are some characters displaying wrong ?

You need to have also:
<meta charset="utf-8" />
in your HTML code
or add this at the beginning of your PHP file:
header('Content-Type: text/html; charset=utf-8');

you can try this :
<?php
header('Content-Type: text/html; charset=utf-8');
?>
you should precise it at the beggining of your php script , your browser may decode to your OS default encoding i think

Related

PHP using UTF8 characters in URL, url encoding fails

In my PHP script I try to send utf8 characters to the google translate website for them to send me a translation of the text, but this doesn't work for UTF8 characters such as chinese, arabic and russian and I can't figure out why. If I try to translate 'как дела' to english I could use this link: https://translate.googleapis.com/translate_a/single?client=gtx&sl=ru&tl=en&dt=t&q=как дела
And it would return this: [[["how are you","как дела",,,1]],,"ru"]
A fine translation, exactly what I wanted, but if I try to recreate it in PHP I do this (I used bytes in the beginning because my future script will use bytes as starting point):
<?php
$bytes = array(1082,1072,1082,32,1076,1077,1083,1072); // bytes of: как дела
$str = "";
for($i = 0; $i < count($bytes); ++$i) {
$str .= json_decode('"\u' . '0' . strtoupper(dechex($bytes[$i])) . '"'); // returns string: как дела
}
$from = 'ru';
$to = 'en';
$url = 'https://translate.googleapis.com/translate_a/single?client=gtx&sl=' . $from . '&tl=' . $to . '&dt=t&q=' . $str;
$call = fopen($url,"r");
$contents = fread($call,2048);
print $contents;
?>
And it outputs: [[["RєR RєRґRμR ° \"° F","какдела",,,0]],,"ru"]
The output doesn't make sense, it appears that my PHP script send the string 'какдела' to translate to english for me. I read something about making UTF-8 characters readable for google in a URI (or url). It says I should transfer my bytes to UTF-8 code units and put them in my url. I didn't yet figure out how to transfer bytes to UTF-8 code units, but I first wanted to try if it worked. I started by converting my text 'как дела' to code units (with percents for URL) to test it myself. This resulted in the following link: https://translate.googleapis.com/translate_a/single?client=gtx&sl=ru&tl=en&dt=t&q=%D0%BA%D0%B0%D0%BA+%D0%B4%D0%B5%D0%BB%D0%B0
And when tested in browser it returns: [[["how are you","как дела",,,1]],,"ru"]
Again a fine translation, it appears it works so I tried to implement it in my script with the following code:
<?php
$from = 'ru';
$to = 'en';
$text = "%D0%BA%D0%B0%D0%BA+%D0%B4%D0%B5%D0%BB%D0%B0"; // code units of: как дела
$url = 'https://translate.googleapis.com/translate_a/single?client=gtx&sl=' . $from . '&tl=' . $to . '&dt=t&q=' . $text;
$call = fopen($url,"r");
$contents = fread($call,2048);
print $contents;
?>
This script outputs: [[["RєR Rє RґRμR ° \"° F","как дела",,,0]],,"ru"]
Again my script doesn't output what I want and what I get when I test these URL's in my own browser. I can't figure what I'm doing wrong and why google responds with a mess up of characters if I use the link in my PHP file.
Does someone know how to get the output I want? Thanks in advance!
Updated code to set strings in UTF-8, (not working)
I added a lot of settings at the top of the PHP file to make sure everything is in UTF8 format. Also I added a mb_convert_encoding halfway but the output keeps being wrong. The fopen function doesn't send the right UTF-8 string to google.
Output I get:
URL: https://translate.googleapis.com/translate_a/single?client=gtx&sl=ru&tl=en&dt=t&q=%D0%BA%D0%B0%D0%BA%20%D0%B4%D0%B5%D0%BB%D0%B0
Encoding: ASCII
File contents: [[["RєR Rє RґRμR ° \"° F","как дела",,,0]],,"ru"]
Code I use:
<?php
header('Content-Type: text/html; charset=utf-8');
$TYPO3_CONF_VARS['BE']['forceCharset'] = 'utf-8';
mb_internal_encoding('UTF-8');
mb_http_output('UTF-8');
mb_http_input('UTF-8');
mb_language('uni');
mb_regex_encoding('UTF-8');
ob_start('mb_output_handler');
$from = 'ru';
$to = 'en';
$text = rawurlencode('как дела');
$url = 'https://translate.googleapis.com/translate_a/single?client=gtx&sl=' . $from . '&tl=' . $to . '&dt=t&q=' . $text;
$url = mb_convert_encoding($url, "UTF-8", "ASCII");
$call = fopen($url,"r");
$contents = fread($call,2048);
print 'URL: ' . $url . '<br>';
print 'Encoding: ' . mb_detect_encoding($url) . '<br>';;
print 'File contents: ' . $contents;
?>
Solved! I got the hint from another not from these forums to look at this stackoverflow post about setting a user agent. After some more research I found that this answer was the solution to my problem. Now everything works fine!

Result from URL request returning weird characters instead of accents

My problem is that the accents are not displayed in the output of print_r().
Here is my code:
<?php
include('./lib/simple_html_dom.php');
error_reporting(E_ALL);
if (isset($_GET['q'])){
$q = $_GET['q'];
$keyword=urlencode($q);
$url="https://www.google.com/search?q=$keyword";
$html=file_get_html($url);
$results=$html->find('li.g');
$G_tot = sizeof($results)-1;
for($g=0;$g<=$G_tot;$g++){
$results=$html->find('li.g',$g);
$array_ttl_google[]=$results->find('h3.r',0)->plaintext;
$array_desc_google[]=$results->find('span.st',0)->plaintext;
$array_href_google[]=$results->find('cite',0)->plaintext;
}
print_r($array_desc_google);
}
?>
Here is the result of print_r:
Array ( [0] => �t� m (plural �t�s)...
What is the resolution in your opinion?
3 basic things you can do:
Set the page encoding to UTF-8 - Add at the very begining of your page: header('Content-Type: text/html; charset=utf-8');
Make sure your code file is saved as UTF-8 (without BOM).
Add a function to translate the parsed string to UTF-8 (in case some other sites are using different encodings)
Your code should look something like that (Tested - working great tried with english and hebrew results):
<?php
header('Content-Type: text/html; charset=utf-8');
include('simple_html_dom.php');
error_reporting(0);
if (isset($_GET['q'])){
$q = $_GET['q'];
$keyword=urlencode($q);
$url="https://www.google.com/search?q=$keyword";
$html=file_get_html($url);
//Make sure we received UTF-8:
$encoding = #mb_detect_encoding($html);
if ($encoding && strtoupper($encoding) != "UTF-8")
$html = #iconv($encoding, "utf-8//TRANSLIT//IGNORE", $html);
//Proceed with your code:
$results=$html->find('li.g');
$G_tot = sizeof($results)-1;
for($g=0;$g<=$G_tot;$g++){
$results=$html->find('li.g',$g);
$array_ttl_google[]= $results->find('h3.r',0)->plaintext;
$array_desc_google[]= $results->find('span.st',0)->plaintext;
$array_href_google[] = $results->find('cite',0)->plaintext;
}
print_r($array_desc_google);
} else {
echo "You forgot to set the 'q' variable in your url.";
}
?>

Cannot get the correct utf-8 text from Access

When I tried to get chinese characters from the database, I got weird text.
I tried almost everything, like html_entity_decode, htmlentities, save the file using utf-8, encode in utf-8, but I can't seem to get it right.
How do i get the right text?
Here's my code:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
<?php
header('Content-Type: text/html; charset=utf-8');
$conn=odbc_connect('vocab','','');
$rs1=odbc_exec($conn,"SELECT MAX(ID) AS MaxId FROM vocab");
$NewMaxID=odbc_result($rs1,"MaxId");
$rand=rand(1,$NewMaxID);
$sql="SELECT word,part_of_speech,chinese FROM vocab WHERE ID=".$rand.";";
$rs=odbc_exec($conn,$sql);
$i=1;
odbc_fetch_row($rs);
$a=(odbc_result($rs,1));
$b=(odbc_result($rs,2));
$c=(odbc_result($rs,3));
//$c="鎮";
//$d=html_entity_decode($c);
//$c=htmlentities($d, ENT_NOQUOTES , "UTF-8");
$rows=array("first"=>$a,"second"=>$b,"third"=>$c);
echo json_encode($rows);
?>
ps: I am using Traditional Chinese version of MS Office.
I encountered this issue a while ago and the only way I could get it to work was to write the HTML into an ADODB.Stream object, save it to a file, and then echo the file:
<?php
define("TEMP_FOLDER", "C:\\__tmp\\");
header('Content-Type: text/html; charset=utf-8');
$stm = new COM("ADODB.Stream") or die("Cannot create COM object.");
$stm->Type = 2; // adTypeText
$stm->Charset = 'utf-8';
$stm->Open();
$stm->WriteText('<html>');
$stm->WriteText('<head>');
$stm->WriteText('<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />');
$stm->WriteText('<title>ADODB test</title>');
$stm->WriteText('</head>');
$stm->WriteText('<body>');
$con = new COM("ADODB.Connection");
$con->Open(
"Driver={Microsoft Access Driver (*.mdb, *.accdb)};" .
"Dbq=C:\\Users\\Public\\Database1.accdb");
$rst = $con->Execute("SELECT word FROM vocab WHERE ID=3");
$stm->WriteText($rst->Fields("word"));
$rst->Close();
$con->Close();
$stm->WriteText('</body>');
$stm->WriteText('</html>');
$tempFile = TEMP_FOLDER . uniqid("", TRUE) . ".txt";
$stm->SaveToFile($tempFile, 2); // adSaveCreateOverWrite
$stm->Close();
echo file_get_contents($tempFile);
unlink($tempFile);
?>

In php how to display chinese character?

what I build now is I grabbing from RSS feed in chinese RSS website, but once I echo out is blank, my code was work on english RSS, I try a lot of decode,iconv, header("Content-Type: text/html; charset=utf-8");, but still the same cannot display any chinese word on my screen.
here is my coding:
header("Content-Type: text/html; charset=utf-8");
function getrssfeed($feed_url){
$Current = date("Y-m-d" ,strtotime("now"));
$content = file_get_contents($feed_url);
$xml = new SimpleXmlElement($content);
$body = "";
foreach($xml->channel->item as $entry){
$body .= get_html_translation_table(htmlspecialchars_decode(strip_tags($Current ." ". $entry->description))) . "\n\n";
//$result = iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $body);
$i++;
if($i==5) {
break;
}
}
echo $body;
}
getrssFeed("http://news.baidu.com/n?cmd=1&class=enternews&tn=rss");
Can you guy help me how to solve my problem ?
thank you
in your HTML header put this
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" ></meta>
Two things you need to do
Set document type or header as
content="text/html;charset=utf-8"
Save those user Chinese characters in database with field collation as utf8_general_ci
may be you can use this function with
mb_convert_encoding
,but at the same time ,you should attention the native document charset must be utf-8 or gb2312

PHP header function not triggering?

I'm getting some fairly odd behavior here... I noticed that only localhost, a header statement I have worked just fine, but when copied over (SAME EXACT CODE) to my live site, the header statement no longer triggers.
I added some echos to help debug. This if statement will only trigger if the URL variable id is NOT set. So with this same exact code on localhost, I never see these debug statements. On the live site, I do... which I shouldn't. I should get redirected instead. Anyone know why a header statement would be ignored?
if((!isset($_GET['id'])) && $rows != 0) {
$result = mysql_query('SELECT videoinfo FROM videos where game_id=' . $gameid . ' LIMIT 1');
$row = mysql_fetch_array($result);
$tubeID = $row['videoinfo'];
echo '<script type="text/javascript">';
echo 'window.location = "videos.php?id=' . $tubeID . '&awayid=' . $awayid . '&homeid=' . $homeid . '&date=' . $date . '&time=' . $time . '&gameid=' . $gameid . '&play=0"';
echo '</script>';
}
EDIT 4 people have told me already that I can't have any echo calls before my header function. This code WORKS on localhost and the header function DOES trigger. Regardless, REMOVING the echo statements DOES NOT fix it.
AFAIK
Can't send header() after some echo.
You cannot output ANYTHING before you call header(), i.e. echo etc...
I don't think you can have a header after any sort of output. I just came up with that.
You need to enable output buffering. This is probably enabled on your localhost but not on your live system (or the value on the live system is too small).
This will keep your output buffered, and then the Header will work fine.
Your debug statements are making the problem worse.
header statements will not work if there is any output before they are called. In this case, your echos are killing it. Also, make sure there is no other output before this is called (white space, HTML, etc.).
at start of script, add
ob_start();
and just before calling header();, use
ob_clean();
ob_clean();
header('Location: videos.php?id=' . $tubeID . '&awayid=' . $awayid . '&homeid=' . $homeid . '&date=' . $date . '&time=' . $time . '&gameid=' . $gameid . '&play=0');
Edit
Check to ensure that your files are encoded correctly. For instance, if you are encoding in UTF-8, make sure you encode in UTF-8 without BOM (Byte Order Mark).
How to check depends on what you use to edit and save your file. For instance, I use Notepad++ so I just go to the 'Encoding' menu and select 'Encode in UTF-8 without BOM' and then save the file.
function goto_url($url) {
// must not have output anything prior to calling this function
if (!headers_sent()){
header('Location: '.$url); exit;
} else {
echo '<script type="text/javascript">';
echo 'window.location.href="'.$url.'";';
echo '</script>';
echo '<noscript>';
echo '<meta http-equiv="refresh" content="0;url='.$url.'" />';
echo '</noscript>'; exit;
};
};
http://www.w3schools.com/php/func_http_headers_sent.asp

Categories