When i output the text £3.99 per M² from an xml file,browser displays
it as £3.99 per M².XML file is in UTF-8 format.I wonder how to fix
this.
Make sure you're outputting UTF-8. That conversion sounds like your source is UTF-8, yet you're telling the browser to expect something else (Latin1?). You should send a header indicating to the browser UTF-8 is coming up, and you should have the correct meta header:
<?php
header ('Content-type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<?php echo "£3.99 per M²"; ?>
</body>
</html>
This should work correctly.
You should encode html entities:
you could try
htmlentities($str, ENT_QUOTES, "UTF-8");
Look here for a complete reference
If you still have problems sometimes you also have to decode the string with utf8_decode()
so you can try:
$str = utf8_decode($str);
$str = htmlentities($str, ENT_QUOTES);
Related
there are a lot of topics about diacritics/accents in PHP but none of them solved my problem.
I have this code:
<!DOCTYPE html>
<html lang="sk">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
</head>
<body>
<?php
$items = scandir("test/");
echo $items[3];
?>
</body>
</html>
$items[3] is ľšá.png but it displays: ğšá.png
I tried:
foreach(mb_list_encodings() as $chr){
echo mb_convert_encoding($items[3], 'UTF-8', $chr) ." : ".$chr."<br>";
}
But none of them is right for me.
I also tried to put this before scandir():
mb_internal_encoding('UTF-8');
mb_http_output('UTF-8');
ini_set('default_charset', 'utf-8');
But no change.
It is very strange because my website have always been working before I saw the issue (today) and I did not affect any code.
You tried to convert from 1-byte encodings to UTF-8 (double-byte), but that wrong file name that you see has double characters in it, so its already UTF-8!
You need to convert it from UTF-8, and for me it worked like this:
mb_convert_encoding($items[3], "ISO-8859-15", 'UTF-8'); // its to ISO from UTF-8
Personally I use iconv
echo iconv("UTF-8","ISO-8859-15",$items[3]); // its from UTF-8 to ISO
but i think its no big difference if either of them actually works.
Also I suggest you to check file names on your webserver if they accidentally has been converted when uploaded.
echo "£$price";
Displays:
£35
How can I get rid of that unwanted char?
First you have to convert that character to an html entity.
echo htmlentities('£').$price;
Sounds like you don't serve your content as utf-8.
Do this by setting the correct header :
header('Content-type: text/html; charset=utf-8');
In addition to be really sure the browser understands, add this HTML Meta tag in your page:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Or instead of £ with HTML output you can always write
£
Your editor is probably set to save your files in utf8, whcih is fine.
The browser may use a different encoding by default. You can hint te browser by using this html5 meta tag:
<meta charset="utf8">
Or the older non-html5 equivalent
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
You can also specify the charset as a HTTP Header, using header():
header("Content-Type: text/html; charset=utf-8");
It also doesn't hurt to read Joel on charsets.
My PHP script parses a web site and pulls out an HTML DIV that looks like this (and saves it as a string)
<div id="merchantinfo">The following merchants: Nautica®, Brookstone®, Teds® ©2012 Blabla</div>
I store this as $merchantList (string).
However, when I output the data to the webpage
echo $merchantList
The encoding gets messed up and displays as:
Nautica®, Brookstone®, Teds® ©2012 Blabla
I tried adding the following to the display page:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</head>
But that didn't do anything. --Thanks
EDIT:: ------------
For the question, the accepted answer is correct.
But I realized my actual issue was slightly different.
The initial parsing using DOMDocument::loadHTML had already mangled the UTF-8 encoding, causing the string to save as
<div id="merchantinfo">The following merchants: Nauticaî, Brookstoneî, Tedsî ©2012 Blabla</div>
This was solved by:
$html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
$dom->loadHTML($html);
Use:
ini_set('default_charset', 'UTF-8');
And do not use iso-8859-1. Use UTF-8.
From the mojibake you posted the input string is utf-8, not iso-8859-1.
You need just to Use htmlspecialchars_decode function , exemple :
$string = '"hello dude"';
$decodechars = htmlspecialchars_decode($string);
echo $decodechars; // output : "hello dude"
There is a MySQL database containing data with accentuated letters like é. I want to display it in my PHP page , but the problem is that there are unrecognized characters displayed because of the accent. So is there a function to convert accent to HTML code , for example é is converted to é !
Rather than using htmlentities you should use the unicode charset in your files, e.g.
<?php
header('Content-Type: text/html; charset="utf-8"');
To be on the safe side, you can add the following meta tag to your html files:
<html>
<head>
<meta charset="utf-8" />
Then, make sure that your data base connection uses utf8:
mysql_connect(...);
mysql_select_database(...);
mysql_set_charset('utf-8');
Then, all browsers should display the special characters correctly.
The advantage is that you can easily use unicode characters everywhere in your php files - for example the copyright sign (©) or a dash (–) - given that your php files are encoded in utf-8, too.
Try htmlspecialchars() and/or htmlentities()
you can easily make one yourself with str_replace:
function txtFormat($input){
$output = str_replace('/\à/','É',$input);
$output = str_replace('/\è/','è',$output);
$output = str_replace('/\é/','é',$output);
$output = str_replace('/\ì/','ì',$output);
$output = str_replace('/\ò/','ò',$output);
$output = str_replace('/\ù/','ù',$output);
return $output;
}
Use the following code
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
And don't encode your data via utf8_encode() function before inserting into database
hope this will solve your problem :)
I'm working on an IMAP email script and I have some lines coded in GB2312 (which I assume is Chinese encoding), looks like this =?GB2312?B?foobarbazetc
How can I start working with this string? I checked mb_list_encodings() and this one is not listed.
If you have the base64-decoded data, then use mbstring or iconv. If you have the raw header, then mbstring.
<?php
$t = "\xc4\xe3\xba\xc3\n";
echo iconv('GB2312', 'UTF-8', $t);
echo mb_convert_encoding($t, 'UTF-8', 'GB2312');
mb_internal_encoding('UTF-8');
echo mb_decode_mimeheader("=?gb2312?b?xOO6ww==?=");
?>
Ignacio solved the meat of the problem with mb_decode_mimeheader() but for future reference these links are also helpful:
http://developer.loftdigital.com/blog/php-utf-8-cheatsheet
http://www.herongyang.com/PHP-Chinese/PHP-UTF-8-Chinese-String-Literals.html
The specific header string I was working with:
$subject = "=?GB2312?B?tPC4tDogUXVvdGF0aW9uIFBJSSBwcm9kdWN0cyA=?= =?GB2312?B?Rk9CIFNoYW5naGFpIG9yIE5pbmdibyBwb3J0?="
This required a page header of
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
and PHP
mb_internal_encoding('utf-8');
echo mb_decode_mimeheader($subject)."<br />";
to output
主题: Quotation PII products FOB Shanghai or Ningbo port