This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 9 years ago.
I have been trying to print some characters on my php page and it returns something else, like corrupted characters
like
"joão", ø, ç, etc
echo '<br>não<br>'
the return is
não
instead:
não
is this a problem with econding utf-8?
i have tried this code
header('Content-Type: text/html, charset=utf-8');
but no result
You have a typo in your Content-Type header. It's ...; charset=... with a semicolon.
Note: Your header header('Content-type: text/html, charset=utf-8');,
is the correct header('Content-type: text/html; charset=utf-8');
Your document (php-file) should be saved in utf8 (without bom) or use iso-8859-1 instead of utf8
To save the document in utf8 without boom use notepade++ (select "Convert to utf-8 without BOM"):
or use:
header('Content-Type: text/html; charset=iso-8859-1');
Tip (if you are using database):
If your database is UTF-8, then:
php file converted to utf-8 without bom
set header to utf8 header('Content-type: text/html; charset=utf-8');
If your database is latin1, then:
php file converted to ANSI
set header to iso-8859-1header('Content-type: text/html; charset=iso-8859-1');
Check that your PHP source file is encoded in UTF-8.
try this
header('Content-Type: text/html; charset=iso-8859-1');
The encoding of your php file itself may be wrong. You might want to change it to utf-8.
However, it is not good practice to include special characters directly in your code. Instead, use the HTML charcode equivalents to produce valid output.
Example
echo 'não';
A comprehensive list of HTML character codes can be found here
Related
I need to get content of the remote file in utf-8 encoding. The file in in utf-8. When I display that file on screen, it has proper encoding:
http://www.parfumeriafox.sk/source_file.html
(notice the ň and č characters, for example, these are alright).
When I run this code:
<?php
$url = 'http://parfumeriafox.sk/source_file.html';
$csv = file_get_contents_utf8($url);
header('Content-type: text/html; charset=utf-8');
print $csv;
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'utf-8');
}
(you can run it using http://www.parfumeriafox.sk/encoding.php), then I get question marks instead of those special characters. I have done huge research on this, I have tried standard file_read_contents function, I have even used some stream bla bla php context function, I also tried fopen and fread function to read that file on binary level, nothing seems to work. I have tried that with and without sending header. This is supposed to be perfectly siple, what am I doing wrong? When I check that string with some encoding detect function, it returns UTF-8.
You can see which character set your browser decided the document was by opening the developer console and looking at document.characterSet:
> document.characterSet
"windows-1250"
With this knowledge we can ask iconv to convert from "windows-1250" to utf-8 for us:
<?php
$text = file_get_contents("source_file.csv");
$text = iconv("windows-1250", "utf-8", $text);
print($text);
The output is valid utf-8, and levanduľa is displayed correctly as well.
How about this one????
For this one I used header('Content-Type: text/plain;; charset=Windows-1250');
bergamot, citrón, tráva, rebarbora, bazalka;levanduľa, škorica, hruška;céderové drevo, vanilka, pižmo, amberlyn
This code works for me
<?php
header('Content-Type: text/plain;charset=Windows-1250');
echo file_get_contents('http://www.parfumeriafox.sk/source_file.html');
?>
The problem is not with file_get_contents()
I save the $data to a file and the characters were correct but still not encoded correctly by my text editor. See image below.
$data = file_get_contents('http://www.parfumeriafox.sk/source_file.html');
file_put_contents('doc.txt',$data);
UPDATE
Seems to be one problematic character as shown here.
It also is seen on the HTML image below. Renders as ¾
Its Hex value is xBE (190 decimal)
I tried these two character sets. Neither worked.
header('Content-Type: text/plain; charset=ISO 8859-1');
header('Content-Type: text/plain; charset=ISO 8859-2');
END OF UPDATE
It works by adding a header WITHOUT charset=utf-8.
These two headers work
header('Content-Type: text/plain');
header('Content-Type: text/html');
These two headers do NOT work
header('Content-Type: text/plain; charset=utf-8');
header('Content-Type: text/html; charset=utf-8');
This code is tested and displayed all characters.
<?php
header('Content-Type: text/plain');
echo file_get_contents('http://www.parfumeriafox.sk/source_file.html');
?>
<?php
header('Content-Type: text/html');
echo file_get_contents('http://www.parfumeriafox.sk/source_file.html');
?>
These are some of the problematic characters with their Hex values.
This is the saved file viewed in Notepad++ with UTF-8 Encoding.
Check the Hex values against these character sets.
From the above table I saw the character set was Latin2.
I went to Wikipedia Windows code page and found that Latin2 is Windows-1250
bergamot, citrón, tráva, rebarbora, bazalka;levanduľa, škorica, hruška;céderové drevo, vanilka, pižmo, amberlyn
I have these Chinese characters:
汉字/漢字''test
If I do
echo utf8_encode($chinesevar);
it displays
??/??''test
Or even if I just do a simple
echo $chinesevar
it still displays some weird characters...
So how am I going to display these Chinese characters without using the <meta> tag with the UTF-8 thingy .. or the ini_set UTF-8 thing or even the header() thing with UTF-8?
Simple:
save your source code in UTF-8
output an HTTP header to specify to your browser that it should interpret the page using UTF-8:
header('Content-Type: text/html; charset=utf-8');
Done.
utf8_encode is for converting Latin-1 encoded strings to UTF-8. You don't need it.
For more details, see Handling Unicode Front To Back In A Web App.
Look that your file is in UTF8 without BOM and that your webserver deliver your site in UTF-8
HTML:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
in PHP:
header('Content-Type: text/html; charset=utf-8');
And if you work with a database look that your database is in UTF-8 if you read the text from your database.
$chinesevarOK = mb_convert_encoding($chinesevar, 'HTML-ENTITIES', 'UTF-8');
Perhaps take a look at the following solutions:
Your database, table and field COLLATE should be utf8_unicode_ci
Check if your records are showing the correct characters within the database...
Set your html to utf8
Add the following line to your php after connecting to the database
mysqli_set_charset($con,"utf8");
http://www.w3schools.com/php/func_mysqli_set_charset.asp
save your source code in UTF-8 No BOM
While I am using UTF-8 encoding in the PHP file, I am getting some weird types of characters
Say like this:
"conteúdo está"
How can I display it properly?
The data are being taken from a CSV file which is encoded as UTF-8 and not plain ANSII
Thanks in advance.
header('Content-type: text/html; charset=utf-8');
I set up both PHP 5 and Apache to use UTF-8 encoding.
I tried to show in my browser the result of this PHP code:
echo "Trying to visualize the letter ü"
and it shows me this result:
Trying to visualize the letter �
Why?
Try this :
<?php
header('Content-Type: text/plain; charset=utf-8');
echo "Trying to visualize the letter ü";
If this doesn't work, than your file is in different encoding that utf-8.
What's different between UTF-8 and UTF-8 without BOM?
Change File Encoding to utf-8 via vim in a script
Make sure you set your documents "Content-Type" http header, and set the charset to the encoding that you're using:
header("Content-Type: text/html; charset=utf-8");
I have these Chinese characters:
汉字/漢字''test
If I do
echo utf8_encode($chinesevar);
it displays
??/??''test
Or even if I just do a simple
echo $chinesevar
it still displays some weird characters...
So how am I going to display these Chinese characters without using the <meta> tag with the UTF-8 thingy .. or the ini_set UTF-8 thing or even the header() thing with UTF-8?
Simple:
save your source code in UTF-8
output an HTTP header to specify to your browser that it should interpret the page using UTF-8:
header('Content-Type: text/html; charset=utf-8');
Done.
utf8_encode is for converting Latin-1 encoded strings to UTF-8. You don't need it.
For more details, see Handling Unicode Front To Back In A Web App.
Look that your file is in UTF8 without BOM and that your webserver deliver your site in UTF-8
HTML:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
in PHP:
header('Content-Type: text/html; charset=utf-8');
And if you work with a database look that your database is in UTF-8 if you read the text from your database.
$chinesevarOK = mb_convert_encoding($chinesevar, 'HTML-ENTITIES', 'UTF-8');
Perhaps take a look at the following solutions:
Your database, table and field COLLATE should be utf8_unicode_ci
Check if your records are showing the correct characters within the database...
Set your html to utf8
Add the following line to your php after connecting to the database
mysqli_set_charset($con,"utf8");
http://www.w3schools.com/php/func_mysqli_set_charset.asp
save your source code in UTF-8 No BOM