This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 6 years ago.
I want to get a .html or .txt file from a folder with PHP, but this file is UTF-8 encoded, and if I use $html=file_get_contents('somewhere/somewhat.html'); and after that I echo $html; then this won't be UTF-8 encoded. I see many "�" in the text. Any idea? How can I prevent this?
You need to convert it to UTF8 yourselves. To do that use mb_convert_encoding() and mb_detect_encoding() PHP functions.
Like this,
$html=file_get_contents('somewhere/somewhat.html');
$html=mb_convert_encoding($html, 'UTF-8',mb_detect_encoding($html, 'UTF-8, ISO-8859-1', true));
echo $html;
mb_convert_encoding() converts character encoding
mb_detect_encoding() detects character encoding
Try to use iconv on your string:
http://php.net/manual/pl/function.iconv.php
Other solution:
http://php.net/manual/en/function.mb-convert-encoding.php
Or:
http://php.net/manual/en/function.utf8-encode.php
Related
This question already has answers here:
Convert utf8-characters to iso-88591 and back in PHP
(10 answers)
Closed 6 years ago.
How can i write a file in PHP with encode ISO-8859-1?
Im using the function
$file = fopen("file.txt", "a");
When the file is created, it appears with encoding UTF-8 and i want it in ISO-8859-1.
UTF-8 supports all the characters found in ISO-8859-1... it's best to stick with UTF-8 as it supports the most amount of characters commonly used and will give you the least amount of problems.
This question already has an answer here:
Reference: Why are my "special" Unicode characters encoded weird using json_encode?
(1 answer)
Closed 2 years ago.
Any way to return PHP json_encode with encode UTF-8 and not Unicode?
$arr=array('a'=>'á');
echo json_encode($arr);
mb_internal_encoding('UTF-8');and $arr=array_map('utf8_encode',$arr); does not fix it.
Result: {"a":"\u00e1"}
Expected result: {"a":"á"}
{"a":"\u00e1"} and {"a":"á"} are different ways to write the same JSON document; The JSON decoder will decode the unicode escape.
In php 5.4+, php's json_encode does have the JSON_UNESCAPED_UNICODE option for plain output. On older php versions, you can roll out your own JSON encoder that does not encode non-ASCII characters, or use Pear's JSON encoder and remove line 349 to 433.
I resolved my problem doing this:
The .php file is encoded to ANSI. In this file is the function to create the .json file.
I use json_encode($array, JSON_UNESCAPED_UNICODE) to encode the data;
The result is a .json file encoded to ANSI as UTF-8.
This function found here, works fine for me
function jsonRemoveUnicodeSequences($struct) {
return preg_replace("/\\\\u([a-f0-9]{4})/e", "iconv('UCS-4LE','UTF-8',pack('V', hexdec('U$1')))", json_encode($struct));
}
Use JSON_UNESCAPED_UNICODE inside json_encode() if your php version >=5.4.
just use this,
utf8_encode($string);
you've to replace your $arr with $string.
I think it will work...try this.
This question already has answers here:
Convert ASCII TO UTF-8 Encoding
(5 answers)
Closed 6 years ago.
is UTF-8 not the same as ASCII? how you would explain the different results i get from:
$result = mb_detect_encoding($PLAINText, mb_detect_order(), true);
Sometimes i get "UTF-8" in $result and sometimes i get "ASCII". so they are different, but that is not my question, my question is why iconv() code doesn't convert from ASCII to UTF-8?
$result = iconv("ASCII","UTF-8//IGNORE",$PLAINText);
i check the $result encoding later using the mb_detect_encoding() function and it is still "ASCII" , not "UTF-8".
The reason is that when using only ASCII characters in an UTF-8 string, the UTF-8 string is indistinguishable from an ASCII string. (Unless a byte order mark is used, but it's optional.)
This question already has answers here:
Convert ASCII TO UTF-8 Encoding
(5 answers)
Closed 6 years ago.
I tried to do:
file_put_contents ( $file_name, utf8_encode($data) ) ;
But when i check the file encoding from the shell with the linux command: 'file file_name'
I get: 'file_name: ASCII text'
Does it mean that the utf8_encoding didn't worked? if so, what is the right way to convert from ASCII to UTF8
If your string doesn't contain any non-ASCII characters, then you likely won't see differences, since UTF-8 is backwards compatible with ASCII. Try writing, for example, the text "1000 さくら" and see what happens.
Please note that utf8_encode only converts a string encoded in
ISO-8859-1 to UTF-8. A more appropriate name for it would be
"iso88591_to_utf8". If your text is not encoded in ISO-8859-1, you do
not need this function. If your text is already in UTF-8, you do not
need this function. In fact, applying this function to text that is
not encoded in ISO-8859-1 will most likely simply garble that text.
If you need to convert text from any encoding to any other encoding,
look at iconv() instead.
See http://php.net/manual/en/function.utf8-encode.php
ASCII is a subset of UTF-8, so if a document is ASCII then it is already UTF-8
Found at: Convert ASCII TO UTF-8 Encoding
Try this:
$data = mb_convert_encoding($data, 'UTF-8', 'ASCII');
file_put_contents ( $file_name, $data );
or use this to change file encoding:
$fd = fopen($file, 'r');
stream_filter_append($fd, 'convert.iconv.UTF-8/ASCII');
stream_copy_to_stream($fd, fopen($output, 'w'));
Reference: How to write file in UTF-8 format?
This question already has answers here:
Detect encoding and make everything UTF-8
(26 answers)
Closed 12 months ago.
I have a legacy database table with a mixed encoding. Some lines are UTF-8 and some lines are ISO 8859-1.
Are there some heuristics I can apply on the content of a line to guess which encoding best represents the content?
Convert from UTF-8. If that fails then it's not UTF-8, so you should probably convert from Latin-1 instead.
Compare
iconv("UTF-8", "ISO-8859-1//IGNORE", $text)
and
iconv("UTF-8", "ISO-8859-1", $text)
If they are not equal - consider it UTF-8.