UTF-8 characters don't display correctly - php

This is my PHP code:
<?php
$result = '';
$str = 'Тугайный соловей';
for ($y=0; $y < strlen($str); $y++) {
$tmp = mb_substr($str, $y, 1);
$result = $result . $tmp;
}
echo 'result = ' . $result;
The output is:
Тугайный Ñоловей
What can I do? I have to put $result into a MySQL database.

What's the encoding of your file? It should be UTF8 too. What's the default charset of your http server? It should be UTF-8 as well.
Encoding only works if:
the file is encoded correctly
the server tells what's the encoding of the delivered file.
When working with databases, you also have to set the right encoding for your DB fields and the way the MySQL client communicates with the server (see mysql_set_charset()). Fields only are not enough because your MySQL client (in this case, PHP) could be set to ISO by default and reinterprets the data. So you end up with UTF8 DB -> ISO client -> injected into UTF8 PHP script. No wonder why it's messed up at the end :-)
How to serve the file with the right charset?
header('Content-type: text/html; charset=utf-8') is one solution
.htaccess file containing AddDefaultCharset UTF-8 is another one
HTML meta content-type might work too but it's always better to send this information using HTTP headers.
PS: you also have to use mb_strlen() because strlen() on UTF8 strings will probably report more than the real length.

If you're going to send a mix of data and don't want to specify utf-8 using a php header, you can add this html to your page:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

I suppose, your code is in windows-1251 encoding since it is Russian :)
convert your string to utf-8:
$str = iconv('windows-1251', 'utf-8', $str);

If your database is UTF-8, it's ok for mysql.
For your echo, if you do it in a web site, put this in the top page:
header('Content-Type: text/html; charset=UTF-8');

Just add this line at the beginning, after the connection with server:
mysqli_set_charset($conn,"utf8");

try this:
header('Content-Type: text/html; charset=UTF-8');
header("Content-type: application/octetstream");
header("Pragma: no-cache");
header("Expires: 0");
//print "$name_field\n$data";
// با این کد درست شد
print chr(255) . chr(254) . mb_convert_encoding("$name_field\n$data", 'UTF-16LE', 'UTF-8');

if you are just using PHP echo with no HTML headers etc., this worked great for me.
$connect = mysqli_connect($host_name, $user_name, $password, $database);
mysqli_set_charset($connect,"utf8");

Related

PHP read a line from a csv file return wrong in charset

I got a csv file, if I set the charset to ISO-8859-2(eastern europe) in Libre Calc, than it renders the characters correctly, but since the server's locale set to EN-UK.
I can not read the characters correctly, for example:
it returns : T�t insted of Tót.
I tried many things like:
echo (mb_detect_encoding("T�t","ISO-8859-2","UTF-8"));
I know probably the char does not exist in UTF-8 but I tried.
Also tried to setup the correct charset in the header:
header('Content-Type: text/html; charset=iso-8859-2');
echo "T�th";
but its returns : TÄĹźËth insted of Tóth.
Please help me solve this, thanks in advance
I advise against setting the header to charset=iso-8859-2'. It is usual to work with UTF-8. If the data is available with a different encoding, it should be converted to UTF-8 and then processed as CSV. The following example code could be kept as simple as the newline characters in UTF-8 and iso-8859-2 are the same.
$fileName = "yourpath/Iso8859_2.csv";
$fp = fopen($fileName,"r");
while($row = fgets($fp)){
$strUtf8 = mb_convert_encoding($row,'UTF-8','ISO-8859-2');
$arr = str_getcsv($strUtf8);
var_dump($arr);
}
fclose($fp);
The exact encoding of the CSV file must be known. mb_detect_encoding is not suitable for determining the encoding of a file.

ftp_get doesn't work with accented characters [duplicate]

I will be brief. My FTP function returns wrong encoding of filenames
$conn_id = ftp_connect("site.com");
ftp_login($conn_id, "login", "pass");
ftp_pasv($conn_id, true);
$buff = ftp_nlist($conn_id, "./");
print_r($buff);
-> // result
array() {
[0]=> "��.txt"
}
The file name has Windows-1251 encoding.
I tried to connect to FTP via nodejs but it also returns something creepy — òð.txt.
My desktop client (WinSCP) however works fine with this.
PS: I tried to use utf8_encode - but that's also not working for me.
If the encoding is of you could try to change it using mb_convert_encoding. The code below should output the correct value.
<?php
echo mb_convert_encoding($buff[0], "UTF-8");
//or
echo mb_convert_encoding($buff[0], "UTF-8", "windows-1251");
?>
If it doesnt work, you can try to find the right encoding using something like
<?php
foreach(mb_list_encodings() as $chr){
echo mb_convert_encoding($buff[0], 'UTF-8', $chr)." : ".$chr."<br>";
}
?>
Many (but not all) ftp servers supports UTF-8 pathnames encoding. You can turn this feature on by issuing 'OPTS UTF8 ON' command before ftp_nlist call.
ftp_raw('OPTS UTF8 ON');
First you add content type on your page.
header('Content-Type: text/html; charset=utf-8');
And then try this, hope it helps
str_replace(array('%82','%94','+'),array('é','ö',' '),urlencode($folder_name));
It's not the best way, but it works for me, if you url encode a string it changes the awkward characters into e.g. %82... You can then replace these with the HTML codes.
you can try using iconv function. Hoping it will solve your problem.

PHP - FTP filename encoding issue

I will be brief. My FTP function returns wrong encoding of filenames
$conn_id = ftp_connect("site.com");
ftp_login($conn_id, "login", "pass");
ftp_pasv($conn_id, true);
$buff = ftp_nlist($conn_id, "./");
print_r($buff);
-> // result
array() {
[0]=> "��.txt"
}
The file name has Windows-1251 encoding.
I tried to connect to FTP via nodejs but it also returns something creepy — òð.txt.
My desktop client (WinSCP) however works fine with this.
PS: I tried to use utf8_encode - but that's also not working for me.
If the encoding is of you could try to change it using mb_convert_encoding. The code below should output the correct value.
<?php
echo mb_convert_encoding($buff[0], "UTF-8");
//or
echo mb_convert_encoding($buff[0], "UTF-8", "windows-1251");
?>
If it doesnt work, you can try to find the right encoding using something like
<?php
foreach(mb_list_encodings() as $chr){
echo mb_convert_encoding($buff[0], 'UTF-8', $chr)." : ".$chr."<br>";
}
?>
Many (but not all) ftp servers supports UTF-8 pathnames encoding. You can turn this feature on by issuing 'OPTS UTF8 ON' command before ftp_nlist call.
ftp_raw('OPTS UTF8 ON');
First you add content type on your page.
header('Content-Type: text/html; charset=utf-8');
And then try this, hope it helps
str_replace(array('%82','%94','+'),array('é','ö',' '),urlencode($folder_name));
It's not the best way, but it works for me, if you url encode a string it changes the awkward characters into e.g. %82... You can then replace these with the HTML codes.
you can try using iconv function. Hoping it will solve your problem.

Display latin character in html

I have a PHP object displaying values as below when I use var_dump($obj):
object() (1) { ["name"]=> string(3) "Lê" ... }
But when I print $obj->name, the browser displays "Lê" instead.
My browser is displaying UTF-8.
HTML charset is also set to utf-8.
I tried with some functions but I didn't solve this.
could you please help me to solve this issue? Thanks.
EDIT:
I have already had all checked items as below:
header('Content-Type: text/html; charset=utf-8');
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
My browser is displaying UTF-8
The $obj is get from db table which set "character set" to utf-8 and "collation" to utf8_general_ci
PHP file is encoded to UTF-8
Set a UTF-8 header('Content-Type: text/html; charset=utf-8');.
I've just tested this on PHP5.4.8, nginx, Ubuntu 12.04 and Firefox. AFAIK, you'll get the same results in pretty much any PHP stack from at least the past 5 years.
<?php
$mystring = 'Lê';
print $mystring;
output:
Lê
<?php
header('Content-Type: text/html; charset=utf-8');
$mystring = 'Lê';
print $mystring;
output:
Lê

PHP import CSV with utf-8 accents

I am having issues importing a CSV file which contains (french) names with accents in them... when ever they are imported the accent do not display properly example
félix turns into fŽlix
the file is created by hand and then imported into PHP.
I have tried both utf8_encode() and utf8_decode() and nether function will convert the chars so they can be viewed properly.
my question is how can i get this to render properly... convert char-set.. etc
I believe the text is encoded in Cp850 based on other questions i've seen on here. I am using fgetcvs() to get the contents.
Set Header Information before you output as UTF
header('Content-Type: text/html; charset=utf-8');
$log = file_get_contents("log.csv");
echo utf8_encode($log);
Output
félix
Please, try iconv() function
I think this is late answer but may be helpful for those who are still searching for solution. This is just a tweak. Not always recommended .
header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename=filename.csv');
echo "\xEF\xBB\xBF"; // UTF-8 with BOM
readfile("filename.csv");
exit;
I'm doing this on upload
if (move_uploaded_file($_FILES["fileToUpload"]["tmp_name"], $target_dir .$target_file)) {
$log = file_get_contents($target_dir .$target_file);
file_put_contents($target_dir .$target_file, utf8_encode($log));

Categories