php how to output arabic characters on windows using powershell - php

I have the below PHP script
<?php
header('Content-Type: text/html; charset=utf-8');
$file = "//192.168.10.206/wwwroot/SABIS CORPORATE/PrepList/Documents/41/1516 Level I Arabic Basic Questions and Answers T1 الأسلوب الخبري.pdf";
echo $file;
?>
and the result is always
PS C:\Users\aaoun> php -q c:\Users\aaoun\Desktop\Test-AAA.php
//192.168.10.206/wwwroot/SABIS CORPORATE/PrepList/Documents/41/1516 Level I Arabic Basic Questions and Answers T1 ╪د┘╪ث
╪│┘┘ê╪ذ ╪د┘╪«╪ذ╪▒┘è.pdf PS C:\Users\aaoun>
I tried UT8 encoding
iconv("unicode", "utf-8", $file);
iconv("Latin1_General_CI_AS","utf-8",$file);
iconv("Arabic_CI_AS","utf-8",$file);
base64_encode($file);
utf8_encode($file);
base64_decode($file);
utf8_decode($file);
iconv('WINDOWS-1256', 'UTF-8', $file);
iconv('cp1256', 'UTF-8', $file);
Nothing seems to work, I keep getting wrong text I need to get the text to check if the file exists ...

Ensure that the encoding of the file-system is the same as the encoding of the string that contains the file-name in your PHP code.
Otherwise you're testing for the wrong file to exist.
For example, if your file-system does not support UTF-8 but the file-name is encoded in UTF-8. As you are using Windows, it is worth noting that Windows filesystems (FAT, FAT32, NTFS) are not UTF-8 aware. you can convert your file name to UTF-16 like
$newName = mb_convert_encoding('your file path', 'UTF-16', 'UTF-8');
If that does not work, you can use urlencode. All characters returned from urlencode are valid in filenames (NTFS/HFS/UNIX).
$newName = urlencode($file);
Try this code to know if the file exists:
if (file_exists('your file path')) {
//file exists;
}
else
{
//Does not exist
}

Related

PHP read a line from a csv file return wrong in charset

I got a csv file, if I set the charset to ISO-8859-2(eastern europe) in Libre Calc, than it renders the characters correctly, but since the server's locale set to EN-UK.
I can not read the characters correctly, for example:
it returns : T�t insted of Tót.
I tried many things like:
echo (mb_detect_encoding("T�t","ISO-8859-2","UTF-8"));
I know probably the char does not exist in UTF-8 but I tried.
Also tried to setup the correct charset in the header:
header('Content-Type: text/html; charset=iso-8859-2');
echo "T�th";
but its returns : TÄĹźËth insted of Tóth.
Please help me solve this, thanks in advance
I advise against setting the header to charset=iso-8859-2'. It is usual to work with UTF-8. If the data is available with a different encoding, it should be converted to UTF-8 and then processed as CSV. The following example code could be kept as simple as the newline characters in UTF-8 and iso-8859-2 are the same.
$fileName = "yourpath/Iso8859_2.csv";
$fp = fopen($fileName,"r");
while($row = fgets($fp)){
$strUtf8 = mb_convert_encoding($row,'UTF-8','ISO-8859-2');
$arr = str_getcsv($strUtf8);
var_dump($arr);
}
fclose($fp);
The exact encoding of the CSV file must be known. mb_detect_encoding is not suitable for determining the encoding of a file.

PHP function in_array doesn't recognize diacritic

I have a code that runs through files and getting all images.
$img = '/srv/www/wordpress-default/public_html/wp-content/uploads/2018/07/2018_07_DogOwner_VS_CatOwner_655x368_NL-500x281.jpg';
$dir = preg_replace('#[^/]*$#', '', $img);
$image_files = scandir($dir);
$image_name = #array_pop(explode('/', $img));
$find = $image_name;
var_dump(in_array($find, $image_files));
In this example I run only through one image. This code returns true. The problem is when I have an image that has for example german signs (hundezubehör-für-sommer.jpg).
$img = '/srv/www/wordpress-default/public_html/wp-content/uploads/2018/07/hundezubehör-für-sommer.jpg';
This returns false. Any ideas why this doesn't work?
EDITED:
I have asked this question few days ago: How to find a shortest name (string) of the same image with different naming. The solution to this is here: https://3v4l.org/T7lfU. The problem I think is when I run the code from scandir then It can't find the diacritic.
The in_array function works despite the strings alphabet. I guest the problem happens because your PHP file and filesystem use different encodings therefore the value read by scandir has another encoding therefore it differs from the $img value written in the code.
Try to convert the encoding of the scandir result to make it match the PHP file encoding. For example:
// ...
$image_files = scandir($dir);
foreach ($image_files as &$file) {
$file = mb_convert_encoding($file, 'UTF-8', 'Windows-1251');
}
// ...
var_dump(in_array($find, $image_files));
Replace UTF-8 with the PHP file encoding and Windows-1251 with your filesystem encoding.
The problem is with storing multi-byte characters like ö and ü into a PHP file.
You can try interpreting the string as multi-byte:
$img = utf8_encode('/srv/www/wordpress-default/public_html/wp-content/uploads/2018/07/hundezubehör-für-sommer.jpg');
Encoding, then decoding to make it safer:
$img = html_entity_decode('/srv/www/wordpress-default/public_html/wp-content/uploads/2018/07/hundezubehör-für-sommer.jpg');
Or backslash the entities:
$img = "/srv/www/wordpress-default/public_html/wp-content/uploads/2018/07/hundezubeh\303\266r-f\303\274r-sommer.jpg";

PHP convert encoding with Shift_JIS

I have a text file. It contains "砡" character and its encoding is Shift-JIS.
I using function file_get_contents() in PHP (Laravel) to read this file, then response in json for client.
$file = file_get_contents("/path/to/file/text");
$file = iconv("SJIS", "UTF-8//IGNORE", $file);
return response()->json(['content' => $file]);
However, this charater "砡" doesn't correctly display, it show to "x".
How do I fix it ?
Try "SJIS-win" instead of "SJIS".

mkdir UTF-8 file name

I'm having some problem with mkdir
I'm using xampp on windows, when I try to create a directory, it returns not like should be, in example
mkdir(JPATH_SITE.'/images/projects/'.$region_folder.'/'.$project_folder, 0777, true);
Should return something like
/images/projects/Ленинградская_область/Ленинградская_область_1
But create a directory like:
/images/projects/Ленинградская_область/Ленинградская_область_1
It's something about encoding? or has to do with the OS?
Windows filenames are not encoded in utf8, but in windows-1252 or windows-1251 or smthing like that.
try this:
$dirname = JPATH_SITE.'/images/projects/'.$region_folder.'/'.$project_folder;
//replace "UTF-8" with the respective input charset, if it is not utf8
$dirname = iconv("UTF-8","Windows-1252",$dirname);
mkdir($dirname, 0777, true);
//if this doesnt work, try another charset like this:
$dirname = iconv("UTF-8","Windows-1251",$dirname);
//you can also use iconv on your russian variables only
//remember that you might need to change UTF-8 to another input charset
$region_folder = iconv("UTF-8","Windows-1251",$region_folder);
$project_folder = iconv("UTF-8","Windows-1251",$project_folder);
read more about iconv here: PHP iconv()
also useful to detect your charset encoding: mb_detect_encoding()

How can i upload a file with utf-8 name in php?

we have a file that have persian name, like:
ایران.jpg
our problem is that php unable to copy or rename this file by orginal name,
meaning if file name does not have fully english character, result is like this:
ط§ط´ط±ع©طھ ظ…ظ„غŒ ظ¾ط®ط´ ظپط±ط¢ظˆط±ط¯ظ‡ ظ‡ط§غŒ ظ†ظپطھغŒ-04ط¢ط¨ط§ظ†.jpg
some articles recommendation for use of iconv function, like:
$fn = iconv("CP-1252", "UTF-8", $file['name']);
we use of that method, but the solution not work.
You need to specify the correct character set to iconv from which to convert the string. Something like this:
$fn = iconv("<persian-character-set>", "UTF-8", $file['name']);
You may want to add additional options to the output character set like TRANSLINT and/or IGNORE:
$fn = iconv("<persian-character-set>", "UTF-8//TRANSLIT//IGNORE", $file['name']);
See http://php.net/manual/en/function.iconv.php for details on these options.
You Should Choose Right Code Page. This Code Works In Windows For Arabic/Persian Names:
$newname = iconv("UTF-8", "CP1256//IGNORE","گچپژ");
echo rename("1.txt", $newname);
In fact, we have a conversion from UTF-8 to UTF-8, Not ridiculous?
Value of $_FILE['name'] is UTF-8 and we try to set that character-set to UTF-8!!!
Our problem is that we have entered into with utf-8, but saved by unknown encoding!
I think Php is a serious bug, if you think opposite, active persian language in your windows os and try to make a folder by persian character by PHP (like ایران), and use of all method that you think work!
If it was, you did a great job, but if ...

Categories