write UTF-8 characters as Ascii into a file - php

I'm trying to convert Hebrew characters from UTF-8 to ISO-8859-8-1 in order to save them into a file.
I have read about ten posts here , in this site,
no matter what I do, I always get question marks (???????) instead of hebrew letters.
I tried iconv(), mb_convert_encoding(), utf8_decode(), all of them convert from UTF-8 to ISO-8859-8-1 but I keep getting '?????????' in the file.
mb_convert_encoding($fullRecord, 'ISO-8859-1', 'UTF-8');
iconv("UTF-8", "ISO-8859-1", $fullRecord);
iconv("UTF-8", "ISO-8859-1//TRANSLIT", $fullRecord);
Even this post didn't help because the solution there is in javascript:
Conversion from UTF8 to ASCII
I wish it could be in php...
I know that there are no hebrew characters in ASCII, but i have an example file that shows it can be done. when I open the file in notepad , it shows hebrew ok and the file is ANSI , so I guess it can be done somehow...
anyone please help?

try
iconv("UTF-8", "windows-1255", $fullRecord);

Related

How to properly convert characters in GB2312 to UTF-8 in PHP?

I have variables with chinese words, their charset is GB2312. I want to convert them to UTF-8 because I want to save them to mysql table with utf-8 encoding. How to do that is PHP? I'm using PHP 7.
Here are what I have tried:
I have tried using $myvar = iconv('gb2312', 'utf-8', $myvar); However some of my variables get empty if it contains some characters (invalid UTF-8 chars maybe?)
I have tried using $myvar = mb_convert_encoding($myvar, 'UTF-8', 'GB2312'); It works better than iconv but when $myvar contain some characters as I mentioned above, they turned into question mark (?)
Please help me, thanks
Update
Here is an example of my chinese string:
GB2312 (Expected result): 第3章︰林鴻
Using mb_convert_encoding become: 第3章?林?
Using iconv become empty

Unable to convert file from ANSI to UTF-8, using PHP

I have a file, which contains some cyrillic characters. When I open this file in Notepad++ I see, that it has ANSI encoding. If I manually encode it into UTF-8 using Notepad++, then everything is absolutely ok - I can use this file in my parsers and get results. But what I want is to do it programmatically, using PHP. This is what I tried after searching through SO and documentation:
file_put_contents($file, utf8_encode(file_get_contents($file)));
In this case when my algorithm parses the resulting files, it meets such letters as "è", "í" , "â". In other words, in this case I get some rubbish. I also tried this:
file_put_contents($file, iconv('WINDOWS-1252', 'UTF-8', file_get_contents($file)));
But it produces the very same rubbish. So, I really wonder how can I achive programmatically what Notepad++ does. Thanks!
Notepad++ may report your encoding as ANSI but this does not necessarily equate to Windows-1252. 1252 is an encoding for the Latin alphabet, whereas 1251 is designed to encode Cyrillic script. So use
file_put_contents($file, iconv('WINDOWS-1251', 'UTF-8', file_get_contents($file)));
to convert from 1251 to utf-8 with iconv.

Hebrew words and letters become question marks

I'm trying to recieve information from text file, and however when it's in hebrew, it shows "????" instead of the hebrew word
I can't change the file encoding, because ZaraRadio Outputs it, so I tried to set the charset of file to UTF-8, this way:
$npf = "CurrentSong.txt";
$ans = file_get_contents($npf);
$ans = mb_convert_encoding($ans, "UTF-8", "auto");
but it still not working...
any suggestions?
thanks.
Most likely auto will not serve because the file is encoded in a single byte encoding. You don't say which encoding it uses, but ISO-8859-8 is probably it.
$ans = mb_convert_encoding($ans, "UTF-8", "ISO-8859-8");

How to list files with special (norwegian) characters

I'm doing a simple (I thought) directory listing of files, like so:
$files = scandir(DOCROOT.'files');
foreach($files as $file)
{
echo ' <li>'.$file.PHP_EOL;
}
Problem is the files contains norwegian characters (æ,ø,å) and they for some reason come out as question marks. Why is this?
I can apparently fix(?) it by doing this before I echo it out:
$file = mb_convert_encoding($file, 'UTF-8', 'pass');
But it makes little sense to me why this helps, since pass should mean no character encoding conversion is performed, according to the docs... *confused*
Here is an example: http://random.geekality.net/files/index.php
It appears the encoding of the file names is in ISO Latin 1, but the page is interpreted by default using UTF-8. The characters do not come out as "question marks", but as Unicode replacement characters (�). That means the browser, which tries to interpret the byte stream as UTF-8, has encountered a byte invalid in UTF-8 and inserts the character at that point instead. Switch your browser to ISO Latin 1 and see the difference (View > Encoding > ...).
So what you need to do is to convert the strings from ISO Latin 1 to UTF-8, if you designate your page to be UTF-8 encoded. Use mb_convert_encoding($file, 'UTF-8', 'ISO-8859-1') to do so.
Why it works if you specify the $from encoding as pass I can only guess. What you're telling mb_convert_encoding with that is to convert from pass to UTF-8. I guess that makes mb_convert_encoding take the mb_internal_encoding value as the $from encoding, which happens to be ISO Latin 1. I suppose it's equivalent to 'auto' when used as the $from parameter.

uploaded file contents being echoed out but not showing accent marks

INT. PALO TORCIDO HIGH SCHOOL, CAFETER�A - DAY
Hi, I uploaded a .txt to my server and got the contents with fopen/fread and alsot used file_get_contents just in case.
I can't seem to figure out how to encode the special characters...
In my HTML i have my UTF set to 8. I also tried a PHP HEADER to use UTF-8 encoding.
what is the proper way to handle files with letters not part of the english alphabet?
Try utf8_encode()
echo utf8_encode(file_get_contents('file.txt'));
This works if the *.txt is encoded in Latin1. If other encoding may be used too, detect the encoding using mb_detect_encoding() and encode it to UTF8 with mb_convert_encoding()

Categories