PHP fputcsv encoding - php

I create csv file with fputcsv. I want csv file to be in windows1251 ecnding. But can't find the solution. How can I do that?
Cheers

Default encoding for an excel file is machine specific ANSI, mainly windows1252. But since you are creating that file and maybe inserting UTF-8 characters, that file is not going to be handled ok.
You could use iconv() when creating the file. Eg:
function encodeCSV(&$value, $key){
$value = iconv('UTF-8', 'Windows-1252', $value);
}
array_walk($values, 'encodeCSV');

It's Working: Enjoy
use this before fputcsv:
$line = array_map("utf8_decode", $line);

The file will be in whatever encoding your strings are in. PHP strings are raw byte arrays, their encodings depends on wherever the bytes came from. If you just read them from your source code files, they're in whatever you saved your source code as. If they're from a database, they're in whatever encoding the database connection was set to.
If you need to convert from one encoding to another, use iconv. If you need more in-depth information, see here:
What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text
Handling Unicode Front To Back In A Web App

fputs( $fp, $bom = chr(0xEF) . chr(0xBB) . chr(0xBF) );
Try this it worked for me!!!

Try the iconv function:
http://php.net/manual/en/function.iconv.php
To transform the members of the array you're passing to the fputcsv function to the encoding you want.

$string = iconv(mb_detect_encoding($string), 'Windows-1252//TRANSLIT', $string);

Related

How to convert UTF-8 to ANSI csv file

I'm actually generating csv files in php, works great but I have to use these csv files to use into Microsoft Dynamics AX and here's the problem.
Csv file that I generated gets "NUL" space on some columns and I have to pull off those spaces to get clean csv files and use it in Dynamics AX.
I saw when opening them into Notepad ++ that csv files are in UTF-8 BOM and I want to convert them to ANSI, when I make the conversion to ANSI in Notepad++, all NUL spaces disappear.
I tried different things saw on StackOverflow and it is with the iconv method that I obtained the better result but it is far from perfect and what I expect.
Here's the actual code :
fprintf($fp, chr(0xEF) . chr(0xBB) . chr(0xBF));
for ( $a = 0 ; $a < count($tableau) ; $a++ ) {
foreach ( $tableau[$a] as $data ) {
fputcsv($fp, $data, ";", chr(0));
}
}
$fp=iconv("UTF-8", "Windows-1252//TRANSLIT//IGNORE", $fp);
fclose($fp);
echo json_encode(responseAjax(true));
}
and I obtain these result :
I don't understand why it's only apply in one cell instead on working on every cells which contain "NUL" spaces.
I tried the mb_converting_encoding method with no great result.
Any other idea, method or advice will be welcome,
thanks
"NUL" is the name generally given to a binary value of 00000000, which has the same meaning in all ASCII-compatible encodings, which includes UTF-8, Windows-1252, and most things that could be given the vague label "ANSI". So character encoding is not relevant here.
You appear to be explicitly adding it with chr(0) - specifically as the "enclosure" argument to fputcsv, so it's being used in place of quote marks around strings with spaces or punctuation. The string "Avion" doesn't need enclosing, which is why it doesn't have any NULs around.
Let's add some comments to the code you've apparently copied without understanding:
// Output the three bytes known as the "UTF-8 BOM"
// - an invisible character used to help software guess that a file should be read as UTF-8
fprintf($fp, chr(0xEF) . chr(0xBB) . chr(0xBF));
// Loop over the data
for ( $a = 0 ; $a < count($tableau) ; $a++ ) {
foreach ( $tableau[$a] as $data ) {
// Output a row of CSV data with the built-in function fputcsv
// $data is the array of data you want to output on this row
// Use ';' as the separator between columns
// Use chr(0) - a NUL byte - to "enclose" fields with spaces or punctuation
// The default would be to use comma (',') and quotes ('"')
// See the PHP manual at https://php.net/fputcsv for more details
fputcsv($fp, $data, ";", chr(0));
}
}
The character you use for the "enclosure" is entirely up to you; most systems will probably expect the default ", so you could use this:
fputcsv($fp, $data, ";");
Which is the same as this:
fputcsv($fp, $data, ";", '"');
The function doesn't support disabling the enclosure completely, but without it, CSV is fundamentally a very simple format - just combine the fields separated by some character, e.g. using implode:
fwrite($fp, implode(";", $data));
Character encoding is a completely separate issue. For that, you need to know two things:
What encoding is your data in
What encoding does the remote system want it in
If these don't match, you can use a function like iconv or mb_convert_encoding.
If your output is not UTF-8, you should also remove the line at the beginning of your code that outputs the UTF-8 BOM.
If your data is stored in UTF-8, and the remote system accepts data in UTF-8, you don't need to do anything here.

Laravel Storage file encoding

I'm trying to save text file as UTF-8 by using Laravel's Storage facade. Unfortunately couldn't find a way and it saves as us-ascii. How can I save as UTF-8?
Currently I'm using following code to save file;
Storage::disk('public')->put('files/test.txt", $fileData);
You should be able to append "\xEF\xBB\xBF" (the BOM which defines it as UTF-8) to your $fileData. So:
Storage::disk('public')->put('files/test.txt", "\xEF\xBB\xBF" . $fileData);
There are other ways to convert your text before writing it to the file, but this is the simplest and easiest to read and execute. As far as I know, there is also no character encoding methods within Illuminate\Filesystem\Filesystem.
For more information: https://stackoverflow.com/a/9047876/823549 and What's different between UTF-8 and UTF-8 without BOM?.
ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. There's no difference between them, so there's no need to do anything.
It looks like your problem is that the files are not actually ASCII. You need to determine what encoding they are using, and transcode them properly.
I recommend using mb_convert_encoding instead
$fileData = mb_convert_encoding($fileData, "UTF-8", "auto");
Storage::disk('public')->put('files/test.txt", $fileData);

Save filename with unicode chars

I have searched all over the Internet and SO, still no luck in the following:
I would like to know, how to properly save a file using file_put_contents when filename has some unicode characters. (Windows 7 as OS)
$string = "jérôme.jpg" ; //UTF-8 string
file_put_contents("images/" . $string, "stuff");
Resuts in a file:
jГ©rГґme.jpg
Tried all possible combinations of such functions as iconv and mb_convert_encoding with all possible encodings, converting source file into different encodings as well.
All proper headers are set, browser recognises UTF-8 properly.
However, I can successfully copy-paste and create a file with such a name in explorer's GUI, but how to make it via PHP?
The last hardcore solution was to urlencode the string and save file.
This might be late but i just found a solution to close this hurting issue for me as well.
Forget about iconv and multibyte solutions; the problem is on Windows! (in the link you'll find all it's beauty about this.)
After numerous attempts and ways to solve this, i met with URLify and decided that best way to cope with unicode-in-filenames is to transliterate them before writing to file.
Example of transliterating a filename before saving it:
$filename = "Αρχείο.php"; // greek name for 'file'
echo URLify::filter($filename,128,"",TRUE);
// output: arxeio.php

How to set text file encoding in PHP?

How can I set a text file ENCODING (for instance UTF-8) in PHP?
Let me show you my problem. This is my code:
<?php
file_put_contents('test.txt', $data); // data is some non-English text with UTF-8 charset
?>
Output: اÙ!
fwrite() has the similar output.
But when I create the test.txt by notepad and set the charset UTF-8 the output is what I want.
I wanna set the charset in the PHP file.
Now this is my question: How to set text file encoding by PHP?
PHP does not apply an encoding when storing text in a file: it stores data exactly as it is laid out in the string.
You mention that you have problems opening the file in notepad.exe. That text editor is not very good at guessing the encoding of the file you are opening; if the text is encoded in UTF-8 you must choose to open it as UTF-8. Use another text editor if possible. Notepad++ is a popular replacement.
If you must use notepad.exe, as a last resort, write a Byte Order Mark to the file before you write anything else; this will make it recognize the file as UTF-8 while potentially making the file unusable for other purposes (see the Wikipedia article for details).
file_put_contents("file.txt", "\xEF\xBB\xBF" . $data);
You may try this using mb_convert_encoding
$data = mb_convert_encoding($data, 'UTF-8', 'auto');
file_put_contents('test.txt', $data);
Also check iconv.
Update : (try this and find the right encoding for your text)
foreach(mb_list_encodings() as $chr){
echo mb_convert_encoding($data, 'UTF-8', $chr)." : ".$chr."<br>";
}
Also, try this on GitHub.
Try:
file_put_contents('test.txt', utf8_encode($data));
You can create a function which converts a string array into a utf8 encoded string array and another to decode and write to a notepad file for you.
<?php
function utf8_string_encode(&$array){
$myencode = function(&$value,&$key){
if(is_string($value)){
$value = utf8_encode($value);
}
if(is_string($key)){
$key = utf8_encode($key);
}
if(is_array($value)){
utf8_string_encode($value);
}
};
array_walk($array,$func);
return $array;
}
?>

Problem writing UTF-8 encoded file in PHP

I have a large file that contains world countries/regions that I'm seperating into smaller files based on individual countries/regions. The original file contains entries like:
EE.04 Järvamaa
EE.05 Jõgevamaa
EE.07 Läänemaa
However when I extract that and write it to a new file, the text becomes:
EE.04 Järvamaa
EE.05 Jõgevamaa
EE.07 Läänemaa
To save my files I'm using the following code:
mb_detect_encoding($text, "UTF-8") == "UTF-8" ? : $text = utf8_encode($text);
$fp = fopen(MY_LOCATION,'wb');
fwrite($fp,$text);
fclose($fp);
I tried saving the files with and without utf8_encode() and neither seems to work. How would I go about saving the original encoding (which is UTF8)?
Thank you!
First off, don't depend on mb_detect_encoding. It's not great at figuring out what the encoding is unless there's a bunch of encoding specific entities (meaning entities that are invalid in other encodings).
Try just getting rid of the mb_detect_encoding line all together.
Oh, and utf8_encode turns a Latin-1 string into a UTF-8 string (not from an arbitrary charset to UTF-8, which is what you really want)... You want iconv, but you need to know the source encoding (and since you can't really trust mb_detect_encoding, you'll need to figure it out some other way).
Or you can try using iconv with a empty input encoding $str = iconv('', 'UTF-8', $str); (which may or may not work)...
It doesn't work like that. Even if you utf8_encode($theString) you will not CREATE a UTF8 file.
The correct answer has something to do with the UTF-8 byte-order mark.
This to understand the issue:
- http://en.wikipedia.org/wiki/Byte_order_mark
- http://unicode.org/faq/utf_bom.html
The solution is the following:
As the UTF-8 byte-order mark is '\xef\xbb\xbf' we should add it to the document's header.
<?php
function writeStringToFile($file, $string){
$f=fopen($file, "wb");
$file="\xEF\xBB\xBF".$string; // utf8 bom
fputs($f, $string);
fclose($f);
}
?>
The $file could be anything text or xml...
The $string is your UTF8 encoded string.
Try it now and it will write a UTF8 encoded file with your UTF8 content (string).
writeStringToFile('test.xml', 'éèàç');
Maybe you want to call htmlentities($text) before writing it into file and html_entity_decode($fetchedData) before output. It'll work with Scandinavian letters.
It appears that your source file is not, in fact, in UTF-8. You might want to try using the same approach you've been using, but with a different encoding, such as UTF-16 perhaps.
You can do it as follows:
<?php
$s = "This is a string éèàç and it is in utf-8";
$f = fopen('myFile',"w");
fwrite($f, utf8_encode($s));
fclose($f);
?>

Categories