PHP import CSV with utf-8 accents - php

I am having issues importing a CSV file which contains (french) names with accents in them... when ever they are imported the accent do not display properly example
félix turns into fŽlix
the file is created by hand and then imported into PHP.
I have tried both utf8_encode() and utf8_decode() and nether function will convert the chars so they can be viewed properly.
my question is how can i get this to render properly... convert char-set.. etc
I believe the text is encoded in Cp850 based on other questions i've seen on here. I am using fgetcvs() to get the contents.

Set Header Information before you output as UTF
header('Content-Type: text/html; charset=utf-8');
$log = file_get_contents("log.csv");
echo utf8_encode($log);
Output
félix

Please, try iconv() function

I think this is late answer but may be helpful for those who are still searching for solution. This is just a tweak. Not always recommended .
header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename=filename.csv');
echo "\xEF\xBB\xBF"; // UTF-8 with BOM
readfile("filename.csv");
exit;

I'm doing this on upload
if (move_uploaded_file($_FILES["fileToUpload"]["tmp_name"], $target_dir .$target_file)) {
$log = file_get_contents($target_dir .$target_file);
file_put_contents($target_dir .$target_file, utf8_encode($log));

Related

Can't encode CSV file in UTF8 with PHP

I've been trying, for some time now, to export a properly encoded and formated CSV file with PHP. But it's just not working. I've tried every tip in every CSV/PHP related thread on SOF, I've checked that the data in my database is UTF-8, it is. I've tried stuff like utf8_encode() on the whole CSV-line, I've checked that the actual PHP file is encoded in UTF-8, but still no success. When I run the file on http://csvlint.io/ I just get:
Your CSV appears to be encoded in ASCII-8BIT. We recommend you use UTF-8.
But I can't find a trace of any other encoding than UTF-8 anywhere in my code..
Basically this is my code:
First, I put all my CSV-rows in an array, then do this:
if (count($array) == 0)
{
return NULL;
}
ob_start();
$df = fopen("php://output", 'w');
$csv = utf8_encode("header1|header2|header3|header4|header5|header6|header7\r\n");
foreach($array as $line) {
$csv .= $line . "\r\n";
}
setlocale(LC_ALL, 'sv_SE', "swedish");
fwrite($df, "\xEF\xBB\xBF".$csv);
fclose($df);
return ob_get_clean();
And these are the headers sent:
$now = gmdate("D, d M Y H:i:s");
header("Expires: Tue, 03 Jul 2001 06:00:00 GMT");
header("Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate");
header("Last-Modified: {$now} GMT");
header("Content-Encoding: UTF-8");
header("Content-Type: text/csv; charset=UTF-8");
header("Content-Type: application/force-download");
header("Content-Type: application/octet-stream");
header("Content-Type: application/download");
header("Content-Disposition: attachment;filename={$filename}");
header("Content-Transfer-Encoding: binary");
Any ideas?
The issue is the byte-order mark you're prepending to the output in this line:
fwrite($df, "\xEF\xBB\xBF".$csv);
If you change this to simply
fwrite($df, $csv);
You should find the resulting file validates just fine (or at least, the validator doesn't complain about its encoding).
Arguably this is a problem with the validator, since as the Wikipedia article notes,
The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use.
I don't recommend you use it either, as most software seems not to recognize byte-order marks. But if you must or you simply prefer to, you can safely ignore the warning from CSVLint.
Since that is apparently not the issue, the next thing I'd look at is whether or not the data is being retrieved from the database in UTF-8. (I'll take your word you've already checked carefully to make sure the data is being stored in UTF-8.) If you're using MySQL, this will depend on the configuration of the database server and any options you may be sending the database after connection.
The PHP manual has a section on character sets and MySQL, and there is also this helpful article about using PHP and MySQL together with UTF-8 data. If you're using a different database system, it likely has equivalent configuration options that should be checked.
The only other suggestions I can make are that you
Move the call to setlocale higher in the script, before string concatenation begins in the foreach loop. (I don't think this setting affects simple concatenation, but I'm not certain.)
Remove the Content-Encoding header from your output, as it is invalid the way it is currently being used.
Try to use this code:
$filename = 'csv/'.date('Y-m-d_H:i:s').'.csv';
$fp = fopen($filename, 'w');
foreach ($csvData as $fields) {
fprintf($fp, chr(0xEF).chr(0xBB).chr(0xBF));
fputcsv($fp, $fields, $delimiter = ';');
}
fclose($fp);

How to export data in CSV into DOS\Windows Format in PHP?

I have an export in PHP like this :
header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename="agents_list.csv"');
When I make export, Notepad++ inform me that the format is "Macintosh" (CR). I need to have it in "Dos\Windows" (CR+LF) format.
How can I do that ? Must I modify some header ? Thank you for your help.
I replaced \r == CR (aka Mac style) with \r\n that is DOS format. Thank you #Andrey for your suggestion that makes me in way to find the solution.

Make PHP exported CSV with UTF-8 character work on mac excel using commas

So I'm trying to export a csv using PHP in which the contents contains UTF-8 character and I want the resultant csv to open in Excel smoothly (including Mac excel)
So there is an answer here: How can I output a UTF-8 CSV in PHP that Excel will read properly?
Checkout the top answer.
But then in order to implement that you need to use tabs to separate the fields instead of commas...Is there a way to achieve this while still using commas and not tabs and still have it work in OS X
EDIT
Mostly to Mark Baker but everyone feel free to comment
Another code update
while(#ob_end_clean());
header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=fileexport.csv");
echo "\xEF\xBB\xBF";
print "sep=,\n";
print $output;
exit;
fputcsv should work fine in this instance. Take the following example, where as the third parameter of fputcsv is the delimiter. By default it is , (comma), but you could also use "\t" for tab files. CSV files should be interpreted the same on either OS
if( $fh = fopen("output_file.csv","w") ){
$put = array("column1, with comma","column2, with comma","column3" /*,"columnN"*/);
fputcsv($fh,$put,",");
fclose($fh);
}

UTF8 encoding difficulty using PHP in Notepad++

I have been making a program that creates multiple CSV's from another source CSV (encoded in 'SJIS'/SHIFT-JIS). Here's the process in which I am creating them:
Create a string, which will hold the contents of the output CSV's
Fill in said strings with their proper information
Encode the string to UTF-8 from SJIS using mb_convert_encoding()
code:
$contents2 = mb_convert_encoding($contents, "UTF-8", "SJIS");
Create a zip archive using PHP's provided library methods and append the files I desire with their corresponding strings using addFromString()
code:
$zipFileName = "output.zip";
$zip = new ZipArchive;
if ($zip->open($zipFileName, ZipArchive::CREATE) === TRUE){
$zip->addFromString('customer.csv', $contents2);
...do for the other files
$zip->close();
}
else{
echo 'Failed! File not created!';
}
Prompt the user with a dialogue box to save the file in their desired location.
code:
$zipContents = file_get_contents($zipFileName);
header('Content-Type: application/zip');
header("Content-Disposition: attachment; filename=inflow.zip");
header("Pragma: no-cache");
header("Expires: 0");
echo $zipContents;
Now here is my problem: The files that I have created from the zip file are encoded in "UTF-8 without BOM" when I open it in Notepad++. However, I require for these files to just be in plain "UTF-8". A inventory program I am using to upload these files, for reasons beyond me, will not show the proper characters for the CSV's encoded in "UTF-8 without BOM". Once I manually: open the files, re-encode it as "UTF-8", and save them, are the files able to display the correct characters in this inventory program.
I have read a good deal of articles talking about the converse of this problem, where people were seeking to make their UTF-8 files become encoded without BOM. However, my situation is the exact opposite of this. If there's an easy solution in PHP I would more than welcome the help! Thanks for reading!!
Found a possible solution, see this: How can I output a UTF-8 CSV in PHP that Excel will read properly?

PHP generate Excel/CSV file and send as UTF-8

I'm retrieving data from my Postgres DB in UTF-8. The db and the client_connection settings are in UTF-8.
Then I send 2 headers to the visitor:
header("Content-Type: application/msexcel");
header("Content-Disposition: $mode; filename=export.xls");
and start outputting plain text data in a CSV-manner. This will open as a simple Excel file on the visitors desktop.
$cols = array ("col1", "col2", "col3");
echo implode("\t", $cols)."\r\n";
Works fine, untill special characters like é, è etc are encountered.
I tried changing my client_encoding while retrieving the data from the db to latin-1, which works in most cases but not for all chars. So that is not a solution.
How could I send the outputted file as UTF-8? I don't think converting the data from the db to latin-1 is possible, since the char seems unknown in latin-1 ... so I need Excel to treat the file as UTF-8
I'd look into using the PHPExcel engine. It uses UTF-8 as default and it can generate a whole list of spreadsheet file types (Excel, OpenOffice, CSV, etc.).
I would recommend not sending plain-text and masquerading it as Excel. XLS files are typically binary, and while binary isn't required, the official Excel method of using non-binary data is to format it as XML.
You mention "CSV" in the title, but nothing about your problem includes anything related to CSV. I bring this up because I believe that you should actually change your tabs to commas, and then you could simply output a standard .csv file, which is read by Excel still but doesn't rely on undocumented or unstable functionality.
If you truly want to send application/msexcel, then you should use a real Excel library, because currently, you are not creating a real Excel file.
use ; charset=UTF-8 after aplication/xxxxxx I do use:
header("Content-Type: application/vnd.ms-excel; charset=UTF-8");
// header("Content-Length: " . strlen($thecontent)); // this is not mandatory
header('Content-Disposition: attachment; filename="file.xls"');
Try mb_convert_encoding function.
Try to use iconv, for converting string into required charset.
Have you tried utf8_encode() the string?
So something like: echo implode("\t", utf8_encode($cols)."\r\n")
Not sure if that would work, but give it a go

Categories