Encoding accents csv - PHP - php

I've been looking through all the answers here and I haven't found the solution.
Here's what I got:
MySQL :
Database & Table encoding => utf8_unicode_ci
I'm trying to convert a an array (containing rows from a query) to CSV
however when i open the csv I get this
Prénom
instead of
Prénom
here's my code
$allQueryRows = array();
while($row_query = $stmt_select->fetch(PDO::FETCH_ASSOC)){
$row_query = array_map("utf8_encode", $row_query);
array_push($allQueryRows, $row_query);
}
download_send_headers("csv" . date("Y-m-d") . ".csv");
echo array2csv($allQueryRows);
die();
function array2csv(array &$array)
{
if (count($array) == 0) {
return null;
}
ob_start();
$df = fopen("php://output", 'w');
fputcsv($df, array_keys(reset($array)));
foreach ($array as $row) {
fputcsv($df, $row);
}
fclose($df);
return ob_get_clean();
}
function download_send_headers($filename) {
// disable caching
$now = gmdate("D, d M Y H:i:s");
header("Expires: Tue, 03 Jul 2001 06:00:00 GMT");
header("Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate");
header("Last-Modified: {$now} GMT");
// force download
header("Content-Type: application/force-download");
header("Content-Type: application/octet-stream");
header("Content-Type: application/download");
// disposition / encoding on response body
header("Content-Disposition: attachment;filename={$filename}");
header("Content-Transfer-Encoding: binary");
}

You should send a SET NAMES utf8; MySQL query before your SELECT query instead of array_mapping your data after.
Then in HTTP headers, send
Content-Type: text/csv; charset=utf-8;

AT first blush it looks like a utf-16 / utf-8 issue. Here's how to start to diagnose it:
First, when you say, "however when i open the csv I get this" what do you mean by "open", i.e. open with what?
I would suggest looking at the file in hex to see exactly what is in the file. It could be that what you are opening it in (your editor or whatever) is what is causing the display coming into your eye ball to be wrong OR it could be that the underlying data that your opening program is seeing is wrong. I think you need to sort out that question first.
(Tip: In general this is the old process of divide and conquer: find a way to test somewhere in the middle of your problem to see which half of your system is causing the problem. The quickest results come from picking test points about half way in the middle of the complexity, not near an edge of the problem, i.e. a Boolean search for the bug. It might not find the problem in the first iteration, but it will help narrow it down.)
Also perhaps you need to tell SQL which to use, e.g. $connection->set_charset("utf8");
Or perhaps what you are seeing is actually being displayed differently from what you think it is because of a utf8/utf16 display level mixup. I generally set stay with utf8 and so set Content-Type: text/plain; charset=UTF-8; (Also if you are viewing this file via your editor make sure it's set to the correct character space.)

Related

Can't encode CSV file in UTF8 with PHP

I've been trying, for some time now, to export a properly encoded and formated CSV file with PHP. But it's just not working. I've tried every tip in every CSV/PHP related thread on SOF, I've checked that the data in my database is UTF-8, it is. I've tried stuff like utf8_encode() on the whole CSV-line, I've checked that the actual PHP file is encoded in UTF-8, but still no success. When I run the file on http://csvlint.io/ I just get:
Your CSV appears to be encoded in ASCII-8BIT. We recommend you use UTF-8.
But I can't find a trace of any other encoding than UTF-8 anywhere in my code..
Basically this is my code:
First, I put all my CSV-rows in an array, then do this:
if (count($array) == 0)
{
return NULL;
}
ob_start();
$df = fopen("php://output", 'w');
$csv = utf8_encode("header1|header2|header3|header4|header5|header6|header7\r\n");
foreach($array as $line) {
$csv .= $line . "\r\n";
}
setlocale(LC_ALL, 'sv_SE', "swedish");
fwrite($df, "\xEF\xBB\xBF".$csv);
fclose($df);
return ob_get_clean();
And these are the headers sent:
$now = gmdate("D, d M Y H:i:s");
header("Expires: Tue, 03 Jul 2001 06:00:00 GMT");
header("Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate");
header("Last-Modified: {$now} GMT");
header("Content-Encoding: UTF-8");
header("Content-Type: text/csv; charset=UTF-8");
header("Content-Type: application/force-download");
header("Content-Type: application/octet-stream");
header("Content-Type: application/download");
header("Content-Disposition: attachment;filename={$filename}");
header("Content-Transfer-Encoding: binary");
Any ideas?
The issue is the byte-order mark you're prepending to the output in this line:
fwrite($df, "\xEF\xBB\xBF".$csv);
If you change this to simply
fwrite($df, $csv);
You should find the resulting file validates just fine (or at least, the validator doesn't complain about its encoding).
Arguably this is a problem with the validator, since as the Wikipedia article notes,
The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use.
I don't recommend you use it either, as most software seems not to recognize byte-order marks. But if you must or you simply prefer to, you can safely ignore the warning from CSVLint.
Since that is apparently not the issue, the next thing I'd look at is whether or not the data is being retrieved from the database in UTF-8. (I'll take your word you've already checked carefully to make sure the data is being stored in UTF-8.) If you're using MySQL, this will depend on the configuration of the database server and any options you may be sending the database after connection.
The PHP manual has a section on character sets and MySQL, and there is also this helpful article about using PHP and MySQL together with UTF-8 data. If you're using a different database system, it likely has equivalent configuration options that should be checked.
The only other suggestions I can make are that you
Move the call to setlocale higher in the script, before string concatenation begins in the foreach loop. (I don't think this setting affects simple concatenation, but I'm not certain.)
Remove the Content-Encoding header from your output, as it is invalid the way it is currently being used.
Try to use this code:
$filename = 'csv/'.date('Y-m-d_H:i:s').'.csv';
$fp = fopen($filename, 'w');
foreach ($csvData as $fields) {
fprintf($fp, chr(0xEF).chr(0xBB).chr(0xBF));
fputcsv($fp, $fields, $delimiter = ';');
}
fclose($fp);

Make PHP exported CSV with UTF-8 character work on mac excel using commas

So I'm trying to export a csv using PHP in which the contents contains UTF-8 character and I want the resultant csv to open in Excel smoothly (including Mac excel)
So there is an answer here: How can I output a UTF-8 CSV in PHP that Excel will read properly?
Checkout the top answer.
But then in order to implement that you need to use tabs to separate the fields instead of commas...Is there a way to achieve this while still using commas and not tabs and still have it work in OS X
EDIT
Mostly to Mark Baker but everyone feel free to comment
Another code update
while(#ob_end_clean());
header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=fileexport.csv");
echo "\xEF\xBB\xBF";
print "sep=,\n";
print $output;
exit;
fputcsv should work fine in this instance. Take the following example, where as the third parameter of fputcsv is the delimiter. By default it is , (comma), but you could also use "\t" for tab files. CSV files should be interpreted the same on either OS
if( $fh = fopen("output_file.csv","w") ){
$put = array("column1, with comma","column2, with comma","column3" /*,"columnN"*/);
fputcsv($fh,$put,",");
fclose($fh);
}

Apply text wrap to CSV cell containing long string

i am using php code to export data to CSV file.Everything is working fine as required.but problem is that when there comes long text in a cell.I want to wrap text so that i can increase cell size to handle long text.Below is my code.
header("Content-Type: application/force-download\n");
header("Cache-Control: cache, must-revalidate");
header("Pragma: public");
header("Content-Disposition: attachment; filename=store_earning_report.csv");
echo "Notes \n";
echo $Notes;
echo "\n";
exit;
I have searched but didn't find any solution.Is there any way to handle this problem.
Thank you.
Make sure you are including a comma "," after each field, and "\r\n" to trigger a new line in the .csv file that is created.
A .csv is just a text file with commas used to separate the field values - So there is no way you can control the cell sizes that will appear when the file is first opened in Excel.

PHP generated csv file is displaying £ for a UK pound sign (£) in Excel 2007

I'm generating the csv file with the following header commands:
header("Content-type: text/csv; charset=utf-8; encoding=utf-8");
header('Content-Disposition: attachment; filename="products.csv"');
If I open the file in Excel 2007, then I get £ wherever a £ sign should appear. However, if I open the file in Notepad++, then the pound signs appear fine; similarly, if I change the content-type to text/plain and get rid of the attachement header, the pound signs appear correctly in the browser.
One strange thing is that if I go to the "Format" menu in Notepad++, it appears that the file is encoded in "UTF-8 without BOM". If I change this to "Encode in UTF-8", then save the file, the pound signs appear correctly in Excel. Is there a way to make it so that the file is saved in this encoding by PHP?
Output 0xEF 0xBB 0xBF before emitting the CSV data. Don't forget to increase the content length header by 3 if you handle it.
header('Content-type: text/csv;');
header('Content-Length: ' + strlen($content) + 3);
header('Content-disposition: attachment;filename=UK_order_' . date('Ymdhis') . '.csv');
echo "\xef\xbb\xbf";
echo $content;
exit;
Use utf8_decode() - WORKED FOR ME
header("Expires: Mon, 26 Nov 1962 00:00:00 GMT");
header("Last-Modified: " . gmdate('D,d M Y H:i:s') . ' GMT');
header("Cache-Control: no-cache, must-revalidate");
header("Pragma: no-cache");
header('Content-Type: text/csv;');
header("Content-Disposition: attachment; filename=".$savename);
echo utf8_decode($csv_string);
Posted as above by user769889 but I missed it with my frustrations with this v-annoying issue, all fixed now and working. Hope this helps someone...

PHP Streaming CSV always adds UTF-8 BOM

The following code gets a 'report line' as an array and uses fputcsv to tranform it into CSV. Everything is working great except for the fact that regardless of the charset I use, it is putting a UTF-8 bom at the beginning of the file. This is exceptionally annoying because A) I am specifying iso and B) We have lots of users using tools that show the UTF-8 bom as characters of garbage.
I have even tried writing the results to a string, stripping the UTF-8 BOM and then echo'ing it out and still get it. Is it possible that the issue resides with Apache?
If I change the fopen to a local file it writes it just fine without the UTF-8 BOM.
header("Content-type: text/csv; charset=iso-8859-1");
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=\"report.csv\"");
$outstream = fopen("php://output",'w');
for($i = 0; $i < $report->rowCount; $i++) {
fputcsv($outstream, $report->getTaxMatrixLineValues($i), ',', '"');
}
fclose($outstream);
exit;
My guess would be that your php source code file has a BOM, and you have php's output buffering enabled.
I don't know if this solves your problem but have you tried using the print and implode functions to do the same thing?
header("Content-type: text/csv; charset=iso-8859-1");
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=\"report.csv\"");
for($i = 0; $i < $report->rowCount; $i++) {
print(implode(',',$report->getTaxMatrixLineValues($i)));
}
That's not tested but pretty obvious.

Categories