Writing a CSV file for Mac users with PHP - php

I use a generic algorithm to write CSV files that in theory works for the major OSes. However, the client started to use Mac a few weeks ago, and they keep telling me the CSV file cannot be read in Microsoft Excel 2008 for Mac 12.2.1.
They have their OS configured to use "semicolon ;" as list separator, which is exactly what I am writing in the CSV. They also say, that when they open the file in notepad, they have noticed there are no linebreaks, everything is displayed in a single line; which is why Excel cannot read the file properly; but in my code, I am using the cross-browser line break \r\n
This is the full code I use:
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
// Output to browser with appropriate mime type, you choose ;)
header("Content-type: text/x-csv");
//header("Content-type: text/csv");
//header("Content-type: application/csv");
header("Content-Disposition: attachment; filename=participantes.csv");
$separator = ";";
$rs = $sql->doquery("SELECT A QUERY TO RETRIEVE DATA FROM THE DB");
$header = "";
$num_fields = mysql_num_fields($rs);
for($i=0; $i<$num_fields; $i++){
$field = mysql_field_name($rs, $i);
$header .= $field.$separator;
}
echo $header."\r\n";
while($row = $sql->fetch($rs)){
$str = "";
for($i=0; $i<$num_fields; $i++){
$field = mysql_field_name($rs, $i);
$value = str_replace(";", ",", $row->{$field});
$value = str_replace("\n", ",", $value);
$value = str_replace("\d", ",", $value);
$value = str_replace(chr(13), ",", $value);
$str .= $value.$separator;
}
echo $str."\r\n";
}
Is there anything I can do so Mac users can read the file properly?

For debugging purposes:
Create a CSV file and send it to them by mail. Can they open it OK?
Have them download the file from your page and have it sent back to you. Compare the files in a Hex-editor to rule out the off-chance that they look differently from what you send to the browser or from what you have saved.
Have them double-check their Excel-settings.
Have them create a working CSV file from scratch (text editor on a mac) and spot any differences from your approach.

Here's some code I did to convert a tab delimited data into CSV, and it comes in fine on my mac. Note that I have it set up to make me click to download, rather than pushing it to the browser. It's not a great solution (I'm pretty sure the code is crap) but it works for what I need it for.
$input = $_POST['input'];
//Remove commas.
$replacedinput1 = str_replace(",", "-", $input);
//remove tabs, replace with commas
$replacedinput2 = str_replace(" ", ",", $replacedinput1);
//create an array
$explodedinput = explode("
", $replacedinput2);
//open the CSV file to write to; delete other text in it
$opencsvfile = fopen("/var/www/main/tools/replace_tab.csv","w+");
//for each line in the array, write it to the file
foreach($explodedinput as $line) {
fputcsv ($opencsvfile, split(',', $line));
};
//close the file
fclose($opencsvfile);
//have the user download the file.
/*
header('Pragma: public');
header('Expires: Fri, 01 Jan 2010 00:00:00 GMT');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Content-Type: application/csv');
header('Content-Disposition: filename=replace_tab.csv'); */
//Or not, since I can't get it to work.
echo "<a href='replace_tab.csv'>Download CSV File, then open in Numbers or Excel (Note, you may need to right click, save as.)</a>.";

The newline character in Mac is LF (unicode U+000A) and in Windows it is CR + LF (Unicode U+000D followed by U+000A).
Maybe that's why the csv is unreadable on a Mac.

Related

Convert html entities to unicode characters via php mysql csv export

In my MySQL database this is a sample of an HTML entity that I have:
Ú
When I export it through my script this is what I get:
ú
As you can see in my script I already have 'html_entity_decode' which should convert it appropriately to this (which is what I want):
Ú
Obviously, I am doing something wrong. I have exhausted other various scripts, solutions and otherwise have been trying to resolve this issue for over a day. Here is my PHP code:
$link = mysqli_connect("localhost", "user", "pass", "db");
$sql="SELECT * FROM wtf";
$result=mysqli_query($link,$sql);
if (!$result) die('Couldn\'t fetch records');
$fp = fopen('php://output', 'w');
if ($fp && $result) {
header("Content-type: application/vnd.ms-excel");
header("Content-Encoding: UTF-8");
header('Content-Disposition: attachment; filename="results.csv"');
header('Pragma: no-cache');
header('Expires: 0');
fputcsv($fp, array('Nome'));
while ($row = $result->fetch_array(MYSQLI_NUM)) {
fputcsv($fp, array_map('html_entity_decode',array_values($row)), ',', '"');
}
die;
}
mysqli_close($link);
exit;
Could someone please help or at least point me in the right direction? Having taken on a project that requires European characters in the CSV results, it has been nothing less then a nightmare...
It sounds like you're probably using a newer version of PHP, which will default to "UTF-8" when html_entity_decode() is called. Maybe try something like this:
Instead of this:
fputcsv($fp, array_map('html_entity_decode',array_values($row)), ',', '"');
Try this:
fputcsv($fp, call_user_func_array('html_entity_decode', array(array_values($row), ENT_COMPAT, 'ISO-8859-1')), ',', '"');
The problem is caused by Excel misinterpreting the character encoding of your output.
An output like ú is an indication that a multi-byte character is being interpreted as two separate single byte characters. When instead of writing a string as CSV, you echo that same string, it is rendered correctly, so this means the problem is not in the string, as stored by PHP.
The header Content-Encoding: UTF-8 does not find its way to Excel, so in order to make Excel aware of the UTF-8 encoding, output a Byte Order Mark at the start of the output:
$fp = fopen('php://output', 'w');
if ($fp && $result) {
header("Content-type: application/vnd.ms-excel");
header("Content-Encoding: UTF-8");
header('Content-Disposition: attachment; filename="results.csv"');
header('Pragma: no-cache');
header('Expires: 0');
fwrite($fp, "\xEF\xBB\xBF"); // <--- add this
Secondly, things tend to also work better when you use a TAB character as separator instead of a comma, as in Europe some regional settings define the semi-colon as the separator (the comma being taken as decimal separator), and this will make all columns collapse into one. So write:
fputcsv($fp, array('Nome'), "\t", '"');
while ($row = $result->fetch_array(MYSQLI_NUM)) {
fputcsv($fp, array_map('html_entity_decode',array_values($row)), "\t", '"');
}

PHP: Export to CSV with special characters

I am trying to export some data that is stored on a table but when I tried to export to CSV this letter č shows like Ä or &#269.
I tried everithing utf8_decode, utf8_enconde, html_entity_decode, but is not working. What can I do?
Thanks,
Leandro.
Additional Information: Now I directly testing the following:
$delimiter = ";";
$enclosure = '"';
header("Content-Disposition: attachment; filename=memorandos.csv");
header("Pragma: no-cache");
header("Expires: 0");
$output = fopen('php://output', 'w');
$header = array('Apellido');
fputcsv($output, $header, $delimiter, $enclosure);
$memorando = Memorando::getById(3263);
if ($memorando){
$dd = array ();
$dd[] = $memorando->apellido; ////ON THE DATABSE IS STORED LIKE Jurič
fputcsv($output, $dd, $delimiter, $enclosure);
}
On the file I see this Juri&#269 ; instead of Jurič
There are many angles and approaches of dealing with this issue, you can even try: ISO-8859-1
Lets say you have
$input = "Fóø Bår Zacarías ?S?B?D Ferreíra"; // original text
Use iconv to get rid of the special chars
$output = iconv("utf-8", "ascii//TRANSLIT//IGNORE", $input);
Regexp lets remove utf-8 special characters except blank spaces
$output = preg_replace("/^'|[^A-Za-z0-9\s-]|'$/", '', $output);
Results in: Foo Bar Zacarias ASABAD Ferreira
echo $output;
And where is your code? can you share?

Exporting array to CSV ini PHP with special characters

Im exporting a php array to csv.
The error is that all special characters are screwed up (e.g.: á é ñ), and the start of the file displays field1 instead of field1.
The issue is happening because of Content-disposition: attachment;, if I comment that line, the file is created without any issues (sadly it is downloaded as FILENAME.php extension).
# CSV headers
header('Cache-Control: public');
header('Content-Type: application/octet-stream');
header('Content-type: application/csv; charset=utf-8');
header('Content-disposition: attachment; filename='.date('Y-m-d H\hi').'.csv');
# Columns
$o = 'field1,field2,field3,field4';
$o .= "\n";
# Data
$rows = array();
foreach($data as $item) {
$fields = array();
foreach($item as $field) {
$fields[] = $field;
}
$rows[] = implode(', ', $fields);
}
$o .= implode("\n", $rows);
echo $o;
Any ideas? Thanks!
As commented by Dagon,  is the BOM and may be causing problems with the file being read (specially if it is done in CMD on Windows). Remove the BOM from your script file.
As for the special characters, you may need to convert them, specially if your source isn't UTF-8.
I had a similar problem once and the solution for me was to certify that the input was being read correctly and converting them before outputting.
For converting the characters, I did something like this:
mb_convert_case($result['products_name'], MB_CASE_UPPER, "UTF-8");
To certify I was working with UTF-8, I issued
$connection->set_charset("utf8");
when connecting to my database.
Take care.

Comma Separated File, appending encrypted charter instead of new line

I am creating a comma separated file, that appends encrypted character at the end of each line instead of starting the next line in a new line?
$inner_exported_header_array[] = 'ID';
$inner_exported_header_array[] = 'First Name';
$exported_customers_arr [] = $inner_exported_header_array;
foreach ($_list AS $row) {
$inner_exported_array = array();
$inner_exported_array[] = $row['id'];
$inner_exported_array[] = $row['v_first_name'];
$exported_customers_arr[] = $inner_exported_array;
}
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header('Content-Description: File Transfer');
header("Content-Type: text/plain");
header("Content-Disposition: attachment; filename=test.txt");
header("Pragma: no-cache");
header("Expires: 0");
outputCSV($exported_customers_arr);
function outputCSV($data) {
$outstream = fopen("php://output", "w");
function __outputCSV(&$vals, $key, $filehandler) {
fputcsv($filehandler, $vals); // add parameters if you want
}
array_walk($data, "__outputCSV", $outstream);
fclose($outstream);
}
Output is showing:
"ID","First Name"[special character not copied here]1,Testname
But if would be:
"ID","First Name"
1,Testname
Any thought, that our new line should be in a new line rather than inserting special characters at end of line ans starting new line from where the first line ends?
Summary from the comments:
The easiest solution may be to simply read the file in and replace every "\n" with "\r\n". The consumer of the file expects different line endings.
What are these characters? Seems like they're coming from your database, somehow? These "special" characters seem to contain newline (i.e. \n), which messes up your output.
I suggest checking the value of the last character of your problematic field using PHP's ord() function:
http://php.net/manual/en/function.ord.php
You could clean your CSV output using PHP's trim() function:
http://php.net/manual/en/function.trim.php
trim() also clears newline, which should solve your issue, if applied to the output of each field, i.e change this:
$inner_exported_array[] = $row['v_first_name'];
to:
$inner_exported_array[] = trim($row['v_first_name']);

Export CSV for Excel

I'm writing a CSV file in PHP using fputcsv($file, $data).
It all works, however I can't just open it in Excel but have to import it and specify the encoding and which delimiter to use (in a wizard).
I've seen exports from other websites that open correctly just by clicking on them and now would like to know what I should do to my file to achieve that.
I tried using this library: http://code.google.com/p/parsecsv-for-php/
But I couldn't even get it to run and am not really confident if it would really help me...
This is how I make Excel readable CSV files from PHP :
Add BOM to fix UTF-8 in Excel
Set semi-colon (;) as delimeter
Set correct header ("Content-Type: text/csv; charset=utf-8")
For exemple :
$headers = array('Lastname :', 'Firstname :');
$rows = array(
array('Doe', 'John'),
array('Schlüter', 'Rudy'),
array('Alvarez', 'Niño')
);
// Create file and make it writable
$file = fopen('file.csv', 'w');
// Add BOM to fix UTF-8 in Excel
fputs($file, $bom = (chr(0xEF) . chr(0xBB) . chr(0xBF)));
// Headers
// Set ";" as delimiter
fputcsv($file, $headers, ";");
// Rows
// Set ";" as delimiter
foreach ($rows as $row) {
fputcsv($file, $row, ";");
}
// Close file
fclose($file);
// Send file to browser for download
$dest_file = 'file.csv';
$file_size = filesize($dest_file);
header("Content-Type: text/csv; charset=utf-8");
header("Content-disposition: attachment; filename=\"file.csv\"");
header("Content-Length: " . $file_size);
readfile($dest_file);
Works with Excel 2013.
this is really a mess. You surely can use the sep=; or sep=, or sep=\t or whatever to make Excel aware of a separator used in your CSV. Just put this string at the beginning of your CSV contents. E.g.:
fwrite($handle, "sep=,\n");
fputcsv($handle,$yourcsvcontent);
This works smoothly. BUT, it doesn't work in combination with a BOM which is required to make Excel aware of UTF-8 in case you need to support special characters or MB respectively.
In the end to make it bullet-proof you need to read out users locale and set the Separator accordingly, as mentioned above.
Put a BOM ("\xEF\xBB\xBF") at the begining of your CSV content, then write the CSV like e.g.: fputcsv($handle, $fields, $user_locale_seperator);
where $user_locale_seperator is the separtator you retrieved by checking the user's locale.
Not comfortable but it works...
Despite the "C=comma" in CVS, Excel uses your locale native separator. So supposing fputcsv always uses a comma, it won't work, if your locale separator is for example a semicolon.
What Google AdSense does, when you click "Export to Excel CSV", is that it uses Tab as a separator. And that works.
To replicate that, set the third parameter (delimiter) of fputcsv to override the default comma. E.g. for Tab use: fputcsv($handle, $fields, "\t");
Compare the format of the CSV that works for you against the one generated by fputcsv.
Consider including example of both in your question. You might get better answers.
You may have an encoding issue.
Try this post:
http://onwebdev.blogspot.com.es/2010/10/php-encoding-of-csv-file-for-excel.html
I notice that you need to consider:
Content-Type header
BOM (Byte Order Mark)
Actual character encoding in the file
With BOM (works):
$bom = pack("CCC", 0xEF, 0xBB, 0xBF);
header('Content-Type: text/csv');
header('Content-Length: '.(strlen($csv)+strlen($bom)));
header('Content-Disposition: attachment;filename=my.csv');
echo $bom;
echo $csv;
Without BOM (works but you need to replace “smart quotes” then run utf8_decode on each value or cell, and it converts some characters, for example FRĒ is converted to FRE')
header('Content-Type: application/csv;charset=utf-8');
header('Content-Length: '.strlen($csv));
header('Content-Disposition: attachment;filename=my.csv');
echo $csv;
If the wrong combination of charset and BOM are used, it just comes out wrong when opening in MS Excel.
Bonus fact: mb_strlen tells you the number of characters, strlen tells you the number of bytes. You do NOT want to use mb_strlen for calculating the Content-Length header.
Bonus 2: replace microsoft "smart" characters (em dash, curly quotes, etc):
$map = array(chr(145) => "'"
,chr(146) => "'"
,chr(147) => '"'
,chr(148) => '"'
,chr(149) => '-'
,chr(150) => '-'
,chr(151) => '-'
,chr(152) => '-'
,chr(152) => '-'
,chr(171) => '-'
,chr(187) => '-'
);
// faster that strtr
return str_replace( array_keys($map), $map, $str );

Categories