PHPExcel how to solve the encoding issue when reading a file - php

I was working on an Yii2 API where i need to upload a .csv or .xlsx file and read from it using PHPExcel(DEPRECATED now , but i am stuck with it as new one PhpSpreadsheet requires PHP version 5.6 or newer) and return the array of data .
This was the code used in the API function
public function actionUpload()
{
$params = $_FILES['uploadFile'];
if($params)
{
$data = array();
$model = new UploadForm();
$model->uploadFile = $_FILES['uploadFile'];
$file = UploadedFile::getInstanceByname('uploadFile');
$inputFileName = $model->getpath($file,$data);
// Read your Excel workbook
try
{
$inputFileType = \PHPExcel_IOFactory::identify($inputFileName['link']);
$objReader = \PHPExcel_IOFactory::createReader($inputFileType);
if($inputFileType == 'CSV')
{
if (mb_check_encoding(file_get_contents($inputFileName['link']), 'UTF-8'))
{
$objReader->setInputEncoding('UTF-8');
}
else
{
$objReader->setInputEncoding('Windows-1255');
//$objReader->setInputEncoding('ISO-8859-8');
}
}
$objPHPExcel = $objReader->load($inputFileName['link']);
}
catch(Exception $e)
{
die('Error loading file "'.pathinfo($inputFileName['link'],PATHINFO_BASENAME).'": '.$e->getMessage());
}
// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$fileData = array();
// Loop through each row of the worksheet in turn
for ($row = 1; $row <= $highestRow; $row++)
{
// Read a row of data into an array
$rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row,
NULL,
TRUE,
FALSE);
array_push($fileData,$rowData[0]);
// Insert row data array into your database of choice here
}
return $fileData;
}
}
But there are encoding issues when we upload a excel file containing hebrew data in it . As you can see the code below from the above code was used to address this issue
if (mb_check_encoding(file_get_contents($inputFileName['link']), 'UTF-8'))
{
$objReader->setInputEncoding('UTF-8');
}
else
{
$objReader->setInputEncoding('Windows-1255');
}
Later i found that UTF-8 and Windows-1255 are not the only possible encoding for the flies that may be uploaded but other encoding like UTF-16 or other ones depending upon the Operating System of user. Is there any better way to find the encoding other than using mb_check_encoding
The common error that occur during the process of reading the data in file is :
iconv(): Detected an illegal character in input string
As you can see the above error occurs due to the inability to detect the appropriate encoding of the file. Is there any workaround ?

You can attempt to use mb_detect_encoding to detect the file encoding but I find that results vary. You might have to manually specify a custom match order of encodings to get proper results. Here is an example substitute for the if statement in question:
if(inputFileType == 'CSV')
{
// Try to detect file encoding
$encoding = mb_detect_encoding(file_get_contents($inputFileName['link']),
// example of a manual detection order
'ASCII,UTF-8,ISO-8859-15');
$objReader->setInputEncoding($encoding);
}

Make sure the first clean the output buffer in your page:
ob_end_clean();
header( "Content-type: application/vnd.ms-excel" );
header('Content-Disposition: attachment; filename="uploadFile.xls"');
header("Pragma: no-cache");
header("Expires: 0");
ob_end_clean();

Related

How to convert a csv file uploaded to utf-8 encoding using iconv

i want to read the contents of a csv file uploaded and then write back it into the file after converting it to utf-8 encoding
i found that i can use the following code to convert the encoding
iconv('hebrew', 'utf-8', $str);
i used following code to read the csv file line by line and write it back after converting .
The main idea was to import the csv by reading line by line but had some issues with hebrew based on encdoing of the file .
So Used the following code to check if encoding is utf-8 or utf-16le (of windows) and then convert the data accordingly. If data does not match one of those encodings then to use iconv('hebrew', 'utf-8', $str); but it not working
public function actionUpload()
{
$params = $_FILES['uploadFile'];
if($params)
{
$data = array();
$model = new UploadForm();
$model->uploadFile = $_FILES['uploadFile'];
$file = UploadedFile::getInstanceByname('uploadFile');
$inputFileName = $model->getpath($file,$data);
// Read your Excel workbook
try
{
$inputFileType = \PHPExcel_IOFactory::identify($inputFileName['link']);
$objReader = \PHPExcel_IOFactory::createReader($inputFileType);
if($inputFileType == 'CSV')
{
if (mb_check_encoding(file_get_contents($inputFileName['link']), 'UTF-8'))
{
$objReader->setInputEncoding('UTF-8');
}
else if (mb_check_encoding(file_get_contents($inputFileName['link']), 'UTF-16le'))
{
$objReader->setInputEncoding('UTF-16le');
}
else
{
$handle = fopen("test.csv", "r+");
if ($handle)
{
while (($line = fgets($handle)) !== false)
{
$newLine = iconv('hebrew', 'utf-8', $line);
fwrite($handle , $newLine);
}
fclose($handle);
}
else
{
// error opening the file.
}
$objReader->setInputEncoding('UTF-8');
}
}
$objPHPExcel = $objReader->load($inputFileName['link']);
}
catch(Exception $e)
{
die('Error loading file "'.pathinfo($inputFileName['link'],PATHINFO_BASENAME).'": '.$e->getMessage());
}
// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$fileData = array();
// Loop through each row of the worksheet in turn
for ($row = 1; $row <= $highestRow; $row++)
{
// Read a row of data into an array
$rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row,
NULL,
TRUE,
FALSE);
array_push($fileData,$rowData[0]);
// Insert row data array into your database of choice here
}
return $fileData;
}
}
There is a hack in PHP documentation comments, to convert hebrew encoding to utf-8
http://php.net/manual/en/function.iconv.php#49434

PHPExcel reading .csv file with Hebrew content is Not Working Properly

I tried to read a .csv file using the PHPExcel library but it is giving me question marks instead of the Hebrew text.
The problem only occur when I tried to upload .csv files created from Windows Operating System. I tried the same for MacOS and Linux and the data seems to be fine. How can I solve the encoding issue from windows
public function actionUpload()
{
$params = $_FILES['uploadFile'];
if($params)
{
$data = array();
$model = new UploadForm();
$model->uploadFile = $_FILES['uploadFile'];
$file = UploadedFile::getInstanceByname('uploadFile');
$inputFileName = $model->getpath($file,$data);
// Read your Excel workbook
try
{
$inputFileType = \PHPExcel_IOFactory::identify($inputFileName['link']);
$objReader = \PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName['link']);
}
catch(Exception $e)
{
die('Error loading file "'.pathinfo($inputFileName['link'],PATHINFO_BASENAME).'": '.$e->getMessage());
}
// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$fileData = array();
// Loop through each row of the worksheet in turn
for ($row = 1; $row <= $highestRow; $row++)
{
// Read a row of data into an array
$rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row,
NULL,
TRUE,
FALSE);
array_push($fileData,$rowData[0]);
// Insert row data array into your database of choice here
}
return $fileData;
}
}
The output array has "???? ???" in places where Hebrew text was present when .csv files created from Windows was uploaded.

PHPExcel Clone Worksheet - Efficient use of memory

I need to clone the first worksheet a few times, accordingly to the amount of rows, but something may be wrong.
The code is:
public function downloadFile()
{
date_default_timezone_set('America/Sao_Paulo');
if(file_exists("xpto.xlsx")){
$objPHPExcel = PHPExcel_IOFactory::load("xpto.xlsx");
$sheets = 3;//3 is enough to throw the error
for($i = 0; $i<$sheets; $i++){
$objClonedWorksheet = clone $objPHPExcel->getSheet(0);
$objClonedWorksheet->setTitle('Sheet ' . $i);
$objClonedWorksheet->setCellValue('A1', 'Test ' . $i);
$objPHPExcel->addSheet($objClonedWorksheet);
}
$objPHPExcel->setActiveSheetIndex(0);
$filename = 'file.xlsx';
header('Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');
header('Content-Disposition: attachment;filename="'.$filename.'"');
header('Cache-Control: max-age=0');
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'Excel2007');
ob_end_clean();
$ret = $objWriter->save('php://output');
exit;
}
}
But I got an exhausted memory error. Than I tried the most commented solution (that is actually an workaround) that is to add
ini_set('memory_limit', '-1');
I added this line just after the load function and it worked, but I don't think it is a good solution to use on a SaaS application. I don't even think most hosts (AWS, for example) will allow me to use that.
I also tried to clone the sheet before the for loop, but when use addSheet, I realized that this function doesn't create a new object and when I change the name of the sheet (by the second iteration of the for loop), it changes the last sheet created, throwing an "already existing sheet with the same name" error.
Trying to use one of the links #rhazen listed, I changed the for loop to:
$objFromSheet = $objPHPExcel->getSheet(0);
$sheets = 3;
for($i = 1; $i<=$sheets; $i++){
$objToSheet = $objPHPExcel->createSheet($i);
foreach($objFromSheet->getRowIterator() as $row){
$cellIterator = $row->getCellIterator();
$cellFrom = $cellIterator->current();
$cellTo = $objToSheet->getCell($cellFrom->getCoordinate());
$cellTo->setXfIndex($cellFrom->getXfIndex());
$cellTo->setValue($cellFrom->getValue());
}
}
But it seems not to work either. Is there a misunderstanding about Iterator or XfIndex?
The solution is in the edited question. Thanks for those who helped.

exporting database to xls with price formatted with dollar sign and commas

Im exporting my mysql database to excel, and everything is working, but I want the price field to display with a dollar sign and commas within the excel spreadsheet.
Here is my code:
$pubtable = $_GET["publication"];
$addate = $_GET["adDateHidden"];
$export = mysql_query("SELECT * FROM $pubtable WHERE addate = '$addate' ORDER BY price DESC") or die ("Sql error : " . mysql_error());
$fields = mysql_num_fields($export);
for($i = 0; $i < $fields; $i++){
$header .= mysql_field_name($export , $i). "\t";
}
while($row = mysql_fetch_row($export)){
$line = '';
foreach($row as $value) {
if(!isset($value) || trim($value) == "") {
$value = "\t";
} else {
$value = str_replace('"' , '""' , $value);
$value = '"' . $value . '"' . "\t";
}
$line .= $value;
}
$data .= trim($line). "\n";
}
$data = str_replace("\r" , "" , $data);
if(trim($data) == ""){
$data = "\n(0)Records Found!\n";
}
header("Content-type: application/vnd.ms-excel");
header("Content-Disposition: attachment; filename=".$pubtable."_".$addate.".xls");
header("Pragma: no-cache");
header("Expires: 0");
header ('Content-Transfer-Encoding: binary');
header ('Last-Modified: '.gmdate('D, d M Y H:i:s').' GMT');
header ('Cache-Control: cache, must-revalidate');
print "$header\n$data";
I tried doing this
$export = mysql_query("SELECT CONCAT('$', FORMAT(price, 2)) as fieldalias * FROM $pubtable WHERE addate = '$addate' ORDER BY fieldalias DESC") or die ("Sql error : " . mysql_error());
this formats it correctly but it only outputs the price field and nothing else.
Technically, you're not producing an Excel spreadsheet. You're producing a CSV file with a .xls extension. CSV has no mechanism for adding formatting, because it's just plain text. You can have MySQL and/or PHP format a number into what looks like a nice currency value, but then you're destroying its existence as a number. It'll be a string-that-used-to-be-a-number.
You should use PHPExcel to produce an ACTUAL Excel file, into which you can add all the usual goodies that Excel supports, including colors and formulae.
Your code is generating a text file (CSV formated) with a XLS extension. Then, under Windows with an Excel installed, it may be automatically opened by Excel despite the content is only text file.
This method cannot produce any data with style.
I suggest that you use OpenTBS to easily build your Excel file. OpenTBS is a PHP class which can build real XLSX, DOCX, ODT, ODS and more... using templates. You just design the template with Excel and then merge it with the data under PHP and you have a new XSLS directly for download, or as a files saved on the server.

php + download to excel --> worksheet name

I am currently exporting data from php to excel using the code as below:
include("dbconnect.php");
$query = $_POST['query'];
$result = odbc_exec($conn,$query);
$count = odbc_num_fields($result);
//Define Variable For ODBC
$data = "";
//Field Name Data
for ($i = 1; $i <= $count; $i++)
{
$data .= odbc_field_name($result, $i)."t";
}
$data .= "n";
//Row Data
while(odbc_fetch_row($result))
{
for ($j = 1; $j <= $count; $j++)
{
$data .= odbc_result($result, $j)."t";
}
$data .= "n";
}
header("Content-type: application/octet-stream");
header("Content-Disposition: attachment; filename=ExcelFile.xls;");
header("Pragma: no-cache");
header("Expires: 0");
echo $data;
odbc_close($conn);
This all works fine, but the generated excel file has a sheetname of: ".xls]ExcelFile(1)" , and when you try to rename the sheet it causes an error in excel (unless you save the file first).
How can I define the sheetname in my php file?
Thanks in advance!
Have a nice weekend:-)
You're not actually creating an xls file, but a tab separated value file. Excel can read this, but it simply populates the data in the first worksheet. Because it's not a true xls file, you can't name the worksheet tabs in any way.
One option would be to change your code to use a library that writes true xls files, such as PHPExcel ( http://www.phpexcel.net )... you would then be able to define a name for the worksheet tab within your script.
By changing the
header("Content-type: application/octet-stream");
To
header("Content-type: application/vnd-ms-excel");
It is downloading without any errors and i am able to save that. the worksheet name is same as excel name.

Categories