Parsing Badly written XLS - php

I have to parse with php an XLS file that is written by some other code and it seems to be poorly written.
I've tried parsing it with PHPExcel using autorecognition in this way:
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
echo 'filetype: '.$inputFileType.'<br>';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName);
Which returns:
filetype: CSV
The file is opened but it is not read correctly as the data it's not correctly recognized, content is not in proper cells and some cells give error. I've tried using all other PHPExcel filetypes and all of them return error.
I've tried to open it with a text editor (Notepad++) and the file it's in binary, not a simple CSV. The extension is XLS but since it's written via a script cannot be used as unique identifier of the version.
If i open the file with Excel it's opened and i can saved it in another format (for example as a new xlsx file) and after that i can correctly read it.
Thinking it's encoded in some very old format, I've tried with other library SimpleExcel and i got this error:
File extension XLS doesn't match with xml
Is there a way to "correct" the format before parsing it?

Related

PhpSpreadsheet cannot read specific xls file

The following error
Fatal error: Uncaught PhpOffice\PhpSpreadsheet\Reader\Exception: Parameter pos=-12 is invalid
is given when trying to parse a specific xls file.
Code
$inputFileName = "excel.xls";
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader('Xls');
$spreadsheet = $reader->load($inputFileName);
The file in question: https://filebin.net/sle19tm0kdgduyne/excel.xls?t=u0itbeue
I have tried using all available readers such as Xlsx, Csv etc and even using the old deprecated PHPExcel library. Nothing can parse this specific file, even though it opens fine with excel on windows.
My end goal is converting this xls file to an array, so i can paste the data into a database.
I think you don't need to use createReader().
Here the example of my worked code.
$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xls();
$spreadsheet = $reader->load('path/to/file.xls');
Hope this can help you.

Change excel file extension from xls to xlsx while uploading with php

Is it possible and if is how can i change excel file extension while uploading or before saving file on server? I am using php and mysql.
Thankyou
You can do something like this.
move_uploaded_file($_FILES['file']['tmp_name'], upload_PATH.'/'.$_FILES['file']['name'].'x');
But that will only change the file name with the xlsx extension. It will not actually convert the file to xlsx format.
As previously mentioned in a different reply, changing the extension won't actually change the format, and it's not a good idea to serve a .xls file as .xlsx, since this will only confuse anyone trying to read it.
What you could do (disregarding potential problems with converting and verification of the file) is read the uploaded file into a library like PHPExcel (http://phpexcel.codeplex.com) and then use the builtin functions to export it as an .xlsx file. Sample below:
// Create a reader to read .xls format
$reader = PHPExcel_IOFactory::createReader('Excel5');
// Read the .xls file from upload storage
$workbook = $reader->load($_FILES['file']['tmp_name']);
// Create a writer to output in .xlsx format
$writer = PHPExcel_IOFactory::createWriter($workbook, 'Excel2007');
// Save file to destination .xlsx path
$writer->save($destination_path);
Keep in mind that although this might work perfectly well, the conversion might mess with the contents of the file. This might not be desirable, as the conversion can cause data loss, formatting changes and all sorts of weirdness.

xls file php download is in a different format than specified by the file extension

I'm struggling with my XML to XLS code in PHP.
I have PHP code that exports MySql data to XML and I have code that writes XML to XLS.
This all works fine, except that when I open the xls file, I get an error:
"The file you are trying to open is in a different format than specified by the file extension. Verify that the file is not corrupted and is from a trusted source before opening the file. Do you want to open the file now?"
my code:
<?php
$file = 'filename';
$url = "$file.xml"; // from xml file
header( "Content-Type: application/vnd.ms-excel" );
header( "Content-disposition: attachment; filename=$file.xls" ); // to xls file
if (file_exists($url)) {
$xml = simplexml_load_file($url);
echo 'Voornaam'."\t" . 'Achternaam'."\t" . 'Geslacht'."\t" . 'Instrument'."\t\n";
foreach($xml->lid as $lid)
{
echo $lid->voornaam."\t" . $lid->achternaam."\t" . $lid->geslacht."\t" . $lid->instrument."\n";
}
}
?>
I found that it has something to do with the MIME_Types https://filext.com/faq/office_mime_types.php
The header("Content-Type: application/vnd.ms-excel"); is for the older versions of excel. But when I use the xlsx type, the file won't open at all.
Like I said, this code runs fine, downloads the Excel file in xls format to my computer. But how can I open it without the error?
I would also like to upgrade the file to xlsx format if someonw knows how to open the file in that format.
Well, your header and filename are both saying "this is an XLS file" - but it isn't, it's actually a TSV file (Tab-Separated Values). So yes, the message is accurate: the file extension is misleading.
To write an actual, valid XLS (or XLSX), it is easiest to use a library which does this for you - both these formats are complex and tricky. I've had best results using PhpExcel (example code).
When Excel opens an .xls file, it expects it to be in normal old binary Excel format. You seem to be outputting tab separated text (.tsv format), which Excel understands but doesn't want to open without issuing a warning. I think this warning was introduced with Excel 2007, and that earlier versions opened happily without warnings. Unfortunately I have no sources handy for this info, only personal experience.

PHPExcel creating file in binary format instead of xml

I am using PHPExcel to generate an Excel file. The problem is that, instead of generating the file in XML format, it is being generated in binary format.
$phpExcel = new PHPExcel();
$phpExcel->getProperties()->setTitle($worksheet_title);
$phpExcel->setActiveSheetIndex(0);
$worksheet = $phpExcel->getActiveSheet();
$worksheet->setCellValue('A1', 'Field');
// ...
$phpExcelWriter = PHPExcel_IOFactory::createWriter($phpExcel, 'Excel2007');
$phpExcelWriter->save('tmp/xxx');
Since I am using the Excel2007 writer, it should be creating the output file in readable xml format, but it's being created in binary format, instead.
An OfficeOpenXML file is not directly human-readable as one gigantasaurus of an xml file. It is a zipped archive containing a collection of XML files.
Unzip the .xlsx file, then you will find a series of folders containing xml files.
That is what MS Excel generates when you save a file as xlsx, and it is what PHPExcel generates as well
You can look at this link for full details of the OfficeOpenXML standard

Convert .xlsx file to .csv file using PHP

I'm looking for a low overhead way to convert a .xlsx file to a .csv file using PHP without consuming excess memory or loading extraneous classes. Anyone?
You can read XLSX files with PHP using PhpSpreadsheet. From there, you only need to figure out the destination format.
You can use following code in PhpSpreadsheet.
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader('CSV');
$objPHPExcel = $reader->load('csv_file.csv');
$objWriter = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($objPHPExcel, 'XLSX');
$objWriter->save('excel_file.xlsx');
If you need to lower memory usage you can provide some caching to the processing, see - https://phpspreadsheet.readthedocs.io/en/latest/topics/memory_saving/

Categories