File is not readable with Excelwriter and phpExcelReader 2 - php

I use Excel Writer of Harish Chauhan to generate an excel (xls) file.
Then I use phpExcelReader 2 to read the file created by the Excel Writer class but have this error all the time :
The filename myXls.xls is not readable
I can open the "myXls.xls" file with MS Excel. But if I save the file with another name , it can be read successfully.
Try to explore the code, it seems that the error was given by :
if (substr($this->data, 0, 8) != IDENTIFIER_OLE) {
//echo 'Error';
$this->error = 1;
return false;
}
IDENTIFIER_OLE was defined :
define('IDENTIFIER_OLE', pack("CCCCCCCC",0xd0,0xcf,0x11,0xe0,0xa1,0xb1,0x1a,0xe1));
I dont have any idea about how to fix it. Please help.
Thanks for your time!

The file generated by Harish Chauhan's ExcelWriter class is not an actual OLE BIFF .xls file, but a mix of HTML markup and some elements from SpreadSheetML, the XML format defined by Microsoft as an alternative to BIFF in Excel 2003. It never proved particularly popular; but the later versions of MS Excel itself can still read and write this format. MS Excel is also very forgiving about reading HTML markup, though the latest version will give you a notice informing you if a file format does not match its extension.
phpExcelReader 2 is written to read Excel BIFF files, therefore it is incapable of reading the non-OLE/non-BIFF files generated by Harish Chauhan's class.
If you want to write and read files in the correct format, then I suggest you use PHPExcel, or one of the many other PHP libraries that work with genuine Excel files.

I had the same problem. The task was to parse very old XLS file (Excel2). I could not find any library in PHP which works with such an old format.
So the solution was to make conversion with LibreOffice command line to XLSX (works to CSV also) and then parse it with any "moderner" Excel parser.
We got LibreOffice installed in our server and this is the command to convert:
libreoffice --headless --convert-to xlsx original_source.xls
or
libreoffice --headless --convert-to csv original_source.xls

Related

check valid docx from linux command line

I generate docx files in a php script, but sometimes they are corrupted. This is not known by the server and it returns the docx file to the user and he discovers that it's is corrupted, creating a very bad experience.
Does someone have a solution to check in linux cli if the docx is corrupted? So I could be more resilient, trying to fix it or give a proper response to the user.
By now I'm experimenting with:
libreoffice --headless --convert-to html corrupted.docx
But if the file is not corrupted, most of cases, it will increase the response time.
you can debug with this corrupted file
You could call a PHP script opening the doc with PHPWord which can report on success for failure. See this example:
include_once 'Sample_Header.php';
// Read contents
$name = basename(__FILE__, '.php');
$source = __DIR__ . "/resources/{$name}.docx";
echo date('H:i:s'), " Reading contents from `{$source}`", EOL;
$phpWord = \PhpOffice\PhpWord\IOFactory::load($source);
return $phpWord instanceof PhpOffice\PhpWord\PhpWord;

PHPExcel reading large CSV is failing

I am attempting to import a csv using PHPExcel into my application so that I can load the data into a table. When the file reaches 2 meg+ the code fails.
I'm running Laravel on WAMP64. The code that is failing is:
$objPHPExcel = PHPExcel_IOFactory::load(Input::file('file')->getRealPath());
The error message is:
ErrorException: file_get_contents(C:\wamp\www\imax\public): failed to open stream: Permission denied in C:\wamp\www\imax\vendor\phpoffice\phpexcel\Classes\PHPExcel\Shared\OLERead.php:85
I know it's a size issue because the code completes properly when the file is 2048K. I can add one character to the file pushing it to 2049K and it fails. So it's not a permissions issue.
The line that fails in OLERead.php is:
// Get the file identifier
// Don't bother reading the whole file until we know it's a valid OLE file
$this->data = file_get_contents($sFileName, FALSE, NULL, 0, 8);
Wampserver 3.0.6
PHP 7.0.10
It sounds like you need to up the memory allocated for PHP. You can do that at runtime or change your php.ini config file.
To up memory limit during runtime:
ini_set('memory_limit','16M'); Feel free to change the 16M to what you need.
To change it permanently:
Open your php.ini file and look for this line upload_max_filesize = 2M; and change the 2M to what you desire. (i also believe WAMP lets you edit this by right clicking the icon and choosing on of the options)
Note: You may want to just search for upload_max_filesize and leave out the = 2 M part as yours may be different.
If the code is using OLERead, then either you're not reading a large CSV file, but a BIFF-format xls file... or you're letting PHPExcel try and identify the filetype itself.... if you know that it is a CSV file, then instantiate the CSV Reader manually rather than letting PHPExcel try and identify the file.
// Tell PHPExcel that you want to load a CSV file
$objReader = new PHPExcel_Reader_CSV();
// Load the $inputFileName to a PHPExcel Object
$objPHPExcel = $objReader->load(Input::file('file')->getRealPath());
However, if you know that you're working with CSV files, then it's more efficient to use PHP's native CSV reading function, fgetcsv()

How to convert a .doc file to a .pdf without tracking changes, using writer from command line? using codeigniter

I am using either Libre office or MS office to convert docx file to pdf. I am using this php code (for command line code execution)
if ($UseLibre)
$command = 'C:\Progra~2\LibreO~1\program\soffice.exe --headless --convert-to pdf:writer_pdf_Export '.$doc_file_path.' -outdir '.$pdf_file_path; // When using Libre
else
$command = 'cscript /nologo D:\wamp\www\doc2pdf.vbs /nologo '.$doc_file_path.' '.$pdf_file_path; //Whe using MS Office
This is working fine. But the problem is that if the docx contain track changes review markup then it affects the alignment in generated pdf.
How can i remove the track changes before generate pdf
Or is there any parameters extra need to pass to say the libre/ms office to ignore track changes while converting docx?

convert pdf to text file in php (note : shell_exec is disabled)

the best way to solve this was to use "pdftotext" that is in the "xpdf" package but in all shared hosts that i googled shell_exec is disabled . i found alternative metods that used only php like a function called pdf2string() (on php.net) but none of those functions didn't work as expected (with some pdf files they just didn't output correct text and with some other pdf they didn't output nothing and some other versions of this function just didnt work at all so i excluded this option). any way to convert that open source pdftotext into a php script ? (source is in c++ i think and can be found here : http://www.foolabs.com/xpdf/download.html) . any other solution will be accepted as far as it gives to me text output of the pdf (the correct one)
Since you have a restricted environment, you may want to look at this.
http://webcheatsheet.com/php/reading_clean_text_from_pdf.php
This uses no external library to parse pdf to text formats.
However, since this parse text out of raw pdf format, i m not sure how stable it is.

read contents of .pst file with php

Is it possible to somehow use PHP to read the contents of a .pst file?
There's a standalone program to convert PST to other formats (which may be then readable using PHP extensions, e.g. php_imap): http://www.five-ten-sg.com/libpst/
However, as Microsoft keeps changing the PST format, it's not guaranteed that you'll be able to convert all PST files.
Exporting folders from MS Outlook (FILE -> OPEN -> IMPORT -> EXPORT TO A FILE) into plain text CSV enables easy parsing e.g. via fgetcsv function. The libpst is only supported on linux (RPM).
Format of PST file is complex and parsing it with PHP would be a tremendous job.

Categories