I have a trouble with converting a .xls file (Excel) to CSV in PHPExcel.
All works fine until comes some Big file. My php script just exceeds the memory limit and blows up. I cannot use more than 64MB because of the specifics of the computer. I'm running Apache on it.
We need to find a solution.
I think I have to tell PHPExcel to load just a few lines of Excel than convert it to small CSV, save it, free the used memory and so on with the rest of the file until it's done...
What you think about? Can we find the more accurate way of doing it.
You have a few options for saving memory with PHPExcel. The main two are:
cell caching, described in section 4.2.1 of the developer
documentation,
This allows you to reduce the memory overhead for each cell that is read from the file
Chunking, described in section 4.3 of the User Documentation for
Readers
This allows you to read small ranges of rows and
columns from a file, rather than the whole worksheet
Related
At the moment I am doing a mass interface of files/data and some files are in XLS format, which I need to normalize them into csv (so basically, convert XLS to CSV files)
The problem is that PHPExcel (and similar libraries) load the entire sheet data at once thus exhausting memory.
So far I tried various libraries (in the meantime negotiating to have the data in csv though no luck so far)
I am running my tests on various large file sizes, my memory allocation is set properly before and after my script runs using ini_set etc.
Is there a way that I can read an xls line by line or in chunks (like fgetcsv or fread) please?
I am programming this so it can work with any filesize (even if it takes ages to run) as this is a fully automated system.
PS: I checked this post and various others already
Reading an Excel file in PHP
Possible ways...
Get help from other languages. e.g. find a Python excel library and use it. Then call Python from PHP.
Modify the source code of those Excel readers
Use a command line tool to convert excel to csv, e.g. Pandoc maybe, and use the csv in PHP
Since xls file is nothing but a zip file, maybe it can be unzipped and found the values
First decompose one xls into many small xls files via non-PHP solution, e.g. VBA in excel, then read each of them.
I'm trying to read a larger than 100MB Excel file using PHPExcel but it crashes while loading the file. I don't need any styling. I tried using:
$objReader->setReadDataOnly(true);
but it still crashes.
Is there any efficient way to read this size of Excel file in PHP?
Try Spout: https://github.com/box/spout.
This is a PHP library that was created to solve your problem (reading/writing large files). Here is why it works:
Other libraries keep a representation of the spreadsheet in memory which make them subject to out of memory errors. Using some caching strategies will help with these kind of errors but will affect performance pretty badly.
On the other hand, Spout uses streams to read or write data. This means that there is only one row kept in memory at all times, all read/written rows being freed from memory. This allows fast read/write of dataset of any size! Give it a try :)
Spout just saved my time! I couldn't read a large file with PhpOffice/PhPSpreedSheet with many Fatal Error Memory size, and with Spout it works like a charm.
When I use this function:
$objPHPExcel = PHPExcel_IOFactory::load($fileName);
on smaller excel files, it takes a while but finally gets an extensive array out into the $objPHPExcel... Unfortuantely when I try it on a slightly larger more complex Ecel file I get:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 32 bytes)
The file is an xlsm file and is 1.7MB... Does this sound right or is something fishy going on?
I think this is all the code you need. I'm running off of default WAMP set up at the moment locally.
I had the same issue. At our company we need to import huge xls(x) files to our database. We have been using PEAR Spreadsheet Excel Reader, but it is no longer supported and we encountered many bugs with newer files, so we have tried to switch to PHPExcel.
Unfortunatelly we did not manage to overcome the memory limit issue. And we spent a lof of time trying to.
There is no way you can load 100K rows with columns up to 'BB' with PHPExcel.
It is just not right tool for the job. The PHP Excel builds the whole spreadsheet in the memory as objects. There is no way around this.
What you need is a tool that can read the file row by row, cell by cell.
Our solution was to use Java and Apache POI classes with event model - which does read only operations but is very memory and cpu efficient.
If you only need to support the xml based "Office Open XML" formats (xlsx) and not the old OLE based, then you can process it as XML for your own. The format is not so much complicated if you get into it. Just unzip a file and look at the xmls. You have one file with is string table, and one file with rows and cells per each sheet.
You should parse the string table first, and then the sheet data with xml reader (not the DOM).
As far as I know, there is no PHP library that can import large excel files out of the box as of October 2013.
Good luck.
In my experience, PHPExcel runs out of memory during a lot of operations.
You can try upping the memory limit for the script.
ini_set('memory_limit','256M');
I want to read an Excel file with PHP row by row because reading the entire file at once cause memory overflow.
I have searched a lot, but no luck until now.
I think PHPExcel library can read chunks of an excel file, when you implement the filter class, but each time it gets this chunk it reads the entire file, which is impossible in huge .xls files because of the time it will take.
Any help ?
This may be something that is totally out of question, but from the information that I get from your question the following seems like an obvious option, at least something to consider ...
I get the impression that this is a really big file that needs to be accessed often. So, I would just try to import its data in a database.
I guess there is no need to explain that databases are masters in performance and caching.
And it is still possible to export the contents of the database to an excel file afterwards.
MySql works great with PHP and is certainly easier to access than an excel file. Most php hosting providers offer a MySql database by default with a PhpMyAdmin management tool.
How to do it:
If you have PhpMyAdmin installed, then you can follow these simple steps.
If you have command-line access to the server then you can even import the file from commandline directly to a MySql database.
If the only thing you need from your read Excel file is data, here is my way to read huge Excel files :
I install gnumeric on my server, ie with debian/ubuntu :
apt-get install gnumeric
Then the php calls to read my excel file and store it into a two dimensionnal data array are incredibly simple (dimensions are rows and cols) :
system("ssconvert \"$excel_file_name\" \"temp.csv\"");
$array = array_map("str_getcsv", file("temp.csv"));
Then I can do what I want with my array. This takes less than 10 seconds for a 10MB large xls file, the same time I would need to load the file on my favorite spreadsheet software !
For very huge files, you should use fopen() and file_getcsv() functions and do what you have to do without storing data in a huge array to avoid storing the whole csv file in memory with the file() function. This will be slower, but will not eat all your server's memory !
I have a php application that needs to work on many configurations of php with as little requirements outside of the code igniter framework as possible.
I have an import function right now that uses .csv files. Csv is pretty good as if is cross platform. But people have trouble with it when using excel. It also can't display chiense characters correctly.
Then there is .xls and .xlsx files. There are libraries for these but often require php_zip
What option should I choose that works with many php installs and is good for display and import?
there may be chances of information lost in the export to CSV.
It will only save the values of the cells - not their formatting informations.
There's no way you'll read an .xlsx file without unzipping it, which means you'll need a zip lib.
PHPExcel handles several formats of excel files, but it can be a bit resource hungry.
http://phpexcel.codeplex.com/
XLSX2CSV is less resource intensive, but only reads one page of multi-page worksheets, doesn't read parse formulas and doesn't handle .xls files.
http://davidacollins.com/weblog/xlsx2csv