I am having issues with including really large files and so have used ini_set('memory_limit', '-1');
but still cannot include a file that is just over 1GB. What should I do?
You will have to reconsider your architecture; PHP's include function was not designed to handle such large files; it was designed to include and evaluate a PHP code file. Without knowing what data the file actually holds, it's hard to say more; but it seems very unlikely that this file actually only holds PHP code; it sounds like you're trying to read a lot of information which is encoded in a PHP-like format.
You should e.g. try to read the file in chunks or lines instead of using include.
The only possible reason you would be trying to load such a file is to probably import it into a database. Each database has a file loader that is meant for this purpose. Mysql has LOAD INFILE specifically to deal with this issue. I'd recommend getting the data in your database first, then use PHP or MySQL to transform the data to your needs. Using PHP to parse through a 1GB file is probably not the best usage of PHP or your resources.
Related
At the moment I am doing a mass interface of files/data and some files are in XLS format, which I need to normalize them into csv (so basically, convert XLS to CSV files)
The problem is that PHPExcel (and similar libraries) load the entire sheet data at once thus exhausting memory.
So far I tried various libraries (in the meantime negotiating to have the data in csv though no luck so far)
I am running my tests on various large file sizes, my memory allocation is set properly before and after my script runs using ini_set etc.
Is there a way that I can read an xls line by line or in chunks (like fgetcsv or fread) please?
I am programming this so it can work with any filesize (even if it takes ages to run) as this is a fully automated system.
PS: I checked this post and various others already
Reading an Excel file in PHP
Possible ways...
Get help from other languages. e.g. find a Python excel library and use it. Then call Python from PHP.
Modify the source code of those Excel readers
Use a command line tool to convert excel to csv, e.g. Pandoc maybe, and use the csv in PHP
Since xls file is nothing but a zip file, maybe it can be unzipped and found the values
First decompose one xls into many small xls files via non-PHP solution, e.g. VBA in excel, then read each of them.
I'm interested making certain my file uploaded via php into a db is locked down. Currently the key functions I'm using are fopen and fgetcsv. Unfortunately this subject seems quite nebulous in the webs.
The file isn't "executed" but is opened and walked with fgetcsv. What steps do I need to do in order make certain that no foul play occurs on my server through this module?
Currently I limit the file size and check the extension.
Do I need to verify the file uploaded is actually a csv and not just some file with a csv extension? I assume this would be through a file type recognizer?
What do I need to do to avoid multibyte/encoding exploits?
***Edit
I found this link to be helpful and may be to others; http://php.net/manual/en/features.file-upload.post-method.php
Thanks
If you are relying on a library to parse user input, you should have confidence in the quality of the library.
If you don't then picking a separate library is advisable.
If no sufficiently stable library can be found for the task, the only viable option in a security-critical application is to implement the functionality yourself.
I'm trying to read a larger than 100MB Excel file using PHPExcel but it crashes while loading the file. I don't need any styling. I tried using:
$objReader->setReadDataOnly(true);
but it still crashes.
Is there any efficient way to read this size of Excel file in PHP?
Try Spout: https://github.com/box/spout.
This is a PHP library that was created to solve your problem (reading/writing large files). Here is why it works:
Other libraries keep a representation of the spreadsheet in memory which make them subject to out of memory errors. Using some caching strategies will help with these kind of errors but will affect performance pretty badly.
On the other hand, Spout uses streams to read or write data. This means that there is only one row kept in memory at all times, all read/written rows being freed from memory. This allows fast read/write of dataset of any size! Give it a try :)
Spout just saved my time! I couldn't read a large file with PhpOffice/PhPSpreedSheet with many Fatal Error Memory size, and with Spout it works like a charm.
I want to read an Excel file with PHP row by row because reading the entire file at once cause memory overflow.
I have searched a lot, but no luck until now.
I think PHPExcel library can read chunks of an excel file, when you implement the filter class, but each time it gets this chunk it reads the entire file, which is impossible in huge .xls files because of the time it will take.
Any help ?
This may be something that is totally out of question, but from the information that I get from your question the following seems like an obvious option, at least something to consider ...
I get the impression that this is a really big file that needs to be accessed often. So, I would just try to import its data in a database.
I guess there is no need to explain that databases are masters in performance and caching.
And it is still possible to export the contents of the database to an excel file afterwards.
MySql works great with PHP and is certainly easier to access than an excel file. Most php hosting providers offer a MySql database by default with a PhpMyAdmin management tool.
How to do it:
If you have PhpMyAdmin installed, then you can follow these simple steps.
If you have command-line access to the server then you can even import the file from commandline directly to a MySql database.
If the only thing you need from your read Excel file is data, here is my way to read huge Excel files :
I install gnumeric on my server, ie with debian/ubuntu :
apt-get install gnumeric
Then the php calls to read my excel file and store it into a two dimensionnal data array are incredibly simple (dimensions are rows and cols) :
system("ssconvert \"$excel_file_name\" \"temp.csv\"");
$array = array_map("str_getcsv", file("temp.csv"));
Then I can do what I want with my array. This takes less than 10 seconds for a 10MB large xls file, the same time I would need to load the file on my favorite spreadsheet software !
For very huge files, you should use fopen() and file_getcsv() functions and do what you have to do without storing data in a huge array to avoid storing the whole csv file in memory with the file() function. This will be slower, but will not eat all your server's memory !
I'm developing a webapp in PHP, and the core library is 94kb in size at this point. While I think I'm safe for now, how big is too big? Is there a point where the script's size becomes an issue, and if so can this be ameliorated by splitting the script into multiple libraries?
I'm using PHP 5.3 and Ubuntu 10.04 32bit in my server environment, if that makes any difference.
I've googled the issue, and everything I can find pertains to PHP upload size only.
Thanks!
Edit: To clarify, the 94kb file is a single file that contains all my data access and business logic, and a small amount of UI code that I have yet to extract to its own file.
Do you mean you have 1 file that is 94KB in size or that your whole library is 94KB in?
Regardless, as long as you aren't piling everything into one file and you're organizing your library into different files your file size should remain manageable.
If a single PHP file is starting to hit a few hundred KB, you have to think about why that file is getting so big and refactor the code to make sure that everything is logically organized.
I've used PHP applications that probably included several megabytes worth of code; the main thing if you have big programs is to use a code caching tool such as APC on your production server. That will cache the compiled (to byte code) PHP code so that it doesn't have to process every file for every page request and will dramatically speed up your code.