I have a PHP script that uses adds a lot of small files into the same zip arhchive file using PHP's ZipArchive class.
Recently script started to run out of memory, although the archive is large, but I add only small files one by one, safely re-opening archive each time I need to add a file.
The initial archive file grew little by little to 50 mb so I assume adding little files is not a problem, the real problem might be that whenever ZipArchive class adds a file, it unpacks the whole archive into memory. Is this correct assumption, can it be so?
Memory management isn't one of PHP's strong points. I don't see anything in the manual to confirm or dispel the idea that the entire archive is unpacked into memory, but I'd guess that it is.
Try comparing the return value of $zip->open() to ZIPARCHIVE::ER_MEMORY - if they're equal, that should confirm that PHP is opening the entire archive in memory.
Another way to confirm it would be to compare the setting of memory_limit (http://us2.php.net/manual/en/ini.core.php#ini.memory-limit) to the size of the zip file.
Related
I made three ZIP files (one with no compression settings, one with minimal compression and one with maximum compression)
I then had my script unzip each one, and timed it. First I did it with ZipArchive, then with exec('unzip')
Every time, it was faster using ZipArchive then exec('unzip')
I thought that surely a native program would be faster than a library? Are there some switches I could use with exec('unzip') (that would make it similar to ZipArchive) to make it as fast?
I'm developing a webapp in PHP, and the core library is 94kb in size at this point. While I think I'm safe for now, how big is too big? Is there a point where the script's size becomes an issue, and if so can this be ameliorated by splitting the script into multiple libraries?
I'm using PHP 5.3 and Ubuntu 10.04 32bit in my server environment, if that makes any difference.
I've googled the issue, and everything I can find pertains to PHP upload size only.
Thanks!
Edit: To clarify, the 94kb file is a single file that contains all my data access and business logic, and a small amount of UI code that I have yet to extract to its own file.
Do you mean you have 1 file that is 94KB in size or that your whole library is 94KB in?
Regardless, as long as you aren't piling everything into one file and you're organizing your library into different files your file size should remain manageable.
If a single PHP file is starting to hit a few hundred KB, you have to think about why that file is getting so big and refactor the code to make sure that everything is logically organized.
I've used PHP applications that probably included several megabytes worth of code; the main thing if you have big programs is to use a code caching tool such as APC on your production server. That will cache the compiled (to byte code) PHP code so that it doesn't have to process every file for every page request and will dramatically speed up your code.
I am doing some tests (lamp):
Basically I have 2 version of my custom framework.
A normal version, that includes ~20 files.
A lite version that has everything inside one single big file.
Using my lite version more and more i am seeing a time decrease for the load time. ie, from 0.01 of the normal to 0.005 of the lite version.
Let's consider just the "include" part. I always thought PHP would store the included .php files in memory so the file system doesn't have to retrieve them at every request.
Do you think condensing every classes/functions in one big file it's worth the "chaos" ?
Or there is a setting to tell PHP to store in memory the require php files?
Thanks
(php5.3.x, apache2.x, debian 6 on a dedicated server)
Don't cripple your development by mushing everything up in one file.
A speed up of 5ms is nothing compared to the pain you will feel maintaining such a beast.
To put it another way, a single incorrect index in your database can give you orders of magnitude more slowdown.
Your page would load faster using the "normal" version and omitting one 2kb image.
Don't do it, really just don't.
Or you can do this:
Leave the code as it is (located in
many different files)
Combine them in one file when you are ready to upload it to the production server
Here's what i use:
cat js/* > all.js
yuicompressor all.js -o all.min.js
First i combine them into a single file and then i minify them with the yui compressor.
I am needing to download a very large file via PHP, the last time I did it manually via http it was 2.2gb in size and took a few hours to download. I would like to automate the download somehow.
Previously I have used
file_put_contents($filename, file_get_contents($url));
Will this be ok for such a large file? I will want to untar the file post downloading and then perform analysis of the various files inside the tarball.
file_get_contents() is handy for small files but it's totally unsuitable for large files. Since it loads the entire file into memory you need like 2GB of RAM for each script instance!
You should use resort to old fopen() + fread() instead.
Also, don't discard using a third-party download tool like wget (installed by default in many Linux systems) and create a cron task to run it. It's possibly the best way to automate a daily download.
You will have to adapt your php.ini to accept larger files in upload, and adapt your memory usage limit.
I wrote a PHP script to dynamically pack files selected by the client into zip file and force a download. It works well except that when the number of files is huge (like over 50000), it takes a very long time for the download dialog box to appear on the client side.
I thought about improving this using cache (these files are not changed very often), but because the selection of the files are totally decided by the user, and there are tens of thousands of combinations on the selection, it is very hard to cache combinations. I also thought about generating zip archives for individual files first, and then combining the zip files on-the-fly. But I did't find a way to concatenate zip files in PHP. Another way I can think of is sending (i.e., reading) the zip file at the same time as generating it. I also don't know if this is supported.
If someone could help me on this, I would really appreciate your help.
To extened Mike Sherov's answer, try using a combination of Tar and Gzip/Zip. Individually pre-compress all the files using Gzip/Zip, Then when the client makes their selection, you simply Tar those files together. That way you still get the benefit of compression and the simplicity of downloading one file, but none of the overheads and delays associated with compressing large files in real time.
While not a silver bullet, you can try tar'ing the files instead. The resulting file is larger, but compression time is much shorter. See here for more info: http://birdhouse.org/blog/2010/03/08/zip-vs-tar-gzip/
Check out mod_zip for Nginx:
https://github.com/evanmiller/mod_zip
It streams a ZIP file to the client dynamically and can include very large (2GB+) files while using very little RAM.