I am needing to download a very large file via PHP, the last time I did it manually via http it was 2.2gb in size and took a few hours to download. I would like to automate the download somehow.
Previously I have used
file_put_contents($filename, file_get_contents($url));
Will this be ok for such a large file? I will want to untar the file post downloading and then perform analysis of the various files inside the tarball.
file_get_contents() is handy for small files but it's totally unsuitable for large files. Since it loads the entire file into memory you need like 2GB of RAM for each script instance!
You should use resort to old fopen() + fread() instead.
Also, don't discard using a third-party download tool like wget (installed by default in many Linux systems) and create a cron task to run it. It's possibly the best way to automate a daily download.
You will have to adapt your php.ini to accept larger files in upload, and adapt your memory usage limit.
Related
I made three ZIP files (one with no compression settings, one with minimal compression and one with maximum compression)
I then had my script unzip each one, and timed it. First I did it with ZipArchive, then with exec('unzip')
Every time, it was faster using ZipArchive then exec('unzip')
I thought that surely a native program would be faster than a library? Are there some switches I could use with exec('unzip') (that would make it similar to ZipArchive) to make it as fast?
hi i wanted to know if uploading large files like videos ( over 200 mb - 1gb) from php is a good option after setting up the server configuration like max_post_size , execution time etc. The reason i ask this question is because i read some where that when a large file is uploaded , best practice is to break that file into chunks and upload it ( I think youtube does that). Do i need to use another language like python or C++ for uploading large files or is php enough. If i need to use another language can anyone please help me with reading material for that .
Thank you.
PHP will hold the entire file in memory while the upload is happening. That means that if you are uploading 5 files in parallel, then at the very most you will need 5GB+ of memory.
This can be done in PHP, and I have done this using a chunking method. There are several SO questions on this topic:
File uploads; How to utilize “chunking”?
Upload 1GB files using chunking in PHP
But my personal preference is to use plupload. It is a very complete cross-platform (JS, Flash, Silverlight) upload script with a nice PHP code sample to handle chunking.
Its not only PHP to be considered for large file uploads. Your web server also need to support that, at least in nginx. I don't know how httpd handles that, but as you said splitting in chunks are viable solution. FTP is another option.
I quickly wrote this up, http://www.ionfish.org/php-shrink/ where a user uploads a .php file with comments and spaces in it, and it will "minify" it for small file-size.
It does exactly this: http://devpro.it/remove_phpcomments/ except it's web-based, and instant. You get a download prompt after you click upload, to save the processed file.
My questions:
Is a 2-megabyte upload limit enough? Should I make it 4, 8, etc. ?
Is the output (requires you test it with some random PHP) satisfactory, or should it be tweaked?
Would there be any use for this for the general public, and should I add support to minify HTML, CSS, JS, and even C++ and Python etc. ?
EDIT: Changed to 250K for now, will see if it suffices.
A file that needs such a minification is proof of bad practice.
The file size of a single php file should never be a problem when using best practice.
You should not be uploading files during your deployment anyways. Instead you should be checking out files from your VCS and you don't want 'minified' files in your VCS.
Such a minification will not improve site performance either since every serious project uses opcode caching.
Conclusion: Such a service is not needed.
I'm developing a webapp in PHP, and the core library is 94kb in size at this point. While I think I'm safe for now, how big is too big? Is there a point where the script's size becomes an issue, and if so can this be ameliorated by splitting the script into multiple libraries?
I'm using PHP 5.3 and Ubuntu 10.04 32bit in my server environment, if that makes any difference.
I've googled the issue, and everything I can find pertains to PHP upload size only.
Thanks!
Edit: To clarify, the 94kb file is a single file that contains all my data access and business logic, and a small amount of UI code that I have yet to extract to its own file.
Do you mean you have 1 file that is 94KB in size or that your whole library is 94KB in?
Regardless, as long as you aren't piling everything into one file and you're organizing your library into different files your file size should remain manageable.
If a single PHP file is starting to hit a few hundred KB, you have to think about why that file is getting so big and refactor the code to make sure that everything is logically organized.
I've used PHP applications that probably included several megabytes worth of code; the main thing if you have big programs is to use a code caching tool such as APC on your production server. That will cache the compiled (to byte code) PHP code so that it doesn't have to process every file for every page request and will dramatically speed up your code.
I wrote a PHP script to dynamically pack files selected by the client into zip file and force a download. It works well except that when the number of files is huge (like over 50000), it takes a very long time for the download dialog box to appear on the client side.
I thought about improving this using cache (these files are not changed very often), but because the selection of the files are totally decided by the user, and there are tens of thousands of combinations on the selection, it is very hard to cache combinations. I also thought about generating zip archives for individual files first, and then combining the zip files on-the-fly. But I did't find a way to concatenate zip files in PHP. Another way I can think of is sending (i.e., reading) the zip file at the same time as generating it. I also don't know if this is supported.
If someone could help me on this, I would really appreciate your help.
To extened Mike Sherov's answer, try using a combination of Tar and Gzip/Zip. Individually pre-compress all the files using Gzip/Zip, Then when the client makes their selection, you simply Tar those files together. That way you still get the benefit of compression and the simplicity of downloading one file, but none of the overheads and delays associated with compressing large files in real time.
While not a silver bullet, you can try tar'ing the files instead. The resulting file is larger, but compression time is much shorter. See here for more info: http://birdhouse.org/blog/2010/03/08/zip-vs-tar-gzip/
Check out mod_zip for Nginx:
https://github.com/evanmiller/mod_zip
It streams a ZIP file to the client dynamically and can include very large (2GB+) files while using very little RAM.