PHP Aborting when creating large .zip file - php

My php script running on CentOS 5.6 and PHP 5.2.12 using ZipArchive() and successfully creates .zip files over 1.6Gb but not for a larger archive of 2GB or larger - PHP aborts with no apparent error. Nothing in the PHP error log or stderr. The script is being executed at the cmd line and not interactively.
The script runs for about 8min and the temp archive grows and while checking the filesize, the last listing showed the tmp file was 2120011776 in size and then the tmp file disappears and the PHP script falls thru the logic and executes the code after the archive create.
For some reason top shows the CPU still at 95% and is creating a new tmp archive file - it does this for say another 5+ min and silently stops and leaves the un-completed tmp archive file. In this test - there was less then 4000 expected files.
The script as noted works just fine creating smaller archive files.
Tested on several different sets of large source data - same result for large files.
This issue sounds similar to this question:
Size limit on PHP's zipArchive class?
I thought maybe the ls -l command was returning a count of 2K blocks and thus 2120011776 would be close to 4GB but that size is in bytes - the size of the xxxx.zip.tmpxx file.
Thanks!

It could be many things. I'm assuming that you have enough free disk space to handle the process. As others have mentioned, there could be some problems fixed either by editing your php.ini file or using the ini_set() function in the code itself.
How much memory does your machine have? If it exhausts your actual memory, then it makes sense that it would abort regularly after a certain size. So, check the free memory usage before the script and monitor it as the script executes.
A third option could be based on the file system itself. I don't have much experience with CentOS, but some file systems do not allow files over 2 gb. Although, from the product page, it seems like most systems on CentOS can handle it.
A fourth option, which seems to be the most promising, appears if you look at the product page linked above, another possible culprit is "Maximum x86 per-process virtual address space," which is approximately 3gb. x86_64 is about 2tb, so check the type of processor.
Again, it seems like the fourth option is the culprit.

Do you have use set_limit variables in php.
You can use the. Htacess or within the PHP script.
Inside the script set_time_limit(0);
Inside the .htaccess php_value memory_limit 214572800;

When your file size is big it will take time to make its archive ZIP, but in PHP (php.ini) maximum execution time, so you must try to increase that value.

there is a setting in php.ini maximum execution time
perhaps this is getting fired !
try to increase the value !
There is also different file size limit for OS, try to check that too !

Related

How to tell if a file has finished uploading in PHP?

I have a cron script that compresses images. It basically iterates over folders and then compresses the files in the folder. My problem is that some images are getting processed halfway. My theory is that users are uploading a image, and before the image has finished uploading the file, the compressor tries to compress the file. Thus compressing a half-uploaded image, and resulting in half an image being displayed.
Is there a way in PHP to confirm that a file has finished uploading? So that I can only do the compression once i know the file has been fully written?
Or alternatively, is there a way to check if a file is being used by another process?
Or alternatively, would it be reliable enough to look at when the file was "written to disk" and not process it until 10 minutes has gone by?
PHP doesn't trigger your action until the files are fully uploaded, but it is possible for your cron job to start interacting with files before they're fully saved.
When saving something from $_FILES, save it to a version with a . prefix on it to tag it as incomplete. Make sure your cron job skips any such files.
Then, once the save operation is complete, rename the file without the . prefix to make it eligible for processing.
There are two ways to handle the scenario
Flags
Set flag that files before modify/write it.
Our App handles lots of files, we set flags before taking them to process once it's done we remove the flag, as it runs on cron flag is the best way to process files.
Usually, you can an extra column on the table on each file. or you can have an array where you can store all currently handling files.
filemtime()
As you mentioned you can check like if file mtime is more than 10 min that current time, then you compress them but if some other processes are using the file opened the at the same time. it causes the problem again.
So its better to go with flag. If other processes never modify the files often.
You can use flock to ensure file is not in use, see here for example. Alternatively you can check whether an image is broken or corrupted see here.

PHP: How to change the default time before an uploaded file gets deleted from the default tmp folder? [duplicate]

I am working on an upload script.
If a user uploads a file and it already exists I want to warn the user (this is all through ajax) and give them the option to replace it, or cancel.
Instead of moving the file,
I was curious if I could just leave the file in tmp and pass back the path to that file in the ajax response.
If they user says overwrite the old file in that ajax request pass the path back to php which continues to work on the file.
For this to work however I need to know how long a file stays in php's tmp dir
Files uploaded through POST are deleted right after php script finishes its execution.
According to php.net:
"The file will be deleted from the temporary directory at the end of the request if it has not been moved away or renamed."
For uploaded files, the manual states:
The file will be deleted from the
temporary directory at the end of the
request if it has not been moved away
or renamed.
Files that are to be kept should therefore be moved to another location.
More generally, as your question title might imply, temporary folders are left to be cleaned up by the system. This is true when using functions like tempnam or tmpfile, or simply when writing to the temporary directory (see sys_get_temp_dir).
In Ubuntu, this is done at every system reboot, or at a time interval, as defined in /etc/default/rcS.
In some Red Hat based distros, it is done using the tmpwatch utility from a cronjob. In others, the /tmp partition is mounted using the tmpfs filesystem, which is similar to a RAM disk (therefore being cleaned when the computer shuts down).
Another known mechanism is a size threshold, which means that the temporary directory will be cleaned up from the older files when it reaches a certain size.
There are three variables that need to be set in PHP to make sure that Garbage Collection of the /tmp directory happens correctly and they are:
session.gc_maxlifetime = 21600
session.gc_probability = 1
session.gc_divisor = 100
Set session.gc_maxlifetime to be the number of seconds you want each tmp file to last before it's deleted. If you login to the admin in OpenCart, this is the number of seconds until you will automatically be logged out. For example to set half an hour, you would do 60 seconds times 30 minutes which would be a value of 1800 seconds.
The other two variables are related to when the Garbage Collector will run, and it's important that they are set to the values above if you're having problems with this.
More info here:
https://www.antropy.co.uk/blog/opencart-php-session-tmp-files-filling-up/

Streaming Log file with Ajax

I have the need to stream a log file that is located on FTP, its from a remote server.
I am not sure how to stream this, possibly with Ajax.
There are a few things on google, but i cannot seem to find something that can access remote FTP and stream the file.
Maybe with Ajax and using intervals, then scrolling down to the bottom of the page.
Note that the log file is being updated constantly and people will also be sending commands to the server, thus updating the log file. Will refreshing the log and downloading the log each time be slow? Some log files can be very large.
Stop using the filesystem and implement publish-subscriber pattern. For a reference implementation see loggly or papertrail.
I would think you would need some sort of intermediate script to keep track of the last read lines of the logfile and the respond to the AJAX call with any updates to the file since that point.
My psuedo-code solution would look like this
Read local cache file for last line number that was processed
Count number of lines in the file (using linux wc -l or similar)
Get the last X number of lines from the files as calculated from the difference (linux tail -n X or similar)
Update local cache file with last line number read.
Return the content to the caller.

Get PHP to wait until a file is done transferring before moving it

I have a PHP script that moves files out of a specific folder on the server(an IBM AS400). The problem I am running into is that sometimes the script runs while the file is still in the process of being moved in to the folder.
Poor logic on my part assumed that if a file was "in use" that PHP wouldn't attempt to move it but it does which results in a corrupted file.
I thought I could do this:
$oldModifyTime = filemtime('thefile.pdf');
sleep(2);
if($oldModifyTime === filemtime('thefile.pdf'){
rename('thefile.pdf','/folder2/thefile.pdf');
}
But the filemtime functions come up with the same value even while the file is being written. I have also tried fileatime with the same results.
If I do Right Click->Properties in Windows the Modified Date and Access Date are constantly changing as the file is being written.
Any ideas how to determine if a file is finished transferring before doing anything to it?
From the PHP manual entry for filemtime():
Note: The results of this function are cached. See clearstatcache() for more details.
I would also suggest that 2 seconds is a bit short to detect whether the file transfer is complete due to network congestion, buffering, etc.
Transfer it as a temporary name or to a different folder, then rename/copy it to the correct folder after the transfer is complete.

Detect with php if files is being uploaded, or is open

I have a PHP script that opens a local directory in order to copy and process some files. But these files may be incomplete, because they are being uploaded by a slow FTP process, and I do not want to copy or process any files which have not been completely uploaded yet.
Is is possible in PHP to find out if a file is still being copied (that is, read from), or written to?
I need my script to process only those files that have been completely uploaded.
The ftp process now, upload files in parallel, and it take more than 1 second for each filesize to change, so this trick is not working for me now, any other method suggests
Do you have script control over the FTP process? If so, have the script that's doing the uploading upload a [FILENAME].complete file (blank text file) after the primary upload completes, so the processing script knows that the file is complete if there's a matching *.complete file there also.
+1 to #MidnightLightning for his excellent suggestion. If you don't have control over the process you have a couple of options:
If you know what the final size of the file should be then use filesize() to compare the current size to the known size. Keep checking until they match.
If you don't know what the final size should be it gets a little trickier. You could use filesize() to check the size of the file, wait a second or two and check it again. If the size hasn't changed then the upload should be complete. The problem with the second method is if your file upload stalls for whatever reason it could give you a false positive. So the time to wait is key.
You don't specify what kind of OS you're on, but if it's a Unix-type box, you should have fuser and/or lsof available. fuser will report on who's using a particular file, and lsof will list all open files (including sockets, fifos, .so's, etc...). Either of those could most likely be used to monitor your directory.
On the windows end, there's a few free tools from Sysinternals that do the same thing. handle might do the trick

Categories