Unzipping a file with PHP, but not all files are extracted

Unzipping a file with PHP, but not all files are extracted - php

I'm working on extracting a zip archive with PHP. The structure of the archive is seven folders, each of which contains on the order of 10,000 files, each around 1 kB.
My code is pretty simple and uses the ZipArchive class:
$zip = new ZipArchive();
$result = $zip->open($filename);
if ($result === true) {
$zip->extractTo($tmpdir);
$zip->close();
}
The problem I'm having, though, is that the extraction seems to halt. The first folder is fully extracted, but only about half of the second one is. None of the other five are extracted at all.
I also tried using this code, which breaks it into chunks of 10 kB at a time, but got the exact same result:
$archive = zip_open($filename);
while ($entry = zip_read($archive)) {
$size = zip_entry_filesize($entry);
$name = zip_entry_name($entry);
if (substr($name, -1) == '/') {
if (!file_exists($tmpdir . $name)) mkdir($tmpdir . $name);
} else {
$unzipped = fopen($tmpdir . $name, 'wb');
while ($size > 0) {
$chunkSize = ($size > 10240) ? 10240 : $size;
$size -= $chunkSize;
$chunk = zip_entry_read($entry, $chunkSize);
if ($chunk !== false) fwrite($unzipped, $chunk);
}
fclose($unzipped);
}
}
I've also tried increasing the memory limit in PHP from 512 MB to 1024 MB, but again got the same result. Unzipped everything is around 100 MB, so I wouldn't anticipate it being a memory issue anyway.

Probably its your max execution time... disable the limit completely by setting it to 0 or set a good value.
ini_set('max_execution_time', 10000);
... dont set it to 0 in production use ...
If you don't have access to ini_set() because of the disable_function directive you may have to edit its value in your php.ini directly.

Related

Zip thousand files in php - memory leak

I have to zip search results containing maximum 10000 files, with an approximate dimension of far more than 1Gb.
I create a zip archive and read every file in a for loop whit fread and add the resulting file to the archive.
I never finished adding files because of this error
PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1257723 bytes)
but I don't think adding 1Gb or more to the memory_limit value of php.ini could be a solution, because memory resources are limited.
Because of the zip file will stay in memory until it will be closed (or so I read in another question), I made a way to create a series of zip file of 50Mb to avoid the memory leak. But even if the php script create another file zip, it will stop with the same PHP Fatal error on the same file (the 174th).
Why?
Am I doing something wrong?
Any help will be appreciated.
Here is a code snippet of the file creation
$zip = new ZipArchive();
$nomeZipFile = "../tmp/" . $title . ".zip";
for ($x = 0; $x < count($risultato); $x++) {
$numFiles = 0;
$dir = '../tmp';
if (file_exists($nomeZipFile)) {
if (filesize($nomeZipFile) > 52428800) {
// filename count
if ($handle = opendir($dir)) {
while (($file = readdir($handle)) !== false) {
if (!in_array($file, array('.', '..')) && !is_dir($dir . $file))
$numFiles++;
}
}
$nomeZipFile = '../tmp/' . $title . $numFiles . '.zip';
$res = $zip->open($nomeZipFile, ZIPARCHIVE::CREATE);
} else {
$res = $zip->open($nomeZipFile, ZIPARCHIVE::CREATE);
}
} else {
$res = $zip->open($nomeZipFile, ZIPARCHIVE::CREATE);
}
...
// adding file
$fileDownload = "";
$fDownload = fopen($kt_response->message, "r"); // the file is downloaded though a webservice
while(!feof($fDownload)) { $fileDownload .= fread($fDownload, 1024); flush(); }
$zip->addFromString(filename, $fileDownload);
....
$zip->close();
}

Split a large zip file in small chunks using php script

I am using below script for spliting a large zip file in small chucks.
$filename = "pro.zip";
$targetfolder = '/tmp';
// File size in Mb per piece/split.
// For a 200Mb file if piecesize=10 it will create twenty 10Mb files
$piecesize = 10; // splitted file size in MB
$buffer = 1024;
$piece = 1048576*$piecesize;
$current = 0;
$splitnum = 1;
if(!file_exists($targetfolder)) {
if(mkdir($targetfolder)) {
echo "Created target folder $targetfolder".br();
}
}
if(!$handle = fopen($filename, "rb")) {
die("Unable to open $filename for read! Make sure you edited filesplit.php correctly!".br());
}
$base_filename = basename($filename);
$piece_name = $targetfolder.'/'.$base_filename.'.'.str_pad($splitnum, 3, "0", STR_PAD_LEFT);
if(!$fw = fopen($piece_name,"w")) {
die("Unable to open $piece_name for write. Make sure target folder is writeable.".br());
}
echo "Splitting $base_filename into $piecesize Mb files ".br()."(last piece may be smaller in size)".br();
echo "Writing $piece_name...".br();
while (!feof($handle) and $splitnum < 999) {
if($current < $piece) {
if($content = fread($handle, $buffer)) {
if(fwrite($fw, $content)) {
$current += $buffer;
} else {
die("filesplit.php is unable to write to target folder");
}
}
} else {
fclose($fw);
$current = 0;
$splitnum++;
$piece_name = $targetfolder.'/'.$base_filename.'.'.str_pad($splitnum, 3, "0", STR_PAD_LEFT);
echo "Writing $piece_name...".br();
$fw = fopen($piece_name,"w");
}
}
fclose($fw);
fclose($handle);
echo "Done! ".br();
exit;
function br() {
return (!empty($_SERVER['SERVER_SOFTWARE']))?'<br>':"\n";
}
?>
But this script not creating small files after split in target temp folder. Script runs successfully without any error.
Please help me to found out what is issue here? Or If you have any other working script for similar functinality, Please provide me.

As indicated in the comments above, you can use split to split a file into smaller pieces, and can then use cat to join them back together.
split -b50m filename x
and to put them back
cat xaa xab xac > filename
If you are looking to split the zipfile into a spanning type archive, so that you do not need to rejoin the them together take a look at zipsplit
zipslit -n (size) filename
so you can just call zipsplit from your exec script and then most standard unzip utils should be able to put it back together. man zipslit for more options, including setting output path, etc..

getimagesize() limiting file size for remote URL

I could use getimagesize() to validate an image, but the problem is what if the mischievous user puts a link to a 10GB random file then it would whack my production server's bandwidth. How do I limit the filesize getimagesize() is getting? (eg. 5MB max image size)
PS: I did research before asking.

You can download the file separately, imposing a maximum size you wish to download:
function mygetimagesize($url, $max_size = -1)
{
// create temporary file to store data from $url
if (false === ($tmpfname = tempnam(sys_get_temp_dir(), uniqid('mgis')))) {
return false;
}
// open input and output
if (false === ($in = fopen($url, 'rb')) || false === ($out = fopen($tmpfname, 'wb'))) {
unlink($tmpfname);
return false;
}
// copy at most $max_size bytes
stream_copy_to_stream($in, $out, $max_size);
// close input and output file
fclose($in); fclose($out);
// retrieve image information
$info = getimagesize($tmpfname);
// get rid of temporary file
unlink($tmpfname);
return $info;
}

You don't want to do something like getimagesize('http://example.com') to begin with, since this will download the image once, check the size, then discard the downloaded image data. That's a real waste of bandwidth.
So, separate the download process from the checking of the image size. For example, use fopen to open the image URL, read little by little and write it to a temporary file, keeping count of how much you have read. Once you cross 5MB and are still not finished reading, you stop and reject the image.
You could try to read the HTTP Content-Size header before starting the actual download to weed out obviously large files, but you cannot rely on it, since it can be spoofed or omitted.

Here is an example, you need to make some change to fit your requirement.
function getimagesize_limit($url, $limit)
{
global $phpbb_root_path;
$tmpfilename = tempnam($phpbb_root_path . 'store/', unique_id() . '-');
$fp = fopen($url, 'r');
if (!$fp) return false;
$tmpfile = fopen($tmpfilename, 'w');
$size = 0;
while (!feof($fp) && $size<$limit)
{
$content = fread($fp, 8192);
$size += 8192; fwrite($tmpfile, $content);
}
fclose($fp);
fclose($tmpfile);
$is = getimagesize($tmpfilename);
unlink($tmpfilename);
return $is;
}

php how to get web image size in kb?

php how to get web image size in kb?
getimagesize only get the width and height.
and filesize caused waring.
$imgsize=filesize("http://static.adzerk.net/Advertisers/2564.jpg");
echo $imgsize;
Warning: filesize() [function.filesize]: stat failed for http://static.adzerk.net/Advertisers/2564.jpg
Is there any other way to get a web image size in kb?

Short of doing a complete HTTP request, there is no easy way:
$img = get_headers("http://static.adzerk.net/Advertisers/2564.jpg", 1);
print $img["Content-Length"];
You can likely utilize cURL however to send a lighter HEAD request instead.

<?php
$file_size = filesize($_SERVER['DOCUMENT_ROOT']."/Advertisers/2564.jpg"); // Get file size in bytes
$file_size = $file_size / 1024; // Get file size in KB
echo $file_size; // Echo file size
?>

Not sure about using filesize() for remote files, but there are good snippets on php.net though about using cURL.
http://www.php.net/manual/en/function.filesize.php#92462

That sounds like a permissions issue because filesize() should work just fine.
Here is an example:
php > echo filesize("./9832712.jpg");
1433719
Make sure the permissions are set correctly on the image and that the path is also correct. You will need to apply some math to convert from bytes to KB but after doing that you should be in good shape!

Here is a good link regarding filesize()
You cannot use filesize() to retrieve remote file information. It must first be downloaded or determined by another method
Using Curl here is a good method:
Tutorial

You can use also this function
<?php
$filesize=file_get_size($dir.'/'.$ff);
$filesize=$filesize/1024;// to convert in KB
echo $filesize;
function file_get_size($file) {
//open file
$fh = fopen($file, "r");
//declare some variables
$size = "0";
$char = "";
//set file pointer to 0; I'm a little bit paranoid, you can remove this
fseek($fh, 0, SEEK_SET);
//set multiplicator to zero
$count = 0;
while (true) {
//jump 1 MB forward in file
fseek($fh, 1048576, SEEK_CUR);
//check if we actually left the file
if (($char = fgetc($fh)) !== false) {
//if not, go on
$count ++;
} else {
//else jump back where we were before leaving and exit loop
fseek($fh, -1048576, SEEK_CUR);
break;
}
}
//we could make $count jumps, so the file is at least $count * 1.000001 MB large
//1048577 because we jump 1 MB and fgetc goes 1 B forward too
$size = bcmul("1048577", $count);
//now count the last few bytes; they're always less than 1048576 so it's quite fast
$fine = 0;
while(false !== ($char = fgetc($fh))) {
$fine ++;
}
//and add them
$size = bcadd($size, $fine);
fclose($fh);
return $size;
}
?>

You can get the file size by using the get_headers() function. Use below code:
$image = get_headers($url, 1);
$bytes = $image["Content-Length"];
$mb = $bytes/(1024 * 1024);
echo number_format($mb,2) . " MB";

Unzipping larger files with PHP

I'm trying to unzip a 14MB archive with PHP with code like this:
$zip = zip_open("c:\kosmas.zip");
while ($zip_entry = zip_read($zip)) {
$fp = fopen("c:/unzip/import.xml", "w");
if (zip_entry_open($zip, $zip_entry, "r")) {
$buf = zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
fwrite($fp,"$buf");
zip_entry_close($zip_entry);
fclose($fp);
break;
}
zip_close($zip);
}
It fails on my localhost with 128MB memory limit with the classic "Allowed memory size of blablabla bytes exhausted". On the server, I've got 16MB limit, is there a better way to do this so that I could fit into this limit? I don't see why this has to allocate more than 128MB of memory. Thanks in advance.
Solution:
I started reading the files in 10Kb chunks, problem solved with peak memory usage arnoud 1.5MB.
$filename = 'c:\kosmas.zip';
$archive = zip_open($filename);
while($entry = zip_read($archive)){
$size = zip_entry_filesize($entry);
$name = zip_entry_name($entry);
$unzipped = fopen('c:/unzip/'.$name,'wb');
while($size > 0){
$chunkSize = ($size > 10240) ? 10240 : $size;
$size -= $chunkSize;
$chunk = zip_entry_read($entry, $chunkSize);
if($chunk !== false) fwrite($unzipped, $chunk);
}
fclose($unzipped);
}

Why do you read the whole file at once?
$buf = zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
fwrite($fp,"$buf");
Try to read small chunks of it and writing them to a file.

Just because a zip is less than PHP's memory limit & perhaps the unzipped is as well, doesn't take account of PHP's overhead generally and more importantly the memory needed to actually unzip the file, which whilst I'm not expert with compression I'd expect may well be a lot more than the final unzipped size.

For a file of that size, perhaps it is better if you use shell_exec() instead:
shell_exec('unzip archive.zip -d /destination_path');
PHP must not be running in safe mode and you must have access to both shell_exec and unzip for this method to work.
Update:
Given that command line tools are not available, all I can think of is to create a script and send the file to a remote server where command line tools are available, extract the file and download the contents.

function my_unzip($full_pathname){
$unzipped_content = '';
$zd = gzopen($full_pathname, "r");
while ($zip_file = gzread($zd, 10000000)){
$unzipped_content.= $zip_file;
}
gzclose($zd);
return $unzipped_content;
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Unzipping a file with PHP, but not all files are extracted - php

Related

Zip thousand files in php - memory leak

Split a large zip file in small chunks using php script

getimagesize() limiting file size for remote URL

php how to get web image size in kb?

Unzipping larger files with PHP

Categories

Resources