I am building a system for people to upload .tar (and .tar.gz, .tar.bz2, .zip, etc) files in PHP. Uploading the files is fine, but I would like to list files contained in the archive after it has been uploaded.
Can someone recommend a good PHP library that can read file archives?
I found File_Archive on Pear but it hasn't been updated in a few years. ZipArchive works great for .zip files, but I need something that can handle more file types.
update I'm running on RHEL6, PHP 5.2, and Apache 2.2.
You can do this with the PharData class:
// Example: list files
$archive = new PharData('/some/file.tar.gz');
foreach($archive as $file) {
echo "$file\n";
}
This even works with the phar:// stream wrapper:
$list = scandir('phar:///some/file.tar.gz');
$fd = fopen('phar:///some/file.tar.gz/some/file/in/the/archive', 'r');
$contents = file_get_contents('phar:///some/file.tar.gz/some/file/in/the/archive');
If you don't have Phar, check the PHP-only implementation, or the pecl extension.
Don't try to build this yourself. Use an existing class like http://pear.php.net/package/Archive_Tar to handle that for you.
The below code reads a file inside a .gz zip file
<?php
$z = gzopen('zipfile.gz','r') or die("can't open: $php_errormsg");
$string = '';
while ($line = gzgets($z,1024)) {
$string .= $line;
}
echo $string;
gzclose($z) or die("can't close: $php_errormsg");
?>
Note that you need to have the zip extension of php enabled for this code to work.
I don't think the first answer works. Or it only doesn't work for me. You could not read file content when you foreach it. I give my working code below.
$fh = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator('phar:///dir/file.tar.gz'),
RecursiveIteratorIterator::CHILD_FIRST
);
foreach ($fh as $splFileInfo) {
echo file_get_contents($splFileInfo->getPathname());
}
This works for gz, zip, tar and bz files.
Use the zlib extension
Related
I have a .tar.gz file downloaded from an external API which we have to implement. It contains images for an object.
I'm not sure how they managed to compress it this way, but the files are basically prefixed with the "current directory". It looks like this in WinRAR:
And like this in 7-Zip, note the .tar first level, and "." second level:
-> ->
When calling
$file = 'archive.tar.gz';
$phar = new PharData($file, FilesystemIterator::CURRENT_AS_FILEINFO);
var_dump($phar->offsetGet('./12613_s_cfe3e73.jpg'));
I get the exception:
Cannot access phar file entry '/12613_s_cfe3e73.jpg' in archive '{...}/archive.tar.gz'
Calling a file which does not exist, e.g.:
var_dump($phar->offsetGet('non-existent.jpg'));
Or calling it without the directory seperator, e.g.:
var_dump($phar->offsetGet('12613_s_cfe3e73.jpg'));
I get a
Entry 12613_s_cfe3e73.jpg does not exist
Exception.
It is not possible to get the archive formatted differently. Does anyone have an idea how to solve this?
Ended up using Archive_Tar. There must be something wrong in the source code of PHP, though I don't think this is the "normal" way of packaging a .tar either.
Unfortunately I'm not very good at C, but it's probably in here (line 1214) or here.
This library seems to handle it just fine, using this example code:
$file = 'archive.tar.gz';
$zip = new Archive_Tar($file);
foreach ($zip->listContent() as $file) {
echo $file['filename'] . '<br>';
}
Result:
./12613_s_f3b483d.jpg
./12613_s_cfe3e73.jpg
./1265717_s_db141dc.jpg
./1265717_s_af5de56.jpg
./1265717_s_b783547.jpg
./1265717_s_35b11f9.jpg
./1265716_s_83ef572.jpg
./1265716_s_9ac2725.jpg
./1265716_s_c5af3e9.jpg
./1265716_s_c070da3.jpg
./1265715_s_4339e8a.jpg
Note the filenames are still prefixed with "./" just like they are in WinRAR.
If you want to stick to using PharData, i suggest a more conservative, two-step approach, where you first decompress the gz and then unarchive all files of the tar to a target folder.
// decompress gz archive to get "/path/to/my.tar" file
$gz = new PharData('/path/to/my.tar.gz');
$gz->decompress();
// unarchive all files from the tar to the target path
$tar = new PharData('/path/to/my.tar');
$tar->extractTo('/target/path');
But it looks like you want to select individual files from the tar.gz archive directly, right?
It should work using fopen() with a StreamReader (compress.zlib or phar) and selecting the individual file. Some examples:
$f = fopen("compress.zlib://http://some.website.org/my.gz/file/in/the/archive", "r");
$f = fopen('phar:///path/to/my.tar.gz//file/in/archive', 'r');
$filecontent = file_get_contents('phar:///some/my.tar.gz/some/file/in/the/archive');
Streaming should also work, when using Iterators:
$rdi = new RecursiveDirectoryIterator('phar:///path/to/my.tar.gz')
$rii = new RecursiveIteratorIterator($rdi, RecursiveIteratorIterator::CHILD_FIRST);
foreach ($rii as $splFileInfo){
echo file_get_contents($splFileInfo->getPathname());
}
The downside is that you have to buffer the stream and save it to file.
Its not a direct file extraction to a target folder.
I have a BASE64 string of a zip file that contains one single XML file.
Any ideas on how I could get the contents of the XML file without having to deal with files on the disk?
I would like very much to keep the whole process in the memory as the XML only has 1-5k.
It would be annoying to have to write the zip, extract the XML and then load it up and delete everything.
I had a similar problem, I ended up doing it manually.
https://www.pkware.com/documents/casestudies/APPNOTE.TXT
This extracts a single file (just the first one), no error/crc checks, assumes deflate was used.
// zip in a string
$data = file_get_contents('test.zip');
// magic
$head = unpack("Vsig/vver/vflag/vmeth/vmodt/vmodd/Vcrc/Vcsize/Vsize/vnamelen/vexlen", substr($data,0,30));
$filename = substr($data,30,$head['namelen']);
$raw = gzinflate(substr($data,30+$head['namelen']+$head['exlen'],$head['csize']));
// first file uncompressed and ready to use
file_put_contents($filename,$raw);
After some hours of research I think it's surprisingly not possible do handle a zip without a temporary file:
The first try with php://memory will not work, beacuse it's a stream that cannot be read by functions like file_get_contents() or ZipArchive::open(). In the comments is a link to the php-bugtracker for the lack of documentation of this problem.
There is a stream support ZipArchive with ::getStream() but as stated in the manual, it only supports reading operation on an opened file. So you cannot build a archive on-the-fly with that.
The zip:// wrapper is also read-only: Create ZIP file with fopen() wrapper
I also did some attempts with the other php wrappers/protocolls like
file_get_contents("zip://data://text/plain;base64,{$base64_string}#test.txt")
$zip->open("php://filter/read=convert.base64-decode/resource={$base64_string}")
$zip->open("php://filter/read=/resource=php://memory")
but for me they don't work at all, even if there are examples like that in the manual. So you have to swallow the pill and create a temporary file.
Original Answer:
This is just the way of temporary storing. I hope you manage the zip handling and parsing of xml on your own.
Use the php php://memory (doc) wrapper. Be aware, that this is only usefull for small files, because its stored in the memory - obviously. Otherwise use php://temp instead.
<?php
// the decoded content of your zip file
$text = 'base64 _decoded_ zip content';
// this will empty the memory and appen your zip content
$written = file_put_contents('php://memory', $text);
// bytes written to memory
var_dump($written);
// new instance of the ZipArchive
$zip = new ZipArchive;
// success of the archive reading
var_dump(true === $zip->open('php://memory'));
toster-cx had it right,you should award him the points, this is an example where the zip comes from a soap response as a byte array (binary), the content is an XML file:
$objResponse = $objClient->__soapCall("sendBill",array(parameters));
$fileData=unzipByteArray($objResponse->applicationResponse);
header("Content-type: text/xml");
echo $fileData;
function unzipByteArray($data){
/*this firts is a directory*/
$head = unpack("Vsig/vver/vflag/vmeth/vmodt/vmodd/Vcrc/Vcsize/Vsize/vnamelen/vexlen", substr($data,0,30));
$filename = substr($data,30,$head['namelen']);
$if=30+$head['namelen']+$head['exlen']+$head['csize'];
/*this second is the actua file*/
$head = unpack("Vsig/vver/vflag/vmeth/vmodt/vmodd/Vcrc/Vcsize/Vsize/vnamelen/vexlen", substr($data,$if,30));
$raw = gzinflate(substr($data,$if+$head['namelen']+$head['exlen']+30,$head['csize']));
/*you can create a loop and continue decompressing more files if the were*/
return $raw;
}
If you know the file name inside the .zip, just do this:
<?php
$xml = file_get_contents('zip://./your-zip.zip#your-file.xml');
If you have a plain string, just do this:
<?php
$xml = file_get_contents('compress.zlib://data://text/plain;base64,'.$base64_encoded_string);
[edit] Documentation is there: http://www.php.net/manual/en/wrappers.php
From the comments: if you don't have a base64 encoded string, you need to urlencode() it before using the data:// wrapper.
<?php
$xml = file_get_contents('compress.zlib://data://text/plain,'.urlencode($text));
[edit 2] Even if you already found a solution with a file, there's a solution (to test) I didn't see in your answer:
<?php
$zip = new ZipArchive;
$zip->open('data::text/plain,'.urlencode($base64_decoded_string));
$zip2 = new ZipArchive;
$zip2->open('data::text/plain;base64,'.urlencode($base64_string));
If you are running on Linux and have administration of the system. You could mount a small ramdisk using tmpfs, the standard file_get / put and ZipArchive functions will then work, except it does not write to disk, it writes to memory.
To have it permanently ready, the fstab is something like:
/media/ramdisk tmpfs nodev,nosuid,noexec,nodiratime,size=2M 0 0
Set your size and location accordingly so it suits you.
Using php to mount a ramdisk and remove it after using it (if it even has the privileges) is probably less efficient than just writing to disk, unless you have a massive number of files to process in one go.
Although this is not a pure php solution, nor is it portable.
You will still need to remove the "files" after use, or have the OS clean up old files.
They will of coarse not persist over reboots or remounts of the ramdisk.
if you want to read the content of a file from zip like and xml inside you shoud look at this i use it to count words from docx (wich is a zip )
if (!function_exists('docx_word_count')) {
function docx_word_count($filename)
{
$zip = new ZipArchive();
if ($zip->open($filename) === true) {
if (($index = $zip->locateName('docProps/app.xml')) !== false) {
$data = $zip->getFromIndex($index);
$zip->close();
$xml = new SimpleXMLElement($data);
return $xml->Words;
}
$zip->close();
}
return 0;
}
}
The idea comes from toster-cx is pretty useful to approach malformed zip files too!
I had one with missing data in the header, so I had to extract the central directory file header by using his method:
$CDFHoffset = strpos( $zipFile, "\x50\x4b\x01\x02" );
$CDFH = unpack( "Vsig/vverby/vverex/vflag/vmeth/vmodt/vmodd/Vcrc/Vcsize/Vsize/vnamelen/vexlen", substr( $zipFile, $CDFHoffset, 46 ) );
Can I include a file from a zip file in PHP? For example consider I have a zip file - test.zip and test.zip contains a file by the name a.php. Now, what I would like to do is something like below,
include "test.zip/a.php";
Is this possible? If it is can anyone provide me a code snippet?? If not, id there any other alternative to do this??
$zip = new ZipArchive('test.zip');
$tmp = tmpfile();
$metadata = stream_get_meta_data($tmp);
file_put_content($metadata['uri'], $zip->getFromName('a.php'));
include $metadata['uri'];
To go further, you may be interested in PHAR archive, which basically a Zip archive.
Edit:
With a cache strategy:
if (apc_exists('test_zip_a_php')) {
$content = apc_fetch('test_zip_a_php');
} else {
$zip = new ZipArchive('test.zip');
$content = $zip->getFromName('a.php');
apc_add('test_zip_a_php', $content);
}
$f = fopen('php://memory', 'w+');
fwrite($f, $content);
rewind($f);
// Note to use such include you need `allow_url_include` directive sets to `On`
include('data://text/plain,'.stream_get_contents($f));
Are you all sure? According to the phar extension, phar is implemented using a stream wrapper, so they can just call
include 'phar:///path/to/myphar.phar/file.php';
But there also exists stream wrappers for zip, see this example, where they call:
$reader->open('zip://' . dirname(__FILE__) . '/test.odt#meta.xml');
To open the file meta.xml in the zip-File test.odt (odt-files are only zip-files with another extension).
Also in another example they directly open a zip file via stream wrapper:
$im = imagecreatefromgif('zip://' . dirname(__FILE__) . '/test_im.zip#pear_item.gif');
imagepng($im, 'a.png');
I have to admit, I do not know how directly it works.
I would try calling
include 'zip:///path/to/myarchive.zip#file.php';
Unlike the phar wrapper the zip wrapper seams to require a sharp, but you can also try it with a slash. But it’s also just an idea from reading the docs.
If it does not work, you can use phars of course.
I have a ZIP file on my server. I want to create a PHP file, loadZIP.php that will accept a single parameter, and then modify a text file within the ZIP to reflect that parameter.
So, accessing loadZIP.php?param=blue, will open up the zip file, and replace some text in a text file I specify with 'blue', and allow the user to download this edited zip file.
I've looked over all of the PHP ZIP functions, but I can't find a simple solution. It seems like a relatively easy problem, and I believe I'm over thinking it. Before I go and write some overly complex functions, I was wondering how you'd go about this.
Have you taken a look at PHP5's ZipArchive functions?
Basically, you can use ZipArchive::Open() to open the zip, then ZipArchive::getFromName() to read the file into memory. Then, modify it, use ZipArchive::deleteName() to remove the old file, use ZipArchive::AddFromString() to write the new contents back to the zip, and ZipArchive::close():
$zip = new ZipArchive;
$fileToModify = 'myfile.txt';
if ($zip->open('test1.zip') === TRUE) {
//Read contents into memory
$oldContents = $zip->getFromName($fileToModify);
//Modify contents:
$newContents = str_replace('key', $_GET['param'], $oldContents)
//Delete the old...
$zip->deleteName($fileToModify)
//Write the new...
$zip->addFromString($fileToModify, $newContents);
//And write back to the filesystem.
$zip->close();
echo 'ok';
} else {
echo 'failed';
}
Note ZipArchive was introduced in PHP 5.2.0 (but, ZipArchive is also available as a PECL package).
In PHP 8 you can use ZipArchive::replaceFile
As demonstrated by this example from the docs:
<?php
$zip = new ZipArchive;
if ($zip->open('test.zip') === TRUE) {
$zip->replaceFile('/path/to/index.txt', 1);
$zip->close();
echo 'ok';
} else {
echo 'failed';
}
?>
So I have a client who's current host does not allow me to use tar via exec()/passthru()/ect and I need to backup the site periodicly and programmaticly so is there a solution?
This is a linux server.
PHP 5.3 offers a much easier way to solve this issue.
Look here: http://www.php.net/manual/en/phardata.buildfromdirectory.php
<?php
$phar = new PharData('project.tar');
// add all files in the project
$phar->buildFromDirectory(dirname(__FILE__) . '/project');
?>
At http://pear.php.net/package/Archive_Tar you can donload the PEAR tar package and use it like this to create the archive:
<?php
require 'Archive/Tar.php';
$obj = new Archive_Tar('archive.tar');
$path = '/path/to/folder/';
$handle=opendir($path);
$files = array();
while(false!==($file = readdir($handle)))
{
$files[] = $path . $file;
}
if ($obj->create($files))
{
//Sucess
}
else
{
//Fail
}
?>
There is the Archive_Tar library. If that can't be used for some reason, the zip extension might be another option.
I need a solution that would work on Azure websites (IIS) and had trouble with creating new files on the server using methods from other answers. The solution that worked for me was to use small TbsZip library for compression, which doesn't require to write file anywhere in the server - it's just returned directly via HTTP.
This thread is old, but this approach might be a bit more generic and complete answer, so I post the code as alternative:
// Compress all files in current directory and return via HTTP as a ZIP file
// by buli, 2013 (http://buli.waw.pl)
// requires TbsZip library from http://www.tinybutstrong.com
include_once('tbszip.php'); // load the TbsZip library
$zip = new clsTbsZip(); // instantiate the class
$zip->CreateNew(); // create a virtual new zip archive
// iterate through files, skipping directories
$objects = new RecursiveIteratorIterator(new RecursiveDirectoryIterator('.'));
foreach($objects as $name => $object)
{
$n = str_replace("/", "\\", substr($name, 2)); // path format
$zip->FileAdd($n, $n, TBSZIP_FILE); // add fileto zip archive
}
$archiveName = "backup_".date('m-d-Y H:i:s').".zip"; // name of the returned file
$zip->Flush(TBSZIP_DOWNLOAD, $archiveName); // flush the result as an HTTP download
And here's the whole article on my blog.