PHP - file_put_contents() doesn't preserve filenames in output - php

I would like to compress a .csv file on my server and put it into .gz (gzip) file using PHP.
I used `file_put_contents() like below:
$input = "test.csv";
$output = $input.".gz";
file_put_contents("compress.zlib://$output", file_get_contents($input));
However, when I open the gzip file (using winrar / 7zip), file extension is missing in the .gz archive; it's just "test" (without the file extension)?
It's not showing "test.csv" as I wanted. How to fix it?

There is no information on any "filename" inside that compressed file. You're simply compressing the raw binary data of the input file and are dumping it into an output file. The .gz file has no meta information on how many files are contained within it or what their names are. That's what the TAR file format is for, to provide that kind of meta information. You should make a tarball, then compress it using gz into a .tar.gz.
I'm not sure how to do this using PHP other than running a shell command through exec.
You may want to look at ZIP as an alternative with native PHP support.

Lets try this.,
$input = "test.txt";
exec("gzip ".$input);
It will work on linux server...

Im not exactly sure about what you're asking, but PHP already has a function for gzip compression, gzencode.
Use it like this
<?php
$data = implode("", file("bigfile.txt"));
$gzdata = gzencode($data, 9);
$fp = fopen("bigfile.txt.gz", "w");
fwrite($fp, $gzdata);
fclose($fp);
?>

Your example works properly on PHP 5.3.10 at least:
-rw-rw-r-- 1 mats mats 8 Jul 17 13:05 test.csv
-rw-rw-r-- 1 mats mats 31 Jul 17 13:05 test.csv.gz
You're not hiding file extensions for known file types in your explorer / navigator?

This worked for me:
$input = "test.csv";
$output = $input.".xml.gz";
I know it's ugly, but when using gzopen and gzwrite, I've found this to be the only way to preserve xml extension inside the archive. This way, when I extract it, I get the xml file.
Later on, once file is created, you can rename it in order to remove [.xml] thing before extension [.gz]

Related

php gzip match unix gzip

I am trying to use php to create a file and zip it. How can i match the same compression levels/headers/and so on as gzip that is run in unix?
using php
ls -l
total 8
-rw-rw-r-- 1 owner owner 486 Jul 21 17:05 file.xml.gz
using gzip on unix command line:
ls -l
total 8
-rw-rw-r-- 1 owner owner 479 Jul 21 17:05 file.xml.gz
in php
$zip = gzencode($xml,2);
i have tried 0 through 9 as the compression level here, i have also tried
$zip = gzencode($xml,x,FORCE_DEFLATE)
again where x is 0-9
my problem is this:
I have a 3rd party vendor that takes the gzipped file, unzips it and does fun things with it. The problem i am running into is when i use php i get an error "cannot parse file.xm.gz", when i use gzip on cli it works fine. I have no vision into what the 3rd party is doing or why its failing. Could it be something like carriage returns or spaces or something in the xml? I know its a tough question to answer. heres a snippet of my php xml.
$xml ='<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<localRoutes xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
';
$xml.='<route>
<user type="string">' . $mac . '</user>
';
$xml.='<next type="regex">!^(.*$)!sip:#' . $ip . "</next>
</route>
";
$xml .= '</localRoutes>';
The compressed data is identical. What's missing is a field in the header indicating the original name of the file (e.g, probably file.xml here). This field is generated by the gzip utility, but the gzencode() PHP function doesn't have an original filename to work with, so it doesn't write this field.
I'm not aware of any way to make PHP generate this field with the zlib extension. Its absence is very unlikely to cause any problems, though.
You can't.
You don't need to.
First, gzip and zlib (which is what php is using) have different compression algorithms, and so for large enough data, they will never produce the same compressed data, even at the same compression level.
Second, as noted by #duskwuff, you will not be able to replicate the same gzip header, unless you pull off the gzip header that php made and write your own. The modification dates in the headers will be different. The way you're doing it, one will have a file name and one will not. Though you can invoke gzip with -n to not store the file name.
Third, there is no reason to try to make the results identical. All that matters is that both decompress to the same thing. Which they will.

PHP Output Stream Filter w/zlib.inflate creates blank file

I'm attempting to use the 'zlib.inflate' built in stream filter on a file pointer that I'm using with ftp_fget(). The idea is, if the file is gzipped, this will inflate it. Everything works until I attach the stream filter.
$local = $remote = 'whatever.txt';
# Create local file to transfer ftp file into
$localFile = fopen($local, 'w');
# Attach inflation stream filter with write filter, so we can inflate as we write to the new file
stream_filter_append($localFile, 'zlib.inflate', STREAM_FILTER_WRITE);
# Use fget to read remote file into local with optional inflation
# $this->_connection(); returns csv resource
$result = ftp_fget($this->_connection(), $localFile, $remote, FTP_BINARY);
This seems like it should be a pretty straight forward thing, but it's just giving me a blank file. Any ideas?
[Edit] Running PHP 5.2.6 on Debian Lenny. Zlib is installed and shows up under phpinfo()
[Edit 2] It appears this is related to this bug. https://bugs.php.net/bug.php?id=49411 If so, I'll close this out.
[Edit 3] Spun up a Vb instance with PHP 5.4.4 and I'm still having the same issues, so I don't think its a bug.
This was a misunderstanding on my part about how the streams work. They don't support the same headers and trailers that the command line gzip utility would use. (2 bytes at the beginning and a checksum at the end). I'm going to have to attack this from a different angle. I'm going to mark this as solved.

gzcompress won't produce a valid zipped file?

Consider this:
$text = "hello";
$text_compressed = gzcompress($text, 6);
$success = file_put_contents('file.gz', $text_compressed);
When i try to open file.gz, i get errors. How can i open file.gz under terminal without calling php? (using gzuncompress works just fine!)
I can't recode every file i did, since that i now have almost a Billion files encoded this way! So if there is a solution... :)
You need to use gzencode() instead.
Luckily for you, the fix is easy: just write a script that opens each of your files one by one, uses gzuncompress() to uncompress that file, and then writes that file back out with gzencode() instead of gzcompress(), repeating the process for all of the files.
Alternatively (since you said you "didn't want to recode your files"), you could use uncompress to open the existing files from the command line (instead of gunzip/zcat).
As noted on the gzcompress() manual page:
gzcompress
This is not the same as gzip compression, which includes some header data. See gzencode() for gzip compression.
As said, you don't really have gzipped files. To open your files from a terminal you need the uncompress utility.

Dynamically created zip files by ZipStream in PHP won't open in OSX

I have a PHP site with a lot of media files and users need to be able to download multiple files at a time as a .zip. I'm trying to use ZipStream to serve the zips on the fly with "store" compression so I don't actually have to create a zip on the server, since some of the files are huge and it's prohibitively slow to compress them all.
This works great and the resulting files can be opened by every zip program I've tried with no errors except for OS X's default unzipping program, Archive Utility. You double click the .zip file and Archive Utility decides it doesn't look a real zip and instead compresses into a .cpgz file.
Using unzip or ditto in the OS X terminal or StuffIt Expander unzips the file with no problem but I need the default program (Archive Utility) to work for the sake of our users.
What sort of things (flags, etc.) in otherwise acceptable zip files can trip Archive Utility into thinking a file isn't a valid zip?
I've read this question, which seems to describe a similar issue but I don't have any of the general purpose bitfield bits set so it's not the third bit issue and I'm pretty sure I have valid crc-32's because when I don't, WinRAR throws a fit.
I'm happy to post some code or a link to a "bad" zip file if it would help but I'm pretty much just using ZipStream, forcing it into "large file mode" and using "store" as the compression method.
Edit - I've tried the "deflate" compression algorithm as well and get the same results so I don't think it's the "store". It's also worth pointing out that I'm pulling down the files one a time from a storage server and sending them out as they arrive so a solution that requires all the files to be downloaded before sending anything isn't going to be viable (extreme example is 5GB+ of 20MB files. User can't wait for all 5GB to transfer to zipping server before their download starts or they'll think it's broken)
Here's a 140 byte, "store" compressed, test zip file that exhibits this behavior: http://teknocowboys.com/test.zip
The problem was in the "version needed to extract" field, which I found by doing a hex diff on a file created by ZipStream vs a file created by Info-zip and going through the differences, trying to resolve them.
ZipStream by default sets it to 0x0603. Info-zip sets it to 0x000A. Zip files with the former value don't seem to open in Archive Utility. Perhaps it doesn't support the features at that version?
Forcing the "version needed to extract" to 0x000A made the generated files open as well in Archive Utility as they do everywhere else.
Edit: Another cause of this issue is if the zip file was downloaded using Safari (user agent version >= 537) and you under-reported the file size when you sent out your Content-Length header.
The solution we employ is to detect Safari >= 537 server side and if that's what you're using, we determine the difference between the Content-Length size and the actual size (how you do this depends on your specific application) and after calling $zipStream->finish(), we echo chr(0) to reach the correct length. The resulting file is technically malformed and any comment you put in the zip won't be displayed, but all zip programs will be able to open it and extract the files.
IE requires the same hack if you're misreporting your Content-Length but instead of downloading a file that doesn't work, it just won't finish downloading and throws a "download interrupted".
use ob_clean(); and flush();
Example :
$file = __UPLOAD_PATH . $projectname . '/' . $fileName;
$zipname = "watherver.zip"
$zip = new ZipArchive();
$zip_full_path_name = __UPLOAD_PATH . $projectname . '/' . $zipname;
$zip->open($zip_full_path_name, ZIPARCHIVE::CREATE);
$zip->addFile($file); // Adding one file for testing
$zip->close();
if(file_exists($zip_full_path_name)){
header('Content-type: application/zip');
header('Content-Disposition: attachment; filename="'.$zipname.'"');
ob_clean();
flush();
readfile($zip_full_path_name);
unlink($zip_full_path_name);
}
I've had this exact issue but with a different cause.
In my case the php generated zip would open from the command line, but not via finder in OSX.
I had made the mistake of allowing some HTML content into the output buffer prior to creating the zip file and sending that back as the response.
<some html></....>
<?php
// Output a zip file...
The command line unzip program was evidently tolerant of this but the Mac unarchive function was not.
No idea. If the external ZipString class doesn't work, try another option. The PHP ZipArchive extension won't help you, since it doesn't support streaming but only ever writes to files.
But you could try the standard Info-zip utility. It can be invoked from within PHP like this:
#header("Content-Type: archive/zip");
passthru("zip -0 -q -r - *.*");
That would lead to an uncompressed zip file directly send back to the client.
If that doesn't help, then the MacOS zip frontend probably doesn't like uncompressed stuff. Remove the -0 flag then.
The InfoZip commandline tool I'm using, both on Windows and Linux, uses version 20 for the zip's "version needed to extract" field. This is needed on PHP as well, as the default compression is the Deflate algorithm. Thus the "version needed to extract" field should really be 0x0014. If you alter the "(6 << 8) +3" code in the referenced ZipStream class to just "20", you should get a valid Zip file across platforms.
The author is basically telling you that the zip file was created in OS/2 using the HPFS file system, and the Zip version needed predates InfoZip 1.0. Not many implementations know what to do about that one any longer ;)
For those using ZipStream in Symfony, here's your solution: https://stackoverflow.com/a/44706446/136151
use Symfony\Component\HttpFoundation\StreamedResponse;
use Aws\S3\S3Client;
use ZipStream;
//...
/**
* #Route("/zipstream", name="zipstream")
*/
public function zipStreamAction()
{
//test file on s3
$s3keys = array(
"ziptestfolder/file1.txt"
);
$s3Client = $this->get('app.amazon.s3'); //s3client service
$s3Client->registerStreamWrapper(); //required
$response = new StreamedResponse(function() use($s3keys, $s3Client)
{
// Define suitable options for ZipStream Archive.
$opt = array(
'comment' => 'test zip file.',
'content_type' => 'application/octet-stream'
);
//initialise zipstream with output zip filename and options.
$zip = new ZipStream\ZipStream('test.zip', $opt);
//loop keys useful for multiple files
foreach ($s3keys as $key) {
// Get the file name in S3 key so we can save it to the zip
//file using the same name.
$fileName = basename($key);
//concatenate s3path.
$bucket = 'bucketname';
$s3path = "s3://" . $bucket . "/" . $key;
//addFileFromStream
if ($streamRead = fopen($s3path, 'r')) {
$zip->addFileFromStream($fileName, $streamRead);
} else {
die('Could not open stream for reading');
}
}
$zip->finish();
});
return $response;
}
If your controller action response is not a StreamedResponse, you are likely going to get a corrupted zip containing html as I found out.
It's an old question but I leave what it worked for me just in case it helps someone else.
When setting the options you need set Zero header to true and enable zip 64 to false (this will limit the archive to archive to 4 Gb though):
$options->setZeroHeader(true);
$opt->setEnableZip64(false)
Everything else as described by Forer.
Solution found on https://github.com/maennchen/ZipStream-PHP/issues/71

How can I tell if a file is text using PHP?

I'm making a search engine for our gigantic PHP codebase.
Given a filepath, how can I determine with some degree of certainty whether a file is a text file, or some other type? I'd prefer not to have to resort to file extensions (like substr($filename, -3) or something silly), as this is a linux based filesystem, so anything goes as far as file extensions are concerned.
I'm using RecursiveDirectoryIterator, so I have those methods available too..
Try the finfo_file() function.
Here's a blog describing its usage: Smart File Type Detection Using PHP
if (mime_content_type($path) == "text/plain") {
echo "I'm a text file";
}
Try to use:
string mime_content_type ( string $filename )
Hope this is helpful.
William Choi
You can invoke the file utility:
echo `file '$file'`;
Returns things like:
$ file test.out
test.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
$ file test.cpp
test.cpp: ASCII C program text
$ file test.txt
test.txt: ASCII text
$ file test.php
test.php: PHP script text

Categories