I'm testing compression of html files.
I have 2 HTML files:
Not compressed HTML file ( content will change )
Compressed HTML file .gz ( content won't change )
Using PHP I'm trying to output compressed files data and here begins.
test with already compressed html file.
//header gzip
$data = getfile($name); // custom function packed with fopen fread
header(Content-Encoding: gzip); // header works perfect
echo $data; // output OK
//header deflate
$data = getfile($name); // custom function packed with fopen fread
header(Content-Encoding: deflate); // file was gzip compressed so error is normal
echo $data; // fireFox : Content Encoding Error
test with not compressed html file
//header gzip using gzcompress();
$data = gzcompress(getfile($name), 9);
header(Content-Encoding: gzip); // somehow header is bad
echo $data; // fireFox : Content Encoding Error , but IE 9 output OK
but here we got magic
//header deflate using gzcompress();
$data = gzcompress(getfile($name), 9);
header(Content-Encoding: deflate); // header works perfect
echo $data; // Firefox output OK, but IE output ERROR
How fix this crazy thing and send all data as gzip with gzip header not defalte? maybe someone have any idea what is wrong?
Thank you
The HTTP spec. (RFC2616) says:
gzip
An encoding format produced by the file compression program
"gzip" (GNU zip) as described in RFC 1952 [25].
compress
The encoding format produced by the common UNIX file compression
program "compress".
deflate
The "zlib" format defined in RFC 1950 [31] in combination with
the "deflate" compression mechanism described in RFC 1951 [29].
The PHP docs say:
gzcompress
For details on the ZLIB compression algorithm see the document
"ZLIB Compressed Data Format Specification version 3.3"
(RFC 1950).
gzdeflate
For details on the DEFLATE compression algorithm see the document
"DEFLATE Compressed Data Format Specification version 1.3"
(RFC 1951).
gzencode
For more information on the GZIP file format, see the document:
GZIP file format specification version 4.3 (RFC 1952).
From this, one can come to the conclusion that gzencode() must be used with gzip, and gzcompress() (with the DEFLATE encoding) must be used with deflate.
The first combination works for me. I haven't tried the second; don't know why it wouldn't work with IE. A URL might help to trouble-shoot that problem.
Related
When I compress a string in PHP with encoding ZLIB_ENCODING_DEFLATE and output the hex data, I can convert this back to the original string using zlib deflate() in a c++ project.
Per the example here ( https://www.php.net/manual/en/function.zlib-encode.php ) :
<?php
$str = 'hello world';
$enc = zlib_encode($str, ZLIB_ENCODING_DEFLATE);
echo bin2hex($enc);
?>
in c++, after having converted the hex string to binary data first: (simplified code)
z_stream d_stream;
d_stream.zalloc = (alloc_func)0 ;
d_stream.zfree = (free_func)0 ;
d_stream.opaque = (voidpf)0
d_stream.next_in = InBuffer ;
d_stream.avail_in = InBufferLen ;
d_stream.next_out = OutBuffer ;
d_stream.avail_out = OutBufferLen ;
int err = inflateInit(&d_stream) ;
while (err == Z_OK)
err = inflate(&d_stream, Z_NO_FLUSH);
err = inflateEnd(&d_stream);
OutBuffer contains "hello world" again
I was wondering if zlib inflate() also decompresses the via PHP generated zlib_encode($str, ZLIB_ENCODING_RAW); raw data ?
From the zlib documentation I think not:
The deflate compression method (the only one supported in this
version).
#define Z_DEFLATED 8
But PHP's function name zlib_encode() and define ZLIB_ENCODING_RAW seem to suggest zlib does support it ? If so what function and/or parameters do I use ?
The PHP designations are (as usual) confusing. I will assume that ZLIB_ENCODING_RAW means raw deflate data (per RFC 1951), and it appears that ZLIB_ENCODING_DEFLATE actually means zlib-wrapped deflate data (per RFC 1950).
If that's correct, they should have called them ZLIB_ENCODING_DEFLATE and ZLIB_ENCODING_ZLIB, respectively. But I digress.
You can decode raw deflate data with the zlib library by using inflateInit2() instead of inflateInit(), and giving -15 as the second argument.
I make a HTTP POST request to a remote service which requires the post body to be "deflated" (and Content-encoding: deflate should be sent in headers). From my understanding, this is covered in RFC 1950. Which php function should I use to be compatible?
gzencode
gzdeflate
gzcompress
Content-Encoding: deflate requires data to be presented using the zlib structure (defined in RFC 1950), with the deflate compression algorithm (defined in RFC 1951).
Consider
<?php
$str = 'test';
$defl = gzdeflate($str);
echo bin2hex($defl), "\n";
$comp = gzcompress($str);
echo bin2hex($comp), "\n";
?>
This gives us:
2b492d2e0100
789c2b492d2e0100045d01c1
so the gzcompress result is the gzdeflate'd buffer preceded by 789c, which appears to be a valid zlib header
0111 | 1000 | 11100 | 0 | 10
CINFO | CM | FCHECK | FDICT | FLEVEL
7=32bit | 8=deflate | | no dict | 2=default algo
and followed by 4 bytes of checksum. This is what we're looking for.
To sum it up,
gzdeflate returns a raw deflated buffer (RFC 1951)
gzcompress returns a deflated buffer wrapped in zlib stuff (RFC 1950)
Content-Encoding: deflate requires a wrapped buffer, that is, use gzcompress when sending deflated data.
Note the confusing naming: gzdeflate is not for Content-Encoding: deflate and gzcompress is not for Content-Encoding: compress. Go figure!
How do I correctly compress a string, so PHP would be able to decompress?
I tried this:
public static byte[] compress(String string) throws IOException {
ByteArrayOutputStream os = new ByteArrayOutputStream(string.length());
DeflaterOutputStream gos = new DeflaterOutputStream(os);
// ALSO TRIED GZOutputStream, same results!
gos.write(string.getBytes());
gos.close();
byte[] compressed = os.toByteArray();
os.close();
return compressed;
}
But PHP does not recognize output as valid GZip compressed string...
The problem seems to be in some headers / footers being added by Android...
For example when I compress something word via PHP with gzcompress I got similar results as with Android, but not similar enough, so PHP could read it:
something (HEX DUMP):
Android: 1f8b08000000000000002bcecf4d2dc9c8cc4b0700fb31da0909000000
PHP: 789c2bcecf4d2dc9c8cc4b0700134703cf
The weirdest thing is that by changing GZOutputStream to DeflaterOutputStream it fixed the problem with something word, but the problem still appears with longer strings...
PS. Removing heading 10 characters from Android generated data does not help at all.
EDIT: I tried to decompress it in PHP with:
gzdecode() - this function does not exist in standard Debian PHP5
version.
gzdecompress() - does not work
And some functions to emulate gzdecode() from PHP site comments that don't really do much.
All above, with removing first 10 bytes and leaving them.
PS2. I tried every single solution from Stack Overflow, and other sources, and still nothing. It is not a duplicate.
EDIT2 (BINARY DUMP): Sample data generated with Android that can't be decomprssed by gzuncompress() or pseudo-gzdecode() functions from PHP.NET: data.compressed.
It supposed to be some JSON, after decompression.
The Android data that starts with 1f8b is a gzip stream. In php you use gzdecode() for that. gzencode() on php makes gzip streams.
The php data that starts with 789c is a zlib stream. You used gzcompress() to make that, and you would use gzuncompress() to decode it.
The compressed data contained within both of those streams, starting with 2bce is raw deflate data. You can use gzinflate() to decode that if you happened to make it somewhere, and you can use gzdeflate() to generate raw deflate.
Just to rant, gzencode(), gzcompress(), and gzdeflate() are some of the most misleading function names ever concocted, since only one of them is related to gzip yet all start with gz, and nothing in the name gzcompress() indicates zlib.
Update:
The "EDIT2" data is, for some reason, doubly compressed. It was compressed first to the zlib format, and then that zlib stream was compressed to the gzip format. (Though gzip couldn't compress the already compressed data, so it's a little bigger.)
You should repair the problem that made it doubly compressed. Or if you have no control over that, you can doubly decompress it, first stripping the gzip header using the RFC 1952 specification and then gzinflate() on the raw deflate data, and then using gzdecompress() on the result.
I want to compress some files with gzip in PHP..
It works as it should when the output file is saved into a file.. When the file is opened it looks like this
But not when the output is returned as a string.. Then the opened file looks like this.. Why is tar file showed inside the gzip file?
public function compress(){
if($this->stream){
return gzencode($this->data, 9);
}
else{
$gz = gzopen('test.tar.gz', 'w9');
gzwrite($gz, $this->data);
gzclose($gz);
}
}
headers sent with string output to the browser
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="'.$filename.'"');
This looks like a WinRAR file extension parsing issue.
In the first example your file is called .tar.gz and WinRAR knows how to handle both tar files and gz compression, so it is able to decompress the tar headers in memory and retrieve a list of files contained within.
In your second example the file is called .tar-19.gz, so WinRAR happily deals with the gz compression but has no idea what format tar-19 is supposed to be (it doesn't even try and guess from header heuristics)
I bet if you stream the file with a tar.gz extension it will open up just fine.
I am sure that you looked into http://php.net/manual/en/function.gzcompress.php
make sure that you have php version PHP 4 >= 4.0.1, PHP 5
string gzcompress ( string $data [, int $level = -1 [, int $encoding = ZLIB_ENCODING_DEFLATE ]] )
This function compress the given string using the ZLIB data format.
For details on the ZLIB compression algorithm see the document "» ZLIB Compressed Data Format Specification version 3.3" (RFC 1950).
<?php
$compressed = gzcompress('Compress me', 9);
echo $compressed;
?>
Note:
This is not the same as gzip compression, which includes some header
data. See gzencode() for gzip compression.
Parameters
data
The data to compress.
level
The level of compression. Can be given as 0 for no compression up to 9
for maximum compression.
If -1 is used, the default compression of the zlib library is used which is 6.
encoding
One of ZLIB_ENCODING_* constants.
gzencode — Create a gzip compressed string
<?php
$data = implode("", file("bigfile.txt"));
$gzdata = gzencode($data, 9);
$fp = fopen("bigfile.txt.gz", "w");
fwrite($fp, $gzdata);
fclose($fp);
?>
Have you tried passing "Content-type: application/x-gzip" headers when sending the file as a string?
It's possible Apache is re-running it through a gzip filter and that's causing issues.
I want to compress a string in PHP and write it to a file without using the gzwrite function as I want to store the actual compressed string in a database first, but I am unsure as whether to use gzcompress, gzencode or gzdeflate as it's not very clear.
Any ideas?
Edit: the already compressed string will be written into a *.gz file from the database so it has to be compatible.
Use gzcompress if you just want to compress the string.
gzencode will also add gzip file headers so it can be uncompressed directly by gzip and similar tools.
gzdeflate uses the deflate algorithm which is very similar to the first one.
I think yo want to use gzencode in ths case since the data is going to be stored as a file.