Unexpected output with zlib_encode() - php

I'm trying to encode a chunk of binary data with PHP in the same way zlib's compress2() function does it. However, using zlib_encode(), I get the wrong encoded output. I know this because I have a C program that does it (correctly). When I compare the output (using a hex editor) of the C program against that of the PHP script below, I notice it doesn't match at all.
My question I guess is, does this really compress in the same way zlib's compress2() function does?
<?php
$filename = 'C:\data.bin';
$in = fopen($filename, 'rb');
$data = fread($in, filesize($filename));
fclose($in);
$data_dec = zlib_decode($data);
$data_enc = zlib_encode($data_dec, ZLIB_ENCODING_DEFLATE, 9);
?>
The compression level is correct, so it should match with the C program's encoded output. Is there a bug somewhere perhaps.. ?

Yes, zlib_encode() (with the default arguments), and uncompress() are compatible, and compress2() and zlib_decode() are compatible.
The way to check is not to compare compressed output. Check by decompressing with uncompress() and zlib_decode(). There is no reason to expect that the compressed output will be the same, and it does not need to be. All that matters is that it can be losslessly decompressed on the other end.

Related

Get Zipfile as Bytes in php?

Using php, how can I read a zip file and get its bytes, for example something like
$contents = file_get_contents('myzipfile.zip');
echo $contents;
// outputs: 504b 0304 1400 0000 0800 1bae 2f46 20e0
Thank you!
file_get_contents gets the raw bytes, your echo outputs those raw bytes. If you expect to output a hexadecimal notation of the raw byte contents instead, use bin2hex:
echo bin2hex($contents);
If you want that arbitrarily grouped with a space every two bytes, you can do something along these lines:
echo join(' ', str_split(bin2hex($contents), 4));
(Note that this is all rather inefficient, modifying the entire, possibly many megabyte large file in memory. I'm expecting this is just for debugging purposes, so won't go out of my way to write super efficient code.)
file_get_contents() will return the exact contents of the file, so the format depends on the file type.
If you are looking for the byte size of the file you can get any available file information with the core SPL library's fileInfo class:
$info = new SplFileInfo('myzipfile.zip');
$bytes = $info->getSize();

string contain many '\0' after inflate

I try to decompress blocks of data which were compressed with zlib and author made remarks that for decompress i must use inflate_init and inflate with Z_SYNC_FLUSH. I sure that this must work because that works on php in this way :
$temp = substr($temp, 2, -4);
$temp{0} = chr(ord($temp{0}) | 1);
$temp = gzinflate($temp);
but i ckecked many method for decompress this on C++ and every time fail.
Here is one of them :
char compressedblockbuffer[3371];
char uncompressedblockbuffer[8192];
is.read(compressedblockbuffer, 3371);
z_stream strm;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = 3371;
strm.next_in = (Bytef *)compressedblockbuffer;
strm.avail_out = 8192;
strm.next_out = (Bytef *)uncompressedblockbuffer;
inflateInit(&strm);
inflate(&strm, Z_SYNC_FLUSH);
inflateEnd(&strm);
It's not full code, just example to show problem and thats why i specified already known sizes.
I use last zlib realize so may be something change in the zlib inflate since 2003-2004 years?
So the result is :
So seems that uncompressedblockbuffer contains '\0' at the 2,3,4 indexes and many other and if i print this to console i just see two first elements.
UPD:
If gzinflate() in PHP works on the data, then your code won't. gzinflate() expects raw deflate data. Your code is looking for zlib-wrapped deflate data. If you want to decode raw deflate data, you need to use inflateInit2(&strm, -15) instead.
Your call to inflate() is likely returning an error that you are not checking for. You need to always check the return codes of the zlib routines, or for that matter any function that has the potential to return an error.
What kind of data are you decompressing? Many binary formats are perfectly accepting of NUL bytes in their data, since it just reads as a value of 0. For example, inside of image data in many formats, it'd just represent a value of 0 in either that channel or pixel (depending on data size). Not to mention, binary formats don't necessarily read as bytes. A NUL byte may actually be a part of a 2- or 4-byte value.
This is the problem with trying to read binary data as a character string. Binary data needn't follow the rules of text. This is why usually the data boundary is a separate size value, because it can't terminate on NUL values like text.
If you have the original uncompressed data for comparison, either load that data into memory and compare the data, or save the decompressed data to a file and use a diff tool to do a binary comparison of the files.

PHP string comparison not working, file reading

I am using PHP to read in a tab delimited CSV file and a pipe delimited TXT file. Unfortunately, I cannot get a string comparsion to work even though the characters (appear) to be exactly the same. I used trim to make sure to clean up hidden characters and I even tried type-casting to string.
Var dump shows they are clearly different but I am not sure how to make them the same?
// read in CSV file
$fh = fopen($mapping_date, 'r');
$mapping_data = fread($fh, filesize($mapping_date));
...
// use str_getcsv to put each line into an array
// get values out that I want to compare
$this_strategy = (string)trim($strategy_name);
$row_strategy = (string)trim($row3[_Strategy_Name]);
if($this_strategy == $row_strategy) { // do something }
var_dump($this_strategy);
Vardump: string(16) "Low Spend ($0.2)"
var_dump($row_strategy);
Vardump: string(31) "Low Spend ($0.2)"
Can't figure out for the life of me how to make this work.
Looks like you have the database encoded in UCS2 (assuming it's MySQL). http://dev.mysql.com/doc/refman/5.1/en/charset-unicode-ucs2.html
You can use possibly use iconv to convert the format - but there's an example in the comments on that page (but it doesn't use iconv - http://php.net/manual/en/function.iconv.php#49171 ). I've not tested it.
Alternatively, change the database field encoding to utf8_generic or ASCII or whatever the file is encoded as?
Edit: Found the actual PHP function you want: mb_convert_encoding - UCS2 is one of the supported encodings, so enable that in php ini and you're good to go.

PHP write binary response

In php is there a way to write binary data to the response stream,
like the equivalent of (c# asp)
System.IO.BinaryWriter Binary = new System.IO.BinaryWriter(Response.OutputStream);
Binary.Write((System.Int32)1);//01000000
Binary.Write((System.Int32)1020);//FC030000
Binary.Close();
I would then like to be able read the response in a c# application, like
System.Net.HttpWebRequest Request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create("URI");
System.IO.BinaryReader Binary = new System.IO.BinaryReader(Request.GetResponse().GetResponseStream());
System.Int32 i = Binary.ReadInt32();//1
i = Binary.ReadInt32();//1020
Binary.Close();
In PHP, strings and byte arrays are one and the same. Use pack to create a byte array (string) that you can then write. Once I realized that, life got easier.
$my_byte_array = pack("LL", 0x01000000, 0xFC030000);
$fp = fopen("somefile.txt", "w");
fwrite($fp, $my_byte_array);
// or just echo to stdout
echo $my_byte_array;
Usually, I use chr();
echo chr(255); // Returns one byte, value 0xFF
http://php.net/manual/en/function.chr.php
This is the same answer I posted to this, similar, question.
Assuming that array $binary is a previously constructed array bytes (like monochrome bitmap pixels in my case) that you want written to the disk in this exact order, the below code worked for me on an AMD 1055t running ubuntu server 10.04 LTS.
I iterated over every kind of answer I could find on the Net, checking the output (I used either shed or vi, like in this answer) to confirm the results.
<?php
$fp = fopen($base.".bin", "w");
$binout=Array();
for($idx=0; $idx < $stop; $idx=$idx+2 ){
if( array_key_exists($idx,$binary) )
fwrite($fp,pack( "n", $binary[$idx]<<8 | $binary[$idx+1]));
else {
echo "index $idx not found in array \$binary[], wtf?\n";
}
}
fclose($fp);
echo "Filename $base.bin had ".filesize($base.".bin")." bytes written\n";
?>
You probably want the pack function -- it gives you a decent amount of control over how you want your values structured as well, i.e., 16 bits or 32 bits at a time, little-endian versus big-endian, etc.

PHP fseek() equivalent for variables?

What I need is an equivalent for PHP's fseek() function. The function works on files, but I have a variable that contains binary data and I want to work on it. I know I could use substr(), but that would be lame - it's used for strings, not for binary data. Also, creating a file and then using fseek() is not what I am looking for either.
Maybe something constructed with streams?
EDIT: Okay, I'm almost there:
$data = fopen('data://application/binary;binary,'.$bin,'rb');
Warning: failed to open stream: rfc2397: illegal parameter
Kai:
You have almost answered yourself here. Streams are the answer. The following manual entry will be enlightening: http://us.php.net/manual/en/wrappers.data.php
It essentially allows you to pass arbitrary data to PHP's file handling functions such as fopen (and thus fseek).
Then you could do something like:
<?php
$data = fopen('data://mime/type;encoding,' . $binaryData);
fseek($data, 128);
?>
fseek on data in a variable doesn't make sense. fseek just positions the file handle to the specified offset, so the next fread call starts reading from that offset. There is no equivalent of fread for strings.
Whats wrong with substr()?
With a file you would do:
$f = fopen(...)
fseek($f, offset)
$x = fread($f, len)
with substr:
$x = substr($var, offset, len)
I'm guessing, but maybe what is being asked for is a way to access bytes in a variable by using a pointer.. (using it like an array of bytes like you could do in c - without the memory overhead of putting the data in php arrays) and being able to edit them inplace without the overhead of copying the data.
Not being able to do this is a BIG problem, but if the operating system caches disk data well using fseek on a temporary file could be a workaround.

Categories