How to extract certificates from app attestation object using php? - php

I tried to set up app attestation between my app and php but I rarely find any other source of explaination than Apple's own documentation, which let me stuck quite at an early state. So far I got the following steps:
On the client side, following https://developer.apple.com/documentation/devicecheck/establishing_your_app_s_integrity, I creted my attestation as a base64 encoded string:
attestation.base64EncodedString()
I then send that string to the server, following https://developer.apple.com/documentation/devicecheck/validating_apps_that_connect_to_your_server from now on.
The documentation says, that the attestation is in the CBOR format. I therefor first decode the base64 encoded string and parse it using (https://github.com/Spomky-Labs/cbor-php).
<?php
use CBOR\Decoder;
use CBOR\OtherObject;
use CBOR\Tag;
use CBOR\StringStream;
$otherObjectManager = new OtherObject\OtherObjectManager();
$tagManager = new Tag\TagObjectManager();
$decoder = new Decoder($tagManager, $otherObjectManager);
$data = base64_decode(/* .. base64 encoded attestation string as send from the client (see swift snippet above) */);
$stream = new StringStream($data);
$object = $decoder->decode($stream);
$norm = $object->getNormalizedData();
$fmt = $norm['fmt'];
$x5c = $norm['attStmt']['x5c'];
From the documentation, the normalized object should have the following format:
{
fmt: 'apple-appattest',
attStmt: {
x5c: [
<Buffer 30 82 02 cc ... >,
<Buffer 30 82 02 36 ... >
],
receipt: <Buffer 30 80 06 09 ... >
},
authData: <Buffer 21 c9 9e 00 ... >
}
which it does:
$fmt == "apple-appattest" // true
Then the next according to the documentation is described as:
Verify that the x5c array contains the intermediate and leaf certificates for App Attest, starting from the credential certificate in the first data buffer in the array (credcert). Verify the validity of the certificates using Apple’s App Attest root certificate.
However, I don't know how to proceed further on this. The content of e.g. $norm['attStmt']['x5c'][0] is a mix of readable chars and glyphs. To give you an idea, this is a random substring from the content of $norm['attStmt']['x5c'][0]: "Certification Authority10U Apple Inc.10 UUS0Y0*�H�=*�H�=B��c�}�". That's why I'm not really sure wheather I have to perform any further encodeing/decoding steps.
I tried parsing the certificate but without any luck (both var_dump return false):
$cert = openssl_x509_read($x5c[0]);
var_dump($cert); // false - indicating that reading the cert failed
$parsedCert = openssl_x509_parse($cert, false);
var_dump($parsedCert); // false - of course, since the prior step did not succeed
Any ideas, guidance or alternative ressources are highly appreciated. Thank you!

After a while I came up with the following solution. The $x5c field contains a list of certificates, all in binary form. I wrote the folowing converter to create a ready-to-use certificate in PEM format, which does the following:
base64 encode the binary data
break lines after 64 bytes
add BEGIN and END markers (also note the trailing line-break on the end certificate line)
function makeCert($bindata) {
$beginpem = "-----BEGIN CERTIFICATE-----\n";
$endpem = "-----END CERTIFICATE-----\n";
$pem = $beginpem;
$cbenc = base64_encode($bindata);
for($i = 0; $i < strlen($cbenc); $i++) {
$pem .= $cbenc[$i];
if (($i + 1) % 64 == 0)
$pem .= "\n";
}
$pem .= "\n".$endpem;
return $pem;
}
the following then works:
openssl_x509_read(makeCert($x5c[0]))

Related

Passing public key in PEM format to openssl_pkey_get_public gives error:0906D06C:PEM routines:PEM_read_bio:no start line

The following public RSA key in PEM format was provided to openssl_pkey_get_public.
-----BEGIN PUBLIC KEY-----
MIIBITANBgkqhkiG9w0BAQEFAAOCAQ4AMIIBCQKCAQCIZouo/rL5IkIIGrke/qkY
Nsb9JDXUw2MfutYdwIVjPiEbAcLiVxK6tOVXy7dq+hU0zyNd68bUi7VJjXWoiepS
+Mm6v76GCGvVvno48m7ofWIq6VLEaMQjIM/pzkF0TW7CmtjKvgg722Hx87AI/KCM
sWuHjhcQZsMgV4ibC8EAY6GYwHYAPWfUq+LI2wfRsQHumFC2IuT4guO/Vs5FJGXw
Arrvv7VPyKwZ8cpcZn9ka1K0N7su7QiGnzOhS3n2THaj25alE6TMXnrKmt6yIiXh
amsKVEKPPzHpw9ldTao1aG7vVNC9QXC8i9uQTWhhokxvSNw5OYFFkDZC5jD7McvB
AgMBAAE=
-----END PUBLIC KEY-----
However, the method call fails, returning false, with the error string error:0906D06C:PEM routines:PEM_read_bio:no start line
Is the public key invalid? For the record, my code is starting with a public key modulus and exponent and converting it to PEM format using the algorithm posted here.
Here's the full script:
<?php
function createPemFromModulusAndExponent($n, $e)
{
$modulus = urlsafeB64Decode($n);
$publicExponent = urlsafeB64Decode($e);
$components = array(
'modulus' => pack('Ca*a*', 2, encodeLength(strlen($modulus)), $modulus),
'publicExponent' => pack('Ca*a*', 2, encodeLength(strlen($publicExponent)), $publicExponent)
);
$RSAPublicKey = pack('Ca*a*a*', 48, encodeLength(strlen($components['modulus']) + strlen($components['publicExponent'])), $components['modulus'], $components['publicExponent']);
$rsaOID = pack('H*', '300d06092a864886f70d0101010500');
$RSAPublicKey = chr(0) . $RSAPublicKey;
$RSAPublicKey = chr(3) . encodeLength(strlen($RSAPublicKey)) . $RSAPublicKey;
$RSAPublicKey = pack('Ca*a*', 48, encodeLength(strlen($rsaOID . $RSAPublicKey)), $rsaOID . $RSAPublicKey);
$RSAPublicKey = "-----BEGIN PUBLIC KEY-----" . chunk_split(base64_encode($RSAPublicKey), 64) . '-----END PUBLIC KEY-----';
return $RSAPublicKey;
}
function urlsafeB64Decode($input)
{
$remainder = strlen($input) % 4;
if ($remainder)
{
$padlen = 4 - $remainder;
$input .= str_repeat('=', $padlen);
}
return base64_decode(strtr($input, '-_', '+/'));
}
function encodeLength($length)
{
if ($length <= 0x7F)
{
return chr($length);
}
$temp = ltrim(pack('N', $length), chr(0));
return pack('Ca*', 0x80 | strlen($temp), $temp);
}
$key = createPemFromModulusAndExponent('iGaLqP6y-SJCCBq5Hv6pGDbG_SQ11MNjH7rWHcCFYz4hGwHC4lcSurTlV8u3avoVNM8jXevG1Iu1SY11qInqUvjJur--hghr1b56OPJu6H1iKulSxGjEIyDP6c5BdE1uwprYyr4IO9th8fOwCPygjLFrh44XEGbDIFeImwvBAGOhmMB2AD1n1KviyNsH0bEB7phQtiLk-ILjv1bORSRl8AK677-1T8isGfHKXGZ_ZGtStDe7Lu0Ihp8zoUt59kx2o9uWpROkzF56ypresiIl4WprClRCjz8x6cPZXU2qNWhu71TQvUFwvIvbkE1oYaJMb0jcOTmBRZA2QuYw-zHLwQ', 'AQAB');
print_r($key);
print_r(openssl_pkey_get_public($key));
print_r(openssl_error_string());
First: openssl_pkey_get_public is intended to either load the public key directly or extract it from a certificate, as described in the documentation of the certificate parameter of openssl_pkey_get_public.
There has already been a bug filed for this issue, #75643 from Dec 2017 (version 7.1.12), which has the status No Feedback and is currently suspended (note that #75643 actually refers to openssl_public_encrypt, which however uses the same logic regarding the key as openssl_pkey_get_public, here):
The error in the queue is expected. If you supply string as a PEM
(string not prefixed by "file://" which would be a file path), then
certificate is tried first (using PEM_ASN1_read_bio). It means that it
fails and the error is saved to the queue. However this queue is just
a copy of the OpenSSL which is emptied. After that the key is loaded
using PEM_read_bio_PUBKEY which is successful in your case so you get
back the result. To sum it up openssl_error_string does not mean that
the operation failed but just that some error was emitted...
According to this, the error message is caused by the failure to extract the key from the certificate. However, processing is continued and the key is loaded directly. In other words, the error message occurs as expected when loading the key directly and can be ignored in this context (at least if the direct loading is successful).
For the records: As of 7.2(.17), a slightly different error message is displayed: error:0909006C:PEM routines:get_name:no start line.
Update:
As #President James Moveon Polk noted in his comment, createPemFromModulusAndExponent doesn't generate the key correctly. If the first / most significant byte is greater than 0x7F, the modulus must be preceded by a 0x00 byte, which does currently not happen. E.g. in the posted code the modulus starts (Base64url decoded) with 0x88, which means that the generated (= the posted) key is invalid. If a 0x00 is prepended manually and the so corrected value is (Base64url encoded) passed to createPemFromModulusAndExponent, the following, now valid key results:
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAiGaLqP6y+SJCCBq5Hv6p
GDbG/SQ11MNjH7rWHcCFYz4hGwHC4lcSurTlV8u3avoVNM8jXevG1Iu1SY11qInq
UvjJur++hghr1b56OPJu6H1iKulSxGjEIyDP6c5BdE1uwprYyr4IO9th8fOwCPyg
jLFrh44XEGbDIFeImwvBAGOhmMB2AD1n1KviyNsH0bEB7phQtiLk+ILjv1bORSRl
8AK677+1T8isGfHKXGZ/ZGtStDe7Lu0Ihp8zoUt59kx2o9uWpROkzF56ypresiIl
4WprClRCjz8x6cPZXU2qNWhu71TQvUFwvIvbkE1oYaJMb0jcOTmBRZA2QuYw+zHL
wQIDAQAB
-----END PUBLIC KEY-----
Of course it would be better if createPemFromModulusAndExponent would do this correction automatically. #President James Moveon Polk has filed an issue for this, here.
Allow me to propose an alternative way that's quite a bit simpler and more succinct. Using phpseclib,
require __DIR__ . '/vendor/autoload.php';
use phpseclib\Math\BigInteger;
use phpseclib\Crypt\RSA;
$rsa = new RSA;
$rsa->loadKey([
'e' => new BigInteger(base64_decode('AQAB'), 256),
'n' => new BigInteger(base64_decode('iGaLqP6y-SJCCBq5Hv6pGDbG_SQ11MNjH7rWHcCFYz4hGwHC4lcSurTlV8u3avoVNM8jXevG1Iu1SY11qInqUvjJur--hghr1b56OPJu6H1iKulSxGjEIyDP6c5BdE1uwprYyr4IO9th8fOwCPygjLFrh44XEGbDIFeImwvBAGOhmMB2AD1n1KviyNsH0bEB7phQtiLk-ILjv1bORSRl8AK677-1T8isGfHKXGZ_ZGtStDe7Lu0Ihp8zoUt59kx2o9uWpROkzF56ypresiIl4WprClRCjz8x6cPZXU2qNWhu71TQvUFwvIvbkE1oYaJMb0jcOTmBRZA2QuYw-zHLwQ'), 256)
]);
print_r(openssl_pkey_get_public($rsa));
The code you're using is, in fact, using code that was lifted from phpseclib 2.0. See https://github.com/dragosgaftoneanu/okta-simple-jwt-verifier/issues/1#issuecomment-612503921 for more info.

Unicode error when loading a json_encoded PHP array from memcached

I need to transfer a PHP associative array to further processing using python. The python code however using pylibmc is unable to load the string from memcached, throwing this error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 32: invalid continuation byte
I wrote a little tester. The PHP code to create the memcached data:
<?php
$mc = new Memcached();
$mc->addServer('localhost', 11211);
$data = array();
for ( $i = 0; $i < 100; $i++) {
$index = "ti" . $i;
$data += [$index => "test string $i"];
}
$mc->delete('test');
$mc->add('test', json_encode($data), 60);
$reverse = $mc->get('test');
echo "$reverse\n"; // prints {"ti0":"test string 0" ...... "ti99":"test string 99"} as expected
$reverse_array = json_decode($reverse, true);
echo $reverse_array['ti10'] . "\n";
//prints 'test string 10' as expected
?>
so this works fine writing to memcached from PHP and reading it back.
On the python side, this is the code I use to read it in:
#!/usr/bin/python
import pylibmc
import json
mc = pylibmc.Client(["127.0.0.1"], binary=True, behaviors={"cas": True, "tcp_nodelay": True,"ketama": True})
temp = json.loads(mc.get("test"))
When running the python code, this is the output I get:
Traceback (most recent call last):
File "./mctest.py", line 7, in <module>
temp = json.loads(mc.get("test")))
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 32: invalid continuation byte
If I create a non-associative array in PHP and share that through memcached, things work fine.
Two further options I've tried:
adding utf8_encode to make sure it's properly encoded:
$mc->add('test', utf8_encode(json_encode($data)), 60);
adding JSON_UNESCAPED_UNICODE to the json_encode function:
$mc->add('test', json_encode($data, JSON_UNESCAPED_UNICODE), 60);
both result in identical outcomes on the python side.
Bit at a loss here - any ideas welcome!
While trying to determine the encoding of the resulting string retrieved from memcached via pymemcache, it occurred to me that the string didn't look like any known encoding, I confirmed this using chardet as well as cchardet.
After some more digging at the PHP end, I discovered that the PHP memcached module adulterates the strings it saves to memcached by compressing the data!
Solution was to add this line to the /etc/php/7.2/cli/conf.d/25-memcached.ini file:
memcached.compression_threshold=9999999999
Now the data comes into python as it should!

Shortest possible encoded string with a decode possibility (shorten URL) using only PHP

I'm looking for a method that encodes a string to the shortest possible length and lets it be decodable (pure PHP, no SQL). I have working script, but I'm unsatisfied with the length of the encoded string.
Scenario
Link to an image (it depends on the file resolution I want to show to the user):
www.mysite.com/share/index.php?img=/dir/dir/hi-res-img.jpg&w=700&h=500
Encoded link (so the user can't guess how to get the larger image):
www.mysite.com/share/encodedQUERYstring
So, basically I'd like to encode only the search query part of the URL:
img=/dir/dir/hi-res-img.jpg&w=700&h=500
The method I use right now will encode the above query string to:
y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA
The method I use is:
$raw_query_string = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
$encoded_query_string = base64_encode(gzdeflate($raw_query_string));
$decoded_query_string = gzinflate(base64_decode($encoded_query_string));
How do I shorten the encoded result and still have the possibility to decode it using only PHP?
I suspect that you will need to think more about your method of hashing if you don't want it to be decodable by the user. The issue with Base64 is that a Base64 string looks like a base64 string. There's a good chance that someone that's savvy enough to be looking at your page source will probably recognise it too.
Part one:
a method that encodes an string to shortest possible length
If you're flexible on your URL vocabulary/characters, this will be a good starting place. Since gzip makes a lot of its gains using back references, there is little point as the string is so short.
Consider your example - you've only saved 2 bytes in the compression, which are lost again in Base64 padding:
Non-gzipped: string(52) "aW1nPS9kaXIvZGlyL2hpLXJlcy1pbWcuanBnJnc9NzAwJmg9NTAw"
Gzipped: string(52) "y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA=="
If you reduce your vocabulary size, this will naturally allow you better compression. Let's say we remove some redundant information.
Take a look at the functions:
function compress($input, $ascii_offset = 38){
$input = strtoupper($input);
$output = '';
//We can try for a 4:3 (8:6) compression (roughly), 24 bits for 4 characters
foreach(str_split($input, 4) as $chunk) {
$chunk = str_pad($chunk, 4, '=');
$int_24 = 0;
for($i=0; $i<4; $i++){
//Shift the output to the left 6 bits
$int_24 <<= 6;
//Add the next 6 bits
//Discard the leading ASCII chars, i.e make
$int_24 |= (ord($chunk[$i]) - $ascii_offset) & 0b111111;
}
//Here we take the 4 sets of 6 apart in 3 sets of 8
for($i=0; $i<3; $i++) {
$output = pack('C', $int_24) . $output;
$int_24 >>= 8;
}
}
return $output;
}
And
function decompress($input, $ascii_offset = 38) {
$output = '';
foreach(str_split($input, 3) as $chunk) {
//Reassemble the 24 bit ints from 3 bytes
$int_24 = 0;
foreach(unpack('C*', $chunk) as $char) {
$int_24 <<= 8;
$int_24 |= $char & 0b11111111;
}
//Expand the 24 bits to 4 sets of 6, and take their character values
for($i = 0; $i < 4; $i++) {
$output = chr($ascii_offset + ($int_24 & 0b111111)) . $output;
$int_24 >>= 6;
}
}
//Make lowercase again and trim off the padding.
return strtolower(rtrim($output, '='));
}
It is basically a removal of redundant information, followed by the compression of 4 bytes into 3. This is achieved by effectively having a 6-bit subset of the ASCII table. This window is moved so that the offset starts at useful characters and includes all the characters you're currently using.
With the offset I've used, you can use anything from ASCII 38 to 102. This gives you a resulting string of 30 bytes, that's a 9-byte (24%) compression! Unfortunately, you'll need to make it URL-safe (probably with base64), which brings it back up to 40 bytes.
I think at this point, you're pretty safe to assume that you've reached the "security through obscurity" level required to stop 99.9% of people. Let's continue though, to the second part of your question
so the user can't guess how to get the larger image
It's arguable that this is already solved with the above, but you need to pass this through a secret on the server, preferably with PHP's OpenSSL interface. The following code shows the complete usage flow of functions above and the encryption:
$method = 'AES-256-CBC';
$secret = base64_decode('tvFD4Vl6Pu2CmqdKYOhIkEQ8ZO4XA4D8CLowBpLSCvA=');
$iv = base64_decode('AVoIW0Zs2YY2zFm5fazLfg==');
$input = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
var_dump($input);
$compressed = compress($input);
var_dump($compressed);
$encrypted = openssl_encrypt($compressed, $method, $secret, false, $iv);
var_dump($encrypted);
$decrypted = openssl_decrypt($encrypted, $method, $secret, false, $iv);
var_dump($decrypted);
$decompressed = decompress($compressed);
var_dump($decompressed);
The output of this script is the following:
string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
string(30) "<��(��tJ��#�xH��G&(�%��%��xW"
string(44) "xozYGselci9i70cTdmpvWkrYvGN9AmA7djc5eOcFoAM="
string(30) "<��(��tJ��#�xH��G&(�%��%��xW"
string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
You'll see the whole cycle: compression → encryption → Base64 encode/decode → decryption → decompression. The output of this would be as close as possible as you could really get, at near the shortest length you could get.
Everything aside, I feel obliged to conclude this with the fact that it is theoretical only, and this was a nice challenge to think about. There are definitely better ways to achieve your desired result - I'll be the first to admit that my solution is a little bit absurd!
Instead of encoding the URL, output a thumbnail copy of the original image. Here's what I'm thinking:
Create a "map" for PHP by naming your pictures (the actual file names) using random characters. Random_bytes is a great place to start.
Embed the desired resolution within the randomized URL string from #1.
Use the imagecopyresampled function to copy the original image into the resolution you would like to output before outputting it out to the client's device.
So for example:
Filename example (from bin2hex(random_bytes(6))): a1492fdbdcf2.jpg
Resolution desired: 800x600. My new link could look like:
http://myserver.com/?800a1492fdbdcf2600 or maybe http://myserfer.com/?a1492800fdbdc600f2 or maybe even http://myserver.com/?800a1492fdbdcf2=600 depending on where I choose to embed the resolution within the link
PHP would know that the file name is a1492fdbdcf2.jpg, grab it, use the imagecopyresampled to copy to the resolution you want, and output it.
Theory
In theory we need a short input character set and a large output character set.
I will demonstrate it by the following example. We have the number 2468 as integer with 10 characters (0-9) as character set. We can convert it to the same number with base 2 (binary number system). Then we have a shorter character set (0 and 1) and the result is longer:
100110100100
But if we convert to hexadecimal number (base 16) with a character set of 16 (0-9 and A-F). Then we get a shorter result:
9A4
Practice
So in your case we have the following character set for the input:
$inputCharacterSet = "0123456789abcdefghijklmnopqrstuvwxyz=/-.&";
In total 41 characters: Numbers, lower cases and the special chars = / - . &
The character set for output is a bit tricky. We want use URL save characters only. I've grabbed them from here: Characters allowed in GET parameter
So our output character set is (73 characters):
$outputCharacterSet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz~-_.!*'(),$";
Numbers, lower and upper cases and some special characters.
We have more characters in our set for the output than for the input. Theory says we can short our input string. Check!
Coding
Now we need an encode function from base 41 to base 73. For that case I don't know a PHP function. Luckily we can grab the function 'convBase' from here: Convert an arbitrarily large number from any base to any base
<?php
function convBase($numberInput, $fromBaseInput, $toBaseInput)
{
if ($fromBaseInput == $toBaseInput) return $numberInput;
$fromBase = str_split($fromBaseInput, 1);
$toBase = str_split($toBaseInput, 1);
$number = str_split($numberInput, 1);
$fromLen = strlen($fromBaseInput);
$toLen = strlen($toBaseInput);
$numberLen = strlen($numberInput);
$retval = '';
if ($toBaseInput == '0123456789')
{
$retval = 0;
for ($i = 1;$i <= $numberLen; $i++)
$retval = bcadd($retval, bcmul(array_search($number[$i-1], $fromBase), bcpow($fromLen, $numberLen-$i)));
return $retval;
}
if ($fromBaseInput != '0123456789')
$base10 = convBase($numberInput, $fromBaseInput, '0123456789');
else
$base10 = $numberInput;
if ($base10<strlen($toBaseInput))
return $toBase[$base10];
while($base10 != '0')
{
$retval = $toBase[bcmod($base10,$toLen)] . $retval;
$base10 = bcdiv($base10, $toLen, 0);
}
return $retval;
}
Now we can shorten the URL. The final code is:
$input = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
$inputCharacterSet = "0123456789abcdefghijklmnopqrstuvwxyz=/-.&";
$outputCharacterSet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz~-_.!*'(),$";
$encoded = convBase($input, $inputCharacterSet, $outputCharacterSet);
var_dump($encoded); // string(34) "BhnuhSTc7LGZv.h((Y.tG_IXIh8AR.$!t*"
$decoded = convBase($encoded, $outputCharacterSet, $inputCharacterSet);
var_dump($decoded); // string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
The encoded string has only 34 characters.
Optimizations
You can optimize the count of characters by
reduce the length of input string. Do you really need the overhead of URL parameter syntax? Maybe you can format your string as follows:
$input = '/dir/dir/hi-res-img.jpg,700,500';
This reduces the input itself and the input character set. Your reduced input character set is then:
$inputCharacterSet = "0123456789abcdefghijklmnopqrstuvwxyz/-.,";
Final output:
string(27) "E$AO.Y_JVIWMQ9BB_Xb3!Th*-Ut"
string(31) "/dir/dir/hi-res-img.jpg,700,500"
reducing the input character set ;-). Maybe you can exclude some more characters?
You can encode the numbers to characters first. Then your input character set can be reduced by 10!
increase your output character set. So the given set by me is googled within two minutes. Maybe you can use more URL save characters.
Security
Heads up: There is no cryptographically logic in the code. So if somebody guesses the character sets, he/she can decode the string easily. But you can shuffle the character sets (once). Then it is a bit harder for the attacker, but not really safe. Maybe it’s enough for your use case anyway.
Reading from the previous answers and below comments, you need a solution to hide the real path of your image parser, giving it a fixed image width.
Step 1: http://www.example.com/tn/full/animals/images/lion.jpg
You can achieve a basic "thumbnailer" by taking profit of .htaccess
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule tn/(full|small)/(.*) index.php?size=$1&img=$2 [QSA,L]
Your PHP file:
$basedir = "/public/content/";
$filename = realpath($basedir.$_GET["img"]);
## Check that file is in $basedir
if ((!strncmp($filename, $basedir, strlen($basedir))
||(!file_exists($filename)) die("Bad file path");
switch ($_GET["size"]) {
case "full":
$width = 700;
$height = 500;
## You can also use getimagesize() to test if the image is landscape or portrait
break;
default:
$width = 350;
$height = 250;
break;
}
## Here is your old code for resizing images.
## Note that the "tn" directory can exist and store the actual reduced images
This lets you using the URL www.example.com/tn/full/animals/images/lion.jpg to view your reduced in size image.
This has the advantage for SEO to preserve the original file name.
Step 2: http://www.example.com/tn/full/lion.jpg
If you want a shorter URL, if the number of images you have is not too much, you can use the basename of the file (e.g., "lion.jpg") and recursively search. When there is a collision, use an index to identify which one you want (e.g., "1--lion.jpg")
function matching_files($filename, $base) {
$directory_iterator = new RecursiveDirectoryIterator($base);
$iterator = new RecursiveIteratorIterator($directory_iterator);
$regex_iterator = new RegexIterator($iterator, "#$filename\$#");
$regex_iterator->setFlags(RegexIterator::USE_KEY);
return array_map(create_function('$a', 'return $a->getpathName();'), iterator_to_array($regex_iterator, false));
}
function encode_name($filename) {
$files = matching_files(basename($filename), realpath('public/content'));
$tot = count($files);
if (!$tot)
return NULL;
if ($tot == 1)
return $filename;
return "/tn/full/" . array_search(realpath($filename), $files) . "--" . basename($filename);
}
function decode_name($filename) {
$i = 0;
if (preg_match("#^([0-9]+)--(.*)#", $filename, $out)) {
$i = $out[1];
$filename = $out[2];
}
$files = matching_files($filename, realpath('public/content'));
return $files ? $files[$i] : NULL;
}
echo $name = encode_name("gallery/animals/images/lion.jp‌​g").PHP_EOL;
## --> returns lion.jpg
## You can use with the above solution the URL http://www.example.com/tn/lion.jpg
echo decode_name(basename($name)).PHP_EOL;
## -> returns the full path on disk to the image "lion.jpg"
Original post:
Basically, if you add some formatting in your example, your shortened URL is in fact longer:
img=/dir/dir/hi-res-img.jpg&w=700&h=500 // 39 characters
y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA // 50 characters
Using base64_encode will always result in longer strings. And gzcompress will require at less to store one occurrence of the different chars; this is not a good solution for small strings.
So doing nothing (or a simple str_rot13) is clearly the first option to consider if you want to shorten the result you had previously.
You can also use a simple character replacement method of your choice:
$raw_query_string = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
$from = "0123456789abcdefghijklmnopqrstuvwxyz&=/ABCDEFGHIJKLMNOPQRSTUVWXYZ";
// The following line if the result of str_shuffle($from)
$to = "0IQFwAKU1JT8BM5npNEdi/DvZmXuflPVYChyrL4R7xc&SoG3Hq6ks=e9jW2abtOzg";
echo strtr($raw_query_string, $from, $to) . "\n";
// Result: EDpL4MEu4MEu4NE-u5f-EDp.dmprYLU00rNLA00 // 39 characters
Reading from your comment, you really want "to prevent anyone to gets a high-resolution image".
The best way to achieve that is to generate a checksum with a private key.
Encode:
$secret = "ujoo4Dae";
$raw_query_string = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
$encoded_query_string = $raw_query_string . "&k=" . hash("crc32", $raw_query_string . $secret);
Result: img=/dir/dir/hi-res-img.jpg&w=700&h=500&k=2ae31804
Decode:
if (preg_match("#(.*)&k=([^=]*)$#", $encoded_query_string, $out)
&& (hash("crc32", $out[1].$secret) == $out[2])) {
$decoded_query_string = $out[1];
}
This does not hide the original path, but this path has no reason to be public. Your "index.php" can output your image from the local directory once the key has been checked.
If you really want to shorten your original URL, you have to consider the acceptable characters in the original URL to be restricted. Many compression methods are based on the fact that you can use a full byte to store more than a character.
There are many ways to shorten URLs. You can look up how other services, like TinyURL, shorten their URLs. Here is a good article on hashes and shortening URLs: URL Shortening: Hashes In Practice
You can use the PHP function mhash() to apply hashes to strings.
And if you scroll down to "Available Hashes" on the mhash website, you can see what hashes you can use in the function (although I would check what PHP versions have which functions): mhash - Hash Library
I think this would be better done by not obscuring at all. You could quite simply cache returned images and use a handler to provide them. This requires the image sizes to be hard coded into the PHP script. When you get new sizes, you can just delete everything in the cache as it is 'lazy loaded'.
1. Get the image from the request
This could be this: /thumbnail.php?image=img.jpg&album=myalbum. It could even be made to be anything using rewrite and have a URL like: /gallery/images/myalbum/img.jpg.
2. Check to see if a temporary version does not exist
You can do this using is_file().
3. Create it if it does not exist
Use your current resizing logic to do it, but don't output the image. Save it to the temporary location.
4. Read the temporary file contents to the stream
It pretty much just outputs it.
Here is an untested code example...
<?php
// Assuming we have a request /thumbnail.php?image=img.jpg&album=myalbum
// These are temporary filenames places. You need to do this yourself on your system.
$image = $_GET['image']; // The file name
$album = $_GET['album']; // The album
$temp_folder = sys_get_temp_dir(); // Temporary directory to store images
// (this should really be a specific cache path)
$image_gallery = "images"; // Root path to the image gallery
$width = 700;
$height = 500;
$real_path = "$image_gallery/$album/$image";
$temp_path = "$temp_folder/$album/$image";
if(!is_file($temp_path))
{
// Read in the image
$contents = file_get_contents($real_path);
// Resize however you are doing it now.
$thumb_contents = resizeImage($contents, $width, $height);
// Write to the temporary file
file_put_contents($temp_path, $thumb_contents);
}
$type = 'image/jpeg';
header('Content-Type:' . $type);
header('Content-Length: ' . filesize($temp_path));
readfile($temp_path);
?>
Short words about "security"
You simply won't be able to secure your link if there is no "secret password" stored somewhere: as long as the URI carries all information to access your resource, then it will be decodable and your "custom security" (they are opposite words btw) will be broken easily.
You can still put a salt in your PHP code (like $mysalt="....long random string...") since I doubt you want an eternal security (such approach is weak because you cannot renew the $mysalt value, but in your case, a few years security sounds sufficient, since anyway, a user can buy one picture and share it elsewhere, breaking any of your security mechanism).
If you want to have a safe mechanism, use a well-known one (as a framework would carry), along with authentication and user rights management mechanism (so you can know who's looking for your image, and whether they are allowed to).
Security has a cost. If you don't want to afford its computing and storing requirements, then forget about it.
Secure by signing the URL
If you want to avoid users easy by-passing and get full resolution picture, then you may just sign the URI (but really, for safety, use something that already exist instead of that quick draft example below):
$salt = '....long random stirng...';
$params = array('img' => '...', 'h' => '...', 'w' => '...');
$p = http_build_query($params);
$check = password_hash($p, PASSWORD_BCRYPT, array('salt' => $salt, 'cost' => 1000);
$uri = http_build_query(array_merge($params, 'sig' => $check));
Decoding:
$sig = $_GET['sig'];
$params = $_GET;
unset($params['sig']);
// Same as previous
$salt = '....long random stirng...';
$p = http_build_query($params);
$check = password_hash($p, PASSWORD_BCRYPT, array('salt' => $salt, 'cost' => 1000);
if ($sig !== $check) throw new DomainException('Invalid signature');
See password_hash
Shorten smartly
"Shortening" with a generic compression algorithm is useless here because the headers will be longer than the URI, so it will almost never shorten it.
If you want to shorten it, be smart: don't give the relative path (/dir/dir) if it's always the same (or give it only if it's not the main one). Don't give the extension if it's always the same (or give it when it's not png if almost everything is in png). Don't give the height because the image carries the aspect ratio: you only need the width. Give it in x100px if you do not need a pixel-accurate width.
A lot has been said about how encoding doesn't help security, so I am just concentrating on the shortening and aesthetics.
Rather than thinking of it as a string, you could consider it as three individual components. Then if you limit your code space for each component, you can pack things together a lot smaller.
E.g.,
path - Only consisting of the 26 characters (a-z) and / - . (Variable length)
width - Integer (0 - 65k) (Fixed length, 16 bits)
height - Integer (0 - 65k) (Fixed length, 16 bits)
I'm limiting the path to only consist of a maximum 31 characters, so we can use five bit groupings.
Pack your fixed length dimensions first, and append each path character as five bits. It might also be necessary to add a special null character to fill up the end byte. Obviously you need to use the same dictionary string for encoding and decoding.
See the code below.
This shows that by limiting what you encode and how much you can encode, you can get a shorter string. You could make it even shorter by using only 12 bit dimension integers (max 2048), or even removing parts of the path if they are known such as base path or file extension (see last example).
<?php
function encodeImageAndDimensions($path, $width, $height) {
$dictionary = str_split("abcdefghijklmnopqrstuvwxyz/-."); // Maximum 31 characters, please
if ($width >= pow(2, 16)) {
throw new Exception("Width value is too high to encode with 16 bits");
}
if ($height >= pow(2, 16)) {
throw new Exception("Height value is too high to encode with 16 bits");
}
// Pack width, then height first
$packed = pack("nn", $width, $height);
$path_bits = "";
foreach (str_split($path) as $ch) {
$index = array_search($ch, $dictionary, true);
if ($index === false) {
throw new Exception("Cannot encode character outside of the allowed dictionary");
}
$index++; // Add 1 due to index 0 meaning NULL rather than a.
// Work with a bit string here rather than using complicated binary bit shift operators.
$path_bits .= str_pad(base_convert($index, 10, 2), 5, "0", STR_PAD_LEFT);
}
// Remaining space left?
$modulo = (8 - (strlen($path_bits) % 8)) %8;
if ($modulo >= 5) {
// There is space for a null character to fill up to the next byte
$path_bits .= "00000";
$modulo -= 5;
}
// Pad with zeros
$path_bits .= str_repeat("0", $modulo);
// Split in to nibbles and pack as a hex string
$path_bits = str_split($path_bits, 4);
$hex_string = implode("", array_map(function($bit_string) {
return base_convert($bit_string, 2, 16);
}, $path_bits));
$packed .= pack('H*', $hex_string);
return base64_url_encode($packed);
}
function decodeImageAndDimensions($str) {
$dictionary = str_split("abcdefghijklmnopqrstuvwxyz/-.");
$data = base64_url_decode($str);
$decoded = unpack("nwidth/nheight/H*path", $data);
$path_bit_stream = implode("", array_map(function($nibble) {
return str_pad(base_convert($nibble, 16, 2), 4, "0", STR_PAD_LEFT);
}, str_split($decoded['path'])));
$five_pieces = str_split($path_bit_stream, 5);
$real_path_indexes = array_map(function($code) {
return base_convert($code, 2, 10) - 1;
}, $five_pieces);
$real_path = "";
foreach ($real_path_indexes as $index) {
if ($index == -1) {
break;
}
$real_path .= $dictionary[$index];
}
$decoded['path'] = $real_path;
return $decoded;
}
// These do a bit of magic to get rid of the double equals sign and obfuscate a bit. It could save an extra byte.
function base64_url_encode($input) {
$trans = array('+' => '-', '/' => ':', '*' => '$', '=' => 'B', 'B' => '!');
return strtr(str_replace('==', '*', base64_encode($input)), $trans);
}
function base64_url_decode($input) {
$trans = array('-' => '+', ':' => '/', '$' => '*', 'B' => '=', '!' => 'B');
return base64_decode(str_replace('*', '==', strtr($input, $trans)));
}
// Example usage
$encoded = encodeImageAndDimensions("/dir/dir/hi-res-img.jpg", 700, 500);
var_dump($encoded); // string(27) "Arw!9NkTLZEy2hPJFnxLT9VA4A$"
$decoded = decodeImageAndDimensions($encoded);
var_dump($decoded); // array(3) { ["width"] => int(700) ["height"] => int(500) ["path"] => string(23) "/dir/dir/hi-res-img.jpg" }
$encoded = encodeImageAndDimensions("/another/example/image.png", 4500, 2500);
var_dump($encoded); // string(28) "EZQJxNhc-iCy2XAWwYXaWhOXsHHA"
$decoded = decodeImageAndDimensions($encoded);
var_dump($decoded); // array(3) { ["width"] => int(4500) ["height"] => int(2500) ["path"] => string(26) "/another/example/image.png" }
$encoded = encodeImageAndDimensions("/short/eg.png", 300, 200);
var_dump($encoded); // string(19) "ASwAyNzQ-VNlP2DjgA$"
$decoded = decodeImageAndDimensions($encoded);
var_dump($decoded); // array(3) { ["width"] => int(300) ["height"] => int(200) ["path"] => string(13) "/short/eg.png" }
$encoded = encodeImageAndDimensions("/very/very/very/very/very-hyper/long/example.png", 300, 200);
var_dump($encoded); // string(47) "ASwAyN2LLO7FlndiyzuxZZ3Yss8Rm!ZbY9x9lwFsGF7!xw$"
$decoded = decodeImageAndDimensions($encoded);
var_dump($decoded); // array(3) { ["width"] => int(300) ["height"] => int(200) ["path"] => string(48) "/very/very/very/very/very-hyper/long/example.png" }
$encoded = encodeImageAndDimensions("only-file-name", 300, 200);
var_dump($encoded); //string(19) "ASwAyHuZnhksLxwWlA$"
$decoded = decodeImageAndDimensions($encoded);
var_dump($decoded); // array(3) { ["width"] => int(300) ["height"] => int(200) ["path"] => string(14) "only-file-name" }
In your question you state that it should be pure PHP and not use a database, and there should be a possibility to decode the strings. So bending the rules a bit:
The way I am interpreting this question is that we don't care about security that much but, we do want the shortest hashes that lead back to images.
We can also take "decode possibility" with a pinch of salt by using a one way hashing algorithm.
We can store the hashes inside a JSON object, then store the data in a file, so all we have to do at the end of the day is string matching
```
class FooBarHashing {
private $hashes;
private $handle;
/**
* In producton this should be outside the web root
* to stop pesky users downloading it and geting hold of all the keys.
*/
private $file_name = './my-image-hashes.json';
public function __construct() {
$this->hashes = $this->get_hashes();
}
public function get_hashes() {
// Open or create a file.
if (! file_exists($this->file_name)) {
fopen($this->file_name, "w");
}
$this->handle = fopen($this->file_name, "r");
$hashes = [];
if (filesize($this->file_name) > 0) {
$contents = fread($this->handle, filesize($this->file_name));
$hashes = get_object_vars(json_decode($contents));
}
return $hashes;
}
public function __destroy() {
// Close the file handle
fclose($this->handle);
}
private function update() {
$handle = fopen($this->file_name, 'w');
$res = fwrite($handle, json_encode($this->hashes));
if (false === $res) {
//throw new Exception('Could not write to file');
}
return true;
}
public function add_hash($image_file_name) {
$new_hash = md5($image_file_name, false);
if (! in_array($new_hash, array_keys($this->hashes) ) ) {
$this->hashes[$new_hash] = $image_file_name;
return $this->update();
}
//throw new Exception('File already exists');
}
public function resolve_hash($hash_string='') {
if (in_array($hash_string, array_keys($this->hashes))) {
return $this->hashes[$hash_string];
}
//throw new Exception('File not found');
}
}
```
Usage example:
<?php
// Include our class
require_once('FooBarHashing.php');
$hashing = new FooBarHashing;
// You will need to add the query string you want to resolve first.
$hashing->add_hash('img=/dir/dir/hi-res-img.jpg&w=700&h=500');
// Then when the user requests the hash the query string is returned.
echo $hashing->resolve_hash('65992be720ea3b4d93cf998460737ac6');
So the end result is a string that is only 32 chars long, which is way shorter than the 52 we had before.
From the discussion in the comments section it looks like what you really want is to protect your original high-resolution images.
Having that in mind, I'd suggest to actually do that first using your web server configuration (e.g., Apache mod_authz_core or Nginx ngx_http_access_module) to deny access from the web to the directory where your original images are stored.
Note that the server will only deny access to your images from the web, but you will still be able to access them directly from your PHP scripts. Since you already are displaying images using some "resizer" script I'd suggest putting some hard limit there and refuse to resize images to anything bigger then that (e.g., something like $width = min(1000, $_GET['w'])).
I know this does not answer your original question, but I think this would the right solution to protect your images. And if you still want to obfuscate the original name and resizing parameters you can do that however you see fit without worrying that someone might figure out what’s behind it.
I'm afraid, you won't be able to shorten the query string better than any known
compression algorithm. As mentioned in other answers, a compressed
version will be shorter by a few (around 4-6) characters than the original.
Moreover, the original string can be decoded relatively easy (opposed to decoding SHA-1 or MD5, for instance).
I suggest shortening URLs by means of Web server configuration. You might
shorten it further by replacing image path with an ID (store ID-filename
pairs in a database).
For example, the following Nginx configuration accepts
URLs like /t/123456/700/500/4fc286f1a6a9ac4862bdd39a94a80858, where
the first number (123456) is supposed to be an image ID from database;
700 and 500 are image dimensions;
the last part is an MD5 hash protecting from requests with different dimensions.
# Adjust maximum image size
# image_filter_buffer 5M;
server {
listen 127.0.0.13:80;
server_name img-thumb.local;
access_log /var/www/img-thumb/logs/access.log;
error_log /var/www/img-thumb/logs/error.log info;
set $root "/var/www/img-thumb/public";
# /t/image_id/width/height/md5
location ~* "(*UTF8)^/t/(\d+)/(\d+)/(\d+)/([a-zA-Z0-9]{32})$" {
include fastcgi_params;
fastcgi_pass unix:/tmp/php-fpm-img-thumb.sock;
fastcgi_param QUERY_STRING image_id=$1&w=$2&h=$3&hash=$4;
fastcgi_param SCRIPT_FILENAME /var/www/img-thumb/public/t/resize.php;
image_filter resize $2 $3;
error_page 415 = /empty;
break;
}
location = /empty {
empty_gif;
}
location / { return 404; }
}
The server accepts only URLs of specified pattern, forwards request to /public/t/resize.php script with modified query string, then resizes the image generated by PHP with the image_filter module. In case of error, returns an empty GIF image.
The image_filter is optional, and it is included only as an example. Resizing can be performed fully on PHP side. With Nginx, it is possible to get rid of PHP part, by the way.
The PHP script is supposed to validate the hash as follows:
// Store this in some configuration file.
$salt = '^sYsdfc_sd&9wa.';
$w = $_GET['w'];
$h = $_GET['h'];
$true_hash = md5($w . $h . $salt . $image_id);
if ($true_hash != $_GET['hash']) {
die('invalid hash');
}
$filename = fetch_image_from_database((int)$_GET['image_id']);
$img = imagecreatefrompng($filename);
header('Content-Type: image/png');
imagepng($img);
imagedestroy($img);
I don't think the resulting URL can be shortened much more than in your own example. But I suggest a few steps to obfuscate your images better.
First I would remove everything you can from the base URL you are zipping and Base64 encoding, so instead of
img=/dir/dir/hi-res-img.jpg&w=700&h=500
I would use
s=hi-res-img.jpg,700,500,062c02153d653119
Were those last 16 chars are a hash to validate the URL being opened is the same you offered in your code - and the user is not trying to trick the high-resolution image out of the system.
Your index.php that serves the images would start like this:
function myHash($sRaw) { // returns a 16-characters dual hash
return hash('adler32', $sRaw) . strrev(hash('crc32', $sRaw));
} // These two hash algorithms are suggestions, there are more for you to chose.
// s=hi-res-img.jpg,700,500,062c02153d653119
$aParams = explode(',', $_GET['s']);
if (count($aParams) != 4) {
die('Invalid call.');
}
list($sFileName, $iWidth, $iHeight, $sHash) = $aParams;
$sRaw = session_id() . $sFileName . $iWidth . $iHeight;
if ($sHash != myHash($sRaw)) {
die('Invalid hash.');
}
After this point you can send the image as the user opening it had access to a valid link.
Note the use of session_id as part of the raw string that makes the hash is optional, but would make it impossible for users to share a valid URL - as it would be session bind. If you want the URLs to be shareable, then just remove session_id from that call.
I would wrap the resulting URL the same way you already do, zip + Base64. The result would be even bigger than your version, but more difficult to see through the obfuscation, and therefore protecting your images from unauthorised downloads.
If you want only to make it shorter, I do not see a way of doing it without renaming the files (or their folders), or without the use of a database.
The file database solution proposed will surely create problems of concurrency - unless you always have no or very few people using the system simultaneously.
You say that you want the size there, so that if you decide some day that the preview images are too small, you want to increase the size - the solution here is to hard code the image size into the PHP script and eliminate it from the URL.
If you want to change the size in the future, change the hardcoded values in the PHP script (or in a config.php file that you include into the script).
You've also said that you are already using files to store image data as a JSON object, like: name, title, description. Exploiting this, you don't need a database and can use the JSON file name as the key for looking up the image data.
When the user visits a URL like this:
www.mysite.com/share/index.php?ax9v
You load ax9v.json from the location you are already storing the JSON files, and within that JSON file the image's real path is stored. Then load the image, resize it according to the hardcoded size in your script and send it to the user.
Drawing from the conclusions in
URL Shortening: Hashes In Practice, to get the smallest search string part of the URL you would need to iterate valid character combinations as new files are uploaded (e.g., the first one is "AAA" then "AAB", "AAC", etc.) instead of using a hashing algorithm.
Your solution would then have only three characters in the string for the first 238,328 photos you upload.
I had started to prototype a PHP solution on PhpFiddle, but the code disappeared (don't use PhpFiddle).

Convert PHP SHA1 to Ruby

I have this algorithm in PHP:
$encoded_key = 'WHllcnRGYTY3eWpUNjQ';
$decoded_key = base64_decode($encoded_key);
// XyertFa67yjT64
$params_string = implode('', $params);
//U215250.00121715620http://partner.domain.ru/order/U215/successhttp://partner.domain.ru/order/U215/fail
$raw_signature = hash_hmac('sha1', $params_string, $decoded_key, true);
// Byte-encoded, hex: c6881d8665afbb46a93a16b34bd152878a19ab3a
$encoded_signature = base64_encode($raw_signature);
// xogdhmWvu0apOhazS9FSh4oZqzo=
I'm trying to port this code to Ruby and get the same result but Base64 and OpenSSL can't help me. Does any one know whats wrong?
One problem is that you are using HMAC.hexdigest instead of HMAC.digest. Your PHP code is generating a raw HMAC and then encoding it in base 64. Therefore, you need to do the same thing in Ruby.
The other problem is the base 64 decoding step of the key. The key you entered is not padded correctly and will therefore be truncated by Ruby's base 64 library. For example:
encoded_key = "WHllcnRGYTY3eWpUNjQ"
Base64.decode64(encoded_key)
#=> "XyertFa67yjT"
# incomplete!
Base64.decode64("#{encoded_key}=\n")
#=> "XyertFa67yjT64"
# this is what you actually want
The padding and the final newline are there to ensure that the base 64 encoded data is complete, since it marks the end. However, it is possible to manually add the padding and just assume that the data is complete:
require 'base64'
require 'openssl'
def base64_pad(unpadded_str)
padding = case unpadded_str.size % 3
when 1 then "=="
when 2 then "="
end
"#{unpadded_str}#{padding}\n"
end
encoded_key = "WHllcnRGYTY3eWpUNjQ"
key = Base64.decode64(base64_pad(encoded_key))
#=> "XyertFa67yjT64"
string = "U215250.00121715620http://partner.domain.ru/order/U215/successhttp://partner.domain.ru/order/U215/fail"
Base64.encode64(OpenSSL::HMAC.digest('SHA1', key, string))
#=> "xogdhmWvu0apOhazS9FSh4oZqzo=\n"

PHP (ZLIB) uncompression of a C (ZLIB) compressed array returns gibberish

I have a set of ZLIB compressed / base64 encoded strings (done in a C program) that are stored in a database. I have written a small PHP page that should retrieve these values and plot them (the string originally was a list of floats).
Chunk of C program that compresses/encodes:
error=compress2(comp_buffer, &comp_length,(const Bytef*)data.mz ,(uLongf)length,Z_DEFAULT_COMPRESSION); /* compression */
if (error != Z_OK) {fprintf(stderr,"zlib error..exiting"); exit(EXIT_FAILURE);}
mz_binary=g_base64_encode (comp_buffer,comp_length); /* encoding */
(Example) of original input format:
292.1149 8379.5928
366.1519 101313.3906
367.3778 20361.8105
369.1290 17033.3223
375.4355 1159.1841
467.3191 8445.3926
Each column was compressed/encoded as a single string. To reconstruct the original data i am using the following code:
//$row[4] is retrieved from the DB and contains the compressed/encoded string
$mz = base64_decode($row[4]);
$unc_mz = gzuncompress($mz);
echo $unc_mz;
Yet this gives me the following output:
f6jEÍ„]EšiSE#IEfŽ
Could anyone give me a tip/hint about what I might be missing?
------ Added Information -----
I feel that the problem comes from the fact that currently php views $unc_mz as a single string while in reality i would have to re-construct an array containing X lines (this output was from a 9 line file) but... no idea how to do that assignment.
The C program that did that went roughly like this:
uncompress( pUncompr , &uncomprLen , (const Bytef*)pDecoded , decodedSize );
pToBeCorrected = (char *)pUncompr;
for (n = 0; n < (2 * peaksCount); n++) {
pPeaks[n] = (RAMPREAL) ((float *) pToBeCorrected)[n];
}
where peaksCount would be the amount of 'lines' in the input file.
EDIT (15-2-2012): The problem with my code was that I was not reconstructing the array, the fixed code is as follows (might be handy if someone needs a similar snippet):
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$m< = base64_decode($row[4]);
$mz_int = gzuncompress($int);
$max = strlen($unc_mz);
$counter = 0;
for ($i = 0; $i < $max; $i = $i + 4) {
$temp= substr($unc_mz,$i,4);
$temp = unpack("f",$temp);
$mz_array[$counter] = $temp[1];
$counter++;
}
The uncompressed string has to be chopped into chunks corresponding to the length of a float, unpack() then reconstructs the float data from teh binary 'chunk'. That's the simplest description that I can give for the above snippet.
compress2() produces the zlib format (RFC 1950). I would have to guess that something called gzuncompress() is expecting the gzip format (RFC 1952). So gzuncompress() would immediately fail upon not finding a gzip header.
You would need to use deflateInit2() in zlib to request that deflate() produce gzip-formatted output, or find or provide a different function in PHP that expects the zlib format.

Categories