PHP pack() format for signed 32 int - big endian - php

I am creating and then writing data to a file (a new 'ESRI Shape file') using PHP, fopen, fseek, pack etc. The file spec is here http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf.
The file spec states that the data written needs to be in a combination of the following:
Integer: Signed 32-bit integer (4 bytes) - Big Endian
Integer: Signed 32-bit integer (4 bytes) - Little Endian
Double: Signed 64-bit IEEE double-precision floating point number (8 bytes) - Little Endian
I cant seem to find a pack() format that allows for these formats. I don't want to use a machine dependent format as this code may be running on a variety of platforms.
Can anyone advise on what format (or combination of formats) I need to use for these 3 formats?
Many thanks,
Steve

You could check the endianness of the machine running the code and reverse the bytes manually as necessary. The code below should work, but you will only be able to convert one int or float at a time.
define('BIG_ENDIAN', pack('L', 1) === pack('N', 1));
function pack_int32s_be($n) {
if (BIG_ENDIAN) {
return pack('l', $n); // that's a lower case L
}
return strrev(pack('l', $n));
}
function pack_int32s_le($n) {
if (BIG_ENDIAN) {
return strrev(pack('l', $n));
}
return pack('l', $n); // that's a lower case L
}
function pack_double_be($n) {
if (BIG_ENDIAN) {
return pack('d', $n);
}
return strrev(pack('d', $n));
}
function pack_double_le($n) {
if (BIG_ENDIAN) {
return strrev(pack('d', $n));
}
return pack('d', $n);
}

If PHP doesn't support it, you could implement your own.
function pack_int32be($i) {
if ($i < -2147483648 || $i > 2147483647) {
die("Out of bounds");
}
return pack('C4',
($i >> 24) & 0xFF,
($i >> 16) & 0xFF,
($i >> 8) & 0xFF,
($i >> 0) & 0xFF
);
}
function pack_int32le($i) {
if ($i < -2147483648 || $i > 2147483647) {
die("Out of bounds");
}
return pack('C4',
($i >> 0) & 0xFF,
($i >> 8) & 0xFF,
($i >> 16) & 0xFF,
($i >> 24) & 0xFF
);
}
The double-precision LE is much harder. Supporting quad-precision system would involve packing the number using d, converting it to a binary string, splitting the binary into fields, truncating the fields to the right size if they're too large, concatenating the fields, then converting from binary to bytes.

Related

Convert double to Pascal 6-byte (48 bits) real format

I need to do some work on data contained in legacy files. For this purpose, I need to read and write Turbo Pascal's 6-byte (48 bit) floating point numbers, from PHP. The Turbo Pascal data type is commonly known as real48 (specs).
I have the following php code to read the format:
/**
* Convert Turbo Pascal 48-bit (6 byte) real to a PHP float
* #param binary 48-bit real (in binary) to convert
* #return float number
*/
function real48ToDouble($real48) {
$byteArray = array_values( unpack('C*', $real48) );
if ($byteArray[0] == 0) {
return 0; // Zero exponent = 0
}
$exponent = $byteArray[0] - 129;
$mantissa = 0;
for ($b = 1; $b <= 4; $b++) {
$mantissa += $byteArray[$b];
$mantissa /= 256;
}
$mantissa += ($byteArray[5] & 127);
$mantissa /= 128;
$mantissa += 1;
if ($byteArray[5] & 128) { // Sign bit check
$mantissa = -$mantissa;
}
return $mantissa * pow(2, $exponent);
}
(adapted from)
Now I need to do the reverse: write the data type.
Note:
I'm aware of the answer to the question Convert C# double to Delphi Real48, but it seems awfully hacky and I would think a much cleaner solution is possible. AND my machine does not natively support 64-bits.
On a second look, the method posted in the answer to Convert C# double to Delphi Real48 cleaned up pretty nicely.
For future reference:
/**
* Convert a PHP number [Int|Float] to a Turbo Pascal 48-bit (6 byte) real byte representation
* #param float number to convert
* #return binary 48-bit real
*/
function doubleToReal48($double) {
$byteArray = array_values( unpack('C*', pack('d', $double)) ); // 64 bit double as array of integers
$real48 = array(0, 0, 0, 0, 0, 0);
// Copy the negative flag
$real48[5] |= ($byteArray[7] & 128);
// Get the exponent
$n = ($byteArray[7] & 127) << 4;
$n |= ($byteArray[6] & 240) >> 4;
if ($n == 0) { // Zero exponent = 0
return pack('c6', $real48[0], $real48[1], $real48[2], $real48[3], $real48[4], $real48[5]);
}
$real48[0] = $n - 1023 + 129;
// Copy the Mantissa
$real48[5] |= (($byteArray[6] & 15) << 3); // Get the last 4 bits
$real48[5] |= (($byteArray[5] & 224) >> 5); // Get the first 3 bits
for ($b = 4; $b >= 1; $b--) {
$real48[$b] = (($byteArray[$b+1] & 31) << 3); // Get the last 5 bits
$real48[$b] |= (($byteArray[$b] & 224) >> 5); // Get the first 3 bits
}
return pack('c6', $real48[0], $real48[1], $real48[2], $real48[3], $real48[4], $real48[5]);
}

Reading the decimal value of little endian in PHP

I'm reading 6 bytes in little endian from a binary file using
$data = fread($fp, 6);
unpack("V", $data);
The result is 1152664389 and in HEX it is 0x44B44345
Now this result is in little endian and has a decimal number. In Delphi, I was able to get the decimal number using this function:
var
myint:integer;
s:single;
begin
myint:= 1152664389; // same as myint:= $44B44345;
s:= PSingle(#myint)^;
and output of s is 1442.102.. This is exactly the number im looking for...
I searched for a way to do it in PHP but I just got lost.
Please help,
Thanks
Okay, I was a bit confused. I wanted to actually retrieve a little endian number from a binary file then convert it to float.
So my steps were:
1) Read bytes and unpack using $number = unpack("V", $data);
2) Convert $number to decimal using dechex.
3) Then use the following function to convert hex to float:
function hexTo32Float($strHex) {
$v = hexdec($strHex);
$x = ($v & ((1 << 23) - 1)) + (1 << 23) * ($v >> 31 | 1);
$exp = ($v >> 23 & 0xFF) - 127;
return $x * pow(2, $exp - 23);
}
Thanks again.

Trying to read a two's complement 16bit into a signed decimal

Im trying (in PHP) to read the two's complement value of two bytes (16 bits) and return a signed decimal.
I am not sure how the two's compliment math should work, but from php.net, I managed to get it almost to show what I expect. The issue i think I am having is that I do not get any negative values.
Code I have:
function _bin16dec($bin) {
// Function to convert 16bit binary numbers to integers using two's complement
$num = bindec($bin);
if($num > 0xFFFF) { return false; }
if($num >= 0x8000) {
return -(($num ^ 0xFFFF)+1);
} else {
return $num;
}
}
This code is what someone came up with online, but its in python which do not understand.
def twoscomp( x ) :
"This returns a 16-bit signed number (two's complement)"
if (0x8000 & x):
x = - (0x010000 - x)
return x
The application reads two bytes from a gyroscope for each axis in the 2's compliment form.
Thanks in advance!
Sam
Assuming this Python function does what you expect...
def twoscomp( x ) :
"""This returns a 16-bit signed number (two's complement)"""
if (0x8000 & x):
x = - (0x010000 - x)
return x
...this PHP function should do the exact same thing.
function _bin16dec($bin) {
// converts 16bit binary number string to integer using two's complement
$num = bindec($bin) & 0xFFFF; // only use bottom 16 bits
if (0x8000 & $num) {
$num = - (0x010000 - $num);
}
return $num;
}
This code works for me on PHP 5.3.15. Let me know if you would like further explanation.
--ap
Here's a version for a ec-2 Amazon Linux x64 distro, PHP 5.3.27, that converts 32bit binary strings:
function _bin32dec($bin) {
$num = bindec($bin) & 0xFFFFFFFF;
if (0x80000000 & $num) {
$num = - (0xFFFFFFFF - $num + 1);
}
return $num;
}

PHP pack/unpack float in big endian byte order

How can I pack/unpack floats in big endian byte order with php?
I got this far with an unpack function, but I'm not sure if this would even work.
function unpackFloat ($float) {
$n = unpack ('Nn');
$n = $n['n'];
$sign = ($n >> 31);
$exponent = ($n >> 23) & 0xFF;
$fraction = $n & 0x7FFFFF;
}
After thinking about it for a while I found a pretty easy solution, to use the opposite byte order from the one pack('f') uses.
unpack
unpack('fdat', strrev(substr($data, 0, 4)));
pack
strrev(pack('f', $data));
PHP 7.2 introduced the option to pack floating point numbers with big endian byte order directly:
// float
$bytes = pack('G', 3.1415);
// double precision float
$bytes = pack('E', 3.1415);
https://www.php.net/manual/en/function.pack.php

How does this code extract the signature?

I have to debug an old PHP script from a developer who has left the company. I understand the most part of the code, except the following function. My question: What does...
if($seq == 0x03 || $seq == 0x30)
...mean in context of extracting the signature out of an X.509 certificate?
public function extractSignature($certPemString) {
$bin = $this->ConvertPemToBinary($certPemString);
if(empty($certPemString) || empty($bin))
{
return false;
}
$bin = substr($bin,4);
while(strlen($bin) > 1)
{
$seq = ord($bin[0]);
if($seq == 0x03 || $seq == 0x30)
{
$len = ord($bin[1]);
$bytes = 0;
if ($len & 0x80)
{
$bytes = ($len & 0x0f);
$len = 0;
for ($i = 0; $i < $bytes; $i++)
{
$len = ($len << 8) | ord($bin[$i + 2]);
}
}
if($seq == 0x03)
{
return substr($bin,3 + $bytes, $len);
}
else
{
$bin = substr($bin,2 + $bytes + $len);
}
}
else
{
return false;
}
}
return false;
}
An X.509 certificate contains data in multiple sections (called Tag-Length-Value triplets). Each section starts with a Tag byte, which indicates the data format of the section. You can see a list of these data types here.
0x03 is the Tag byte for the BIT STRING data type, and 0x30 is the Tag byte for the SEQUENCE data type.
So this code is designed to handle the BIT STRING and SEQUENCE data types. If you look at this part:
if($seq == 0x03)
{
return substr($bin,3 + $bytes, $len);
}
else // $seq == 0x30
{
$bin = substr($bin,2 + $bytes + $len);
}
you can see that the function is designed to skip over Sequences (0x30), until it finds a Bit String (0x03), at which point it returns the value of the Bit String.
You might be wondering why the magic number is 3 for Bit String and 2 for Sequence. That is because in a Bit String, the first value byte is a special extra field which indicates how many bits are unused in the last byte of the data. (For example, if you're sending 13 bits of data, it will take up 2 bytes = 16 bits, and the "unused bits" field will be 3.)
Next issue: the Length field. When the length of the Value is less than 128 bytes, the length is simply specified using a single byte (the most significant bit will be 0). If the length is 128 or greater, then the first length byte has bit 7 set, and the remaining 7 bits indicates how many following bytes contain the length (in big-endian order). More description here. The parsing of the length field happens in this section of the code:
$len = ord($bin[1]);
$bytes = 0;
if ($len & 0x80)
{
// length is greater than 127!
$bytes = ($len & 0x0f);
$len = 0;
for ($i = 0; $i < $bytes; $i++)
{
$len = ($len << 8) | ord($bin[$i + 2]);
}
}
After that, $bytes contains the number of extra bytes used by the length field, and $len contains the length of the Value field (in bytes).
Did you spot the error in the code? Remember,
If the length is 128 or greater, then the first length byte has bit 7
set, and the remaining 7 bits indicates how many following bytes
contain the length.
but the code says $bytes = ($len & 0x0f), which only takes the lower 4 bits of the byte! It should be:
$bytes = ($len & 0x7f);
Of course, this error is only a problem for extremely long messages: it will work fine as long as the length value will fit within 0x0f = 15 bytes, meaning the data has to be less than 256^15 bytes. That's about a trillion yottabytes, which ought to be enough for anybody.
As Pateman says above, you just have a logical if, we're just checking if $seq is either 0x30 or 0x03.
I have a feeling you already know that though, so here goes. $seq is the first byte of the certificate, which is probably either the version of the certificate or the magic number to denote that the file is a certificate (also known as "I'm guessing this because 10:45 is no time to start reading RFCs").
In this case, we're comparing against 0x30 and 0x03. These numbers are expressed in hexadecimal (as is every number starting with 0x), which is base-16. This is just really a very convenient shorthand for binary, as each hex digit corresponds to exactly four binary bits. A quick table is this:
0 = 0000
1 = 0001
2 = 0010
3 = 0011
...
...
E = 1110
F = 1111
Equally well, we could have said if($seq == 3 || $seq == 48), but hex is just much easier to read and understand in this case.
I'd hazard a guess that it's a byte-order-independent check for version identifier '3' in an x.509 certificate. See RFC 1422, p7. The rest is pulling the signature byte-by-byte.
ord() gets the value of the ASCII character you pass it. In this case it's checking to see if the ASCII character is either a 0 or end of text (according to this ASCII table).
0x03 and 0x30 are hex values. Look that up and you'll have what $seq is matching to

Categories