What string encoding is used by PHP's hash_hmac?

What string encoding is used by PHP's hash_hmac? - php

PHP has a method hash_hmac that computes the HMAC signature of a given string using a given key and algorithm. But HMAC technically operates on binary data, and PHP takes all its params here as strings. How does it convert those strings to binary data?

Short answer: String encoding is just metadata attached to a lump of binary data. PHP strings are just the lump, you have to keep track of the rest.
Long answer:
PHP takes the Honey Badger approach to native string encodings, in other words, "PHP don't care". You give it a sequence of bytes, it stores them. It has no concept of encoding until you want to use a function that cares about it. Even then you need to explicitly declare the input and output encodings, otherwise PHP will go with its configured default which is usually not what anyone actually wants.
function nice_hex($in) {
return implode(' ', str_split(bin2hex($in), 2));
}
$utf8 = "You owe me €5.";
$utf16le = mb_convert_encoding($utf8, 'utf-16le', 'utf-8');
$utf16be = mb_convert_encoding($utf8, 'utf-16be', 'utf-8');
$iso88591 = mb_convert_encoding($utf8, 'iso-8859-1', 'utf-8');
$cp1252 = mb_convert_encoding($utf8, 'cp1252', 'utf-8');
var_dump(
$utf8,
nice_hex($utf8),
hash_hmac('md5', $utf8, 'foo'),
$utf16le,
nice_hex($utf16le),
hash_hmac('md5', $utf16le, 'foo'),
$utf16be,
nice_hex($utf16be),
hash_hmac('md5', $utf16be, 'foo'),
$iso88591,
nice_hex($iso88591),
hash_hmac('md5', $iso88591, 'foo'),
$cp1252,
nice_hex($cp1252),
hash_hmac('md5', $cp1252, 'foo')
);
Output:
string(16) "You owe me €5."
string(47) "59 6f 75 20 6f 77 65 20 6d 65 20 e2 82 ac 35 2e"
string(32) "7724135d91c43906f8730a26dcd76ffb"
string(28) "You owe me � 5."
string(83) "59 00 6f 00 75 00 20 00 6f 00 77 00 65 00 20 00 6d 00 65 00 20 00 ac 20 35 00 2e 00"
string(32) "f4a2347b4a1336dae1db21554c54b9e2"
string(28) "You owe me �5."
string(83) "00 59 00 6f 00 75 00 20 00 6f 00 77 00 65 00 20 00 6d 00 65 00 20 20 ac 00 35 00 2e"
string(32) "b0c1a98d8b853e6568bae513d764a029"
string(14) "You owe me ?5."
string(41) "59 6f 75 20 6f 77 65 20 6d 65 20 3f 35 2e"
string(32) "301a0fb55e23285904413323d10cc774"
string(14) "You owe me �5."
string(41) "59 6f 75 20 6f 77 65 20 6d 65 20 80 35 2e"
string(32) "fa1ee73d39e1a70fe2cde7a8c5bbf0ba"
And the reason why that all looks like it does is because:
StackOverflow uses UTF-8.
My editor uses UTF-8.
My console uses UTF-8.
The fact that PHP doesn't care about string encoding lets me produce arbitrarily-encoded trash output like the above quite easily.
Additional recommended reading: UTF-8 all the way through
Fun Fact: One of the reasons why PHP6 never ended up happening was because they wanted to include native multibyte string encoding but no one could agree on what flavor it should be. Eventually they just scrapped the whole thing and left it up to us the same as it was in PHP5.

It's just UTF-8 (for string literals).
You can put whatever encoding you want in a string, hash_hmac() doesn't use any specific encoding, just whatever encoding your string has.
Here's an example from Wikipedia using UTF-8 encoding and running a HMAC algorithm over the binary:
HMAC_MD5("key", "The quick brown fox jumps over the lazy dog") = 80070713463e7749b90c2dc24911e275
And here's the result of the equivalent PHP code, which gets the same response:
php > echo hash_hmac('md5', "The quick brown fox jumps over the lazy dog", "key");
80070713463e7749b90c2dc24911e275

Related

Strange behavior reading DBF file in PHP using fread()

I have a DBF file created as part of a shapefile with rgdal library's writeOGR function (in R).
When I ask to see its first bytes with Linux od command, I get the following.
od -x -c -N 32 BRA.dbf
0000000 7703 1e07 001b 0000 00a1 00d1 0000 0000
0000020 0000 0000 0000 0000 0000 0000 5700 0000
0000040
My PHP code goes like this.
$dbf = fopen('BRA.dbf','rb');
fread($dbf,10); // jumps over the first 10 bytes
$dbfRecSize = unpack('v',fread($dbf,2))[1]; // 'v' = little endian 16 bits: 00d1 = d1(16) = 209
fread($dbf,17); // jumps over a few more bytes
$dbfLangID = ord(fread($dbf,1)); // language driver ID
if ($dbfLangID == 0x57) {
echo "Language: 0x57 (ISO-8859-1)\n";
} else {
echo "Language: $dbfLangID;\n";
}
The code above outputs "Language: 0x57 (ISO-8859-1)", which means the "57" close to the end of the od output is being read with the ord(fread($dbf,1)); command.
Strange thing is that I've read 10+2+17 = 29 bytes from the file, so the next byte should be "00", or not (right after the 0x57)? $dbfRecSize is 209, which means my logic is correct in the first two reads. Why isn't it in the following reads?
What am I misunderstanding here?

The error is that I was confusing od command with debug from DOS...
od -x prints bytes with the order reversed every two bytes (too confusing to me).
0000000 7703 1e07 001b 0000 00a1 00d1 0000 0000
0000020 0000 0000 0000 0000 0000 0000 5700 0000
od -t x1 prints each byte once and separated (harder to count/read in the middle of the line).
0000000 03 77 07 1e 1b 00 00 00 a1 00 d1 00 00 00 00 00
0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 57 00 00
Wonder if is there an option to print bytes two by two (in hexadecimal), without reversing their orders?

How to read a float from binary WebM file?

I'm trying to learn binary and create a simple WebM parser in PHP based on Matroska.
I read TimecodeScale, MuxingAppm WritingApp, etc. with unpack(format, data). My problem is when I reach Duration (0x4489) in Segment Information (0x1549a966) I must read a float and based on TimecodeScale convert it to seconds: 261.564s->00:04:21.564 and I don't know how.
This is a sample sequence:
`2A D7 B1 83 0F 42 40 4D 80 86 67 6F 6F 67 6C 65 57 41 86 67 6F 6F 67 6C 65 44 89 88 41 0F ED E0 00 00 00 00 16 54 AE 6B`
TimecodeScale := 2ad7b1 uint [ def:1000000; ]
MuxingApp := 4d80 string; ("google")
WritingApp := 5741 string; ("google")
Duration := 4489 float [ range:>0.0; ]
Tracks := 1654ae6b container [ card:*; ]{...}
I must read a float after (0x4489) and return 261.564s.

The duration is a double precision floating point value (64-bits) represented in the IEEE 754 format. If you want to see how the conversion is done check this.
The TimecodeScale is the timestamp scale in nanoseconds.
In php you can do:
$bin = hex2bin('410fede000000000');
$timecode_scale = 1e6;
// endianness
if (unpack('S', "\xff\x00")[1] === 0xff) {
$bytes = unpack('C8', $bin);
$bytes = array_reverse($bytes);
$bin = implode('', array_map('chr', $bytes));
}
$duration = unpack('d', $bin)[1];
$duration_s = $duration * $timecode_scale / 1e9;
echo "duration=${duration_s}s\n";
Result:
duration=261.564s

NodeJS Buffer Equivalent In PHP

NodeJS code:
const salt = new Buffer('GHlDHToiZA1ViUu+W+EXww==', 'base64');
Output like this:
<Buffer 18 79 43 1d 3a 22 64 0d 55 89 4b be 5b e1 17 c3>
I need the same output in PHP. Read somewhere about PHP's pack function but I don't know how to use it.

Seems that you are working with base64; in php you are right pack and unpack is your friends.
example
in Node
$ node
> Buffer('hello world').toString('base64')
aGVsbG8gd29sZA==
in PHP
$ php -a
php > echo base64_encode('hello world');
aGVsbG8gd29ybGQ=
But if you are only looking for the binary:
in Node
> Buffer('hello wold')
<Buffer 68 65 6c 6c 6f 20 77 6f 6c 64>
in PHP
php > print_r(unpack('H*', 'hello world'));
Array
(
[1] => 68656c6c6f20776f726c64
)
So in your instance you would first decode the base64 and then unpack it.
php > $raw = base64_decode('GHlDHToiZA1ViUu+W+EXww==');
php > print_r(unpack('H*', $raw));
Array
(
[1] => 1879431d3a22640d55894bbe5be117c3
)
Easy peasy ;)

I have the same problem, and i found solution using the packet: lcobucci/jwt.
Must to create a buffer by your key in base64, after create will be converting to binary for sign the jwt.
$configuration = Configuration::forSymmetricSigner(
// You may use any HMAC variations (256, 384, and 512)
new Sha256(),
// replace the value below with a key of your own!
InMemory::base64Encoded('your-base64-key')
// You may also override the JOSE encoder/decoder if needed by providing extra arguments here
);

PHP - convert little endian hex to big endian hex

I am trying to convert little endian hex to big endian hex.
Example:
Little endian:
E1 31 01 00 00 9D
Big endian:
9D 00 00 01 31 E1

If numbers are in the format described than you can convert by using standard array functions.
function littleToBigEndian($little) {
return implode(' ',array_reverse(explode(' ', $little)));
}
echo littleToBigEndian('E1 31 3C 01 00 00 9B');
// Output: 9B 00 00 01 3C 31 E1
If there are no spaces for separation of numbers you need to str_split() the string instead.
function littleToBigEndian($little) {
return implode('',array_reverse(str_split($little,2)));
}
echo littleToBigEndian('E1313C0100009B');
// Output: 9B0000013C31E1

PHP utf encoding problem

How can I encode strings on UTF-16BE format in PHP? For "Demo Message!!!" the encoded string should be '00440065006D006F0020004D00650073007300610067006'. Also, I need to encode Arabic characters to this format.

First of all, this is absolutly not UTF-8, which is just a charset (i.e. a way to store strings in memory / display them).
WHat you have here looks like a dump of the bytes that are used to build each characters.
If so, you could get those bytes this way :
$str = utf8_encode("Demo Message!!!");
for ($i=0 ; $i<strlen($str) ; $i++) {
$byte = $str[$i];
$char = ord($byte);
printf('%02x ', $char);
}
And you'd get the following output :
44 65 6d 6f 20 4d 65 73 73 61 67 65 21 21 21
But, once again, this is not UTF-8 : in UTF-8, like you can see in the example I've give, D is stored on only one byte : 0x44
In what you posted, it's stored using two Bytes : 0x00 0x44.
Maybe you're using some kind of UTF-16 ?
EDIT after a bit more testing and #aSeptik's comment : this is indeed UTF-16.
To get the kind of dump you're getting, you'll have to make sure your string is encoded in UTF-16, which could be done this way, using, for example, the mb_convert_encoding function :
$str = mb_convert_encoding("Demo Message!!!", 'UTF-16', 'UTF-8');
Then, it's just a matter of iterating over the bytes that make this string, and dumping their values, like I did before :
for ($i=0 ; $i<strlen($str) ; $i++) {
$byte = $str[$i];
$char = ord($byte);
printf('%02x ', $char);
}
And you'll get the following output :
00 44 00 65 00 6d 00 6f 00 20 00 4d 00 65 00 73 00 73 00 61 00 67 00 65 00 21 00 21 00 21
Which kind of looks like what youy posted :-)
(you just have to remove the space in the call to printf -- I let it there to get an easier to read output=)

E.g. by using the mbstring extension and its mb_convert_encoding() function.
$in = 'Demo Message!!!';
$out = mb_convert_encoding($in, 'UTF-16BE');
for($i=0; $i<strlen($out); $i++) {
printf("%02X ", ord($out[$i]));
}
prints
00 44 00 65 00 6D 00 6F 00 20 00 4D 00 65 00 73 00 73 00 61 00 67 00 65 00 21 00 21 00 21
Or by using iconv()
$in = 'Demo Message!!!';
$out = iconv('iso-8859-1', 'UTF-16BE', $in);
for($i=0; $i<strlen($out); $i++) {
printf("%02X ", ord($out[$i]));
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

What string encoding is used by PHP's hash_hmac? - php

PHP has a method hash_hmac that computes the HMAC signature of a given string using a given key and algorithm. But HMAC technically operates on binary data, and PHP takes all its params here as strings. How does it convert those strings to binary data?

Related

Strange behavior reading DBF file in PHP using fread()

How to read a float from binary WebM file?

NodeJS Buffer Equivalent In PHP

PHP - convert little endian hex to big endian hex

PHP utf encoding problem

Categories

Resources