PHP dechex change amount of bytes in output

PHP dechex change amount of bytes in output - php

Hi I am working on creating an assembler and so I need to take some number and convert it to hex for a branch command. Is there a way to change the amount of bytes returned in the output? We are using 24 bit instructions (6 bytes) and our branch commands use the first byte for op code and second byte for conditional bits, that leaves me 4 bytes for the number. If I have a negative number like -2 I get fffffffffffffffe which is 16 bytes. Is there an easy way to change the output of hexdec() to a specified number of bytes? I know how to do positive numbers as they output the minimum amount of bytes needed so 2 becomes 2 or 15 becomes f.
If I went from integer to binary using decbin I still get 16 bytes. I can not just cut off any leading bytes can I?

Since I don't care about possibility of overflow and I will not get anywhere clear to the 65k number required to need more than 4 bytes I can ignore all bytes after the 4th byte. I would still like to know if there is a way though.

Related

PhpExcel how to turn off the default conversion of numbers to scientific notation

How to turn off the default conversion of numbers to scientific notation. When importing from an excel file, large numbers are automatically converted to scientific notation (3.5868405364945E+14 it should be:358684053649447).Is there any option to turn off conversion in PhpExcel?
Or reverse conversions from PHP? When I trying to use printf,
printf("%d", "3.5868405364945E+14"); // 358684053649450 wrong value
final number is inaccurate.

Sorry, you'll never get the full value again, it's been already rounded, because your number has 16 digits and 15 digits is the limit for numbers in Excel.
It happens at the entry point, when you enter a number that excedes 15 digits. EXcel will round it, modifying your entry forever.
It's similar as storing a decimal number like 1.2 as integer, you'll loose that 0.2, no matter what you do, it will be 1 forever.
The only solution for this is (too late in your case), storing the large number as text in the first place, just adding a single quote before the number: '358684053649447 instead of 358684053649447. Excel will interpret that as string, not as number, and you'll be able to save numbers higher than 15 digits.

Unexpected result in PHP bitwise operation

The operation 1539 | 0xfffff800 returns -509 in JavaScript and Python 2.7.
In PHP I get 4294966787.
Does anybody know why and could explain that to me. I would love to know how I get the expected result in PHP as well.

1539 | 0xfffff800 = 4294966787 (= 0xFFFFFE03)
This is perfectly right. So, PHP is right.
If you would like to have both positive and negative integers, you need some mechanism to determine whether the number is negative. This is usually done using the 2-complement of the number. You can negate a number by just inverting all the bits of the number and then add 1 to it. In order to avoid ambiguities, you cannot use all the bits of your integer variable. You cannot use the highest bit in this case. The highest bit is reserved as a sign bit. (If you would not do so, you never know if your number is a big positive number or a negative number.)
For exammple with an 8 bit integer variable, you would be able to represent numbers from 0 to 255. If you need signed values, you can represent number from -128 (1000 000 binary) to +127 (0111 1111).
In your example, you have a 32 bit number which has its highest bit set. In Python and JavaScript, it's interpreted as negative number, as they apparently have 32 bit variables, and there, the highest bit is set. They interpret that as negative number. So, the result of your calculation is also negative.
In the PHP version you are using, the integer variable seems to be 64 bit long and only the lower 32 bits are used. The highest bit (bit 63) is not set, so PHP interprets this number as positive. Depending on what you want to achive, you may want to fill up all bits from bit 32 to bit 63 with 1s which will create a negative number...

Encoding/Compressing a large integer into alphanumeric value

I have a very large integer 12-14 digits long and I want to encrypt/compress this to an alphanumeric value so that the integer can be recovered later from the alphanumeric value. I tried to convert this integer using a 62 base and tried to map those values to a-zA-Z0-9, but the value generated from this is 7 characters long. This length is still long enough and I want to convert to about 4-5 characters.
Is there a general way to do this or some method in which this can be done so that recovering the integer would still be possible? I am asking the mathematical aspects here but I would be programming this in PHP and I recently started programming in php.
Edit:
I was thinking in terms of assigning a masking bit and using this in a fashion to generate less number of Chars. I am aware of the fact that the range is not enough and that is the reason I was focusing on using a mathematical trick or a way of representation. The 62 base was an Idea that I already applied but is not working out.

14 digit decimal numbers can express 100,000,000,000,000 values (1014).
5 characters of a 62 character alphabet can express 916,132,832 values (625).
You cannot cram the equivalent number of values of a 14 digit number into a 5 character base 62 string. It's simply not possible to express each possible value uniquely. See http://en.wikipedia.org/wiki/Pigeonhole_principle. Even base 64 with 7 characters is not enough (only 4,398,046,511,104 possible values). In fact, if you target a 5 character short string you'd need to compensate by using a base 631 alphabet (6315 = 100,033,806,792,151).
Even compression doesn't help you. It would mean that two or more numbers would need to compress to the same compressed string (because there aren't enough possible unique compressed values), which logically means it's impossible to uncompress them into two different values.
To illustrate this very simply: Say my alphabet and target "string length" consists of one bit. That one bit can be 0 or 1. It can express 2 unique possible values. Say I have a compression algorithm which compresses anything and everything into this one bit. ... How could I possibly uncompress 100,000,000,000,000 unique values out of that one bit with two possible values? If you'd solve that problem, bandwidth and storage concerns would immediately evaporate and you'd be a billionaire.

With 95 printable ASCII characters you can switch to base 95 encoding instead of 62:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
That way an integer string of length X can be compressed into length Y base 95 string, where
Y = X * log 10/ log 95 = roughly X / 2
which is pretty good compression. So from length 12 you get down to 6. If the purpose of compression is to save the bandwidth by using JSON, then base 92 can be good choice (excluding ",\,/ that become escaped in JSON).
Surely you can get better compression but the price to pay is a larger alphabet. Just replace 95 in the above formula by the number of symbols.
Unless of course, you know the structure of your integers. For instance, if they have plenty of zeroes, you can base your compression on this knowledge to get much better results.

because the pigeon principle you will end up with some values that get compressed and other values that get expanded. It simply impossible to create a compression algorithm that compress every possible input string (i.e. in your case your numbers).
If you force the cardinality of the output set to be smaller than the cardinality of the input set you'll get collisions (i.e. more input strings get "compressed" to the same compressed binary string). A compression algorithm should be reversible, right? :)

RijndaelManaged.CreateEncryptor key expansion

There are two ways to specify a key and an IV for a RijndaelManaged object. One is by calling CreateEncryptor:
var encryptor = rij.CreateEncryptor(Encoding.UTF8.GetBytes(key), Encoding.UTF8.GetBytes(iv)));
and another one by directly setting Key and IV properties:
rij.Key = "1111222233334444";
rij.IV = "1111222233334444";
As long as the length of the Key and IV is 16 bytes, both methods produce the same result. But if your key is shorter than 16 bytes, the first method still allows you to encode the data and the second method fails with an exception.
Now this may sound like an absolutely abstract question, but I have to use PHP & the key which is only 10 bytes long in order to send an encrypted message to a server which uses the first method.
So the question is: How does CreateEncryptor expand the key and is there a PHP implementation? I cannot alter the C# code so I'm forced to replicate this behaviour in PHP.

I'm going to have to start with some assumptions. (TL;DR - The solution is about two-thirds of the way down but the journey is way cooler).
First, in your example you set IV and Key to strings. This can't be done. I'm therefore going to assume we call GetBytes() on the strings, which is a terrible idea by the way as there are less potential byte values in usable ASCII space than there are in all 256 values in a byte; that's what GenerateIV() and GenerateKey() are for. I'll get to this at the very end.
Next I'm going to assume you're using the default block, key and feedback size for RijndaelManaged: 128, 256 and 128 respectively.
Now we'll decompile the Rijndael CreateEncryptor() call. When it creates the Transform object it doesn't do much of anything with the key at all (except set m_Nk, which I'll come to later). Instead it goes straight to generating a key expansion from the bytes it is given.
Now it gets interesting:
switch (this.m_blockSizeBits > rgbKey.Length * 8 ? this.m_blockSizeBits : rgbKey.Length * 8)
So:
128 > len(k) x 8 = 128
128 <= len(k) x 8 = len(k) x 8
128 / 8 = 16, so if len(k) is 16 we can expect to switch on len(k) x 8. If it's more, then it will switch on len(k) x 8 too. If it's less it will switch on the block size, 128.
Valid switch values are 128, 192 and 256. That means it will only fall to default (and throw an exception) if it's over 16 bytes in length and not a valid block (not key) length of some sort.
In other words, it never checks against the key length specified in the RijndaelManaged object. It goes straight in to the key expansion and starts operating at the block level, as long as the key length (in bits) is one of 128, 192, 256 or less than 128. This is actually a check against the block size, not the key size.
So what happens now that we've patently not checked the key length? The answer has to do with the nature of the key schedule. When you enter a key in to Rijndael, the key needs to be expanded before it can be used. In this case, it's going to be expanded to 176 bytes. In order to accomplish this, it uses an algorithm which is specifically designed to turn a short byte array in to much longer byte array.
Part of that involves checking the key length. A bit more decompilation fun and we find that this defined as m_Nk. Sounds familiar?
this.m_Nk = rgbKey.Length / 4;
Nk is 4 for a 16-byte key, less when we enter shorter keys. That's 4 words, for anyone wondering where the magic number 4 came from. This causes a curious fork in the key scheduler, there's a specific path for Nk <= 6.
Without going too deep in to the details, this actually happens to 'work' (ie. not crash in a fireball) with a key length less than 16 bytes... until it gets below 8 bytes.
Then the entire thing crashes spectacularly.
So what have we learned? When you use CreateEncryptor you are actually throwing a completely invalid key straight in to the key scheduler and it's serendipity that sometimes it doesn't outright crash on you (or a horrible contractual integrity breach, depending on your POV); probably an unintended side effect of the fact there's a specific fork for short key lengths.
For completeness sake we can now look at the other implementation where you set the Key and IV in the RijndaelManaged object. These are stored in the SymmetricAlgorithm base class, which has the following setter:
if (!this.ValidKeySize(value.Length * 8))
throw new CryptographicException(Environment.GetResourceString("Cryptography_InvalidKeySize"));
Bingo. Contract properly enforced.
The obvious answer is that you cannot replicate this in another library unless that library happens to contain the same glaring issue, which I'm going to a call a bug in Microsoft's code because I really can't see any other option.
But that answer would be a cop out. By inspecting the key scheduler we can work out what's actually happening.
When the expanded key is initialised, it populates itself with 0x00s. It then writes to the first Nk words with our key (in our case Nk = 2, so it populates the first 2 words or 8 bytes). Then it enters a second stage of expanding upon that by populating the rest of the expanded key beyond that point.
So now we know it's essentially padding everything past 8 bytes with 0x00, we can pad it with 0x00s right? No; because this shifts the Nk up to Nk = 4. As a result, although our first 4 words (16 bytes) will be populated as we expect, the second stage will begin expanding at the 17th byte, not the 9th!
The solution then is utterly trivial. Rather than padding our initial key with 6 additional bytes, just chop off the last 2 bytes.
So your direct answer in PHP is:
$key = substr($key, 0, -2);
Simple, right? :)
Now you can interop with this encryption function. But don't. It can be cracked.
Assuming your key uses lowercase, uppercase and digits you have an exhaustive search space of only 218 trillion keys.
62 bytes (26 + 26 + 10) is the search space of each byte because you're never using the other 194 (256 - 62) values. Since we have 8 bytes, there are 62^8 possible combinations. 218 trillion.
How fast can we try all the keys in that space? Let's ask openssl what my laptop (running lots of clutter) can do:
Doing aes-256 cbc for 3s on 16 size blocks: 12484844 aes-256 cbc's in 3.00s
That's 4,161,615 passes/sec. 218,340,105,584,896 / 4,161,615 / 3600 / 24 = 607 days.
Okay, 607 days isn't bad. But I can always just fire up a bunch of Amazon servers and cut that down to ~1 day by asking 607 equivalent instances to calculate 1/607th of the search space. How much would that cost? Less than $1000, assuming that each instance was somehow only as efficient as my busy laptop. Cheaper and faster otherwise.
There is also an implementation that is twice the speed of openssl1, so cut whatever figure we've ended up with in half.
Then we've got to consider that we'll almost certainly find the key before exhausting the entire search space. So for all we know it might be finished in an hour.
At this point we can assert if the data is worth encrypting, it's probably worth it to crack the key.
So there you go.

Hash value lengths

When referring to the length of a hash value such as sha1 or md5 in PHP, is it correct to interpret that as the size of the hash in memory rather than the number of characters present in the literal?

Yes, it does. However, that size is tightly related to the amount of characters in the string -- if you get a raw string, you'll get 1 character per 8 bits; if you get hex digits (the default), you're getting 1 character per 4 bits.

It's the minimum number of bits required to store the hash unambiguously.
>>> len(hashlib.md5('foo').digest()) * 8
128
>>> len(hashlib.sha1('foo').digest()) * 8
160
>>> len(hashlib.sha512('foo').digest()) * 8
512

The principal output of a secure hash function is always defined in bits. So when referring to the output of a hash function a cryptographer always talks about e.g. 128 bits for the broken MD5 algorithm, 160 bits for SHA1 and obviously 256 bits for SHA-256.
Most crypto APIs however only work with bytes. This means that if there is a specific method present to indicate hash size, that more often than not the size in bytes is returned. So that would be 16, 20 and 32 bytes for the above algorithms.
Of course, the bytes are returned in e.g. hexadecimals then the length in characters of the string would be double that. The string length should then return 32, 40 or 64 characters. If that translates to an identical number of bytes depends on the character encoding (e.g. using UTF-16 would double the number of bytes).
Hash functions do have a big internal state, so the number of bytes taken by a running implementation is much higher than number of bits in the output. It is not that high that you would notice on a modern PC though.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.