Is there anything that can make the returned length of the PHP CRC32 function to vary?
Thanks!
No, by definition a CRC32 has 32-bits.
You can only vary its representation. For instance, while it can be represented with 4 8-bit bytes (and hence fits in a PHP int), you may wish to represent that number in base 10 in a string, and then it can have 10 characters (unsigned), since 2^32-1 is 4294967295.
Related
Hi I am working on creating an assembler and so I need to take some number and convert it to hex for a branch command. Is there a way to change the amount of bytes returned in the output? We are using 24 bit instructions (6 bytes) and our branch commands use the first byte for op code and second byte for conditional bits, that leaves me 4 bytes for the number. If I have a negative number like -2 I get fffffffffffffffe which is 16 bytes. Is there an easy way to change the output of hexdec() to a specified number of bytes? I know how to do positive numbers as they output the minimum amount of bytes needed so 2 becomes 2 or 15 becomes f.
If I went from integer to binary using decbin I still get 16 bytes. I can not just cut off any leading bytes can I?
Since I don't care about possibility of overflow and I will not get anywhere clear to the 65k number required to need more than 4 bytes I can ignore all bytes after the 4th byte. I would still like to know if there is a way though.
I have a very large integer 12-14 digits long and I want to encrypt/compress this to an alphanumeric value so that the integer can be recovered later from the alphanumeric value. I tried to convert this integer using a 62 base and tried to map those values to a-zA-Z0-9, but the value generated from this is 7 characters long. This length is still long enough and I want to convert to about 4-5 characters.
Is there a general way to do this or some method in which this can be done so that recovering the integer would still be possible? I am asking the mathematical aspects here but I would be programming this in PHP and I recently started programming in php.
Edit:
I was thinking in terms of assigning a masking bit and using this in a fashion to generate less number of Chars. I am aware of the fact that the range is not enough and that is the reason I was focusing on using a mathematical trick or a way of representation. The 62 base was an Idea that I already applied but is not working out.
14 digit decimal numbers can express 100,000,000,000,000 values (1014).
5 characters of a 62 character alphabet can express 916,132,832 values (625).
You cannot cram the equivalent number of values of a 14 digit number into a 5 character base 62 string. It's simply not possible to express each possible value uniquely. See http://en.wikipedia.org/wiki/Pigeonhole_principle. Even base 64 with 7 characters is not enough (only 4,398,046,511,104 possible values). In fact, if you target a 5 character short string you'd need to compensate by using a base 631 alphabet (6315 = 100,033,806,792,151).
Even compression doesn't help you. It would mean that two or more numbers would need to compress to the same compressed string (because there aren't enough possible unique compressed values), which logically means it's impossible to uncompress them into two different values.
To illustrate this very simply: Say my alphabet and target "string length" consists of one bit. That one bit can be 0 or 1. It can express 2 unique possible values. Say I have a compression algorithm which compresses anything and everything into this one bit. ... How could I possibly uncompress 100,000,000,000,000 unique values out of that one bit with two possible values? If you'd solve that problem, bandwidth and storage concerns would immediately evaporate and you'd be a billionaire.
With 95 printable ASCII characters you can switch to base 95 encoding instead of 62:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
That way an integer string of length X can be compressed into length Y base 95 string, where
Y = X * log 10/ log 95 = roughly X / 2
which is pretty good compression. So from length 12 you get down to 6. If the purpose of compression is to save the bandwidth by using JSON, then base 92 can be good choice (excluding ",\,/ that become escaped in JSON).
Surely you can get better compression but the price to pay is a larger alphabet. Just replace 95 in the above formula by the number of symbols.
Unless of course, you know the structure of your integers. For instance, if they have plenty of zeroes, you can base your compression on this knowledge to get much better results.
because the pigeon principle you will end up with some values that get compressed and other values that get expanded. It simply impossible to create a compression algorithm that compress every possible input string (i.e. in your case your numbers).
If you force the cardinality of the output set to be smaller than the cardinality of the input set you'll get collisions (i.e. more input strings get "compressed" to the same compressed binary string). A compression algorithm should be reversible, right? :)
When referring to the length of a hash value such as sha1 or md5 in PHP, is it correct to interpret that as the size of the hash in memory rather than the number of characters present in the literal?
Yes, it does. However, that size is tightly related to the amount of characters in the string -- if you get a raw string, you'll get 1 character per 8 bits; if you get hex digits (the default), you're getting 1 character per 4 bits.
It's the minimum number of bits required to store the hash unambiguously.
>>> len(hashlib.md5('foo').digest()) * 8
128
>>> len(hashlib.sha1('foo').digest()) * 8
160
>>> len(hashlib.sha512('foo').digest()) * 8
512
The principal output of a secure hash function is always defined in bits. So when referring to the output of a hash function a cryptographer always talks about e.g. 128 bits for the broken MD5 algorithm, 160 bits for SHA1 and obviously 256 bits for SHA-256.
Most crypto APIs however only work with bytes. This means that if there is a specific method present to indicate hash size, that more often than not the size in bytes is returned. So that would be 16, 20 and 32 bytes for the above algorithms.
Of course, the bytes are returned in e.g. hexadecimals then the length in characters of the string would be double that. The string length should then return 32, 40 or 64 characters. If that translates to an identical number of bytes depends on the character encoding (e.g. using UTF-16 would double the number of bytes).
Hash functions do have a big internal state, so the number of bytes taken by a running implementation is much higher than number of bits in the output. It is not that high that you would notice on a modern PC though.
I am trying this out, but am unable to store large value
$var = rand(100000000000000,999999999999999);
echo $var; // prints a 9 digit value(largest possible)
How to get a desired value ?
From the manual:
The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18. PHP does not support unsigned integers. Integer size can be determined using the constant PHP_INT_SIZE, and maximum value using the constant PHP_INT_MAX since PHP 4.4.0 and PHP 5.0.5.
...
If PHP encounters a number beyond the bounds of the integer type, it will be interpreted as a float instead. Also, an operation which results in a number beyond the bounds of the integer type will return a float instead.
BC Math and GMP are the (only?) way to manipulate this limitation.
PHP ints are typically 32 bits. Other packages provide higher-precision ints: http://php.net/manual/en/language.types.integer.php
If you need to work with very large numbers I have found success with BC Math. Here is a link to everything you need to know:
http://php.net/manual/en/book.bc.php
If you want to generate the number and manipulate as a native type, you can't with most PHP installations (either you have 32 or 64 bit ints and nothing else), as the other answers have already stated. However, if you are just generating a number and want to pass it around a possible trick is to just concatenate strings:
$var = rand(0,PHP_INT_MAX).str_pad(rand(0, 999999999), 9, 0, STR_PAD_LEFT);
echo $var;
On a platform in which PHP uses a 32 bit integer, this allows you to get a near random integer (as a string) that is bigger than 32 bits ( > 10 decimal places). Of course, there is a bias in this construction which means you won't cover all the numbers with the same probability. The limits of the rand() calls obey normal decimal rules so its simple to adjust the upper bound of the number you want.
If all you're doing is storing/transmitting/showing this value, the string will be just fine. Equality and greater/less than tests will also work. Just don't do any math with it.
Lets assume we are talking about 32bit system.
PHP doesn't support unsigned INT. It means that INT value should be between -2,147,483,648 and 2,147,483,647 values. And INT takes 4 bytes to store a value which are 32 bits length.
So does it mean that I have only 31 bits for value and 1 bit for sign? Or I can use whole 32 bits to store a value?
You are using the whole 32 bits. It's just that the default output functions interpret it as signed integer. If you want to display the value "raw" use:
printf("%u", -1); // %u for unsigned
Since PHP handles the integers signed internally however, you can only use bit arithmetics, but not addition/multiplication etc. with them - if you expect them to behave like unsigned ints.
2147483647 is the usual value 2^31-1. 1 bit for sign and -1 because we also represent 0.
from the manual:
"The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18. PHP does not support unsigned integers. Integer size can be determined using the constant PHP_INT_SIZE, and maximum value using the constant PHP_INT_MAX since PHP 4.4.0 and PHP 5.0.5."
As far as i know on a 32-bit system the largest positive integer possible is 2147483647 values above will be float values, since a float value in php can take values up to 10000000000000.
First of all, if you want to do calculations with huge numbers (say, regulary greater than 10k), you should use the bcmath arbitrary precision module.
Secondly, the official php implementation uses the Two's complement internally to represent numbers, like virtually all other compilers and interpreters. Since the entropy of the signed bit (if you count 0 as positive[1]) is 1, and the entropy of 31 bits is, well, 31, you can store 232 = 4 294 967 296 distinct values. How you interpret them is up to your application.
[1] - 0 is neither positive nor negative.
Normally with 32 bit integers the difference between signed and unsigned is just how you interpret the value. For example, (-1)+1 would be 1 for both signed and unsigned, for signed it's obvious and for unsigned it's of course only true if you just chop the overflow off. So you do have 32 bits to store values, it just happens to be so that there's 1 bit that is interpreted differently from the rest.
Two's complement is most frequently used to store numbers, which means that 1 bit is not wasted just for the sign.
http://en.wikipedia.org/wiki/Two's_complement
In PHP if a number overflows the INT_MAX for a platform, it converts it to a floating point value.
Yes if it would have used unsigned integer it will use 32 bit to store it as you don't need sign in that case but as it supports only signed integers a 32 bit systems will have 31 bit for value and 1 bit for sign s0 maximum signed integer range is -2147483648 to 2147483647.