The docs (http://php.net/manual/de/function.crypt.php) for the crypt() function show the following example for an MD5 hash:
$1$rasmusle$rISCgZzpwk3UhDidwXvin0
I understand, that "$1$" is the prefix, which contains the information, that the hash is an MD5 hash.
But how is the rest of the string an MD5 hash? Normally it should be a 32 char string (0-9, a-f), right?
I'm sure, it's a stupid question, but I still want to ask.
Normally it should be a 32 char string (0-9, a-f), right?
That is not correct (at least strictly speaking). Technically, a MD5 hash is a 128 bit numeric value. The form that you are used to is simply a hexadecimal representation of that number. It is often chosen because they are easy to exchange as strings (128-bit integers are difficult to handle. After all, a typical integer variable usually only holds 64 bit). Consider the following examples:
md5("test") in hexadecimal (base 16) representation: 098f6bcd4621d373cade4e832627b4f6
md5("test") in base 64 representation: CY9rzUYh03PK3k6DJie09g==
md5("test") in decimal (base 10) representation: 12707736894140473154801792860916528374
md5("test") in base 27 representation (never used, just because I can and to prove my point): ko21h9o9h8bc1hgmao4e69bn6f
All these strings represent the same numerical value, just in different bases.
Related
As I know sha2 generate 256 bit hash.
256bit/8 = 32bytes.
So, It should take only varchar(32) field in the database. But I saw an article saying sah2 database field require varchar(64) field in the database. Is that true? Can someone explain, Please!!
Hashes are generally represented as hexadecimal strings:
string(64) "316a2017faa1ee410aadfb159097b8af260a258aa4210c550844cab89083111d"
In this case, SHA256 is 64 bytes. However, you may choose to store it in its binary form. This will make it take half as much space (32 bytes) but will make it unreadable in your database shell:
string(32) "̵9�~Rbgc\�7ME���)Fw�w��E�kc5"
Whether you store as a 64 byte string or a 32 byte binary is up to you.
Use varchar(64) or string(64). You need 64 characters to represent the 256 bits in SHA-256, it's represented in hexadecimal so each digit represents 4 bits and that's 256/4 = 64.
I'd like to crypt a int up to 7'000'000 into a 4-char string and then decrypt back.
Any idea on how this can be easy be achieved with php?
Clarification
I want to create a unique slug for each Wordpress user based on the user ID with a length of 4 chars.
example.com/rDfy
log2(7,000,000) = 22.7 or < 3 full bytes. Make the value a 4-byte integer and then cast it to a 4-byte array and Base64 encode the least significant 3 bytes to 4-bytes, that is an encoding, not encryption but may suffice.
If you need encryption then encrypt the byte array using a substitution cipher, sure it is not strong encryption.
When referring to the length of a hash value such as sha1 or md5 in PHP, is it correct to interpret that as the size of the hash in memory rather than the number of characters present in the literal?
Yes, it does. However, that size is tightly related to the amount of characters in the string -- if you get a raw string, you'll get 1 character per 8 bits; if you get hex digits (the default), you're getting 1 character per 4 bits.
It's the minimum number of bits required to store the hash unambiguously.
>>> len(hashlib.md5('foo').digest()) * 8
128
>>> len(hashlib.sha1('foo').digest()) * 8
160
>>> len(hashlib.sha512('foo').digest()) * 8
512
The principal output of a secure hash function is always defined in bits. So when referring to the output of a hash function a cryptographer always talks about e.g. 128 bits for the broken MD5 algorithm, 160 bits for SHA1 and obviously 256 bits for SHA-256.
Most crypto APIs however only work with bytes. This means that if there is a specific method present to indicate hash size, that more often than not the size in bytes is returned. So that would be 16, 20 and 32 bytes for the above algorithms.
Of course, the bytes are returned in e.g. hexadecimals then the length in characters of the string would be double that. The string length should then return 32, 40 or 64 characters. If that translates to an identical number of bytes depends on the character encoding (e.g. using UTF-16 would double the number of bytes).
Hash functions do have a big internal state, so the number of bytes taken by a running implementation is much higher than number of bits in the output. It is not that high that you would notice on a modern PC though.
I use this function for hashing my passwords:
// RETURNS: rAyZOnlNBxO2WA53z2rAtFlhdS+M7kec9hskSCpeL6j+WwcuUvfFbpFJUtHvv7ji
base64_encode(hash_hmac('sha384', $str . SC_NONCE, SC_SITEKEY, true));
And I store hashes in char(64) field (MySQL -InnoDB).
Should I use varchar(64) instead of char(64)? Why?
Edit:
I changed sha256 with sha384. Because in this example, sha256 always returns 44 bytes for me. Sorry for confusing. Now it's 64-bytes.
varchars save storage by only using up to the length required. If the 64 bit hash is always 64 then it makes no difference in terms of storage so probably char is just as good as varchar in this case.
If you have variable length data to store, then a varchar will save wasting unnecessary space.
You should use CHAR(64) since your hash is fixed in length. Using VARCHAR will add another byte, wasting space.
Even though you are using a Base 64 encoded string, the result is not necessarily 64 bits in length. In this case, VARCHAR is better because the result can be shorter than 64 bits.
In fact as seen here, 64 bits is the maximum length rather than the set length.
Is there any chance that a SHA-1 hash can be purely numeric, or does the algorithm ensure that there must be at least one alphabetical character?
Edit: I'm representing it in base 16, as a string returned by PHP's sha1() function.
technically, a SHA1 hash is a number, it is just most often encoded in base 16 (which is what PHP's sha1() does) so that it nearly always has a letter in it. There is no guarantee of this though.
The odds of a hex encoded 160 bit number having no digits A-F are (10/16)40 or about 6.84227766 × 10-9
The SHA-1 hash is a 160 bit number. For the ease of writing it, it is normally written in hexadecimal. Hexadecimal (base 16) digits are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e and f. There is nothing special about the letters. Each hexadecimal character equivalent to 4 bits which means the hash can be written in 40 characters.
I don't believe there is any reason why a SHA-1 hash can't have any letters, but it is improbable. It's like generating a 40 digit (base 10) random number and not getting any 7s, 8s or 9s.
You can represent the output of SHA1 (just like any binary data) in any base you want. Specifically, you can encode the result in base-8/10.