As I know sha2 generate 256 bit hash.
256bit/8 = 32bytes.
So, It should take only varchar(32) field in the database. But I saw an article saying sah2 database field require varchar(64) field in the database. Is that true? Can someone explain, Please!!
Hashes are generally represented as hexadecimal strings:
string(64) "316a2017faa1ee410aadfb159097b8af260a258aa4210c550844cab89083111d"
In this case, SHA256 is 64 bytes. However, you may choose to store it in its binary form. This will make it take half as much space (32 bytes) but will make it unreadable in your database shell:
string(32) "̵9�~Rbgc\�7ME���)Fw�w��E�kc5"
Whether you store as a 64 byte string or a 32 byte binary is up to you.
Use varchar(64) or string(64). You need 64 characters to represent the 256 bits in SHA-256, it's represented in hexadecimal so each digit represents 4 bits and that's 256/4 = 64.
Related
I'd like to crypt a int up to 7'000'000 into a 4-char string and then decrypt back.
Any idea on how this can be easy be achieved with php?
Clarification
I want to create a unique slug for each Wordpress user based on the user ID with a length of 4 chars.
example.com/rDfy
log2(7,000,000) = 22.7 or < 3 full bytes. Make the value a 4-byte integer and then cast it to a 4-byte array and Base64 encode the least significant 3 bytes to 4-bytes, that is an encoding, not encryption but may suffice.
If you need encryption then encrypt the byte array using a substitution cipher, sure it is not strong encryption.
The docs (http://php.net/manual/de/function.crypt.php) for the crypt() function show the following example for an MD5 hash:
$1$rasmusle$rISCgZzpwk3UhDidwXvin0
I understand, that "$1$" is the prefix, which contains the information, that the hash is an MD5 hash.
But how is the rest of the string an MD5 hash? Normally it should be a 32 char string (0-9, a-f), right?
I'm sure, it's a stupid question, but I still want to ask.
Normally it should be a 32 char string (0-9, a-f), right?
That is not correct (at least strictly speaking). Technically, a MD5 hash is a 128 bit numeric value. The form that you are used to is simply a hexadecimal representation of that number. It is often chosen because they are easy to exchange as strings (128-bit integers are difficult to handle. After all, a typical integer variable usually only holds 64 bit). Consider the following examples:
md5("test") in hexadecimal (base 16) representation: 098f6bcd4621d373cade4e832627b4f6
md5("test") in base 64 representation: CY9rzUYh03PK3k6DJie09g==
md5("test") in decimal (base 10) representation: 12707736894140473154801792860916528374
md5("test") in base 27 representation (never used, just because I can and to prove my point): ko21h9o9h8bc1hgmao4e69bn6f
All these strings represent the same numerical value, just in different bases.
When referring to the length of a hash value such as sha1 or md5 in PHP, is it correct to interpret that as the size of the hash in memory rather than the number of characters present in the literal?
Yes, it does. However, that size is tightly related to the amount of characters in the string -- if you get a raw string, you'll get 1 character per 8 bits; if you get hex digits (the default), you're getting 1 character per 4 bits.
It's the minimum number of bits required to store the hash unambiguously.
>>> len(hashlib.md5('foo').digest()) * 8
128
>>> len(hashlib.sha1('foo').digest()) * 8
160
>>> len(hashlib.sha512('foo').digest()) * 8
512
The principal output of a secure hash function is always defined in bits. So when referring to the output of a hash function a cryptographer always talks about e.g. 128 bits for the broken MD5 algorithm, 160 bits for SHA1 and obviously 256 bits for SHA-256.
Most crypto APIs however only work with bytes. This means that if there is a specific method present to indicate hash size, that more often than not the size in bytes is returned. So that would be 16, 20 and 32 bytes for the above algorithms.
Of course, the bytes are returned in e.g. hexadecimals then the length in characters of the string would be double that. The string length should then return 32, 40 or 64 characters. If that translates to an identical number of bytes depends on the character encoding (e.g. using UTF-16 would double the number of bytes).
Hash functions do have a big internal state, so the number of bytes taken by a running implementation is much higher than number of bits in the output. It is not that high that you would notice on a modern PC though.
I have a need to store an encrypted but recoverable (by admin) password in MySQL, from PHP. AFAIK, the most straightforward way to do this is with openssl_public_encrypt(), but I'm not sure what column type is needed. Can I make any reliable judgment on the maximum length of encrypted output, based upon the size of the key and the input?
Or am I forced to use a huge field (e.g. BLOB), and just hope it works all the time?
The openssl_public_encrypt function limits the size of the data you can encrypt to the length of the key, if you use padding (recommended), you'll lose an extra 11 bytes.
However, the PKCS#1 standard, which OpenSSL uses, specifies a padding scheme (so you can encrypt smaller quantities without losing security), and that padding scheme takes a minimum of 11 bytes (it will be longer if the value you're encrypting is smaller). So the highest number of bits you can encrypt with a 1024-bit key is 936 bits because of this (unless you disable the padding by adding the OPENSSL_NO_PADDING flag, in which case you can go up to 1023-1024 bits). With a 2048-bit key it's 1960 bits instead.
Of course you should never disable padding, because that will make the same passwords to encrypt to the same value.
So for a 1024-bit key the maximum password input length is 117 chars.
For a 2048-bit key it's 245 chars.
I'm not 100% sure of the output length, but a simple trail should confirm this, the output is a simple function of the keylength, so for a 2048-bit key I suspect it is 256 bytes.
You should use a binary string with the required length to store the password.
For speed reasons it's best to use a limited length index on the field.
Do not use blob (!) because that will slow things way down for no benefit.
CREATE TABLE user
id unsigned integer auto_increment primary key,
username varchar(50) not null,
passRSA binary(256), <<-- doublecheck the length.
index ipass(passRSA(10)) <<-- only indexes the first 10 bytes for speed reasons.
) ENGINE = InnoDB
Adding extra bytes to the index will just slow things down and grow the index file for no benefit.
I use this function for hashing my passwords:
// RETURNS: rAyZOnlNBxO2WA53z2rAtFlhdS+M7kec9hskSCpeL6j+WwcuUvfFbpFJUtHvv7ji
base64_encode(hash_hmac('sha384', $str . SC_NONCE, SC_SITEKEY, true));
And I store hashes in char(64) field (MySQL -InnoDB).
Should I use varchar(64) instead of char(64)? Why?
Edit:
I changed sha256 with sha384. Because in this example, sha256 always returns 44 bytes for me. Sorry for confusing. Now it's 64-bytes.
varchars save storage by only using up to the length required. If the 64 bit hash is always 64 then it makes no difference in terms of storage so probably char is just as good as varchar in this case.
If you have variable length data to store, then a varchar will save wasting unnecessary space.
You should use CHAR(64) since your hash is fixed in length. Using VARCHAR will add another byte, wasting space.
Even though you are using a Base 64 encoded string, the result is not necessarily 64 bits in length. In this case, VARCHAR is better because the result can be shorter than 64 bits.
In fact as seen here, 64 bits is the maximum length rather than the set length.