What is the pool of characters used in a BCRYPT hash - php

I was looking for answers on BCRYPT specific resources but found the anwser within PHP's documentation for crypt.
I have an issue whereby I need to do some REGEX to clean a BCRYPT hash (generated by PHP password_hash function using PASSWORD_BCRYPT). I want to be able to know the characters that could theoretically appear in the BCRYPT hash so that REGEX can remove all other characters from a string.
I have read all about various bits of BCrypt, it's history and its development but I have not come across anywhere that states for canon what a BCRYPT hash can contain.
Current understanding is:
*, 0-9, a-z, A-Z, $, . \
Does BCRYPT contain any ascii character? (edit: no) I see it contains many but going through the many BCRYPT hashes I can find is not a good methodology for being sure. For example, BCRYPT hash does NOT seem to contain = or ¬ or a few other characters that, -for want of a better description- have small UTF-8 definitions.
This is using a PHP interface if that changes how BCRYPT outputs hashes.

I found the answer on the PHP Crypt Function page:
CRYPT_BLOWFISH - Blowfish hashing with a salt as follows: "$2a$", "$2x$" or "$2y$", a two digit cost parameter, "$", and 22 characters from the alphabet "./0-9A-Za-z". Using characters outside of this range in the salt will cause crypt() to return a zero-length string
So this PHP/BCRYPT Hash will use characters from:
$./0-9A-Za-z range.

Related

What is the format of password_hash output?

I know the PHP function, password_hash outputs the algorithm, cost, salt, and hash all in one string so password_verify can check a password.
Sample output from PHP page:
$2y$10$.vGA1O9wmRjrwAVXD98HNOgsNpDczlqm3Jq7KnEd1rVAGv3Fykk1a
so the $2y$ represents the algorithm, the 10 represents cost.
But how does password_verify separate the salt from the hash? I don't see any identifier separating the two afterwards.
For the bCrypt version of Password Hash.
Bcrypt has a fixed-length salt value. The crypt function which is what PHP calls internally when you're utilizing password_hash()/password_verify() with the default algorithm has a a 16 byte salt. This is given as a 22 characters of the custom base64 alphabet A-Za-z/. then it decodes the string into bytes as 22 B64 characters encode 16.5Bytes there is an extra nibble of data that is not taken into account.
For all other hashes the salt value is a defined set of bytes which are of course encoded into ASCII safe b64 and put after the $ sign and then the verifying function would only have to split the string into parts via the delimiter $ and then go for the third set of characters get the substr(0,B64_ENCODED_HASH_ALGORITHM_SALT_LEN). After that it would then pass the parameters it also got from the split string and pass those back into the password_hash function along with the password to check.
The string it gives you is defined by the hashing algorithm's standard in most cases but is almost always something to the pattern of
$<ALGORITHM_ID>$<COST_IN_FORMAT>$<BASE64_ENCODED_SALT><BASE64_ENCODED_HASH>$

Why shouldn't I use the 23rd character in a crypt() function's salt?

I'm learning about PHP's crypt() function and have been running some tests with it. According to this post, I should use a salt that's 22 characters long. I can, however, use a string that's 23 characters long with some limitations. When I use a 22 character long string I always get an outcome of '$2y$xxStringStringStringStri.HashHashHashHashHashHashHashHas'. I know the period is just part of the salt.
It seems that if I use 23 characters instead of just 22, I can successfully generate different hashes, but there is only 4 different outcomes for all 64 characters. The 23rd character "rounds down" to the nearest 1/4th of the 64 character alphabet (e.g. the 23rd character is "W" and rounds down to "O" or any number rounds down to "u")
v---------------v---------------v---------------v---------------
./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890
All four of these crypt functions generate the same salt:
crypt('Test123','$2y$09$AAAAAAAAAAAAAAAAAAAAAq');
crypt('Test123','$2y$09$AAAAAAAAAAAAAAAAAAAAAr');
crypt('Test123','$2y$09$AAAAAAAAAAAAAAAAAAAAAs');
crypt('Test123','$2y$09$AAAAAAAAAAAAAAAAAAAAAt');
But this one is different:
crypt('Test123','$2y$09$AAAAAAAAAAAAAAAAAAAAAu');
So why shouldn't I use the 23rd character when it can successfully generate different outcomes? Is there some kind of glitchy behavior in PHP that should be avoided by not using it?
For clarification on how I'm counting the 23rd character in the salt:
crypt('Test123','$2y$08$ABCDEFGHIJKLMNOPQRSTUV');
// The salt is '$ABCDEFGHIJKLMNOPQRSTUV'
// Which will be treated as '$ABCDEFGHIJKLMNOPQRSTUO'
It has to do with hash collisions. Once you exceed 22 characters your generated hashes are no longer unique depending on the NAMESPACE of the algorithm. To be said another way, more than 22 characters doesn't result in any increased security and can actually decrease your level of security.
$ is not part of the actual salt. It is a separator.
For Blowfish crypt, the format is $2[axy]$log2Rounds$[salt][hash]. You describe it adding a . -- that's because you are missing the last character. Blowfish's salt is 128 bits. You could use only 126, yes, but you are just unnecessarily weakening the salt.

What is the character set of hash when md5 is used with salt using crypt()?

What is the character set of the output given by crypt() using md5 with salt.
By hash, I mean just the 22 characters after "$1$ "8 random characters"$ ". So I wanted to know what type of characters does 22 hashed character contains?
I was looking for this and have found a few questions that touched on this but nobody seems to have a definitive answer, nobody except the code of course and, this python implementation of the same:
http://pythonhosted.org/passlib/lib/passlib.hash.md5_crypt.html
Based on this, it seems both the salt, and the hash itself are encoded with the following regex char set: [./0-9A-Za-z]
I would expect these to output the same since they are all trying to be compatible with the same shadow password utilities.

Expanding and using salt generation code for a php login system

I am working on a php/mysql login system for a webproject. After looking through SO and alot of articles on the web Ive come up with a basic framework and started writing some code for it. However Ive come to a bit of an impasse in password encryption.
After a nights worth of reading Ive found out that:
I should the users password with at least sha1 or sha2
I should also use a randomly generated salt (this is what I need help with) and append it to the password before encrypting it
the hashed password and the randomly generated salt should be stored in the database and then queried and combined/encrypted then checked against the users hashed password.
My problem is coming in randomly generating the salt,
Possibilities I can think of:
Use mt_rand() in a loop to pick an ASCII code, get the corresponding character with chr() and concatenate to salt.
This allows to create salts with any length.
Define a string with available characters, use mt_rand() in a loop to pick random positions from it, extract the character in the selected position with substr() or mb_substr() and concatenate to salt.
This allows to create salts with a chosen character set and length.
Use a builtin function that generates a random string (e.g. uniqid()) and optionally hash it.
This is quick and simple.
I normally use the second option.
uniqid() ?
http://sg2.php.net/uniqid

Blowfish salt length for the Crypt() function?

According to the crypt() documentation, the salt needs to be 22 base 64 digits from the alphabet "./0-9A-Za-z".
This is the code example they give:
crypt('rasmuslerdorf', '$2a$07$usesomesillystringforsalt$');
The first confusing part is that salt has 25 characters, not 22.
Question #1: Does that mean the salt is supposed to be longer than 22 characters?
Then I tested the function myself and noticed something. If I use a 20 character salt, I get this
// using 20 char salt: 00000000001111111111
crypt('rasmuslerdorf', '$2a$07$00000000001111111111$');
// $2a$07$00000000001111111111$.6Th1f3O1SYpWaEUfdz7ieidkQOkGKh2
So, when I used a 20 character salt, the entire salt is in the output. Which is convenient, because I do not have to store the salt in a separate place then. (I want to use random salts). I would be able to read the salt back out of the generated hash.
However, if I use a 22 character salt as the documentation says, or a longer one, the salt is cut off at the end.
// using 22 char salt: 0000000000111111111122
crypt('rasmuslerdorf', '$2a$07$0000000000111111111122$');
// $2a$07$000000000011111111112uRTfyYkWmPPMWDRM/cUAlulrBkhVGlui
// 22nd character of the salt is gone
// using 25 char salt: 0000000000111111111122222
crypt('rasmuslerdorf', '$2a$07$0000000000111111111122222$');
// $2a$07$000000000011111111112uRTfyYkWmPPMWDRM/cUAlulrBkhVGlui
// Same hash was generated as before, 21 chars of the salt are in the hash
Question #2: So, what exactly is the proper length of a salt? 20? 22? Longer?
Question #3: Also, is it a good idea to read the salt out of the hash when it is time to check passwords? Instead of storing the salt in a separate field and reading it from there. (Which seems redundant since the salt seems to be included in the hash).
Blowfish salts should be 22 chars long (including the trailing $, so 21) - you can double check with var_dump(CRYPT_SALT_LENGTH), I can't verify this now but my guess is that less chars will return an error and more chars will be truncated.
Regarding your third question: yes, you should read and check the hash using the embedded salt (and cost) parameters from the hash itself.

Categories