I've been reading about password hash functions and it's said: use a salt to make harder the hacker's work, use a height cost algorithm, etc
I've found this password_hash function but... it returns this (letters are an example to explain)
AAAACCCSSS....SSSHHH.....HHHH
Example:
Where:
AAA is the algorithm
CCC is the cost
SSS....SSS is the salt
HHH.....HHHH is the hash we get
It is supposed (I think) that the algorithm, cost and salt is used to make the life harder to hackers, and if I use that my hash will be safer
But, what happened if I use all of them but I say to the hackers that I'm using this salt, algorithm and cost?
Doing that I'm giving the hackers a lot of clues
I mean,
can I store in my database all the string?
or
must I to store the hash and the (algorithm,cost,salt) in different places?
If somebody gets my DB he has all the hash passwords.
The security does not come from the information being secret. It comes from the algorithm being very computationally expensive.
The attack here is to guess a plaintext which, when hashed with the given algorithm and salt and cost, will result in the same hash value. Even with all the information given (except the plaintext obviously) and assuming a strong (random) plaintext password, it takes many many years, possibly millennia, to find one such value. And that's just for one password hash, to say nothing of a whole database of hashes.
The protection is in using an algorithm costly enough to make guessing infeasibly slow, not in keeping details of the algorithm (which salt and cost are) secret.
The purpose of salt is to make sure that hashes are uniquely generated and cannot be looked up in an existing database such as MD5 Decrypt. Even if a salt is leaked the attacker would have to break each hashed password individually.
can I store in my database all the string?
Yes, you may store it in the database as whole just like WordPress does.
If somebody gets my DB he has all the hash passwords.
Yes but as I mentioned the hashes are of no use unless the passwords are individually cracked using brute force and that would take an insane amount of computation cost.
I'm having some trouble understanding the purpose of a salt to a password. It's my understanding that the primary use is to hamper a rainbow table attack. However, the methods I've seen to implement this don't seem to really make the problem harder.
I've seen many tutorials suggesting that the salt be used as the following:
$hash = md5($salt.$password)
The reasoning being that the hash now maps not to the original password, but a combination of the password and the salt. But say $salt=foo and $password=bar and $hash=3858f62230ac3c915f300c664312c63f. Now somebody with a rainbow table could reverse the hash and come up with the input "foobar". They could then try all combinations of passwords (f, fo, foo, ... oobar, obar, bar, ar, ar). It might take a few more milliseconds to get the password, but not much else.
The other use I've seen is on my linux system. In the /etc/shadow the hashed passwords are actually stored with the salt. For example, a salt of "foo" and password of "bar" would hash to this: $1$foo$te5SBM.7C25fFDu6bIRbX1. If a hacker somehow were able to get his hands on this file, I don't see what purpose the salt serves, since the reverse hash of te5SBM.7C25fFDu6bIRbX is known to contain "foo".
Thanks for any light anybody can shed on this.
EDIT: Thanks for the help. To summarize what I understand, the salt makes the hashed password more complex, thus making it much less likely to exist in a precomputed rainbow table. What I misunderstood before was that I was assuming a rainbow table existed for ALL hashes.
A public salt will not make dictionary attacks harder when cracking a single password. As you've pointed out, the attacker has access to both the hashed password and the salt, so when running the dictionary attack, she can simply use the known salt when attempting to crack the password.
A public salt does two things: makes it more time-consuming to crack a large list of passwords, and makes it infeasible to use a rainbow table.
To understand the first one, imagine a single password file that contains hundreds of usernames and passwords. Without a salt, I could compute "md5(attempt[0])", and then scan through the file to see if that hash shows up anywhere. If salts are present, then I have to compute "md5(salt[a] . attempt[0])", compare against entry A, then "md5(salt[b] . attempt[0])", compare against entry B, etc. Now I have n times as much work to do, where n is the number of usernames and passwords contained in the file.
To understand the second one, you have to understand what a rainbow table is. A rainbow table is a large list of pre-computed hashes for commonly-used passwords. Imagine again the password file without salts. All I have to do is go through each line of the file, pull out the hashed password, and look it up in the rainbow table. I never have to compute a single hash. If the look-up is considerably faster than the hash function (which it probably is), this will considerably speed up cracking the file.
But if the password file is salted, then the rainbow table would have to contain "salt . password" pre-hashed. If the salt is sufficiently random, this is very unlikely. I'll probably have things like "hello" and "foobar" and "qwerty" in my list of commonly-used, pre-hashed passwords (the rainbow table), but I'm not going to have things like "jX95psDZhello" or "LPgB0sdgxfoobar" or "dZVUABJtqwerty" pre-computed. That would make the rainbow table prohibitively large.
So, the salt reduces the attacker back to one-computation-per-row-per-attempt, which, when coupled with a sufficiently long, sufficiently random password, is (generally speaking) uncrackable.
The other answers don't seem to address your misunderstandings of the topic, so here goes:
Two different uses of salt
I've seen many tutorials suggesting that the salt be used as the following:
$hash = md5($salt.$password)
[...]
The other use I've seen is on my linux system. In the /etc/shadow the hashed passwords are actually stored with the salt.
You always have to store the salt with the password, because in order to validate what the user entered against your password database, you have to combine the input with the salt, hash it and compare it to the stored hash.
Security of the hash
Now somebody with a rainbow table could reverse the hash and come up with the input "foobar".
[...]
since the reverse hash of te5SBM.7C25fFDu6bIRbX is known to contain "foo".
It is not possible to reverse the hash as such (in theory, at least). The hash of "foo" and the hash of "saltfoo" have nothing in common. Changing even one bit in the input of a cryptographic hash function should completely change the output.
This means you cannot build a rainbow table with the common passwords and then later "update" it with some salt. You have to take the salt into account from the beginning.
This is the whole reason for why you need a rainbow table in the first place. Because you cannot get to the password from the hash, you precompute all the hashes of the most likely used passwords and then compare your hashes with their hashes.
Quality of the salt
But say $salt=foo
"foo" would be an extremely poor choice of salt. Normally you would use a random value, encoded in ASCII.
Also, each password has it's own salt, different (hopefully) from all other salts on the system. This means, that the attacker has to attack each password individually instead of having the hope that one of the hashes matches one of the values in her database.
The attack
If a hacker somehow were able to get his hands on this file, I don't see what purpose the salt serves,
A rainbow table attack always needs /etc/passwd (or whatever password database is used), or else how would you compare the hashes in the rainbow table to the hashes of the actual passwords?
As for the purpose: let's say the attacker wants to build a rainbow table for 100,000 commonly used english words and typical passwords (think "secret"). Without salt she would have to precompute 100,000 hashes. Even with the traditional UNIX salt of 2 characters (each is one of 64 choices: [a–zA–Z0–9./]) she would have to compute and store 4,096,000,000 hashes... quite an improvement.
The idea with the salt is to make it much harder to guess with brute-force than a normal character-based password. Rainbow tables are often built with a special character set in mind, and don't always include all possible combinations (though they can).
So a good salt value would be a random 128-bit or longer integer. This is what makes rainbow-table attacks fail. By using a different salt value for each stored password, you also ensure that a rainbow table built for one particular salt value (as could be the case if you're a popular system with a single salt value) does not give you access to all passwords at once.
Yet another great question, with many very thoughtful answers -- +1 to SO!
One small point that I haven't seen mentioned explicitly is that, by adding a random salt to each password, you're virtually guaranteeing that two users who happened to choose the same password will produce different hashes.
Why is this important?
Imagine the password database at a large software company in the northwest US. Suppose it contains 30,000 entries, of which 500 have the password bluescreen. Suppose further that a hacker manages to obtain this password, say by reading it in an email from the user to the IT department. If the passwords are unsalted, the hacker can find the hashed value in the database, then simply pattern-match it to gain access to the other 499 accounts.
Salting the passwords ensures that each of the 500 accounts has a unique (salt+password), generating a different hash for each of them, and thereby reducing the breach to a single account. And let's hope, against all probability, that any user naive enough to write a plaintext password in an email message doesn't have access to the undocumented API for the next OS.
I was searching for a good method to apply salts and found this excelent article with sample code:
http://crackstation.net/hashing-security.htm
The author recomends using random salts per user, so that gaining access to a salt won't render the entire list of hashes as easy to crack.
To Store a Password:
Generate a long random salt using a CSPRNG.
Prepend the salt to the password and hash it with a standard
cryptographic hash function such as SHA256.
Save both the salt and the hash in the user's database record.
To Validate a Password :
Retrieve the user's salt and hash from the database.
Prepend the salt to the given password and hash it using the same hash function.
Compare the hash of the given password with the hash from the database. If they
match, the password is correct. Otherwise, the password is incorrect.
The reason a salt can make a rainbow-table attack fail is that for n-bits of salt, the rainbow table has to be 2^n times larger than the table size without the salt.
Your example of using 'foo' as a salt could make the rainbow-table 16 million times larger.
Given Carl's example of a 128-bit salt, this makes the table 2^128 times larger - now that's big - or put another way, how long before someone has portable storage that big?
Most methods of breaking hash based encryption rely on brute force attacks. A rainbow attack is essentially a more efficient dictionary attack, it's designed to use the low cost of digital storage to enable creation of a map of a substantial subset of possible passwords to hashes, and facilitate the reverse mapping. This sort of attack works because many passwords tend to be either fairly short or use one of a few patterns of word based formats.
Such attacks are ineffective in the case where passwords contain many more characters and do not conform to common word based formats. A user with a strong password to start with won't be vulnerable to this style of attack. Unfortunately, many people do not pick good passwords. But there's a compromise, you can improve a user's password by adding random junk to it. So now, instead of "hunter2" their password could become effectively "hunter2908!fld2R75{R7/;508PEzoz^U430", which is a much stronger password. However, because you now have to store this additional password component this reduces the effectiveness of the stronger composite password. As it turns out, there's still a net benefit to such a scheme since now each password, even the weak ones, are no longer vulnerable to the same pre-computed hash / rainbow table. Instead, each password hash entry is vulnerable only to a unique hash table.
Say you have a site which has weak password strength requirements. If you use no password salt at all your hashes are vulnerable to pre-computed hash tables, someone with access to your hashes would thus have access to the passwords for a large percentage of your users (however many used vulnerable passwords, which would be a substantial percentage). If you use a constant password salt then pre-computed hash tables are no longer valuable, so someone would have to spend the time to compute a custom hash table for that salt, they could do so incrementally though, computing tables which cover ever greater permutations of the problem space. The most vulnerable passwords (e.g. simple word based passwords, very short alphanumeric passwords) would be cracked in hours or days, less vulnerable passwords would be cracked after a few weeks or months. As time goes on an attacker would gain access to passwords for an ever growing percentage of your users. If you use a unique salt for every password then it would take days or months to gain access to each one of those vulnerable passwords.
As you can see, when you step up from no salt to a constant salt to a unique salt you impose a several orders of magnitude increase in effort to crack vulnerable passwords at each step. Without a salt the weakest of your users' passwords are trivially accessible, with a constant salt those weak passwords are accessible to a determined attacker, with a unique salt the cost of accessing passwords is raised so high that only the most determined attacker could gain access to a tiny subset of vulnerable passwords, and then only at great expense.
Which is precisely the situation to be in. You can never fully protect users from poor password choice, but you can raise the cost of compromising your users' passwords to a level that makes compromising even one user's password prohibitively expensive.
One purpose of salting is to defeat precomputed hash tables. If someone has a list of millions of pre-computed hashes, they aren't going to be able to look up $1$foo$te5SBM.7C25fFDu6bIRbX1 in their table even though they know the hash and the salt. They'll still have to brute force it.
Another purpose, as Carl S mentions is to make brute forcing a list of hashes more expensive. (give them all different salts)
Both of these objectives are still accomplished even if the salts are public.
As far as I know, the salt is intended to make dictionary attacks harder.
It's a known fact that many people will use common words for passwords instead of seemingly random strings.
So, a hacker could use this to his advantage instead of using just brute force. He will not look for passwords like aaa, aab, aac... but instead use words and common passwords (like lord of the rings names! ;) )
So if my password is Legolas a hacker could try that and guess it with a "few" tries. However if we salt the password and it becomes fooLegolas the hash will be different, so the dictionary attack will be unsuccessful.
Hope that helps!
I assume that you are using PHP --- md5() function, and $ preceded variables --- then, you can try looking this article Shadow Password HOWTO Specially the 11th paragraph.
Also, you are afraid of using message digest algorithms, you can try real cipher algorithms, such as the ones provided by the mcrypt module, or more stronger message digest algorithms, such as the ones that provide the mhash module (sha1, sha256, and others).
I think that stronger message digest algorithm are a must. It's known that MD5 and SHA1 are having collision problems.
I am using password_hash for password encryption. However there is a strange question, password_hash cost very long time. Here is a sample code.
this code will cost more than 1 second. Is that normal?
<?php
$startTime = microtime(TRUE);
$password='123456';
$cost=13;
$hash=password_hash($password, PASSWORD_DEFAULT, ['cost' => $cost]);
password_verify($password,$hash);
$endTime = microtime(TRUE);
$time = $endTime - $startTime;
echo $time;
?>
the result is :1.0858609676361
After running on 3v4l that seems perfectly normal.
Password hashing is not something you want optimize. In the words of Leigh on the hash documentation:
If you are hashing passwords etc for security, speed is not your friend. You should use the slowest method.
Slow to hash means slow to crack and will hopefully make generating things like rainbow tables more trouble than it's worth.
The default algorithm for password_hash, bcrypt, is designed to be slow.
http://en.wikipedia.org/wiki/Key_stretching
In cryptography, key stretching refers to techniques used to make a possibly weak key, typically a password or passphrase, more secure against a brute force attack by increasing the time it takes to test each possible key. Passwords or passphrases created by humans are often short or predictable enough to allow password cracking. Key stretching makes such attacks more difficult.
http://en.wikipedia.org/wiki/Rainbow_table#Defense_against_rainbow_tables
Another technique that helps prevent precomputation attacks is key stretching. When stretching is used, the salt, password, and a number of intermediate hash values are run through the underlying hash function multiple times to increase the computation time required to hash each password. For instance, MD5-Crypt uses a 1000 iteration loop that repeatedly feeds the salt, password, and current intermediate hash value back into the underlying MD5 hash function. The user's password hash is the concatenation of the salt value (which is not secret) and the final hash. The extra time is not noticeable to users because they have to wait only a fraction of a second each time they log in. On the other hand, stretching reduces the effectiveness of a brute-force attacks in proportion to the number of iterations because it reduces the number of computations an attacker can perform in a given time frame. This principle is applied in MD5-Crypt and in bcrypt. It also greatly increases the time needed to build a precomputed table, but in the absence of salt, this needs only be done once.
A full second is probably a little long - you could experiment with dropping $cost by one or two to bring it more to something like a tenth of a second, which will retain the effective protection while making the delay unnoticeable to your users.
Yes, it's normal. That's what the cost parameter is for: it allows you to tweak the iteration count, making the hash slower or faster as needed.
You should always make the hash as slow as possible and as fast as necessary. The reason being that the only feasible attack on a password hash is brute force. You want to make the cost so large that it takes prohibitively long to simple brute force all possible values. That's your only real defence against attackers with password hashing to begin with.
One whole second seems prohibitively for your own use. You should lower that cost a bit to stay within a few hundred milliseconds at most. Adjust to your target systems as needed.
To begin, password_hash is not encryption.
password_hash() creates a new password hash using a strong one-way hashing algorithm. password_hash() is compatible with crypt(). Therefore, password hashes created by crypt() can be used with password_hash().
A hash is one-way, and whatever you pass into it will always have the same end-result, however there is no way for you get the original string from the hash. This is ideal for passwords because you want to store an obfuscated version of the user's password that you can easily compare at login without actually storing what the password is. This means if the database is compromised, so long as the passwords were hashed, the attacker wouldn't get the passwords, they would get the hashed passwords which are essentially useless (you can use rainbow tables and I'm sure other techniques to get the resulting hashes, but it takes a decent amount of effort).
This leads into your original question. Why are password hashes slow? They are slow because one of the only ways to get the original string from a hash is to re-generate that hash. So if it takes 1 second to generate each hash it becomes a bigger time sink than it would have been had you used a fast hash such as md5 of a version of sha. Fast hashes are great for pretty much everything except for password storage.
Hopefully this answers your question. Just as an aside, I would strongly recommend generating a unique salt for each user and passing that in as one of the options into password_hash. This salt can be stored as plain-text in the database alongside the hashed password. Using a different salt for each password will add that into the password so a would-be attacker would have to generate a rainbow table for every salt that's in the database. At this point the attacker would likely utilize other techniques to get the passwords instead of a database breach.
I am trying to understand password_hash fully in order to be able to explain it for an auditor.
Based on my searching for an answer, I understand that the password_hash() function is a wrapper for crypt(). While reading the PHP manual for predefined Constants I see that it uses PASSWORD_BCRYPT as the default integer value (basically it uses the CRYPT_BLOWFISH algorithm to hash a password).
What's confusing me is that the $options variable, if omitted, generates a random salt and the cost will be set to 10. If I supply a higher cost (for example: 12), will it still generate a random salt since I am not supplying a salt value? The reason why I am confused here is because I am not omitting the $options but instead supplying a different cost.
My other questions:
Why does increasing the cost value increase security?
How, since password_hash() is a one way hashing function, does password_verify() validate the password since the salt is random?
Is CRYPT_SHA512 stronger than CRYPT_BLOWFISH for hashing?
I find this article incredibly useful to understand how to correctly hash passwords. It explains how hashes can be cracked with various techniques if the hashes are weak, and how to hash passwords correctly to provide sufficient security.
If I supply a higher cost (say 12), will it still generate a random
salt since I am not supplying a salt value
Yes it will - as the documentation says if salt is omitted, a random salt will be generated by password_hash() for each password hashed (this means if you omit the salt value from your options array, it will be generated by password_hash() function defaultly). Moreover, the salt option has been deprecated since php 7.0.
why increases to the cost value increase security?
This is also explained in the above article in section Making Password Cracking Harder: Slow Hash Functions. The higher the cost is set to, the slower is the hash function. The idea is to make the hash function very slow, so that even with a fast GPU or custom hardware, dictionary and brute-force attacks are too slow to be worthwhile. The cost should be however set to reasonable value (based on the specs of your server), so that it doesn't cause significant time delays when verifying users' passwords.
More, is CRYPT_SHA512 stronger that CRYPT_BLOWFISH for hashing?
Read this post about their comparison.
Password hash works by using crypt() in basically a wrapper. It returns a string that contains the salt, the cost and the hash all in one. It is a one-way algorithm, in that you don't decrypt it to validate it, you simply pass the original string in with your password and if it generates the same hash for the provided password, you're authenticated.
It's best to omit the salt and let it generate one for you. If you use only one salt, it makes it easier to break all your passwords instead of just that one. Salts can be generated regardless of the cost.
Cost (an exponential value) refers to how much effort goes into generating the hash (where higher = more computing power to generate a hash). Don't set it too high or you will bog your login scripts down.
Generally speaking:
You always should apply a salt when hashing passwords, to have a different hash even if you have the same password. This increases security by "preventing" people from using rainbow tables to crack the password.
But bcrypt handles the salting on its own!
Back to your original question:
The cost is used to make it "costly" to crack the password with a dictionary/brute force attack.
Bcrypt basically hashes the password over and over, which makes it time consuming (=costly) to obtain the password to a given hash. If you try to find a password for a hash (brute force attack) you have to calculate billions of password hashes. When each hashing takes "$cost" times as long, then a brute force attack is not feasible. Even if you can calculate the hash for a potential password in milliseconds.
In simple terms:
If you have a password hash for SHA-1 (unsecure, don't use it!) with the salt (as this is usually contained in the hash) and you want to hack it then you have to hash all possible passwords + the salt and when you find the combination with the same hash, you found a possible password for this hash.
Let's say you use a good salt and a long enough password, then you need something like 1-5 seconds for a password hash. If you use the blowfish approach with cost=10 you need 10-50 seconds for a password hash.
For a single password, this is no big deal. So a directed attack for a single hash is still simple, but usually people obtain large lists of user and password combinations and they are interested to get the passwords for all of them quickly. Then this is much less lucrative for the bad guy, as he needs 10 times the CPU power to calculate all that stuff.
I understand that bcrypt is more secure than other methods but still puts you the same situation where you need to salt passwords!
If the salt is included in the hash string it's not needed to store it separately in the DB. Everytime I need to create a new hash, meaning a new salt as well, do I have to get all the passwords, extract the salts and check the new one doesn't exist already against my DB passwords?
Wouldn't be easier to store directly the salts separately for easy compare? If yes then I don't get:
the point of storing the salt in plain text
why bcrypt is more secure than manually use sha256 with salted passwords
I'm actually going to disagree with Curtis Mattoon's answer on a couple of things.
When you hash using bcrypt, the salt is stored directly inside the hash, so you don't need to store it separately. I'm not sure what he means by not having to store it at all, because the hash without the salt is completely useless. The salt is needed to verify the password against the hash.
I agree on this point. If you are updating one password, you don't need to update them all. In fact, it would be impossible because you (hopefully) don't know the passwords for any other users.
You don't need to go through pains to get a unique salt. If that were the case, you could use uniqid, but the problem with that is its output is predictable. Predictability is a bad thing in cryptography. Instead, what you want to do is use a pseudo random salt as close to random as possible (i.e. using /dev/random instead of /dev/urandom). If you have a billion users, you may get one or two that have exactly the same salt, but seriously, is this such a big problem? All it does is doubles someone's chance of brute forcing the password for those two particular passwords out of a billion, and I doubt it's even that high of a chance of a collision occurring. Don't strain yourself over this. Make the salts random, not unique. Using things like last login time or IP address is only going to take away from randomness.
As for a comparison between SHA512 and Blowfish, see here SHA512 vs. Blowfish and Bcrypt
This site seems to do a decent job at a brief explanation: http://michaelwright.me/php-password-storage
Quick answer:
1) You don't need to store the salt.
2) You don't need to update all the hashes, if you use a unique salt for each password.
3) I'm no crypto expert, but when you're using a unique salt for each user/password, an attacker would have to use a different set of rainbow tables for EACH user. Using the same salt value across the site means that every user's password would be susceptible to the same hash tables. In the past (for better or worse), I've used a function of the user's last login time and/or last IP as the for their password's salt.
e.g. (pseudocode) $password = hash(hash($_POST['password']) . hash($row['last_login']));
4) I'll defer the "Why is bcrypt better?" question to someone more knowledgeable about such things. This answer may help: How do you use bcrypt for hashing passwords in PHP?