Why are emails not commonly hashed when stored in db? - php

With almost weekly news about databases being pilfered I am wondering why only passwords are hashed and not emails too? To be clear, I mean hashed with a static salt, which is stored somewhere other than the database.
Obviously, it's just one step among many. But as part of a multi-faceted security setup (ie - PDO, not rolling your own hasher, rate limiting, etc etc) why is it not more common to hash the email? Regarding logins (+ password reminder emails, etc) you could simply do a regular compare. Surely user emails should be treated more respectfully?
I have read a number of similar questions on SO / sister sites but am really unconvinced as to how this is not an idea that should be adopted more frequently?

Because you usually need to be able to read the email address at a later date. Not just verify its value.
Passwords are not used for anything but validation so you don't need to know it's actual value so long as you have a way to validate that value. Comparing hashes allows you to do that.
Emails addresses are actually used for something. Like, sending emails. You can't do that unless you can actually read the email address.

Related

Storing email addresses anonymously

I run a service where users can log in, but I will never have a need to send an email to them. I try to keep user data as anonymous a possible. I'm not interested in user tracking, selling data, etc. I know there will be simpler solutions to this question, such as "don't use email addresses in the first place" but they make a good login identifier because they are GUIDs. My service goes though the process of having the user verify the address, that's the only email I'll ever send.
So I had the idea of storing the addresses anonymously. My first thought was to simply store the SHA512 hash of each address, but in the event of a breach - which I believe my security would prevent - technically somebody could use rainbow tables to recover at least some of the addresses.
To use a salted hash, I need some way to narrow down the potential result list so I don't compute hashes for every user for every login. That won't scale. To achieve that, my idea was to store the first 5 characters of the SHA512 of the email. That wouldn't be a unique value of course, but it gives me a smaller pool of potential matches. Technically, this all works great.
My concern though is this is still vulnerable to rainbow tables. Those 5 characters are enough to look up possible inputs, and the attacker would already know that only inputs that look like email addresses would be valid. They'd still have enough to determine the email address given the first part of an unsalted hash and entire salted hash.
Am I overthinking this though? For the record, I'm using pgsql and php in this case, but that's really an implementation detail.
Update: I'm still not sure if I'm going to go ahead with this, but for anybody curious, the problem with rainbow tables here can be solved rather easily. Rather than hashing the whole email and taking the first few characters of the hash, use the first few characters of the email as the hash input and store the whole hash. It achieves the same effect, but at best the rainbow table will only reveal the first few characters.
To me, I think yes. You are over-looking.
no matter how strong your structure is, there is always a small chance of breach as nobody is perfect and no can be the human made script.
I think you should go for the best option you think it is and then stick to it.
Some things are best left to fate.
Good Luck
I think you're overthinking this. You stated that you don't need to email the users down the road, so my question back to you is why do you need to store the email at all? You mention that it's a good GUID, but if you're that concerned about data security, would it not be easier to let users define a username upon email verification?
Basically, I picture an ephemeral usage of the email, where it's never stored in the database, and only used to send a validation email. This would allow you to send a custom one-time-use link to the email, which would allow your user the chance to create a custom login name, which you could validate against your database to make sure it is unique.
You could then safely store this unique identifier without the concern that it would lead to email insecurity.
All of that said, I don't think any of it is necessary. As you said, email is an excellent GUID. What makes it an excellent GUID is that it is so widely known and available. The risks associated with the release of a plaintext email are far fewer and less damaging than the risks of a plaintext password. I believe our time as developers is better left securing the private data, and not the public data.

Obfuscate email in a way it is not recoverable but still usable

Probably a stupid question, but I need to ask anyway.
I'm working on a research, which involves emailing fake phishing emails to participants.
At the beginning, I would have a database of email addresses.
Because of ethical considerations, I would like somehow to hash the email addresses in a way, that later they would not be recoverable even if I want to.
For example:
I want to send an email to
john.doe#mail.com
The email would lead to a page, where I would collect some data (when was it visited, what did he did on the page), so basically I would store email address and its actions in a database.
I could store the hash of the email address in this database, so in the end I wouldn't have his address, but the problem is at a later stage I will need to email him a second time, and record those actions as well...
Now the problem is:
If I hash his email address and store it this way in the database, a
simple re-hash of the original database would reveal the recipient.
If I hash his email with a random salt, I could not link his old and
new actions together.
I need to be able to tell honestly that there
is no way I can link real email addresses and real people to the
database entries. (I just need the results anyway)
No, you can't keep the email address usable and also make it impossible to recover it. If you need to be able to decode/recover the email address to send an email at a later date, then there's no way to make it unrecoverable. That's a contradiction in terms. You would need to do something like use a third party to create per-user tokens, but then the third party would need to store the token and the email. There's no avoiding it: someone has to store the email.
The best solution is just to encrypt any sensitive data, including personally identifiable information (PII). If you want to be hyper-paranoid about it, you could throw away the key at the end of your project. But you have to keep it in the meantime, if you really need to be able to use the encrypted information (like the email address).
Also, be aware that what you are doing may have legal implications (both the sending of bogus phishing emails and the storage of PII). You should speak to a lawyer in whatever jurisdiction(s) is/are relevant.

Prevent user from creating multiple accounts with one e-mail address?

Let's say a user is making an account on a website. The e-mail address which is provided by user is saved in mysql, but is hashed before saving. That way a possible hacker is not going to see the e-mail addresses. But on the other hand for me or you ("the programmer") there is no way to see if an user is trying to create an account with the same e-mail address (which I really want to prevent).
Question: In general what is your advice to cope with this problem? Any advice or solutions are appreciated?
Question: Would an account be more "secure" when hashing the e-mail address?
P.S. FYI, this application uses PHP as server language.
UPDATE:
I use BCRYPT with PHP built in salt.
I use mysqli.
Solution #1 - MySQL approach
Add unique index on email column. This will prevent any additional rows with identical email field to be added. No error, smooth.
Assuming your table is users and you store emails in email_hashed:
ALTER TABLE users
ADD UNIQUE (email_hashed)
Needs cleaning first before applying if you already have duplicates.
.
Solution #2 - PHP approach
Simply hash email and SELECT from database all rows with that hash. Like that:
$email = 'ex#example.com';
$hashed = someHashing($email);
$sql = ("SELECT id FROM users WHERE email_hashed = '$email'");
..
If any row will be fetched then you can do something like displaying message, error or anything.
I recommend using both solutions.
EDIT - Regarding... BCrypt...
So yeah, you are using BCrypt. There are two ways for you if you want to hash emails (no idea why, but whatever!). The one for which you will gonna be laughed by everyone and the better one.
The first (laughable) one is to:
SELECT from database entire table with every possible existing hash of emails
Run foreach() {} loop through every hash from database
In every loop compare hashes using password_verify()
If any compare returns true then run some code of your own
The second one is easy:
CHANGE hashing to either md5 (using md5('text') function) or sha256 for longer hashes (using hash('sha256','text'))
Another edit
Question: Would an account be more "secure" when hashing the e-mail address?
I think it's not question to raise on StackOverflow but since it's "a bonus" I will put some thoughts here.
I am not security expert though, so it's possible that I don't know something.
Anyway, hashing passwords with BCrypt and being sure that nothing on the account can be edited in any way that don't require passwords (like flawed API or compromised admin dashboard). I think you should also protect vulnerable data (like names, addresses, phone numbers etc) from public access.
Hashing emails has only one purpose I can think of. That in case of successfull hack, someone who dumped all your database won't get any single email address. That is nice. But it also prevents you from sending newsletters, account expiration notices and other important emails.
In 90% of sites I'd say "hashing emails, are you insane?", but if you don't need to reuse email at all (you won't ever send any email except registration one) and want user emails to be pretty safe, then yes, hashing can prove useful. But please, no BCrypt :P
As S.L Barth states, you can hash the email address as given, perhaps with an ajax request once the field looses focus, and then check if that hash exists in the database, if number of rows returned is > 0 then javascript can output a message saying this account is already registered.
Creating a unique index on the table would also work but this would not feedback an issue until the data was attempted to be written to the database, which will probably be too late. Needless looping for the end user.
Update
If your email address is hashed with a salt and you can't confirm it against the same email address added again, what is the point of storing the email address if it can not be decrypted? Revise your method. Use a php function like Password_hash() and password_verify()
When you hash the email to insert it into the database check if such hashed email already exists, just like you check for that each time a user is logging in.
Accounts would be more secure if you hash everything but that way it would be hard to recover/reset lost username/password because you wouldn't know to what email to send the reset information
on onblur event hashed the email id entered by user and check it whether it is present in mysql email column if it is present it will result in a row then disable button else allow user to insert
use select * from user where email="hashed(email_id_entered)"
if(result>0)
disable button
else
enable

Is there anyway to send someone a password that they can use but cannot see?

Just trying to hack together a simple script, and I had a little question about passwords.
Is there anyway I can send someone a random password that they cannot see themselves but can use to say, change their facebook password to in order to block themselves from logging in? I will then send them the visible password at a specified time later on.
This is for purely educational purposes, as I'm just building little apps here and there to learn php and mysql.
Example: Friend wants to get off facebook for 3 hours. He uses web app and I email him a randomly generated password for him to change his current FB password to. However, on the email it is hidden to him. After 3 hours, he gets another email allowing him to use it.
I understand there might be some easier ways / clearer methods of achieving my end goal, but I am just curious about this itself!
Thanks so much
If you are sending an e-mail that contains "something" that allows the user to log in with, you are sending them a password and if it's clickable/copyable from the e-mail, they will see it. Regardless of if the password is plain text that directly matches a stored value (like "thi$ismyPa$$word") or some other encrypted value that when inputted is decrypted to match a stored value is irrelevant, the user either way knows what that value is (because they have to enter it). In order for the user to provide a value, they have to have the value. As others have mentioned, you could implement a one-time use password into your application, but that wouldn't work for a facebook implementation because it's not your app and you can't control it's functionality. The short answer, if you provide something to the user (like an e-mail) that is used to access a system, then they can see the value(s) necessary to login.
The traditional approach is to calculate a hash sum of some kind from the password, and send that. Thre are one-way algorithms, like MD5, that can do this. That way, conformance TO the password can be assured, without having to send the password itself, or from which the password can be inferred.
For even greater security, both a hash and a checksum can be sent: that way the hash itself cannot be intercepted and sent as a proxy for the password: authentication will not happen unless both the hash and the checksum values agree.

Activation on site

I have been running my website for a few months now and occasionally I find my activation isnt great. After the user signs up, they will receive an email which has an activation link provided.
I have a few problems and want to improve this if possible.
Firstly, the email sometimes doesnt arrive? Any reason for this?
How can I stop it going into the junk mail?
Secondly, at the moment, the activation is their username and an md5 of their username.
Is there a better way to do activations?
I'm always looking to improve and find better ways of doing things!
Thanks for your time.
Email doesn't arrive
First at all, you cannot really rely on mail. Never. Because you can't even know if it was received or read. A mail may be blocked as spam on server side, can be filtered on client side, or can just be lost or ignored.
There may be plenty of causes. For example, you may use e-mail authentication mechanisms. You may also start to check if there is reverse DNS for your domain.
Further, you may want to read some documentation and books to know how spam filters work. It will show you some obvious methods to reduce filtering of your mails, like sending mails in plain text instead of full-HTML, but also less obvious stuff like the words to use, etc.
If you have no choice and you must send mail, probably the most easy solution to prevent spam filtering would be to ask the users to add your domain to the list of safe senders. In practice, nobody will do it for you.
Activation through MD5
There is obviously a better way, since the one you implemented does not provide anything. If the activation is a hash from user name, you can as well just tell the users to calculate the hash themselves (thus avoiding all the problems with mails filtered as spam).
Normally, the users may not know what their activation code would be. It means that the activation code must be random or difficult to guess.
Generate a set of random characters, save them to database and send the code by mail. Then you would just need to validate the code against the one you keep in your database.
Some emails will always end up in the trash folder. It's probably best to put up a notice so that people know to check there, and make it possible for the user to re-request the activation email.
Using the MD5 hash of the username is not a very good idea because anyone can automate that. At the very least add some salt before hashing it, or even better, use a completely unrelated random token saved in your database.
For your second question, you may want to generate a random activation code and store it in a database. When the user clicks the activation link you could verify the code in the database using their e-mail address. This way a malicious user will have a more difficult time automating registration on your site.
$code = md5(uniqid(rand(), true));
If you're on a shared server, services like Yahoo are apt to label you spam. They want you to have a dedicated IP. It's almost impossible to get users to check the 1000 messages in their spam folders for your one activation message.
The MD5 hash is fine if you're hashing with a timestamp.
Keep this implementation, but supplement it with OpenID. That will take care of your Gmail and Yahoo users.
Yes, that's wrong. You shouldn't use MD5 for that.
The most popular way of do it is generating a rand code and saving it in the users table in the DB and send it by email as a GET parameter of the link.
About the emails, I would tell users to look in theit junk folders.
First problem: Make sure your mail isn't spammy. Follow the default guidelines for setting up mail... things like making sure you've got your SPF records configured, your mail is well-formatted, doesn't include spammy words. I generally test against Gmail, Hotmail and a server running SpamAssassin to check mails I send out; examine the headers to see if you're triggering any serious anti-spam rules.
Second problem: You'll want to make sure that the user cannot guess what his activation key is (thus removing the need for receiving the email). An MD5 of the username is insufficient for this. However, if you salt the MD5 you can easily prevent people from generating the MD5's in an automated way (that's an open invitation for automated signups). Adding Salt refers to adding a large amount of pregenerated random data to your input before hashing it. That way, the attacker can't lookup the hash in a 'rainbow table', as he no longer knows what the input for your hash was. Of course, you could just as well use a randomly generated string, which would probably be easier.
Another look on user registration. Let yourself inspire at stackoverflow and use OpenId and you don't have to care about user registration.
Update
You don't need to validate OpenId user via email. A user which signed up via Google or MyOpenId account is valid.
You don't have to care about questions if user is a bot? This servers did it already.
I have never got verification email from stackoverflow.
Mail arriving in the junk folder is a perpetual problem. The range of 'not looking like spam' strategies are numerous. Beyond the Junk folder I think that the overwhelming majority of reported 'not received' situations are actually just delays in propagating the email.
I'm currently implementing a resend for the activation email confirmation despite the fact that it should only actually be necessary in cases where the user has accidentally deleted the email and purged their trash or a transient error has discarded the mail. These cases are going to be rare but do exist so needed to be coded for.
I think the most important reason for implementing the resend of the activation confirm is customer service. It provides the user with an action that they can take while waiting for their mail and in the course of doing so and re-checking their email the activation email will eventually appear.
I wouldn't use the md5 as it creates too predictable a result. You want something that has a random or at least less predictable element. It is then problematic if you are invalidating the hash/token in the original email by resending a new mail so I would avoid overwriting the existing token and would instead re-use the same token which you should have stored or better stored the values from which it can be validated. This does constrain how you create the token as you want to be able to recreate it in the later resend mails or at least to be able to continue to validate all the inflight mails as valid. I am using a session aging model to resend the same token if that token is still valid. There is no reason why the user shouldn't see it as the same token and hence understand that they are all valid. In the case of an expired session/token a new one needs to be generated.
It's good practice to expire the activation mail token in case the mailbox falls into the wrong hands weeks or months later and the old mail is found. Assuming this can have some undesirable effect on the state of the users account at that later point.

Categories