how to protect a site-wide secret key

how to protect a site-wide secret key - php

Imagine a pretty standard website, with user authenticating with email/password pair. For passwords, it already ha shashing with random salt, but the rest of data is kept unencrypted.
We do another step forward and encrypt the sensitive data with a password key, the key, obviously, shall be known to the application to be able to decript the data for its operation.
we don't want to have it in the source code, so it's kept in a file and read by the app when it needs it.
we've secured the file so that only user which executes the app can read it
(this point has appeared after some discussions below) We have already considered buying hardware HSM and found that not possible (for instance we are running the server on a virtual machine)
this way we are relatively protected from complete DB stealing, right? However, the key might become known if someone gets access to the OS user with read rights.
the question is: what are the best practices for keeping such key secure?

Buy a hardware security module and keep the key in it. The key will not be able to be read.
Yubi makes a reasonably priced hsm. $500 if I recall correctly.
While we're here, your db server should be on a different box in a different network zone as your web server.

Related

How to store the API keys of my clients in a secure way?

I am developing a SAAS service that allows my clients to connect third party emailing tools (eg MailChimp). I therefore ask to enter their API key associated with the desired service to allow certain actions to be performed automatically on their account.
For that I record in their database their key (s) API and the connection is done. But from a security point of view, if my database comes to be hacked despite all the predispositions taken in terms of security (prepared requests etc) ... These are all the API keys of my clients that are revealed and also email addresses of their own customers that can be retrieved, used, resold ... Because the tools I connect essentially allows to store contacts, organize and send emails.
So I wonder what is the best practice to allow my clients to use the API of their favorite tools without endangering the security of their own accounts and data of their customers (emails, etc). I am aware that currently launching my web application with this data in clear in database would be dangerous.
I thought of several solutions:
Encrypt API keys in database, but I do not see how to test them (decryption) since it's not like a password?
Store API keys on a different database hosted elsewhere, but the problem of encryption remains the same ... no?
Use an OAuth stream: it seemed to be convenient, but all the services I want to connect via API do not offer this and I'm not even sure that this is really suitable for me.
I intend to host my SAAS on Amazon web services, I saw that it was proposing a service called "KMS" Key managament storing but I do not know if it is really adapted once again to my problematic ...
If someone has already had to answer this problem, or knows how to solve it, I want to be enlightened on it!
Note: Sorry for my bad english, i'm French.

All of the solutions you mentioned are somewhat valid and a combination is most likely the best answer. Your application needs access to these API keys so it's not really possible for a hacker to gain full control of your application and not gain control to the API keys. Full control being the key part - you can make it a lot harder to get to them.
Encryption
You would need encrypt them, not hash them, with something like AES. As you need to be able to decrypt them and use them in your requests towards the 3rd parties. This will help you protect against, eg. a database leak - if someone gets your database they would have to crack the encryption to get to them (as long as the encryption is properly implemented). The encryption/decryption key would of course have to be NOT in the database otherwise the whole thing has no point :)
Separation
Different database also makes sense - if someone dumps your main database they won't get to the API keys database and would have to get deeper into the application to access this database (ideally would be a completely separate DB server only accessible from your application).
Architecture of the solution matters too - you can have one server posing as a web-interface that is internet facing and that would talk to the backend server that is not internet facing over some limited (as much as possible) API to lower the attack surface. Only the backend server would then have access to the keys database and would perform the requests to the 3rd parties. Now an attacker has to jump through several servers to get even close to the keys.
Combining the above-mentioned will ensure one would have to obtain full control of your application (and all its parts) to get to the keys, the encryption key and bypass whatever other protection you might put in place.

Best practice for storing private images on a webserver

For example if you had an online communitity that allowed the sending of private images between members, like a digital penpal or dating website.
What would be the best practice for securing these images on a webserver and the best practice for displaying them to the authenticated user?
Here is what I have done so far:
Store Images outside of public root.
Retrieve images via one time code instead of the actual image location.
Randomised hashed image names and folder names that are not easy to guess.
PHP script to authenticate user before displaying the image.
Outside of root seems to be one of the best ways to store the images to make then hard to access, but what about if the server itself is directly hacked into?
Is there a way to hash and salt the image files so it can only be displayed once the hash and salt matches, even if a hacker had the file?
Would this be possible to return via PHP or SQL?
I was thinking of encoding the images to base64 and salting the base64 with a salt generated from a randomly generated password per user (Is this possible?)
Or is there a better method?

For a basic protection, the things you have described could be enough, maybe even too much in the sense that if folders are outside of www root, randomizing folder names won't add much to security but will increase complexity.
Based on a risk assessment that you should conduct for your scenario, you can choose to do more. Of course if you find that you can lower the risk of a $100 breach with the cost of $10000, you probably don't want to do that. So do the maths first. :)
I can see two major threats to your solution, one is a bug in the access control logic that allows a user to download images that he was not supposed to be able to access. The other is an attacker gaining access to your web server and downloading images (as your web server needs to have access to image files, this is not necessarily root/admin access, which increases the risk).
An idea one could think of would be to encrypt images on the server. However, with encryption, key management is usually the problem, and that is exactly the case now. There is not much point in encryption with a key that your application can access anyway, as an attacker could also access that key in case of a successful application level attack (and also in case of a server/OS level attack, because the user running your web server and/or application must have access to the key).
In theory, you could generate a public/private keypair for all of your users. When somebody uploads an image, you would generate a symmetric key for the image, encrypt the image with that key, and then encrypt the symmetric key with each intended recipient's public key and store encrypted keys (and metadata) with the image. The private keys for users should also be encrypted, preferably with a key derived from the user's password with a proper key derivation function like PBKDF2. One implication is that you can only get the user's private key when the user logs in, because you don't store his password, so that's the only time you have it. This means you would have to store your user's decrypted private key in server memory at least, where it is not really safe (and any other store is much worse). This would still provide protection against offline attackers though (somebody having access to backups for instance), and it would also limit attack scope to victim users that log on while the server is compromised (meaning after it is compromised, but before you realize this). Another drawback is the complexity of this solution - crypto is hard, it would be really easy to mess this up without experience. This would also mitigate the threat posed by an access control flaw, because unintended images could not be decrypted with the logged on user's private key.
A completely different approach would be to separate your application into several components: a logon service (similar to SSO), your web server, and a backend image service. When your user logs on to the authentication provider (AP), he would in this case receive a token with claims, signed by the AP. When talking to the web application, he would use this token for authentication. What differentiates this solution from the previous is that when a user requests images, the web application would pass his token to the image service, and the image service could on the one hand store images securely on a box not directly accessible from the internet, and on the other hand it could authorize whether for the token received it wants to return images (it could verify the token with the AP or by itself, depending on the implementation you choose). In this case, even if an attacker compromises the web application, he would still not be able to produce (sign) a valid token from the AP to get access to images on the image service, and it could potentially be much harder to compromise the image service. Of course in case of a breach on the web server, the attacker would still be able to observe any image flowing through, meaning any user that logs on while the server is compromised would still lose his images to the attacker. The added complexity of this solution is even worse than the previous one, which means it is easy to get this wrong too, and it's also costly both to develop and maintain.
Note that none of these solutions protect images from server admins, which may or may not be a requirement for your app.
I hope this answer sheds some light on the difficulties involved in making it significantly more secure than your current solution. Having said all this, implementation is key, and details (the actual code level vulnerabilities) probably matter the most.

You have these listed as some of your security protocols:
"1. Store Images outside of public root.
2. Retrieve images via one time code instead of the actual image location.
...
4. PHP script to authenticate user before displaying the image."
This should be enough, but then you mentioned...
"3. Randomised hashed image names and folder names that are not easy to guess."
If you are actually doing the first two correctly, 1 and 2, then it's not really possible for 3 to have any effect. If the images are outside of the webserver directory, then it doesn't matter if the folder names and image names are easy to guess.
Your code will look like (for doing 1 and 2), assuming the environment is the root directory of your webserver (i.e., example.com/index.php)...
$file_location = '../../images/' . $some_id_that_is_authenticated_and_cleansed_for_slashes . '.jpg';
readfile($file_location); // grabs file and shows it to user
If you are doing the above, then 3 is redundant. Hashing the names, etc., won't help if your site is hacked (the Apache security is bypassed) and it won't help if your site isn't doing the above (since users can then just directly access the URLs). Except for that redundancy, the rest seems perfect.

How to protect encryption key from server admin?

Scenario
Data is encrypted inside DB using key that is never stored in the app server or DB server
Key is entered upon login and is stored via $_COOKIE['key'] variable for persistence (so user doesn't have to enter it every page load)
Data is decrypted via $_COOKIE['key']
$_COOKIE['key'] is destroyed upon browser exit
Threat
Rouge server admin snoops on PHP files, finds out key is stored at $_COOKIE['key']. He injects malicious code like email_me($_COOKIE['key']);. He erase malicious code after gaining the key.
Question
Is there a way to protect yourself from this kind of scenario?

You can make it harder for a server admin to get the key, but they always can.
Let's think about moving the encryption and decryption to the client side. Now, the server won't get the key, so the server admin should not be able to decrypt the data. That's not quite true, because the server admin can manipulate the page JavaScript so that either the key is sent to the server or nothing is encrypted at all.
The only way a client can be certain that a server admin cannot steal their data, is by using a client software that is open source and cannot be changed on-the-fly by an admin. So, web pages and automatically updating apps are out of the question.

If the key itself is a concern, you can use cryptography oracles like Keyvault in Azure that never release the keys contained within but perform cryptography themselves on data sent to them.
Of course an admin would be able to access the data as long as they have access to the cryptography oracle, but not afterwards, and they would never have the key. This helps in some scenarios, that's the whole point of services like Azure Keyvault. Also you don't need to give actual access to the encryption service to all admins.
Another mitigation (a detective control, as opposed to a preventive one) is audit logging both on the IT and application level. When done right, not even admins can hide the fact that they accessed the data, which again can help mitigate some risks and at least may provide non-repudiation.
Yet another thing you could do is proper change management, controlling who has access (especially write access) to your source code. This can get difficult with script languages like PHP, where you can't really sign code, but you can still have good processes for reviewing and releasing code to production.
So in the end, it's probably less of a technical question, there's a great deal you can do in terms of processes.

Is it possible to somehow get this randomly generated key for my site and access the SQL?

I have a php/js site where the information is encoded and put into the database. The encryption key for the information is randomly generated, then given back to the users after they send a post through a form. The encryption key is not stored in my database at all. A seperate, randomly generated, ID is formed and stored in the database, used to lookup the item itself before deciphering it.
My question is, is it possible at all to look through the logs and find information that would reveal the key? I am trying to make it impossible to read any of the SQL data without either being the person who has the code (who can do whatever he wants with it), or by a brute force attack (unavoidable if someone gets my SQL database)?
Just to re-iterate my steps:
User sends information through POST
php file generates random ID and access key. The data is encrypted with the access key then put in the php database with the ID as the PRIMARY KEY.
php file echos just the random ID and the access key.
website uses jQuery to create a link from the key and mysite.com?i=cYFogD3Se8RkLSE1CA [9 digit A-Ba-b09 = ID][9 digit A-Ba-b09 = key]
Is there any possibility if someone had access to my server that can read the information? I want it to be information for me to read the messages myself. The information has to be decodable, it can't be a one way encoding.

I like your system of the URL containing the decryption key, so that not even you, without having data available only on the user's computer, will be able to access.
I still see a few gotchas in this.
URLs are often saved in web server logs. If you're logging to disk, and they get the disk, then they get the keys.
If the attacker has access to your database, he may have enough access to your system to secretly install software that logs the URLs. He could even do something as prosaic as turn logging back on.
The person visiting your site will have the URL bookmarked at least (otherwise it is useless to him) and it will likely appear in his browser history. Normally, bookmarks and history are not considered secure data. Thus, an attacker to a user's computer (either by sitting down directly or if the computer is compromised by malware) can access the data as well. If the payload is desirable enough, someone could create a virus or malware that specifically mines for your static authentication token, and could achieve a reasonable hit rate. The URLs could be available to browser plugins, even, or other applications acting under a seemingly reasonable guise of "import your bookmarks now".
So it seems to me that the best security is then for the client to not just have the bookmark (which, while it is information, it is not kept in anyone's head so can be considered "something he has"), but also for him to have to present "something he knows", too. So encrypt with his password, too, and don't save the password. When he presents the URL, ask for a password, and then decrypt with both (serially or in combination) and the data is secure.
Finally, I know that Google's two-factor authentication can be used by third parties (for example, I use it with Dropbox). This creates another "something you have" by requiring the person accessing the resource to have his cell phone, or nothing. Yes, there is recourse if you lose your cell phone, but it usually involves another phone number, or a special Google-supplied one-time long password that has been printed out and stashed in one's wallet.

Let's start with some basic definitions:
Code Protecting data by translating it to another language, usually a private language. English translated to Spanish is encoded but its not very secure since many people understand Spanish.
Cipher Protecting data by scrambling it up using a key. A letter substitution cipher first documented by Julius Caesar is an example of this. Modern techniques involve mathematical manipulation of binary data using prime numbers. The best techniques use asymmetric keys; the key that is used to encipher the data cannot decipher it, a different key is needed. This allows the public key to be published and is the basis of SSL browser communication.
Encryption Protecting data by encoding and/or enciphering it.
All of these terms are often used interchangeably but they are different and the differences are sometimes important. What you are trying to do is to protect the data by a cipher.
If the data is "in clear" then if it is intercepted it is lost. If it is enciphered, then both the data and the key need to be intercepted. If it is enciphered and encoded, then the data, the key and the code need to be intercepted.
Where is your data vulnerable?
The most vulnerable place for any data is when it is in clear the personal possession of somebody, on a storage device (USB, CD, piece of paper) or inside their head since that person is vulnerable to inducement or coercion. This is the foundation of Wikileaks - people who are trusted with in confidence information are induced to betray that confidence - the ethics of this I leave to your individual consciences.
When it is in transit between the client and the server and vice versa. Except for data of national security importance the SSL method of encryption should be adequate.
When it is in memory in your program. The source code of your program is the best place to store your keys, however, they themselves need to be stored encrypted with a password that you enter each time your program runs (best), that is entered when you compile and publish or that is embedded in your code (worst). Unless you have a very good reason one key should be adequate; not one per user. You should also keep in-memory data encrypted except when you actually need it and you should use any in-memory in-clear data structures immediately and destroy them as soon as you are finished with them. The key has to be stored somewhere or else the data is irrecoverable. But consider, who has access to the source code (including backups and superseded versions) and how can you check for backdoors or trojans?
When it is in transit between your program's machine and the data store. If you only send encrypted data between the program and the data store and DO NOT store the key in the data store this should be OK.
When it is stored in the data store. Ditto.
Do not overlook physical security, quite often the easiest way to steal data is to walk up to the server and copy the hard drive. Many companies (and sadly defence/security forces) spend millions on on-line data security and then put their data in a room with no lock. They also have access protocols that a 10 year old child could circumvent.
You now have lovely encrypted data - how are you going to stop your program from serving it up in the clear to anyone who asks for it?
This brings us to identification, validation and authorisation. More definitions:
Identification A claim made by a person that they are so-and-so. This is usually handled in a computer program by a user name. In physical security applications it is by a person presenting themselves and saying "I am so-and-so"; this can explicitly be by a verbal statement or by presenting an identity document like a passport or implicitly by a guard you know recognising you.
Validation This is the proof that a person is who they say they are. In a computer this is the role of the password; more accurately, this proves that they know the person they say they are's password which is the big, massive, huge and insurmountable problem in the whole thing. In physical security it is by comparing physical metrics (appearance, height etc) as documented in a trusted document (like a passport) against the claim; you need to have protocols in place to ensure that you can trust the document. Incidentally, this is the main cause of problems with face recognition technology to identify bad guys – it uses a validation technique to try and identify someone. “This guy looks like Bad Guy #1”; guess what? So do a lot of people in a population of 7 billion.
Authorisation Once a person has been identified and validated they are then given authorisation to do certain things and go to certain places. They may be given a temporary identification document for this; think of a visitor id badge or a cookie. Depending on where they go they may be required to reidentify and revalidate themselves; think of a bank’s website; you identify and validate yourself to see your bank accounts and you do it again to make transfers or payments.
By and large, this is the weakest part of any computer security system; it is hard for me to steal you data, it is far easier for me to steal your identity and have the data given to me.
In your case, this is probably not your concern, providing that you do the normal thing of allowing the user to set, change and retrieve their password in the normal commercial manner, you have probably done all you can.
Remember, data security is a trade off between security on the one hand and trust and usability on the other. Make things too hard (like high complexity passwords for low value data) and you compromise the whole system (because people are people and they write them down).
Like everything in computers – users are a problem!
Why are you protecting this data, and what are you willing to spend to do so?
This is a classic risk management question. In effect, you need to consider the adverse consequences of losing this data, the risk of this happening with your present level of safeguards and if the reduction in risk that additional safeguards will cost is worth it.
Losing the data can mean any or all of:
Having it made public
Having if fall into the wrong person’s hands
Having it destroyed maliciously or accidently. (Backup, people!)
Having it changed. If you know it has been changed this is equivalent to losing it; if you don’t this can be much, much worse since you may be acting on false data.
This type of thinking is what leads to the classification of data in defence and government into Top Secret, Secret, Restricted and Unrestricted (Australian classifications). The human element intervenes again here; due to the nature of bureaucracy there is no incentive to give a document a low classification and plenty of disincentive; so documents are routinely over-classified. This means that because many documents with a Restricted classification need to be distributed to people who don’t have the appropriate clearance simply to make the damn thing work, this is what happens.
You can think of this as a hierarchy as well; my personal way of thinking about it is:
Defence of the Realm Compromise will have serious adverse consequences for the strategic survival of my country/corporation/family whatever level you are thinking about.
Life and Death Compromise will put someone’s life or health in danger.
Financial Compromise will allow someone to have money/car/boat/space shuttle stolen.
Commercial Compromise will cause loss of future financial gain.
Humiliating Compromise will cause embarrassment. Of course, if you are a politician this is probably No 1.
Personal These are details that you would rather not have released but aren’t particularly earth shaking. I would put my personal medical history in here but, the impact of contravening privacy laws may push it up to Humiliating (if people find out) or Financial (if you get sued or prosecuted).
Private This is stuff that is nobody else’s business but doesn’t actually hurt you if they find out.
Public Print it in the paper for all anyone cares.
Irrespective of the level, you don’t want any of this data lost or changed but if it is, you need to know that this has happened. For the Nazi’s, having their Enigma cipher broken was bad; not knowing it had happened was catastrophic.
In the comments below, I have been asked to describe best practice. This is impossible without knowing the risk of the data (and risk tolerance of the organisation). Spending too much on data security is as bad as spending too little.

First and most importantly, you need a really good, watertight legal disclaimer.
Second, don’t store the user’s data at all.
Instead when the user submits the data (using SSL), generate a hash of the SessionID and your system’s datetime. Store this hash in your table along with the datetime and get the record ID. Encrypt the user’s data with this hash and generate a URL with the record ID and the data within it and send this back to the user (again using SSL). Security of this URL is now the user’s problem and you no longer have any record of what they sent (make sure it is not logged).
Routinely, delete stale (4h,24h?) records from the database.
When a retrieval request comes in (using SSL) lookup the hash, if it’s not there tell the user the URL is stale. If it is, decrypt the data they sent and send it back (using SSL) and delete the record from your database.

Lets have a little think
Use SSL - Data is encrypted
Use username/password for authorisation
IF someboby breaks that - you do have a problem with security
Spend the effort on fixing that. Disaster recover is a waste of effort in this case. Just get the base cases correct.

Authenticate system without sessions - Only cookies - Is this reasonably secure?

I'm interested in your advice/opinion on this security problem.
I was thinking on doing something like this:
Get hash MAC (sha256) from string built from userId + expirationTime and as secret key string built from some secret string and $_SERVER['HTTP_USER_AGENT'].
Get hash MAC (sha256) from userId + expirationTime and as secret key previously made hash (from step 1).
Build string from userId|expiration| and previously made hash (from step 2).
Encrypt given string (from step 3) with 'rijndael-256' algo. (mcrypt family of functions).
Encode to base64.
Set cookie with given value.
What do you think. Is this ok?
What else could I implement with $_SERVER['HTTP_USER_AGENT'] check, to make sure that the cookie isn't stolen (except IP address)?
P.S. From sensitive data cookie would contain only userId.
EDIT:
Ok to clear some things.
I'm trying to make "safe" auth system that doesn't rely on sessions. The app in question is build more or less as pure restful api.
Step 2:
Problem:
"Fu’s protocol does not provide an answer to this
question. There is only one key involved in Fu’s proto-
col, namely the server key. One straightforward solu-
tion is to use this server key to encrypt the data field
of every cookie; however, this solution is not secure."
Solution:
"Our solution to this problem is simple and efficient.
We propose to use HMAC(user name|expiration time,
sk) as the encryption key. This solution has the fol-
lowing three good properties. First, the encryption key
is unique for each different cookie because of the user
name and expiration time. Note that whenever a new
cookie is created, a new expiration time is included in
the cookie. Second, the encryption key is unforgeable
because the server key is kept secret. Third, the encryp-
tion key of each cookie does not require any storage on
the server side or within the cookie, rather, it is com-
puted by a server dynamically.
"
From paper "A Secure Cookie Protocol" by Alex X. Liu1 , Jason M. Kovacs
Step 4:
Encrypts data (which would look something like this: 'marko#example.com|34234324234|324erfkh42fx34gc4fgcc423g4'), so that even client couldn't know exactly what's inside.
Step 5:
Base64 encode is there just to make final value pretty.

I'll bite.
In order to maintain any semblance of state you need to identify the user using a key of some type. That key is sent to the browser as a cookie OR through query string parameters.
Now, the validation of that key can occur inside the web server itself (session) or through checking some other storage mechanism, usually a database record.
The key itself should be obfuscated using some mechanism. The reason for the obfuscation is simply to make it harder to guess what values other keys might have if the originating user or someone else decides to inspect the value. For example, if the key is your user id (not recommended) and you are using incrementing ints then it's trivial to guess the other user keys. I want to stress that obfuscating ( or even downright encrypting ) the key provides absolutely no protection against a hijacked session. ALL it does is make it harder to guess other peoples session keys.
That said, I believe the key should have nothing at all to do with your user id and instead be some other near random value like a generated GUID. Quite frankly a base 64 encoded GUID is at the exact same level of security as encrypting user id + time. It's just that one is more computationally intensive on your server than the other.
Of course, this key could change upon each request. Browser posts something, you generate a new key and send it back. In the event the browser posts an out of date key then log it and kick them back to the login screen. This should prevent replay attacks .. to a degree. However, it introduces other challenges such as using the Back button on various browsers. So, you may not want to go down this path.
That said you can't depend on the client IP address because the same user might send follow up requests using a different IP. You can't depend on browser fingerprinting because any decent hacking tool will capture that and submit the same values regardless of whatever they are using.
Now, if you really want to do this right you should have SSL turned on. Otherwise you're wasting your time. The entire conversation (from the login screen on) needs to be encrypted. If it's not then someone could simply listen for that cookie, replay it immediately and hijack the session. Point is that they don't need to know the values contained therein to use them. So all of that hashing, etc you have is just fluff that will increase your server load.
Did I say use SSL? ;) This will encrypt the traffic from the beginning of the conversation and an attacker cannot replay the same packets as they would have to negotiate their own handshake with the server. Which means all you have to do is ensure that whatever session id you use is non-guessable so that one logged in user can't take over another's session.
So, to sum up: the method you posted is a waste of time.
You are much better off just getting a $10 SSL certificate and using a base 64 encoded GUID as the session ID. How you store that session info on your server doesn't really matter... except in load balanced situations. At which point it needs to be out-of-process and backed by a database server.. but that's another question.

#Marko A few comments about how secure this kind of "session in a cookie" approach is:
First of all, as said by others as well, you need a secure connection. There is no realiable way around this requirement. It is a must.
Other than that, there are quite a few pitfalls regarding to implement a secure encryption/authentication system. For example you need to make the MAC verification "constant-time", you need to pay attention how do you implement the encryption/authentication (mode of operation, IV creation etc.). And so on.
If you are unsure about such issues, I recommend you to take a look at TCrypto (which I maintain):
TCrypto
It is a small PHP 5.3+ key-value storage library (cookies will be used as a storage backend by default). Designed exactly for (scalable) "session in a cookie" usage. Feel free to use it :) Also, if you are interested about the low-level implementation, take a look at the code. The codebase is not that huge, I guess it would do quite well, demonstrating encryption related code usage in PHP applications.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.