For example if you had an online communitity that allowed the sending of private images between members, like a digital penpal or dating website.
What would be the best practice for securing these images on a webserver and the best practice for displaying them to the authenticated user?
Here is what I have done so far:
Store Images outside of public root.
Retrieve images via one time code instead of the actual image location.
Randomised hashed image names and folder names that are not easy to guess.
PHP script to authenticate user before displaying the image.
Outside of root seems to be one of the best ways to store the images to make then hard to access, but what about if the server itself is directly hacked into?
Is there a way to hash and salt the image files so it can only be displayed once the hash and salt matches, even if a hacker had the file?
Would this be possible to return via PHP or SQL?
I was thinking of encoding the images to base64 and salting the base64 with a salt generated from a randomly generated password per user (Is this possible?)
Or is there a better method?
For a basic protection, the things you have described could be enough, maybe even too much in the sense that if folders are outside of www root, randomizing folder names won't add much to security but will increase complexity.
Based on a risk assessment that you should conduct for your scenario, you can choose to do more. Of course if you find that you can lower the risk of a $100 breach with the cost of $10000, you probably don't want to do that. So do the maths first. :)
I can see two major threats to your solution, one is a bug in the access control logic that allows a user to download images that he was not supposed to be able to access. The other is an attacker gaining access to your web server and downloading images (as your web server needs to have access to image files, this is not necessarily root/admin access, which increases the risk).
An idea one could think of would be to encrypt images on the server. However, with encryption, key management is usually the problem, and that is exactly the case now. There is not much point in encryption with a key that your application can access anyway, as an attacker could also access that key in case of a successful application level attack (and also in case of a server/OS level attack, because the user running your web server and/or application must have access to the key).
In theory, you could generate a public/private keypair for all of your users. When somebody uploads an image, you would generate a symmetric key for the image, encrypt the image with that key, and then encrypt the symmetric key with each intended recipient's public key and store encrypted keys (and metadata) with the image. The private keys for users should also be encrypted, preferably with a key derived from the user's password with a proper key derivation function like PBKDF2. One implication is that you can only get the user's private key when the user logs in, because you don't store his password, so that's the only time you have it. This means you would have to store your user's decrypted private key in server memory at least, where it is not really safe (and any other store is much worse). This would still provide protection against offline attackers though (somebody having access to backups for instance), and it would also limit attack scope to victim users that log on while the server is compromised (meaning after it is compromised, but before you realize this). Another drawback is the complexity of this solution - crypto is hard, it would be really easy to mess this up without experience. This would also mitigate the threat posed by an access control flaw, because unintended images could not be decrypted with the logged on user's private key.
A completely different approach would be to separate your application into several components: a logon service (similar to SSO), your web server, and a backend image service. When your user logs on to the authentication provider (AP), he would in this case receive a token with claims, signed by the AP. When talking to the web application, he would use this token for authentication. What differentiates this solution from the previous is that when a user requests images, the web application would pass his token to the image service, and the image service could on the one hand store images securely on a box not directly accessible from the internet, and on the other hand it could authorize whether for the token received it wants to return images (it could verify the token with the AP or by itself, depending on the implementation you choose). In this case, even if an attacker compromises the web application, he would still not be able to produce (sign) a valid token from the AP to get access to images on the image service, and it could potentially be much harder to compromise the image service. Of course in case of a breach on the web server, the attacker would still be able to observe any image flowing through, meaning any user that logs on while the server is compromised would still lose his images to the attacker. The added complexity of this solution is even worse than the previous one, which means it is easy to get this wrong too, and it's also costly both to develop and maintain.
Note that none of these solutions protect images from server admins, which may or may not be a requirement for your app.
I hope this answer sheds some light on the difficulties involved in making it significantly more secure than your current solution. Having said all this, implementation is key, and details (the actual code level vulnerabilities) probably matter the most.
You have these listed as some of your security protocols:
"1. Store Images outside of public root.
2. Retrieve images via one time code instead of the actual image location.
...
4. PHP script to authenticate user before displaying the image."
This should be enough, but then you mentioned...
"3. Randomised hashed image names and folder names that are not easy to guess."
If you are actually doing the first two correctly, 1 and 2, then it's not really possible for 3 to have any effect. If the images are outside of the webserver directory, then it doesn't matter if the folder names and image names are easy to guess.
Your code will look like (for doing 1 and 2), assuming the environment is the root directory of your webserver (i.e., example.com/index.php)...
$file_location = '../../images/' . $some_id_that_is_authenticated_and_cleansed_for_slashes . '.jpg';
readfile($file_location); // grabs file and shows it to user
If you are doing the above, then 3 is redundant. Hashing the names, etc., won't help if your site is hacked (the Apache security is bypassed) and it won't help if your site isn't doing the above (since users can then just directly access the URLs). Except for that redundancy, the rest seems perfect.
Related
Scenario
Data is encrypted inside DB using key that is never stored in the app server or DB server
Key is entered upon login and is stored via $_COOKIE['key'] variable for persistence (so user doesn't have to enter it every page load)
Data is decrypted via $_COOKIE['key']
$_COOKIE['key'] is destroyed upon browser exit
Threat
Rouge server admin snoops on PHP files, finds out key is stored at $_COOKIE['key']. He injects malicious code like email_me($_COOKIE['key']);. He erase malicious code after gaining the key.
Question
Is there a way to protect yourself from this kind of scenario?
You can make it harder for a server admin to get the key, but they always can.
Let's think about moving the encryption and decryption to the client side. Now, the server won't get the key, so the server admin should not be able to decrypt the data. That's not quite true, because the server admin can manipulate the page JavaScript so that either the key is sent to the server or nothing is encrypted at all.
The only way a client can be certain that a server admin cannot steal their data, is by using a client software that is open source and cannot be changed on-the-fly by an admin. So, web pages and automatically updating apps are out of the question.
If the key itself is a concern, you can use cryptography oracles like Keyvault in Azure that never release the keys contained within but perform cryptography themselves on data sent to them.
Of course an admin would be able to access the data as long as they have access to the cryptography oracle, but not afterwards, and they would never have the key. This helps in some scenarios, that's the whole point of services like Azure Keyvault. Also you don't need to give actual access to the encryption service to all admins.
Another mitigation (a detective control, as opposed to a preventive one) is audit logging both on the IT and application level. When done right, not even admins can hide the fact that they accessed the data, which again can help mitigate some risks and at least may provide non-repudiation.
Yet another thing you could do is proper change management, controlling who has access (especially write access) to your source code. This can get difficult with script languages like PHP, where you can't really sign code, but you can still have good processes for reviewing and releasing code to production.
So in the end, it's probably less of a technical question, there's a great deal you can do in terms of processes.
Imagine a pretty standard website, with user authenticating with email/password pair. For passwords, it already ha shashing with random salt, but the rest of data is kept unencrypted.
We do another step forward and encrypt the sensitive data with a password key, the key, obviously, shall be known to the application to be able to decript the data for its operation.
we don't want to have it in the source code, so it's kept in a file and read by the app when it needs it.
we've secured the file so that only user which executes the app can read it
(this point has appeared after some discussions below) We have already considered buying hardware HSM and found that not possible (for instance we are running the server on a virtual machine)
this way we are relatively protected from complete DB stealing, right? However, the key might become known if someone gets access to the OS user with read rights.
the question is: what are the best practices for keeping such key secure?
Buy a hardware security module and keep the key in it. The key will not be able to be read.
Yubi makes a reasonably priced hsm. $500 if I recall correctly.
While we're here, your db server should be on a different box in a different network zone as your web server.
I have built a ZF2 application which includes user profiles and I now want to allow users to upload and display their photo as part of their profile. Something like what you see in LinkedIn.
Uploading the photo seems easy enough (using Zend\InputFilter\FileInput()). I have that working fine.
It seems to me that storing them outside of the web root makes a lot of sense. (For example, I don't need to worry about user's using wget on the directory). But how do I then embed these images as part of a web page?
If they were within the web root I would simply do <img width="140" src="/img/filename.jpg"> but obviously that's not possible if they are in a secure location. What's the solution?
You're right. Web developers traditionally obfuscate the paths used to store images to prevent malicious individuals from retrieving them in bulk (as you allude to with your wget comment).
So while storing a user's avatar in /uploads/users/{id}.jpg would be straightforward (and not necessarily inappropriate, depending on your use case), you can use methods to obfuscate the URL. Keep in mind: There are two ways of approaching the problem.
More simply, you want to ensure one cannot determine an asset URL based on "public" information (e.g., the user's primary key). So if a user has a user ID of 37, accessing their avatar won't be as simple as downloading /uploads/users/37.jpg.
A more vigorous approach would be to ensure one cannot relate a URL back to its public information. A URL like /uploads/users/37/this-is-some-gibberish.jpg puts its ownership "on display"; the user responsible for this content must be the user with an ID of 37.
A simple solution
If you'd like to go with simpler approach, generate a fast hash based on set property (e.g., the user's ID) and an application-wide salt. For PHP, take a look at "Fastest hash for non-cryptographic uses?".
$salt = 'abc123'; // Change this, keep it secret, store it as env. variable
$user->id; // 37
$hash = crc32($salt . strval($user->id)); // 1202873758
Now we have a unique hash and can store the file at this endpoint: /uploads/users/37/1202873758.jpg. Anytime we need to reference a user's avatar, we can repeat this logic to generate hash needed to create the filename.
The collision issue
You might be wondering, why can't I store it at /uploads/users/1202873758.jpg? Won't this keep my user's identity safe? (And if you're not wondering, that's OK, I'll explain for other readers.) We could, but the hash generated is not unique; with a sufficiently large number of users, we will overwrite the file with some other user's avatar, rendering our storage solution impotent.
To be more secretive
To be fair, /uploads/users/1202873758.jpg is a more secretive filename. Perhaps even /uploads/1202873758.jpg would be better. To store files with paths like these; we need to ensure uniqueness, which will require not only generating a hash, but also checking for uniqueness, accommodating for inevitable collisions, and storing the (potentially modified) hash—as well as being able to retrieve the hash from storage as needed.
Depending on your application stack, you could implement this an infinite number of ways, some more suitable than others depending on your needs, so I won't dive into it here.
If you use Zfcuser, you can use this module:HtProfileImage.
It contains a view helper to display images very easily!
I have a php/js site where the information is encoded and put into the database. The encryption key for the information is randomly generated, then given back to the users after they send a post through a form. The encryption key is not stored in my database at all. A seperate, randomly generated, ID is formed and stored in the database, used to lookup the item itself before deciphering it.
My question is, is it possible at all to look through the logs and find information that would reveal the key? I am trying to make it impossible to read any of the SQL data without either being the person who has the code (who can do whatever he wants with it), or by a brute force attack (unavoidable if someone gets my SQL database)?
Just to re-iterate my steps:
User sends information through POST
php file generates random ID and access key. The data is encrypted with the access key then put in the php database with the ID as the PRIMARY KEY.
php file echos just the random ID and the access key.
website uses jQuery to create a link from the key and mysite.com?i=cYFogD3Se8RkLSE1CA [9 digit A-Ba-b09 = ID][9 digit A-Ba-b09 = key]
Is there any possibility if someone had access to my server that can read the information? I want it to be information for me to read the messages myself. The information has to be decodable, it can't be a one way encoding.
I like your system of the URL containing the decryption key, so that not even you, without having data available only on the user's computer, will be able to access.
I still see a few gotchas in this.
URLs are often saved in web server logs. If you're logging to disk, and they get the disk, then they get the keys.
If the attacker has access to your database, he may have enough access to your system to secretly install software that logs the URLs. He could even do something as prosaic as turn logging back on.
The person visiting your site will have the URL bookmarked at least (otherwise it is useless to him) and it will likely appear in his browser history. Normally, bookmarks and history are not considered secure data. Thus, an attacker to a user's computer (either by sitting down directly or if the computer is compromised by malware) can access the data as well. If the payload is desirable enough, someone could create a virus or malware that specifically mines for your static authentication token, and could achieve a reasonable hit rate. The URLs could be available to browser plugins, even, or other applications acting under a seemingly reasonable guise of "import your bookmarks now".
So it seems to me that the best security is then for the client to not just have the bookmark (which, while it is information, it is not kept in anyone's head so can be considered "something he has"), but also for him to have to present "something he knows", too. So encrypt with his password, too, and don't save the password. When he presents the URL, ask for a password, and then decrypt with both (serially or in combination) and the data is secure.
Finally, I know that Google's two-factor authentication can be used by third parties (for example, I use it with Dropbox). This creates another "something you have" by requiring the person accessing the resource to have his cell phone, or nothing. Yes, there is recourse if you lose your cell phone, but it usually involves another phone number, or a special Google-supplied one-time long password that has been printed out and stashed in one's wallet.
Let's start with some basic definitions:
Code Protecting data by translating it to another language, usually a private language. English translated to Spanish is encoded but its not very secure since many people understand Spanish.
Cipher Protecting data by scrambling it up using a key. A letter substitution cipher first documented by Julius Caesar is an example of this. Modern techniques involve mathematical manipulation of binary data using prime numbers. The best techniques use asymmetric keys; the key that is used to encipher the data cannot decipher it, a different key is needed. This allows the public key to be published and is the basis of SSL browser communication.
Encryption Protecting data by encoding and/or enciphering it.
All of these terms are often used interchangeably but they are different and the differences are sometimes important. What you are trying to do is to protect the data by a cipher.
If the data is "in clear" then if it is intercepted it is lost. If it is enciphered, then both the data and the key need to be intercepted. If it is enciphered and encoded, then the data, the key and the code need to be intercepted.
Where is your data vulnerable?
The most vulnerable place for any data is when it is in clear the personal possession of somebody, on a storage device (USB, CD, piece of paper) or inside their head since that person is vulnerable to inducement or coercion. This is the foundation of Wikileaks - people who are trusted with in confidence information are induced to betray that confidence - the ethics of this I leave to your individual consciences.
When it is in transit between the client and the server and vice versa. Except for data of national security importance the SSL method of encryption should be adequate.
When it is in memory in your program. The source code of your program is the best place to store your keys, however, they themselves need to be stored encrypted with a password that you enter each time your program runs (best), that is entered when you compile and publish or that is embedded in your code (worst). Unless you have a very good reason one key should be adequate; not one per user. You should also keep in-memory data encrypted except when you actually need it and you should use any in-memory in-clear data structures immediately and destroy them as soon as you are finished with them. The key has to be stored somewhere or else the data is irrecoverable. But consider, who has access to the source code (including backups and superseded versions) and how can you check for backdoors or trojans?
When it is in transit between your program's machine and the data store. If you only send encrypted data between the program and the data store and DO NOT store the key in the data store this should be OK.
When it is stored in the data store. Ditto.
Do not overlook physical security, quite often the easiest way to steal data is to walk up to the server and copy the hard drive. Many companies (and sadly defence/security forces) spend millions on on-line data security and then put their data in a room with no lock. They also have access protocols that a 10 year old child could circumvent.
You now have lovely encrypted data - how are you going to stop your program from serving it up in the clear to anyone who asks for it?
This brings us to identification, validation and authorisation. More definitions:
Identification A claim made by a person that they are so-and-so. This is usually handled in a computer program by a user name. In physical security applications it is by a person presenting themselves and saying "I am so-and-so"; this can explicitly be by a verbal statement or by presenting an identity document like a passport or implicitly by a guard you know recognising you.
Validation This is the proof that a person is who they say they are. In a computer this is the role of the password; more accurately, this proves that they know the person they say they are's password which is the big, massive, huge and insurmountable problem in the whole thing. In physical security it is by comparing physical metrics (appearance, height etc) as documented in a trusted document (like a passport) against the claim; you need to have protocols in place to ensure that you can trust the document. Incidentally, this is the main cause of problems with face recognition technology to identify bad guys – it uses a validation technique to try and identify someone. “This guy looks like Bad Guy #1”; guess what? So do a lot of people in a population of 7 billion.
Authorisation Once a person has been identified and validated they are then given authorisation to do certain things and go to certain places. They may be given a temporary identification document for this; think of a visitor id badge or a cookie. Depending on where they go they may be required to reidentify and revalidate themselves; think of a bank’s website; you identify and validate yourself to see your bank accounts and you do it again to make transfers or payments.
By and large, this is the weakest part of any computer security system; it is hard for me to steal you data, it is far easier for me to steal your identity and have the data given to me.
In your case, this is probably not your concern, providing that you do the normal thing of allowing the user to set, change and retrieve their password in the normal commercial manner, you have probably done all you can.
Remember, data security is a trade off between security on the one hand and trust and usability on the other. Make things too hard (like high complexity passwords for low value data) and you compromise the whole system (because people are people and they write them down).
Like everything in computers – users are a problem!
Why are you protecting this data, and what are you willing to spend to do so?
This is a classic risk management question. In effect, you need to consider the adverse consequences of losing this data, the risk of this happening with your present level of safeguards and if the reduction in risk that additional safeguards will cost is worth it.
Losing the data can mean any or all of:
Having it made public
Having if fall into the wrong person’s hands
Having it destroyed maliciously or accidently. (Backup, people!)
Having it changed. If you know it has been changed this is equivalent to losing it; if you don’t this can be much, much worse since you may be acting on false data.
This type of thinking is what leads to the classification of data in defence and government into Top Secret, Secret, Restricted and Unrestricted (Australian classifications). The human element intervenes again here; due to the nature of bureaucracy there is no incentive to give a document a low classification and plenty of disincentive; so documents are routinely over-classified. This means that because many documents with a Restricted classification need to be distributed to people who don’t have the appropriate clearance simply to make the damn thing work, this is what happens.
You can think of this as a hierarchy as well; my personal way of thinking about it is:
Defence of the Realm Compromise will have serious adverse consequences for the strategic survival of my country/corporation/family whatever level you are thinking about.
Life and Death Compromise will put someone’s life or health in danger.
Financial Compromise will allow someone to have money/car/boat/space shuttle stolen.
Commercial Compromise will cause loss of future financial gain.
Humiliating Compromise will cause embarrassment. Of course, if you are a politician this is probably No 1.
Personal These are details that you would rather not have released but aren’t particularly earth shaking. I would put my personal medical history in here but, the impact of contravening privacy laws may push it up to Humiliating (if people find out) or Financial (if you get sued or prosecuted).
Private This is stuff that is nobody else’s business but doesn’t actually hurt you if they find out.
Public Print it in the paper for all anyone cares.
Irrespective of the level, you don’t want any of this data lost or changed but if it is, you need to know that this has happened. For the Nazi’s, having their Enigma cipher broken was bad; not knowing it had happened was catastrophic.
In the comments below, I have been asked to describe best practice. This is impossible without knowing the risk of the data (and risk tolerance of the organisation). Spending too much on data security is as bad as spending too little.
First and most importantly, you need a really good, watertight legal disclaimer.
Second, don’t store the user’s data at all.
Instead when the user submits the data (using SSL), generate a hash of the SessionID and your system’s datetime. Store this hash in your table along with the datetime and get the record ID. Encrypt the user’s data with this hash and generate a URL with the record ID and the data within it and send this back to the user (again using SSL). Security of this URL is now the user’s problem and you no longer have any record of what they sent (make sure it is not logged).
Routinely, delete stale (4h,24h?) records from the database.
When a retrieval request comes in (using SSL) lookup the hash, if it’s not there tell the user the URL is stale. If it is, decrypt the data they sent and send it back (using SSL) and delete the record from your database.
Lets have a little think
Use SSL - Data is encrypted
Use username/password for authorisation
IF someboby breaks that - you do have a problem with security
Spend the effort on fixing that. Disaster recover is a waste of effort in this case. Just get the base cases correct.
I'm going to be implementing a PHP/mySQL setup to store credit card information.
It seems like AES_ENCRYPT/AES_DECRYPT is the way to go,
but I'm still confused on one point:
How do I keep the encryption key secure?
Hardwiring it into my PHP scripts (which will live on the same server as the db) seems like a major security hole.
What's the "best practice" solution here?
You should think long and hard about whether you REALLY need to keep the CC#. If you don't have a great reason, DON'T! Every other week you hear about some company being compromised and CC#'s being stolen. All these companies made a fatal flaw - they kept too much information. Keep the CC# until the transaction clears. After that, delete it.
As far as securing the server, the best course of action is to secure the hardware and use the internal system socket to MySQL, and make sure to block any network access to the MySQL server. Make sure you're using both your system permissions and the MySQL permissions to allow as little access as needed. For some scripts, you might consider write-only authentication. There's really no encryption method that will be foolproof (as you will always need to decrypt, and thus must store the key). This is not to say you shouldn't - you can store your key in one location and if you detect system compromise you can destroy the file and render the data useless.
MySQL, there is six easy steps you can do to secure your sensitive data.
Step 1: Remove wildcards in the grant tables
Step 2: Require the use of secure passwords
Note: Use the MySQL “--secure-auth” option to prevent the use of older, less secure MySQL password formats.
Step 3: Check the permissions of configuration files
Step 4: Encrypt client-server transmissions
Step 5: Disable remote access
Step 6: Actively monitor the MySQL access log
Security Tools
I agree, but don't the cc if you don't need too. But if you really have too, make sure the file that have it is not accessible on the web. You can write a binary that would return the key. This way it's not store in clear text. But if your server is compromise it's still easy to get it.
the security you need depends on your application. for example, if the only time the cc# will be used is when the user is logged in (thin online store type scenario), then you can encrypt the cc# with the a hash of the user's plain-text password, a per-user salt, and a dedicated cc# salt. do not store this value permanently.
since you're not storing this value, the only time you can get this value is when the user enters their password to log in. just make sure you have good session expiration and garbage collection policies in place.
if this situation does not apply to you, please describe your situation in more detail so we can provide a more appropriate answer.
Put your database files outside computer lets say external hdd and keep it at safe place. Works only if you can develop this project at only place where this external drive is placed :)
Or you can at least protect those files using file system encryption tools like
https://itsfoss.com/password-protect-folder-linux/
In case of production environment I agree with Kyle Cronin.