I asked a question here and I manage to partially implement the advice. Data is now stored encrypted in binary field (varbinary(500)), after I remove the aes-256 encryption and I leave aes-128 (default) codeigniter encryption.
However, I have some questions, and I can't find answers, since I can not find many articles on this subject, so If anyone can answer my questions, or point me to a book, or any other literature for further reading, I would be very grateful.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
base64_encode($clientName);
$encClientName = $this->encryption->encrypt($clientName);
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure? Can anyone post any snippet code of how to use nonce with the codeigniter?
Again, any link to reading material on this subject (storing encrypted data in the database with php) will be deeply appreciated.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Encrypted data is binary. It will frequently contain byte sequences which are invalid in your text encoding, making them impossible to insert into a column which expects a string (like VARCHAR or TEXT).
The data type you probably want is either VARBINARY (which is similar to VARCHAR, but not a string) or BLOB (likewise, but for TEXT -- there's also MEDIUMBLOB, LONGBLOB, etc).
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
You don't. This is backwards.
If you were going to use a string-type column to store encrypted data, you could "fake it" by Base64 encoding the data after encryption. However, you're still better off using a binary-type column, at which point you don't need any additional encoding.
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure?
Based on what I'm seeing in the documentation, I think the CodeIgniter Encryption library handles this for you by default. You shouldn't have to do anything additional.
In addition to duskwuffs answer, I covered your questions from a more crypto-related viewpoint. He just managed to post a minute before I did :)
Encrypted data must be stored in a binary type field due to the way that Character Encodings work. I recommend you read, if you haven't already, this excellent article by Joel Spolsky that details this very well.
It is important to remember that encryption algorithms operate on raw binary data. That is, a bit string. Literal 1's and 0's that can be interpreted in many ways. You can represent this data as unsigned byte values (255, 255), Hex (0xFF, 0xFF), whatever, they are really just bit strings underneath. Another property of encryption algorithms (or good ones, at least) is that the result of encryption should be indistinguishable from random data. That is, given an encrypted file and a blob of CSPRNG generated random data that have the same length, you should not be able to determine which is which.
Now lets presume you wanted to store this data in a field that expects UTF8 strings. Because the bit string we store in this field could contain any possible sequence of bytes, as we discussed above, we can't assume that the sequence of bytes that we store will denote actual valid UTF8 characters. The implication of this is that binary data encoded to UTF8 and then decoded back to binary is not guaranteed to give you the original binary data. In fact, it rarely will.
Your second question is also somewhat to do with encodings, but the encoding here is base64. Base64 is a encoding that plays very nicely with (in fact, it was designed for) binary data. Base64 is a way to represent binary data using common characters (a-z, A-Z, 0-9 and +, /) in most implementations. I am willing to bet that the encrypt function you are using either uses base64_decode or one of the functions it calls does. What you should actually be interested in is whether or not the output of the encrypt function is a base64 string or actual binary data, as this will affect the type of data field you use in your database (e.g. binary vs varchar).
I believe in your last question you stated that you were using CTR, so the following applies to the nonce used by CTR only.
CTR works by encrypting a counter value, and then xor-ing this encrypted counter value with your data. This counter value is made up of two things, the nonce, and the actual value of the counter, which normally starts at 0. Technically, your nonce can be any length, but I believe a common value is 12 bytes. Because the we are discussing AES, the total size of the counter value should be 16 bytes. That is, 12 bytes of nonce and 4 bytes of counter.
This is the important part. Every encryption operation should:
Generate a new 12 byte nonce to use for that operation.
Your implementation should add the counter and perform the actual encryption.
Once you have the final ciphertext, prepend the nonce to this ciphertext so that the result is len(ciphertext) + 12) bytes long.
Then store this final result in your database.
Repeating a nonce, using a static nonce, or performing more than 2^32 encryption operations with a single 12 byte nonce will make your ciphertext vulnerable.
Related
I've just had an interesting little problem.
Using Laravel 4, I encrypt some entries before adding them to a db, including email address.
The db was setup with the default varchar length of 255.
I've just had an entry that encrypted to 309 characters, blowing up the encryption by cutting off the last 50-odd characters in the db.
I've (temporarily) fixed this by simply increasing the varchar length to 500, which should - in theory - cover me from this, but I want to be sure.
I'm not sure how the encryption works, but is there a way to tell what maximum character length to expect from the encrypt output for the sake of setting my database?
Should I change my field type from varchar to something else to ensure this doesn't happen again?
Conclusion
First, be warned that there has been quite a few changes between 4.0.0 and 4.2.16 (which seems to be the latest version).
The scheme starts with a staggering overhead of 188 characters for 4.2 and about 244 for 4.0 (given that I did not forget any newlines and such). So to be safe you will probably need in the order of 200 characters for 4.2 and 256 characters for 4.0 plus 1.8 times the plain text size, if the characters in the plaintext are encoded as single bytes.
Analysis
I just looked into the source code of Laravel 4.0 and Laravel 4.2 with regards to this function. Lets get into the size first:
the data is serialized, so the encryption size depends on the size of the type of the value (which is probably a string);
the serialized data is PKCS#7 padded using Rijndael 256 or AES, so that means adding 1 to 32 bytes or 1 to 16 bytes - depending on the use of 4.0 or 4.2;
this data is encrypted with the key and an IV;
both the ciphertext and IV are separately converted to base64;
a HMAC using SHA-256 over the base64 encoded ciphertext is calculated, returning a lowercase hex string of 64 bytes
then the ciphertext consists of base64_encode(json_encode(compact('iv', 'value', 'mac'))) (where the value is the base 64 ciphertext and mac is the HMAC value, of course).
A string in PHP is serialized as s:<i>:"<s>"; where <i> is the size of the string, and <s> is the string (I'm presuming PHP platform encoding here with regards to the size). Note that I'm not 100% sure that Laravel doesn't use any wrapping around the string value, maybe somebody could clear that up for me.
Calculation
All in all, everything depends quite a lot on character encoding, and it would be rather dangerous for me to make a good estimation. Lets assume a 1:1 relation between byte and character for now (e.g. US-ASCII):
serialization adds up to 9 characters for strings up to 999 characters
padding adds up to 16 or 32 bytes, which we assume are characters too
encryption keeps data the same size
base64 in PHP creates ceil(len / 3) * 4 characters - but lets simplify that to (len * 4) / 3 + 4, the base 64 encoded IV is 44 characters
the full HMAC is 64 characters
the JSON encoding adds 3*5 characters for quotes and colons, plus 4 characters for braces and comma's around them, totaling 19 characters (I'm presuming json_encode does not end with a white space here, base 64 again adds the same overhead
OK, so I'm getting a bit tired here, but you can see it at least twice expands the plaintext with base64 encoding. In the end it's a scheme that adds quite a lot of overhead; they could just have used base64(IV|ciphertext|mac) to seriously cut down on overhead.
Notes
if you're not on 4.2 now, I would seriously consider upgrading to the latest version because 4.2 fixes quite a lot of security issues
the sample code uses a string as key, and it is unclear if it is easy to use bytes instead;
the documentation does warn against key sizes other than the Rijndael defaults, but forgets to mention string encoding issues;
padding is always performed, even if CTR mode is used, which kind of defeats the purpose;
Laravel pads using PKCS#7 padding, but as the serialization always seems to end with ;, that was not really necessary;
it's a nice thing to see authenticated encryption being used for database encryption (the IV wasn't used, fixed in 4.2).
#MaartenBodewes' does a very good job at explaining how long the actual string probably will be. However you can never know it for sure, so here are two options to deal with the situation.
1. Make your field text
Change the field from a limited varchar to an "self-expanding" text. This is probably the simpler one, and especially if you expect rather long input I'd definitely recommend this.
2. Just make your varchar longer
As you did already, make your varchar longer depending on what input length you expect/allow. I'd multiply by a factor of 5.
But don't stop there! Add a check in your code to make sure the data doesn't get truncated:
$encrypted = Crypt::encrypt($input);
if(strlen($encrypted) > 500){
// do something about it
}
What can you do about it?
You could either write an error to the log and add the encrypted data (so you can manually re-insert it after you extended the length of your DB field)
Log::error('An encrypted value was too long for the DB field xy. Length: '.strlen($encrypted).' Data: '.$encrypted);
Obviously that means you have to check the logs frequently (or send them to you by mail) and also that the user could encounter errors while using the application because of the incorrect data in your DB.
The other way would be to throw an exception (and display an error to the user) and of course also write it to the log so you can fix it...
Anyways
Whether you choose option 1 or 2 you should always restrict the accepted length of your input fields. Server side and client side.
I've been playing around with php mcrypt over the weekend with AES used to encrypt text strings with a key. Later I worked up a tiny php tool to encrypt / decrypt your strings with AES/mcrypt now when the key is "wrong" and the text doesn't get decrypted, you end up with what I think is binary from what I've read around (http://i.imgur.com/jF8cZMZ.png), is there anyway in PHP to check if the variable holds binary or a properly decoded string?
My apologies if the title and the intro are a bit misleading.
When you encrypt text and then try to decrypt it, you will get the same text, but when you try to decrypt random data, there is a small chance that the result will be text (decreasing with length of data). You haven't specified what kind of data we are talking about, but determining if the decryption is successful by applying a heuristic is a bad idea. It is slow and may lead to false positives.
You should have a checksum or something like that to determine if the decrypted result is valid. This could be easily done by running sha1 on the plaintext data, prepend the result to the text and encrypt it as a whole. When you decrypt it, you can split (sha1 output has a fixed size, so you know where to split) the resulting string run sha1 on the text part and compare with the hash part. If it matches you have a valid result. You can of course improve the security a little by using SHA-256 or SHA-512.
That's is just one way of doing it, but might not be the best. Better ways would be to use an authenticated mode of operation for AES like GCM or CCM, or use encrypt-then-MAC with a good MAC function like HMAC-SHA512.
With using the approaches above you're free to use any kind of data to encrypt, because you're not limited to determining if it is text or not anymore.
This question already has answers here:
What is base 64 encoding used for?
(19 answers)
Closed 8 years ago.
Hi my question is that does base64_encode does unique data every time we run the script?
Below is the code.
<?php
$id = 1;
echo base64_encode($id);
?>
If it does not provide the unique data every time then what is the point in encoding the string and passing in url. Does that make url safe??
Base64 encoding is not a method of encryption. It is used for encoding binary data into text, which makes it safer to transmit over the internet.
If you stream bits, some protocols may interpret it differently. Streaming text is much more reliable.
What is base 64 encoding used for?
If you need true encryption, you need to use something which hashes based on a salt you can hide from other users, such as the mcrypt library.
http://php.net/manual/en/book.mcrypt.php
base64-encoding does not provide unique data. Its purpose is to provide a compact representation of binary data in string form. In your example, you are encoding non-binary data, so it is not very practical. However, if you wanted to encode a string containing a newline and punctuation and pass it via the URL, you cannot send the binary data directly.
For example, if you had the string Hello, World!!\n there would be three punctuation marks, a space and a newline that all need to be URL-encoded. Doing that gives the result:
Hello%2C+World%21%21%0A
Which is 23 bytes long.
On the other hand if you were to base64-encode the same string, the result would be:
SGVsbG8sIFdvcmxkISEK
Which is 20 characters, or about 13% shorter. This adds up quickly if you've got a lot of non-alphanumeric characters or a large amount of data.
So the primary advantage of base64 encoding is its slightly more compact representation of certain data.
Base64 encoding is a way of representing data using only a limited set of characters. You use it when you need to store data in something such as a cookie that can't handle the data in its original format.
I'm trying to use mcrypt_create_iv to generate random salts. When I test to see if the salt is generated by echo'ing it out, it checks out but it isn't the required length which I pass as a parameter to it (32), instead its less than that.
When I store it in my database table however, it shows up as something like this K??5P?M???4?o???"?0??
I'm sure it's something to do with the database, but I tried to change the collation of it to correspond with the config settings of CI, which is utf8_general_ci, but it doesn't solve the problem, instead it generates a much smaller salt.
Does anyone know of what may be wrong? Thanks for any feedback/help
The function mcrypt_create_iv() will return a binary string, containing \0 and other unreadable characters. Depending on how you want to use the salts, you first have to encode those byte strings, to an accepted alphabet. It is also possible to store binary strings in the database, but of course you will have a problem to display them.
Since salts are normally used for password storing, i would recommend to have a look at PHP's function password_hash(), it will generate a salt automatically and includes it in the resulting hash-value, so you don't need a separate database field for the salt.
I'm planning to generate and store salt in database and pass it back to UI. I'm using openssl_random_pseudo_bytes function. Looking at the salt result I see following in PHP debug window
and the following in database record
I'm planning to recreate hash on client-side using this salt.
I'm curious if this random string of is okay to work with when passing it back and forth from server to UI? Or should this be encoded before passing it back and forth?
Following is database column definition for SALT column
You should change the column type to VARBINARY (or just BINARY if the length is fixed) instead of VARCHAR. Since the contents of the column are not meant to represent text, binary data types will be more efficient.
In any case, as long as there are no misconfiguration problem you should have no correctness issues in your application even with the column having its current data type.