Laravel 4 Encryption: how many characters to expect - php

I've just had an interesting little problem.
Using Laravel 4, I encrypt some entries before adding them to a db, including email address.
The db was setup with the default varchar length of 255.
I've just had an entry that encrypted to 309 characters, blowing up the encryption by cutting off the last 50-odd characters in the db.
I've (temporarily) fixed this by simply increasing the varchar length to 500, which should - in theory - cover me from this, but I want to be sure.
I'm not sure how the encryption works, but is there a way to tell what maximum character length to expect from the encrypt output for the sake of setting my database?
Should I change my field type from varchar to something else to ensure this doesn't happen again?

Conclusion
First, be warned that there has been quite a few changes between 4.0.0 and 4.2.16 (which seems to be the latest version).
The scheme starts with a staggering overhead of 188 characters for 4.2 and about 244 for 4.0 (given that I did not forget any newlines and such). So to be safe you will probably need in the order of 200 characters for 4.2 and 256 characters for 4.0 plus 1.8 times the plain text size, if the characters in the plaintext are encoded as single bytes.
Analysis
I just looked into the source code of Laravel 4.0 and Laravel 4.2 with regards to this function. Lets get into the size first:
the data is serialized, so the encryption size depends on the size of the type of the value (which is probably a string);
the serialized data is PKCS#7 padded using Rijndael 256 or AES, so that means adding 1 to 32 bytes or 1 to 16 bytes - depending on the use of 4.0 or 4.2;
this data is encrypted with the key and an IV;
both the ciphertext and IV are separately converted to base64;
a HMAC using SHA-256 over the base64 encoded ciphertext is calculated, returning a lowercase hex string of 64 bytes
then the ciphertext consists of base64_encode(json_encode(compact('iv', 'value', 'mac'))) (where the value is the base 64 ciphertext and mac is the HMAC value, of course).
A string in PHP is serialized as s:<i>:"<s>"; where <i> is the size of the string, and <s> is the string (I'm presuming PHP platform encoding here with regards to the size). Note that I'm not 100% sure that Laravel doesn't use any wrapping around the string value, maybe somebody could clear that up for me.
Calculation
All in all, everything depends quite a lot on character encoding, and it would be rather dangerous for me to make a good estimation. Lets assume a 1:1 relation between byte and character for now (e.g. US-ASCII):
serialization adds up to 9 characters for strings up to 999 characters
padding adds up to 16 or 32 bytes, which we assume are characters too
encryption keeps data the same size
base64 in PHP creates ceil(len / 3) * 4 characters - but lets simplify that to (len * 4) / 3 + 4, the base 64 encoded IV is 44 characters
the full HMAC is 64 characters
the JSON encoding adds 3*5 characters for quotes and colons, plus 4 characters for braces and comma's around them, totaling 19 characters (I'm presuming json_encode does not end with a white space here, base 64 again adds the same overhead
OK, so I'm getting a bit tired here, but you can see it at least twice expands the plaintext with base64 encoding. In the end it's a scheme that adds quite a lot of overhead; they could just have used base64(IV|ciphertext|mac) to seriously cut down on overhead.
Notes
if you're not on 4.2 now, I would seriously consider upgrading to the latest version because 4.2 fixes quite a lot of security issues
the sample code uses a string as key, and it is unclear if it is easy to use bytes instead;
the documentation does warn against key sizes other than the Rijndael defaults, but forgets to mention string encoding issues;
padding is always performed, even if CTR mode is used, which kind of defeats the purpose;
Laravel pads using PKCS#7 padding, but as the serialization always seems to end with ;, that was not really necessary;
it's a nice thing to see authenticated encryption being used for database encryption (the IV wasn't used, fixed in 4.2).

#MaartenBodewes' does a very good job at explaining how long the actual string probably will be. However you can never know it for sure, so here are two options to deal with the situation.
1. Make your field text
Change the field from a limited varchar to an "self-expanding" text. This is probably the simpler one, and especially if you expect rather long input I'd definitely recommend this.
2. Just make your varchar longer
As you did already, make your varchar longer depending on what input length you expect/allow. I'd multiply by a factor of 5.
But don't stop there! Add a check in your code to make sure the data doesn't get truncated:
$encrypted = Crypt::encrypt($input);
if(strlen($encrypted) > 500){
// do something about it
}
What can you do about it?
You could either write an error to the log and add the encrypted data (so you can manually re-insert it after you extended the length of your DB field)
Log::error('An encrypted value was too long for the DB field xy. Length: '.strlen($encrypted).' Data: '.$encrypted);
Obviously that means you have to check the logs frequently (or send them to you by mail) and also that the user could encounter errors while using the application because of the incorrect data in your DB.
The other way would be to throw an exception (and display an error to the user) and of course also write it to the log so you can fix it...
Anyways
Whether you choose option 1 or 2 you should always restrict the accepted length of your input fields. Server side and client side.

Related

Encryption questions

I asked a question here and I manage to partially implement the advice. Data is now stored encrypted in binary field (varbinary(500)), after I remove the aes-256 encryption and I leave aes-128 (default) codeigniter encryption.
However, I have some questions, and I can't find answers, since I can not find many articles on this subject, so If anyone can answer my questions, or point me to a book, or any other literature for further reading, I would be very grateful.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
base64_encode($clientName);
$encClientName = $this->encryption->encrypt($clientName);
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure? Can anyone post any snippet code of how to use nonce with the codeigniter?
Again, any link to reading material on this subject (storing encrypted data in the database with php) will be deeply appreciated.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Encrypted data is binary. It will frequently contain byte sequences which are invalid in your text encoding, making them impossible to insert into a column which expects a string (like VARCHAR or TEXT).
The data type you probably want is either VARBINARY (which is similar to VARCHAR, but not a string) or BLOB (likewise, but for TEXT -- there's also MEDIUMBLOB, LONGBLOB, etc).
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
You don't. This is backwards.
If you were going to use a string-type column to store encrypted data, you could "fake it" by Base64 encoding the data after encryption. However, you're still better off using a binary-type column, at which point you don't need any additional encoding.
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure?
Based on what I'm seeing in the documentation, I think the CodeIgniter Encryption library handles this for you by default. You shouldn't have to do anything additional.
In addition to duskwuffs answer, I covered your questions from a more crypto-related viewpoint. He just managed to post a minute before I did :)
Encrypted data must be stored in a binary type field due to the way that Character Encodings work. I recommend you read, if you haven't already, this excellent article by Joel Spolsky that details this very well.
It is important to remember that encryption algorithms operate on raw binary data. That is, a bit string. Literal 1's and 0's that can be interpreted in many ways. You can represent this data as unsigned byte values (255, 255), Hex (0xFF, 0xFF), whatever, they are really just bit strings underneath. Another property of encryption algorithms (or good ones, at least) is that the result of encryption should be indistinguishable from random data. That is, given an encrypted file and a blob of CSPRNG generated random data that have the same length, you should not be able to determine which is which.
Now lets presume you wanted to store this data in a field that expects UTF8 strings. Because the bit string we store in this field could contain any possible sequence of bytes, as we discussed above, we can't assume that the sequence of bytes that we store will denote actual valid UTF8 characters. The implication of this is that binary data encoded to UTF8 and then decoded back to binary is not guaranteed to give you the original binary data. In fact, it rarely will.
Your second question is also somewhat to do with encodings, but the encoding here is base64. Base64 is a encoding that plays very nicely with (in fact, it was designed for) binary data. Base64 is a way to represent binary data using common characters (a-z, A-Z, 0-9 and +, /) in most implementations. I am willing to bet that the encrypt function you are using either uses base64_decode or one of the functions it calls does. What you should actually be interested in is whether or not the output of the encrypt function is a base64 string or actual binary data, as this will affect the type of data field you use in your database (e.g. binary vs varchar).
I believe in your last question you stated that you were using CTR, so the following applies to the nonce used by CTR only.
CTR works by encrypting a counter value, and then xor-ing this encrypted counter value with your data. This counter value is made up of two things, the nonce, and the actual value of the counter, which normally starts at 0. Technically, your nonce can be any length, but I believe a common value is 12 bytes. Because the we are discussing AES, the total size of the counter value should be 16 bytes. That is, 12 bytes of nonce and 4 bytes of counter.
This is the important part. Every encryption operation should:
Generate a new 12 byte nonce to use for that operation.
Your implementation should add the counter and perform the actual encryption.
Once you have the final ciphertext, prepend the nonce to this ciphertext so that the result is len(ciphertext) + 12) bytes long.
Then store this final result in your database.
Repeating a nonce, using a static nonce, or performing more than 2^32 encryption operations with a single 12 byte nonce will make your ciphertext vulnerable.

Key length issue: AES encryption on phpseclib and decryption on PyCrypto

I am working on a data intensive project where I have been using PHP for fetching data and encrypting it using phpseclib. A chunk of the data has been encrypted in AES with the ECB mode -- however the key length is only 10. I am able to decrypt the data successfully.
However, I need to use Python in the later stages of the project and consequently need to decrypt my data using it. I tried employing PyCrypto but it tells me the key length must be 16, 24 or 32 bytes long, which is not the case. According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
What should I do?
I strongly recommend you adjust your PHP code to use (at least) a sixteen byte key, otherwise your crypto system is considerably weaker than it might otherwise be.
I would also recommend you switch to CBC-mode, as ECB-mode may reveal patterns in your input data. Ensure you use a random IV each time you encrypt and store this with the ciphertext.
Finally, to address your original question:
According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
The space character 0x20 is not the same as the null character 0x00.

AES-128 UTF-8 characters in key (iOS ↔ PHP)

After a really long time of research I finally did encoding / decoding on iOS and PHP. I wrote a little algorithm that uses a pool of randomly created 16 Bytes keys both, on iOS and PHP.
The algorithm keeps both systems synchronized so that I'm using the keys not multiple times.
However my keys contain some UTF8 Characters (I think). I'm using the standard [a-z][A-Z][0-9] characters including these special chars:
!\"§$%&/()=?+-*#.,£[]|{}
Unfortunately, when using one of these keys, the decryption fails on PHP. On iOS I'm using an extension on the stringByAddingPercentEscapes: method which escapes a bit more characters. Then I send the escaped data as POST variables to the server.
I played around a bit and it turned out that using only [a-z][A-Z][0-9] works great.
Any suggestions on solving my issue?
Of the characters you described, £ and § are not ASCII characters. Depending on how you are transmitting them, those two are probably being corrupted.
That being said — encryption keys are data, not strings. If you represent your encryption keys as NSData, rather than NSString, character sets will cease to be an issue, and you should be able to use any randomly generated key, not just ones consisting of these 85 characters.

Transforming a url into a unique 32 character token

I am writing an affiliate system, and I want to generate a unique 32 character wide token, from the url.
The problem is that a URL can be up to 128 chars long (IIRC). Is there a way that I can create a unique 32 char wide key/token from a given URL, without any 'collisions'?
I am not sure if this is an encoding, encryption or hashing problem (probably, a mixture of all three).
I will be implementing this 'mapping function' using PHP, since that is the language I am using to build this particular system. Any suggestions on how to go about doing this?
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
[Edit]
I just did some reading up, and found that the max length of urls is actually, something in the order of 2K. However, I am not concerned about 'silly' edge cases like that. I am pretty sure that 99.9% of the time, my imposed limit of 128 chars should be sufficient.
Is it even possible to map a 128 char
string into a 32 char string uniquely
(i.e. no collisions?) ...
In part. You can use a hash function like md5 or sha1. That's what they were built to do.
MD5 generates a 32 char string, and SHA1 generates a 40 char string.
Of course you can't guarantee that there won't be collisions. That's impossible since the message space is too large for your hashes (there are 21024 messages vs 2128 possible hashes if you are using MD5), but these functions are meant to be collision resistant and hard to reverse.
Wikipedia references:
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/MD5
http://en.wikipedia.org/wiki/SHA-1
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
That depends on the alphabet being used for both input and output. If your resulting 32 char hash is limited to an alphabet of a-z, you can encode a maximum of 26^32 = 1.901722×10^45 values in it. A URL can consist of at least a-z and quite a number of other characters, so can contain at least 26^128 = 1.307942×10^181 values. So, an alphabet of 26 characters is not enough.
Using a-zA-Z0-9 you can encode 62^32 = 2.272658×10^57 unique values, which is still not enough. Even an alphabet of 100 characters gives you only 100^32 = 1.0×10^64 possible values.
Depending on what exactly you want to do, you should either increase the length of the hash or rethink the overall approach.

Why is my SHA1 hash not matching?

I don't think I was specific enough last time. Here we go:
I have a hex string:
742713478fb3c36e014d004100440041004
e0041004e00000060f347d15798c9010060
6b899c5a98c9014d007900470072006f007
500700000002f0000001f7691944b9a3306
295fb5f1f57ca52090d35b50060606060606
The last 20 bytes should (theoretically) contain a SHA1 Hash of the first part (complete string - 20 bytes). But it doesn't match for me.
Trying to do this with PHP, but no luck. Can you get a match?
Ticket:
742713478fb3c36e014d004100
440041004e0041004e00000060
f347d15798c90100606b899c5a
98c9014d007900470072006f00
7500700000002f0000001f7691944b9a
sha1 hash of ticket appended to original:
3306295fb5f1f57ca52090d35b50060606060606
My sha1 hash of ticket:
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Here's what is in the ticket and how it's being stored. FWIW, I can pull out username, etc, and spot the various delimiters.
http://www.codeproject.com/KB/aspnet/Forms_Auth_Internals/AuthTicket2.JPG
Edited: I have discovered that the string is padded on the end by the decryption function it goes through before this point. I removed the last 6 bytes and adjusted by ticket and hash accordingly. Still doesn't work, but I'm closer.
Your ticket is being calculated on the hex string itself. Maybe the appended hash is calculated on another representation of the same data?
I think you are getting confused about bytes vs characters.
Internally, php stores every character in a string as a byte. The sha1 hash that PHP generates is a 40 character (40 byte) hexademical representation of the 20-byte binary data, since each binary value needs to be represented by 2 hex characters.
I'm not sure if this is the actual source of your discrepancy, but seeing this misunderstanding makes me wonder if it's related.
Try trimming the string first, its suprisingly easy to have a newline or space on the end that changes the hash completely.
According to this Online SHA1 tool the hash of the given text (after removing new lines and spaces) is
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Idea: Make sure your inputing characters not a hex number to the PHP version.
The problem was that the original was a keyed hash. I had to use hash_hmac() with a validation key rather than sha1() without.

Categories