How does cross platform AES encryption work? - php

I have been able to successfully encrypt and decrypt AES-256 in both php and objective-c code. I won't post any code here since I have tried many varieties and none work. I have no idea how these encryption functions work... AES is a standardized algorithm, so why it doesn't work in my thinking boils down to
a) the iv
b) some encoding error
or
c) differences in padding (should be irrelevant for decryption).
If someone has AES functions that work in both php and objective-c that would be wonderful, but if not, any help in understanding what is causing these varied results would be appreciated.
If you want a more narrow question, it is about encodings, iv, and block size of this AES cipher.
1) Does it matter what encoding is used in terms of the key and the plaintext/ciphertext? Basically I'm guessing it is not a problem with the plain text since all the characters that I would use (at least during testing) are standard ASCII symbols. But lets say php strings are ASCII and I am using UTF8 in objective-c... I don't know enough to say if php uses ASCII or if the bytes ie. the key would be different between the two.
2) To my knowledge the ECB mode uses no iv (correct if wrong). CBC mode uses an iv. In this case, the iv must be recorded along with the cipher text. Now this key is 16 or 32 chars long in php (depending on 128 vs 256 block size). This means 16 or 32 bytes? And will the string 1234567890123456789012 be the same in ASCII and UTF8 when converted to bytes?
3) What is the difference between block size and key size in terms of the alogrithm? (again correct if wrong) Basically they are all the same algorithm just different parameters? And using a 256 bit key vs a 128 bit key is just a matter of which key is passed
(Also, note that I have been using base64 encoding to transfer strings between the applications for testing)
Thanks,
Elijah

For decryption to work correctly, everything must be exactly the same. Same key, same IV, same mode. In particular the key must be the same. Byte for byte the same. Bit for bit the same. AES is designed to fail to decrypt correctly if even one bit of the key is incorrect.
Reading your question, I suspect that your problem lies with the key. Your real key is not characters, it is bytes. There are a number of different ways to translate between characters and bytes, which can cause decryption to fail. You need to be certain that the two keys match byte for byte, not character for character. At the very least you need to be explicit about what mapping is used. Don't rely on system defaults as they can differ across systems.
Looking at your three questions:
1) For plaintext encoding you will get back exactly what you put in: UTF-8 in, UTF-8 out. If you want to convert to a different encoding then you will have to do it after decryption.
2) You are right that ECB doesn't need an IV, but ECB mode leaks information and should be avoided. Use CBC or CTR mode instead, the same mode at both ends. The IV is tied to the block size, so for AES the IV is always 16 bytes or 128 bits. You cannot guarantee that ASCII and UTF-8 will be the same. UTF might have a BOM at the start. ASCII might have a C-style zero byte at the end. Don't think in terms of characters, think in terms of bytes. Everything has to match at the byte level. In CBC mode a faulty IV will munge up the first block but decrypt subsequent blocks OK.
3) Block size is fixed at 128 bits for AES and cannot be changed. Key sizes are less constrained, and can be 128, 192, or 256 bits. In practice most people seem to use 128 or 256 bits. A block is a conveniently sized processing unit that is built into the cypher at a very low level. The key determines what is done to the block in the course of the processing. That allows more flexibility for the key. The key you enter is used to build some internal structures, the "round keys". This process is called "key expansion". It is the round keys which interact with the block being processed. Because the key is used indirectly there is more flexibility about how large it can be.

In terms of encoding of the key, IV, plaintext, and ciphertext, AES encryption does not use encoding. AES encryption uses binary data -- a sequence of 8-bit bytes.
You need the same binary key, binary IV, and binary ciphertext on the decrypting platform, to produce the original binary plaintext.
When you are converting between character encodings and binary, you are not always guaranteed a round-trip conversion. That is, not all sequences of bytes can be converted to strings of UTF-8 characters.
However, if you treat UTF-8 plaintext as binary data, and encrypt it, and then transport the ciphertext as binary, for example, by encoding it as base64 to preserve the binary representation of the data, then when you base64-decode to reconstitute the binary ciphertext on the decrypting platform, and decrypt, the resulting binary plaintext will be the original UTF-8 character data.
Always treat the key, IV, plaintext, and ciphertext as binary data in terms of encryption and decryption. The plaintext is binary data that just might happen to be UTF-8, or some variant of ASCII, or UTF-16BE, etc. The ciphertext will probably be none of those, or, happen to be one of those purely by chance.

Related

Encryption questions

I asked a question here and I manage to partially implement the advice. Data is now stored encrypted in binary field (varbinary(500)), after I remove the aes-256 encryption and I leave aes-128 (default) codeigniter encryption.
However, I have some questions, and I can't find answers, since I can not find many articles on this subject, so If anyone can answer my questions, or point me to a book, or any other literature for further reading, I would be very grateful.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
base64_encode($clientName);
$encClientName = $this->encryption->encrypt($clientName);
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure? Can anyone post any snippet code of how to use nonce with the codeigniter?
Again, any link to reading material on this subject (storing encrypted data in the database with php) will be deeply appreciated.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Encrypted data is binary. It will frequently contain byte sequences which are invalid in your text encoding, making them impossible to insert into a column which expects a string (like VARCHAR or TEXT).
The data type you probably want is either VARBINARY (which is similar to VARCHAR, but not a string) or BLOB (likewise, but for TEXT -- there's also MEDIUMBLOB, LONGBLOB, etc).
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
You don't. This is backwards.
If you were going to use a string-type column to store encrypted data, you could "fake it" by Base64 encoding the data after encryption. However, you're still better off using a binary-type column, at which point you don't need any additional encoding.
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure?
Based on what I'm seeing in the documentation, I think the CodeIgniter Encryption library handles this for you by default. You shouldn't have to do anything additional.
In addition to duskwuffs answer, I covered your questions from a more crypto-related viewpoint. He just managed to post a minute before I did :)
Encrypted data must be stored in a binary type field due to the way that Character Encodings work. I recommend you read, if you haven't already, this excellent article by Joel Spolsky that details this very well.
It is important to remember that encryption algorithms operate on raw binary data. That is, a bit string. Literal 1's and 0's that can be interpreted in many ways. You can represent this data as unsigned byte values (255, 255), Hex (0xFF, 0xFF), whatever, they are really just bit strings underneath. Another property of encryption algorithms (or good ones, at least) is that the result of encryption should be indistinguishable from random data. That is, given an encrypted file and a blob of CSPRNG generated random data that have the same length, you should not be able to determine which is which.
Now lets presume you wanted to store this data in a field that expects UTF8 strings. Because the bit string we store in this field could contain any possible sequence of bytes, as we discussed above, we can't assume that the sequence of bytes that we store will denote actual valid UTF8 characters. The implication of this is that binary data encoded to UTF8 and then decoded back to binary is not guaranteed to give you the original binary data. In fact, it rarely will.
Your second question is also somewhat to do with encodings, but the encoding here is base64. Base64 is a encoding that plays very nicely with (in fact, it was designed for) binary data. Base64 is a way to represent binary data using common characters (a-z, A-Z, 0-9 and +, /) in most implementations. I am willing to bet that the encrypt function you are using either uses base64_decode or one of the functions it calls does. What you should actually be interested in is whether or not the output of the encrypt function is a base64 string or actual binary data, as this will affect the type of data field you use in your database (e.g. binary vs varchar).
I believe in your last question you stated that you were using CTR, so the following applies to the nonce used by CTR only.
CTR works by encrypting a counter value, and then xor-ing this encrypted counter value with your data. This counter value is made up of two things, the nonce, and the actual value of the counter, which normally starts at 0. Technically, your nonce can be any length, but I believe a common value is 12 bytes. Because the we are discussing AES, the total size of the counter value should be 16 bytes. That is, 12 bytes of nonce and 4 bytes of counter.
This is the important part. Every encryption operation should:
Generate a new 12 byte nonce to use for that operation.
Your implementation should add the counter and perform the actual encryption.
Once you have the final ciphertext, prepend the nonce to this ciphertext so that the result is len(ciphertext) + 12) bytes long.
Then store this final result in your database.
Repeating a nonce, using a static nonce, or performing more than 2^32 encryption operations with a single 12 byte nonce will make your ciphertext vulnerable.

What are the different ways of generating a key for encription

I am using aes256 with php to encrypt data.
In the various documents I see various ways to generate a key, Like:
$key = pack('H*', "bcb04b7e103a0cd8b54763051cef08bc55abe029fdebae5e1d417e2ffb2a00a3");
Or
$Key = "what ever, plain string";
Or
$Key = "123456789abcdef";//128bit
What is the point of the first example, as opposed to the others?
Why not simply use a random string, 128 or 256 long?
I am using the example here http://php.net/manual/en/function.mcrypt-encrypt.php with each of the different key generating methods above.
You have three different key lengths. AES is specified for the following three key lengths: 128-bit (16 byte), 192-bit (24 byte) and 256-bit (32 byte). I'm not going to go into detail about the strength of different key sizes.
Let's take them apart:
$key = pack('H*', "bcb04b7e103a0cd8b54763051cef08bc55abe029fdebae5e1d417e2ffb2a00a3");
This is a hex encoded which is 64 characters long in encoded form. The key itself will be 32 bytes long which means that when the key is passed to mcrypt_encrypt() AES-256 is used automatically.
$Key = "what ever, plain string";
This is a 23 character string which can be used as a key for PHP versions before 5.6.0. This is not a valid length for a key in AES. MCrypt will pad the key with \0 up to the next valid key size which is 24 byte for AES-192. So this key is actually a valid key for PHP 5.6 in this form:
$Key = "what ever, plain string\0";
$Key = "123456789abcdef"; //128bit
This is a 15 character "key". As with the previous example, it will be padded to reach 16 bytes so that AES-128 is used.
Generating a key
Since you're asking about key generation, this question contains some approaches. Keys should be random and consist of all possible bytes. Using keys that are only alphanumeric or only contain printable characters is not good if you want to be safe against brute-force attacks on your key.
Since it's not possible to directly hard-code arbitrary bytes as a key in a code file, you should use the first approach of hard-coding an encoded version of the key and decode it programmatically.
Using hard-coded keys
There are only a handful of scenarios where hard-coding a symmetric key in the code can be used:
testing cryptographic implemetations (during development)
encryption data at rest where the data is not on the same machine as the encryption key (otherwise, it's just data obfuscation)
If your scenario doesn't match to the above, you're either happy with obfuscation or you should think about how you can employ public-key-encryption with a hybrid encryption approach.

Laravel 4 Encryption: how many characters to expect

I've just had an interesting little problem.
Using Laravel 4, I encrypt some entries before adding them to a db, including email address.
The db was setup with the default varchar length of 255.
I've just had an entry that encrypted to 309 characters, blowing up the encryption by cutting off the last 50-odd characters in the db.
I've (temporarily) fixed this by simply increasing the varchar length to 500, which should - in theory - cover me from this, but I want to be sure.
I'm not sure how the encryption works, but is there a way to tell what maximum character length to expect from the encrypt output for the sake of setting my database?
Should I change my field type from varchar to something else to ensure this doesn't happen again?
Conclusion
First, be warned that there has been quite a few changes between 4.0.0 and 4.2.16 (which seems to be the latest version).
The scheme starts with a staggering overhead of 188 characters for 4.2 and about 244 for 4.0 (given that I did not forget any newlines and such). So to be safe you will probably need in the order of 200 characters for 4.2 and 256 characters for 4.0 plus 1.8 times the plain text size, if the characters in the plaintext are encoded as single bytes.
Analysis
I just looked into the source code of Laravel 4.0 and Laravel 4.2 with regards to this function. Lets get into the size first:
the data is serialized, so the encryption size depends on the size of the type of the value (which is probably a string);
the serialized data is PKCS#7 padded using Rijndael 256 or AES, so that means adding 1 to 32 bytes or 1 to 16 bytes - depending on the use of 4.0 or 4.2;
this data is encrypted with the key and an IV;
both the ciphertext and IV are separately converted to base64;
a HMAC using SHA-256 over the base64 encoded ciphertext is calculated, returning a lowercase hex string of 64 bytes
then the ciphertext consists of base64_encode(json_encode(compact('iv', 'value', 'mac'))) (where the value is the base 64 ciphertext and mac is the HMAC value, of course).
A string in PHP is serialized as s:<i>:"<s>"; where <i> is the size of the string, and <s> is the string (I'm presuming PHP platform encoding here with regards to the size). Note that I'm not 100% sure that Laravel doesn't use any wrapping around the string value, maybe somebody could clear that up for me.
Calculation
All in all, everything depends quite a lot on character encoding, and it would be rather dangerous for me to make a good estimation. Lets assume a 1:1 relation between byte and character for now (e.g. US-ASCII):
serialization adds up to 9 characters for strings up to 999 characters
padding adds up to 16 or 32 bytes, which we assume are characters too
encryption keeps data the same size
base64 in PHP creates ceil(len / 3) * 4 characters - but lets simplify that to (len * 4) / 3 + 4, the base 64 encoded IV is 44 characters
the full HMAC is 64 characters
the JSON encoding adds 3*5 characters for quotes and colons, plus 4 characters for braces and comma's around them, totaling 19 characters (I'm presuming json_encode does not end with a white space here, base 64 again adds the same overhead
OK, so I'm getting a bit tired here, but you can see it at least twice expands the plaintext with base64 encoding. In the end it's a scheme that adds quite a lot of overhead; they could just have used base64(IV|ciphertext|mac) to seriously cut down on overhead.
Notes
if you're not on 4.2 now, I would seriously consider upgrading to the latest version because 4.2 fixes quite a lot of security issues
the sample code uses a string as key, and it is unclear if it is easy to use bytes instead;
the documentation does warn against key sizes other than the Rijndael defaults, but forgets to mention string encoding issues;
padding is always performed, even if CTR mode is used, which kind of defeats the purpose;
Laravel pads using PKCS#7 padding, but as the serialization always seems to end with ;, that was not really necessary;
it's a nice thing to see authenticated encryption being used for database encryption (the IV wasn't used, fixed in 4.2).
#MaartenBodewes' does a very good job at explaining how long the actual string probably will be. However you can never know it for sure, so here are two options to deal with the situation.
1. Make your field text
Change the field from a limited varchar to an "self-expanding" text. This is probably the simpler one, and especially if you expect rather long input I'd definitely recommend this.
2. Just make your varchar longer
As you did already, make your varchar longer depending on what input length you expect/allow. I'd multiply by a factor of 5.
But don't stop there! Add a check in your code to make sure the data doesn't get truncated:
$encrypted = Crypt::encrypt($input);
if(strlen($encrypted) > 500){
// do something about it
}
What can you do about it?
You could either write an error to the log and add the encrypted data (so you can manually re-insert it after you extended the length of your DB field)
Log::error('An encrypted value was too long for the DB field xy. Length: '.strlen($encrypted).' Data: '.$encrypted);
Obviously that means you have to check the logs frequently (or send them to you by mail) and also that the user could encounter errors while using the application because of the incorrect data in your DB.
The other way would be to throw an exception (and display an error to the user) and of course also write it to the log so you can fix it...
Anyways
Whether you choose option 1 or 2 you should always restrict the accepted length of your input fields. Server side and client side.

Why Laravel4.2 Encryption Key less than Encryption Key character in CodeIgniter?

I'm currently study Laravel4.2 and start to compared with Codeigniter
But I found some problem on Encryption Key character as below code
I've used this key to testing in Laravel 4.2 but it don't work because I got messages
"mcrypt_encrypt(): Size of key is too large for this algorithm"
But it's work perfect when I've used the same Encryption Key in Codeigniter latest version.
My question:How does Larave 4.2 secure if I used MCRYPT_RIJNDAEL_256 of Encryption Key
'key' =>
'SdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrd',
'cipher' => MCRYPT_RIJNDAEL_256,
AES keys need to be indistinguishable from random and either 16, 24 or 32 bytes in length. It seems Laravel adds an additional check for the AES key to be a valid size.
Basically what the PHP's mcrypt does (not sure about the C-code) is that it extends the key data with 00 valued bytes if the key is smaller than 32 bytes, until it gets to the first legal AES key size. If the key is larger than 32 bytes it simply cuts it to 32 bytes. This is absolutely against any good practice with regards to handling keys.
So your AES key is likely just interpreted as 'SdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrd', encoded as ASCII. This kind of key certainly does not provide the full security of AES-256 as the key reduces the key space significantly (with slightly more than 8 bytes if a 62 character alphabet is used, assuming each value within the alphabet is equally likely).
And note that MCRYPT_RIJNDAEL_256 is not AES, so you will only be able to decrypt it with libraries that support Rijndael with a block size of 256.

Key length issue: AES encryption on phpseclib and decryption on PyCrypto

I am working on a data intensive project where I have been using PHP for fetching data and encrypting it using phpseclib. A chunk of the data has been encrypted in AES with the ECB mode -- however the key length is only 10. I am able to decrypt the data successfully.
However, I need to use Python in the later stages of the project and consequently need to decrypt my data using it. I tried employing PyCrypto but it tells me the key length must be 16, 24 or 32 bytes long, which is not the case. According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
What should I do?
I strongly recommend you adjust your PHP code to use (at least) a sixteen byte key, otherwise your crypto system is considerably weaker than it might otherwise be.
I would also recommend you switch to CBC-mode, as ECB-mode may reveal patterns in your input data. Ensure you use a random IV each time you encrypt and store this with the ciphertext.
Finally, to address your original question:
According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
The space character 0x20 is not the same as the null character 0x00.

Categories