I am using aes256 with php to encrypt data.
In the various documents I see various ways to generate a key, Like:
$key = pack('H*', "bcb04b7e103a0cd8b54763051cef08bc55abe029fdebae5e1d417e2ffb2a00a3");
Or
$Key = "what ever, plain string";
Or
$Key = "123456789abcdef";//128bit
What is the point of the first example, as opposed to the others?
Why not simply use a random string, 128 or 256 long?
I am using the example here http://php.net/manual/en/function.mcrypt-encrypt.php with each of the different key generating methods above.
You have three different key lengths. AES is specified for the following three key lengths: 128-bit (16 byte), 192-bit (24 byte) and 256-bit (32 byte). I'm not going to go into detail about the strength of different key sizes.
Let's take them apart:
$key = pack('H*', "bcb04b7e103a0cd8b54763051cef08bc55abe029fdebae5e1d417e2ffb2a00a3");
This is a hex encoded which is 64 characters long in encoded form. The key itself will be 32 bytes long which means that when the key is passed to mcrypt_encrypt() AES-256 is used automatically.
$Key = "what ever, plain string";
This is a 23 character string which can be used as a key for PHP versions before 5.6.0. This is not a valid length for a key in AES. MCrypt will pad the key with \0 up to the next valid key size which is 24 byte for AES-192. So this key is actually a valid key for PHP 5.6 in this form:
$Key = "what ever, plain string\0";
$Key = "123456789abcdef"; //128bit
This is a 15 character "key". As with the previous example, it will be padded to reach 16 bytes so that AES-128 is used.
Generating a key
Since you're asking about key generation, this question contains some approaches. Keys should be random and consist of all possible bytes. Using keys that are only alphanumeric or only contain printable characters is not good if you want to be safe against brute-force attacks on your key.
Since it's not possible to directly hard-code arbitrary bytes as a key in a code file, you should use the first approach of hard-coding an encoded version of the key and decode it programmatically.
Using hard-coded keys
There are only a handful of scenarios where hard-coding a symmetric key in the code can be used:
testing cryptographic implemetations (during development)
encryption data at rest where the data is not on the same machine as the encryption key (otherwise, it's just data obfuscation)
If your scenario doesn't match to the above, you're either happy with obfuscation or you should think about how you can employ public-key-encryption with a hybrid encryption approach.
Related
I asked a question here and I manage to partially implement the advice. Data is now stored encrypted in binary field (varbinary(500)), after I remove the aes-256 encryption and I leave aes-128 (default) codeigniter encryption.
However, I have some questions, and I can't find answers, since I can not find many articles on this subject, so If anyone can answer my questions, or point me to a book, or any other literature for further reading, I would be very grateful.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
base64_encode($clientName);
$encClientName = $this->encryption->encrypt($clientName);
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure? Can anyone post any snippet code of how to use nonce with the codeigniter?
Again, any link to reading material on this subject (storing encrypted data in the database with php) will be deeply appreciated.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Encrypted data is binary. It will frequently contain byte sequences which are invalid in your text encoding, making them impossible to insert into a column which expects a string (like VARCHAR or TEXT).
The data type you probably want is either VARBINARY (which is similar to VARCHAR, but not a string) or BLOB (likewise, but for TEXT -- there's also MEDIUMBLOB, LONGBLOB, etc).
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
You don't. This is backwards.
If you were going to use a string-type column to store encrypted data, you could "fake it" by Base64 encoding the data after encryption. However, you're still better off using a binary-type column, at which point you don't need any additional encoding.
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure?
Based on what I'm seeing in the documentation, I think the CodeIgniter Encryption library handles this for you by default. You shouldn't have to do anything additional.
In addition to duskwuffs answer, I covered your questions from a more crypto-related viewpoint. He just managed to post a minute before I did :)
Encrypted data must be stored in a binary type field due to the way that Character Encodings work. I recommend you read, if you haven't already, this excellent article by Joel Spolsky that details this very well.
It is important to remember that encryption algorithms operate on raw binary data. That is, a bit string. Literal 1's and 0's that can be interpreted in many ways. You can represent this data as unsigned byte values (255, 255), Hex (0xFF, 0xFF), whatever, they are really just bit strings underneath. Another property of encryption algorithms (or good ones, at least) is that the result of encryption should be indistinguishable from random data. That is, given an encrypted file and a blob of CSPRNG generated random data that have the same length, you should not be able to determine which is which.
Now lets presume you wanted to store this data in a field that expects UTF8 strings. Because the bit string we store in this field could contain any possible sequence of bytes, as we discussed above, we can't assume that the sequence of bytes that we store will denote actual valid UTF8 characters. The implication of this is that binary data encoded to UTF8 and then decoded back to binary is not guaranteed to give you the original binary data. In fact, it rarely will.
Your second question is also somewhat to do with encodings, but the encoding here is base64. Base64 is a encoding that plays very nicely with (in fact, it was designed for) binary data. Base64 is a way to represent binary data using common characters (a-z, A-Z, 0-9 and +, /) in most implementations. I am willing to bet that the encrypt function you are using either uses base64_decode or one of the functions it calls does. What you should actually be interested in is whether or not the output of the encrypt function is a base64 string or actual binary data, as this will affect the type of data field you use in your database (e.g. binary vs varchar).
I believe in your last question you stated that you were using CTR, so the following applies to the nonce used by CTR only.
CTR works by encrypting a counter value, and then xor-ing this encrypted counter value with your data. This counter value is made up of two things, the nonce, and the actual value of the counter, which normally starts at 0. Technically, your nonce can be any length, but I believe a common value is 12 bytes. Because the we are discussing AES, the total size of the counter value should be 16 bytes. That is, 12 bytes of nonce and 4 bytes of counter.
This is the important part. Every encryption operation should:
Generate a new 12 byte nonce to use for that operation.
Your implementation should add the counter and perform the actual encryption.
Once you have the final ciphertext, prepend the nonce to this ciphertext so that the result is len(ciphertext) + 12) bytes long.
Then store this final result in your database.
Repeating a nonce, using a static nonce, or performing more than 2^32 encryption operations with a single 12 byte nonce will make your ciphertext vulnerable.
I'm currently study Laravel4.2 and start to compared with Codeigniter
But I found some problem on Encryption Key character as below code
I've used this key to testing in Laravel 4.2 but it don't work because I got messages
"mcrypt_encrypt(): Size of key is too large for this algorithm"
But it's work perfect when I've used the same Encryption Key in Codeigniter latest version.
My question:How does Larave 4.2 secure if I used MCRYPT_RIJNDAEL_256 of Encryption Key
'key' =>
'SdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrdSdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrd',
'cipher' => MCRYPT_RIJNDAEL_256,
AES keys need to be indistinguishable from random and either 16, 24 or 32 bytes in length. It seems Laravel adds an additional check for the AES key to be a valid size.
Basically what the PHP's mcrypt does (not sure about the C-code) is that it extends the key data with 00 valued bytes if the key is smaller than 32 bytes, until it gets to the first legal AES key size. If the key is larger than 32 bytes it simply cuts it to 32 bytes. This is absolutely against any good practice with regards to handling keys.
So your AES key is likely just interpreted as 'SdRlCcZtE2ujlTZv5S3JZKN5bJvGQkrd', encoded as ASCII. This kind of key certainly does not provide the full security of AES-256 as the key reduces the key space significantly (with slightly more than 8 bytes if a 62 character alphabet is used, assuming each value within the alphabet is equally likely).
And note that MCRYPT_RIJNDAEL_256 is not AES, so you will only be able to decrypt it with libraries that support Rijndael with a block size of 256.
I am working on a data intensive project where I have been using PHP for fetching data and encrypting it using phpseclib. A chunk of the data has been encrypted in AES with the ECB mode -- however the key length is only 10. I am able to decrypt the data successfully.
However, I need to use Python in the later stages of the project and consequently need to decrypt my data using it. I tried employing PyCrypto but it tells me the key length must be 16, 24 or 32 bytes long, which is not the case. According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
What should I do?
I strongly recommend you adjust your PHP code to use (at least) a sixteen byte key, otherwise your crypto system is considerably weaker than it might otherwise be.
I would also recommend you switch to CBC-mode, as ECB-mode may reveal patterns in your input data. Ensure you use a random IV each time you encrypt and store this with the ciphertext.
Finally, to address your original question:
According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
The space character 0x20 is not the same as the null character 0x00.
I have been able to successfully encrypt and decrypt AES-256 in both php and objective-c code. I won't post any code here since I have tried many varieties and none work. I have no idea how these encryption functions work... AES is a standardized algorithm, so why it doesn't work in my thinking boils down to
a) the iv
b) some encoding error
or
c) differences in padding (should be irrelevant for decryption).
If someone has AES functions that work in both php and objective-c that would be wonderful, but if not, any help in understanding what is causing these varied results would be appreciated.
If you want a more narrow question, it is about encodings, iv, and block size of this AES cipher.
1) Does it matter what encoding is used in terms of the key and the plaintext/ciphertext? Basically I'm guessing it is not a problem with the plain text since all the characters that I would use (at least during testing) are standard ASCII symbols. But lets say php strings are ASCII and I am using UTF8 in objective-c... I don't know enough to say if php uses ASCII or if the bytes ie. the key would be different between the two.
2) To my knowledge the ECB mode uses no iv (correct if wrong). CBC mode uses an iv. In this case, the iv must be recorded along with the cipher text. Now this key is 16 or 32 chars long in php (depending on 128 vs 256 block size). This means 16 or 32 bytes? And will the string 1234567890123456789012 be the same in ASCII and UTF8 when converted to bytes?
3) What is the difference between block size and key size in terms of the alogrithm? (again correct if wrong) Basically they are all the same algorithm just different parameters? And using a 256 bit key vs a 128 bit key is just a matter of which key is passed
(Also, note that I have been using base64 encoding to transfer strings between the applications for testing)
Thanks,
Elijah
For decryption to work correctly, everything must be exactly the same. Same key, same IV, same mode. In particular the key must be the same. Byte for byte the same. Bit for bit the same. AES is designed to fail to decrypt correctly if even one bit of the key is incorrect.
Reading your question, I suspect that your problem lies with the key. Your real key is not characters, it is bytes. There are a number of different ways to translate between characters and bytes, which can cause decryption to fail. You need to be certain that the two keys match byte for byte, not character for character. At the very least you need to be explicit about what mapping is used. Don't rely on system defaults as they can differ across systems.
Looking at your three questions:
1) For plaintext encoding you will get back exactly what you put in: UTF-8 in, UTF-8 out. If you want to convert to a different encoding then you will have to do it after decryption.
2) You are right that ECB doesn't need an IV, but ECB mode leaks information and should be avoided. Use CBC or CTR mode instead, the same mode at both ends. The IV is tied to the block size, so for AES the IV is always 16 bytes or 128 bits. You cannot guarantee that ASCII and UTF-8 will be the same. UTF might have a BOM at the start. ASCII might have a C-style zero byte at the end. Don't think in terms of characters, think in terms of bytes. Everything has to match at the byte level. In CBC mode a faulty IV will munge up the first block but decrypt subsequent blocks OK.
3) Block size is fixed at 128 bits for AES and cannot be changed. Key sizes are less constrained, and can be 128, 192, or 256 bits. In practice most people seem to use 128 or 256 bits. A block is a conveniently sized processing unit that is built into the cypher at a very low level. The key determines what is done to the block in the course of the processing. That allows more flexibility for the key. The key you enter is used to build some internal structures, the "round keys". This process is called "key expansion". It is the round keys which interact with the block being processed. Because the key is used indirectly there is more flexibility about how large it can be.
In terms of encoding of the key, IV, plaintext, and ciphertext, AES encryption does not use encoding. AES encryption uses binary data -- a sequence of 8-bit bytes.
You need the same binary key, binary IV, and binary ciphertext on the decrypting platform, to produce the original binary plaintext.
When you are converting between character encodings and binary, you are not always guaranteed a round-trip conversion. That is, not all sequences of bytes can be converted to strings of UTF-8 characters.
However, if you treat UTF-8 plaintext as binary data, and encrypt it, and then transport the ciphertext as binary, for example, by encoding it as base64 to preserve the binary representation of the data, then when you base64-decode to reconstitute the binary ciphertext on the decrypting platform, and decrypt, the resulting binary plaintext will be the original UTF-8 character data.
Always treat the key, IV, plaintext, and ciphertext as binary data in terms of encryption and decryption. The plaintext is binary data that just might happen to be UTF-8, or some variant of ASCII, or UTF-16BE, etc. The ciphertext will probably be none of those, or, happen to be one of those purely by chance.
I am writing an affiliate system, and I want to generate a unique 32 character wide token, from the url.
The problem is that a URL can be up to 128 chars long (IIRC). Is there a way that I can create a unique 32 char wide key/token from a given URL, without any 'collisions'?
I am not sure if this is an encoding, encryption or hashing problem (probably, a mixture of all three).
I will be implementing this 'mapping function' using PHP, since that is the language I am using to build this particular system. Any suggestions on how to go about doing this?
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
[Edit]
I just did some reading up, and found that the max length of urls is actually, something in the order of 2K. However, I am not concerned about 'silly' edge cases like that. I am pretty sure that 99.9% of the time, my imposed limit of 128 chars should be sufficient.
Is it even possible to map a 128 char
string into a 32 char string uniquely
(i.e. no collisions?) ...
In part. You can use a hash function like md5 or sha1. That's what they were built to do.
MD5 generates a 32 char string, and SHA1 generates a 40 char string.
Of course you can't guarantee that there won't be collisions. That's impossible since the message space is too large for your hashes (there are 21024 messages vs 2128 possible hashes if you are using MD5), but these functions are meant to be collision resistant and hard to reverse.
Wikipedia references:
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/MD5
http://en.wikipedia.org/wiki/SHA-1
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
That depends on the alphabet being used for both input and output. If your resulting 32 char hash is limited to an alphabet of a-z, you can encode a maximum of 26^32 = 1.901722×10^45 values in it. A URL can consist of at least a-z and quite a number of other characters, so can contain at least 26^128 = 1.307942×10^181 values. So, an alphabet of 26 characters is not enough.
Using a-zA-Z0-9 you can encode 62^32 = 2.272658×10^57 unique values, which is still not enough. Even an alphabet of 100 characters gives you only 100^32 = 1.0×10^64 possible values.
Depending on what exactly you want to do, you should either increase the length of the hash or rethink the overall approach.