I have a PHP REST (Gateway) server. The client is a node.js server. THe data exchanged between them is encrypted (crypto_secretbox) & decrypted (crypto_secretbox_open) using libsodium easy api implementations of PHP & Node respectively.
Encrypted data in PHP doesn't have the 16 byte Zeros at the beginning (salt) where as the encrypted data in node.js has the 16 byte zeros.
To decrypt on node of the data encrypted in PHP, I have to prepend 16 bytes of zeros (salt) before calling the secretBox.decrypt.
To decrypt on PHP of the data encrypted in node, I have to first remove the 16 bytes of zeros before calling the \Sodium\crypto_secretbox_open.
The question: Is this the best possible approach or I am missing something very obvious?
Are you actually using secretbox_easy with Node-Sodium, and not secretbox?
secretbox requires extra bytes to be prepended/stripped. It is only available for backward compatibility, it doesn't really make sense to use this in except in C, but for some reason, Node-Sodium provides it.
The PHP bindings don't require these extra bytes. Like most other bindings, secretbox is actually secretbox_easy under the hood.
The good news is that Node-Sodium also provides secretbox_easy. You just need to explicitly call it secretbox_easy. No more padding required.
Related
I've just had an interesting little problem.
Using Laravel 4, I encrypt some entries before adding them to a db, including email address.
The db was setup with the default varchar length of 255.
I've just had an entry that encrypted to 309 characters, blowing up the encryption by cutting off the last 50-odd characters in the db.
I've (temporarily) fixed this by simply increasing the varchar length to 500, which should - in theory - cover me from this, but I want to be sure.
I'm not sure how the encryption works, but is there a way to tell what maximum character length to expect from the encrypt output for the sake of setting my database?
Should I change my field type from varchar to something else to ensure this doesn't happen again?
Conclusion
First, be warned that there has been quite a few changes between 4.0.0 and 4.2.16 (which seems to be the latest version).
The scheme starts with a staggering overhead of 188 characters for 4.2 and about 244 for 4.0 (given that I did not forget any newlines and such). So to be safe you will probably need in the order of 200 characters for 4.2 and 256 characters for 4.0 plus 1.8 times the plain text size, if the characters in the plaintext are encoded as single bytes.
Analysis
I just looked into the source code of Laravel 4.0 and Laravel 4.2 with regards to this function. Lets get into the size first:
the data is serialized, so the encryption size depends on the size of the type of the value (which is probably a string);
the serialized data is PKCS#7 padded using Rijndael 256 or AES, so that means adding 1 to 32 bytes or 1 to 16 bytes - depending on the use of 4.0 or 4.2;
this data is encrypted with the key and an IV;
both the ciphertext and IV are separately converted to base64;
a HMAC using SHA-256 over the base64 encoded ciphertext is calculated, returning a lowercase hex string of 64 bytes
then the ciphertext consists of base64_encode(json_encode(compact('iv', 'value', 'mac'))) (where the value is the base 64 ciphertext and mac is the HMAC value, of course).
A string in PHP is serialized as s:<i>:"<s>"; where <i> is the size of the string, and <s> is the string (I'm presuming PHP platform encoding here with regards to the size). Note that I'm not 100% sure that Laravel doesn't use any wrapping around the string value, maybe somebody could clear that up for me.
Calculation
All in all, everything depends quite a lot on character encoding, and it would be rather dangerous for me to make a good estimation. Lets assume a 1:1 relation between byte and character for now (e.g. US-ASCII):
serialization adds up to 9 characters for strings up to 999 characters
padding adds up to 16 or 32 bytes, which we assume are characters too
encryption keeps data the same size
base64 in PHP creates ceil(len / 3) * 4 characters - but lets simplify that to (len * 4) / 3 + 4, the base 64 encoded IV is 44 characters
the full HMAC is 64 characters
the JSON encoding adds 3*5 characters for quotes and colons, plus 4 characters for braces and comma's around them, totaling 19 characters (I'm presuming json_encode does not end with a white space here, base 64 again adds the same overhead
OK, so I'm getting a bit tired here, but you can see it at least twice expands the plaintext with base64 encoding. In the end it's a scheme that adds quite a lot of overhead; they could just have used base64(IV|ciphertext|mac) to seriously cut down on overhead.
Notes
if you're not on 4.2 now, I would seriously consider upgrading to the latest version because 4.2 fixes quite a lot of security issues
the sample code uses a string as key, and it is unclear if it is easy to use bytes instead;
the documentation does warn against key sizes other than the Rijndael defaults, but forgets to mention string encoding issues;
padding is always performed, even if CTR mode is used, which kind of defeats the purpose;
Laravel pads using PKCS#7 padding, but as the serialization always seems to end with ;, that was not really necessary;
it's a nice thing to see authenticated encryption being used for database encryption (the IV wasn't used, fixed in 4.2).
#MaartenBodewes' does a very good job at explaining how long the actual string probably will be. However you can never know it for sure, so here are two options to deal with the situation.
1. Make your field text
Change the field from a limited varchar to an "self-expanding" text. This is probably the simpler one, and especially if you expect rather long input I'd definitely recommend this.
2. Just make your varchar longer
As you did already, make your varchar longer depending on what input length you expect/allow. I'd multiply by a factor of 5.
But don't stop there! Add a check in your code to make sure the data doesn't get truncated:
$encrypted = Crypt::encrypt($input);
if(strlen($encrypted) > 500){
// do something about it
}
What can you do about it?
You could either write an error to the log and add the encrypted data (so you can manually re-insert it after you extended the length of your DB field)
Log::error('An encrypted value was too long for the DB field xy. Length: '.strlen($encrypted).' Data: '.$encrypted);
Obviously that means you have to check the logs frequently (or send them to you by mail) and also that the user could encounter errors while using the application because of the incorrect data in your DB.
The other way would be to throw an exception (and display an error to the user) and of course also write it to the log so you can fix it...
Anyways
Whether you choose option 1 or 2 you should always restrict the accepted length of your input fields. Server side and client side.
I've been playing around with php mcrypt over the weekend with AES used to encrypt text strings with a key. Later I worked up a tiny php tool to encrypt / decrypt your strings with AES/mcrypt now when the key is "wrong" and the text doesn't get decrypted, you end up with what I think is binary from what I've read around (http://i.imgur.com/jF8cZMZ.png), is there anyway in PHP to check if the variable holds binary or a properly decoded string?
My apologies if the title and the intro are a bit misleading.
When you encrypt text and then try to decrypt it, you will get the same text, but when you try to decrypt random data, there is a small chance that the result will be text (decreasing with length of data). You haven't specified what kind of data we are talking about, but determining if the decryption is successful by applying a heuristic is a bad idea. It is slow and may lead to false positives.
You should have a checksum or something like that to determine if the decrypted result is valid. This could be easily done by running sha1 on the plaintext data, prepend the result to the text and encrypt it as a whole. When you decrypt it, you can split (sha1 output has a fixed size, so you know where to split) the resulting string run sha1 on the text part and compare with the hash part. If it matches you have a valid result. You can of course improve the security a little by using SHA-256 or SHA-512.
That's is just one way of doing it, but might not be the best. Better ways would be to use an authenticated mode of operation for AES like GCM or CCM, or use encrypt-then-MAC with a good MAC function like HMAC-SHA512.
With using the approaches above you're free to use any kind of data to encrypt, because you're not limited to determining if it is text or not anymore.
I am working on a data intensive project where I have been using PHP for fetching data and encrypting it using phpseclib. A chunk of the data has been encrypted in AES with the ECB mode -- however the key length is only 10. I am able to decrypt the data successfully.
However, I need to use Python in the later stages of the project and consequently need to decrypt my data using it. I tried employing PyCrypto but it tells me the key length must be 16, 24 or 32 bytes long, which is not the case. According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
What should I do?
I strongly recommend you adjust your PHP code to use (at least) a sixteen byte key, otherwise your crypto system is considerably weaker than it might otherwise be.
I would also recommend you switch to CBC-mode, as ECB-mode may reveal patterns in your input data. Ensure you use a random IV each time you encrypt and store this with the ciphertext.
Finally, to address your original question:
According to the phpseclib documentation the "keys are null-padded to the closest valid size", but I'm not sure how to implement that in Python. Simply extending the length of the string with 6 spaces is not working.
The space character 0x20 is not the same as the null character 0x00.
I have the luxury of starting from scratch, so I'm wondering what would be a good hash to use between PHP and Python.
I just need to be able to generate the same hash from the same text in each language.
From what I read, PHP's md5() isn't going to work nicely.
md5() always plays nicely - it always does the same thing because it is a standard hashing format.
The only tripping hazard is that some languages default return format for an MD5 hash is a 32 byte ascii string containing hexadecimal characters, and some use a 16 byte string containing a literal binary representation of the hash.
PHP's md5() by default returns a 32-byte string, but if you pass true to the second argument, it will return the 16 byte form instead. So as long as you know which version your other language uses (in you case Python), you just need to make sure that you get the correct format from PHP.
You may be better using the 32-byte form anyway, depending on how your applications communicate. If you use a communication protocol based on plain-text (such as HTTP) it is usually safer to use plain-text versions of anything - binary, in this case, is smaller, but liable to get corrupted in transmission by badly written servers/clients.
The binary vs. ascii problem applys to just about any hashing algorithm you can think of.
What is it you want from the hash? (portability, security, performance....)
From what I read, PHP's md5() isn't going to work nicely.
What did you read? Why won't it work?
I just need to be able to generate the same hash from the same text in each language
Since PHP only provides crc32 (very insecure), md5 and sha1 out of the box, it's not exactly a huge amount of testing you need to do. Of course if portability is not an issue then there's the mcrypt and openssl apis available. And more recently the hash PECL gives you a huge choice.
I suggest to use sha1 as it is implemented out of the box in both but has no collision valnurabilities like md5. See: http://en.wikipedia.org/wiki/MD5#Collision_vulnerabilities
Can any one please let me know the way, how can i encrypt/decrypt a file instead of string. I mean i need to encrypt the entire file it may be an excel-sheet or document or even text file.
instead of string.
That rather implies that you already know how to encrypt the string - and since you're being specific about the algorithm, that you can create an appropriate representation for the other tools being used to operate on the data. But you haven't said what mode of operation you need to use - implementing this using CBC is trivial.
It's also not stated - but implied in your question, that the data is too large to load into a string (otherwise its simply a case of encrypting file_get_contents()).
There doesn't seem to be much in the way of documentation, but I would expect the modificed key required for ECB is updated in the resource created by mcrypt_module_open() and modified by mcrypt_generic_init(). Then its just a matter of feeding in parts from the file sized as a multiple of the block size (see mcrypt_get_block_size)
See http://www.php.net/manual/en/function.mcrypt-module-open.php
C.
I'm a little confused, can't you just read/write the string to a file using functions like file_get_contents and file_put_contents?
If you need an encryption-class there are some over at PHP classes. There is also a paid solution here: phpAES.
I guess it is better to create your own library for it and expose an API that just accepts a filepath instead of it content. It can open read the file and do the encryption / decryption.
You can use your own or pre-existing algo for encrypt/decrypt. Also you can have an argument in that API to accept the filepath to store the decrypted data or replace with the same file or whatever.