Seemingly simple question, corresponds to another question that was asked with regards to MySQL: How does one store the hex value that results from a SHA1 hash in a PostgreSQL database?
Note: I realize I could use a VARCHAR(40) field, but this isn't efficient, as the data is in hex. Also, I am using PHP to interact with the database, so I can use PHP functions if necessary, but if this is the case, what do I store the result as in the database?
I would store as bytea, hex encoded. Converting the human-readable hex data to bytea is simply a matter of:
('\x' || sha1_hex_value)::bytea
The only real disadvantage here is that depending on your app framework you may get a binary representation out. If not you will get an escaped version and depending on the escape settings, may want to convert to binary yourself (if it is hex though you can just strip off the \x at the front of the value and use as hex).
Related
I asked a question here and I manage to partially implement the advice. Data is now stored encrypted in binary field (varbinary(500)), after I remove the aes-256 encryption and I leave aes-128 (default) codeigniter encryption.
However, I have some questions, and I can't find answers, since I can not find many articles on this subject, so If anyone can answer my questions, or point me to a book, or any other literature for further reading, I would be very grateful.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
base64_encode($clientName);
$encClientName = $this->encryption->encrypt($clientName);
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure? Can anyone post any snippet code of how to use nonce with the codeigniter?
Again, any link to reading material on this subject (storing encrypted data in the database with php) will be deeply appreciated.
Why encrypted data must be stored in binary type field? What is wrong with storing it in longtext, or varchar? Does that make the encryption worthless?
Encrypted data is binary. It will frequently contain byte sequences which are invalid in your text encoding, making them impossible to insert into a column which expects a string (like VARCHAR or TEXT).
The data type you probably want is either VARBINARY (which is similar to VARCHAR, but not a string) or BLOB (likewise, but for TEXT -- there's also MEDIUMBLOB, LONGBLOB, etc).
Why I must first encode the variable and then encrypt it when I store the data in the binary type of field, and I don't have to do that when I store the data in varchar field?
You don't. This is backwards.
If you were going to use a string-type column to store encrypted data, you could "fake it" by Base64 encoding the data after encryption. However, you're still better off using a binary-type column, at which point you don't need any additional encoding.
In my previous question (see the link on the top) I have been advised to use nonce. Since I didn't know how to use that with codeigniter library, I didn't implement that part. Does that make my data less secure?
Based on what I'm seeing in the documentation, I think the CodeIgniter Encryption library handles this for you by default. You shouldn't have to do anything additional.
In addition to duskwuffs answer, I covered your questions from a more crypto-related viewpoint. He just managed to post a minute before I did :)
Encrypted data must be stored in a binary type field due to the way that Character Encodings work. I recommend you read, if you haven't already, this excellent article by Joel Spolsky that details this very well.
It is important to remember that encryption algorithms operate on raw binary data. That is, a bit string. Literal 1's and 0's that can be interpreted in many ways. You can represent this data as unsigned byte values (255, 255), Hex (0xFF, 0xFF), whatever, they are really just bit strings underneath. Another property of encryption algorithms (or good ones, at least) is that the result of encryption should be indistinguishable from random data. That is, given an encrypted file and a blob of CSPRNG generated random data that have the same length, you should not be able to determine which is which.
Now lets presume you wanted to store this data in a field that expects UTF8 strings. Because the bit string we store in this field could contain any possible sequence of bytes, as we discussed above, we can't assume that the sequence of bytes that we store will denote actual valid UTF8 characters. The implication of this is that binary data encoded to UTF8 and then decoded back to binary is not guaranteed to give you the original binary data. In fact, it rarely will.
Your second question is also somewhat to do with encodings, but the encoding here is base64. Base64 is a encoding that plays very nicely with (in fact, it was designed for) binary data. Base64 is a way to represent binary data using common characters (a-z, A-Z, 0-9 and +, /) in most implementations. I am willing to bet that the encrypt function you are using either uses base64_decode or one of the functions it calls does. What you should actually be interested in is whether or not the output of the encrypt function is a base64 string or actual binary data, as this will affect the type of data field you use in your database (e.g. binary vs varchar).
I believe in your last question you stated that you were using CTR, so the following applies to the nonce used by CTR only.
CTR works by encrypting a counter value, and then xor-ing this encrypted counter value with your data. This counter value is made up of two things, the nonce, and the actual value of the counter, which normally starts at 0. Technically, your nonce can be any length, but I believe a common value is 12 bytes. Because the we are discussing AES, the total size of the counter value should be 16 bytes. That is, 12 bytes of nonce and 4 bytes of counter.
This is the important part. Every encryption operation should:
Generate a new 12 byte nonce to use for that operation.
Your implementation should add the counter and perform the actual encryption.
Once you have the final ciphertext, prepend the nonce to this ciphertext so that the result is len(ciphertext) + 12) bytes long.
Then store this final result in your database.
Repeating a nonce, using a static nonce, or performing more than 2^32 encryption operations with a single 12 byte nonce will make your ciphertext vulnerable.
I'm trying to use mcrypt_create_iv to generate random salts. When I test to see if the salt is generated by echo'ing it out, it checks out but it isn't the required length which I pass as a parameter to it (32), instead its less than that.
When I store it in my database table however, it shows up as something like this K??5P?M???4?o???"?0??
I'm sure it's something to do with the database, but I tried to change the collation of it to correspond with the config settings of CI, which is utf8_general_ci, but it doesn't solve the problem, instead it generates a much smaller salt.
Does anyone know of what may be wrong? Thanks for any feedback/help
The function mcrypt_create_iv() will return a binary string, containing \0 and other unreadable characters. Depending on how you want to use the salts, you first have to encode those byte strings, to an accepted alphabet. It is also possible to store binary strings in the database, but of course you will have a problem to display them.
Since salts are normally used for password storing, i would recommend to have a look at PHP's function password_hash(), it will generate a salt automatically and includes it in the resulting hash-value, so you don't need a separate database field for the salt.
I'm coding a tool to insert test data into a database. Some of the fields are blobs which are the (mcrypt) encrypted representations of strings.
I'm creating binary variables, but can't find a way to properly output it in the format I see in PHPmyAdmin when I export (known good) data as a reference.
For example:
I used PHPmyAdmin to export a known string. It produces a value of 0xe07861bbcaf39ad54a0b85389a9f08886997f8cafffe871b8569c2fcf3293bcc in the VALUES list.
Running bin2hex on my binary field (which I've confirmed contains the same contents as known good data) results in a representation of 7a49e1b3d7c6357cab6b4f9c61bc4d8535c23cbc8789e28ce9321993e9372c80
I can't find any documentation on how to properly convert binary PHP data to the (hex) format that mySQL uses. I've read the similar questions that seem related.
How can I get from a binary field to the 0x.... value that PHPmyAdmin makes?
It's as simple as:
'0x' . bin2hex($bin)
As for the different outputs, my bet is that you are mixing up the original data.
The whole point of designating data as binary is to simply treat the binary sequence as a raw, untouched sequence of bytes.
=> Given that MySQL has BLOB, BINARY and VARBINARY data types, why isn't it possible to store and retrieve any arbitrary binary stream of data from a php script without having the need to escape the sequence with mysql_real_escape_string or addslashes?
Because binary data are still serialized to a string… So, for example, imagine your $binary_data had the value a 'b" c. Then the query INSERT INTO foo VALUES $binary_data would fail.
The whole point of designating data as binary is to simply treat the binary sequence as a raw, untouched sequence of bytes.
you are wrong.
the only point of designating data as binary is just to mark it to be not a subject of character set recoding. that's all.
why isn't it possible to store and retrieve any arbitrary binary stream of data from a php script without having the need to escape the sequence with mysql_real_escape_string or addslashes?
who said it's impossible?
it's quite possible, both to store and retrieve.
The whole point of prepared statements is to send an arbitrary binary stream directly to mysql.
Why is it necessary to add escape sequences to a binary string when storing to a MySQL Database?
If you are talking of SQL, you have to understand what it is first.
SQL is a programming language.
And as any language has it's own syntax to follow.
And if you're going to add your raw binary data to this program, you have to make this data satisfy these rules. That's what escaping for.
So I have been browsing the internet, and came across the MySQL built-in function AES_ENCRYPT. It doesn't seem too hard to use, but some sources tell me to store the encrypted data as a VARCHAR, and some say to store it as a BLOB. What should I store the encrypted data as?
Many encryption and compression functions return strings for which the result might contain arbitrary byte values. If you want to store these results, use a column with a VARBINARY or BLOB binary string data type. This will avoid potential problems with trailing space removal or character set conversion that would change data values, such as may occur if you use a nonbinary string data type (CHAR, VARCHAR, TEXT).
Source: http://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html
If you need to use VARCHAR, rather than BLOB, then convert the encrypted binary to Base64 which only uses printable characters and can be safely stored as VARCHAR. Of course you will need to convert it back from Base64 to binary before decrypting.
I have always used blobs to stored encrypted data in MySQL.
You can use Binary. BINARY in STRING. It have to work. I am using it. Write me answer if it doesn't working.