I have a need to store an encrypted but recoverable (by admin) password in MySQL, from PHP. AFAIK, the most straightforward way to do this is with openssl_public_encrypt(), but I'm not sure what column type is needed. Can I make any reliable judgment on the maximum length of encrypted output, based upon the size of the key and the input?
Or am I forced to use a huge field (e.g. BLOB), and just hope it works all the time?
The openssl_public_encrypt function limits the size of the data you can encrypt to the length of the key, if you use padding (recommended), you'll lose an extra 11 bytes.
However, the PKCS#1 standard, which OpenSSL uses, specifies a padding scheme (so you can encrypt smaller quantities without losing security), and that padding scheme takes a minimum of 11 bytes (it will be longer if the value you're encrypting is smaller). So the highest number of bits you can encrypt with a 1024-bit key is 936 bits because of this (unless you disable the padding by adding the OPENSSL_NO_PADDING flag, in which case you can go up to 1023-1024 bits). With a 2048-bit key it's 1960 bits instead.
Of course you should never disable padding, because that will make the same passwords to encrypt to the same value.
So for a 1024-bit key the maximum password input length is 117 chars.
For a 2048-bit key it's 245 chars.
I'm not 100% sure of the output length, but a simple trail should confirm this, the output is a simple function of the keylength, so for a 2048-bit key I suspect it is 256 bytes.
You should use a binary string with the required length to store the password.
For speed reasons it's best to use a limited length index on the field.
Do not use blob (!) because that will slow things way down for no benefit.
CREATE TABLE user
id unsigned integer auto_increment primary key,
username varchar(50) not null,
passRSA binary(256), <<-- doublecheck the length.
index ipass(passRSA(10)) <<-- only indexes the first 10 bytes for speed reasons.
) ENGINE = InnoDB
Adding extra bytes to the index will just slow things down and grow the index file for no benefit.
Related
As I know sha2 generate 256 bit hash.
256bit/8 = 32bytes.
So, It should take only varchar(32) field in the database. But I saw an article saying sah2 database field require varchar(64) field in the database. Is that true? Can someone explain, Please!!
Hashes are generally represented as hexadecimal strings:
string(64) "316a2017faa1ee410aadfb159097b8af260a258aa4210c550844cab89083111d"
In this case, SHA256 is 64 bytes. However, you may choose to store it in its binary form. This will make it take half as much space (32 bytes) but will make it unreadable in your database shell:
string(32) "̵9�~Rbgc\�7ME���)Fw�w��E�kc5"
Whether you store as a 64 byte string or a 32 byte binary is up to you.
Use varchar(64) or string(64). You need 64 characters to represent the 256 bits in SHA-256, it's represented in hexadecimal so each digit represents 4 bits and that's 256/4 = 64.
There are two ways to specify a key and an IV for a RijndaelManaged object. One is by calling CreateEncryptor:
var encryptor = rij.CreateEncryptor(Encoding.UTF8.GetBytes(key), Encoding.UTF8.GetBytes(iv)));
and another one by directly setting Key and IV properties:
rij.Key = "1111222233334444";
rij.IV = "1111222233334444";
As long as the length of the Key and IV is 16 bytes, both methods produce the same result. But if your key is shorter than 16 bytes, the first method still allows you to encode the data and the second method fails with an exception.
Now this may sound like an absolutely abstract question, but I have to use PHP & the key which is only 10 bytes long in order to send an encrypted message to a server which uses the first method.
So the question is: How does CreateEncryptor expand the key and is there a PHP implementation? I cannot alter the C# code so I'm forced to replicate this behaviour in PHP.
I'm going to have to start with some assumptions. (TL;DR - The solution is about two-thirds of the way down but the journey is way cooler).
First, in your example you set IV and Key to strings. This can't be done. I'm therefore going to assume we call GetBytes() on the strings, which is a terrible idea by the way as there are less potential byte values in usable ASCII space than there are in all 256 values in a byte; that's what GenerateIV() and GenerateKey() are for. I'll get to this at the very end.
Next I'm going to assume you're using the default block, key and feedback size for RijndaelManaged: 128, 256 and 128 respectively.
Now we'll decompile the Rijndael CreateEncryptor() call. When it creates the Transform object it doesn't do much of anything with the key at all (except set m_Nk, which I'll come to later). Instead it goes straight to generating a key expansion from the bytes it is given.
Now it gets interesting:
switch (this.m_blockSizeBits > rgbKey.Length * 8 ? this.m_blockSizeBits : rgbKey.Length * 8)
So:
128 > len(k) x 8 = 128
128 <= len(k) x 8 = len(k) x 8
128 / 8 = 16, so if len(k) is 16 we can expect to switch on len(k) x 8. If it's more, then it will switch on len(k) x 8 too. If it's less it will switch on the block size, 128.
Valid switch values are 128, 192 and 256. That means it will only fall to default (and throw an exception) if it's over 16 bytes in length and not a valid block (not key) length of some sort.
In other words, it never checks against the key length specified in the RijndaelManaged object. It goes straight in to the key expansion and starts operating at the block level, as long as the key length (in bits) is one of 128, 192, 256 or less than 128. This is actually a check against the block size, not the key size.
So what happens now that we've patently not checked the key length? The answer has to do with the nature of the key schedule. When you enter a key in to Rijndael, the key needs to be expanded before it can be used. In this case, it's going to be expanded to 176 bytes. In order to accomplish this, it uses an algorithm which is specifically designed to turn a short byte array in to much longer byte array.
Part of that involves checking the key length. A bit more decompilation fun and we find that this defined as m_Nk. Sounds familiar?
this.m_Nk = rgbKey.Length / 4;
Nk is 4 for a 16-byte key, less when we enter shorter keys. That's 4 words, for anyone wondering where the magic number 4 came from. This causes a curious fork in the key scheduler, there's a specific path for Nk <= 6.
Without going too deep in to the details, this actually happens to 'work' (ie. not crash in a fireball) with a key length less than 16 bytes... until it gets below 8 bytes.
Then the entire thing crashes spectacularly.
So what have we learned? When you use CreateEncryptor you are actually throwing a completely invalid key straight in to the key scheduler and it's serendipity that sometimes it doesn't outright crash on you (or a horrible contractual integrity breach, depending on your POV); probably an unintended side effect of the fact there's a specific fork for short key lengths.
For completeness sake we can now look at the other implementation where you set the Key and IV in the RijndaelManaged object. These are stored in the SymmetricAlgorithm base class, which has the following setter:
if (!this.ValidKeySize(value.Length * 8))
throw new CryptographicException(Environment.GetResourceString("Cryptography_InvalidKeySize"));
Bingo. Contract properly enforced.
The obvious answer is that you cannot replicate this in another library unless that library happens to contain the same glaring issue, which I'm going to a call a bug in Microsoft's code because I really can't see any other option.
But that answer would be a cop out. By inspecting the key scheduler we can work out what's actually happening.
When the expanded key is initialised, it populates itself with 0x00s. It then writes to the first Nk words with our key (in our case Nk = 2, so it populates the first 2 words or 8 bytes). Then it enters a second stage of expanding upon that by populating the rest of the expanded key beyond that point.
So now we know it's essentially padding everything past 8 bytes with 0x00, we can pad it with 0x00s right? No; because this shifts the Nk up to Nk = 4. As a result, although our first 4 words (16 bytes) will be populated as we expect, the second stage will begin expanding at the 17th byte, not the 9th!
The solution then is utterly trivial. Rather than padding our initial key with 6 additional bytes, just chop off the last 2 bytes.
So your direct answer in PHP is:
$key = substr($key, 0, -2);
Simple, right? :)
Now you can interop with this encryption function. But don't. It can be cracked.
Assuming your key uses lowercase, uppercase and digits you have an exhaustive search space of only 218 trillion keys.
62 bytes (26 + 26 + 10) is the search space of each byte because you're never using the other 194 (256 - 62) values. Since we have 8 bytes, there are 62^8 possible combinations. 218 trillion.
How fast can we try all the keys in that space? Let's ask openssl what my laptop (running lots of clutter) can do:
Doing aes-256 cbc for 3s on 16 size blocks: 12484844 aes-256 cbc's in 3.00s
That's 4,161,615 passes/sec. 218,340,105,584,896 / 4,161,615 / 3600 / 24 = 607 days.
Okay, 607 days isn't bad. But I can always just fire up a bunch of Amazon servers and cut that down to ~1 day by asking 607 equivalent instances to calculate 1/607th of the search space. How much would that cost? Less than $1000, assuming that each instance was somehow only as efficient as my busy laptop. Cheaper and faster otherwise.
There is also an implementation that is twice the speed of openssl1, so cut whatever figure we've ended up with in half.
Then we've got to consider that we'll almost certainly find the key before exhausting the entire search space. So for all we know it might be finished in an hour.
At this point we can assert if the data is worth encrypting, it's probably worth it to crack the key.
So there you go.
I use this function for hashing my passwords:
// RETURNS: rAyZOnlNBxO2WA53z2rAtFlhdS+M7kec9hskSCpeL6j+WwcuUvfFbpFJUtHvv7ji
base64_encode(hash_hmac('sha384', $str . SC_NONCE, SC_SITEKEY, true));
And I store hashes in char(64) field (MySQL -InnoDB).
Should I use varchar(64) instead of char(64)? Why?
Edit:
I changed sha256 with sha384. Because in this example, sha256 always returns 44 bytes for me. Sorry for confusing. Now it's 64-bytes.
varchars save storage by only using up to the length required. If the 64 bit hash is always 64 then it makes no difference in terms of storage so probably char is just as good as varchar in this case.
If you have variable length data to store, then a varchar will save wasting unnecessary space.
You should use CHAR(64) since your hash is fixed in length. Using VARCHAR will add another byte, wasting space.
Even though you are using a Base 64 encoded string, the result is not necessarily 64 bits in length. In this case, VARCHAR is better because the result can be shorter than 64 bits.
In fact as seen here, 64 bits is the maximum length rather than the set length.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Importance of varchar length in MySQL table
When using VARCHAR (assuming this is the correct data type for a short string) does the size matter? If I set it to 20 characters, will that take up less space or be faster than 255 characters?
Yes, is matter when you indexing multiple columns.
Prefixes can be up to 1000 bytes long (767 bytes for InnoDB tables). Note that prefix limits are measured in bytes, whereas the prefix length in CREATE TABLE statements is interpreted as number of characters. Be sure to take this into account when specifying a prefix length for a column that uses a multi-byte character set.
source : http://dev.mysql.com/doc/refman/5.0/en/column-indexes.html
In a latin1 collation, you can only specify up 3 columns of varchar(255).
While can specify up to 50 columns for varchar(20)
In-directly, without proper index, it will slow-down query speed
In terms of storage, it does not make difference,
as varchar stand for variable-length strings
In general, for a VARCHAR field, the amount of data stored in each field determines its footprint on the disk rather than the maximum size (unlike a CHAR field which always has the same footprint).
There is an upper limit on the total data stored within all fields of an index of 900 bytes (900 byte index size limit in character length).
The larger you make the field, the more likely people will try to use for purposes other than what you intended - and the greater the screen real-estate required to show the value - so its good practice to try to pick the right size, rather than assuming that if you make it as large as possible it will save you having to revisit the design.
The actual differences are:
TINYTEXT and other TEXT fields are stored separately from in-memory row inside MySQL heap, whereas VARCHAR() fields add up to 64k limit (so you can have more than 64k in TINYTEXTs, whereas you won't with VARCHAR).
TINYTEXT and other 'blob-like' fields will force SQL layer (MySQL) to use on-disk temporary tables whenever they are used, whereas VARCHAR will be still sorted 'in memory' (though will be converted to CHAR for the full width).
InnoDB internally doesn't really care whether it is tinytext or varchar. It is very easy to verify, create two tables, one with VARCHAR(255), another with TINYINT, and insert a record to both. They both will take single 16k page - whereas if overflow pages are used, TINYTEXT table should show up as taking at least 32k in 'SHOW TABLE STATUS'.
I usually prefer VARCHAR(255) - they don't cause too much of heap fragmentation for single row, and can be treated as single 64k object in memory inside MySQL. On InnoDB size differences are negligible.
In the documentation of MySQL:
http://dev.mysql.com/doc/refman/5.0/en/char.html
You have a table that indicates the bytes of a VARCHAR(4) (vs a CHAR(4)).
A simple VARCHAR(4) without string, only 1 byte. Then, a simple VARCHAR(255) without string is 1byte. A VARCHAR(4) with 'ab' is 3 bytes, and a VARCHAR(255) with 'ab' is 3 bytes. It's the same, but with the lenght limit :)
This will have no effect on performance. In this case the constraint merely helps ensure data integrity.
If you set it to 20, it will save only the first 20 characters. So yes, it will take up less space than 255 characters :).
The required storage space for VARCHAR is as follows:
VARCHAR(L), VARBINARY(L) — L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes
So VARCHAR does only require the space for the string plus one or two additional bytes for the length of the string.
I'm storing unique user-agents in a MySQL MyISAM table so when I have to look if it exists in the table, I check the md5 hash that is stored next to the TEXT field.
User-Agents
{
id - INT
user-agent - TEXT
hash - VARCHAR(32) // md5
}
There is any way to do the same but using a 32-bit integer and not a text hash? Maybe the md5 in raw format will be faster? That will requiere a binary search.
[EDIT]
MySQL don't handle hash searches for complete case-sensitive strings?
Store the UNHEX(MD5($value)) in a BINARY(16).
You could do this instead:
User-Agents
{
id - INT
user-agent - TEXT
hash - UNSIGNED INT (CRC32, indexed)
}
$crc32 = sprintf("%u", crc32($user_agent));
SELECT * FROM user_agents WHERE hash=$crc32 AND user_agent='$user_agent';
It's unlikely that you'll get collisions with crc32 for this kind of data.
To guarantee that collisions will not cause problems, add a secondary search parameter. MySQL will be able to use the index to quickly find the record. Then it can do a simple string search to guarantee that match is correct.
PS: The sprintf() is there to work around signed 32-bit integers. Should be unnecessary on 64-bit systems.
Let MySQL do the hard work for you. Use a CHAR column and create an index on that column. You could convert and store the hash as an integer, but there's absolutely no benefit, and it may actually cause problems.
try MurmurHash. Its a fast hashing algo thats been translated to multiple languages. It takes your input and translates it into a 32/64 bit integer hash.
You can't store an MD5 hash in a 32-bit int: it simply won't fit. (It's 32 characters when written in hex, but it's 128-bits of data)
You could look at MySQL's BINARY and VARBINARY types. See http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html. These types store binary data. In your case, BINARY(16) or VARBINARY(16), but since MD5 hashes are always 16 bytes, the latter seems a bit pointless.
You can store MD5 hash in char(32) which is a bit faster than varchar(32).
It's also possible to make two BIGINT fields and keep first half of md5 hash in first field and second part in second field.
Are you REALLY sure the hashes are only 32-bit? MD5 is 128-bit. Cropping the hash to first 4 or 8 bytes would greatly increase risk of collisions.
If your field hash is always an MD5 value generated by PHP, then you can safely set it to CHAR(32). This should not impact the response time to your queries, unless you plan to have millions+ of rows, or even worst! JOIN other tables with this field. The bottom line is that fixed width column is better than variable ones, so if you can optimize do it.
Regarding changing MD5 into int values, see this question; the conclusion to this is that if you really want to change your MD5 into a 128-bit int value, you might as well use a random number instead of an MD5!
Have you tried creating a BINARY(16) field, and storing the result of md5($plaintext, true); in it? That might work, make sure you index that field as well.
Because trying to fit a 128-bit value in 32 bits doesn't make any sense...