Reading about generating salt using Cryptographically Secure Pseudo-Random Number Generator (CSPRNG). This salt then will be appended to a string that needs to be hashed.
However, the salt generated by CSPRNG function (for PHP I'm using openssl_random_pseudo_bytes) is actually binary data.
Confused about how I should append this binary data to a string, I saw this PHP example for creating hash. It encodes binary data.
So I just wanted to know if that is what I need to do. I need to encode salt to get a string. Then I can append that salt to a string that needs to be hashed. Or are there other ways of adding salt to a string?
note I'm not hashing a password
If you need to hash a password, please use password_hash() and password_verify(), and probably add password_needs_rehash() - see http://de2.php.net/password_hash.
You might notice that these functions are available since PHP 5.5.0 - if you are using an earlier version of PHP, you can add this compatibility library to make it work with PHP starting at 5.3.7.
It can't get very much easier than that.
It's probably best to first convert your string into binary data using an encoding. UTF-8 encoding is probably best for most use cases. Don't forget to (at least) document which character encoding is used.
Now concatenate the salt and the encoded string. Again, you need to (at least) document the size of the salt. Please make sure you use concatenation of bytes, not strings. Bytes can have any value, including invalid characters, control characters etc.
After the concatenation you can feed the resulting byte array into the hashing function.
If you have trouble with byte concatenation in PHP, you could use hexadecimal values instead. But don't forget to convert them back into bytes before feeding them into the hash method.
Related
I've been playing around with php mcrypt over the weekend with AES used to encrypt text strings with a key. Later I worked up a tiny php tool to encrypt / decrypt your strings with AES/mcrypt now when the key is "wrong" and the text doesn't get decrypted, you end up with what I think is binary from what I've read around (http://i.imgur.com/jF8cZMZ.png), is there anyway in PHP to check if the variable holds binary or a properly decoded string?
My apologies if the title and the intro are a bit misleading.
When you encrypt text and then try to decrypt it, you will get the same text, but when you try to decrypt random data, there is a small chance that the result will be text (decreasing with length of data). You haven't specified what kind of data we are talking about, but determining if the decryption is successful by applying a heuristic is a bad idea. It is slow and may lead to false positives.
You should have a checksum or something like that to determine if the decrypted result is valid. This could be easily done by running sha1 on the plaintext data, prepend the result to the text and encrypt it as a whole. When you decrypt it, you can split (sha1 output has a fixed size, so you know where to split) the resulting string run sha1 on the text part and compare with the hash part. If it matches you have a valid result. You can of course improve the security a little by using SHA-256 or SHA-512.
That's is just one way of doing it, but might not be the best. Better ways would be to use an authenticated mode of operation for AES like GCM or CCM, or use encrypt-then-MAC with a good MAC function like HMAC-SHA512.
With using the approaches above you're free to use any kind of data to encrypt, because you're not limited to determining if it is text or not anymore.
I'm trying to use mcrypt_create_iv to generate random salts. When I test to see if the salt is generated by echo'ing it out, it checks out but it isn't the required length which I pass as a parameter to it (32), instead its less than that.
When I store it in my database table however, it shows up as something like this K??5P?M???4?o???"?0??
I'm sure it's something to do with the database, but I tried to change the collation of it to correspond with the config settings of CI, which is utf8_general_ci, but it doesn't solve the problem, instead it generates a much smaller salt.
Does anyone know of what may be wrong? Thanks for any feedback/help
The function mcrypt_create_iv() will return a binary string, containing \0 and other unreadable characters. Depending on how you want to use the salts, you first have to encode those byte strings, to an accepted alphabet. It is also possible to store binary strings in the database, but of course you will have a problem to display them.
Since salts are normally used for password storing, i would recommend to have a look at PHP's function password_hash(), it will generate a salt automatically and includes it in the resulting hash-value, so you don't need a separate database field for the salt.
I know the PHP function, password_hash outputs the algorithm, cost, salt, and hash all in one string so password_verify can check a password.
Sample output from PHP page:
$2y$10$.vGA1O9wmRjrwAVXD98HNOgsNpDczlqm3Jq7KnEd1rVAGv3Fykk1a
so the $2y$ represents the algorithm, the 10 represents cost.
But how does password_verify separate the salt from the hash? I don't see any identifier separating the two afterwards.
For the bCrypt version of Password Hash.
Bcrypt has a fixed-length salt value. The crypt function which is what PHP calls internally when you're utilizing password_hash()/password_verify() with the default algorithm has a a 16 byte salt. This is given as a 22 characters of the custom base64 alphabet A-Za-z/. then it decodes the string into bytes as 22 B64 characters encode 16.5Bytes there is an extra nibble of data that is not taken into account.
For all other hashes the salt value is a defined set of bytes which are of course encoded into ASCII safe b64 and put after the $ sign and then the verifying function would only have to split the string into parts via the delimiter $ and then go for the third set of characters get the substr(0,B64_ENCODED_HASH_ALGORITHM_SALT_LEN). After that it would then pass the parameters it also got from the split string and pass those back into the password_hash function along with the password to check.
The string it gives you is defined by the hashing algorithm's standard in most cases but is almost always something to the pattern of
$<ALGORITHM_ID>$<COST_IN_FORMAT>$<BASE64_ENCODED_SALT><BASE64_ENCODED_HASH>$
I have the luxury of starting from scratch, so I'm wondering what would be a good hash to use between PHP and Python.
I just need to be able to generate the same hash from the same text in each language.
From what I read, PHP's md5() isn't going to work nicely.
md5() always plays nicely - it always does the same thing because it is a standard hashing format.
The only tripping hazard is that some languages default return format for an MD5 hash is a 32 byte ascii string containing hexadecimal characters, and some use a 16 byte string containing a literal binary representation of the hash.
PHP's md5() by default returns a 32-byte string, but if you pass true to the second argument, it will return the 16 byte form instead. So as long as you know which version your other language uses (in you case Python), you just need to make sure that you get the correct format from PHP.
You may be better using the 32-byte form anyway, depending on how your applications communicate. If you use a communication protocol based on plain-text (such as HTTP) it is usually safer to use plain-text versions of anything - binary, in this case, is smaller, but liable to get corrupted in transmission by badly written servers/clients.
The binary vs. ascii problem applys to just about any hashing algorithm you can think of.
What is it you want from the hash? (portability, security, performance....)
From what I read, PHP's md5() isn't going to work nicely.
What did you read? Why won't it work?
I just need to be able to generate the same hash from the same text in each language
Since PHP only provides crc32 (very insecure), md5 and sha1 out of the box, it's not exactly a huge amount of testing you need to do. Of course if portability is not an issue then there's the mcrypt and openssl apis available. And more recently the hash PECL gives you a huge choice.
I suggest to use sha1 as it is implemented out of the box in both but has no collision valnurabilities like md5. See: http://en.wikipedia.org/wiki/MD5#Collision_vulnerabilities
I don't think I was specific enough last time. Here we go:
I have a hex string:
742713478fb3c36e014d004100440041004
e0041004e00000060f347d15798c9010060
6b899c5a98c9014d007900470072006f007
500700000002f0000001f7691944b9a3306
295fb5f1f57ca52090d35b50060606060606
The last 20 bytes should (theoretically) contain a SHA1 Hash of the first part (complete string - 20 bytes). But it doesn't match for me.
Trying to do this with PHP, but no luck. Can you get a match?
Ticket:
742713478fb3c36e014d004100
440041004e0041004e00000060
f347d15798c90100606b899c5a
98c9014d007900470072006f00
7500700000002f0000001f7691944b9a
sha1 hash of ticket appended to original:
3306295fb5f1f57ca52090d35b50060606060606
My sha1 hash of ticket:
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Here's what is in the ticket and how it's being stored. FWIW, I can pull out username, etc, and spot the various delimiters.
http://www.codeproject.com/KB/aspnet/Forms_Auth_Internals/AuthTicket2.JPG
Edited: I have discovered that the string is padded on the end by the decryption function it goes through before this point. I removed the last 6 bytes and adjusted by ticket and hash accordingly. Still doesn't work, but I'm closer.
Your ticket is being calculated on the hex string itself. Maybe the appended hash is calculated on another representation of the same data?
I think you are getting confused about bytes vs characters.
Internally, php stores every character in a string as a byte. The sha1 hash that PHP generates is a 40 character (40 byte) hexademical representation of the 20-byte binary data, since each binary value needs to be represented by 2 hex characters.
I'm not sure if this is the actual source of your discrepancy, but seeing this misunderstanding makes me wonder if it's related.
Try trimming the string first, its suprisingly easy to have a newline or space on the end that changes the hash completely.
According to this Online SHA1 tool the hash of the given text (after removing new lines and spaces) is
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Idea: Make sure your inputing characters not a hex number to the PHP version.
The problem was that the original was a keyed hash. I had to use hash_hmac() with a validation key rather than sha1() without.