Why is my SHA1 hash not matching? - php

I don't think I was specific enough last time. Here we go:
I have a hex string:
742713478fb3c36e014d004100440041004
e0041004e00000060f347d15798c9010060
6b899c5a98c9014d007900470072006f007
500700000002f0000001f7691944b9a3306
295fb5f1f57ca52090d35b50060606060606
The last 20 bytes should (theoretically) contain a SHA1 Hash of the first part (complete string - 20 bytes). But it doesn't match for me.
Trying to do this with PHP, but no luck. Can you get a match?
Ticket:
742713478fb3c36e014d004100
440041004e0041004e00000060
f347d15798c90100606b899c5a
98c9014d007900470072006f00
7500700000002f0000001f7691944b9a
sha1 hash of ticket appended to original:
3306295fb5f1f57ca52090d35b50060606060606
My sha1 hash of ticket:
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Here's what is in the ticket and how it's being stored. FWIW, I can pull out username, etc, and spot the various delimiters.
http://www.codeproject.com/KB/aspnet/Forms_Auth_Internals/AuthTicket2.JPG
Edited: I have discovered that the string is padded on the end by the decryption function it goes through before this point. I removed the last 6 bytes and adjusted by ticket and hash accordingly. Still doesn't work, but I'm closer.

Your ticket is being calculated on the hex string itself. Maybe the appended hash is calculated on another representation of the same data?

I think you are getting confused about bytes vs characters.
Internally, php stores every character in a string as a byte. The sha1 hash that PHP generates is a 40 character (40 byte) hexademical representation of the 20-byte binary data, since each binary value needs to be represented by 2 hex characters.
I'm not sure if this is the actual source of your discrepancy, but seeing this misunderstanding makes me wonder if it's related.

Try trimming the string first, its suprisingly easy to have a newline or space on the end that changes the hash completely.

According to this Online SHA1 tool the hash of the given text (after removing new lines and spaces) is
b6ecd613698ac3533b5f853bf22f6eb4afb94239
Idea: Make sure your inputing characters not a hex number to the PHP version.

The problem was that the original was a keyed hash. I had to use hash_hmac() with a validation key rather than sha1() without.

Related

Laravel 4 Encryption: how many characters to expect

I've just had an interesting little problem.
Using Laravel 4, I encrypt some entries before adding them to a db, including email address.
The db was setup with the default varchar length of 255.
I've just had an entry that encrypted to 309 characters, blowing up the encryption by cutting off the last 50-odd characters in the db.
I've (temporarily) fixed this by simply increasing the varchar length to 500, which should - in theory - cover me from this, but I want to be sure.
I'm not sure how the encryption works, but is there a way to tell what maximum character length to expect from the encrypt output for the sake of setting my database?
Should I change my field type from varchar to something else to ensure this doesn't happen again?
Conclusion
First, be warned that there has been quite a few changes between 4.0.0 and 4.2.16 (which seems to be the latest version).
The scheme starts with a staggering overhead of 188 characters for 4.2 and about 244 for 4.0 (given that I did not forget any newlines and such). So to be safe you will probably need in the order of 200 characters for 4.2 and 256 characters for 4.0 plus 1.8 times the plain text size, if the characters in the plaintext are encoded as single bytes.
Analysis
I just looked into the source code of Laravel 4.0 and Laravel 4.2 with regards to this function. Lets get into the size first:
the data is serialized, so the encryption size depends on the size of the type of the value (which is probably a string);
the serialized data is PKCS#7 padded using Rijndael 256 or AES, so that means adding 1 to 32 bytes or 1 to 16 bytes - depending on the use of 4.0 or 4.2;
this data is encrypted with the key and an IV;
both the ciphertext and IV are separately converted to base64;
a HMAC using SHA-256 over the base64 encoded ciphertext is calculated, returning a lowercase hex string of 64 bytes
then the ciphertext consists of base64_encode(json_encode(compact('iv', 'value', 'mac'))) (where the value is the base 64 ciphertext and mac is the HMAC value, of course).
A string in PHP is serialized as s:<i>:"<s>"; where <i> is the size of the string, and <s> is the string (I'm presuming PHP platform encoding here with regards to the size). Note that I'm not 100% sure that Laravel doesn't use any wrapping around the string value, maybe somebody could clear that up for me.
Calculation
All in all, everything depends quite a lot on character encoding, and it would be rather dangerous for me to make a good estimation. Lets assume a 1:1 relation between byte and character for now (e.g. US-ASCII):
serialization adds up to 9 characters for strings up to 999 characters
padding adds up to 16 or 32 bytes, which we assume are characters too
encryption keeps data the same size
base64 in PHP creates ceil(len / 3) * 4 characters - but lets simplify that to (len * 4) / 3 + 4, the base 64 encoded IV is 44 characters
the full HMAC is 64 characters
the JSON encoding adds 3*5 characters for quotes and colons, plus 4 characters for braces and comma's around them, totaling 19 characters (I'm presuming json_encode does not end with a white space here, base 64 again adds the same overhead
OK, so I'm getting a bit tired here, but you can see it at least twice expands the plaintext with base64 encoding. In the end it's a scheme that adds quite a lot of overhead; they could just have used base64(IV|ciphertext|mac) to seriously cut down on overhead.
Notes
if you're not on 4.2 now, I would seriously consider upgrading to the latest version because 4.2 fixes quite a lot of security issues
the sample code uses a string as key, and it is unclear if it is easy to use bytes instead;
the documentation does warn against key sizes other than the Rijndael defaults, but forgets to mention string encoding issues;
padding is always performed, even if CTR mode is used, which kind of defeats the purpose;
Laravel pads using PKCS#7 padding, but as the serialization always seems to end with ;, that was not really necessary;
it's a nice thing to see authenticated encryption being used for database encryption (the IV wasn't used, fixed in 4.2).
#MaartenBodewes' does a very good job at explaining how long the actual string probably will be. However you can never know it for sure, so here are two options to deal with the situation.
1. Make your field text
Change the field from a limited varchar to an "self-expanding" text. This is probably the simpler one, and especially if you expect rather long input I'd definitely recommend this.
2. Just make your varchar longer
As you did already, make your varchar longer depending on what input length you expect/allow. I'd multiply by a factor of 5.
But don't stop there! Add a check in your code to make sure the data doesn't get truncated:
$encrypted = Crypt::encrypt($input);
if(strlen($encrypted) > 500){
// do something about it
}
What can you do about it?
You could either write an error to the log and add the encrypted data (so you can manually re-insert it after you extended the length of your DB field)
Log::error('An encrypted value was too long for the DB field xy. Length: '.strlen($encrypted).' Data: '.$encrypted);
Obviously that means you have to check the logs frequently (or send them to you by mail) and also that the user could encounter errors while using the application because of the incorrect data in your DB.
The other way would be to throw an exception (and display an error to the user) and of course also write it to the log so you can fix it...
Anyways
Whether you choose option 1 or 2 you should always restrict the accepted length of your input fields. Server side and client side.

Appending CSPRNG generated salt to a string that needs to be hashed

Reading about generating salt using Cryptographically Secure Pseudo-Random Number Generator (CSPRNG). This salt then will be appended to a string that needs to be hashed.
However, the salt generated by CSPRNG function (for PHP I'm using openssl_random_pseudo_bytes) is actually binary data.
Confused about how I should append this binary data to a string, I saw this PHP example for creating hash. It encodes binary data.
So I just wanted to know if that is what I need to do. I need to encode salt to get a string. Then I can append that salt to a string that needs to be hashed. Or are there other ways of adding salt to a string?
note I'm not hashing a password
If you need to hash a password, please use password_hash() and password_verify(), and probably add password_needs_rehash() - see http://de2.php.net/password_hash.
You might notice that these functions are available since PHP 5.5.0 - if you are using an earlier version of PHP, you can add this compatibility library to make it work with PHP starting at 5.3.7.
It can't get very much easier than that.
It's probably best to first convert your string into binary data using an encoding. UTF-8 encoding is probably best for most use cases. Don't forget to (at least) document which character encoding is used.
Now concatenate the salt and the encoded string. Again, you need to (at least) document the size of the salt. Please make sure you use concatenation of bytes, not strings. Bytes can have any value, including invalid characters, control characters etc.
After the concatenation you can feed the resulting byte array into the hashing function.
If you have trouble with byte concatenation in PHP, you could use hexadecimal values instead. But don't forget to convert them back into bytes before feeding them into the hash method.

What is the character set of hash when md5 is used with salt using crypt()?

What is the character set of the output given by crypt() using md5 with salt.
By hash, I mean just the 22 characters after "$1$ "8 random characters"$ ". So I wanted to know what type of characters does 22 hashed character contains?
I was looking for this and have found a few questions that touched on this but nobody seems to have a definitive answer, nobody except the code of course and, this python implementation of the same:
http://pythonhosted.org/passlib/lib/passlib.hash.md5_crypt.html
Based on this, it seems both the salt, and the hash itself are encoded with the following regex char set: [./0-9A-Za-z]
I would expect these to output the same since they are all trying to be compatible with the same shadow password utilities.

Transforming a url into a unique 32 character token

I am writing an affiliate system, and I want to generate a unique 32 character wide token, from the url.
The problem is that a URL can be up to 128 chars long (IIRC). Is there a way that I can create a unique 32 char wide key/token from a given URL, without any 'collisions'?
I am not sure if this is an encoding, encryption or hashing problem (probably, a mixture of all three).
I will be implementing this 'mapping function' using PHP, since that is the language I am using to build this particular system. Any suggestions on how to go about doing this?
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
[Edit]
I just did some reading up, and found that the max length of urls is actually, something in the order of 2K. However, I am not concerned about 'silly' edge cases like that. I am pretty sure that 99.9% of the time, my imposed limit of 128 chars should be sufficient.
Is it even possible to map a 128 char
string into a 32 char string uniquely
(i.e. no collisions?) ...
In part. You can use a hash function like md5 or sha1. That's what they were built to do.
MD5 generates a 32 char string, and SHA1 generates a 40 char string.
Of course you can't guarantee that there won't be collisions. That's impossible since the message space is too large for your hashes (there are 21024 messages vs 2128 possible hashes if you are using MD5), but these functions are meant to be collision resistant and hard to reverse.
Wikipedia references:
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/MD5
http://en.wikipedia.org/wiki/SHA-1
Is it even possible to map a 128 char string into a 32 char string uniquely (i.e. no collisions?) ...
That depends on the alphabet being used for both input and output. If your resulting 32 char hash is limited to an alphabet of a-z, you can encode a maximum of 26^32 = 1.901722×10^45 values in it. A URL can consist of at least a-z and quite a number of other characters, so can contain at least 26^128 = 1.307942×10^181 values. So, an alphabet of 26 characters is not enough.
Using a-zA-Z0-9 you can encode 62^32 = 2.272658×10^57 unique values, which is still not enough. Even an alphabet of 100 characters gives you only 100^32 = 1.0×10^64 possible values.
Depending on what exactly you want to do, you should either increase the length of the hash or rethink the overall approach.

How to convert numbers to an alpha numeric system with php

I'm not sure what this is called, which is why I'm having trouble searching for it.
What I'm looking to do is to take numbers and convert them to some alphanumeric base so that the number, say 5000, wouldn't read as '5000' but as 'G4u', or something like that. The idea is to save space and also not make it obvious how many records there are in a given system. I'm using php, so if there is something like this built into php even better, but even a name for this method would be helpful at this point.
Again, sorry for not being able to be more clear, I'm just not sure what this is called.
You want to change the base of the number to something other than base 10 (I think you want base 36 as it uses the entire alphabet and numbers 0 - 9).
The inbuilt base_convert function may help, although it does have the limitation it can only convert between bases 2 and 36
$number = '5000';
echo base_convert($number, 10, 36); //3uw
Funnily enough, I asked the exact opposite question yesterday.
The first thing that comes to mind is converting your decimal number into hexadecimal. 5000 would turn into 1388, 10000 into 2710. Will save a few bytes here and there.
You could also use a higher base that utilizes the full alphabet (0-Z instead of 0-F) or even the full 256 ASCII characters. As #Yacoby points out, you can use base_convert() for that.
As I said in the comment, keep in mind that this is not an efficient way to mask IDs. If you have a security problem when people can guess the next or previous ID to a record, this is very poor protection.
dechex will convert a number to hex for you. It won't obfuscate how many records are in a given system, however. I don't think it will make it any more efficient to store or save space, either.
You'd probably want to use a 2 way crypt function if obfuscation is needed. That won't save space, either.
Please state your goals more clearly and give more background, because this seems a bit pointless as it is.
This might confuse more people than simply converting the base of the numbers ...
Try using signed digits to represent your numbers. For example, instead of using digits 0..9 for decimal numbers, use digits -5..5. This Wikipedia article gives an example for the binary representation of numbers, but the approach can be used for any numeric base.
Using this together with, say, base-36 arithmetic might satisfy you.
EDIT: This answer is not really a solution to the question, so ignore it unless you are trying to hash a number.
My first thought we be to hash it using eg. md5 or sha1. (You'd probably not save any space though...)
To prevent people from using rainbow-tables or brute force to guess which number you hashed, you can always add a salt. It can be as simple as a string prepended to your number before hashing it.
md5 would return an alphanumeric string of exactly 32 chars and sha1 would return one of exaclty 40 chars.

Categories