I have the luxury of starting from scratch, so I'm wondering what would be a good hash to use between PHP and Python.
I just need to be able to generate the same hash from the same text in each language.
From what I read, PHP's md5() isn't going to work nicely.
md5() always plays nicely - it always does the same thing because it is a standard hashing format.
The only tripping hazard is that some languages default return format for an MD5 hash is a 32 byte ascii string containing hexadecimal characters, and some use a 16 byte string containing a literal binary representation of the hash.
PHP's md5() by default returns a 32-byte string, but if you pass true to the second argument, it will return the 16 byte form instead. So as long as you know which version your other language uses (in you case Python), you just need to make sure that you get the correct format from PHP.
You may be better using the 32-byte form anyway, depending on how your applications communicate. If you use a communication protocol based on plain-text (such as HTTP) it is usually safer to use plain-text versions of anything - binary, in this case, is smaller, but liable to get corrupted in transmission by badly written servers/clients.
The binary vs. ascii problem applys to just about any hashing algorithm you can think of.
What is it you want from the hash? (portability, security, performance....)
From what I read, PHP's md5() isn't going to work nicely.
What did you read? Why won't it work?
I just need to be able to generate the same hash from the same text in each language
Since PHP only provides crc32 (very insecure), md5 and sha1 out of the box, it's not exactly a huge amount of testing you need to do. Of course if portability is not an issue then there's the mcrypt and openssl apis available. And more recently the hash PECL gives you a huge choice.
I suggest to use sha1 as it is implemented out of the box in both but has no collision valnurabilities like md5. See: http://en.wikipedia.org/wiki/MD5#Collision_vulnerabilities
Related
I've been playing around with php mcrypt over the weekend with AES used to encrypt text strings with a key. Later I worked up a tiny php tool to encrypt / decrypt your strings with AES/mcrypt now when the key is "wrong" and the text doesn't get decrypted, you end up with what I think is binary from what I've read around (http://i.imgur.com/jF8cZMZ.png), is there anyway in PHP to check if the variable holds binary or a properly decoded string?
My apologies if the title and the intro are a bit misleading.
When you encrypt text and then try to decrypt it, you will get the same text, but when you try to decrypt random data, there is a small chance that the result will be text (decreasing with length of data). You haven't specified what kind of data we are talking about, but determining if the decryption is successful by applying a heuristic is a bad idea. It is slow and may lead to false positives.
You should have a checksum or something like that to determine if the decrypted result is valid. This could be easily done by running sha1 on the plaintext data, prepend the result to the text and encrypt it as a whole. When you decrypt it, you can split (sha1 output has a fixed size, so you know where to split) the resulting string run sha1 on the text part and compare with the hash part. If it matches you have a valid result. You can of course improve the security a little by using SHA-256 or SHA-512.
That's is just one way of doing it, but might not be the best. Better ways would be to use an authenticated mode of operation for AES like GCM or CCM, or use encrypt-then-MAC with a good MAC function like HMAC-SHA512.
With using the approaches above you're free to use any kind of data to encrypt, because you're not limited to determining if it is text or not anymore.
When I apply Crypt::encrypt(1) I'm getting this encrypted string:
eyJpdiI6IlBoQnliQkZkb0NPT1g5NG9FbkpqV2hLa3ZLUnlWSEFRMEZwM2YxTEdNVk09IiwidmFsdWUiOiJ0N0kyWmZvRWVETzE3WTJWVU5DS1ZpTVFYTGpXNHQxT2YyQWdsMFgxK0xvPSIsIm1hYyI6IjAzMjAzNzdhNzZmYmZiZDVkZGJkMjM5MWY5NjhkNzJjMWFhMzNiYmYyZDJkODNlMmFkODcyNzdhYTE3ZjFkODMifQ==
Is it possible to make string shorter (4-5 times shorter) in Laravel, using the same two-way encryption?
What you want to do, instead of encrypting uri portion is obfuscate it. For example, one of the great libraries for php is Hashids
Reading about generating salt using Cryptographically Secure Pseudo-Random Number Generator (CSPRNG). This salt then will be appended to a string that needs to be hashed.
However, the salt generated by CSPRNG function (for PHP I'm using openssl_random_pseudo_bytes) is actually binary data.
Confused about how I should append this binary data to a string, I saw this PHP example for creating hash. It encodes binary data.
So I just wanted to know if that is what I need to do. I need to encode salt to get a string. Then I can append that salt to a string that needs to be hashed. Or are there other ways of adding salt to a string?
note I'm not hashing a password
If you need to hash a password, please use password_hash() and password_verify(), and probably add password_needs_rehash() - see http://de2.php.net/password_hash.
You might notice that these functions are available since PHP 5.5.0 - if you are using an earlier version of PHP, you can add this compatibility library to make it work with PHP starting at 5.3.7.
It can't get very much easier than that.
It's probably best to first convert your string into binary data using an encoding. UTF-8 encoding is probably best for most use cases. Don't forget to (at least) document which character encoding is used.
Now concatenate the salt and the encoded string. Again, you need to (at least) document the size of the salt. Please make sure you use concatenation of bytes, not strings. Bytes can have any value, including invalid characters, control characters etc.
After the concatenation you can feed the resulting byte array into the hashing function.
If you have trouble with byte concatenation in PHP, you could use hexadecimal values instead. But don't forget to convert them back into bytes before feeding them into the hash method.
Most of the text stored in my DB is from 1MB to 1.5MB big. But not bigger then 1.5MB, because that's the limit I set.
Here are my needs:
I need it for lowering my mysql database size
I need it to be as fast as possible
no security needed
it must just work correctly, so that string_1 and string_2 can never have the same hash
I use PHP and MYSQL.
A hash is not reversible. You can make a 1.5MB text into a small string with the help of hashing, but you cannot convert the same hash back into the original text.
What you are looking for is a compression algorithm. You can make the files a lot smaller with compression, but it's unlikely to be as small as a hash.
I would suggest SHA1, as it is also in use by git and similar applications to identify strings.
See: https://en.wikipedia.org/wiki/Sha1
and: http://php.net/manual/en/function.hash.php
$hash = hash( 'sha1', $inputData );
Saving space
MySQL has built-in COMPRESS() and UNCOMPRESS() functions which will save space in your DB, as well having to write extra PHP code.
Checking unique-ness
Instead of indexing TEXT columns [regardless of if they're compressed or not] you can store and index 2 relatively-small things that will guarantee that that text is unique.
A hash of the data, MD5, SHA, whatever you want.
The length of the uncompressed data.
For most hashing functions you're more likely to get hit by a meteor than have 2 identical hashes for different text strings, and having 2 indentical length and hash strings is less likely than getting hit by a meteor and lightning while winning three simultaneous lotteries.
I'm going to assume you want a compression algorithm to reduce the text size.
See http://php.net/manual/en/function.gzcompress.php.
I'd like to have a super simple / fast encrypt/decrypt function for non-critical pieces of data. I'd prefer the encryped string to be url-friendly (bonus points for pure alphanumerics), and no longer than it has to be. Ideally it should have some sort of key or other mechanism to randomize the cipher as well.
Because of server constraints the solution should not use mcrypt. Ideally it should also avoid base64 because of easier decrypting.
Example strings:
sample#email_address.com
shortstring
two words
or three words
555-123-4567
Capitals Possible?
You will probably have to code it yourself, but a Vigenère cypher on the characters A-Z, a-z, 0-9 should meet your needs.
With careful key generation and a long key (ideally longer than the encrypted text) Vigenère can be secure, but you have to use it very carefully to ensure that.
There's a wide variety of easy-to-implement ciphers around, such as XTEA. Don't invent your own, or use a trivially broken one like the vigenere cipher. Better yet, don't do this at all - inventing your own cryptosystems is fraught with danger, and if you don't want your users to view the data, you probably shouldn't be sending it to them in the first place.