I am trying to solve a CTF in which the juggling type should be used. The code is:
if ($_GET["hash"] == hash("ripemd160", $_GET["hash"]))
{
echo $flag;
}
else
{
echo "<h1>Bad Hash</h1>";
}
I made a script in python which checks random hashes in ripemd160 that begins with "0e" and ends with only numbers. The code is:
def id_generator(size, chars=string.digits):
return ''.join(random.choice(chars) for _ in range(size))
param = "0e"
results = []
while True:
h = hashlib.new('ripemd160')
h.update("{0}".format(str(param)).encode('utf-8'))
hashed = h.hexdigest()
if param not in results:
print(param)
if hashed.startswith("0e") and hashed[2:].isdigit():
print(param)
print(hashed)
break
results.append(param)
else:
print("CHECKED")
param = "0e" + str(id_generator(size=10))
Any suggestions on how to solve it? Thank you!
There seems to be a bit of misunderstanding in the comments, so I'll start by explaining the problem a little more:
Type juggling refers to the behaviour of PHP whereby variables are implicitly cast to different data types under certain conditions. For example, all the following logical expressions will evaluate to true in PHP:
0 == 0 // int vs. int
"0" == 0 // str -> int
"abc" == 0 // any non-numerical string -> 0
"1.234E+03" == "0.1234E+04" // string that looks like a float -> float
"0e215962017" == 0 // another string that looks like a float
The last of these examples is interesting because its MD5 hash value is another string consisting of 0e followed by a bunch of decimal digits (0e291242476940776845150308577824). So here's another logical expression in PHP that will evaluate to true:
"0e215962017" == md5("0e215962017")
To solve this CTF challenge, you have to find a string that is "equal" to its own hash value, but using the RIPEMD160 algorithm instead of MD5. When this is provided as a query string variable (e.g., ?hash=0e215962017), then the PHP script will disclose the value of a flag.
Fake hash collisions like this aren't difficult to find. Roughly 1 in every 256 MD5 hashes will start with '0e', and the probability that the remaining 30 characters are all digits is (10/16)^30. If you do the maths, you'll find that the probability of an MD5 hash equating to zero in PHP is approximately one in 340 million. It took me about a minute (almost 216 million attempts) to find the above example.
Exactly the same method can be used to find similar values that work with RIPEMD160. You just need to test more hashes, since the extra hash digits mean that the probability of a "collision" will be approximately one in 14.6 billion. Quite a lot, but still tractable (in fact, I found a solution to this challenge in about 15 minutes, but I'm not posting it here).
Your code, on the other hand, will take much, much longer to find a solution. First of all, there is absolutely no point in generating random inputs. Sequential values will work just as well, and will be much faster to generate.
If you use sequential input values, then you also won't need to worry about repeating the same hash calculations. Your code uses a list structure to store previously hashed values. This is a terrible idea. Searching for an item in a list is an O(n) operation, so once your code has (unsuccessfully) tested a billion inputs, it will have to compare every new input against each of these billion inputs at each iteration, causing your code to grind to a complete standstill. Your code would actually run a lot faster if you didn't bother checking for duplicates. When you have time, I suggest you learn when to use lists, dicts and sets in Python.
Another problem is that your code only tests 10-digit numbers, which means it can only test a maximum of 10 billion possible inputs. Based on the numbers given above, are you sure this is a sensible limit?
Finally, your code is printing every single input string before you calculate its hash. Before your program outputs a solution, you can expect it to print out somewhere in the order of a billion screenfuls of incorrect guesses. Is there any point in doing this? No.
Here's the code I used to find the MD5 collision I mentioned earlier. You can easily adapt it to work with RIPEMD160, and you can convert it to Python if you like (although the PHP code is much simpler):
$n = 0;
while (1) {
$s = "0e$n";
$h = md5($s);
if ($s == $h) break;
$n++;
}
echo "$s : $h\n";
Note: Use PHP's hash_equals() function and strict comparison operators to avoid this sort of vulnerability in your own code.
Related
I'm new to php and I'm studying on my own, normally I create my tokens and insert them in the tables like this:
private function create_token($reference, $bytes, $slice)
{
$key = substr(preg_replace('/\W/', "", base64_encode(bin2hex(random_bytes($bytes)))), 0, $slice);
return $reference . $key;
}
function create_token('token_B8', 34, 22); //token_B8eEr32EEddDsfSDGRGgHHhg
This maybe is a correct way to create tokens but my doubt would be if this really is the correct way, I was thinking, obviously the chanses of there being 2 tokens in the identical table and from 1 to 1000000000 correct? Or is there a way to create a token that says:
Under no circumstances create an equal token
without having to create a function to check if the token in the table already exists.
I believe that what I should do is create a token the way I create it is to create a function that checks if this token already exists in the table, if it does, it generates another token, if not then insert it in the table. This seems to be a correct way, but as I'm new I don't know if there is a more appropriate way, can someone get me out of this doubt? thanks
The string generated by random_bytes() is maximally random, and literally everything you do to it after that is decreasing the amount of randomness in the string, and therefore the number of possible values that it could be.
random_bytes() 8 bits of random per byte.
bin2hex() stretches each byte of input over two bytes. [x0.5]
base64_encode() stretches 3 input bytes over 4 output bytes. [x0.75]
preg_replace('/\W/', "", $input) effectively changing from base64 encoding to base62, decreasing the space slightly once again. [x??? < 1]
So all told that 22 byte token you're generating represents 22 * 8 * 0.5 * 0.75 * ??? <= 66 bits of random data. So <= 73,786,976,294,838,206,464 possibilities.
Boy howdy, that sure seems like a lot, right? Well not really. Because of the Birthday Paradox the probability of collisions can get into the range of causing issues while you're still a few orders of magnitude away from filling the range.
I guess if we remove that pointless bin2hex() we could squeeze out another 66 bits for 132 in total? But how much more does that really get us?
5,444,517,870,735,015,415,413,993,718,908,291,383,296
A lot. A lot. I don't even care about that preg_replace() anymore.
For the sake of completeness, what about just a random_bytes(22)? 176 bits?
95,780,971,304,118,053,647,396,689,196,894,323,976,171,195,136,475,136
I guess the take-aways are:
Don't confuse data encoding with "make more random" just because the output looks garbled. [Note: the same goes for hash functions]
Don't apply functions/encodings willy-nilly if you don't know what they are actually doing.
In code:
$input = 'abc';
// all of these outputs contain the SAME amount of entropy, some of them are just longer representations
var_dump(
$input,
bin2hex($input),
base64_encode($input),
base64_encode(bin2hex($input)),
bin2hex(base64_encode($input))
);
Output:
string(3) "abc"
string(6) "616263"
string(4) "YWJj"
string(8) "NjE2MjYz"
string(8) "59574a6a"
Anyway, with a sufficiently large random ID space it's more pragmatic to just put a UNIQUE constraint on the value and let the process fail when a duplicate value tries to be inserted. You can put in some retry logic, but odds are that it will never actually run unless someone leverages vulnerabilities specifically to make you generate duplicates and DoS yourself with retries. [yes, this is a thing]
In php is there a way to give a unique hash from a string, but that the hash was made up from numbers only?
example:
return md5(234); // returns 098f6bcd4621d373cade4e832627b4f6
but I need
return numhash(234); // returns 00978902923102372190
(20 numbers only)
the problem here is that I want the hashing to be short.
edit:
OK let me explain the back story here.
I have a site that has a ID for every registered person, also I need a ID for the person to use and exchange (hence it can't be too long), so far the ID numbering has been 00001, 00002, 00003 etc...
this makes some people look more important
this reveals application info that I don't want to reveal.
To fix point 1 and 2 I need to "hide" the number while keeping it unique.
Edit + SOLUTION:
Numeric hash function based on the code by https://stackoverflow.com/a/23679870/175071
/**
* Return a number only hash
* https://stackoverflow.com/a/23679870/175071
* #param $str
* #param null $len
* #return number
*/
public function numHash($str, $len=null)
{
$binhash = md5($str, true);
$numhash = unpack('N2', $binhash);
$hash = $numhash[1] . $numhash[2];
if($len && is_int($len)) {
$hash = substr($hash, 0, $len);
}
return $hash;
}
// Usage
numHash(234, 20); // always returns 6814430791721596451
An MD5 or SHA1 hash in PHP returns a hexadecimal number, so all you need to do is convert bases. PHP has a function that can do this for you:
$bignum = hexdec( md5("test") );
or
$bignum = hexdec( sha1("test") );
PHP Manual for hexdec
Since you want a limited size number, you could then use modular division to put it in a range you want.
$smallnum = $bignum % [put your upper bound here]
EDIT
As noted by Artefacto in the comments, using this approach will result in a number beyond the maximum size of an Integer in PHP, and the result after modular division will always be 0. However, taking a substring of the hash that contains the first 16 characters doesn't have this problem. Revised version for calculating the initial large number:
$bignum = hexdec( substr(sha1("test"), 0, 15) );
You can try crc32(). See the documentation at: http://php.net/manual/en/function.crc32.php
$checksum = crc32("The quick brown fox jumped over the lazy dog.");
printf("%u\n", $checksum); // prints 2191738434
With that said, crc should only be used to validate the integrity of data.
There are some good answers but for me the approaches seem silly.
They first force php to create a Hex number, then convert this back (hexdec) in a BigInteger and then cut it down to a number of letters... this is much work!
Instead why not
Read the hash as binary:
$binhash = md5('[input value]', true);
then using
$numhash = unpack('N2', $binhash); //- or 'V2' for little endian
to cast this as two INTs ($numhash is an array of two elements). Now you can reduce the number of bits in the number simply using an AND operation. e.g:
$result = $numhash[1] & 0x000FFFFF; //- to get numbers between 0 and 1048575
But be warned of collisions! Reducing the number means increasing the probability of two different [input value] with the same output.
I think that the much better way would be the use of "ID-Crypting" with a Bijectiv function. So no collisions could happen! For the simplest kind just use an Affine_cipher
Example with max input value range from 0 to 25:
function numcrypt($a)
{
return ($a * 15) % 26;
}
function unnumcrypt($a)
{
return ($a * 7) % 26;
}
Output:
numcrypt(1) : 15
numcrypt(2) : 4
numcrypt(3) : 19
unnumcrypt(15) : 1
unnumcrypt(4) : 2
unnumcrypt(19) : 3
e.g.
$id = unnumcrypt($_GET('userid'));
... do something with the ID ...
echo ' go ';
of course this is not secure, but if no one knows the method used for your encryption then there are no security reasons then this way is faster and collision safe.
The problem of cut off the hash are the collisions, to avoid it try:
return hexdec(crc32("Hello World"));
The crc32():
Generates the cyclic redundancy checksum polynomial of 32-bit lengths
of the str. This is usually used to validate the integrity of data
being transmitted.
That give us an integer of 32 bit, negative in 32 bits installation, or positive in the 64 bits. This integer could be store like an ID in a database. This don´t have collision problems, because it fits into 32bits variable, once you convert it to decimal with the hexdec() function.
First of all, md5 is basically compromised, so you shouldn't be using it for anything but non-critical hashing.
PHP5 has the hash() function, see http://www.php.net/manual/en/function.hash.php.
Setting the last parameter to true will give you a string of binary data. Alternatively, you could split the resulting hexadecimal hash into pieces of 2 characters and convert them to integers individually, but I'd expect that to be much slower.
Try hashid.
It hash a number into format you can define. The formats include how many character, and what character included.
Example:
$hashids->encode(1);
Will return "28630" depends on your format,
Just use my manual hash method below:
Divide the number (e.g. 6 digit) by prime values, 3,5,7.
And get the first 6 values that are in the decimal places as the ID to be used. Do a check on uniqueness before actual creation of the ID, if a collision exists, increase the last digit by +1 until a non collision.
E.g. 123456 gives you 771428
123457 gives you 780952
123458 gives you 790476.
i tried this piece of code
For i=1 to 1000000
mystring.s=Str(i)+"'2013-"+mm+"-"+dd+"','"+valoare+"','"+curs+"','"+total+"','"+Str(cont)+"','"+simbolcont+"','Denumire"+Str(i)+"','"+valuta.s+"','"+RSet(Str(i),40,"0")+"','"+total.s+"'"
id.s=UCase(MD5Fingerprint(#mystring.s,StringByteLength(mystring))+SHA1Fingerprint(#mystring,StringByteLength(mystring)))
Next i
the code above is in Purebasic, but i am more intrested in the principle of using this for uniqueid
i can say that in 1,000,000 generated strings i did not found any collisions
MD5(String)+SHA1(String) resulting a 72 characters string for uniqueid?
Keep in mind that String is the same in both functions and variyng length 300-350 chars
or the simple question
if a SHA1 collide does a MD5 of same string collide too? or viceversa?
i'm not a math genius, but i guess the colliding factor is low..
i can not use uniqueid based on timestamp here.
Thank you for your time.
To answer my own question quote from other forum
If I have two random strings (s1, s2) that are different (s1 != s2), you want to know the probability that md5(s1) == md5(s2) AND sha1(s1) == sha1(s2).
Well, first for two specific randomly chosen strings what is the probability that md5(s1) == md5(s2)? Answer its 1/2^128 as the first hash is some 128-bit string, and the chances that the second hash equals the second is 1 in 2^128 or about 2.9 x 10^-37 %.
Similarly, P(sha1(s1) == sha1(s2)) = 2^-160 ~ 6.8 x 10^-47 %.
Now the probability that that both conditions would be true assuming they are independent conditions (that is that the hashing functions are fundamentally independent of each other), is found by multiplying the probabilities since P(X AND Y) = P(X) P(Y) so P(md5(s1)==md5(s2) AND sha1(s1) == sha1(s2)) = 2^-288 ~ 2 x 10^-85 %.
Granted we assumed the hashing functions act independent of each other on the string -- which is a fair assumption for md5 and sha1 as hashing functions. But if instead of comparing MD5 and SHA-1, we compared MD5 and a new hashing function that's just MD5 applied to itself 100 times, we would find that whenever md5(s1) == md5(s2), that we'd also have md5^100(s1) == md5^100(s2), so the probability of both colliding is the same as the probability of having one collision.
Similarly, if we had a silly "hash" function that was just silly_hash(s) = md5(s) ++ s (where ++ means concatenate), then you could show that if s1 != s2 and md5(s1) == md5(s2) then silly_hash(s1) != silly_hash(s2) -- meaning that you could never have a double collision with md5 and silly_hash.
If you take 2 specific strings and compare, there's a 1 in 2^288 ~ 497323236409786642155382248146820840100456150797347717440463976893159497012533375533056 chance of both matching. Granted if you generate roughly about 2^144 ~ 22300745198530623141535718272648361505980416 strings together, there's a good chance that both hashes will match for one.
Tested with 3,500,000 strings and not a match .. then it's good enough for me (for the db i use to have that much records it requires about 10+ years of input at the rate they input (1.400.000 records in 4 years) - and i did a idcheck on the way (and they can modify if needed 1 char somewhere))
And 22300745198530623141535718272648361505980416? i cant even count that.
Hope it helps someone. The answer is Yes i can use MD5(s1)+SHA1(s1) as id.
the topic pretty much describes what we would like to accomplish.
a) start with a possible range of integers, for example, 1 to 10000.
b) take any md5 hash, run it thru this algo.
c) result that pops out will be an integer between 1 to 10000.
we are open to using another hashing method too.
the flow would ideally look like this:
string -> md5(string) -> algo(md5(string),range) -> resulting integer within range
is something like this possible?
final note: the range will always start with 1.
if you have an answer, feel free to post just the general idea, or if you so desire, php snippet works too :)
thanks!
Since MD5 (and SHA-1, etc.) will give you 128 bits of data (in PHP, you'll get it in hexadecimal string notation, so you need to convert it to an integer first). That number modulo 10000 will give you your integer.
Note however that many different hashes will convert to the same integer; this is unavoidable with any sort of conversion to your integer range, as the modulo operation essentially maps a larger set of numbers (in this case, 128 bits, that is numbers from 0 to 340,282,366,920,938,463,463,374,607,431,768,211,456) to a smaller set of numbers (less than 17 bits, numbers from 1 to 100,000).
since the range that we want will always start at 1, the following works great. all credit goes to Piskvor, as he was the one who provided the basic idea of how to go at this.
the code below seams to accomplish what we want. please chime in if this can be (not the code, its just for reference, but if the idea) improved at all. running the code below will result in 6305 / 10000 unique results. that in our case is good enough.
<?
$final=array();
$range=10000;
for($i=1;$i<=$range;$i++){
$string='this is my test string - attempt #'.$i;
echo 'initial string: '.$string.PHP_EOL;
$crc32=crc32($string);
echo 'crc32 of string: '.$crc32.PHP_EOL;
$postalgo=$crc32%$range;
echo 'post algo: '.$postalgo.PHP_EOL;
if(!in_array($postalgo,$final)){
$final[]=$postalgo;
}
}
echo 'unique results for '.($i-1).' attempts: '.count($final).PHP_EOL;
?>
enjoy!
I am trying to port a piece of code from perl to php. The perl code snippet is part of akamai's video on demand link generation script. The script generates seed based on the location / URL of the video file (which will always be constant for a single URL). And then it is used in generating serial ID for stream (which is basically a random number between 1 and 2000 using the seed). Here is the perl code.$seed=6718;
srand($seed);
print(int(rand(1999)) + 1); // return 442 every time And the converted PHP code is:$seed=6718;
srand($seed);
echo(rand(0, 1999) + 1); //returns 155 every time
Does php rand behaves differently than perl one?
Yes. You can't depend on their algorithms being the same. For perl, which rand is used depends on what platform your perl was built for.
You may have more luck using a particular algorithm; for instance, Mersenne Twister looks to be available for both PHP and Perl.
Update: trying it produces different results, so that one at least won't do the trick.
Update 2: From the perl numbers you show, your perl is using the drand48 library; I don't know whether that's available for PHP at all, and google isn't helping.
[clippy]It looks like your trying to hash a number, maybe you want to use a hash function?[/clippy]
Hash functions are designed to take an input and produce a consistently repeatable value, that is in appearance random. As a bonus they often have cross language implementations.
Using srand() with rand() to get what is basically a hash value is a fairly bad idea. Different languages use different algorithms, some just use system libraries. Changing (or upgrading) the OS, standard C library, or language can result in wildly different results.
Using SHA1 to get a number between 1 and 2000 is a bit overkill, but you can at least be sure that you could port the code to nearly any language and still get the same result.
use Digest::SHA1;
# get a integer hash value from $in between $min (inclusive) and $max (exclusive)
sub get_int_hash {
my ($in, $min, $max) = #_;
# calculate the SHA1 of $in, note $in is converted to a string.
my $sha = Digest::SHA1->new;
$sha->add( "$in" );
my $digest = $sha->hexdigest;
# use the last 7 characters of the digest (28 bits) for an effective range of 0 - 268,435,455.
my $value = hex substr $digest, -7;
# scale and shift the value to the desired range.
my $out = int( $value / 0x10000000 * ( $max - $min ) ) + $min;
return $out;
}
print get_int_hash(6718, 1, 2000); #this should print 812 for any SHA1 implementation.
Just seeing this snippet of code it is impossible to say if it is the same.
At first you need to knew that even a random generator like the rand() function is not really random. It calculates a new value with a mathematical formula from the previous number. With the srand() function you can set the start value.
Calling srand() with the same argument each time means that the program always returns the same numbers in the same order.
If you really want random numbers, in Perl you should remove the initialization of srand(). Because Perl automatically sets srand() to a better (random) value when you first call the rand() function.
If your program really wants random numbers, then it should also be okay for PHP. But even in PHP i would look if srand() is automatically set and set to a more random value.
If your program don't work with random numbers and instead really want a stream of numbers that is always the same, then the snipet of code are probably not identical. Even if you do the same initialization with srand() it could be that PHP uses another formula to calculate the next "random" number.
So you need to look at your surrounding code if you code really wants random numbers, if yes you can use this code. But even then you should look for a better initialization for srand().