Could a random sleep prevent timing attacks? - php

From Wikipedia
In cryptography, a timing attack is a side channel attack in which the
attacker attempts to compromise a cryptosystem by analyzing the time
taken to execute cryptographic algorithms.
Actually, to prevent timing attacks, I'm using the following function taken from this answer:
function timingSafeCompare($safe, $user) {
// Prevent issues if string length is 0
$safe .= chr(0);
$user .= chr(0);
$safeLen = strlen($safe);
$userLen = strlen($user);
// Set the result to the difference between the lengths
$result = $safeLen - $userLen;
// Note that we ALWAYS iterate over the user-supplied length
// This is to prevent leaking length information
for ($i = 0; $i < $userLen; $i++) {
// Using % here is a trick to prevent notices
// It's safe, since if the lengths are different
// $result is already non-0
$result |= (ord($safe[$i % $safeLen]) ^ ord($user[$i]));
}
// They are only identical strings if $result is exactly 0...
return $result === 0;
}
But I was thinking if is possible prevent this kind of attack using a random sleep like
function timingSafeCompare($a,$b) {
sleep(rand(0,100));
if ($a === $b) {
return true;
} else {
return false;
}
}
Or maybe augmenting the randomness of sleep
sleep(rand(1,10)+rand(1,10)+rand(1,10)+rand(1,10));
This kind of approach can totally prevent timing attacks? Or just make the work harder?

This kind of approach can totally prevent timing attacks? Or just make the work harder?
Neither. It doesn't prevent timing attacks, nor does it make them any more difficult at all.
To understand why, look at the docs for sleep. Specifically, the meaning of the first parameter:
Halt time in seconds.
So your app takes 0.3 seconds to respond without sleep. With sleep it takes either 0.3, 1.3, 2.3, etc...
So really, to get the part we care about (the timing difference), we just need to chop off the integer part:
$real_time = $time - floor($time);
But let's go a step further. Let's say that you randomly sleep using usleep. That's a lot more granular. That's sleeping in microseconds.
Well, the measurements are being made in the 15-50 nanosecond scale. So that sleep is still about 100 times less granular than the measurements being made. So we can average off to the single microsecond:
$microseconds = $time * 1000000;
$real_microseconds = $microseconds - floor($microseconds);
And still have meaningful data.
You could go further and use time_nanosleep which can sleep to nanosecond scale precision.
Then you could start fuddling with the numbers.
But the data is still there. The beauty of randomness is that you can just average it out:
$x = 15 + rand(1, 10000);
Run that enough times and you'll get a nice pretty graph. You'll tell that there are about 10000 different numbers, so you can then average away the randomness and deduce the "private" 15.
Because well-behaved randomness is unbiased, it's pretty easy to detect statistically over a large enough sample.
So the question I would ask is:
Why bother with sleep-like hacks when you can fix the problem correctly?

Anthony Ferrara answered this question in his blog post, It's All About Time. I highly recommend this article.
Many people, when they hear about timing attacks, think "Well, I'll just add a random delay! That'll work!". And it doesn't.

This is fine for a single request if the only side channel observable by the attacker is the response time.
However, if an attacker makes enough requests this random delay could average out as noted in #Scott's answer citing ircmaxell's blog post:
So if we needed to run 49,000 tests to get an accuracy of 15ns [without a random delay], then we would need perhaps 100,000 or 1,000,000 tests for the same accuracy with a random delay. Or perhaps 100,000,000. But the data is still there.
As an example, let's estimate the number of requests a timing attack would need to get a valid 160 bit Session ID like PHP at 6 bits per character which gives a length of 27 characters. Assume, like the linked answer that an attack can only be done on one user at once (as they are storing the user to lookup in the cookie).
Taking the very best case from the blog post, 100,000, the number of permutations would be 100,000 * 2^6 * 27.
On average, the attacker will find the value halfway through the number of permutations.
This gives the number of requests needed to discover the Session ID from a timing attack to be 86,400,000. This is compared to 42,336,000 requests without your proposed timing protection (assuming 15ns accuracy like the blog post).
In the blog post, taking the longest length tested, 14, took 0.01171 seconds on average, which means 86,400,000 would take 1,011,744 seconds which equates to 11 days 17 hours 2 minutes 24 seconds.
Could a random sleep prevent timing attacks?
This depends on the context in which your random sleep is used, and the bit strength of the string that it is protecting. If it is for "keep me logged in" functionality which is the context in the linked question, then it could be worth an attacker spending 11 days to use the timing attack to brute force a value. However, this is assuming perfect conditions (i.e. fairly consistent response times from your application for each string position tested and no resetting or rollover of IDs). Also, these type of activity from an attacker will create a lot of noise and it is likely they will be spotted via IDS and IPS.
It can't entirely prevent them, but it can make them more difficult for an attacker to execute. It would be much easier and better to use something like hash-equals which would prevent timing attacks entirely assuming the string lengths are equal.
Your proposed code
function timingSafeCompare($a,$b) {
sleep(rand(0,100));
if ($a === $b) {
return true;
} else {
return false;
}
}
Note that the PHP rand function is not cryptographically secure:
Caution
This function does not generate cryptographically secure values, and should not be used for cryptographic purposes. If you need a cryptographically secure value, consider using openssl_random_pseudo_bytes() instead.
This means that in theory an attacker could predict what rand was going to generate and then use this information to determine whether the response time delay from your application was due to random sleep or not.
The best way to approach security is to assume that the attacker knows your source code - the only things secret from the attacker should be things like keys and passwords - assume that they know the algorithms and function used. If you can still say your system is secure even though an attacker knows exactly how it works, you will be most of the way there. Functions like rand are usually set to seed with the current time of day, so an attacker can just make sure their system clock is set to the same as your server and then make requests to validate that their generator is matching yours.
Due to this, it is best to avoid insecure random functions like rand and change your implementation to use openssl_random_pseudo_bytes which will be unpredictable.
Also, as per ircmaxell's comment, sleep is not granular enough as it only accepts an integer to represent the number of seconds. If you are going to try this approach look into time_nanosleep with a random number of nanoseconds.
These pointers should help secure your implementation against this type of timing attack.

This kind of approach can totally prevent timing attacks? Or just make the work harder?
ircmaxell have already answered why this only makes the work harder,
but a solution to prevent timing attacks in PHP in general would be
/**
* execute callback function in constant-time,
* or throw an exception if callback was too slow
*
* #param callable $cb
* #param float $target_time_seconds
* #throws \LogicException if the callback was too slow
* #return whatever $cb returns.
*/
function execute_in_constant_time(callable $cb, float $target_time_seconds = 0.01)
{
$start_time = microtime(true);
$ret = ($cb)();
$success = time_sleep_until($start_time + $target_time_seconds);
if ($success) {
return $ret;
}
// dammit!
$time_used = microtime(true) - $start_time;
throw new \LogicException("callback function was too slow! time expired! target_time_seconds: {$target_time_seconds} actual time used: {$time_used}");
}
using that approach, your code could be
function timingSafeCompare($a,$b, float $target_time_seconds = 0.01) {
return execute_in_constant_time(fn() => $a === $b, $target_time_seconds);
}
downside is that you should pick a number with a large margin, meaning relatively much time is lost sleeping.. fwiw on my laptop i had to use 0.2 (200 milliseconds) to compare 2x exactly-1-GiB strings, with a Core i7-8565U (a weird 2018 mid-range laptop cpu i've never heard of)
and this loop:
ini_set("memory_limit", "-1");
$s1 = "a";
$s2 = "a";
$append = str_repeat("a",100*1024);
try {
for (;;) {
$res = timingSafeCompare($s1, $s2, 0.01);
$s1 .= $append;
$s2 .= $append;
}
} catch (\Throwable $e) {
var_dump(strlen($s1));
}
craps out at about 65 megabytes/int(65126401)
(but how often do you need to constant-time-compare strings above 65MB? i imagine it's not often)
you might think "then the attacker could send a HUGE string to compare, and check how long it takes for the exception to be thrown" but i don't think that would work, === starts by checking if both strings have the same length, and short-circuits if they have different lengths, such an attack should only work if the attacker can set the length for both strings to be large enough to timeout
today we have the native hash_equals() function to compare strings that have exactly the same length, but hash_equals() will not protect you against strings of different length, while the function above will.

Related

Timing attack in PHP

strcmp - what is means "Binary safe string comparison"? This compare is safe for the timing attack?
If no, how can I compare two strings for preventing the timing attack? Compare hashes of the strings is enough? Or I must use some library (or own code) that gives constant time for the compare?
Here writes that the timing attack can be used in the web. But can be this type of an attack exists in the real world? Or this attack can be used only for a small type of an attacker (like government) so this protection through the web is excess?
"binary safe" means that any bytes can be safely compared with strcmp, not just valid characters in some character set. A quick test confirms that strcmp is not safe against timing attacks:
$nchars = 1000;
$s1 = str_repeat('a', $nchars + 1);
$s2 = str_repeat('a', $nchars) . 'b';
$s3 = 'b' . str_repeat('a', $nchars);
$times = 100000;
$start = microtime(true);
for ($i = 0; $i < $times; $i++) {
strcmp($s1, $s2);
}
$timeForSameAtStart = microtime(true) - $start;
$start = microtime(true);
for ($i = 0; $i < $times; $i++) {
strcmp($s1, $s3);
}
$timeForSameAtEnd = microtime(true) - $start;
printf("'b' at the end: %.4f\n'b' at the front: %.4f\n", $timeForSameAtStart, $timeForSameAtEnd);
For me this prints something like 'b' at the end: 0.0634 'b' at the front: 0.0287.
Many other string-based functions in PHP likely suffer from similar issues. Working around this is tricky, especially in PHP where you don't actually know what a lot of functions are really doing at the physical level.
One possible tactic is just sticking a random wait time in your code before you return the answer to the caller/potential attacker. Even better, measure how long it took to check the input data (e.g., with microtime), and then wait a random time minus that amount of time. This is not 100% secure, but it makes attacking the system MUCH harder because, at a minimum, an attacker will have to try each input many times in order to filter out the randomness.
The problem with strcmp is, that it depends on implementation. If it binarily compares each byte of strings until it reaches difference or end of either strings, then it is vulnerable to timing attack.
Now how about hashing?
I have found this Security question and i belive it has the correct answer for you:
https://security.stackexchange.com/a/46215
Timing attack is a myth.
I explain.
The time that it takes to validates a text, between one similar versus other different is around a fraction of second, let's say +/- 0.1 second (exaggerated!).
However, the time that it takes an attacker to measure this time is:
delay of the network + 0.1 seconds + delay of the system (may be its busy doing some other task) + other delays.
So no, its not possible, even for a local system (lag zero), the result of interval of time is always unclear.
In a test, let's say the difference between one method and another is 1us.
So, if we test it and the difference is 1us, then we could guess part of the number.
But what if there is another factor, for example, the network, the cpu usage at the moment, the cpu cycle of the moment and such.
Even if we excluded the network, we have that most operating systems are multi-tasking, so the test must be done in a system with a single-tasking operating system or a system running a single task, and that is not something that you see in the wild. Even embedded systems run multiple threads at the same time.
But let's say we run locally (not network) and we are doing a drill-run in a computer that only runs a single task, our task. But we have another problem, modern CPUs don't run at a constant cycle, they vary depending on the usage (, temperature and other factors.
So, it is only possible if:
it is executed locally and there is no other factor.
it runs as a single task and no other task is running on the server.
the cpu runs constantly.
i.e. it is ABSURD.
it is the test.
<?php
$text='123456789012345678901234567890123456789012345678901234567890123456789012345678901234';
$compare1='12345678901234567890123456789012345678901234567890123456789012345678901234567890123x';
$compare2='2222222222222222222222222222222222222222222222222222222222222222222222222222222222222';
$a1=microtime(true);
for($i=0;$i<100000;$i++) {
if($compare1===$text) {
// do something
}
}
$a2=microtime(true);
var_dump($a2-$a1);
$a1=microtime(true);
for($i=0;$i<100000;$i++) {
if($compare2===$text) {
// do something
}
}
$a2=microtime(true);
var_dump($a2-$a1);
It took me 5 minutes to invalidate this hypothesis.
What is tested:
it tests a 512bit text and it compares with two tests and compares the times.
This test is done to prove the hypothesis so it forces a no-real situation where the first text compared is almost the same as the first test (excluding the last character).
It also excludes latencies and other operations.
(why 512bits, most passwords are encrypted in 128 and 256bits, 512bits is what we can call it safe)
And it is the result.
one round:
0.021588087081909
0.021672010421753 (long time)
another run:
0.021767854690552
0.022729873657227 (long time)
and another run:
0.021697998046875 (long time)
0.021611213684082
and again
0.021565914154053 (long time)
0.020948171615601
and again
0.021995067596436
0.0224769115448 (long time)
So, even when the test is forced to validate the point, it fails.
i.e.
you can't find a trend when one of the variables is unknown and this factor compromises the whole test. I can test it 1 million times and the result will be the same. And this test, in particular, avoids any variable such as latency, other processes, access to the database, etc.

Secure string compare function

I just came across this code in the HTTP Auth library of the Zend Framework. It seems to be using a special string compare function to make it more secure. However, I don't quite understand the comments. Could anybody explain why this function is more secure than doing $a == $b?
/**
* Securely compare two strings for equality while avoided C level memcmp()
* optimisations capable of leaking timing information useful to an attacker
* attempting to iteratively guess the unknown string (e.g. password) being
* compared against.
*
* #param string $a
* #param string $b
* #return bool
*/
protected function _secureStringCompare($a, $b)
{
if (strlen($a) !== strlen($b)) {
return false;
}
$result = 0;
for ($i = 0; $i < strlen($a); $i++) {
$result |= ord($a[$i]) ^ ord($b[$i]);
}
return $result == 0;
}
It looks like they're trying to prevent timing attacks.
In cryptography, a timing attack is a side channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, and the time can differ based on the input; with precise measurements of the time for each operation, an attacker can work backwards to the input.
Basically, if it takes a different amount of time to compare a correct password and an incorrect password, then you can use the timing to figure out how many characters of the password you've guessed correctly.
Consider an extremely flawed string comparison (this is basically the normal string equality function, with an obvious wait added):
function compare(a, b) {
if(len(a) !== len(b)) {
return false;
}
for(i = 0; i < len(a); ++i) {
if(a[i] !== b[i]) {
return false;
}
wait(10); // wait 10 ms
}
return true;
}
Say you give a password and it (consistently) takes some amount of time for one password, and about 10 ms longer for another. What does this tell you? It means the second password has one more character correct than the first one.
This lets you do movie hacking -- where you guess a password one character at a time (which is much easier than guessing every single possible password).
In the real world, there's other factors involved, so you have to try a password many, many times to handle the randomness of the real world, but you can still try every one character password until one is obviously taking longer, then start on two character password, and so on.
This function still has a minor problem here:
if(strlen($a) !== strlen($b)) {
return false;
}
It lets you use timing attacks to figure out the correct length of the password, which lets you not bother guessing any shorter or longer passwords. In general, you want to hash your passwords first (which will create equal-length strings), so I'm guessing they didn't consider it to be a problem.

Gathering entropy in web apps to create (more) secure random numbers

after several days of research and discussion i came up with this method to gather entropy from visitors (u can see the history of my research here)
when a user visits i run this code:
$entropy=sha1(microtime().$pepper.$_SERVER['REMOTE_ADDR'].$_SERVER['REMOTE_PORT'].
$_SERVER['HTTP_USER_AGENT'].serialize($_POST).serialize($_GET).serialize($_COOKIE));
note: pepper is a per site/setup random string set by hand.
then i execute the following (My)SQL query:
$query="update `crypto` set `value`=sha1(concat(`value`, '$entropy')) where name='entropy'";
that means we combine the entropy of the visitor's request with the others' gathered already.
that's all.
then when we want to generate random numbers we combine the gathered entropy with the output:
$query="select `value` from `crypto` where `name`='entropy'";
//...
extract(unpack('Nrandom', pack('H*', sha1(mt_rand(0, 0x7FFFFFFF).$entropy.microtime()))));
note: the last line is a part of a modified version of the crypt_rand function of the phpseclib.
please tell me your opinion about the scheme and other ideas/info regarding entropy gathering/random number generation.
ps: i know about randomness sources like /dev/urandom.
this system is just an auxiliary system or (when we don't have (access to) these sources) a fallback scheme.
In the best scenario, your biggest danger is a local user disclosure of information exploit. In the worst scenario, the whole world can predict your data. Any user that has access to the same resources you do: the same log files, the same network devices, the same border gateway, or the same line that runs between you and your remote connections allows them to sniff your traffic by unwinding your random number generator.
How would they do it? Why, basic application of information theory and a bit of knowledge of cryptography, of course!
You don't have a wrong idea, though! Seeding your PRNG with real sources of randomness is generally quite useful to prevent the above attacks from happening. For example, this same level of attack can be exploited by someone that understands how /dev/random gets populated on a per-system basis if the system has low entropy or its sources of randomness are reproducible.
If you can sufficiently secure the processes that seed your pool of entropy (for example, by gathering data from multiple sources over secure lines), the likelihood that someone is able to listen in becomes smaller and smaller as you get closer and closer to the desirable cryptographic qualities of a one-time pad.
In other words, don't do this in PHP, using a single source of randomness fed into a single Mersenne twister. Do it properly, by reading from your best, system-specific alternative to /dev/random, seeding its entropy pool from as many secure, distinct sources of "true" randomness as possible. I understand you've stated that these sources of randomness are inaccessible, but this notion is strange when similar functions are afforded to all major operating systems. So, I suppose I find the concept of an "auxiliary system" in this context to be dubious.
This will still be vulnerable to an attack by a local user cognizant of your sources of entropy, but securing the machine and increasing the true entropy within /dev/random will make it far more difficult for them to do their dirty work short of a man-in-the-middle attack.
As for cases where /dev/random is indeed accessible, you can seed it fairly easily:
Look at what options exist on your system for using /dev/hw_random
Embrace rngd (or a good alternative) for defining your sources of randomness
Use rng-tools for inspecting and improving your randomness profile
And finally, if you need a good, strong source of randomness, consider investing in more specialized hardware.
Best of luck in securing your application.
PS: You may want to give questions like this a spin at Security.SE and Cryptography.SE in the future!
Use Random.Org
If you need truly random numbers, use random.org. These numbers are generated via atmospheric noise. Besides library for PHP, it also has a http interface which allows you to get truly random numbers by simple requests:
https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new
This means that you can simply retrieve the real random numbers in PHP without any additional PECL exension on the server.
If you don't like other users to be able to "steal" your random numbers (as MrGomez' argues), just use https with a certificate checking. Here follows an example with https certificate checking:
$url = "https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
$response = curl_exec($ch);
if ($response === FALSE)
echo "http request failed: " . curl_error($ch);
else
echo $response;
curl_close($ch);
If you need more information on how to create https requests:
Make a HTTPS request through PHP and get response
http://unitstep.net/blog/2009/05/05/using-curl-in-php-to-access-https-ssltls-protected-sites/
More on security
Again, some might argue that if the attacker queries random.org at the same time as you, he might get the same numbers and predict.. I don't know if random.org would even work this way, but if you are really concerned, you may lessen the chance by fooling the attacker with dummy request which you throw out, or use only a certain part of the random numbers you get.
As MrGomez notes in his comment, this shall not be considered as an ultimate solution to security, but only as one of possible sources of entropy.
Performance
Of course, if you need a blitz latency then doing one random.org request per one client request might not be best idea... but what about just doing one bigger request to pre-cache the random numbers like every 5 minutes?
To come to the point, as far as i know there is no way to generate entrophy inside a PHP script, sorry for this non-answer. Even if you look at well etablished scripts like phppass, you will see, that their fallback system cannot do some magic.
The question is, whether you should try it anyway or not. Since you want to publish your system under GPL, you propably don't know in what scenario it will be used. In my opinion it's best then to require a random source, or to fail fast (die with an appropriate error message), so a developer who wants to use your system, knows immediately, that there is a problem.
To read from the random source, you could call the mcrypt_create_iv() function...
$randomBinaryString = mcrypt_create_iv($length, MCRYPT_DEV_URANDOM);
...this function reads from the random pool of the operating system. Since PHP 5.3 it does it on Windows servers as well, so you can leave it to PHP to handle the random source.
If you have access to /dev/urandom you can use this:
function getRandData($length = 1024) {
$randf = fopen('/dev/urandom', 'r');
$data = fread($randf, $length);
fclose($randf);
return $data;
}
UPDATE:
of course you should have some backup in case opening the device fails
should you have access to client side, you can enable mouse movement tracking - this is what true crypt is using for extra level of entropy.
as i have said before, my rand function is a modified version of phpseclib's crypt_random function.
u could see it in the link given on my first post. at least the author of the phpseclib cryptographic library confirmed it; not enough for ordinary apps? i don't speak of extreme/theoretical security, just speak about practical security to the extent really needed and at the same time 'easily'/'sufficiently low cost' available for almost all of the ordinary applications on the web.
phpseclib's crypt_random effectively and silently falls back to the mt_rand (which u should know is really weak) in the worst case (no openssl_random_pseudo_bytes or urandom available), but my function uses a much more secure scheme in such cases. it's just a fall back to a scheme that brute-forcing/predicting its output is much harder and (should be) in practice sufficient for all ordinary apps/sites. it uses possible (in practice very likely and hard to predict/circumvent) extra entropy that is gathered over time which quickly becomes almost impossible to know for outsiders. it adds this possible entropy to the mt_rand's output (and also to the output of other sources: urandom, openssl_random_pseudo_bytes, mcrypt_create_iv). if u are informed u should know, this entropy can be added but not subtracted. in the (almost surely really rare) worst case, that extra entropy would be 0 or some too tiny amount. in the mediocre case, which i think is almost all of the cases, it would be even more than practically necessary, i think. (i have had vast cryptography studies, so when i say i think, it is based on a much more informed and scientific analysis than ordinary programmers).
see the full code of my modified crypt_random:
function crypt_random($min = 0, $max = 0x7FFFFFFF)
{
if ($min == $max) {
return $min;
}
global $entropy;
if (function_exists('openssl_random_pseudo_bytes')) {
// openssl_random_pseudo_bytes() is slow on windows per the following:
// http://stackoverflow.com/questions/1940168/openssl-random-pseudo-bytes-is-slow-php
if ((PHP_OS & "\xDF\xDF\xDF") !== 'WIN') { // PHP_OS & "\xDF\xDF\xDF" == strtoupper(substr(PHP_OS, 0, 3)), but a lot faster
extract(unpack('Nrandom', pack('H*', sha1(openssl_random_pseudo_bytes(4).$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
}
// see http://en.wikipedia.org/wiki//dev/random
static $urandom = true;
if ($urandom === true) {
// Warning's will be output unles the error suppression operator is used. Errors such as
// "open_basedir restriction in effect", "Permission denied", "No such file or directory", etc.
$urandom = #fopen('/dev/urandom', 'rb');
}
if (!is_bool($urandom)) {
extract(unpack('Nrandom', pack('H*', sha1(fread($urandom, 4).$entropy.microtime()))));
// say $min = 0 and $max = 3. if we didn't do abs() then we could have stuff like this:
// -4 % 3 + 0 = -1, even though -1 < $min
return abs($random) % ($max - $min) + $min;
}
if(function_exists('mcrypt_create_iv') and version_compare(PHP_VERSION, '5.3.0', '>=')) {
#$tmp16=mcrypt_create_iv(4, MCRYPT_DEV_URANDOM);
if($tmp16!==false) {
extract(unpack('Nrandom', pack('H*', sha1($tmp16.$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
}
/* Prior to PHP 4.2.0, mt_srand() had to be called before mt_rand() could be called.
Prior to PHP 5.2.6, mt_rand()'s automatic seeding was subpar, as elaborated here:
http://www.suspekt.org/2008/08/17/mt_srand-and-not-so-random-numbers/
The seeding routine is pretty much ripped from PHP's own internal GENERATE_SEED() macro:
http://svn.php.net/viewvc/php/php-src/tags/php_5_3_2/ext/standard/php_rand.h?view=markup */
static $seeded;
if (!isset($seeded) and version_compare(PHP_VERSION, '5.2.5', '<=')) {
$seeded = true;
mt_srand(fmod(time() * getmypid(), 0x7FFFFFFF) ^ fmod(1000000 * lcg_value(), 0x7FFFFFFF));
}
extract(unpack('Nrandom', pack('H*', sha1(mt_rand(0, 0x7FFFFFFF).$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
$entropy contains my extra entropy which comes from all requests parameters' entropy combined till now + current request's parameters entropy + the entropy of a random string (*) set by hand at the installation time.
*: length: 22, composed of lower and uppercase letters + numbers (more than 128 bits of entropy)
Update 2: Code Review Warning to Everyone: Dont use The code in the original question. It's a security liability. If this code is online anywhere Remove it as it open the whole system, network and database to a malevolent user. Your not only exposing your code but all of your users data.
Do not ever Serialize user inputs. If in your code your already doing it, Stop your server and change your code. This is a great exemple of Not doing crypto by yourself.
Update 1: For real security you need to have UN-guessable randomess in your entropy. A suitable option to add entropy has your Question refer-to is to use the Delta of your script's execution time Not microtime() by itself . Because the Delta Rely on the load of your server. And so is a combination of the hardware environment, temperature, network load, power load, disk access, Cpu usage and voltage fluctuation which together are unpredictable.
Using Time(), timestamp or microtime is a flaw in your implementation.
Script execution Delta Exemple code coming:
#martinstoeckli stated correctly that a Suitable Random generation for crypto is from
mcrypt_create_iv($lengthinbytes, MCRYPT_DEV_URANDOM);
but is outside the requirements of not having a crypto module
In SQL use the RAND() in conjunction with your generated number.
http://www.tutorialspoint.com/mysql/mysql-rand-function.htm
Php offer as well the Rand() function
http://php.net/manual/en/function.rand.php
they wont give you the same number so you could use both.
rn_rand() should be getting used not rand()

What is the fastest combination of compression + encoding + checking + serialization of an array?

I need a combination of functions that does:
array serialization(no object, small - 3-7 key-value pairs of strings, no references)
data validity check of above(Is it better for the hash to be inside the array?)
encryption of above(is there any encryption method that validates decrypted information?)
compression of above(I am not sure if the cost worth: bandwidth / CPU time)
...of an array.
Everything should be optimized for speed.
For serializing the array I was thinking about using json_encode() rather than serialize() because it's faster. See Preferred method to store PHP arrays (json_encode vs serialize).
For data validity check I was thinking about using sha1(), but I am considering crc32 because it's faster and I don't think collisions are close. See Fastest hash for non-cryptographic uses?.
For encryption i made:
<?php
function encode($pass, $data) {
return mcrypt_encrypt(MCRYPT_RIJNDAEL_256, $pass, $data, MCRYPT_MODE_ECB);
}
function decode($pass, $data) {
return mcrypt_decrypt(MCRYPT_RIJNDAEL_256, $pass, $data, MCRYPT_MODE_ECB);
}
$rand = str_repeat(rand(0, 1000), 5);
$start = microtime(true);
for($i = 0; $i <= 10000; $i++){
encode('pass', $rand);
}
echo 'Script took ' . (microtime(true) - $start) . ' seconds for encryption<br/>';
$start = microtime(true);
for($i = 0; $i <= 10000; $i++){
encode('pass', $rand);
}
echo 'Script took ' . (microtime(true) - $start) . ' seconds for decryption';
Results are:
Script took 1.8680129051208 seconds for encryption
Script took 1.8597548007965 seconds for decryption
I would rather avoid any randomness. I know that CBC mode is more secure, but it is also slower.
For compression I have no idea what is better to use given the fact that the resulting string is binary and short.
Is there any compression that don't require encoding in order to set the resulting string as a cookie? I know that sha1() for example returns only digits ans letters.
It is a complex question. So feel free to point anything wrong or not accurate.
It contains many topics but basically the short question is how to safely and rapidly encrypt/decrypt an array while having a small representation of it.
Is this the right order?
Is data validation required given that there is a high probability that the resulting JSON
won't be valid in case data is altered?
Is there a function that already combines those or some of those functions?
I know that CBC mode is more secure, but it is also slower
Than ECB? Only if the data is more than a couple of blocks.
If you want the fastest encryption algorithm then there's no substitute for testing it yourself - somewhat strangely, PHP's sha1() implementation is significantly faster than its md5() (I know these are hashes - this is to illustrate that performance depends on implementation as much as algorithm).
Why are you trying to valdate it? If it's an encrypted datagram then the contents are opaque to the user - if they try tampering with it, then it will most likely to fail to decompress, in the unlikely event it still decompresses then decode will fail but in the remote case that this neither happen it should be very easy to check for other modifications - even an embedded CRC32 seems overkill.
in order to set the resulting string as a cookie
Sounds like you're using lots of fancy encryption to cover up a basic insecurity of your application - it's likely to be open to replay attacks. And you've got the added complication of ensuring that your data fits in a cookie. Why not just use a server-side session with a random value sent client-side (you don't have to use the PHP session handler if you want to implement a remember me type function and still have a conventional session).
In my opinion it would be sufficient to use only a compression. To reverse engineer a compression it would take a long time. I can recommend a huffman compression.

Session hash does size matter?

Does size matter when choosing the right algorithm to use for a session hash.
I recently read this article and it suggested using whirlpool to create a hash for session id. Whirlpool generates a 128 character hash string, is this too large?
The plan is to store the session hash in a db. Is there much of a difference between maybe using 64 character field (sha256), 96 character field (sha384) or 128 character field (whirlpool)? One of the initial arguments made for whirlpool was the speed vs other algorithms but looking at the speed results sha384 doesn't fair too badly.
There is the option truncate the hash to make it smaller than 128 characters.
I did modify the original code snippet, to allow changing of the algorithm based of the needs.
Update: There was some discussion about string being hashed, so I've included the code.
function generateUniqueId($maxLength = null) {
$entropy = '';
// try ssl first
if (function_exists('openssl_random_pseudo_bytes')) {
$entropy = openssl_random_pseudo_bytes(64, $strong);
// skip ssl since it wasn't using the strong algo
if($strong !== true) {
$entropy = '';
}
}
// add some basic mt_rand/uniqid combo
$entropy .= uniqid(mt_rand(), true);
// try to read from the windows RNG
if (class_exists('COM')) {
try {
$com = new COM('CAPICOM.Utilities.1');
$entropy .= base64_decode($com->GetRandom(64, 0));
} catch (Exception $ex) {
}
}
// try to read from the unix RNG
if (is_readable('/dev/urandom')) {
$h = fopen('/dev/urandom', 'rb');
$entropy .= fread($h, 64);
fclose($h);
}
// create hash
$hash = hash('whirlpool', $entropy);
// truncate hash if max length imposed
if ($maxLength) {
return substr($hash, 0, $maxLength);
}
return $hash;
}
The time taken to create the hash is not important, and as long as your database is properly indexed, the storage method should not be a major factor either.
However, the hash has to be transmitted with the client's request every time, frequently as a cookie. Large cookies can add a small amount of additional time to each request. See Yahoo!'s page performance best practices for more information. Smaller cookies, thus a smaller hash, have benefits.
Overall, large hash functions are probably not justified. For their limited scope, good old md5 and sha1 are probably just fine as the source behind a session token.
Yes, size matters.
If it's too short, you run the risk of collisions. You also make it practical for an attacker to find someone else's session by brute-force attack.
Being too long matters less, but every byte of the session ID has to be transferred from the browser to the server with every request, so if you're really optimising things, you may not want an ID that's too long.
You don't have to use all the bits of a hash algorithm, though - there's nothing stopping you from using something like Whirlpool, then only taking the first 128 bits (32 characters in hex). Practically speaking, 128 bits is a good lower bound on length, too.
As erickson points out, though, using a hash is a bit odd. Unless you have at least as much entropy as input as the length of the ID you're using, you're vulnerable to attacks that guess the input to your hash.
The article times out when I try to read it, but I can't think of a good reason to use a hash as a session identifier. Session identifiers should be unpredictable; given the title of the article, it sounds like the authors acknowledge that principle. Then, why not use a cryptographic random number generator to produce session identifiers?
A hash takes input, and if that input is predictable, so is the hash, and that's bad.
SHA1 or MD5 is probably enough for your needs. In practice, the probability of a collision is so small that it will likely never happen.
Ultimately, though, it all depends upon your required level of security. Do also keep in mind that longer hashes are both more expensive to compute and require more storage space.

Categories