There's quite a few articles around PHP stating that constant timing attacks are possible when doing direct string comparisons. I've written some sample code to try and determine the order of magnitude difference but it's showing that it's not always the case that one is quicker than the other, even when doing millions of tests.
You would expect the first iteration active time to be quicker, as the first character is wrong in the string and thus the comparison can bail out, but it's not always the case.
$target = 'hello-world';
$comparison = ['Xello-world', 'hello-worlX'];
foreach($comparison as $x){
$time = 0;
$total = 500000;
for($i = 0; $i < $total; $i++){
$start = microtime(true);
/** Actually perform the comparison */
$result = ($target === $comparison);
$end = microtime(true);
/** Add up so we can compute the average **/
$time += $end - $start;
}
echo "Duration for $x: " . ($time / $total) . PHP_EOL;
}
Related
I'm trying to produce a timing attack in PHP and am using PHP 7.1 with the following script:
<?php
$find = "hello";
$length = array_combine(range(1, 10), array_fill(1, 10, 0));
for ($i = 0; $i < 1000000; $i++) {
for ($j = 1; $j <= 10; $j++) {
$testValue = str_repeat('a', $j);
$start = microtime(true);
if ($find === $testValue) {
// Do nothing
}
$end = microtime(true);
$length[$j] += $end - $start;
}
}
arsort($length);
$length = key($length);
var_dump($length . " found");
$found = '';
$alphabet = array_combine(range('a', 'z'), array_fill(1, 26, 0));
for ($len = 0; $len < $length; $len++) {
$currentIteration = $alphabet;
$filler = str_repeat('a', $length - $len - 1);
for ($i = 0; $i < 1000000; $i++) {
foreach ($currentIteration as $letter => $time) {
$testValue = $found . $letter . $filler;
$start = microtime(true);
if ($find === $testValue) {
// Do nothing
}
$end = microtime(true);
$currentIteration[$letter] += $end - $start;
}
}
arsort($currentIteration);
$found .= key($currentIteration);
}
var_dump($found);
This is searching for a word with the following constraints
a-z only
up to 10 characters
The script finds the length of the word without any issue, but the value of the word never comes back as expected with a timing attack.
Is there something I am doing wrong?
The script loops though lengths, correctly identifies the length. It then loops though each letter (a-z) and checks the speed on these. In theory, 'haaaa' should be slightly slower than 'aaaaa' due to the first letter being a h. It then carries on for each of the five letters.
Running gives something like 'brhas' which is clearly wrong (it's different each time, but always wrong).
Is there something I am doing wrong?
I don't think so. I tried your code and I too, like you and the other people who tried in the comments, get completely random results for the second loop. The first one (the length) is mostly reliable, though not 100% of the times. By the way, the $argv[1] trick suggested didn't really improve the consistency of the results, and honestly I don't really see why it should.
Since I was curious I had a look at the PHP 7.1 source code. The string identity function (zend_is_identical) looks like this:
case IS_STRING:
return (Z_STR_P(op1) == Z_STR_P(op2) ||
(Z_STRLEN_P(op1) == Z_STRLEN_P(op2) &&
memcmp(Z_STRVAL_P(op1), Z_STRVAL_P(op2), Z_STRLEN_P(op1)) == 0));
Now it's easy to see why the first timing attack on the length works great. If the length is different then memcmp is never called and therefore it returns a lot faster. The difference is easily noticeable, even without too many iterations.
Once you have the length figured out, in your second loop you are basically trying to attack the underlying memcmp. The problem is that the difference in timing highly depends on:
the implementation of memcmp
the current load and interfering processes
the architecture of the machine.
I recommend this article titled "Benchmarking memcmp for timing attacks" for more detailed explanations. They did a much more precise benchmark and still were not able to get a clear noticeable difference in timing. I'm simply going to quote the conclusion of the article:
In conclusion, it highly depends on the circumstances if a memcmp() is subject to a timing attack.
Being a mostly PHP developer (and self taught), I've never really had a reason to know or understand the algorithms behind things like sorting algorithms, except that quicksort is on average the quickest, and it's usually the algorithm behind PHP's sort functions.
But I have a pending interview coming up soon, and they recommend understanding basic algorithms like this one. So I broke open http://www.geeksforgeeks.org/quick-sort/ and implemented my own QuickSort and Partition functions, for practice of course, for sorting an array by one of it's values. I came up with this (I'm using PHP 7.1, so a fair bit of the syntax is relatively new)
function Partition(array &$Array, $Column, int $Low, int $High): int {
$Pivot = $Array[$High][$Column];
$i = $Low - 1;
for ($j = $Low; $j <= $High - 1; $j++) {
if ($Array[$j][$Column] > $Pivot) {
$i++;
[$Array[$i], $Array[$j]] = [$Array[$j], $Array[$i]];
}
}
[$Array[$i + 1], $Array[$High]] = [$Array[$High], $Array[$i + 1]];
return $i + 1;
}
function QuickSort(array &$Array, $Column, int $Low = 0, ?int $High = null): void {
$High = $High ?? (count($Array) - 1);
if ($Low < $High) {
$PartitionIndex = Partition($Array, $Column, $Low, $High);
QuickSort($Array, $Column, $Low, $PartitionIndex - 1);
QuickSort($Array, $Column, $PartitionIndex + 1, $High);
}
}
And it works! Awesome! And so I thought, no real point to using it, since there's no way the PHP interpreted version of this algorithm is faster than the compiled C version (like what would be used in usort). But for the heck of it, I decided to benchmark the two approaches.
And very much to my surprised, mine is faster!
$Tries = 1000;
$_Actions = $Actions;
$Start = microtime(true);
for ($i = 0; $i < $Tries; $i++) {
$Actions = $_Actions;
usort($Actions, function($a, $b) {
return $b['Timestamp'] <=> $a['Timestamp'];
});
}
echo microtime(true) - $Start, "\n";
$Start = microtime(true);
for ($i = 0; $i < $Tries; $i++) {
$Actions = $_Actions;
QuickSort($Actions, 'Timestamp');
}
echo microtime(true) - $Start, "\n";
This gives me consistent numbers around 1.274071931839 for the first one and 0.87327885627747 for the second.
Is there something silly that I'm missing that would cause this? Does usort not really use an implementation of quicksort? Is it because I'm not taking into account the array keys (in my case I don't need the key/value pairs to stay the same)?
Just in case anyone wants the finished QuickSort function in PHP, this is what I ended up with, which sorts arrays by column, descending, in about half the time as the native usort. (Iterative, not recursive, and the partition function was also inlined)
function array_column_sort_QuickSort_desc(array &$Array, $Column, int $Start = 0, int $End = null): void {
$End = $End ?? (count($Array) - 1);
$Stack = [];
$Top = 0;
$Stack[$Top++] = $Start;
$Stack[$Top++] = $End;
while ($Top > 0) {
$End = $Stack[--$Top];
$Start = $Stack[--$Top];
if ($Start < $End) {
$Pivot = $Array[$End][$Column];
$PartitionIndex = $Start;
for ($i = $Start; $i < $End; $i++) {
if ($Array[$i][$Column] >= $Pivot) {
[$Array[$i], $Array[$PartitionIndex]] = [$Array[$PartitionIndex], $Array[$i]];
$PartitionIndex++;
}
}
[$Array[$End], $Array[$PartitionIndex]] = [$Array[$PartitionIndex], $Array[$End]];
$Stack[$Top++] = $Start;
$Stack[$Top++] = $PartitionIndex - 1;
$Stack[$Top++] = $PartitionIndex + 1;
$Stack[$Top++] = $End;
}
}
}
Consider the difference between the arguments you pass to your QuickSort and those you pass to usort(). usort() has a much more generic interface, which operates in terms of a comparison function. Your QuickSort is specialized for your particular kind of data, and for performing comparisons via the > operator.
Very likely, then, the difference in performance is attributable to the much higher cost of evaluating function calls relative to evaluating individual > operations. That difference could easily swamp any inherent efficiency advantage that usort() might have. Consider, moreover, that because it relies on a comparison function written in PHP, usort()'s operation involves running a lot of PHP, not just compiled C code.
If you want to explore this further then consider modifying your implementation to present the same interface that usort() does. I'd be inclined to guess that usort() would win an apples-to-apples comparison with such a hand-rolled variation, but performance is notoriously hard to predict. This is why we test.
is there any kind of performance difference between these two?
$bin = 1000 //8 in decimal
$bin_a = strrev($bin);
$bin_a = str_split($bin_a);
or
$bin_b = str_split($bin);
$bin_b = array_reverse($bin_b);
or are there any function to convert string to array and reverse at the same time?
I want to manually convert binary to decimal without native php function,
or any simpler way to do this?
Not a meaningful difference.
And not diffifcult to test out. You should be able to write a test like this without any trouble whatsoever.
<?php
$start_a = microtime(true);
$bin = "1000"; //8 in decimal
for ($n = 0; $n < 1000000; $n++) {
$bin_a = strrev($bin);
$bin_a = str_split($bin_a);
}
$end_a = microtime(true);
echo "Took ", $end_a - $start_a, " seconds \n";
$start_b = microtime(true);
for ($n = 0; $n < 1000000; $n++) {
$bin_b = str_split($bin);
$bin_b = array_reverse($bin_b);
}
$end_b = microtime(true);
echo "Took ", $end_b - $start_b, " seconds \n";
Output, for a million repetitions:
Took 0.26819205284119 seconds
Took 0.39758610725403 seconds
If you are optimizing for this, you are most likely doing it wrong. :)
I need help creating PHP code to echo and run a function only 30% of the time.
Currently I have code below but it doesn't seem to work.
if (mt_rand(1, 3) == 2)
{
echo '';
theFunctionIWantCalled();
}
Are you trying to echo what the function returns? That would be
if(mt_rand(1,100) <= 30)
{
echo function();
}
What you currently have echoes a blank statement, then executes a function. I also changed the random statement. Since this is only pseudo-random and not true randomness, more options will give you a better chance of hitting it 30% of the time.
If you intended to echo a blank statement, then execute a function,
if(mt_rand(1,100) <= 30)
{
echo '';
function();
}
would be correct. Once again, I've changed the if-statement to make it more evenly distributed. To help insure a more even distribution, you could even do
if(mt_rand(1,10000) <= 3000)
since we aren't dealing with true randomness here. It's entirely possible that the algorithm is choosing one number more than others. As was mentioned in the comments of this question, since the algorithm is random, it could be choosing the same number over, and over, and over again. However, in practice, having more numbers to choose from will most likely result in an even distribution. Having only 3 numbers to choose from can skew the results.
Since you are using rand you can't guarantee it will be called 30% of the time. Where you could instead use modulus which will effectively give you 1/3 of the time, not sure how important this is for you but...
$max = 27;
for($i = 1; $i < $max; $i++){
if($i % 3 == 0){
call_function_here();
}
}
Since modulus does not work with floats you can use fmod, this code should be fairly close you can substitute the total iterations and percent...
$total = 50;
$percent = 0.50;
$calls = $total * $percent;
$interval = $total / $calls;
$called = 0;
$notcalled = 0;
for($i = 0; $i <= $total; $i++){
if(fmod($i, $interval) < 1){
$called++;
echo "Called" . "\n";
}else{
$notcalled++;
echo "Not Called" . "\n";
}
}
echo "Called: " . $called . "\n";
echo "Not Called: " . $notcalled . "\n";
I know of several ways to get a character off a string given the index.
<?php
$string = 'abcd';
echo $string[2];
echo $string{2};
echo substr($string, 2, 1);
?>
I don't know if there are any more ways, if you know of any please don't hesitate to add it. The question is, if I were to choose and repeat a method above a couple of million times, possibly using mt_rand to get the index value, which method would be the most efficient in terms of least memory consumption and fastest speed?
To arrive at an answer, you'll need to setup a benchmark test rig. Compare all methods over several (hundreds of thousands or millions) iterations on an idle box. Try the built-in microtime function to measure the difference between start and finish. That's your elapsed time.
The test should take you all of 2 minutes to write.
To save you some effort, I wrote a test. My own test shows that the functional solution (substr) is MUCH slower (expected). The idiomatic PHP ({}) solution is as fast as the index method. They are interchangeable. The ([]) is preferred, as this is the direction where PHP is going regarding string offsets.
<?php
$string = 'abcd';
$limit = 1000000;
$r = array(); // results
// PHP idiomatic string index method
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = $string{2};
}
$r[] = microtime(true) - $s;
echo "\n";
// PHP functional solution
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = substr($string, 2, 1);
}
$r[] = microtime(true) - $s;
echo "\n";
// index method
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = $string[2];
}
$r[] = microtime(true) - $s;
echo "\n";
// RESULTS
foreach ($r as $i => $v) {
echo "RESULT ($i): $v \n";
}
?>
Results:
RESULT (PHP4 & 5 idiomatic braces syntax): 0.19106006622314
RESULT (string slice function): 0.50699090957642
RESULT (*index syntax, the future as the braces are being deprecated *): 0.19102001190186