Optimizing Base Conversion Loop - php

So, for my Cryptography Library, I have a base converter that I use quite often. It's not the most efficient thing in the world, but it works quite well for all ranges of input.
The bulk of the work is done by the callback loop:
$callback = function($source, $src, $dst) {
$div = array();
$remainder = 0;
foreach ($source as $n) {
$e = floor(($n + $remainder * $src) / $dst);
$remainder = ($n + $remainder * $src) % $dst;
if ($div || $e) {
$div[] = $e;
}
}
return array(
$div,
$remainder
);
};
while ($source) {
list ($source, $remainder) = $callback($source, $srcBase, $dstBase);
$result[] = $remainder;
}
Basically, it takes an array of numbers in $srcBase and converts them to an array of numbers in $dstBase. So, an example input would be array(1, 1), 2, 10 which would give array(3) as a result. Another example would be array(1, 0, 0), 256, 10 which would give array(1, 6, 7, 7, 7, 2, 1, 6) (each element of the array is a single "digit" in the $dstBase.
The problem I'm now facing, is if I feed it 2kb of data, it takes almost 10 seconds to run. So I've set out to optimize it. So far, I have it down to about 4 seconds by replacing that whole structure with this recursive loop:
while ($source) {
$div = array();
$remainder = 0;
foreach ($source as $n) {
$dividend = $n + $remainder * $srcBase;
$res = (int) ($dividend / $dstBase);
$remainder = $dividend % $dstBase;
if ($div || $res) {
$div[] = $res;
}
}
$result[] = $remainder;
$source = $div;
}
The problem I'm facing, is how to optimize it further (if that's even possible). I think the problem is the shear number of iterations it takes for the large input (for a 2000 element array, from base 256 to base 10, it takes 4,815,076 iterations in total).
Any thoughts?

99.9% of the time taken to execute this script originates from the inherent need to iterate through an input. Since the code inside the foreach is very basic, the only way of decreasing execution time is to reduce the number of iterations. If that is not possible, then you have the most efficient version of this function.

Yes, it can be optimized a little:
$source_count = count($source);
while ($source) {
$remainder = $i = 0;
foreach ($source AS &$n) {
$dividend = $n + $remainder * $srcBase;
$remainder = $dividend % $dstBase;
$res = ($dividend - $remainder) / $dstBase;
if ($i || $res)
$source[$i++] = $res;
}
for ($j=$i; $j < $source_count; $j++)
unset($source[$i]);
$source_count=$i;
$result[] = $remainder;
}
or even faster, but more obscure:
$source_count = count($source);
while ($source) {
$remainder = $i = 0;
foreach ($source AS &$n) {
if (($res = ($dividend - ($remainder = ($dividend = $n + $remainder * $srcBase) % $dstBase)) / $dstBase) || $i)
$source[$i++] = $res;
}
for ($j=$i; $j < $source_count; $j++)
unset($source[$i]);
$source_count=$i;
$result[] = $remainder;
}
You will get some memory and CPU usage reduction and it is much more fun but of cource unreadable (:.
But personally I think you are doing it the wrong way. I think you should use some fast C code for this kind of task(by using system call or writing/installing existing PHP module). And I think that code optimizers/compilers like Hip-Hop PHP,Zend Optimized etc. can dramatically increase performance in this case.

I'm not sure, but
$dividend = $remainder * $srcBase + $n;
could be a little bit faster...

Related

PHP: Getting random combinations to specified input

I am trying to display possibilities for additions of specific numbers but have not been getting the right results.
<?php
$num3 = 20;
$set = null;
$balance = $num3;
$dig = mt_rand(1,5);
for($i = $balance; $i > 0; $i -= $dig ){
echo $dig.'<br>';
if($i < 1){
$set .= $i;
$balance = 0;
}
else{
$set .= $dig.'+';
$dig = mt_rand(1,5);
}
}
echo $set.'='.$num3;
?>
Here are some of the outputs:
2+5+1+4+5+3+=20
1+4+3+5+3+=20
3+1+1+2+3+4+4+1+3+=20
Appreciate any pointers. Thank in advance...
Ok, even though the requirement isn't completely clear, here's an approach:
(edit: demonstrating prevention of endless loop)
$max_loops = 1000;
$balance = 20;
$found = [];
while($balance > 0 && $max_loops-- > 0) {
$r = mt_rand(1, 5);
if ($balance - $r >= 0) {
$found[] = $r;
$balance -= $r;
}
}
echo implode(' + ', $found) . ' = '.array_sum($found);
Note: This code has a small risk of getting caught in an endless loop... though it's doubtful that it'll ever happen :)
Edit: Now the loop contains a hard-limit of 1000 iterations, after which the loop will end for sure...
To provoke an endless loop, set $balance = 7 and modify mt_rand(4, 5).
You can use a recursive function for this:
function randomAddends($target, $maxAddend = 5, $sum = 0, $addends = [])
{
// Return the array of addends when the target is reached
if ($target <= $sum) {
return $addends;
}
// generate a new random addend and add it to the array
$randomAddend = mt_rand(1, min($maxAddend, $target - $sum));
$addends[] = $randomAddend;
// make the recursive call with the new sum
return randomAddends($target, $maxAddend, $sum + $randomAddend, $addends);
}

PHP Memory Exhausted Error with simple fractions

I am using this library to work with fractions in PHP. This works fine but sometimes, I have to loop over a lot of values and this results in the following error:
Allowed memory size of 134217728 bytes exhausted
I can allocate more memory using PHP ini but that is a slippery slope. At some point, I am going to run out of memory when the loops are big enough.
Here is my current code:
for($q = 10; $q <= 20; $q++) {
for($r= 10; $r <= 20; $r++) {
for($p = 10; $p <= 20; $p++) {
for($s = 10; $s <= 20; $s++) {
for($x = 50; $x <= 100; $x++) {
for($y = 50; $y <= 100; $y++) {
$den = ($q + $r + 1000) - ($p + $s);
$num = $x + $y;
$c_diff = new Fraction($num, $den);
}
}
}
}
}
}
I used memory_get_peak_usage(true)/(1024*1024) to keep track of the memory the script is using. The total memory used was just 2MB until I added the line that creates a new fraction.
Could anyone please guide me on how to get rid of this error. I went through the code of the library posted on GitHub here but can't figure out how to get rid of the exhausted memory error. Is this because of the static keyword? I am beginner so I am not entirely sure what's going on.
The library code is about a 100 lines after removing the empty lines and comments. Any help would be highly appreciated.
UPDATE:
The script exhausts its memory even if I use just this block of code and nothing else. I definitely know that creating a new Fraction object is the cause of exhausting memory.
I thought that there was not need to unset() anything because the same one variable to store the new fractional value over and over again.
This leads me to think that whenever I creating a new Fraction object something else happens which in the library code that takes up memory which is not released on rewriting the value in the $c_diff variable.
I am not very good at this so I thought it has something to do with the static keyword used at a couple of places. Could anyone confirm it for me?
If this issue can indeed be resolved by using unset(), should I place it at the end of the loop?
Various Possible fixes and efficiencies:
You have 6 for loops, each loop cycles a single integer value within various ranges.
But your calculation only uses 3 values and so it doesn't matter if $p = 10; $s = 14; or $p = 13; $s = 11; These are entirely equivilant in the calculation.
All you need is the sum; so once you've found that the value 24 works; you can find all the parts (over the minimum value of 10) that fit that value: ie (24 (sum) - 10 (min) = 14), then collect the values within the range; so there are 10,14, 11,13 , 12,12, 13,11, 14,10 valid values. savng yourself 80%+ processing work on the inner for loops.
$pairs = "p,s<BR>"; //the set of paired values found
$other = $sum - $min;
if($other > $max){
$other = $sum - $max;
}
$hardMin = $min;
while ($other >= $hardMin && $min >= $hardMin && $min <= $max){
$pairs .= $min.", ".$other."<BR>";
$other--; // -1
$min++; // +1
}
print $pairs;
Giving:
  p,s
10,14
11,13
12,12
13,11
14,10
So for this for loop already, you may only need to do ~10% of the total work cycling the inner loops.
Stop instantiating new classes. Creating a class is expensive. Instad you create one class and simply plug the values in:
Example:
$c_diff = new Fraction();
for(...){
for(...){
$c_diff->checkValuesOrWhateverMethod($num, $den)
}
}
This will save you significant overhead (depending on the structure of the class)
The code you linked on GitHub is simply to turn the value into a fraction and seems to be highly inefficient.
All you need is this:
function float2frac($n, $tolerance = 1.e-6) {
$h1=1; $h2=0;
$k1=0; $k2=1;
$b = 1/$n;
do {
$b = 1/$b;
$a = floor($b);
$aux = $h1; $h1 = $a*$h1+$h2; $h2 = $aux;
$aux = $k1; $k1 = $a*$k1+$k2; $k2 = $aux;
$b = $b-$a;
} while (abs($n-$h1/$k1) > $n*$tolerance);
return $h1."/".$k1;
}
Taken from this excellent answer.
Example:
for(...){
for(...){
$den = ($q + $r + 1000) - ($p + $s);
$num = $x + $y;
$value = $num/den;
$c_diff = float2frac($value);
unset($value,den,$num);
}
}
If you need more precision you can read this question and update PHP.ini as appropriate, but personally I would recommend you use more specialist maths languages such as Matlab or Haskell.
Putting it all together:
You want to check three values, and then find the equivilant part of each one.
You want to simply find the lowest common denominator fraction (I think).
So:
/***
* to generate a fraction with Lowest Common Denominator
***/
function float2frac($n, $tolerance = 1.e-6) {
$h1=1; $h2=0;
$k1=0; $k2=1;
$b = 1/$n;
do {
$b = 1/$b;
$a = floor($b);
$aux = $h1; $h1 = $a*$h1+$h2; $h2 = $aux;
$aux = $k1; $k1 = $a*$k1+$k2; $k2 = $aux;
$b = $b-$a;
} while (abs($n-$h1/$k1) > $n*$tolerance);
return $h1."/".$k1;
}
/***
* To find equivilants
***/
function find_equivs($sum = 1, $min = 1, $max = 2){
$value_A = $sum - $min;
$value_B = $min;
if($value_A > $max){
$value_B = $sum - $max;
$value_A = $max;
}
$output = "";
while ($value_A >= $min && $value_B <= $max){
if($value_A + $value_B == $sum){
$output .= $value_A . ", " . $value_B . "<BR>";
}
$value_A--; // -1
$value_B++; // +1
}
return $output;
}
/***
* Script...
***/
$c_diff = []; // an array of results.
for($qr = 20; $qr <= 40; $qr++) {
for($ps = 20; $ps <= 40; $ps++) {
for($xy = 100; $x <= 200; $xy++) {
$den = ($qr + 1000) - $ps;
$num = $xy;
$value = $num/$den; // decimalised
$c_diff[] = float2frac($num, $den);
/***
What is your criteria for success?
***/
if(success){
$qr_text = "Q,R<BR>";
$qr_text .= find_equivs($qr,10,20);
$sp_text = "S,P<BR>";
$sp_text .= find_equivs($sp,10,20);
$xy_text = "X,Y<BR>";
$xy_text .= find_equivs($sp,50,100);
}
}
}
}
This should do only a small percentage of the original looping.
I guess this isn't the entire block of code you are using.
This loop creates 50*50*10*10*10*10 = 25.000.000 Fraction objects. Consider using PHP's unset() to clean up memory, since you are allocating memory to create objects, but you never free it up.
editing for clarification
When you create anything in PHP, be it variable, array, object, etc. PHP allocates memory to store it and usually, the allocated memory is freed when script execution ends.
unset() is the way to tell PHP, "hey, I don't need this anymore. Can you, pretty please, free up the memory it takes?". PHP takes this into consideration and frees up the memory, when its garbage collector runs.
It is better to prevent memory exhaustion rather than feeding your script with more memory.
Allowed memory size of 134217728 bytes exhausted
134217728 bytes = 134.218 megabytes
Can you try this?
ini_set('memory_limit', '140M')
/* loop code below */

Modify Held-Karp TSP algorithm so we do not need to go back to the origin

I have to solve a problem where I had to find the shortest path to link all points starting from a distance matrix. It's almost like a Traveling Salesman Problem except I do not need to close my path by returning to the starting point. I found the Held-Karp algorithm (Python) that solves the TSP very well but always computes distances returning to the starting point. So now it leaves me with 3 questions :
Could at least one situation have a different result if I modify my function not to get back to the starting point?
If the answer to 1 is yes, how could I alter my held_karp() function to fit my needs?
It there is no way in 2, what should I look for next?
I have translated the held_karp() function from Python to PHP, and for my solution I'd be happy to use either language.
function held_karp($matrix) {
$nb_nodes = count($matrix);
# Maps each subset of the nodes to the cost to reach that subset, as well
# as what node it passed before reaching this subset.
# Node subsets are represented as set bits.
$c = [];
# Set transition cost from initial state
for($k = 1; $k < $nb_nodes; $k++) $c["(".(1 << $k).",$k)"] = [$matrix[0][$k], 0];
# Iterate subsets of increasing length and store intermediate results
# in classic dynamic programming manner
for($subset_size = 2; $subset_size < $nb_nodes; $subset_size++) {
$combinaisons = every_combinations(range(1, $nb_nodes - 1), $subset_size, false);
foreach($combinaisons AS $subset) {
# Set bits for all nodes in this subset
$bits = 0;
foreach($subset AS $bit) $bits |= 1 << $bit;
# Find the lowest cost to get to this subset
foreach($subset AS $bk) {
$prev = $bits & ~(1 << $bk);
$res = [];
foreach($subset AS $m) {
if(($m == 0)||($m == $bk)) continue;
$res[] = [$c["($prev,$m)"][0] + $matrix[$m][$bk], $m];
}
$c["($bits,$bk)"] = min($res);
}
}
}
# We're interested in all bits but the least significant (the start state)
$bits = (2**$nb_nodes - 1) - 1;
# Calculate optimal cost
$res = [];
for($k = 1; $k < $nb_nodes; $k++) $res[] = [$c["($bits,$k)"][0] + $matrix[$k][0], $k];
list($opt, $parent) = min($res);
# Backtrack to find full path
$path = [];
for($i = 0; $i < $nb_nodes - 1; $i++) {
$path[] = $parent;
$new_bits = $bits & ~(1 << $parent);
list($scrap, $parent) = $c["($bits,$parent)"];
$bits = $new_bits;
}
# Add implicit start state
$path[] = 0;
return [$opt, array_reverse($path)];
}
In case you need to know how the every_combinations() function works
function every_combinations($set, $n = NULL, $order_matters = true) {
if($n == NULL) $n = count($set);
$combinations = [];
foreach($set AS $k => $e) {
$subset = $set;
unset($subset[$k]);
if($n == 1) $combinations[] = [$e];
else {
$subcomb = every_combinations($subset, $n - 1, $order_matters);
foreach($subcomb AS $s) {
$comb = array_merge([$e], $s);
if($order_matters) $combinations[] = $comb;
else {
$needle = $comb;
sort($needle);
if(!in_array($needle, $combinations)) $combinations[] = $comb;
}
}
}
}
return $combinations;
}
Yes, the answer can be different. For instance, if the graph has 4 vertices and the following undirected edges:
1-2 1
2-3 1
3-4 1
1-4 100
1-3 2
2-4 2
The optimal path is 1-2-3-4 with a weight 1 + 1 + 1 = 3, but the weight of the same cycle is 1 + 1 + 1 + 100 = 103. However, the weight of the cycle 1-3-4-2 is 2 + 1 + 2 + 1 = 6 and the weight of this path is 2 + 1 + 2 = 5, so the optimal cycle and the optimal path are different.
If you're looking for a path, not a cycle, you can use the same algorithm, but you don't need to add the weight of the last edge to the start vertex, that is
for($k = 1; $k < $nb_nodes; $k++) $res[] = [$c["($bits,$k)"][0] + $matrix[$k][0], $k];
should be for($k = 1; $k < $nb_nodes; $k++) $res[] = [$c["($bits,$k)"][0], $k];

How do I roll over when doing math on an array of integers?

So I am trying to do math on an array of integers while enforcing a maximum integer in each piece of the array. Similar to this:
function add($amount) {
$result = array_reverse([0, 0, 0, 100, 0]);
$max = 100;
for ($i = 0; $i < count($result); ++$i) {
$int = $result[$i];
$new = $int + $amount;
$amount = 0;
while ($new > $max) {
$new = $new - $max;
++$amount;
}
$result[$i] = $new;
}
return array_reverse($result);
}
add(1); // [0, 0, 0, 100, 1]
add(100); // [0, 0, 0, 100, 100]
add(101); // [0, 0, 1, 0, 100]
So what I have above works but it is slow when adding larger integers. I've tried to do this with bitwise shifts and gotten close but I just can't get it to work for some reason. I think I need a third-party perspective. Does anyone have some tips?
The part that is taking up the majority of the time is the while loop. You are reducing the value down repeatedly until you have a sub-100 value. However, using PHP to loop down like that takes an incredible amount of time (a 12-digit integer clocked in at over 20 seconds on my local machine). Instead, use multiplication and division (along with an if). It is magnitudes faster. The same 12-digit integer took less than a second to complete with this code:
function add($amount) {
$result = array_reverse([0, 0, 0, 100, 0]);
$max = 100;
for ($i = 0, $size = count($result); $i < $size; ++$i) {
$int = $result[$i];
$new = $int + $amount;
$amount = 0;
if( $new > $max ) {
$remainder = $new % $max;
// Amount is new divided by max (subtract 1 if remainder is 0 [see next if])
$amount = ((int) ($new / $max));
// If remainder exists, new is the the number of times max goes into new
// minus the value of max. Otherwise it is the remainder
if( $remainder == 0 ) {
$amount -= 1;
$new = $new - ((($new / $max) * $max) - $max);
} else {
$new = $remainder;
}
}
$result[$i] = $new;
}
return array_reverse($result);
}
Also note that I moved your count($result) call into the variable initialization section of the for loop. When it is inside the expression section it gets executed each time the for loop repeats which can also add to the overall time of executing the function.
Also note that with a large math change like this you may want to assert a range of values you expect to calculate to ensure there are no outliers. I did a small range and they all came out the same but I encourage you to run your own.
Use min($max, $number) to get $number limited to $max.
for ($i = 0; $i < count($result); ++$i) {
$result[$i] = min($max, $result[$i] + $amount);
}

How to sort an SplFixedArray?

Is there a way to perform sorting on integers or strings in an instance of the SplFixedArray class? Is converting to a PHP's array, sorting, and then converting back being the only option?
Firstly, congratulations on finding and using SplFixedArrays! I think they're a highly under-utilised feature in vanilla PHP ...
As you've probably appreciated, their performance is unrivalled (compared to the usual PHP arrays) - but this does come at some trade-offs, including a lack of PHP functions to sort them (which is a shame)!
Implementing your own bubble-sort is a relatively easy and efficient solution. Just iterate through, looking at each consecutive pairs of elements, putting the highest on the right. Rinse and repeat until the array is sorted:
<?php
$arr = new SplFixedArray(10);
$arr[0] = 2345;
$arr[1] = 314;
$arr[2] = 3666;
$arr[3] = 93;
$arr[4] = 7542;
$arr[5] = 4253;
$arr[6] = 2343;
$arr[7] = 32;
$arr[8] = 6324;
$arr[9] = 1;
$moved = 0;
while ($moved < sizeof($arr) - 1) {
$i = 0;
while ($i < sizeof($arr) - 1 - $moved) {
if ($arr[$i] > $arr[$i + 1]) {
$tmp = $arr[$i + 1];
$arr[$i + 1] = $arr[$i];
$arr[$i] = $tmp;
}
$i++;
var_dump ($arr);
}
$moved++;
}
It's not fast, it's not efficient. For that you might consider Quicksort - there's documented examples online including this one at wikibooks.org (will need modification of to work with SplFixedArrays).
Seriously, beyond getting your question answered, I truly feel that forcing yourself to ask why things like SplFixedArray exist and forcing yourself to understand what goes on behind a "quick call to array_sort()" (and why it quickly takes a very long time to run) make the difference between programmers and programmers. I applaud your question!
Here's my adaptation of bubble sort using splFixedArrays. In PHP 7 this simple program is twice as fast as the regular bubblesort
function bubbleSort(SplFixedArray $a)
{
$len = $a->getSize() - 1;
$sorted = false;
while (!$sorted) {
$sorted = true;
for ($i = 0; $i < $len; $i++)
{
$current = $a->offsetGet($i);
$next = $a->offsetGet($i + 1);
if ( $next < $current ) {
$a->offsetSet($i, $next);
$a->offsetSet($i + 1, $current);
$sorted = false;
}
}
}
return $a
}
$starttime = microtime(true);
$array = SplFixedArray::fromArray([3,4,1,3,5,1,92,2,4124,424,52,12]);
$array = bubbleSort($array);
print_r($array->toArray());
echo (microtime(true) - $starttime) * 1000, PHP_EOL;

Categories