Fastest way to compare numbers in php - php

I have a loop that needs to run a few million times; 10,967,700 to be precise. Within it, I am doing some checks including:
Number 1 is less than Number 2
Number 1 is less than or equal to Number 3
Number 4 is greater than Number 5
I'm wondering if there is any optimization/tweaks I can perform to have these checks performed faster. Or is this a ridiculous questino?

According to your snippet I suggest you the following changes:
Use the for-loop instead of the foreach like this example:
$key = array_keys($aHash);
$size = sizeOf($key);
for ($i=0; $i<$size; $i++) $aHash[$key[$i]] .= "a";
This foreach-loop is 4.7x slower. (see the example at the end - http://www.phpbench.com/)
foreach($aHash as $key=>$val) $aHash[$key] .= "a";
As well as checking a value is set the empty()-method is slightly faster than isset().
Using the if and elseif (using ===) is also faster than (==)
I hope I could help you.
(Performance Source: http://www.phpbench.com/)

Related

Generate a fixed random number in a range based on a string in PHP

I need to generate a random number or hash that will be the same each time based on a string. This is done easily enough with the function crc32, however, I need it to be an integer between a range because the random number will be picking an item out of an array.
Here's the code I have so far:
$min=0;
$max=count($myarray);
$number = crc32("Joe Jones");
$rnd = '.'.(string)$number;
//(Int((max-min+1)*Rnd+min))
$rand = round(($max-$min+1)*$rnd+$min);
echo $rand;
It seems to work, but it always picks lower numbers. It never picks the higher numbers.
Just use mod (%). $x % $n will ensure an output between 0 and $n-1 for any $x.
$myArray=range(1,1000);
$max=count($myArray); //1000
$number = crc32("Joe Jones"); //2559948711
$rand=$number % $max; //711
Also just a note about crc32: It may return a negative number if you run it on a 32 bit platform, so you may optionally want to do abs(crc32($input))
Your crc32 function is producing a negative number. Change the line as follows:
$number = abs(crc32("Joe Jones"));
This turns that negative number in to a positive one. Also, you might want to consider multiplying that number if your your array count is is low. How high that goes is up to you.

PHP looping context query

A question that has always puzzled me is why people write it like the first version when the second version is smaller and easier to read. I thought it might be because php calculates the strlen each time it iterates. any ideas?
FIRST VERSION
for ($i = 0, $len = strlen($key); $i < $len; $i++) {}
You can obviously use $len inside the loop and further on in the code, but what are the benefits over the following version?
SECOND VERSION
for ($i = 0; $i < strlen($key); $i++) {}
It's a matter of performance.
Your first version of the for loop will recaculate the strlen every time and thus, the performances could be slowed down.
Even though it wouldn't be significant enough, you could be surprised how much the slow can be exponantial sometimes.
You can see here for some performances benchmarks with loops.
The first version is best used if the loop is expected to have many iterations and $key won't change in the process.
The second one is best used if the loop is updating $key and you need to recalculate it, or, when recalculating it doesn't affect your performance.

Multiple foreach with over 37 million possibilities

I've been tasked with creating a list of all possibilities using data in 8 blocks.
The 8 blocks have the following number of possibilities:
*Block 1: 12 possibilities
*Block 2: 8 possibilities
*Block 3: 8 possibilities
*Block 4: 11 possibilities
*Block 5: 16 possibilities
*Block 6: 11 possibilities
*Block 7: 5 possibilities
*Block 8: 5 possibilities
This gives a potential number of 37,171,200 possibilities.
I tried simply doing and limiting only to displaying the values returned with the correct string length like so:
foreach($block1 AS $b1){
foreach($block2 AS $b2){
foreach($block3 AS $b3){
foreach($block4 AS $b4){
foreach($block5 AS $b5){
foreach($block6 AS $b6){
foreach($block7 AS $b7){
foreach($block8 AS $b8){
if (strlen($b1.$b2.$b3.$b4.$b5.$b6.$b7.$b8) == 16)
{
echo $b1.$b2.$b3.$b4.$b5.$b6.$b7.$b8.'<br/>';
}
}
}
}
}
}
}
}
}
However the execution time was far too long to compute. I was wondering if anyone knew of a simpler way of doing this?
You could improve your algorithm by caching the string prefixes and remember their lengths. Then you don’t have to do that for each combination.
$len = 16:
// array for remaining characters per level
$r = array($len);
// array of level parts
$p = array();
foreach ($block1 AS &$b1) {
// skip if already too long
if (($r[0] - strlen($b1)) <= 0) continue;
$r[1] = $r[0] - strlen($b1);
foreach ($block2 AS &$b2) {
if (($r[1] - strlen($b2)) <= 0) continue;
$r[2] = $r[1] - strlen($b2);
foreach ($block3 AS $b3) {
// …
foreach ($block8 AS &$b8) {
$r[8] = $r[7] - strlen($b8);
if ($r[8] == 0) {
echo implode('', $p).'<br/>';
}
}
}
}
}
Additionally, using references in foreach will stop PHP using a copy of the array internally.
You could try to store the precomputed part the concatenated string known at each of the previous lelels for later reuse, avoiding concatenating everything in the innermost loop
foreach($block7 AS $b7){
$precomputed7 = $precomputed6.$b7
foreach($block8 AS $b8){
$precomputed8 = $precomputed7.$b8
if (strlen($precomputed8) == 16) {
echo $precomputed8.'<br/>';
}
}
}
Doing this analogously for precedent levels. Then you could try to test at one of the higher loop level for strings that are already longer as 16 chars. You can shortcut and avoid trying out other possibilities. But beware calculating the length of the string costs much performance, maybe is the latter improvement not worth it at all, depending on the input data.
Another idea is to precalculate the lengths for each block and then recurse on the array of lengths, calculating sums should be faster than concatenating and computing the length of strings. For the Vector of indexes that match the length of 16, you can easily output the full concatenated string.
Since you have that length requirement of 16 and assuming each (i) possibility in each (b) of the eight blocks has length x_i_b you can get some reduction by some cases becoming impossible.
For example, say we have length requirement 16, but only 4 blocks, with possibilities with lengths indicated
block1: [2,3,4]
block2: [5,6,7]
block3: [8,9,10]
block4: [9,10,11]
Then all of the possibilities are impossible since block 4's lengths are all too large to permit any combination of blocks 1 - 3 of making up the rest of the 16.
Now if you're length is really 16 that means that your (possible) lengths range from 1 to 9, assumng no 0 lengths.
I can see two ways of approaching this:
Greedy
Dynamic Programming
Perhaps even combine them. For the Greedy approach, pick the biggest possibility in all the blocks, then the next biggest etc, follow that through until you cross your threshold of 16. If you got all the blocks, then you can emit that one.
Whether or not you got on threshold or not, you can then iterate through the possibilities.
The dynamic appraoch means that you should store some of the results that you compute already. Like a selection from some of the blocks that gives you a length of 7, you don't need to recompute that in future, but you can iterate through the remaining blocks to see if you can find a combination to give you lenth 9.
EDIT: This is kind of like the knapsack problem but with the additional restriction of 1 choice per block per instance. Anyway, in terms of other optimizations definitely pre process the blocks into arrays of lengths only and keep a running sum at each iteration level. So you only do 1 sum per each iteration of each loop, rather than 8 sums per each iteration. Also only str concat if you need to emit the selection.
If you don't want a general solution (probably easier if you don't) then you can hand code alot of problem instance specific speedups by excluding the largest too small combination of lengths (and all selections smaller than that) and excluding the smallest too large combination of lengths (and all selections larger).
If you can express this as a nested array, try a RecursiveIteratorIterator, http://php.net/manual/en/class.recursiveiteratoriterator.php

prime generator optimization

I'm starting out my expedition into Project Euler. And as many others I've figured I need to make a prime number generator. Problem is: PHP doesn't like big numbers. If I use the standard Sieve of Eratosthenes function, and set the limit to 2 million, it will crash. It doesn't like creating arrays of that size. Understandable.
So now I'm trying to optimize it. One way, I found, was to instead of creating an array with 2 million variable, I only need 1 million (only odd numbers can be prime numbers). But now it's crashing because it exceeds the maximum execution time...
function getPrimes($limit) {
$count = 0;
for ($i = 3; $i < $limit; $i += 2) {
$primes[$count++] = $i;
}
for ($n = 3; $n < $limit; $n += 2) {
//array will be half the size of $limit
for ($i = 1; $i < $limit/2; $i++) {
if ($primes[$i] % $n === 0 && $primes[$i] !== $n) {
$primes[$i] = 0;
}
}
}
return $primes;
}
The function works, but as I said, it's a bit slow...any suggestions?
One thing I've found to make it a bit faster is to switch the loop around.
foreach ($primes as $value) {
//$limitSq is the sqrt of the limit, as that is as high as I have to go
for ($n = 3; $n = $limitSq; $n += 2) {
if ($value !== $n && $value % $n === 0) {
$primes[$count] = 0;
$n = $limitSq; //breaking the inner loop
}
}
$count++;
}
And in addition setting the time and memory limit (thanks Greg), I've finally managed to get an answer. phjew.
Without knowing much about the algorithm:
You're recalculating $limit/2 each time around the $i loop
Your if statement will be evaluated in order, so think about (or test) whether it would be faster to test $primes[$i] !== $n first.
Side note, you can use set_time_limit() to give it longer to run and give it more memory using
ini_set('memory_limit', '128M');
Assuming your setup allows this, of course - on a shared host you may be restricted.
From Algorithmist's proposed solution
This is a modification of the standard
Sieve of Eratosthenes. It would be
highly inefficient, using up far too
much memory and time, to run the
standard sieve all the way up to n.
However, no composite number less than
or equal to n will have a factor
greater than sqrt{n},
so we only need to know all primes up
to this limit, which is no greater
than 31622 (square root of 10^9). This
is accomplished with a sieve. Then,
for each query, we sieve through only
the range given, using our
pre-computed table of primes to
eliminate composite numbers.
This problem has also appeared on UVA's and Sphere's online judges. Here's how it's enunciated on Sphere.
You can use a bit field to store your sieve. That is, it's roughly identical to an array of booleans, except you pack your booleans into a large integer. For instance if you had 8-bit integers you would store 8 bits (booleans) per integer which would further reduce your space requirements.
Additionally, using a bit field allows the possibility of using bit masks to perform your sieve operation. For example, if your sieve kept all numbers (not just odd ones), you could construct a bit mask of b01010101 which you could then AND against every element in your array. For threes you could use three integers as the mask: b00100100 b10010010 b01001001.
Finally, you do not need to check numbers that are lower than $n, in fact you don't need to check for numbers less than $n*$n-1.
Once you know the number is not a prime, I would exit the enter loop. I don't know php, but you need a statement like a break in C or a last in Perl.
If that is not available, I would set a flag and use it to exit the inter loop as a condition of continuing the interloop. This should speed up your execution as you are not checking $limit/2 items if it is not a prime.
if you want speed, don’t use PHP on this one :P
no, seriously, i really like PHP and it’s a cool language, but it’s not suited at all for such algorithms

Collect Lowest Numbers Algorithm

I'm looking for an algorithm (or PHP code, I suppose) to end up with the 10 lowest numbers from a group of numbers. I was thinking of making a ten item array, checking to see if the current number is lower than one of the numbers in the array, and if so, finding the highest number in the array and replacing it with the current number.
However, I'm planning on finding the lowest 10 numbers from thousands, and was thinking there might be a faster way to do it. I plan on implementing this in PHP, so any native PHP functions are usable.
Sort the array and use the ten first/last entries.
Honestly: sorting an array with a thousand entries costs less time than it takes you to blink.
What you're looking for is called a selection algorithm. The Wikipedia page on the subject has a few subsections in the selecting k smallest or largest elements section. When the list is large enough, you can beat the time required for the naive "sort the whole list and choose the first 10" algorithm.
Naive approach is to just sort the input. It's likely fast enough, so just try it and profile it before doing anything more complicated.
Potentially faster approach: Linearly search the input, but keep the output array sorted to make it easier to determine if the next input belongs in the array or not. Pseudocode:
output[0-9] = input[0-9];
sort(output);
for i=10..n-1
if input[i] < output[9]
insert(input[i])
where insert(x) will find the right spot (binary search) and do the appropriate shifting.
But seriously, just try the naive approach first.
Where are you getting this group of numbers?
If your List of numbers is already in an array you could simply do a sort(), and then a array_slice() to get the first 10.
I doesn't matter much for a small array, but as it gets larger a fast and easy way to increase processing speed is to take advantage of array key indexing, which for 1 mill. rows will use about 40% of the time. Example:
// sorting array values
$numbers = array();
for($i = 0; $i < 1000000; ++$i)
{
$numbers[$i] = rand(1, 999999);
}
$start = microtime(true);
sort($numbers);
$res = array_slice($numbers, 0, 10, true);
echo microtime(true) - $start . "\n";
// 2.6612658500671
print_r($res);
unset($numbers, $res, $start);
// sorting array keys
$numbers = array();
for($i = 0; $i < 1000000; ++$i)
{
$numbers[rand(1, 999999)] = $i;
}
$start = microtime(true);
ksort($numbers);
$res = array_keys(array_slice($numbers, 0, 10, true));
echo microtime(true) - $start . "\n";
// 0.9651210308075
print_r($res);
But if the array data is from a database the fastest is probably to just sort it there:
SELECT number_column FROM table_with_numbers ORDER BY number_column LIMIT 10
Create a sorted set (TreeSet in Java, I don't know about PHP), and add the first 10 numbers. Now iterate over the rest of the numbers Iterate over all your numbers, add the new one, then remove the biggest number from the set.
This algorithm is O(n) if n >> 10.
I would use a heap with 10 elements and the highest number at the root of the tree. Then start at the beginning of the list of numbers:
If the heap has less than 10 elements: add the number to the list
Otherwise, if the number is smaller than the highest number in the heap, remove the highest number in the heap, and then add the current number to the list
Otherwise, ignore it.
You will end up with the 10 lowest numbers in the heap. If you are using an array as the heap data structure, then you can just use the array directly.
(alternatively: you can slice out the first 10 elements, and heapify them instead of using the first step above, which will be slightly faster).
However, as other people have noted, for 1000 elements, just sort the list and take the first 10 elements.

Categories