How to fastest count the number of set bits in php? - php

I just want to find some fastest set bits count function in the php.
For example, 0010101 => 3, 00011110 => 4
I saw there is good Algorithm that can be implemented in c++.
How to count the number of set bits in a 32-bit integer?
Is there any php built-in function or fastest user-defined function?

You can try to apply a mask with a binary AND, and use shift to test bit one by one, using a loop that will iterate 32 times.
function getBitCount($value) {
$count = 0;
while($value)
{
$count += ($value & 1);
$value = $value >> 1;
}
return $count;
}
You can also easily put your function into PHP style
function NumberOfSetBits($v)
{
$c = $v - (($v >> 1) & 0x55555555);
$c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
$c = (($c >> 4) + $c) & 0x0F0F0F0F;
$c = (($c >> 8) + $c) & 0x00FF00FF;
$c = (($c >> 16) + $c) & 0x0000FFFF;
return $c;
}

I could figure out a few ways to but not sure which one would be the fastest :
use substr_count()
replace all none '1' characters by '' and then use strlen()
use preg_match_all()
PS : if you start with a integer these examples would involve using decbin() first.

There are a number of other ways; but for a decimal 32 bit integer, NumberOfSetBits is definitely the fastest.
I recently stumbled over Brian Kernighan´s algorithm, which has O(log(n)) instead of most of the others having O(n). I don´t know why it´s not appearing that fast here; but it still has a measurable advantage over all other non-specialized functions.
Of course, nothing can beat NumberOfSetBits with O(1).
my benchmarks:
function getBitCount($value) { $count = 0; while($value) { $count += ($value & 1); $value = $value >> 1; } return $count; }
function getBitCount2($value) { $count = 0; while($value) { if ($value & 1)$count++; $value >>= 1; } return $count; }
// if() instead of +=; >>=1 instead of assignment: sometimes slower, sometimes faster
function getBitCount2a($value) { for($count = 0;$value;$value >>= 1) if($value & 1)$count ++; return $count; }
// for instead of while: sometimes slower, sometimes faster
function getBitCount3($value) { for($i=1,$count=0;$i;$i<<=1) if($value&$i)$count++; return $count; }
// shifting the mask: incredibly slow (always shifts all bits)
function getBitCount3a($value) { for($i=1,$count=0;$i;$i<<=1) !($value&$i) ?: $count++; return $count; }
// with ternary instead of if: even slower
function NumberOfSetBits($v) {
// longest (in source code bytes), but fastest
$c = $v - (($v >> 1) & 0x55555555); $c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
$c = (($c >> 4) + $c) & 0x0F0F0F0F; $c = (($c >> 8) + $c) & 0x00FF00FF;
$c = (($c >> 16) + $c) & 0x0000FFFF; return $c;
}
function bitsByPregReplace($n) { return strlen(preg_replace('_0_','',decbin($n))); }
function bitsByNegPregReplace($n) { return strlen(preg_replace('/[^1]/','',decbin($n))); }
function bitsByPregMatchAll($n) { return preg_match_all('/1/',decbin($n)); }
function bitsBySubstr($i) { return substr_count(decbin($i), '1'); }
function bitsBySubstrInt($i) { return substr_count(decbin($i), 1); }
// shortest (in source code bytes)
function bitsByCountChars($n){ return count_chars(decbin($n))[49]; }
// slowest by far
function bitsByCountChars1($n) { return count_chars(decbin($n),1)[49]; }
// throws a notice for $n=0
function Kernighan($n) { for(;$n;$c++)$n&=$n-1;return$c; }
// Brian Kernighan’s Algorithm
function benchmark($function)
{
gc_collect_cycles();
$t0=microtime();
for($i=1e6;$i--;) $function($i);
$t1=microtime();
$t0=explode(' ', $t0); $t1=explode(' ', $t1);
echo ($t1[0]-$t0[0])+($t1[1]-$t0[1]), " s\t$function\n";
}
benchmark('getBitCount');
benchmark('getBitCount2');
benchmark('getBitCount2a');
benchmark('getBitCount3');
benchmark('getBitCount3a');
benchmark('NumberOfSetBits');
benchmark('bitsBySubstr');
benchmark('bitsBySubstrInt');
benchmark('bitsByPregReplace');
benchmark('bitsByPregMatchAll');
benchmark('bitsByCountChars');
benchmark('bitsByCountChars1');
benchmark('decbin');
banchmark results (sorted)
> php count-bits.php
2.286831 s decbin
1.364934 s NumberOfSetBits
3.241821 s Kernighan
3.498779 s bitsBySubstr*
3.582412 s getBitCount2a
3.614841 s getBitCount2
3.751102 s getBitCount
3.769621 s bitsBySubstrInt*
5.806785 s bitsByPregMatchAll*
5.748319 s bitsByCountChars1*
6.350801 s bitsByNegPregReplace*
6.615289 s bitsByPregReplace*
13.863838 s getBitCount3
16.39626 s getBitCount3a
19.304038 s bitsByCountChars*
Those are the numbers from one of my runs (with PHP 7.0.22); others showed different order within the 3.5 seconds group. I can say that - on my machine - four of those five are pretty equal, and bitsBySubstrInt is always a little slower due to the typecasts.
Most other ways require a decbin (which mostly takes longer than the actual counting; I marked them with a * in the benchmark results); only BitsBySubstr would get close to the winner without that gammy leg.
I find it noticeable that you can make count_chars 3 times faster by limiting it to only existing chars. Seems like array indexing needs quite some time.
edit:
added another preg_replace version
fixed preg_match_all version
added Kernighan´s algorithm (fastest algorithm for arbitrary size integers)
added garbage collection to benchmarking function
reran benchmarks

My benchmarking code
start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
getBitCount($i);
}
end_benchmark();
start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
NumberOfSetBits($i);
}
end_benchmark();
start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
substr_count(decbin($i), '1');
}
end_benchmark();
Benchmarking result:
benchmark (NumberOfSetBits()) : 1.429042 milleseconds
benchmark (substr_count()) : 1.672635 milleseconds
benchmark (getBitCount()): 10.464981 milleseconds
I think NumberOfSetBits() and substr_count() are best.
Thanks.

This option is a little faster than NumberOfSetBits($v)
function bitsCount(int $integer)
{
$count = $integer - (($integer >> 1) & 0x55555555);
$count = (($count >> 2) & 0x33333333) + ($count & 0x33333333);
$count = ((((($count >> 4) + $count) & 0x0F0F0F0F) * 0x01010101) >> 24) & 0xFF;
return $count;
}
Benckmark (PHP8)
1.78 s bitsBySubstr
1.42 s NumberOfSetBits
1.11 s bitsCount

Here is another solution. Maybe not the fastet but therefor the shortest solution. It also works for negative numbers:
function countBits($num)
{
return substr_count(decbin($num), "1");
}

Related

php bit array in integer

I have written a wrapper class around a byte stream in order to read bit by bit from that stream (bit arrays) using this method:
public function readBits($len) {
if($len === 0) {
return 0;
}
if($this->nextbyte === null) {
//no byte has been started yet
if($len % 8 == 0) {
//don't start a byte with the cache, even number of bytes
$ret = 0;
//just return byte count not bit count
$len /= 8;
while ($len--) {
if($this->bytestream->eof()) {
//no more bytes
return false;
}
$byte = $this->bytestream->readByte();
$ret = ($ret << 8) | ord($byte);
}
return $ret;
} else {
$this->nextbyte = ord($this->bytestream->readByte());
$this->byteshift = 0;
}
}
if($len <= 8 && $this->byteshift + $len <= 8) {
//get the bitmask e.g. 00000111 for 3
$bitmask = self::$includeBitmask[$len - 1];
//can be satisfied with the remaining bits
$ret = $this->nextbyte & $bitmask;
//shift by len
$this->nextbyte >>= $len;
$this->byteshift += $len;
} else {
//read the remaining bits first
$bitsremaining = 8 - $this->byteshift;
$ret = $this->readBits($bitsremaining);
//decrease len by the amount bits remaining
$len -= $bitsremaining;
//set the internal byte cache to null
$this->nextbyte = null;
if($len > 8) {
//read entire bytes as far as possible
for ($i = intval($len / 8); $i > 0; $i--) {
if($this->bytestream->eof()) {
//no more bytes
return false;
}
$byte = $this->bytestream->readByte();
$ret = ($ret << 8) | ord($byte);
}
//reduce len to the rest of the requested number
$len = $len % 8;
}
//read a new byte to get the rest required
$newbyte = $this->readBits($len);
$ret = ($ret << $len) | $newbyte;
}
if($this->byteshift === 8) {
//delete the cached byte
$this->nextbyte = null;
}
return $ret;
}
This allows me to read bit arrays of arbitrary length off my byte stream which are returned in integers (as php has no signed integers).
The problem appears once I try to read a bit array that is bigger than 64 bits and I am assuming if I were to use the class on a 32 bit system the problem would appear with 32 bit arrays already.
The problem is that the return value is obviously to big to be held within an integer, so it topples over into a negative integer.
My question now is what would be the best way to deal with this. I can think of:
Forcing the number to be saved as a string (I am unsure if that's even possible)
Use the GMP extension (which I kinda don't want to because I think the gmp bitwise methods are probably quite a performance hit compared to the normal bitwise operators)
Is there something I missed on this or is one of the options I mentioned actually the best way to deal with this problem?
Thanks for your help in advance

Is there a faster way than x >= start && x <= end in PHP to test if an integer is between two integers?

This is a similar question to Fastest way to determine if an integer is between two integers (inclusive) with known sets of values, but the accepted answer will not work (as far as I know) in php due to php not being strictly typed and not having controllable integer overflow.
The use case here is to determine if an integer is between 65 and 90 (ASCII values for 'A' and 'Z'). These bounds might help optimize the solution due to 64 being a power of two and acting as boundary condition for this problem.
The only pseudo optimization I have come up with so far is:
//$intVal will be between 0 and 255 (inclusive)
function isCapital($intVal)
{
//255-64=191 (bit mask of 1011 1111)
return (($intVal & 191) <= 26) && (($intVal & 191) > 0);
}
This function is not much of an improvement (possibly slower) over a normal double comparison of $intVal >= 65 && $intVal <= 90, but it is just where I started heading while trying to optimize.
function isCapitalBitwise($intVal) {
return (($intVal & 191) <= 26) && (($intVal & 191) > 0);
}
function isCapitalNormal($intVal) {
return $intVal >= 65 && $intVal <= 90;
}
function doTest($repetitions) {
$i = 0;
$startFirst = microtime();
while ($i++ < $repetitions) {
isCapitalBitwise(76);
}
$first = microtime() - $startFirst;
$i = 0;
$startSecond = microtime();
while ($i++ < $repetitions) {
isCapitalNormal(76);
}
$second = microtime() - $startSecond;
$i = 0;
$startThird = microtime();
while ($i++ < $repetitions) {
ctype_upper('A');
}
$third = $startThird - microtime();
echo $first . ' ' . $second . ' ' . $third . PHP_EOL;
}
doTest(1000000);
On my system this returns:
0.217393 0.188426 0.856837
PHP is not as good at bitwise operations as compiled languages... but more importantly, I had to do a million comparisons to get less than 3 hundredths of a second of difference.
Even ctype_upper() is well in the range of "you might save a few seconds of CPU time per year" with these other ways of comparison, with the added bonus that you don't have to call ord() first.
Go for readability. Go for maintainability. Write your application, then profile it to see where your real bottlenecks are.
Instead of recreating the wheel, why not use the pre-built php method ctype_upper
$char = 'A';
echo ctype_upper($char) ? "It's uppercase" : "It's lowercase";
You can even pass in the integer value of a character:
echo ctype_upper($intVal) ? "It's uppercase" : "It's lowercase";
http://php.net/manual/en/function.ctype-upper.php
Even if you do find a method other than comparing via && or what I pasted above, it will be microseconds difference. You will waste hours coming up with a way to save a few seconds in the course of a year.
From How to check if an integer is within a range?:
t1_test1: ($val >= $min && $val <= $max): 0.3823 ms
t2_test2: (in_array($val, range($min, $max)): 9.3301 ms
t3_test3: (max(min($var, $max), $min) == $val): 0.7272 ms
You can also use range with characters (A, B, C...) but as you see it is not a good approach.
I think you will get best results by going native, but its only a fraction faster. Use ctype_upper directly. Here are my tests.
<?php
$numTrials = 500000;
$test = array();
for ($ii = 0; $ii < $numTrials; $ii++) {
$test[] = mt_rand(0, 255);
}
function compare2($intVal) {
return $intVal >= 65 && $intVal <= 90;
}
$tic = microtime(true);
for ($ii = 0; $ii < $numTrials; $ii++) {
$result = compare2($test[$ii]);
}
$toc = microtime(true);
echo "compare2...: " . ($toc - $tic) . "\n";
$tic = microtime(true);
for ($ii = 0; $ii < $numTrials; $ii++) {
$result = ctype_upper($test[$ii]);
}
$toc = microtime(true);
echo "ctype_upper: " . ($toc - $tic) . "\n";
echo "\n";
Which gives something pretty consistently like:
compare2...: 0.39210104942322
ctype_upper: 0.32374000549316

Calculating the n-th root of an integer using PHP/GMP

How can I calculate the n-th root of an integer using PHP/GMP?
Although I found a function called gmp_root(a, nth) in the PHP source, it seems that this function has not been published in any release yet*: http://3v4l.org/8FjU7
*) 5.6.0alpha2 being the most recent one at the time of writing
Original source: Calculating Nth root with bcmath in PHP – thanks and credits to HamZa!
I've rewritten the code to use GMP instead of BCMath:
function gmp_nth_root($num, $n) {
if ($n < 1) return 0; // we want positive exponents
if ($num <= 0) return 0; // we want positive numbers
if ($num < 2) return 1; // n-th root of 1 or 2 give 1
// g is our guess number
$g = 2;
// while (g^n < num) g=g*2
while (gmp_cmp(gmp_pow($g, $n), $num) < 0) {
$g = gmp_mul($g, 2);
}
// if (g^n==num) num is a power of 2, we're lucky, end of job
if (gmp_cmp(gmp_pow($g, $n), $num) == 0) {
return $g;
}
// if we're here num wasn't a power of 2 :(
$og = $g; // og means original guess and here is our upper bound
$g = gmp_div($g, 2); // g is set to be our lower bound
$step = gmp_div(gmp_sub($og, $g), 2); // step is the half of upper bound - lower bound
$g = gmp_add($g, $step); // we start at lower bound + step , basically in the middle of our interval
// while step != 1
while (gmp_cmp($step, 1) > 0) {
$guess = gmp_pow($g, $n);
$step = gmp_div($step, 2);
$comp = gmp_cmp($guess, $num); // compare our guess with real number
if ($comp < 0) { // if guess is lower we add the new step
$g = gmp_add($g, $step);
} else if ($comp == 1) { // if guess is higher we sub the new step
$g = gmp_sub($g, $step);
} else { // if guess is exactly the num we're done, we return the value
return $g;
}
}
// whatever happened, g is the closest guess we can make so return it
return $g;
}

PHP "Maximum execution time"

I'm trying to program my own Sine function implementation for fun but I keep getting :
Fatal error: Maximum execution time of 30 seconds exceeded
I have a small HTML form where you can enter the "x" value of Sin(x) your looking for and the number of "iterations" you want to calculate (precision of your value), the rest is PhP.
The maths are based of the "Series definition" of Sine on Wikipedia :
--> http://en.wikipedia.org/wiki/Sine#Series_definition
Here's my code :
<?php
function factorial($int) {
if($int<2)return 1;
for($f=2;$int-1>1;$f*=$int--);
return $f;
};
if(isset($_POST["x"]) && isset($_POST["iterations"])) {
$x = $_POST["x"];
$iterations = $_POST["iterations"];
}
else {
$error = "You forgot to enter the 'x' or the number of iterations you want.";
global $error;
}
if(isset($x) && is_numeric($x) && isset($iterations) && is_numeric($iterations)) {
$x = floatval($x);
$iterations = floatval($iterations);
for($i = 0; $i <= ($iterations-1); $i++) {
if($i%2 == 0) {
$operator = 1;
global $operator;
}
else {
$operator = -1;
global $operator;
}
}
for($k = 1; $k <= (($iterations-(1/2))*2); $k+2) {
$k = $k;
global $k;
}
function sinus($x, $iterations) {
if($x == 0 OR ($x%180) == 0) {
return 0;
}
else {
while($iterations != 0) {
$result = $result+(((pow($x, $k))/(factorial($k)))*$operator);
$iterations = $iterations-1;
return $result;
}
}
}
$result = sinus($x, $iterations);
global $result;
}
else if(!isset($x) OR !isset($iterations)) {
$error = "You forgot to enter the 'x' or the number of iterations you want.";
global $error;
}
else if(isset($x) && !is_numeric($x)&& isset($iterations) && is_numeric($iterations)) {
$error = "Not a valid number.";
global $error;
}
?>
My mistake probably comes from an infinite loop at this line :
$result = $result+(((pow($x, $k))/(factorial($k)))*$operator);
but I don't know how to solve the problem.
What I'm tring to do at this line is to calculate :
((pow($x, $k)) / (factorial($k)) + (((pow($x, $k))/(factorial($k)) * ($operator)
iterating :
+ (((pow($x, $k))/(factorial($k)) * $operator)
an "$iterations" amount of times with "$i"'s and "$k"'s values changing accordingly.
I'm really stuck here ! A bit of help would be needed. Thank you in advance !
Btw : The factorial function is not mine. I found it in a PhP.net comment and apparently it's the optimal factorial function.
Why are you computing the 'operator' and power 'k' out side the sinus function.
sin expansion looks like = x - x^2/2! + x^3/3! ....
something like this.
Also remember iteration is integer so apply intval on it and not floatval.
Also study in net how to use global. Anyway you do not need global because your 'operator' and power 'k' computation will be within sinus function.
Best of luck.
That factorial function is hardly optimal—for speed, though it is not bad. At least it does not recurse. It is simple and correct though. The major aspect of the timeout is that you are calling it a lot. One technique for improving its performance is to remember, in a local array, the values for factorial previously computed. Or just compute them all once.
There are many bits of your code which could endure improvement:
This statement:
while($iterations != 0)
What if $iterations is entered as 0.1? Or negative. That would cause an infinite loop. You can make the program more resistant to bad input with
while ($iterations > 0)
The formula for computing a sine uses the odd numbers: 1, 3, 5, 7; not every integer
There are easier ways to compute the alternating sign.
Excess complication of arithmetic expressions.
return $result is within the loop, terminating it early.
Here is a tested, working program which has adjustments for all these issues:
<?php
// precompute the factorial values
global $factorials;
$factorials = array();
foreach (range (0, 170) as $j)
if ($j < 2)
$factorials [$j] = 1;
else $factorials [$j] = $factorials [$j-1] * $j;
function sinus($x, $iterations)
{
global $factorials;
$sign = 1;
for ($j = 1, $result = 0; $j < $iterations * 2; $j += 2)
{
$result += pow($x, $j) / $factorials[$j] * $sign;
$sign = - $sign;
}
return $result;
}
// test program to prove functionality
$pi = 3.14159265358979323846264338327950288419716939937510582097494459230781640628620;
$x_vals = array (0, $pi/4, $pi/2, $pi, $pi * 3/2, 2 * $pi);
foreach ($x_vals as $x)
{
$y = sinus ($x, 20);
echo "sinus($x) = $y\n";
}
?>
Output:
sinus(0) = 0
sinus(0.78539816339745) = 0.70710678118655
sinus(1.5707963267949) = 1
sinus(3.1415926535898) = 3.4586691443274E-16
sinus(4.7123889803847) = -1
sinus(6.2831853071796) = 8.9457384260403E-15
By the way, this executes very quickly: 32 milliseconds for this output.

Calculating prime factors for Project Euler

I need to find the greatest prime factor of a large number: up to 12 places (xxx,xxx,xxx,xxx). I have solved the problem, and the code works for small numbers (up to 6 places); however, the code won't run fast enough to not trigger a timeout on my server for something in the 100 billions.
I found a solution, thanks to all.
Code:
<?php
set_time_limit(300);
function is_prime($number) {
$sqrtn = intval(sqrt($number));
//won't work for 0-2
for($i=3; $i<=$sqrtn; $i+=2) {
if($number%$i == 0) {
return false;
}
}
return true;
}
$initial = 600851475143;
$prime_factors = array();
for($i=3; $i<=9999; $i++) {
$remainder = fmod($initial, $i);
if($remainder == 0) {
if(is_prime($i)) {
$prime_factors[] = $i;
}
}
}
//print_r($prime_factors);
echo "\n\n";
echo "<b>Answer: </b>". max($prime_factors);
?>
The test number in this case is 600851475143.
Your code will not find any prime factors larger than sqrt(n). To correct that, you have to test the quotient $number / $i also, for each factor (not only prime factors) found.
Your is_factor function
function is_factor($number, $factor) {
$half = $number/2;
for($y=1; $y<=$half; $y++) {
if(fmod($number, $factor) == 0) {
return true;
}
}
}
doesn't make sense. What's $y and the loop for? If $factor is not a divisor of $number, that will perform $number/2 utterly pointless divisions. With that fixed, reordering the tests in is_prime_factor will give a good speedup because the costly primality test needs only be performed for the few divisors of $number.
Here is a really simple and fast solution.
LPF(n)
{
for (i = 2; i <= sqrt(n); i++)
{
while (n > i && n % i == 0) n /= i;
}
return n;
}

Categories