Calculate a % b for very large numbers - php - php

I have to calculate a % b for two very large numbers.
I can not use the default modulo operator, because a and b are larger then PHP_INT_MAX, so I have to handle them as "strings".
I know that there exists special math libraries like BC or GMP but I can't use them, because my app probably will hosted on a shared host, where these are not enabled.
I have to write a function in php that will do the job. The function will take two strings (the two number) as parameters and have to return a % b, but I don't know how to start?
How to solve this problem?

Since PHP 4.0.4, libbcmath is bundled with PHP. You don't need any external libraries for this extension. These functions are only available if PHP was configured with --enable-bcmath .
The Windows version of PHP has built-in support for this extension. You do not need to load any additional extensions in order to use these functions. You should be able to enable these functions yourself, without any action on the part of the hosting company.

I though of this solution:
$n represents a huge number, $m the (not so huge) modulus.
function getModulus($n, $m)
{
$a = str_split($n);
$r = 0;
foreach($a as $v)
{
$r = ((($r * 10) + intval($v)) % $m);
}
return $r;
}
Hope it helps someone,

Depending on your processor, if using 64 bit machine 2^63-1 and if 32 bit machine 2^31-1 should give you the length of your decimal your machine can compute. above that you will get wrong values.
You can do the same by splitting your number into chunks.
Example:
my number is 18 decimal long thus, split into chunks of 9/7/2 = 18.
calculate the mod of the first chunk.
Append the mod of the first one to the front of the second chunk.
Example: result of the first mod = 23, thus 23XXXXXXX. find the mod of the resulting 23XXXXXXX. add the mod to the last chunk. Example: mod = 15 then 15XX.
$string = '123456789123456789'; // 18 decimal long
$chunk[0] = '123456789'; // 9 decimal long
$chunk[1] = '1234567'; // 7 decimal long
$chunk[2] = '89'; // 2 decimal long
$modulus = null;
foreach($chunk as $value){
$modulus = (int)($modulus.$value) % 45;
}
The result $modulus above should be same as
$modulus = $tring % 45
Better late than even.
Hope this will help. anyone with similar approach?

You can use fmod for values larger than MAX_INT
Read more about it here
http://php.net/manual/en/function.fmod.php

Related

A random-string function produces many duplicates

I am using the function below to generate random strings for filenames. I got no problems on Unix machines but I have many duplicates on Windows. I just made a test and generated 100.000 strings with the result, that each string occurs 227 (??) times. Could anyone explain this? Even with rand() I got duplicates, but srand() seems to work.
function generateRandomString($length = 6)
{
$rows = array();
array_push($rows, range('A', 'Z'));
array_push($rows, range('a', 'z'));
array_push($rows, range(0, 9));
$signs = array();
foreach ($rows as $row) {
$signs = array_merge($signs, $row);
}
shuffle($signs);
shuffle($signs);
$password = '';
for ($i = 0; $i < $length; $i++) {
$password .= $signs[array_rand($signs, 1)];
}
return $password;
}
Well, for one, Windows file names aren't case sensitive, so you should only use either lowercase or uppercase characters in your source arrays, not both.
The rest is basically upto the actual implementation of the PHP processor you're using, rather than Windows vs. *nix. Perhaps they rely on timer resolution and keep no internal state - that would be bad given that usually the timer resolution is about 15ms, which isn't a whole lot. In any case, it's most likely an issue with the PHP implementation, not Windows itself, as this code in C# nicely illustrates:
Random rnd = new Random();
byte[] buf = new byte[10 * 100000];
rnd.NextBytes(buf);
buf
.Select((val, idx) => new { Index = idx, Value = val })
.ToLookup(i => i.Index / 10)
.Select(i => string.Join(string.Empty,
i.Select(j => j.Value.ToString("X2")).ToArray()))
.GroupBy(i => i)
.Where(i => i.Count() > 1)
.Dump();
This creates 100 000 random strings of 20 characters (0 to F) and looks for duplicates. In a few hundred tests, I haven't had a single collision. So if you've got trouble with the random generator in PHP, go look at the particular implementation of PHP you're using, rather than blaming Windows :)
It's interesting how your code does a few passes of the randomness (shuffling the $signs array twice and then picking randomly from that?). Doing this most likely reduces the randomness of the data rather than increasing it. Seems just like a stupid attempt at hiding the password generation mechanism behind layers of obscurity (and then open sourcing it... eh :D).
As for passwords (your code seems to indicate that's what it was used for), you should probably use the crypto-secure randoms anyway - they're far less predictable, more random and less prone to bias.

php5 pack is broken on x84_64 env

pack('H*', dechex(12345678900)) /* on 32bit */
!= pack('H*', dechex(12345678900)) /* on 64bit */
why ?
I don't know how to fix it, but I think I know why this is happening. No bug here - straigt out from the manual http://php.net/manual/en/function.dechex.php
The largest number that can be converted is 4294967295 in decimal resulting to "ffffffff"
I do not know what exactly is happening "inside" php, but you probably are causing 32 bit unsigned integer to overflow (12,345,678,900 > 4,294,967,295). Since on 64 bit this limit should be 18,446,744,073,709,551,615, dechex is returning "correct" values (32 vs 64 bit diffirence doesn't seem to be documented and I might be wrong since I don't have 64 bit system for testing).
//Edit:
As a last resort you could use GMP extesion to make your own hecdex function for 32 bit system, but that is going to produce lots and lots of overhead. Probably going to be one of the slowest implementations known to the modern programming.
//Edit2:
Wrote a function using BCMath, I'm on a Windows at the moment and was struggling finding correct dll for GMP.
function dechex32($i) {
//Cast string
$i = (string)$i;
//Initialize result string
$r = NULL;
//Map hex values 0-9, a-f to array keys
$hex = array_merge(range(0, 9), range('a', 'f'));
//While input is lagrer than 0
while(bccomp($i, '0') > 0) {
//Modulo 16 and append hex char to result
$r.= $hex[$mod = bcmod($i, '16')];
//i = (i - mod) / 16
$i = bcdiv(bcsub($i, $mod), '16');
}
//Reverse result and return
return strrev($r);
}
var_dump(dechex32(12345678900));
/*string(9) "2dfdc1c34"*/
Didn't test thoroughly but seems to work. Use as a last resort - rough benchmarking with 100,000 iterations did show, that it's ~40 times slower than native implemetation.

How to compare two 64 bit numbers

In PHP I have a 64 bit number which represents tasks that must be completed. A second 64 bit number represents the tasks which have been completed:
$pack_code = 1001111100100000000000000011111101001111100100000000000000011111
$veri_code = 0000000000000000000000000001110000000000000000000000000000111110
I need to compare the two and provide a percentage of tasks completed figure. I could loop through both and find how many bits are set, but I don't know if this is the fastest way?
Assuming that these are actually strings, perhaps something like:
$pack_code = '1001111100100000000000000011111101001111100100000000000000011111';
$veri_code = '0000000000000000000000000001110000000000000000000000000000111110';
$matches = array_intersect_assoc(str_split($pack_code),str_split($veri_code));
$finished_matches = array_intersect($matches,array(1));
$percentage = (count($finished_matches) / 64) * 100
Because you're getting the numbers as hex strings instead of ones and zeros, you'll need to do a bit of extra work.
PHP does not reliably support numbers over 32 bits as integers. 64-bit support requires being compiled and running on a 64-bit machine. This means that attempts to represent a 64-bit integer may fail depending on your environment. For this reason, it will be important to ensure that PHP only ever deals with these numbers as strings. This won't be hard, as hex strings coming out of the database will be, well, strings, not ints.
There are a few options here. The first would be using the GMP extension's gmp_xor function, which performs a bitwise-XOR operation on two numbers. The resulting number will have bits turned on when the two numbers have opposing bits in that location, and off when the two numbers have identical bits in that location. Then it's just a matter of counting the bits to get the remaining task count.
Another option would be transforming the number-as-a-string into a string of ones and zeros, as you've represented in your question. If you have GMP, you can use gmp_init to read it as a base-16 number, and use gmp_strval to return it as a base-2 number.
If you don't have GMP, this function provided in another answer (scroll to "Step 2") can accurately transform a string-as-number into anything between base-2 and 36. It will be slower than using GMP.
In both of these cases, you'd end up with a string of ones and zeros and can use code like that posted by #Mark Baker to get the difference.
Optimization in this case is not worth of considering. I'm 100% sure that you don't really care whether your scrip will be generated 0.00000014 sec. faster, am I right?
Just loop through each bit of that number, compare it with another and you're done.
Remember words of Donald Knuth:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
This code utilizes the GNU Multi Precision library, which is supported by PHP, and since it is implemented in C, should be fast enough, and supports arbitrary precision.
$pack_code = gmp_init("1001111100100000000000000011111101001111100100000000000000011111", 2);
$veri_code = gmp_init("0000000000000000000000000001110000000000000000000000000000111110", 2);
$number_of_different_bits = gmp_popcount(gmp_xor($pack_code, $veri_code));
$a = 11111;
echo sprintf('%032b',$a)."\n";
$b = 12345;
echo sprintf('%032b',$b)."\n";
$c = $a & $b;
echo sprintf('%032b',$c)."\n";
$n=0;
while($c)
{
$n += $c & 1;
$c = $c >> 1;
}
echo $n."\n";
Output:
00000000000000000010101101100111
00000000000000000011000000111001
00000000000000000010000000100001
3
Given your PHP-setuo can handle 64bit, this can be easily extended.
If not you can sidestep this restriction using GNU Multiple Precision
You could also split up the HEx-Representation and then operate on those coresponding parts parts instead. As you need just the local fact of 1 or 0 and not which number actually is represented! I think that would solve your problem best.
For example:
0xF1A35C and 0xD546C1
you just compare the binary version of F and D, 1 and 5, A and 4, ...

PHP bitwise left shifting 32 spaces problem and bad results with large numbers arithmetic operations

I have the following problems:
First: I am trying to do a 32-spaces bitwise left shift on a large number, and for some reason the number is always returned as-is. For example:
echo(516103988<<32); // echoes 516103988
Because shifting the bits to the left one space is the equivalent of multiplying by 2, i tried multiplying the number by 2^32, and it works, it returns 2216649749795176448.
Second: I have to add 9379 to the number from the above point:
printf('%0.0f', 2216649749795176448 + 9379); // prints 2216649749795185920
Should print: 2216649749795185827
Doing 32 bit-shifting operations will probably not work like you expect, as integers tend to be stored on 32 bits.
Quoting this page : Bitwise Operators
Don't right shift for more than 32
bits on 32 bits systems. Don't left
shift in case it results to number
longer than 32 bits. Use functions
from the gmp extension for bitwise
manipulation on numbers beyond
PHP_INT_MAX.
Php integer precision is limited to machine word size (32, 64). To work with arbitrary precision integers you have to store them as strings and use bc or gmp library:
echo bcmul('516103988', bcpow(2, 32)); // 2216649749795176448
Based on Pascal MARTIN's suggestions, i tried both the BCMath and the GMP extension and came up with the following solutions:
With BCMath:
$a = 516103988;
$s = bcpow(2, 32);
$a = bcadd(bcmul($a, $s), 9379);
echo $a; // works, echoes 2216649749795185827
With GMP:
$a = gmp_init(516103988);
$s = gmp_pow(gmp_init(2), 32);
$a = gmp_add(gmp_mul($a, $s), gmp_init(9379));
echo gmp_strval($a); // also works
From what i understand, there is a far greater chance for BCMath to be installed on the server then GMP, so i will be using the first solution.
Thanks :)

prime generator optimization

I'm starting out my expedition into Project Euler. And as many others I've figured I need to make a prime number generator. Problem is: PHP doesn't like big numbers. If I use the standard Sieve of Eratosthenes function, and set the limit to 2 million, it will crash. It doesn't like creating arrays of that size. Understandable.
So now I'm trying to optimize it. One way, I found, was to instead of creating an array with 2 million variable, I only need 1 million (only odd numbers can be prime numbers). But now it's crashing because it exceeds the maximum execution time...
function getPrimes($limit) {
$count = 0;
for ($i = 3; $i < $limit; $i += 2) {
$primes[$count++] = $i;
}
for ($n = 3; $n < $limit; $n += 2) {
//array will be half the size of $limit
for ($i = 1; $i < $limit/2; $i++) {
if ($primes[$i] % $n === 0 && $primes[$i] !== $n) {
$primes[$i] = 0;
}
}
}
return $primes;
}
The function works, but as I said, it's a bit slow...any suggestions?
One thing I've found to make it a bit faster is to switch the loop around.
foreach ($primes as $value) {
//$limitSq is the sqrt of the limit, as that is as high as I have to go
for ($n = 3; $n = $limitSq; $n += 2) {
if ($value !== $n && $value % $n === 0) {
$primes[$count] = 0;
$n = $limitSq; //breaking the inner loop
}
}
$count++;
}
And in addition setting the time and memory limit (thanks Greg), I've finally managed to get an answer. phjew.
Without knowing much about the algorithm:
You're recalculating $limit/2 each time around the $i loop
Your if statement will be evaluated in order, so think about (or test) whether it would be faster to test $primes[$i] !== $n first.
Side note, you can use set_time_limit() to give it longer to run and give it more memory using
ini_set('memory_limit', '128M');
Assuming your setup allows this, of course - on a shared host you may be restricted.
From Algorithmist's proposed solution
This is a modification of the standard
Sieve of Eratosthenes. It would be
highly inefficient, using up far too
much memory and time, to run the
standard sieve all the way up to n.
However, no composite number less than
or equal to n will have a factor
greater than sqrt{n},
so we only need to know all primes up
to this limit, which is no greater
than 31622 (square root of 10^9). This
is accomplished with a sieve. Then,
for each query, we sieve through only
the range given, using our
pre-computed table of primes to
eliminate composite numbers.
This problem has also appeared on UVA's and Sphere's online judges. Here's how it's enunciated on Sphere.
You can use a bit field to store your sieve. That is, it's roughly identical to an array of booleans, except you pack your booleans into a large integer. For instance if you had 8-bit integers you would store 8 bits (booleans) per integer which would further reduce your space requirements.
Additionally, using a bit field allows the possibility of using bit masks to perform your sieve operation. For example, if your sieve kept all numbers (not just odd ones), you could construct a bit mask of b01010101 which you could then AND against every element in your array. For threes you could use three integers as the mask: b00100100 b10010010 b01001001.
Finally, you do not need to check numbers that are lower than $n, in fact you don't need to check for numbers less than $n*$n-1.
Once you know the number is not a prime, I would exit the enter loop. I don't know php, but you need a statement like a break in C or a last in Perl.
If that is not available, I would set a flag and use it to exit the inter loop as a condition of continuing the interloop. This should speed up your execution as you are not checking $limit/2 items if it is not a prime.
if you want speed, don’t use PHP on this one :P
no, seriously, i really like PHP and it’s a cool language, but it’s not suited at all for such algorithms

Categories