I have millions of long numbers (5216672577) in CSV data files and I'd like to reduce the file size. I plan to do this by rounding as many of the trailing digits as possible to 0, while staying within X% accuracy to the original number. Then I will convert the numbers to scientific notation. I prefer the format 103e7 to 1.03E+7. The period and plus symbols add unnecessary bytes. I'm working with integers.
Update - I got it to work:
for($i = 9000; $i < 11000; $i++) {
echo notation($i), "\r\n<br>";
}
// Round and abbreviate an integer
function notation($int, $precision=0.01) {
// We cannot shorten small numbers
if(is_int($int) && $int >= 1000) {
$best = $int;
// For each decimal place
$l = strlen($int);
for($i = 3; $i <= $l - 1; $i++) {
// Round to the deciaml place
$newInt = round($int, -$i);
// Check precision
$ratio = $int / $newInt;
if($ratio < (1 - $precision) || $ratio > (1 + $precision)) {
break;
}
// Save the best option
$best = $newInt;
}
// Count and remove trailing zeros
$l = strlen($best);
$best = rtrim($best, '0');
$i = $l - strlen($best);
// Check that we can actually shorten the int
if ($i >= 3) {
// Add scientific Notation
return $best . 'e' . $i;
}
}
return $int;
}
Related
I have a loop that generates all the possible combinations of bits by giving the number of bits desired, bu the issue is that I got out of memory when number of bits goes beyond 20, is there any optimizations that I can do, to solve this issue.
here my code :
function bitsGenerator($N)
{
$listN = $N;
$bits = ['0', '1'];
//check if input is valid or not
if (!is_int($listN)) {
echo 'Input must be numeric!';
}
if ($listN >= 1 && $listN <= 65) {
if ($listN == 1) {
echo '1';
exit;
}
for ($i = 1; $i <= ($listN - 1); $i++) {
$reverseBits = array_reverse($bits);
$prefixBit = preg_filter('/^/', '0', $bits);
$prefixReverseBits = preg_filter('/^/', '1', $reverseBits);
$bits = array_merge($prefixBit, $prefixReverseBits);
unset($prefixBit, $prefixReverseBits, $reverseBits);
}
$finalBits = array_slice($bits, -$listN);
foreach ($finalBits as $k => $v) {
echo $v . "\n";
}
} else {
echo 'Invalid input!';
}
}
The purpose of this function is to get the last $N combinations and display them all other combinations are thrown away, I'm looking for some kind of optimization to my code so that the $bits array will not store more than 65 item because the maximum number to bits thus the maximum number of combination to display is 65.
Thanks to every one for helping me.
The main problem is the size of the $bits array holding all your results. The array will contain 2^$N elements, each of $N length times 8 bits for each character (because you are using strings to have leading zeroes) so you'll end up with a memory consumption of approx (2^$N)*$N*8 which is 167772160 bytes. It won't get any smaller when using RAM.
Your working copies of $bits, preg_filter and array_merge will also consume a lot of RAM. Running your function with $N = 20 consumes 180375552 (172MiB).
BTW: unseting the variables will not reduce the consumed RAM because they would get overwritten in the next iteration anyway (or destroyed at the end of the function)
The following function was my first sketch based on your function and uses a bit less RAM: 171970560 bytes or 164MiB (vs 172MiB).
function myBitsGenerator($length)
{
if (!is_int($length)) {
die('Input must be numeric!');
}
if ($length < 1 || $length > 65) {
die('Input must be between 1 and 65');
}
$bitsArray = ['0', '1'];
$count = 2;
if ($length > 1) {
for ($i = 0; $i < $length; $i++) {
for ($j = 0; $j < $count; $j++) {
$bitsArray[] = $bitsArray[$j] . '1';
$bitsArray[$j] = $bitsArray[$j] . '0';
}
$count += $j;
}
}
$printArray = array_slice($bitsArray, $count - $length);
array_walk(
$printArray,
function ($value) {
echo $value . PHP_EOL;
}
);
}
However, the generation of those numbers is too complicated and should be simplified:
Theoretical: A binary number can be written in decimal and vice versa. The number of possible combinations of binary numbers with the fixed length of $N are x = 2 ^ $N. Each decimal number from 0 to x represents a binary number in the results.
Practical example: (binary) (0b101 === 5) (int)
All you have to do is to pad the calculated binary number with zeroes.
The simplified generator looks like:
$n = pow(2, $length);
for ($i = 0; $i < $iterations; $i++) {
$binary = str_pad(decbin($i), $length, '0', STR_PAD_LEFT);
}
You can use this generator to generate
If you really need to use even less RAM you should think about storing it in a file, which makes the whole think slower but it will use way less RAM.
function fileBitsGenerator($length)
{
$handle = fopen('bits.txt', 'w+');
$iterations = pow(2, $length);
for ($i = 0; $i < $iterations; $i++) {
fwrite($handle, str_pad(decbin($i), $length, '0', STR_PAD_LEFT) . PHP_EOL);
}
}
This consumes just 2097152 bytes and scales!
But be aware that the performance will depend on your HDD/SSD speed and it executes a lot of write operations (which might shorten your SSD life span). For example: the resulting file bits.txt is 92MB big if length = 22
So I am trying to do math on an array of integers while enforcing a maximum integer in each piece of the array. Similar to this:
function add($amount) {
$result = array_reverse([0, 0, 0, 100, 0]);
$max = 100;
for ($i = 0; $i < count($result); ++$i) {
$int = $result[$i];
$new = $int + $amount;
$amount = 0;
while ($new > $max) {
$new = $new - $max;
++$amount;
}
$result[$i] = $new;
}
return array_reverse($result);
}
add(1); // [0, 0, 0, 100, 1]
add(100); // [0, 0, 0, 100, 100]
add(101); // [0, 0, 1, 0, 100]
So what I have above works but it is slow when adding larger integers. I've tried to do this with bitwise shifts and gotten close but I just can't get it to work for some reason. I think I need a third-party perspective. Does anyone have some tips?
The part that is taking up the majority of the time is the while loop. You are reducing the value down repeatedly until you have a sub-100 value. However, using PHP to loop down like that takes an incredible amount of time (a 12-digit integer clocked in at over 20 seconds on my local machine). Instead, use multiplication and division (along with an if). It is magnitudes faster. The same 12-digit integer took less than a second to complete with this code:
function add($amount) {
$result = array_reverse([0, 0, 0, 100, 0]);
$max = 100;
for ($i = 0, $size = count($result); $i < $size; ++$i) {
$int = $result[$i];
$new = $int + $amount;
$amount = 0;
if( $new > $max ) {
$remainder = $new % $max;
// Amount is new divided by max (subtract 1 if remainder is 0 [see next if])
$amount = ((int) ($new / $max));
// If remainder exists, new is the the number of times max goes into new
// minus the value of max. Otherwise it is the remainder
if( $remainder == 0 ) {
$amount -= 1;
$new = $new - ((($new / $max) * $max) - $max);
} else {
$new = $remainder;
}
}
$result[$i] = $new;
}
return array_reverse($result);
}
Also note that I moved your count($result) call into the variable initialization section of the for loop. When it is inside the expression section it gets executed each time the for loop repeats which can also add to the overall time of executing the function.
Also note that with a large math change like this you may want to assert a range of values you expect to calculate to ensure there are no outliers. I did a small range and they all came out the same but I encourage you to run your own.
Use min($max, $number) to get $number limited to $max.
for ($i = 0; $i < count($result); ++$i) {
$result[$i] = min($max, $result[$i] + $amount);
}
What is the best way to generate a random integer with a restricted set of digits?
I want to generate a 4 digit random number, where each digit is in the range [1..6]. I was thinking generate a number in the range [0..1295], then converting to base 6 and incrementing the digits, but that goes through a string.
Without string conversion, and with only one call to a random number generator, you could do this:
function myRandom() {
$num = mt_rand(0, 1295);
$result = 0;
for ($i = 0; $i < 4; $i++) {
$result = $result*10 + $num % 6;
$num = floor($num / 6);
}
return $result + 1111;
}
You could generate each digit separately like this:
$result = '';
for ($i=0; $i < 4; $i++) {
$result .= mt_rand(1, 6);
}
$result = (int) $result;
Or if using a string is not preferred, you could do it with math:
$result = 0;
for ($i=0; $i < 4; $i++) {
$result += mt_rand(1, 6) * 10 ** $i;
// or for PHP versions < 5.6 (no ** exponentiation operator)
// $result += mt_rand(1, 6) * pow(10, $i);
}
<?php
// 216_10 = 1000_6
// 1295_10 = 5555_6
base_convert(mt_rand(216,1295),10,6);
rand(1,5)
.. generates random numbers for example: 4 3 2 3 2 (sum is equal 14).
I want the total to NOT exceed x (which is say 5), so in this case, it could be:
1 + 2 + 2 = 5
2 + 3 = 5
and so on ... variable length as long as sum < x
Should I generate a random, check against x, generate another, check again or is there another way?
The most obvious way is just to keep looping and generating a smaller and smaller random number until you're capped out.
$min = 2;
$max = 5;
$randoms = [];
while ($max > $min) {
$max -= ( $rand = rand($min, $max) );
$randoms[] = $rand;
}
Updated for the actual use-case (see comments):
function generateRandomSpend($balance, $maximum = 5, $minimum = 2) {
$amounts = array();
while ($balance) {
// If we can't add any more minimum-spends, stop
if ($balance - $minimum < 0) {
break;
} else {
// Don't generate pointlessly-high values
$maximum = min($balance, $maximum);
$balance -= $amounts[] = rand($minimum, $maximum);
}
}
return $amounts;
}
print_r( $s = generateRandomSpend(10) );
You can also do
echo array_sum( generateRandomSpend(10) );
to get a numeric value of their total spend.
This is also working and give result which you want
<?php
$fixed = 5;
$random = array();
$no = 0;
while(1) {
$number = rand(1,5);
$no +=$number;
if($no > $fixed) {
break;
}
$random[]= $number;
}
?>
I am looking to create an auto incrementing unique string using PHP, containing [a-Z 0-9] starting at 2 chars long and growing when needed.
This is for a url shrinker so each string (or alias) will be saved in the database attached to a url.
Any insight would be greatly appreciated!
Note this solution won't produce uppercase letters.
Use base_convert() to convert to base 36, which will use [a-z0-9].
<?php
// outputs a, b, c, ..., 2o, 2p, 2q
for ($i = 10; $i < 99; ++$i)
echo base_convert($i, 10, 36), "\n";
Given the last used number, you can convert it back to an integer with intval() increment it and convert the result back to base 36 with base_convert().
<?php
$value = 'bc9z';
$value = intval($value, 36);
++$value;
$value = base_convert($value, 10, 36);
echo $value; // bca0
// or
echo $value = base_convert(intval($value, 36) + 1, 10, 36);
Here's an implementation of an incr function which takes a string containing characters [0-9a-zA-Z] and increments it, pushing a 0 onto the front if required using the 'carry-the-one' method.
<?php
function incr($num) {
$chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$parts = str_split((string)$num);
$carry = 1;
for ($i = count($parts) - 1; $i >= 0 && $carry; --$i) {
$value = strpos($chars, $parts[$i]) + 1;
if ($value >= strlen($chars)) {
$value = 0;
$carry = 1;
} else {
$carry = 0;
}
$parts[$i] = $chars[$value];
}
if ($carry)
array_unshift($parts, $chars[0]);
return implode($parts);
}
$num = '0';
for ($i = 0; $i < 1000; ++$i) {
echo $num = incr($num), "\n";
}
If your string was single case rather than mixed, and didn't contain numerics, then you could literally just increment it:
$testString="AA";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
$testString="aa";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
But you could possibly make some use of this feature even with a mixed alphanumeric string
To expand on meagar's answer, here is how you can do it with uppercase letters as well and for number arbitrarily big (requires the bcmath extension, but you could as well use gmp or the bigintegers pear package):
function base10ToBase62($number) {
static $chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$result = "";
$n = $number;
do {
$remainder = bcmod($n, 62);
$n = bcdiv($n, 62);
$result = $chars[$remainder] . $result;
} while ($n > 0);
return $result;
}
for ($i = 10; $i < 99; ++$i) {
echo base10ToBase62((string) $i), "\n";
}