Calculating Floating Point Powers (PHP/BCMath) - php

I'm writing a wrapper for the bcmath extension, and bug #10116 regarding bcpow() is particularly annoying -- it casts the $right_operand ($exp) to an (native PHP, not arbitrary length) integer, so when you try to calculate the square root (or any other root higher than 1) of a number you always end up with 1 instead of the correct result.
I started searching for algorithms that would allow me to calculate the nth root of a number and I found this answer which looks pretty solid, I actually expanded the formula using WolframAlpha and I was able to improve it's speed by about 5% while keeping the accuracy of the results.
Here is a pure PHP implementation mimicking my BCMath implementation and its limitations:
function _pow($n, $exp)
{
$result = pow($n, intval($exp)); // bcmath casts $exp to (int)
if (fmod($exp, 1) > 0) // does $exp have a fracional part higher than 0?
{
$exp = 1 / fmod($exp, 1); // convert the modulo into a root (2.5 -> 1 / 0.5 = 2)
$x = 1;
$y = (($n * _pow($x, 1 - $exp)) / $exp) - ($x / $exp) + $x;
do
{
$x = $y;
$y = (($n * _pow($x, 1 - $exp)) / $exp) - ($x / $exp) + $x;
} while ($x > $y);
return $result * $x; // 4^2.5 = 4^2 * 4^0.5 = 16 * 2 = 32
}
return $result;
}
The above seems to work great except when 1 / fmod($exp, 1) doesn't yield an integer. For example, if $exp is 0.123456, its inverse will be 8.10005 and the outcome of pow() and _pow() will be a bit different (demo):
pow(2, 0.123456) = 1.0893412745953
_pow(2, 0.123456) = 1.0905077326653
_pow(2, 1 / 8) = _pow(2, 0.125) = 1.0905077326653
How can I achieve the same level of accuracy using "manual" exponential calculations?

The employed algorithm to find the nth root of a (positive) number a is the Newton algorithm for finding the zero of
f(x) = x^n - a.
That involves only powers with natural numbers as exponents, hence is straightforward to implement.
Calculating a power with an exponent 0 < y < 1 where y is not of the form 1/n with an integer n is more complicated. Doing the analogue, solving
x^(1/y) - a == 0
would again involve calculating a power with non-integral exponent, the very problem we're trying to solve.
If y = n/d is rational with small denominator d, the problem is easily solved by calculating
x^(n/d) = (x^n)^(1/d),
but for most rational 0 < y < 1, numerator and denominator are rather large, and the intermediate x^n would be huge, so the computation would use a lot of memory and take a (relatively) long time.
(For the example exponent of 0.123456 = 1929/15625, it's not too bad, but 0.1234567 would be rather taxing.)
One way to calculate the power for general rational 0 < y < 1 is to write
y = 1/a ± 1/b ± 1/c ± ... ± 1/q
with integers a < b < c < ... < q and to multiply/divide the individual x^(1/k). (Every rational 0 < y < 1 has such representations, and the shortest such representations generally don't involve many terms, e.g.
1929/15625 = 1/8 - 1/648 - 1/1265625;
using only additions in the decomposition leads to longer representations with larger denominators, e.g.
1929/15625 = 1/9 + 1/82 + 1/6678 + 1/46501020 + 1/2210396922562500,
so that would involve more work.)
Some improvement is possible by mixing the approaches, first find a close rational approximation to y with small denominator via the continued fraction expansion of y - for the example exponent 1929/15625 = [0;8,9,1,192] and using the first four partial quotients yields the approximation 10/81 = 0.123456790123... [note that 10/81 = 1/8 - 1/648, the partial sums of the shortest decomposition into pure fractions are convergents] - and then decompose the remainder into pure fractions.
However, in general that approach leads to calculating nth roots for large n, which also is slow and memory-intensive if the desired accuracy of the final result is high.
All in all, it is probably simpler and faster to implement exp and log and use
x^y = exp(y*log(x))

Related

What's the most efficient way of randomly picking a floating number within a specific range? [duplicate]

How does one generate a random float between 0 and 1 in PHP?
I'm looking for the PHP's equivalent to Java's Math.random().
You may use the standard function: lcg_value().
Here's another function given on the rand() docs:
// auxiliary function
// returns random number with flat distribution from 0 to 1
function random_0_1()
{
return (float)rand() / (float)getrandmax();
}
Example from documentation :
function random_float ($min,$max) {
return ($min+lcg_value()*(abs($max-$min)));
}
rand(0,1000)/1000 returns:
0.348 0.716 0.251 0.459 0.893 0.867 0.058 0.955 0.644 0.246 0.292
or use a bigger number if you want more digits after decimal point
class SomeHelper
{
/**
* Generate random float number.
*
* #param float|int $min
* #param float|int $max
* #return float
*/
public static function rand($min = 0, $max = 1)
{
return ($min + ($max - $min) * (mt_rand() / mt_getrandmax()));
}
}
update:
forget this answer it doesnt work wit php -v > 5.3
What about
floatVal('0.'.rand(1, 9));
?
this works perfect for me, and it´s not only for 0 - 1 for example between 1.0 - 15.0
floatVal(rand(1, 15).'.'.rand(1, 9));
function mt_rand_float($min, $max, $countZero = '0') {
$countZero = +('1'.$countZero);
$min = floor($min*$countZero);
$max = floor($max*$countZero);
$rand = mt_rand($min, $max) / $countZero;
return $rand;
}
example:
echo mt_rand_float(0, 1);
result: 0.2
echo mt_rand_float(3.2, 3.23, '000');
result: 3.219
echo mt_rand_float(1, 5, '00');
result: 4.52
echo mt_rand_float(0.56789, 1, '00');
result: 0.69
$random_number = rand(1,10).".".rand(1,9);
function frand($min, $max, $decimals = 0) {
$scale = pow(10, $decimals);
return mt_rand($min * $scale, $max * $scale) / $scale;
}
echo "frand(0, 10, 2) = " . frand(0, 10, 2) . "\n";
This question asks for a value from 0 to 1. For most mathematical purposes this is usually invalid albeit to the smallest possible degree. The standard distribution by convention is 0 >= N < 1. You should consider if you really want something inclusive of 1.
Many things that do this absent minded have a one in a couple billion result of an anomalous result. This becomes obvious if you think about performing the operation backwards.
(int)(random_float() * 10) would return a value from 0 to 9 with an equal chance of each value. If in one in a billion times it can return 1 then very rarely it will return 10 instead.
Some people would fix this after the fact (to decide that 10 should be 9). Multiplying it by 2 should give around a ~50% chance of 0 or 1 but will also have a ~0.000000000465% chance of returning a 2 like in Bender's dream.
Saying 0 to 1 as a float might be a bit like mistakenly saying 0 to 10 instead of 0 to 9 as ints when you want ten values starting at zero. In this case because of the broad range of possible float values then it's more like accidentally saying 0 to 1000000000 instead of 0 to 999999999.
With 64bit it's exceedingly rare to overflow but in this case some random functions are 32bit internally so it's not no implausible for that one in two and a half billion chance to occur.
The standard solutions would instead want to be like this:
mt_rand() / (getrandmax() + 1)
There can also be small usually insignificant differences in distribution, for example between 0 to 9 then you might find 0 is slightly more likely than 9 due to precision but this will typically be in the billionth or so and is not as severe as the above issue because the above issue can produce an invalid unexpected out of bounds figure for a calculation that would otherwise be flawless.
Java's Math.random will also never produce a value of 1. Some of this comes from that it is a mouthful to explain specifically what it does. It returns a value from 0 to less than one. It's Zeno's arrow, it never reaches 1. This isn't something someone would conventionally say. Instead people tend to say between 0 and 1 or from 0 to 1 but those are false.
This is somewhat a source of amusement in bug reports. For example, any PHP code using lcg_value without consideration for this may glitch approximately one in a couple billion times if it holds true to its documentation but that makes it painfully difficult to faithfully reproduce.
This kind of off by one error is one of the common sources of "Just turn it off and on again." issues typically encountered in embedded devices.
Solution for PHP 7. Generates random number in [0,1). i.e. includes 0 and excludes 1.
function random_float() {
return random_int(0, 2**53-1) / (2**53);
}
Thanks to Nommyde in the comments for pointing out my bug.
>>> number_format((2**53-1)/2**53,100)
=> "0.9999999999999998889776975374843459576368331909179687500000000000000000000000000000000000000000000000"
>>> number_format((2**53)/(2**53+1),100)
=> "1.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
Most answers are using mt_rand. However, mt_getrandmax() usually returns only 2147483647. That means you only have 31 bits of information, while a double has a mantissa with 52 bits, which means there is a density of at least 2^53 for the numbers between 0 and 1.
This more complicated approach will get you a finer distribution:
function rand_754_01() {
// Generate 64 random bits (8 bytes)
$entropy = openssl_random_pseudo_bytes(8);
// Create a string of 12 '0' bits and 52 '1' bits.
$x = 0x000FFFFFFFFFFFFF;
$first12 = pack("Q", $x);
// Set the first 12 bits to 0 in the random string.
$y = $entropy & $first12;
// Now set the first 12 bits to be 0[exponent], where exponent is randomly chosen between 1 and 1022.
// Here $e has a probability of 0.5 to be 1022, 0.25 to be 1021, etc.
$e = 1022;
while($e > 1) {
if(mt_rand(0,1) == 0) {
break;
} else {
--$e;
}
}
// Pack the exponent properly (add four '0' bits behind it and 49 more in front)
$z = "\0\0\0\0\0\0" . pack("S", $e << 4);
// Now convert to a double.
return unpack("d", $y | $z)[1];
}
Please note that the above code only works on 64-bit machines with a Litte-Endian byte order and Intel-style IEEE754 representation. (x64-compatible computers will have this). Unfortunately PHP does not allow bit-shifting past int32-sized boundaries, so you have to write a separate function for Big-Endian.
You should replace this line:
$z = "\0\0\0\0\0\0" . pack("S", $e << 4);
with its big-endian counterpart:
$z = pack("S", $e << 4) . "\0\0\0\0\0\0";
The difference is only notable when the function is called a large amount of times: 10^9 or more.
Testing if this works
It should be obvious that the mantissa follows a nice uniform distribution approximation, but it's less obvious that a sum of a large amount of such distributions (each with cumulatively halved chance and amplitude) is uniform.
Running:
function randomNumbers() {
$f = 0.0;
for($i = 0; $i < 1000000; ++$i) {
$f += \math::rand_754_01();
}
echo $f / 1000000;
}
Produces an output of 0.49999928273099 (or a similar number close to 0.5).
I found the answer on PHP.net
<?php
function randomFloat($min = 0, $max = 1) {
return $min + mt_rand() / mt_getrandmax() * ($max - $min);
}
var_dump(randomFloat());
var_dump(randomFloat(2, 20));
?>
float(0.91601131712832)
float(16.511210331931)
So you could do
randomFloat(0,1);
or simple
mt_rand() / mt_getrandmax() * 1;
what about:
echo (float)('0.' . rand(0,99999));
would probably work fine... hope it helps you.

Why is my Python code 100 times slower than the same code in PHP?

I have two points (x1 and x2) and want to generate a normal distribution in a given step count. The sum of y values for the x values between x1 and x2 is 1. To the actual problem:
I'm fairly new to Python and wonder why the following code produces the desired result, but about 100x slower than the same program in PHP. There are about 2000 x1-x2 pairs and about 5 step values per pair.
I tried to compile with Cython, used multiprocessing but it just improved things 2x, which is still 50x slower than PHP. Any suggestions how to improve speed to match at least PHP performance?
from scipy.stats import norm
import numpy as np
import time
# Calculates normal distribution
def calculate_dist(x1, x2, steps, slope):
points = []
range = np.linspace(x1, x2, steps+2)
for x in range:
y = norm.pdf(x, x1+((x2-x1)/2), slope)
points.append([x, y])
sum = np.array(points).sum(axis=0)[1]
norm_points = []
for point in points:
norm_points.append([point[0], point[1]/sum])
return norm_points
start = time.time()
for i in range(0, 2000):
for j in range(10, 15):
calculate_dist(0, 1, j, 0.15)
print(time.time() - start) # Around 15 seconds or so
Edit, PHP Code:
$start = microtime(true);
for ($i = 0; $i<2000; $i++) {
for ($j = 10; $j<15; $j++) {
$x1 = 0; $x2 = 1; $steps = $j; $slope = 0.15;
$step = abs($x2-$x1) / ($steps + 1);
$points = [];
for ($x = $x1; $x <= $x2 + 0.000001; $x += $step) {
$y = stats_dens_normal($x, $x1 + (($x2 - $x1) / 2), $slope);
$points[] = [$x, $y];
}
$sum = 0;
foreach ($points as $point) {
$sum += $point[1];
}
$norm_points = [];
foreach ($points as &$point) {
array_push($norm_points, [$point[0], $point[1] / $sum]);
}
}
}
return microtime(true) - $start; # Around 0.1 seconds or so
Edit 2, profiled each line and found that norm.pdf() was taking 98% of time, so found a custom normpdf function and defined it, now time is around 0.67s which is considerably faster, but still around 10x slower than PHP. Also I think redefining common functions goes against the idea of Pythons simplicity?!
The custom function (source is some other Stackoverflow answer):
from math import sqrt, pi, exp
def normpdf(x, mu, sigma):
u = (x-mu)/abs(sigma)
y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
return y
The answer is, you aren't using the right tools/data structures for the tasks in python.
Calling numpy functionality has quite an overhead (scipy.stats.norm.pdf uses numpy under the hood) in python and thus one would never call this functions for one element but for the whole array (so called vectorized computation), that means instead of
for x in range:
y = norm.pdf(x, x1+((x2-x1)/2), slope)
ys.append(y)
one would rather use:
ys = norm.pdf(x,x1+((x2-x1)/2), slope)
calculating pdf for all elements in x and paying the overhead only once rather than len(x) times.
For example to calculate pdf for 10^4 elements takes less than 10 times more time than for one element:
%timeit norm.pdf(0) # 68.4 µs ± 1.62 µs
%timeit norm.pdf(np.zeros(10**4)) # 415 µs ± 12.4 µs
Using vectorized computation will not only make your program faster but often also shorter/easier to understand, for example:
def calculate_dist_vec(x1, x2, steps, slope):
x = np.linspace(x1, x2, steps+2)
y = norm.pdf(x, x1+((x2-x1)/2), slope)
ys = y/np.sum(y)
return x,ys
Using this vectorized version gives you a speed-up around 10.
The problem: norm.pdf is optimized for long vectors (nobody really cares how fast/slow it is for 10 elements if it is very fast for one million elements), but your test is biased against numpy, because it uses/creates only short arrays and thus norm.pdf cannot shine.
So if it is really about small arrays and you are serious about speeding it up you will have to roll out your own version of norm.pdf Using cython for creating this fast and specialized function might be worth a try.

How to optimise a Exponential Moving Average algorithm in PHP?

I'm trying to retrieve the last EMA of a large dataset (15000+ values). It is a very resource-hungry algorithm since each value depends on the previous one. Here is my code :
$k = 2/($range+1);
for ($i; $i<$size_data; ++$i) {
$lastEMA = $lastEMA + $k * ($data[$i]-$lastEMA);
}
What I already did:
Isolate $k so it is not computed 10000+ times
Keep only the latest computed EMA, and not keep all of them in an array
use for() instead of foreach()
the $data[] array doesn't have keys; it's a basic array
This allowed me to reduced execution time from 2000ms to about 500ms for 15000 values!
What didn't work:
Use SplFixedArray(), this shaved only ~10ms executing 1,000,000 values
Use PHP_Trader extension, this returns an array containing all the EMAs instead of just the latest, and it's slower
Writing and running the same algorithm in C# and running it over 2,000,000 values takes only 13ms! So obviously, using a compiled, lower-level language seems to help ;P
Where should I go from here? The code will ultimately run on Ubuntu, so which language should I choose? Will PHP be able to call and pass such a huge argument to the script?
Clearly implementing with an extension gives you a significant boost.
Additionally the calculus can be improved as itself and that gain you can add in whichever language you choose.
It is easy to see that lastEMA can be calculated as follows:
$lastEMA = 0;
$k = 2/($range+1);
for ($i; $i<$size_data; ++$i) {
$lastEMA = (1-$k) * $lastEMA + $k * $data[$i];
}
This can be rewritten as follows in order to take out of the loop as most as possible:
$lastEMA = 0;
$k = 2/($range+1);
$k1m = 1 - $k;
for ($i; $i<$size_data; ++$i) {
$lastEMA = $k1m * $lastEMA + $data[$i];
}
$lastEMA = $lastEMA * $k;
To explain the extraction of the "$k" think that in the previous formulation is as if all the original raw data are multiplied by $k so practically you can instead multiply the end result.
Note that, rewritten in this way, you have 2 operations inside the loop instead of 3 (to be precise inside the loop there are also $i increment, $i comparison with $size_data and $lastEMA value assignation) so this way you can expect to achieve an additional speedup in the range between the 16% and 33%.
Further there are other improvements that can be considered at least in some circumstances:
Consider only last values
The first values are multiplied several times by $k1m = 1 - $k so their contribute may be little or even go under the floating point precision (or the acceptable error).
This idea is particularly helpful if you can do the assumption that older data are of the same order of magnitude as the newer because if you consider only the last $n values the error that you make is
$err = $EMA_of_discarded_data * (1-$k) ^ $n.
So if order of magnitude is broadly the same we can tell that the relative error done is
$rel_err = $err / $lastEMA = $EMA_of_discarded_data * (1-$k) ^ $n / $lastEMA
that is almost equal to simply (1-$k) ^ $n.
Under the assumption that "$lastEMA almost equal to $EMA_of_discarded_data":
Let's say that you can accept a relative error $rel_err
you can safely consider only the last $n values where (1 - $k)^$n < $rel_err.
Means that you can pre-calculate (before the loop) $n = log($rel_err) / log (1-$k) and compute all only considering the last $n values.
If the dataset is very big this can give a sensible speedup.
Consider that for 64 bit floating point numbers you have a relative precision (related to the mantissa) that is 2^-53 (about 1.1e-16 and only 2^-24 = 5.96e-8 for 32 bit floating point numbers) so you cannot obtain better than this relative error
so basically you should never have an advantage in calculating more than $n = log(1.1e-16) / log(1-$k) values.
to give an example if $range = 2000 then $n = log(1.1e-16) / log(1-2/2001) = 36'746.
I think that is interesting to know that extra calculations would go lost inside the roundings ==> it is useless ==> is better not to do.
now one example for the case where you can accept a relative error larger than floating point precision $rel_err = 1ppm = 1e-6 = 0.00001% = 6 significant decimal digits you have $n = log(1.1e-16) / log(1-2/2001) = 13'815
I think is quite a little number compared to your last samples numbers so in that cases the speedup could be evident (I'm assuming that $range = 2000 is meaningful or high for your application but thi I cannot know).
just other few numbers because I do not know what are your typical figures:
$rel_err = 1e-3; $range = 2000 => $n = 6'907
$rel_err = 1e-3; $range = 200 => $n = 691
$rel_err = 1e-3; $range = 20 => $n = 69
$rel_err = 1e-6; $range = 2000 => $n = 13'815
$rel_err = 1e-6; $range = 200 => $n = 1'381
$rel_err = 1e-6; $range = 20 => $n = 138
If the assumption "$lastEMA almost equal to $EMA_of_discarded_data" cannot be taken things are less easy but since the advantage cam be significant it can be meaningful to go on:
we need to re-consider the full formula: $rel_err = $EMA_of_discarded_data * (1-$k) ^ $n / $lastEMA
so $n = log($rel_err * $lastEMA / $EMA_of_discarded_data) / log (1-$k) = (log($rel_err) + log($lastEMA / $EMA_of_discarded_data)) / log (1-$k)
the central point is to calculate $lastEMA / $EMA_of_discarded_data (without actually calculating $lastEMA nor $EMA_of_discarded_data of course)
one case is when we know a-priori that for example $EMA_of_discarded_data / $lastEMA < M (for example M = 1000 or M = 1e6)
in that case $n < (log($rel_err/M)) / log (1-$k)
if you cannot give any M number
you have to find a good idea to over-estimate $EMA_of_discarded_data / $lastEMA
one quick way could be to take M = max(data) / min(data)
Parallelization
The calculation can be re-written in a form where it is a simple addition of independent terms:
$lastEMA = 0;
$k = 2/($range+1);
$k1m = 1 - $k;
for ($i; $i<$size_data; ++$i) {
$lastEMA += $k1m ^ ($size_data - 1 - $i) * $data[$i];
}
$lastEMA = $lastEMA * $k;
So if the implementing language supports parallelization the dataset can be divided in 4 (or 8 or n ...basically the number of CPU cores available) chunks and it can be computed the sum of terms on each chunk in parallel summing up the individual results at the end.
I do not go in detail with this since this reply is already terribly long and I think the concept is already expressed.
Building your own extension definitely improves performance. Here's a good tutorial from the Zend website.
Some performance figures: Hardware: Ubuntu 14.04, PHP 5.5.9, 1-core Intel CPU#3.3Ghz, 128MB RAM (it's a VPS).
Before (PHP only, 16,000 values) : 500ms
C Extension, 16,000 values : 0.3ms
C Extension (100,000 values) : 3.7ms
C Extension (500,000 values) : 28.0ms
But I'm memory limited at this point, using 70MB. I will fix that and update the numbers accordingly.

Is there a clever way to do this with pure math

I've got this spot of code that seems it could be done cleaner with pure math (perhaps a logarigthms?). Can you help me out?
The code finds the first power of 2 greater than a given input. For example, if you give it 500, it returns 9, because 2^9 = 512 > 500. 2^8 = 256, would be too small because it's less than 500.
function getFactor($iMaxElementsPerDir)
{
$aFactors = range(128, 1);
foreach($aFactors as $i => $iFactor)
if($iMaxElementsPerDir > pow(2, $iFactor) - 1)
break;
if($i == 0)
return false;
return $aFactors[$i - 1];
}
The following holds true
getFactor(500) = 9
getFactor(1000) = 10
getFactor(2500) = 12
getFactor(5000) = 13
You can get the same effect by shifting the bits in the input to the right and checking against 0. Something like this.
i = 1
while((input >> i) != 0)
i++
return i
The same as jack but shorter. Log with base 2 is the reverse function of 2^x.
echo ceil(log(500, 2));
If you're looking for a "math only" solution (that is a single expression or formula), you can use log() and then take the ceiling value of its result:
$factors = ceil(log(500) / log(2)); // 9
$factors = ceil(log(5000) / log(2)); // 13
I seem to have not noticed that this function accepts a second argument (since PHP 4.3) with which you can specify the base; though internally the same operation is performed, it does indeed make the code shorter:
$factors = ceil(log(500, 2)); // 9
To factor in some inaccuracies, you may need some tweaking:
$factors = floor(log($nr - 1, 2)) + 1;
There are a few ways to do this.
Zero all but the most significant bit of the number, maybe like this:
while (x & x-1) x &= x-1;
and look the answer up in a table. Use a table of length 67 and mod your power of two by 67.
Binary search for the high bit.
If you're working with a floating-point number, inspect the exponent field. This field contains 1023 plus your answer, except in the case where the number is a perfect power of two. You can detect the perfect power case by checking whether the significand field is exactly zero.
If you aren't working with a floating-point number, convert it to floating-point and look at the exponent like in 3. Check for a power of two by testing (x & x-1) == 0 instead of looking at the significand; this is true exactly when x is a power of two.
Note that log(2^100) is the same double as log(nextafter(2^100, 1.0/0.0)), so any solution based on floating-point natural logarithms will fail.
Here's (nonconformant C++, not PHP) code for 4:
int ceillog2(unsigned long long x) {
if (x < 2) return x-1;
double d = x-1;
int ans = (long long &)d >> 52;
return ans - 1022;
}

Random Float between 0 and 1 in PHP

How does one generate a random float between 0 and 1 in PHP?
I'm looking for the PHP's equivalent to Java's Math.random().
You may use the standard function: lcg_value().
Here's another function given on the rand() docs:
// auxiliary function
// returns random number with flat distribution from 0 to 1
function random_0_1()
{
return (float)rand() / (float)getrandmax();
}
Example from documentation :
function random_float ($min,$max) {
return ($min+lcg_value()*(abs($max-$min)));
}
rand(0,1000)/1000 returns:
0.348 0.716 0.251 0.459 0.893 0.867 0.058 0.955 0.644 0.246 0.292
or use a bigger number if you want more digits after decimal point
class SomeHelper
{
/**
* Generate random float number.
*
* #param float|int $min
* #param float|int $max
* #return float
*/
public static function rand($min = 0, $max = 1)
{
return ($min + ($max - $min) * (mt_rand() / mt_getrandmax()));
}
}
update:
forget this answer it doesnt work wit php -v > 5.3
What about
floatVal('0.'.rand(1, 9));
?
this works perfect for me, and it´s not only for 0 - 1 for example between 1.0 - 15.0
floatVal(rand(1, 15).'.'.rand(1, 9));
function mt_rand_float($min, $max, $countZero = '0') {
$countZero = +('1'.$countZero);
$min = floor($min*$countZero);
$max = floor($max*$countZero);
$rand = mt_rand($min, $max) / $countZero;
return $rand;
}
example:
echo mt_rand_float(0, 1);
result: 0.2
echo mt_rand_float(3.2, 3.23, '000');
result: 3.219
echo mt_rand_float(1, 5, '00');
result: 4.52
echo mt_rand_float(0.56789, 1, '00');
result: 0.69
$random_number = rand(1,10).".".rand(1,9);
function frand($min, $max, $decimals = 0) {
$scale = pow(10, $decimals);
return mt_rand($min * $scale, $max * $scale) / $scale;
}
echo "frand(0, 10, 2) = " . frand(0, 10, 2) . "\n";
This question asks for a value from 0 to 1. For most mathematical purposes this is usually invalid albeit to the smallest possible degree. The standard distribution by convention is 0 >= N < 1. You should consider if you really want something inclusive of 1.
Many things that do this absent minded have a one in a couple billion result of an anomalous result. This becomes obvious if you think about performing the operation backwards.
(int)(random_float() * 10) would return a value from 0 to 9 with an equal chance of each value. If in one in a billion times it can return 1 then very rarely it will return 10 instead.
Some people would fix this after the fact (to decide that 10 should be 9). Multiplying it by 2 should give around a ~50% chance of 0 or 1 but will also have a ~0.000000000465% chance of returning a 2 like in Bender's dream.
Saying 0 to 1 as a float might be a bit like mistakenly saying 0 to 10 instead of 0 to 9 as ints when you want ten values starting at zero. In this case because of the broad range of possible float values then it's more like accidentally saying 0 to 1000000000 instead of 0 to 999999999.
With 64bit it's exceedingly rare to overflow but in this case some random functions are 32bit internally so it's not no implausible for that one in two and a half billion chance to occur.
The standard solutions would instead want to be like this:
mt_rand() / (getrandmax() + 1)
There can also be small usually insignificant differences in distribution, for example between 0 to 9 then you might find 0 is slightly more likely than 9 due to precision but this will typically be in the billionth or so and is not as severe as the above issue because the above issue can produce an invalid unexpected out of bounds figure for a calculation that would otherwise be flawless.
Java's Math.random will also never produce a value of 1. Some of this comes from that it is a mouthful to explain specifically what it does. It returns a value from 0 to less than one. It's Zeno's arrow, it never reaches 1. This isn't something someone would conventionally say. Instead people tend to say between 0 and 1 or from 0 to 1 but those are false.
This is somewhat a source of amusement in bug reports. For example, any PHP code using lcg_value without consideration for this may glitch approximately one in a couple billion times if it holds true to its documentation but that makes it painfully difficult to faithfully reproduce.
This kind of off by one error is one of the common sources of "Just turn it off and on again." issues typically encountered in embedded devices.
Solution for PHP 7. Generates random number in [0,1). i.e. includes 0 and excludes 1.
function random_float() {
return random_int(0, 2**53-1) / (2**53);
}
Thanks to Nommyde in the comments for pointing out my bug.
>>> number_format((2**53-1)/2**53,100)
=> "0.9999999999999998889776975374843459576368331909179687500000000000000000000000000000000000000000000000"
>>> number_format((2**53)/(2**53+1),100)
=> "1.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
Most answers are using mt_rand. However, mt_getrandmax() usually returns only 2147483647. That means you only have 31 bits of information, while a double has a mantissa with 52 bits, which means there is a density of at least 2^53 for the numbers between 0 and 1.
This more complicated approach will get you a finer distribution:
function rand_754_01() {
// Generate 64 random bits (8 bytes)
$entropy = openssl_random_pseudo_bytes(8);
// Create a string of 12 '0' bits and 52 '1' bits.
$x = 0x000FFFFFFFFFFFFF;
$first12 = pack("Q", $x);
// Set the first 12 bits to 0 in the random string.
$y = $entropy & $first12;
// Now set the first 12 bits to be 0[exponent], where exponent is randomly chosen between 1 and 1022.
// Here $e has a probability of 0.5 to be 1022, 0.25 to be 1021, etc.
$e = 1022;
while($e > 1) {
if(mt_rand(0,1) == 0) {
break;
} else {
--$e;
}
}
// Pack the exponent properly (add four '0' bits behind it and 49 more in front)
$z = "\0\0\0\0\0\0" . pack("S", $e << 4);
// Now convert to a double.
return unpack("d", $y | $z)[1];
}
Please note that the above code only works on 64-bit machines with a Litte-Endian byte order and Intel-style IEEE754 representation. (x64-compatible computers will have this). Unfortunately PHP does not allow bit-shifting past int32-sized boundaries, so you have to write a separate function for Big-Endian.
You should replace this line:
$z = "\0\0\0\0\0\0" . pack("S", $e << 4);
with its big-endian counterpart:
$z = pack("S", $e << 4) . "\0\0\0\0\0\0";
The difference is only notable when the function is called a large amount of times: 10^9 or more.
Testing if this works
It should be obvious that the mantissa follows a nice uniform distribution approximation, but it's less obvious that a sum of a large amount of such distributions (each with cumulatively halved chance and amplitude) is uniform.
Running:
function randomNumbers() {
$f = 0.0;
for($i = 0; $i < 1000000; ++$i) {
$f += \math::rand_754_01();
}
echo $f / 1000000;
}
Produces an output of 0.49999928273099 (or a similar number close to 0.5).
I found the answer on PHP.net
<?php
function randomFloat($min = 0, $max = 1) {
return $min + mt_rand() / mt_getrandmax() * ($max - $min);
}
var_dump(randomFloat());
var_dump(randomFloat(2, 20));
?>
float(0.91601131712832)
float(16.511210331931)
So you could do
randomFloat(0,1);
or simple
mt_rand() / mt_getrandmax() * 1;
what about:
echo (float)('0.' . rand(0,99999));
would probably work fine... hope it helps you.

Categories