The usual advice to handle money and other decimal numbers where accuracy is crucial is to either use integers or strings (plus arbitrary precision libraries) and it makes sense if you understand how floating point maths work. However, I don't have at hand any specific example to illustrate this, as every wrong calculation I've spot in the wild was due to some other mistake: naive comparisons using ==, lack of proper rounding when displaying results, blatantly wrong logic (e.g. calculating taxes with an inconsistent algorithm that also doesn't work on paper)... I've done some research and results either only apply to C/C++ (float/double having different precision) or were mere elaborations on why you can't trust two floats to be equal.
Can you share a self-contained PHP code snippet with carefully selected floating point figures and a correct algorithm that renders an incorrect result explicitly caused by floating point limitations?
Disclaimer: I don't intend to argue, refute or debunk anything, I honestly need an example for my toolbelt.
Things can break fairly easy with much less than a billion iterations. The thing is, by using floats and arithmetic you can very easily find yourself with unexpected results, and even if numbers superficially look fine, the subtle imprecisions can lead to an application bugging out.
Let's try a variation of the example in your answer:
$total = 0.0;
for ($i = 0; $i < 10; $i++) {
$total += 0.1;
}
echo "added ten cents, ten times\n";
// since we added 0.1 € x 10 times, we now have 1€ in total, right?
if ($total == 1) {
echo "I have 1€. All is good in the realm.";
}
else {
echo "WTF? Where is my money? I only have {$total}€!!!!\n";
echo "\$total holds: ";
var_dump($total);
}
The output for the above is:
added ten cents
WTF? Where is my money? I only have 1€!!!!
$total holds: float(1)
Even if $total appears to be float(1), the code follow the 'wrong' branch of execution, breaking our application.
If we execute the same code in PHP8 (beta so far), you'll an easier to understand result:
added ten cents
WTF? Where is my money? I only have 1€!!!!
$total holds: float(0.9999999999999999)
Another simple example:
$balance = 50.03;
$debit = 45.42;
$expected_balance = 4.61;
$real_balance = $balance - $debit;
if ($real_balance !== $expected_balance) {
echo "problems: ";
var_dump($real_balance);
}
The output for the above is:
balance mismatch: float(4.61)
or, in PHP8:
balance mismatch: float(4.609999999999999)
Either of the above examples show that practically, using floating numbers to do (specifically) money arithmetic can be problematic. Since the results no longer match your expectations, not only it can lead to plainly wrong results, but the subtle different results can make the whole application behaving in unexpected ways.
Examples and results, here.
echo floor((0.1 + 0.7) * 10);
Expected result: 0.1 + 0.7 = 0.8; 0.8 * 10 = 8;
Result: 7
Tested on PHP 7.2.12
The question makes little sense because that isn't how floating point errors work.
Inaccuracies are tiny. They happen in remote decimals and they're only noticeable when you require very high precision levels. After all, IEEE 754 powers a vast majority of computer systems and it offers an excellent precision. To put it in context, 0.1 kilometres expressed as float is 0.100000001490116119384765625, what makes accurate up to 1/10 of a µm
if I didn't get maths wrong.
There probably isn't a set of carefully chosen figures and a real-life calculation you'd be expected to use PHP for (an invoice, a stock exchange index...) that renders incorrect results no matter how careful you are with precision levels. Because that's not the problem.
The problem with floating point maths is that it forces you to be extremely careful on every step and it makes it very easy for bugs to slip in.
For applications where accuracy matters, you can write correct software using floats, but it won't be as easy, maintainable or robust.
Original answer:
This is the best I've got so far (thanks to chtz for the hint):
// Set-up and display settings (shouldn't affect internal calculations or final result)
set_time_limit(0);
ini_set('precision', -1);
// Expected accuracy: 2 decimal positions
$total = 0;
for ($i = 0; $i < 1e9; $i++) {
$total += 0.01;
// It's important to NOT round inside the loop, e.g.: $total = round($total + 0.01, 2);
}
var_dump($total, number_format($total, 2));
float(9999999.825158669)
string(12) "9,999,999.83" // Correct value would be "10,000,000.00"
Unfortunately, it relies on the accumulation of a very large number of precision errors (it needs around 1,000,000,000 of them to happen and it needs more than 4 minutes to run in my PC), so it isn't as real-life as I would have liked, but it certainly illustrates the underlying issue.
Related
I'm working on a system where I need to round down to the nearest penny financial payments. Naively I thought I would multiply up by 100, take the floor and then divide back down. However the following example is misbehaving:
echo 1298.34*100;
correctly shows:
129834
but
echo floor(1298.34*100);
unexpectedly shows:
129833
I get the same problem using intval for example.
I suspect the multiplication is falling foul of floating point rounding. But if I can't rely on multiplication, how can I do this? I always want to round down reliably, and I don't need to take negative amounts into consideration.
To be clear, I want any fractional penny amounts to be stripped off:
1298.345 should give 1298.34
1298.349 should give 1298.34
1298.342 should give 1298.34
Since you mention you only use this for displaying purposes, you could take the amount, turn it into a string and truncate anything past the second decimal. A regular expression could do the job:
preg_match('/\d+\.{0,1}\d{0,2}/', (string) $amount, $matches);
This expression works with any number of decimals (including zero). How it works in detail:
\d+ matches any number of digits
\.{0,1} matches 0 or 1 literal dot
\d{0,2} matches zero or two digits after the dot
You can run the following code to test it:
$amounts = [
1298,
1298.3,
1298.34,
1298.341,
1298.349279745,
];
foreach ($amounts as $amount) {
preg_match('/\d+\.{0,1}\d{0,2}/', (string) $amount, $matches);
var_dump($matches[0]);
}
Also available as a live test in this fiddle.
You can use round() to round to the required precision, and with the expected behavior when rounding the final 5 (which is another financial hurdle you might encounter).
$display = round(3895.0 / 3.0, 2);
Also, as a reminder, I have the habit of always writing floating point integers with a final dot or a ".0". This prevents some languages from inferring the wrong type and doing, say, integer division, so that 5 / 3 will yield 1.
If you need a "custom rounding" and want to be sure, well, the reason it didn't work is because not all floating point numbers exist in machine representation. 1298.34 does not exist; what does exist (I'm making the precise numbers up!) in its place might be 1298.33999999999999124.
So when you multiply it by 100 and get 129833.999999999999124, of course truncating it will yield 129833.
What you need to do then is to add a small quantity that must be enough to cover the machine error but not enough to matter in the financial calculation. There is an algorithm to determine this quantity, but you can probably get away with "one thousandth after upscaling".
So:
$display = floor((3895.0 / 3.0)*100.0 + 0.001);
Please be aware that this number, which you will "see" as 1234.56, might again not exist precisely. It might really be 1234.5600000000000123 or 1234.559999999999876. This might have consequences in complex, composite calculations.
Since You're working with financial, You should use some kind of Money library (https://github.com/moneyphp/money). Almost all other solutions are asking for trouble.
Other ways, which I don't recommend, are: a) use integers only, b) calculate with bcmath or c) use Number class from the Money library e.g.:
function getMoneyValue($value): string
{
if (!is_numeric($value)) {
throw new \RuntimeException(sprintf('Money value has to be a numeric value, "%s" given', is_object($value) ? get_class($value) : gettype($value)));
}
$number = \Money\Number::fromNumber($value)->base10(-2);
return $number->getIntegerPart();
}
he other function available is round(), which takes two parameters -
the number to round, and the number of decimal places to round to. If
a number is exactly half way between two integers, round() will always
round up.
use round :
echo round (1298.34*100);
result :
129834
I'm trying to convert from decimal to fraction sporting odds. I have found via a search a PHP function which works well, but certain decimals cause problems, such as 2.1 which maxs out the server:
function dec2frac($dec) {
$decBase = --$dec;
$div = 1;
do {
$div++;
$dec = $decBase * $div;
} while (intval($dec) != $dec);
if ($dec % $div == 0) {
$dec = $dec / $div;
$div = $div / $div;
}
return $dec.'/'.$div;
}
$decimal = 2.3;
echo $decimal.' --> '.dec2frac($decimal);
A decimal odds of 6 should give 5/1. This is calulated as 6-1=5 = 5/1
I have found that decimal input of 2.2 and 2.3 trips the function up but other values seem to be ok. What is causing this anomaly, is there way around it?
Thanks.
This problem consists of two seperate steps
Create a fraction from a decimal number
Convert between betting odds and fractions
Let's start with the latter: A betting odd of 5 means, for every $1 invested, you get $5 if you win. Since you invested $1, your actual win is just $4. So the odds are 4-1 or 4/1
Analogous, betting odds of 2.5 mean, for every $1 you invest, you win $1.5, giving you 1.5-1 or 3-2 or 3/2
This leads us to the conclusion, that what we need is the fraction of ($odds-1)
Next part: Fractionizing. I didn't analyze the given algorithm, but wrote a very bad (but easily readable) one:
function dec2frac($val) {
//first pump denominator up
$tmp=strstr("$val",'.');
if ($tmp) $tmp=strlen($tmp)-1;
else $tmp=0;
$n=$val;
$d=1;
for (;$tmp>0;$tmp--) {
$n*=10;
$d*=10;
}
$n=intval(round($n));
$d=intval(round($d));
//Now shorten the fraction
//Find limit for pseudoprime search
$min=$n;
if ($d<$n) $min=$d;
$min=ceil($min/2);
if (ceil($d/2)>$min) $min=ceil($d/2);
if (ceil($n/2)>$min) $min=ceil($n/2);
$pseudoprime=2;
while ($pseudoprime<=$min) {
//Shorten by current pseudoprime as long as possible
while (true) {
$nn=$n/$pseudoprime;
if ($nn!=round($nn)) break;
$dd=$d/$pseudoprime;
if ($dd!=round($dd)) break;
$n=intval($nn);
$d=intval($dd);
}
//Move on to next pseudoprime
$pseudoprime+=($pseudoprime==2)?1:2;
if ($pseudoprime>3)
if (($pseudoprime/3)==floor($pseudoprime/3)) $pseudoprime+=2;
}
return "$n/$d";
}
This was tested to work with the values 0.25, 2.5, 3.1, 3.14, 3.141, 3.1415, 3.14159 and 3.141592.
The very unoptimized nature of the algorithm is a less important limit, as betting odds tend to have not very many decimal digits.
Together with
function odds2fract($odds) {
return dec2frac($odds-1);
}
derived from the other step, we get successfull conversion of
5 --> 4/1
2.5 --> 3/2
2.1 --> 11/10
2.2 --> 6/5
Edit
The original version had a bug in the search limit calculation, which led to some fractions (e.g. completeley shortenable) failed to be shortened. The updated version fixes this.
Edit 2
Again a fixed bug: Failing to round() the values obtained in the first step before intval()ing them gave wrong results on fractions, that have a very bad fidelity in floatingpoint. Fixed by applying the missing round()
The first thing that just immediately triggers is that you used '.1' in there. This suggests, to me, that we're dealing with floating point problems... check your code... yeah, looks like floating point problems to me. The problem here is that your numbers are stored in binary, and there's no convenient way for binary to use a lot of decimal values. When you start using simple math, you're not getting exact results, you're getting shortcut estimations.
Floating Point numbers in PHP
That link should give you a detailed run-down on what's going wrong, links to more pages that will give you still more information, and links to libraries that exist to fix it.
What I'm trying to do isn't exactly a Gaussian distribution, since it has a finite minimum and maximum. The idea is closer to rolling X dice and counting the total.
I currently have the following function:
function bellcurve($min=0,$max=100,$entropy=-1) {
$sum = 0;
if( $entropy < 0) $entropy = ($max-$min)/15;
for($i=0; $i<$entropy; $i++) $sum += rand(0,15);
return floor($sum/(15*$entropy)*($max-$min)+$min);
}
The idea behind the $entropy variable is to try and roll enough dice to get a more even distribution of fractional results (so that flooring it won't cause problems).
It doesn't need to be a perfect RNG, it's just for a game feature and nothing like gambling or cryptography.
However, I ran a test over 65,536 iterations of bellcurve() with no arguments, and the following graph emerged:
(source: adamhaskell.net)
As you can see, there are a couple of values that are "offset", and drastically so. While overall it doesn't really affect that much (at worst it's offset by 2, and ignoring that the probability is still more or less where I want it), I'm just wondering where I went wrong.
Any additional advice on this function would be appreciated too.
UPDATE: I fixed the problem above just by using round instead of floor, but I'm still having trouble getting a good function for this. I've tried pretty much every function I can think of, including gaussian, exponential, logistic, and so on, but to no avail. The only method that has worked so far is this approximation of rolling dice, which is almost certainly not what I need...
If you are looking for a bell curve distribution, generate multiple random numbers and add them together. If you are looking for more modifiers, simply multiply them to the end result.
Generate a random bell curve number, with a bonus of 50% - 150%.
Sum(rand(0,15), rand(0,15) , rand(0,15))*(rand(2,6)/2)
Though if you're concerned about rand not providing random enough numbers you can use mt_rand which will have a much better distribution (uses mersenne twister)
The main issue turned out to be that I was trying to generate a continuous bell curve based on a discrete variable. That's what caused holes and offsets when scaling the result.
The fix I used for this was: +rand(0,1000000)/1000000 - it essentially takes the whole number discrete variable and adds a random fraction to it, more or less making it continuous.
The function is now:
function bellcurve() {
$sum = 0;
$entropy = 6;
for($i=0; $i<$entropy; $i++) $sum += rand(0,15);
return ($sum+rand(0,1000000)/1000000)/(15*$entropy);
}
It returns a float between 0 and 1 inclusive (although those exact values are extremely unlikely), which can then be scaled and rounded as needed.
Example usage:
$damage *= bellcurve()-0.5; // adjusts $damage by a random amount
// between 50% and 150%, weighted in favour of 100%
I'm adding together two numerical strings $a and $b and then comparing the result against another numerical string $c. All three numbers are stored as strings, and being converted to floats by PHP at the comparison step.
For some reason, the test $a+$b == $c does not evaluate as true, even though it should.
You can recreate the problem with this script:
<?php
$a = "-111.11";
$b = "-22.22";
$c = "-133.33";
echo '$a is '.$a."\n";
echo '$b is '.$b."\n";
echo '$c is '.$c."\n";
echo '$a + $b is '.($a+$b). "\n";
if ($a + $b == $c) {
echo 'a + b equals c'."\n";
} else {
echo 'a + b does not equal c'."\n";
}
?>
Weirdly, if I change the values slightly so that $a=-111.11, $b=-22.23 and $c=-133.34 it works as expected.
Am I missing something obvious, or is this a bug with PHP?
From the large red box on this page: http://php.net/manual/en/language.types.float.php
never compare floating point numbers for equality.
Basically, you're not getting the correct numbers, because they are saved in a slightly different format, so when you compare, it gets screwed.
That link of #Corbin is really good, So I'm adding it just for the love :)
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
What Every Computer Scientist Should Know About Floating-Point Arithmetic
This paper presents a tutorial on those aspects of floating-point that
have a direct impact on designers of computer systems. It begins with
background on floating-point representation and rounding error,
continues with a discussion of the IEEE floating-point standard, and
concludes with numerous examples of how computer builders can better
support floating-point.
You're running into a limitation of floating point arithmetic. Just as there are certain numbers you can't represent exactly in decimal (1/3 for instance), so there are certain numbers you can't represent exactly in floating point binary.
You should never try and compare floating point numbers for equality, as the limitations of floating point make it unlikely that the variables you're comparing have an actual value that matches exactly the value you think they have. You need to add a "fudge factor", that is if the two numbers are similar to within a certain tolerance, then you should consider them to be equal.
You can do this by subtracting one number from another and seeing if the absolute result is below your threshold (in my example, 0.01):
if (abs ($someFloatingPointNumber - $someOtherFloatingPointNumber) <= 0.01)
{
// The values are close enough to be considered equal
}
Of course, this combined with rounding errors that can creep in with successive mathematical operations mean that floating point numbers are often not the best choice anyway, and should be avoided where possible. For example, if you're dealing with currency, store your values as integers in the minor unit (pennies for GBP, cents for USD, etc), and only convert to the major unit by dividing by 100 for display.
Do your number have always two decimal positions?
If so, you can try this:
$aDec = round($a * 100);
$bDec = round($b * 100);
$cDec = round($c * 100);
if ($aDec + $bDec == $cDec) {
...
}
Because the float data type in PHP is inaccurate, and a FLOAT in MySQL takes up more space than an INT (and is inaccurate), I always store prices as INTs, multipling by 100 before storing to ensure we have exactly 2 decimal places of precision. However I believe PHP is misbehaving. Example code:
echo "<pre>";
$price = "1.15";
echo "Price = ";
var_dump($price);
$price_corrected = $price*100;
echo "Corrected price = ";
var_dump($price_corrected);
$price_int = intval(floor($price_corrected));
echo "Integer price = ";
var_dump($price_int);
echo "</pre>";
Produced output:
Price = string(4) "1.15"
Corrected price = float(115)
Integer price = int(114)
I was surprised. When the final result was lower than expected by 1, I was expecting the output of my test to look more like:
Price = string(4) "1.15"
Corrected price = float(114.999999999)
Integer price = int(114)
which would demonstrate the inaccuracy of the float type. But why is floor(115) returning 114??
Try this as a quick fix:
$price_int = intval(floor($price_corrected + 0.5));
The problem you are experiencing is not PHP's fault, all programming languages using real numbers with floating point arithmetics have similar issues.
The general rule of thumb for monetary calculations is to never use floats (neither in the database nor in your script). You can avoid all kinds of problems by always storing the cents instead of dollars. The cents are integers, and you can freely add them together, and multiply by other integers. Whenever you display the number, make sure you insert a dot in front of the last two digits.
The reason why you are getting 114 instead of 115 is that floor rounds down, towards the nearest integer, thus floor(114.999999999) becomes 114. The more interesting question is why 1.15 * 100 is 114.999999999 instead of 115. The reason for that is that 1.15 is not exactly 115/100, but it is a very little less, so if you multiply by 100, you get a number a tiny bit smaller than 115.
Here is a more detailed explanation what echo 1.15 * 100; does:
It parses 1.15 to a binary floating point number. This involves rounding, it happens to round down a little bit to get the binary floating point number nearest to 1.15. The reason why you cannot get an exact number (without rounding error) is that 1.15 has infinite number of numerals in base 2.
It parses 100 to a binary floating point number. This involves rounding, but since 100 is a small integer, the rounding error is zero.
It computes the product of the previous two numbers. This also involves a little rounding, to find the nearest binary floating point number. The rounding error happens to be zero in this operation.
It converts the binary floating point number to a base 10 decimal number with a dot, and prints this representation. This also involves a little rounding.
The reason why PHP prints the surprising Corrected price = float(115) (instead of 114.999...) is that var_dump doesn't print the exact number (!), but it prints the number rounded to n - 2 (or n - 1) digits, where n digits is the precision of the calculation. You can easily verify this:
echo 1.15 * 100; # this prints 115
printf("%.30f", 1.15 * 100); # you 114.999....
echo 1.15 * 100 == 115.0 ? "same" : "different"; # this prints `different'
echo 1.15 * 100 < 115.0 ? "less" : "not-less"; # this prints `less'
If you are printing floats, remember: you don't always see all digits when you print the float.
See also the big warning near the beginning of the PHP float docs.
The other answers have covered the cause and a good workaround to the problem, I believe.
To aim at fixing the problem from a different angle:
For storing price values in MySQL, you should probably look at the DECIMAL type, which lets you store exact values with decimal places.
Maybe it's another possible solution for this "problem":
intval(number_format($problematic_float, 0, '', ''));
PHP is doing rounding based on significant digits. It's hiding the inaccuracy (on line 2). Of course, when floor comes along, it doesn't know any better and lops it all the way down.
As stated this is not a problem with PHP per se, It is more of an issue of handling fractions that can't be expressed as finite floating point values hence leading to loss of character when rounding up.
The solution is to ensure that when you are working on floating point values and you need to maintain accuracy - use the gmp functions or the BC maths functions - bcpow, bcmul et al. and the problem will be resolved easily.
E.g instead of
$price_corrected = $price*100;
use $price_corrected = bcmul($price,100);