PHP integer rounding problems - php

echo (int) ( (0.1+0.7) * 10 );
Why does the above output 7? I understand how PHP rounds towards 0, but isn't (0.1+0.7) * 10 evaluated as a float and then casted as an integer?
Thanks!

There's a loss in precision when decimals are converted internally to their binary equivalent. The computed value will be something like 7.9+ instead of the expected 8.
If you need a high degree of accuracy, use the GMP family of functions or the bcmath library.

See the manual:
http://php.net/manual/en/language.types.float.php
It is typical that simple decimal
fractions like 0.1 or 0.7 cannot be
converted into their internal binary
counterparts without a small loss of
precision. This can lead to confusing
results: for example,
floor((0.1+0.7)*10) will usually
return 7 instead of the expected 8,
since the internal representation will
be something like 7.9.

The other answers explained WHY this happens. This should get you what you want:
echo (int) round( (0.1+0.7) * 10 );
Just round the float before casting it to an int.

I don't have php installed, but in python:
$ python
>>> 0.1+0.7
0.79999999999999993
>>>
Not all numbers in base 10 can be represented precisely in base 2 system. Check Wikipedia article:
http://en.wikipedia.org/wiki/Binary_numeral_system
section Fractions in Binary. In particular, this line:
Fraction Decimal Binary Fractional Approx.
1/10 0.1 0.000110011... 1/16+1/32+1/256...
1/10 cannot be represented in a finite way in base 2. Thus, 0.1 + 0.7 cannot be precisely calculated in base 2.
Never assume floating-point calculations are precise, it will bite you sooner or later.

1/10 cannot be represented in a finite number of binary digits, just like 1/3 cannot be represented as a finite number of base-10 digits. Therefore you are actually adding together 0.09999999999999... and 0.69999999999999... -- the sum is almost 8, but not quite.

Related

convert large integer/float to string using PHP [duplicate]

Why do some numbers lose accuracy when stored as floating point numbers?
For example, the decimal number 9.2 can be expressed exactly as a ratio of two decimal integers (92/10), both of which can be expressed exactly in binary (0b1011100/0b1010). However, the same ratio stored as a floating point number is never exactly equal to 9.2:
32-bit "single precision" float: 9.19999980926513671875
64-bit "double precision" float: 9.199999999999999289457264239899814128875732421875
How can such an apparently simple number be "too big" to express in 64 bits of memory?
In most programming languages, floating point numbers are represented a lot like scientific notation: with an exponent and a mantissa (also called the significand). A very simple number, say 9.2, is actually this fraction:
5179139571476070 * 2 -49
Where the exponent is -49 and the mantissa is 5179139571476070. The reason it is impossible to represent some decimal numbers this way is that both the exponent and the mantissa must be integers. In other words, all floats must be an integer multiplied by an integer power of 2.
9.2 may be simply 92/10, but 10 cannot be expressed as 2n if n is limited to integer values.
Seeing the Data
First, a few functions to see the components that make a 32- and 64-bit float. Gloss over these if you only care about the output (example in Python):
def float_to_bin_parts(number, bits=64):
if bits == 32: # single precision
int_pack = 'I'
float_pack = 'f'
exponent_bits = 8
mantissa_bits = 23
exponent_bias = 127
elif bits == 64: # double precision. all python floats are this
int_pack = 'Q'
float_pack = 'd'
exponent_bits = 11
mantissa_bits = 52
exponent_bias = 1023
else:
raise ValueError, 'bits argument must be 32 or 64'
bin_iter = iter(bin(struct.unpack(int_pack, struct.pack(float_pack, number))[0])[2:].rjust(bits, '0'))
return [''.join(islice(bin_iter, x)) for x in (1, exponent_bits, mantissa_bits)]
There's a lot of complexity behind that function, and it'd be quite the tangent to explain, but if you're interested, the important resource for our purposes is the struct module.
Python's float is a 64-bit, double-precision number. In other languages such as C, C++, Java and C#, double-precision has a separate type double, which is often implemented as 64 bits.
When we call that function with our example, 9.2, here's what we get:
>>> float_to_bin_parts(9.2)
['0', '10000000010', '0010011001100110011001100110011001100110011001100110']
Interpreting the Data
You'll see I've split the return value into three components. These components are:
Sign
Exponent
Mantissa (also called Significand, or Fraction)
Sign
The sign is stored in the first component as a single bit. It's easy to explain: 0 means the float is a positive number; 1 means it's negative. Because 9.2 is positive, our sign value is 0.
Exponent
The exponent is stored in the middle component as 11 bits. In our case, 0b10000000010. In decimal, that represents the value 1026. A quirk of this component is that you must subtract a number equal to 2(# of bits) - 1 - 1 to get the true exponent; in our case, that means subtracting 0b1111111111 (decimal number 1023) to get the true exponent, 0b00000000011 (decimal number 3).
Mantissa
The mantissa is stored in the third component as 52 bits. However, there's a quirk to this component as well. To understand this quirk, consider a number in scientific notation, like this:
6.0221413x1023
The mantissa would be the 6.0221413. Recall that the mantissa in scientific notation always begins with a single non-zero digit. The same holds true for binary, except that binary only has two digits: 0 and 1. So the binary mantissa always starts with 1! When a float is stored, the 1 at the front of the binary mantissa is omitted to save space; we have to place it back at the front of our third element to get the true mantissa:
1.0010011001100110011001100110011001100110011001100110
This involves more than just a simple addition, because the bits stored in our third component actually represent the fractional part of the mantissa, to the right of the radix point.
When dealing with decimal numbers, we "move the decimal point" by multiplying or dividing by powers of 10. In binary, we can do the same thing by multiplying or dividing by powers of 2. Since our third element has 52 bits, we divide it by 252 to move it 52 places to the right:
0.0010011001100110011001100110011001100110011001100110
In decimal notation, that's the same as dividing 675539944105574 by 4503599627370496 to get 0.1499999999999999. (This is one example of a ratio that can be expressed exactly in binary, but only approximately in decimal; for more detail, see: 675539944105574 / 4503599627370496.)
Now that we've transformed the third component into a fractional number, adding 1 gives the true mantissa.
Recapping the Components
Sign (first component): 0 for positive, 1 for negative
Exponent (middle component): Subtract 2(# of bits) - 1 - 1 to get the true exponent
Mantissa (last component): Divide by 2(# of bits) and add 1 to get the true mantissa
Calculating the Number
Putting all three parts together, we're given this binary number:
1.0010011001100110011001100110011001100110011001100110 x 1011
Which we can then convert from binary to decimal:
1.1499999999999999 x 23 (inexact!)
And multiply to reveal the final representation of the number we started with (9.2) after being stored as a floating point value:
9.1999999999999993
Representing as a Fraction
9.2
Now that we've built the number, it's possible to reconstruct it into a simple fraction:
1.0010011001100110011001100110011001100110011001100110 x 1011
Shift mantissa to a whole number:
10010011001100110011001100110011001100110011001100110 x 1011-110100
Convert to decimal:
5179139571476070 x 23-52
Subtract the exponent:
5179139571476070 x 2-49
Turn negative exponent into division:
5179139571476070 / 249
Multiply exponent:
5179139571476070 / 562949953421312
Which equals:
9.1999999999999993
9.5
>>> float_to_bin_parts(9.5)
['0', '10000000010', '0011000000000000000000000000000000000000000000000000']
Already you can see the mantissa is only 4 digits followed by a whole lot of zeroes. But let's go through the paces.
Assemble the binary scientific notation:
1.0011 x 1011
Shift the decimal point:
10011 x 1011-100
Subtract the exponent:
10011 x 10-1
Binary to decimal:
19 x 2-1
Negative exponent to division:
19 / 21
Multiply exponent:
19 / 2
Equals:
9.5
Further reading
The Floating-Point Guide: What Every Programmer Should Know About Floating-Point Arithmetic, or, Why don’t my numbers add up? (floating-point-gui.de)
What Every Computer Scientist Should Know About Floating-Point Arithmetic (Goldberg 1991)
IEEE Double-precision floating-point format (Wikipedia)
Floating Point Arithmetic: Issues and Limitations (docs.python.org)
Floating Point Binary
This isn't a full answer (mhlester already covered a lot of good ground I won't duplicate), but I would like to stress how much the representation of a number depends on the base you are working in.
Consider the fraction 2/3
In good-ol' base 10, we typically write it out as something like
0.666...
0.666
0.667
When we look at those representations, we tend to associate each of them with the fraction 2/3, even though only the first representation is mathematically equal to the fraction. The second and third representations/approximations have an error on the order of 0.001, which is actually much worse than the error between 9.2 and 9.1999999999999993. In fact, the second representation isn't even rounded correctly! Nevertheless, we don't have a problem with 0.666 as an approximation of the number 2/3, so we shouldn't really have a problem with how 9.2 is approximated in most programs. (Yes, in some programs it matters.)
Number bases
So here's where number bases are crucial. If we were trying to represent 2/3 in base 3, then
(2/3)10 = 0.23
In other words, we have an exact, finite representation for the same number by switching bases! The take-away is that even though you can convert any number to any base, all rational numbers have exact finite representations in some bases but not in others.
To drive this point home, let's look at 1/2. It might surprise you that even though this perfectly simple number has an exact representation in base 10 and 2, it requires a repeating representation in base 3.
(1/2)10 = 0.510 = 0.12 = 0.1111...3
Why are floating point numbers inaccurate?
Because often-times, they are approximating rationals that cannot be represented finitely in base 2 (the digits repeat), and in general they are approximating real (possibly irrational) numbers which may not be representable in finitely many digits in any base.
While all of the other answers are good there is still one thing missing:
It is impossible to represent irrational numbers (e.g. π, sqrt(2), log(3), etc.) precisely!
And that actually is why they are called irrational. No amount of bit storage in the world would be enough to hold even one of them. Only symbolic arithmetic is able to preserve their precision.
Although if you would limit your math needs to rational numbers only the problem of precision becomes manageable. You would need to store a pair of (possibly very big) integers a and b to hold the number represented by the fraction a/b. All your arithmetic would have to be done on fractions just like in highschool math (e.g. a/b * c/d = ac/bd).
But of course you would still run into the same kind of trouble when pi, sqrt, log, sin, etc. are involved.
TL;DR
For hardware accelerated arithmetic only a limited amount of rational numbers can be represented. Every not-representable number is approximated. Some numbers (i.e. irrational) can never be represented no matter the system.
There are infinitely many real numbers (so many that you can't enumerate them), and there are infinitely many rational numbers (it is possible to enumerate them).
The floating-point representation is a finite one (like anything in a computer) so unavoidably many many many numbers are impossible to represent. In particular, 64 bits only allow you to distinguish among only 18,446,744,073,709,551,616 different values (which is nothing compared to infinity). With the standard convention, 9.2 is not one of them. Those that can are of the form m.2^e for some integers m and e.
You might come up with a different numeration system, 10 based for instance, where 9.2 would have an exact representation. But other numbers, say 1/3, would still be impossible to represent.
Also note that double-precision floating-points numbers are extremely accurate. They can represent any number in a very wide range with as much as 15 exact digits. For daily life computations, 4 or 5 digits are more than enough. You will never really need those 15, unless you want to count every millisecond of your lifetime.
Why can we not represent 9.2 in binary floating point?
Floating point numbers are (simplifying slightly) a positional numbering system with a restricted number of digits and a movable radix point.
A fraction can only be expressed exactly using a finite number of digits in a positional numbering system if the prime factors of the denominator (when the fraction is expressed in it's lowest terms) are factors of the base.
The prime factors of 10 are 5 and 2, so in base 10 we can represent any fraction of the form a/(2b5c).
On the other hand the only prime factor of 2 is 2, so in base 2 we can only represent fractions of the form a/(2b)
Why do computers use this representation?
Because it's a simple format to work with and it is sufficiently accurate for most purposes. Basically the same reason scientists use "scientific notation" and round their results to a reasonable number of digits at each step.
It would certainly be possible to define a fraction format, with (for example) a 32-bit numerator and a 32-bit denominator. It would be able to represent numbers that IEEE double precision floating point could not, but equally there would be many numbers that can be represented in double precision floating point that could not be represented in such a fixed-size fraction format.
However the big problem is that such a format is a pain to do calculations on. For two reasons.
If you want to have exactly one representation of each number then after each calculation you need to reduce the fraction to it's lowest terms. That means that for every operation you basically need to do a greatest common divisor calculation.
If after your calculation you end up with an unrepresentable result because the numerator or denominator you need to find the closest representable result. This is non-trivil.
Some Languages do offer fraction types, but usually they do it in combination with arbitary precision, this avoids needing to worry about approximating fractions but it creates it's own problem, when a number passes through a large number of calculation steps the size of the denominator and hence the storage needed for the fraction can explode.
Some languages also offer decimal floating point types, these are mainly used in scenarios where it is imporant that the results the computer gets match pre-existing rounding rules that were written with humans in mind (chiefly financial calculations). These are slightly more difficult to work with than binary floating point, but the biggest problem is that most computers don't offer hardware support for them.

PHP & Base 2. Which Floats give a precise Value?

Apologies for my poor maths skills, I've tried to understand this to answer my own query but I'm not convinced.
We all know that PHP doesn't store Floats in base 10 but base 2.
I have a series of calculations that are using 0.5 as the only float, and in trying to understand if they will be stored as 0.500001 or 0.4999999 (for rounding purposes there is a big difference!!!) I have come to understand that 0.5 will be stored precisely in base2.
My queries are
A Have I understood this correctly?
B What other floats are stored precisely in base2? eg 0.25?
Any multiple of 1/pow(x, 2) can be precisely represented as a float.
That means x/2, x/4, x/8, x/16 ...ect. can be accurately represented.
For more information on how floating point numbers are store see http://kipirvine.com/asm/workbook/floating_tut.htm
Gmp is a good library for high precision math.
PHP is not required to use binary floating-point. It depends on the system.
Many systems use IEEE-754 binary floating-point (sometimes incompletely or with modifications, such as flushing subnormal numbers to zero).
In IEEE-754 64-bit binary floating point, a number is exactly representable if and only if it is representable as an integer F times a power of two, 2E, such that:
The magnitude of F is less than 253.
–1074 ≤ E < 972.
For example, ½ equals 1•2–1. 1 is an integer under the integer limit, and –1 is an exponent within the exponent limits. So ½ is representable.
253+1 is not representable. As it is, it is an integer outside the integer limit. If you try to scale it by a power of two to bring it within the limit, you get a number that is not an integer. So there is no way to represent this value exactly in IEEE-754 64-bit binary floating-point.
1/3 and 1/10 are also not representable because no matter what power of two you scale them by, you will not produce an integer.

Float number behavior

My roommate just came up with a question.
Why in php (maybe other languages as well) floor($foo) and (int)$foo is 7?
$foo = (0.7 + 0.1) * 10;
var_dump(
$foo,
floor($foo),
(int)$foo,
ceil($foo),
is_infinite($foo),
is_finite($foo));
result
float(8)
float(7)
int(7)
float(8)
bool(false)
bool(true)
Notice that $foo is not an infinite number.
From answers I can see that everyone says that it is actually x.(9)
But what is reason behind number being x.(9) and not actual x as it should be in real life?
A rational number will become a repeating decimal if the denominator contains a prime factor that isn't among the ones in the base's prime factor list (i.e. 2 and 5)
A rational number has an infinitely repeating sequence of finite length less than the value of the fully reduced fraction's denominator if the reduced fraction's denominator contains a prime factor that is not a factor of the base. The repeating sequence is preceded after the radix point by a transient of finite length if the reduced fraction also shares a prime factor with the base.
https://en.wikipedia.org/wiki/Repeating_decimal
Floating-point types in computers are almost always in binary, so any number whose denominator in the rational representation is not a power of 2 would be an infinite periodic decimal. For example 0.1 would be rounded to 0.100000001490116119384765625 in IEEE-754 single precision which is the nearest sum of power of 2s
Here none of 0.7 and 0.1 are representable in binary floating-point, and neither is 0.8. Their sum is also not equal to 0.8: try printing 0.7 + 0.1 == 0.8 and you'll get the result as false. It's actually slightly less than 0.8. But it's not a repeating decimal like 7.(9) as you though but a value with a finite number of 9s in the fractional part.
As a result the ceil and int result in 7. If you take the floor or round result you'll get 8
Not always. If you end up with 8.0000001 due to floating point imprecision, the floor will snap to 8. Sometimes it may be 7.999999, which will snap to 7.
Chances are, if you're multiplying 0.x by y(which is read as an int in most languages), it will come out whole, so you won't see this behavior.
This is similar in other languages as well.
Because 0.7 and/or 0.1 are internally actually 0.6999999.... or 0.09.....
That means your (0.7 * 0.1) comes out as something more like 0.7999..... After multiplying by 10 and int/flooring, you end up with 7.
The floor function rounds down to nearest integer. Casting to int simply throws away the decimal part. $foo is a float, and it is not exactly 8, (must be 7.99999...) so you can observe that behavior.

Wrong Calculation? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is double Multiplication Broken in .NET?
PHP unexpected result of float to int type cast
echo (int) ( (0.1+0.7) * 10);
Gives me answer 7 but not 8.
echo (int) ( (0.1+0.8) * 10);
Gives answer 9 which is right
whats wrong? can anybody explain
thanx,
-Navi
It’s normal – There’s a thing called precision while working with floating point numbers. It’s present in most of the modern languages. See here: http://www.mredkj.com/javascript/nfbasic2.html for more information.
((0.1+0.7) * 10) is probably something like 7.9999999 and (int) 7.9999999 = 7
On the other hand ((0.1+0.8) * 10) is probably 9.00000001 and (int)9.00000001 = 9
Floating point numbers have limited precision. Although it depends on the system, PHP typically uses the IEEE 754 double precision format, which will give a maximum relative error due to rounding in the order of 1.11e-16. Non elementary arithmetic operations may give larger errors, and, of course, error propagation must be considered when several operations are compounded.
Additionally, rational numbers that are exactly representable as
floating point numbers in base 10, like 0.1 or 0.7, do not have an
exact representation as floating point numbers in base 2, which is
used internally, no matter the size of the mantissa. Hence, they
cannot be converted into their internal binary counterparts without a
small loss of precision. This can lead to confusing results: for
example, floor((0.1+0.7)*10) will usually return 7 instead of the
expected 8, since the internal representation will be something like
7.9999999999999991118....
So never trust floating number results to the last digit, and do not
compare floating point numbers directly for equality. If higher
precision is necessary, the arbitrary precision math functions and gmp
functions are available.
Source: http://php.net/manual/en/language.types.float.php
try
echo (float) ( (0.1+0.7) * 10);
Use Float
echo (float) ( (0.1+0.7) * 10);
it will give you perfect ans
try
echo (int) round( ( (0.1+0.7) * 10));
This should compensate the floating point computation error.

Weird PHP floating point behavior

Having weird problem:
$testTotal = 0;
foreach($completeBankArray as $bank){
var_dump($testTotal);
echo " + ";
var_dump(floatval($bank["amount"]));
echo " = ".(floatval($testTotal) + floatval($bank["amount"]))."</br>";
$testTotal = floatval(floatval($testTotal) + floatval($bank["amount"]));
And this is output I get:
------------------//--------------------
float(282486.09) + float(15) = 282501.09
float(282501.09) + float(3.49) = 282504.58
float(282504.58) + float(22.98) = 282527.55999999
float(282527.55999999) + float(5.2) = 282532.76
float(282532.76) + float(39.98) = 282572.73999999
float(282572.73999999) + float(2.6) = 282575.33999999
float(282575.33999999) + float(2.99) = 282578.32999999
------------------//-----------------------
How is this possible, what am I doing wring ?
You aren't doing anything wrong. Floats are notoriously innaccurate. From the docs (In the huge red warning box):
Floating point numbers have limited precision. Although it depends on the system, PHP typically uses the IEEE 754 double precision format, which will give a maximum relative error due to rounding in the order of 1.11e-16. Non elementary arithmetic operations may give larger errors, and, of course, error propagation must be considered when several operations are compounded.
Additionally, rational numbers that are exactly representable as floating point numbers in base 10, like 0.1 or 0.7, do not have an exact representation as floating point numbers in base 2, which is used internally, no matter the size of the mantissa. Hence, they cannot be converted into their internal binary counterparts without a small loss of precision. This can lead to confusing results: for example, floor((0.1+0.7)*10) will usually return 7 instead of the expected 8, since the internal representation will be something like 7.9999999999999991118....
So never trust floating number results to the last digit, and do not compare floating point numbers directly for equality. If higher precision is necessary, the arbitrary precision math functions and gmp functions are available.
Floats are never exact and will diff quite a bit in the long run. If you are working with precision math, please read about the bc library.
Classic numeric precision example. Computers store floating point numbers in binary.
The short answer is that the computer cannot accurately represent some floating point numbers in binary.
The long answer involves moving between numerical bases. If you have a float, you cannot represent it completely in binary unless the denominator contains can be broken into factors that are powers of 2.
The other answers have given some insight into why you get this behaviour with floats.
If you are dealing with money, one solution to your problem would be to use integers instead of floats, and deal with cents instead of dollars. Then all you need to do is format your output to include the decimal.

Categories