Reliable Margin of Error for Float -> String -> Float Conversion?

Reliable Margin of Error for Float -> String -> Float Conversion? - php

I have a float value that I need to store as a string in PHP and then compare later after casting back into a float.
Due to the conversion I know that relying on equality would be a mistake, as there's potential for a loss of precision, so I'm doing something like the following:
if (abs((float)$string_value - $float_value) < 0.001) { echo "Values are close enough\n"; }
Now, while a margin for error of 0.001 should be fine for my immediate purposes, it got me wondering; what is the smallest margin of error that I can reliably/safely use?
I realise that the safe margin of error will change with the size of the float (i.e- larger values have less or even no fractional precision), so an answer should probably account for this.
So to put it another way; given a float value that I want to store in base 10 and read back, how can I reliably decide what my margin of error should be such that I can reasonably confirm that the two values are the same?
Unfortunately the values I'm handling must be stored in plain decimal form, so my usual go-to of packing them as a network order 64-bit integer is not an option here ☹️
EDIT: To clarify; please assume that my question is about handling arbitrarily sized floats; the example code I've given is for a recent case where I'm handling floats within a limited range, so setting the margin of error manually is fine, but I'd like to be able to handle floats of any magnitude in future.

As mentioned in Mark Dickinson's comment, it is possible to convert a floating-point number to a string and back without losing precision. This only works if
you use enough significant decimal digits (17 for IEEE doubles)
the conversions are accurate (i.e. they're guaranteed to convert to the nearest number)
From a quick look, it seems that casting a double $f to a string in PHP, either implicitly or with (string) $f, only uses 14 significant digits, so this method isn't accurate enough. But you can use sprintf with a %.16e conversion specifier to get 17 significant digits. So after the following roundtrip
$s = sprintf("%.16e", $f);
$f2 = (double) $s;
$f2 should equal $f exactly unless PHP uses suboptimal algorithms internally.
Note that the %e conversion specifier uses scientific (exponential) notation. If you need plain decimal strings, you can use the %f specifier and calculate the required number of digits after the decimal point using log10:
if ($f != 0) {
$prec = 16 - floor(log10(abs($f)));
if ($prec < 0) $prec = 0;
}
else {
$prec = 0;
}
$s = sprintf("%.${prec}f", $f);
This can produce extremely long strings for very small or large numbers, though.
It would probably require a huge amount of research to tell the whether these methods are completely reliable, and if not what the maximum error is. It all depends on several implementation details like PHP version, underlying C library, etc.
Another idea is to compare the string representations instead of floating-point values:
# Assuming $string_value was also converted with float_to_string
if ($string_value == float_to_string($float_value)) {
echo "Values are close enough\n";
}
This should be reliable as long as you stick to the same PHP version.
If you must compare floating-point numbers, it often makes more sense to compare the relative error. See Bruce Dawson's excellent blog for more details.

Related

PHP Long string. Output it all

So I got a really long string, made by a calculator.
$string='483451102828322427131269442894636268716773727170';
$result=(8902543901+$string)*($string/93.189)/($string)+55643907015.57895461;
echo $result;
This outputs 5.1878558931668E+45
So now my question is. How can I output the whole string, without that nasty E+45?

PHP on a 64 bit machine can only accurately calculate number up until 9223372036854775807. As soon as you calculate with numbers higher than that, php will switch to floats which may loose some of it's precision, especially when you use divisions.
There's an extension for php that will allow you to make calculations based on string, called BCMath.
Example:
$string = '483451102828322427131269442894636268716773727170';
$result = bcadd($string, 8902543901);
echo $result;
bcadd() is for additions, bcdiv() for divisions and bcmul() for multiplying.

You can't print exact value because you are using calculation, so this $string becomes a number (float in this case) and all numbers have limited precision.
If you want to do operations on big numbers you should use BCMath
However if you want to display it without scientific notation you can do it using:
echo sprintf("%f",$result);
or
echo sprintf("%.0f",$result);
if you want to omit decimal part

How to compare two 64 bit numbers

In PHP I have a 64 bit number which represents tasks that must be completed. A second 64 bit number represents the tasks which have been completed:
$pack_code = 1001111100100000000000000011111101001111100100000000000000011111
$veri_code = 0000000000000000000000000001110000000000000000000000000000111110
I need to compare the two and provide a percentage of tasks completed figure. I could loop through both and find how many bits are set, but I don't know if this is the fastest way?

Assuming that these are actually strings, perhaps something like:
$pack_code = '1001111100100000000000000011111101001111100100000000000000011111';
$veri_code = '0000000000000000000000000001110000000000000000000000000000111110';
$matches = array_intersect_assoc(str_split($pack_code),str_split($veri_code));
$finished_matches = array_intersect($matches,array(1));
$percentage = (count($finished_matches) / 64) * 100

Because you're getting the numbers as hex strings instead of ones and zeros, you'll need to do a bit of extra work.
PHP does not reliably support numbers over 32 bits as integers. 64-bit support requires being compiled and running on a 64-bit machine. This means that attempts to represent a 64-bit integer may fail depending on your environment. For this reason, it will be important to ensure that PHP only ever deals with these numbers as strings. This won't be hard, as hex strings coming out of the database will be, well, strings, not ints.
There are a few options here. The first would be using the GMP extension's gmp_xor function, which performs a bitwise-XOR operation on two numbers. The resulting number will have bits turned on when the two numbers have opposing bits in that location, and off when the two numbers have identical bits in that location. Then it's just a matter of counting the bits to get the remaining task count.
Another option would be transforming the number-as-a-string into a string of ones and zeros, as you've represented in your question. If you have GMP, you can use gmp_init to read it as a base-16 number, and use gmp_strval to return it as a base-2 number.
If you don't have GMP, this function provided in another answer (scroll to "Step 2") can accurately transform a string-as-number into anything between base-2 and 36. It will be slower than using GMP.
In both of these cases, you'd end up with a string of ones and zeros and can use code like that posted by #Mark Baker to get the difference.

Optimization in this case is not worth of considering. I'm 100% sure that you don't really care whether your scrip will be generated 0.00000014 sec. faster, am I right?
Just loop through each bit of that number, compare it with another and you're done.
Remember words of Donald Knuth:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

This code utilizes the GNU Multi Precision library, which is supported by PHP, and since it is implemented in C, should be fast enough, and supports arbitrary precision.
$pack_code = gmp_init("1001111100100000000000000011111101001111100100000000000000011111", 2);
$veri_code = gmp_init("0000000000000000000000000001110000000000000000000000000000111110", 2);
$number_of_different_bits = gmp_popcount(gmp_xor($pack_code, $veri_code));

$a = 11111;
echo sprintf('%032b',$a)."\n";
$b = 12345;
echo sprintf('%032b',$b)."\n";
$c = $a & $b;
echo sprintf('%032b',$c)."\n";
$n=0;
while($c)
{
$n += $c & 1;
$c = $c >> 1;
}
echo $n."\n";
Output:
00000000000000000010101101100111
00000000000000000011000000111001
00000000000000000010000000100001
3
Given your PHP-setuo can handle 64bit, this can be easily extended.
If not you can sidestep this restriction using GNU Multiple Precision
You could also split up the HEx-Representation and then operate on those coresponding parts parts instead. As you need just the local fact of 1 or 0 and not which number actually is represented! I think that would solve your problem best.
For example:
0xF1A35C and 0xD546C1
you just compare the binary version of F and D, 1 and 5, A and 4, ...

Cast a numeric string as float type data

What is the PHP command that does something similar to intval(), but for decimals?
Eg. I have string "33.66" and I want to convert it to decimal value before sending it to MSSQL.

How about floatval()?
$f = floatval("33.66");
You can shave a few nanoseconds off of type conversions by using casting instead of a function call. But this is in the realm of micro-optimization, so don't worry about it unless you do millions of these operations per second.
$f = (float) "33.66";
I also recommend learning how to use sscanf() because sometimes it's the most convenient solution.
list($f) = sscanf("33.66", "%f");

If you mean a float:
$var = floatval("33.66")
Or
$var = (float)"33.66";
If you need the exact precision of a decimal, there is no such type in PHP. There is the Arbitrary Precision Mathematics extension, but it will return strings, so it's only usefull for you when performing calculations.

You could try floatval, but floats are potentially lossy.
You could try running the number through sprintf to get it to a more correct format. The format string %.2f would produce a floating-point-formatted number with two decimal places. Excess places get rounded.
I'm not sure if sprintf will convert the value to a float internally for formatting, so the lossy problem might still exist. That being said, if you're only worrying about two decimal places, you shouldn't need to worry about precision loss.

php is a loosely typed language. It doesn't matter if you have
$x = 33.66;
or
$x = "33.66";
sending it to mssql will be the same regardless.
Are you just wanting to make sure it is formatted properly, or is an actual float?

php intval() and floor() return value that is too low?

Because the float data type in PHP is inaccurate, and a FLOAT in MySQL takes up more space than an INT (and is inaccurate), I always store prices as INTs, multipling by 100 before storing to ensure we have exactly 2 decimal places of precision. However I believe PHP is misbehaving. Example code:
echo "<pre>";
$price = "1.15";
echo "Price = ";
var_dump($price);
$price_corrected = $price*100;
echo "Corrected price = ";
var_dump($price_corrected);
$price_int = intval(floor($price_corrected));
echo "Integer price = ";
var_dump($price_int);
echo "</pre>";
Produced output:
Price = string(4) "1.15"
Corrected price = float(115)
Integer price = int(114)
I was surprised. When the final result was lower than expected by 1, I was expecting the output of my test to look more like:
Price = string(4) "1.15"
Corrected price = float(114.999999999)
Integer price = int(114)
which would demonstrate the inaccuracy of the float type. But why is floor(115) returning 114??

Try this as a quick fix:
$price_int = intval(floor($price_corrected + 0.5));
The problem you are experiencing is not PHP's fault, all programming languages using real numbers with floating point arithmetics have similar issues.
The general rule of thumb for monetary calculations is to never use floats (neither in the database nor in your script). You can avoid all kinds of problems by always storing the cents instead of dollars. The cents are integers, and you can freely add them together, and multiply by other integers. Whenever you display the number, make sure you insert a dot in front of the last two digits.
The reason why you are getting 114 instead of 115 is that floor rounds down, towards the nearest integer, thus floor(114.999999999) becomes 114. The more interesting question is why 1.15 * 100 is 114.999999999 instead of 115. The reason for that is that 1.15 is not exactly 115/100, but it is a very little less, so if you multiply by 100, you get a number a tiny bit smaller than 115.
Here is a more detailed explanation what echo 1.15 * 100; does:
It parses 1.15 to a binary floating point number. This involves rounding, it happens to round down a little bit to get the binary floating point number nearest to 1.15. The reason why you cannot get an exact number (without rounding error) is that 1.15 has infinite number of numerals in base 2.
It parses 100 to a binary floating point number. This involves rounding, but since 100 is a small integer, the rounding error is zero.
It computes the product of the previous two numbers. This also involves a little rounding, to find the nearest binary floating point number. The rounding error happens to be zero in this operation.
It converts the binary floating point number to a base 10 decimal number with a dot, and prints this representation. This also involves a little rounding.
The reason why PHP prints the surprising Corrected price = float(115) (instead of 114.999...) is that var_dump doesn't print the exact number (!), but it prints the number rounded to n - 2 (or n - 1) digits, where n digits is the precision of the calculation. You can easily verify this:
echo 1.15 * 100; # this prints 115
printf("%.30f", 1.15 * 100); # you 114.999....
echo 1.15 * 100 == 115.0 ? "same" : "different"; # this prints `different'
echo 1.15 * 100 < 115.0 ? "less" : "not-less"; # this prints `less'
If you are printing floats, remember: you don't always see all digits when you print the float.
See also the big warning near the beginning of the PHP float docs.

The other answers have covered the cause and a good workaround to the problem, I believe.
To aim at fixing the problem from a different angle:
For storing price values in MySQL, you should probably look at the DECIMAL type, which lets you store exact values with decimal places.

Maybe it's another possible solution for this "problem":
intval(number_format($problematic_float, 0, '', ''));

PHP is doing rounding based on significant digits. It's hiding the inaccuracy (on line 2). Of course, when floor comes along, it doesn't know any better and lops it all the way down.

As stated this is not a problem with PHP per se, It is more of an issue of handling fractions that can't be expressed as finite floating point values hence leading to loss of character when rounding up.
The solution is to ensure that when you are working on floating point values and you need to maintain accuracy - use the gmp functions or the BC maths functions - bcpow, bcmul et al. and the problem will be resolved easily.
E.g instead of
$price_corrected = $price*100;
use $price_corrected = bcmul($price,100);

PHP money string conversion to integer error

I have a small financial application with PHP as the front end and MySQL as the back end. I have ancient prejudices, and I store money values in MySQL as an integer of cents. My HTML forms allow input of dollar values, like "156.64" and I use PHP to convert that to cents and then I store the cents in the database.
I have a function that both cleans the dollar value from the form, and converts it to cents. I strip leading text, I strip trailing text, I multiply by 100 and convert to an integer. That final step is
$cents = (integer) ($dollars * 100);
This works fine for almost everything, except for a very few values like '156.64' which consistently converts to 15663 cents. Why does it do this?
If I do this:
$cents = (integer) ($dollars * 100 + 0.5);
then it consistently works. Why do I need to add that rounding value?
Also, my prejudices about storing money amounts as integers and not floating point values, is that no longer needed? Will modern float calculations produce nicely rounded and accurate money values adequate for keeping 100% accurate accounting?

If you want precision, you should store your money values using the DECIMAL data type in MySQL.

Your "prejudices" about floats will never be overcome - it's fundamental to the way they work. Without going into too much detail, they store a number based on powers of two and since not all decimal number can be presented this way, it doesn't always work. Your only reliable solution is to store the number as a sequence of digits and the location of the decimal point (as per DECIMAL type mentioned above).
I'm not 100% on the PHP, but is it possible the multiplication is converting the ints to floats and hence introducing exactly the problem you're trying to avoid?

Currency/money values should never be stored in a database (or used in a program) as floats.
Your integer method is fine, as is using a DECIMAL, NUMERIC or MONEY type where available.
Your problem is caused by $dollars being treated as a float and PHP doesn't have a better type to deal with money. Depending on when $dollars is being assigned, it could be being treated as a string or a float, but is certainly converted to a float if it's still a string for the * 100 operation if it looks like a float.
You might be better off parsing the string to an integer "money" value yourself (using a regex) instead of relying on the implicit conversions which PHP is doing.

The code you posted does the multiplication first, forcing a floating point calculation that introduces error, before converting the value to an integer. Instead, you should avoid floating point arithmetic entirely by reversing the order. Convert to integer values first, then perform the arithmetic.
Assuming previous code already validated and formatted the input, try this:
list($bills, $pennies) = explode('.', $dollars);
$cents = 100 * $bills + $pennies;
Your prejudice against floating point values to represent money is well founded because of truncation and because of values being converted from base-10 to base-2 and back again.

Casting does not round() as in round-to-nearest, it truncates at the decimal: (int)3.99 yields 3. (int)-3.99 yields -3.
Since float arithmetic often induces error (and possibly not in the direction you want), use round() if you want reliable rounding.

You should never ever store currency in floating point, because it always get results you don't expect.
Check out php BC Maths, it allow you to store your currency as string, then perform very high precision arithmetic on them.

Instead of using
$cents = (integer) ($dollars * 100);
you may want to try to use:
$cents = bcmul($dollars, 100, 2);

When converting from float to integer, the number will be rounded towards zero (src).
Read the Floating point precision warning.

There's no point in storing money as integer if you enter it through a floating point operation (no pun intended). If you want to convert from string to int and be consistent with your "prejudice" you can simply use string functions.
You can use an arbitrary precision library to divide by 10 (they handle numbers internally as strings), e.g. bcdiv() or gmp_div_q(), but of course, you could have also used it from the beginning for all the math.
Or you can use plain string functions:
<?php
// Quick ugly code not fully tested
$input = '156.64';
$output = NULL;
if( preg_match('/\d+(\.\d+)?/', $input) ){
$tmp = explode('.', $input);
switch( count($tmp) ){
case 1:
$output = $tmp[0];
break;
case 2:
$output = $tmp[0] . substr($tmp[1], 0, 2);
break;
default:
echo "Invalid decimal\n";
}
}else{
echo "Invalid number\n";
}
var_dump($output);
?>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.