Bitwise operator with large numbers - php

Is there a way to check greater values then "2147483648"?
I have to work with numbers up to "6.73297395398192e212" (2^707).
The data is stored in a mysql-database as float.
Maybe I'm just using the wrong search terms or there is not a good way.

A double precision value uses 8 bytes, and you obviously cannot store 707 bits in those (which I assume you are trying to do). It can store a value of 1e308 by an approximation that costs precision in the lower digits, which makes it a bad choice for storing data that you want to do bitwise operations on. For bitwise operation on 8 bytes, you can use bigint.
Since MySQL 8, MySQL supports bitwise operations on binary string of arbitrary length, so you should store your value that way - a bit array is basically a binary string anyway. You cannot treat them as numbers though (e.g. add or multiply them like integers).
For earlier MySQL versions, bit operations on binary strings were limited to 8 bytes. You should still store your bits as a binary string (which allows for an easy upgrade), and write a small function that does the operation e.g. bytewise.

Related

SolR float (TrieFloatField) storage limits

I'm trying to understand how float are stored in SolR.
I have a delta between the float value in PHP (32-bit) and the stored one in SolR.
I've searched in the documentation, "Field Types Included with SolR" :
https://cwiki.apache.org/confluence/display/solr/Field+Types+Included+with+Solr
And found for TrieFloatField:
Floating point field (32-bit IEEE floating point). precisionStep="0"
enables efficient numeric sorting and minimizes index size;
precisionStep="8" (the default) enables efficient range queries.
But I don't know how to estimate what will be the stored value.
Here are some tests I've made.
The value I've tried to insert in the float field and the result:
ok: 2097151.1
ko: 2097152.1 -> 2097152
ko: 20971521 -> 20971520
ok: 16777216
ko: 16777217 -> 16777216
ko: 4294967296 -> 4294967300
ok: 4294967300
ko: 4294967301 -> 4294967300
I don't understand which constraint is used, it is not rounded.
Maybe it is a binary constraint, because it looks like it is rounded to fit powers of 2.
https://en.wikipedia.org/wiki/Power_of_two#The_first_96_powers_of_two
2^21 = 2,097,152
2^24 = 16,777,216
2^32 = 4,294,967,296
As you can see, these values are close the the ones stored by SolR.
Does someone have an idea how SolR stores float?
And how to evaluate it with PHP?
Thanks.
As you've mentioned, it's a 32 bit floating point number. A 32-bit floating point number can't represent all the values between 0 and 2^32 exactly, so there will be inaccuracies and numbers that can't be represented using those bits.
You can use a converter like IEEE754 Floating Point Conversion to test the values you've included, and they all convert to what you're getting back from Solr.
Floating point numbers are not exact, and aren't magic - there's still just 2^32 distinct values available, so when you're trying to store values that don't map exactly onto the possible values that a 32 bit FP can represent, you'll get inaccuracies.
Doubles were introduced to have more accuracy (64-bit vs 32-bit), and you can use doubles in Solr by using a TrieDoubleField instead.
Another option, depending on what you need, is to use a long field instead, and multiplying by 10 or 100 when storing a value and dividing the value on the way out. That will allow you to exactly represent a decimal number with two digits after the dot.
Apparently, the most secure way to compare floats is to use pack().
Pack data into binary string to securely compare two floats.
http://php.net/manual/en/language.types.float.php#119860
So, as an alternative to using
$float1 === $float2
one could use
pack('f', $float1) === pack ('f', $float2)
with a big footnote that one should really remember that one is reducing your accuracy of the comparison. AFAIK is this the only way (apart from epsilon methods) to securely compare two floats.

Encoding/Compressing a large integer into alphanumeric value

I have a very large integer 12-14 digits long and I want to encrypt/compress this to an alphanumeric value so that the integer can be recovered later from the alphanumeric value. I tried to convert this integer using a 62 base and tried to map those values to a-zA-Z0-9, but the value generated from this is 7 characters long. This length is still long enough and I want to convert to about 4-5 characters.
Is there a general way to do this or some method in which this can be done so that recovering the integer would still be possible? I am asking the mathematical aspects here but I would be programming this in PHP and I recently started programming in php.
Edit:
I was thinking in terms of assigning a masking bit and using this in a fashion to generate less number of Chars. I am aware of the fact that the range is not enough and that is the reason I was focusing on using a mathematical trick or a way of representation. The 62 base was an Idea that I already applied but is not working out.
14 digit decimal numbers can express 100,000,000,000,000 values (1014).
5 characters of a 62 character alphabet can express 916,132,832 values (625).
You cannot cram the equivalent number of values of a 14 digit number into a 5 character base 62 string. It's simply not possible to express each possible value uniquely. See http://en.wikipedia.org/wiki/Pigeonhole_principle. Even base 64 with 7 characters is not enough (only 4,398,046,511,104 possible values). In fact, if you target a 5 character short string you'd need to compensate by using a base 631 alphabet (6315 = 100,033,806,792,151).
Even compression doesn't help you. It would mean that two or more numbers would need to compress to the same compressed string (because there aren't enough possible unique compressed values), which logically means it's impossible to uncompress them into two different values.
To illustrate this very simply: Say my alphabet and target "string length" consists of one bit. That one bit can be 0 or 1. It can express 2 unique possible values. Say I have a compression algorithm which compresses anything and everything into this one bit. ... How could I possibly uncompress 100,000,000,000,000 unique values out of that one bit with two possible values? If you'd solve that problem, bandwidth and storage concerns would immediately evaporate and you'd be a billionaire.
With 95 printable ASCII characters you can switch to base 95 encoding instead of 62:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
That way an integer string of length X can be compressed into length Y base 95 string, where
Y = X * log 10/ log 95 = roughly X / 2
which is pretty good compression. So from length 12 you get down to 6. If the purpose of compression is to save the bandwidth by using JSON, then base 92 can be good choice (excluding ",\,/ that become escaped in JSON).
Surely you can get better compression but the price to pay is a larger alphabet. Just replace 95 in the above formula by the number of symbols.
Unless of course, you know the structure of your integers. For instance, if they have plenty of zeroes, you can base your compression on this knowledge to get much better results.
because the pigeon principle you will end up with some values that get compressed and other values that get expanded. It simply impossible to create a compression algorithm that compress every possible input string (i.e. in your case your numbers).
If you force the cardinality of the output set to be smaller than the cardinality of the input set you'll get collisions (i.e. more input strings get "compressed" to the same compressed binary string). A compression algorithm should be reversible, right? :)

Using long int in PHP

I am trying this out, but am unable to store large value
$var = rand(100000000000000,999999999999999);
echo $var; // prints a 9 digit value(largest possible)
How to get a desired value ?
From the manual:
The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18. PHP does not support unsigned integers. Integer size can be determined using the constant PHP_INT_SIZE, and maximum value using the constant PHP_INT_MAX since PHP 4.4.0 and PHP 5.0.5.
...
If PHP encounters a number beyond the bounds of the integer type, it will be interpreted as a float instead. Also, an operation which results in a number beyond the bounds of the integer type will return a float instead.
BC Math and GMP are the (only?) way to manipulate this limitation.
PHP ints are typically 32 bits. Other packages provide higher-precision ints: http://php.net/manual/en/language.types.integer.php
If you need to work with very large numbers I have found success with BC Math. Here is a link to everything you need to know:
http://php.net/manual/en/book.bc.php
If you want to generate the number and manipulate as a native type, you can't with most PHP installations (either you have 32 or 64 bit ints and nothing else), as the other answers have already stated. However, if you are just generating a number and want to pass it around a possible trick is to just concatenate strings:
$var = rand(0,PHP_INT_MAX).str_pad(rand(0, 999999999), 9, 0, STR_PAD_LEFT);
echo $var;
On a platform in which PHP uses a 32 bit integer, this allows you to get a near random integer (as a string) that is bigger than 32 bits ( > 10 decimal places). Of course, there is a bias in this construction which means you won't cover all the numbers with the same probability. The limits of the rand() calls obey normal decimal rules so its simple to adjust the upper bound of the number you want.
If all you're doing is storing/transmitting/showing this value, the string will be just fine. Equality and greater/less than tests will also work. Just don't do any math with it.

Comparing large numbers in php and sql

I need to compare a very large number in php (30 digits long) with 2 numbers in my database. Whats a good way to do this? I tried using floats but its not precise enough and I don't know of a good way to use large numbers in php.
Have you tried using string comparison? Just make sure every number is padded with zeroes.
mysql> select "123123123123123123456456456"<"123123123123123123456456457";
+-------------------------------------------------------------+
| "123123123123123123456456456"<"123123123123123123456456457" |
+-------------------------------------------------------------+
| 1 |
+-------------------------------------------------------------+
Justed test this up to 200+ chars, works like a charm.
Check bcdcomp function
You could compare strings instead.
Depending on how you're fetching the data from the database, you may want to explicitly cast the integer to a string type in the SQL statement.
Other than that, there are several libraries in PHP that handle large integers, like BCMath and GMP.
Handling large numbers in PHP is done through either of two libraries: GMP or BC Math.
I haven't done this myself, so it may not be correct, but I think you'd have to take the string result from GMP or BC Math, and feed that into the query. Make sure you store your numbers as bigint.
Interestin fact: You might think BigInt would be limited to about 20 digits, and you'd be right, except for the fact that it has Mysql Magic:
You can always store an exact integer value in a BIGINT column by storing it using a string. In this case, MySQL performs a string-to-number conversion that involves no intermediate double-precision representation.
If they're -very- big, I'd compare them as strings even. First, if one is longer than the other, it wins. If they're the same length, compare digit by digit left-to-right - if two digits differ, the number with the bigger digit wins. This of course for Positive integers.

Numeric String (arbitrary size) -> Multiple Integers

I'm running into a problem because my database has BIGINT data (64-bit integers) but the version of PHP I'm running is only 32-bit.
So when I pull out value from a table I end up with a numeric string representing a 64-bit integer in base 10. What I would ideally like to do is use the 64-bit integer as a bitmask. So I need to go to either two 32-bit integers (one representing the upper part and one the lower part) or a numeric string in base 2.
Problem is I can't just multiply it out because my PHP is only 32-bit. Am I stuck?
You can use MySQL's bit shift operators to split the 64-bit integer into two 32-bit integers. So, you could select:
select (myBigIntField & 0xffffffff) as lowerHalf,
(myBigIntField >> 32) as upperHalf
To handle arbitrary size integers, you have two options in PHP: GMP and BC Math methods.
I believe GMP is more efficient by using some custom resources, while BC uses strings directly.
If you won't be processing too many (thousands or more) of numbers at a time, you may use BC directly.
You have a few options here:
Use either BCMath or GMP if they are included in your PHP installation. Both should provide arbitrary-length integers.
Ask the database to convert the integer to a 64-character long bit string instead
Write a bignum implementation yourself (more work :-))

Categories