I found a question about making short codes like TinyURL (https://stackoverflow.com/a/960364/1778465), and I am not sure if what I am doing is working.
I have the following test code:
<?php
$val = intval('murwaresuperchainreaction', 36);
echo $val."\n";
echo base_convert($val, 10, 36) . "\n";
echo "---\n";
$val = intval('murwarebarnstormers', 36);
echo $val."\n";
echo base_convert($val, 10, 36) . "\n";
echo "---\n";
$val = intval('murwarenightmare', 36);
echo $val."\n";
echo base_convert($val, 10, 36) . "\n";
and I am getting these results:
9223372036854775807
1y2p0ij32e8e7
---
9223372036854775807
1y2p0ij32e8e7
---
9223372036854775807
1y2p0ij32e8e7
The question I have, is why are all the results the same? according to the answer I linked to above I should be getting "collision-proof" results, but they are all the same...
As per the Documentation of intval,
The maximum value depends on the system. 32 bit systems have a maximum signed integer range of -2147483648 to 2147483647. So for example on such a system, intval('1000000000000') will return 2147483647. The maximum signed integer value for 64 bit systems is 9223372036854775807.
When you try this method with shorter strings, you'll get collision free results. But large strings will return the maximum value. Hence this method is not suitable to create short codes from large strings.
The value being encoded in the answer you linked is an integer - an ID that references that shortened link's record. By Base 64 or Base 36 encoding the ID, the string is becomes a lot shorter:
echo base_convert(1234567, 10, 36);
// output qglj
intval can then be used to convert the shortened string back to the ID:
echo intval('qglj', 36);
// output 1234567
Related
If I hash a bunch of random numbers in PHP and convert the hashes to base-10, it turns out that the digit 9 never appears toward the end of the resultant integers. I think I must be missing something obvious in the way MD5 hashing works or the way PHP handles it.
I noticed this because I have a list of (all different) strings and need to randomly bucket them into two groups (with 90% of the strings in bucket A and 10% in the bucket B). I figured I could just hash the strings, convert to base-10, and do something like this:
if( ( md5_hash_in_base_ten % 100 ) < 90 ) use bucket A
else use bucket B
But it turns out that the 9 digit was never appearing near the end of the resultant integers, so bucket B was never chosen.
I know there could be a million ways to randomly group strings, but I'm not interested in different solutions to that problem. I'm just curious about the (possible) odd results of my test code.
for( $i = 0; $i < 10000; $i++ ) {
$r = rand();
$bc = base_convert( md5( $r ), 16, 10 );
echo $bc . '<br>';
}
One chunk of results looks like this:
302600829905161600608260662606624826442
59585669553455458666446844468880068068
330999075520965568846868468088242088640
192131673950084244086840262480428482262
219128507900677482440460800240644480082
255318670176792246888206600668682602264
240208061481025440684246208684488420642
294217394926758646048046684044640488204
278449747058183168002848628868886688226
195713211929924564840668644204640202264
249037264096573760228220842660668480862
207646493898559360028468248404088664884
169051134173421386202080006468046600882
91273057168422960202446286266888840680
289959365012917366428044866648660802042
172462762250895562808826226442626868482
21264346015514864044284484068442686886
37414331404805136842220266424646680664
76064003552382186484240646428006806660
316804269790551588866666266482482808288
142781990240421424242486286048486626288
12211092583070068208404402226428806286
164064659807615146666228064640060626026
336702095492281784288600868224440806802
264447819530445920480408448628866828002
127283138187204864060642440804622660688
220658311731241408862084402042406680248
71873545317929552826606228242842664868
And if we ctrl-F that for the number 9, it looks like this:
Ideas?
PHP's base_convert function is not made for arbitrarily large numbers, as the red box on the function's documentation page indicates.
You can verify that yourself the following way:
echo ('a' == base_convert(base_convert('a', 16, 10), 10, 16)) ? 1 : 0;
echo ('abcdefabcdefabcdef' == base_convert(base_convert('abcdefabcdefabcdef', 16, 10), 10, 16)) ? 1 : 0;
This will print 1 and 0: Converting the number a (hex) to decimal and back works as expected, but the number abcdefabcdefabcdef (hex) causes base_convert to loose precision.
To work around this problem, you need to use a function that is able to handle numbers of arbitrary length. For an example, check out one of the comments on the documentation page (function convBase).
You're using a bigger number than base_convert can handle. Refer to the PHP documentation here.
And MD5 hash is 128 bits, which is more than most libraries are expected to handle. Your converted result can't hold the precision. In your case you can use the GNU Multiple Precision library.
<?php
/*use gmp library to convert base. gmp will convert numbers > 32bit*/
function gmp_convert($num, $base_a, $base_b)
{
return gmp_strval ( gmp_init($num, $base_a), $base_b );
}
for( $i = 0; $i < 5; $i++ ) {
$r = rand();
$h = md5($r);
$bc = base_convert( $h, 16, 10 );
$gmp = gmp_convert( $h, 16, 10 );
echo "Random value: " . $r . PHP_EOL;
echo "MD5 Hash: " . $h . PHP_EOL;
echo "Base converted: " . $bc . PHP_EOL;
echo "GMP converted: " . $gmp . PHP_EOL;
}
?>
Which will ouput:
$ php -f foo.php
Random value: 1198279904
MD5 Hash: 714ae450dedfd56314b47f84e1922c9a
Base converted: 150591624287845962826662264228684068862
GMP converted: 150591624287845974934538261676650802330
Random value: 2000471768
MD5 Hash: 6359b22761538dd02822732ba45c66bf
Base converted: 132059299392045104262828066404880468000
GMP converted: 132059299392045115619248080281367504575
Random value: 851022648
MD5 Hash: 1e95df1b73599a92637982bab7814fc4
Base converted: 40655017257670256242606204044284220868
GMP converted: 40655017257670268183638196631434776516
Random value: 711523039
MD5 Hash: e23aff29be3bb611abbb3736fbdd4d07
Base converted: 300711855586863926204240426446264628688
GMP converted: 300711855586863939825593015112763788551
Random value: 953421999
MD5 Hash: a5990cd2bbab7707db05ebd3b468df17
Base converted: 220117300808777322406606064084840664268
GMP converted: 220117300808777304730115892103715806999
php function round not working correctly.
I have number 0.9950.
I put code:
$num = round("0.9950", 2);
And I get 1.0? Why?? Why I can't get 0.99?
You can add a third parameter to the function to make it do what you need.
You have to choose from one of the following :
PHP_ROUND_HALF_UP
PHP_ROUND_HALF_DOWN
PHP_ROUND_HALF_EVEN
PHP_ROUND_HALF_ODD
This constants are easy enough to understand, so just use the adapted one :)
In your example, to get 0.99, you'll need to use :
<?php echo round("0.9950", 2, PHP_ROUND_HALF_DOWN); ?>
DEMO
When you round 0.9950 to two decimal places, you get 1.00 because this is how rounding works. If you want an operation which would result in 0.99 then perhaps you are looking for floating point truncation. One option to truncate a floating point number to two decimal places is to multiply by 100, cast to integer, then divide again by 100:
$num = "0.9950";
$output = (int)(100*$num) / 100;
echo $output;
0.99
This trick works because after the first step 0.9950 becomes 99.50, which, when cast to integer becomes just 99, discarding everything after the second decimal place in the original number. Then, we divide again by 100 to restore the original number, minus what we want truncated.
Demo
Just tested in PHP Sandbox... PHP seems funny sometimes.
<?php
$n = 16.90;
echo (100*$n)%100, "\n"; // 89
echo (int)(100*$n)%100, "\n"; // 89
echo 100*($n - (int)($n)), "\n"; // 90
echo (int)(100*($n - (int)($n))), "\n"; // 89
echo round(100*($n - (int)($n))), "\n"; // 90
I am trying to xor two values which are like below:
Variable 1 : 6463334891
Variable 2 : 1000212390
When i did xor with these values in php it gives me wrong answer.
It should give me "7426059853"
This is my code
$numericValue = (int)$numericValue;
$privateKey = (int)$privateKey;
echo "Type of variable 1 ".gettype($numericValue)."<br />";
echo "Type of variable 2 ".gettype($privateKey)."<br />";
$xor_val = (int)$numericValue ^ (int)$privateKey;
echo "XOR Value :".$xor_val."<br />";
Just a total stab into the dark...
You're doing this:
echo "6463334891" ^ "1000212390";
When you want to be doing this:
echo 6463334891 ^ 1000212390;
XOR is an operation on bytes. The byte representation of the integer 6463334891 and the string "6463334891" are very different. Hence this operation will result in very different values depending on whether the operands are strings or integers. If you get your numbers in string form, cast them to an int first:
echo (int)$var1 ^ (int)$var2;
That is because you re hitting the MAXIMUM INTEGER LIMIT which is 2147483647
From the PHP Docs...
The maximum value depends on the system. 32 bit systems have a maximum
signed integer range of -2147483648 to 2147483647. So for example on
such a system, intval('1000000000000') will return 2147483647. The
maximum signed integer value for 64 bit systems is
9223372036854775807.
Thus to handle such big integers you need to make use of an extension like (GMP) GNU Multiple Precision
<?php
$v1="6463334891";
$v2="1000212390";
$a = gmp_init($v1);
$b = gmp_init($v2);
echo gmp_intval($a) ^ gmp_intval($b); //"prints" 7426059853
Else , Switch to a 64-bit system.
my solution to maintain the value of big integers is to convert them to binary (with base_convert cause decbin doesnt work) and then make the xor for every bit, to finally convert the string to decimal.
function binxor($w1,$w2)
{
$x=base_convert($w1, 10, 2);
$y=base_convert($w2, 10, 2);
// adjust so both have same lenght
if (strlen($y)<strlen($x)) $y=str_repeat(0,strlen($x)-strlen($y)).$y;
if (strlen($x)<strlen($y)) $x=str_repeat(0,strlen($y)-strlen($x)).$x;
$x=str_split($x);$y=str_split($y);
$z="";
for ($k=0;$k<sizeof($x);$k++)
{
// xor bit a bit
$z.=(int)($x[$k])^(int)($y[$k]);
}
return base_convert($z,2,10);
}
Also, to adjust large numbers to 32 bits
bindec(decbin($number))
because decbin cuts the number to 32 automatically.
I know of the PHP function floor() but that doesn't work how I want it to in negative numbers.
This is how floor works
floor( 1234.567); // 1234
floor(-1234.567); // -1235
This is what I WANT
truncate( 1234.567); // 1234
truncate(-1234.567); // -1234
Is there a PHP function that will return -1234?
I know I could do this but I'm hoping for a single built-in function
$num = -1234.567;
echo $num >= 0 ? floor($num) : ceil($num);
Yes intval
intval(1234.567);
intval(-1234.567);
Truncate floats with specific precision:
echo bcdiv(2.56789, 1, 1); // 2.5
echo bcdiv(2.56789, 1, 3); // 2.567
echo bcdiv(-2.56789, 1, 1); // -2.5
echo bcdiv(-2.56789, 1, 3); // -2.567
This method solve the problem with round() function.
Also you can use typecasting (no need to use functions),
(int) 1234.567; // 1234
(int) -1234.567; // -1234
http://php.net/manual/en/language.types.type-juggling.php
You can see the difference between intval and (int) typecasting from here.
another hack is using prefix ~~ :
echo ~~1234.567; // 1234
echo ~~-1234.567; // 1234
it's simpler and faster
Tilde ~ is bitwise NOT operator in PHP and Javascript
Double tilde(~) is a quick way to cast variable as integer, where it is called 'two tildes' to indicate a form of double negation.
It removes everything after the decimal point because the bitwise operators implicitly convert their operands to signed 32-bit integers. This works whether the operands are (floating-point) numbers or strings, and the result is a number
reference:
https://en.wikipedia.org/wiki/Double_tilde
What does ~~ ("double tilde") do in Javascript?
you can use intval(number); but if your number bigger than 2147483648 (and your machine/os is x64) all bigs will be truncated to 2147483648. So you can use
if($number < 0 )
$res = round($number);
else
$res = floor($number);
echo $res;
You can shift the decimal to the desired place, intval, and shift back:
function truncate($number, $precision = 0) {
// warning: precision is limited by the size of the int type
$shift = pow(10, $precision);
return intval($number * $shift)/$shift;
}
Note the warning about size of int -- this is because $number is potentially being multiplied by a large number ($shift) which could make the resulting number too large to be stored as an integer type. Possibly converting to floating point might be better.
You could get fancy with a $base parameter, and sending that to intval(...).
Could (should) also get fancy with error/bounds checking.
An alternative approach would be to treat number as a string, find the decimal point and do a substring at the appropriate place after the decimal based on the desired precision. Relatively speaking, that won't be fast.
how can i convert a string(i.e. email address) to unique integers, to use them as an ID.
The amount of information a PHP integer may store is limited. The amount of information you can store in a string is not (at least if the string isn't unreasonably long.)
Thus you would need to compress your arbitrary-length string to an non-arbitrary-length integer. This is impossible without data loss.
You may use a hashing algorithm, but hashing algorithms may always have collisions. Especially if you want to hash a string to an integer the collision probability is pretty high - integers can store only very little data.
Thus you shall either stick with the email or use an auto incrementing integer field.
Try the binhex function
from the above site:
<?php
$str = "Hello world!";
echo bin2hex($str) . "<br />";
echo pack("H*",bin2hex($str)) . "<br />";
?>
outputs
48656c6c6f20776f726c6421
Hello world!
Why not just have an auto-increment ID field on the database?
This code generates 64bit number which can be use as it or as a bigInt / similar data-type for databases like MySQL etc.
function get64BitNumber($str)
{
return gmp_strval(gmp_init(substr(md5($str), 0, 16), 16), 10);
}
echo get64BitNumber('Hello World!'); // 17079728445181560374
echo get64BitNumber('Hello World#'); // 2208921763183434891
echo get64BitNumber('http://waqaralamgir.tk/'); // 12007604953204508983
echo get64BitNumber('12345678910'); // 4841164765122470932
If the emails are ascii text, you could use PHP ord function to generate a unique integer, but it will be a very large number!
The approach would be to work through the email address one character at a time, calling ord for each of them. The ord function returns an integer uniquely expressing the character's value. You can pad each of these numbers with zeros and then use string concatenation to plug them into each other.
Consider "abc".
ord("a");
>> 97
ord("b");
>> 98
ord("c");
>> 99
Pad these numbers with a 0, and you have a unique number for it, that is: 970980990.
I hope that helps!
You can use crc32 function.
Example:
$email = "user#gmail.com";
echo $email . " = " . crc32($email);
Live example: https://repl.it/repls/HonorableRespectfulBundledsoftware
Why not create your own associative table locally that will bind the emails with unique integers?
So the work flow would be in the lines of:
1 get the record from the ldap server.
2 check it locally if it has already an int assigned.
2.1 if yes use that int.
2.2 if no, generate an associative row in the table locally.
3 do your things with the unique ids.
Does that make sense?
You can use this function:
function stringToInteger($string) {
$output = '';
for ($i = 0; $i < strlen($string); $i++) {
$output .= (string) ord($string[$i]);
}
return (int) $output;
}
A bit ugly, but works :)