Why is 0 so frequent is this number generation pattern? - php

I was just goofing around with PHP and I decided to generate some random numbers with PHP_INT_MIN (-9223372036854775808) and PHP_INT_MAX (9223372036854775807). I simply echoed the following:
echo rand(-9223372036854775808, 9223372036854775807);
I kept refreshing to see the numbers generated and to view the randomness of the numbers, as a result I started to notice a pattern emerging. Every 2-4 refreshes 0 appeared and this happened without fail, at one stage I even got 0 to appear 4x in a row.
I wanted to experiment further so I created the following snippet:
<?php
$countedZero = 0;
$totalGen = 250;
for ($i = 1; $i <= $totalGen; $i++) {
$rand = rand(-9223372036854775808, 9223372036854775807);
if ($rand == 0) {
echo $i . ": <font color='red'>" . $rand . "</font><br/>";
$countedZero++;
} else {
echo $i . ": " . $rand . "<br/>";
}
}
echo "0 was generated " . $countedZero . "/" . $totalGen . " times which is " . (($countedZero / $totalGen) * 100) . "%."
?>
this would give me a clear idea of what the generation rate is. I ran 8 tests:
The first 3 tests were using a $totalGen of 250. (3 tests total).
The second 3 tests were using a $totalGen of 1000. (6 tests total).
The third test was just to see what the results would be on a larger number, I chose 10,000. (7 tests total).
The fourth test was the final test, I was intrigued at this point because the last (large number) test got such a high result surprisingly so I raised the stakes and set $totalGen to 500,000. (8th test total).
Results
I took a screenshot of the results. I took the first output, I didn't keep testing it to try and get it to fit a certain pattern:
Test 1 (250)
(1).
(2).
(3).
Test 2 (1000)
(1).
(2).
(3).
Test 3 (10,000)
(1).
Test 4 (500,000)
(1).
From the above results, it is safe to assume that 0 has a very high probability of showing up even when the range of possible numbers is at its maximum. So my question is:
Is there a logical reason to why this is happening?
Considering how many numbers it can choose from why is 0 a recurring number?
Note Test 8 was originally going to be 1,000,000 but it lagged out quite badly so I reduced it to 500,000 if someone could test 1,000,000 and show the results by editing the OP it would be much appreciated.
Edit 1
As requested by #maiorano84 I used mt_rand instead of rand and these were the results.
Test 1 (250)
(1).
(2).
(3).
Test 2 (1000)
(1).
(2).
(3).
Test 3 (10,000)
(1).
Test 4 (500,000)
(1).
The results as you can see show that 0 still has a high probability of showing up. Also using the function rand provided the lowest result.
Update
It seems that in PHP7 when using the new function random_int it fixes the issue.
Example PHP7 random_int
https://3v4l.org/76aEH

This is basically an example of how someone wrote a bad rand() function. When you specify the min/max range in rand(), you hit a part of PHP's source that just results in imperfect distribution in the PRNG.
Specifically lines 44-45 of php_rand.h in php-src, which is the following macro:
#define RAND_RANGE(__n, __min, __max, __tmax) \
(__n) = (__min) + (zend_long) ((double) ( (double) (__max) - (__min) + 1.0) * ((__n) / ((__tmax) + 1.0)))
From higher up the call stack (lines 300-302 in rand.c of php-src):
if (argc == 2) {
RAND_RANGE(number, min, max, PHP_RAND_MAX);
}
RAND_RANGE being the macro defined above. By removing the range parameters by just calling rand() instead of rand(-9223372036854775808, 9223372036854775807) you will get even distribution again.
Here's a script to demonstrate the effects...
function unevenRandDist() {
$r = [];
for ($i = 0; $i < 10000; $i++) {
$n = rand(-9223372036854775808,9223372036854775807);
if (isset($r[$n])) {
$r[$n]++;
} else {
$r[$n] = 1;
}
}
arsort($r);
// you should see 0 well above average in the top 10 here
var_dump(array_slice($r, 0, 10));
}
function evenRandDist() {
$r = [];
for ($i = 0; $i < 10000; $i++) {
$n = rand();
if (isset($r[$n])) {
$r[$n]++;
} else {
$r[$n] = 1;
}
}
arsort($r);
// you should see the top 10 are about identical
var_dump(array_slice($r, 0, 10)); //
}
unevenRandDist();
evenRandDist();
Sample Output I Got
array(10) {
[0]=>
int(5005)
[1]=>
int(1)
[2]=>
int(1)
[3]=>
int(1)
[4]=>
int(1)
[5]=>
int(1)
[6]=>
int(1)
[7]=>
int(1)
[8]=>
int(1)
[9]=>
int(1)
}
array(10) {
[0]=>
int(1)
[1]=>
int(1)
[2]=>
int(1)
[3]=>
int(1)
[4]=>
int(1)
[5]=>
int(1)
[6]=>
int(1)
[7]=>
int(1)
[8]=>
int(1)
[9]=>
int(1)
}
Notice the inordinate difference in the number of times 0 shows up in the first array vs. the second array. Even though technically they are both generating random numbers within the same exact range of PHP_INT_MIN to PHP_INT_MAX.
I guess you could blame PHP for this, but it's important to note here that glibc rand is not known for generating good random numbers (regardless of crypto). This problem is known in glibc's implementation of rand as pointed out by this SO answer

I took a quick look at your script and ran it through the command line. The first thing I had noticed is that because I was running a 32-bit version of PHP, my Integer Minimum and Maximum were different from yours.
Because I was using your original values, I was actually getting 0 100% of the time. I resolved this by modifying the script like so:
$countedZero = 0;
$totalGen = 1000000;
for ($i = 1; $i <= $totalGen; $i++) {
$rand = rand(~PHP_INT_MAX, PHP_INT_MAX);
if ($rand === 0) {
//echo $i . ": <font color='red'>" . $rand . "</font><br/>";
$countedZero++;
} else {
//echo $i . ": " . $rand . "<br/>";
}
}
echo "0 was generated " . $countedZero . "/" . $totalGen . " times which is " . (($countedZero / $totalGen) * 100) . "%.";
I was able to confirm that each test would yield just shy of a 50% hit rate for 0.
Here's the interesting part, though:
$rand = rand(~PHP_INT_MAX+1, PHP_INT_MAX-1);
Altering the range to these values causes the likelihood of zero coming up to plummet to an average of 0.003% (after 8 tests). The weird part was that after checking the value of $rand that was not zero, I was seeing many values of 1, and many random negative numbers. No positive numbers greater than 1 were showing up.
After changing the range to the following, I was able to see consistent behavior and more randomization:
$rand = rand(~PHP_INT_MAX/2, PHP_INT_MAX/2);
Here's what I'm pretty sure is happening:
Because you're dealing with a range here, you have to take into account the difference between the minimum and the maximum, and whether or not PHP can support that value.
In my case, the minimum that PHP is able to support is -2147483648, the maximum 2147483647, but the difference between them actually ends up being 4294967295 - a much larger number than PHP can store, so it truncates the maximum in order to try to manage that value.
Ultimately, if the difference of your minimum and maximum exceeds the PHP_INT_MAX constant, you're going to see unexpected behavior.

Related

in_array always returning false when searching for strings

I'm currently writing a simple Battleships game in PHP. At the start of the game, I generate three ship positions on a 5 x 5 board, with each ship occupying one square:
function generate_three_battleships($ships){
for ($n=0; $n<3; $n++){
// Generate new position, returned from function as string
$ship_position = generate_new_position();
// Check to ensure that the position is not already occupied - if so, recast
if (in_array($ship_position, $ships)){
$ship_position = generate_new_position();
}//if
// Assign position to array
array_push($ships, $ship_position);
}//for
}//generate_three_battleships
Each position is represented as a two-digit string, which represent cartesian coordinates (so for example, "32" represents y = 3, x = 2). This task is handled by the generate_new_position function:
When the game starts, the user will enter in their guesses for rows and columns:
function generate_new_position(){
// Generate x and y coordinates - cast to string for storage
$ship_row = (string)random_pos();
$ship_col = (string)random_pos();
$ship_position = $ship_row.$ship_col;
return $ship_position;
}//generate_new_position
The user then enters their guesses for rows and columns, and the game will check to see if there is a ship there:
// Generate battleships
generate_three_battleships($ships);
for($turn=1; $turn <= GUESSES; $turn++){
// First check to see if all ships have been sunk. If not, proceed with the game
if ($ships_sunk < 3){
$guess_row = (string)readline("Guess a row: ");
$guess_col = (string)readline("Guess a column: ");
$guess = $guess_row.$guess_col; // format guesses as strings
if(($guess_row=="") || ($guess_col=="") || ($guess_row < 0) || ($guess_col < 0) || ($guess_row >= BOARDSIZE) || ($guess_col >= BOARDSIZE)){
print("Oops, that's not even in the ocean. \n");
}
else if(array_search($guess, $ships) != false){
print("Congratulations! You sunk one of my battleships!\n");
$board[$guess_row][$guess_col] = "X";
$ships_sunk++;
}
}
However, the in_array function is consistently returning false for every guess, even if that guess is actually in the $ships array. I can't see where I am going wrong, as I have explicitly cast everything to string. Am I missing something obvious?
As some people have asked, the output of var_dump on $ships after generate_three_battleships has executed is as follows:
array(3) {
[0]=>
string(2) "12"
[1]=>
string(2) "30"
[2]=>
string(2) "03"
}
Unfortunately I don't have a complete answer because I am missing some information to understand what the problem is.
You can debug what's going on by printing the contents of the array using var_dump to see the actual contents of $ships and maybe forcing generate_new_position to always return the same value.
If you can't solve this problem yourself, could you post the contents of $ships (using var_dump) before and after the for loop?

Float 1 converts to integer sometimes as 0, sometimes as 1 [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
PHP integer rounding problems
(5 answers)
Closed 8 years ago.
I am trying to get the first decimal place of a float number as an integer by subtracting the integer part, multiplying the remainder with 10 and then casting the result to int or using intval(). I noticed that the result for numbers with x.1 is correctly 1 as float, but after converting it to integer, it becomes sometimes 0, sometimes 1.
I tried to test it with numbers from 1.1 to 9.1:
for ($number = 1; $number < 10; $number++) {
$result = 10 * ($number + 0.1 - $number);
echo "<br/> number = " . ($number + 0.1) . ", result: ";
var_dump($result);
$result_int = intval($result);
var_dump($result_int);
}
Starting with 4.1 as input, the 1 oddly gets converted to 0:
number = 1.1, result: float(1) int(1)
number = 2.1, result: float(1) int(1)
number = 3.1, result: float(1) int(1)
number = 4.1, result: float(1) int(0)
number = 5.1, result: float(1) int(0)
number = 6.1, result: float(1) int(0)
number = 7.1, result: float(1) int(0)
number = 8.1, result: float(1) int(0)
number = 9.1, result: float(1) int(0)
Why at 4.1? That doesn't make any sense to me. Can anyone give me a hint what I am doing wrong?
PS: also tested at http://ideone.com/hr7M0A
You are seeing these results because floating point arithmetic is not perfectly accurate.
Instead of trying to manually get the first decimal point use fmod:
$result = substr(fmod($number, 1) * 10, 0, 1)
My php is a bit rusty, so my syntax in probably off, but shouldn't it be simpler to convert to string and take the rightmost digit ?
sprintf($Str, "%.1f", $number);
$digit=$Str[strlen($Str)-1]; // Last digit

Generate random numbers with fix probability

I red a lot in the forum about this, but all answers were so specific to the the asked question. The nearest one I found to my need was:Probability Random Number Generator by Alon Gubkin.
The difference is that, Alon ask to give a one face (which is six) extra chance. In my case, I want to divide the chance for the six faces so that they add up to 100%. For example, face 1 has chance of 40%, face 2 has only 10%, face 3 has 25%, ... etc.
How can I do that?
The single probability check with linear probability can be easily done with:
function checkWithProbability($probability=0.1, $length=10000)
{
$test = mt_rand(1, $length);
return $test<=$probability*$length;
}
For example, this will produce:
for($i=0; $i<10; $i++)
{
var_dump(checkWithProbability(1/3));
}
Something like:
bool(false)
bool(true)
bool(false)
bool(false)
bool(false)
bool(false)
bool(false)
bool(false)
bool(true)
bool(false)
And you can use that principle to get your edges check with desired probability:
function checkWithSet(array $set, $length=10000)
{
$left = 0;
foreach($set as $num=>$right)
{
$set[$num] = $left + $right*$length;
$left = $set[$num];
}
$test = mt_rand(1, $length);
$left = 1;
foreach($set as $num=>$right)
{
if($test>=$left && $test<=$right)
{
return $num;
}
$left = $right;
}
return null;//debug, no event realized
}
The idea is to use geometry probability - i.e. split some line part into pieces with corresponding length and then check to which part our random number belongs.
0.75 0.9
| |
V V
*--------*--*-----*-*--*--* <-- (length)
^ ^ ^ ^ ^
| | | | |
0 0.4 0.5 0.8 1
Sample will be:
$set = [
1 => 0.4,
2 => 0.1,
3 => 0.25,
4 => 0.05,
5 => 0.1,
6 => 0.1
];
for($i=0; $i<10; $i++)
{
var_dump(checkWithSet($set));
}
With result like:
int(1)
int(2)
int(2)
int(6)
int(3)
int(1)
int(1)
int(6)
int(1)
int(1)
You can increase $length - in theory, this will increase "quality" of randomize check, but that's not too easy thing - because mt_rand() uses pseudo-random generator, Mersenne Twister (and in ideal case that's not true linear probability)
A quite simple approach would be to have an array with the length 100, writing your "faces" numbers in it, shuffle it and get the first element.
So for your example in that array are 40x 1, 10x 2, 25x 3.
Little code example (not tested):
$probabilities = array(
1 => 40,
2 => 10,
3 => 25,
4 => 5,
5 => 10,
6 => 10
);
$random = array();
foreach($probabilities as $key => $value) {
for($i = 0; $i < $value; $i++) {
$random[] = $key;
}
}
shuffle($random);
echo $random[0];
In your case you might generate random from 1 to 100 and then:
if random in 1:40 -> face 1
elseif random in 41:50 -> face 2
and so on.
Of course, real code would be a little more complex to get real ranges and not hardcoded ifs
I can think of a very simple solution. This one does not alter the random number generator's generation pattern but interprets the outcomes so as to suit your problem above. I'd ask the random number generator to generate numbers between 0 and 9. And then do the following mapping where I assign ranges of the generated number to values of my intrest based on the probability I am intered in assigning to that value:
If result <= 3, face=1
else if result <=5, face =2
else is result <=25 face =3
//and so on
I tried changing Alma's code a little.
The main goal was to make the code shorter and simple.
In this example, you will be inputting the probabilities as integers, and not decimals, therefore adding a probability of 7.5% will force you to multiply everything by 10.
// face 1 = 40%, face 2 = 10% etc...
$probabilities = [40, 10, 25, 25];
$results = ['face 1', 'face 2', 'face 3', 'face 4'];
echo checkWithSet($probabilities, $results);
function checkWithSet($probabilities, $results)
{
$total = array_sum($probabilities);
$random_num = mt_rand(1, $total);
$counter = 0;
foreach($probabilities as $index=>$value)
{
$counter += $value
if($counter > $random_num)
{
return $results[$index];
}
}
}

rounding a number, NOT necessarily decimel PHP

I have a question.
I am using php to generate a number based on operations that a user has specified
This variable is called
$new
$new is an integer, I want to be able to round $new to a 12 digit number, regardless of the answer
I was thinking I could use
round() or ceil()
but I believe these are used for rounding decimel places
So, I have an integer stored in $new, when $new is echoed out I want for it to print 12 digits. Whether the number is 60 billion or 0.00000000006
If i understand correctly
function showNumber($input) {
$show = 12;
$input = number_format(min($input,str_repeat('9', $show)), ($show-1) - strlen(number_format($input,0,'.','')),'.','');
return $input;
}
var_dump(showNumber(1));
var_dump(showNumber(0.00000000006));
var_dump(showNumber(100000000000000000000000));
gives
string(12) "1.0000000000"
string(12) "0.0000000001"
string(12) "999999999999"

How to make 5 random numbers with sum of 100 [duplicate]

This question already has answers here:
Getting N random numbers whose sum is M
(9 answers)
Closed 1 year ago.
do you know a way to split an integer into say... 5 groups.
Each group total must be at random but the total of them must equal a fixed number.
for example I have "100" I wanna split this number into
1- 20
2- 3
3- 34
4- 15
5- 18
EDIT: i forgot to say that yes a balance would be a good thing.I suppose this could be done by making a if statement blocking any number above 30 instance.
I have a slightly different approach to some of the answers here. I create a loose percentage based on the number of items you want to sum, and then plus or minus 10% on a random basis.
I then do this n-1 times (n is total of iterations), so you have a remainder. The remainder is then the last number, which isn't itself truley random, but it's based off other random numbers.
Works pretty well.
/**
* Calculate n random numbers that sum y.
* Function calculates a percentage based on the number
* required, gives a random number around that number, then
* deducts the rest from the total for the final number.
* Final number cannot be truely random, as it's a fixed total,
* but it will appear random, as it's based on other random
* values.
*
* #author Mike Griffiths
* #return Array
*/
private function _random_numbers_sum($num_numbers=3, $total=500)
{
$numbers = [];
$loose_pcc = $total / $num_numbers;
for($i = 1; $i < $num_numbers; $i++) {
// Random number +/- 10%
$ten_pcc = $loose_pcc * 0.1;
$rand_num = mt_rand( ($loose_pcc - $ten_pcc), ($loose_pcc + $ten_pcc) );
$numbers[] = $rand_num;
}
// $numbers now contains 1 less number than it should do, sum
// all the numbers and use the difference as final number.
$numbers_total = array_sum($numbers);
$numbers[] = $total - $numbers_total;
return $numbers;
}
This:
$random = $this->_random_numbers_sum();
echo 'Total: '. array_sum($random) ."\n";
print_r($random);
Outputs:
Total: 500
Array
(
[0] => 167
[1] => 164
[2] => 169
)
Pick 4 random numbers, each around an average of 20 (with distribution of e.g. around 40% of 20, i.e. 8). Add a fifth number such that the total is 100.
In response to several other answers here, in fact the last number cannot be random, because the sum is fixed. As an explanation, in below image, there are only 4 points (smaller ticks) that can be randomly choosen, represented accumulatively with each adding a random number around the mean of all (total/n, 20) to have a sum of 100. The result is 5 spacings, representing the 5 random numbers you are looking for.
Depending on how random you need it to be and how resource rich is the environment you plan to run the script, you might try the following approach.
<?php
set_time_limit(10);
$number_of_groups = 5;
$sum_to = 100;
$groups = array();
$group = 0;
while(array_sum($groups) != $sum_to)
{
$groups[$group] = mt_rand(0, $sum_to/mt_rand(1,5));
if(++$group == $number_of_groups)
{
$group = 0;
}
}
The example of generated result, will look something like this. Pretty random.
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(11)
[1]=>
int(2)
[2]=>
int(13)
[3]=>
int(9)
[4]=>
int(65)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(9)
[1]=>
int(29)
[2]=>
int(21)
[3]=>
int(27)
[4]=>
int(14)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(18)
[1]=>
int(26)
[2]=>
int(2)
[3]=>
int(5)
[4]=>
int(49)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(20)
[1]=>
int(25)
[2]=>
int(27)
[3]=>
int(26)
[4]=>
int(2)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(9)
[1]=>
int(18)
[2]=>
int(56)
[3]=>
int(12)
[4]=>
int(5)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(0)
[1]=>
int(50)
[2]=>
int(25)
[3]=>
int(17)
[4]=>
int(8)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(17)
[1]=>
int(43)
[2]=>
int(20)
[3]=>
int(3)
[4]=>
int(17)
}
$number = 100;
$numbers = array();
$iteration = 0;
while($number > 0 && $iteration < 5) {
$sub_number = rand(1,$number);
if (in_array($sub_number, $numbers)) {
continue;
}
$iteration++;
$number -= $sub_number;
$numbers[] = $sub_number;
}
if ($number != 0) {
$numbers[] = $number;
}
print_r($numbers);
This should do what you need:
<?php
$tot = 100;
$groups = 5;
$numbers = array();
for($i = 1; $i < $groups; $i++) {
$num = rand(1, $tot-($groups-$i));
$tot -= $num;
$numbers[] = $num;
}
$numbers[] = $tot;
It won't give you a truly balanced distribution, though, since the first numbers will on average be larger.
I think the trick to this is to keep setting the ceiling for your random # generator to 100 - currentTotal
The solution depends on how random you want your values to be, in other words, what random situation you're going to simulate.
To get totally random distribution, you'll have to do 100 polls in which each element will be binded to a group, in symbolic language
foreach i from 1 to n
group[ random(1,n) ] ++;
For bigger numbers, you could increase the selected group by random(1, n/100) or something like that until the total sum would match the n.
However, you want to get the balance, so I think the best for you would be the normal distribution. Draw 5 gaussian values, which will divide the number (their sum) into 5 parts. Now you need to scale this parts so that their sum would be n and round them, so you got your 5 groups.
The solution I found to this problem is a little different but makes makes more sense to me, so in this example I generate an array of numbers that add up to 960. Hope this is helpful.
// the range of the array
$arry = range(1, 999, 1);
// howmany numbers do you want
$nrresult = 3;
do {
//select three numbers from the array
$arry_rand = array_rand ( $arry, $nrresult );
$arry_fin = array_sum($arry_rand);
// dont stop till they sum 960
} while ( $arry_fin != 960 );
//to see the results
foreach ($arry_rand as $aryid) {
echo $arryid . '+ ';
}

Categories