Generating random values and keeping track of their sum - php

I have more than 200 entries in a database table and I would like to generate a random value for each entry, but in the end, the sum of entries values must equal 100. Is it possible to do this using a for loop and rand() in PHP?

You could simply normalize a set of numbers, like:
$numbers = array();
for ($i = 0; $i < 200; $i += 1) {
$numbers[] = rand();
}
$sum = array_sum($numbers);
// divide $sum by the target sum, to have an instant result, e.g.:
// $sum = array_sum($numbers) / 100;
// $sum = array_sum($numbers) / 42;
// ...
$numbers = array_map(function ($n) use($sum) {
return $n / $sum;
}, $numbers);
print_r($numbers);
print_r(array_sum($numbers)); // ~ 1
demo: http://codepad.viper-7.com/RDOIvX

The solution for your problem is to rand number from 0 to 200 then put in array, then sum the values and divide it by 200 after that. Loop through elements and divide every element by result of previous equatation it will give you the answer
$sum = 0;
$max = 100; //max value to be sumed
$nr_of_records = 200; // number of records that should sum to $max
$arr = array();
for($i=0;$i<$nr_of_records;++$i)
{
$arr[$i] = rand(0,$max);
}
$div = array_sum($arr) / $max;
for($i=0;$i<$nr_of_records;++$i)
{
$arr[$i] /= $div;
echo $arr[$i].'<br>';
}
echo array_sum($arr);
Created living example

How exact has the 100 to be? Just curious, because all hints end at using floating point values, which tend to be inacurate.
I'd propose using fractions... lets say 10000 fractions, each count 1/100 point (10000 * 1/100 = 100 points). Distribute 10000 points to 200 elements, using integers - and be absolutely sure, that the sum of all integers divided by 10000 is 100. There is no need for floats, just think around the corner...

Do a little over/under:
$size = 200;
$sum = 100;
$places = 3;
$base = round($sum/$size, $places);
$values = array_fill(0, $size, $base);
for($i=0; $i<$size; $i+=2) {
$diff = round((rand()/getrandmax()) * $base, $places);
$values[$i] += $diff;
$values[$i+1] -= $diff;
}
//optional: array_shuffle($values);
$sum = 0;
foreach($values as $item) {
printf("%0.3f ", $item);
$sum += $item;
}
echo $sum;
Output:
0.650 0.350 0.649 0.351 0.911 0.089 0.678 0.322 0.566 0.434 0.563 0.437 0.933 0.067 0.505 0.495 0.503 0.497 0.752 0.248 0.957 0.043 0.856 0.144 0.977 0.023 0.863 0.137 0.766 0.234 0.653 0.347 0.770 0.230 0.888 0.112 0.637 0.363 0.716 0.284 0.891 0.109 0.549 0.451 0.629 0.371 0.501 0.499 0.652 0.348 0.729 0.271 0.957 0.043 0.769 0.231 0.767 0.233 0.513 0.487 0.647 0.353 0.612 0.388 0.509 0.491 0.925 0.075 0.797 0.203 0.799 0.201 0.588 0.412 0.788 0.212 0.693 0.307 0.688 0.312 0.847 0.153 0.903 0.097 0.843 0.157 0.801 0.199 0.538 0.462 0.954 0.046 0.541 0.459 0.893 0.107 0.592 0.408 0.913 0.087 0.711 0.289 0.679 0.321 0.816 0.184 0.781 0.219 0.632 0.368 0.839 0.161 0.568 0.432 0.914 0.086 0.991 0.009 0.979 0.021 0.666 0.334 0.678 0.322 0.705 0.295 0.683 0.317 0.869 0.131 0.837 0.163 0.792 0.208 0.618 0.382 0.606 0.394 0.574 0.426 0.927 0.073 0.661 0.339 0.986 0.014 0.759 0.241 0.547 0.453 0.804 0.196 0.681 0.319 0.960 0.040 0.708 0.292 0.558 0.442 0.605 0.395 0.986 0.014 0.621 0.379 0.992 0.008 0.622 0.378 0.937 0.063 0.884 0.116 0.840 0.160 0.607 0.393 0.765 0.235 0.632 0.368 0.898 0.102 0.946 0.054 0.794 0.206 0.561 0.439 0.801 0.199 0.770 0.230 0.843 0.157 0.681 0.319 0.794 0.206 100
The rounding gets a bit squiffy if you're not using nice numbers like 100 and 200, but never more than 0.1 off.

Original question yesterday had exactly 200 entries and the sum "not greater than 100".
My original answer from yesterday:
Use random numbers not greater than 0.5 to be sure.
Alternatively, depending on how "random" those numbers need to be (how
much correlation is allowed), you could keep a running total, and if
it gets disproportionately high, you can mix in a bunch of smaller
values.
Edit:
Way to go changing the question, making me look stupid and get downvoted.
To get the exact sum you have to normalize, and better use exact fractions instead of floats to avoid rounding errors.

Related

Weighted Random Choice In PHP

I need help getting a my probability odds closer to testing results with low percentages. What I have seems to work for percentages at 1% or higher but I need it to work with very low percentages such as 0.02% (down to 4 decimals). Anything below 1% tends to end up having around a 1% probability after running tests from running 1000-100000 tests at once the results are similar.
Example Results
ID Odds Test Total Test Odds
1 60.0000 301773 60.3546%
2 30.0000 148360 29.672%
3 9.9800 44897 8.9794%
4 0.0200 4970 0.994%
Function
// $values = [1,2,3,4]
// $weights = [60.0000,30.0000,9.9800,.0200]
private function getRandom($values, $weights)
{
$count = count($values);
$i = 0;
$n = 0;
$num = mt_rand(0, array_sum($weights));
while($i < $count)
{
$n += $weights[$i];
if($n >= $num)
break;
$i++;
}
return $values[$i];
}
mt_rand returns an integer so comparing it to 0.02 is effectively the same as comparing it to 1. Hence you always get around 1% for the weights which are less than 1%. Try computing $num like this instead:
$num = mt_rand(0, array_sum($weights) * 100) / 100;
Demo on 3v4l.org

Defining percentage for random number

My rand(0,1) php function returns me the 0 and 1 randomly when I call it.
Can I define something in php, so that it makes 30% numbers will be 0 and 70% numbers will be 1 for the random calls? Does php have any built in function for this?
Sure.
$rand = (float)rand()/(float)getrandmax();
if ($rand < 0.3)
$result = 0;
else
$result = 1;
You can deal with arbitrary results and weights, too.
$weights = array(0 => 0.3, 1 => 0.2, 2 => 0.5);
$rand = (float)rand()/(float)getrandmax();
foreach ($weights as $value => $weight) {
if ($rand < $weight) {
$result = $value;
break;
}
$rand -= $weight;
}
You can do something like this:
$rand = (rand(0,9) > 6 ? 1 : 0)
rand(0,9) will produce a random number between 0 and 9, and whenever that randomly generated number is greater than 6 (which should be nearly 70% time), it will give you 1 otherwise 0...
Obviously, it seems to be the easiest solution to me, but definitely, it wont give you 1 exactly 70% times, but should be quite near to do that, if done correctly.
But, I doubt that any solution based on rand will give you 1 exactly 70% times...
Generate a new random value between 1 and 100. If the value falls below 30, then use 0, and 1 otherwise:
$probability = rand(1, 100);
if ($probability < 30) {
echo 0;
} else {
echo 1;
}
To test this theory, consider the following loop:
$arr = array();
for ($i=0; $i < 10000; $i++) {
$rand = rand(0, 1);
$probability = rand(1, 100);
if ($probability < 30) {
$arr[] = 0;
} else {
$arr[] = 1;
}
}
$c = array_count_values($arr);
echo "0 = " . $c['0'] / 10000 * 100;
echo "1 = " . $c['1'] / 10000 * 100;
Output:
0 = 29.33
1 = 70.67
Create an array with 70% 1 and 30% 0s. Then random sort it. Then start picking numbers from the beginning of the array to the end :)
$num_array = array();
for($i = 0; $i < 3; $i++) $num_array[$i] = 0;
for($i = 0; $i < 7; $i++) $num_array[$i] = 1;
shuffle($num_array);
Pros:
You'll get exactly 30% 0 and 70% 1 for any such array.
Cons: Might take longer computation time than a rand() only solution to create the initial array.
I searched for an answer to my question and this was the topic I found.
But it didn't answered my question, so I had to figure it out myself, and I did :).
I figured out that maybe this will help someone else as well.
It's regarding what you asked, but for more usage.
Basically, I use it as a "power" calculator for a random generated item (let's say a weapon). The item has a "min power" and a "max power" value in the db. And I wanted to have 80% chances to have the "power" value closer to the lower 80% of the max possible power for the item, and 20% for the highest 20% possible max power (that are stored in the db).
So, to do this I did the following:
$min = 1; // this value is normally taken from the db
$max = 30; // this value is normally taken from the db
$total_possibilities = ($max - $min) + 1;
$rand = random_int(1, 100);
if ($rand <= 80) { // 80% chances
$new_max = $max - ($total_possibilities * 0.20); // remove 20% from the max value, so you can get a number only from the lowest 80%
$new_rand = random_int($min, $new_max);
} elseif ($rand <= 100) { // 20% chances
$new_min = $min + ($total_possibilities * 0.80); // add 80% for the min value, so you can get a number only from the highest 20%
$new_rand = random_int($new_min, $max);
}
echo $new_rand; // this will be the final item power
The only problem you can have, is if the initial $min and $max variables are the same (or obviously, if the $max is bigger than the $min). This will throw an error since the random works like this ($min, $max), not the other way around.
This code can be very easily changed to have more percentages for different purposes, instead of 80% and 20% to put 40%, 40% and 20% (or whatever you need). I think the code is pretty much easy to read and understand.
Sorry if this is not helpful, but I hope it is :).
It can't do any harm either way ;).

PHP Generate x amount of random odd numbers within a range

I need to generate x amount of random odd numbers, within a given range.
I know this can be achieved with simple looping, but I'm unsure which approach would be the best, and is there a better mathematical way of solving this.
EDIT: Also I cannot have the same number more than once.
Generate x integer values over half the range, and for each value double it and add 1.
ANSWERING REVISED QUESTION: 1) Generate a list of candidates in range, shuffle them, and then take the first x. Or 2) generate values as per my original recommendation, and reject and retry if the generated value is in the list of already generated values.
The first will work better if x is a substantial fraction of the range, the latter if x is small relative to the range.
ADDENDUM: Should have thought of this approach earlier, it's based on conditional probability. I don't know php (I came at this from the "random" tag), so I'll express it as pseudo-code:
generate(x, upper_limit)
loop with index i from upper_limit downto 1 by 2
p_value = x / floor((i + 1) / 2)
if rand <= p_value
include i in selected set
decrement x
return/exit if x <= 0
end if
end loop
end generate
x is the desired number of values to generate, upper_limit is the largest odd number in the range, and rand generates a uniformly distributed random number between zero and one. Basically, it steps through the candidate set of odd numbers and accepts or rejects each one based how many values you still need and how many candidates still remain.
I've tested this and it really works. It requires less intermediate storage than shuffling and fewer iterations than the original acceptance/rejection.
Generate a list of elements in the range, remove the element you want in your random series. Repeat x times.
Or you can generate an array with the odd numbers in the range, then do a shuffle
Generation is easy:
$range_array = array();
for( $i = 0; $i < $max_value; $i++){
$range_array[] .= $i*2 + 1;
}
Shuffle
shuffle( $range_array );
splice out the x first elements.
$result = array_slice( $range_array, 0, $x );
This is a complete solution.
function mt_rands($min_rand, $max_rand, $num_rand){
if(!is_integer($min_rand) or !is_integer($max_rand)){
return false;
}
if($min_rand >= $max_rand){
return false;
}
if(!is_integer($num_rand) or ($num_rand < 1)){
return false;
}
if($num_rand <= ($max_rand - $min_rand)){
return false;
}
$rands = array();
while(count($rands) < $num_rand){
$loops = 0;
do{
++$loops; // loop limiter, use it if you want to
$rand = mt_rand($min_rand, $max_rand);
}while(in_array($rand, $rands, true));
$rands[] = $rand;
}
return $rands;
}
// let's see how it went
var_export($rands = mt_rands(0, 50, 5));
Code is not tested. Just wrote it. Can be improved a bit but it's up to you.
This code generates 5 odd unique numbers in the interval [1, 20]. Change $min, $max and $n = 5 according to your needs.
<?php
function odd_filter($x)
{
if (($x % 2) == 1)
{
return true;
}
return false;
}
// seed with microseconds
function make_seed()
{
list($usec, $sec) = explode(' ', microtime());
return (float) $sec + ((float) $usec * 100000);
}
srand(make_seed());
$min = 1;
$max = 20;
//number of random numbers
$n = 5;
if (($max - $min + 1)/2 < $n)
{
print "iterval [$min, $max] is too short to generate $n odd numbers!\n";
exit(1);
}
$result = array();
for ($i = 0; $i < $n; ++$i)
{
$x = rand($min, $max);
//not exists in the hash and is odd
if(!isset($result{$x}) && odd_filter($x))
{
$result[$x] = 1;
}
else//new iteration needed
{
--$i;
}
}
$result = array_keys($result);
var_dump($result);

displaying axis from min to max value - calculating scale and labels

Writing a routine to display data on a horizontal axis (using PHP gd2, but that's not the point here).
The axis starts at $min to $max and displays a diamond at $result, such an image will be around 300px wide and 30px high, like this:
(source: testwolke.de)
In the example above, $min=0, $max=3, $result=0.6.
Now, I need to calculate a scale and labels that make sense, in the above example e.g. dotted lines at 0 .25 .50 .75 1 1.25 ... up to 3, with number-labels at 0 1 2 3.
If $min=-200 and $max=600, dotted lines should be at -200 -150 -100 -50 0 50 100 ... up to 600, with number-labels at -200 -100 0 100 ... up to 600.
With $min=.02and $max=5.80, dotted lines at .02 .5 1 1.5 2 2.5 ... 5.5 5.8 and numbers at .02 1 2 3 4 5 5.8.
I tried explicitly telling the function where to put dotted lines and numbers by arrays, but hey, it's the computer who's supposed to work, not me, right?!
So, how to calculate???
An algorithm (example values $min=-186 and $max=+153 as limits):
Take these two limits $min, $max and mark them if you wish
Calculate the difference between $max and $min: $diff = $max - $min
153 - (-186) = 339
Calculate 10th logarithm of the difference $base10 = log($diff,10) = 2,5302
Round down: $power = round($base10) = 2.
This is your tenth power as base unit
To calculate $step calculate this:
$base_unit = 10^$power = 100;
$step = $base_unit / 2; (if you want 2 ticks per one $base_unit).
Calculate if $min is divisible by $step, if not take the nearest (round up) one
(in the case of $step = 50 it is $loop_start = -150)
for ($i=$loop_start; $i<=$max; $i++=$step){ // $i's are your ticks
end
I tested it in Excel and it gives quite nice results, you may want to increase its functionality,
for example (in point 5) by calculating $step first from $diff,
say $step = $diff / 4 and round $step in such way that $base_unit is divisible by $step;
this will avoid such situations that you have between (101;201) four ticks with $step=25 and you have 39 steps $step=25 between 0 and 999.
ACM Algorithm 463 provides three simple functions to produce good axis scales with outputs xminp, xmaxp and dist for the minimum and maximum values on the scale and the distance between tick marks on the scale, given a request for n intervals that include the data points xmin and xmax:
Scale1() gives a linear scale with approximately n intervals and dist being an integer power of 10 times 1, 2 or 5.
Scale2() gives a linear scale with exactly n intervals (the gap between xminp and xmaxp tends to be larger than the gap produced by Scale1()).
Scale3() gives a logarithmic scale.
The original 1973 paper is online here, which provides more explanation than the code linked to above.
The code is in Fortran but it is just a set of arithmetical calculations so it is very straightforward to interpret and convert into other languages. I haven't written any PHP myself, but it looks a lot like C so you might want to start by running the code through f2c which should give you something close to runnable in PHP.
There are more complicated functions that give prettier scales (e.g. the ones in gnuplot), but Scale1() would likely do the job for you with minimal code.
(This answer builds on my answer to a previous question Graph axis calibration in C++)
(EDIT -- I've found an implementation of Scale1() that I did in Perl):
use strict;
sub scale1 ($$$) {
# from TOMS 463
# returns a suitable scale ($xMinp, $xMaxp, $dist), when called with
# the minimum and maximum x values, and an approximate number of intervals
# to divide into. $dist is the size of each interval that results.
# #vInt is an array of acceptable values for $dist.
# #sqr is an array of geometric means of adjacent values of #vInt, which
# is used as break points to determine which #vInt value to use.
#
my ($xMin, $xMax, $n) = #_;
#vInt = {1, 2, 5, 10};
#sqr = {1.414214, 3.162278, 7.071068 }
if ($xMin > $xMax) {
my ($tmp) = $xMin;
$xMin = $xMax;
$xMax = $tmp;
}
my ($del) = 0.0002; # accounts for computer round-off
my ($fn) = $n;
# find approximate interval size $a
my ($a) = ($xMax - $xMin) / $fn;
my ($al) = log10($a);
my ($nal) = int($al);
if ($a < 1) {
$nal = $nal - 1;
}
# $a is scaled into a variable named $b, between 1 and 10
my ($b) = $a / 10^$nal;
# the closest permissable value for $b is found)
my ($i);
for ($i = 0; $i < $_sqr; $i++) {
if ($b < $sqr[$i]) last;
}
# the interval size is computed
$dist = $vInt[$i] * 10^$nal;
$fm1 = $xMin / $dist;
$m1 = int($fm1);
if ($fm1 < 0) $m1--;
if (abs(($m1 + 1.0) - $fm1) < $del) $m1++;
# the new minimum and maximum limits are found
$xMinp = $dist * $m1;
$fm2 = $xMax / $dist;
$m2 = $fm2 + 1;
if ($fm2 < -1) $m2--;
if (abs ($fm2 + 1 - $m2) < $del) $m2--;
$xMaxp = $dist * $m2;
# adjust limits to account for round-off if necessary
if ($xMinp > $xMin) $xMinp = $xMin;
if ($xMaxp < $xMax) $xMaxp = $xMax;
return ($xMinp, $xMaxp, $dist);
}
sub scale1_Test {
$par = (-3.1, 11.1, 5,
5.2, 10.1, 5,
-12000, -100, 9);
print "xMin\txMax\tn\txMinp\txMaxp,dist\n";
for ($i = 0; $i < $_par/3; $i++) {
($xMinp, $xMaxp, $dist) = scale1($par[3*$i+0],
$par[3*$i+1], $par[3*$i+2]);
print "$par[3*$i+0]\t$par[3*$i+1]\t$par[3*$i+2]\t$xMinp\t$xMaxp,$dist\n";
}
}
I know that this isn't exactly what you are looking for, but hopefully it will get you started in the right direction.
$min = -200;
$max = 600;
$difference = $max - $min;
$labels = 10;
$picture_width = 300;
/* Get units per label */
$difference_between = $difference / ($labels - 1);
$width_between = $picture_width / $labels;
/* Make the label array */
$label_arr = array();
$label_arr[] = array('label' => $min, 'x_pos' => 0);
/* Loop through the number of labels */
for($i = 1, $l = $labels; $i < $l; $i++) {
$label = $min + ($difference_between * $i);
$label_arr[] = array('label' => $label, 'x_pos' => $width_between * $i);
}
A quick example would be something in the lines of $increment = ($max-$min)/$scale where you can tweak scale to be the variable by which the increment scales. Since you devide by it, it should change proportionately as your max and min values change. After that you will have a function like:
$end = false;
while($end==false){
$breakpoint = $last_value + $increment; // that's your current breakpoint
if($breakpoint > $max){
$end = true;
}
}
At least thats the concept... Let me know if you have troubles with it.

Smaller number generation from a large number and total of smaller number must be a large number

How can I generate fix smaller random numbers from a large number. Addition of these smaller numbers must be equal to large number. Suppose I want to generate 400 random number and addition of these smaller number = e.g. 1,000,000. every number should be unique and have any value assign to it. Like Number 1=1000 and number 2 may contain only 5. But total of all the number must be a large number. Is there any algorithm to do this kind of operation in php?
function array_generate_sum($n, $total)
{
$sum = 0;
$arr = array();
for( ; $n >= 0; $n--)
{
$current = $n == 0 ? $total - $sum : mt_rand(1, $total - $sum - $n);
$sum += $current;
$arr[] = $current;
}
return $arr;
}
// Generate an array of 5 values whose sum is 30
array_generate_sum(5, 30);

Categories