Generating a random number in the range [M..N] is easy enough. I however would like to generate a series of random numbers in that range with mean X (M < X < N).
For example, assume the following:
M = 10000
N = 1000000
X = 20000
I would like to generate (a large amount of) random numbers such that the entire range [M..N] is covered, but in this case numbers closer to N should become exceedingly more rare. Numbers closer to M should be more common to ensure that the mean converges to X.
The intended target language is PHP, but this is not a language question per se.
There are many ways to accomplish this, and it would differ very much depending on your demands on precision. The following code uses the 68-95-99.7 rule, based on the normal distribution, with a standard deviation of 15% of the mean.
It does not:
ensure exact precision. If you need this you have to calculate the real mean and compensate for the missing amount.
created a true normal distributed curve dynamically, as all the three chunks (68-95-99.7) are considered equal within their groups.
It does however give you a start:
<?php
$mean = (int)$_GET['mean']; // The mean you want
$amnt = (int)$_GET['amnt']; // The amount of integers to generate
$sd = $mean * 0.15;
$numbers = array();
for($i=1;$i<$amnt;$i++)
{
$n = mt_rand(($mean-$sd), ($mean+$sd));
$r = mt_rand(10,1000)/10; // For decimal counting
if($r>68)
{
if(2==mt_rand(1,2)) // Coin flip, should it add or subtract?
{
$n = $n+$sd;
}
else
{
$n = $n-$sd;
}
}
if($r>95)
{
if(2==mt_rand(1,2))
{
$n = $n+$sd;
}
else
{
$n = $n-$sd;
}
}
if($r>99.7)
{
if(2==mt_rand(1,2))
{
$n = $n+$sd;
}
else
{
$n = $n-$sd;
}
}
$numbers[] = $n;
}
arsort($numbers);
print_r($numbers);
// Echo real mean to see how far off you get. Typically within 1%
/*
$sum = 0;
foreach($numbers as $val)
{
$sum = $sum + $val;
}
echo $rmean = $sum/$amnt;
*/
?>
Hope it helps!
Related
I need to find the value of x where the variance of two results (which take x into account) is the closest to 0. The problem is, the only way to do this is to cycle through all possible values of x. The equation uses currency, so I have to check in increments of 1 cent.
This might make it easier:
$previous_var = null;
$high_amount = 50;
for ($i = 0.01; $i <= $high_amount; $i += 0.01) {
$val1 = find_out_1($i);
$val2 = find_out_2();
$var = variance($val1, $val2);
if ($previous_var == null) {
$previous_var = $var;
}
// If this variance is larger, it means the previous one was the closest to
// 0 as the variance has now started increasing
if ($var > $previous_var) {
$l_s -= 0.01;
break;
}
}
$optimal_monetary_value = $i;
I feel like there is a mathematical formula that would make the "cycling through every cent" more optimal? It works fine for small values, but if you start using 1000's as the $high_amount it takes quite a few seconds to calculate.
Based on the comment in your code, it sounds like you want something similar to bisection search, but a little bit different:
function calculate_variance($i) {
$val1 = find_out_1($i);
$val2 = find_out_2();
return variance($val1, $val2);
}
function search($lo, $loVar, $hi, $hiVar) {
// find the midpoint between the hi and lo values
$mid = round($lo + ($hi - $lo) / 2, 2);
if ($mid == $hi || $mid == $lo) {
// we have converged, so pick the better value and be done
return ($hiVar > $loVar) ? $lo : $hi;
}
$midVar = calculate_variance($mid);
if ($midVar >= $loVar) {
// the optimal point must be in the lower interval
return search($lo, $loVar, $mid, $midVar);
} elseif ($midVar >= $hiVar) {
// the optimal point must be in the higher interval
return search($mid, $midVar, $hi, $hiVar);
} else {
// we don't know where the optimal point is for sure, so check
// the lower interval first
$loBest = search($lo, $loVar, $mid, $midVar);
if ($loBest == $mid) {
// we can't be sure this is the best answer, so check the hi
// interval to be sure
return search($mid, $midVar, $hi, $hiVar);
} else {
// we know this is the best answer
return $loBest;
}
}
}
$optimal_monetary_value = search(0.01, calculate_variance(0.01), 50.0, calculate_variance(50.0));
This assumes that the variance is monotonically increasing when moving away from the optimal point. In other words, if the optimal value is O, then for all X < Y < O, calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) (and the same with all > and < flipped). The comment in your code and the way have you have it written make it seem like this is true. If this isn't true, then you can't really do much better than what you have.
Be aware that this is not as good as bisection search. There are some pathological inputs that will make it take linear time instead of logarithmic time (e.g., if the variance is the same for all values). If you can improve the requirement that calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) to be calculate_variance(X) > calculate_variance(Y) > calculate_variance(O), you can improve this to be logarithmic in all cases by checking to see how the variance for $mid compares the the variance for $mid + 0.01 and using that to decide which interval to check.
Also, you may want to be careful about doing math with currency. You probably either want to use integers (i.e., do all math in cents instead of dollars) or use exact precision numbers.
If you known nothing at all about the behavior of the objective function, there is no other way than trying all possible values.
On the opposite if you have a guarantee that the minimum is unique, the Golden section method will converge very quickly. This is a variant of the Fibonacci search, which is known to be optimal (require the minimum number of function evaluations).
Your function may have different properties which call for other algorithms.
Why not implementing binary search ?
<?php
$high_amount = 50;
// computed val2 is placed outside the loop
// no need te recalculate it each time
$val2 = find_out_2();
$previous_var = variance(find_out_1(0.01), $val2);
$start = 0;
$end = $high_amount * 100;
$closest_variance = NULL;
while ($start <= $end) {
$section = intval(($start + $end)/2);
$cursor = $section / 100;
$val1 = find_out_1($cursor);
$variance = variance($val1, $val2);
if ($variance <= $previous_var) {
$start = $section;
}
else {
$closest_variance = $cursor;
$end = $section;
}
}
if (!is_null($closest_variance)) {
$closest_variance -= 0.01;
}
I need to generate x amount of random odd numbers, within a given range.
I know this can be achieved with simple looping, but I'm unsure which approach would be the best, and is there a better mathematical way of solving this.
EDIT: Also I cannot have the same number more than once.
Generate x integer values over half the range, and for each value double it and add 1.
ANSWERING REVISED QUESTION: 1) Generate a list of candidates in range, shuffle them, and then take the first x. Or 2) generate values as per my original recommendation, and reject and retry if the generated value is in the list of already generated values.
The first will work better if x is a substantial fraction of the range, the latter if x is small relative to the range.
ADDENDUM: Should have thought of this approach earlier, it's based on conditional probability. I don't know php (I came at this from the "random" tag), so I'll express it as pseudo-code:
generate(x, upper_limit)
loop with index i from upper_limit downto 1 by 2
p_value = x / floor((i + 1) / 2)
if rand <= p_value
include i in selected set
decrement x
return/exit if x <= 0
end if
end loop
end generate
x is the desired number of values to generate, upper_limit is the largest odd number in the range, and rand generates a uniformly distributed random number between zero and one. Basically, it steps through the candidate set of odd numbers and accepts or rejects each one based how many values you still need and how many candidates still remain.
I've tested this and it really works. It requires less intermediate storage than shuffling and fewer iterations than the original acceptance/rejection.
Generate a list of elements in the range, remove the element you want in your random series. Repeat x times.
Or you can generate an array with the odd numbers in the range, then do a shuffle
Generation is easy:
$range_array = array();
for( $i = 0; $i < $max_value; $i++){
$range_array[] .= $i*2 + 1;
}
Shuffle
shuffle( $range_array );
splice out the x first elements.
$result = array_slice( $range_array, 0, $x );
This is a complete solution.
function mt_rands($min_rand, $max_rand, $num_rand){
if(!is_integer($min_rand) or !is_integer($max_rand)){
return false;
}
if($min_rand >= $max_rand){
return false;
}
if(!is_integer($num_rand) or ($num_rand < 1)){
return false;
}
if($num_rand <= ($max_rand - $min_rand)){
return false;
}
$rands = array();
while(count($rands) < $num_rand){
$loops = 0;
do{
++$loops; // loop limiter, use it if you want to
$rand = mt_rand($min_rand, $max_rand);
}while(in_array($rand, $rands, true));
$rands[] = $rand;
}
return $rands;
}
// let's see how it went
var_export($rands = mt_rands(0, 50, 5));
Code is not tested. Just wrote it. Can be improved a bit but it's up to you.
This code generates 5 odd unique numbers in the interval [1, 20]. Change $min, $max and $n = 5 according to your needs.
<?php
function odd_filter($x)
{
if (($x % 2) == 1)
{
return true;
}
return false;
}
// seed with microseconds
function make_seed()
{
list($usec, $sec) = explode(' ', microtime());
return (float) $sec + ((float) $usec * 100000);
}
srand(make_seed());
$min = 1;
$max = 20;
//number of random numbers
$n = 5;
if (($max - $min + 1)/2 < $n)
{
print "iterval [$min, $max] is too short to generate $n odd numbers!\n";
exit(1);
}
$result = array();
for ($i = 0; $i < $n; ++$i)
{
$x = rand($min, $max);
//not exists in the hash and is odd
if(!isset($result{$x}) && odd_filter($x))
{
$result[$x] = 1;
}
else//new iteration needed
{
--$i;
}
}
$result = array_keys($result);
var_dump($result);
I want to calculate Frequency (Monobits) test in PHP:
Description: The focus of the test is
the proportion of zeroes and ones for
the entire sequence. The purpose of
this test is to determine whether that
number of ones and zeros in a sequence
are approximately the same as would be
expected for a truly random sequence.
The test assesses the closeness of the
fraction of ones to ½, that is, the
number of ones and zeroes in a
sequence should be about the same.
I am wondering that do I really need to calculate the 0's and 1's (the bits) or is the following adequate:
$value = 0;
// Loop through all the bytes and sum them up.
for ($a = 0, $length = strlen((binary) $data); $a < $length; $a++)
$value += ord($data[$a]);
// The average should be 127.5.
return (float) $value/$length;
If the above is not the same, then how do I exactly calculate the 0's and 1's?
No, you really need to check all zeroes and ones. For example, take the following binary input:
01111111 01111101 01111110 01111010
. It is clearly (literally) one-sided(8 zeroes, 24 ones, correct result 24/32 = 3/4 = 0.75) and therefore not random. However, your test would compute 125.0 /255 which is close to ½.
Instead, count like this:
function one_proportion($binary) {
$oneCount = 0;
$len = strlen($binary);
for ($i = 0;$i < $len;$i++) {
$intv = ord($binary{$i});
for ($bitp = 0;$bitp < 7;$bitp++) {
$oneCount += ($intv>>$bitp) & 0x1;
}
}
return $oneCount / (8 * $len);
}
I want a random number generator with non-uniform distribution, ie:
// prints 0 with 0.1 probability, and 1 with 0.9 probability
echo probRandom(array(10, 90));
This is what I have right now:
/**
* method to generated a *not uniformly* random index
*
* #param array $probs int array with weights
* #return int a random index in $probs
*/
function probRandom($probs) {
$size = count($probs);
// construct probability vector
$prob_vector = array();
$ptr = 0;
for ($i=0; $i<$size; $i++) {
$ptr += $probs[$i];
$prob_vector[$i] = $ptr;
}
// get a random number
$rand = rand(0, $ptr);
for ($i=0, $ret = false; $ret === false; $i++) {
if ($rand <= $prob_vector[$i])
return $i;
}
}
Can anyone think of a better way? Possibly one that doesn't require me to do pre-processing?
If you know the sum of all elements in $probs, you can do this without preprocessing.
Like so:
$max = sum($probs);
$r = rand(0,$max-1);
$tot = 0;
for ($i = 0; $i < length($probs); $i++) {
$tot += $probs[$i];
if ($r < $tot) {
return $i;
}
}
This will do what you want in O(N) time, where N is the length of the array. This is a firm lower bound on the algorithmic runtime of such an algorithm, as each element in the input must be considered.
The probability a given index $i is selected is $probs[$i]/sum($probs), given that the rand function returns independent uniformly distributed integers in the given range.
In your solution you generate an accumulated probability vector, which is very useful.
I have two suggestions for improvement:
if $probs are static, i.e. it's the same vector every time you want to generate a random number, you can preprocess $prob_vector just once and keep it.
you can use binary search for the $i (Newton bisection method)
EDIT: I now see that you ask for a solution without preprocessing.
Without preprocessing, you will end up with worst case linear runtime (i.e., double the length of the vector, and your running time will double as well).
Here is a method that doesn't require preprocessing. It does, however, require you to know a maximum limit of the elements in $probs:
Rejection method
Pick a random index, $i and a random number, X (uniformly) between 0 and max($probs)-1, inclusive.
If X is less than $probs[$i], you're done - $i is your random number
Otherwise reject $i (hence the name of the method) and restart.
I'm fairly new to PHP - programming in general. So basically what I need to accomplish is, create an array of x amount of numbers (created randomly) whose value add up to n:
Let's say, I have to create 4 numbers that add up to 30. I just need the first random dataset. The 4 and 30 here are variables which will be set by the user.
Essentially something like
x = amount of numbers;
n = sum of all x's combined;
// create x random numbers which all add up to n;
$row = array(5, 7, 10, 8) // these add up to 30
Also, no duplicates are allowed and all numbers have to be positive integers.
I need the values within an array. I have been messing around with it sometime, however, my knowledge is fairly limited. Any help will be greatly appreciated.
First off, this is a really cool problem. I'm almost sure that my approach doesn't even distribute the numbers perfectly, but it should be better than some of the other approaches here.
I decided to build the array from the lowest number up (and shuffle them at the end). This allows me to always choose a random range that will allows yield valid results. Since the numbers must always be increasing, I solved for the highest possible number that ensures that a valid solution still exists (ie, if n=4 and max=31, if the first number was picked to be 7, then it wouldn't be possible to pick numbers greater than 7 such that the sum of 4 numbers would be equal to 31).
$n = 4;
$max = 31;
$array = array();
$current_min = 1;
while( $n > 1 ) {
//solve for the highest possible number that would allow for $n many random numbers
$current_max = floor( ($max/$n) - (($n-1)/2) );
if( $current_max < $current_min ) throw new Exception( "Can't use combination" );
$new_rand = rand( $current_min, $current_max ); //get a new rand
$max -= $new_rand; //drop the max
$current_min = $new_rand + 1; //bump up the new min
$n--; //drop the n
$array[] = $new_rand; //add rand to array
}
$array[] = $max; //we know what the last element must be
shuffle( $array );
EDIT: For large values of $n you'll end up with a lot of grouped values towards the end of the array, since there is a good chance you will get a random value near the max value forcing the rest to be very close together. A possible fix is to have a weighted rand, but that's beyond me.
I'm not sure whether I understood you correctly, but try this:
$n = 4;
$max = 30;
$array = array();
do {
$random = mt_rand(0, $max);
if (!in_array($random, $array)) {
$array[] = $random;
$n--;
}
} while (n > 0);
sorry i missed 'no duplicates' too
-so need to tack on a 'deduplicator' ...i put it in the other question
To generate a series of random numbers with a fixed sum:
make a series of random numbers (of largest practical magnitude to hide granularity...)
calculate their sum
multiply each in series by desiredsum/sum
(basicaly to scale a random series to its new size)
Then there is rounding error to adjust for:
recalculate sum and its difference
from desired sum
add the sumdiff to a random element
in series if it doesnt result in a
negative, if it does loop to another
random element until fine.
to be ultratight instead add or
subtract 1 bit to random elements
until sumdiff=0
Some non-randomness resulting from doing it like this is if the magnitude of the source randoms is too small causing granularity in the result.
I dont have php, but here's a shot -
$n = ; //size of array
$targsum = ; //target sum
$ceiling = 0x3fff; //biggish number for rands
$sizedrands = array();
$firstsum=0;
$finsum=0;
//make rands, sum size
for( $count=$n; $count>0; $count--)
{ $arand=rand( 0, $ceiling );
$sizedrands($count)=$arand;
$firstsum+=$arand; }
//resize, sum resize
for( $count=$n; $count>0; $count--)
{ $sizedrands($count)=($sizedrands($count)*$targsum)/$firstsum;
$finsum+=$sizedrands($count);
}
//redistribute parts of rounding error randomly until done
$roundup=$targsum-$finsum;
$rounder=1; if($roundup<0){ $rounder=-1; }
while( $roundup!=0 )
{ $arand=rand( 0, $n );
if( ($rounder+$sizedrands($arand) ) > 0 )
{ $sizedrands($arand)+=$rounder;
$roundup-=$rounder; }
}
Hope this will help you more....
Approch-1
$aRandomarray = array();
for($i=0;$i<100;$i++)
{
$iRandomValue = mt_rand(1000, 999);
if (!in_array($iRandomValue , $aRandomarray)) {
$aRandomarray[$i] = $iRandomValue;
}
}
Approch-2
$aRandomarray = array();
for($i=0;$i<100;$i++)
{
$iRandomValue = mt_rand(100, 999);
$sRandom .= $iRandomValue;
}
array_push($aRandomarray, $sRandom);