This question already has answers here:
Picking the nearest value from an array reflecting ranges
(4 answers)
Closed 9 months ago.
I have sorted array with numbers as keys, I need a reasonably fast alg to pick a key number which is holding value closest or identical (if exists) to given variable. If given value is higher than max or lower than min values, then keys holding max and min are given respectively.
so I made an attempt and here is a function that translates values in one array according to another, it can be used for example a temperature, -10 to cold or +30 to hot, but for big arrays it is not so fast, any clue how to make it faster ?
function transnum($nums,$transarr,$searchkey='x',$returnkey='y') {
$was_arr = is_array($nums); $nums = (array)$nums;
foreach ($nums as &$num) {
if ($num===null or $num==='') continue;
reset($transarr[$searchkey]);
$ckey= key($transarr[$searchkey]);
$closest = abs($num-current($transarr[$searchkey]));
while($next = next($transarr[$searchkey])) {
$checkclosest=abs($num-$next);
if($closest>$checkclosest) {
$closest = $checkclosest;
$ckey = key($transarr[$searchkey]);
}
else break;
}
$num = $transarr[$returnkey][$ckey];
}
if(!$was_arr) $nums = $nums[0];
return $nums;
}
You could use a binary search. The basic algorithm goes something like this, assuming you're looking for myVal
Look at the middle value of the array.
If the value is myVal, you're done.
If the value is higher than myVal, split the array in half and go to 1, but use only the bottom half of the array
If the value is lower, go to 1, but use only the top half.
Once you reach and array of length = one, compare that value to its neighbours to see which is closest.
This should be an O(log N) search.
This is a java implementation, you may translate into a language of your choice.
static int findNearestValue(int value, int arr[], int low, int high) {
int result = -1;
if(high-low>1){
int mid = (low + high)/2;
if(arr[mid]>value)
result = findNearestValue(value, arr, low, mid);
else if(arr[mid]<value)
result = findNearestValue(value, arr, mid, high);
else
result = arr[mid];
} else
result = abs(value-arr[low]) < abs(value-arr[high]) ? arr[low] : arr[high];
return result;
}
Related
I really need your help. I have to create a function that takes 2 positive integers as its arguments and returns the numerical palindromes as an array of n numerical palindromes that come after the num, including num. Also, single-digit numbers are not considered numerical palindromes. So the outcome must be like this --> function(4,6) // returns [11,22,33,44]. Function(4, 6) it's an array that will take only 4 elements and the numerical palindromes must be greater than 6. Other examples are function (1, 75) // returns [77] and function (3, 100) // returns [101, 111, 121]
My code so far:
<?php
function createPalindrome($input)
{
$m = $input;
$palin = $input;
while ($m > 1) {
$d = intval($m % 10);
$palin = $palin * 10 + $d;
$m = intval($m / 10);
}
return $palin;
}
function generatePalindromes($x, $n)
{
$arr = [];
$i = 1;
while (($number = createPalindrome($i)) <= $n) {
$arr[] = $number;
$i++;
}
for($j = 0; $j < $x; $j++)
var_dump($arr[$j]);
}
generatePalindromes(4, 77);
The outcome is:
int(1)
int(22)
int(33)
int(44)
Had to modify this answer a fair bit once Giann49 expounded on his question in a comment reply.
This is not the cleanest or most precise way to do this for sure but it will principally function and hopefully help point you in the right direction logically.
function findExceedingPalindromes($palindromeLimit,$startingPoint){
$palindromesFound = 0; //Set an initial counter for number of palindromes found so far.
$palindromeSet = []; //Create an array to contain all the palindromes.
if($palindromeLimit <= 0 || $startingPoint <= 0){ //Both integers need to be positive as stated.
return false; //If they aren't return false. You can return whatever you want to halt execution of the function. This is just an easy example.
}
if($startingPoint < 10){
$startingPoint = 10; //Since single digits aren't valid if the starting number if less than 10 kick it up to 10.
}
while($palindromesFound <= $palindromeLimit){
$startingPoint++; //Since the first palindrome must exceed the starting point increment it up once at the top of the loop.
$reverseNumber = strrev($startingPoint); //reverse the current number.
if($startingPoint === $reverseNumber){
array_push($palindromeSet,$startingPoint);
$palindomresFound++; //If we find a palindome move the number found 1 higher.
}
}
return $palindromeSet;
}
As an explanation.
The first argument is the number of palindromes to generate. The second argument is the number we want to start palindrome generation at then work up from there.
We create two variables. One is to track how many palindromes have been found. The other is an empty array to insert found palindromes into.
You say the two numbers must be positive integers so if they are anything less than 1 we'll want to exit the function. (optional)
You say single digits don't count so if the starting point is less than 10 we'll just move it up to 10 for starters. (optional)
Now we'll start a while loop. While the number of palindromes is less than the number we want to find the loop will keep running.
We add 1 to the starting point right out of the gate because we want to first palindrome to be higher than the starting point if it already is one. As in if 11 is the number set as the point to start searching we want to look at 11 + 1 for starters. (optional)
To check if a number of a palindrome we want to simply reverse it's string. If the strings are the same forward and back obviously it matches the definition of a palindrome. So we'll add that number into the set of found palindromes and move the number found 1 digit higher.
Once the requested number of palindromes are found we'll break the while loop and return the array of what was found.
I'm trying to write this solution in PHP.
Inputs:
number of numbers = x
smallest number = 1
sum of numbers = y
I'm not dealing with very large numbers, largest x is approximatly 50, largest y is approximatly 80.
rules: Within each set of numbers, the number proceeding the previous must be equal to or greater.
For example
x = 3
min = 1
y = 6
solution:
(1,1,4),(1,2,3)
note that (3,2,1) isn't a solution as they are in descending order.
This is easily solved via recursion. The time complexity though will be high. For a better (but slightly more complex solution) use dynamic programming.
Here's the idea:
If the size of the set is 1 then the only possible solution is the desired sum.
If the set is larger than one then you can merge a number X between the minimum and the desired sum with a set of numbers which add up to the desired sum minus X.
function tuplesThatSumUpTo($desiredSum, $minimumNumber, $setSize) {
$tuples = [];
if ($setSize <= 1) {
return [ [ $desiredSum ] ]; //A set of sets of size 1 e.g. a set of the desired sum
}
for ($i = $minimumNumber;$i < $desiredSum;$i++) {
$partial = tuplesThatSumUpTo($desiredSum-$i, $minimumNumber,$setSize-1);
$tuples = array_merge($tuples, array_map(function ($tuple) use ($i) {
$res = array_merge([$i], $tuple);
sort($res);
return $res;
},$partial));
}
return array_unique($tuples,SORT_REGULAR);
}
See it run:
http://sandbox.onlinephpfunctions.com/code/1b0e507f8c2fcf06f4598005bf87ee98ad2505b3
The dynamic programming approach would have you instead hold an array of sets with partial sums and refer back to it to fill in what you need later on.
I have a set of items. I need to randomly pick one. The problem is that they each have a weight of 1-10. A weight of 2 means that the item is twice as likely to be picked than a weight of 1. A weight of 3 is three times as likely.
I currently fill an array with each item. If the weight is 3, I put three copies of the item in the array. Then, I pick a random item.
My method is fast, but uses a lot of memory. I am trying to think of a faster method, but nothing comes to mind. Anyone have a trick for this problem?
EDIT: My Code...
Apparently, I wasn't clear. I do not want to use (or improve) my code. This is what I did.
//Given an array $a where $a[0] is an item name and $a[1] is the weight from 1 to 100.
$b = array();
foreach($a as $t)
$b = array_merge($b, array_fill(0,$t[1],$t));
$item = $b[array_rand($b)];
This required me to check every item in $a and uses max_weight/2*size of $a memory for the array. I wanted a COMPLETELY DIFFERENT algorithm.
Further, I asked this question in the middle of the night using a phone. Typing code on a phone is nearly impossible because those silly virtual keyboards simply suck. It auto-corrects everything, ruining any code I type.
An yet further, I woke up this morning with an entirely new algorithm that uses virtual no extra memory at all and does not require checking every item in the array. I posted it as an answer below.
This ones your huckleberry.
$arr = array(
array("val" => "one", "weight" => 1),
array("val" => "two", "weight" => 2),
array("val" => "three", "weight" => 3),
array("val" => "four", "weight" => 4)
);
$weight_sum = 0;
foreach($arr as $val)
{
$weight_sum += $val['weight'];
}
$r = rand(1, $weight_sum);
print "random value is $r\n";
for($i = 0; $i < count($arr); $i++)
{
if($r <= $arr[$i]['weight'])
{
print "$r <= {$arr[$i]['weight']}, this is our match\n";
print $arr[$i]['val'] . "\n";
break;
}
else
{
print "$r > {$arr[$i]['weight']}, subtracting weight\n";
$r -= $arr[$i]['weight'];
print "new \$r is $r\n";
}
}
No need to generate arrays containing an item for every weight, no need to fill an array with n elements for a weight of n. Just generate a random number between 1 and total weight, then loop through the array until you find a weight less than your random number. If it isn't less than the number, subtract that weight from the random and continue.
Sample output:
# php wr.php
random value is 8
8 > 1, subtracting weight
new $r is 7
7 > 2, subtracting weight
new $r is 5
5 > 3, subtracting weight
new $r is 2
2 <= 4, this is our match
four
This should also support fractional weights.
modified version to use array keyed by weight, rather than by item
$arr2 = array(
);
for($i = 0; $i <= 500000; $i++)
{
$weight = rand(1, 10);
$num = rand(1, 1000);
$arr2[$weight][] = $num;
}
$start = microtime(true);
$weight_sum = 0;
foreach($arr2 as $weight => $vals) {
$weight_sum += $weight * count($vals);
}
print "weighted sum is $weight_sum\n";
$r = rand(1, $weight_sum);
print "random value is $r\n";
$found = false;
$elem = null;
foreach($arr2 as $weight => $vals)
{
if($found) break;
for($j = 0; $j < count($vals); $j ++)
{
if($r < $weight)
{
$elem = $vals[$j];
$found = true;
break;
}
else
{
$r -= $weight;
}
}
}
$end = microtime(true);
print "random element is: $elem\n";
print "total time is " . ($end - $start) . "\n";
With sample output:
# php wr2.php
weighted sum is 2751550
random value is 345713
random element is: 681
total time is 0.017189025878906
measurement is hardly scientific - and fluctuates depending on where in the array the element falls (obviously) but it seems fast enough for huge datasets.
This way requires two random calculations but they should be faster and require about 1/4 of the memory but with some reduced accuracy if weights have disproportionate counts. (See Update for increased accuracy at the cost of some memory and processing)
Store a multidimensional array where each item is stored in the an array based on its weight:
$array[$weight][] = $item;
// example: Item with a weight of 5 would be $array[5][] = 'Item'
Generate a new array with the weights (1-10) appearing n times for n weight:
foreach($array as $n=>$null) {
for ($i=1;$i<=$n;$i++) {
$weights[] = $n;
}
}
The above array would be something like: [ 1, 2, 2, 3, 3, 3, 4, 4, 4, 4 ... ]
First calculation: Get a random weight from the weighted array we just created
$weight = $weights[mt_rand(0, count($weights)-1)];
Second calculation: Get a random key from that weight array
$value = $array[$weight][mt_rand(0, count($array[$weight])-1)];
Why this works: You solve the weighted issue by using the weighted array of integers we created. Then you select randomly from that weighted group.
Update: Because of the possibility of disproportionate counts of items per weight, you could add another loop and array for the counts to increase accuracy.
foreach($array as $n=>$null) {
$counts[$n] = count($array[$n]);
}
foreach($array as $n=>$null) {
// Calculate proportionate weight (number of items in this weight opposed to minimum counted weight)
$proportion = $n * ($counts[$n] / min($counts));
for ($i=1; $i<=$proportion; $i++) {
$weights[] = $n;
}
}
What this does is if you have 2000 10's and 100 1's, it'll add 200 10's (20 * 10, 20 because it has 20x the count, and 10 because it is weighted 10) instead of 10 10's to make it proportionate to how many are in there opposed the minimum weight count. So to be accurate, instead of adding one for EVERY possible key, you are just being proportionate based on the MINIMUM count of weights.
I greatly appreciate the answers above. Please consider this answer, which does not require checking every item in the original array.
// Given $a as an array of items
// where $a[0] is the item name and $a[1] is the item weight.
// It is known that weights are integers from 1 to 100.
for($i=0; $i<sizeof($a); $i++) // Safeguard described below
{
$item = $a[array_rand($a)];
if(rand(1,100)<=$item[1]) break;
}
This algorithm only requires storage for two variables ($i and $item) as $a was already created before the algorithm kicked in. It does not require a massive array of duplicate items or an array of intervals.
In a best-case scenario, this algorithm will touch one item in the original array and be done. In a worst-case scenario, it will touch n items in an array of n items (not necessarily every item in the array as some may be touched more than once).
If there was no safeguard, this could run forever. The safeguard is there to stop the algorithm if it simply never picks an item. When the safeguard is triggered, the last item touched is the one selected. However, in millions of tests using random data sets of 100,000 items with random weights of 1 to 10 (changing rand(1,100) to rand(1,10) in my code), the safeguard was never hit.
I made histograms comparing the frequency of items selected among my original algorithm, the ones from answers above, and the one in this answer. The differences in frequencies are trivial - easy to attribute to variances in the random numbers.
EDIT... It is apparent to me that my algorithm may be combined with the algorithm pala_ posted, removing the need for a safeguard.
In pala_'s algorithm, a list is required, which I call an interval list. To simplify, you begin with a random_weight that is rather high. You step down the list of items and subtract the weight of each one until your random_weight falls to zero (or less). Then, the item you ended on is your item to return. There are variations on this interval algorithm that I've tested and pala_'s is a very good one. But, I wanted to avoid making a list. I wanted to use only the given weighted list and never touch all the items. The following algorithm merges my use of random jumping with pala_'s interval list. Instead of a list, I randomly jump around the list. I am guaranteed to get to zero eventually, so no safeguard is needed.
// Given $a as the weighted array (described above)
$weight = rand(1,100); // The bigger this is, the slower the algorithm runs.
while($weight>0)
{
$item = $a[array_rand($a)];
$weight-= $item[1];
}
// $item is the random item you want.
I wish I could select both pala_ and this answer as the correct answers.
I'm not sure if this is "faster", but I think it may be more "balance"d between memory usage and speed.
The thought is to transform your current implementation (500000 items array) into an equal-length array (100000 items), with the lowest "origin" position as key, and origin index as value:
<?php
$set=[["a",3],["b",5]];
$current_implementation=["a","a","a","b","b","b","b","b"];
// 0=>0 means the lowest "position" 0
// points to 0 in the set;
// 3=>1 means the lowest "position" 3
// points to 1 in the set;
$my_implementation=[0=>0,3=>1];
And then randomly picks a number between 0 and highest "origin" position:
// 3 is the lowest position of the last element ("b")
// and 5 the weight of that last element
$my_implemention_pick=mt_rand(0,3+5-1);
Full code:
<?php
function randomPickByWeight(array $set)
{
$low=0;
$high=0;
$candidates=[];
foreach($set as $key=>$item)
{
$candidates[$high]=$key;
$high+=$item["weight"];
}
$pick=mt_rand($low,$high-1);
while(!array_key_exists($pick,$candidates))
{
$pick--;
}
return $set[$candidates[$pick]];
}
$cache=[];
for($i=0;$i<100000;$i++)
{
$cache[]=["item"=>"item {$i}","weight"=>mt_rand(1,10)];
}
$time=time();
for($i=0;$i<100;$i++)
{
print_r(randomPickByWeight($cache));
}
$time=time()-$time;
var_dump($time);
3v4l.org demo
3v4l.org have some time limitation on codes, so the demo didn't finished. On my laptop the above demo finished in 10 seconds (i7-4700 HQ)
ere is my offer in case I've understand you right. I offer you take a look and if there are some question I'll explain.
Some words in advance:
My sample is with only 3 stages of weight - to be clear
- With outer while I'm simulating your main loop - I count only to 100.
- The array must to be init with one set of initial numbers as shown in my sample.
- In every pass of main loop I get only one random value and I'm keeping the weight at all.
<?php
$array=array(
0=>array('item' => 'A', 'weight' => 1),
1=>array('item' => 'B', 'weight' => 2),
2=>array('item' => 'C', 'weight' => 3),
);
$etalon_weights=array(1,2,3);
$current_weights=array(0,0,0);
$ii=0;
while($ii<100){ // Simulates your main loop
// Randomisation cycle
if($current_weights==$etalon_weights){
$current_weights=array(0,0,0);
}
$ft=true;
while($ft){
$curindex=rand(0,(count($array)-1));
$cur=$array[$curindex];
if($current_weights[$cur['weight']-1]<$etalon_weights[$cur['weight']-1]){
echo $cur['item'];
$array[]=$cur;
$current_weights[$cur['weight']-1]++;
$ft=false;
}
}
$ii++;
}
?>
I'll use this input array for my explanation:
$values_and_weights=array(
"one"=>1,
"two"=>8,
"three"=>10,
"four"=>4,
"five"=>3,
"six"=>10
);
The simple version isn't going to work for you because your array is so large. It requires no array modification but may need to iterate the entire array, and that's a deal breaker.
/*$pick=mt_rand(1,array_sum($values_and_weights));
$x=0;
foreach($values_and_weights as $val=>$wgt){
if(($x+=$wgt)>=$pick){
echo "$val";
break;
}
}*/
For your case, re-structuring the array will offer great benefits.
The cost in memory for generating a new array will be increasingly justified as:
array size increases and
number of selections increases.
The new array requires the replacement of "weight" with a "limit" for each value by adding the previous element's weight to the current element's weight.
Then flip the array so that the limits are the array keys and the values are the array values.
The selection logic is: the selected value will have the lowest limit that is >= $pick.
// Declare new array using array_walk one-liner:
array_walk($values_and_weights,function($v,$k)use(&$limits_and_values,&$x){$limits_and_values[$x+=$v]=$k;});
//Alternative declaration method - 4-liner, foreach() loop:
/*$x=0;
foreach($values_and_weights as $val=>$wgt){
$limits_and_values[$x+=$wgt]=$val;
}*/
var_export($limits_and_values);
$limits_and_values looks like this:
array (
1 => 'one',
9 => 'two',
19 => 'three',
23 => 'four',
26 => 'five',
36 => 'six',
)
Now to generate the random $pick and select the value:
// $x (from walk/loop) is the same as writing: end($limits_and_values); $x=key($limits_and_values);
$pick=mt_rand(1,$x); // pull random integer between 1 and highest limit/key
while(!isset($limits_and_values[$pick])){++$pick;} // smallest possible loop to find key
echo $limits_and_values[$pick]; // this is your random (weighted) value
This approach is brilliant because isset() is very fast and the maximum number of isset() calls in the while loop can only be as many as the largest weight (not to be confused with limit) in the array.
FOR YOUR CASE, THIS APPROACH WILL FIND THE VALUE IN 10 ITERATIONS OR LESS!
Here is my Demo that will accept a weighted array (like $values_and_weights), then in just four lines:
Restructure the array,
Generate a random number,
Find the correct value, and
Display it.
I wonder if is there a good way to get the number of digits in right/left side of a decimal number PHP. For example:
12345.789 -> RIGHT SIDE LENGTH IS 3 / LEFT SIDE LENGTH IS 5
I know it is readily attainable by helping string functions and exploding the number. I mean is there a mathematically or programmatically way to perform it better than string manipulations.
Your answers would be greatly appreciated.
Update
The best solution for left side till now was:
$left = floor(log10($x))+1;
but still no sufficient for right side.
Still waiting ...
To get the digits on the left side you can do this:
$left = floor(log10($x))+1;
This uses the base 10 logarithm to get the number of digits.
The right side is harder. A simple approach would look like this, but due to floating point numbers, it would often fail:
$decimal = $x - floor($x);
$right = 0;
while (floor($decimal) != $decimal) {
$right++;
$decimal *= 10; //will bring in floating point 'noise' over time
}
This will loop through multiplying by 10 until there are no digits past the decimal. That is tested with floor($decimal) != $decimal.
However, as Ali points out, giving it the number 155.11 (a hard to represent digit in binary) results in a answer of 14. This is because as the number is stored as something like 155.11000000000001 with the 32 bits of floating precision we have.
So instead, a more robust solution is needed. (PoPoFibo's solutions above is particularly elegant, and uses PHPs inherit float comparison functions well).
The fact is, we can never distinguish between input of 155.11 and 155.11000000000001. We will never know which number was originally given. They will both be represented the same. However, if we define the number of zeroes that we can see in a row before we just decide the decimal is 'done' than we can come up with a solution:
$x = 155.11; //the number we are testing
$LIMIT = 10; //number of zeroes in a row until we say 'enough'
$right = 0; //number of digits we've checked
$empty = 0; //number of zeroes we've seen in a row
while (floor($x) != $x) {
$right++;
$base = floor($x); //so we can see what the next digit is;
$x *= 10;
$base *= 10;
$digit = floor($x) - $base; //the digit we are dealing with
if ($digit == 0) {
$empty += 1;
if ($empty == $LIMIT) {
$right -= $empty; //don't count all those zeroes
break; // exit the loop, we're done
}
} else {
$zeros = 0;
}
}
This should find the solution given the reasonable assumption that 10 zeroes in a row means any other digits just don't matter.
However, I still like PopoFibo's solution better, as without any multiplication, PHPs default comparison functions effectively do the same thing, without the messiness.
I am lost on PHP semantics big time but I guess the following would serve your purpose without the String usage (that is at least how I would do in Java but hopefully cleaner):
Working code here: http://ideone.com/7BnsR3
Non-string solution (only Math)
Left side is resolved hence taking the cue from your question update:
$value = 12343525.34541;
$left = floor(log10($value))+1;
echo($left);
$num = floatval($value);
$right = 0;
while($num != round($num, $right)) {
$right++;
}
echo($right);
Prints
85
8 for the LHS and 5 for the RHS.
Since I'm taking a floatval that would make 155.0 as 0 RHS which I think is valid and can be resolved by String functions.
php > $num = 12345.789;
php > $left = strlen(floor($num));
php > $right = strlen($num - floor($num));
php > echo "$left / $right\n";
5 / 16 <--- 16 digits, huh?
php > $parts = explode('.', $num);
php > var_dump($parts);
array(2) {
[0]=>
string(5) "12345"
[1]=>
string(3) "789"
As you can see, floats aren't the easiest to deal with... Doing it "mathematically" leads to bad results. Doing it by strings works, but makes you feel dirty.
$number = 12345.789;
list($whole, $fraction) = sscanf($number, "%d.%d");
This will always work, even if $number is an integer and you’ll get two real integers returned. Length is best done with strlen() even for integer values. The proposed log10() approach won't work for 10, 100, 1000, … as you might expect.
// 5 - 3
echo strlen($whole) , " - " , strlen($fraction);
If you really, really want to get the length without calling any string function here you go. But it's totally not efficient at all compared to strlen().
/**
* Get integer length.
*
* #param integer $integer
* The integer to count.
* #param boolean $count_zero [optional]
* Whether 0 is to be counted or not, defaults to FALSE.
* #return integer
* The integer's length.
*/
function get_int_length($integer, $count_zero = false) {
// 0 would be 1 in string mode! Highly depends on use case.
if ($count_zero === false && $integer === 0) {
return 0;
}
return floor(log10(abs($integer))) + 1;
}
// 5 - 3
echo get_int_length($whole) , " - " , get_int_length($fraction);
The above will correctly count the result of 1 / 3, but be aware that the precision is important.
$number = 1 / 3;
// Above code outputs
// string : 1 - 10
// math : 0 - 10
$number = bcdiv(1, 3);
// Above code outputs
// string : 1 - 0 <-- oops
// math : 0 - INF <-- 8-)
No problem there.
I would like to apply a simple logic.
<?php
$num=12345.789;
$num_str="".$num; // Converting number to string
$array=explode('.',$num_str); //Explode number (String) with .
echo "Left side length : ".intval(strlen($array[0])); // $array[0] contains left hand side then check the string length
echo "<br>";
if(sizeof($array)>1)
{
echo "Left side length : ".intval(strlen($array[1]));// $array[1] contains left hand check the string length side
}
?>
I have a fully-populated array of values, and I would like to arbitrarily remove elements from this array with more removed towards the far end.
For example, given input ( where a . signifies a populated index )
............................................
I would like something like
....... . ... .. . . .. . .
My first thought was to count the elements, then iterate over the array generating a random number somewhere between the current index and the total size of the array, eg:
if ( mt_rand( 0, $total ) > $total - $current_index )
//remove this element
however, as this entails making a random number each time the loop goes round it becomes very arduous.
Is there a better way of doing this?
One easy way is to flip a weighted coin for each entry with coin flips more weighted towards the end. For example, if the array is size n, for each entry you could choose a random number from 0 to n-1 and only keep the value if the index is less than or equal to the random number. (That is, keep each entry with probability 1 - index/total.) This has the nice advantage that if you're going to be compacting your array anyways, and you're using a good enough but efficient random number generator (could be a simple integer hash over a nonce), it's going to be rather fast for memory access.
On the other hand if you're only blanking out a few items and aren't rearranging the array, you can go with some sort of weighted random number generator that more often chooses numbers that are toward the end of the index. For example, if you have a random number generator that generates floats in the value of [0,1] (closed or open bounds not mattering that much likely), consider obtaining such a random float r and squaring it. This will tend to prefer lower values. You can fix this by flipping it around: 1-r^2. Of course, you need this to be in your index range of 0 to n - 1, so take floor(n * (1 - r^2)) and also round n down to n-1.
There's practically an infinite number of variations on both of these techniques.
This is quite probably not the best/most efficient way to do this, but it is the best I can come up with and it does work.
N.B. the codepad example takes a long time to execute, but this is because of the pretty-print loop I added to the end so you can see it visibly working. If you remove the inner loop, execution time drops to acceptable levels.
<?php
$array = range(0, 99);
for ($i = 0, $count = count($array); $i < $count; $i++) {
// Get array keys
$keys = array_keys($array);
// Get a random number between 0 and count($keys) - 1
$rand = mt_rand(0, count($keys) - 1);
// Cut $rand elements off the beginning of the keys
$keys = array_slice($keys, $rand);
// Unset a random key from the remaining keys
unset($array[$keys[array_rand($keys)]]);
}
This method isn't random- it works by you defining a function, and its inverse. Different functions, with different constant coefficients will have different distribution characteristics.
The results are very pattern like, as expected when mapping a continuous function to a discrete structure like an array.
Here's an example using a quadratic function. You could try varying the constant.
demo: http://codepad.org/ojU3s9xM
#as in y = x^2 / 7;
function y($x) {
return $x * $x / 7;
}
function x($y) {
return 7 * sqrt($y);
}
$theArray = range(0,100);
$size = count($theArray);
//use func inverse to find the max value we can input to $y() without going out of array bounds
$maximumX = x($size);
for ($i=0; $i<$maximumX; $i++) {
$index = (int) y($i);
//unset the index if it still exists, else, the next greatest index
while (!isset($theArray[$index]) && $index < $size) {
$index++;
}
unset($theArray[$index]);
}
for ($i=0; $i<$size; $i++) {
printf("[%-3s]", isset($theArray[$i]) ? $theArray[$i] : '');
}