So I've got a list of weighted items, and I'd like to pick 4 non-duplicate items from this list.
Item Weight
Apple 5
Banana 7
Cherry 12
...
Orange 8
Pineapple 50
What is the most efficient way to do this? My initial foray was to just reroll for the subsequent picks if an already picked item came up... but for a small list this can result in a ton of rerolls.
Edit for clarification:
For the above example, and ignoring fruits D through N, there is a total weight of 82. So the chances of being picked first are:
A ~6%
B ~8.5%
C ~14.6%
O ~9.8%
P ~61%
Once an items is picked the probabilities would (should!) change.
In your comment you say that unique means:
I don't want to pick the same item twice.
.. and that the weights determine a likelihood of being picked.
All you need to do to make sure that you don't pick duplicates, is simply remove the last picked item from the list before you pick the next one. Yes, this will change your weights slightly, but that is the correct statistical change to make if you do want unique results.
In addition, I'm not sure how you are using the weights to determine the candidates, but I came up with this algorithm that should do this with a minimal number of loops (and without the need to fill an array according to weights, which could result in extremely large arrays, requires int weights, etc.)
I've used JavaScript here, simply so it's easy to see the output in a browser without a server. It should be trivial to port to PHP since it's not doing anything complicated.
Constants
var FRUITS = [
{name : "Apple", weight: 8 },
{name : "Orange", weight: 4 },
{name : "Banana", weight: 4 },
{name : "Nectarine", weight: 3 },
{name : "Kiwi", weight: 1 }
];
var PICKS = 3;
function getNewFruitsAvailable(fruits, removeFruit) {
var newFruits = [];
for (var idx in fruits) {
if (fruits[idx].name != removeFruit) {
newFruits.push(fruits[idx]);
}
}
return newFruits;
}
Script
var results = [];
var candidateFruits = FRUITS;
for (var i=0; i < PICKS; i++) {
// CALCULATE TOTAL WEIGHT OF AVAILABLE FRUITS
var totalweight = 0;
for (var idx in candidateFruits) {
totalweight += candidateFruits[idx].weight;
}
console.log("Total weight: " + totalweight);
var rand = Math.random();
console.log("Random: " + rand);
// ITERATE THROUGH FRUITS AND PICK THE ONE THAT MATCHES THE RANDOM
var weightinc = 0;
for (idx in candidateFruits) {
// INCREMENT THE WEIGHT BY THE NEXT FRUIT'S WEIGHT
var candidate = candidateFruits[idx];
weightinc += candidate.weight;
// IF rand IS BETWEEN LAST WEIGHT AND NEXT WEIGHT, PICK THIS FRUIT
if (rand < weightinc/totalweight) {
results.push(candidate.name);
console.log("Pick: " + candidate.name);
// GET NEXT SET OF FRUITS (REMOVING PICKED FRUIT)
candidateFruits = getNewFruitsAvailable(candidateFruits, candidate.name);
break;
}
}
console.log("CandidateFruits: " + candidateFruits.length);
};
Output
for (var i=0; i < results.length; i++) {
document.write(results[i] + "<br/>");
}
The basic strategy is to allocate each fruit a portion of the total range [0,1). In the first loop, you'd have this:
Apple — 8/20 = 0.0 up to 0.4
Orange — 4/20 = 0.4 up to 0.6
Banana — 4/20 = 0.6 up to 0.8
Nectarine — 3/20 = 0.8 up to 0.95
Kiwi — 8/20 = 0.95 up to 1.0
The script iterates over each item in the list, and progresses a weight counter. When it reaches the range that contains the first random, it picks that item, removes it from the list, then recalculates the ranges based on the new total weight and runs again.
Here I found the idea to following steps:
Build the sum of the weights --> SUM
Build a random number between 0 and SUM --> RAND_NUMBER
Iterate through the list and subtract each element weight from RAND_NUMBER. If RAND_NUMBER gets negative, you have your first element.
Remove the found element from the list and go back to step 1 until you have 4 elements.
Update
function array_rand2($ary,$n = 1)
{
// make sure we don't get in to an infinite loop
// check we have enough options to select from
$unique = count(array_unique(array_keys($ary)));
if ($n > $unique) $n = count($unique);
// First, explode the array and expand out all the weights
// this means something with a weight of 5 will appear in
// in the array 5 times
$_ary = array();
foreach ($ary as $item => $weight)
{
$_ary = array_merge($_ary, array_fill(0, $weight, $item));
}
// now look for $n unique entries
$matches = array();
while (count($matches) < $n)
{
$r = $_ary[array_rand($_ary)];
if (!in_array($r,$matches))
{
$matches[] = $r;
}
}
// and now grab those $n entries and return them
$result = array();
foreach ($matches as $match){
$result[] = $match;
}
return $result;
}
See if that does a better job.
Maybe instead of "rerolls" you could just increment the list element index you've randomly generated: list.elementAt(rand_index++ % size(list)) (something like that). I'd think you'd find the next random unique item pretty fast with logic like that.
I'm sure there are even better solutions, of course, there usually are.
Edit: Looks like Brad's already provided one.. :)
Related
I have a set of items. I need to randomly pick one. The problem is that they each have a weight of 1-10. A weight of 2 means that the item is twice as likely to be picked than a weight of 1. A weight of 3 is three times as likely.
I currently fill an array with each item. If the weight is 3, I put three copies of the item in the array. Then, I pick a random item.
My method is fast, but uses a lot of memory. I am trying to think of a faster method, but nothing comes to mind. Anyone have a trick for this problem?
EDIT: My Code...
Apparently, I wasn't clear. I do not want to use (or improve) my code. This is what I did.
//Given an array $a where $a[0] is an item name and $a[1] is the weight from 1 to 100.
$b = array();
foreach($a as $t)
$b = array_merge($b, array_fill(0,$t[1],$t));
$item = $b[array_rand($b)];
This required me to check every item in $a and uses max_weight/2*size of $a memory for the array. I wanted a COMPLETELY DIFFERENT algorithm.
Further, I asked this question in the middle of the night using a phone. Typing code on a phone is nearly impossible because those silly virtual keyboards simply suck. It auto-corrects everything, ruining any code I type.
An yet further, I woke up this morning with an entirely new algorithm that uses virtual no extra memory at all and does not require checking every item in the array. I posted it as an answer below.
This ones your huckleberry.
$arr = array(
array("val" => "one", "weight" => 1),
array("val" => "two", "weight" => 2),
array("val" => "three", "weight" => 3),
array("val" => "four", "weight" => 4)
);
$weight_sum = 0;
foreach($arr as $val)
{
$weight_sum += $val['weight'];
}
$r = rand(1, $weight_sum);
print "random value is $r\n";
for($i = 0; $i < count($arr); $i++)
{
if($r <= $arr[$i]['weight'])
{
print "$r <= {$arr[$i]['weight']}, this is our match\n";
print $arr[$i]['val'] . "\n";
break;
}
else
{
print "$r > {$arr[$i]['weight']}, subtracting weight\n";
$r -= $arr[$i]['weight'];
print "new \$r is $r\n";
}
}
No need to generate arrays containing an item for every weight, no need to fill an array with n elements for a weight of n. Just generate a random number between 1 and total weight, then loop through the array until you find a weight less than your random number. If it isn't less than the number, subtract that weight from the random and continue.
Sample output:
# php wr.php
random value is 8
8 > 1, subtracting weight
new $r is 7
7 > 2, subtracting weight
new $r is 5
5 > 3, subtracting weight
new $r is 2
2 <= 4, this is our match
four
This should also support fractional weights.
modified version to use array keyed by weight, rather than by item
$arr2 = array(
);
for($i = 0; $i <= 500000; $i++)
{
$weight = rand(1, 10);
$num = rand(1, 1000);
$arr2[$weight][] = $num;
}
$start = microtime(true);
$weight_sum = 0;
foreach($arr2 as $weight => $vals) {
$weight_sum += $weight * count($vals);
}
print "weighted sum is $weight_sum\n";
$r = rand(1, $weight_sum);
print "random value is $r\n";
$found = false;
$elem = null;
foreach($arr2 as $weight => $vals)
{
if($found) break;
for($j = 0; $j < count($vals); $j ++)
{
if($r < $weight)
{
$elem = $vals[$j];
$found = true;
break;
}
else
{
$r -= $weight;
}
}
}
$end = microtime(true);
print "random element is: $elem\n";
print "total time is " . ($end - $start) . "\n";
With sample output:
# php wr2.php
weighted sum is 2751550
random value is 345713
random element is: 681
total time is 0.017189025878906
measurement is hardly scientific - and fluctuates depending on where in the array the element falls (obviously) but it seems fast enough for huge datasets.
This way requires two random calculations but they should be faster and require about 1/4 of the memory but with some reduced accuracy if weights have disproportionate counts. (See Update for increased accuracy at the cost of some memory and processing)
Store a multidimensional array where each item is stored in the an array based on its weight:
$array[$weight][] = $item;
// example: Item with a weight of 5 would be $array[5][] = 'Item'
Generate a new array with the weights (1-10) appearing n times for n weight:
foreach($array as $n=>$null) {
for ($i=1;$i<=$n;$i++) {
$weights[] = $n;
}
}
The above array would be something like: [ 1, 2, 2, 3, 3, 3, 4, 4, 4, 4 ... ]
First calculation: Get a random weight from the weighted array we just created
$weight = $weights[mt_rand(0, count($weights)-1)];
Second calculation: Get a random key from that weight array
$value = $array[$weight][mt_rand(0, count($array[$weight])-1)];
Why this works: You solve the weighted issue by using the weighted array of integers we created. Then you select randomly from that weighted group.
Update: Because of the possibility of disproportionate counts of items per weight, you could add another loop and array for the counts to increase accuracy.
foreach($array as $n=>$null) {
$counts[$n] = count($array[$n]);
}
foreach($array as $n=>$null) {
// Calculate proportionate weight (number of items in this weight opposed to minimum counted weight)
$proportion = $n * ($counts[$n] / min($counts));
for ($i=1; $i<=$proportion; $i++) {
$weights[] = $n;
}
}
What this does is if you have 2000 10's and 100 1's, it'll add 200 10's (20 * 10, 20 because it has 20x the count, and 10 because it is weighted 10) instead of 10 10's to make it proportionate to how many are in there opposed the minimum weight count. So to be accurate, instead of adding one for EVERY possible key, you are just being proportionate based on the MINIMUM count of weights.
I greatly appreciate the answers above. Please consider this answer, which does not require checking every item in the original array.
// Given $a as an array of items
// where $a[0] is the item name and $a[1] is the item weight.
// It is known that weights are integers from 1 to 100.
for($i=0; $i<sizeof($a); $i++) // Safeguard described below
{
$item = $a[array_rand($a)];
if(rand(1,100)<=$item[1]) break;
}
This algorithm only requires storage for two variables ($i and $item) as $a was already created before the algorithm kicked in. It does not require a massive array of duplicate items or an array of intervals.
In a best-case scenario, this algorithm will touch one item in the original array and be done. In a worst-case scenario, it will touch n items in an array of n items (not necessarily every item in the array as some may be touched more than once).
If there was no safeguard, this could run forever. The safeguard is there to stop the algorithm if it simply never picks an item. When the safeguard is triggered, the last item touched is the one selected. However, in millions of tests using random data sets of 100,000 items with random weights of 1 to 10 (changing rand(1,100) to rand(1,10) in my code), the safeguard was never hit.
I made histograms comparing the frequency of items selected among my original algorithm, the ones from answers above, and the one in this answer. The differences in frequencies are trivial - easy to attribute to variances in the random numbers.
EDIT... It is apparent to me that my algorithm may be combined with the algorithm pala_ posted, removing the need for a safeguard.
In pala_'s algorithm, a list is required, which I call an interval list. To simplify, you begin with a random_weight that is rather high. You step down the list of items and subtract the weight of each one until your random_weight falls to zero (or less). Then, the item you ended on is your item to return. There are variations on this interval algorithm that I've tested and pala_'s is a very good one. But, I wanted to avoid making a list. I wanted to use only the given weighted list and never touch all the items. The following algorithm merges my use of random jumping with pala_'s interval list. Instead of a list, I randomly jump around the list. I am guaranteed to get to zero eventually, so no safeguard is needed.
// Given $a as the weighted array (described above)
$weight = rand(1,100); // The bigger this is, the slower the algorithm runs.
while($weight>0)
{
$item = $a[array_rand($a)];
$weight-= $item[1];
}
// $item is the random item you want.
I wish I could select both pala_ and this answer as the correct answers.
I'm not sure if this is "faster", but I think it may be more "balance"d between memory usage and speed.
The thought is to transform your current implementation (500000 items array) into an equal-length array (100000 items), with the lowest "origin" position as key, and origin index as value:
<?php
$set=[["a",3],["b",5]];
$current_implementation=["a","a","a","b","b","b","b","b"];
// 0=>0 means the lowest "position" 0
// points to 0 in the set;
// 3=>1 means the lowest "position" 3
// points to 1 in the set;
$my_implementation=[0=>0,3=>1];
And then randomly picks a number between 0 and highest "origin" position:
// 3 is the lowest position of the last element ("b")
// and 5 the weight of that last element
$my_implemention_pick=mt_rand(0,3+5-1);
Full code:
<?php
function randomPickByWeight(array $set)
{
$low=0;
$high=0;
$candidates=[];
foreach($set as $key=>$item)
{
$candidates[$high]=$key;
$high+=$item["weight"];
}
$pick=mt_rand($low,$high-1);
while(!array_key_exists($pick,$candidates))
{
$pick--;
}
return $set[$candidates[$pick]];
}
$cache=[];
for($i=0;$i<100000;$i++)
{
$cache[]=["item"=>"item {$i}","weight"=>mt_rand(1,10)];
}
$time=time();
for($i=0;$i<100;$i++)
{
print_r(randomPickByWeight($cache));
}
$time=time()-$time;
var_dump($time);
3v4l.org demo
3v4l.org have some time limitation on codes, so the demo didn't finished. On my laptop the above demo finished in 10 seconds (i7-4700 HQ)
ere is my offer in case I've understand you right. I offer you take a look and if there are some question I'll explain.
Some words in advance:
My sample is with only 3 stages of weight - to be clear
- With outer while I'm simulating your main loop - I count only to 100.
- The array must to be init with one set of initial numbers as shown in my sample.
- In every pass of main loop I get only one random value and I'm keeping the weight at all.
<?php
$array=array(
0=>array('item' => 'A', 'weight' => 1),
1=>array('item' => 'B', 'weight' => 2),
2=>array('item' => 'C', 'weight' => 3),
);
$etalon_weights=array(1,2,3);
$current_weights=array(0,0,0);
$ii=0;
while($ii<100){ // Simulates your main loop
// Randomisation cycle
if($current_weights==$etalon_weights){
$current_weights=array(0,0,0);
}
$ft=true;
while($ft){
$curindex=rand(0,(count($array)-1));
$cur=$array[$curindex];
if($current_weights[$cur['weight']-1]<$etalon_weights[$cur['weight']-1]){
echo $cur['item'];
$array[]=$cur;
$current_weights[$cur['weight']-1]++;
$ft=false;
}
}
$ii++;
}
?>
I'll use this input array for my explanation:
$values_and_weights=array(
"one"=>1,
"two"=>8,
"three"=>10,
"four"=>4,
"five"=>3,
"six"=>10
);
The simple version isn't going to work for you because your array is so large. It requires no array modification but may need to iterate the entire array, and that's a deal breaker.
/*$pick=mt_rand(1,array_sum($values_and_weights));
$x=0;
foreach($values_and_weights as $val=>$wgt){
if(($x+=$wgt)>=$pick){
echo "$val";
break;
}
}*/
For your case, re-structuring the array will offer great benefits.
The cost in memory for generating a new array will be increasingly justified as:
array size increases and
number of selections increases.
The new array requires the replacement of "weight" with a "limit" for each value by adding the previous element's weight to the current element's weight.
Then flip the array so that the limits are the array keys and the values are the array values.
The selection logic is: the selected value will have the lowest limit that is >= $pick.
// Declare new array using array_walk one-liner:
array_walk($values_and_weights,function($v,$k)use(&$limits_and_values,&$x){$limits_and_values[$x+=$v]=$k;});
//Alternative declaration method - 4-liner, foreach() loop:
/*$x=0;
foreach($values_and_weights as $val=>$wgt){
$limits_and_values[$x+=$wgt]=$val;
}*/
var_export($limits_and_values);
$limits_and_values looks like this:
array (
1 => 'one',
9 => 'two',
19 => 'three',
23 => 'four',
26 => 'five',
36 => 'six',
)
Now to generate the random $pick and select the value:
// $x (from walk/loop) is the same as writing: end($limits_and_values); $x=key($limits_and_values);
$pick=mt_rand(1,$x); // pull random integer between 1 and highest limit/key
while(!isset($limits_and_values[$pick])){++$pick;} // smallest possible loop to find key
echo $limits_and_values[$pick]; // this is your random (weighted) value
This approach is brilliant because isset() is very fast and the maximum number of isset() calls in the while loop can only be as many as the largest weight (not to be confused with limit) in the array.
FOR YOUR CASE, THIS APPROACH WILL FIND THE VALUE IN 10 ITERATIONS OR LESS!
Here is my Demo that will accept a weighted array (like $values_and_weights), then in just four lines:
Restructure the array,
Generate a random number,
Find the correct value, and
Display it.
I'm not that good at math, so I'm stuck here.
I need to get the total number of possible arrangement (I think, or permutations maybe?) of X elements amongst N.
I want to pick X distinct elements amongst N (N>=X)
order DOES matter
each element can not come more than once in a combination
=> For exemple, given $N = count(1,2,3,4,5,6,7,8,9), a valid combination of $X=6 elements could be :
- 1,4,5,3,2,8
- 4,2,1,9,7,3
What formula do I need to use in PHP to get the total number of possibilities?
There are N choices for the first element, N-1 for the second (as you have already chosen 1) then N-2 choices for the third and so on. You can express using factorials this a N! / (N-X-1)!. See https://en.wikipedia.org/wiki/Permutations
Ok, I think I got it.
$set = array(A,B,C,D,E,F,G);
$n = count($set);
$k = 6;
if($n>0)
{
if($k < $n)
{
$outcomes = gmp_fact($n) / gmp_fact($n-$k);
}
else
{
$outcomes = gmp_fact($n);
}
} else { $outcomes = 0; }
where gmp_fact($n) is the php function for $n! (n factorial), which means N x (N-1) x ... x 1
In a browser game we have items that occur based on their probabilities.
P(i1) = 0.8
P(i2) = 0.45
P(i3) = 0.33
P(i4) = 0.01
How do we implement a function in php that returns a random item based on its probability chance?
edit
The items have a property called rarity which varies from 1 to 100 and represents the probability to occcur. The item that occurs is chosen from a set of all items of a certain type. (e.x the given example above represents all artifacts tier 1)
I don't know if its the best solution but when I had to solve this a while back this is what I found:
Function taken from this blog post:
// Given an array of values, and weights for those values (any positive int)
// it will select a value randomly as often as the given weight allows.
// for example:
// values(A, B, C, D)
// weights(30, 50, 100, 25)
// Given these values C should come out twice as often as B, and 4 times as often as D.
function weighted_random($values, $weights){
$count = count($values);
$i = 0;
$n = 0;
$num = mt_rand(0, array_sum($weights));
while($i < $count){
$n += $weights[$i];
if($n >= $num){
break;
}
$i++;
}
return $values[$i];
}
Example call:
$values = array('A','B','C');
$weights = array(1,50,100);
$weighted_value = weighted_random($values, $weights);
It's somewhat unwieldy as obviously the values and weights need to be supplied separately but this could probably be refactored to suit your needs.
Tried to understand how Bulk's function works, and here is how I understand based on Benjamin Kloster answer:
https://softwareengineering.stackexchange.com/questions/150616/return-random-list-item-by-its-weight
Generate a random number n in the range of 0 to sum(weights), in this case $num so lets say from this: weights(30, 50, 100, 25).
Sum is 205.
Now $num has to be 0-30 to get A,
30-80 to get B
80-180 to get C
and 180-205 to get D
While loop finds in which interval the $num falls.
I have a set of elements and i need to choose any one element out of it. Each element is associated with a percentage chance. The percentages add to 100.
I need to choose one out of those element so that the chances of an element being chosen is equal to the percent value. So if a element has 25% chance, it is supposed to have 25% chances of getting chosen. In other words, if we choose elements 1 mil times, that element should be chosen near 250k times.
What you describe is a multinomial process.
http://en.wikipedia.org/wiki/Multinomial_distribution#Sampling_from_a_multinomial_distribution
They way to generate such random process is like this:
( I'll use pseudo code but it should be easy to make it in to real code. )
Sort the 'boxes' in reverse order of their probability:
(not needed. it's just an optimization)
so that you have for example values=[0.45,0.3,0.15,0.1]
then create the 'cumulative' distribution, which is the sum of all elements with index <=i.
pseudocode:
cumulant=[0,0,0,0] // initiate it
s=0
for j=0 to size()-1 {
s=s+values[i] ;
cumulant[i]=s
}
in our case cumulant=[0.45,0.70,0.85 ,1 ]
make a uniform random number x between 0 and 1.
For php: http://php.net/manual/en/function.rand.php
the resulting random box index i is
the highest i for which cumulant[i]< x
pseudocode:
for j=0 to size()-1 {
if !(cumulant[i]<){
print "your index is ",i
break;
}
that is it. Get another random index i by going back to point 3.
if you sort like suggested above, that means that the final search will be faster. For example, if you have this vector of probabilities: 0.001 0.001 0.001 0.001 0.996 then, when you sort it, you will almost always only have to look only at index i=0, since the random number x will almost always be lower than 0.996. If the sort pays off or not depends on if you repeatedly use the same 'boxes'. So, yes with 250k tries it will help a lot. Just remember that the box index i you get is for the sorted vector.
I guess it was faster for me to write it than it was for you to show us what you did so far.
Probably not the best solution, but as it stands, it looks like it's the only one you've got.
Here you go:
$elements = array(
'This' => 25,
'is' => 15,
'a' => 15,
'crappy' => 20,
'list' => 25
);
asort($elements);
$elements = array_reverse($elements);
// Precalc cumulative value
$cumulant = 0;
foreach ($elements as $key => &$value) {
$cumulant += $value;
$value = $cumulant;
}
function pickAnElement($elements) {
$random = rand(1, 100);
foreach ($elements as $key => $value) {
if ($random <= $value) {
return $key;
}
}
}
$picks = array();
for ($i = 0; $i < 10000; $i++) {
$element = pickAnElement($elements);
if (!array_key_exists($element, $picks)) {
$picks[$element] = 0;
}
$picks[$element]++;
}
var_dump($picks);
Inspired by Johans answer, I added a loop to sort and pre-calculate the cumulant.
I am trying to create a little php script that can make my life a bit easier.
Basically, I am going to have 21 text fields on a page where I am going to input 20 different numbers. In the last field I will enter a number let's call it the TOTAL AMOUNT. All I want the script to do is to point out which numbers from the 20 fields added up will come up to TOTAL AMOUNT.
Example:
field1 = 25.23
field2 = 34.45
field3 = 56.67
field4 = 63.54
field5 = 87.54
....
field20 = 4.2
Total Amount = 81.90
Output: field1 + fields3 = 81.90
Some of the fields might have 0 as value because sometimes I only need to enter 5-15 fields and the maximum will be 20.
If someone can help me out with the php code for this, will be greatly appreciated.
If you look at oezis algorithm one drawback is immediately clear: It spends very much time summing up numbers which are already known not to work. (For example if 1 + 2 is already too big, it doesn't make any sense to try 1 + 2 + 3, 1 + 2 + 3 + 4, 1 + 2 + 3 + 4 + 5, ..., too.)
Thus I have written an improved version. It does not use bit magic, it makes everything manual. A drawback is, that it requires the input values to be sorted (use rsort). But that shouldn't be a big problem ;)
function array_sum_parts($vals, $sum){
$solutions = array();
$pos = array(0 => count($vals) - 1);
$lastPosIndex = 0;
$currentPos = $pos[0];
$currentSum = 0;
while (true) {
$currentSum += $vals[$currentPos];
if ($currentSum < $sum && $currentPos != 0) {
$pos[++$lastPosIndex] = --$currentPos;
} else {
if ($currentSum == $sum) {
$solutions[] = array_slice($pos, 0, $lastPosIndex + 1);
}
if ($lastPosIndex == 0) {
break;
}
$currentSum -= $vals[$currentPos] + $vals[1 + $currentPos = --$pos[--$lastPosIndex]];
}
}
return $solutions;
}
A modified version of oezis testing program (see end) outputs:
possibilities: 540
took: 3.0897309780121
So it took only 3.1 seconds to execute, whereas oezis code executed 65 seconds on my machine (yes, my machine is very slow). That's more than 20 times faster!
Furthermore you may notice, that my code found 540 instead of 338 possibilities. This is because I adjusted the testing program to use integers instead of floats. Direct floating point comparison is rarely the right thing to do, this is a great example why: You sometimes get 59.959999999999 instead of 59.96 and thus the match will not be counted. So, if I run oezis code with integers it finds 540 possibilities, too ;)
Testing program:
// Inputs
$n = array();
$n[0] = 6.56;
$n[1] = 8.99;
$n[2] = 1.45;
$n[3] = 4.83;
$n[4] = 8.16;
$n[5] = 2.53;
$n[6] = 0.28;
$n[7] = 9.37;
$n[8] = 0.34;
$n[9] = 5.82;
$n[10] = 8.24;
$n[11] = 4.35;
$n[12] = 9.67;
$n[13] = 1.69;
$n[14] = 5.64;
$n[15] = 0.27;
$n[16] = 2.73;
$n[17] = 1.63;
$n[18] = 4.07;
$n[19] = 9.04;
$n[20] = 6.32;
// Convert to Integers
foreach ($n as &$num) {
$num *= 100;
}
$sum = 57.96 * 100;
// Sort from High to Low
rsort($n);
// Measure time
$start = microtime(true);
echo 'possibilities: ', count($result = array_sum_parts($n, $sum)), '<br />';
echo 'took: ', microtime(true) - $start;
// Check that the result is correct
foreach ($result as $element) {
$s = 0;
foreach ($element as $i) {
$s += $n[$i];
}
if ($s != $sum) echo '<br />FAIL!';
}
var_dump($result);
sorry for adding a new answer, but this is a complete new solution to solve all problems of life, universe and everything...:
function array_sum_parts($n,$t,$all=false){
$count_n = count($n); // how much fields are in that array?
$count = pow(2,$count_n); // we need to do 2^fields calculations to test all possibilities
# now i want to look at every number from 1 to $count, where the number is representing
# the array and add up all array-elements which are at positions where my actual number
# has a 1-bit
# EXAMPLE:
# $i = 1 in binary mode 1 = 01 i'll use ony the first array-element
# $i = 10 in binary mode 10 = 1010 ill use the secont and the fourth array-element
# and so on... the number of 1-bits is the amount of numbers used in that try
for($i=1;$i<=$count;$i++){ // start calculating all possibilities
$total=0; // sum of this try
$anzahl=0; // counter for 1-bits in this try
$k = $i; // store $i to another variable which can be changed during the loop
for($j=0;$j<$count_n;$j++){ // loop trough array-elemnts
$total+=($k%2)*$n[$j]; // add up if the corresponding bit of $i is 1
$anzahl+=($k%2); // add up the number of 1-bits
$k=$k>>1; //bit-shift to the left for looking at the next bit in the next loop
}
if($total==$t){
$loesung[$i] = $anzahl; // if sum of this try is the sum we are looking for, save this to an array (whith the number of 1-bits for sorting)
if(!$all){
break; // if we're not looking for all solutions, make a break because the first one was found
}
}
}
asort($loesung); // sort all solutions by the amount of numbers used
// formating the solutions to getting back the original array-keys (which shoud be the return-value)
foreach($loesung as $val=>$anzahl){
$bit = strrev(decbin($val));
$total=0;
$ret_this = array();
for($j=0;$j<=strlen($bit);$j++){
if($bit[$j]=='1'){
$ret_this[] = $j;
}
}
$ret[]=$ret_this;
}
return $ret;
}
// Inputs
$n[0]=6.56;
$n[1]=8.99;
$n[2]=1.45;
$n[3]=4.83;
$n[4]=8.16;
$n[5]=2.53;
$n[6]=0.28;
$n[7]=9.37;
$n[8]=0.34;
$n[9]=5.82;
$n[10]=8.24;
$n[11]=4.35;
$n[12]=9.67;
$n[13]=1.69;
$n[14]=5.64;
$n[15]=0.27;
$n[16]=2.73;
$n[17]=1.63;
$n[18]=4.07;
$n[19]=9.04;
$n[20]=6.32;
// Output
$t=57.96;
var_dump(array_sum_parts($n,$t)); //returns one possible solution (fuc*** fast)
var_dump(array_sum_parts($n,$t,true)); // returns all possible solution (relatively fast when you think of all the needet calculations)
if you don't use the third parameter, it returns the best (whith the least amount numbers used) solution as array (whith keys of the input-array) - if you set the third parameter to true, ALL solutions are returned (for testing, i used the same numbers as zaf in his post - there are 338 solutions in this case, found in ~10sec on my machine).
EDIT:
if you get all, you get the results ordered by which is "best" - whithout this, you only get the first found solution (which isn't necessarily the best).
EDIT2:
to forfil the desire of some explanation, i commented the essential parts of the code . if anyone needs more explanation, please ask
1. Check and eliminate fields values more than 21st field
2. Check highest of the remaining, Add smallest,
3. if its greater than 21st eliminate highest (iterate this process)
4. If lower: Highest + second Lowest, if equal show result.
5. if higher go to step 7
6. if lower go to step 4
7. if its lower than add second lowest, go to step 3.
8. if its equal show result
This is efficient and will take less execution time.
Following method will give you an answer... almost all of the time. Increase the iterations variable to your taste.
<?php
// Inputs
$n[1]=8.99;
$n[2]=1.45;
$n[3]=4.83;
$n[4]=8.16;
$n[5]=2.53;
$n[6]=0.28;
$n[7]=9.37;
$n[8]=0.34;
$n[9]=5.82;
$n[10]=8.24;
$n[11]=4.35;
$n[12]=9.67;
$n[13]=1.69;
$n[14]=5.64;
$n[15]=0.27;
$n[16]=2.73;
$n[17]=1.63;
$n[18]=4.07;
$n[19]=9.04;
$n[20]=6.32;
// Output
$t=57.96;
// Let's try to do this a million times randomly
// Relax, thats less than a blink
$iterations=1000000;
while($iterations-->0){
$z=array_rand($n, mt_rand(2,20));
$total=0;
foreach($z as $x) $total+=$n[$x];
if($total==$t)break;
}
// If we did less than a million times we have an answer
if($iterations>0){
$total=0;
foreach($z as $x){
$total+=$n[$x];
print("[$x] + ". $n[$x] . " = $total<br/>");
}
}
?>
One solution:
[1] + 8.99 = 8.99
[4] + 8.16 = 17.15
[5] + 2.53 = 19.68
[6] + 0.28 = 19.96
[8] + 0.34 = 20.3
[10] + 8.24 = 28.54
[11] + 4.35 = 32.89
[13] + 1.69 = 34.58
[14] + 5.64 = 40.22
[15] + 0.27 = 40.49
[16] + 2.73 = 43.22
[17] + 1.63 = 44.85
[18] + 4.07 = 48.92
[19] + 9.04 = 57.96
A probably inefficient but simple solution with backtracking
function subset_sums($a, $val, $i = 0) {
$r = array();
while($i < count($a)) {
$v = $a[$i];
if($v == $val)
$r[] = $v;
if($v < $val)
foreach(subset_sums($a, $val - $v, $i + 1) as $s)
$r[] = "$v $s";
$i++;
}
return $r;
}
example
$ns = array(1, 2, 6, 7, 11, 5, 8, 9, 3);
print_r(subset_sums($ns, 11));
result
Array
(
[0] => 1 2 5 3
[1] => 1 2 8
[2] => 1 7 3
[3] => 2 6 3
[4] => 2 9
[5] => 6 5
[6] => 11
[7] => 8 3
)
i don't think the answer isn't as easy as nik mentioned. let's ay you have the following numbers:
1 2 3 6 8
looking for an amount of 10
niks solution would do this (if i understand it right):
1*8 = 9 = too low
adding next lowest (2) = 11 = too high
now he would delete the high number and start again taking the new highest
1*6 = 7 = too low
adding next lowest (2) = 9 = too low
adding next lowest (3) = 12 = too high
... and so on, where the perfect answer would simply
be 8+2 = 10... i think the only solution is trying every possible combination of
numbers and stop if the amaunt you are looking for is found (or realy calculate all, if there are different solutions and save which one has used least numbers).
EDIT: realy calculating all possible combiations of 21 numbers will end up in realy, realy, realy much calculations - so there must be any "intelligent" solution for adding numbers in a special order (lik that one in niks post - with some improvements, maybe that will bring us to a reliable solution)
Without knowing if this is a homework assignment or not, I can give you some pseudo code as a hint for a possible solution, note the solution is not very efficient, more of a demonstration.
Hint:
Compare each field value to all field value and at each iteration check if their sum is equal to TOTAL_AMOUNT.
Pseudo code:
for i through field 1-20
for j through field 1-20
if value of i + value of j == total_amount
return i and j
Update:
What you seem to be having is the Subset sum problem, given within the Wiki link is pseudo code for the algorithm which might help point you in the right direction.