Permutations of Varying Size

Permutations of Varying Size - php

I'm trying to write a function in PHP that gets all permutations of all possible sizes. I think an example would be the best way to start off:
$my_array = array(1,1,2,3);
Possible permutations of varying size:
1
1 // * See Note
2
3
1,1
1,2
1,3
// And so forth, for all the sets of size 2
1,1,2
1,1,3
1,2,1
// And so forth, for all the sets of size 3
1,1,2,3
1,1,3,2
// And so forth, for all the sets of size 4
Note: I don't care if there's a duplicate or not. For the purposes of this example, all future duplicates have been omitted.
What I have so far in PHP:
function getPermutations($my_array){
$permutation_length = 1;
$keep_going = true;
while($keep_going){
while($there_are_still_permutations_with_this_length){
// Generate the next permutation and return it into an array
// Of course, the actual important part of the code is what I'm having trouble with.
}
$permutation_length++;
if($permutation_length>count($my_array)){
$keep_going = false;
}
else{
$keep_going = true;
}
}
return $return_array;
}
The closest thing I can think of is shuffling the array, picking the first n elements, seeing if it's already in the results array, and if it's not, add it in, and then stop when there are mathematically no more possible permutations for that length. But it's ugly and resource-inefficient.
Any pseudocode algorithms would be greatly appreciated.
Also, for super-duper (worthless) bonus points, is there a way to get just 1 permutation with the function but make it so that it doesn't have to recalculate all previous permutations to get the next?
For example, I pass it a parameter 3, which means it's already done 3 permutations, and it just generates number 4 without redoing the previous 3? (Passing it the parameter is not necessary, it could keep track in a global or static).
The reason I ask this is because as the array grows, so does the number of possible combinations. Suffice it to say that one small data set with only a dozen elements grows quickly into the trillions of possible combinations and I don't want to task PHP with holding trillions of permutations in its memory at once.

Sorry no php code, but I can give you an algorithm.
It can be done with small amounts of memory and since you don't care about dupes, the code will be simple too.
First: Generate all possible subsets.
If you view the subset as a bit vector, you can see that there is a 1-1 correspondence to a set and a binary number.
So if your array had 12 elements, you will have 2^12 subsets (including empty set).
So to generate a subset, you start with 0 and keep incrementing till you reach 2^12. At each stage you read the set bits in the number to get the appropriate subset from the array.
Once you get one subset, you can now run through its permutations.
The next permutation (of the array indices, not the elements themselves) can be generated in lexicographic order like here: http://www.de-brauwer.be/wiki/wikka.php?wakka=Permutations and can be done with minimal memory.
You should be able to combine these two to give your-self a next_permutation function. Instead of passing in numbers, you could pass in an array of 12 elements which contains the previous permutation, plus possibly some more info (little memory again) of whether you need to go to the next subset etc.
You should actually be able to find very fast algorithms which use minimal memory, provide a next_permutation type feature and do not generate dupes: Search the web for multiset permutation/combination generation.
Hope that helps. Good luck!

The best set of functions I've come up with was the one provided by some user at the comments of the shuffle function on php.net Here is the link It works pretty good.
Hope it's useful.

The problem seems to be trying to give an index to every permutation and having a constant access time. I cannot think of a constant time algorithm, but maybe you can improve this one to be so. This algorithm has a time complexity of O(n) where n is the length of your set. The space complexity should be reducible to O(1).
Assume our set is 1,1,2,3 and we want the 10th permutation. Also, note that we will index each element of the set from 0 to 3. Going by your order, this means the single element permutations come first, then the two element, and so on. We are going to subtract from the number 10 until we can completely determine the 10th permutation.
First up are the single element permutations. There are 4 of those, so we can view this as subtracting one four times from 10. We are left with 6, so clearly we need to start considering the two element permutations. There are 12 of these, and we can view this as subtracting three up to four times from 6. We discover that the second time we subtract 3, we are left with 0. This means the indexes of our permutation must be 2 (because we subtracted 3 twice) and 0, because 0 is the remainder. Therefore, our permutation must be 2,1.
Division and modulus may help you.
If we were looking for the 12th permutation, we would run into the case where we have a remainder of 2. Depending on your desired behavior, the permutation 2,2 might not be valid. Getting around this is very simple, however, as we can trivially detect that the indexes 2 and 2 (not to be confused with the element) are the same, so the second one should be bumped to 3. Thus the 12th permutation can trivially be calculated as 2,3.
The biggest confusion right now is that the indexes and the element values happen to match up. I hope my algorithm explanation is not too confusing because of that. If it is, I will use a set other than your example and reword things.

Inputs: Permutation index k, indexed set S.
Pseudocode:
L = {S_1}
for i = 2 to |S| do
Insert S_i before L_{k % i}
k <- k / i
loop
return L
This algorithm can also be easily modified to work with duplicates.

Related

Subset Sum floats Elimations

I will be happy to get some help. I have the following problem:
I'm given a list of numbers and a target number.
subset_sum([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20)
I need to find an algorithm that will find all numbers that combined will sum target number ex: 20.
First find all int equal 20
And next for example the best combinations here are:
11.96 + 8.04
1 + 10 + 9
11.13 + 7.8 + 1.07
9 + 11
Remaining value 15.04.
I need an algorithm that uses 1 value only once and it could use from 1 to n values to sum target number.
I tried some recursion in PHP but runs out of memory really fast (50k values) so a solution in Python will help (time/memory wise).
I'd be glad for some guidance here.
One possible solution is this: Finding all possible combinations of numbers to reach a given sum
The only difference is that I need to put a flag on elements already used so it won't be used twice and I can reduce the number of possible combinations
Thanks for anyone willing to help.

there are many ways to think about this problem.
If you do recursion make sure to identify your end cases first, then proceed with the rest of the program.
This is the first thing that comes to mind.
<?php
subset_sum([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20);
function subset_sum($a,$s,$c = array())
{
if($s<0)
return;
if($s!=0&&count($a)==0)
return;
if($s!=0)
{
foreach($a as $xd=>$xdd)
{
unset($a[$xd]);
subset_sum($a,$s-$xdd,array_merge($c,array($xdd)));
}
}
else
print_r($c);
}
?>

This is possible solution, but it's not pretty:
import itertools
import operator
from functools import reduce
def subset_num(array, num):
subsets = reduce(operator.add, [list(itertools.combinations(array, r)) for r in range(1, 1 + len(array))])
return [subset for subset in subsets if sum(subset) == num]
print(subset_num([11.96,1,15.04,7.8,20,10,11.13,9,11,1.07,8.04,9], 20))
Output:
[(20,), (11.96, 8.04), (9, 11), (11, 9), (1, 10, 9), (1, 10, 9), (7.8, 11.13, 1.07)]

DISCLAIMER: this is not a full solution, it is a way to just help you build the possible subsets. It does not help you to pick which ones go together (without using the same item more than once and getting the lowest remainder).
Using dynamic programming you can build all the subsets that add up to the given sum, then you will need to go through them and find which combination of subsets is best for you.
To build this archive you can (I'm assuming we're dealing with non-negative numbers only) put the items in a column, go from top to bottom and for each element compute all the subsets that add up to the sum or a lower number than it and that include only items from the column that are in the place you are looking at or higher. When you build a subset you put in its node both the sum of the subset (which may be the given sum or smaller) and the items that are included in the subset. So in order to compute the subsets for an item [i] you need only look at the subsets you've created for item [i-1]. For each of them there are 3 options:
1) the subset's sum is the given sum ---> Keep the subset as it is and move to the next one.
2) the subset's sum is smaller than the given sum but larger than it if item [i] is added to it ---> Keep the subset as it is and move on to the next one.
3) the subset's sum is smaller than the given sum and it will still be smaller or equal to it if item [i] is added to it ---> Keep one copy of the subset as it is and create another one with item [i] added to it (both as a member and added to the sum of the subset).
When you're done with the last item (item [n]), look at the subsets you've created - each one has its sum in its node and you can see which ones are equal to the given sum (and which ones are smaller - you don't need those anymore).
As I wrote at the beginning - now you need to figure out how to take the best combination of subsets that do not have a shared member between any of them.
Basically you're left with a problem that resembles the classic knapsack problem but with another limitation (not every stone can be taken with every other stone). Maybe the limitation actually helps, I'm not sure.
A bit more about the advantage of dynamic programming in this case
The basic idea of dynamic programming instead of recursion is to trade redundancy of operations with occupation of memory space. By that I mean to say that recursion with a complex problem (normally a backtrack knapsack-like problem, as we have here) normally ends up calculating the same thing a fair amount of times because the different branches of calculation have no concept of each other's operations and results. Dynamic programming saves the results and uses them along the way to build "bigger" results, relying on the previous/"smaller" ones. Because the use of the stack is much more straightforward than in recursion, you don't get the memory problem you get with recursion regarding the maintenance of the function's state, but you do need to handle a great deal of memory that you store (sometimes you can optimise that).
So for example in our problem, trying to combine a subset that would add up to the required sum, the branch that starts with item A and the branch that starts with item B do not know of each other's operations. let's assume item C and item D together add up to the sum, but either of them added alone to A or B would not exceed the sum, and that A don't go with B in the solution (we can have sum=10, A=B=4, C=D=5 and there is no subset that sums up to 2 (so A and B can't be in the same group)). The branch trying to figure out A's group would (after trying and rejecting having B in its group) add C (A+C=9) and then add D, in which point would reject this group and trackback (A+C+D=14 > sum=10). The same would happen to B of course (A=B) because the branch figuring out B's group has no information regarding what just happened to the branch dealing with A. So in fact we've calculated C+D twice, and haven't even used it yet (and we're about to calculate it yet a third time to realise they belong in a group of their own).
NOTE:
Looking around while writing this answer I came across a technique I was not familiar with and might be a better solution for you: memoization. Taken from wikipedia:
memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

So I have a possbile solution:
#compute difference between 2 list but keep duplicates
def list_difference(a, b):
count = Counter(a) # count items in a
count.subtract(b) # subtract items that are in b
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
#return combination of numbers that match target
def subset_sum(numbers, target, partial=[]):
s = sum(partial)
# check if the partial sum is equals to target
if s == target:
print "--------------------------------------------sum_is(%s)=%s" % (partial, target)
return partial
else:
if s >= target:
return # if we reach the number why bother to continue
for i in range(len(numbers)):
n = numbers[i]
remaining = numbers[i+1:]
rest = subset_sum(remaining, target, partial + [n])
if type(rest) is list:
#repeat until rest is > target and rest is not the same as previous
def repeatUntil(subset, target):
currSubset = []
while sum(subset) > target and currSubset != subset:
diff = subset_sum(subset, target)
currSubset = subset
subset = list_difference(subset, diff)
return subset
Output:
--------------------------------------------sum_is([11.96, 8.04])=20
--------------------------------------------sum_is([1, 10, 9])=20
--------------------------------------------sum_is([7.8, 11.13, 1.07])=20
--------------------------------------------sum_is([20])=20
--------------------------------------------sum_is([9, 11])=20
[15.04]
Unfortunately this solution does work for a small list. For a big list still trying to break the list in small chunks and calculate but the answer is not quite correct. You can see it o a new thread here:
Finding unique combinations of numbers to reach a given sum

Accessing unique value pairs from an array without repeating myself

I am trying to access unique value pairs from an array in a random order - without repeating myself until I have to.
For example, if I have an array set A,B,C,D (generally an even number of items, but up to 20) then the first time through I might pair A-B & C-D. But I want to guarantee that the next time I do it, I avoid repeating my pairing and that I get both A-C & B-D and A-D and B-C before I then get A-B and C-D again. Each item should only be called once in each round.
I started off by shuffling the order of the array randomly then pairing two values together - but I need a way to prevent some pairings from occurring more frequently than others (ideally I'd want them to increment equally all the way through).
So I've moved to looking at permutations - and have managed to get a full array containing all the possible pairings using the code below:
$this->items = array('A','B','C','D');
$input = $this->items;
$input_copy = $input;
$output = array();
$i = 0;
foreach($input as $val) {
$j = 0;
foreach($input_copy as $cval) {
if($j == $i) break;
print $val.'-'.$cval.'<br/>';
//$output[] = array($val => $cval);
$j++;
}
$i++;
}
//print_r($output);
e.g for A, B, C, D I get:
b-a
c-a
c-b
d-a
d-b
d-c
I want to cycle through the set n-1 times and capture the results in another array, but I'm not sure how to generate the actual order from these unique options
In other words, I want to turn the list above in to the below:
1st run =>
1=> A-B,
2=> C-D,
2nd run =>
1=> A-C,
2=> B-D,
3rd run =>
1=> A-D,
2=> C-B,
It may be that I can do this more simply from $this->items. I've also had a look at the Math_Combinatorics PEAR package, but I wasn't sure where to start.
I'd be grateful for any help!

You can use round-robin tournament algorithm
Place elements in two rows.
Fix one element - in this case A
For next round shift all other elements in circular manner.
Pair them.
Repeat N-1 times
A B
D C
-----
A D
C B
----
A C
B D
----

I assume that you want to generate each pairing exactly once, i.e. each partition of your whole sequence into pairs. If you only want each pair exactly once, that's a different problem handled in a different answer.
Think about this problem recursively: At the beginning you have n elements. From these, take the first and choose a partner for it from the remaining n-1 elements. Take this pair out of the list and recuse with the remaining n-2 elements. If you make each choice unbiased, the remaining pairing will be unbiased as well. But that doesn't guarantee you won't repeat yourself earlier than neccessary.
If you really want to be sure you avoid repeating pairings, you should first think about how many possible pairings there are. For now I'll assume that n is even, so you only have complete pairs. It's easy to adjust this to odd n with one unpaired element. To obtain the total number of possible pairings, you have to multiply your choices:
m=(n-1)*(n-3)*(n-5)*...*7*5*3*1
So it's a product of odd numbers. That's A001147, also written as a double factorial m=(n-1)!!. Note that these numbers grow fairly quickly, so even for moderate n (like n=16) you might not have to worry about repeating yourself simply because there are so many possible pairings to choose from that a repetition is fairly unlikely.
If you really want to be sure that you avoid repetitions, you could of course simply generate the whole list and shuffle it. But as I just indicated, that list could become huge as well. So instead I'd suggest you divide this problem into two steps. Find a way to generate all numbers from 0 to m-1 each exactly once, and find a way to turn such numbers into pairings. For the latter, you can simply decompose your number step by step. At each step, take index % (n-1) to make the current choice, and choose (int)(index / (n-1)) as the index for subsequent choices in the recursive calls.
For the former, the easiest thing I can think of would be using a PRNG with a prime number p>m as its period. Using modular arithmetic, that should be easy to do. Then simply discard all values which are greater or equal to m. Discarding means that you skip to the next element in the sequence. This will give all pairings in an order which should seem fairly random. If the starting point in that sequence should be random, be sure that if you at first choose a value which is to be discarded, then you have to initialize again, not skip to the next element. Otherwise some elements would be more likely as starting points than others.

performance on iterations in PHP

Can someone tell me about this performance issue
I've got 2 arrays,
I need to pick 5 numbers from these 2 arrays and work on the logic
the first array has got 5 number, out of which I need to pick 3 numbers
and the second array has got 4 numbers, out of which I need to pick 2 number
so taking this into consideration 5c3 - 10 and 4c2 - 6
which means 60 iterations for a single case
Is the method I'm approaching the right way??
is there any performance issue on this type of iterations ??

If you have to go through the whole array and pick numbers, then there is no optimization for that. The execution time depends on the size of arrays, meaning the bigger the size - higher execution time.
Although, if you know that it will always be exactly 5 numbers from two rows whose elements will not change, than I think you could generate all the possible valid combinations, store them in a database or file, and return a random one (if random choice is what you are looking for). In this case, you will achieve some optimization.

random function: higher values appear less often than lower

I have a tricky question that I've looked into a couple of times without figuring it out.
Some backstory: I am making a textbased RPG-game where players fight against animals/monsters etc. It works like any other game where you hit a number of hitpoints on each other every round.
The problem: I am using the random-function in php to generate the final value of the hit, depending on levels, armor and such. But I'd like the higher values (like the max hit) to appear less often than the lower values.
This is an example-graph:
How can I reproduce something like this using PHP and the rand-function? When typing rand(1,100) every number has an equal chance of being picked.
My idea is this: Make a 2nd degree (or quadratic function) and use the random number (x) to do the calculation.
Would this work like I want?
The question is a bit tricky, please let me know if you'd like more information and details.

Please, look at this beatiful article:
http://www.redblobgames.com/articles/probability/damage-rolls.html
There are interactive diagrams considering dice rolling and percentage of results.
This should be very usefull for you.
Pay attention to this kind of rolling random number:
roll1 = rollDice(2, 12);
roll2 = rollDice(2, 12);
damage = min(roll1, roll2);
This should give you what you look for.

OK, here's my idea :
Let's say you've got an array of elements (a,b,c,d) and you won't to randomly pick one of them. Doing a rand(1,4) to get the random element index, would mean that all elements have an equal chance to appear. (25%)
Now, let's say we take this array : (a,b,c,d,d).
Here we still have 4 elements, but not every one of them has equal chances to appear.
a,b,c : 20%
d : 40%
Or, let's take this array :
(1,2,3,...,97,97,97,98,98,98,99,99,99,100,100,100,100)
Hint : This way you won't only bias the random number generation algorithm, but you'll actually set the desired probability of apparition of each one (or of a range of numbers).
So, that's how I would go about that :
If you want numbers from 1 to 100 (with higher numbers appearing more frequently, get a random number from 1 to 1000 and associate it with a wider range. E.g.
rand = 800-1000 => rand/10 (80->100)
rand = 600-800 => rand/9 (66->88)
...
Or something like that. (You could use any math operation you imagine, modulo or whatever... and play with your algorithm). I hope you get my idea.
Good luck! :-)

Find the value in the minimum number of trials

I have an array of 52 different values that I can pass through a class to get a number in return.
$array = array("A","B","C","D"...);
Each value passed through the class gives a different number that can be either positive or negative.
The numbers are not equally distributed but are sorted in natural order.
E.g.
$myclass->calculate("A"); // 2.3
$myclass->calculate("B"); // 0.25
$myclass->calculate("C"); // -1.3
$myclass->calculate("D"); // -6
I want to get the last value that return a number >= 0.20 (in the example would be "B").
This should be done in the minimum number of "class invocation" to avoid time wasting.
I thought something like: divide $array in 2 pieces and calculate the number I get, if it is >= 20, then split the last part of $array in other 2 smaller pieces and so on. But I don't know if this would work.
How would you solve this?
Thanks in advance.

What you're describing is called a binary search, but it won't really work for this use case, because you aren't searching for a known value. Rather, you're searching for the value that is the lowest number >= 0.2 in a set where the exact value 0.2 may not exist (if it were guaranteed to exist, then you could do a binary search for 0.2, and then your letter would simply be n - 1; n != 0).
If your range is always A-Z, a simple linear search would definitely be the easiest method. The time savings on a data set of 26 elements for using a more efficient method is negligible (talking milliseconds here), compared to implementation time.
Edit: I see you actually mentioned 52 elements, not 26. My point is still the same, though. The number of elements would need to be in the tens of thousands or more for there to be any significant savings, unless you are performing this operation in a tight loop.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.