Recursive generators in PHP - php

Introduction
Since version 5.5 in PHP there's such great thing as generators. I will not repeat official manual page, but they are great thing for short definition of iterators. The most-known sample is:
function xrange($from, $till, $step)
{
if ($from>$till || $step<=0)
{
throw new InvalidArgumentException('Invalid range initializers');
}
for ($i = $from; $i < $till; $i += $step)
{
yield $i;
}
}
//...
foreach (xrange(2, 13, 3) as $i)
{
echo($i.PHP_EOL); // 2,5,8,11
}
and generator is actually not a function, but an instance of a concrete class:
get_class(xrange(1, 10, 1)); // Generator
The problem
Done with RTM stuff, now moving on to my question. Imagine that we want to create generator of Fibonacci numbers. Normally, to get those, we can use simple function:
function fibonacci($n)
{
if(!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
return $n < 2 ? $n : fibonacci($n-1) + fibonacci($n-2);
}
var_dump(fibonacci(6)); // 8
Let's transform this into something, that holds sequence and not only it's last member:
function fibonacci($n)
{
if (!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
if ($n<2)
{
return range(0, $n);
}
$n1 = fibonacci($n-1);
$n2 = fibonacci($n-2);
return array_merge($n1, [array_pop($n1)+array_pop($n2)]);
}
//...
foreach (fibonacci(6) as $i)
{
echo($i.PHP_EOL); // 0,1,1,2,3,5,8
}
We have now a function that returns array with full sequence
The question
Finally, the question part: how can I transform my latest fibonacci function so it will yield my values, not holding them in an array? My $n can be big, so I want to use benefits of generators, like in xrange sample. Pseudo-code will be:
function fibonacci($n)
{
if (!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
if ($n<2)
{
yield $n;
}
yield fibonacci($n-2) + fibonacci($n-1);
}
But this, obviously, is crap since we can't handle with it like this way because recursion will cause object of class Generator and not int value.
Bonus: getting fibonacci sequence is just a sample for more general question: how to use generators with recursion in common case? Of course, I can use standard Iterator for that or re-write my function to avoid recursion. But I want to achieve that with generators. Is this possible? Does this worth efforts to use this such way?

So the issue I ran into when attempting to create a recursive generator function, is that once you go past your first depth level each subsequent yield is yielding to its parent call rather than the iteration implementation (the loop).
As of php 7 a new feature has been added that allows you to yield from a subsequent generator function. This is the new Generator Delegation feature: https://wiki.php.net/rfc/generator-delegation
This allows us to yield from subsequent recursive calls, which means we can now efficiently write recursive functions with the use of generators.
$items = ['what', 'this', 'is', ['is', 'a', ['nested', 'array', ['with', 'a', 'bunch', ['of', ['values']]]]]];
function processItems($items)
{
foreach ($items as $value)
{
if (is_array($value))
{
yield from processItems($value);
continue;
}
yield $value;
}
}
foreach (processItems($items) as $item)
{
echo $item . "\n";
}
This gives the following output..
what
this
is
is
a
nested
array
with
a
bunch
of
values

I've finally identified a real-world use for recursive generators.
I've been exploring QuadTree datastructures recently. For those not familiar with QuadTrees, they're a tree-based datastructure use for geospatial indexing, and allowing a fast search lookup of all points/locations within a defined bounding box.
Each node in the QuadTree represents a segment of the mapped region, and acts as a bucket in which locations are stored... but a bucket of restricted size. When a bucket overflows, the QuadTree node splits off 4 child nodes, representing the North-west, North-east, South-west and South-east areas of the parent node, and starts to fill those.
When searching for locations falling within a specified bounding box, the search routine starts at the top-level node, testing all the locations in that bucket; then recurses into the child nodes, testing whether they intersect with the bounding box, or are encompassed by the bounding box, testing each QuadTree node within that set, then recursing again down through the tree. Each node may return none, one or many locations.
I implemented a basic QuadTree in PHP, designed to return an array of results; then realised that it might be a valid use case for a recursive generator, so I implemented a GeneratorQuadTree that can be accessed in a foreach() loop yielding a single result each iteration.
It seems a much more valid use-case for recursive generators because it is a truly recursive search function, and because each generator may return none, one or many results rather than a single result. Effectively, each nested generator is handling a part of the search, feeding its results back up through the tree through its parent.
The code is rather too much to post here; but you can take a look at the implementation on github.
It's fractionally slower than the non-generator version (but not significantly): the main benefit is reduction in memory because it isn't simply returning an array of variable size (which can be a significant benefit depending on the number of results returned). The biggest drawback is the fact that the results can't easily be sorted (my non-generator version does a usort() on the results array after it's returned).

function fibonacci($n)
{
if($n < 2) {
yield $n;
}
$x = fibonacci($n-1);
$y = fibonacci($n-2);
yield $x->current() + $y->current();
}
for($i = 0; $i <= 10; $i++) {
$x = fibonacci($i);
$value = $x->current();
echo $i , ' -> ' , $value, PHP_EOL;
}

If you first want to make a generator you might as well use the iterative version of fibonacci:
function fibonacci ($from, $to)
{
$a = 0;
$b = 1;
$tmp;
while( $to > 0 ) {
if( $from > 0 )
$from--;
else
yield $a;
$tmp = $a + $b;
$a=$b;
$b=$tmp;
$to--;
}
}
foreach( fibonacci(10,20) as $fib ) {
print "$fib "; // prints "55 89 144 233 377 610 987 1597 2584 4181 "
}

Here's a recursive generator for combinations (order unimportant, without replacement):
<?php
function comb($set = [], $size = 0) {
if ($size == 0) {
// end of recursion
yield [];
}
// since nothing to yield for an empty set...
elseif ($set) {
$prefix = [array_shift($set)];
foreach (comb($set, $size-1) as $suffix) {
yield array_merge($prefix, $suffix);
}
// same as `yield from comb($set, $size);`
foreach (comb($set, $size) as $next) {
yield $next;
}
}
}
// let's verify correctness
assert(iterator_to_array(comb([0, 1, 2, 3, 4], 3)) == [
[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4],
[0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]
]);
foreach (comb([0, 1, 2, 3], 3) as $combination) {
echo implode(", ", $combination), "\n";
}
Outputs:
0, 1, 2
0, 1, 3
0, 2, 3
1, 2, 3
Same thing non-yielding.

Recently ran into a problem that needed 'recursive' generators or generator delegation. I ended up writing a little function that converts delegated generators calls into a single generator.
I turned it into a package so you could just require it with composer, or checkout the source here: hedronium/generator-nest.
Code:
function nested(Iterator $generator)
{
$cur = 0;
$gens = [$generator];
while ($cur > -1) {
if ($gens[$cur]->valid()) {
$key = $gens[$cur]->key();
$val = $gens[$cur]->current();
$gens[$cur]->next();
if ($val instanceof Generator) {
$gens[] = $val;
$cur++;
} else {
yield $key => $val;
}
} else {
array_pop($gens);
$cur--;
}
}
}
You use it like:
foreach (nested(recursive_generator()) as $combination) {
// your code
}
Checkout that link above. It has examples.

Short answer: recursive generators are simple. Example for walking through tree:
class Node {
public function getChildren() {
return [ /* array of children */ ];
}
public function walk() {
yield $this;
foreach ($this->getChildren() as $child) {
foreach ($child->walk() as $return) {
yield $return;
};
}
}
}
It's all.
Long answer about fibonacci:
Generator is something that is used with foreach (generator() as $item) { ... }. But OP wants fib() function to return int, but at the same time he wants it to return generator to be used in foreach. It is very confusing.
It is possible to implement recursive generator solution for fibonacci. We just need to put somewere inside fib() function a loop that will indeed yield each member of the sequence. As generator is supposed to be used with foreach, it looks really wierd, and I do not think it is effective, but here it is:
function fibGenerator($n) {
if ($n < 2) {
yield $n;
return;
}
// calculating current number
$x1 = fibGenerator($n - 1);
$x2 = fibGenerator($n - 2);
$result = $x1->current() + $x2->current();
// yielding the sequence
yield $result;
yield $x1->current();
yield $x2->current();
for ($n = $n - 3; $n >= 0; $n--) {
$res = fibGenerator($n);
yield $res->current();
}
}
foreach (fibGenerator(15) as $x) {
echo $x . " ";
}

I am offering two solution for Fibonacci number, with and without recursion:
function fib($n)
{
return ($n < 3) ? ($n == 0) ? 0 : 1 : fib($n - 1) + fib($n - 2);
}
function fib2()
{
$a = 0;
$b = 1;
for ($i = 1; $i <= 10; $i++)
{
echo $a . "\n";
$a = $a + $b;
$b = $a - $b;
}
}
for ($i = 0; $i <= 10; $i++)
{
echo fib($i) . "\n";
}
echo fib2();

Related

Generate all combinaties of an array, not always using all arrays

I'm working on a piece of code that should generate all combinations of a set of filters. All the values of a filter are in an array, which creates a multidimensional array of filters. The end result should be all possible url combinations with the filters; e.a:
/lorem
/lorem/foo
/lorem/foo/
/lorem/foo/primary
But also
/primary
/primary/ipsum
/primary/bar
etcetera.
I got this piece of code, which I believe is already from StackOverflow;
<?php
$parts = [];
$parts[] = ['lorem', 'ipsum'];
$parts[] = ['foo', 'bar'];
$parts[] = ['primary', 'secondary'];
$parts[] = ['test-value'];
echo "<pre>";
var_dump( create_all_combinations($parts) );
echo "</pre>";
function create_all_combinations($arrays)
{
$result = array();
$arrays = array_values($arrays);
$sizeIn = sizeof($arrays);
$size = $sizeIn > 0 ? 1 : 0;
foreach ($arrays as $array)
$size = $size * sizeof($array);
for ($i = 0; $i < $size; $i ++)
{
$result[$i] = array();
for ($j = 0; $j < $sizeIn; $j ++)
array_push($result[$i], current($arrays[$j]));
for ($j = ($sizeIn -1); $j >= 0; $j --)
{
if (next($arrays[$j]))
break;
elseif (isset ($arrays[$j]))
reset($arrays[$j]);
}
}
return $result;
}
Currently the result is always with 4 urls (since we feed 4 different filters). I am missing the unique combinaties of 1, 2 or 3 filter combination and I'm not sure where to start to create this. Any help appreciated.
be wary. beware.
In this post I will provide essential functions where I have manage the complexity to a meticulous degree. A beginner should be able to step thru them and verify the behaviour along the way. It is my goal to help prevent copy/paste development -
Cargo cult programming is symptomatic of a programmer not understanding either a bug they were attempting to solve or the apparent solution. The term cargo cult programmer may apply when anyone inexperienced with the problem at hand copies some program code from one place to another with little understanding of how it works or whether it is required.
The functions below are written with functional style principles. That means avoiding things like mutation, variable reassignment and other side effects. Instead of destroying old values, we will create new ones. This makes writing functions and debugging a lot easier. If you have any questions, don't hesitate to ask :D
array_combinations
I would break your problem down into several pieces. First, a generic combinations function which generates combinations of any multi-dimensional array -
function array_combinations (array $a) {
if (count($a) == 0)
yield [];
else
foreach (array_combinations(array_slice($a, 1)) as $c)
foreach ($a[0] as $v)
yield array_merge([$v], array_filter($c));
}
$t = [[3,6], ["a","b"], [9]];
foreach (array_combinations($t) as $c)
echo json_encode($c), PHP_EOL;
[3,"a",9]
[6,"a",9]
[3,"b",9]
[6,"b",9]
array_transpose
But according to your question, you want the transpose variant of this output. So we simply write a function for that too -
function array_transpose (array $a) {
return array_map(null, ...$a);
}
$t = [[3,6], ["a","b"], [9]];
foreach (array_combinations(array_transpose($t)) as $c)
echo json_encode($c), PHP_EOL;
[3,6]
["a",6]
[9,6]
[3,"b"]
["a","b"]
[9,"b"]
[3]
["a"]
[9]
combine
Complex programs are made up of several smaller programs -
$parts = [
['lorem', 'ipsum'],
['foo', 'bar'],
['primary', 'secondary'],
['test-value'],
];
foreach (array_combinations(array_transpose($parts)) as $c)
echo "/".join("/", $c), PHP_EOL;
/lorem/ipsum
/foo/ipsum
/primary/ipsum
/test-value/ipsum
/lorem/bar
/foo/bar
/primary/bar
/test-value/bar
/lorem/secondary
/foo/secondary
/primary/secondary
/test-value/secondary
/lorem
/foo
/primary
/test-value
without generators
If you do not want to use generators, adhering to functional principles is more challenging but still worth it. The complexity ramps up here considerably and you should use the generator, if possible -
function array_combinations (array $a) {
if (count($a) == 0)
return [[]];
else
return array_flatmap(
array_combinations(array_slice($a, 1)),
function ($c) use ($a) {
return array_map(function($v) use ($c) {
return array_merge([$v], array_filter($c));
}, $a[0]);
}
);
}
Where array_flatmap is defined as -
function array_flatmap (array $a, callable $f) {
return array_reduce($a, function ($r, $v) use ($f) {
return array_merge($r, $f($v));
}, []);
}
with arrow functions
If you are on PHP >= 7.4, you can use the new arrow functions, which cleans up the generatorless approach significantly -
function array_combinations (array $a) {
if (count($a) == 0)
return [[]];
else
return array_flatmap(
array_combinations(array_slice($a, 1)),
fn($c) => array_map(
fn($v) => array_merge([$v], array_filter($c)),
$a[0]
)
);
}
Where array_flatmap is defined as -
function array_flatmap (array $a, callable $f) {
return array_reduce(
$a,
fn($r, $v) => array_merge($r, $f($v)),
[]
);
}
Usage and output of each variant is the same.

Sorting an array of strings by arbitrary numbers of substrings

I'm attempting to modify the OrderedImportsFixer class in php-cs-fixer so I can clean up my files the way I want. What I want is to order my imports in a fashion similar to what you'd see in a filesystem listing, with "directories" listed before "files".
So, given this array:
$indexes = [
26 => ["namespace" => "X\\Y\\Zed"],
9 => ["namespace" => "A\\B\\See"],
3 => ["namespace" => "A\\B\\Bee"],
38 => ["namespace" => "A\\B\\C\\Dee"],
51 => ["namespace" => "X\\Wye"],
16 => ["namespace" => "A\\Sea"],
12 => ["namespace" => "A\\Bees"],
31 => ["namespace" => "M"],
];
I'd like this output:
$sorted = [
38 => ["namespace" => "A\\B\\C\\Dee"],
3 => ["namespace" => "A\\B\\Bee"],
9 => ["namespace" => "A\\B\\See"],
12 => ["namespace" => "A\\Bees"],
16 => ["namespace" => "A\\Sea"],
26 => ["namespace" => "X\\Y\\Zed"],
51 => ["namespace" => "X\\Wye"],
31 => ["namespace" => "M"],
];
As in a typical filesystem listing:
I've been going at uasort for a while (key association must be maintained) and have come close. Admittedly, this is due more to desperate flailing than any sort of rigorous methodology. Not really having a sense of how uasort works is kind of limiting me here.
// get the maximum number of namespace components in the list
$ns_counts = array_map(function($val){
return count(explode("\\", $val["namespace"]));
}, $indexes);
$limit = max($ns_counts);
for ($depth = 0; $depth <= $limit; $depth++) {
uasort($indexes, function($first, $second) use ($depth, $limit) {
$fexp = explode("\\", $first["namespace"]);
$sexp = explode("\\", $second["namespace"]);
if ($depth === $limit) {
// why does this help?
array_pop($fexp);
array_pop($sexp);
}
$fexp = array_slice($fexp, 0, $depth + 1, true);
$sexp = array_slice($sexp, 0, $depth + 1, true);
$fimp = implode(" ", $fexp);
$simp = implode(" ", $sexp);
//echo "$depth: $fimp <-> $simp\n";
return strnatcmp($fimp, $simp);
});
}
echo json_encode($indexes, JSON_PRETTY_PRINT);
This gives me properly sorted output, but with deeper namespaces on the bottom instead of the top:
{
"31": {
"namespace": "M"
},
"12": {
"namespace": "A\\Bees"
},
"16": {
"namespace": "A\\Sea"
},
"3": {
"namespace": "A\\B\\Bee"
},
"9": {
"namespace": "A\\B\\See"
},
"38": {
"namespace": "A\\B\\C\\Dee"
},
"51": {
"namespace": "X\\Wye"
},
"26": {
"namespace": "X\\Y\\Zed"
}
}
I'm thinking I may have to build a separate array for each level of namespace and sort it separately, but have drawn a blank on how I might do that. Any suggestions for getting the last step of this working, or something completely different that doesn't involve so many loops?
We divide this into 4 steps.
Step 1: Create hierarchical structure from the dataset.
function createHierarchicalStructure($indexes){
$data = [];
foreach($indexes as $d){
$temp = &$data;
foreach(explode("\\",$d['namespace']) as $namespace){
if(!isset($temp[$namespace])){
$temp[$namespace] = [];
}
$temp = &$temp[$namespace];
}
}
return $data;
}
Split the namespaces by \\ and maintain a $data variable. Use & address reference to keep editing the same copy of the array.
Step 2: Sort the hierarchy in first folders then files fashion.
function fileSystemSorting(&$indexes){
foreach($indexes as $key => &$value){
fileSystemSorting($value);
}
uksort($indexes,function($key1,$key2) use ($indexes){
if(count($indexes[$key1]) == 0 && count($indexes[$key2]) > 0) return 1;
if(count($indexes[$key2]) == 0 && count($indexes[$key1]) > 0) return -1;
return strnatcmp($key1,$key2);
});
}
Sort the subordinate folders and use uksort for the current level of folders. Vice-versa would also work. If both 2 folders in comparison have subfolders, compare them as strings, else if one is a folder and another is a file, make folders come above.
Step 3: Flatten the hierarchical structure now that they are in order.
function flattenFileSystemResults($hierarchical_data){
$result = [];
foreach($hierarchical_data as $key => $value){
if(count($value) > 0){
$sub_result = flattenFileSystemResults($value);
foreach($sub_result as $r){
$result[] = $key . "\\" . $r;
}
}else{
$result[] = $key;
}
}
return $result;
}
Step 4: Restore the initial data keys back and return the result.
function associateKeys($data,$indexes){
$map = array_combine(array_column($indexes,'namespace'),array_keys($indexes));
$result = [];
foreach($data as $val){
$result[ $map[$val] ] = ['namespace' => $val];
}
return $result;
}
Driver code:
function foldersBeforeFiles($indexes){
$hierarchical_data = createHierarchicalStructure($indexes);
fileSystemSorting($hierarchical_data);
return associateKeys(flattenFileSystemResults($hierarchical_data),$indexes);
}
print_r(foldersBeforeFiles($indexes));
Demo: https://3v4l.org/cvoB2
I believe the following should work:
uasort($indexes, static function (array $entry1, array $entry2): int {
$ns1Parts = explode('\\', $entry1['namespace']);
$ns2Parts = explode('\\', $entry2['namespace']);
$ns1Length = count($ns1Parts);
$ns2Length = count($ns2Parts);
for ($i = 0; $i < $ns1Length && isset($ns2Parts[$i]); $i++) {
$isLastPartForNs1 = $i === $ns1Length - 1;
$isLastPartForNs2 = $i === $ns2Length - 1;
if ($isLastPartForNs1 !== $isLastPartForNs2) {
return $isLastPartForNs1 <=> $isLastPartForNs2;
}
$nsComparison = $ns1Parts[$i] <=> $ns2Parts[$i];
if ($nsComparison !== 0) {
return $nsComparison;
}
}
return 0;
});
What it does is:
split namespaces into parts,
compare each part starting from the first one, and:
if we're at the last part for one and not the other, prioritize the one with the most parts,
otherwise, if the respective parts are different, prioritize the one that is before the other one alphabetically.
Demo
Here's another version that breaks the steps down further that, although it might not be the most optimal, definitely helps my brain think about it. See the comments for more details on what is going on:
uasort(
$indexes,
static function (array $a, array $b) {
$aPath = $a['namespace'];
$bPath = $b['namespace'];
// Just in case there are duplicates
if ($aPath === $bPath) {
return 0;
}
// Break into parts
$aParts = explode('\\', $aPath);
$bParts = explode('\\', $bPath);
// If we only have a single thing then it is a root-level, just compare the item
if (1 === count($aParts) && 1 === count($bParts)) {
return $aPath <=> $bPath;
}
// Get the class and namespace (file and folder) parts
$aClass = array_pop($aParts);
$bClass = array_pop($bParts);
$aNamespace = implode('\\', $aParts);
$bNamespace = implode('\\', $bParts);
// If the namespaces are the same, sort by class name
if ($aNamespace === $bNamespace) {
return $aClass <=> $bClass;
}
// If the first namespace _starts_ with the second namespace, sort it first
if (0 === mb_strpos($aNamespace, $bNamespace)) {
return -1;
}
// Same as above but the other way
if (0 === mb_strpos($bNamespace, $aNamespace)) {
return 1;
}
// Just only by namespace
return $aNamespace <=> $bNamespace;
}
);
Online demo
I find no fault with Jeto's algorithmic design, but I decided to implement it more concisely. My snippet avoids iterated function calls and arithmetic in the for() loop, uses a single spaceship operator, and a single return. My snippet is greater than 50% shorter and I generally find it easier to read, but then everybody thinks their own baby is cute, right?
Code: (Demo)
uasort($indexes, function($a, $b) {
$aParts = explode('\\', $a['namespace']);
$bParts = explode('\\', $b['namespace']);
$aLast = count($aParts) - 1;
$bLast = count($bParts) - 1;
for ($cmp = 0, $i = 0; $i <= $aLast && !$cmp; ++$i) {
$cmp = [$i === $aLast, $aParts[$i]] <=> [$i === $bLast, $bParts[$i]];
}
return $cmp;
});
Like Jeto's answer, it iterates each array simultaneously and
checks if either element is the last element of the array, if so it is bumped down the list (because we want longer paths to win tiebreakers);
if neither element is the last in its array, then compare the elements' current string alphabetically.
The process repeats until a non-zero evaluation is generated.
Since duplicate entries are not expected to occur, the return value should always be -1 or 1 (never 0)
Note, I am halting the for() loop with two conditions.
If $i is greater than the number of elements in $aParts (only) -- because if $bParts has fewer elements than $aParts, then $cmp will generate a non-zero value before a Notice is triggered.
If $cmp is a non-zero value.
Finally, to explain the array syntax on either side of the spaceship operator...The spaceship operator will compare the arrays from left to right, so it will behave like:
leftside[0] <=> rightside[0] then leftside[1] <=> rightside[1]
Making multiple comparisons in this way does not impact performance because there are no function calls on either side of the <=>. If there were function calls involved, it would be more performant to make individual comparisons in a fallback manner like:
fun($a1) <=> fun($b1) ?: fun($a2) <=> fun($b2)
This way subsequent function calls are only actually made if a tiebreak is necessary.

PHP Resource for getting combinations of x distinct items in y distinct bins?

Does anyone know a resource (manual or book) or have the PHP solution for getting all the combinations of x distinct items in y distinct bins?
For example, if I had 2 items [1, 2] with 2 bins, the 4 possibilities would be:
[ 1,2 ] [ ]
[ 1 ] [ 2 ]
[ 2 ] [ 1 ]
[ ] [ 1,2 ]
I need the combinations, not permutations, as order of items is irelevent. And there is no min/max for items in a bin. And if you're going to downgrade my question because it's unclear, please specify what you're confused with. I've spent the entire day trying to find a solution, even in another programming language. Apparently, not very easy to come up with.
UPDATE: Hi Karol, thanks for the comment and link. I'm still working away on this, and did find that page in my searches and converted that to PHP here:
function combinationsOf($k, $xs){
if ($k === 0)
return array(array());
if (count($xs) === 0)
return array();
$x = $xs[0];
$xs1 = array_slice($xs,1,count($xs)-1);
$res1 = combinationsOf($k-1,$xs1);
for ($i = 0; $i < count($res1); $i++) {
array_splice($res1[$i], 0, 0, $x);
}
$res2 = combinationsOf($k,$xs1);
return array_merge($res1, $res2); }
I'm going about it in a different way with this than what I originally hoped for, so still hoping to hear from someone ... thanks!
UPDATE: So I'm making progress, making use of the above recursive function along with another link I found: Permutation Of Multidimensional Array in PHP
Although, correct me if I'm wrong (it's been a loooong day), but it's not permutations, but combinations, that's being generated here.
You could use a backtracking method which utilizes recursion. Basically, it's like a "smart brutforce" approach which takes a path and tries to get combinations which work.
The solution may look a little large but most of the functions are there just to support the combo function. The main brains behind the algorithm is behind the combo function which creates the combinations. The rest of the functions are there to support the combo function and print a nice looking output.
<?php
function toPlainArray($arr2) {
$output = "[";
foreach($arr2 as $arr) {
$output .= "[";
foreach($arr as $val) {
$output .= $val . ", ";
}
if($arr != []) {
$output = substr($output, 0, -2) . "], ";
} else {
$output .= "], ";
}
}
return substr($output, 0, -2) . "]";
}
function difference($arr2d, $arr1d) {
foreach((array)$arr2d as $arr) {
foreach($arr as $item) {
if(in_array($item, $arr1d)) {
$index = array_search($item, $arr1d);
unset($arr1d[$index]);
}
}
}
return $arr1d;
}
function getNextPossibleSol($pSol, $item) { // returns an array (1d)
$allItems = range(1, $item);
return difference($pSol, $allItems);
}
function createEmpty2dArray($arr, $amount) {
for($i = 0; $i < $amount; $i++) {
$arr[] = [];
}
return $arr;
}
function isSmallerThenPartialItems($item, $pSol) {
foreach($pSol as $arr) {
foreach($arr as $val) {
if($val > $item) return false;
}
}
return true;
}
function combo($items, $buckets, $partialSol=[]) {
if($partialSol == []) { // Starting empty array, populate empty array with other arrays (ie create empty buckets to fill)
$partialSol = createEmpty2dArray($partialSol, $buckets);
}
$nextPossibleSol = getNextPossibleSol($partialSol, $items);
if($nextPossibleSol == []) { // base case: solution found
echo toPlainArray($partialSol); // 2d array
echo "<br /><br />";
} else {
foreach($nextPossibleSol as $item) {
for($i = 0; $i < count($partialSol); $i++) {
if(isSmallerThenPartialItems($item, $partialSol)) { // as order doesn't matter, we can use this if-statement to remove duplicates
$partialSol[$i][] = $item;
combo($items, $buckets, $partialSol);
array_pop($partialSol[$i]);
}
}
}
}
}
combo(2, 2); // call the combinations functions with 2 items and 2 buckets
?>
Output:
[[1, 2], []]
[[1], [2]]
[[2], [1]]
[[], [1, 2]]

Calculate the intersection of arrays with a threshold in PHP

Let's say I have following arrays:
$a = [1,2,3,4,5];
$b = [1,3,4,5,6];
$c = [1,7,8,9,10];
$d = [1,2,3,4];
The intersection of those would be $result = [1], which is easy enough. But what if I wanted the intersection of those with a minimum threshold of let's say 3? The threshold means I can skip one or more arrays from the intersection, as long as my resulting intersection has at least 3 elements, which in this case might result in:
$result = [1,3,4];
1, 3 and 4 are present in $a, $b and $d, but not in $c which is skipped because of the threshold. Is there an existing PHP class, algorithm or function with which I might accomplish this?
To do that we have to use combinations of an array. I have used combinations algorithm from this great article. Adjusting this algorithm we can write the following class:
class Intersections
{
protected $arrays;
private $arraysSize;
public function __construct($arrays)
{
$this->arrays = $arrays;
$this->arraysSize = count($arrays);
}
public function getByThreshold($threshold)
{
$intersections = $this->getAll();
foreach ($intersections as $intersection) {
if (count($intersection) >= $threshold) {
return $intersection;
}
}
return null;
}
protected $intersections;
public function getAll()
{
if (is_null($this->intersections)) {
$this->generateIntersections();
}
return $this->intersections;
}
private function generateIntersections()
{
$this->generateCombinationsMasks();
$this->generateCombinations();
$combinationSize = $this->arraysSize;
$intersectionSize = 0;
foreach ($this->combinations as $combination) {
$intersection = call_user_func_array('array_intersect', $combination);
if ($combinationSize > count($combination)) {
$combinationSize = count($combination);
$intersectionSize = 0;
}
if (count($intersection) > $intersectionSize) {
$this->intersections[$combinationSize] = $intersection;
$intersectionSize = count($intersection);
}
}
}
private $combinationsMasks;
private function generateCombinationsMasks()
{
$combinationsMasks = [];
$totalNumberOfCombinations = pow(2, $this->arraysSize);
for ($i = $totalNumberOfCombinations - 1; $i > 0; $i--) {
$combinationsMasks[] = str_pad(
decbin($i), $this->arraysSize, '0', STR_PAD_LEFT
);
}
usort($combinationsMasks, function ($a, $b) {
return strcmp(strtr($b, ['']), strtr($a, ['']));
});
$this->combinationsMasks = array_slice(
$combinationsMasks, 0, -$this->arraysSize
);
}
private $combinations;
private function generateCombinations()
{
$this->combinations = array_map(function ($combinationMask) {
return $this->generateCombination($combinationMask);
}, $this->combinationsMasks);
}
private function generateCombination($combinationMask)
{
$combination = [];
foreach (str_split($combinationMask) as $key => $indicator) {
if ($indicator) {
$combination[] = $this->arrays[$key];
}
}
return $combination;
}
}
I have tried to give self-explanatory names to methods. Some chunks of code can be optimized more (for example, I call count function multiple times on same arrays; this was done in order to reduce variables fiddling) for production use.
So basically the logic is pretty simple. We generate all combinations of arrays and sort them decreasingly by the number of used arrays. Then we find the longest intersection for each length of combinations. Actually, this is the hardest part. To get one particular intersection we return first one that matches threshold.
$intersections = new Intersections([$a, $b, $c, $d]);
var_dump($intersections->getAll());
var_dump($intersections->getByThreshold(3));
Here is working demo.
There are other ways to find all combinations, for example, one from "PHP Cookbook". You can choose whatever one you like most.
No build in feature for that. You need to write something short like:
$values = [];
foreach ([$a, $b, $c, $d] as $arr)
foreach ($arr as $value)
$values[$value] = ($values[$value] ?? 0) + 1;
// For threshold of 3
$values = array_keys(array_filter($values, function($a) { return $a >= 3; }));
Note: This requires PHP7 for ?? operator. Otherwise use something like:
$values[$value] = empty($values[$value]) ? 1 : $values[$value] + 1;

Optimizing PHP algorithm

I have built a class that finds the smallest number divisible by a all numbers in a given range.
This is my code:
class SmallestDivisible
{
private $dividers = array();
public function findSmallestDivisible($counter)
{
$this->dividers = range(10, 20);
for($x=1; $x<$counter; $x++) {
if ($this->testIfDevisibleByAll($x, $this->dividers) == true) {
return $x;
}
}
}
private function testIfDevisibleByAll($x, $dividers)
{
foreach($dividers as $divider) {
if ($x % $divider !== 0) {
return false;
}
}
return true;
}
}
$n = new SmallestDivisible();
echo $n->findSmallestDivisible(1000000000);
This class finds a number that is divisible by all numbers in the range from 1 to 20 ($this->dividers).
I know it works well as I tested it with other, lower ranges, but, unfortunately, it is not able to find the solution for range(10, 20) within 30 seconds - and this is the time after which a PHP script is halted.
A parameter that is fed to the findSmallestDivisible method is the ceiling of the group of numbers the script is going to inspect (e.i. from 1 to $counter (1000000000 is this execution)).
I would be grateful for suggestions on how I can optimize this script so that it executes faster.
Your solution is brute-force and simply horrible.
Instead, how about handling it mathematically? You're looking for the lowest common multiple of numbers in your range, so...
function gcd($n, $m) {
$n=abs($n); $m=abs($m);
list($n,$m) = array(min($m,$n),max($m,$n));
while($r = $m % $n) {
list($m,$n) = array($n,$r);
}
return $n;
}
function lcm($n, $m) {
return $m * ($n/gcd($n,$m));
}
function lcm_array($arr) {
while(count($arr) > 1) {
array_push($arr, lcm(array_shift($arr),array_shift($arr)));
}
return array_shift($arr);
}
var_dump(lcm_array(range(10,20)));
// result int(232792560)
This means your original code would have had to do 232,792,560 iterations, no wonder it took so long!
Your goal is an easy mathematical calculation named the least common multiple but using brute force to compute it is totally wrong (as you already found out).
The Wikipedia page lists several reasonable algorithms that can be used to compute it faster.
The one explained in the section "A method using a table" is really fast and doesn't require much memory. You keep only the leftmost column of the table (the numbers you want to get the lcm for) and the rightmost column (the current step). If you implement it I suggest you hardcode a list of prime numbers into your program to avoid computing them.
Here is another solution I came up with.
In short, the algorithm will calculate LCM (lesast common multiple) for a group of numbers.
class Lcmx
{
public $currentLcm = 0;
private function gcd($a, $b)
{
if ($a == 0 || $b == 0)
return abs( max(abs($a), abs($b)) );
$r = $a % $b;
return ($r != 0) ?
$this->gcd($b, $r) :
abs($b);
}
private function lcm($a, $b)
{
return array_product(array($a, $b)) / $this->gcd($a, $b);
}
public function lcm_array($array = array())
{
$factors = $array;
while(count($factors) > 1) {
$this->currentLcm = $this->lcm(array_pop($factors), array_pop($factors));
array_push($factors, $this->currentLcm);
}
return $this;
}
}
$l = new Lcmx;
echo $l->lcm_array(range(1, 20))->currentLcm;
//232792560

Categories