I'm working on a piece of code that should generate all combinations of a set of filters. All the values of a filter are in an array, which creates a multidimensional array of filters. The end result should be all possible url combinations with the filters; e.a:
/lorem
/lorem/foo
/lorem/foo/
/lorem/foo/primary
But also
/primary
/primary/ipsum
/primary/bar
etcetera.
I got this piece of code, which I believe is already from StackOverflow;
<?php
$parts = [];
$parts[] = ['lorem', 'ipsum'];
$parts[] = ['foo', 'bar'];
$parts[] = ['primary', 'secondary'];
$parts[] = ['test-value'];
echo "<pre>";
var_dump( create_all_combinations($parts) );
echo "</pre>";
function create_all_combinations($arrays)
{
$result = array();
$arrays = array_values($arrays);
$sizeIn = sizeof($arrays);
$size = $sizeIn > 0 ? 1 : 0;
foreach ($arrays as $array)
$size = $size * sizeof($array);
for ($i = 0; $i < $size; $i ++)
{
$result[$i] = array();
for ($j = 0; $j < $sizeIn; $j ++)
array_push($result[$i], current($arrays[$j]));
for ($j = ($sizeIn -1); $j >= 0; $j --)
{
if (next($arrays[$j]))
break;
elseif (isset ($arrays[$j]))
reset($arrays[$j]);
}
}
return $result;
}
Currently the result is always with 4 urls (since we feed 4 different filters). I am missing the unique combinaties of 1, 2 or 3 filter combination and I'm not sure where to start to create this. Any help appreciated.
be wary. beware.
In this post I will provide essential functions where I have manage the complexity to a meticulous degree. A beginner should be able to step thru them and verify the behaviour along the way. It is my goal to help prevent copy/paste development -
Cargo cult programming is symptomatic of a programmer not understanding either a bug they were attempting to solve or the apparent solution. The term cargo cult programmer may apply when anyone inexperienced with the problem at hand copies some program code from one place to another with little understanding of how it works or whether it is required.
The functions below are written with functional style principles. That means avoiding things like mutation, variable reassignment and other side effects. Instead of destroying old values, we will create new ones. This makes writing functions and debugging a lot easier. If you have any questions, don't hesitate to ask :D
array_combinations
I would break your problem down into several pieces. First, a generic combinations function which generates combinations of any multi-dimensional array -
function array_combinations (array $a) {
if (count($a) == 0)
yield [];
else
foreach (array_combinations(array_slice($a, 1)) as $c)
foreach ($a[0] as $v)
yield array_merge([$v], array_filter($c));
}
$t = [[3,6], ["a","b"], [9]];
foreach (array_combinations($t) as $c)
echo json_encode($c), PHP_EOL;
[3,"a",9]
[6,"a",9]
[3,"b",9]
[6,"b",9]
array_transpose
But according to your question, you want the transpose variant of this output. So we simply write a function for that too -
function array_transpose (array $a) {
return array_map(null, ...$a);
}
$t = [[3,6], ["a","b"], [9]];
foreach (array_combinations(array_transpose($t)) as $c)
echo json_encode($c), PHP_EOL;
[3,6]
["a",6]
[9,6]
[3,"b"]
["a","b"]
[9,"b"]
[3]
["a"]
[9]
combine
Complex programs are made up of several smaller programs -
$parts = [
['lorem', 'ipsum'],
['foo', 'bar'],
['primary', 'secondary'],
['test-value'],
];
foreach (array_combinations(array_transpose($parts)) as $c)
echo "/".join("/", $c), PHP_EOL;
/lorem/ipsum
/foo/ipsum
/primary/ipsum
/test-value/ipsum
/lorem/bar
/foo/bar
/primary/bar
/test-value/bar
/lorem/secondary
/foo/secondary
/primary/secondary
/test-value/secondary
/lorem
/foo
/primary
/test-value
without generators
If you do not want to use generators, adhering to functional principles is more challenging but still worth it. The complexity ramps up here considerably and you should use the generator, if possible -
function array_combinations (array $a) {
if (count($a) == 0)
return [[]];
else
return array_flatmap(
array_combinations(array_slice($a, 1)),
function ($c) use ($a) {
return array_map(function($v) use ($c) {
return array_merge([$v], array_filter($c));
}, $a[0]);
}
);
}
Where array_flatmap is defined as -
function array_flatmap (array $a, callable $f) {
return array_reduce($a, function ($r, $v) use ($f) {
return array_merge($r, $f($v));
}, []);
}
with arrow functions
If you are on PHP >= 7.4, you can use the new arrow functions, which cleans up the generatorless approach significantly -
function array_combinations (array $a) {
if (count($a) == 0)
return [[]];
else
return array_flatmap(
array_combinations(array_slice($a, 1)),
fn($c) => array_map(
fn($v) => array_merge([$v], array_filter($c)),
$a[0]
)
);
}
Where array_flatmap is defined as -
function array_flatmap (array $a, callable $f) {
return array_reduce(
$a,
fn($r, $v) => array_merge($r, $f($v)),
[]
);
}
Usage and output of each variant is the same.
Related
I'm attempting to modify the OrderedImportsFixer class in php-cs-fixer so I can clean up my files the way I want. What I want is to order my imports in a fashion similar to what you'd see in a filesystem listing, with "directories" listed before "files".
So, given this array:
$indexes = [
26 => ["namespace" => "X\\Y\\Zed"],
9 => ["namespace" => "A\\B\\See"],
3 => ["namespace" => "A\\B\\Bee"],
38 => ["namespace" => "A\\B\\C\\Dee"],
51 => ["namespace" => "X\\Wye"],
16 => ["namespace" => "A\\Sea"],
12 => ["namespace" => "A\\Bees"],
31 => ["namespace" => "M"],
];
I'd like this output:
$sorted = [
38 => ["namespace" => "A\\B\\C\\Dee"],
3 => ["namespace" => "A\\B\\Bee"],
9 => ["namespace" => "A\\B\\See"],
12 => ["namespace" => "A\\Bees"],
16 => ["namespace" => "A\\Sea"],
26 => ["namespace" => "X\\Y\\Zed"],
51 => ["namespace" => "X\\Wye"],
31 => ["namespace" => "M"],
];
As in a typical filesystem listing:
I've been going at uasort for a while (key association must be maintained) and have come close. Admittedly, this is due more to desperate flailing than any sort of rigorous methodology. Not really having a sense of how uasort works is kind of limiting me here.
// get the maximum number of namespace components in the list
$ns_counts = array_map(function($val){
return count(explode("\\", $val["namespace"]));
}, $indexes);
$limit = max($ns_counts);
for ($depth = 0; $depth <= $limit; $depth++) {
uasort($indexes, function($first, $second) use ($depth, $limit) {
$fexp = explode("\\", $first["namespace"]);
$sexp = explode("\\", $second["namespace"]);
if ($depth === $limit) {
// why does this help?
array_pop($fexp);
array_pop($sexp);
}
$fexp = array_slice($fexp, 0, $depth + 1, true);
$sexp = array_slice($sexp, 0, $depth + 1, true);
$fimp = implode(" ", $fexp);
$simp = implode(" ", $sexp);
//echo "$depth: $fimp <-> $simp\n";
return strnatcmp($fimp, $simp);
});
}
echo json_encode($indexes, JSON_PRETTY_PRINT);
This gives me properly sorted output, but with deeper namespaces on the bottom instead of the top:
{
"31": {
"namespace": "M"
},
"12": {
"namespace": "A\\Bees"
},
"16": {
"namespace": "A\\Sea"
},
"3": {
"namespace": "A\\B\\Bee"
},
"9": {
"namespace": "A\\B\\See"
},
"38": {
"namespace": "A\\B\\C\\Dee"
},
"51": {
"namespace": "X\\Wye"
},
"26": {
"namespace": "X\\Y\\Zed"
}
}
I'm thinking I may have to build a separate array for each level of namespace and sort it separately, but have drawn a blank on how I might do that. Any suggestions for getting the last step of this working, or something completely different that doesn't involve so many loops?
We divide this into 4 steps.
Step 1: Create hierarchical structure from the dataset.
function createHierarchicalStructure($indexes){
$data = [];
foreach($indexes as $d){
$temp = &$data;
foreach(explode("\\",$d['namespace']) as $namespace){
if(!isset($temp[$namespace])){
$temp[$namespace] = [];
}
$temp = &$temp[$namespace];
}
}
return $data;
}
Split the namespaces by \\ and maintain a $data variable. Use & address reference to keep editing the same copy of the array.
Step 2: Sort the hierarchy in first folders then files fashion.
function fileSystemSorting(&$indexes){
foreach($indexes as $key => &$value){
fileSystemSorting($value);
}
uksort($indexes,function($key1,$key2) use ($indexes){
if(count($indexes[$key1]) == 0 && count($indexes[$key2]) > 0) return 1;
if(count($indexes[$key2]) == 0 && count($indexes[$key1]) > 0) return -1;
return strnatcmp($key1,$key2);
});
}
Sort the subordinate folders and use uksort for the current level of folders. Vice-versa would also work. If both 2 folders in comparison have subfolders, compare them as strings, else if one is a folder and another is a file, make folders come above.
Step 3: Flatten the hierarchical structure now that they are in order.
function flattenFileSystemResults($hierarchical_data){
$result = [];
foreach($hierarchical_data as $key => $value){
if(count($value) > 0){
$sub_result = flattenFileSystemResults($value);
foreach($sub_result as $r){
$result[] = $key . "\\" . $r;
}
}else{
$result[] = $key;
}
}
return $result;
}
Step 4: Restore the initial data keys back and return the result.
function associateKeys($data,$indexes){
$map = array_combine(array_column($indexes,'namespace'),array_keys($indexes));
$result = [];
foreach($data as $val){
$result[ $map[$val] ] = ['namespace' => $val];
}
return $result;
}
Driver code:
function foldersBeforeFiles($indexes){
$hierarchical_data = createHierarchicalStructure($indexes);
fileSystemSorting($hierarchical_data);
return associateKeys(flattenFileSystemResults($hierarchical_data),$indexes);
}
print_r(foldersBeforeFiles($indexes));
Demo: https://3v4l.org/cvoB2
I believe the following should work:
uasort($indexes, static function (array $entry1, array $entry2): int {
$ns1Parts = explode('\\', $entry1['namespace']);
$ns2Parts = explode('\\', $entry2['namespace']);
$ns1Length = count($ns1Parts);
$ns2Length = count($ns2Parts);
for ($i = 0; $i < $ns1Length && isset($ns2Parts[$i]); $i++) {
$isLastPartForNs1 = $i === $ns1Length - 1;
$isLastPartForNs2 = $i === $ns2Length - 1;
if ($isLastPartForNs1 !== $isLastPartForNs2) {
return $isLastPartForNs1 <=> $isLastPartForNs2;
}
$nsComparison = $ns1Parts[$i] <=> $ns2Parts[$i];
if ($nsComparison !== 0) {
return $nsComparison;
}
}
return 0;
});
What it does is:
split namespaces into parts,
compare each part starting from the first one, and:
if we're at the last part for one and not the other, prioritize the one with the most parts,
otherwise, if the respective parts are different, prioritize the one that is before the other one alphabetically.
Demo
Here's another version that breaks the steps down further that, although it might not be the most optimal, definitely helps my brain think about it. See the comments for more details on what is going on:
uasort(
$indexes,
static function (array $a, array $b) {
$aPath = $a['namespace'];
$bPath = $b['namespace'];
// Just in case there are duplicates
if ($aPath === $bPath) {
return 0;
}
// Break into parts
$aParts = explode('\\', $aPath);
$bParts = explode('\\', $bPath);
// If we only have a single thing then it is a root-level, just compare the item
if (1 === count($aParts) && 1 === count($bParts)) {
return $aPath <=> $bPath;
}
// Get the class and namespace (file and folder) parts
$aClass = array_pop($aParts);
$bClass = array_pop($bParts);
$aNamespace = implode('\\', $aParts);
$bNamespace = implode('\\', $bParts);
// If the namespaces are the same, sort by class name
if ($aNamespace === $bNamespace) {
return $aClass <=> $bClass;
}
// If the first namespace _starts_ with the second namespace, sort it first
if (0 === mb_strpos($aNamespace, $bNamespace)) {
return -1;
}
// Same as above but the other way
if (0 === mb_strpos($bNamespace, $aNamespace)) {
return 1;
}
// Just only by namespace
return $aNamespace <=> $bNamespace;
}
);
Online demo
I find no fault with Jeto's algorithmic design, but I decided to implement it more concisely. My snippet avoids iterated function calls and arithmetic in the for() loop, uses a single spaceship operator, and a single return. My snippet is greater than 50% shorter and I generally find it easier to read, but then everybody thinks their own baby is cute, right?
Code: (Demo)
uasort($indexes, function($a, $b) {
$aParts = explode('\\', $a['namespace']);
$bParts = explode('\\', $b['namespace']);
$aLast = count($aParts) - 1;
$bLast = count($bParts) - 1;
for ($cmp = 0, $i = 0; $i <= $aLast && !$cmp; ++$i) {
$cmp = [$i === $aLast, $aParts[$i]] <=> [$i === $bLast, $bParts[$i]];
}
return $cmp;
});
Like Jeto's answer, it iterates each array simultaneously and
checks if either element is the last element of the array, if so it is bumped down the list (because we want longer paths to win tiebreakers);
if neither element is the last in its array, then compare the elements' current string alphabetically.
The process repeats until a non-zero evaluation is generated.
Since duplicate entries are not expected to occur, the return value should always be -1 or 1 (never 0)
Note, I am halting the for() loop with two conditions.
If $i is greater than the number of elements in $aParts (only) -- because if $bParts has fewer elements than $aParts, then $cmp will generate a non-zero value before a Notice is triggered.
If $cmp is a non-zero value.
Finally, to explain the array syntax on either side of the spaceship operator...The spaceship operator will compare the arrays from left to right, so it will behave like:
leftside[0] <=> rightside[0] then leftside[1] <=> rightside[1]
Making multiple comparisons in this way does not impact performance because there are no function calls on either side of the <=>. If there were function calls involved, it would be more performant to make individual comparisons in a fallback manner like:
fun($a1) <=> fun($b1) ?: fun($a2) <=> fun($b2)
This way subsequent function calls are only actually made if a tiebreak is necessary.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
Let's say I have
$a = [1,4,5,8,9,...];
well containing a large range of discontinuing numbers, and
$p = [
1 => [...],
3 => [...],
8 => [...],
10 => [...],
...
];
containing arrays which indexes are discontinued as well.
I need to remove every number in $a that has a corresponding index in $p ...wait... without using any function.
Is it possible ? If yes, how ? If no, what is the most optimized way (wallclock-wise) to resolve that ?
Sure, it's possible:
$filtered_array = [];
foreach ($a as $index => $value) {
foreach ($p as $key => $sub_array) {
if ($key == $value) {
// this $value of $a corresponds with an existing index in $p, so
// do NOT add it to our $filtered_array but move on to the next
// value of $a
continue 2;
}
}
$filtered_array[$index] = $value;
}
Of course this will be very slow since for every entry in $a that is not in $p you will need to iterate over the entire $p array to find out that it isn't in there. A more efficient solution would be to leverage PHP's OutOfBoundsException that is thrown when you try to access an array index that doesn't exist:
$filtered_array = [];
foreach ($a as $index => $value) {
try {
$p[$value];
} except (OutOfBoundsException $e) {
// $value does not exist as an index of $p
$filtered_array[$index] = $value;
}
}
Performance might be improved somewhat if you don't need to preserve the array keys (you won't need $index in that case, which saves a memory assignment for every item in $a), but I think that difference will be negligible.
Using some array functions will be more efficient:
$filtered_array = [];
foreach ($a as $index => $value) {
if (!array_key_exists($value, $p)) {
$filtered_array[$index] = $value;
}
}
Since you won't need a try/catch mechanism that passes through PHP's error handling, this will be faster.
It would probably be even faster if you allow yourself to use unset() to remove values that are in $p, so you don't need to create a new array but can instead modify $a in place. From your code example and the short array syntax, I'm assuming you use PHP 7. Since PHP 7 no longer uses the internal array pointer for foreach, you can safely unset() items while iterating. If you're running on PHP 5, you can still do that but you run the risk of foreach skipping over some items.
foreach ($a as $index => $value) {
if (array_key_exists($value, $p)) {
unset($a[$index]);
}
}
Doing this will reduce the memory overhead of having a second (potentially large) array hanging around.
But since now you're already breaking your own "no functions" rule anyway, you might as well go all the way and use array_filter (although this does not modify the existing array but instead builds a new array, which will degrade performance for very large arrays):
$filtered_array = array_filter($a, function($value) use ($p) {
return !array_key_exists($value, $p);
});
So I answer my own question, people too busy to downvote and childishly comment saying "it's not possible", "is it homework". No it is not homework, I was just hoping some little help here (you know, as in community).
What I tried to achieve, remove every number in an array that has an index in another array (both having discontinued values) without using any function :
// temporary $a
$a_temp = [];
foreach ($a as $avalue) {
foreach ($p as $k => $v) {
// if iterator $a is in the $p (we disregard)
if ($avalue === $k) {
continue 2;
}
}
$a_temp[] = $avalue;
}
$a = $a_temp;
(if you see a function above, please send me an email)
Here some testing .
/********************************
* initialize fake data
********************************/
$p = [];
for ($i = 0; $i < 10; ++$i) {
$p[rand(0, 500)] = null;
}
for ($j = 0; $j < 1000; ++$j) {
$a[rand(0, 500)] = null;
}
$a = array_keys($a);
/********************************
* first test (without functions)
********************************/
$start = microtime(true);
for ($i = 0; $i < 9999; ++$i) {
$a_temp = [];
foreach ($a as $avalue) {
foreach ($p as $k => $v) {
if ($avalue === $k) {
continue 2;
}
}
$a_temp[] = $avalue;
}
}
$firstTestExecTime = (microtime(true) - $start);
// average time : 8s6757750511169
/********************************
* second test, with functions
********************************/
$start = microtime(true);
for ($i = 0; $i < 9999; ++$i) {
$a_temp = [];
foreach ($a as $avalue) {
if (!array_key_exists($avalue, $p)) {
$a_temp[] = $avalue;
}
}
}
$secondTestExecTime = (microtime(true) - $start);
// average time : 5.1003220081329
/********************************
* printing results
********************************/
printf('first test execution time : %s', $firstTestExecTime);
printf('second test execution time : %s', $secondTestExecTime);
Sometimes using functions with a big set of data is not recommended, but in that case, it seems like array_search has better performance than trying to make a native code.
Let's say I have following arrays:
$a = [1,2,3,4,5];
$b = [1,3,4,5,6];
$c = [1,7,8,9,10];
$d = [1,2,3,4];
The intersection of those would be $result = [1], which is easy enough. But what if I wanted the intersection of those with a minimum threshold of let's say 3? The threshold means I can skip one or more arrays from the intersection, as long as my resulting intersection has at least 3 elements, which in this case might result in:
$result = [1,3,4];
1, 3 and 4 are present in $a, $b and $d, but not in $c which is skipped because of the threshold. Is there an existing PHP class, algorithm or function with which I might accomplish this?
To do that we have to use combinations of an array. I have used combinations algorithm from this great article. Adjusting this algorithm we can write the following class:
class Intersections
{
protected $arrays;
private $arraysSize;
public function __construct($arrays)
{
$this->arrays = $arrays;
$this->arraysSize = count($arrays);
}
public function getByThreshold($threshold)
{
$intersections = $this->getAll();
foreach ($intersections as $intersection) {
if (count($intersection) >= $threshold) {
return $intersection;
}
}
return null;
}
protected $intersections;
public function getAll()
{
if (is_null($this->intersections)) {
$this->generateIntersections();
}
return $this->intersections;
}
private function generateIntersections()
{
$this->generateCombinationsMasks();
$this->generateCombinations();
$combinationSize = $this->arraysSize;
$intersectionSize = 0;
foreach ($this->combinations as $combination) {
$intersection = call_user_func_array('array_intersect', $combination);
if ($combinationSize > count($combination)) {
$combinationSize = count($combination);
$intersectionSize = 0;
}
if (count($intersection) > $intersectionSize) {
$this->intersections[$combinationSize] = $intersection;
$intersectionSize = count($intersection);
}
}
}
private $combinationsMasks;
private function generateCombinationsMasks()
{
$combinationsMasks = [];
$totalNumberOfCombinations = pow(2, $this->arraysSize);
for ($i = $totalNumberOfCombinations - 1; $i > 0; $i--) {
$combinationsMasks[] = str_pad(
decbin($i), $this->arraysSize, '0', STR_PAD_LEFT
);
}
usort($combinationsMasks, function ($a, $b) {
return strcmp(strtr($b, ['']), strtr($a, ['']));
});
$this->combinationsMasks = array_slice(
$combinationsMasks, 0, -$this->arraysSize
);
}
private $combinations;
private function generateCombinations()
{
$this->combinations = array_map(function ($combinationMask) {
return $this->generateCombination($combinationMask);
}, $this->combinationsMasks);
}
private function generateCombination($combinationMask)
{
$combination = [];
foreach (str_split($combinationMask) as $key => $indicator) {
if ($indicator) {
$combination[] = $this->arrays[$key];
}
}
return $combination;
}
}
I have tried to give self-explanatory names to methods. Some chunks of code can be optimized more (for example, I call count function multiple times on same arrays; this was done in order to reduce variables fiddling) for production use.
So basically the logic is pretty simple. We generate all combinations of arrays and sort them decreasingly by the number of used arrays. Then we find the longest intersection for each length of combinations. Actually, this is the hardest part. To get one particular intersection we return first one that matches threshold.
$intersections = new Intersections([$a, $b, $c, $d]);
var_dump($intersections->getAll());
var_dump($intersections->getByThreshold(3));
Here is working demo.
There are other ways to find all combinations, for example, one from "PHP Cookbook". You can choose whatever one you like most.
No build in feature for that. You need to write something short like:
$values = [];
foreach ([$a, $b, $c, $d] as $arr)
foreach ($arr as $value)
$values[$value] = ($values[$value] ?? 0) + 1;
// For threshold of 3
$values = array_keys(array_filter($values, function($a) { return $a >= 3; }));
Note: This requires PHP7 for ?? operator. Otherwise use something like:
$values[$value] = empty($values[$value]) ? 1 : $values[$value] + 1;
Introduction
Since version 5.5 in PHP there's such great thing as generators. I will not repeat official manual page, but they are great thing for short definition of iterators. The most-known sample is:
function xrange($from, $till, $step)
{
if ($from>$till || $step<=0)
{
throw new InvalidArgumentException('Invalid range initializers');
}
for ($i = $from; $i < $till; $i += $step)
{
yield $i;
}
}
//...
foreach (xrange(2, 13, 3) as $i)
{
echo($i.PHP_EOL); // 2,5,8,11
}
and generator is actually not a function, but an instance of a concrete class:
get_class(xrange(1, 10, 1)); // Generator
The problem
Done with RTM stuff, now moving on to my question. Imagine that we want to create generator of Fibonacci numbers. Normally, to get those, we can use simple function:
function fibonacci($n)
{
if(!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
return $n < 2 ? $n : fibonacci($n-1) + fibonacci($n-2);
}
var_dump(fibonacci(6)); // 8
Let's transform this into something, that holds sequence and not only it's last member:
function fibonacci($n)
{
if (!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
if ($n<2)
{
return range(0, $n);
}
$n1 = fibonacci($n-1);
$n2 = fibonacci($n-2);
return array_merge($n1, [array_pop($n1)+array_pop($n2)]);
}
//...
foreach (fibonacci(6) as $i)
{
echo($i.PHP_EOL); // 0,1,1,2,3,5,8
}
We have now a function that returns array with full sequence
The question
Finally, the question part: how can I transform my latest fibonacci function so it will yield my values, not holding them in an array? My $n can be big, so I want to use benefits of generators, like in xrange sample. Pseudo-code will be:
function fibonacci($n)
{
if (!is_int($n) || $n<0)
{
throw new InvalidArgumentException('Invalid sequence limit');
}
if ($n<2)
{
yield $n;
}
yield fibonacci($n-2) + fibonacci($n-1);
}
But this, obviously, is crap since we can't handle with it like this way because recursion will cause object of class Generator and not int value.
Bonus: getting fibonacci sequence is just a sample for more general question: how to use generators with recursion in common case? Of course, I can use standard Iterator for that or re-write my function to avoid recursion. But I want to achieve that with generators. Is this possible? Does this worth efforts to use this such way?
So the issue I ran into when attempting to create a recursive generator function, is that once you go past your first depth level each subsequent yield is yielding to its parent call rather than the iteration implementation (the loop).
As of php 7 a new feature has been added that allows you to yield from a subsequent generator function. This is the new Generator Delegation feature: https://wiki.php.net/rfc/generator-delegation
This allows us to yield from subsequent recursive calls, which means we can now efficiently write recursive functions with the use of generators.
$items = ['what', 'this', 'is', ['is', 'a', ['nested', 'array', ['with', 'a', 'bunch', ['of', ['values']]]]]];
function processItems($items)
{
foreach ($items as $value)
{
if (is_array($value))
{
yield from processItems($value);
continue;
}
yield $value;
}
}
foreach (processItems($items) as $item)
{
echo $item . "\n";
}
This gives the following output..
what
this
is
is
a
nested
array
with
a
bunch
of
values
I've finally identified a real-world use for recursive generators.
I've been exploring QuadTree datastructures recently. For those not familiar with QuadTrees, they're a tree-based datastructure use for geospatial indexing, and allowing a fast search lookup of all points/locations within a defined bounding box.
Each node in the QuadTree represents a segment of the mapped region, and acts as a bucket in which locations are stored... but a bucket of restricted size. When a bucket overflows, the QuadTree node splits off 4 child nodes, representing the North-west, North-east, South-west and South-east areas of the parent node, and starts to fill those.
When searching for locations falling within a specified bounding box, the search routine starts at the top-level node, testing all the locations in that bucket; then recurses into the child nodes, testing whether they intersect with the bounding box, or are encompassed by the bounding box, testing each QuadTree node within that set, then recursing again down through the tree. Each node may return none, one or many locations.
I implemented a basic QuadTree in PHP, designed to return an array of results; then realised that it might be a valid use case for a recursive generator, so I implemented a GeneratorQuadTree that can be accessed in a foreach() loop yielding a single result each iteration.
It seems a much more valid use-case for recursive generators because it is a truly recursive search function, and because each generator may return none, one or many results rather than a single result. Effectively, each nested generator is handling a part of the search, feeding its results back up through the tree through its parent.
The code is rather too much to post here; but you can take a look at the implementation on github.
It's fractionally slower than the non-generator version (but not significantly): the main benefit is reduction in memory because it isn't simply returning an array of variable size (which can be a significant benefit depending on the number of results returned). The biggest drawback is the fact that the results can't easily be sorted (my non-generator version does a usort() on the results array after it's returned).
function fibonacci($n)
{
if($n < 2) {
yield $n;
}
$x = fibonacci($n-1);
$y = fibonacci($n-2);
yield $x->current() + $y->current();
}
for($i = 0; $i <= 10; $i++) {
$x = fibonacci($i);
$value = $x->current();
echo $i , ' -> ' , $value, PHP_EOL;
}
If you first want to make a generator you might as well use the iterative version of fibonacci:
function fibonacci ($from, $to)
{
$a = 0;
$b = 1;
$tmp;
while( $to > 0 ) {
if( $from > 0 )
$from--;
else
yield $a;
$tmp = $a + $b;
$a=$b;
$b=$tmp;
$to--;
}
}
foreach( fibonacci(10,20) as $fib ) {
print "$fib "; // prints "55 89 144 233 377 610 987 1597 2584 4181 "
}
Here's a recursive generator for combinations (order unimportant, without replacement):
<?php
function comb($set = [], $size = 0) {
if ($size == 0) {
// end of recursion
yield [];
}
// since nothing to yield for an empty set...
elseif ($set) {
$prefix = [array_shift($set)];
foreach (comb($set, $size-1) as $suffix) {
yield array_merge($prefix, $suffix);
}
// same as `yield from comb($set, $size);`
foreach (comb($set, $size) as $next) {
yield $next;
}
}
}
// let's verify correctness
assert(iterator_to_array(comb([0, 1, 2, 3, 4], 3)) == [
[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4],
[0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]
]);
foreach (comb([0, 1, 2, 3], 3) as $combination) {
echo implode(", ", $combination), "\n";
}
Outputs:
0, 1, 2
0, 1, 3
0, 2, 3
1, 2, 3
Same thing non-yielding.
Recently ran into a problem that needed 'recursive' generators or generator delegation. I ended up writing a little function that converts delegated generators calls into a single generator.
I turned it into a package so you could just require it with composer, or checkout the source here: hedronium/generator-nest.
Code:
function nested(Iterator $generator)
{
$cur = 0;
$gens = [$generator];
while ($cur > -1) {
if ($gens[$cur]->valid()) {
$key = $gens[$cur]->key();
$val = $gens[$cur]->current();
$gens[$cur]->next();
if ($val instanceof Generator) {
$gens[] = $val;
$cur++;
} else {
yield $key => $val;
}
} else {
array_pop($gens);
$cur--;
}
}
}
You use it like:
foreach (nested(recursive_generator()) as $combination) {
// your code
}
Checkout that link above. It has examples.
Short answer: recursive generators are simple. Example for walking through tree:
class Node {
public function getChildren() {
return [ /* array of children */ ];
}
public function walk() {
yield $this;
foreach ($this->getChildren() as $child) {
foreach ($child->walk() as $return) {
yield $return;
};
}
}
}
It's all.
Long answer about fibonacci:
Generator is something that is used with foreach (generator() as $item) { ... }. But OP wants fib() function to return int, but at the same time he wants it to return generator to be used in foreach. It is very confusing.
It is possible to implement recursive generator solution for fibonacci. We just need to put somewere inside fib() function a loop that will indeed yield each member of the sequence. As generator is supposed to be used with foreach, it looks really wierd, and I do not think it is effective, but here it is:
function fibGenerator($n) {
if ($n < 2) {
yield $n;
return;
}
// calculating current number
$x1 = fibGenerator($n - 1);
$x2 = fibGenerator($n - 2);
$result = $x1->current() + $x2->current();
// yielding the sequence
yield $result;
yield $x1->current();
yield $x2->current();
for ($n = $n - 3; $n >= 0; $n--) {
$res = fibGenerator($n);
yield $res->current();
}
}
foreach (fibGenerator(15) as $x) {
echo $x . " ";
}
I am offering two solution for Fibonacci number, with and without recursion:
function fib($n)
{
return ($n < 3) ? ($n == 0) ? 0 : 1 : fib($n - 1) + fib($n - 2);
}
function fib2()
{
$a = 0;
$b = 1;
for ($i = 1; $i <= 10; $i++)
{
echo $a . "\n";
$a = $a + $b;
$b = $a - $b;
}
}
for ($i = 0; $i <= 10; $i++)
{
echo fib($i) . "\n";
}
echo fib2();
What would be the fastest, most efficient way to implement a search method that will return an object with a qualifying id?
Sample object array:
$array = [
(object) ['id' => 'one', 'color' => 'white'],
(object) ['id' => 'two', 'color' => 'red'],
(object) ['id' => 'three', 'color' => 'blue']
];
What do I write inside of:
function findObjectById($id){
}
The desired result would return the object at $array[0] if I called:
$obj = findObjectById('one')
Otherwise, it would return false if I passed 'four' as the parameter.
You can iterate that objects:
function findObjectById($id){
$array = array( /* your array of objects */ );
foreach ( $array as $element ) {
if ( $id == $element->id ) {
return $element;
}
}
return false;
}
Edit:
Faster way is to have an array with keys equals to objects' ids (if unique);
Then you can build your function as follow:
function findObjectById($id){
$array = array( /* your array of objects with ids as keys */ );
if ( isset( $array[$id] ) ) {
return $array[$id];
}
return false;
}
It's an old question but for the canonical reference as it was missing in the pure form:
$obj = array_column($array, null, 'id')['one'] ?? false;
The false is per the questions requirement to return false. It represents the non-matching value, e.g. you can make it null for example as an alternative suggestion.
This works transparently since PHP 7.0. In case you (still) have an older version, there are user-space implementations of it that can be used as a drop-in replacement.
However array_column also means to copy a whole array. This might not be wanted.
Instead it could be used to index the array and then map over with array_flip:
$index = array_column($array, 'id');
$map = array_flip($index);
$obj = $array[$map['one'] ?? null] ?? false;
On the index the search problem might still be the same, the map just offers the index in the original array so there is a reference system.
Keep in mind thought that this might not be necessary as PHP has copy-on-write. So there might be less duplication as intentionally thought. So this is to show some options.
Another option is to go through the whole array and unless the object is already found, check for a match. One way to do this is with array_reduce:
$obj = array_reduce($array, static function ($carry, $item) {
return $carry === false && $item->id === 'one' ? $item : $carry;
}, false);
This variant again is with the returning false requirement for no-match.
It is a bit more straight forward with null:
$obj = array_reduce($array, static function ($carry, $item) {
return $carry ?? ($item->id === 'one' ? $item : $carry);
}, null);
And a different no-match requirement can then be added with $obj = ...) ?? false; for example.
Fully exposing to foreach within a function of its own even has the benefit to directly exit on match:
$result = null;
foreach ($array as $object) {
if ($object->id === 'one') {
$result = $object;
break;
}
}
unset($object);
$obj = $result ?? false;
This is effectively the original answer by hsz, which shows how universally it can be applied.
You can use the function array_search of php like this
$key=array_search("one", array_column(json_decode(json_encode($array),TRUE), 'color'));
var_dump($array[$key]);
i: is the index of item in array
1: is the property value looking for
$arr: Array looking inside
'ID': the property key
$i = array_search(1, array_column($arr, 'ID'));
$element = ($i !== false ? $arr[$i] : null);
Well, you would would have to loop through them and check compare the ID's unless your array is sorted (by ID) in which case you can implement a searching algorithm like binary search or something of that sort to make it quicker.
My suggestion would be to first sort the arrays using a sorting algorithm (binary sort, insertion sort or quick sort) if the array is not sorted already. Then you can implement a search algorithm which should improve performance and I think that's as good as it gets.
http://www.algolist.net/Algorithms/Binary_search
This is my absolute favorite algorithm for very quickly finding what I need in a very large array, quickly. It is a Binary Search Algorithm implementation I created and use extensively in my PHP code. It hands-down beats straight-forward iterative search routines. You can vary it a multitude of ways to fit your need, but the basic algorithm remains the same.
To use it (this variation), the array must be sorted, by the index you want to find, in lowest-to-highest order.
function quick_find(&$array, $property, $value_to_find, &$first_index) {
$l = 0;
$r = count($array) - 1;
$m = 0;
while ($l <= $r) {
$m = floor(($l + $r) / 2);
if ($array[$m]->{$property} < $value_to_find) {
$l = $m + 1;
} else if ($array[$m]->{$property} > $value_to_find) {
$r = $m - 1;
} else {
$first_index = $m;
return $array[$m];
}
}
return FALSE;
}
And to test it out:
/* Define a class to put into our array of objects */
class test_object {
public $index;
public $whatever_you_want;
public function __construct( $index_to_assign ) {
$this->index = $index_to_assign;
$this->whatever_you_want = rand(1, 10000000);
}
}
/* Initialize an empty array we will fill with our objects */
$my_array = array();
/* Get a random starting index to simulate data (possibly loaded from a database) */
$my_index = rand(1256, 30000);
/* Say we are needing to locate the record with this index */
$index_to_locate = $my_index + rand(200, 30234);
/*
* Fill "$my_array()" with ONE MILLION objects of type "test_object"
*
* 1,000,000 objects may take a little bit to generate. If you don't
* feel patient, you may lower the number!
*
*/
for ($i = 0; $i < 1000000; $i++) {
$searchable_object = new test_object($my_index); // Create the object
array_push($my_array, $searchable_object); // Add it to the "$my_array" array
$my_index++; /* Increment our unique index */
}
echo "Searching array of ".count($my_array)." objects for index: " . $index_to_locate ."\n\n";
$index_found = -1; // Variable into which the array-index at which our object was found will be placed upon return of the function.
$object = quick_find($my_array, "index", $index_to_locate, $index_found);
if ($object == NULL) {
echo "Index $index_to_locate was not contained in the array.\n";
} else {
echo "Object found at index $index_found!\n";
print_r($object);
}
echo "\n\n";
Now, a few notes:
You MAY use this to find non-unique indexes; the array MUST still be sorted in ascending order. Then, when it finds an element matching your criteria, you must walk the array backwards to find the first element, or forward to find the last. It will add a few "hops" to your search, but it will still most likely be faster than iterating a large array.
For STRING indexes, you can change the arithmetic comparisons (i.e. " > " and " < " ) in quick_find() to PHP's function "strcasecmp()". Just make sure the STRING indexes are sorted the same way (for the example implementation): Alphabetically and Ascending.
And if you want to have a version that can search arrays of objects sorted in EITHER ascending OR decending order:
function quick_find_a(&$array, $property, $value_to_find, &$first_index) {
$l = 0;
$r = count($array) - 1;
$m = 0;
while ($l <= $r) {
$m = floor(($l + $r) / 2);
if ($array[$m]->{$property} < $value_to_find) {
$l = $m + 1;
} else if ($array[$m]->{$property} > $value_to_find) {
$r = $m - 1;
} else {
$first_index = $m;
return $array[$m];
}
}
return FALSE;
}
function quick_find_d(&$array, $property, $value_to_find, &$first_index) {
$l = 0;
$r = count($array) - 1;
$m = 0;
while ($l <= $r) {
$m = floor(($l + $r) / 2);
if ($value_to_find > $array[$m]->{$property}) {
$r = $m - 1;
} else if ($value_to_find < $array[$m]->{$property}) {
$l = $m + 1;
} else {
$first_index = $m;
return $array[$m];
}
}
return FALSE;
}
function quick_find(&$array, $property, $value_to_find, &$first_index) {
if ($array[0]->{$property} < $array[count($array)-1]->{$property}) {
return quick_find_a($array, $property, $value_to_find, $first_index);
} else {
return quick_find_d($array, $property, $value_to_find, $first_index);
}
}
The thing with performance of data structures is not only how to get but mostly how to store my data.
If you are free to design your array, use an associative array:
$array['one']->id = 'one';
$array['one']->color = 'white';
$array['two']->id = 'two';
$array['two']->color = 'red';
$array['three']->id = 'three';
$array['three']->color = 'blue';
Finding is then the most cheap: $one = $array['one];
UPDATE:
If you cannot modify your array constitution, you could create a separate array which maps ids to indexes. Finding an object this way does not cost any time:
$map['one'] = 0;
$map['two'] = 1;
$map['three'] = 2;
...
getObjectById() then first lookups the index of the id within the original array and secondly returns the right object:
$index = $map[$id];
return $array[$index];
Something I like to do in these situations is to create a referential array, thus avoiding having to re-copy the object but having the power to use the reference to it like the object itself.
$array['one']->id = 'one';
$array['one']->color = 'white';
$array['two']->id = 'two';
$array['two']->color = 'red';
$array['three']->id = 'three';
$array['three']->color = 'blue';
Then we can create a simple referential array:
$ref = array();
foreach ( $array as $row )
$ref[$row->id] = &$array[$row->id];
Now we can simply test if an instance exists in the array and even use it like the original object if we wanted:
if ( isset( $ref['one'] ) )
echo $ref['one']->color;
would output:
white
If the id in question did not exist, the isset() would return false, so there's no need to iterate the original object over and over looking for a value...we just use PHP's isset() function and avoid using a separate function altogether.
Please note when using references that you want use the "&" with the original array and not the iterator, so using &$row would not give you what you want.
This is definitely not efficient, O(N). But it looks sexy:
$result = array_reduce($array, function ($found, $obj) use ($id) {
return $obj['id'] == $id ? $obj : $found;
}, null);
addendum:
I see hakre already posted something akin to this.
Here is what I use. Reusable functions that loop through an array of objects. The second one allows you to retrieve a single object directly out of all matches (the first one to match criteria).
function get_objects_where($match, $objects) {
if ($match == '' || !is_array($match)) return array ();
$wanted_objects = array ();
foreach ($objects as $object) {
$wanted = false;
foreach ($match as $k => $v) {
if (is_object($object) && isset($object->$k) && $object->$k == $v) {
$wanted = true;
} else {
$wanted = false;
break;
};
};
if ($wanted) $wanted_objects[] = $object;
};
return $wanted_objects;
};
function get_object_where($match, $objects) {
if ($match == '' || !is_array($match)) return (object) array ();
$wanted_objects = get_objects_where($match, $objects);
return count($wanted_objects) > 0 ? $wanted_objects[0] : (object) array ();
};
The easiest way:
function objectToArray($obj) {
return json_decode(json_encode($obj), true);
}