Multiple generators in a single loop within PHP - php

I need to write a simple script that loads data from multiple files and merges it somehow. However, given the fact that the files might be quite huge I'd like to load data partially. To do so I decided to use yield. And according to examples I found I could use following construction for single generator:
$generator = $someClass->load(); //load method uses yield so it returns generator object
foreach($generator as $i) {
// do something
}
But what if I want to use two generators at once?
$generatorA = $someClass1->load(); //load method uses yield so it returns generator object
$generatorB = $someClass2->load(); //load method uses yield so it returns generator object
foreach($generatorA as $i) {
// how can I access to resultSet from generatorB here?
}

Generators in PHP implement the Iterator interface, so you can merge / combine multiple Generators like you can combine multiple Iterators.
If you want to iterate over both generators one after the other (merge A + B), then you can make use of the AppendIterator.
$aAndB = new AppendIterator();
$aAndB->append($generatorA);
$aAndB->append($generatorB);
foreach ($aAndB as $i) {
...
If you want to iterate over both generator at once, you can make use of MultipleIterator.
$both = new MultipleIterator();
$both->attachIterator($generatorA);
$both->attachIterator($generatorB);
foreach ($both as list($valueA, $valueB)) {
...
Example for those two incl. examples and caveats are in this blog-post of mine as well:
Iterating over Multiple Iterators at Once
Otherwise I don't understand what you've been asking for. If you could clarify, I should be able to give you more directions.

From https://www.php.net/manual/en/language.generators.syntax.php#control-structures.yield.from
Generator delegation via yield from
In PHP 7, generator delegation allows you to yield values from another
generator, Traversable object, or array by using the yield from keyword.
The outer generator will then yield all values from
the inner generator, object, or array until that is no longer valid,
after which execution will continue in the outer generator.
So it's possible to combine two (or more) generators using yield from.
/**
* Yield all values from $generator1, then all values from $generator2
* Keys are preserved
*/
function combine_sequentially(Generator $generator1, Generator $generator2): Generator
{
yield from $generator1;
yield from $generator2;
};
Or something more fancy (here, it's not possible to use yield from):
/**
* Yield a value from $generator1, then a value from $generator2, and so on
* Keys are preserved
*/
function combine_alternatively(Generator $generator1, Generator $generator2): Generator
{
while ($generator1->valid() || $generator2->valid()) {
if ($generator1->valid()) {
yield $generator1->key() => $generator1->current();
$generator1->next();
}
if ($generator2->valid()) {
yield $generator2->key() => $generator2->current();
$generator2->next();
}
}
};

While AppendIterator works for Iterators, it has some issues.
Firstly it is not so nice to need to construct a new object rather than just calling a function. What is even less nice is that you need to mutate the AppendIterator, since you cannot provide the inner iterators in its constructor.
Secondly AppendIterator only takes Iterator instances, so if you have a Traversable, such as IteratorAggregate, you are out of luck. Same story for other iterable that are not Iterator, such as array.
This PHP 7.1+ function combines two iterable:
/**
* array_merge clone for iterables using lazy evaluation
*
* As with array_merge, numeric elements with keys are assigned a fresh key,
* starting with key 0. Unlike array_merge, elements with duplicate non-numeric
* keys are kept in the Generator. Beware that when converting the Generator
* to an array with a function such as iterator_to_array, these duplicates will
* be dropped, resulting in identical behaviour as array_merge.
*
*
* #param iterable ...$iterables
* #return Generator
*/
function iterable_merge( iterable ...$iterables ): Generator {
$numericIndex = 0;
foreach ( $iterables as $iterable ) {
foreach ( $iterable as $key => $value ) {
yield is_int( $key ) ? $numericIndex++ : $key => $value;
}
}
}
Usage example:
foreach ( iterable_merge( $iterator1, $iterator2, $someArray ) as $k => $v ) {}
This function is part of a small library for working with iterable, where it is also extensively tested.

If you want to use Generators with AppendIterator you'll need to use NoRewindIterator with it:
https://3v4l.org/pgiXB
<?php
function foo() {
foreach ([] as $foo) {
yield $foo;
}
}
$append = new AppendIterator();
$append->append(new NoRewindIterator(foo()));
var_dump(iterator_to_array($append));
Trying to traverse a bare Generator with AppendIterator will cause a fatal error if the Generator never actually calls yield:
https://3v4l.org/B4Qnh
<?php
function foo() {
foreach ([] as $foo) {
yield $foo;
}
}
$append = new AppendIterator();
$append->append(foo());
var_dump(iterator_to_array($append));
Output:
Fatal error: Uncaught Exception: Cannot traverse an already closed generator in /in/B4Qnh:10
Stack trace:
#0 [internal function]: AppendIterator->rewind()
#1 /in/B4Qnh(10): iterator_to_array(Object(AppendIterator))
#2 {main}
thrown in /in/B4Qnh on line 10
Process exited with code 255.

You can use yield from
function one()
{
yield 1;
yield 2;
}
function two()
{
yield 3;
yield 4;
}
function merge()
{
yield from one();
yield from two();
}
foreach(merge() as $i)
{
echo $i;
}
An example Reusable function
function iterable_merge( iterable ...$iterables ): Generator {
foreach ( $iterables as $iterable ) {
yield from $iterable;
}
}
$merge=iterable_merge(one(),two());

Something like:
$generatorA = $someClass1->load(); //load method uses yield so it returns generator object
$generatorB = $someClass2->load(); //load method uses yield so it returns generator object
$flag = true;
$i = 0;
while($flag === false) {
if ($i >= count($generatorA) || $i >= count($generatorB)) {
$flag = true;
}
// Access both generators
$genA = $generatorA[$i];
$genB = $generatorB[$i];
$i++;
}

Try this:
<?php
foreach($generatorA as $key=>$i) {
$A=$i;//value from $generatorA
$B=$generatorB[$key];//value from $generatorB
}

Related

unsetting objects in an array based on a bool method

I'm trying to filter an array of objects implementing a specific interface (which simply defines the isComplete(): bool method) based on the result of that method. array_filter doesn't work because it can't call a method on each object to determine whether to filter it (or can it?). I've tried writing a function that takes the splatted array as an argument by reference, this doesn't work either:
function skipIncomplete(CompletableObjectInterface &...$objects): array {
$skipped = [];
foreach ($objects as $index => $item) {
if (!$item->isComplete()) {
$skipped[] = $item->id ?? $index;
unset($objects[$index]);
}
}
return $skipped;
}
The original elements passed in simply don't end up getting unset.
I'm looking for a way that doesn't include creating an entirely new Collection class to hold my CompletableObjects for complexity reasons. I really want to keep the type hint so no one can pass in a generic array, causing runtime errors when the function tries to call $item->isComplete.
Is there any way I can achieve this in PHP 7.3.15?
Added a filter, please comment as to what is wrong with this type of approach:
<?php
interface CompletableObjectInterface {
public function isComplete() : bool;
}
class Foo implements CompletableObjectInterface
{
public function isComplete() : bool
{
return false;
}
}
class Bar implements CompletableObjectInterface
{
public function isComplete() : bool
{
return true;
}
}
$foo = new Foo;
$bar = new Bar;
$incomplete = array_filter([$foo, $bar], function($obj) { return !$obj->isComplete();});
var_dump($incomplete);
Output:
array(1) {
[0]=>
object(Foo)#1 (0) {
}
}
Looks like you got a bit hung up on a wrong understanding of the ... syntax for a variable number of arguments.
You are passing in one array, and the $objects parameter will therefore contain that array in the first index, i.e. in $objects[0].
So in theory you could just change your line
unset($objects[$index]);
to
unset($objects[0][$index]);
However, I do not really see why the variable number of arguments syntax is used at all, since you apparently are just expecting one array of values (objects in this case) as an argument to the function. Therefore I'd recommend you just remove the ... from the argument list and your solution does what you wanted.
Alternatively you can of course add an outer foreach-loop and iterate over all passed "arrays of objects", if that is an use case for you.

Difference in behavior when using "yield from" with an Iterator or a Generator

According to the documentation it if possible to delegate generation from any Traversable object. However I see the difference between yield from {a Generator instance} and yield from {an Iterator instance}.
An example.
This piece of code has an Iterator and a Generator that are doing the same: they provide the "1, 2, 3" sequence:
$iterator = new class implements \Iterator {
private $values = [1, 2, 3];
public function current()
{
return current($this->values);
}
public function next()
{
next($this->values);
}
public function key()
{
return key(this->values);
}
public function valid()
{
return current($this->values) !== false;
}
public function rewind()
{
reset($this->values);
}
};
$generator = (function ()
{
yield 1;
yield 2;
yield 3;
})();
But when I try to yield from them, I've got different results. Let's play with this function:
function skipOne(\Iterator $iterator)
{
$iterator->next();
yield from $iterator;
}
With a generator everything works as expected:
foreach (skipOne($generator) as $value) {
var_dump($value);
}
Output:
int(2)
int(3)
But with an Iterator it doesn't skip the first number:
foreach (skipOne($iterator) as $value) {
var_dump($value);
}
Output:
int(1)
int(2)
int(3)
I've found that yield from operator cause $iterator->rewind() invocation by some reason. What I'm doing wrong?
You are not doing anything wrong.
yield from will attempt to rewind something that's capable of being rewound, but generators are not, they only go forward.
That`s the reason your code works as you expect in the first example, and not in the second.
You could wrap your $iterator in a generator, and you'd get the same result.
$wrappedIterator = (function($iterator) {
foreach ($iterator as $value) {
yield $value;
}
})($iterator);
You are simply takind "advantage" of the one-use nature of the generators, since they can only go forward and never be rewound.
They same would happen with foreach, for example, since it also attempts to rewind whatever is looping about.
If you simply did
$iterator->next();
foreach ($iterator as $value) {
var_dump($value);
}
you would also get:
int(1)
int(2)
int(3)
In both cases (yield from and foreach) the iterator will be rewound.
Although if you tried
$generator->next();
foreach ($generator as $value) {
var_dump($value);
}
you would get a fatal error.
In your case, since when yield from is not capable of rewinding the object of its affections it doesn't complain and proceeds as if nothing happened, it can be confusing.
You can easily verify the behavior by checking this.
As others have stated, yield from will call $iterator->rewind().
Generators cannot be rewound, because their rewind() is implemented to do nothing.
From the PHP Manual:
Generators are forward-only iterators, and cannot be rewound once iteration has started. This also means that the same generator can't be iterated over multiple times: the generator will need to be rebuilt by calling the generator function again.
So if you wish to have the same behavior as Generators, you have two options.
Option 1: No-op Rewind
Simply leave the rewind() method of your Iterator empty, and move any rewind code to the constructor of the iterator.
class MyIterator implements \Iterator
{
public function __construct()
{
// put rewind code here
}
public function rewind()
{
// Do nothing
}
}
This will make your iterators less reusable, while also making it unclear from outside code why the iterator is not behaving like a regular iterator. So use this option with caution.
Options 2: Use NoRewindIterator
NoRewindIterator is an SPL iterator class that decorates another iterator.
$noRewindIterator = new \NoRewindIterator($iterator);
It will behave identically to the $iterator it is given, with the exception of the rewind() method. When $noRewindIterator->rewind() is called, nothing will happen.
This has the advantage of allowing you to write normal rewind-able iterators, while also being able to do partial iteration.
Here's how your skipOne() function can use NoRewindIterator:
function skipOne(\Iterator $iterator)
{
$iterator->next();
yield from new \NoRewindIterator($iterator);
}
PHP documentation said The outer generator will then yield all values from the inner generator, object, or array until that is no longer valid, after which execution will continue in the outer generator.
All is one from keywords here. PHP source code is proof that too.

How to pass in an empty generator parameter?

I have a method which takes a generator plus some additional parameters and returns a new generator:
function merge(\Generator $carry, array $additional)
{
foreach ( $carry as $item ) {
yield $item;
}
foreach ( $additional as $item ) {
yield $item;
}
}
The usual use case for this function is similar to this:
function source()
{
for ( $i = 0; $i < 3; $i++ ) {
yield $i;
}
}
foreach ( merge(source(), [4, 5]) as $item ) {
var_dump($item);
}
But the problem is that sometimes I need to pass empty source to the merge method. Ideally I would like to be able to do something like this:
merge(\Generator::getEmpty(), [4, 5]);
Which is exactly how I would do in C# (there is a IEnumerable<T>.Empty property). But I don't see any kind of empty generator in the manual.
I've managed to work around this (for now) by using this function:
function sourceEmpty()
{
if ( false ) {
yield;
}
}
And this works. The code:
foreach ( merge(sourceEmpty(), [4, 5]) as $item ) {
var_dump($item);
}
correctly outputs:
int(4)
int(5)
But this is obviously not an ideal solution. What would be the proper way of passing an empty generator to the merge method?
Bit late, but needed an empty generator myself, and realized creating one is actually quite easy...
function empty_generator(): Generator
{
yield from [];
}
Don't know if that's better than using the EmptyIterator, but this way you get exactly the same type as non-empty generators at least.
Just for completeness, perhaps the least verbose answer so far:
function generator() {
return; yield;
}
I just wondered about the same question and remembered an early description in the docs (which should be in at least semantically until today) that a generator function is any function with the yield keyword.
Now when the function returns before it yields, the generator should be empty.
And so it is.
Example on 3v4l.org: https://3v4l.org/iqaIY
I've found the solution:
Since \Generator extends \Iterator I can just change the method signature to this:
function merge(\Iterator $carry, array $additional)
{
// ...
This is input covariance thus it would break backward compatibility, but only if someone did extend the merge method. Any invocations will still work.
Now I can invoke the method with PHP's native EmptyIterator:
merge(new \EmptyIterator, [4, 5]);
And the usual generator also works:
merge(source(), [4, 5])
As explained in the official docs, you can create an in-line Generator instance, by using yield in an expression:
$empty = (yield);
That should work, but when I tried using that, I got a fatal error (yield expression can only be used in a function). Using null didn't help either:
$empty = (yield null); //error
So I guess you're stuck with the sourceEmpty function... it was the only thing I found that works... note that it will create a null value in the array you're iterating.
All the code was tested on PHP 5.5.9, BTW
The best fix I can come up with (seeing as compatibility is an issue) would be to make both arguments optional:
function merge(\Generator $carry = null, array $additional = array())
{
if ($carry)
foreach ($carry as $item)
yield $item;
foreach ($additional as $item)
yield $item;
}
foreach(merge(null, [1,2]) as $item)
var_dump($item);
This way, existing code won't brake, and instead of constructing an empty generator, passing null will work just fine, too.

How to check if variable is array?... or something array-like

I want to use a foreach loop with a variable, but this variable can be many different types, NULL for example.
So before foreach I test it:
if(is_array($var)){
foreach($var as ...
But I realized that it can also be a class that implements Iterator interface. Maybe I am blind but how to check whether the class implements interface? Is there something like is_a function or inherits operator? I found class_implements, I can use it, but maybe there is something simpler?
And second, more important, I suppose this function exist, would be enough to check if the variable is_array or "implements Iterator interface" or should I test for something more?
If you are using foreach inside a function and you are expecting an array or a Traversable object you can type hint that function with:
function myFunction(array $a)
function myFunction(Traversable)
If you are not using foreach inside a function or you are expecting both you can simply use this construct to check if you can iterate over the variable:
if (is_array($a) or ($a instanceof Traversable))
foreach can handle arrays and objects. You can check this with:
$can_foreach = is_array($var) || is_object($var);
if ($can_foreach) {
foreach ($var as ...
}
You don't need to specifically check for Traversable as others have hinted it in their answers, because all objects - like all arrays - are traversable in PHP.
More technically:
foreach works with all kinds of traversables, i.e. with arrays, with plain objects (where the accessible properties are traversed) and Traversable objects (or rather objects that define the internal get_iterator handler).
(source)
Simply said in common PHP programming, whenever a variable is
an array
an object
and is not
NULL
a resource
a scalar
you can use foreach on it.
You can check instance of Traversable with a simple function. This would work for all this of Iterator because Iterator extends Traversable
function canLoop($mixed) {
return is_array($mixed) || $mixed instanceof Traversable ? true : false;
}
<?php
$var = new ArrayIterator();
var_dump(is_array($var), ($var instanceof ArrayIterator));
returns bool(false) or bool(true)
PHP 7.1.0 has introduced the iterable pseudo-type and the is_iterable() function, which is specially designed for such a purpose:
This […] proposes a new iterable pseudo-type. This type is analogous to callable, accepting multiple types instead of one single type.
iterable accepts any array or object implementing Traversable. Both of these types are iterable using foreach and can be used with yield from within a generator.
function foo(iterable $iterable) {
foreach ($iterable as $value) {
// ...
}
}
This […] also adds a function is_iterable() that returns a boolean: true if a value is iterable and will be accepted by the iterable pseudo-type, false for other values.
var_dump(is_iterable([1, 2, 3])); // bool(true)
var_dump(is_iterable(new ArrayIterator([1, 2, 3]))); // bool(true)
var_dump(is_iterable((function () { yield 1; })())); // bool(true)
var_dump(is_iterable(1)); // bool(false)
var_dump(is_iterable(new stdClass())); // bool(false)
You can also use the function is_array($var) to check if the passed variable is an array:
<?php
var_dump( is_array(array()) ); // true
var_dump( is_array(array(1, 2, 3)) ); // true
var_dump( is_array($_SERVER) ); // true
?>
Read more in How to check if a variable is an array in PHP?
Functions
<?php
/**
* Is Array?
* #param mixed $x
* #return bool
*/
function isArray($x) : bool {
return !isAssociative($x);
}
/**
* Is Associative Array?
* #param mixed $x
* #return bool
*/
function isAssociative($x) : bool {
if (!is_array($array)) {
return false;
}
$i = count($array);
while ($i > 0) {
if (!isset($array[--$i])) {
return true;
}
}
return false;
}
Example
<?php
$arr = [ 'foo', 'bar' ];
$obj = [ 'foo' => 'bar' ];
var_dump(isAssociative($arr));
# bool(false)
var_dump(isAssociative($obj));
# bool(true)
var_dump(isArray($obj));
# bool(false)
var_dump(isArray($arr));
# bool(true)
Since PHP 7.1 there is a pseudo-type iterable for exactly this purpose. Type-hinting iterable accepts any array as well as any implementation of the Traversable interface. PHP 7.1 also introduced the function is_iterable(). For older versions, see other answers here for accomplishing the equivalent type enforcement without the newer built-in features.
Fair play: As BlackHole pointed out, this question appears to be a duplicate of Iterable objects and array type hinting? and his or her answer goes into further detail than mine.

PHP Variable Scope and "foreach"

My application is building PDF documents. It uses scripts to produce each page's HTML.
The PDF-Generating class is "Production", and page class is "Page".
class Production
{
private $_pages; // an array of "Page" objects that the document is composed of
public getPages()
{
return $this->_pages;
}
public render()
{
foreach($this->_pages as $page) {
$pageHtml = $page->getHtml($this); // Page takes a pointer to production to access some of its data.
}
}
}
Here is the Page class summary:
class Page
{
private $scriptPath; // Path to Script File (PHP)
public function getHtml(Production &$production)
{
$view = new Zend_View();
$view->production = $production;
return $view->render($this->scriptPath);
}
}
I've encountered a problem when coding Table of Contents. It accesses Production, get all the pages, queries them, and builds TOC based on page titles:
// TableOfContents.php
// "$this" refers to Zend_View from Pages->getHtml();
$pages = $this->production->getPages();
foreach($pages as $page) {
// Populate TOC
// ...
// ...
}
What happens is that foreach inside the TableOfContents.php is interfering with foreach in Production. Production foreach loop is terminated at Index page (which is actually a second page in the document, after the cover page).
The Document Layout is like so:
1) Cover Page
2) Table of Contents
3) Page A
4) Page B
5) Page C
TableOfContents.php, in its foreach loop, goes through the pages as required and builds an index of the entire document, but the loop in Production terminates at Table of Contents and does not proceed to render Pages A, B and C.
If I remove foreach from TableOfContents.php, all consecutive pages are rendered appropriately.
I feel it's a problem with the pointer and variable scope, so what can I do to fix it?
Diagnosis
I suspect the problem is that $_pages isn't a normal PHP array, but instead an object which happens to implement the Iterator interface. Because of this, the "state" of the foreach loop is stored on the object itself, which means that the two loops are conflicting.
If $_pages was a plain array, then there would be no problem, since the line $pages = $this->production->getPages(); would make a copy, since PHP arrays are copied on assignment (unlike objects) and also because nested foreach loops on a normal array don't have that problem. (Presumably from some internal array copy/assignemnt logic.)
Solution
The "fast and dirty" fix is to avoid foreach loops, but I think that would be both annoying and be a cause of future bugs because it's very easy to forget that $_pages needs super-special treatment.
For a real fix, I suggest looking at whatever class is behind the object in $_pages and see if you can change that class. Instead of having $_pages be the Iterator, change $_pages so that it provides iterators through the IteratorAggregate interface.
That way every foreach loop asks for a separate iterator object and maintains separate state.
Here is a sample script to illustrate the problem, partially cribbed from the PHP reference pages:
<?php
class MyIterator implements Iterator
{
private $var = array();
public function __construct($array)
{
if (is_array($array)) {
$this->var = $array;
}
}
public function rewind()
{
reset($this->var);
}
public function current()
{
$var = current($this->var);
return $var;
}
public function key()
{
$var = key($this->var);
return $var;
}
public function next()
{
$var = next($this->var);
return $var;
}
public function valid()
{
$key = key($this->var);
$var = ($key !== NULL && $key !== FALSE);
return $var;
}
}
// END BOILERPLATE DEFINITION OF ITERATOR, START OF INTERESTING PART
function getMyArrayThingy(){
/*
* Hey, let's conveniently give them an object that
* behaves like an array. It'll be convenient!
* Nothing could possibly go wrong, right?
*/
return new MyIterator(array("a","b","c"));
}
// $arr = array("a,b,c"); // This is old code. It worked fine. Now we'll use the new convenient thing!
$arr = getMyArrayThingy();
// We expect this code to output nine lines, showing all combinations of a,b,c
foreach($arr as $item){
foreach($arr as $item2){
echo("$item, $item2\n");
}
}
/*
* Oh no! It printed only a,a and a,b and a,c!
* The outer loop exited too early because the counter
* was set to C from the inner loop.
*/
I'm not sure to understand what is your problem, but you may look at the PHP function reset =)
The solution was the avoid using foreach and use conventional loops, as suggested here:
nested foreach in PHP problem

Categories