How should PHP's Iterator methods valid(), current(), and next() behave? - php

I am using PHP 5.3.6 from MAMP.
I have a use case where it would be best to use PHP's Iterator interface methods, next(), current(), and valid() to iterate through a collection. A foreach loop will NOT work for me in my particular situation. A simplified while loop might look like
<?php
while ($iter->valid()) {
// do something with $iter->current()
$iter->next();
}
Should the above code always work when $iter implements PHP's Iterator interface? How does PHP's foreach keyword deal with Iterators?
The reason I ask is that the code I am writing may be given an ArrayIterator or a MongoCursor. Both implement PHP's Iterator interface but they behave differently. I would like to know if there is a bug in PHP or PHP's Mongo extension.
ArrayIterator::valid() returns true before any call to next() -- immediately after the ArrayIterator is created.
MongoCursor::valid() only returns true after the first call to next(). Therefore the while loop above will never execute.
At risk of being verbose, the following code demonstrates these assertions:
<?php
// Set up array iterator
$arr = array("first");
$iter = new \ArrayIterator($arr);
// Test array iterator
echo(($iter->valid() ? "true" : "false")."\n"); // Echoes true
var_dump($iter->current()."\n"); // "first"
$iter->next();
echo(($iter->valid() ? "true" : "false")."\n"); // Echoes false
// Set up mongo iterator
$m = new \Mongo();
$collection = $m->selectDB("iterTest")->selectCollection("mystuff");
$collection->drop(); // Ensure collection is empty
$collection->insert(array('a' => 'b'));
$miter = $collection->find(); // should find one object
// Test mongo iterator
echo(($miter->valid() ? "true" : "false")."\n"); // Echoes false
$miter->next();
echo(($miter->valid() ? "true" : "false")."\n"); // Echoes true
var_dump($miter->current()); // Array(...)
Which implementation is correct? I found little documentation to support either behavior, and the official PHP documentation is either ambiguous or I'm reading it wrong. The doc for Iterator::valid() states:
This method is called after Iterator::rewind() and Iterator::next() to check if the current position is valid.
This would suggest that my while loop should first call next().
Yet the PHP documentation for Iterator::next states:
This method is called after each foreach loop.
This would suggest that my while loop is correct as written.
To summarize - how should PHP iterators behave?

This is an interesting question. I'm not sure why a foreach won't work for you, but I have some ideas.
Take a look at the example given on the Iterator interface reference page. It shows the order in which PHP's internal implementation of foreach calls the Iterator methods. In particular, notice that when the foreach is first set up, the very first call is to rewind(). This example, though it's not well-annotated, is the basis for my answer.
I'm not sure why a MongoCursor would not return true for valid() until after next() is called, but you should be able to reset either type of object by calling rewind() prior to your loop. So you would have:
// $iter may be either MongoCursor or ArrayIterator
$iter->rewind();
while( $iter->valid() ){
// do something with $iter->current()
$iter->next();
}
I believe this should work for you. If it does not, the Mongo class may have a bug in it.
Edit: Mike Purcell's answer correctly calls out that ArrayIterator and Iterator are not the same. However, ArrayIterator implements Iterator, so you should be able to use rewind() as I show above on either of them.

Subclassing any Iterator and echo'ing when it's called will tell you how it behaves.
Example (demo)
class MyArrayIterator extends ArrayIterator
{
public function __construct ($array)
{
echo __METHOD__, PHP_EOL;
parent::__construct($array);
}
…
}
foreach (new MyArrayIterator(range(1,3)) as $k => $v) {
echo "$k => $v", PHP_EOL;
}
Output
MyArrayIterator::__construct
MyArrayIterator::rewind
MyArrayIterator::valid
MyArrayIterator::current
MyArrayIterator::key
0 => 1
MyArrayIterator::next
MyArrayIterator::valid
MyArrayIterator::current
MyArrayIterator::key
1 => 2
MyArrayIterator::next
MyArrayIterator::valid
MyArrayIterator::current
MyArrayIterator::key
2 => 3
MyArrayIterator::next
MyArrayIterator::valid
This is equivalent to doing
$iterator = new MyArrayIterator(range(1,3));
for ($iterator->rewind(); $iterator->valid(); $iterator->next()) {
echo "{$iterator->key()} => {$iterator->current()}", PHP_EOL;
}
The sequence in which the methods are called is identical to a custom Iterator:
class MyIterator implements Iterator
{
protected $iterations = 0;
public function current()
{
echo __METHOD__, PHP_EOL;
return $this->iterations;
}
public function key ()
{
echo __METHOD__, PHP_EOL;
return $this->iterations;
}
public function next ()
{
echo __METHOD__, PHP_EOL;
return $this->iterations++;
}
public function rewind ()
{
echo __METHOD__, PHP_EOL;
return $this->iterations = 0;
}
public function valid ()
{
echo __METHOD__, PHP_EOL;
return $this->iterations < 3;
}
}
foreach (new MyIterator as $k => $v) {
echo "$k => $v", PHP_EOL;
}

It seems you may have mixed up Iterator and ArrayIterator, as each has their own valid() api call.
Array Iterator : There is no specific mention to next() etc, but the example clearly demonstrates what you mentioned in your OP, to paraphrase; that no other AI api call needs to be made to determine if current element of array is valid.
Iterator : As you mention: "This method is called after Iterator::rewind() and Iterator::next() to check if the current position is valid."
As such, I would stick with ArrayIterator, as it demonstrates the most correct behavior, in that valid() will correctly determine if current element in array is valid, without having to make another api call (next, rewind).
If you want get Mongo to behave like AI does, you could add an instance check before starting the while loop:
if ($iter instanceof MongoCursor) {
$iter->next()
}
while ($iter->valid()) {
// Do stuff
}

Related

Encoding clone $this in JsonSerializable

This simplified case is resulting in a PHP segfault (exit 127):
class Datum implements \JsonSerializable{
public function jsonSerialize(){
return clone $this;
}
}
echo json_encode(new Datum);
The last line of code results in exit(127). I'm unable to retrieve any stack in my current environment.
Meanwhile, removing the clone token works.
Is there any possible explanation why this is happening?
This code results in an infinite recursion.
It appears that the PHP JSON module supports JsonSerializable in this manner (pseudocode):
function json_encode($data){
if($data instanceof JsonSerializable) return json_encode($data->jsonSerialize());
else real_json_encode($data); // handling primitive data or arrays or pure data objects
}
If you return yet another instance of JsonSerializable, json_encode is going to try to serialize it again, resulting in an infinite recursion.
This is working for return $this;, however, probably due to intentional workaround from json_encode's implementation where it goes straight to real json_encode when the returned object is identical, i.e. when $this is returned. However this is not happening for cloned objects since $a !== clone $a.
References
This answer can be supported by reference from the php-src.
// in php_json_encode_zval
if (instanceof_function(Z_OBJCE_P(val), php_json_serializable_ce)) {
return php_json_encode_serializable_object(buf, val, options, encoder);
}
// in php_json_encode_serializable_object
if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) == Z_OBJ_P(val))) {
/* Handle the case where jsonSerialize does: return $this; by going straight to encode array */
PHP_JSON_HASH_APPLY_PROTECTION_DEC(myht);
return_code = php_json_encode_array(buf, &retval, options, encoder);
} else {
/* All other types, encode as normal */
return_code = php_json_encode_zval(buf, &retval, options, encoder);
PHP_JSON_HASH_APPLY_PROTECTION_DEC(myht);
}
These snippets prove that PHP would encode return $this; as an array (or as a non-serializable object), while returning anything else makes Z_OBJ(retval) == Z_OBJ_P(val) false, going to the else block which recursively calls php_json_encode_zval again.
TL;DR, Simple solution: return (array) $this; instead of clone $this;.

PHP SeekableIterator : catch OutOfBoundsException or check valid() method?

So I'm not sure if this is buggy design with PHP, or if there is an understood logic to handling inconsistent outcomes for the same interface.
The SeekableIterator interface has two methods (seek and valid) that are either in conflict with one another or should be working consistently with each other, but I'm seeing both.
The documentation for the interface says that seek should throw an exception of class OutOfBoundsException, but this seems to negate the usefulness of valid unless the iterator position is updated (making valid return false) before throwing the exception (which apparently must be caught).
Three test examples
Example 1.
Custom class implementing SeekableIterator, as provided by example in docs:
The class:
class MySeekableIterator implements SeekableIterator {
private $position;
private $array = array(
"first element",
"second element",
"third element",
"fourth element"
);
/* Method required for SeekableIterator interface */
public function seek($position) {
if (!isset($this->array[$position])) {
throw new OutOfBoundsException("invalid seek position ($position)");
}
$this->position = $position;
}
/* Methods required for Iterator interface */
public function rewind() {
$this->position = 0;
}
public function current() {
return $this->array[$this->position];
}
public function key() {
return $this->position;
}
public function next() {
++$this->position;
}
public function valid() {
return isset($this->array[$this->position]);
}
}
Example 1. Test :
echo PHP_EOL . "Custom Seekable Iterator seek Test" . PHP_EOL;
$it = new MySeekableIterator;
$it->seek(1);
try {
$it->seek(10);
echo $it->key() . PHP_EOL;
echo "Is valid? " . (int) $it->valid() . PHP_EOL;
} catch (OutOfBoundsException $e) {
echo $e->getMessage() . PHP_EOL;
echo $it->key() . PHP_EOL; // outputs previous position (1)
echo "Is valid? " . (int) $it->valid() . PHP_EOL;
}
Test 1 Output:
Custom Seekable Iterator seek Test
invalid seek position (10)
1
Is valid? 1
Example 2:
Using native ArrayIterator::seek
Test 2 Code:
echo PHP_EOL . "Array Object Iterator seek Test" . PHP_EOL;
$array = array('1' => 'one',
'2' => 'two',
'3' => 'three');
$arrayobject = new ArrayObject($array);
$iterator = $arrayobject->getIterator();
$iterator->seek(1);
try {
$iterator->seek(5);
echo $iterator->key() . PHP_EOL;
echo "Is valid? " . (int) $iterator->valid() . PHP_EOL;
} catch (OutOfBoundsException $e) {
echo $e->getMessage() . PHP_EOL;
echo $iterator->key() . PHP_EOL; // outputs previous position (1)
echo "Is valid? " . (int) $iterator->valid() . PHP_EOL;
}
Test 2 Output:
Array Object Iterator seek Test
Seek position 5 is out of range
1
Is valid? 1
Example 3:
Using native DirectoryIterator::seek
Test 3 Code:
echo PHP_EOL . "Directory Iterator seek Test" . PHP_EOL;
$dir_iterator = new DirectoryIterator(dirname(__FILE__));
$dir_iterator->seek(1);
try {
$dir_iterator->seek(500); // arbitrarily high seek position
echo $dir_iterator->key() . PHP_EOL;
echo "Is valid? " . (int) $dir_iterator->valid() . PHP_EOL;
} catch (OutOfBoundsException $e) {
echo $e->getMessage() . PHP_EOL;
echo $dir_iterator->key() . PHP_EOL;
echo "Is valid? " . (int) $dir_iterator->valid() . PHP_EOL;
}
Test 3 Output:
Directory Iterator seek Test
90
Is valid? 0
So how would one reasonably expect to know whether to use valid() to confirm valid position after seek($position) while also anticipating that the seek() might throw an Exception instead of updating the position, so that valid() returns true?
It seems that the directoryIterator::seek() method here is not implemented with an exception. Instead it will just return no value, and let valid() handle it.
Your other example, the ArrayObject::seek() does work "correctly" and throws an OutOfBoundsException.
The reasoning is simple: the ArrayObject (and most likely, most custom implementations too) will know beforehand how many elements it contains, and thus can quickly check its bounds. The DirectoryIterator however, must read the directory entities from disk one by one in order to reach the given position. It does so by literally calling valid() and next() in a loop. This is the reason why the key() has changed, and valid() returns 0.
The other iterators will not even touch the current iterator state, and can quickly decide if your request falls in its range or not.
On a sidenote: if you want to seek a position in the DirectoryIterator backwards, it will reset the iterator first, and then starts iterating each element again. So if you are on position 1000, and do a $it->seek(999), it will actually iterate 999 elements again.
IMHO, The DirectoryIterator is not a good implementation of the seekableIterator interface. It's intended to quickly jump to a certain element within the iterator and clearly, with a directoryIterator this is not something that is doable. Instead, a full iteration must be done, which results in a changed iterator state.
The seekableIterator interface is useful for filterIterators that do something with the range of the iterator. Within the SPL, this is only the LimitIterator. When you do:
$it = new ArrayIterator(range('a','z'));
$it = new LimitIterator($it, 5, 10));
When the limitIterator detects that the given iterator has implemented the seekableIterator interface, it will call seek() to quickly jump to the 5th element, otherwise it will just iterate until it reached the 5th element.
Conclusion: do not use the seekableIterator when you cannot quickly jump to a position or check bounds. At best you gain nothing, at worst you get iterators that change state without knowing why.
To answer your question: seek() should throw an exception and not change state. The directoryIterator (maybe some others too) should be changed to either not implementing seekableIterator, or by finding out how many entries there are prior to seek() (but that doesn't fix the 'rewind` when seeking backwards problem).

Fast check if an object will be successfully instantiated in PHP?

How can I check if an object will be successfully instantiated with the given argument, without actually creating the instance?
Actually I'm only checking (didn't tested this code, but should work fine...) the number of required parameters, ignoring types:
// Filter definition and arguments as per configuration
$filter = $container->getDefinition($serviceId);
$args = $activeFilters[$filterName];
// Check number of required arguments vs arguments in config
$constructor = $reflector->getConstructor();
$numRequired = $constructor->getNumberOfRequiredParameters();
$numSpecified = is_array($args) ? count($args) : 1;
if($numRequired < $numSpecified) {
throw new InvalidFilterDefinitionException(
$serviceId,
$numRequired,
$numSpecified
);
}
EDIT: $constructor can be null...
The short answer is that you simply cannot determine if a set of arguments will allow error-free instantiation of a constructor. As commenters have mentioned above, there's no way to know for sure if a class can be instantiated with a given argument list because there are runtime considerations that cannot be known without actually attempting
instantiation.
However, there is value in trying to instantiate a class from a list of constructor arguments. The most obvious use-case for this sort of operation is a configurable Dependency Injection Container (DIC). Unfortunately, this is a much more complicated operation than the OP suggests.
We need to determine for each argument in a supplied definition array whether or not it matches specified type-hints from the constructor method signature (if the method signature actually has type-hints). Also, we need to resolve how to treat default argument values. Additionally, for our code to be of any real use we need to allow the specification of "definitions" ahead of time for instantiating a class. A sophisticated treatment of the problem will also involve a pool of reflection objects (caching) to minimize the performance impact of repeatedly reflecting things.
Another hurdle is the fact that there's no way to access the type-hint of a reflected method parameter without calling its ReflectionParameter::getClass method and subsequently instantiating a reflection class from the returned class name (if null is returned the param has no type-hint). This is where caching generated reflections becomes particularly important for any real-world use-case.
The code below is a severely stripped-down version of my own string-based recursive dependency injection container. It's a mixture of pseudo-code and real-code (if you were hoping for free code to copy/paste you're out of luck). You'll see that the code below matches the associative array keys of "definition" arrays to the parameter names in the constructor signature.
The real code can be found over at the relevant github project page.
class Provider {
private $definitions;
public function define($class, array $definition) {
$class = strtolower($class);
$this->definitions[$class] = $definition;
}
public function make($class, array $definition = null) {
$class = strtolower($class);
if (is_null($definition) && isset($this->definitions[$class])) {
$definition = $this->definitions[$class];
}
$reflClass = new ReflectionClass($class);
$instanceArgs = $this->buildNewInstanceArgs($reflClass);
return $reflClass->newInstanceArgs($instanceArgs);
}
private function buildNewInstanceArgs(
ReflectionClass $reflClass,
array $definition
) {
$instanceArgs = array();
$reflCtor = $reflClass->getConstructor();
// IF no constructor exists we're done and should just
// return a new instance of $class:
// return $this->make($reflClass->name);
// otherwise ...
$reflCtorParams = $reflCtor->getParameters();
foreach ($reflCtorParams as $ctorParam) {
if (isset($definition[$ctorParam->name])) {
$instanceArgs[] = $this->make($definition[$ctorParam->name]);
continue;
}
$typeHint = $this->getParameterTypeHint($ctorParam);
if ($typeHint && $this->isInstantiable($typeHint)) {
// The typehint is instantiable, go ahead and make a new
// instance of it
$instanceArgs[] = $this->make($typeHint);
} elseif ($typeHint) {
// The typehint is abstract or an interface. We can't
// proceed because we already know we don't have a
// definition telling us which class to instantiate
throw Exception;
} elseif ($ctorParam->isDefaultValueAvailable()) {
// No typehint, try to use the default parameter value
$instanceArgs[] = $ctorParam->getDefaultValue();
} else {
// If all else fails, try passing in a NULL or something
$instanceArgs[] = NULL;
}
}
return $instanceArgs;
}
private function getParameterTypeHint(ReflectionParameter $param) {
// ... see the note about retrieving parameter typehints
// in the exposition ...
}
private function isInstantiable($class) {
// determine if the class typehint is abstract/interface
// RTM on reflection for how to do this
}
}

PHP # foreach warning

I have a PHP foreach from an array, the array is given to me by my DB provider via a soap web service so I cannot change the array I get. When there are no elements to return, I get an empty array, this results in
Warning: Invalid argument supplied for foreach()
the loop looks like
foreach (($point1['return']) as $val)
Where can I put an # to stop this warning, and if I cant, what I do I do to turn off php warnings.
Hiding the warning is not the right way. You should check whether it exists and is an array.
if (is_array($point1['return'])) {
foreach ($point1['return'] as $val) {
...
}
}
PHP is_array()
Actually, turning off warnings or using the # operator is not the right way to go 99% of the time.
Solve the problem instead of hiding it.
foreach() can handle not only arrays but also objects by either using the the default "all visible properties" implementation or a custom implementation via the traversable/iterator interface.
And a "DB provider via a soap web service" is something where I'd keep an eye on the possibility of (suddenly) having an object/iterator instead of a plain array.
So, if you're going to test the existence and data type before passing the variable to foreach, you should consider not only testing for is_array() but also for instanceof Traversable.
<?php
class Foo implements Iterator {
protected $v = 0;
public function current() { return $this->v; }
public function key() { return $this->v; }
public function next() { ++$this->v; }
public function rewind() { $this->v=0; }
public function valid() { return 10>$this->v; }
}
//$a = array(1,2,3,4);
$a = new Foo;
if( is_array($a) || $a instanceof Traversable ) {
foreach($a as $e) {
echo $e, "\n";
}
}
An empty array does not cause that error, the problem is that you are trying to iterate trough something that is not an array. You could add a check using is_array function.
Better to let errors display but check that the input is an array first. So you could wrap the foreach in an if, like this:
if ((is_array($point1)) && (is_array($point1['return']))) {
foreach (($point1['return']) as $val)
...
}
Check first for an array:
if(is_array($point1['return']))
{
...
}
You can also explicitly cast the argument to array:
foreach ((array) $point1['return'] as $val) {
Note: this still will issue undefined index, if there is no 'return' key in $point1
Check whether that is actually an array. with is_array(); !!
There's no need to suppress the warning.
As a matter of fact, It's not possible to suppress that invalid argument warning.
Paste this into your function file:
set_error_handler(function($errno, $errstr){
if(stristr($errstr,'Invalid argument supplied for foreach()')){
return true;
}
return false;
});

How do I implement a callback in PHP?

How are callbacks written in PHP?
The manual uses the terms "callback" and "callable" interchangeably, however, "callback" traditionally refers to a string or array value that acts like a function pointer, referencing a function or class method for future invocation. This has allowed some elements of functional programming since PHP 4. The flavors are:
$cb1 = 'someGlobalFunction';
$cb2 = ['ClassName', 'someStaticMethod'];
$cb3 = [$object, 'somePublicMethod'];
// this syntax is callable since PHP 5.2.3 but a string containing it
// cannot be called directly
$cb2 = 'ClassName::someStaticMethod';
$cb2(); // fatal error
// legacy syntax for PHP 4
$cb3 = array(&$object, 'somePublicMethod');
This is a safe way to use callable values in general:
if (is_callable($cb2)) {
// Autoloading will be invoked to load the class "ClassName" if it's not
// yet defined, and PHP will check that the class has a method
// "someStaticMethod". Note that is_callable() will NOT verify that the
// method can safely be executed in static context.
$returnValue = call_user_func($cb2, $arg1, $arg2);
}
Modern PHP versions allow the first three formats above to be invoked directly as $cb(). call_user_func and call_user_func_array support all the above.
See: http://php.net/manual/en/language.types.callable.php
Notes/Caveats:
If the function/class is namespaced, the string must contain the fully-qualified name. E.g. ['Vendor\Package\Foo', 'method']
call_user_func does not support passing non-objects by reference, so you can either use call_user_func_array or, in later PHP versions, save the callback to a var and use the direct syntax: $cb();
Objects with an __invoke() method (including anonymous functions) fall under the category "callable" and can be used the same way, but I personally don't associate these with the legacy "callback" term.
The legacy create_function() creates a global function and returns its name. It's a wrapper for eval() and anonymous functions should be used instead.
With PHP 5.3, you can now do this:
function doIt($callback) { $callback(); }
doIt(function() {
// this will be done
});
Finally a nice way to do it. A great addition to PHP, because callbacks are awesome.
Implementation of a callback is done like so
// This function uses a callback function.
function doIt($callback)
{
$data = "this is my data";
$callback($data);
}
// This is a sample callback function for doIt().
function myCallback($data)
{
print 'Data is: ' . $data . "\n";
}
// Call doIt() and pass our sample callback function's name.
doIt('myCallback');
Displays: Data is: this is my data
One nifty trick that I've recently found is to use PHP's create_function() to create an anonymous/lambda function for one-shot use. It's useful for PHP functions like array_map(), preg_replace_callback(), or usort() that use callbacks for custom processing. It looks pretty much like it does an eval() under the covers, but it's still a nice functional-style way to use PHP.
well... with 5.3 on the horizon, all will be better, because with 5.3, we'll get closures and with them anonymous functions
http://wiki.php.net/rfc/closures
You will want to verify whatever your calling is valid. For example, in the case of a specific function, you will want to check and see if the function exists:
function doIt($callback) {
if(function_exists($callback)) {
$callback();
} else {
// some error handling
}
}
create_function did not work for me inside a class. I had to use call_user_func.
<?php
class Dispatcher {
//Added explicit callback declaration.
var $callback;
public function Dispatcher( $callback ){
$this->callback = $callback;
}
public function asynchronous_method(){
//do asynch stuff, like fwrite...then, fire callback.
if ( isset( $this->callback ) ) {
if (function_exists( $this->callback )) call_user_func( $this->callback, "File done!" );
}
}
}
Then, to use:
<?php
include_once('Dispatcher.php');
$d = new Dispatcher( 'do_callback' );
$d->asynchronous_method();
function do_callback( $data ){
print 'Data is: ' . $data . "\n";
}
?>
[Edit]
Added a missing parenthesis.
Also, added the callback declaration, I prefer it that way.
For those who don't care about breaking compatibility with PHP < 5.4, I'd suggest using type hinting to make a cleaner implementation.
function call_with_hello_and_append_world( callable $callback )
{
// No need to check $closure because of the type hint
return $callback( "hello" )."world";
}
function append_space( $string )
{
return $string." ";
}
$output1 = call_with_hello_and_append_world( function( $string ) { return $string." "; } );
var_dump( $output1 ); // string(11) "hello world"
$output2 = call_with_hello_and_append_world( "append_space" );
var_dump( $output2 ); // string(11) "hello world"
$old_lambda = create_function( '$string', 'return $string." ";' );
$output3 = call_with_hello_and_append_world( $old_lambda );
var_dump( $output3 ); // string(11) "hello world"
I cringe every time I use create_function() in php.
Parameters are a coma separated string, the whole function body in a string... Argh... I think they could not have made it uglier even if they tried.
Unfortunately, it is the only choice when creating a named function is not worth the trouble.

Categories