How to optimize an ArrayIterator implementation in PHP? - php

I have a long running PHP daemon with a collection class that extends ArrayIterator. This holds a set of custom Column objects, typically less than 1000. Running it through the xdebug profiler I found my find method consuming about 35% of the cycles.
How can I internally iterate over the items in an optimized way?
class ColumnCollection extends \ArrayIterator
{
public function find($name)
{
$return = null;
$name = trim(strtolower($name));
$this->rewind();
while ($this->valid()) {
/** #var Column $column */
$column = $this->current();
if (strtolower($column->name) === $name) {
$return = $column;
break;
}
$this->next();
}
$this->rewind();
return $return;
}
}

Your find() method apparently just returns the first Column object with the queried $name. In that case, it might make sense to index the Array by name, e.g. store the object by it's name as the key. Then your lookup becomes a O(1) call.
ArrayIterator implements ArrayAccess. This means you can add new items to your Collection like this:
$collection = new ColumnCollection;
$collection[$someCollectionObject->name] = $someCollectionObject;
and also retrieve them via the square bracket notation:
$someCollectionObject = $collection["foo"];
If you don't want to change your client code, you can simply override offsetSet in your ColumnCollection:
public function offsetSet($index, $newValue)
{
if ($index === null && $newValue instanceof Column) {
return parent::offsetSet($newValue->name, $newValue);
}
return parent::offsetSet($index, $newValue);
}
This way, doing $collection[] = $column would automatically add the $column by name. See http://codepad.org/egAchYpk for a demo.
If you use the append() method to add new elements, you just change it to:
public function append($newValue)
{
parent::offsetSet($newValue->name, $newValue);
}
However, ArrayAccess is slower than native array access, so you might want to change your ColumnCollection to something like this:
class ColumnCollection implements IteratorAggregate
{
private $columns = []; // or SplObjectStorage
public function add(Column $column) {
$this->columns[$column->name] = $column;
}
public function find($name) {
return isset($this->data[$name]) ? $this->data[$name] : null;
}
public function getIterator()
{
return new ArrayIterator($this->data);
}
}

I replaced the iterator method calls with a loop on a copy of the array. I presume this gives direct access to the internal storage since PHP implements copy-on-write. The native foreach is much faster than calling rewind(), valid(), current(), and next(). Also pre-calculating the strtolower on the Column object helped. This got performance down from 35% of cycles to 0.14%.
public function find($name)
{
$return = null;
$name = trim(strtolower($name));
/** #var Column $column */
foreach ($this->getArrayCopy() as $column) {
if ($column->nameLower === $name) {
$return = $column;
break;
}
}
return $return;
}
Also experimenting with #Gordon's suggestion of using an array keyed on name instead of using the internal storage. The above is working well for a simple drop-in replacement.

Related

How do I list all objects created from a class in PHP [duplicate]

I would like to get all the instances of an object of a certain class.
For example:
class Foo {
}
$a = new Foo();
$b = new Foo();
$instances = get_instances_of_class('Foo');
$instances should be either array($a, $b) or array($b, $a) (order does not matter).
A plus is if the function would return instances which have a superclass of the requested class, though this isn't necessary.
One method I can think of is using a static class member variable which holds an array of instances. In the class's constructor and destructor, I would add or remove $this from the array. This is rather troublesome and error-prone if I have to do it on many classes.
If you derive all your objects from a TrackableObject class, this class could be set up to handle such things (just be sure you call parent::__construct() and parent::__destruct() when overloading those in subclasses.
class TrackableObject
{
protected static $_instances = array();
public function __construct()
{
self::$_instances[] = $this;
}
public function __destruct()
{
unset(self::$_instances[array_search($this, self::$_instances, true)]);
}
/**
* #param $includeSubclasses Optionally include subclasses in returned set
* #returns array array of objects
*/
public static function getInstances($includeSubclasses = false)
{
$return = array();
foreach(self::$_instances as $instance) {
if ($instance instanceof get_class($this)) {
if ($includeSubclasses || (get_class($instance) === get_class($this)) {
$return[] = $instance;
}
}
}
return $return;
}
}
The major issue with this is that no object would be automatically picked up by garbage collection (as a reference to it still exists within TrackableObject::$_instances), so __destruct() would need to be called manually to destroy said object. (Circular Reference Garbage Collection was added in PHP 5.3 and may present additional garbage collection opportunities)
Here's a possible solution:
function get_instances_of_class($class) {
$instances = array();
foreach ($GLOBALS as $value) {
if (is_a($value, $class) || is_subclass_of($value, $class)) {
array_push($instances, $value);
}
}
return $instances;
}
Edit: Updated the code to check if the $class is a superclass.
Edit 2: Made a slightly messier recursive function that checks each object's variables instead of just the top-level objects:
function get_instances_of_class($class, $vars=null) {
if ($vars == null) {
$vars = $GLOBALS;
}
$instances = array();
foreach ($vars as $value) {
if (is_a($value, $class)) {
array_push($instances, $value);
}
$object_vars = get_object_vars($value);
if ($object_vars) {
$instances = array_merge($instances, get_instances_of_class($class, $object_vars));
}
}
return $instances;
}
I'm not sure if it can go into infinite recursion with certain objects, so beware...
I need this because I am making an event system and need to be able to sent events to all objects of a certain class (a global notification, if you will, which is dynamically bound).
I would suggest having a separate object where you register objects with (An observer pattern). PHP has built-in support for this, through spl; See: SplObserver and SplSubject.
As far as I know, the PHP runtime does not expose the underlying object space, so it would not be possible to query it for instances of an object.

PHP object comparison and private properties

I am wondering how PHP determines the equality of instances of a class with private properties:
class Example {
private $x;
public $y;
public __construct($x,$y) {
$this->x = $x; $this->y = $y;
}
}
and something like
$needle = new Example(1,2);
$haystack = [new Example(2,2), new Example(1,2)];
$index = array_search($needle, $haystack); // result is 1
The result is indeed 1, so the private member is compared. Is there a possibility to only match public properties?
I know I could overwrite the __toString method and cast all arrays and needles to string, but that leads to ugly code.
I am hoping to find a solution that is elegant enough to work with in_array, array_search, array_unique, etc.
A possible solution could be the PHP Reflection API. With that in mind you can read the public properties of a class and compare them to other public properties of another instance of the same class.
The following code is a simple comparison of public class properties. The base for the comparison is a simple value object.
declare(strict_types=1);
namespace Marcel\Test;
use ReflectionClass;
use ReflectionProperty;
class Example
{
private string $propertyA;
public string $propertyB;
public string $propertyC;
public function getPropertyA(): string
{
return $this->propertyA;
}
public function setPropertyA(string $propertyA): self
{
$this->propertyA = $propertyA;
return $this;
}
public function getPropertyB(): string
{
return $this->propertyB;
}
public function setPropertyB($propertyB): self
{
$this->propertyB = $propertyB;
return $this;
}
public function getPropertyC(): string
{
return $this->propertyC;
}
public function setPropertyC($propertyC): self
{
$this->propertyC = $propertyC;
return $this;
}
public function __compare(Example $b, $filter = ReflectionProperty::IS_PUBLIC): bool
{
$reflection = new ReflectionClass($b);
$properties = $reflection->getProperties($filter);
$same = true;
foreach ($properties as $property) {
if (!property_exists($this, $property->getName())) {
$same = false;
}
if ($this->{$property->getName()} !== $property->getValue($b)) {
$same = false;
}
}
return $same;
}
}
The __compare method of the Example class uses the PHP Reflection API. First we build a reflection instance of the class to which we want to compare to the current instance. Then we request all public properties of the class we want to compare to. If a public property does not exist in the instance or the value of the property is not the same as in the object we want to compare to, the method returns false, otherwise true.
Some examples.
$objectA = (new Example())
->setPropertyA('bla')
->setPropertyB('yadda')
->setPropertyC('bar');
$objectB = (new Example())
->setPropertyA('foo')
->setPropertyB('yadda')
->setPropertyC('bar');
$result = $objectA->__compare($objectB);
var_dump($result); // true
In this example the comparison results into true because the public properties PropertyB and PropertyC exist in both instances and have the same values. Keep in mind, that this comparison works only, if the second instance is the same class. One could spin this solution further and compare all possible objects based on their characteristics.
In Array Filter Example
It is a kind of rebuild of the in_array function based on the shown __compare method.
declare(strict_types=1);
namespace Marcel\Test;
class InArrayFilter
{
protected ArrayObject $data;
public function __construct(ArrayObject $data)
{
$this->data = $data;
}
public function contains(object $b)
{
foreach ($this->data as $object) {
if ($b->__compare($object)) {
return true;
}
}
return false;
}
}
This filter class acts like the in_array function. It takes a collection of objects and checks, if an object with the same public properties is in the collection.
Conclusion
If you want this solution to act like array_unique, array_search or ìn_array you have to code your own callback functions which execute the __compare method in the way you want to get the result.
It depends on the amount of data to be handled and the performance of the callback methods. The application could consume much more memory and therefore become slower.

PHP An iterator cannot be used with foreach by reference

I have an object that implements Iterator and holds 2 arrays: "entries" and "pages". Whenever I loop through this object, I want to modify the entries array but I get the error An iterator cannot be used with foreach by reference which I see started in PHP 5.2.
My question is, how can I use the Iterator class to change the value of the looped object while using foreach on it?
My code:
//$flavors = instance of this class:
class PaginatedResultSet implements \Iterator {
private $position = 0;
public $entries = array();
public $pages = array();
//...Iterator methods...
}
//looping
//throws error here
foreach ($flavors as &$flavor) {
$flavor = $flavor->stdClassForApi();
}
The reason for this is that sometimes $flavors will not be an instance of my class and instead will just be a simple array. I want to be able to modify this array easily regardless of the type it is.
I just tried creating an iterator which used:
public function &current() {
$element = &$this->array[$this->position];
return $element;
}
But that still did not work.
The best I can recommend is that you implement \ArrayAccess, which will allow you to do this:
foreach ($flavors as $key => $flavor) {
$flavors[$key] = $flavor->stdClassForApi();
}
Using generators:
Updating based on Marks comment on generators, the following will allow you to iterate over the results without needing to implement \Iterator or \ArrayAccess.
class PaginatedResultSet {
public $entries = array();
public function &iterate()
{
foreach ($this->entries as &$v) {
yield $v;
}
}
}
$flavors = new PaginatedResultSet(/* args */);
foreach ($flavors->iterate() as &$flavor) {
$flavor = $flavor->stdClassForApi();
}
This is a feature available in PHP 5.5.
Expanding upon Flosculus' solution, if you don't want to reference the key each time you use the iterated variable, you can assign a reference to it to a new variable in the first line of your foreach.
foreach ($flavors as $key => $f) {
$flavor = &$flavors[$key];
$flavor = $flavor->stdClassForApi();
}
This is functionally identical to using the key on the base object, but helps keep code tidy, and variable names short... If you're into that kind of thing.
If you implemented the iterator functions in your calss, I would suggest to add another method to the class "setCurrent()":
//$flavors = instance of this class:
class PaginatedResultSet implements \Iterator {
private $position = 0;
public $entries = array();
public $pages = array();
/* --- Iterator methods block --- */
private $current;
public function setCurrent($value){
$this->current = $value;
}
public function current(){
return $this->current;
}
//...Other Iterator methods...
}
Then you can just use this function inside the foreach loop:
foreach ($flavors as $flavor) {
$newFlavor = makeNewFlavorFromOldOne($flavor)
$flavors -> setCurrent($newFlavor);
}
If you need this function in other classes, you can also define a new iterator and extend the Iterator interface to contain setCurrent()

PHP lazy array mapping

Is there a way of doing array_map but as an iterator?
For example:
foreach (new MapIterator($array, $function) as $value)
{
if ($value == $required)
break;
}
The reason to do this is that $function is hard to calculate and $array has too many elements, only need to map until I find a specific value. array_map will calculate all values before I can search for the one I want.
I could implement the iterator myself, but I want to know if there is a native way of doing this. I couldn't find anything searching PHP documentation.
In short: No.
There is no lazy iterator mapping built into PHP. There is a non-lazy function iterator_apply(), but nothing like what you are after.
You could write one yourself, as you said. I suggest you extend IteratorIterator and simply override the current() method.
If there were such a thing it would either be documented here or here.
This is a lazy collection map function that gives you back an Iterator:
/**
* #param array|Iterator $collection
* #param callable $function
* #return Iterator
*/
function collection_map( $collection, callable $function ) {
foreach( $collection as $element ) {
yield $function( $element );
}
}
I'm thinking of a simple Map class implementation which uses an array of keys and an array of values. The overall implementation could be used like Java's Iterator class whereas you'd iterate through it like:
while ($map->hasNext()) {
$value = $map->next();
...
}
foreach ($array as $key => $value) {
if ($value === $required) {
break;
} else {
$array[$key] = call_back_function($value);
}
}
process and iterate until required value is found.
Don't bother with an iterator, is the answer:
foreach ($array as $origValue)
{
$value = $function($origValue);
if ($value == $required)
break;
}
I wrote this class to use a callback for that purpose. Usage:
$array = new ArrayIterator(array(1,2,3,4,5));
$doubles = new ModifyIterator($array, function($x) { return $x * 2; });
Definition (feel free to modify for your need):
class ModifyIterator implements Iterator {
/**
* #var Iterator
*/
protected $iterator;
/**
* #var callable Modifies the current item in iterator
*/
protected $callable;
/**
* #param $iterator Iterator|array
* #param $callable callable This can have two parameters
* #throws Exception
*/
public function __construct($iterator, $callable) {
if (is_array($iterator)) {
$this->iterator = new ArrayIterator($iterator);
}
elseif (!($iterator instanceof Iterator))
{
throw new Exception("iterator must be instance of Iterator");
}
else
{
$this->iterator = $iterator;
}
if (!is_callable($callable)) {
throw new Exception("callable must be a closure");
}
if ($callable instanceof Closure) {
// make sure there's one argument
$reflection = new ReflectionObject($callable);
if ($reflection->hasMethod('__invoke')) {
$method = $reflection->getMethod('__invoke');
if ($method->getNumberOfParameters() !== 1) {
throw new Exception("callable must have only one parameter");
}
}
}
$this->callable = $callable;
}
/**
* Alters the current item with $this->callable and returns a new item.
* Be careful with your types as we can't do static type checking here!
* #return mixed
*/
public function current()
{
$callable = $this->callable;
return $callable($this->iterator->current());
}
public function next()
{
$this->iterator->next();
}
public function key()
{
return $this->iterator->key();
}
public function valid()
{
return $this->iterator->valid();
}
public function rewind()
{
$this->iterator->rewind();
}
}
PHP's iterators are quite cumbersome to use, especially if deep nesting is required. LINQ, which implements SQL-like queries for arrays and objects, is better suited for this, because it allows easy method chaining and is lazy through and through. One of the libraries implementing it is YaLinqo*. With it, you can perform mapping and filtering like this:
// $array can be an array or \Traversible. If it's an iterator, it is traversed lazily.
$is_value_in_array = from($array)->contains(2);
// where is like array_filter, but lazy. It'll be called only until the value is found.
$is_value_in_filtered_array = from($array)->where($slow_filter_function)->contains(2);
// select is like array_map, but lazy.
$is_value_in_mapped_array = from($array)->select($slow_map_function)->contains(2);
// first function returns the first value which satisfies a condition.
$first_matching_value = from($array)->first($slow_filter_function);
// equivalent code
$first_matching_value = from($array)->where($slow_filter_function)->first();
There're many more functions, over 70 overall.
* developed by me
Have a look at Non-standard PHP library. It has a lazy map function:
use function \nspl\a\lazy\map;
$heavyComputation = function($value) { /* ... */ };
$iterator = map($heavyComputation, $list);

Get all instances of a class in PHP

I would like to get all the instances of an object of a certain class.
For example:
class Foo {
}
$a = new Foo();
$b = new Foo();
$instances = get_instances_of_class('Foo');
$instances should be either array($a, $b) or array($b, $a) (order does not matter).
A plus is if the function would return instances which have a superclass of the requested class, though this isn't necessary.
One method I can think of is using a static class member variable which holds an array of instances. In the class's constructor and destructor, I would add or remove $this from the array. This is rather troublesome and error-prone if I have to do it on many classes.
If you derive all your objects from a TrackableObject class, this class could be set up to handle such things (just be sure you call parent::__construct() and parent::__destruct() when overloading those in subclasses.
class TrackableObject
{
protected static $_instances = array();
public function __construct()
{
self::$_instances[] = $this;
}
public function __destruct()
{
unset(self::$_instances[array_search($this, self::$_instances, true)]);
}
/**
* #param $includeSubclasses Optionally include subclasses in returned set
* #returns array array of objects
*/
public static function getInstances($includeSubclasses = false)
{
$return = array();
foreach(self::$_instances as $instance) {
if ($instance instanceof get_class($this)) {
if ($includeSubclasses || (get_class($instance) === get_class($this)) {
$return[] = $instance;
}
}
}
return $return;
}
}
The major issue with this is that no object would be automatically picked up by garbage collection (as a reference to it still exists within TrackableObject::$_instances), so __destruct() would need to be called manually to destroy said object. (Circular Reference Garbage Collection was added in PHP 5.3 and may present additional garbage collection opportunities)
Here's a possible solution:
function get_instances_of_class($class) {
$instances = array();
foreach ($GLOBALS as $value) {
if (is_a($value, $class) || is_subclass_of($value, $class)) {
array_push($instances, $value);
}
}
return $instances;
}
Edit: Updated the code to check if the $class is a superclass.
Edit 2: Made a slightly messier recursive function that checks each object's variables instead of just the top-level objects:
function get_instances_of_class($class, $vars=null) {
if ($vars == null) {
$vars = $GLOBALS;
}
$instances = array();
foreach ($vars as $value) {
if (is_a($value, $class)) {
array_push($instances, $value);
}
$object_vars = get_object_vars($value);
if ($object_vars) {
$instances = array_merge($instances, get_instances_of_class($class, $object_vars));
}
}
return $instances;
}
I'm not sure if it can go into infinite recursion with certain objects, so beware...
I need this because I am making an event system and need to be able to sent events to all objects of a certain class (a global notification, if you will, which is dynamically bound).
I would suggest having a separate object where you register objects with (An observer pattern). PHP has built-in support for this, through spl; See: SplObserver and SplSubject.
As far as I know, the PHP runtime does not expose the underlying object space, so it would not be possible to query it for instances of an object.

Categories