Related
While extending ArrayIterator, How can i access to current array to modify it? or get some other data from it?
Please consider example below:
class Test extends ArrayIterator {
public function in_array($key) {
return in_array($key, ???);
}
}
Is using $this->getArrayCopy() instead of ??? ok? Is there any better solution for doing that? What about performance?
And how to change array of class dynamically? for example using array_walk.
Regards,
Notice that the ArrayIterator class implements ArrayAccess. To modify the array, simply treat $this as an array:
$this['k'] = 'v';
Unfortunately, functions such as in_array don't work on array-like objects; you need an actual array. getArrayCopy() will work, but I would just use (array) $this.
EDIT: As salathe notes in a comment, getArrayCopy() is better, because it always gets the internal array, while (array) $this will behave differently if you use the STD_PROP_LIST flag.
Performance-wise, making a copy of an array like that does cause a small slowdown. As a benchmark, I tried getArrayCopy() and (array) on an ArrayIterator of 1000000 items, and both took about 0.11 seconds on my machine. The in_array operation itself (on the resulting array), on the other hand, took 0.011 seconds - about a tenth as long.
I also tried this version of the in_array function:
class Test extends ArrayIterator {
public function in_array($key) {
foreach($this as $v)
if($v == $key)
return true;
return false;
}
}
That function runs in 0.07 seconds on my machine when searching for a value that doesn't exist, which is the worst-case scenario for this algorithm.
The performance problems are too small to matter in most cases. If your situation is extreme enough that 100 nanoseconds or so per array element actually make a difference, then I would suggest putting the values you want to search for in the array keys and using offsetExists() instead.
Can somebody explain clearly the fundamental differences between ArrayIterator, ArrayObject and Array in PHP in terms of functionality and operation? Thanks!
Array is a native php type. You can create one using the php language construct array(), or as of php 5.4 onwards []
ArrayObject is an object that work exactly like arrays. These can be created using new keyword
ArrayIterator is like ArrayObject but it can iterate on itself. Also created using new
Comparing Array vs (ArrayObject/ArrayIterator)
They both can be used using the php's array syntax, for eg.
$array[] = 'foo';
$object[] = 'foo';
// adds new element with the expected numeric key
$array['bar'] = 'foo';
$object['bar'] = 'foo';
// adds new element with the key "bar"
foreach($array as $value);
foreach($object as $value);
// iterating over the elements
However, they are still objects vs arrays, so you would notice the differences in
is_array($array); // true
is_array($object); // false
is_object($array); // false
is_object($object); // true
Most of the php array functions expect arrays, so using objects there would throw errors. There are many such functions. For eg.
sort($array); // works as expected
sort($object); // Warning: sort() expects parameter 1 to be array, object given in ......
Finally, objects can do what you would expect from a stdClass object, i.e. accessing public properties using the object syntax
$object->foo = 'bar'; // works
$array->foo = 'bar'; // Warning: Attempt to assign property of non-object in ....
Arrays (being the native type) are much faster than objects. On the other side, the ArrayObject & ArrayIterator classes have certain methods defined that you can use, while there is no such thing for arrays
Comparing ArrayObject vs ArrayIterator
The main difference between these 2 is in the methods the classes have.
The ArrayIterator implements Iterator interface which gives it methods related to iteration/looping over the elements. ArrayObject has a method called exchangeArray that swaps it's internal array with another one. Implementing a similar thing in ArrayIterator would mean either creating a new object or looping through the keys & unseting all of them one by one & then setting the elements from new array one-by-one.
Next, since the ArrayObject cannot be iterated, when you use it in foreach it creates an ArrayIterator object internally(same as arrays). This means php creates a copy of the original data & there are now 2 objects with same contents. This will prove to be inefficient for large arrays. However, you can specify which class to use for iterator, so you can have custom iterators in your code.
ArrayObject and array are somewhat alike. Merely a collection of objects (or native types). They have some different methods that you can call, but it mostly boils down to the same thing.
However, an Iterator is something else completely. The iterator design pattern is a way to secure your array (making it only readable). Lets take the next example:
You have a class that has an array. You can add items to that array by using addSomethingToMyArray. Note however, that we do something to item before we actually add it to the array. This could be anything, but lets for a moment act like it is very important that this method is fired for EVERY item that we want to add to the array.
class A
{
private $myArray;
public function returnMyArray()
{
return $this->myArray;
}
public function addSomethingToMyArray( $item )
{
$this->doSomethingToItem( $item );
array_push( $item );
}
}
The problem with this, is that you pass the array by reference here. That means that classes that actually use returnMyArray get the real myArray object. That means that classes other than A can add things to that array, and therefor also change the array inside A without having to use the addSOmethingToMyArray. But we needed to doSOmethingToItem, remember? This is an example of a class not in control of its own inner status.
The solution to this is an iterator. Instead of passing the array, we pass the array to a new object, that can only READ things from the array. The most simple Iterator ever is something like this:
<?php
class MyIterator{
private $array;
private $index;
public function __construct( $array )
{
$this->array = $array;
}
public function hasNext()
{
return count( $this->array ) > $this->index;
}
public function next()
{
$item = $this->array[ $this->index ];
this->$index++;
return $item;
}
}
?>
As you can see, i have no way of adding new items to the given array, but i do have posibilities to read the array like this:
while( $iterator->hasNext() )
$item = $iterator->next();
Now there is again, only one way to add items to myArray in A, namely via the addSomethingToArray method. So that is what an Iterator is, it is somewhat of a shell around arrays, to provide something called encapsulation.
Array is array type, ArrayObject and ArrayIterator are built-in classes, with their instances being object type, that partially behave like arrays on syntax and usage level.
For the classes I think the quick idea can be gained from the interfaces they implement:
class ArrayObject implements IteratorAggregate, ArrayAccess, Serializable, Countable {
class ArrayIterator implements SeekableIterator, ArrayAccess, Serializable, Countable {
Both classes implement:
ArrayAccess, so they support array syntax to access values
Countable, so they support count of values, like arrays do
Iterator (though in different ways), so they fit iterable type and can be looped over with foreach and such
One way to look at it, is that classes go through a lot of trouble to work like an array. The point is that being classes, they can be extended and customized.
They would also be inherently interoperable with any code that expects Iterator in general, so the basic use case is just that - wrapping to feed array data into iterator code.
So in a nutshell ArrayObject and ArrayIterator are Do It Yourself arrays (with some implementation differences between the two). Their instances (partially) behave like array type, but as classes they are extensible and as Iterator implementers they are interoperable with code that wants that.
Unless you need a deeply custom behavior and/or Iterator interoperability, sticking with array type is probably the way to go between these.
Notably there are also implementations of collections around, that aim to deliver the class benefits with more friendly abstractions.
array is one the eight primitive types in PHP. Allthough it comes with a lot of built-in utility functions, but they are all procedural.
Both the ArrayObject and the ArrayIterator allow us to make arrays first class citizens in an object oriented program (OOP).
Difference between ArrayObject and the ArrayIterator is that, since ArrayIterator implements SeekableIterator interface, you can do $myArray->seek(10); with ArrayIterator.
An Iterator is an object that enables a programmer to traverse a container, particularly lists. Various types of iterators are often provided via a container's interface.
There is no much difference between ArrayObject and Array as they represent the same things albeit using different object types.
ArrayIterator is an Iterator that iterates over Array-like objects, this includes all objects that implement ArrayAcess and the native Array type.
In fact, when you foreach over an array, PHP internally creates ArrayIterator to do the traversing and transform your code to look as if typed this,
for( $arrayIterator->rewind(); $arrayIterator->valid(); $arrayIterator-
>next())
{ $key = $arrayIteartor->key();
$value = $arrayIterator->current();
}
So you can see, every collection object has an Iterator except for your defined collections which you need to define your own Iterators for.
EDIT:forgot to address the normal array in my answer..
Before you read all this, if your new to php and not trying to do anything tricky just storing values to be used later, then just use the array primitive type.
$data = [1,2,3,4,5];
//or
$data = array(1,2,3,4,5);
Array is the entry point and you should get used to using first them before considering using the others. If you do not understand arrays then you will probably will not understand the benefits of using the others.
( See https://www.w3schools.com/php/php_arrays.asp for a tutorial on arrays. )
Otherwise continue reading...
I have the same question and after coming here decided to just spend some time testing them out and here is my findings and conclusions...
First lets clear up some things that were said by others that I immediately tested.
ArrayObject and ArrayItterator do not protect the data. Both can still be passed by reference in a for-each loop (see down he bottom for example).
Both return true for is_object(), both return false for is_array() and both allow direct access to the values as an array without providing protection for adding values etc. and both can be passed by reference allowing the original data to be manipulated during a foreach loop.
$array = new ArrayIterator();
var_dump(is_array($array)); // false
var_dump(is_object($array)); // true
$array[] = 'value one';
var_dump($array[0]);//string(9) "value one"
$array = new ArrayObject();
var_dump(is_array($array)); // false
var_dump(is_object($array)); // true
$array[] = 'value one';
var_dump($array[0]);//string(9) "value one"
The big difference can be seen in the functions that are available for either of them.
ArrayIteroator has all the functions required to traverse the values such as a foreach loop. ( A foreach loop will call rewind(),valid(),current(),key() methods )
see: https://www.php.net/manual/en/class.iterator.php for an excellent example of the concept (lower level class documentation).
While an ArrayObject can still be iterated over and access values the same way, it does not offer public access to pointer functions.
Object is kind of like adding a wrapper arount the ArrayItterator object and has public getIterator ( void ) : ArrayIterator that will faciliate traversing of the values inside.
You can always get the ArrayItterator From ArrayObject if your really need the added functions.
ArrayItterator is best choice if you have your own pseudo foreach loop for traversing in weird ways and want better control instead of just start to end traversal.
ArrayItterator would also be a good choice for overriding the default array behavior when iterated over by a foreach loop. eg..
//I originally made to this to solve some problem where generic conversion to XML via a foreach loop was used,
//and I had a special case where a particular API wanted each value to have the same name in the XML.
class SameKey extends ArrayIterator{
public function key()
{
return "AlwaysTheSameKey";
}
}
$extendedArrayIterator = new SameKey(['value one','value two','value three']);
$extendedArrayIterator[] = 'another item added after construct';
//according to foreach there all the keys are the same
foreach ($extendedArrayIterator as $key => $value){
echo "$key: ";//key is always the same
var_dump($value);
}
//can still be access as array with index's if you need to differentiate the values
for ($i = 0; $i < count($extendedArrayIterator); $i++){
echo "Index [$i]: ";
var_dump($extendedArrayIterator[$i]);
}
The ArrayObject might be a good choice if you have more high level complexities with itterators going on for example...
//lets pretend I have many custom classes extending ArrayIterator each with a different behavior..
$O = new ArrayObject(['val1','val2','val3']);
//and I want to change the behavior on the fly dynamically by changing the iterator class
if ($some_condition = true) $O->setIteratorClass(SameKey::class);
foreach ($O as $key => $value){
echo "$key: ";//AlwaysTheSameKey:
var_dump($value);
}
One example might be changing the output of the same data set such as having a bunch of custom iterators that will return the values in a different format when traversing the same data set. eg...
class AustralianDates extends ArrayIterator{
public function current()
{
return Date('d/m/Y',parent::current());
}
}
$O = new ArrayObject([time(),time()+37474,time()+37845678]);
//and I want to change the behaviour on the fly dynamically by changing the iterator class
if ($some_condition = true) $O->setIteratorClass(AustralianDates::class);
foreach ($O as $key => $value){
echo "$key: ";//AlwaysTheSameKey:
var_dump($value);
}
Obviously there are probably better ways to do this kind of thing.
In short these are the major major advantages differences I can see.
ArrayItorator
- lower level ability to control or extend & override traversal behaviors.
VS
ArrayObject
- one Container for data but able to change ArrayIterator class.
There are also other differences you might care to inspect but I'd imagine you wont fully grasp them all until you use them extensively.
It appears both objects can be used by reference in a foreach but NOT when using a custom itterator class via an ArrayObject..
//reference test
$I = new ArrayIterator(['mouse','tree','bike']);
foreach ($I as $key => &$value){
$value = 'dog';
}
var_dump($I);//all values in the original are now 'dog'
$O = new ArrayObject(['mouse','tree','bike']);
foreach ($O as $key => &$value){
$value = 'dog';
}
var_dump($O);//all values in the original are now 'dog'
$O->setIteratorClass(SameKey::class);
foreach ($O as $key => &$value){//PHP Fatal error: An iterator cannot be used with foreach by reference
$value = 'dog';
}
var_dump($O);
Recommendation/Conclusion
Use arrays.
If you want to do something tricky then,
I'd recommend always using ArrayIterator to start off with and only move on to ArrayObject if you want to something specific that only ArrayObject can do.
Considering the ArrayObject can make use of any custom ArrayIterators you have created along the way, id say this is the logical path to take.
Hope this helps you as much as it helped me looking into it.
Arrays
An array in PHP is actually an ordered map. A map is a type that
associates values to keys. This type is optimized for several
different uses; it can be treated as an array, list (vector), hash
table (an implementation of a map), dictionary, collection, stack,
queue, and probably more. As array values can be other arrays, trees
and multidimensional arrays are also possible.
The ArrayObject class
This class allows objects to work as arrays.
The ArrayIterator class
This iterator allows to unset and modify values and keys while
iterating over Arrays and Objects.
When you want to iterate over the same array multiple times you need
to instantiate ArrayObject and let it create ArrayIterator instances
that refer to it either by using foreach or by calling its
getIterator() method manually.
array_values() doesn't work with ArrayAccess object.
neither does array_keys()
why?
if I can access $object['key'] I should be able to do all kind of array operations
No, you've misunderstood the utility of ArrayAccess. It isn't just a sort of wrapper for an array. Yes, the standard example for implementing it uses a private $array variable whose functionality is wrapped by the class, but that isn't a particularly useful one. Often, you may as well just use an array.
One good example of ArrayAccess is when the script doesn't know what variables are available.
As a fairly silly example, imagine an object that worked with a remote server. Resources on that server can be read, updated and deleted using an API across a network. A programmer decides they want to wrap that functionality with array-like syntax, so $foo['bar'] = 'foobar' sets the bar resource on that server to foobar and echo $foo['bar'] retrieves it. The script has no way of finding out what keys or values are present without trying all possible values.
So ArrayAccess allows the use of array syntax for setting, updating, retrieving or deleting from an object with array-like syntax: no more, no less.
Another interface, Countable, allows the use of count(). You could use both interfaces on the same class. Ideally, there would be more such interfaces, perhaps including those that can do array_values or array_keys, but currently they don't exist.
ArrayAccess is very limited. It does not allow the use of native array_ functions (no existing interface does).
If you need to do more array-like operations on your object, then you are essentially creating a collection. A collection should be manipulated by its methods.
So, create an object and extend ArrayObject. This implements IteratorAggregate, Traversable, ArrayAccess, Serializable and Countable.
If you need the keys, simply add an array_keys method:
public function array_keys($search_value = null, $strict = false)
{
return call_user_func_array('array_keys', array($this->getArrayCopy(), $search_value, $strict));
}
Then you can:
foreach ($object->array_keys() as $key) {
echo $object[$key];
}
The ArrayObject/ArrayAccess allows objects to work as arrays, but they're still objects. So instead of array_keys() (which work only on arrays) you should use get_object_vars(), for example:
var_dump(array_keys(get_object_vars($ArrObj)));
or convert your ArrayObject by casting it into array by (array) $ArrObj, e.g.:
var_dump(array_keys((array)$ArrObj));
What's the equivalent function in PHP for C plus plus "set" ("Sets are a kind of associative containers that stores unique elements, and in which the elements themselves are the keys.")?
There isn't one, but they can be emulated.
Here is a achieve copy before the link died.. all the contents
A Set of Objects in PHP: Arrays vs. SplObjectStorage
One of my projects, QueryPath, performs many tasks that require maintaining a set of unique objects. In my quest to optimize QueryPath, I have been looking into various ways of efficiently storing sets of objects in a way that provides expedient containment checks. In other words, I want a data structure that keeps a list of unique objects, and can quickly tell me if some object is present in that list. The ability to loop through the contents of the list is also necessary.
Recently I narrowed the list of candidates down to two methods:
Use good old fashioned arrays to emulate a hash set.
Use the SPLObjectStorage system present in PHP 5.2 and up.
Before implementing anything directly in QueryPath, I first set out designing the two methods, and then ran some micro-benchmarks (with Crell's help) on the pair of methods. To say that the results were surprising is an understatement. The benchmarks will likely change the way I structure future code, both inside and outside of Drupal.
The Designs
Before presenting the benchmarks, I want to quickly explain the two designs that I settled on.
Arrays emulating a hash set
The first method I have been considering is using PHP's standard array() to emulate a set backed by a hash mapping (a "hash set"). A set is a data structure designed to keep a list of unique elements. In my case, I am interested in storing a unique set of DOM objects. A hash set is a set that is implemented using a hash table, where the key is a unique identifier for the stored value. While one would normally write a class to encapsulate this functionality, I decided to test the implementation as a bare array with no layers of indirection on top. In other words, what I am about to present are the internals of what would be a hash set implementation.
The Goal: Store a (unique) set of objects in a way that makes them (a) easy to iterate, and (b) cheap to check membership.
The Strategy: Create an associative array where the key is a hash ID and the value is the object.
With a reasonably good hashing function, the strategy outlined above should work as desired.
"Reasonably good hashing function" -- that was the first gotcha. How do you generate a good hashing function for an object like DOMDocument? One (bad) way would be to serialize the object and then, perhaps, take its MD5 hash. That, however, will not work on many objects -- specifically any object that cannot serialze. The DOMDocument, for example, is actually backed by a resource and cannot be serialized.
One needed look far for a such a function, though. It turns out that there is an object hashing function in PHP 5. It's called spl_object_hash(), and it can take any object (even one that is not native PHP) and generate a hashcode for it.
Using spl_object_hash() we can build a simple data structure that functions like a hash set. This structure looks something like this:
array(
$hashcode => $object
);
For example, we an generate an entry like so:
$object = new stdClass();
$hashcode = spl_object_hash($object);
$arr = array(
$hashcode => $object
);
In the example above, then, the hashcode string is an array key, and the object itself is the array value. Note that since the hashcode will be the same each time an object is re-hashed, it serves not only as a comparison point ("if object a's hashkey == object b's hashkey, then a == b"), it also functions as a uniqueness constraint. Only one object with the specified hashcode can exist per array, so there is no possibility of two copies (actually, two references) to the same object being placed in the array.
With a data structure like this, we have a host of readily available functions for manipulating the structure, since we have at our disposal all of the PHP array functions. So to some degree this is an attractive choice out of the box.
The most import task, in our context at least, is that of determining whether an entry exists inside of the set. There are two possible candidates for this check, and both require supplying the hashcode:
Check whether the key exists using array_key_exists().
Check whether the key is set using isset().
To cut to the chase, isset() is faster than array_key_exists(), and offers the same features in our context, so we will use that. (The fact that they handle null values differently makes no difference to us. No null values will ever be inserted into the set.)
With this in mind, then, we would perform a containment check using something like this:
$object = new stdClass();
$hashkey = spl_object_hash($object);
// ...
// Check whether $arr has the $object.
if (isset($arr[$hashkey])) {
// Do something...
}
Again, using an array that emulates a hash set allows us to use all of the existing array functions and also provides easy iterability. We can easily drop this into a foreach loop and iterate the contents. Before looking at how this performs, though, let's look at the other possible solution.
Using SplObjectStorage
The second method under consideration makes use of the new SplObjectStorage class from PHP 5.2+ (it might be in 5.1). This class, which is backed by a C implementation, provides a set-like storage mechanism for classes. It enforces uniqueness; only one of each object can be stored. It is also traversable, as it implements the Iterable interface. That means you can use it in loops such as foreach. (On the down side, the version in PHP 5.2 does not provide any method of random access or index-based access. The version in PHP 5.3 rectifies this shortcoming.)
The Goal: Store a (unique) set of objects in a way that makes them (a) easy to iterate, and (b) cheap to check membership.
The Strategy: Instantiate an object of class SplObjectStorage and store references to objects inside of this.
Creating a new SplObjectStorage is simple:
$objectStore = new SplObjectStorage();
An SplObjectStorage instance not only retains uniqueness information about objects, but objects are also stored in predictable order. SplObjectStorage is a FIFO -- First In, First Out.
Adding objects is done with the attach() method:
$objectStore = new SplObjectStorage();
$object = new stdClass();
$objectStore->attach($object);
It should be noted that attach will only attach an object once. If the same object is passed to attach() twice, the second attempt will simply be ignored. For this reason it is unnecessary to perform a contains() call before an attach() call. Doing so is redundant and costly.
Checking for the existence of an object within an SplObjectStorage instance is also straightforward:
$objectStore = new SplObjectStorage();
$object = new stdClass();
$objectStore->attach($object);
// ...
if ($objectStore->contains($object)) {
// Do something...
}
While SplObjectStorage has nowhere near the number of supporting methods that one has access to with arrays, it allows for iteration and somewhat limited access to the objects stored within. In many use cases (including the one I am investigating here), SplObjectStorage provides the requisite functionality.
Now that we have taken a look at the two candidate data structures, let's see how they perform.
The Comparisons
Anyone who has seen Larry (Crell) Garfield's micro-benchmarks for arrays and SPL ArrayAccess objects will likely come into this set of benchmarks with the same set of expectations Larry and I had. We expected PHP's arrays to blow the SplObjectStorage out of the water. After all, arrays are a primitive type in PHP, and have enjoyed years of optimizations. However, the documentation for the SplObjectStorage indicates that the search time for an SplObjectStorage object is O(1), which would certainly make it competitive if the base speed is similar to that of an array.
My testing environments are:
An iMac (current generation) with a 3.06 Ghz Intel Core 2 Duo and 2G of 800mhz DDR2 RAM. MAMP 1.72 (PHP 5.2.5) provides the AMP stack.
A MacBook Pro with a 2.4 Ghz Intel Core 2 Duo and 4G of 667mhz DDR2 RAM. MAMP 1.72 (PHP 5.2.5) provides the AMP stack.
In both cases, the performance tests averaged about the same. Benchmarks in this article come from the second system.
Our basic testing strategy was to build a simple test that captured information about three things:
The amount of time it takes to load the data structure
The amount of time it takes to seek the data structure
The amount of memory the data structure uses
We did our best to minimize the influence of other factors on the test. Here is our testing script:
<?php
/**
* Object hashing tests.
*/
$sos = new SplObjectStorage();
$docs = array();
$iterations = 100000;
for ($i = 0; $i < $iterations; ++$i) {
$doc = new DOMDocument();
//$doc = new stdClass();
$docs[] = $doc;
}
$start = $finis = 0;
$mem_empty = memory_get_usage();
// Load the SplObjectStorage
$start = microtime(TRUE);
foreach ($docs as $d) {
$sos->attach($d);
}
$finis = microtime(TRUE);
$time_to_fill = $finis - $start;
// Check membership on the object storage
$start = microtime(FALSE);
foreach ($docs as $d) {
$sos->contains($d);
}
$finis = microtime(FALSE);
$time_to_check = $finis - $start;
$mem_spl = memory_get_usage();
$mem_used = $mem_spl - $mem_empty;
printf("SplObjectStorage:\nTime to fill: %0.12f.\nTime to check: %0.12f.\nMemory: %d\n\n", $time_to_fill, $time_to_check, $mem_used);
unset($sos);
$mem_empty = memory_get_usage();
// Test arrays:
$start = microtime(TRUE);
$arr = array();
// Load the array
foreach ($docs as $d) {
$arr[spl_object_hash($d)] = $d;
}
$finis = microtime(TRUE);
$time_to_fill = $finis - $start;
// Check membership on the array
$start = microtime(FALSE);
foreach ($docs as $d) {
//$arr[spl_object_hash($d)];
isset($arr[spl_object_hash($d)]);
}
$finis = microtime(FALSE);
$time_to_check = $finis - $start;
$mem_arr = memory_get_usage();
$mem_used = $mem_arr - $mem_empty;
printf("Arrays:\nTime to fill: %0.12f.\nTime to check: %0.12f.\nMemory: %d\n\n", $time_to_fill, $time_to_check, $mem_used);
?>
The test above is broken into four separate tests. The first two test how well the SplObjectStorage method handles loading and containment checking. The second two perform the same test on our improvised array data structure.
There are two things worth noting about the test above.
First, the object of choice for our test was a DOMDocument. There are a few reasons for this. The obvious reason is that this test was done with the intent of optimizing QueryPath, which works with elements from the DOM implementation. There are two other interesting reasons, though. One is that DOMDocuments are not lightweight. The other is that DOMDocuments are backed by a C implementation, making them one of the more difficult cases when storing objects. (They cannot, for example, be conveniently serialized.)
That said, after observing the outcome, we repeated the test with basic stdClass objects and found the performance results to be nearly identical, and the memory usage to be proportional.
The second thing worth mention is that we used 100,000 iterations to test. This was about the upper bound that my PHP configuration allowed before running out of memory. Other than that, though, the number was chosen arbitrarily. When I ran tests with lower iteration counts, the SplObjectStorage definitely scaled linearly. Array performance was less predictable (larger standard deviation) with smaller data sets, though it seemed to average around the same for lower sizes as it does (more predictably) for larger sized arrays.
The Results
So how did these two strategies fare in our micro-benchmarks? Here is a representative sample of the output generated when running the above:
SplObjectStorage:
Time to fill: 0.085041999817.
Time to check: 0.073099000000.
Memory: 6124624
Arrays:
Time to fill: 0.193022966385.
Time to check: 0.153498000000.
Memory: 8524352
Averaging this over multiple runs, SplObjectStorage executed both fill and check functions twice as fast as the array method presented above. We tried various permutations of the tests above. Here, for example, are results for the same test with a stdClass object:
SplObjectStorage:
Time to fill: 0.082209110260.
Time to check: 0.070617000000.
Memory: 6124624
Arrays:
Time to fill: 0.189271926880.
Time to check: 0.152644000000.
Memory: 8524360
Not much different. Even adding arbitrary data to the object we stored does not make a difference in the time it takes for the SplObjectStorage (though it does seem to raise the time ever so slightly for the array).
Our conclusion is that SplObjectStorage is indeed a better solution for storing lots of objects in a set. Over the last week, I've ported QueryPath to SplObjectStorage (see the Quark branch at GitHub -- the existing Drupal QueryPath module can use this experimental branch without alteration), and will likely continue benchmarking. But preliminary results seem to provide a clear indication as to the best approach.
As a result of these findings, I'm much less inclined to default to arrays as "the best choice" simply because they are basic data types. If the SPL library contains features that out-perform arrays, they should be used when appropriate. From QueryPath to my Drupal modules, I expect that my code will be impacted by these findings.
Thanks to Crell for his help, and for Eddie at Frameweld for sparking my examination of these two methods in the first place.
In PHP you use arrays for that.
There is no built-in equivalent of std::set in PHP.
You can use arrays "like" sets, but it's up to you to enforce the rules.
Have a look at Set from Nspl. It supports basic set operations which take other sets, arrays and traversable objects as arguments. You can see examples here.
In the middle of a period of big refactoring at work, I wish to introduce stdClass ***** as a way to return data from functions and I'm trying to find non-subjective arguments to support my decision.
Are there any situations when would it be best to use one instead of the other ??
What benefits would I get to use stdClass instead of arrays ??
Some would say that functions have to be as little and specific to be able to return one single value. My decision to use stdClass is temporal, as I hope to find the right Value Objects for each process on the long run.
The usual approach is
Use objects when returning a defined data structure with fixed branches:
$person
-> name = "John"
-> surname = "Miller"
-> address = "123 Fake St"
Use arrays when returning a list:
"John Miller"
"Peter Miller"
"Josh Swanson"
"Harry Miller"
Use an array of objects when returning a list of structured information:
$person[0]
-> name = "John"
-> surname = "Miller"
-> address = "123 Fake St"
$person[1]
-> name = "Peter"
-> surname = "Miller"
-> address = "345 High St"
Objects are not suitable to hold lists of data, because you always need a key to address them. Arrays can fulfill both functions - hold arbitrary lists, and a data structure.
Therefore, you can use associative arrays over objects for the first and third examples if you want to. I'd say that's really just a question of style and preference.
#Deceze makes a number of good points on when to use an object (Validation, type checking and future methods).
Using stdClass to fulfill the same function as an array is not very useful IMHO, it just adds the overhead of an object without any real benefit. You're also missing out on many useful array functions (e.g. array_intersect). You should at least create your own class to enable type checking, or add methods to the object to make it worth using an object.
I don't think there is any reasonable advantage of using a stdClass over an array as long as your sole intention is to return multiple arbitrary datatypes from a function call.
Since you cannot technically return multiple values natively, you have to use a container that can hold all other datatypes available in PHP. That would be either an object or an array.
function fn1() { return array(1,2); }
function fn2() { return array('one' => 1, 'two' => 2); }
function fn3() { return (object) array(1,2); }
function fn4() { return (object) array('one' => 1, 'two' => 2); }
All of the above would work. The array is a tiny negligible fraction faster and less work to type. It also has a clearly defined purpose in contrast to the generic stdClass (which is a bit wishywashy, isnt it). Both only have an implicit interface, so you will have to look at the docs or the function body to know what they will contain.
If you want to use objects at any cost, you could use ArrayObject or SplFixedArray, but if you look at their APIs would you say you need their functionality for the simple task of returning random multiple values? I don't think so. Don't get me wrong though: if you want to use stdClass, then use it. It's not like it would break anything. But you also would not gain anything. To add at least some benefit, you could create a separate class named ReturnValues for this.
Could be a simple tagging class
class ReturnValues {}
or something more functional
class ReturnValues implements Countable
{
protected $values;
public function __construct() { $this->values = func_get_args(); }
public function __get($key) return $this->values[$key]; }
public function count() { return count($this->values); }
}
Granted, it doesn't do much and getting the values out of it is still done through an implict interface, but at least the class has a more clearly defined responsibility now. You could extend from this class to create ReturnValue objects for particular operations and give those an explicit interface:
class FooReturnValues extends ReturnValues
{
public function getFoo() { return $this->values['foo']; }
public function getBar() { return $this->values['foo']; }
}
Now the developer just has to look at the API to know which multiple values foo() will return. Of course, having to write concrete ReturnValue classes for each and every operation that might return multiple values could become tedious quickly. And personally, I find this overengineered for the initial purpose.
Anyway, hope that makes any sense.
Well, there are 3 differences:
they have an identity. which is why the default for passing array arguments is call by value and for objects call by sharing.
there is a semantical difference. If you use an object, anyone who reads the code understand, that the value represents the model some sort of entitity, while an array is supposed to act as a collection or a map
And last but not least, refactoring becomes signifficantly easier. If you want to use a concrete class rather than stdClass, all you have to do is to instantiate another class. Which also allows you to add methods.
greetz
back2dos
The only OBJECTIVE vote of confidence I can find is:
json_decode uses stdClass by default so us mortals in userland should use stdClass for similar situations.
I find stdClass objects over arrays useful when I need to keep my code clean and somewhat sentence-like readable. Take for example function getProperties() which returns a set of properties, say data about a person (name, age, sex). If getProperties() would return an associative array, when you want to use one of the returned properties, you would write two instructions:
$data = getProperties();
echo $data['name'];
On the other hand, if getProperties() returns a stdClass then you could write that in just one instruction:
echo getProperties()->name;
In tests of array vs stdclass they way php handles dynamic properties is slower then associative arrays. I'm not saying this to argue micro optimization but rather if your going to do this your better off defining a dataclass with no methods and set public properties. Esp if you are using php 5.4+. Under the hood defined properties are mapped directly to a c array with no hashtable where as dynamic ones need to use a hash table.
This has the added bonus of later becoming a full class without any major interface reworking.