In PHP I'm having a real hard time using serialize/unserialize on a large array of objects (100000+ objects). These objects can be of a lot of different types, but are all descendants from a base class.
Somehow when I use unserialize on the array of objects about 0,001% of the objects are generated wrong! A whole different object is generated instead. This does not happen random, but each time with the same objects. But if I change the order of the array, it happens with different objects, so this looks like a bug to me.
I switched to json_encode/json_decode, but found that this always uses stdClass as the object's class. I solved this by including the classname of each object as a property, and then use this property to construct a the new object, but this solution is not very elegant.
Using var_export with eval works fine but is about 3 times slower than the other methods and uses much more memory.
Now my questions are :
what could cause the bug / wrong objects that are created with
unserialize ?
is there a better way to use json_decode with an array of objects, so that classes are somehow stored within the json
automatically?
is there maybe even an other method to read/write a large array of objects in PHP?
UPDATE
I'm beginning to believe there must be something strange with my array-data, because with msgpack_serialize (php extension, alternative to serialize) I get the same kind of errors (but strangely enough not the same objects are generated wrong!).
UPDATE 2
Found a solution : instead of doing serialize on the entire array, I do it on each object now, first serialize and then base64_encode and then I store each serialized object as a separate line in a text-file. This way I can generate the entire array of objects and then iterate each object using file() with unserialize and base64_decode : no more errors!
With serialize/unserialize functions 2 magic methods are connected.
__sleep()
serialize() checks if your class has a function with the magic name __sleep(). If so, that function is executed prior to any serialization. It can clean up the object and is supposed to return an array with the names of all variables of that object that should be serialized. If the method doesn't return anything then NULL is serialized and E_NOTICE is issued.
With sleep you have better control on serializaction you can pass the variables that can be serialized and clean resources befere seralization.
When unserialize is called then the other function should be mentioned
__wakeup()
The intended use of __wakeup() is to reestablish any database connections that may have been lost during serialization and perform other reinitialization tasks.
About json_encode()
It doesn't have magic methods __wakeup, __sleep so you have less control
It doesn't serialize private properties
Objects are always stored as stdClass
Json_encode is faster than serialize
It's is up to you what you choose but for more advanced classes with db connection etc. I would suggest serialize()
Related
I have a function that requires information to be passed to it. The information is contained within an object. Therefore I must pass that object as one of the function arguments. The object is very large however, and I would like to reduce the overhead involved in making copies every time it is passed. Here is an example of
My function Call:
1 myFunction($myObject1);
and the function:
2 function myFunction($myObject2){
3 //do stuff
4 }
I understand there is more to it in php than just pass-by-reference vs pass-by-value. Correct me if I am wrong, but I believe on line 1 there is only a reference to the object made, but on line 2 the object is copied. To avoid this copy I have replaced ($myObject2) with (&$myObject2). I still refer to the object within the function definition as $myObject2 and everything seems to work. I believe I am now using a reference only and therefore making no copies of the object (which was my goal). Is my thinking correct? If not not why?
In PHP5, "objects" are not values. The value of the variables $myObject1, $myObject2 are object references (i.e. pointers to objects). You cannot get "the object itself"; objects can only be manipulated through these pointers.
Assignment and passing by value only copy values. Since objects are not values, they cannot ever be cloned through assignment, passing, etc. The only way to duplicate an object is to use the clone operator.
Putting & on a variable makes it pass or assign by reference, instead of by value without the &. Passing by reference allows you to modify the variable passed. Since the value of a variable cannot be an object, this has nothing to do with objects.
array_values() doesn't work with ArrayAccess object.
neither does array_keys()
why?
if I can access $object['key'] I should be able to do all kind of array operations
No, you've misunderstood the utility of ArrayAccess. It isn't just a sort of wrapper for an array. Yes, the standard example for implementing it uses a private $array variable whose functionality is wrapped by the class, but that isn't a particularly useful one. Often, you may as well just use an array.
One good example of ArrayAccess is when the script doesn't know what variables are available.
As a fairly silly example, imagine an object that worked with a remote server. Resources on that server can be read, updated and deleted using an API across a network. A programmer decides they want to wrap that functionality with array-like syntax, so $foo['bar'] = 'foobar' sets the bar resource on that server to foobar and echo $foo['bar'] retrieves it. The script has no way of finding out what keys or values are present without trying all possible values.
So ArrayAccess allows the use of array syntax for setting, updating, retrieving or deleting from an object with array-like syntax: no more, no less.
Another interface, Countable, allows the use of count(). You could use both interfaces on the same class. Ideally, there would be more such interfaces, perhaps including those that can do array_values or array_keys, but currently they don't exist.
ArrayAccess is very limited. It does not allow the use of native array_ functions (no existing interface does).
If you need to do more array-like operations on your object, then you are essentially creating a collection. A collection should be manipulated by its methods.
So, create an object and extend ArrayObject. This implements IteratorAggregate, Traversable, ArrayAccess, Serializable and Countable.
If you need the keys, simply add an array_keys method:
public function array_keys($search_value = null, $strict = false)
{
return call_user_func_array('array_keys', array($this->getArrayCopy(), $search_value, $strict));
}
Then you can:
foreach ($object->array_keys() as $key) {
echo $object[$key];
}
The ArrayObject/ArrayAccess allows objects to work as arrays, but they're still objects. So instead of array_keys() (which work only on arrays) you should use get_object_vars(), for example:
var_dump(array_keys(get_object_vars($ArrObj)));
or convert your ArrayObject by casting it into array by (array) $ArrObj, e.g.:
var_dump(array_keys((array)$ArrObj));
I have an XML file which contains the structure of some objects. The object is like this:
class Object:
{
private $name;
private $info;
private $$items;
}
Where $items is an array of objects, so it is recursive.
As of now, when I have to list the items I use simplexml to iterate within the elements and show them.
My questions are:
1) If I parse the XML and convert the data into the object instead of working with pure XML, would it affect the overall performance of the pages all that much? Will it slow down too much considering that every page the user loads he will have to load the items?
2) Is it a good idea to serialize() a recursive object like the one I’ve defined?
SimpleXML cannot be serialized because it's considered to be a resource. However, you can easily grab the output of $sx->toXML(); and serialize that, re-constructing the SimpleXMLElement(s) once you've unserialized them.
The performance difference of re-parsing the XML is not very considerable unless you're working with extremely large XML trees.
In regards to your object, you can also implement the __sleep() and __wakeup() magic methods that will allow you do alter the object before it is serialized and unserialized respectively.
When serializing your recursive object example, don't include your $$items variable in the __sleep() magic method and re-implement it in __wakeup.
I'm trying to serialize a Doctrine_query object in Symfony :
var_dump(serialize($this->pager->getQuery()));
The result is :
string(2) "N;"
What am I doing wrong?
You can not serialize every object in PHP. Objects themselves - by implementing the Serializeable interface PHP Manual - can protect themselves from being serialized for example.
They return a NULL value then (or don't return anything which is then NULL in PHP). And that's exactly the contents of your serialized string: a serialized NULL (N;).
And there are even some build-in classes that go even further than that. But it applies as well to user-defined classes and build-in classes: Some of them are not available for serialization.
One example of a built-in class that can not be serialized in PHP is DOMDocument, however it is possible to add the functionality as the following question demonstrates:
How to serialize/save a DOMElement in $_SESSION?.
I have tried to unserialize a PHP Object.
Warning: unserialize() [function.unserialize]: Node no longer exists in /var/www/app.php on line 42
But why was that happen?
Even if I found a solution to unserialize simplexml objects, its good to know why php cant unserialize objects?
To serialize simplexml object i use this function
function serializeSimpleXML(SimpleXMLElement $xmlObj)
{
return serialize($xmlObj->asXML());
}
To unserialize an simplexml objetc i use this function
function unserializeSimpleXML($str)
{
return simplexml_load_string(unserialize($str));
}
SimpleXMLElement wraps a libxml resource type. Resources cannot be serialized. On the next invocation, the resource representing the libxml Node object doesn't exist, so unserialization fails. It may be a bug that you are allowed to serialize a SimpleXMLElement at all.
Your solution is the correct one, since text/xml is the correct serialization format for anything XML. However, since it is just a string, there isn't really any reason to serialize the XML string itself.
Note that this has nothing inherently to do with "built-in" PHP classes/objects, but is an implementation detail of SimpleXML (and I think DOM in PHP 5).
just inherent the class(the main xml class would be best) in a other one
and use __sleep
to store the data required to initialize simplexml(any object)
and __wake
to reinitialize the object as required
this way you can serialize(any object)
edit: remember this class needs to be accessible first,
this can be done by loading(including) the class or an __autoload