I'm having a hard time finding clear information on PHP's garbage collection as it relates to objects referencing other objects. Specifically, I'm trying to understand what happens when I have a chain of objects that reference each other, and I destroy or unset() the first object in the chain. Will PHP know to GC the remaining objects in the chain or do I have to destroy them all individually to prevent a memory leak?
For example, Say I have a class BinaryTree with a variable rootNode and a Node class that extends BinaryTree and has 3 variables, a parent Node, a leftChild Node, and a rightChild Node. The rootNode will be the Node with a parent that is null, but every other Node created will reference either the rootNode or a subsequent child Node before it as the parent and any Nodes after it as either leftChild or rightChild. The rootNode, therefore, is the only way to reference any of the child nodes.
If I unset() the rootNode, will all other Nodes remain in memory? Would I need to iterate through the entire tree unsetting each Node individually to prevent them from just existing without a way to reference them anymore, or will PHP know that those objects can no longer be referenced and GC them?
So far what I'm reading leads me to believe I would need to iterate the entire list of objects to destroy them individually, but I'm wondering if anyone has any experience with this and can offer some clarity. TIA.
The PHP manual has a section on this:
...a zval container also has an internal reference counting mechanism to optimize memory usage. This second piece of additional information, called "refcount", contains how many variable names (also called symbols) point to this one zval container.
[...]
...when the "refcount" reaches zero, the variable container is removed from memory
[...]
...unsetting a variable removes the symbol, and the reference count of the variable container it points to is decreased by one.
And so you get a cascade effect. Initially all the nodes of your binary tree node have a reference count of 1, but by unsetting the rootNode, the reference count of the root object becomes 0, and the memory used for the root is freed. This involves unsetting the references it has to other objects, whose reference counters are therefore also reduced, ...etc.
So, no, you do not have to unlink the nodes or destroy them yourself. The reference count mechanism will trigger a cascade of memory being freed.
Related
foreach in PHP7 by default, when iterating by value, operates on a copy of the array according to: http://php.net/manual/en/migration70.incompatible.php
Does it lazily create a copy only if there are changes made to the array or a value or will it always make a copy and in essence make looping over references a performance optimization?
Also, do arrays of objects still loop over/give you references of the objects? Or will they actually also create copies for the foreach and return the objects by value?
In PHP 7, if you iterate an array by value, the copy will be done lazily, only when and if the array is actually modified.
If you iterate an array by reference instead, a separation will be performed at the start of the loop. If the array is currently used in more than one place, this separation will lead to a copy.
Furthermore iterating by reference means that a) the array has to be wrapped into a reference and b) each element has to be wrapped in a reference as well. Creating a reference wrapper is an expensive operation, because it requires allocation.
Additionally iteration by reference requires us to use a modification-safe iteration mechanism. This works by registering the iterator with the array and checking for potentially affected iterators in various array modification operations.
So no, iterating by reference is certainly not an optimization, it's a de-optimization. Using references usually is.
I have written a file-handler-class, that works like this:
__construct opens and ex-locks a file, reads its json-content and parses that as an PHP-array, keeping this as a property of the class.
The file is still locked, in order to avoid race-conditions.
Other 'worker-classes' make changes in this Array, in/from other scopes.
__destruct encodes the finished Array, writes it to file, and unlocks the file.
Everything works fine ...
QUESTION:
Is it sensible to keep the Array as a property of the original class, or is it better to pass the Array to the worker-classes, and let them return it at the end?
Perhaps there is a way to keep the Array locally, and pass it to worker-classes by reference, instead of as raw data?
I mean ... this is a question of not having duplicates, waisting memory. A question of speed, not passing things unnecessarily. And a question of best practices, keeping things easy to understand.
Actually, by passing the array to another function, having that function modify the array, and then return it to some other caller that may or may not also conduct modifications on it, you are in fact copying that array multiple times (since this invokes copy-on-write semantics in PHP) and by definition wasting memory.
Whereas by keeping it as a property of the object instance, you would not be invoking any copy-on-write semantics, even if the caller is not the same instance. Since passing an object instance won't copy the array, nor will its modification from said instance.
Not to mention you just make it easier to retain state within that object (assuming you care about validation).
I have two classes. Parent and Child in a OneToMany relationship. Parent has an array called $children where it stores Child instances. Child has a private $name property with public getter/setter methods. I want children with unique names.
The way I went about solving this is that I pass to the Child's constructor method the Parent instance, I store it in $_my_parent, and on the Child's setName($name) method I ask the Parent instance to loop all children and check if $name can be used.
Pretty straight forward.
Q1: This obviously creates infinite recursion. Is that a problem? When serializing?
Q2: Is there another way of doing this?
While 100% guaranteed data integrity this way may be nice in theory, in practice it's not attainable anyway. You could always set properties on your objects which make them not unique, for example using the Reflection API.
I'd keep it simple:
your child objects are dumb data objects, they do not know anything about their parent and are self contained
the parent just holds child objects, it does not inject itself into them
either in the parent object or yet another external class, have a validation method which checks whether the parent-child combination is valid by iterating the children and ensuring their uniqueness
Simply call this validation method explicitly whenever necessary, don't trigger it automatically whenever you modify a child. It gets rid of a lot of complexity and problems with very few downsides.
I am building a tree structure in PHP and I could leave it as an array, or turn it into a tree of objects. I think there will be much better performance if I leave it as an array, but I'm not sure.
In the case of an array, the owning object would have a reference to the root element, and that's it. The root element would contain sub-arrays which may, in turn, contain their own sub-arrays.
In the case of objects, my mapper would need to instantiate them on load, and for every child object their would be a reference from its parent. For a 300 node tree, this would mean 299 references, as opposed to 1 when using arrays.
So, it seems to me that the performance would be much better if I use arrays rather than objects. Is this correct? It's important because sacrificing the behaviour of objects will be a considerable trade off in this case.
Is the PHP implementation of a Heap really a full implementation?
When I read this article, http://en.wikipedia.org/wiki/Heap_%28data_structure%29, I get the idea that a child node has a specific parent, and that a parent has specific children.
When I look at the example in the PHP documentation however, http://au.php.net/manual/en/class.splheap.php, it seems that child nodes all share the same 'level', but the specific parent/child information is not important.
For example, which node is the parent for each of the three nodes who are ranked 10th in the PHP example?
In my application, when a user selects 'node 156', I need to know who its children are so that I can pay them each a visit. (I could make their identities 'node 1561', 'node 1562', etc, so the relationship is obvious).
Is the PHP Heap implementation incomplete? Should I forget the Spl class and go my own way? Or am I missing something about how heaps should operate? Or perhaps I should be looking at a particular heap variant?
Thanks heaps!
The API for this heap implementation does not allow array access, which is what you need in this case. Heaps are typically used to implement other structures that allow you to easily remove items from the top.
You could create a wrapper iterator that seeks to the positions you need. I suspect that this would be a poorly performing solution and not a very good one.
In this situation, a BinaryTree is what I think you need. Given a spot in the tree you can go down it's children because they're directly linked. I do have a BinarySearchTree that will keep the tree ordered and you can call getRoot to get a copy of the BinaryTree. There's also an AvlTree which will keep the tree balanced for optimal searching.
A Heap is usually used to answer questions that revolve around "what is the max/min element in the set".
While a heap could coincidentally efficiently answer "who are the children of the max/min node", a heap doesn't stand out as a good data structure to use when you need to access an arbitrary node to answer your question(a node that isn't the max node). It's also not the data structure to use if there are specific parent child relationships that need to be represented and maintained, because the children can and do get swapped between different parent nodes.
So php's heap implementation is definitely not incomplete on these grounds. You simply need a different data structure.