php garbage collection while script running - php

I have a PHP script that runs on cron that can take up to an 15 minutes to execute. At regular intervals I have it spitting out memory_get_usage() so I can see what is happening. The first time it tells me my usage I am at 10 megs. When the script finishes I am at 114 megs!
Does PHP do it's garbage collection while the script is running? Or what is happening to all that memory? Is there something I can do to force garbage collection. The task that my script is doing is a nightly import of a couple thousand nodes into Drupal. So it is doing the same thing a lot of times.
Any suggestions?

The key is that you unset your global variables as soon as you don't need them.
You needn't call unset explicitly for local variables and object properties because these are destroyed when the function goes out of scope or the object is destroyed.
PHP keeps a reference count for all variables and destroys them (in most conditions) as soon as this reference count goes to zero. Objects have one internal reference count and the variables themselves (the object references) each have one reference count. When all the object references have been destroyed because their reference coutns have hit 0, the object itself will be destroyed. Example:
$a = new stdclass; //$a zval refcount 1, object refcount 1
$b = $a; //$a/$b zval refcount 2, object refcount 1
//this forces the zval separation because $b isn't part of the reference set:
$c = &$a; //$a/$c zval refcount 2 (isref), $b 1, object refcount 2
unset($c); //$a zval refcount 1, $b 1, object refcount 2
unset($a); //$b refcount 1, object refcount 1
unset($b); //everything is destroyed
But consider the following scenario:
class A {
public $b;
}
class B {
public $a;
}
$a = new A;
$b = new B;
$a->b = $b;
$b->a = $a;
unset($a); //cannot destroy object $a because $b still references it
unset($b); //cannot destroy object $b because $a still references it
These cyclic references are where PHP 5.3's garbage collector kicks in. You can explicitly invoke the garbage collector with gc_collect_cycles.
See also Reference Counting Basics and Collecting Cycles in the manual.

PHP garbage collection is largely a reference counter (it does have some cycle detection.) If you are keeping references which are still accessible around these will easily add up if not freed.
Use unset() to free variables you are no longer using. If you simply overwrite variables (eg, with null) this will only allow the GC to reduce to the amount of space required by that variable, but not as much as unset which actually allows the destruction of the referenced value.
You should also properly release any resources etc. that you use.
You will still see memory increase during runtime as the GC is free to release it at its own discression, such as when there are free cpu cycles or when it starts to run low on memory.

Use unset() as much as possible, check used memory more often. yes, php does garbage collection during runtime on a few conditions.
here is a helpful post on php.net.

If the memory is increasing that much, then you are probably not releasing it. You have created a memory leak. Garbage collection won't help you if you don't unset variables, destroy objects and/or they go out of scope.
Are you unsetting the nodes you load once you are done with them? I've written PHP scripts that run for hours, processing millions of database records, with no problems and memory usage that goes up and down within a very acceptable range.

Related

Cyclic references in PHP 7.4

In PHP 7.4 I noticed the number of collected cycles returned by gc_collect_cycles is always zero when there is a destructor method in a cyclic referenced object.
class A {
public function __destruct() {
}
}
gc_disable();
$a1 = new A;
$a2 = new A;
$a1->ref = $a2;
$a2->ref = $a1;
$a1 = $a2 = NULL;
echo('removed cycles: '.gc_collect_cycles()); // Output: removed cycles: 0
When I remove the __destruct method the output is:
removed cycles: 2
You can see this behavior started as of PHP 7.4.0beta4
What is going on here ? Are garbage cycles get collected in the destructor even when GC is disabled?
Since PHP 7.4, the initial garbage collection run will only call destructors on objects that have them, and the actual destruction of the object is deferred to the next GC run. You can see this if you perform two calls to gc_collect_cycles(): https://3v4l.org/0LIVn
The reason for this behavior is that destructors can introduce additional references to the object, such that it is no longer valid to destroy it. Previous versions used an unreliable heuristic to detect this case. PHP 7.4 will instead delay destruction to a separate GC run.

PHP, does assignment use double memory for a variable?

If I have a large object and assign another variable to that object, does php create two objects, or does it use a pointer internally?
for example:
<?php
$myObject = new Class_That_Will_Consume_Lots_Of_Memory();
$testObject = $myObject;
In this example will i be using 2 x the memory footprint of a Class_That_Will_Consume_Lots_Of_Memory instance or will it be 1 of those and a pointer?
The latter: one object and a pointer/reference (and in fact, here, two pointers/references, since the first is one as well).
To get a new object, use clone.
Related: Are PHP5 objects passed by reference?
Objects in PHP5 are passed by reference, arrays and other types passing is based on Copy on Write technique:
$a = ['a'=>'b'];
$b = $a; // Here we used 1x memory
$b['x'] = 'y'; // Now it become 2x memory
You can use memory_get_usage() to debug memory usage.

=& operator, memories

I am very confused about how using & operator to reduce memories.
Can I have an answer for below question??
clase C{
function B(&$a){
$this->a = &$a;
$this->a = $a;
//They are the same here created new variable $this->a??
// Or only $this->a = $a will create new variable?
}
}
$a = 1;
C = new C;
C->B($a)
Or maybe my understanding is totally wrong.....
Never ever use references in PHP just to reduce memory load. PHP handles that perfectly with its internal copy on write mechanism. Example:
$a = str_repeat('x', 100000000); // Memory used ~ 100 MB
$b = $a; // Memory used ~ 100 MB
$b = $b . 'x'; // Memory used ~ 200 MB
You should only use references if you know exactly what you are doing and need them for functionality (and that's almost never, so you could as well just forget about them). PHP references are quirky and can result to some unexpected behaviour.
I am very confused about how using & operator to reduce memories.
If you don't know it, you probably don't need it :) The & is quite useless nowadays, because of several enhancements in the PHP-core over the last years. Usually you would use & to avoid, that PHP copies the value to the memory allocated for the second variable, but instead (in short) let both variables point to the same memory.
But nowadays
Objects are passed as reference anyway. They don't clone themself magically, because they are passed to a method ;)
When you pass primitive types, PHP will not copy the value, unless you change the variable (copy-on-write).
To sum it up: The benefits of & already exists as feature of the core, but without the ugly side-effects of the operator
Value type variables will only be copied when their value changes, if you only assign it in your example it wont be copied, memory footprint will be the same as if u have not used the & operator.
I recommend that you read these articles about passing values by reference:
When to pass-by-reference in PHP
When is it good to use pass by reference in PHP?
http://schlueters.de/blog/archives/125-Do-not-use-PHP-references.html
it is considered a microoptimalization, and hurts the transparency of the code

What does PHP's gc_enable function do exactly?

Before you tell me to read the manual, check out the php.net documentation for this function:
Warning
This function is currently not documented; only its argument list is available.
That was helpful!
This page explains that it enables garbage collection for cyclic references. Where and when is this useful? Could someone show me an example of its use? Preferably an example where a cyclic reference is created and then collected.
gc_enable is only needed if you call gc_disable. There is really no sane reason to do this, as that would cause cyclic references to not be garbage collected (like pre-5.3, when the cyclic GC did not exist).
PHP's garbage collector works by reference counting. You can think of a variable as a "pointer" to an object. When an object has no pointers to it, it is "dead" because nothing can reach it, so it is garbage collected.
//one thing points to the Foo object
$a = new Foo();
//now two things do
$b = $a;
//now only $b points to it
$a = null;
//now nothing points to Foo, so php garbage collects the object
$b = null;
Consider this though:
$a = new Foo();
$b = new Bar();
$b->foo = $a;
$a->bar = $b;
$a = $b = null;
At this point nothing is holding on to $a or $b except the objects themselves. This is a cyclic reference, and in previous versions of php (< 5.3), would not be collected. The cyclic collector in 5.3 can now detect this and clean up these objects.
There is a full chapter on Garbage Collection in the PHP Manual explaining this:
Reference Counting Basics
Collecting Cycles
Performance Considerations
I usually try not to just link offsite, but feel it's too much to summarize.
There are reasons why we use gc_disable and gc_enable.
In the latest PHP manual, it states
Can be very useful for big projects, when you create a lot of objects that should stay in memory. So GC can't clean them up and just wasting CPU time.
Issue in composer:
https://github.com/composer/composer/pull/3482#issuecomment-65199153
Solution and people replies:
https://github.com/composer/composer/commit/ac676f47f7bbc619678a29deae097b6b0710b799
Please be reminded that the second link above contains a lot of comments with graphics.

In PHP can someone explain cloning vs pointer reference?

To begin with, I understand programming and objects, but the following doesn't make much sense to me in PHP.
In PHP we use the & operator to retrieve a reference to a variable. I understand a reference as being a way to refer to the same 'thing' with a different variable. If I say for example
$b = 1;
$a =& $b;
$a = 3;
echo $b;
will output 3 because changes made to $a are the same as changes made to $b. Conversely:
$b = 1;
$a = $b;
$a = 3;
echo $b;
should output 1.
If this is the case, why is the clone keyword necessary? It seems to me that if I set
$obj_a = $obj_b then changes made to $obj_a should not affect $obj_b,
conversely $obj_a =& $obj_b should be pointing to the same object so changes made to $obj_a affect $obj_b.
However it seems in PHP that certain operations on $obj_a DO affect $obj_b even if assigned without the reference operator ($obj_a = $obj_b). This caused a frustrating problem for me today while working with DateTime objects that I eventually fixed by doing basically:
$obj_a = clone $obj_b
But most of the php code I write doesn't seem to require explicit cloning like in this case and works just fine without it. What's going on here? And why does PHP have to be so clunky??
Basically, there are two ways variables work in PHP...
For everything except objects:
Assignment is by value (meaning a copy occurs if you do $a = $b.
Reference can be achieved by doing $a = &$b (Note the reference operator operates upon the variable, not the assignment operator, since you can use it in other places)...
Copies use a copy-on-write tehnique. So if you do $a = $b, there is no memory copy of the variable. But if you then do $a = 5;, the memory is copied then and overwritten.
For objects:
Assignment is by object reference. It's not really the same as normal variable by reference (I'll explain why later).
Copy by value can be achieved by doing $a = clone $b.
Reference can be achieved by doing $a = &$b, but beware that this has nothing to do with the object. You're binding the $a variable to the $b variable. It doesn't matter if it's an object or not.
So, why is assignment for objects not really reference? What happens if you do:
$a = new stdclass();
$b = $a;
$a = 4;
What's $b? Well, it's stdclass... That's because it's not writing a reference to the variable, but to the object...
$a = new stdclass();
$a->foo = 'bar';
$b = $a;
$b->foo = 'baz';
What's $a->foo? It's baz. That's because when you did $b = $a, you are telling PHP to use the same object instance (hence the object reference). Note that $a and $b are not the same variable, but they do both reference the same object.
One way of thinking about it, is to think of all variables which store an object as storing the pointer to that object. So the object lives somewhere else. When you assign $a = $b where $b is an object, all you're doing is copying that pointer. The actual variables are still disjoint. But when you do $a = &$b, you're storing a pointer to $b inside of $a. Now, when you manipulate $a it cascades the pointer chain to the base object. When you use the clone operator, you're telling PHP to copy the existing object, and create a new one with the same state... So clone really just does a by-value copy of the varaible...
So if you noticed, I said the object is not stored in an actual variable. It's stored somewhere else and nothing but a pointer is stored in the variable. So this means that you can have (and often do have) multiple variables pointing to the same instance. For this reason, the internal object representation contains a refcount (Simply a count of the number of variables pointing to it). When an object's refcount drops to 0 (meaning that all the variables pointing to it either go out of scope, or are changed to somethign else) it is garbaged collected (as it is no longer accessable)...
You can read more on references and PHP in the docs...
Disclaimer: Some of this may be oversimplification or blurring of certain concepts. I intended this only to be a guide to how they work, and not an exact breakdown of what goes on internally...
Edit: Oh, and as for this being "clunky", I don't think it is. I think it is really useful. Otherwise you'd have variable references being passed around all over the place. And that can yield some really interesting bugs when a variable in one part of an application affects another variable in another part of the app. And not because it's passed, but because a reference was made somewhere along the line.
In general, I don't use variable references that much. It's rare that I find an honest need for them. But I do use object references all the time. I use them so much, that I'm happy that they are the default. Otherwise I'd need to write some operator (since & denotes a variable reference, there'd need to be another to denote an object reference). And considering that I rarely use clone, I'd say that 99.9% of use cases should use object references (so make the operator be used for the lower frequency cases)...
JMHO
I've also created a video explaining these differences. Check it out on YouTube.
In Short:
In PHP 5+ objects are passed by reference. In PHP 4 they are passed by value (that's why it had runtime pass by reference, which became deprecated). So, you have to use the clone operator in PHP5 to copy objects:
$objectB = clone $objectA;
Also note that it's just objects that are passed by reference, not other variables. The following may clear you up more:
PHP References
PHP Object Cloning
PHP Objects and References
i've written a presentation to explain better how php manage memory with its variables:
https://docs.google.com/presentation/d/1HAIdvSqK0owrU-uUMjwMWSD80H-2IblTlacVcBs2b0k/pub?start=false&loop=false&delayms=3000
take a look ;)

Categories