Clean up problems with references in php - php

While studying the Garbage collector in PHP with references, I have not been understood what are the cleanup problems of references with the garbage collector that mentioned here
$a = array( 'one' );
$a[] =& $a;
xdebug_debug_zval( 'a' );
unset($a);
the reference upon this code and section said
Although there is no longer a symbol in any scope pointing to this structure, it cannot be cleaned up because the array element "1" still points to this same array. Because there is no external symbol pointing to it, there is no way for a user to clean up this structure; thus you get a memory leak.
after studying PHP references , i learned that unset the variable means cut down the binding between the variable name and content ,
so according to the below code :
$a = array( 'one' );
$a[] =& $a;
unset($a);
the whole variable $a will not be related to the content, and due to the whole array removed, then its contents references or variables are also removed, so where is the cleanup problem?
note that , according to the below code , xdebug function here generates 2 means that two references or pointers or bindings are released which proves that there isn't cleanup problem :
$a = array( 'one' );
$a[] =& $a;
xdebug_debug_zval( 'a' );
References i studied from :
Manual
Toptal Article
Sitepoint

The point is, this can't be removed, as there's is still a pointer to var $a. As there's no easy solution to detect, that this pointer is defined inside $a, the memory reserved for $a can't be free'd.
As seen in this image, there's a pointer from inside the array to the array itself. This pointer (which increments the refcount) exists even, if you unset the ref to $a.
$a = array( 'one' ); // refcount for a = 1
$a[] =& $a; // refcount for a = 2
unset($a); // refcount for a = 1, but there's remains no usable pointer for the php user
A simple loop could demonstrate this behavior
$start = memory_get_usage ();
for ($i = 0; $i < 100000; $i++) {
$a = ['test'];
// if you remove this line, the memory usage is 0, if not 4000000
$a[] = &$a;
unset($a);
}
echo memory_get_usage() - $start;

Related

Assign by reference PHP misunderstaning

Whats the meaning of $arr[0] result "2" in the code given below, $arr2 is copying $arr and increasing its first value by one, so the result of $arr2[0] "2" is understaning,but whats happening with $arr, when i pass by reference $arr[0] to $a like so $a=&$arr[0] the result of $arr[0] is 2, when i pass it by value $a=$arr[0[ the result of $arr[0] would be set to 1 as it should, can anyone enlighten me on this?
<?php
$a = 1;
$arr = array(1);
$a = &$arr[0];
$arr2 = $arr;
$arr2[0]++;
echo $arr[0]. "<br>";//2
echo $arr2[0]. "<br>";//2
?>
References in PHP are not like pointers; when you assign or pass something by reference, you create what we might call a "reference set" where both variables are references to the same "zval" (the structure in memory which holds the type and value of a variable).
An important consequence of this is that references are symmetrical. I tend to write assign by reference as $foo =& $bar rather than $foo = &$bar to emphasise this: the operator doesn't just take a reference to $bar and put it into $foo, it affects both operands.
With that in mind, let's go through your example:
$arr = array(1);
This will create two zvals: one for the array $arr itself, and one for the value in position 0 of that array ($arr[0]).
$a =& $arr[0];
This binds $a and $arr[0] into a reference set. Whichever of those names we use, we will be using the same zval that was previously created, and currently holds 1.
$arr2 = $arr;
This copies the contents of $arr into a new array zval, $arr2. But it doesn't resolve the references inside that array into values, it just copies the fact that they are a reference.
So $arr2[0] is now also part of the reference set containing $arr[0] and $a.
$arr2[0]++;
This increments the zval pointed to by our reference set, from 1 to 2.
echo $arr[0]. "<br>";//2
echo $arr2[0]. "<br>";//2
Since these are both part of the same reference set, they both point to the same zval, which is integer 2.
The inevitable question is, "bug or feature?" A more useful example would be passing an array containing references into a function:
function foo($array) {
$array[0] = 42;
$array[1] = 69;
}
$a = 1;
$b = 1;
$foo = [ &$a, &$b ];
foo($foo);
echo $a; // 42
echo $b; // 69
Like assignment, passing into the function didn't break the references, so code manipulating them can safely be re-factored into multiple functions.
Not sure if this is a bug but var_dump() can help to explain.
<?php
$a = 1;
$arr = array(1);
var_dump( $arr );
$a = &$arr[0];
var_dump( $arr );
$arr2 = $arr;
$arr2[0]++;
Output:
array(1) {
[0]=>
int(1)
}
array(1) {
[0]=>
&int(1)
}
Take note of &int(1) in the second var_dump(). This tells us that position #0 of $arr has been turned into a reference pointer to a position in PHP's memory instead of remaining a dedicated value.
So when you perform $arr2 = $arr;, $arr2 receives that reference pointer as well.

Smarty assignByRef inside loop

I need to render some Smarty template for different values inside a loop (a PHP loop, not a Smarty foreach), in the following way (just an example):
$a = 0;
$b = 0;
$output = "";
$tmpl = "$a, $b";
$smarty->assignByRef('a', $a['a']);
$smarty->assignByRef('b', $b['b']);
for (int i = 0; i < 10; ++i) {
++$a;
++$b;
$output .= $smarty->fetch("string:" . $tmpl);
}
My doubt is about assignByRef. The Smarty v3 docs says:
With the introduction of PHP5, assignByRef() is not necessary for most
intents and purposes. assignByRef() is useful if you want a PHP array
index value to be affected by its reassignment from a template.
Assigned object properties behave this way by default.
but I don't fully understand what does that technical note means. So, can I use assignByRef that way or not? or using just assign will produce the same output?
PHP 4 objects were passed by value, unless the user explicitly specified the reference by prepending ampersand: &$variable. For this reason, function arguments that were likely to consume a big amount of memory were passed by reference in order to optimize memory usage:
function f(&$huge) {
// ...
}
PHP 5 variables are passed by reference, even if the user did't specify it explicitly (the ampersand character is not used). By assigning one variable to another we only create a new container (internally called zval) for the same data in memory. Consider this:
$a = new stdClass;
$b = $a;
The first line allocates memory for variable $a and an object of stdClass, and stores the object's identifier into the variable. The second line allocates memory for variable $b, stores the object's identifier into the $b variable, and increments an internal reference counter. The reference counter value shows how many times the object is referenced in the code. When $b variable is destroyed, the reference counter is decremented by one. When the value of reference counter becomes equal to zero, the object's memory is deallocated. The following code demonstrates the idea:
$a = new stdClass;
debug_zval_dump($a);
$b = $a;
debug_zval_dump($a);
$c = $a;
debug_zval_dump($a);
$c = null; // destroy $c
debug_zval_dump($a);
$b = null; // destroy $b
debug_zval_dump($a);
Output
object(stdClass)#1 (0) refcount(2){
}
object(stdClass)#1 (0) refcount(3){
}
object(stdClass)#1 (0) refcount(4){
}
object(stdClass)#1 (0) refcount(3){
}
object(stdClass)#1 (0) refcount(2){
}
But when a variable is modified, PHP versions 5 and 7 create a copy of the variable in order to keep the original value (variable) intact.
$m1 = memory_get_usage();
$a = str_repeat('a', 1 << 24);
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 16,781,408
$b = $a;
$c = $a;
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 16,781,472
$b[0] = 'x';
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 33,562,880
$c[0] = 'x';
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 50,344,288
The same is applied to the context of the function arguments. Thus, if a variable is supposed to be used for read only, there is no need for passing it by reference explicitly. The words in the Smarty documentation mean that in most cases, you pass variables to the templates, and usually do not expect the template to change them. You need to pass a variable by reference only when you really want the variable to be modified in the template. The same concept is applied to any function arguments in PHP 5 and newer.

Changing a value in a copy of an array changes the value in the original arra

I have some issues with variables in php that i don't understand.
This is a simplified code example of the issue.
//Create an initial array with an sub-array
$a = array();
$a['test'] = array(1,2);
//Create an reference to $a['test'] in $b
//Changing $b[0] should now change the value in $a['test'][0]
$b = &$a['test'];
$b[0] = 3;
//Create an copy of $a into $c
$c = $a;
//Change one value in $c, which is an copy of $a.
//This should NOT change the original value of $a as it is a copy.
$c['test'][1] = 5;
print_r($a);
print_r($b);
print_r($c);
This is the output:
Array
(
[test] => Array
(
[0] => 3
[1] => 5
)
)
Array
(
[0] => 3
[1] => 5
)
Array
(
[test] => Array
(
[0] => 3
[1] => 5
)
)
The script creates an array with an sub-array and puts two values in it.
A reference to the sub-array is then put into b and one of the values in a is changed in this way.
I then make a copy of a into c.
I then change one value of c.
As c is a copy of a i would expect that the change on c did not affect a. But the output tells a different tale.
Can anyone explain why changing a value in the variable $c affect the value in $a when $c is just a copy of $a? Why is there a 5 in the values of $a?
You're assigning $b to $a by reference (that's what the & prefix does). Any changes to $b will effectively modify $a. Just force a declaration assignment:
$b = $a['test'];
$c does not modify $a. Here's the order of what's going on, and why the arrays are identical:
$a['test'] is assigned an array of 1,2.
$b is assigned as a reference to $a['test'], and modifies its values
$c is then assigned to $a, which has now been modified by $b.
I think i found the answer to my own question... On this page: http://www.php.net/manual/en/language.references.whatdo.php
I can't really understand why it does what it does. I do understand that i should probably avoid mixing references and arrays in the future.
Im refering to this section:
Note, however, that references inside arrays are potentially
dangerous. Doing a normal (not by reference) assignment with a
reference on the right side does not turn the left side into a
reference, but references inside arrays are preserved in these normal
assignments. This also applies to function calls where the array is
passed by value. Example:
<?php
/* Assignment of scalar variables */
$a = 1;
$b =& $a;
$c = $b;
$c = 7; //$c is not a reference; no change to $a or $b
/* Assignment of array variables */
$arr = array(1);
$a =& $arr[0]; //$a and $arr[0] are in the same reference set
$arr2 = $arr; //not an assignment-by-reference!
$arr2[0]++;
/* $a == 2, $arr == array(2) */
/* The contents of $arr are changed even though it's not a reference! */
?>
In other words, the reference behavior of arrays is defined in an
element-by-element basis; the reference behavior of individual
elements is dissociated from the reference status of the array
container.
You are passing the reference $a to $b by using $b = &$a['test']; Hence
change
$b = &$a['test'];
to
$b = $a['test'];

String concatenation while incrementing

This is my code:
$a = 5;
$b = &$a;
echo ++$a.$b++;
Shouldn't it print 66?
Why does it print 76?
Alright. This is actually pretty straight forward behavior, and it has to do with how references work in PHP. It is not a bug, but unexpected behavior.
PHP internally uses copy-on-write. Which means that the internal variables are copied when you write to them (so $a = $b; doesn't copy memory until you actually change one of them). With references, it never actually copies. That's important for later.
Let's look at those opcodes:
line # * op fetch ext return operands
---------------------------------------------------------------------------------
2 0 > ASSIGN !0, 5
3 1 ASSIGN_REF !1, !0
4 2 PRE_INC $2 !0
3 POST_INC ~3 !1
4 CONCAT ~4 $2, ~3
5 ECHO ~4
6 > RETURN 1
The first two should be pretty easy to understand.
ASSIGN - Basically, we're assinging the value of 5 into the compiled variable named !0.
ASSIGN_REF - We're creating a reference from !0 to !1 (the direction doesn't matter)
So far, that's straight forward. Now comes the interesting bit:
PRE_INC - This is the opcode that actually increments the variable. Of note is that it returns its result into a temporary variable named $2.
So let's look at the source code behind PRE_INC when called with a variable:
static int ZEND_FASTCALL ZEND_PRE_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
USE_OPLINE
zend_free_op free_op1;
zval **var_ptr;
SAVE_OPLINE();
var_ptr = _get_zval_ptr_ptr_var(opline->op1.var, execute_data, &free_op1 TSRMLS_CC);
if (IS_VAR == IS_VAR && UNEXPECTED(var_ptr == NULL)) {
zend_error_noreturn(E_ERROR, "Cannot increment/decrement overloaded objects nor string offsets");
}
if (IS_VAR == IS_VAR && UNEXPECTED(*var_ptr == &EG(error_zval))) {
if (RETURN_VALUE_USED(opline)) {
PZVAL_LOCK(&EG(uninitialized_zval));
AI_SET_PTR(&EX_T(opline->result.var), &EG(uninitialized_zval));
}
if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
CHECK_EXCEPTION();
ZEND_VM_NEXT_OPCODE();
}
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
if (UNEXPECTED(Z_TYPE_PP(var_ptr) == IS_OBJECT)
&& Z_OBJ_HANDLER_PP(var_ptr, get)
&& Z_OBJ_HANDLER_PP(var_ptr, set)) {
/* proxy object */
zval *val = Z_OBJ_HANDLER_PP(var_ptr, get)(*var_ptr TSRMLS_CC);
Z_ADDREF_P(val);
fast_increment_function(val);
Z_OBJ_HANDLER_PP(var_ptr, set)(var_ptr, val TSRMLS_CC);
zval_ptr_dtor(&val);
} else {
fast_increment_function(*var_ptr);
}
if (RETURN_VALUE_USED(opline)) {
PZVAL_LOCK(*var_ptr);
AI_SET_PTR(&EX_T(opline->result.var), *var_ptr);
}
if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
CHECK_EXCEPTION();
ZEND_VM_NEXT_OPCODE();
}
Now I don't expect you to understand what that's doing right away (this is deep engine voodoo), but let's walk through it.
The first two if statements check to see if the variable is "safe" to increment (the first checks to see if it's an overloaded object, the second checks to see if the variable is the special error variable $php_error).
Next is the really interesting bit for us. Since we're modifying the value, it needs to preform copy-on-write. So it calls:
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
Now, remember, we already set the variable to be a reference above. So the variable is not separated... Which means everything we do to it here will happen to $b as well...
Next, the variable is incremented (fast_increment_function()).
Finally, it sets the result as itself. This is copy-on-write again. It's not returning the value of the operation, but the actual variable. So what PRE_INC returns is still a reference to $a and $b.
POST_INC - This behaves similarly to PRE_INC, except for one VERY important fact.
Let's check out the source code again:
static int ZEND_FASTCALL ZEND_POST_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
retval = &EX_T(opline->result.var).tmp_var;
ZVAL_COPY_VALUE(retval, *var_ptr);
zendi_zval_copy_ctor(*retval);
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
fast_increment_function(*var_ptr);
}
This time I cut away all of the non-interesting stuff. So let's look at what it's doing.
First, it gets the return temporary variable (~3 in our code above).
Then it copies the value from its argument (!1 or $b) into the result (and hence the reference is broken).
Then it increments the argument.
Now remember, the argument !1 is the variable $b, which has a reference to !0 ($a) and $2, which if you remember was the result from PRE_INC.
So there you have it. It returns 76 because the reference is maintained in PRE_INC's result.
We can prove this by forcing a copy, by assigning the pre-inc to a temporary variable first (through normal assignment, which will break the reference):
$a = 5;
$b = &$a;
$c = ++$a;
$d = $b++;
echo $c.$d;
Which works as you expected. Proof
And we can reproduce the other behavior (your bug) by introducing a function to maintain the reference:
function &pre_inc(&$a) {
return ++$a;
}
$a = 5;
$b = &$a;
$c = &pre_inc($a);
$d = $b++;
echo $c.$d;
Which works as you're seeing it (76): Proof
Note: the only reason for the separate function here is that PHP's parser doesn't like $c = &++$a;. So we need to add a level of indirection through the function call to do it...
The reason I don't consider this a bug is that it's how references are supposed to work. Pre-incrementing a referenced variable will return that variable. Even a non-referenced variable should return that variable. It may not be what you expect here, but it works quite well in almost every other case...
The Underlying Point
If you're using references, you're doing it wrong about 99% of the time. So don't use references unless you absolutely need them. PHP is a lot smarter than you may think at memory optimizations. And your use of references really hinders how it can work. So while you think you may be writing smart code, you're really going to be writing less efficient and less friendly code the vast majority of the time...
And if you want to know more about References and how variables work in PHP, checkout One Of My YouTube Videos on the subject...
I think the full concatenate line is first executed and than send with the echo function.
By example
$a = 5;
$b = &$a;
echo ++$a.$b++;
// output 76
$a = 5;
$b = &$a;
echo ++$a;
echo $b++;
// output 66
EDIT: Also very important, $b is equal to 7, but echoed before adding:
$a = 5;
$b = &$a;
echo ++$a.$b++; //76
echo $b;
// output 767
EDIT: adding Corbin example: https://eval.in/34067
There's obviously a bug in PHP. If you execute this code:
<?php
{
$a = 5;
echo ++$a.$a++;
}
echo "\n";
{
$a = 5;
$b = &$a;
echo ++$a.$b++;
}
echo "\n";
{
$a = 5;
echo ++$a.$a++;
}
You get:
66 76 76
Which means the same block (1st and 3rd one are identical) of code doesn't always return the same result. Apparently the reference and the increment are putting PHP in a bogus state.
https://eval.in/34023

PHP array copy semantics: what it does when members are references, and where is it documented?

This code
<?php
$a = 10;
$arr1 = array(&$a);
$arr1[0] = 20;
echo $a; echo "\n";
$arr2 = $arr1;
$arr2[0] = 30;
echo $a;
produces
20
30
Obviously reference array members are "preserved", which can lead, for example, to some interesting/strange behavior, like
<?php
function f($arr) {
$arr[0] = 20;
}
$val = 10;
$a = array(&$val);
f($a);
echo $a[0];
?>
outputting
20
My question is: what is it for, where is it documented (except for a user comment at http://www.php.net/manual/en/language.types.array.php#50036) and the Zend Engine source code itself?
PHP's assignment by reference behavior is documented on the manual page "PHP: What References Do". You'll find a paragraph on array value references there, too, starting with:
While not being strictly an assignment by reference, expressions created with the language construct array() can also behave as such by prefixing & to the array element to add.
The page also explains your why your first code behaves like it does:
Note, however, that references inside arrays are potentially dangerous. Doing a normal (not by reference) assignment with a reference on the right side does not turn the left side into a reference, but references inside arrays are preserved in these normal assignments. This also applies to function calls where the array is passed by value.

Categories