PHP:
$a = array("key" => 23);
var_dump($a);
$c = &$a["key"];
var_dump($a);
unset($c);
var_dump($a);
Output:
array(1) {
["key"]=>
int(23)
}
array(1) {
["key"]=>
&int(23)
}
array(1) {
["key"]=>
int(23)
}
In the second dump the value of "key" is shown as a reference. Why is that?
If I do the same with a normal variable instead of an array key this does not happen.
My only explanation would be that array keys are usually stored as references and as long as there is only one entry in the symbol table it is shown as a scalar in the dump.
Internally, PHP arrays are hashmaps (or dictionaries, or HashTables or whatever you want to call it). Even a numerically indexed array is implemented as a hash table, which is a zval, just like any other.
However, what you're seeing is expected behaviour, which is explained both here and here.
Basically, what your array looks like internally is this:
typedef struct _zval_struct {
zvalue_value value;
zend_uint refcount__gc;
zend_uchar type;
zend_uchar is_ref__gc;
} zval;
//zval_value:
typedef union _zvalue_value {
long lval;
double dval;
struct {
char *val;
int len;
} str;
HashTable *ht;
zend_object_value obj;
} zvalue_value;
In case of an array, the zval.type will be set to indicate that the zval value is an array, and so the zval_value.ht member will be used.
What happens when you write $c = &$a['key'] is that the zval that is assigned to $a['key'] will be updated: zval.refcount__gc will be incremented, and is_ref__gc will be set to 1. Simply because the value is not copied, but the value is used by more than 1 variable: meaning this value is a reference. Once you unset($c);, the refcount is decremented, and the reference is lost, and so is_ref is set to 0.
Now for the big one: Why don't you see the same thing when you use regular, scalar variables? Well, that's because an array is a HashTable, complete with its own, internal, ref-counting (zval_ptr_dtor). Once an array itself is empty, it too should be destroyed. By creating a reference to an array value, and you unset the array, the zval should be GC'ed. But that would mean you have a reference to a destroyed zval floating around.
Therefore, the zval in the array is changed to a reference, too: a reference can be deleted safely. So that if you were to do this:
$foo = array(123);
$bar = &$foo[0];
unset($foo[0]);
echo $bar, PHP_EOL;
Your code will still work as expected: $foo[0] no longer exists, but $bar is now the only existing reference to 123.
This is just a really, really, short and incomplete explanation, but google the PHP internals, and how the memory management works, how references are dealt with internally, and how the garbage collector uses the is_ref and refcount members to manage the memory.
Pay special attention to the internal mechanisms like copy-on-write, and (when looking through the first link I provided here), look for the snippet that looks like this:
$ref = &$array;
foreach ($ref as $val) {}
Because it deals with some oddities in terms of references and arrays.
i have come across some very strange php behaviour (5.3.2 on ubuntu 10.04). an unset which should occur within local scope is affecting the scope of the caller function. the following snippet is a simplification of my code which displays what i can only assume is a bug:
<?php
function should_not_alter($in)
{
$in_ref =& $in['level1'];
should_only_unset_locally($in);
return $in;
}
function should_only_unset_locally($in)
{
unset($in['level1']['level2_0']);
}
$data = array('level1' => array('level2_0' => 'first value', 'level2_1' => 'second value'));
$data = should_not_alter($data); //test 1
//should_only_unset_locally($data); //test 2
print_r($data);
?>
if you run the above you will see that the value 'first value' has been unset from the $data array in the global scope. however if you comment out test 1 and run test 2 this does not happen.
i can only assume that php does not like referencing an element of an array. in my code i need to alter $in_ref - hence the reason for the $in_ref =& $in['level1']; line in the above code. i realize that removing this line would fix the problem of 'first value' being unset in the global scope, but this is not an option.
can anyone confirm if this is intended behaviour of php?
i suspect it is a bug, rather than a feature, because this behaviour is inconsistent with the way that php handles scopes and references with normal (non-array) variables. for example, using a string rather than an array function should_only_unset_locally() has no effect on the global scope:
<?php
function should_not_alter($in)
{
$in_ref =& $in;
should_only_unset_locally($in);
return $in;
}
function should_only_unset_locally($in)
{
unset($in);
}
$data = 'original';
$data = should_not_alter($data); //test 1
//should_only_unset_locally($data); //test 2
print_r($data);
?>
both test1 or test2 output original as expected. actually, even if $data is an array but $in_ref is referenced to the entire array (ie $in_ref =& $in;) then the buggy behaviour goes away.
update
i have submitted a bug report
Yup, looks like a bug.
As the name of the function implies, should_not_alter should not alter the array since it's passed by value. (I'm of course not basing that just off of the name -- it also should not alter anything based on its definition.)
The fact that commenting $in_ref =& $in['level1']; makes it leave $in alone seems to be further proof that it's a bug. That is quite an odd little quirk. No idea what could be happening internally to cause that.
I'd file a bug report on the PHP bug tracker. For what it's worth, it still exists in 5.4.6.
$data = should_not_alter($data)
This line is overwriting the $data array with the return value of should_not_alter, which is $in. This is normal behavior.
Also, while you're creating a reference $in_ref =& $in['level1']; but you're not doing anything with it. It will have no effect on the program output.
Short answer:
Delete the reference variable via unset($in_ref) before calling the should_only_unset_locally() function.
Long answer:
When a reference to an array element is created, the array element is replaced with a reference. This behavior is weird but it isn't a bug - it's a feature of the language and is by design.
Consider the following PHP program:
<?php
$a = array(
'key1' => 'value1',
'key2' => 'value2',
);
$r = &$a['key1'];
$a['key1'] = 'value3';
var_dump($a['key1']);
var_dump($r);
var_dump($a['key1'] === $r);
Output:
string(6) "value3"
string(6) "value3"
bool(true)
Assigning a value to $a['key1'] changes the value of $r as they both reference the same value. Conversely updating $r will update the array element:
$r = 'value4';
var_dump($a['key1']);
var_dump($r);
Output:
string(6) "value4"
string(6) "value4"
The value doesn't live in $r or $a['key'] - those are just references. It's like they're both referencing some spooky, hidden value. Weird, huh?
For most use cases this is desired and useful behavior.
Now apply this to your program. The following line modifies the local $in array and replaces the 'level1' element with a reference:
$in_ref = &$in['level1'];
$in_ref is not a reference to $in['level1'] - instead they both reference the same spooky value. So when this line comes around:
unset($in['level1']['level2_0']);
PHP sees $in['level1'] as a reference to a spooky value and removes the 'level2_0' element. And since it's a reference the removal is also felt in the scope of the should_not_alter() function.
The solution to your particular problem is to destroy the reference variable which will automagically restore $in['level1'] back to normal behavior:
function should_not_alter($in) {
$in_ref =& $in['level1'];
// Do some stuff with $in_ref
// After you're done with it delete the reference to restore $in['level1']
unset($in_ref);
should_only_unset_locally($in);
return $in;
}
I have an array I'm using as a stack to store a path through a tree. Each element points to a node in the tree and I want to pop the last element off and then set the object referred to by that element to null.
Basically:
$node = array_pop($path);
*$node = null;
assuming that PHP had a '*' operator as C-ish languages do. Right now I have the ugly solution of starting at the parent node and remembering which child I took and then setting that to null as in:
if($goLeft) {
$parent->left = null;
} else {
$parent->right = null;
}
I say this is ugly because the array containing the path is created by a public function in my tree class. I'd like to expose the ability to work directly on the nodes in a path through the tree without exposing an implementation detail that addresses an idiosyncrasy (feature?) in PHP. ATM I need to include a boolean in the return value ($goLeft in this case) just so I can workaround an inability to dereference a reference.
This is the second time I've encountered this problem, so if anyone knows a way I can do something similar to the first block of code please share!
(EDIT)
After experimenting with many permutations of &'s and arrays, it turns out that the basic problem was that I had misinterpreted the reason for an error I was getting.
I tried
$a = ($x > $y) ? &$foo[$bar] : $blah;
and got " syntax error, unexpected '&' ". I interpreted this to mean that the problem was using the &-operator on $foo[$bar]. It actually turns out that the culprit is the ?-operator, as
if($x > $y) {
$a = &$foo[$bar];
} else {
$a = null;
}
works perfectly fine. I thus went on a wild goose chase looking for a workaround for a problem that didn't exist. As long as I don't break the chain of &'s, PHP does what I want, which is to operate on the object referred to by a variable (not the variable itself). Example
$a1 = new SomeClass;
$a2 = &$a1;
$a3 = &$a2;
$a4 = &$a3;
$a4 = 42; // This actually sets $a1 to 42
var_dump($a1); // Emits 42
What messed me up is that I thought objects are passed around by reference anyway (this is wrong), so I didn't think the & was necessary if the expression resolved to an object. I mean:
class A {
public $b;
}
class B {}
$a = new A;
$a->b = new B;
$c1 = $a->b;
$c2 = &$a->b;
$c1 = 42; // Merely assigns 42 to $c1
$c2 = 42; // Assigns 42 to $a->b
It turns out that this exact issue is addressed at http://www.php.net/manual/en/language.oop5.references.php. Wish that had sunk in the first time I read it!
Very interesting question! I may have found a workaround: if you populate the array with object references, with the & operator, you can destroy the original object by setting that array value to NULL. You have to operate on the array directly, instead of using a variable returned by array_pop. After that you can pop the array to free that position (that would then contain a NULL value).
This is what I mean (based on Rocket's code):
$a=(object)'a';
$b=array(&$a);
$b[0] = NULL;
// array still contains an element
array_pop($b);
// now array is empty
var_dump($a); // NULL
http://codepad.org/3D7Lphde
I wish I could remember where I read this, but PHP works by maintaining a counter of references to a given object. You have some object (e.g. a Tree) that has a reference to some nodes. When you use array_pop, a reference to the node object is returned (i.e. an additional reference is created), but the original reference still exists. When you unset the popped reference, that is destroyed but the original object is not destroyed because Tree still has that reference. The only way to free the memory of that object is to have Tree destroy it personally (which seems to be what you're doing in the second code block).
PHP does not seem to have any method for forcing memory deallocation or garbage collection, so unless you carefully handle your references, you're stuck.
This is not possible
P.S. I'm still really confused about what you're trying to do. Rocket's explanation helps, but what is $path, and how does it relate to the second block?
Just don't assign the array_pop() return value.
php > $test = array(1, 2, 3);
php > $test2 = array(0 => &$test[0], 1 => &$test[1], 2 => &$test[2]);
php > array_pop($test2);
php > var_dump($test);
array(3) {
[0]=>
&int(1)
[1]=>
&int(2)
[2]=>
int(3)
}
php > var_dump($test2);
array(2) {
[0]=>
&int(1)
[1]=>
&int(2)
}
$one = 1;
$two = 2;
$array = array(&$one, &$two);
// magic
end($array);
$array[key($array)] = NULL;
var_dump($two);
// NULL
Reference in php allows you to change object.
Extension from https://stackoverflow.com/a/55191/547210
I am creating a validating function to check several attributes of string variables, which may or may not have been set. (One of the attributes which is checked)
What I am trying to do with the function is receive arguments an unknown number of arguments in the form (See below), and suppress errors that may be caused by passing an unset variable.
I'm receiving the variables like validate([ mixed $... ] ) by using func_get_args()
The previous post mentioned that it was possible by passing by reference, now is this possible when the variables are passed implicitly like this?
If you pass a variable that is not set in the calling scope, the array returned by func_get_args() will contain a NULL value at the position where the variable was passed, and an error will be triggered. This error is not triggered in the function code itself, but in the function call. There is, therefore, nothing that can be done to suppress this error from within the code of the function.
Consider this:
function accepts_some_args () {
$args = func_get_args();
var_dump($args);
}
$myVar = 'value';
accepts_some_args($notSet, $myVar);
/*
Ouput:
Notice: Undefined variable: notSet in...
array(2) {
[0]=>
NULL
[1]=>
string(5) "value"
}
*/
As you can see, the variable name notSet appears in the error, telling us that the error was triggered in the caller's scope, not that of the callee.
If we want to counter the error, we could do this:
accepts_some_args(#$notSet, $myVar);
...and prefix the variable names with the evil # operator, but a better solution would be to structure our code differently, so we can do the checks ourselves:
function accepts_some_args ($args) {
var_dump($args);
}
$myVar = 'value';
$toPassToFunction = array();
$toPassToFunction[] = (isset($notSet)) ? $notSet : NULL;
$toPassToFunction[] = (isset($myVar)) ? $myVar : NULL;
accepts_some_args($toPassToFunction);
/*
Ouput:
array(2) {
[0]=>
NULL
[1]=>
string(5) "value"
}
*/
1) When an array is passed as an argument to a method or function, is it passed by reference, or by value?
2) When assigning an array to a variable, is the new variable a reference to the original array, or is it new copy?
What about doing this:
$a = array(1,2,3);
$b = $a;
Is $b a reference to $a?
For the second part of your question, see the array page of the manual, which states (quoting) :
Array assignment always involves value
copying. Use the reference operator to
copy an array by reference.
And the given example :
<?php
$arr1 = array(2, 3);
$arr2 = $arr1;
$arr2[] = 4; // $arr2 is changed,
// $arr1 is still array(2, 3)
$arr3 = &$arr1;
$arr3[] = 4; // now $arr1 and $arr3 are the same
?>
For the first part, the best way to be sure is to try ;-)
Consider this example of code :
function my_func($a) {
$a[] = 30;
}
$arr = array(10, 20);
my_func($arr);
var_dump($arr);
It'll give this output :
array
0 => int 10
1 => int 20
Which indicates the function has not modified the "outside" array that was passed as a parameter : it's passed as a copy, and not a reference.
If you want it passed by reference, you'll have to modify the function, this way :
function my_func(& $a) {
$a[] = 30;
}
And the output will become :
array
0 => int 10
1 => int 20
2 => int 30
As, this time, the array has been passed "by reference".
Don't hesitate to read the References Explained section of the manual : it should answer some of your questions ;-)
With regards to your first question, the array is passed by reference UNLESS it is modified within the method / function you're calling. If you attempt to modify the array within the method / function, a copy of it is made first, and then only the copy is modified. This makes it seem as if the array is passed by value when in actual fact it isn't.
For example, in this first case, even though you aren't defining your function to accept $my_array by reference (by using the & character in the parameter definition), it still gets passed by reference (ie: you don't waste memory with an unnecessary copy).
function handle_array($my_array) {
// ... read from but do not modify $my_array
print_r($my_array);
// ... $my_array effectively passed by reference since no copy is made
}
However if you modify the array, a copy of it is made first (which uses more memory but leaves your original array unaffected).
function handle_array($my_array) {
// ... modify $my_array
$my_array[] = "New value";
// ... $my_array effectively passed by value since requires local copy
}
FYI - this is known as "lazy copy" or "copy-on-write".
TL;DR
a) the method/function only reads the array argument => implicit (internal) reference
b) the method/function modifies the array argument => value
c) the method/function array argument is explicitly marked as a reference (with an ampersand) => explicit (user-land) reference
Or this:
- non-ampersand array param: passed by reference; the writing operations alter a new copy of the array, copy which is created on the first write;
- ampersand array param: passed by reference; the writing operations alter the original array.
Remember - PHP does a value-copy the moment you write to the non-ampersand array param. That's what copy-on-write means. I'd love to show you the C source of this behaviour, but it's scary in there. Better use xdebug_debug_zval().
Pascal MARTIN was right. Kosta Kontos was even more so.
Answer
It depends.
Long version
I think I'm writing this down for myself. I should have a blog or something...
Whenever people talk of references (or pointers, for that matter), they usually end up in a logomachy (just look at this thread!).
PHP being a venerable language, I thought I should add up to the confusion (even though this a summary of the above answers). Because, although two people can be right at the same time, you're better off just cracking their heads together into one answer.
First off, you should know that you're not a pedant if you don't answer in a black-and-white manner. Things are more complicated than "yes/no".
As you will see, the whole by-value/by-reference thing is very much related to what exactly are you doing with that array in your method/function scope: reading it or modifying it?
What does PHP says? (aka "change-wise")
The manual says this (emphasis mine):
By default, function arguments are passed by value (so that if the
value of the argument within the function is changed, it does not get
changed outside of the function). To allow a function to modify its
arguments, they must be passed by reference.
To have an argument to a
function always passed by reference, prepend an ampersand (&) to the
argument name in the function definition
As far as I can tell, when big, serious, honest-to-God programmers talk about references, they usually talk about altering the value of that reference. And that's exactly what the manual talks about: hey, if you want to CHANGE the value in a function, consider that PHP's doing "pass-by-value".
There's another case that they don't mention, though: what if I don't change anything - just read?
What if you pass an array to a method which doesn't explicitly marks a reference, and we don't change that array in the function scope? E.g.:
<?php
function readAndDoStuffWithAnArray($array)
{
return $array[0] + $array[1] + $array[2];
}
$x = array(1, 2, 3);
echo readAndDoStuffWithAnArray($x);
Read on, my fellow traveller.
What does PHP actually do? (aka "memory-wise")
The same big and serious programmers, when they get even more serious, they talk about "memory optimizations" in regards to references. So does PHP. Because PHP is a dynamic, loosely typed language, that uses copy-on-write and reference counting, that's why.
It wouldn't be ideal to pass HUGE arrays to various functions, and PHP to make copies of them (that's what "pass-by-value" does, after all):
<?php
// filling an array with 10000 elements of int 1
// let's say it grabs 3 mb from your RAM
$x = array_fill(0, 10000, 1);
// pass by value, right? RIGHT?
function readArray($arr) { // <-- a new symbol (variable) gets created here
echo count($arr); // let's just read the array
}
readArray($x);
Well now, if this actually was pass-by-value, we'd have some 3mb+ RAM gone, because there are two copies of that array, right?
Wrong. As long as we don't change the $arr variable, that's a reference, memory-wise. You just don't see it. That's why PHP mentions user-land references when talking about &$someVar, to distinguish between internal and explicit (with ampersand) ones.
Facts
So, when an array is passed as an argument to a method or function is it passed by reference?
I came up with three (yeah, three) cases:
a) the method/function only reads the array argument
b) the method/function modifies the array argument
c) the method/function array argument is explicitly marked as a reference (with an ampersand)
Firstly, let's see how much memory that array actually eats (run here):
<?php
$start_memory = memory_get_usage();
$x = array_fill(0, 10000, 1);
echo memory_get_usage() - $start_memory; // 1331840
That many bytes. Great.
a) the method/function only reads the array argument
Now let's make a function which only reads the said array as an argument and we'll see how much memory the reading logic takes:
<?php
function printUsedMemory($arr)
{
$start_memory = memory_get_usage();
count($arr); // read
$x = $arr[0]; // read (+ minor assignment)
$arr[0] - $arr[1]; // read
echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}
$x = array_fill(0, 10000, 1); // this is 1331840 bytes
printUsedMemory($x);
Wanna guess? I get 80! See for yourself. This is the part that the PHP manual omits. If the $arr param was actually passed-by-value, you'd see something similar to 1331840 bytes. It seems that $arr behaves like a reference, doesn't it? That's because it is a references - an internal one.
b) the method/function modifies the array argument
Now, let's write to that param, instead of reading from it:
<?php
function printUsedMemory($arr)
{
$start_memory = memory_get_usage();
$arr[0] = 1; // WRITE!
echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}
$x = array_fill(0, 10000, 1);
printUsedMemory($x);
Again, see for yourself, but, for me, that's pretty close to being 1331840. So in this case, the array is actually being copied to $arr.
c) the method/function array argument is explicitly marked as a reference (with an ampersand)
Now let's see how much memory a write operation to an explicit reference takes (run here) - note the ampersand in the function signature:
<?php
function printUsedMemory(&$arr) // <----- explicit, user-land, pass-by-reference
{
$start_memory = memory_get_usage();
$arr[0] = 1; // WRITE!
echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}
$x = array_fill(0, 10000, 1);
printUsedMemory($x);
My bet is that you get 200 max! So this eats approximately as much memory as reading from a non-ampersand param.
By default
Primitives are passed by value. Unlikely to Java, string is primitive in PHP
Arrays of primitives are passed by value
Objects are passed by reference
Arrays of objects are passed by value (the array) but each object is passed by reference.
<?php
$obj=new stdClass();
$obj->field='world';
$original=array($obj);
function example($hello) {
$hello[0]->field='mundo'; // change will be applied in $original
$hello[1]=new stdClass(); // change will not be applied in $original
$
}
example($original);
var_dump($original);
// array(1) { [0]=> object(stdClass)#1 (1) { ["field"]=> string(5) "mundo" } }
Note: As an optimization, every single value is passed as reference until its modified inside the function. If it's modified and the value was passed by reference then, it's copied and the copy is modified.
When an array is passed to a method or function in PHP, it is passed by value unless you explicitly pass it by reference, like so:
function test(&$array) {
$array['new'] = 'hey';
}
$a = $array(1,2,3);
// prints [0=>1,1=>2,2=>3]
var_dump($a);
test($a);
// prints [0=>1,1=>2,2=>3,'new'=>'hey']
var_dump($a);
In your second question, $b is not a reference to $a, but a copy of $a.
Much like the first example, you can reference $a by doing the following:
$a = array(1,2,3);
$b = &$a;
// prints [0=>1,1=>2,2=>3]
var_dump($b);
$b['new'] = 'hey';
// prints [0=>1,1=>2,2=>3,'new'=>'hey']
var_dump($a);
In PHP arrays are passed to functions by value by default, unless you explicitly pass them by reference, as the following snippet shows:
$foo = array(11, 22, 33);
function hello($fooarg) {
$fooarg[0] = 99;
}
function world(&$fooarg) {
$fooarg[0] = 66;
}
hello($foo);
var_dump($foo); // (original array not modified) array passed-by-value
world($foo);
var_dump($foo); // (original array modified) array passed-by-reference
Here is the output:
array(3) {
[0]=>
int(11)
[1]=>
int(22)
[2]=>
int(33)
}
array(3) {
[0]=>
int(66)
[1]=>
int(22)
[2]=>
int(33)
}
To extend one of the answers, also subarrays of multidimensional arrays are passed by value unless passed explicitely by reference.
<?php
$foo = array( array(1,2,3), 22, 33);
function hello($fooarg) {
$fooarg[0][0] = 99;
}
function world(&$fooarg) {
$fooarg[0][0] = 66;
}
hello($foo);
var_dump($foo); // (original array not modified) array passed-by-value
world($foo);
var_dump($foo); // (original array modified) array passed-by-reference
The result is:
array(3) {
[0]=>
array(3) {
[0]=>
int(1)
[1]=>
int(2)
[2]=>
int(3)
}
[1]=>
int(22)
[2]=>
int(33)
}
array(3) {
[0]=>
array(3) {
[0]=>
int(66)
[1]=>
int(2)
[2]=>
int(3)
}
[1]=>
int(22)
[2]=>
int(33)
}
This thread is a bit older but here something I just came across:
Try this code:
$date = new DateTime();
$arr = ['date' => $date];
echo $date->format('Ymd') . '<br>';
mytest($arr);
echo $date->format('Ymd') . '<br>';
function mytest($params = []) {
if (isset($params['date'])) {
$params['date']->add(new DateInterval('P1D'));
}
}
http://codepad.viper-7.com/gwPYMw
Note there is no amp for the $params parameter and still it changes the value of $arr['date']. This doesn't really match with all the other explanations here and what I thought until now.
If I clone the $params['date'] object, the 2nd outputted date stays the same. If I just set it to a string it doesn't effect the output either.