I was wondering why php does not complain when we reference a non existing variable (being it a plain variable or array), is this just the way it is, or there is something else I am missing?
For example this code
<?php
$t = &$r["er"];
var_dump($r);
?>
throws no warning about a non existing variable.
Besides that the var_dump show this:
array(1) { ["er"]=> &NULL }
that &NULL is something I didn't really expected, I thought I would get a plain NULL.
Thanks in advance!
If memory of the PHP interpreter reference var allocation serves me right, PHP will create a null element in the hash table with a key like the one you sent and reference it. This is visible by running the following test:
<?php
$i = 0;
$arr = [];
$arrS = null;
$v = memory_get_peak_usage();
for ($i = 0; $i < 150; $i++) {
$arrS = &$arr[rand()];
}
$v = memory_get_peak_usage() - $v;
echo $v;
Until the default heap size, PHP will return exactly an extra 0 memory used - as it is still allocating already "prepared" array items (PHP keeps a few extra hash table elements empty but allocated for performance purposes). You can see this by setting it from 0 to 16 (which is the heap size!).
When you get over 16, PHP will have to allocate extra items, and will do so on i=17, i=18 etc..., creating null items in order to reference them.
P.S: contrary to what people said, this does NOT throw an error, warning or notice. Recalling an empty item without reference would - referencing an inexistent item does not. Big big BIG difference.
throws no warning about a non existing variable.
This is how references work. $a = &$b; creates $b if it does not exist yet, "for future reference", if you will. It's the same as in parameters that are passed by reference. Maybe this looks familiar to you:
preg_match($pattern, $string, $matches);
Here the third parameter is a reference parameter, thus $matches does not have to exist at time of the method invocation, but it will be created.
that &NULL is something I didn't really expected, I thought I would get a plain NULL.
Why didn't you expect it? $r['er'] is a reference "to/from" $t. Note that references are not pointers, $r['er'] and $t are equal references to the same value, there is no direction from one to the other (hence the quotation marks in the last sentence).
Related
In C++ if you pass a large array to a function, you need to pass it by reference, so that it is not copied to the new function wasting memory. If you don't want it modified you pass it by const reference.
Can anyone verify that passing by reference will save me memory in PHP as well. I know PHP does not use addresses for references like C++ that is why I'm slightly uncertain. That is the question.
The following does not apply to objects, as it has been already stated here. Passing arrays and scalar values by reference will only save you memory if you plan on modifying the passed value, because PHP uses a copy-on-change (aka copy-on-write) policy. For example:
# $array will not be copied, because it is not modified.
function foo($array) {
echo $array[0];
}
# $array will be copied, because it is modified.
function bar($array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
# This is how bar shoudl've been implemented in the first place.
function baz($array) {
$temp = $array[0] + 1;
echo $temp + $array[1];
}
# This would also work (passing the array by reference), but has a serious
#side-effect which you may not want, but $array is not copied here.
function foobar(&$array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
To summarize:
If you are working on a very large array and plan on modifying it inside a function, you actually should use a reference to prevent it from getting copied, which can seriously decrease performance or even exhaust your memory limit.
If it is avoidable though (that is small arrays or scalar values), I'd always use functional-style approach with no side-effects, because as soon as you pass something by reference, you can never be sure what passed variable may hold after the function call, which sometimes can lead to nasty and hard-to-find bugs.
IMHO scalar values should never be passed by reference, because the performance impact can not be that big as to justify the loss of transparency in your code.
The short answer is use references when you need the functionality that they provide. Don't think of them in terms of memory usage or speed. Pass by reference is always going to be slower if the variable is read only.
Everything is passed by value, including objects. However, it's the handle of the object that is passed, so people often mistakenly call it by-reference because it looks like that.
Then what functionality does it provide? It gives you the ability to modify the variable in the calling scope:
class Bar {}
$bar = new Bar();
function by_val($o) { $o = null; }
function by_ref(&$o) { $o = null; }
by_val($bar); // $bar is still non null
by_ref($bar); // $bar is now null
So if you need such functionality (most often you do not), then use a reference. Otherwise, just pass by value.
Functions that look like this:
$foo = modify_me($foo);
sometimes are good candidates for pass-by-reference, but it should be absolutely clear that the function modifies the passed in variable. (And if such a function is useful, often it's because it really ought to just be part of some class modifying its own private data.)
In PHP :
objects are passed by reference1 -- always
arrays and scalars are passed by value by default ; and can be passed by reference, using an & in the function's declaration.
For the performance part of your question, PHP doesn't deal with that the same way as C/C++ ; you should read the following article : Do not use PHP references
1. Or that's what we usually say -- even if it's not "completely true" -- see Objects and references
Basically in the case of working with large arrays it's convenient to pass null back if an error occurs since if $array = null then $array[] = 1 is [ 1 ] and null is also usable in a callable context, ie. function (array $array = null) will accept null as an acceptable value. Basically null is convenient since you can easily identify it as an error if you have error handling code, but also easily ignore it if you don't care.
This is fairly straight forward in most cases, however there is a corner case where PHP doesn't really support it all that well and that's when passing back a reference in the context of a function that doesn't necessarily accept a reference but needs to pass null back sometimes, yet pass a reference back other times (most of the time this is not an issue since you're returning a reference to a instance variable but sometimes that's not the case). There's also the case of calling a function with a null value in a non awkward way.
The reason to pass the reference is obviously to save the trouble of copying the array around (especially when it's very large).
The following "solution"...
\error_reporting(-1);
function & nil()
{
$nil = null;
return $nil;
}
function & pass(array & $variable = null)
{
return $variable;
}
function & check ()
{
return nil();
}
$test = pass(nil());
$test = &pass(nil());
$test1 = &check();
$test1[] = 1;
$test2 = &check();
$test2[] = 2;
\var_dump($test1, $test2);
Works, but...
My question is Does PHP guarantee the local variable won't be garbage collected before all references to it are garbage collected? or is that undefined behavior.
PHP will never GC anything while reference count is greater than 0.
I'm creating a wrapper function around mysqli so that my application needn't be overly complicated with database-handling code. Part of that is a bit of code to parameterize the SQL calls using mysqli::bind_param(). bind_param(), as you may know, requires references. Since it's a semi-general wrapper, I end up making this call:
call_user_func_array(array($stmt, 'bind_param'), $this->bindArgs);
and I get an error message:
Parameter 2 to mysqli_stmt::bind_param() expected to be a reference, value given
The above discussion is to forstall those who would say "You don't need references at all in your example".
My "real" code is a bit more complicated than anyone wants to read, so I've boiled the code leading up to this error into the following (hopefully) illustrative example:
class myclass {
private $myarray = array();
function setArray($vals) {
foreach ($vals as $key => &$value) {
$this->myarray[] =& $value;
}
$this->dumpArray();
}
function dumpArray() {
var_dump($this->myarray);
}
}
function myfunc($vals) {
$obj = new myclass;
$obj->setArray($vals);
$obj->dumpArray();
}
myfunc(array('key1' => 'val1',
'key2' => 'val2'));
The problem appears to be that, in myfunc(), in between the call to setArray() and the call to dumpArray(), all the elements in $obj->myarray stop being references and become just values instead. This can be easily seen by looking at the output:
array(2) {
[0]=>
&string(4) "val1"
[1]=>
&string(4) "val2"
}
array(2) {
[0]=>
string(4) "val1"
[1]=>
string(4) "val2"
}
Note that the array is in the "correct" state in the first half of the output here. If it made sense to do so, I could make my bind_param() call at that point, and it would work. Unfortunately, something breaks in the latter half of the output. Note the lack of the "&" on the array value types.
What happened to my references? How can I prevent this from happening? I hate to call "PHP bug" when I'm really not a language expert, but could this be one? It does seem very odd to me. I'm using PHP 5.3.8 for my testing at the moment.
Edit:
As more than one person pointed out, the fix is to change setArray() to accept its argument by reference:
function setArray(&$vals) {
I'm adding this note to document WHY this seems to work.
PHP generally, and mysqli in particular, appear to have a slightly odd concept of what a "reference" is. Observe this example:
$a = "foo";
$b = array(&$a);
$c = array(&$a);
var_dump($b);
var_dump($c);
First of all, I'm sure you're wondering why I'm using arrays instead of scalar variables -- it's because var_dump() doesn't show any indication of whether a scalar is a reference, but it does for array members.
Anyway, at this point, $b[0] and $c[0] are both references to $a. So far, so good. Now we throw our first wrench into the works:
unset($a);
var_dump($b);
var_dump($c);
$b[0] and $c[0] are both still references to the same thing. If we change one, both will still change. But what are they references to? Some unnamed location in memory. Of course, garbage collection insures that our data is safe, and will remain so, until we stop refering to it.
For our next trick, we do this:
unset($b);
var_dump($c);
Now $c[0] is the only remaining reference to our data. And, whoa! Magically, it's no longer a "reference". Not by var_dump()'s measure, and not by mysqli::bind_param()'s measure either.
The PHP manual says that there's a separate flag, 'is_ref' on every piece of data. However, this test appears to suggest that 'is_ref' is actually equivalent to '(refcount > 1)'
For fun, you can modify this toy example as follows:
$a = array("foo");
$b = array(&$a[0]);
$c = array(&$a[0]);
var_dump($a);
var_dump($b);
var_dump($c);
Note that all three arrays have the reference mark on their members, which backs up the idea that 'is_ref' is functionally equivalent to '(refcount > 1)'.
It's beyond me why mysqli::bind_param() would care about this distinction in the first place (or perhaps it's call_user_func_array()... either way), but it would appear that what we "really" need to ensure is that the reference count is at least 2 for each member of $this->bindArgs in our call_user_func_array() call (see the very beginning of the post/question). And the easiest way to do that (in this case) is to make setArray() pass-by-reference.
Edit:
For extra fun and games, I modified my original program (not shown here) to leave its equivalent to setArray() pass-by-value, and to create a gratuitous extra array, bindArgsCopy, containing exactly the same thing as bindArgs. Which means that, yes, both arrays contained references to "temporary" data which was deallocated by the time of the second call. As predicted by the analysis above, this worked. This demonstrates that the above analysis is not an artifact of var_dump()'s inner workings (a relief to me, at least), and it also demonstrates that it's the reference count that matters, not the "temporary-ness" of the original data storage.
So. I make the following assertion: In PHP, for the purpose of call_user_func_array() (and probably more), saying that a data item is a "reference" is the same thing as saying that the item's reference count is greater than or equal to 2 (ignoring PHP's internal memory optimizations for equal-valued scalars)
Administrivia note: I'd love to give mario the site credit for the answer, as he was the first to suggest the correct answer, but since he wrote it in a comment, not an actual answer, I couldn't do it :-(
Pass the array as a reference:
function setArray(&$vals) {
foreach ($vals as $key => &$value) {
$this->myarray[] =& $value;
}
$this->dumpArray();
}
My guess (which could be wrong in some details, but is hopefully correct for the most part) as to why this makes your code work as expected is that when you pass as a value, everything's cool for the call to dumpArray() inside of setArray() because the reference to the $vals array inside setArray() still exist. But when control returns to myfunc() then that temporary variable is gone as are all references to it. So PHP dutifully changes the array to string values instead of references before deallocating the memory for it. But if you pass it as a reference from myfunc() then setArray() is using references to an array that lives on when control returns to myfunc().
Adding the & in the argument signature fixed it for me. This means the function will receive the memory address of the original array.
function setArray(&$vals) {
// ...
}
CodePad.
I just encountered this same problem in almost exactly the same situation (I'm writing a PDO wrapper for my own purposes). I suspected that PHP was changing the reference to a value once no other variables were referencing the same data, and Rick, your comments in the edit to your question confirmed that suspicion, so thank you for that.
Now, for anybody else who comes across something like this, I believe I have the simplest solution: simply set each relevant array element to a reference to itself before passing the array to call_user_func_array().
I'm not sure exactly what happens internally because it seems like that shouldn't work, but the element becomes a reference again (which you can see with var_dump()), and call_user_func_array() then passes the argument by reference as expected. This fix seems to work even if the array element is still a reference already, so you don't have to check first.
In Rick's code, it would be something like this (everything after the first argument for bind_param is by reference, so I skip the first one and fix all of the ones after that):
for ($i = 1, $count = count($this->bindArgs); $i < $count; $i++) {
$this->bindArgs[$i] = &$this->bindArgs[$i];
}
call_user_func_array(array($stmt, 'bind_param'), $this->bindArgs);
I'm using the PHP shorthand addition operator to tally the number of times a specific id occurs within a multidimensional array:
$source['tally'] = array();
foreach ($items as $item) {
$source['tally'][$item->getId()] += 1;
}
The first time it hits a new id, it sets its 'tally' value to 1 and then increments it each time it's found thereafter.
The code works perfectly (I get the correct totals), but PHP gives me an "Undefined Offset" notice each time it finds a new id.
I know I can just turn off notices in php.ini, but figured there must be a reason why PHP doesn't approve of my technique.
Is it considered bad practice to create a new key/offset dynamically like this, and is there a better approach I should take instead?
Please Note: To help clarify following initial feedback, I do understand why the notice is being given. My question is whether I should do anything about it or just live with the notice. Apologies if my question didn't make that clear enough.
You have to understand that PHP notices are a tool. They exist so you have additional help when writing code and you are able to detect potential bugs easily. Uninitialized variables are the typical example. Many developers ask: if it's not mandatory to initialize variables, why is PHP complaining? Because it's trying to help:
$item_count = 0;
while( do_some_stuff() ){
$iten_count++; // Notice: Undefined variable: iten_count
}
echo $item_count . ' items found';
Oops, I mistyped the variable name.
$res = mysql_query('SELECT * FROM foo WHERE foo_id=' . (int)$_GET['foo_id']);
// Notice: Undefined index: foo_id
Oops, I haven't provided a default value.
Yours is just another example of the same situation. If you're incrementing the wrong array element, you'd like to know.
If you simply want to hide the notice, you can use the error control operator:
$source['tally'] = array();
foreach ($items as $item) {
#$source['tally'][$item->getId()]++;
}
However, you should generally initialize your variables, in this case by adding the following code inside the loop:
if (!isset( $source['tally'][$item->getId()] ))
{
$source['tally'][$item->getId()] = 0;
}
Using += (or any of the other augmented assignment operators) assumes that a value already exists for that key. Since this is not the case the first time the ID is encountered, a notice is emitted and 0 is assumed.
It's caused because you don't initialize your array to contain the initial value of 0. Note that the code would probably work, however it is considered good practice to initialize all the variable you are about to preform actions upon. So the following code is an example of what you should probably have:
<?php
$source['tally'] = array();
foreach ($items as $item) {
//For each $item in $items,
//check if that item doesn't exist and create it (0 times).
//Then, regardless of the previous statement, increase it by one.
if (!isset($source['tally'][$item->getID()]) $source['tally'][$item->getID()] = 0;
$source['tally'][$item->getId()] += 1;
}
?>
The actual reason why PHP cares about it, is mainly to warn you about that empty value (much like it would if you try to read it). It is a kind of an error, not a fatal, script-killing one, but a more subtle quiet one. You should still fix it though.
In C++ if you pass a large array to a function, you need to pass it by reference, so that it is not copied to the new function wasting memory. If you don't want it modified you pass it by const reference.
Can anyone verify that passing by reference will save me memory in PHP as well. I know PHP does not use addresses for references like C++ that is why I'm slightly uncertain. That is the question.
The following does not apply to objects, as it has been already stated here. Passing arrays and scalar values by reference will only save you memory if you plan on modifying the passed value, because PHP uses a copy-on-change (aka copy-on-write) policy. For example:
# $array will not be copied, because it is not modified.
function foo($array) {
echo $array[0];
}
# $array will be copied, because it is modified.
function bar($array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
# This is how bar shoudl've been implemented in the first place.
function baz($array) {
$temp = $array[0] + 1;
echo $temp + $array[1];
}
# This would also work (passing the array by reference), but has a serious
#side-effect which you may not want, but $array is not copied here.
function foobar(&$array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
To summarize:
If you are working on a very large array and plan on modifying it inside a function, you actually should use a reference to prevent it from getting copied, which can seriously decrease performance or even exhaust your memory limit.
If it is avoidable though (that is small arrays or scalar values), I'd always use functional-style approach with no side-effects, because as soon as you pass something by reference, you can never be sure what passed variable may hold after the function call, which sometimes can lead to nasty and hard-to-find bugs.
IMHO scalar values should never be passed by reference, because the performance impact can not be that big as to justify the loss of transparency in your code.
The short answer is use references when you need the functionality that they provide. Don't think of them in terms of memory usage or speed. Pass by reference is always going to be slower if the variable is read only.
Everything is passed by value, including objects. However, it's the handle of the object that is passed, so people often mistakenly call it by-reference because it looks like that.
Then what functionality does it provide? It gives you the ability to modify the variable in the calling scope:
class Bar {}
$bar = new Bar();
function by_val($o) { $o = null; }
function by_ref(&$o) { $o = null; }
by_val($bar); // $bar is still non null
by_ref($bar); // $bar is now null
So if you need such functionality (most often you do not), then use a reference. Otherwise, just pass by value.
Functions that look like this:
$foo = modify_me($foo);
sometimes are good candidates for pass-by-reference, but it should be absolutely clear that the function modifies the passed in variable. (And if such a function is useful, often it's because it really ought to just be part of some class modifying its own private data.)
In PHP :
objects are passed by reference1 -- always
arrays and scalars are passed by value by default ; and can be passed by reference, using an & in the function's declaration.
For the performance part of your question, PHP doesn't deal with that the same way as C/C++ ; you should read the following article : Do not use PHP references
1. Or that's what we usually say -- even if it's not "completely true" -- see Objects and references