I will always be in confusion whether to create pass/call by reference functions. It would be great if someone could explain when exactly I should use it and some realistic examples.
A common reason for calling by reference (or pointers) in other languages is to save on space - but PHP is smart enough to implement copy-on-write for arguments which are declared as passed-by-value (copies). There are also some hidden semantic oddities - although PHP5 introduced the practice of always passing objects by reference, array values are always stored as references, call_user_func() always calls by value - never by reference (because it itself is a function - not a construct).
But this is additional to the original question asked.
In general its good practice to always declare your code as passing by value (copy) unless you explicitly want the value to be different after the invoked functionality returns. The reason being that you should know how the invoked functionality changes the state of the code you are currently writing. These concepts are generally referred to as isolation and separation of concerns.
Since PHP 5 there is no real reason to pass values by reference.
One exception is if you want to modify arrays in-place. Take for example the sort function. You can see that the array is passed by reference, which means that the array is sorted in place (no new array is returned).
Or consider a recursive function where each call needs to have access to the same datum (which is often an array too).
In php4 it was used for large variables. If you passed an array in a function the array was copied for use in the function, using a lot of memory and cpu. The solution was this:
function foo(&$arr)
{
echo $arr['value'];
}
$arr = new array();
foo($arr);
This way you only passed the reference, a link to the array and save memory and cpu. Since php5 every object and array (not sure of scalars like int) are passed by reference internally so there isn't any need to do it yourself.
This is best when your function will always return a modified version of the variable that is passed to it to the same variable
$var = modify($var);
function modify($var)
{
return $var.'ret';
}
If you will always return to the passed variable, using reference is great.
Also, when dealing with large variables and especially arrays, it is good to pass by reference wherever feasible. This helps save on memory.
Usually, I pass by reference when dealing with arrays since I usually return to the modified array to the original array.
Related
In C++ if you pass a large array to a function, you need to pass it by reference, so that it is not copied to the new function wasting memory. If you don't want it modified you pass it by const reference.
Can anyone verify that passing by reference will save me memory in PHP as well. I know PHP does not use addresses for references like C++ that is why I'm slightly uncertain. That is the question.
The following does not apply to objects, as it has been already stated here. Passing arrays and scalar values by reference will only save you memory if you plan on modifying the passed value, because PHP uses a copy-on-change (aka copy-on-write) policy. For example:
# $array will not be copied, because it is not modified.
function foo($array) {
echo $array[0];
}
# $array will be copied, because it is modified.
function bar($array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
# This is how bar shoudl've been implemented in the first place.
function baz($array) {
$temp = $array[0] + 1;
echo $temp + $array[1];
}
# This would also work (passing the array by reference), but has a serious
#side-effect which you may not want, but $array is not copied here.
function foobar(&$array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
To summarize:
If you are working on a very large array and plan on modifying it inside a function, you actually should use a reference to prevent it from getting copied, which can seriously decrease performance or even exhaust your memory limit.
If it is avoidable though (that is small arrays or scalar values), I'd always use functional-style approach with no side-effects, because as soon as you pass something by reference, you can never be sure what passed variable may hold after the function call, which sometimes can lead to nasty and hard-to-find bugs.
IMHO scalar values should never be passed by reference, because the performance impact can not be that big as to justify the loss of transparency in your code.
The short answer is use references when you need the functionality that they provide. Don't think of them in terms of memory usage or speed. Pass by reference is always going to be slower if the variable is read only.
Everything is passed by value, including objects. However, it's the handle of the object that is passed, so people often mistakenly call it by-reference because it looks like that.
Then what functionality does it provide? It gives you the ability to modify the variable in the calling scope:
class Bar {}
$bar = new Bar();
function by_val($o) { $o = null; }
function by_ref(&$o) { $o = null; }
by_val($bar); // $bar is still non null
by_ref($bar); // $bar is now null
So if you need such functionality (most often you do not), then use a reference. Otherwise, just pass by value.
Functions that look like this:
$foo = modify_me($foo);
sometimes are good candidates for pass-by-reference, but it should be absolutely clear that the function modifies the passed in variable. (And if such a function is useful, often it's because it really ought to just be part of some class modifying its own private data.)
In PHP :
objects are passed by reference1 -- always
arrays and scalars are passed by value by default ; and can be passed by reference, using an & in the function's declaration.
For the performance part of your question, PHP doesn't deal with that the same way as C/C++ ; you should read the following article : Do not use PHP references
1. Or that's what we usually say -- even if it's not "completely true" -- see Objects and references
I'm using the Facebook library with this code in it:
class FacebookRestClient {
...
public function &users_hasAppPermission($ext_perm, $uid=null) {
return $this->call_method('facebook.users.hasAppPermission',
array('ext_perm' => $ext_perm, 'uid' => $uid));
}
...
}
What does the & at the beginning of the function definition mean, and how do I go about using a library like this (in a simple example)
An ampersand before a function name means the function will return a reference to a variable instead of the value.
Returning by reference is useful when
you want to use a function to find to
which variable a reference should be
bound. Do not use return-by-reference
to increase performance. The engine
will automatically optimize this on
its own. Only return references when
you have a valid technical reason to
do so.
See Returning References.
It's returning a reference, as mentioned already. In PHP 4, objects were assigned by value, just like any other value. This is highly unintuitive and contrary to how most other languages works.
To get around the problem, references were used for variables that pointed to objects. In PHP 5, references are very rarely used. I'm guessing this is legacy code or code trying to preserve backwards compatibility with PHP 4.
This is often known in PHP as Returning reference or Returning by reference.
Returning by reference is useful when you want to use a function to
find to which variable a reference should be bound. Do not use
return-by-reference to increase performance. The engine will
automatically optimize this on its own. Only return references when
you have a valid technical reason to do so.
PHP documentation on Returning reference
A reference in PHP is simply another name assigned to the content of a variable. PHP references are not like pointers in C programming, they are not actual memory addresses, so they cannot be used for pointer arithmetics.
The concept of returning references can be very confusing especially to beginners, so an example will be helpful.
$populationCount = 120;
function &getPopulationCount() {
global $populationCount;
return $populationCount;
}
$countryPopulation =& getPopulationCount();
$countryPopulation++;
echo "\$populationCount = $populationCount\n"; // Output: $populationCount = 121
echo "\$countryPopulation = $countryPopulation\n"; //Output: $countryPopulation = 121
The function getPopulationCount() defined with a preceding &, returns the reference to the content or value of $populationCount. So, incrementing $countryPopulation, also increments $populationCount.
Lets assume the following:
private $array = array(/*really big multi-dimensional array*/);
public function &func1($specific_large_sub_array_key)
{
return $this->array[$specific_large_sub_array_key]
}
public function func2()
{
$specificArray = &$this->func1(1);
$this->func3($specificArray);
}
public function func3($specificArray)
{
/* do stuff here*/
}
My question is this:
If func3 does not specify that $specificArray is not passed by reference to it, does PHP make a copy of $specificArray when it calls func3 inside of func2? Or does PHP keep the reference and propagate it automatically?
i.e. Will this...
public function func3($specificArray)
{
unset($specificArray[234]);
}
...affect $array?
Thank you
Note, this example is extremely simplified.
PHP is pretty intelligent as to how it deals with variables and copies.
Take the following example:
// Allocate one variable with content 'Hello'
$var = 'Hello';
At this point, the Zend Engine has a representation of your string variable with the content, Hello.
Now if you do this:
$varCopy = $var;
You have 2 independent variables ($var and $varCopy), but since their contents are the same, the content only exists in one place in memory (basically a true copy hasn't been made yet). At this point, the two variables reference the same value (Hello) in a symbol table. It will only copy the contents once one of the two variables is modified. This same logic works for 2 copies to any number of copies.
Put simply, PHP is smart enough not to copy the value of the variable or array when it isn't necessary to make a copy.
You can learn more about this on the Reference Counting Basics page on the PHP manual. They even give an example specific to arrays towards the end.
A useful function is memory_get_usage which can show you how much memory PHP is using. You can use this to track the fact that the memory usage will change very little as you pass multiple copies of your array around. This can help prove the point outlined in the reference counting basics section of the manual.
You don't need to know all the details about how it works, but do be aware that PHP is smart in how it creates and manages references.
EDIT:
To answer your actual question directly, no, in func3 PHP will not make a copy of the array even if you don't pass it by reference. It will use references as illustrated in the reference counting basics section, so you can pass it by value without any concern.
If you call unset however, the value you unset will only be removed from the local copy of the array, so it ultimately isn't removed from the source array unless you pass it by reference to the function. But passing it by value does not create a whole new copy of the entire gigantic array. Even removing one value from the copy doesn't create a whole new copy minus the entry you removed (you just have a second array with all identical references to the first, but it is missing the one reference to the removed entry).
Can't do multiline comments, so as an answer:
return &$this->array[$specific_large_sub_array_key]
^
But to also give you an answer to your question:
i.e. Will this... [...] ...affect $array?
Plain and simple: No. Reason: It's a different variable, not an alias (reference).
Why don't the function handling functions like call_user_func() support passing parameters by reference?
The docs say terse things like "Note that the parameters for call_user_func() are not passed by reference." I assume the PHP devs had some kind of reason for disabling that capability in this case.
Were they facing a technical limitation? Was it a language design choice? How did this come about?
EDIT:
In order to clarify this, here is an example.
<?php
function more(&$var){ $var++; }
$count = 0;
print "The count is $count.\n";
more($count);
print "The count is $count.\n";
call_user_func('more', $count);
print "The count is $count.\n";
// Output:
// The count is 0.
// The count is 1.
// The count is 1.
This is functioning normally; call_user_func does not pass $count by reference, even though more() declared it as a referenced variable. The call_user_func documentation clearly says that this is the way it's supposed to work.
I am well aware that I can get the effect I need by using call_user_func_array('more', array(&$count)).
The question is: why was call_user_func designed to work this way? The passing by reference documentation says that "Function definitions alone are enough to correctly pass the argument by reference." The behavior of call_user_func is an exception to that. Why?
The answer is embedded deep down in the way references work in PHP's model - not necessarily the implementation, because that can vary a lot, particularly in the 5.x versions. I'm sure you've heard the lines, they're not like C pointers, or C++ references, etc etc... Basically when a variable is assigned or bound, it can happen in two ways - either by value (in which case the new variable is bound to a new 'box' containing a copy of the old value), or by reference (in which case the new variable is bound to the same value box as the old value). This is true whether we're talking about variables, or function arguments, or cells in arrays.
Things start to get a bit hairy when you start passing references into functions - obviously the intent is to be able to modify the original variables. Quite some time ago, call-time pass-by-reference (the ability to pass a reference into a function that wasn't expecting one) got deprecated, because a function that wasn't aware it was dealing with a reference might 'accidentally' modify the input. Taking it to another level, if that function calls a second function, that itself wasn't expecting a reference... then everything ends up getting disconnected. It might work, but it's not guaranteed, and may break in some PHP version.
This is where call_user_func() comes in. Suppose you pass a reference into it (and get the associated the call-time pass-by-reference warning). Then your reference gets bound to a new variable - the parameters of call_user_func() itself. Then when your target function is called, its parameters are not bound where you expect. They're not bound to the original parameters at all. They're bound to the local variables that are in the call_user_func() declaration. call_user_func_array() requires caution too. Putting a reference in an array cell could be trouble - since PHP passes that array with "copy-on-write" semantics, you can't be sure if the array won't get modified underneath you, and the copy won't get detached from the original reference.
The most insightful explanation I've seen (which helped me get my head around references) was in a comment on the PHP 'passing by reference' manual:
http://ca.php.net/manual/en/language.references.pass.php#99549
Basically the logic goes like this. How would you write your own version of call_user_func() ? - and then explain how that breaks with references, and how it fails when you avoid call-time pass-by-reference. In other words, the right way to call functions (specify the value, and let PHP decide from the function declaration whether to pass value or reference) isn't going to work when you use call_user_func() - you're calling two functions deep, the first by value, and the second by reference to the values in the first.
Get your head around this, and you'll have a much deeper understanding of PHP references (and a much greater motivation to steer clear if you can).
See this:
http://hakre.wordpress.com/2011/03/09/call_user_func_array-php-5-3-and-passing-by-reference/
Is it possible to pass parameters by reference using call_user_func_array()?
http://bugs.php.net/bug.php?id=17309&edit=1
Passing references in an array works correctly.
Updated Answer:
You can use:
call_user_func('more', &$count)
to achieve the same effect as:
call_user_func_array('more', array(&$count))
For this reason I believe (unfoundedly) that call_user_func is just a compiler time short cut. (i.e. it gets replaced with the later at compile time)
To give my view on you actual question "Why was call_user_func designed to work this way?":
It probably falls under the same lines as "Why is some methods strstr and other str_replace?, why is array functions haystack, needle and string functions needle, haystack?
Its because PHP was designed, by many different people, over a long period of time, and with no strict standards in place at the time.
Original Answer:
You must make sure you set the variable inside the array to a reference as well.
Try this and take note of the array(&$t) part:
function test(&$t) {
$t++;
echo '$t is '.$t.' inside function'.PHP_EOL;
}
$t = 0;
echo '$t is '.$t.' in global scope'.PHP_EOL;
test($t);
$t++;
echo '$t is '.$t.' in global scope'.PHP_EOL;
call_user_func_array('test', array(&$t));
$t++;
echo '$t is '.$t.' in global scope'.PHP_EOL;
Should output:
$t is 0 in global scope
$t is 1 inside function
$t is 2 in global scope
$t is 3 inside function
$t is 4 in global scope
Another possible way - the by-reference syntax stays the 'right' way:
$data = 'some data';
$func = 'more';
$func($more);
function more(&$data) {
// Do something with $data here...
}
With PHP5 using "copy on write" and passing by reference causing more of a performance penalty than a gain, why should I use pass-by-reference? Other than call-back functions that would return more than one value or classes who's attributes you want to be alterable without calling a set function later(bad practice, I know), is there a use for it that I am missing?
You use pass-by-reference when you want to modify the result and that's all there is to it.
Remember as well that in PHP objects are always pass-by-reference.
Personally I find PHP's system of copying values implicitly (I guess to defend against accidental modification) cumbersome and unintuitive but then again I started in strongly typed languages, which probably explains that. But I find it interesting that objects differ from PHP's normal operation and I take it as evidence that PHP"s implicit copying mechanism really isn't a good system.
A recursive function that fills an array? Remember writing something like that, once.
There's no point in having hundreds of copies of a partially filled array and copying, splicing and joining parts at every turn.
Even when passing objects there is a difference.
Try this example:
class Penguin { }
$a = new Penguin();
function one($a)
{
$a = null;
}
function two(&$a)
{
$a = null;
}
var_dump($a);
one($a);
var_dump($a);
two($a);
var_dump($a);
The result will be:
object(Penguin)#1 (0) {}
object(Penguin)#1 (0) {}
NULL
When you pass a variable containing a reference to an object by reference, you are able to modify the reference to the object.