Unsetting a variable vs setting to '' - php

Is it better form to do one of the following? If not, is one of them faster than the other?
unset($variable);
or to do
$variable = '';

they will do slightly different things:
unset will remove the variable from the symbol table and will decrement the reference count on the contents by 1. references to the variable after that will trigger a notice ("undefined variable"). (note, an object can override the default unset behavior on its properties by implementing __unset()).
setting to an empty string will decrement the reference count on the contents by 1, set the contents to a 0-length string, but the symbol will still remain in the symbol table, and you can still reference the variable. (note, an object can override the default assignment behavior on its properties by implementing __set()).
in older php's, when the ref count falls to 0, the destructor is called and the memory is freed immediately. in newer versions (>= 5.3), php uses a buffered scheme that has better handling for cyclical references (http://www.php.net/manual/en/features.gc.collecting-cycles.php), so the memory could possibly be freed later, tho it might not be delayed at all... in any case, that doesn't really cause any issues and the new algorithm prevents certain memory leaks.
if the variable name won't be used again, unset should be a few cpu cycles faster (since new contents don't need to be created). but if the variable name is re-used, php would have to create a new variable and symbol table entry, so it could be slower! the diff would be a negligible difference in most situations.
if you want to mark the variable as invalid for later checking, you could set it to false or null. that would be better than testing with isset() because a typo in the variable name would return false without any error... you can also pass false and null values to another function and retain the sentinel value, which can't be done with an unset var...
so i would say:
$var = false; ...
if ($var !== false) ...
or
$var = null; ...
if (!is_null($var)) ...
would be better for checking sentinel values than
unset($var); ...
if (isset($var)) ...

Technically $test = '' will return true to
if(isset($test))
Because it is still 'set', it is just set to en empty value.
It will however return true to
if(empty($test))
as it is an empty variable. It just depends on what you are checking for. Generally people tend to check if a variable isset, rather than if it is empty though.
So it is better to just unset it completely.
Also, this is easier to understand
unset($test);
than this
$test = '';
the first immediately tells you that the variable is NO LONGER SET. Where as the latter simply tells you it is set to a blank space. This is commonly used when you are going to add stuff to a variable and don't want PHP erroring on you.

You are doing different things, the purpose of unset is to destroys the specified variable in the context of where you make it, your second example simply sets the variable to an empty string.
Unsetting a variable doesn't force immediate memory freeing, if you are concerned about performance, setting the variable to NULL may be a better option, but really, the difference will be not noticeable...
Discussed in the docs:
unset() does just what it's name says
- unset a variable. It does not force immediate memory freeing. PHP's
garbage collector will do it when it
see fits - by intention as soon, as
those CPU cycles aren't needed anyway,
or as late as before the script would
run out of memory, whatever occurs
first.
If you are doing $whatever = null;
then you are rewriting variable's
data. You might get memory freed /
shrunk faster, but it may steal CPU
cycles from the code that truly needs
them sooner, resulting in a longer
overall execution time.

I think the most relevant difference is that unsetting a variable communicates that the variable will not be used by subsequent code (it also "enforces" this by reporting an E_NOTICE if you try to use it, as jspcal said that's because it's not in the symbol table anymore).
Therefore, if the empty string is a legal (or sentinel) value for whatever you are doing with your variable, go ahead and set it to ''. Otherwise, if the variable is no longer useful, unsetting it makes for clearer code intent.

They have totally different meanings. The former makes a variable non-existant. The latter just sets its value to the empty string. It doesn't matter which one is "better" so to speak, because they are for totally different things.
Are you trying to clean up memory or something? If so, don't; PHP manages memory for you, so you can leave it laying around and it'll get cleaned up automatically.
If you're not trying to clean up memory, then you need to figure out why you want to unset a variable or set it to empty, and choose the appropriate one. One good sanity check for this: let's say someone inserted the following line of code somewhere after your unset/empty:
if(strcmp($variable, '') == 0) { do_something(); }
And then, later:
if(!isset($variable)) { do_something_else(); }
The first will run do_something() if you set the variable to the empty string. The second will run do_something_else() if you unset the variable. Which of these do you expect to run if your script is behaving properly?

There is one other 'gotcha' to consider here, the reference.
if you had:
$a = 'foobar';
$variable =& $a;
then to do either of your two alternatives is quite different.
$variable = '';
sets both $variable and $a to the empty string, where as
unset($variable);
removes the reference link between $a and $variable while removing $variable from the symbol table. This is indeed the only way to unlink $a and $variable without setting $variable to reference something else. Note, e.g., $variable = null; won't do it.

Related

Best approach with sort in place algorithms [duplicate]

In C++ if you pass a large array to a function, you need to pass it by reference, so that it is not copied to the new function wasting memory. If you don't want it modified you pass it by const reference.
Can anyone verify that passing by reference will save me memory in PHP as well. I know PHP does not use addresses for references like C++ that is why I'm slightly uncertain. That is the question.
The following does not apply to objects, as it has been already stated here. Passing arrays and scalar values by reference will only save you memory if you plan on modifying the passed value, because PHP uses a copy-on-change (aka copy-on-write) policy. For example:
# $array will not be copied, because it is not modified.
function foo($array) {
echo $array[0];
}
# $array will be copied, because it is modified.
function bar($array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
# This is how bar shoudl've been implemented in the first place.
function baz($array) {
$temp = $array[0] + 1;
echo $temp + $array[1];
}
# This would also work (passing the array by reference), but has a serious
#side-effect which you may not want, but $array is not copied here.
function foobar(&$array) {
$array[0] += 1;
echo $array[0] + $array[1];
}
To summarize:
If you are working on a very large array and plan on modifying it inside a function, you actually should use a reference to prevent it from getting copied, which can seriously decrease performance or even exhaust your memory limit.
If it is avoidable though (that is small arrays or scalar values), I'd always use functional-style approach with no side-effects, because as soon as you pass something by reference, you can never be sure what passed variable may hold after the function call, which sometimes can lead to nasty and hard-to-find bugs.
IMHO scalar values should never be passed by reference, because the performance impact can not be that big as to justify the loss of transparency in your code.
The short answer is use references when you need the functionality that they provide. Don't think of them in terms of memory usage or speed. Pass by reference is always going to be slower if the variable is read only.
Everything is passed by value, including objects. However, it's the handle of the object that is passed, so people often mistakenly call it by-reference because it looks like that.
Then what functionality does it provide? It gives you the ability to modify the variable in the calling scope:
class Bar {}
$bar = new Bar();
function by_val($o) { $o = null; }
function by_ref(&$o) { $o = null; }
by_val($bar); // $bar is still non null
by_ref($bar); // $bar is now null
So if you need such functionality (most often you do not), then use a reference. Otherwise, just pass by value.
Functions that look like this:
$foo = modify_me($foo);
sometimes are good candidates for pass-by-reference, but it should be absolutely clear that the function modifies the passed in variable. (And if such a function is useful, often it's because it really ought to just be part of some class modifying its own private data.)
In PHP :
objects are passed by reference1 -- always
arrays and scalars are passed by value by default ; and can be passed by reference, using an & in the function's declaration.
For the performance part of your question, PHP doesn't deal with that the same way as C/C++ ; you should read the following article : Do not use PHP references
1. Or that's what we usually say -- even if it's not "completely true" -- see Objects and references

How to check if return_value (from php function, inside native extension) is used in php userland?

So i'm writing an extension to act as a wrapper for a certain multithreaded networking library. Now, the pattern used is a simple request-reply.
All I want to do is allocate a zend_string (with zend_string_init) for the reply, and return it via return_value. The thing is, I don't want to explicitly call zend_string_release, because of how php treats zvals.
What I mean by that is: if a certain zval gets to php userland, it will be destroyed and freed after it is no longer used. If it doesn't get there (e.g. a user will use in php something like "myfunc();", instead of "$result = myfunc();" ), I have to destroy it.
I find this to be quite a tricky case. I wonder if there isn't some function or macro or field in the execute_data parameter, that can tell me if the result from my function is used in php or not. If it isn't used, then it means that I don't need to alloc memory for the reply. If it is, I will alloc memory and return it, and it will be freed automatically.
EDIT: if such a mechanism doesn't exist or isn't reliable or isn't best practice to use (as another user pointed out), how should memory be managed?
For instance something like
PHP_FUNCTION(myfunc)
{
zend_string *dummy = zend_string_init("dummy", sizeof("dummy") - 1, 0);
RETVAL_STR(dummy);
return;
}
would cause a memory leak if the user doesn't do something like $var = myfunc() in php, and it would also cause a double free if the user does indeed do
$var = myfunc() and I do something like zend_string_release(dummy) in the RSHUTDOWN() function (assuming i have a pointer to it saved somehow in a global hashtable)
In general, you shouldn't have to worry about handling the lifecycle of a function's return value. The PHP engine handles this value just like any other value in userland. In other words, your conception of the return value "being used" (or "not being used") in userland is not quite correct. In fact, every return value is "used" by the engine whether the userland code assigns the return value to a variable or not. This means PHP's automatic memory management is applied and everything is handled and eventually freed properly.
Underneath the hood, every PHP_FUNCTION is passed a zval having the argument name return_value (of course, the PHP_FUNCTION macro hides this fact). This zval is always initialized to NULL (e.g. ZVAL_NULL(return_value)) before the function is even called. The macro RETVAL_STR(str) evaluates into the call ZVAL_STR(return_value,str), which assigns the zend_string* to the existing return_value zval. Keeping in mind that a zend_string is a reference-counted structure, the assigning zval inherits the initial reference set via zend_string_init. The engine will then handle reference-counting the assigned zend_string* and will eventually call zend_string_release once the zend_string's reference counter hits zero (i.e. it is no longer in use by any zvals). The engine handles destroying the return_value zval after potentially assigning the value to another zval via a variable assignment, function call parameter assignment, ETC.
Note: if anyone reading this is more familiar with PHP5 than PHP7 (like myself), then note it is the same concept except the zend_string* is a char*, and in PHP5 the zval itself is reference-counted instead of the zend_string.
So, for example, consider this:
<?php
myfunc();
The return value is handled by the engine in this instance like we'd expect using PHP's reference-counting system. The refcount for the zend_string* peaks at 1 and is decremented once before being destroyed. In other words, the zend_string lives entirely within the return_value zval before getting cleaned up. Now consider this other case:
<?php
$val = myfunc();
echo "Got message: $val\n";
In this case, the return value is still handled in the exact same way using the reference-counting system, albeit for a longer amount of time (i.e. more refcount increments/decrements). The refcount will at least hit 2 when the return value is assigned to $val. When the engine destroys the return_value zval, the refcount will decrement to 1 since the zend_string* is still alive in $val.
I hope this cleared things up. Note that I also answered a very similar question to yours a while back; it may be useful to you as well.

PHP: with an associative array of counters, is it more idiomatic to explicitly initialize each value to 0 before the first increment?

I'm writing some code that builds up associative array of counters. When it encounters a new item for the first time it creates a new key and initializes it to zero. I.e.:
if (!array_key_exists($item, $counters)) {
$counters[$item] = 0;
}
$counters[$item]++;
However, PHP actually does that first part implicitly. If I just do...
$counters[$item]++;
... then $counters[$item] will evaluate to NULL and be converted to 0 before it's incremented. Obviously the second way is simpler and more concise, but it feels a little sleazy because it's not obvious that $counters[$item] might not exist yet. Is one way or the other preferred in PHP?
For comparison, in Python the idiomatic approach would be to use collections.Counter when you want keys that initialize themselves to 0, and a regular dictionary when you want to initialize them yourself. In PHP you only have the first option.
Incrementing an uninitialized key will generate a PHP Notice, and is a bad idea. You should always initialize first.
However, the use of array_key_exists is not very idiomatic. I know coming from Python it may seem natural, but if you know that $counter has no meaningful NULL values it's more idiomatic to use isset() to test for array membership. (It's also much faster for no reason I can discern!)
This is how I would write a counter in PHP:
$counters = array();
foreach ($thingtobecounted as $item) {
if (isset($counters[$item])) {
$counters[$item]++;
} else {
$counters[$item] = 1;
}
}
Unfortunately unlike Python PHP does not provide any way to do this without performing two key lookups.
the first is preferred. the second option will generate a Notice in your logs that $counters[$item] is undefined. it still works but if you change display_errors = On; and error_reporting = E_ALL. in your php.ini file you will see these notices in your browser.
The first way is generally how you do it, if for nothing other than simpler maintenance. Remember, you may not be the one maintaining the code. You don't want error logs riddled with correctly operating code. Even worse, you may need to transfer methods to other languages (or earlier versions of PHP) where implicit initialization might not occur.
If you don't really need a check on each array index - or know that most of the indexes will be undefinded - why not suppress errors like: ?
(this way you save some performance on initializing [useless] indexes)
if (#!array_key_exists($item, $counters)) {

What does PHP assignment operator do?

I happens to read this http://code.google.com/speed/articles/optimizing-php.html
It claims that this code
$description = strip_tags($_POST['description']);
echo $description;
should be optimized as below
echo strip_tags($_POST['description']);
However, in my understanding, assignment operation in PHP is not necessarily create a copy in memory.
This only have one copy of "abc" in memory.
$a = $b = "abc";
It consumes more memory only when one variable is changed.
$a = $b = "abc";
$a = "xyz";
Is that correct?
should be optimized as below
It's only a good idea if you don't need to store it, thereby avoiding unnecessary memory consumption. However, if you need to output the same thing again later, it's better to store it in a variable to avoid a another function call.
Is that correct?
Yes. It's called copy-on-write.
In the first example, if the variable is only used once then there is not point of making a variable in the first place, just echo the statements result right away, there is no need for the variable.
In the second example, PHP has something called copy on write. That means that if you have two variables that point to the same thing, they are both just pointing at the same bit of memory. That is until one of the variables is written to, then a copy is made, and the change is made to that copy.
The author does have a point insofar as copying data into a variable will keep that data in memory until the variable is unset. If you do not need the data again later, it's indeed wasted memory.
Otherwise there's no difference at all in peak memory consumption between the two methods, so his reasoning ("copying") is wrong.

Freeing objects in PHP

I am new to OOP in PHP (normally write software). I have read that when an object goes out of scope it will be free'd so there is no need to do it manually. However, if I have a script like:
while ($var == 1) {
$class = new My_Class();
//Do something
if ($something) {
break;
}
}
This script will loop until $something is true which in my mind will create a lot of instances of $class. Do I need to free it at the end of each iteration? Will the same var name just re-reference itself? If I do need to free it, would unset() suffice?
Thanks.
When you assign a new instance to a variable, the old instance referenced by that variable (if any) has its reference count decreased. In this case the refcount will become zero. Since it is no longer referenced it will be automatically cleaned up.
From PHP 5.3 there is a proper garbage collector that can also handle circular references. You can enable it by calling gc_enable.
It shouldn't be necessary to unset() it in this context, as it will be overwritten on each iteration of the loop. Depending on what other actions are taking place in the while loop, it may be preferable to assign the $class outside the loop. Does $class change on each iteration?
$class = new My_Class();
while ($var ==1)
{
// Do something
}
Unless: the loop will be running a very long time; you anticipate a high number of concurrent users; or there are limited resources on the server (i.e. self-host, VPS/shared, etc), then you don't need to worry about it. In any scenario where the script won't be running for very long (less than 5 seconds), anything you try to do to free memory is going to be less effective than PHP's garbage collector.
That said, if you need to clear the reference (because of one of the aforementioned scenarios, or because you like to be tidy), you can set the variable to null or use the unset function. That will remove the reference and PHP's garbage collector will clean it up because there are no more references to it.

Categories