What does PHP assignment operator do? - php

I happens to read this http://code.google.com/speed/articles/optimizing-php.html
It claims that this code
$description = strip_tags($_POST['description']);
echo $description;
should be optimized as below
echo strip_tags($_POST['description']);
However, in my understanding, assignment operation in PHP is not necessarily create a copy in memory.
This only have one copy of "abc" in memory.
$a = $b = "abc";
It consumes more memory only when one variable is changed.
$a = $b = "abc";
$a = "xyz";
Is that correct?

should be optimized as below
It's only a good idea if you don't need to store it, thereby avoiding unnecessary memory consumption. However, if you need to output the same thing again later, it's better to store it in a variable to avoid a another function call.
Is that correct?
Yes. It's called copy-on-write.

In the first example, if the variable is only used once then there is not point of making a variable in the first place, just echo the statements result right away, there is no need for the variable.
In the second example, PHP has something called copy on write. That means that if you have two variables that point to the same thing, they are both just pointing at the same bit of memory. That is until one of the variables is written to, then a copy is made, and the change is made to that copy.

The author does have a point insofar as copying data into a variable will keep that data in memory until the variable is unset. If you do not need the data again later, it's indeed wasted memory.
Otherwise there's no difference at all in peak memory consumption between the two methods, so his reasoning ("copying") is wrong.

Related

PHP behavior under the hood

I was just wondering how PHP works under the hood in this certain scenario. Let's say I have these two pieces of code:
function foo() {
return 2 * 2;
}
// First.
if (foo()) {
bar(foo());
}
// Second.
if (($ref = foo())) {
bar($ref);
}
Now the questions:
In the first case, does PHP make some sort of temporary variable inside the if clause? If so, isn't the second piece of code always better approach?
Does the second case take more memory? If answer to the first question is yes to the first question, then not?
The two codes are not equivalent, because the first one calls foo() twice (if it returns a truthy value). If it has side effects, such as printing something, they will be done twice. Or if it's dependent on something that can change (e.g. the contents of a file or database), the two calls may return different values. In your example where it just multiplies two numbers, this doesn't happen, but it still means it has to do an extra multiplication, which is unnecessary.
The answer to your questions is:
Yes, it needs to hold the returned value in a temporary memory location so it can test whether it's true or not.
Yes, it uses a little more memory. In the first version, the temporary memory can be reclaimed as soon as the if test is completed. In the second version, it will not be reclaimed until the variable $foo is reassigned or goes out of scope.
In the first case, you are calling a function twice, so, if the function is time consuming, it is inefficient. The second case is indeed better since you are saving the result of foo().
In both cases, PHP needs to allocate memory depending on what data foo() generates. That memory will be freed by the garbage collector later on. In terms of memory both cases are pretty much equivalent. Maybe the memory will be released earlier, maybe not, but most likely you won't encounter a case where it matters.
PHP can't make any temporary variable because it can't be sure that foo()'s returning value will always be the same. microtime(), rand() will return different values for each call, for example.
In the second example, it takes indeed more memory, since PHP needs to create and keep the value in memory.
Here is how to test it :
<?php
function foo() {
return true;
}
function bar($bool) {
echo memory_get_usage();
}
if (1) {
// 253632 bytes on my machine
if (foo()) {
bar(foo());
}
} else {
// 253720 bytes on my machine
if (($ref = foo())) {
bar($ref);
}
}

Create new variable or reassign the old one - php

What is less expensive in terms of performance and why ? Though for the first case it creates new variable, but in second case should not it first unset the var1 to reassign it ?
1)
$var1 = $someBigArray;
$var2 = $this->someFunction($var1);
// use $var2
2)
$var1 = $someBigArray;
$var1 = $this->someFunction($var1);
// user $var1
UPDATE
I cant really do this, I just excluded the rest of my code, asking the core part and making it look simpler
$var1 = $this->someFunction($someBigArray);
There you have two things, One is processing other is memory.
About Processing:
In PHP you really don't know what is the type of the variable. if the variable type of $someBigArray and $this->someFunction($var1) are not the same, it will be lot more expensive then assigning a new variable $var2. if they are same type($someBigArray and $this->someFunction($var1)), then it is less expensive.(less processing)
About memory:
using the same variable will may use less memory. small optimization for the RAM.
Memory is cheap, You should be more careful about the processing power your using. In these cases, try to do a benchmark.
PHP's garbage collection is a bit weird and when overriding a variable it isn't (always?) flushed from memory. So memory wise, assign a new variable and use unset() on the old one. You don't really want 30Mb of array wasting away in your memory, especially if this is a script that runs for any length of time or if you ever have multiple of it running at once.
I don't know about processing performance. As Nafis said, try a benchmark. (Simplest and dumbest way, make a script that runs it 1000 times and see which is faster)

=& operator, memories

I am very confused about how using & operator to reduce memories.
Can I have an answer for below question??
clase C{
function B(&$a){
$this->a = &$a;
$this->a = $a;
//They are the same here created new variable $this->a??
// Or only $this->a = $a will create new variable?
}
}
$a = 1;
C = new C;
C->B($a)
Or maybe my understanding is totally wrong.....
Never ever use references in PHP just to reduce memory load. PHP handles that perfectly with its internal copy on write mechanism. Example:
$a = str_repeat('x', 100000000); // Memory used ~ 100 MB
$b = $a; // Memory used ~ 100 MB
$b = $b . 'x'; // Memory used ~ 200 MB
You should only use references if you know exactly what you are doing and need them for functionality (and that's almost never, so you could as well just forget about them). PHP references are quirky and can result to some unexpected behaviour.
I am very confused about how using & operator to reduce memories.
If you don't know it, you probably don't need it :) The & is quite useless nowadays, because of several enhancements in the PHP-core over the last years. Usually you would use & to avoid, that PHP copies the value to the memory allocated for the second variable, but instead (in short) let both variables point to the same memory.
But nowadays
Objects are passed as reference anyway. They don't clone themself magically, because they are passed to a method ;)
When you pass primitive types, PHP will not copy the value, unless you change the variable (copy-on-write).
To sum it up: The benefits of & already exists as feature of the core, but without the ugly side-effects of the operator
Value type variables will only be copied when their value changes, if you only assign it in your example it wont be copied, memory footprint will be the same as if u have not used the & operator.
I recommend that you read these articles about passing values by reference:
When to pass-by-reference in PHP
When is it good to use pass by reference in PHP?
http://schlueters.de/blog/archives/125-Do-not-use-PHP-references.html
it is considered a microoptimalization, and hurts the transparency of the code

Calling unset() in PHP script

Coming from a C/C++ background, I am used to doing my own garbage collection - i.e. freeing resources after using them (i.e. RAII in C++ land).
I find myself unsetting (mostly ORM) variables after using them. Are there any benefits of this habit?
I remeber reading somewhere a while back, that unsetting variables marks them for deletion for the attention of PHP's GC - which can help resource usage on the server side - true or false?
[Edit]
I forgot to mention, I am using PHP 5.3, and also most of the unset() calls I make are in a loop where I am processing several 'heavy' ORM variables
I find that if your having to unset use alot your probably doing it wrong. Let scoping doing the "unsetting" for you. Consider the two examples:
1:
$var1 = f( ... );
....
unset( $var1 );
$var2 = g( ... );
....
unset( $var2 );
2:
function scope1()
{
$var1 = f( ... );
....
} //end of function triggers release of $var1
function scope2()
{
$var2 = g( ... );
....
} //end of function triggers release of $var2
scope1();
scope2();
The second example I would be preferable because it clearly defines the scope and decreases the risk of leaking variables to global scope (which are only released at the end of the script).
EDIT:
Another things to keep in mind is the unset in PHP costs more (CPU) than normal scope garbage collection. While the difference is small, it goes to show how little of an emphases the PHP team puts on unset. If anything unset should give PHP insight that on how to release memory, but it actually adds to execution time. unset is really only a hack to free up variables that are no longer needed, unless your doing something fairly complex, reusing variables (which acts like a natural unset on the old variable) and scoping should be all you ever need.
function noop( $value ){}
function test1()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
}
function test2()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
unset( $value );
}
$start1 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test1();
$end1 = microtime(true);
$start2 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test2();
$end2 = microtime(true);
echo "test1 ".($end1 - $start1)."\n"; //test1 0.404934883118
echo "test2 ".($end2 - $start2)."\n"; //test2 0.434437990189
If a very large object is used early in a long script, and there is no opportunity for the object to go out of scope, then unset() might help with memory usage. In most cases, objects go out of scope and they're marked for GC automatically.
Yes it can especially when you are dealing with big arrays, and script require much time to run.
Without going to look up some proof I'm going to say that it doesn't really matter. Garbage collection occurs automatically when you leave a function or a script ends. So unless you are really strapped for resources, don't bother.
OK, looked something up. Here is a good quote:
"Freeing memory - particularly large
amounts - isn't free in terms of
processor time, which means that if
you want your script to execute as
fast as possible at the expense of
RAM, you should avoid garbage
collection on large variables while
it's running, then let PHP do it en
masse at the end of the script."
For more info on the subject check out the links provided in the first answer here.
I thought PHP variables were only preserved through the lifetime of your script, so it's unlikely to help that much unless your script is particularly long-running or using a lot of temporary memory in one step.
Clearing explicitly may be slower than letting them all be automatically cleared at startup.
You're adding more code, which is generally going to make thing slower unless you know that it helps.
Premature optimization, in any case.
The PHP GC is usually good enough so that you usually do not need to call unset() on simple variables. For objects however, the GC will only destroy them when they leave scope and no other objects refer to them. Unset can help with memory in this case. See http://us3.php.net/manual/en/language.references.unset.php
I have had to use unset when you are running into memory issues when looping through and making copies of arrays. I would say don't use it unless you are in this situation ad the GC will kick in automatically.

Unsetting a variable vs setting to ''

Is it better form to do one of the following? If not, is one of them faster than the other?
unset($variable);
or to do
$variable = '';
they will do slightly different things:
unset will remove the variable from the symbol table and will decrement the reference count on the contents by 1. references to the variable after that will trigger a notice ("undefined variable"). (note, an object can override the default unset behavior on its properties by implementing __unset()).
setting to an empty string will decrement the reference count on the contents by 1, set the contents to a 0-length string, but the symbol will still remain in the symbol table, and you can still reference the variable. (note, an object can override the default assignment behavior on its properties by implementing __set()).
in older php's, when the ref count falls to 0, the destructor is called and the memory is freed immediately. in newer versions (>= 5.3), php uses a buffered scheme that has better handling for cyclical references (http://www.php.net/manual/en/features.gc.collecting-cycles.php), so the memory could possibly be freed later, tho it might not be delayed at all... in any case, that doesn't really cause any issues and the new algorithm prevents certain memory leaks.
if the variable name won't be used again, unset should be a few cpu cycles faster (since new contents don't need to be created). but if the variable name is re-used, php would have to create a new variable and symbol table entry, so it could be slower! the diff would be a negligible difference in most situations.
if you want to mark the variable as invalid for later checking, you could set it to false or null. that would be better than testing with isset() because a typo in the variable name would return false without any error... you can also pass false and null values to another function and retain the sentinel value, which can't be done with an unset var...
so i would say:
$var = false; ...
if ($var !== false) ...
or
$var = null; ...
if (!is_null($var)) ...
would be better for checking sentinel values than
unset($var); ...
if (isset($var)) ...
Technically $test = '' will return true to
if(isset($test))
Because it is still 'set', it is just set to en empty value.
It will however return true to
if(empty($test))
as it is an empty variable. It just depends on what you are checking for. Generally people tend to check if a variable isset, rather than if it is empty though.
So it is better to just unset it completely.
Also, this is easier to understand
unset($test);
than this
$test = '';
the first immediately tells you that the variable is NO LONGER SET. Where as the latter simply tells you it is set to a blank space. This is commonly used when you are going to add stuff to a variable and don't want PHP erroring on you.
You are doing different things, the purpose of unset is to destroys the specified variable in the context of where you make it, your second example simply sets the variable to an empty string.
Unsetting a variable doesn't force immediate memory freeing, if you are concerned about performance, setting the variable to NULL may be a better option, but really, the difference will be not noticeable...
Discussed in the docs:
unset() does just what it's name says
- unset a variable. It does not force immediate memory freeing. PHP's
garbage collector will do it when it
see fits - by intention as soon, as
those CPU cycles aren't needed anyway,
or as late as before the script would
run out of memory, whatever occurs
first.
If you are doing $whatever = null;
then you are rewriting variable's
data. You might get memory freed /
shrunk faster, but it may steal CPU
cycles from the code that truly needs
them sooner, resulting in a longer
overall execution time.
I think the most relevant difference is that unsetting a variable communicates that the variable will not be used by subsequent code (it also "enforces" this by reporting an E_NOTICE if you try to use it, as jspcal said that's because it's not in the symbol table anymore).
Therefore, if the empty string is a legal (or sentinel) value for whatever you are doing with your variable, go ahead and set it to ''. Otherwise, if the variable is no longer useful, unsetting it makes for clearer code intent.
They have totally different meanings. The former makes a variable non-existant. The latter just sets its value to the empty string. It doesn't matter which one is "better" so to speak, because they are for totally different things.
Are you trying to clean up memory or something? If so, don't; PHP manages memory for you, so you can leave it laying around and it'll get cleaned up automatically.
If you're not trying to clean up memory, then you need to figure out why you want to unset a variable or set it to empty, and choose the appropriate one. One good sanity check for this: let's say someone inserted the following line of code somewhere after your unset/empty:
if(strcmp($variable, '') == 0) { do_something(); }
And then, later:
if(!isset($variable)) { do_something_else(); }
The first will run do_something() if you set the variable to the empty string. The second will run do_something_else() if you unset the variable. Which of these do you expect to run if your script is behaving properly?
There is one other 'gotcha' to consider here, the reference.
if you had:
$a = 'foobar';
$variable =& $a;
then to do either of your two alternatives is quite different.
$variable = '';
sets both $variable and $a to the empty string, where as
unset($variable);
removes the reference link between $a and $variable while removing $variable from the symbol table. This is indeed the only way to unlink $a and $variable without setting $variable to reference something else. Note, e.g., $variable = null; won't do it.

Categories