I am somewhat new to PHP, and I am wondering:
How important is it to unset variables in PHP?
I know in languages like C, we free the allocated memory to prevent leaks, etc. By using unset on variables when I am done with them, will this significantly increase performance of my applications?
Also, is there a benchmark anywhere that compares the difference between using unset and not using unset?
See this example (and the article I linked below the question):
$x = str_repeat('x', 80000);
echo memory_get_usage() . "<br>\n"; // 120172
echo memory_get_peak_usage() . "<br>\n"; // 121248
$x = str_repeat('x', 80000);
echo memory_get_usage() . "<br>\n"; // 120172
echo memory_get_peak_usage() . "<br>\n"; // 201284
As you can see, at one point PHP had used up almost double the memory. This is because before assigning the 'x'-string to $x, PHP builds the new string in memory, while holding the previous variable in memory, too. This could have been prevented with unsetting $x.
Another example:
for ($i=0; $i<3; $i++) {
$str = str_repeat("Hello", 10000);
echo memory_get_peak_usage(), PHP_EOL;
}
This will output something like
375696
425824
425824
At the first iteration $str is still empty before assignment. On the second iteration $str will hold the generated string though. When str_repeat is then called for the second time, it will not immediately overwrite $str, but first create the string that is to be assigned in memory. So you end up with $str and the value it should be assigned. Double memory. If you unset $str, this will not happen:
for($i=0;$i<3;$i++) {
$str = str_repeat("Hello", 10000);
echo memory_get_peak_usage(), PHP_EOL;
unset($str);
}
// outputs something like
375904
376016
376016
Does it matter? Well, the linked article sums it quite good with
This isn't critical, except when it is.
It doesn't hurt to unset your variables when you no longer need them. Maybe you are on a shared host and want to do some iterating over large datasets. If unsetting would prevent PHP from ending with Allowed memory size of XXXX bytes exhausted, then it's worth the tiny effort.
What should also be taken into account is, that even if the request lifetime is just a second, doubling the memory usage effectively halves the maximum amount of simultaneous requests that can be served. If you are nowhere close to the server's limit anyway, then who cares, but if you are, then a simple unset could save you the money for more RAM or an additional server.
There are many situations in which unset will not actually deallocate much of anything, so its use is generally quite pointless unless the logical flow of your code necessitates its non-existence.
Applications being written in languages like C usually run for many hours. But usual php application's runtime is just about 0.05 sec. So, it is much less important to use unset.
It depends, you should make sure you unsetted very long strings (such as a blog post's content selected from a db).
<?php
$post = get_blog_post();
echo $post;
unset($post);
?>
Related
Let's say I have $variable holding more than 500 kb info.
while ($row = mysqli_fetch_assoc($selectFromTable))
{
$variable .= "<p>$row[info]</p>";
}
or
while ($row = mysqli_fetch_assoc($selectFromTable))
{
echo "<p>$row[info]</p>";
}
Optimization wise, is it better to echo the info right away than saving it to a variable?
I can't decide because I can't see the difference in performance because I don't know what tool to use in monitoring the response time. Any suggestion?
Even though there is not enough difference in performance, I still wanted to learn on how can I optimize my coding.
There is no significant difference in speed or memory usage between the two pieces of code you listed. They both build a new string that contains the value of $row['info'] enclosed in a <p> HTML element.
You can pass each string as an individual argument to echo:
echo "<p>", $row['info'], "</p>";
This avoids the creation of a new string, uses less memory and runs slightly faster (the improvement speed is not significant unless you do it thousands of times in a loop).
Read about the echo language construct.
Also please note that $row[info] is not correct. The correct way is $row['info']. It is explained in the documentation why.
You need to do something with variable, instead of just saving some data in it in the first loop.
With your current setup first loop with only variable storage will always be faster as operation with IO (input/output) devices are slow that means echo to print output in screen.
But if you add an echo after the variable statement then, loop with only single echo will obviously be faster.
I got to wondering about the efficiency of this:
I have a csv file with about 200 rows in it, I use a class to filter/break up the csv and get the bits I want. It is cached daily.
I found that many descriptions (can be up to ~500 chars each) have a hanging word "Apply" and it needs chopping off.
Thinking that calling toString() on my object more than once would be bad practice, I created a temp var : $UJM_desc (this code is inside a loop)
// mad hanging 'Apply' in `description` very often, cut it off
$UJM_desc = $description->toString();
$hanging = substr($UJM_desc, -5);
if($hanging == "Apply")
$UJM_desc = substr($UJM_desc, 0 , -5);
$html .= '<p>' . $UJM_desc ;
But could have just called $description->toString() a couple of times, I am aware there is room to simplify this maybe with a ternary, but still, I froze the moment and thought I'd ask.
Call a method twice or use a temp var? Which is best?
I'd just use regex to strip off the end:
$html .= '<p>' . preg_replace('/Apply$/', '', $description->toString());
That said, if $description->toString() gives the same output no matter where you use it, there's absolutely no reason to call it multiple times, and a temporary variable will be the most efficient.
There's also no reason to save $hanging to a variable, as you only use it once.
In general, it depends, and it's a tradeoff.
Keeping a calculated value in a variable takes up memory, and runs the risk of containing stale data.
Calculating the value anew might be slow, or expensive in some other way.
So it's a matter of deciding which resource is most important to you.
In this case, however, the temporary variable is so short-lived, it's definitely worth using.
I'm curious about using of unset() language construct just about everywhere, where I took memory or declare some variables (regardless of structure).
I mean, when somebody declares variable, when should it really be left for GC, or be unset()?
Example 1:
<?php
$buffer = array(/* over 1000 elements */);
// 1) some long code, that uses $buffer
// 2) some long code, that does not use $buffer
?>
Is there any chance, that $buffer might affect performance of point 2?
Am I really need (or should) to do unset($buffer) before entering point 2?
Example 2:
<?php
function someFunc(/* some args */){
$buffer = new VeryLargeObject();
// 1) some actions with $buffer methods and properties
// 2) some actions without usage of $buffer
return $something;
}
?>
Am I really need (or should) to do unset($buffer) within someFunc()s body before entering point 2?
Will GC free all allocated memory (references and objects included) within someFunc()s scope, when function will come to an end or will find return statement?
I'm interested in technical explaination, but code style suggestions are welcome too.
Thanks.
In php, all memory gets cleaned up after script is finished, and most of the time it's enough.
From php.net:
unset() does just what it's name says - unset a variable. It does not
force immediate memory freeing. PHP's garbage collector will do it
when it see fits - by intention as soon, as those CPU cycles aren't
needed anyway, or as late as before the script would run out of
memory, whatever occurs first.
If you are doing $whatever = null; then you are rewriting variable's
data. You might get memory freed / shrunk faster, but it may steal CPU
cycles from the code that truly needs them sooner, resulting in a
longer overall execution time.
In reality you would use unset() for cleaning memory pretty rare, and it's described good in this post:
https://stackoverflow.com/a/2617786/1870446
By doing an unset() on a variable, you mark the variable for being "garbage collected" so the memory isn't immediately available. The variable does not have the data anymore, but the stack remains at the larger size.
In PHP >= 5.3.0, you can call gc_collect_cycles() to force a GC pass. (after doing gc_enable() first).
But you must understand that PHP is script language, it's not Java so you shouldn't consider it like one. If your script is really that heavy to use tons of RAM - you can use unset and when script is close to exceed the memory - GC will trigger and clean up everything useless, including your unset variables. But in most cases you can forget about it.
Also, if you would want to go for unsetting every variable you do not use - don't. It will actually make your script execute longer - by using more CPU cycles - for the sake of getting free memory that would, in most cases, would never be needed.
Some people also say that they use unset to explicitly show that they won't use variable anymore. I find it a bad practice too, for me it just makes code more verbose with all these useless unsets.
foreach(explode(',' $foo) as $bar) { ... }
vs
$test = explode(',' $foo);
foreach($test as $bar) { ... }
In the first example, does it explode the $foo string for each iteration or does PHP keep it in memory exploded in its own temporary variable? From an efficiency point of view, does it make sense to create the extra variable $test or are both pretty much equal?
I could make an educated guess, but let's try it out!
I figured there were three main ways to approach this.
explode and assign before entering the loop
explode within the loop, no assignment
string tokenize
My hypotheses:
probably consume more memory due to assignment
probably identical to #1 or #3, not sure which
probably both quicker and much smaller memory footprint
Approach
Here's my test script:
<?php
ini_set('memory_limit', '1024M');
$listStr = 'text';
$listStr .= str_repeat(',text', 9999999);
$timeStart = microtime(true);
/*****
* {INSERT LOOP HERE}
*/
$timeEnd = microtime(true);
$timeElapsed = $timeEnd - $timeStart;
printf("Memory used: %s kB\n", memory_get_peak_usage()/1024);
printf("Total time: %s s\n", $timeElapsed);
And here are the three versions:
1)
// explode separately
$arr = explode(',', $listStr);
foreach ($arr as $val) {}
2)
// explode inline-ly
foreach (explode(',', $listStr) as $val) {}
3)
// tokenize
$tok = strtok($listStr, ',');
while ($tok = strtok(',')) {}
Results
Conclusions
Looks like some assumptions were disproven. Don't you love science? :-)
In the big picture, any of these methods is sufficiently fast for a list of "reasonable size" (few hundred or few thousand).
If you're iterating over something huge, time difference is relatively minor but memory usage could be different by an order of magnitude!
When you explode() inline without pre-assignment, it's a fair bit slower for some reason.
Surprisingly, tokenizing is a bit slower than explicitly iterating a declared array. Working on such a small scale, I believe that's due to the call stack overhead of making a function call to strtok() every iteration. More on this below.
In terms of number of function calls, explode()ing really tops tokenizing. O(1) vs O(n)
I added a bonus to the chart where I run method 1) with a function call in the loop. I used strlen($val), thinking it would be a relatively similar execution time. That's subject to debate, but I was only trying to make a general point. (I only ran strlen($val) and ignored its output. I did not assign it to anything, for an assignment would be an additional time-cost.)
// explode separately
$arr = explode(',', $listStr);
foreach ($arr as $val) {strlen($val);}
As you can see from the results table, it then becomes the slowest method of the three.
Final thought
This is interesting to know, but my suggestion is to do whatever you feel is most readable/maintainable. Only if you're really dealing with a significantly large dataset should you be worried about these micro-optimizations.
In the first case, PHP explodes it once and keeps it in memory.
The impact of creating a different variable or the other way would be negligible. PHP Interpreter would need to maintain a pointer to a location of next item whether they are user defined or not.
From the point of memory it will not make a difference, because PHP uses the copy on write concept.
Apart from that, I personally would opt for the first option - it's a line less, but not less readable (imho!).
Efficiency in what sense? Memory management, or processor? Processor wouldn't make a difference, for memory - you can always do $foo = explode(',', $foo)
Coming from a C/C++ background, I am used to doing my own garbage collection - i.e. freeing resources after using them (i.e. RAII in C++ land).
I find myself unsetting (mostly ORM) variables after using them. Are there any benefits of this habit?
I remeber reading somewhere a while back, that unsetting variables marks them for deletion for the attention of PHP's GC - which can help resource usage on the server side - true or false?
[Edit]
I forgot to mention, I am using PHP 5.3, and also most of the unset() calls I make are in a loop where I am processing several 'heavy' ORM variables
I find that if your having to unset use alot your probably doing it wrong. Let scoping doing the "unsetting" for you. Consider the two examples:
1:
$var1 = f( ... );
....
unset( $var1 );
$var2 = g( ... );
....
unset( $var2 );
2:
function scope1()
{
$var1 = f( ... );
....
} //end of function triggers release of $var1
function scope2()
{
$var2 = g( ... );
....
} //end of function triggers release of $var2
scope1();
scope2();
The second example I would be preferable because it clearly defines the scope and decreases the risk of leaking variables to global scope (which are only released at the end of the script).
EDIT:
Another things to keep in mind is the unset in PHP costs more (CPU) than normal scope garbage collection. While the difference is small, it goes to show how little of an emphases the PHP team puts on unset. If anything unset should give PHP insight that on how to release memory, but it actually adds to execution time. unset is really only a hack to free up variables that are no longer needed, unless your doing something fairly complex, reusing variables (which acts like a natural unset on the old variable) and scoping should be all you ever need.
function noop( $value ){}
function test1()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
}
function test2()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
unset( $value );
}
$start1 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test1();
$end1 = microtime(true);
$start2 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test2();
$end2 = microtime(true);
echo "test1 ".($end1 - $start1)."\n"; //test1 0.404934883118
echo "test2 ".($end2 - $start2)."\n"; //test2 0.434437990189
If a very large object is used early in a long script, and there is no opportunity for the object to go out of scope, then unset() might help with memory usage. In most cases, objects go out of scope and they're marked for GC automatically.
Yes it can especially when you are dealing with big arrays, and script require much time to run.
Without going to look up some proof I'm going to say that it doesn't really matter. Garbage collection occurs automatically when you leave a function or a script ends. So unless you are really strapped for resources, don't bother.
OK, looked something up. Here is a good quote:
"Freeing memory - particularly large
amounts - isn't free in terms of
processor time, which means that if
you want your script to execute as
fast as possible at the expense of
RAM, you should avoid garbage
collection on large variables while
it's running, then let PHP do it en
masse at the end of the script."
For more info on the subject check out the links provided in the first answer here.
I thought PHP variables were only preserved through the lifetime of your script, so it's unlikely to help that much unless your script is particularly long-running or using a lot of temporary memory in one step.
Clearing explicitly may be slower than letting them all be automatically cleared at startup.
You're adding more code, which is generally going to make thing slower unless you know that it helps.
Premature optimization, in any case.
The PHP GC is usually good enough so that you usually do not need to call unset() on simple variables. For objects however, the GC will only destroy them when they leave scope and no other objects refer to them. Unset can help with memory in this case. See http://us3.php.net/manual/en/language.references.unset.php
I have had to use unset when you are running into memory issues when looping through and making copies of arrays. I would say don't use it unless you are in this situation ad the GC will kick in automatically.