What is less expensive in terms of performance and why ? Though for the first case it creates new variable, but in second case should not it first unset the var1 to reassign it ?
1)
$var1 = $someBigArray;
$var2 = $this->someFunction($var1);
// use $var2
2)
$var1 = $someBigArray;
$var1 = $this->someFunction($var1);
// user $var1
UPDATE
I cant really do this, I just excluded the rest of my code, asking the core part and making it look simpler
$var1 = $this->someFunction($someBigArray);
There you have two things, One is processing other is memory.
About Processing:
In PHP you really don't know what is the type of the variable. if the variable type of $someBigArray and $this->someFunction($var1) are not the same, it will be lot more expensive then assigning a new variable $var2. if they are same type($someBigArray and $this->someFunction($var1)), then it is less expensive.(less processing)
About memory:
using the same variable will may use less memory. small optimization for the RAM.
Memory is cheap, You should be more careful about the processing power your using. In these cases, try to do a benchmark.
PHP's garbage collection is a bit weird and when overriding a variable it isn't (always?) flushed from memory. So memory wise, assign a new variable and use unset() on the old one. You don't really want 30Mb of array wasting away in your memory, especially if this is a script that runs for any length of time or if you ever have multiple of it running at once.
I don't know about processing performance. As Nafis said, try a benchmark. (Simplest and dumbest way, make a script that runs it 1000 times and see which is faster)
Related
I have read several times that, in order to invoke the garbage collector and actually clean the RAM used by a variable, you have to assign a new value (e.g. NULL) instead of simply unset() it.
This code, however, shows that the memory allocated for the array $a is not cleaned after the NULL assignment.
function build_array()
{
for ($i=0;$i<10000;$i++){
$a[$i]=$i;
}
$i=null;
return $a;
}
echo '<p>'.memory_get_usage(true);
$a = build_array();
echo '<p>'.memory_get_usage(true);
$a = null;
echo '<p>'.memory_get_usage(true);
The output I get is:
262144
1835008
786432
So part of the memory is cleaned, but not all the memory. How can I completely clean the RAM?
You have no way to definitely clear a variable from the memory with PHP.
Its is up to PHP garbage collector to do that when it sees that it should.
Fortunately, PHP garbage collector may not be perfect but it is one of the best features of PHP. If you do things as per the PHP documentation there's no reasons to have problems.
If you have a realistic scenario where it may be a problem, post the scenario here or report it to PHP core team.
Other unset() is the best way to clear vars.
Point is that memory_get_usage(true) shows the memory allocated to your PHP process, not the amount actually in use. System could free unused part once it is required somewhere.
More details on memory_get_usage could be found there
If you run that with memory_get_usage(false), you will see that array was actually gc'd. example
I'm curious about using of unset() language construct just about everywhere, where I took memory or declare some variables (regardless of structure).
I mean, when somebody declares variable, when should it really be left for GC, or be unset()?
Example 1:
<?php
$buffer = array(/* over 1000 elements */);
// 1) some long code, that uses $buffer
// 2) some long code, that does not use $buffer
?>
Is there any chance, that $buffer might affect performance of point 2?
Am I really need (or should) to do unset($buffer) before entering point 2?
Example 2:
<?php
function someFunc(/* some args */){
$buffer = new VeryLargeObject();
// 1) some actions with $buffer methods and properties
// 2) some actions without usage of $buffer
return $something;
}
?>
Am I really need (or should) to do unset($buffer) within someFunc()s body before entering point 2?
Will GC free all allocated memory (references and objects included) within someFunc()s scope, when function will come to an end or will find return statement?
I'm interested in technical explaination, but code style suggestions are welcome too.
Thanks.
In php, all memory gets cleaned up after script is finished, and most of the time it's enough.
From php.net:
unset() does just what it's name says - unset a variable. It does not
force immediate memory freeing. PHP's garbage collector will do it
when it see fits - by intention as soon, as those CPU cycles aren't
needed anyway, or as late as before the script would run out of
memory, whatever occurs first.
If you are doing $whatever = null; then you are rewriting variable's
data. You might get memory freed / shrunk faster, but it may steal CPU
cycles from the code that truly needs them sooner, resulting in a
longer overall execution time.
In reality you would use unset() for cleaning memory pretty rare, and it's described good in this post:
https://stackoverflow.com/a/2617786/1870446
By doing an unset() on a variable, you mark the variable for being "garbage collected" so the memory isn't immediately available. The variable does not have the data anymore, but the stack remains at the larger size.
In PHP >= 5.3.0, you can call gc_collect_cycles() to force a GC pass. (after doing gc_enable() first).
But you must understand that PHP is script language, it's not Java so you shouldn't consider it like one. If your script is really that heavy to use tons of RAM - you can use unset and when script is close to exceed the memory - GC will trigger and clean up everything useless, including your unset variables. But in most cases you can forget about it.
Also, if you would want to go for unsetting every variable you do not use - don't. It will actually make your script execute longer - by using more CPU cycles - for the sake of getting free memory that would, in most cases, would never be needed.
Some people also say that they use unset to explicitly show that they won't use variable anymore. I find it a bad practice too, for me it just makes code more verbose with all these useless unsets.
I happens to read this http://code.google.com/speed/articles/optimizing-php.html
It claims that this code
$description = strip_tags($_POST['description']);
echo $description;
should be optimized as below
echo strip_tags($_POST['description']);
However, in my understanding, assignment operation in PHP is not necessarily create a copy in memory.
This only have one copy of "abc" in memory.
$a = $b = "abc";
It consumes more memory only when one variable is changed.
$a = $b = "abc";
$a = "xyz";
Is that correct?
should be optimized as below
It's only a good idea if you don't need to store it, thereby avoiding unnecessary memory consumption. However, if you need to output the same thing again later, it's better to store it in a variable to avoid a another function call.
Is that correct?
Yes. It's called copy-on-write.
In the first example, if the variable is only used once then there is not point of making a variable in the first place, just echo the statements result right away, there is no need for the variable.
In the second example, PHP has something called copy on write. That means that if you have two variables that point to the same thing, they are both just pointing at the same bit of memory. That is until one of the variables is written to, then a copy is made, and the change is made to that copy.
The author does have a point insofar as copying data into a variable will keep that data in memory until the variable is unset. If you do not need the data again later, it's indeed wasted memory.
Otherwise there's no difference at all in peak memory consumption between the two methods, so his reasoning ("copying") is wrong.
Coming from a C/C++ background, I am used to doing my own garbage collection - i.e. freeing resources after using them (i.e. RAII in C++ land).
I find myself unsetting (mostly ORM) variables after using them. Are there any benefits of this habit?
I remeber reading somewhere a while back, that unsetting variables marks them for deletion for the attention of PHP's GC - which can help resource usage on the server side - true or false?
[Edit]
I forgot to mention, I am using PHP 5.3, and also most of the unset() calls I make are in a loop where I am processing several 'heavy' ORM variables
I find that if your having to unset use alot your probably doing it wrong. Let scoping doing the "unsetting" for you. Consider the two examples:
1:
$var1 = f( ... );
....
unset( $var1 );
$var2 = g( ... );
....
unset( $var2 );
2:
function scope1()
{
$var1 = f( ... );
....
} //end of function triggers release of $var1
function scope2()
{
$var2 = g( ... );
....
} //end of function triggers release of $var2
scope1();
scope2();
The second example I would be preferable because it clearly defines the scope and decreases the risk of leaking variables to global scope (which are only released at the end of the script).
EDIT:
Another things to keep in mind is the unset in PHP costs more (CPU) than normal scope garbage collection. While the difference is small, it goes to show how little of an emphases the PHP team puts on unset. If anything unset should give PHP insight that on how to release memory, but it actually adds to execution time. unset is really only a hack to free up variables that are no longer needed, unless your doing something fairly complex, reusing variables (which acts like a natural unset on the old variable) and scoping should be all you ever need.
function noop( $value ){}
function test1()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
}
function test2()
{
$value = "something";
noop( $value ); //make sure $value isn't optimized out
unset( $value );
}
$start1 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test1();
$end1 = microtime(true);
$start2 = microtime(true);
for( $i = 0; $i < 1000000; $i++ ) test2();
$end2 = microtime(true);
echo "test1 ".($end1 - $start1)."\n"; //test1 0.404934883118
echo "test2 ".($end2 - $start2)."\n"; //test2 0.434437990189
If a very large object is used early in a long script, and there is no opportunity for the object to go out of scope, then unset() might help with memory usage. In most cases, objects go out of scope and they're marked for GC automatically.
Yes it can especially when you are dealing with big arrays, and script require much time to run.
Without going to look up some proof I'm going to say that it doesn't really matter. Garbage collection occurs automatically when you leave a function or a script ends. So unless you are really strapped for resources, don't bother.
OK, looked something up. Here is a good quote:
"Freeing memory - particularly large
amounts - isn't free in terms of
processor time, which means that if
you want your script to execute as
fast as possible at the expense of
RAM, you should avoid garbage
collection on large variables while
it's running, then let PHP do it en
masse at the end of the script."
For more info on the subject check out the links provided in the first answer here.
I thought PHP variables were only preserved through the lifetime of your script, so it's unlikely to help that much unless your script is particularly long-running or using a lot of temporary memory in one step.
Clearing explicitly may be slower than letting them all be automatically cleared at startup.
You're adding more code, which is generally going to make thing slower unless you know that it helps.
Premature optimization, in any case.
The PHP GC is usually good enough so that you usually do not need to call unset() on simple variables. For objects however, the GC will only destroy them when they leave scope and no other objects refer to them. Unset can help with memory in this case. See http://us3.php.net/manual/en/language.references.unset.php
I have had to use unset when you are running into memory issues when looping through and making copies of arrays. I would say don't use it unless you are in this situation ad the GC will kick in automatically.
In a PHP program, I sequentially read a bunch of files (with file_get_contents), gzdecode them, json_decode the result, analyze the contents, throw most of it away, and store about 1% in an array.
Unfortunately, with each iteration (I traverse over an array containing the filenames), there seems to be some memory lost (according to memory_get_peak_usage, about 2-10 MB each time). I have double- and triple-checked my code; I am not storing unneeded data in the loop (and the needed data hardly exceeds about 10MB overall), but I am frequently rewriting (actually, strings in an array). Apparently, PHP does not free the memory correctly, thus using more and more RAM until it hits the limit.
Is there any way to do a forced garbage collection? Or, at least, to find out where the memory is used?
it has to do with memory fragmentation.
Consider two strings, concatenated to one string. Each original must remain until the output is created. The output is longer than either input.
Therefore, a new allocation must be made to store the result of such a concatenation. The original strings are freed but they are small blocks of memory.
In a case of 'str1' . 'str2' . 'str3' . 'str4' you have several temps being created at each . -- and none of them fit in the space thats been freed up. The strings are likely not laid out in contiguous memory (that is, each string is, but the various strings are not laid end to end) due to other uses of the memory. So freeing the string creates a problem because the space can't be reused effectively. So you grow with each tmp you create. And you don't re-use anything, ever.
Using the array based implode, you create only 1 output -- exactly the length you require. Performing only 1 additional allocation. So its much more memory efficient and it doesn't suffer from the concatenation fragmentation. Same is true of python. If you need to concatenate strings, more than 1 concatenation should always be array based:
''.join(['str1','str2','str3'])
in python
implode('', array('str1', 'str2', 'str3'))
in PHP
sprintf equivalents are also fine.
The memory reported by memory_get_peak_usage is basically always the "last" bit of memory in the virtual map it had to use. So since its always growing, it reports rapid growth. As each allocation falls "at the end" of the currently used memory block.
In PHP >= 5.3.0, you can call gc_collect_cycles() to force a GC pass.
Note: You need to have zend.enable_gc enabled in your php.ini enabled, or call gc_enable() to activate the circular reference collector.
Found the solution: it was a string concatenation. I was generating the input line by line by concatenating some variables (the output is a CSV file). However, PHP seems not to free the memory used for the old copy of the string, thus effectively clobbering RAM with unused data. Switching to an array-based approach (and imploding it with commas just before fputs-ing it to the outfile) circumvented this behavior.
For some reason - not obvious to me - PHP reported the increased memory usage during json_decode calls, which mislead me to the assumption that the json_decode function was the problem.
There's a way.
I had this problem one day. I was writing from a db query into csv files - always allocated one $row, then reassigned it in the next step. Kept running out of memory. Unsetting $row didn't help; putting an 5MB string into $row first (to avoid fragmentation) didn't help; creating an array of $row-s (loading many rows into it + unsetting the whole thing in every 5000th step) didn't help. But it was not the end, to quote a classic.
When I made a separate function that opened the file, transferred 100.000 lines (just enough not to eat up the whole memory) and closed the file, THEN I made subsequent calls to this function (appending to the existing file), I found that for every function exit, PHP removed the garbage. It was a local-variable-space thing.
TL;DR
When a function exits, it frees all local variables.
If you do the job in smaller portions, like 0 to 1000 in the first function call, then 1001 to 2000 and so on, then every time the function returns, your memory will be regained. Garbage collection is very likely to happen on return from a function. (If it's a relatively slow function eating a lot of memory, we can safely assume it always happens.)
Side note: for reference-passed variables it will obviously not work; a function can only free its inside variables that would be lost anyway on return.
I hope this saves your day as it saved mine!
I've found that PHP's internal memory manager is most-likely to be invoked upon completion of a function. Knowing that, I've refactored code in a loop like so:
while (condition) {
// do
// cool
// stuff
}
to
while (condition) {
do_cool_stuff();
}
function do_cool_stuff() {
// do
// cool
// stuff
}
EDIT
I ran this quick benchmark and did not see an increase in memory usage. This leads me to believe the leak is not in json_decode()
for($x=0;$x<10000000;$x++)
{
do_something_cool();
}
function do_something_cool() {
$json = '{"a":1,"b":2,"c":3,"d":4,"e":5}';
$result = json_decode($json);
echo memory_get_peak_usage() . PHP_EOL;
}
I was going to say that I wouldn't necessarily expect gc_collect_cycles() to solve the problem - since presumably the files are no longer mapped to zvars. But did you check that gc_enable was called before loading any files?
I've noticed that PHP seems to gobble up memory when doing includes - much more than is required for the source and the tokenized file - this may be a similar problem. I'm not saying that this is a bug though.
I believe one workaround would be not to use file_get_contents but rather fopen()....fgets()...fclose() rather than mapping the whole file into memory in one go. But you'd need to try it to confirm.
HTH
C.
Call memory_get_peak_usage() after each statement, and ensure you unset() everything you can. If you are iterating with foreach(), use a referenced variable to avoid making a copy of the original (foreach()).
foreach( $x as &$y)
If PHP is actually leaking memory a forced garbage collection won't make any difference.
There's a good article on PHP memory leaks and their detection at IBM
There recently was a similar issue with System_Daemon. Today I isolated my problem to file_get_contents.
Could you try using fread instead? I think this may solve your problem.
If it does, it's probably time to do a bugreport over at PHP.