I have a php script to scrap a website (text files only). After running for few hours I noticed the script to stop for reaching the memory limit. I know I can increase the limit, but since the files the script loads are onlty HTML files I explain the reaching of the limit only with the inability of the script to empty the memory after each loop. Could I optimize my script's memory management by flush()ing its memory regularly?
In general, you shouldn't need to manually manage memory in PHP, as it has a high-level Memory Manager built in to the Zend Engine which takes care of this for you. However, it is useful to know a bit about how this works in order to better understand why your code is running out of memory.
As a very basic overview, PHP frees memory based on a "refcount" of how many variables are referencing a particular piece of data. So if you say $a = 'hello'; $b = $a;, a single piece of memory containing the string 'hello' will have a refcount of 2. If you call unset() on either variable, or they fall out of scope (e.g. at the end of the function they were defined in), the refcount will decrease. Once the refcount reaches zero, the data will be deleted and the memory freed. Note that "freed" in this case means freed for use by other parts of that PHP script, not necessarily freed back to the Operating System for use by other processes.
There are a few differences between PHP versions worth knowing:
The reference counting mechanism described above doesn't work if you have circular references (e.g. $obj1->foo = $obj2; $obj2->bar = $obj1;) because the reference count never reaches zero. In PHP 5.2 and earlier, this meant that such circular references led to memory leaks, and had to be manually handled by the programmer. In PHP 5.3, a "Garbage Collector" was added specifically to handle this case. It does not replace the normal refcount mechanism, but if circular references are common in your code, it may be worth reading up on.
PHP 5.4 included a large number of optimizations to the way PHP allocates and uses memory. AFAIK, none of these change the fundamental recommendations of how to write efficient code, they are just a good reason to upgrade your PHP version if you can.
Other than that, there are a few common tips for writing PHP code that makes good use of memory:
Make sure unused variables are discarded when no longer needed. In a well-structured program, this is often a non-issue, because most variables will be local to a particular function; when the function exits, they will go out of scope, and be freed. But if you are creating large intermediate variables, or dynamically creating large numbers of variables, manually calling unset() may be a good idea. And if your code is very linear, or uses large numbers of global and static variables, just refactoring it into a more modular structure may improve its memory performance as well as its readability, maintainability, etc.
Assigning or passing a variable by reference ($foo = &$bar) may cause PHP to use more memory than a straight assignment ($foo = $bar). This is because PHP uses a "Copy On Write" mechanism to to store variables with the same content in one location of memory, but reference assignment conflicts with this mechanism, so PHP has to copy the variable early.
Objects are more memory-hungry than scalar values (int, boolean, string) or arrays. This is one of the things that has been much improved in PHP 5.4, but is still worth thinking about - although obviously not to the exclusion of writing well-structured code!
You can unset variables as you no longer need them (e.g. unset($var) or $var = null). If you're on PHP 5.3 or later, you can also explicitly call the garbage collector: see gc_collect_cycles() and gc_enable().
Some functions seem to be worse than others. I recently found that array_merge_recursive() did horrible things to my code's memory footprint.
If you want to be able to analyse where the memory's going, you can use tools like Xdebug or XHProf/XHGui to help. e.g. Xdebug and tracing memory usage and Profiling with XHProf
See also:
Force freeing memory in PHP
php garbage collection while script running
Related
I have php application which should manage (export) big (huge) amount of data, and it has to be done on production... so I need to make as low memory usage as possible (main criteria).
Shortly say App exporting data in cycle, like
for($fileCounter=0;$fileCounter<=70;$fileCounter++) {
... HERE a lot of (more than 1K lines) huge work, many variables a lot of DB queries from another databases etc ...
}
I don't want to show here full logic because it can take a lot of time for other peoples, it's not the main point here.
Main point is, why if I will unset() all newly created variables during each iteration it does not decrease memory usage ? like this
for($fileCounter=0;$fileCounter<=70;$fileCounter++) {
// optimization purpose
$vars_at_start = array_keys(get_defined_vars());
echo memory_get_peak_usage(true) . PHP_EOL;
... huge logic ...
$vars_at_end = array_diff($vars_at_start, array_keys(get_defined_vars()));
foreach($vars_at_end as $v) unset($v);
unset($vars_at_end);
}
and how I could decrease memory usage ? if I need to use so many queries, variebles etc..
P.S. code is not mine :) and I don't want to rewrite it from scratch, I'm just looking for optimization direction.
without variables cleaning memory usage is next (it is measuring in the beginning of each iteration)
23592960
Started: 0 - 12:58:26
Ended: 13:00:51
877920256 (difference 854'327'296)
Started: 1 - 13:00:51
Ended: 13:03:39
1559494656 (difference 681'574'400)
and with variables cleaning
23592960
Started: 0 - 12:47:57
Ended: 12:50:20
877920256 (difference 854'327'296)
Started: 1 - 12:50:20
Ended: 12:53:16
1559756800 (difference 681'836'544)
Based on my reading PHP has a lot reason to leak memory... like this https://bugs.php.net/bug.php?id=48781
There is a tool called valgrind it can help, gone to try it :)
Although unset() does not free the memory consumed by the PHP process, it does free it for use by the PHP script itself.
So, if you are creating a variable of 10M of size 10 times in a loop and unsetting (or rewriting) it at the end of the loop, the memory consumption should be as low as 10M + all other variables by the end of the loop.
If it grows - then, there is a leak somewhere in the full logic you don't want to show.
Because PHP does it automatically.
When variable is not used any more, PHP unsets it itself.
Because unset doesn't free up memory. It merely frees the variable. But if that variable points to some sort of complex structure, that PHP's memory management can't figure out how to free up, it won't.
$vars_at_end = array_diff($vars_at_start, array_keys(get_defined_vars()));
foreach($vars_at_end as $v) unset($v);
unset($vars_at_end);
This whole block of code is working on copies (or even copies of copies) of your variables, so it's adding large amounts of memory.
All that you ever unset are the copies of copies in the foreach loop.
You need to unset the actual variables that you use
BTW, your foreach ... unset loop does nothing anyway. PHP uses a reference-based delayed copy-on write system to optimise memory use. This foreach loop this in effect a no-op. Storage is freed -- that is returned to the internal Zend Engines emalloc allocator (not to the OS) -- or reuse once the reference count for any element is zero. This will happen anyway for local variables when you leave the scope of a function and for class properties when you destroy a class object. There is no point in cloning a shallow copy of a variable and then unsetting this as this just does a +1 -1 on the reference count.
If you mine the code for the main variables used in the loop and unset them then you will truly decrement reference counts and to 0, then you will free up storage for reuse. However, as troelskyn implies, there are many ways the existing code can leave data elements with a non-zero reference. The classic way is if your code is using references than you can create cyclic reference chains which will never be reclaimed unless the cycle is explicitly broken. Even having a global array which is use to hold results can hog memory.
Sorry, but your statement:
I don't want to show here full logic because it can take a lot of time for other peoples, it's not the main point here.
is wrong. In PHP if you want to understand why you are not returning storage into the memory pool, then you must look into the code.
unset() on a variable marks it for 'garbage collection'
Have you tried __destruct() ?
http://www.php.net/manual/en/language.oop5.decon.php
Can php script get allowed memory size and also how much memory can be allocated? I know that it is possible to clean memory using unset. But I'll like to understand how to create php scripts that consume less memory as possibile.
The basic mechanism which PHP uses is garbage collection
How it works in short is something like:
Say you have a certain memory location M allocated to store variable $m e.g.:
$m = [ 0,1,2,3,4,5 ]; //M refers to the memory which is storing this array
As long as $m keeps pointing to M then PHP is not allowed to destroy M. However if you do something like:
$m = null;
This makes $m point to nothing and therefore M no longer is referenced by anything. PHP at this point is allowed to clear that memory, but may not do so immediately. The point is if you ensure that you stop referencing something when you don't need it anymore you're giving PHP the opportunity to run as memory optimized as possible.
However, garbage collection for large complex applications is expensive so keep in mind that PHP may opt to delay garbage collection if it can.
unset will free the memory.
Though major memory consumption will be on the resources from io, db etc.
And for these tasks, it is very important to release the memory or free up the sources for further utilisation.
Also to note that while processing, the processed data will also have near similar usage. Hence freeing that after usage will also improve the use case.
To go more and more memory less, make functions pure as much as possible, where after execution of function, there is only output and no side effect. With this there will be less things in global memory space.
Before starting, I'm not asking about standard coding practice or "etiquette." My question is more from curiosity with the internals of PHP. My research so far mostly seems to find people confused about scope in PHP.
Does re-using variables come with any benefit/detriment in PHP, either in memory or speed? For science.
Say you are sequentially accessing multiple files throughout a script.
Scenario A: your handles each have a variable, $file1, $file2, $file3, etc.
Scenario B: your handles all reuse the variable $fp
Will this theoretical scenario require respectively resource intensive scripts to matter? Will B allow garbage collection to get rid of the old handles while A won't? Will optimization through Zend make this a non-issue either way?
There is not a cut & dry answer to this question. Optimization and performance will depend heavily on your particular codebase, the platform it runs on, what else is running on the server, and more.
To start with your scenarios are too vague to provide an adequate answer. However, to touch on some of the prior comments/concerns...
PHP does not have very well defined rules for garbage collection. In THEORY scenario A will release the memory when a function exits thanks to garbage collection. In reality this rarely happens. There are a number of triggers that will cause garbage collection to release that memory, but behind the scenes the actual low-level free() and mallocs() are not cut & dry. If you watch your memory stack closely you will find that after a function exit the memory space for $file1, $file2, $file3 will remain. Sometimes until the entire application exits.
Your application construction will also determine which is faster, creating a new entry in the symbol table for $file1, $file2, $file3 or re-using $fp over & over. Re-using $fp, again IN THEORY, would typically mean the memory space does not need to be re-allocated and a new symbol table entry and corresponding management object does not need to be re-created. However this is not always the case. Sometimes re-using $fp actually can be slower because a destroy needs to be called first, then re-creating the object. In some corner cases it may be faster to just create a new $file1, $file2, $file3 on the iterative process and let garbage collection happen all-at-once.
So, the bottom line of all this....
You need to analyze and test your own apps in their native environment to learn how things behave in YOUR playground. It is rarely an "always do this" or "never do that" scenario.
Not confident on my answer, but I do found that reuse vars saves more memory, especially when re-using vars for query results as often time those vars will fill with a lot of other unwanted stuff in there.
You can use
echo memory_get_usage() at different stage of the code execution to see the difference and compare.
But it could get confusing as your code grows and makes it harder for people to read.
Also PHP runs garbage collection when the script is done. so how you name your vars probably won't have anything to do with it, rather it effects how much memory it uses during execution.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
What's better at freeing memory with PHP: unset() or $var = null
Is there a real benefit of unsetting variables in php?
class test {
public function m1($a, $b)
$c = $a + $b;
unset($a, $b);
return $c;
}
}
Is it true that unsetting variables doesn't actually decrease the memory consumption during runtime?
Is it true that unsetting variables
doesn't actually decrease the memory
consumption during runtime?
Yep. From PHP.net:
unset() does just what it's name says
- unset a variable. It does not force immediate memory freeing. PHP's
garbage collector will do it when it
see fits - by intention as soon, as
those CPU cycles aren't needed anyway,
or as late as before the script would
run out of memory, whatever occurs
first.
If you are doing $whatever = null;
then you are rewriting variable's
data. You might get memory freed /
shrunk faster, but it may steal CPU
cycles from the code that truly needs
them sooner, resulting in a longer
overall execution time.
Regarding your other question:
And is there any reason to unset
variables apart from destroying
session varaibles for instance or for
scoping?
Not really, you pretty much summed it.
PHP will clean up memory on its own with the garbage collector, and it usually does a pretty good job. unsetting will simply make it explicit that you're done with that particular variable.
Probably no benefit for simple data types, but for any system resources you'd want to use that command to free those resources.
It depends on what the variable is. If it's a large array that consumes a few megs of data, and your script is liable to require lots of memory in the future (i.e.: before it finishes execution) then it would be wise to tag this memory as being available for use by unsetting the array.
That said, this is only really of use if the array is still in scope, as PHP will effectively have automatically disposed of it otherwise.
In terms of your provided example, there's no need to use unset, as those variables immediately go out of scope.
Even though there's no real gain over PHP's own garbage collection, I will occasionally unset() variables to make it clear in the code that a var's role has been completed and will no longer be accessed or assigned. I tend not to do this with atomic data types, but instead with major actors in a script - configuration singletons, large objects, etc.
It releases memory which is being used by your script. See http://ie2.php.net/memory_get_usage.
The benefit is with scripts which are processing large amounts of data you can run into out of memory errors, see the memory_limit ini setting for more on this.
So, yes, there may be benefit, but unless you are working with large amounts of data you shouldn't need to use it.
You may also want to unset variable to prevent their value being used later on, but if that's the case it could be argued that your code needs to be written differently to prevent such things happening.
As mentioned in unset
unset() does just what it's name says - unset a variable.
It does not force immediate memory freeing.
PHP's garbage collector will do it when it see fits - by intention as soon,
as those CPU cycles aren't needed anyway,
or as late as before the script would run out of memory, whatever occurs first.
If you are doing $whatever = null;
then you are rewriting variable's data.
You might get memory freed / shrunk faster,
but it may steal CPU cycles from the code that truly needs them sooner,
resulting in a longer overall execution time.
Can any body give me a a introduction of how to program efficiently minimizing memory usage in PHP program correctly and generate my program results using minimum memory ?
Based on how I read your question, I think you may be barking up the wrong tree with PHP. It was never designed for a low memory overhead.
If you just want to be as efficient as possible, then look at the other answers. Remember that every single variable costs a fair bit of memory, so use only what you have to, and let the garbage collector work. Make sure that you only declare variables in a local scope so they can get GC'd when the program leaves that scope. Objects will be more expensive than scalar variables. But the biggest common abuse I see are multiple copies of data. If you have a large array, operate directly on it rather than copying it (It may be less CPU efficient, but it should be more memory efficient).
If you are looking to run it in a low memory environment, I'd suggest finding a different language to use. PHP is nice because it manages everything for you (with respect to variables). But that type coersion and flexibility comes at a price (speed and memory usage). Each variable requires a lot of meta-data stored with it. So an 8 byte int (32 bit) would take 8 bytes to store in C, it will likely take more than 64 bytes in PHP (because of all of the "tracking" information associated with it such as type, name, scoping information, etc). That overhead is normally seen as ok since PHP was not designed for large memory loads. So it's a trade-off. More memory used for easier programming. But if you have tight memory constraints, I'd suggest moving to a different language...
It's difficult to give advice with so little information on what you're trying to do and why memory utilization is a problem. In the common scenarios (web servers that serve many requests), memory is not a limiting factory and it's preferable to serve the requests as fast as possible, even if this means sacrificing memory for speed.
However, the following general guidelines apply:
unset your variables as soon as you don't need them. In a program that's well written, this, however, won't have a big impact, as variables going out of scope have the same effect.
In long running scripts, with lot's of variables with circular references, and if using PHP 5.3, trey calling the garbage collector explicitly in certain points.
First of all: Don't try to optimize memory usage by using references. PHP is smart enough not to copy the contents of a variable if you do something like this:
$array = array(1,2,3,4,5,);
$var = $array;
PHP will only copy the contents of the variable when you write to it. Using references all the time because you think they will save you copying the variable content can often fire backwards ;)
But, I think your question is hard to answer, as long as you are more precise.
For example if you are working with files it can be recommendable not always to file_get_contents() the whole file, but use the f(open|...) functions to load only small parts of the file at once or even skip whole chunks.
Or if you are working with strings make use of functions which return a string offset instead of the rest of a string (e.g. strcspn instead of strpbrk) when possible.