I have php application which should manage (export) big (huge) amount of data, and it has to be done on production... so I need to make as low memory usage as possible (main criteria).
Shortly say App exporting data in cycle, like
for($fileCounter=0;$fileCounter<=70;$fileCounter++) {
... HERE a lot of (more than 1K lines) huge work, many variables a lot of DB queries from another databases etc ...
}
I don't want to show here full logic because it can take a lot of time for other peoples, it's not the main point here.
Main point is, why if I will unset() all newly created variables during each iteration it does not decrease memory usage ? like this
for($fileCounter=0;$fileCounter<=70;$fileCounter++) {
// optimization purpose
$vars_at_start = array_keys(get_defined_vars());
echo memory_get_peak_usage(true) . PHP_EOL;
... huge logic ...
$vars_at_end = array_diff($vars_at_start, array_keys(get_defined_vars()));
foreach($vars_at_end as $v) unset($v);
unset($vars_at_end);
}
and how I could decrease memory usage ? if I need to use so many queries, variebles etc..
P.S. code is not mine :) and I don't want to rewrite it from scratch, I'm just looking for optimization direction.
without variables cleaning memory usage is next (it is measuring in the beginning of each iteration)
23592960
Started: 0 - 12:58:26
Ended: 13:00:51
877920256 (difference 854'327'296)
Started: 1 - 13:00:51
Ended: 13:03:39
1559494656 (difference 681'574'400)
and with variables cleaning
23592960
Started: 0 - 12:47:57
Ended: 12:50:20
877920256 (difference 854'327'296)
Started: 1 - 12:50:20
Ended: 12:53:16
1559756800 (difference 681'836'544)
Based on my reading PHP has a lot reason to leak memory... like this https://bugs.php.net/bug.php?id=48781
There is a tool called valgrind it can help, gone to try it :)
Although unset() does not free the memory consumed by the PHP process, it does free it for use by the PHP script itself.
So, if you are creating a variable of 10M of size 10 times in a loop and unsetting (or rewriting) it at the end of the loop, the memory consumption should be as low as 10M + all other variables by the end of the loop.
If it grows - then, there is a leak somewhere in the full logic you don't want to show.
Because PHP does it automatically.
When variable is not used any more, PHP unsets it itself.
Because unset doesn't free up memory. It merely frees the variable. But if that variable points to some sort of complex structure, that PHP's memory management can't figure out how to free up, it won't.
$vars_at_end = array_diff($vars_at_start, array_keys(get_defined_vars()));
foreach($vars_at_end as $v) unset($v);
unset($vars_at_end);
This whole block of code is working on copies (or even copies of copies) of your variables, so it's adding large amounts of memory.
All that you ever unset are the copies of copies in the foreach loop.
You need to unset the actual variables that you use
BTW, your foreach ... unset loop does nothing anyway. PHP uses a reference-based delayed copy-on write system to optimise memory use. This foreach loop this in effect a no-op. Storage is freed -- that is returned to the internal Zend Engines emalloc allocator (not to the OS) -- or reuse once the reference count for any element is zero. This will happen anyway for local variables when you leave the scope of a function and for class properties when you destroy a class object. There is no point in cloning a shallow copy of a variable and then unsetting this as this just does a +1 -1 on the reference count.
If you mine the code for the main variables used in the loop and unset them then you will truly decrement reference counts and to 0, then you will free up storage for reuse. However, as troelskyn implies, there are many ways the existing code can leave data elements with a non-zero reference. The classic way is if your code is using references than you can create cyclic reference chains which will never be reclaimed unless the cycle is explicitly broken. Even having a global array which is use to hold results can hog memory.
Sorry, but your statement:
I don't want to show here full logic because it can take a lot of time for other peoples, it's not the main point here.
is wrong. In PHP if you want to understand why you are not returning storage into the memory pool, then you must look into the code.
unset() on a variable marks it for 'garbage collection'
Have you tried __destruct() ?
http://www.php.net/manual/en/language.oop5.decon.php
Related
I have a php script to scrap a website (text files only). After running for few hours I noticed the script to stop for reaching the memory limit. I know I can increase the limit, but since the files the script loads are onlty HTML files I explain the reaching of the limit only with the inability of the script to empty the memory after each loop. Could I optimize my script's memory management by flush()ing its memory regularly?
In general, you shouldn't need to manually manage memory in PHP, as it has a high-level Memory Manager built in to the Zend Engine which takes care of this for you. However, it is useful to know a bit about how this works in order to better understand why your code is running out of memory.
As a very basic overview, PHP frees memory based on a "refcount" of how many variables are referencing a particular piece of data. So if you say $a = 'hello'; $b = $a;, a single piece of memory containing the string 'hello' will have a refcount of 2. If you call unset() on either variable, or they fall out of scope (e.g. at the end of the function they were defined in), the refcount will decrease. Once the refcount reaches zero, the data will be deleted and the memory freed. Note that "freed" in this case means freed for use by other parts of that PHP script, not necessarily freed back to the Operating System for use by other processes.
There are a few differences between PHP versions worth knowing:
The reference counting mechanism described above doesn't work if you have circular references (e.g. $obj1->foo = $obj2; $obj2->bar = $obj1;) because the reference count never reaches zero. In PHP 5.2 and earlier, this meant that such circular references led to memory leaks, and had to be manually handled by the programmer. In PHP 5.3, a "Garbage Collector" was added specifically to handle this case. It does not replace the normal refcount mechanism, but if circular references are common in your code, it may be worth reading up on.
PHP 5.4 included a large number of optimizations to the way PHP allocates and uses memory. AFAIK, none of these change the fundamental recommendations of how to write efficient code, they are just a good reason to upgrade your PHP version if you can.
Other than that, there are a few common tips for writing PHP code that makes good use of memory:
Make sure unused variables are discarded when no longer needed. In a well-structured program, this is often a non-issue, because most variables will be local to a particular function; when the function exits, they will go out of scope, and be freed. But if you are creating large intermediate variables, or dynamically creating large numbers of variables, manually calling unset() may be a good idea. And if your code is very linear, or uses large numbers of global and static variables, just refactoring it into a more modular structure may improve its memory performance as well as its readability, maintainability, etc.
Assigning or passing a variable by reference ($foo = &$bar) may cause PHP to use more memory than a straight assignment ($foo = $bar). This is because PHP uses a "Copy On Write" mechanism to to store variables with the same content in one location of memory, but reference assignment conflicts with this mechanism, so PHP has to copy the variable early.
Objects are more memory-hungry than scalar values (int, boolean, string) or arrays. This is one of the things that has been much improved in PHP 5.4, but is still worth thinking about - although obviously not to the exclusion of writing well-structured code!
You can unset variables as you no longer need them (e.g. unset($var) or $var = null). If you're on PHP 5.3 or later, you can also explicitly call the garbage collector: see gc_collect_cycles() and gc_enable().
Some functions seem to be worse than others. I recently found that array_merge_recursive() did horrible things to my code's memory footprint.
If you want to be able to analyse where the memory's going, you can use tools like Xdebug or XHProf/XHGui to help. e.g. Xdebug and tracing memory usage and Profiling with XHProf
See also:
Force freeing memory in PHP
php garbage collection while script running
My question is really simple, I fetch data from database and store them as variable for future use. After done working with the variables, is it cost-efficient to use the unset function of PHP to free up the memory? What I mean by 'cost-efficient' is whether is it worth calling the function multiple times in hope of clearing the memory to reduce up page load time.
As mentioned in unset
From here
"unset() does just what it's name says - unset a variable. It does not force immediate memory freeing. PHP's garbage collector will do it when it see fits - by intention as soon, as those CPU cycles aren't needed anyway, or as late as before the script would run out of memory, whatever occurs first.
If you are doing $whatever = null; then you are rewriting variable's data. You might get memory freed / shrunk faster, but it may steal CPU cycles from the code that truly needs them sooner, resulting in a longer overall execution time."
Use unset when you are dealing with huge data like if you are dealing with arrays.
You do not really need to call the function multiple times unless you use the same variable name for different values - which I do not recommend.
unset($variable1, $variable2, $variable3)
I use unset at the end of my loops and to unset my arrays at the end of my code.
I do not really need to use unset unless I really have to - loops again here for a known php weirdness - or I have really huge arrays.
I don't have better why to rephrase the question , you might suggest me one. Most of the time I re-use a variable in php I wonder, which one will be memory/processor efficient . e.g
case A
$string_var ='1,2,4,5,6,7,8';
$array_var =explode(',',$string_var);
Case B: re-use the same variable (string variable and re-declare as array object)
$array_var ='1,2,4,5,6,7,8';
$array_var =explode(',',$array_var);
My question is not from code-readability point of view . I wonder which one will be efficient way in term of memory and processor utilization.
$string_var ='1,2,4,5,6,7,8';
$array_var =explode(',',$string_var);
This will keep both the string and the array in memory, using more memory. If you'd overwrite the original variable, the previously stored content would be garbage collected at some point, freeing up memory. In practice it may not make any real difference, since the values won't be garbage collected immediately, and if your variables are reasonably scoped they should go out of scope soon enough anyway.
It makes virtually no difference in processing time.
Go with what makes more sense logically. If you don't need $string_var anymore, there's no need to keep it around as a separate variable. Try to declutter your namespace as much as possible.
Each time you create a new variable, a bit of memory is allocated to that variable. Therefore 2 variables will take around twice as much memory. It is better to use the same variable Case B as that only uses the memory required for 1 variable.
Both of them will affect the processor in same manner. Though, there will be lower memory usage in the Case B as you'll be updating an existing variable to a new array.
If this wasn't PHP, but a "lower level" language like C, the answer would be quite simple:
will be slightly more efficient in speed (directly rewrite results to destination location in memory, skip rewriting the pointer)
will be slightly more efficient in matter of memory (temporary location holding the result between right-side and left-side of the operation deallocated immediately after operation, just one variable allocated permanently.)
With PHP being a scripted language, all I can say B might be marginally more memory-conservative in the long run, but the amount is really insignificant. Other than that, all bets are off, write a benchmark.
Memory management is not something that most PHP developers ever need to think about. I'm running into an issue where my command line script is running out of memory. It performs multiple iterations over a large array of objects, making multiple database requests per iteration. I'm sure that increasing the memory ceiling may be a short term fix, but I don't think it's an appropriate long-term solution. What should I be doing to make sure that my script is not using too much memory, and using memory efficiently?
The golden rule
The number one thing to do when you encounter (or expect to encounter) memory pressure is: do not read massive amounts of data in memory at once if you intend to process them sequentially.
Examples:
Do not fetch a large result set in memory as an array; instead, fetch each row in turn and process it before fetching the next
Do not read large text files in memory (e.g. with file); instead, read one line at a time
This is not always the most convenient thing in PHP (arrays don't cut it, and there is a lot of code that only works on arrays), but in recent versions and especially after the introduction of generators it's easier than ever to stream your data instead of chunking it.
Following this practice religiously will "automatically" take care of other things for you as well:
There is no longer any need to clean up resources with a big memory footprint by closing them and losing all references to them on purpose, because there will be no such resources to begin with
There is no longer a need to unset large variables after you are done with them, because there will be no such variables as well
Other things to do
Be careful of creating closures inside loops; this should be easy to do, as creating such inside loops is a bad code smell. You can always lift the closure upwards and give it more parameters.
When expecting massive input, design your program and pick algorithms accordingly. For example, you can mergesort any amount of text files of any size using a constant amount of memory.
You could try profiling it puting some calls to memory_get_usage(), to look for the place where it's peaking.
Of course, knowing what the code really does you'll have more information to reduce its memory usage.
When you compute your large array of objects, try to not compute it all at once. Walk in steps and process elements as you walk then free memory and take next elements.
It will take more time, but you can manage the amount of memory you use.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
What's better at freeing memory with PHP: unset() or $var = null
Is there a real benefit of unsetting variables in php?
class test {
public function m1($a, $b)
$c = $a + $b;
unset($a, $b);
return $c;
}
}
Is it true that unsetting variables doesn't actually decrease the memory consumption during runtime?
Is it true that unsetting variables
doesn't actually decrease the memory
consumption during runtime?
Yep. From PHP.net:
unset() does just what it's name says
- unset a variable. It does not force immediate memory freeing. PHP's
garbage collector will do it when it
see fits - by intention as soon, as
those CPU cycles aren't needed anyway,
or as late as before the script would
run out of memory, whatever occurs
first.
If you are doing $whatever = null;
then you are rewriting variable's
data. You might get memory freed /
shrunk faster, but it may steal CPU
cycles from the code that truly needs
them sooner, resulting in a longer
overall execution time.
Regarding your other question:
And is there any reason to unset
variables apart from destroying
session varaibles for instance or for
scoping?
Not really, you pretty much summed it.
PHP will clean up memory on its own with the garbage collector, and it usually does a pretty good job. unsetting will simply make it explicit that you're done with that particular variable.
Probably no benefit for simple data types, but for any system resources you'd want to use that command to free those resources.
It depends on what the variable is. If it's a large array that consumes a few megs of data, and your script is liable to require lots of memory in the future (i.e.: before it finishes execution) then it would be wise to tag this memory as being available for use by unsetting the array.
That said, this is only really of use if the array is still in scope, as PHP will effectively have automatically disposed of it otherwise.
In terms of your provided example, there's no need to use unset, as those variables immediately go out of scope.
Even though there's no real gain over PHP's own garbage collection, I will occasionally unset() variables to make it clear in the code that a var's role has been completed and will no longer be accessed or assigned. I tend not to do this with atomic data types, but instead with major actors in a script - configuration singletons, large objects, etc.
It releases memory which is being used by your script. See http://ie2.php.net/memory_get_usage.
The benefit is with scripts which are processing large amounts of data you can run into out of memory errors, see the memory_limit ini setting for more on this.
So, yes, there may be benefit, but unless you are working with large amounts of data you shouldn't need to use it.
You may also want to unset variable to prevent their value being used later on, but if that's the case it could be argued that your code needs to be written differently to prevent such things happening.
As mentioned in unset
unset() does just what it's name says - unset a variable.
It does not force immediate memory freeing.
PHP's garbage collector will do it when it see fits - by intention as soon,
as those CPU cycles aren't needed anyway,
or as late as before the script would run out of memory, whatever occurs first.
If you are doing $whatever = null;
then you are rewriting variable's data.
You might get memory freed / shrunk faster,
but it may steal CPU cycles from the code that truly needs them sooner,
resulting in a longer overall execution time.