Do I need to worry about memory leaks with PHP? In particular, I have the following code that is being called from a browser. When the call finishes, is everything cleaned up properly, or, do I need to clear the memory created by the first array that was created?
class SomeClass
{
var $someArray = array();
function someMethod()
{
$this->someArray[1] = "Some Value 1";
$this->someArray[2] = "Some Value 2";
$this->someArray[3] = "Some Value 3";
$this->someArray = array();
$this->someArray[1] = "Some other Value";
$this->someArray[2] = "Some other Value";
$this->someArray[3] = "Some other Value";
}
}
someMethod();
Thanks,
Scott
Do I need to worry about memory leaks with PHP?
It's possible to have a cyclic reference in PHP where the refcount of the zval never drops to 0. This will cause a memory leak (GC won't clean up objects that have a reference to them). This has been fixed in >= PHP 5.3.
In particular, I have the following code that is being called from a browser. When the call finishes, is everything cleaned up properly, or, do I need to clear the memory created by the first array that was created?
PHP scripts have a request lifecycle (run application, return response, close application), so it shouldn't be a worry. All memory used by your application should be marked as free'd when your application finishes, ready to be overwritten on the next request.
If you're super paranoid, you can always unset things, however, PHP is a garbage collected language meaning that unless there is a bug in the core or in an extension, there is never going to be a memory leak.
More information
On a side note, you should use the newer PHP 5 OOP syntax. And, someMethod would be an error. It would need to be $obj->someMethod() where $obj is an instance of the class.
There actually do exist memory problems if you run mod_php through Apache with the mpm_prefork behavior. The problem is that memory consumed by PHP is not released back to the operating system. The same Apache process can reuse the memory for subsequent requests, but it can't be used by other programs (not even other Apache processes).
One solution is to restart the processes from time to time, for example by setting the MaxRequestsPerChild setting to something rather low (100 or so, maybe lower for lightly loaded servers). The best solution is to not use mod_php at all but instead run PHP through FastCGI.
This is a sysadmin issue though, not a programmer issue.
Related
I have a php script that sends data to another script and processes it async (at least I hope to get it likewise). Here is the code of called.php
include_once("../caller.php");
chdir(__DIR__);
fclose(STDOUT); //THIS
fclose(STDIN); //THIS
fclose(STDERR); //THIS
function giveCake($arg1,$arg2){
global $mysqli;
$sleep = 15; //script has to sleep
(...) code amongst sleep (...)
sleep($sleep);
$_SESSION; //would session variable of the user be available if the script is called as described?
//script caller.php is firstly initiated by a script with pre-defined $_SESSION
//now that I'm thinking maybe it won't since it is called from the command line...
pcntl_exec("/usr/bin/php",Array($_SERVER['argv'][1]));
}
if (!isset($_SERVER["HTTP_HOST"])) { //check if it comes from within the server? localhost?
$arg1 = parse_str($argv[1], $_GET);
$arg2 = parse_str($argv[1], $_POST);
if($arg1 && $arg2){
giveCake($arg1,$arg2);
}
}
And my concerns are given in the title, as so:
By closing the file operations (as in the beginning of called.php) does this affect all other scripts that might be using file operations or only the ones affected as in the moment of this execution?
If called using cURL would I let the script vulnerable to inappropriate execution? Although I think I would most certainly have access to $_SESSION that would leave it easily spoofable if someone would want to execute it. Any way to counter this?
Considering the arguments I would need to transfer between scripts could easily achieve a ton of bytes, as in each array around 400 bytes * x arrays would there be any problem regarding execution?
Thank you very much for your help, I hope you don't consider this to be highly broad since I've tried and detailed all my concerns explicitly and would like help in the whole process (easier than fragmenting it). Please help as you can, tyvm.
Q1: File operations always affect the script currently in execution, of course including all libraries loaded via require or include.
Q2: Depending on where the caller and the callee sit, you could limit access for example by restricting access to certain IPs and maybe access method via .htaccess.
Like:
<Limit GET POST>
order deny,allow
deny from all
allow from 1.2.3.4
</Limit>
Q3: Also depending on the connection between the two scripts, usually there should be no problem with big data amounts if you have enough bandwidth available.
We have some scripts in operation that handle data in the range of some hundred megabytes regularly. It may be necessary to extend or turn off script execution time limits, by setting max_execution_time in php.ini or by using ini_set(), or use set_time_limit() (which is a different approach).
pcntl_exec() will simply replace the current process by the new one. There is actually no communication happening. I'm wondering how you can think that some asynchronous communication is happening.
Also I'm unsure what $_SERVER['argv'][1] should do here. Don't you mean argv[0]?
So at the moment you just presented a bunch of not-working code. That's too less.
I've encountered the dreaded error-message, possibly through-painstaking effort, PHP has run out of memory:
Allowed memory size of #### bytes exhausted (tried to allocate #### bytes) in file.php on line 123
Increasing the limit
If you know what you're doing and want to increase the limit see memory_limit:
ini_set('memory_limit', '16M');
ini_set('memory_limit', -1); // no limit
Beware! You may only be solving the symptom and not the problem!
Diagnosing the leak:
The error message points to a line withing a loop that I believe to be leaking, or needlessly-accumulating, memory. I've printed memory_get_usage() statements at the end of each iteration and can see the number slowly grow until it reaches the limit:
foreach ($users as $user) {
$task = new Task;
$task->run($user);
unset($task); // Free the variable in an attempt to recover memory
print memory_get_usage(true); // increases over time
}
For the purposes of this question let's assume the worst spaghetti code imaginable is hiding in global-scope somewhere in $user or Task.
What tools, PHP tricks, or debugging voodoo can help me find and fix the problem?
PHP doesn't have a garbage collector. It uses reference counting to manage memory. Thus, the most common source of memory leaks are cyclic references and global variables. If you use a framework, you'll have a lot of code to trawl through to find it, I'm afraid. The simplest instrument is to selectively place calls to memory_get_usage and narrow it down to where the code leaks. You can also use xdebug to create a trace of the code. Run the code with execution traces and show_mem_delta.
Here's a trick we've used to identify which scripts are using the most memory on our server.
Save the following snippet in a file at, e.g., /usr/local/lib/php/strangecode_log_memory_usage.inc.php:
<?php
function strangecode_log_memory_usage()
{
$site = '' == getenv('SERVER_NAME') ? getenv('SCRIPT_FILENAME') : getenv('SERVER_NAME');
$url = $_SERVER['PHP_SELF'];
$current = memory_get_usage();
$peak = memory_get_peak_usage();
error_log("$site current: $current peak: $peak $url\n", 3, '/var/log/httpd/php_memory_log');
}
register_shutdown_function('strangecode_log_memory_usage');
Employ it by adding the following to httpd.conf:
php_admin_value auto_prepend_file /usr/local/lib/php/strangecode_log_memory_usage.inc.php
Then analyze the log file at /var/log/httpd/php_memory_log
You might need to touch /var/log/httpd/php_memory_log && chmod 666 /var/log/httpd/php_memory_log before your web user can write to the log file.
I noticed one time in an old script that PHP would maintain the "as" variable as in scope even after my foreach loop. For example,
foreach($users as $user){
$user->doSomething();
}
var_dump($user); // would output the data from the last $user
I'm not sure if future PHP versions fixed this or not since I've seen it. If this is the case, you could unset($user) after the doSomething() line to clear it from memory. YMMV.
There are several possible points of memory leaking in php:
php itself
php extension
php library you use
your php code
It is quite hard to find and fix the first 3 without deep reverse engineering or php source code knowledge. For the last one you can use binary search for memory leaking code with memory_get_usage
I recently ran into this problem on an application, under what I gather to be similar circumstances. A script that runs in PHP's cli that loops over many iterations. My script depends on several underlying libraries. I suspect a particular library is the cause and I spent several hours in vain trying to add appropriate destruct methods to it's classes to no avail. Faced with a lengthy conversion process to a different library (which could turn out to have the same problems) I came up with a crude work around for the problem in my case.
In my situation, on a linux cli, I was looping over a bunch of user records and for each one of them creating a new instance of several classes I created. I decided to try creating the new instances of the classes using PHP's exec method so that those process would run in a "new thread". Here is a really basic sample of what I am referring to:
foreach ($ids as $id) {
$lines=array();
exec("php ./path/to/my/classes.php $id", $lines);
foreach ($lines as $line) { echo $line."\n"; } //display some output
}
Obviously this approach has limitations, and one needs to be aware of the dangers of this, as it would be easy to create a rabbit job, however in some rare cases it might help get over a tough spot, until a better fix could be found, as in my case.
I came across the same problem, and my solution was to replace foreach with a regular for. I'm not sure about the specifics, but it seems like foreach creates a copy (or somehow a new reference) to the object. Using a regular for loop, you access the item directly.
I would suggest you check the php manual or add the gc_enable() function to collect the garbage... That is the memory leaks dont affect how your code runs.
PS: php has a garbage collector gc_enable() that takes no arguments.
I recently noticed that PHP 5.3 lambda functions leave extra memory used when they are removed.
for ($i = 0; $i < 1000; $i++)
{
//$log = new Log;
$log = function() { return new Log; };
//unset($log);
}
I'm not sure why, but it seems to take an extra 250 bytes each lambda even after the function is removed.
I didn't see it explicitly mentioned, but xdebug does a great job profiling time and memory (as of 2.6). You can take the information it generates and pass it off to a gui front end of your choice: webgrind (time only), kcachegrind, qcachegrind or others and it generates very useful call trees and graphs to let you find the sources of your various woes.
Example (of qcachegrind):
If what you say about PHP only doing GC after a function is true, you could wrap the loop's contents inside a function as a workaround/experiment.
One huge problem I had was by using create_function. Like in lambda functions, it leaves the generated temporary name in memory.
Another cause of memory leaks (in case of Zend Framework) is the Zend_Db_Profiler.
Make sure that is disabled if you run scripts under Zend Framework.
For example I had in my application.ini the folowing:
resources.db.profiler.enabled = true
resources.db.profiler.class = Zend_Db_Profiler_Firebug
Running approximately 25.000 queries + loads of processing before that, brought the memory to a nice 128Mb (My max memory limit).
By just setting:
resources.db.profiler.enabled = false
it was enough to keep it under 20 Mb
And this script was running in CLI, but it was instantiating the Zend_Application and running the Bootstrap, so it used the "development" config.
It really helped running the script with xDebug profiling
I'm a little late to this conversation but I'll share something pertinent to Zend Framework.
I had a memory leak problem after installing php 5.3.8 (using phpfarm) to work with a ZF app that was developed with php 5.2.9. I discovered that the memory leak was being triggered in Apache's httpd.conf file, in my virtual host definition, where it says SetEnv APPLICATION_ENV "development". After commenting this line out, the memory leaks stopped. I'm trying to come up with an inline workaround in my php script (mainly by defining it manually in the main index.php file).
I didn't see it mentioned here but one thing that might be helpful is using xdebug and xdebug_debug_zval('variableName') to see the refcount.
I can also provide an example of a php extension getting in the way: Zend Server's Z-Ray. If data collection is enabled it memory use will balloon on each iteration just as if garbage collection was off.
Let's say I cache data in a PHP file in PHP array like this:
/cache.php
<?php return (object) array(
'key' => 'value',
);
And I include the cache file like this:
<?php
$cache = include 'cache.php';
Now, the question is will the cache file be automatically cached by APC in the memory? I mean as a typical opcode cache, as all .php files.
If I store the data differently for example in JSON format (cache.json), the data will not be automatically cached by APC?
Would apc_store be faster/preferable?
Don't mix APC's caching abilities with its ability to optimize intermediate code and cache compiled code. APC provides 2 different things:
It gives a handy method of caching data structures (objects,
arrays etc), so that you can store/get them with apc_store and
apc_fetch
It keeps a compiled version of your scripts so that the
next time they run, they run faster
Let's see an example for (1): Suppose you have a data structure which takes 1 second to calculate:
function calculate_array() {
sleep(1);
return array('foo' => 'bar');
}
$data = calculate_array();
You can store its output so that you don't have to call the slow calculate_array() again:
function calculate_array() {
sleep(1);
return array('foo' => 'bar');
}
if (!apc_exists('key1')) {
$data = calculate_array();
apc_store('key1', $data);
} else {
$data = apc_fetch('key1');
}
which will be considerably faster, much less than the original 1 second.
Now, for (2) above: having APC will not make your program run faster than 1 second, which is the time that calculate_array() needs. However, if your file additionally needed (say) 100 milliseconds to initialize and execute, simply having enabled APC will make it need (approx) 20 millisecond. So you have an 80% increase in initialization/preparation time. This can make quite a difference in production systems, so simply installing APC can have a noticeable positive impact on your script's performance, even if you never explicitly call any of its functions
If you are just storing static data (as in your example), it would be preferable to use apc_store.
The reasoning behind this is not so much whether the opcode cache is faster or slower, but the fact you are using include to fetch static data into scope.
Even with an opcode cache, the file will still be checked for consistency on each execution. PHP will not have to parse the contents, but it will have to check whether the file exists, and that it hasn't changed since the opcode cache was created. Filesystem checks are resource expensive, even if it is only to stat a file.
Therefore, of the two approaches I would use apc_store to remove the filesystem checks completely.
Unlike the other answer I would use the array-file-solution (the first one)
<?php return (object) array(
'key' => 'value',
);
The reason is, that with both solutions you are on the right side, but when you let the caching up to APC itself you don't have to juggle around with the apc_*()-functions. You simply include and use it. When you set
apc.stat = 0
you avoid the stat-calls on every include too. This is useful for production, but remember to clear the system-cache on every deployment.
http://php.net/apc.configuration.php#ini.apc.stat
Oh, not to forget: With the file-approach it works even without APC. Useful for the development setup, where you usually shouldn't use any caching.
I'm a little puzzled if I can spare the fclose command by just unsetting the variable that carries the handle?
$handle = fopen($file);
...
fclose($handle);
... // script goes on for a long
Compared with:
$handle = fopen($file);
...
unset($handle);
... // script goes on for a long
Insights anyone?
Thanks to the reference-counting system introduced with PHP 4's Zend Engine, a resource with no more references to it is detected automatically, and it is freed by the garbage collector.
Consider the implications of this. It's safe to assume that all traces of the variable are gone after the garbage collection. In other words, at the end of PHP's execution, if PHP is not still tracking the reference, how would it close it? Thus, it seems fairly logical that it would close it when the garbage collector eats it.
This is a bad logical argument though because it assumes that garbage collections happens either immediately or shortly after unset and that PHP does not keep hidden references to variables that no longer exist in user land.
A more compelling case though could be a potential behavioural flaw if PHP did not close file handles when they go out of scope. Consider a daemon of some kind that opens lots of files. Now consider if fclose is never called. Instead, variables are allowed to fall out of scope or unset is explicitly called on them.
If these file handles were not closed, this long running daemon would run out of file handles.
Potentially behavior specific test script:
<?php
$db = mysql_connect(...);
if ($db) {
echo "Connected\n";
sleep(5); //netstat during this just for paranoia
unset($db);
echo "Unset\n";
sleep(5); //netstat during this and the connection is closed
}
On both Windows 7 and Debian 6, the connection has been closed after the unset.
Obviously though, this only proves that on my specific machines with my specific PHP version will this work. Has no meaning on file handles or the like :).
Am searching the PHP source now for hard proof
PHP docs hint that all resources with no remaining references are "freed", I assume for file handles this would include closing the file.
Simple test case:
$f = fopen("test.php", "r");
if (!flock($f, LOCK_EX)) {
print("still locked\n");
exit;
}
unset($f);
sleep(5);
print("goodbye\n");
(I've saved this as test.php, so it is locking itself; might need to change the filename in the fopen() to some existing file otherwise)
Run the script twice within 5 seconds; if you get "still locked", then apparently unsetting the handle did not release the lock. In my test, I did not get "still locked", so apparently unsetting the handle at least releases the lock, though it would seem silly to release locks upon garbage collection, but not close the file.
unset($handle) will destroy the $handle variable, but it won't close the file being pointed by $handle. You still need to call fclose() to close the file.
some research:
fclose makes $handle to be resource(5) of type (Unknown)
while
unset makes it NULL.
and after fclose php consumes 88 bytes of memory more.
so: they are different =)
I've encountered the dreaded error-message, possibly through-painstaking effort, PHP has run out of memory:
Allowed memory size of #### bytes exhausted (tried to allocate #### bytes) in file.php on line 123
Increasing the limit
If you know what you're doing and want to increase the limit see memory_limit:
ini_set('memory_limit', '16M');
ini_set('memory_limit', -1); // no limit
Beware! You may only be solving the symptom and not the problem!
Diagnosing the leak:
The error message points to a line withing a loop that I believe to be leaking, or needlessly-accumulating, memory. I've printed memory_get_usage() statements at the end of each iteration and can see the number slowly grow until it reaches the limit:
foreach ($users as $user) {
$task = new Task;
$task->run($user);
unset($task); // Free the variable in an attempt to recover memory
print memory_get_usage(true); // increases over time
}
For the purposes of this question let's assume the worst spaghetti code imaginable is hiding in global-scope somewhere in $user or Task.
What tools, PHP tricks, or debugging voodoo can help me find and fix the problem?
PHP doesn't have a garbage collector. It uses reference counting to manage memory. Thus, the most common source of memory leaks are cyclic references and global variables. If you use a framework, you'll have a lot of code to trawl through to find it, I'm afraid. The simplest instrument is to selectively place calls to memory_get_usage and narrow it down to where the code leaks. You can also use xdebug to create a trace of the code. Run the code with execution traces and show_mem_delta.
Here's a trick we've used to identify which scripts are using the most memory on our server.
Save the following snippet in a file at, e.g., /usr/local/lib/php/strangecode_log_memory_usage.inc.php:
<?php
function strangecode_log_memory_usage()
{
$site = '' == getenv('SERVER_NAME') ? getenv('SCRIPT_FILENAME') : getenv('SERVER_NAME');
$url = $_SERVER['PHP_SELF'];
$current = memory_get_usage();
$peak = memory_get_peak_usage();
error_log("$site current: $current peak: $peak $url\n", 3, '/var/log/httpd/php_memory_log');
}
register_shutdown_function('strangecode_log_memory_usage');
Employ it by adding the following to httpd.conf:
php_admin_value auto_prepend_file /usr/local/lib/php/strangecode_log_memory_usage.inc.php
Then analyze the log file at /var/log/httpd/php_memory_log
You might need to touch /var/log/httpd/php_memory_log && chmod 666 /var/log/httpd/php_memory_log before your web user can write to the log file.
I noticed one time in an old script that PHP would maintain the "as" variable as in scope even after my foreach loop. For example,
foreach($users as $user){
$user->doSomething();
}
var_dump($user); // would output the data from the last $user
I'm not sure if future PHP versions fixed this or not since I've seen it. If this is the case, you could unset($user) after the doSomething() line to clear it from memory. YMMV.
There are several possible points of memory leaking in php:
php itself
php extension
php library you use
your php code
It is quite hard to find and fix the first 3 without deep reverse engineering or php source code knowledge. For the last one you can use binary search for memory leaking code with memory_get_usage
I recently ran into this problem on an application, under what I gather to be similar circumstances. A script that runs in PHP's cli that loops over many iterations. My script depends on several underlying libraries. I suspect a particular library is the cause and I spent several hours in vain trying to add appropriate destruct methods to it's classes to no avail. Faced with a lengthy conversion process to a different library (which could turn out to have the same problems) I came up with a crude work around for the problem in my case.
In my situation, on a linux cli, I was looping over a bunch of user records and for each one of them creating a new instance of several classes I created. I decided to try creating the new instances of the classes using PHP's exec method so that those process would run in a "new thread". Here is a really basic sample of what I am referring to:
foreach ($ids as $id) {
$lines=array();
exec("php ./path/to/my/classes.php $id", $lines);
foreach ($lines as $line) { echo $line."\n"; } //display some output
}
Obviously this approach has limitations, and one needs to be aware of the dangers of this, as it would be easy to create a rabbit job, however in some rare cases it might help get over a tough spot, until a better fix could be found, as in my case.
I came across the same problem, and my solution was to replace foreach with a regular for. I'm not sure about the specifics, but it seems like foreach creates a copy (or somehow a new reference) to the object. Using a regular for loop, you access the item directly.
I would suggest you check the php manual or add the gc_enable() function to collect the garbage... That is the memory leaks dont affect how your code runs.
PS: php has a garbage collector gc_enable() that takes no arguments.
I recently noticed that PHP 5.3 lambda functions leave extra memory used when they are removed.
for ($i = 0; $i < 1000; $i++)
{
//$log = new Log;
$log = function() { return new Log; };
//unset($log);
}
I'm not sure why, but it seems to take an extra 250 bytes each lambda even after the function is removed.
I didn't see it explicitly mentioned, but xdebug does a great job profiling time and memory (as of 2.6). You can take the information it generates and pass it off to a gui front end of your choice: webgrind (time only), kcachegrind, qcachegrind or others and it generates very useful call trees and graphs to let you find the sources of your various woes.
Example (of qcachegrind):
If what you say about PHP only doing GC after a function is true, you could wrap the loop's contents inside a function as a workaround/experiment.
One huge problem I had was by using create_function. Like in lambda functions, it leaves the generated temporary name in memory.
Another cause of memory leaks (in case of Zend Framework) is the Zend_Db_Profiler.
Make sure that is disabled if you run scripts under Zend Framework.
For example I had in my application.ini the folowing:
resources.db.profiler.enabled = true
resources.db.profiler.class = Zend_Db_Profiler_Firebug
Running approximately 25.000 queries + loads of processing before that, brought the memory to a nice 128Mb (My max memory limit).
By just setting:
resources.db.profiler.enabled = false
it was enough to keep it under 20 Mb
And this script was running in CLI, but it was instantiating the Zend_Application and running the Bootstrap, so it used the "development" config.
It really helped running the script with xDebug profiling
I'm a little late to this conversation but I'll share something pertinent to Zend Framework.
I had a memory leak problem after installing php 5.3.8 (using phpfarm) to work with a ZF app that was developed with php 5.2.9. I discovered that the memory leak was being triggered in Apache's httpd.conf file, in my virtual host definition, where it says SetEnv APPLICATION_ENV "development". After commenting this line out, the memory leaks stopped. I'm trying to come up with an inline workaround in my php script (mainly by defining it manually in the main index.php file).
I didn't see it mentioned here but one thing that might be helpful is using xdebug and xdebug_debug_zval('variableName') to see the refcount.
I can also provide an example of a php extension getting in the way: Zend Server's Z-Ray. If data collection is enabled it memory use will balloon on each iteration just as if garbage collection was off.