I have some scripts that use a ton of cpu is it possible to cap the amount of cpu a process is allowed to use? I am running on CentOs 5.5 by the way.
I helped a fellow PHP coder create PHP scripts which address a similar issue. These are long-running PHP scripts which generate a lot of load. Since they're long running, the goal was to "pause" them if load gets too high. The script has a function similar to:
function get_server_load()
{
$fh = fopen('/proc/loadavg', 'r')
$data = fread($fh, 6);
fclose($fh);
$load_avg = explode(" ", $data);
return floatval(trim($load_avg[0]));
}
The script calls get_server_load() during each loop, and if the load is greater than a given max, it sleeps for 30 seconds and checks again:
set_time_limit(120);
while(get_server_load() > $max_load)
sleep($load_sleep_time);
This allows the script to give CPU time back to the server during periods of high load.
maybe you could use nice?
PHP is considered a scripting language, and does not have such low level access to the hardware.
Instead, what you can do is use functions like "set_time_limit()"
http://php.net/manual/en/function.set-time-limit.php
and memory_limit in your php.ini
http://php.net/manual/en/ini.core.php
Those are the recommended methods, but the closest you'll get to what you want are probably a combination of "sleep()"
http://php.net/manual/en/function.sleep.php
and getting the current CPU load with "exec('uptime');". Note that you may or may not have access to those system commands.
Related
I've encountered the dreaded error-message, possibly through-painstaking effort, PHP has run out of memory:
Allowed memory size of #### bytes exhausted (tried to allocate #### bytes) in file.php on line 123
Increasing the limit
If you know what you're doing and want to increase the limit see memory_limit:
ini_set('memory_limit', '16M');
ini_set('memory_limit', -1); // no limit
Beware! You may only be solving the symptom and not the problem!
Diagnosing the leak:
The error message points to a line withing a loop that I believe to be leaking, or needlessly-accumulating, memory. I've printed memory_get_usage() statements at the end of each iteration and can see the number slowly grow until it reaches the limit:
foreach ($users as $user) {
$task = new Task;
$task->run($user);
unset($task); // Free the variable in an attempt to recover memory
print memory_get_usage(true); // increases over time
}
For the purposes of this question let's assume the worst spaghetti code imaginable is hiding in global-scope somewhere in $user or Task.
What tools, PHP tricks, or debugging voodoo can help me find and fix the problem?
PHP doesn't have a garbage collector. It uses reference counting to manage memory. Thus, the most common source of memory leaks are cyclic references and global variables. If you use a framework, you'll have a lot of code to trawl through to find it, I'm afraid. The simplest instrument is to selectively place calls to memory_get_usage and narrow it down to where the code leaks. You can also use xdebug to create a trace of the code. Run the code with execution traces and show_mem_delta.
Here's a trick we've used to identify which scripts are using the most memory on our server.
Save the following snippet in a file at, e.g., /usr/local/lib/php/strangecode_log_memory_usage.inc.php:
<?php
function strangecode_log_memory_usage()
{
$site = '' == getenv('SERVER_NAME') ? getenv('SCRIPT_FILENAME') : getenv('SERVER_NAME');
$url = $_SERVER['PHP_SELF'];
$current = memory_get_usage();
$peak = memory_get_peak_usage();
error_log("$site current: $current peak: $peak $url\n", 3, '/var/log/httpd/php_memory_log');
}
register_shutdown_function('strangecode_log_memory_usage');
Employ it by adding the following to httpd.conf:
php_admin_value auto_prepend_file /usr/local/lib/php/strangecode_log_memory_usage.inc.php
Then analyze the log file at /var/log/httpd/php_memory_log
You might need to touch /var/log/httpd/php_memory_log && chmod 666 /var/log/httpd/php_memory_log before your web user can write to the log file.
I noticed one time in an old script that PHP would maintain the "as" variable as in scope even after my foreach loop. For example,
foreach($users as $user){
$user->doSomething();
}
var_dump($user); // would output the data from the last $user
I'm not sure if future PHP versions fixed this or not since I've seen it. If this is the case, you could unset($user) after the doSomething() line to clear it from memory. YMMV.
There are several possible points of memory leaking in php:
php itself
php extension
php library you use
your php code
It is quite hard to find and fix the first 3 without deep reverse engineering or php source code knowledge. For the last one you can use binary search for memory leaking code with memory_get_usage
I recently ran into this problem on an application, under what I gather to be similar circumstances. A script that runs in PHP's cli that loops over many iterations. My script depends on several underlying libraries. I suspect a particular library is the cause and I spent several hours in vain trying to add appropriate destruct methods to it's classes to no avail. Faced with a lengthy conversion process to a different library (which could turn out to have the same problems) I came up with a crude work around for the problem in my case.
In my situation, on a linux cli, I was looping over a bunch of user records and for each one of them creating a new instance of several classes I created. I decided to try creating the new instances of the classes using PHP's exec method so that those process would run in a "new thread". Here is a really basic sample of what I am referring to:
foreach ($ids as $id) {
$lines=array();
exec("php ./path/to/my/classes.php $id", $lines);
foreach ($lines as $line) { echo $line."\n"; } //display some output
}
Obviously this approach has limitations, and one needs to be aware of the dangers of this, as it would be easy to create a rabbit job, however in some rare cases it might help get over a tough spot, until a better fix could be found, as in my case.
I came across the same problem, and my solution was to replace foreach with a regular for. I'm not sure about the specifics, but it seems like foreach creates a copy (or somehow a new reference) to the object. Using a regular for loop, you access the item directly.
I would suggest you check the php manual or add the gc_enable() function to collect the garbage... That is the memory leaks dont affect how your code runs.
PS: php has a garbage collector gc_enable() that takes no arguments.
I recently noticed that PHP 5.3 lambda functions leave extra memory used when they are removed.
for ($i = 0; $i < 1000; $i++)
{
//$log = new Log;
$log = function() { return new Log; };
//unset($log);
}
I'm not sure why, but it seems to take an extra 250 bytes each lambda even after the function is removed.
I didn't see it explicitly mentioned, but xdebug does a great job profiling time and memory (as of 2.6). You can take the information it generates and pass it off to a gui front end of your choice: webgrind (time only), kcachegrind, qcachegrind or others and it generates very useful call trees and graphs to let you find the sources of your various woes.
Example (of qcachegrind):
If what you say about PHP only doing GC after a function is true, you could wrap the loop's contents inside a function as a workaround/experiment.
One huge problem I had was by using create_function. Like in lambda functions, it leaves the generated temporary name in memory.
Another cause of memory leaks (in case of Zend Framework) is the Zend_Db_Profiler.
Make sure that is disabled if you run scripts under Zend Framework.
For example I had in my application.ini the folowing:
resources.db.profiler.enabled = true
resources.db.profiler.class = Zend_Db_Profiler_Firebug
Running approximately 25.000 queries + loads of processing before that, brought the memory to a nice 128Mb (My max memory limit).
By just setting:
resources.db.profiler.enabled = false
it was enough to keep it under 20 Mb
And this script was running in CLI, but it was instantiating the Zend_Application and running the Bootstrap, so it used the "development" config.
It really helped running the script with xDebug profiling
I'm a little late to this conversation but I'll share something pertinent to Zend Framework.
I had a memory leak problem after installing php 5.3.8 (using phpfarm) to work with a ZF app that was developed with php 5.2.9. I discovered that the memory leak was being triggered in Apache's httpd.conf file, in my virtual host definition, where it says SetEnv APPLICATION_ENV "development". After commenting this line out, the memory leaks stopped. I'm trying to come up with an inline workaround in my php script (mainly by defining it manually in the main index.php file).
I didn't see it mentioned here but one thing that might be helpful is using xdebug and xdebug_debug_zval('variableName') to see the refcount.
I can also provide an example of a php extension getting in the way: Zend Server's Z-Ray. If data collection is enabled it memory use will balloon on each iteration just as if garbage collection was off.
So I have this function for making non-blocking curl requests. It works fine on what I've tested so far (small amounts of requests). But I need this to scale up to thousands of requests (maybe max 10,000). My issue is that I don't want to run into issues with too many parallel requests running at once.
What would you suggest to rate-limit the requests? Usleep? Requests in batches? The function is below:
function poly_curl($requests){
$queue = curl_multi_init();
$curl_array = array();
$count = 0;
foreach($requests as $request)
{
$curl_array[$count] = curl_init($request);
curl_setopt($curl_array[$count], CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($queue, $curl_array[$count]);
$count++;
}
$running = NULL;
do {
curl_multi_exec($queue,$running);
} while($running > 0);
$res = array();
$count = 0;
foreach($requests as $request)
{
$res[$count] = curl_multi_getcontent($curl_array[$count]);
$count++;
}
$count = 0;
foreach($requests as $request){
curl_multi_remove_handle($queue, $curl_array[$count]);
$count++;
}
curl_multi_close($queue);
return $res;
}
I think curl_multi_exec is bad for this purpose, because even if you use batches in groups of 100, 99 request could be finished and still will have to wait for the last request completion.
But you need 100 parallel requests and when one finishes, another is immediately started. So you cannot use curl_multi_exec at all.
I would use normal producer-consumer algorithm with multiple (constant number) consumers with every consumer processing only one url. For example php-resque and COUNT=100 php resque.php
You may want to implement something that is called Exponential Backoff (wikipedia).
Basically, it is an algorithm that allows you dynamically scale your processes depending on some feedback.
You define a rate in your application, and on the first time-out, error, or anything you decide, you decrease this rate until the request finishes.
You may implement it easily using the HTTP response code for example.
Last time i was doing something like this it was including downloading and "parsing" files. Was able to proceed only 4 subpages at a time limited by very weak hardware processor (2 cores with HT). What time i ended up with two queuest: 1 for waiting, 2 for in-process. Every time a task gone from 2nd queue, new was taken from 1st one.
May saund complicated, but ended in two loops inside eachother and simple count()'s
Btw, considering so hight rate i would think of using Node.js - for simplicity - or anything more nonblocking and more suitable for deamons than PHP.. As long as threads are PHP weakpoint, it just does not suit there.
PS: nice & useful bit of code, thanks.
We used to face the same problem with C++ connection pooling code. The approach is those days involved some serious analysis.
But, the essence was that, we created a pool and requests would get processed depending on number of available requests. What we also did was assign a maximum number of connection pools.[This was determined by testing].
What you really need is a method to determine how many requests are being processed and put a limit to it. In your case that is $count
Just compare $count to a maximum value[say, $max] to it and stop there. define the value depending on the system the program runs. $max could be hardcoded or dynamic.
I have a script that is very long to execute, so when i run it it hit the max execution time on my webserver and end up timing out.
To illustrate that imagine i have a for loop that make some pretty intensive manipulation one million time. How could i spread this loop execution in several parts so that i don t hit the max execution time of my Webserver?
Many thanks,
If you have an application that is going to loop a known number of times (i.e. you are sure that it's going to finish some time) you can increase time limit inside the loop:
foreach ($data as $row) {
set_time_limit(10);
// do your stuff here
}
This solution will protect you from having one run-away iteration, but will let your whole script run undisturbed as long as you need.
Best solution is to use http://php.net/manual/en/function.set-time-limit.php to change the timeout. Otherwise, you can use 301 redirects to send to an updated URL on a timeout.
$threshold = 10000;
$t = microtime();
$i = isset( $_GET['i'] ) ? $_GET['i'] : 0;
for( $i; $i < 10000000; $i++ )
{
if( microtime - $t > $threshold )
{
header('Location: http://www.example.com/?i='.$i);
exit;
}
// Your code
}
The browser will only respect a few redirects before it stops, you're better to use javascript to force a page reload.
I someday used a technique where I splitted the work from one file into three parts. It was just an array of 120.000 elements with intensive operation. I created a splitter script which stored the arrays in a database of the size of 40.000 each one. Then I created an HTML file with a redirect to the first PHP file to compute the first 40.000 elements. After computing the first 40.000 elments I had again a HTML forward to the next PHP file and so on.
Not very elegant, but it worked :-)
If you have the right permissions on your hosting server, you could use the php interpreter to execute a php script and have it run in the background.
See Asynchronous shell exec in PHP.
if you are running a script that needs to execute for unknown time, you can use:
set_time_limit(0);
If possible you can make the script so that it handles a portion of the wanted operations. Once it completes say 10%, you via AJAX call the script again to execute the next 10%. But there are circumstances where this is not an ideal solution, it really depends on what you are doing.
I used this method to create a web-based crawler which only ran on my computer for instance. If it had to do the operations at once it would time out as well. So it was split into 200 "tasks", each called via Ajax once the previous completes. Works perfectly, and it's been over a year since it started running (crawling?)
I've encountered the dreaded error-message, possibly through-painstaking effort, PHP has run out of memory:
Allowed memory size of #### bytes exhausted (tried to allocate #### bytes) in file.php on line 123
Increasing the limit
If you know what you're doing and want to increase the limit see memory_limit:
ini_set('memory_limit', '16M');
ini_set('memory_limit', -1); // no limit
Beware! You may only be solving the symptom and not the problem!
Diagnosing the leak:
The error message points to a line withing a loop that I believe to be leaking, or needlessly-accumulating, memory. I've printed memory_get_usage() statements at the end of each iteration and can see the number slowly grow until it reaches the limit:
foreach ($users as $user) {
$task = new Task;
$task->run($user);
unset($task); // Free the variable in an attempt to recover memory
print memory_get_usage(true); // increases over time
}
For the purposes of this question let's assume the worst spaghetti code imaginable is hiding in global-scope somewhere in $user or Task.
What tools, PHP tricks, or debugging voodoo can help me find and fix the problem?
PHP doesn't have a garbage collector. It uses reference counting to manage memory. Thus, the most common source of memory leaks are cyclic references and global variables. If you use a framework, you'll have a lot of code to trawl through to find it, I'm afraid. The simplest instrument is to selectively place calls to memory_get_usage and narrow it down to where the code leaks. You can also use xdebug to create a trace of the code. Run the code with execution traces and show_mem_delta.
Here's a trick we've used to identify which scripts are using the most memory on our server.
Save the following snippet in a file at, e.g., /usr/local/lib/php/strangecode_log_memory_usage.inc.php:
<?php
function strangecode_log_memory_usage()
{
$site = '' == getenv('SERVER_NAME') ? getenv('SCRIPT_FILENAME') : getenv('SERVER_NAME');
$url = $_SERVER['PHP_SELF'];
$current = memory_get_usage();
$peak = memory_get_peak_usage();
error_log("$site current: $current peak: $peak $url\n", 3, '/var/log/httpd/php_memory_log');
}
register_shutdown_function('strangecode_log_memory_usage');
Employ it by adding the following to httpd.conf:
php_admin_value auto_prepend_file /usr/local/lib/php/strangecode_log_memory_usage.inc.php
Then analyze the log file at /var/log/httpd/php_memory_log
You might need to touch /var/log/httpd/php_memory_log && chmod 666 /var/log/httpd/php_memory_log before your web user can write to the log file.
I noticed one time in an old script that PHP would maintain the "as" variable as in scope even after my foreach loop. For example,
foreach($users as $user){
$user->doSomething();
}
var_dump($user); // would output the data from the last $user
I'm not sure if future PHP versions fixed this or not since I've seen it. If this is the case, you could unset($user) after the doSomething() line to clear it from memory. YMMV.
There are several possible points of memory leaking in php:
php itself
php extension
php library you use
your php code
It is quite hard to find and fix the first 3 without deep reverse engineering or php source code knowledge. For the last one you can use binary search for memory leaking code with memory_get_usage
I recently ran into this problem on an application, under what I gather to be similar circumstances. A script that runs in PHP's cli that loops over many iterations. My script depends on several underlying libraries. I suspect a particular library is the cause and I spent several hours in vain trying to add appropriate destruct methods to it's classes to no avail. Faced with a lengthy conversion process to a different library (which could turn out to have the same problems) I came up with a crude work around for the problem in my case.
In my situation, on a linux cli, I was looping over a bunch of user records and for each one of them creating a new instance of several classes I created. I decided to try creating the new instances of the classes using PHP's exec method so that those process would run in a "new thread". Here is a really basic sample of what I am referring to:
foreach ($ids as $id) {
$lines=array();
exec("php ./path/to/my/classes.php $id", $lines);
foreach ($lines as $line) { echo $line."\n"; } //display some output
}
Obviously this approach has limitations, and one needs to be aware of the dangers of this, as it would be easy to create a rabbit job, however in some rare cases it might help get over a tough spot, until a better fix could be found, as in my case.
I came across the same problem, and my solution was to replace foreach with a regular for. I'm not sure about the specifics, but it seems like foreach creates a copy (or somehow a new reference) to the object. Using a regular for loop, you access the item directly.
I would suggest you check the php manual or add the gc_enable() function to collect the garbage... That is the memory leaks dont affect how your code runs.
PS: php has a garbage collector gc_enable() that takes no arguments.
I recently noticed that PHP 5.3 lambda functions leave extra memory used when they are removed.
for ($i = 0; $i < 1000; $i++)
{
//$log = new Log;
$log = function() { return new Log; };
//unset($log);
}
I'm not sure why, but it seems to take an extra 250 bytes each lambda even after the function is removed.
I didn't see it explicitly mentioned, but xdebug does a great job profiling time and memory (as of 2.6). You can take the information it generates and pass it off to a gui front end of your choice: webgrind (time only), kcachegrind, qcachegrind or others and it generates very useful call trees and graphs to let you find the sources of your various woes.
Example (of qcachegrind):
If what you say about PHP only doing GC after a function is true, you could wrap the loop's contents inside a function as a workaround/experiment.
One huge problem I had was by using create_function. Like in lambda functions, it leaves the generated temporary name in memory.
Another cause of memory leaks (in case of Zend Framework) is the Zend_Db_Profiler.
Make sure that is disabled if you run scripts under Zend Framework.
For example I had in my application.ini the folowing:
resources.db.profiler.enabled = true
resources.db.profiler.class = Zend_Db_Profiler_Firebug
Running approximately 25.000 queries + loads of processing before that, brought the memory to a nice 128Mb (My max memory limit).
By just setting:
resources.db.profiler.enabled = false
it was enough to keep it under 20 Mb
And this script was running in CLI, but it was instantiating the Zend_Application and running the Bootstrap, so it used the "development" config.
It really helped running the script with xDebug profiling
I'm a little late to this conversation but I'll share something pertinent to Zend Framework.
I had a memory leak problem after installing php 5.3.8 (using phpfarm) to work with a ZF app that was developed with php 5.2.9. I discovered that the memory leak was being triggered in Apache's httpd.conf file, in my virtual host definition, where it says SetEnv APPLICATION_ENV "development". After commenting this line out, the memory leaks stopped. I'm trying to come up with an inline workaround in my php script (mainly by defining it manually in the main index.php file).
I didn't see it mentioned here but one thing that might be helpful is using xdebug and xdebug_debug_zval('variableName') to see the refcount.
I can also provide an example of a php extension getting in the way: Zend Server's Z-Ray. If data collection is enabled it memory use will balloon on each iteration just as if garbage collection was off.
I want to run a cron job that does cleanup that takes a lot of CPU and Mysql resources. I want it to run only if the server is not relatively busy.
What is the simplest way to determine that from PHP? (for example, is there a query that returns how many queries were done in the last minute? ...)
if (function_exists('sys_getloadavg')) {
$load = sys_getloadavg();
if ($load[0] > 80) {
header('HTTP/1.1 503 Too busy, try again later');
die('Server too busy. Please try again later.');
}
}
Try to update this function to your needs
If this is a Unix system, parse the output of uptime. This shows the CPU load which is generally considered to be a good measure of "busy." Anything near or over 1.0 means "completely busy."
There are three CPU "load" times in that output giving the load average over 1, 5, and 15 minutes. Pick whichever makes sense for your script. Definition of "load" is here.
On Linux you can get load from /proc/loadavg file.
$load = split(' ',file_get_contents('/proc/loadavg'))
$loadAvg = $load[0]
You can probably use the info in the Mysql List Processes function to see how many are active vs. sleeping, a decent indicator about the load on the DB.
If you're on Linux you can use the Sys GetLoadAvg function to see the overall system load.