So here is a little background info on my setup. Running Centos with apache and php 5.2.17. I have a website that lists products from many different retailers websites. I have crawler scripts that run to grab products from each website. Since every website is different, each crawler script had to be customized to crawl the particular retailers website. So basically I have 1 crawler per retailer. At this time I have 21 crawlers that are constantly running to gather and refresh the products from these websites. Each crawler is a php file and once the php script is done running it checks to ensure its the only instance of itself running and at the very end of the script it uses exec to start itself all over again while the original instance closes. This helps protect against memory leaks since each crawler restarts itself before it closes. However recently I will check the crawler scripts and notice that one of them Isnt running anymore and in the error log I find the following.
PHP Warning: exec() [<a href='function.exec'>function.exec</a>]: Unable to fork [nice -n 20 php -q /home/blahblah/crawler_script.php >/dev/null &]
This is what is supposed to start this particular crawler over again however since it was "unable to fork" it never restarted and the original instance of the crawler ended like it normally does.
Obviously its not a permission issue because each of these 21 crawler scripts runs this exec command every 5 or 10 minutes at the end of its run and most of the time it works as it should. This seems to happen maybe once or twice a day. It seems as though its a limit of some sort as I have only just recently started to see this happen ever since I added my 21st crawler. And its not always the same crawler that gets this error it will be any one of them at a random time that are unable to fork its restart exec command.
Does anyone have an idea what could be causing php to be unable to fork or maybe even a better way to handle these processes as to get around the error all together? Is there a process limit I should look into or something of that nature? Thanks in advance for help!
Process limit
"Is there a process limit I should look into"
It's suspected somebody (system admin?) set limitation of max user process. Could you try this?
$ ulimit -a
....
....
max user processes (-u) 16384
....
Run preceding command in PHP. Something like :
echo system("ulimit -a");
I searched whether php.ini or httpd.conf has this limit, but I couldn't find it.
Error Handling
"even a better way to handle these processes as to get around the error all together?"
The third parameter of exec() returns exit code of $cmd. 0 for success, non zero for error code. Refer to http://php.net/function.exec .
exec($cmd, &$output, &$ret_val);
if ($ret_val != 0)
{
// do stuff here
}
else
{
echo "success\n";
}
In my case (large PHPUnit test suite) it would say unable to fork once the process hit 57% memory usage. So, one more thing to watch for, it may not be a process limit but rather memory.
I ran into same problem and I tried this and it worked for me;
ulimit -n 4096
The problem is often caused by the system or the process or running out of available memory. Be sure that you have enough by running free -m. You will get a result like the following:
total used free shared buffers cached
Mem: 7985 7722 262 19 189 803
-/+ buffers/cache: 6729 1255
Swap: 0 0 0
The buffers/cache line is what you want to look at. Notice free memory is 1255 MB on this machine. When running your program keep trying free -m and check free memory to see if this falls into the low hundreds. If it does you will need to find a way to run you program while consumer less memory.
For anyone else who comes across this issue, it could be several problems as outlined in this question's answer.
However, my problem was my nginx user did not have a proper shell to execute the commands I wanted. Adding .bashrc to the nginx user's home directory fixed this.
Related
As part of a process automation routine I have a shell script which executes several PHP scripts via CLI. One of them, the very last of course, is very, very long. The code base I was handed was a mess and much of the import.php script and ImportProduct.php object is essentially unmodifiable.
Last week I finished updating the code to MySQLi and set to testing and keep getting a similar process termination message. I have dug through the Magento PHP library, my own code, the original code and can find nothing which would cause it. The format alone lends one to believe that either the PHP CLI interpreter, Linux OS, or Apache (if it is even running via Apache, sadly not very sysadmin savvy) is causing the termination. I SSH to the remote server, but the last two lines show that SSH itself did not timeout.
The default timeout via PHP CLI is that there is none, however setting it to 0 or 180 days makes no difference. Therefore, I believe the issue is at a lower level. Please don't bother screaming about root access it's not my decision to work this way.
Here is a section of my messages. Line 5 contains the termination message.
new_import_products.php: ET068FII.TXT: Processed 406085 records in 1686 queries.
import.php: Begin
ImportProduct.php> start 1055398
import.php: done loadFile()
./executeFullUpdateImport.sh: line 58: 4908 Killed php -f import.php
executeFullUpdateImport> full update/import completed
executeFullUpdateImport> Performed in 384m, 56s.\n\n
[root#stinedev import]#
[root#stinedev import]# ls
Searching for something like this was very unhelpful. Google returns 9 billion pages of how to kill a process... not keep one alive, no matter what combinations of terms I use (admittedly I could be asking the wrong questions).
I have met this situation. One possible cause is oom killer. That's easy to check. dmsg will print this behavior.
As some other user process send signal to kill,my solution is to use kernel audition. google auditctl the first entry.
The answer to this question was that it is indeed an Out-of-Memory (OOM) error. Wish the message was more descriptive when it was killed.
The log file is platform dependant, but on CentOS it was located at:
/var/log/messages[-date]
where [-date] is an optional YYYYMMDD date stamp. The offending entries will look something like this:
... some log stuff here ...
Apr 13 21:26:23 MACHINE kernel: php invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Apr 13 21:26:23 MACHINE kernel: php cpuset=/ mems_allowed=0
Apr 13 21:26:23 MACHINE kernel: Pid: 1337, comm: php Not tainted 2.6.32-431.29.2.el6.x86_64 #1
Apr 13 21:26:23 MACHINE kernel: Call Trace:
... more log stuff here ...
where MACHINE is the name of your rig.
PS: Thanks for the help in the comments. I wouldn't of been able to answer this without the help, and would of chose an answer from them, but it isn't allowed.
I've stumbled upon a phenomenon that I can't explain myself.
I'm using popen to execute php and then execute a php script that way, and pclose to close that.
So far so fine. I ran into quite some severe troubles as the script where I used this didn't execute and instead after trying it 3 times in a row I crashed the zend-server (no page would open any more). I found out that the reason to this was that I used a wrong directory for the php.exe. Example:
if (pclose(popen("C:\wrongDir\php\php.exe C:\Zend\Apache2\htdocs\myApp\public\mytest.php 57 > C:\Logs\1\0\jobOut.log 2> C:\Logs\1\0\jobErr.log"))>-1)
{
.....
}
Aside from the "wrongDir" all other dirs were correct....the popen even created the jobOut and jobErr files (which were empty). (remark: PHP is not in a searchpath that is why it wasn't found without the correct path)
Even though I now solved the problem....I have the question if this is a normal behaviour there, or if I have done something wrong (maybe even server settings). As from what I read in the manual about both commands it sounded to me that in the my case I should have had a return value of either -1 or 0 and not the problem I ran into with the process and then the server hanging).
Thanks.
It appears that pclose() doesn't return the exit status of the process but rather of it's ability to close the process.
To get the process termination 'code' use pcntl_wifexited() and pcntl_wexitstatus()
http://php.net/manual/en/function.pclose.php
http://php.net/manual/en/function.pcntl-wexitstatus.php
I have 1 cronjob that runs every 60 minutes but for some reason, recently, it is running slow.
Env: centos5 + apache2 + mysql5.5 + php 5.3.3 / raid 10/10k HDD / 16gig ram / 4 xeon processor
Here's what the cronjob do:
parse the last 60 minutes data
a) 1 process parse user agent and save the data to the database
b) 1 process parse impressions/clicks on the website and save them to the database
from the data in step 1
a) build a small report and send emails to the administrator/bussiness
b) save the report into a daily table (available in the admin section)
I see now 8 processes (the same file) when I run the command ps auxf | grep process_stats_hourly.php (found this command in stackoverflow)
Technically I should only have 1 not 8.
Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?
Thanks
Your hardware seems to be good enough to process this.
1) Check if you already have hanging processes. Using the ps auxf (see tcurvelo answer), check if you have one or more processes that takes too much resources. Maybe you don't have enough resources to run your cronjob.
2) Check your network connections:
If your databases and your cronjob are on a different server you should check whats the response time between these two machines. Maybe you have network issues that makes the cronjob wait for the network to send the package back.
You can use: Netcat, Iperf, mtr or ttcp
3) Server configuration
Is your server is configured correctly? Your OS, MySQL are setup correctly? I would recommend to read these articles:
http://www3.wiredgorilla.com/content/view/220/53/
http://www.vr.org/knowledgebase/1002/Optimize-and-disable-default-CentOS-services.html
http://dev.mysql.com/doc/refman/5.1/en/starting-server.html
http://www.linux-mag.com/id/7473/
4) Check your database:
Make sure your database has the correct indexes and make sure your queries are optimized. Read this article about the explain command
If a query with few hundreds thousands of record takes times to execute that will affect the rest of your cronjob, if you have a query inside a loop, even worse.
Read these articles:
http://dev.mysql.com/doc/refman/5.0/en/optimization.html
http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/
http://blog.fedecarg.com/2008/06/12/10-great-articles-for-optimizing-mysql-queries/
5) Trace and optimized PHP code?
Make sure your PHP code runs as fast as possible.
Read these articles:
http://phplens.com/lens/php-book/optimizing-debugging-php.php
http://code.google.com/speed/articles/optimizing-php.html
http://ilia.ws/archives/12-PHP-Optimization-Tricks.html
A good technique to validate your cronjob is to trace your cronjob script:
Based on your cronjob process, put some debug trace including how much memory, how much time it took to execute the last process. eg:
<?php
echo "\n-------------- DEBUG --------------\n";
echo "memory (start): " . memory_get_usage(TRUE) . "\n";
$startTime = microtime(TRUE);
// some process
$end = microtime(TRUE);
echo "\n-------------- DEBUG --------------\n";
echo "memory after some process: " . memory_get_usage(TRUE) . "\n";
echo "executed time: " . ($end-$start) . "\n";
By doing that you can easily find which process takes how much memory and how long it takes to execute it.
6) External servers/web service calls
Is your cronjob calls external servers or web service? if so, make sure these are loaded as fast as possible. If you request data from a third-party server and this server takes few seconds to return an answer that will affect the speed of your cronjob specially if these calls are in loops.
Try that and let me know what you find.
The ps's output also shows when the process have started (see column STARTED).
$ ps auxf
USER PID %CPU %MEM VSZ RSS TTY STAT STARTED TIME COMMAND
root 2 0.0 0.0 0 0 ? S 18:55 0:00 [ktrheadd]
^^^^^^^
(...)
Or you can customize the output:
$ ps axfo start,command
STARTED COMMAND
18:55 [ktrheadd]
(...)
Thus, you can be sure if they are overlapping.
You should use a lockfile mechanism within your process_stats_hourly.php script. Doesn't have to be anything overly complex, you could have php write the PID which started the process to a file like /var/mydir/process_stats_hourly.txt. So if it takes longer than an hour to process the stats and cron kicks off another instance of the process_stats_hourly.php script, it can check to see if the lockfile already exists, if it does it will not run.
However you are left with the problem of how to "re-queue" the hourly script if it did find the lock file and couldn't start.
You might use strace -p 1234 where 1234 is a relevant process id, on one of the processes which is running too long. Perhaps you'll understand why is it so slow, or even blocked.
Is there any tool in Cent OS or something I can do to make sure my cronjob will run every hour and not overlapping the next one?
Yes. CentOS' standard util-linux package provides a command-line convenience for filesystem locking. As Digital Precision suggested, a lockfile is an easy way to synchronize processes.
Try invoking your cronjob as follows:
flock -n /var/tmp/stats.lock process_stats_hourly.php || logger -p cron.err 'Unable to lock stats.lock'
You'll need to edit paths and adjust for $PATH as appropriate. That invocation will attempt to lock stats.lock, spawning your stats script if successful, otherwise giving up and logging the failure.
Alternatively your script could call PHP's flock() itself to achieve the same effect, but the flock(1) utility is already there for you.
How often is that logfile rotated?
A log-parsing job suddenly taking longer than usual sounds like the log isn't being rotated and is now too big for the parser to handle efficiently.
Try resetting the logfile and see if the job runs faster. If that solves the problem, I recommend logrotate as a means of preventing the problem in the future.
You could add a step to the cronjob to check the output of your above command:
ps auxf | grep process_stats_hourly.php
Keep looping until the command returns nothing, indicating that the process isn't running, then allow the remaining code to execute.
I created a script that runs in the background using the ignore_user_abort() function. However, I was foolish enough not to insert any sort of code to make the script stop and now it is sending e-mails every 30 seconds...
Is there any way to stop the script? I am in a shared hosting, so I don't have access to the command prompt, and I don't know the PID.
Is there any way to stop the script? I am in a shared hosting, so I don't have access to the command prompt, and I don't know the PID.
Then no.
But are you sure you don't have any shell access? Even via PHP? If you do, you could try....
<?php
print `ps -ef | grep php`;
...and if you can identify the process from that then....
<?php
$pid=12345; // for example.
print `kill -9 $pid`;
And even if you don't have access to run shell commands, you may be able to find the pid in /proc (on a linux system) and terminate it using the POSIX extension....
<?php
$ps=glob('/proc/[0-9]*');
foreach ($ps as $p) {
if (is_dir($p) && is_writeable($p)) {
print "proc= " . basename($p);
$cmd=file_get_contents($p . '/cmdline');
print " / " . file_get_contents($p . '/cmdline');
if (preg_match('/(php).*(myscript.php)/',$cmd)) {
posix_kill(basename($p), SIGKILL);
print " xxxxx....";
break;
}
print "\n";
}
}
I came to this thread Yesterday! I by mistake had a infinite loop in a page which was not supposed to be visited and that increased my I/O to 100 and CPU usage to 100 I/O was because of some php errors and it was getting logged and log file size was increasing beyond anyone can think.
None of the above trick worked on my shared hosting.
MY SOLUTION
In cPanel, go to PHP Version (except that of current)
Select any PHP Version for time being.
And then Apply Changes.
REASON WHY IT WORKED
The script which had infinite loop with some php errors was a process so I just needed to kill it, changing php version reinforce restart of services like php and Apache, and as restart was involved earlier processes were killed, and I was relaxed as I/O and CPU usage stabilized. Also, I fixed that bug before hand changing the php version :)
how did you deploy the script? surely you can just remove it (if that's an acceptable option). otherwise modify it and insert some logic to only allow it to send a mail once every n mins/hours/days based on the server time?
re. stopping the script from executing (or rather the system trying to execute it) how did you schedule it for execution? is it some type of gui to a crontab or something? can you not just undo what you did there (seeing as you have no access to the command line/terminal)?
rob ganly
Simply .
Call the support, get it cancelled.
Next time, don't execute something you can't control.
We have a node js script that runs a command to execute the following command:
/usr/local/bin/php -q /home/www/441.php {"id":"325241"}
This script does a lot of things, however it does not seem to respect the time limit. The first line of this file is:
set_time_limit(1800);
Yet if we check what processes are running on the server (ps -aux | grep php) we will see a lot of these commands that have been open since last week.
Any ideas on how we can clean this up?
I found the following comment on the PHP user guide for max_execution_time
Keep in mind that for CLI SAPI
max_execution_time is hardcoded to 0.
So it seems to be changed by ini_set
or set_time_limit but it isn't,
actually. The only references I've
found to this strange decision are
deep in bugtracker
(http://bugs.php.net/37306) and in
php.ini (comments for
'max_execution_time' directive).
So it would seem that there's a bug in the CLI module that means max_execution_time is effectively ignored.
The commenter mentioned a page in the bug tracker about this at http://bugs.php.net/37306 but the tracker seems to be down.
set_time_limit only has meaning to the php part of the program. If you had a query on a database that takes 5h to finish, those 5h are not counted by php, so they fall out of scope of the set_time_limit limitation. Having said that, it seems weird that a php process is still running after a week, if it is not calling another program that runs forever (which, in this case, the set_time_limit neither affects that calling).
Also, what does the -q flag? I can't find it on man php nor php --help nor in php's command line options.
If you start the script in nodejs, why not kill it there too, after 1800s?
var pid = startPHPProcess();
setTimeout(function() {
killPHPProcess(pid);
}, 1800);