Why does PHP child process become zombie? - php

I have a script that does several tasks.
In order to avoid timeout, memorylimits, crossvar and so on I decided to have a main script that fork all the taks on different PHP process.
I can run manually the script and work fine.
I can run manually every single child process and work fine.
However from time to time I see that some of child process are running forever and I have to kill them from top.
Does anybody know why a PHP process executed by CLI should become zombie and avoiding to close itself and also the main process?
The Spawn process:
foreach ($OPS as $OP) {
$command = $PHP_BIN." ".__DIR__."/process_this.php op_id=".$OP["id"];
exec($command);
sleep(5);
}

Related

MySQL code causes PHP script to crash at popen/exec

I have the following PHP 5.6.19 code on a Ubuntu 14.04 server. This code simply connects to a MySQL 5.6.28 database, waits a minute, launches another process of itself, then exits.
Note: this is the full script, and it's purpose is to demonstrate the problem - it doesn't do anything useful.
class DatabaseConnector {
const DB_HOST = 'localhost';
const DB_NAME = 'database1';
const DB_USERNAME = 'root';
const DB_PASSWORD = 'password';
public static $db;
public static function Init() {
if (DatabaseConnector::$db === null) {
DatabaseConnector::$db = new PDO('mysql:host=' . DatabaseConnector::DB_HOST . ';dbname=' . DatabaseConnector::DB_NAME . ';charset=utf8', DatabaseConnector::DB_USERNAME, DatabaseConnector::DB_PASSWORD);
}
}
}
$startTime = time();
// ***** Script works fine if this line is removed.
DatabaseConnector::Init();
while (true) {
// Sleep for 100 ms.
usleep(100000);
if (time() - $startTime > 60) {
$filePath = __FILE__;
$cmd = "nohup php $filePath > /tmp/1.log 2>&1 &";
// ***** Script sometimes exits here without opening the process and without errors.
$p = popen($cmd, 'r');
pclose($p);
exit;
}
}
I start the first process of the script using nohup php myscript.php > /tmp/1.log 2>&1 &.
This process loop should go on forever but... based on multiple tests, within a day (but not instantly), the process on the server "disappears" without reason. I discovered that the MySQL code is causing the popen code to fail (the script exits without any error or output).
What is happening here?
Notes
The server runs 24/7.
Memory is not an issue.
The database connects correctly.
The file path does not contain spaces.
The same problem exists when using shell_exec or exec instead of popen (and pclose).
I also know that popen is the line that fails because I did further debugging (not shown above) by logging to a file at certain points in the script.
Is the parent process definitely exiting after forking? I had thought pclose would wait for the child to exit before returning.
If it isn't exiting, I'd speculate that because the mySQL connection is never closed, you're eventually hitting its connection limit (or some other limit) as you spawn the tree of child processes.
Edit 1
I've just tried to replicate this. I altered your script to fork every half-second, rather than every minute, and was able to kill it off within about 10 minutes.
It looks like the the repeat creation of child processes is generating ever more FDs, until eventually it can't have any more:
$ lsof | grep type=STREAM | wc -l
240
$ lsof | grep type=STREAM | wc -l
242
...
$ lsof | grep type=STREAM | wc -l
425
$ lsof | grep type=STREAM | wc -l
428
...
And that's because the child's inheriting the parent's FDs (in this case for the mySQL connection) when it forks.
If you close the mySQL connection before popen with (in your case):
DatabaseConnector::$db = null;
The problem will hopefully go away.
I had a similar situation using pcntl_fork() and a MySQL connection. The cause here is probably the same.
Background info
popen() creates a child process. The call to pclose() closes the communication channel and the child process continues to run until it exits. This is when the things start to go out of control.
When a child process completes, the parent process receives a SIGCHLD signal. The parent process here is the PHP interpreter that runs the code you posted. The child process is the one launched using popen() (it doesn't matter what command it runs).
There is a small thing here you probably don't know or you have found in the documentation and ignored it because it doesn't make much sense when one programs in PHP. It is mentioned in the documentation of sleep():
If the call was interrupted by a signal, sleep() returns a non-zero value.
The sleep() PHP function is just a wrapper of the sleep() Linux system call (and usleep() PHP function is a wrapper of the usleep() Linux system call.)
What is not told in the PHP documentation is clearly stated in the documentation of the system calls:
sleep() makes the calling thread sleep until seconds seconds have elapsed or a signal arrives which is not ignored.
Back to your code.
There are two places in your code where the PHP interpreter calls the usleep() Linux system function. One of them is clearly visible: your PHP code invokes it. The other one is hidden (see below).
What happens (the visible part)
Starting with the second iteration, if a child process (created using popen() on a previous iteration) happens to exit while the parent program is inside the usleep(100000) call, the PHP interpreter process receives the SIGCHLD signal and its execution resumes before the time being out. The usleep() returns earlier than expected. Because the timeout is short, this effect is not observable by the naked eye. Put 10 seconds instead of 0.1 seconds and you'll notice it.
However, apart from the broken timeout, this doesn't affect the execution of your code in a fatal manner.
Why it crashes (the invisible part)
The second place where an incoming signal hurts your programs execution is hidden deep inside the code of the PHP interpreter. For some protocol reasons, the MySQL client library uses sleep() and/or usleep() in several places. If the interpreter happens to be inside one of these calls when the SIGCHLD arrives, the MySQL client library code is resumed unexpectedly and, many times, it concludes with the erroneous status "MySQL server has gone away (error 2006)".
It's possible that your code ignores (or swallows) the MySQL error status (because it doesn't expect it to happen in that place). Mine didn't and I spent a couple of days of investigation to find out the facts summarized above.
A solution
The solution for the problem is easy (after you know all the internal details exposed above). It is hinted in the documentation quote above: "a signal arrives which is not ignored".
The signals can be masked (ignored) when their arrival is not desired. The PHP PCNTL extension provides the function pcntl_sigprocmask(). It wraps the sigprocmask() Linux system call that sets what signals can be received by the program from now on (in fact, what signals to be blocked).
There are two strategies you can implement, depending of what you need.
If your program needs to communicate with the database and be notified when the child processed complete then you have to wrap all your database calls within a pair of calls to pcntl_sigprocmask() to block then unblock the SIGCHLD signal.
If you doesn't care when the child processes complete then you just call:
pcntl_sigprocmask(SIG_BLOCK, array(SIGCHLD));
before you start creating any child process (before the while()).
It makes your process ignore the termination of the child processes and lets it run its database queries without undesired interruption.
Warning
The default handling of the SIGCHLD signal is to call wait() in order to let the system cleanup after the completed child process. What happens if the signal is not handled (because its delivery is blocked) is explained in the documentation of wait():
A child that terminates, but has not been waited for becomes a "zombie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. If a parent process terminates, then its "zombie" children (if any) are adopted by init(1), which automatically performs a wait to remove the zombies.
In plain English, if you block the reception of SIGCHLD signal, then you have to call pcntl_wait() in order to cleanup the zombie child processes.
You can add:
pcntl_wait($status, WNOHANG);
somewhere inside the while loop (just before it ends, for example).
the script exits without any error or output
Not surprising when there's no error checking in the code. However if it really is "crashing", then:
if the cause is trapped by the PHP runtime then it will be trying to log an error. Have you tried delibertely creating an error scenario to varify that the reorting/logging is working as you expect?
if the error is not trapped by the PHP runtime, the the OS should be dumping a corefile - have you checked the OS config? Looked for the core file? Analyzed it?
$cmd = "nohup php $filePath > /tmp/1.log 2>&1 &";
This probably doesn't do what you think it does. When you run a process in the background with most versions of nohup, it still retains a relationship with the parent process; the parent cannot be reaped until the child process exits - and a child is always spawning another child before it does.
This is not a valid way to keep your code running in the background / as a daemon. What the right approach is depends on what you are trying to achieve. Is there a specific reason for attempting to renew the process every 60 seconds?
(You never explicitly close the database connection - this is less of an issue as PHP should do this when exit is invoked).
You might want to read this and this
I suggest that process doesn't exit after pclose. In this case every process holds it's own connection to db. After some time connectons limit of MySQL is reached and new connection fails.
To understand what's going on - add some logs before and after strings DatabaseConnector::Init(); and pclose($p);

running a single process in the bakground at all times

How can I run process in the background at all times?
I want to create a process that manages some work queues based on the info from a database.
I'm currently doing it using a cron job, where I run the cron job every minute, and have 30 calls with a sleep(2) interval. While this is working OK, I've noticed from time to time that there is a race condition.
Is it possible to just run the same process all the time? I would still have the cron job attempt to start periodically, but it would just shut down if it sees itself running.
Or is this a bad idea? any possibility of a memory leak or other issues occurring?
some years ago I didn't know about MQ systems and nodejs and etc.
so then I used code like this and added to cron to run every minute:
<?php
// defining path to lock file. example: /home/user1/bin/cronjob1.lock
define('LOCK_FILE', __DIR__."/".basename(__FILE__).'.lock');
// function to check if process is running or not
function isLocked()
{
// lock file exists, but let's check if it's running?
if(is_file(LOCK_FILE))
{
$pid = trim(file_get_contents(LOCK_FILE)); // reading process id from .lock file
$pids = explode("\n", trim(`ps -e | awk '{print $1}'`)); // running process ids
if(in_array($pid, $pids)) // $pid exists in process ids
return true; // it's ok, process running
}
// making .lock file with new process id in it
file_put_contents(LOCK_FILE, getmypid()."\n" );
return false; // previous process was not running
}
// if previous process locked to run same script
if(isLocked()) die("Already running.\n"); // locked, exiting
// from this point we run our new process
set_time_limit(0);
while(true) {
// some ops
sleep(1);
}
// cleanup before finishing
unlink(LOCK_FILE);
You could use something called forever which requires nodejs.
Once you have node installed,
Install forever with:
$ [sudo] npm install forever -g
To run a script forever:
forever start app.js

Dont run a cron php task until last one has finished

I have a php-cli script that is run by cron every 5 minutes. Because this interval is short, multiple processes are run at the same time. That's not what I want, since this script has to write inside a text file a numeric id that is incremented each time. It happens that writers are writing at the same time on this text file, and the value written is incorrect.
I have tried to use php's flock function to block writing in the file, when another process is writing on it but it doesnt work.
$fw = fopen($path, 'r+');
if (flock($fw, LOCK_EX)) {
ftruncate($fw, 0);
fwrite($fw, $latestid);
fflush($fw);
flock($fw, LOCK_UN);
}
fclose($fw);
So I suppose that the solution to this is create a bash script that verifies if there is an instance of this php script that is running, if so it should wait until it finished. But I dont know how to do it, any ideas?
The solution I'm using with a bash script is this:
exec 9>/path/to/lock/file
if ! flock -n 9 ; then
echo "another instance is running";
exit 1
fi
# this now runs under the lock until 9 is closed (it will be closed automatically when the script ends)
A file descriptor 9> is created in /var/lock/file, and flock will exit a new process that's trying to run, unless there is no other instance of the script that is running.
How can I ensure that only one instance of a script is running at a time (mutual exclusion)?
I don't really understand how incrementing a counter every 5 minutes will result in multiple processes trying to write the counter file at the same time, but...
A much simpler approach is to use a simple locking mechanism similar to the below:
<?php
$lock_filename = 'nobodyshouldincrementthecounterwhenthisfileishere';
if(file_exists($lock_filename)) {
return;
}
touch($lock_filename);
// your stuff...
unlink($lock_filename);
This as a simple approach will not deal with a situation when the script breaks before it could remove the lock file, in which case it would never run again until it is removed.
More sophisticated approaches are also possible as you suggest, e.g. fork the job in its own process, write the PID into a file, then before running the job it could be checked whether that PID is still running.
To prevent start of a next session of any program until the previous session still running, such as next cron job, I recommend to use either built into your program or external check of running process of this program. Just execute before starting of your program
ps -ef|grep <process_name>|grep -v grep|wc -l
and check, if its result will be 0. Only in this case your program could be started.
I suppose, that you must guarantee an absence of 3rd party process having similar name. (For this purpose give your program a longer and unique name). And a name of your program must not contain pattern "grep".
This work good in combination with normal regular starting of your program, that is configured in a cron table, by cron daemon.
For the case if your check is written as an external script, an entry in the crontab might look like
<time_specification> <your_starter_script> <your_program> ...
2 important remarks: Exit code of your_starter_script must be 0 in case of not starting of your program and it would be better to completely prohibit writing to stdout or stderr by this script.
Such starter is very short and a simple programming exercise. Therefore I don't feel a need to provide its complete code.
Instead of using cron to run your script every 5 minutes, how about using at to schedule your script to run again, 5 minutes after it finishes. Near the end of your script, you can use shell_exec() to run an at command to schedule your script to run again in 5 minutes, like so:
at now + 5 minutes /path/to/script
Or, perhaps even simpler than my previous answer (using at to schedule the script to run again in 5 minutes) is make your script a daemon, by using a non-terminating loop, like so:
while(1) {
// whatever your script does here....
sleep(300) //wait 5 minutes
}
Then, you can do away with scheduling by way of cron or at altogether. Just simply run your script in the background from the command line, like so:
/path/to/your/script &
Or, add /path/to/your/script in /etc/rc.local to make your script start automatically when the machine boots.

Queueing shells with CakePHP

I'm using CakePHP 2.3.8 and I'm using the CakePHP Queue Plugin.
I'm inserting data in a queue to be processed by a shell. I can process the queue (which runs a shell), and it runs properly. However, when I process the queue the page hangs until the queue is done processing. The queue can take a while to processes because it makes external requests, and if there are failures I have some retries and waiting periods. Basically, I need to find a way to process the queue in the background so the user doesn't have to wait for it.
This is my code right now
App::uses('Queue', 'Queue.Lib');
//queue up the items!
foreach($queue_items as $item){
$task_id = Queue::add("ExternalSource ".$item, 'shell');
$this->ExternalSource->id = $item;
$this->ExternalSource->saveField('queue_task_id', $task_id['QueueTask']['id']);
}
//now process the queue
Queue::process();
echo "Done!";
I've also tried calling the shell directly but the same thing happens. I have to wait until it's done being processed.
Is there any way to call the shell so that the user doesn't have to wait for it to finish being processed? I'd prefer not to do it with a cronjob checking frequently.
Edit
I've also tried using exec but doesn't seem to be working
$exec = exec('/Applications/XAMPP/xamppfiles/htdocs/app/Console/cake InitiateQueueProcess');
var_dump($exec);
The dump shows string '' (length=0). When I change exec to exec('pwd'); it shows the directory. I can use that exact path and call the shell from the terminal so I know it's correct. I also changed the permission but still nothing happens. The shell doesn't run.

using cron to make php-script starting continuous processes

I created a cronjob which call a php-script every 5mins.
This PHP script needs to start several other PHP CLI scripts and keep them running in background even when the cron-script terminates.
I'm currently creating these sub-processes by the following line of code:
if (!$pid = shell_exec("nohup /var/[..]/cake.php test doSomething > /dev/null 2>&1 & echo $!")) return false;
When I call the "motherscript" via command-line everything's working great. But it seems like the sub-processes startet by the above line of code are terminated when the cron-job stops.
So how to spawn these cake.php test doSomethingscript and keep the child-process running under the predefined user in crontab?
That makes sense, you need to tell the children PHP scripts to NOT terminate when it's parent stops.
ignore_user_abort(true);
Add the above line to the children PHP scripts. Should do it

Categories