Terminating zombie child processes forked from socket server - php

Disclaimer
I am well aware that PHP might not have been the best choice in this case for a socket server. Please refrain from suggesting
different languages/platforms - believe me - I've heard it from all
directions.
Working in a Unix environment and using PHP 5.2.17, my situation is as follows - I have constructed a socket server in PHP that communicates with flash clients. My first hurtle was that each incoming connection blocked the sequential connections until it had finished being processed. I solved this by utilizing PHP's pcntl_fork(). I was successfully able to spawn numerous child processes (saving their PID in the parent) that took care of broadcasting messages to the other clients and therefore "releasing" the parent process and allowing it to continue to process the next connection[s].
My main issue right now is dealing/handling with the collection of these dead/zombie child processes and terminating them. I have read (over and over) the relevant PHP manual pages for pcntl_fork() and realize that the parent process is in charge of cleaning up its children. The parent process receives a SIGNAL from its child when the child executes an exit(0). I am able to "catch" that signal using the pcntl_signal() function to setup a signal handler.
My signal_handler looks like this :
declare(ticks = 1);
function sig_handler($signo){
global $forks; // this is an array that holds all the child PID's
foreach($forks AS $key=>$childPid){
echo "has my child {$childPid} gone away?".PHP_EOL;
if (posix_kill($childPid, 9)){
echo "Child {$childPid} has tragically died!".PHP_EOL;
unset($forks[$key]);
}
}
}
I am indeed seeing both echo's including the relevant and correct child PID that needs to be removed but it seems that
posix_kill($childPid, 9)
Which I understand to be synonymous with kill -9 $childPid is returning TRUE although it is in fact NOT removing the process...
Taken from the man pages of posix_kill :
Returns TRUE on success or FALSE on failure.
I am monitoring the child processes with the ps command. They appear like this on the system :
web5 5296 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5321 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5466 5234 0 14:52 ? 00:00:00 [php] <defunct>
As you can see all these processes are child processes of the parent which has the PID of 5234
Am I missing something in my understanding? I seem to have managed to get everything to work (and it does) but I am left with countless zombie processes on the system!
My plans for a zombie apocalypse are rock solid -
but what on earth can I do when even sudo kill -9 does not kill the zombie child processes?
Update 10 Days later
I've answered this question myself after some additional research, if you are still able to stand my ramblings proceed at will.

I promise there is a solution at the end :P
Alright... so here we are, 10 days later and I believe that I have solved this issue. I didn't want to add onto an already longish post so I'll include in this answer some of the things that I tried.
Taking #sym's advice, and reading more into the documentation and the comments on the documentation, the pcntl_waitpid() description states :
If a child as requested by pid has already exited by the time of the call (a so-called
"zombie" process), the function returns immediately. Any system resources used by the child
are freed...
So I setup my pcntl_signal() handler like this -
function sig_handler($signo){
global $childProcesses;
$pid = pcntl_waitpid(-1, $status, WNOHANG);
echo "Sound the alarm! ";
if ($pid != 0){
if (posix_kill($pid, 9)){
echo "Child {$pid} has tragically died!".PHP_EOL;
unset($childProcesses[$pid]);
}
}
}
// These define the signal handling
// pcntl_signal(SIGTERM, "sig_handler");
// pcntl_signal(SIGHUP, "sig_handler");
// pcntl_signal(SIGINT, "sig_handler");
pcntl_signal(SIGCHLD, "sig_handler");
For completion, I'll include the actual code I'm using for forking a child process -
function broadcastData($socketArray, $data){
global $db,$childProcesses;
$pid = pcntl_fork();
if($pid == -1) {
// Something went wrong (handle errors here)
// Log error, email the admin, pull emergency stop, etc...
echo "Could not fork()!!";
} elseif($pid == 0) {
// This part is only executed in the child
foreach($socketArray AS $socket) {
// There's more happening here but the essence is this
socket_write($socket,$msg,strlen($msg));
// TODO : Consider additional forking here for each client.
}
// This is where the signal is fired
exit(0);
}
// If the child process did not exit above, then this code would be
// executed by both parent and child. In my case, the child will
// never reach these commands.
$childProcesses[] = $pid;
// The child process is now occupying the same database
// connection as its parent (in my case mysql). We have to
// reinitialize the parent's DB connection in order to continue using it.
$db = dbEngine::factory(_dbEngine);
}
Yea... That's a ratio of 1:1 comments to code :P
So this was looking great and I saw the echo of :
Sound the alarm! Child 12345 has tragically died!
However when the socket server loop did it's next iteration, the socket_select() function failed throwing this error :
PHP Warning: socket_select(): unable to select [4]: Interrupted system call...
The server would now hang and not respond to any requests other than manual kill commands from a root terminal.
I'm not going to get into why this was happening or what I did after that to debug it... lets just say it was a frustrating week...
much coffee, sore eyes and 10 days later...
Drum roll please
TL&DR - The Solution :
Mentioned here in a comment from 2007 in the php sockets documentation and in this tutorial on stuporglue (search for "good parenting"), one can simply "ignore" signals comming in from the child processes (SIGCHLD) by passing SIG_IGN to the pcntl_signal() function -
pcntl_signal(SIGCHLD, SIG_IGN);
Quoting from that linked blog post :
If we are ignoring SIGCHLD, the child processes will be reaped automatically upon completion.
Believe it or not - I included that pcntl_signal() line, deleted all the other handlers and things dealing with the children and it worked! There were no more <defunct> processes left hanging around!
In my case, it really did not interest me to know exactly when a child process died, or who it was, I wasn't interested in them at all - just that they didn't hang around and crash my entire server :P

Regards your disclaimer - PHP is no better / worse than many other languages for writing a server in. There are some things which are not possible to do (lightweight processes, asynchronuos I/O) but these do not really apply to a forking server. If you're using OO code, then do ensure that you've got the circular reference checking garbage collector enabled.
Once a child process exits, it becomes a zombie until the parent process cleans it up. Your code seems to send a KILL signal to every child on receipt of any signal. It won't clean up the process entries. It will terminate processes which have not called exit. To get the child process reaped correctly you should call waitpid (see also this example on the pcntl_wait manual page).

http://www.linuxsa.org.au/tips/zombies.html
Zombies are dead processes. You cannot kill the dead. All processes
eventually die, and when they do they become zombies. They consume
almost no resources, which is to be expected because they are dead!
The reason for zombies is so the zombie's parent (process) can
retrieve the zombie's exit status and resource usage statistics. The
parent signals the operating system that it no longer needs the zombie
by using one of the wait() system calls.
When a process dies, its child processes all become children of
process number 1, which is the init process. Init is ``always''
waiting for children to die, so that they don't remain as zombies.
If you have zombie processes it means those zombies have not been
waited for by their parent (look at PPID displayed by ps -l). You
have three choices: Fix the parent process (make it wait); kill the
parent; or live with it. Remember that living with it is not so hard
because zombies take up little more than one extra line in the output
of ps.

I know only too well how hard you have to search for a solution to the problem of zombie processes. My concern with potentially having hundreds or thousands of them was (rightly or wrongly as I don't know if this would actualy be a problem) running out of inodes, as all hell can break loose when that happens.
If only the pcntl_fork() manual page linked to posix-setsid() many of us would have discovered the solution was so simple years ago.

Related

What's wrong with my concurrent programming logic?

I wrote a web spider to spider pages concurrently. For each link that the spider finds, I want to fork off a new child that starts the process all over again.
I don't want to overload the target server so I created a static array that all objects can access. Each child can add their PID to the array, and either parent or child should check the array to see if $maxChildren have been met, and if so, patiently wait until any child finishes.
As you see, I have $maxChildren set to 3. I am expecting to see 3 simultaneous processes at any given time. However, that's not the case. The linux top command shows 12 to 30 processes at any given time. In concurrent programming, how can I regulate the number of simultaneous processes? My logic is currently inspired by how Apache handles it's max children, but I'm not exactly sure how that works.
As pointed out in one of the answers, globally accessing the static variable brings up issues with race conditions. To deal with this, the $children array takes the unique $PID of the process as both the key and it's value, thereby creating a unique value. My thinking is that since any object can only deal with one $children[$pid] value, locking is not necessary. Is this not true? Is there a chance that two processes could try to unset or add the same value at some point?
private static $children = array();
private $maxChildren = 3;
public function concurrentSpider($url) {
// STEP 1:
// Download the $url
$pageData = http_get($url, $ref = '');
if (!$this->checkIfSaved($url)) {
$this->save_link_to_db($url, $pageData);
}
// STEP 2:
// extract all hyperlinks from this url's page data
$linksOnThisPage = $this->harvest_links($url, $pageData);
// STEP 3:
// Check the links array from STEP 2 to see if this page has
// already been saved or is excluded because of any other
// logic from the excluded_link() function
$filteredLinks = $this->filterLinks($linksOnThisPage);
shuffle($filteredLinks);
// STEP 4: loop through each of the links and
// repeat the process
foreach ($filteredLinks as $filteredLink) {
$pid = pcntl_fork();
switch ($pid) {
case -1:
print "Could not fork!\n";
exit(1);
case 0:
if ($this->checkIfSaved($filteredLink)) {
exit();
}
//$pid = getmypid();
print "In child with PID: " . getmypid() . " processing $filteredLink \n";
$var[$pid]->concurrentSpider($filteredLink);
sleep(2);
exit(1);
default:
// Add an element to the children array
self::$children[$pid] = $pid;
// If the maximum number of children has been
// achieved, wait until one or more return
// before continuing.
while (count(self::$children) >= $this->maxChildren) {
//print count(self::$children) . " children \n";
$pid = pcntl_waitpid(-1, $status);
unset(self::$children[$pid]);
}
}
}
}
This is written in PHP. I know that the pcntl_waitpid function with argument of -1 waits for any child to complete regardless of the parent (http://php.net/manual/en/function.pcntl-waitpid.php).
What's wrong with my logic and how can I correct it so that only $maxChildren processes are running simultaneously? I'm also open to improving the logic in general if you have suggestions.
First thing to note: if this is truly a global being shared among multiple threads, it's possible that multiple threads are adding to it at once and you're running afoul of a race condition. You need some sort of concurrency control to ensure that only one process is accessing your global array at once.
Also, try the simple debugging trick of having each process write out (to the console or to a file) its PID and the full contents of the global array each time a new spider is forked. It will help you to check your assumptions (which are plainly wrong at some point) and figure out what's going wrong.
EDIT: (In response to the comments)
I'm not a PHP developer, but if I had to guess, based on the fact that you're using an OS tool that counts OS-level processes, I'd guess that your fork is spawning multiple processes, but your static array is global within the current process. Implementing system-wide shared memory is a lot more complicated!
If you just want to count something and ensure that instances of a shared resource don't grow out of control, look into semaphores, and see if you can find a way in PHP to create a named semaphore object that can be shared between multiple instances of your spider.
Use a real programming language ;)
Step 1 is kind of bad why are you downloading if it might be in the db. Put that inside the if and see if you can put a mutex around it. Maybe so something in sql to imitate one.
I hope harvest_links uses a proper html processor with css selector support (i like fizzler for .NET). I guess regular expression would be fine if its just to get links but it is possible to mess up.
I see step 4 and i don't think its bad but personally i'd do it a different way.
I'd have something like step one to insert url,page,flag into a db. Then i'd have another process or the same one ask the db for unprocessed pages and set the flag to some value if it errors and another if its successful. This is so if something fails of the process exits (shutdown, crash, power out, etc) it can pick it up easily and don't need to scan every page to find where it left off. It just ask the database for the next link and redoes what it didnt finish
PHP doesn't support multithreading, therefore it doesn't support mutexes or any other synchronization methods. As others have said in their answers, this will lead to a race condition.
You'll have to write a wrapper in C or bash. That way, the PHP script can submit targets to the wrapper, and the wrapper will handle scheduling.
Another approach is to rewrite your spider in Python or Ruby, both of which support multithreading. That will eliminate the need for interprocess communication.
Edit: On second thought, the best way is to write the wrapper in Python or Ruby and reuse your existing PHP code as a black box. That's a compromise of the solutions above.
If the spider is for practical purposes, you might want to google "curl multithread"
cURL Multi Threading with PHP

PHP PCNTL - Maintain control in terminal

I want to run parallel processes to process at least 4 users simultaneously.
ob_end_flush();
for($i = 0; $i < 4; $i++)
{
$pid=pcntl_fork();
if ($pid == -1) {
echo("Could not create a new child.");
exit;
} else if ($pid == 0) {
//This will get excuted only in the child process
echo 'parent1';
sleep(5);
echo 'parent2';
exit;
}
}
The output is something like this:
> php testscript.php
parent1
parent1
parent1
parent1
> parent2
parent2
parent2
parent2
What is currently happening is after displaying parent1 4 times the control on the script is lost, and a new empty command line is started. After waiting for 5 more seconds the remaining output comes.
How can i avoid that from happening. I want to be able to break the script as and when I want.
EDIT
If i send a signal via terminal say by pressing CTRL+C, the script does not stops, need a way to inform the script to stop at that instant.
You are creating dirty zombie/parent-less processes who're being adopted by others who certainly don't want them either =)
When you're working with forking you need to understand you're creating a copy of the current process (stack/heap and all), giving it a new process id (PID) and setting it's parent id (PPID) as its own as to say the spawned process is a child. Here the parent of these 4 new children finished all his apparent work then died without any error because he has nothing left to do.. But as any respectful parent should do, your parent process should sit around and take care of his children. That's where signals can become useful especially the SIGCHLD signal, which will be sent to any parent process whos child just past away. Combining that with a small event handling loop and your parent can gracefully manage his children \o/
tl;dr I have a an example fork wrapper sitting in my blog http://codehackit.blogspot.be/2012/01/childpids-break-pcntlsignaldispatch.html
you need to use pcntl_signal to set up a signal handler for CTRL-C(which equals to SIGINT, with a value of 2) if you want to react on it in a custom fashion.

How do I keep my mysql connection in the parent process after pcntl_fork?

As all of you know when you fork the child gets a copy of everything, including file and network descriptors - man fork.
In PHP, when you use pcntl_fork all of your connections created with mysql_connect are copied and this is somewhat of a problem - php docs and SO question. Common sense in this situation says close the parent connection, create new and let the child use the old one. But what if said parent needs create many children ever few seconds? In that case you end up creating loads of new connections - one for every bunch of forks.
What does that mean in code:
while (42) {
$db = mysql_connect($host, $user, $pass);
// do some stuff with $db
// ...
foreach ($jobs as $job) {
if (($pid = pcntl_fork()) == -1) {
continue;
} else if ($pid) {
continue;
}
fork_for_job($job);
}
mysql_close($db);
wait_children();
sleep(5);
}
function fork_for_job($job) {
// do something.
// does not use the global $db
// ...
exit(0);
}
Well, I do not want to do that - thats way too many connections to the database. Ideally I would want to be able to achieve behaviour similar to this one:
$db = mysql_connect($host, $user, $pass);
while (42) {
// do some stuff with $db
// ...
foreach ($jobs as $job) {
if (($pid = pcntl_fork()) == -1) {
continue;
} else if ($pid) {
continue;
}
fork_for_job($job);
}
wait_children();
sleep(5);
}
function fork_for_job($job) {
// do something
// does not use the global $db
// ...
exit(0);
}
Do you think it is possible?
Some other things:
This is php-cli script
I've tried using mysql_pconnect in the first example but as far as I can tell there is no difference - the mysql server receives as much new connections. Maybe that's because it is cli and pconnect does not work as it was in mod_php. As Marc has noticed - pconnect in php-cli does not make sense.
The only thing you could try, is to let your children wait until each other child has finished its job. This way you could use the same database connection (provided there aren't any synchronization issues). But of course you'll have a lot of processes, which is not very good too (in my experience PHP has quite a big memory usage). If having multiple processes accessing the same database connection is not a problem, you could try to make "groups" of processes which share a connection. So you don't have to wait until each job finished (you can clean up when the whole group finished) and you don't have a lot of connections either..
You should ask yourself whether you really need a database connection for your worker processes. Why not let the parent fetch the data and write your results to a file?
If you do need the connection, you should consider using another language for the job. PHPs cli itself is not a "typical" use case (it was added in 4.3) and multiprocessing is more of a hack than a supported feature.
If the child calls exec() or _exit() fairly quickly, you're alright. The problem is if the child sticks around and holds on to copies of your file descriptors.
You could also use posix_spawn if PHP has an API for that. That might work well.
My advice (from personal experience on the same issue) is to close the connection before pcntl_fork() then open new connections in parent and/or the child process as needed.
If you open a new connection in the parent process then you have to block the SIGCHLD signal (using pcntl_sigprocmask(SIG_BLOCK, array(SIGCHLD)). No special care is needed in the children processes (except when they also launch their own children, becoming parents this way.)
SIGCHLD is a signal that is received by the parent process when one of its children completes.
During the communication with the server, the MySQL client library uses nanosleep() to suspend the execution of the program for some amounts of time. The sleep() functions return when the time passes but they also return before the time passes if the process receives a signal while it is suspended.
When nanosleep() returns because of a signal (i.e. before enough time has passed), the MySQL library gets confused and reports the error "MySQL server has gone away" and the connection cannot be used any more. It is a false alarm, the MySQL server is still there waiting for queries but the client code is fooled by the signal arrived at the wrong moment.
If you are interested in receiving the SIGCHLD signal then you can block it before running a MySQL query then unblock it again (to avoid it being received during the communication with MySQL server.
Also read this answer and this answer I wrote on similar questions (it's the same information, but with more details and explanation.)

Executing functions parallelly in PHP

Can PHP call a function and don't wait for it to return? So something like this:
function callback($pause, $arg) {
sleep($pause);
echo $arg, "\n";
}
header('Content-Type: text/plain');
fast_call_user_func_array('callback', array(3, 'three'));
fast_call_user_func_array('callback', array(2, 'two'));
fast_call_user_func_array('callback', array(1, 'one'));
would output
one (after 1 second)
two (after 2 seconds)
three (after 3 seconds)
rather than
three (after 3 seconds)
two (after 3 + 2 = 5 seconds)
one (after 3 + 2 + 1 = 6 seconds)
Main script is intended to be run as a permanent process (TCP server). callback() function would receive data from client, execute external PHP script and then do something based on other arguments that are passed to callback(). The problem is that main script must not wait for external PHP script to finish. Result of external script is important, so exec('php -f file.php &') is not an option.
Edit:
Many have recommended to take a look at PCNTL, so it seems that such functionality can be achieved. PCNTL is not available in Windows, and I don't have an access to a Linux machine right now, so I can't test it, but if so many people have advised it, then it should do the trick :)
Thanks, everyone!
On Unix platforms you can enable the PCNTL functions, and use pcntl_fork to fork the process and run your jobs in child processes.
Something like:
function fast_call_user_func_array($func, $args) {
if (pcntl_fork() == 0) {
call_user_func_array($func, $args);
}
}
Once you call pcntl_fork, two processes will execute your code from the same position. The parent process will get a PID returned from pcntl_fork, while the child process will get 0. (If there's an error the parent process will return -1, which is worth checking for in production code).
You can check out PHP Process Control:
http://us.php.net/manual/en/intro.pcntl.php
Note: This is not threading, but the handling of separate processes. There is more overhead attached.
Wouldn't it solve your problem to fork, keeping the parent process free for other connections & actions? See http://www.php.net/pcntl_fork. If you need an answer back you could possibly listen to a socket in the parent, and write with the child. A simple while(true) loop with a read could possibly do, and probably you already have that basic functionality if you run a permanent TCP server. Another option would be to keep track of your childprocess-ids, keep a accessable store somewhere (file/database/memcached etc), with a pcnt_wait in the main process with a WNOHANG to check which process has exited, and retrieve the data from the store.
You can do some threading in PHP if you use the method pcntl_fork.
http://ca.php.net/manual/en/function.pcntl-fork.php
I have never use this myself, but the are some good example of how to use it on php.net.
PHP doesn't have this functionality as far as I know
You can emulate the function using a different technique, like this one:
Parallel functions in PHP
PHP does not support multi-threading, so there's no other option than taking advantage of the OS or the web server multi processing capabilities. Note that actually you can fetch both the result and output of exec:
string exec ( string $command [,
array &$output [, int &$return_var
]] )
You can, at least, prevent the parent process from hanging until the child process is done by ignoring the child signals using pcntl_signal(SIGCHLD, SIG_IGN).
So, let's say you want to fork a process and execute another PHP function that takes a while without making the parent wait for it to finish (since you want the main process to finish in a timely manner):
pcntl_signal(SIGCHLD, SIG_IGN);
$pid = pcntl_fork();
if ($pid < 0) {
exit(0);
} elseif (!$pid) {
my_slow_function();
exit(0);
}
// Parent keeps executing and finishes before the child does
If you want to execute a slow external script as the child process, pcntl_exec is handy:
$script = array('/path/to/my/script'); // E.g. /home/my_user/my_script.php
pcntl_exec('/path/to/program/executable',$script); // E.g. /usr/bin/php

Stopping gearman workers nicely

I have a number of Gearman workers running constantly, saving things like records of user page views, etc. Occasionally, I'll update the PHP code that is used by the Gearman workers. In order to get the workers to switch to the new code, I the kill and restart the PHP processes for the workers.
What is a better way to do this? Presumably, I'm sometime losing data (albeit not very important data) when I kill one of those worker processes.
Edit: I found an answer that works for me, and posted it below.
Solution 1
Generally I run my workers with the unix daemon utility with the -r flag and let them expire after one job. Your script will end gracefully after each iteration and daemon will restart automatically.
Your workers will be stale for one job but that may not be as big a deal to you as losing data
This solution also has the advantage of freeing up memory. You may run into problems with memory if you're doing large jobs as PHP pre 5.3 has god awful GC.
Solution 2
You could also add a quit function to all of your workers that exits the script. When you'd like to restart you simply give gearman calls to quit with a high priority.
function AutoRestart() {
static $startTime = time();
if (filemtime(__FILE__) > $startTime) {
exit();
}
}
AutoRestart();
Well, I posted this question, now I think I have found a good answer to it.
If you look in the code for Net_Gearman_Worker, you'll find that in the work loop, the function stopWork is monitored, and if it returns true, it exits the function.
I did the following:
Using memcache, I created a cached value, gearman_restarttime, and I use a separate script to set that to the current timestamp whenever I update the site. (I used Memcache, but this could be stored anywhere--a database, a file, or anything).
I extended the Worker class to be, essentially, Net_Gearman_Worker_Foo, and had all of my workers instantiate that. In the Foo class, I overrode the stopWork function to do the following: first, it checks gearman_restarttime; the first time through, it saves the value in a global variable. From then on, each time through, it compares the cached value to the global. If it has changed, the stopWork returns true, and the worker quits. A cron checks every minute to see if each worker is still running, and restarts any worker that has quit.
It may be worth putting a timer in stopWork as well, and checking the cache only once every x minutes. In our case, Memcache is fast enough that checking the value each time doesn't seem to be a problem, but if you are using some other system to store off the current timestamp, checking less often would be better.
Hmm, You could implement a code in the workers to check occasionally if the source code was modified, if yes then just just kill themselves when they see fit. That is, check while they are in the middle of the job, and if job is very large.
Other way would be implement some kind of an interrupt, maybe via network to say stop whenever you have the chance and restart.
The last solution is helping to modify Gearman's source to include this functionality.
I've been looking at this recently as well (though in perl with Gearman::XS). My usecase was the same as yours - allow a long-running gearman worker to periodically check for a new version of itself and reload.
My first attempt was just having the worker keep track of how long since it last checked the worker script version (an md5sum would also work). Then once N seconds had elapsed, between jobs, it would check to see if a new version of itself was available, and restart itself (fork()/exec()). This did work OK, but workers registered for rare jobs could potentially end up waiting hours for work() to return, and thus for checking the current time.
So I'm now setting a fairly short timeout when waiting for jobs with work(), so I can check the time more regularly. The PHP interface suggest that you can set this timeout value when registering for the job. I'm using SIGALRM to trigger the new-version check. The perl interface blocks on work(), so the alarm wasn't being triggered initially. Setting the timeout to 60 seconds got the SIGALRM working.
If someone were looking for answer for a worker running perl, that's part of what the GearmanX::Starter library is for. You can stop workers after completing the current job two different ways: externally by sending the worker process a SIGTERM, or programmatically by setting a global variable.
Given the fact that the workers are written in PHP, it would be a good idea to recycle them on a known schedule. This can be a static amount of time since started or can be done after a certain number of jobs have been attempted.
This essentially kills (no pun intended) two birds with one stone. You are are mitigating the potential for memory leaks, and you have a consistent way to determine when your workers will pick up on any potentially new code.
I generally write workers such that they report their interval to stdout and/or to a logging facility so it is simple to check on where a worker is in the process.
I ran into this same problem and came up with a solution for python 2.7.
I'm writing a python script which uses gearman to communicate with other components on the system. The script will have multiple workers, and I have each worker running in separate thread. The workers all receive gearman data, they process and store that data on a message queue, and the main thread can pull the data off of the queue as necessary.
My solution to cleanly shutting down each worker was to subclass gearman.GearmanWorker and override the work() function:
from gearman import GearmanWorker
POLL_TIMEOUT_IN_SECONDS = 60.0
class StoppableWorker(GearmanWorker):
def __init__(self, host_list=None):
super(StoppableWorker,self).__init__(host_list=host_list)
self._exit_runloop = False
# OVERRIDDEN
def work(self, poll_timeout=POLL_TIMEOUT_IN_SECONDS):
worker_connections = []
continue_working = True
def continue_while_connections_alive(any_activity):
return self.after_poll(any_activity)
while continue_working and not self._exit_runloop:
worker_connections = self.establish_worker_connections()
continue_working = self.poll_connections_until_stopped(
worker_connections,
continue_while_connections_alive,
timeout=poll_timeout)
for current_connection in worker_connections:
current_connection.close()
self.shutdown()
def stopwork(self):
self._exit_runloop = True
Use it just like GearmanWorker. When it's time to exit the script, call the stopwork() function. It won't stop immediately--it can take up to poll_timeout seconds before it kicks out of the run loop.
There may be multiple smart ways to invoke the stopwork() function. In my case, I create a temporary gearman client in the main thread. For the worker that I'm trying to shutdown, I send a special STOP command through the gearman server. When the worker gets this message, it knows to shut itself down.
Hope this helps!
http://phpscaling.com/2009/06/23/doing-the-work-elsewhere-sidebar-running-the-worker/
Like the above article demonstrates, I've run a worker inside a BASH shell script, exiting occasionally between jobs to cleanup (or re-load the worker-script) - or if a given task is given to it it can exit with a specific exit code and to shut down.
I use following code which supports both Ctrl-C and kill -TERM. By default supervisor sends TERM signal if have not modified signal= setting. In PHP 5.3+ declare(ticks = 1) is deprecated, use pcntl_signal_dispatch() instead.
$terminate = false;
pcntl_signal(SIGINT, function() use (&$terminate)
{
$terminate = true;
});
pcntl_signal(SIGTERM, function() use (&$terminate)
{
$terminate = true;
});
$worker = new GearmanWorker();
$worker->addOptions(GEARMAN_WORKER_NON_BLOCKING);
$worker->setTimeout(1000);
$worker->addServer('127.0.0.1', 4730);
$worker->addFunction('reverse', function(GearmanJob $job)
{
return strrev($job->workload());
});
$count = 500 + rand(0, 100); // rand to prevent multple workers restart at same time
for($i = 0; $i < $count; $i++)
{
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
$worker->work();
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
if ( GEARMAN_SUCCESS == $worker->returnCode() )
{
continue;
}
if ( GEARMAN_IO_WAIT != $worker->returnCode() && GEARMAN_NO_JOBS != $worker->returnCode() )
{
$e = new ErrorException($worker->error(), $worker->returnCode());
// log exception
break;
}
$worker->wait();
}
$worker->unregisterAll();
This would fit nicely into your continuous integration system. I hope you have it or you should have it soon :-)
As you check in new code, it automatically gets built and deployed onto the server. As a part of the build script, you kill all workers, and launch new ones.
What I do is use gearmadmin to check if there are any jobs running. I used the admin API to make a UI for this. When the jobs are sitting idly, there is no harm in killing them.

Categories