I have one simple program that's using Qt Framework.
It uses QProcess to execute RAR and compress some files. In my program I am catching SIGINT and doing something in my code when it occurs:
signal(SIGINT, &unix_handler);
When SIGINT occurs, I check if RAR process is done, and if it isn't I will wait for it ... The problem is that (I think) RAR process also gets SIGINT that was meant for my program and it quits before it has compressed all files.
Is there a way to run RAR process so that it doesn't receive SIGINT when my program receives it?
Thanks
If you are generating the SIGINT with Ctrl+C on a Unix system, then the signal is being sent to the entire process group.
You need to use setpgid or setsid to put the child process into a different process group so that it will not receive the signals generated by the controlling terminal.
[Edit:]
Be sure to read the RATIONALE section of the setpgid page carefully. It is a little tricky to plug all of the potential race conditions here.
To guarantee 100% that no SIGINT will be delivered to your child process, you need to do something like this:
#define CHECK(x) if(!(x)) { perror(#x " failed"); abort(); /* or whatever */ }
/* Block SIGINT. */
sigset_t mask, omask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
CHECK(sigprocmask(SIG_BLOCK, &mask, &omask) == 0);
/* Spawn child. */
pid_t child_pid = fork();
CHECK(child_pid >= 0);
if (child_pid == 0) {
/* Child */
CHECK(setpgid(0, 0) == 0);
execl(...);
abort();
}
/* Parent */
if (setpgid(child_pid, child_pid) < 0 && errno != EACCES)
abort(); /* or whatever */
/* Unblock SIGINT */
CHECK(sigprocmask(SIG_SETMASK, &omask, NULL) == 0);
Strictly speaking, every one of these steps is necessary. You have to block the signal in case the user hits Ctrl+C right after the call to fork. You have to call setpgid in the child in case the execl happens before the parent has time to do anything. You have to call setpgid in the parent in case the parent runs and someone hits Ctrl+C before the child has time to do anything.
The sequence above is clumsy, but it does handle 100% of the race conditions.
What are you doing in your handler? There are only certain Qt functions that you can call safely from a unix signal handler. This page in the documentation identifies what ones they are.
The main problem is that the handler will execute outside of the main Qt event thread. That page also proposes a method to deal with this. I prefer getting the handler to "post" a custom event to the application and handle it that way. I posted an answer describing how to implement custom events here.
Just make the subprocess ignore SIGINT:
child_pid = fork();
if (child_pid == 0) {
/* child process */
signal(SIGINT, SIG_IGN);
execl(...);
}
man sigaction:
During an execve(2), the dispositions of handled signals are reset to the default;
the dispositions of ignored signals are left unchanged.
I have a software written in PHP which can run for a long time, it's lauched via command line (not a web application). I wanted to make sure to call a function when the software exit, including when killed via ctrl+c. When killed by a signal, the shutdown functions (destructors and functions registered via register_shutdown_fucntion() ) are not called.
After some reading, I realized I had to add a handler on every relevant signal just to call "exit;". Doing so solved my problem and works.
My issue is that in some case, the signal is not handled right away. It can take seconds, minutes and in some case, is not handled at all. Just like if the handler code was never reached. I don't really know where to start to debug that. I've tried others signals (sigkill, sighup, etc), same behaviour.
My code is withing these brackets
declare(ticks=1)
{...}
I can't find any correlation between the time it get treated right away and the ones that it doesn't.
Any help would be appreciated.
The signal should be handled this way :
First, you have to make a signal handler like this :
function signalHandler($signo = null) {
$pid = posix_getpid();
switch ($signo) {
//case SIGKILL: // you can't override SIGKILL so it is useless
case SIGTERM:
case SIGINT:
// unexpected shut down
exit(3);
break;
case SIGCHLD:
case SIGHUP:
// just ignore it
break;
case 10:
// user signal 1 received. Process exited normally
exit(0);
break;
case 12:
// user signal 2 received. Precess exited with a catched error
exit(3);
break;
default:
// ignore other signals
}
}
then, in your main process, you have to tell PCNTL you are using this handler :
pcntl_signal(SIGTERM, "signalHandler");
pcntl_signal(SIGINT, "signalHandler");
// pcntl_signal(SIGKILL, "signalHandler"); // you can't override SIGKILL so it is useless
pcntl_signal(SIGCHLD, "signalHandler");
pcntl_signal(SIGHUP, "signalHandler");
pcntl_signal(10, "signalHandler");
pcntl_signal(12, "signalHandler");
I'm using this and seems to work for 200 process, 8 process running at the same time, and 10 sec for each process.
Here, when a process exits, he sends a signal 10 or 12 (for success or error) using posix_kill($pid, 10)
I think the problem here is that maybe you make a call to a function that has loop in it and every iteration of the loop takes a lot of time (for example, fetching data from database). In this case, when your program receives a stop signal, it adds this signal to stack, waits, until data will be fetched from database, and then calls signal handler function.
I'm also looking for a way to immediately call signal handler function. Maybe PHP just doesn't have this option.
UPDATE:
I think I found the solution. Try to pass false as the 3rd argument to function pcntl_signal() like this:
pcntl_signal(SIGINT, function () {
echo 'Do something here.';
}, false);
I am executing a PHP script through Windows Console. I have some ending functions that write results to a file. Anyway, sometimes I have to interrupt the execution (ctrl + c) and halt the script. I am interested in some way to write the current progress to file between keys stroke and actual sigterm. Is this possible ? I'd really need to be able to resume my script execution from last point the next time I run it. Thank you !
You can register a signal handler for SIGTERM:
function sig_handler($signo)
{
// Do something
}
pcntl_signal(SIGTERM, "sig_handler");
Your handler should then be executed when the signal is received.
I've got a PHP script in the works that is a job worker; its main task is to check a database table for new jobs, and if there are any, to act on them. But jobs will be coming in in bursts, with long gaps in between, so I devised a sleep cycle like:
while(true) {
if ($jobs = get_new_jobs()) {
// Act upon the jobs
} else {
// No new jobs now
sleep(30);
}
}
Good, but in some cases that means there might be a 30 second lag before a new job is acted upon. Since this is a daemon script, I figured I'd try the pcntl_signal hook to catch a SIGUSR1 signal to nudge the script to wake up, like:
$_isAwake = true;
function user_sig($signo) {
global $_isAwake;
daemon_log("Caught SIGUSR1");
$_isAwake = true;
}
pcntl_signal(SIGUSR1, 'user_sig');
while(true) {
if ($jobs = get_new_jobs()) {
// Act upon the jobs
} else {
// No new jobs now
daemon_log("No new jobs, sleeping...");
$_isAwake = false;
$ts = time();
while(time() < $ts+30) {
sleep(1);
if ($_isAwake) break; // Did a signal happen while we were sleeping? If so, stop sleeping
}
$_isAwake = true;
}
}
I broke the sleep(30) up into smaller sleep bits, in case a signal doesn't interrupt a sleep() command, thinking that this would cause at most a one-second delay, but in the log file, I'm seeing that the SIGUSR1 isn't being caught until after the full 30 seconds has passed (and maybe the outer while loop resets).
I found the pcntl_signal_dispatch command, but that's only for PHP 5.3 and higher. If I were using that version, I could stick a call to that command before the if ($_isAwake) call, but as it currently stands I'm on 5.2.13.
On what sort of situations is the signals queue interpreted in PHP versions without the means to explicitly call the queue parsing? Could I put in some other useless command in that sleep loop that would trigger a signal queue parse within there?
Fixed my own problem: The answer is the "ticks" declaration. I had, as part of the Daemon process startup done the declare(ticks=1); action, but it wasn't seeming to carry over to the main script (since that was inside a function, in an include file?. Adding a declare(ticks=1) line before the while(true) loop causes signals to come through immediately (i.e. the sleep(1) command causes a tick, so after waking up from sleep, signals are processed).
I have a number of Gearman workers running constantly, saving things like records of user page views, etc. Occasionally, I'll update the PHP code that is used by the Gearman workers. In order to get the workers to switch to the new code, I the kill and restart the PHP processes for the workers.
What is a better way to do this? Presumably, I'm sometime losing data (albeit not very important data) when I kill one of those worker processes.
Edit: I found an answer that works for me, and posted it below.
Solution 1
Generally I run my workers with the unix daemon utility with the -r flag and let them expire after one job. Your script will end gracefully after each iteration and daemon will restart automatically.
Your workers will be stale for one job but that may not be as big a deal to you as losing data
This solution also has the advantage of freeing up memory. You may run into problems with memory if you're doing large jobs as PHP pre 5.3 has god awful GC.
Solution 2
You could also add a quit function to all of your workers that exits the script. When you'd like to restart you simply give gearman calls to quit with a high priority.
function AutoRestart() {
static $startTime = time();
if (filemtime(__FILE__) > $startTime) {
exit();
}
}
AutoRestart();
Well, I posted this question, now I think I have found a good answer to it.
If you look in the code for Net_Gearman_Worker, you'll find that in the work loop, the function stopWork is monitored, and if it returns true, it exits the function.
I did the following:
Using memcache, I created a cached value, gearman_restarttime, and I use a separate script to set that to the current timestamp whenever I update the site. (I used Memcache, but this could be stored anywhere--a database, a file, or anything).
I extended the Worker class to be, essentially, Net_Gearman_Worker_Foo, and had all of my workers instantiate that. In the Foo class, I overrode the stopWork function to do the following: first, it checks gearman_restarttime; the first time through, it saves the value in a global variable. From then on, each time through, it compares the cached value to the global. If it has changed, the stopWork returns true, and the worker quits. A cron checks every minute to see if each worker is still running, and restarts any worker that has quit.
It may be worth putting a timer in stopWork as well, and checking the cache only once every x minutes. In our case, Memcache is fast enough that checking the value each time doesn't seem to be a problem, but if you are using some other system to store off the current timestamp, checking less often would be better.
Hmm, You could implement a code in the workers to check occasionally if the source code was modified, if yes then just just kill themselves when they see fit. That is, check while they are in the middle of the job, and if job is very large.
Other way would be implement some kind of an interrupt, maybe via network to say stop whenever you have the chance and restart.
The last solution is helping to modify Gearman's source to include this functionality.
I've been looking at this recently as well (though in perl with Gearman::XS). My usecase was the same as yours - allow a long-running gearman worker to periodically check for a new version of itself and reload.
My first attempt was just having the worker keep track of how long since it last checked the worker script version (an md5sum would also work). Then once N seconds had elapsed, between jobs, it would check to see if a new version of itself was available, and restart itself (fork()/exec()). This did work OK, but workers registered for rare jobs could potentially end up waiting hours for work() to return, and thus for checking the current time.
So I'm now setting a fairly short timeout when waiting for jobs with work(), so I can check the time more regularly. The PHP interface suggest that you can set this timeout value when registering for the job. I'm using SIGALRM to trigger the new-version check. The perl interface blocks on work(), so the alarm wasn't being triggered initially. Setting the timeout to 60 seconds got the SIGALRM working.
If someone were looking for answer for a worker running perl, that's part of what the GearmanX::Starter library is for. You can stop workers after completing the current job two different ways: externally by sending the worker process a SIGTERM, or programmatically by setting a global variable.
Given the fact that the workers are written in PHP, it would be a good idea to recycle them on a known schedule. This can be a static amount of time since started or can be done after a certain number of jobs have been attempted.
This essentially kills (no pun intended) two birds with one stone. You are are mitigating the potential for memory leaks, and you have a consistent way to determine when your workers will pick up on any potentially new code.
I generally write workers such that they report their interval to stdout and/or to a logging facility so it is simple to check on where a worker is in the process.
I ran into this same problem and came up with a solution for python 2.7.
I'm writing a python script which uses gearman to communicate with other components on the system. The script will have multiple workers, and I have each worker running in separate thread. The workers all receive gearman data, they process and store that data on a message queue, and the main thread can pull the data off of the queue as necessary.
My solution to cleanly shutting down each worker was to subclass gearman.GearmanWorker and override the work() function:
from gearman import GearmanWorker
POLL_TIMEOUT_IN_SECONDS = 60.0
class StoppableWorker(GearmanWorker):
def __init__(self, host_list=None):
super(StoppableWorker,self).__init__(host_list=host_list)
self._exit_runloop = False
# OVERRIDDEN
def work(self, poll_timeout=POLL_TIMEOUT_IN_SECONDS):
worker_connections = []
continue_working = True
def continue_while_connections_alive(any_activity):
return self.after_poll(any_activity)
while continue_working and not self._exit_runloop:
worker_connections = self.establish_worker_connections()
continue_working = self.poll_connections_until_stopped(
worker_connections,
continue_while_connections_alive,
timeout=poll_timeout)
for current_connection in worker_connections:
current_connection.close()
self.shutdown()
def stopwork(self):
self._exit_runloop = True
Use it just like GearmanWorker. When it's time to exit the script, call the stopwork() function. It won't stop immediately--it can take up to poll_timeout seconds before it kicks out of the run loop.
There may be multiple smart ways to invoke the stopwork() function. In my case, I create a temporary gearman client in the main thread. For the worker that I'm trying to shutdown, I send a special STOP command through the gearman server. When the worker gets this message, it knows to shut itself down.
Hope this helps!
http://phpscaling.com/2009/06/23/doing-the-work-elsewhere-sidebar-running-the-worker/
Like the above article demonstrates, I've run a worker inside a BASH shell script, exiting occasionally between jobs to cleanup (or re-load the worker-script) - or if a given task is given to it it can exit with a specific exit code and to shut down.
I use following code which supports both Ctrl-C and kill -TERM. By default supervisor sends TERM signal if have not modified signal= setting. In PHP 5.3+ declare(ticks = 1) is deprecated, use pcntl_signal_dispatch() instead.
$terminate = false;
pcntl_signal(SIGINT, function() use (&$terminate)
{
$terminate = true;
});
pcntl_signal(SIGTERM, function() use (&$terminate)
{
$terminate = true;
});
$worker = new GearmanWorker();
$worker->addOptions(GEARMAN_WORKER_NON_BLOCKING);
$worker->setTimeout(1000);
$worker->addServer('127.0.0.1', 4730);
$worker->addFunction('reverse', function(GearmanJob $job)
{
return strrev($job->workload());
});
$count = 500 + rand(0, 100); // rand to prevent multple workers restart at same time
for($i = 0; $i < $count; $i++)
{
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
$worker->work();
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
if ( GEARMAN_SUCCESS == $worker->returnCode() )
{
continue;
}
if ( GEARMAN_IO_WAIT != $worker->returnCode() && GEARMAN_NO_JOBS != $worker->returnCode() )
{
$e = new ErrorException($worker->error(), $worker->returnCode());
// log exception
break;
}
$worker->wait();
}
$worker->unregisterAll();
This would fit nicely into your continuous integration system. I hope you have it or you should have it soon :-)
As you check in new code, it automatically gets built and deployed onto the server. As a part of the build script, you kill all workers, and launch new ones.
What I do is use gearmadmin to check if there are any jobs running. I used the admin API to make a UI for this. When the jobs are sitting idly, there is no harm in killing them.