How to prevent upstart from killing child processes to a daemon?

How to prevent upstart from killing child processes to a daemon? - php

Situation
I have a daemon I wrote in PHP (not the best language for this, but work with me), and it is made to receive jobs from a queue and process them whenever a job needs to be done. For each new job, I use pcntl_fork() to fork the job off into a child process. Within this child process, I then use proc_open() to execute long-running system commands for audio transcoding, which returns directly to the child when finished. When the job is completely done, the child exits and is cleaned up by the parent process.
To keep this daemon always running, I use upstart. Here is my upstart configuration file:
description "Audio Transcoding Daemon"
start on startup
stop on shutdown
# kill signal SIGCHLD
kill timeout 1200 # Don't force kill the process until it runs over 20 minutes
respawn
exec audio-daemon.php
Goal
Because I want to use this daemon in a distributed environment, I want to be able to shutdown the server at any time without disrupting any running jobs. To do this, I have already implemented signal handlers using pcntl_signal() for SIGTERM, SIGHUP, and SIGINT on the parent process, which waits for all children to exit normally before exiting itself. The children also have signal handlers, but they are made to ignore all kill signals.
Problem
The problem is, according to the docs...
The signal specified by the kill signal stanza is sent to the process group of the main process. (such that all processes belonging to the jobs main process are killed). By default this signal is SIGTERM.
This is concerning because, in my child process, I run system commands through proc_open(), which spawns new child processes as well. So, whenever I run sudo stop audio-daemon, this sub-process (which happens to be sox) is killed immediately, and the job returns back with an error. Apparently, sox obeys SIGTERM and does what it's told...
Originally, I thought, "Fine. I'll just change kill signal to send something that is inherently ignored, and I'll just pick it up in the main process only." But according to the manual, there are only two signals that are ignored by default: SIGCHLD and SIGURG (and possibly SIGWINCH). But I'm afraid of getting false flags, since these can also be triggered other ways.
There are ways to create a custom signal using what the manual calls "Real-time Signals" but it also states...
The default action for an unhandled real-time signal is to terminate the receiving process.
So that doesn't help...
Can you think of any way that I can get upstart to keep all of my sub-processes open until they complete? I really don't want to go digging through sox's source code to modify its signal handlers, and while I could set SIGCHLD, SIGURG, or SIGWINCH as my upstart kill signal and pray nothing else sends them my way, I can't help but think there's a better way to do this... Any ideas?
Thanks for all your help! :)

Since I haven't received any other answers for how to do this a better way, this is what I ended up doing, and I hope it helps someone out there...
To stall shutdown/reboot of the system until the daemon is finished, I changed my start on and stop on in my upstart configuration. And to keep upstart from killing my children, I resorted to using SIGURG as my kill signal, which I then catch as a kill signal in my main daemon process only.
Here is my final upstart configuration:
description "Audio Transcoding Daemon"
start on runlevel [2345]
stop on starting rc RUNLEVEL=[016] # Block shutdown/reboot until the daemon ends
kill signal SIGURG # Kill the process group with SIGURG instead of SIGTERM so only the main process will pick it up (since SIGURG will be ignored by all children by default)
kill timeout 1200 # Don't force kill the process until it runs over 20 minutes
respawn
exec audio-daemon.php
Note that using stop on starting rc RUNLEVEL=[016] is necessary to stall shutdown/reboot. stop on runlevel [016] will not work.
Also note that if you use SIGURG in your application for any other reason, using it as a kill signal may cause problems. In my case, I wasn't, so this works fine as far as I can tell.
Ideally, it would be nice if the POSIX standard provided a user-defined signal like SIGUSR1 and SIGUSR2 that was ignored by default. But right now, it looks like it doesn't exist.
Feel free to chime in if you have a better answer, but for now, I hope this helps anyone else having this problem.

Disclaimer: I don't know any PHP
I solved a similar problem with my ruby process by setting a new group id for a launched subprocess. It looks like php has a similar facility.
you can start a new group (detaching from your audio-daemon.php) by settings it's group id to its process id
something like
$chldPid=pcntl_fork()
... << error checks etc
if ($chldPid){
...
posix_setpgid($chldPid, $chldPid)

Related

Python gracefully stop a script from Laravel/Php

I'm looking for a way to gracefully exit from a long running python script which I launch from my Laravel app.
My actual process to do so is:
From Laravel set a 'script_state' to 1 in a MySql table database
Launch the python script via shell_exec
The python scripts periodically check the 'script_state' by a MySql query. If it is changed to 0 (intentionally from my Laravel app) then it gracefully exit the script
Retrieving the pid from the shell_exec and then kill could have been an option but I actually wan't the script to stop and exit gracefully.
My actual config works but I'm looking for a better approach.

Retrieving the pid from the shell_exec and then kill could have been an option but I actually wan't the script to stop and exit gracefully.
This is probably your best bet. You can use SIGTERM to politely ask the script to exit:
The SIGTERM signal is a generic signal used to cause program termination. Unlike SIGKILL, this signal can be blocked, handled, and ignored. It is the normal way to politely ask a program to terminate.
This is effectively what happens when you click the close button in a GUI application.
In your Python code you can handle SIGTERM using the signal module with whatever cleanup logic you want, then exit:
import signal, os
def handler(signum, frame):
print('Signal handler called with signal', signum)
raise OSError("Couldn't open device!")
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
# This open() may hang indefinitely
fd = os.open('/dev/ttyS0', os.O_RDWR)
signal.alarm(0) # Disable the alarm
See also this Stack Overflow answer for a class that cleans up before exiting.

How to stop SIGTERM and SIGKILL?

I need to run a huge process which will run for like 10+ minutes. I maxed the max_execution_time, but in my error logs I get a SIGTERM and then a SIGKILL.
I read a little about SIGTERM and SIGKILL that they come from the daemon, but i Didn't figure out how to stop it from happening. I just need to disable it for one night.

Rather than trying to ignore signals, you need to find who sends them and why. If you're starting php from the command line, no one will send that signal and your script time will have all the time.
But if you're actually starting this process as a response to an http request, it's probably the web server of FastCGI manager that limits the amount of time it waits for the script to finish. It also may simply kill the script because client connection (between user's browser and http server) has been terminated.
So the important question you should ask yourself - what's the source of that signal and how this timeout can be increased. Please also provide all details about how you start this script and what platform you're running.

using exec for long running scripts

I want to get some data from and API and save for that user in database, this actions takes random times depending on the time and sometimes it takes even 4 hours,
I am executing the script using exec and & in the background in php,
My question is that is exec safe for long running jobs, I dont know much about fork,and linux processes etc so I dont know what happened internally on CPU cores,
Here is something I found that confused me,
http://symcbean.blogspot.com/2010/02/php-and-long-running-processes.html
Can somebody tell me if I am going in right direction with exec?
will the process be killed itself after script completion?
Thanks

Well, that article is talking about process "trees" and how a child process depends of it spawning parent.
The PHP instance starts a child process (through exec or similar). If it doesn't wait for the process output, the PHP script ends (and the response is sent to the browser, for instance), but the process will sit idling, waiting for it's child process to finish.
The problem with this is that the child process (the long running 4 hours process) is not guaranteed to finish its job, before apache decides to kill its parent process (because you have too many idle processes) and, effectively, killing its children.
The article's author then gives the suggestion of using a daemon and separate the child process from the parent process.
Edit:
Answering the question you left in the comments, here's a quick explanation of the command he uses in the article
echo /usr/bin/php -q longThing.php | at now
Starting from left to right.
echo prints to Standard Output (STDOUT) the stuff you put in front of it so...
echo /usr/bin/php -q longThing.php will print to the shell /usr/bin/php -q longThing.php
| (pipeline) feeds directly the STDOUT of a previous command to the standard input (STDIN) of the next command.
at reads commands from STDIN and executes them at a specified time. at now means the command will be executed immediately.
So basically this is the same thing as running the following sequence in the shell:
at now - Opens the at prompt
/usr/bin/php -q longThing.php - The command we want to run
^D (by pressing Control+D) - To save the job
So, regarding your questions:
Will the child process be immediately killed after the PARENT PHP script ends?
No.
Will the child process be killed at all, in some future moment?
Yes. Apache takes care of that for you.
Will the child process finish its job before being killed?
Maybe. Maybe not. Apache might kill it before its done. Odds of that happening increase with the number of idle processes and with the time the process takes to finish.
Sidenote:
I think this article does point in the right direction but I dislike the idea of spawning processes directly from PHP. In fact, PHP does not have the appropriate tools for running (long and/or intensive) bg work. With PHP alone, you have little to no control over it.
I can, however, give you the solution we found for a similar problem I faced a while ago. We created a small program that would accept and queue data processing requests (about 5 mins long) and report back when the request was finished. That way we could control how many processes could be running at the same time, memory usage, number of requests by the same user, etc...
The program was actually hosted in another LAN server, which prevented memory usage spikes slowing down the webserver.
At the front-end, the user would be informed when the request was completed through long polling,

Running a PHP process as a daemon while safely killing it from background

We are running a PHP Daemon which look into a queue, receives worker jobs and spawns the worker to handle it. The workers themselves acquire a lock on a specific location before proceeding.
We spawn the Daemon as nohup background processes.
This entire architecture seems to work, except when we have to kill the processes, for whatever reason. If we kill them using -9, there is no way to trap it in the worker process and release the locks before dying.
If we use anything less than -9 (like TERM or HUP), it doesn't seem to be received by either the daemon or the worker processes.
Has anybody solved this problem in a better way?
(ps: BTW, Due to other considerations, we may not be able to change our language of implementation, so please only consider PHP based solutions)

I had related problems once too. Let me explain. I had a php 'daemon' that worked like a downloader. It accessed feeds periodically and downloads (laaaarge) content from the net. The daemon had to be stopped at a certain time, lets say 0500 in the morning to prevent it from using the whole bandwith during daytime. I decided to use a cronjob to send SIGTERM to the daemon at 0500.
In the daemon I had the following code:
pcntl_signal(SIGTERM, array($this, 'signal_handler'));
where signal_handler looked like this:
public function signal_handler($signal) {
// some cleanup code
exit(1);
}
Unfortunately this did not work :|
It took me a time to find out what's going on. The first thing I figured out was that I'll have to call the method pcntl_signal_dispatch() on init to enable signal dispatching at all. Quote from the doc (comments):
If you are running PHP as CLI and as a "daemon" (i.e. in a loop), this function must be called in each loop to check if new signals are waiting dispatching.
Ok, so far, it seemed working. But I realized quickly that under certain conditions even this will not work as expected. Sometimes the daemon could only being stopped by kill -9 - as before. :|
So what's the problem?.. Answer: My program called wget to download the files via shell_exec. The problem is, that shell_exec() blocking waits until the child process has terminated. During this blocking wait no signal processing is done, the process can only being terminated using SIGKILL - what is hard. Also a problem was that child processes had to be terminated one by one as they became zombie processes after killing the father.
My solution to this was to execute the child process using proc_open() and the use stream_select() on it's output for non blocking IO.
Now it works like a charm. :) If you need further information don't hesitate to drop a comment.
Note If you are working with PHP < 5.3 then you'll have to use `
declare(ticks=1);
instead of pcntl_signal_dispatch(). You can rfer to the the documentation of pcntl_signal() for that. But if possible you should upgrade to PHP >= 5.3

The problem was solved just by adding ticks:
// tick use required as of PHP 4.3.0
declare(ticks = 1);
Leaving this alone was causing my code not to work.
*(It's unfortunate that the documentation of pcntl_signal doesn't mention it in a lot more attention grabbing way.)*

You need to catch the signal (SIGTERM). This can be achieved via the function pcntl_signal. This will give you the option to perform any necessary functions before calling exit.

php command line reader

So I want to be able to do the following in PHP. From the command line I call a script.
/usr/bin/php mychildren.php
I want the script to be able to create 2 child processes both of which stay active indefinitely. (say we make them infinite loops =D), but I want the child processes to occasionally to echo out hello for the 1st process and goodbye for the second process. And then when I do a signal interrupt (ctrl+c) using pcntl_signal I can then kill the 2 child processes and once I have verification that they are killed then I kill the parent process.
Is this even possible?! I looked through streaming a little and I am super confuzzled as to how to get this working. Seems like it should work, but I can't get anything to work properly.
Quick details:
2 child processes
each child processes occasionally echos something random
when I kill the parent the children die, and once they are dead then the parent dies

While you can use pcntl_fork to create subprocesses, oftentimes it is better to execute the subprocesses anew with proc_open. Use pcntl_signal to install signal handlers (to kill the subprocesses). If you want the child processes to directly write to the same output, you'll have to implement some kind of IPC to avoid both writing at the same time.
Therefore, it's probably better to let both subprocesses write to the main process, and let the main process wait for full lines or otherwise synchronize outputs.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.