I need to run a huge process which will run for like 10+ minutes. I maxed the max_execution_time, but in my error logs I get a SIGTERM and then a SIGKILL.
I read a little about SIGTERM and SIGKILL that they come from the daemon, but i Didn't figure out how to stop it from happening. I just need to disable it for one night.
Rather than trying to ignore signals, you need to find who sends them and why. If you're starting php from the command line, no one will send that signal and your script time will have all the time.
But if you're actually starting this process as a response to an http request, it's probably the web server of FastCGI manager that limits the amount of time it waits for the script to finish. It also may simply kill the script because client connection (between user's browser and http server) has been terminated.
So the important question you should ask yourself - what's the source of that signal and how this timeout can be increased. Please also provide all details about how you start this script and what platform you're running.
Related
i'm creating a socket server with ReactPHP and i need it to run forever.
I also have a command panel where i have to check if the process is running, and i can stop or start it (or restart it).
I don't know howe to achieve this.
My plan was:
With play button: start the php command shell_exec with simply "php script.php".
With stop button i can do in 2 ways: 1. i can set in the loop a timer that every 5 seconds checks if there is file inside the folder (like "stop.lock") and then stop the process. 2. i can save the process PID in the database, and so clicking the stop button i can just kill the process.
Checking online status: I can make another script that tries to connect to the IP/port and if succeeds is online, if not (timeout 5 seconds) is offline.
I also want the script stay always in the listening status, so how can i make the script auto-start if for example i have to restart my server?
I was thinking about a cron trying to connect to the server every minutes; and if it fails, it will just lauch again shell_exec('php script.php');
How is the best solution to handle all? (Server OS is CentOS 7)
As #Volker said, just stop the loop if you want to stop it gracefully. You could check periodically a file or query a table but that's not a great way.
A nice flow could be to listen to an admin message to stop the server. Of course, you should care of authenticating who can stop the server. This way it will stop without having to wait an interval to run, and you reduce the overhead of querying periodically your filesystem or your database.
Another cool way could be using RabbitMQ or a similar queue service. You just listen to your queue server, and you can send a message from your script to RabbitMQ, and from there to your server.
Good luck!
Edit: If you are running your server with systemd, a great way of handling it could be just to listen to a system signal to gracefully stop the application. Take a look at addSignal, you can handle a kill by pid, but also through systemd.
To handle graceful shutdown versus long running streamed response I've created a acquire/release like mechanism.
When a handler starts streaming a long response it acquires a lock and when streaming is done it releases it (it's just an array of uniqid()).
The server can decide to wait if there is active locks.
I use supervisor to handle the start/stop with a SIGTERM signal.
I have a long running PHP script that I'm attempting to convert into a systemd daemon.
While planing the daemon, I figured that I could simply send SITERM/SIGUSR1/SIGUSR2/etc. signals to restart/reload to my script when necessary using the kill command, but after reading through the systemd documentation, I've noticed this bit in the "ExecReload" section:
Note however that reloading a daemon by sending a signal (as with the example line above) is usually not a good choice, because this is an asynchronous operation and hence not suitable to order reloads of multiple services against each other. It is strongly recommended to set ExecReload= to a command that not only triggers a configuration reload of the daemon, but also synchronously waits for it to complete.
So, while my script will run just fine and the daemon itself will work properly using kill to signal various events (I don't have and most likely won't ever have another daemon that would depend on this one), the quote above got me thinking about alternatives of sending a synchronous message to the daemon.
The only thing that I could think of so far is:
Open a local socket in the daemon and listen for messages on it
Execute any supported action when receiving a message
Send an OK message back to the sender once the action is complete
Is there a better/recommended/optimal way of achieving this?
Situation
I have a daemon I wrote in PHP (not the best language for this, but work with me), and it is made to receive jobs from a queue and process them whenever a job needs to be done. For each new job, I use pcntl_fork() to fork the job off into a child process. Within this child process, I then use proc_open() to execute long-running system commands for audio transcoding, which returns directly to the child when finished. When the job is completely done, the child exits and is cleaned up by the parent process.
To keep this daemon always running, I use upstart. Here is my upstart configuration file:
description "Audio Transcoding Daemon"
start on startup
stop on shutdown
# kill signal SIGCHLD
kill timeout 1200 # Don't force kill the process until it runs over 20 minutes
respawn
exec audio-daemon.php
Goal
Because I want to use this daemon in a distributed environment, I want to be able to shutdown the server at any time without disrupting any running jobs. To do this, I have already implemented signal handlers using pcntl_signal() for SIGTERM, SIGHUP, and SIGINT on the parent process, which waits for all children to exit normally before exiting itself. The children also have signal handlers, but they are made to ignore all kill signals.
Problem
The problem is, according to the docs...
The signal specified by the kill signal stanza is sent to the process group of the main process. (such that all processes belonging to the jobs main process are killed). By default this signal is SIGTERM.
This is concerning because, in my child process, I run system commands through proc_open(), which spawns new child processes as well. So, whenever I run sudo stop audio-daemon, this sub-process (which happens to be sox) is killed immediately, and the job returns back with an error. Apparently, sox obeys SIGTERM and does what it's told...
Originally, I thought, "Fine. I'll just change kill signal to send something that is inherently ignored, and I'll just pick it up in the main process only." But according to the manual, there are only two signals that are ignored by default: SIGCHLD and SIGURG (and possibly SIGWINCH). But I'm afraid of getting false flags, since these can also be triggered other ways.
There are ways to create a custom signal using what the manual calls "Real-time Signals" but it also states...
The default action for an unhandled real-time signal is to terminate the receiving process.
So that doesn't help...
Can you think of any way that I can get upstart to keep all of my sub-processes open until they complete? I really don't want to go digging through sox's source code to modify its signal handlers, and while I could set SIGCHLD, SIGURG, or SIGWINCH as my upstart kill signal and pray nothing else sends them my way, I can't help but think there's a better way to do this... Any ideas?
Thanks for all your help! :)
Since I haven't received any other answers for how to do this a better way, this is what I ended up doing, and I hope it helps someone out there...
To stall shutdown/reboot of the system until the daemon is finished, I changed my start on and stop on in my upstart configuration. And to keep upstart from killing my children, I resorted to using SIGURG as my kill signal, which I then catch as a kill signal in my main daemon process only.
Here is my final upstart configuration:
description "Audio Transcoding Daemon"
start on runlevel [2345]
stop on starting rc RUNLEVEL=[016] # Block shutdown/reboot until the daemon ends
kill signal SIGURG # Kill the process group with SIGURG instead of SIGTERM so only the main process will pick it up (since SIGURG will be ignored by all children by default)
kill timeout 1200 # Don't force kill the process until it runs over 20 minutes
respawn
exec audio-daemon.php
Note that using stop on starting rc RUNLEVEL=[016] is necessary to stall shutdown/reboot. stop on runlevel [016] will not work.
Also note that if you use SIGURG in your application for any other reason, using it as a kill signal may cause problems. In my case, I wasn't, so this works fine as far as I can tell.
Ideally, it would be nice if the POSIX standard provided a user-defined signal like SIGUSR1 and SIGUSR2 that was ignored by default. But right now, it looks like it doesn't exist.
Feel free to chime in if you have a better answer, but for now, I hope this helps anyone else having this problem.
Disclaimer: I don't know any PHP
I solved a similar problem with my ruby process by setting a new group id for a launched subprocess. It looks like php has a similar facility.
you can start a new group (detaching from your audio-daemon.php) by settings it's group id to its process id
something like
$chldPid=pcntl_fork()
... << error checks etc
if ($chldPid){
...
posix_setpgid($chldPid, $chldPid)
I've created a PHP script that reads from beanstalkd and processes the jobs. No problems there.
The last thing I've got to do is just write an init script for it, so it can run as a service.
However, this has now raised another question for me. When trying to stop the service, the one obvious way of doing it would be to try and kill the process. However, if I do that, what will happen to the job, if the PHP script was halfway through processing it? So the job was reserved, but the script never succeeded or failed (to delete or bury respectively), what happens?
My guess is that the TTR will expire, and then it gets put back to the ready queue?
And bonus 2nd question, any hints on how to better manage stopping the PHP service?
When a worker process (beanstalk client) opens up a connection with beanstalkd and reserves a job, the job will be in "reserved" state until the client issues delete/release command (or) job times out.
In case, if the worker process terminates abruptly, its connection with beanstalkd will get closed and the server will immediately release all the jobs that has been reserved using that particular connection.
Ref: http://groups.google.com/group/beanstalk-talk/browse_thread/thread/232d0cac5bebe30f?hide_quotes=no#msg_efa0109e7af4672e
Any job that runs out of time, and is not buried or touched goes back into the ready queue to be reserved.
I've posted elsewhere about using Supervisord and shell scripts to run workers. It has the advantage that most of the time, you probably don't mind waiting for a little while as jobs finish cleanly. You can have supervisord kill the bash scripts that run a worker script, and when the script itself has finished, simply exits, as it can't be restarted.
Another way is to put a highest-priority (0) message into a tube that the workers listen of, that will have the workers first delete the message, and then exit. I setup the shell scripts to check for a specific return value (from exit($val);) and then they too would exit any loop in the shell scripts.
I've used these techniques for Beanstalkd and also AWS:SQS queue runners for some time, dealing with millions of jobs per day running through the system.
If you job is too valuable to lose, you can also use pcntl to wait until the job finishes and then restart/shutdown your worker. I've managed to handle all suitable pcntl signals to release the job back to tube.
I use cPanel and I'm up power CPU. My PHP script (messenger.php) uses too much CPU power, so I want to kill it with Process Management.
I looked at the documentation of cPanel here: http://docs.cpanel.net/twiki/bin/view/AllDocumentation/WHMDocs/CurrentCPUUsage
When I kill this process, what's there for my users who use this page (messenger.php) and when will it re-run again?
When you kill it, everything will be aborted - database queries, deletions, inserts, submits, ...
So someone could lost some important data in the worst case
If you kill a php process it will shut down ungracefully and not finish any outstanding work it has to do. This usually results in a error 500 for users who requested the page but did not receive it yet. However the process will usually restart automatically, new page requests should be served in a matter of miliseconds again. The other running php processes will take over the workload of the process you terminated while it restarts, except of course the other processes are hanging too.