When does a PHP <5.3.0 daemon script receive signals? - php

I've got a PHP script in the works that is a job worker; its main task is to check a database table for new jobs, and if there are any, to act on them. But jobs will be coming in in bursts, with long gaps in between, so I devised a sleep cycle like:
while(true) {
if ($jobs = get_new_jobs()) {
// Act upon the jobs
} else {
// No new jobs now
sleep(30);
}
}
Good, but in some cases that means there might be a 30 second lag before a new job is acted upon. Since this is a daemon script, I figured I'd try the pcntl_signal hook to catch a SIGUSR1 signal to nudge the script to wake up, like:
$_isAwake = true;
function user_sig($signo) {
global $_isAwake;
daemon_log("Caught SIGUSR1");
$_isAwake = true;
}
pcntl_signal(SIGUSR1, 'user_sig');
while(true) {
if ($jobs = get_new_jobs()) {
// Act upon the jobs
} else {
// No new jobs now
daemon_log("No new jobs, sleeping...");
$_isAwake = false;
$ts = time();
while(time() < $ts+30) {
sleep(1);
if ($_isAwake) break; // Did a signal happen while we were sleeping? If so, stop sleeping
}
$_isAwake = true;
}
}
I broke the sleep(30) up into smaller sleep bits, in case a signal doesn't interrupt a sleep() command, thinking that this would cause at most a one-second delay, but in the log file, I'm seeing that the SIGUSR1 isn't being caught until after the full 30 seconds has passed (and maybe the outer while loop resets).
I found the pcntl_signal_dispatch command, but that's only for PHP 5.3 and higher. If I were using that version, I could stick a call to that command before the if ($_isAwake) call, but as it currently stands I'm on 5.2.13.
On what sort of situations is the signals queue interpreted in PHP versions without the means to explicitly call the queue parsing? Could I put in some other useless command in that sleep loop that would trigger a signal queue parse within there?

Fixed my own problem: The answer is the "ticks" declaration. I had, as part of the Daemon process startup done the declare(ticks=1); action, but it wasn't seeming to carry over to the main script (since that was inside a function, in an include file?. Adding a declare(ticks=1) line before the while(true) loop causes signals to come through immediately (i.e. the sleep(1) command causes a tick, so after waking up from sleep, signals are processed).

Related

php never ending loop

I need a function that executes by itself in php without the help of crone. I have come up with the following code that works for me well but as it is a never-ending loop will it cause any problem to my server or script, if so could you give me some suggestion or alternatives, please. Thanks.
$interval=60; //minutes
set_time_limit(0);
while (1){
$now=time();
#do the routine job, trigger a php function and what not.
sleep($interval*60-(time()-$now));
}
We have used the infinite loop in a live system environment to basically wait for incoming SMS and then process it. We found out that doing it this way makes the server resource intensive over time and had to restart the server in order to free up memory.
Another issue we encountered is when you execute a script with an infinite loop in your browser, even if you hit the stop button it will continue to run unless you restart Apache.
while (1){ //infinite loop
// write code to insert text to a file
// The file size will still continue to grow
//even when you click 'stop' in your browser.
}
The solution is to run the PHP script as a deamon on the command line. Here's how:
nohup php myscript.php &
the & puts your process in the background.
Not only we found this method to be less memory intensive but you can also kill it without restarting apache by running the following command :
kill processid
Edit: As Dagon pointed out, this is not really the true way of running PHP as a 'Daemon' but using the nohup command can be considered as the poor man's way of running a process as a daemon.
You can use time_sleep_until() function. It will return TRUE OR FALSE
$interval=60; //minutes
set_time_limit( 0 );
$sleep = $interval*60-(time());
while ( 1 ){
if(time() != $sleep) {
// the looping will pause on the specific time it was set to sleep
// it will loop again once it finish sleeping.
time_sleep_until($sleep);
}
#do the routine job, trigger a php function and what not.
}
There are many ways to create a daemon in php, and have been for a very long time.
Just running something in background isn't good. If it tries to print something and the console is closed, for example, the program dies.
One method I have used on linux is pcntl_fork() in a php-cli script, which basically splits your script into two PIDs. Have the parent process kill itself, and have the child process fork itself again. Again have the parent process kill itself. The child process will now be completely divorced and can happily hang out in background doing whatever you want it to do.
$i = 0;
do{
$pid = pcntl_fork();
if( $pid == -1 ){
die( "Could not fork, exiting.\n" );
}else if ( $pid != 0 ){
// We are the parent
die( "Level $i forking worked, exiting.\n" );
}else{
// We are the child.
++$i;
}
}while( $i < 2 );
// This is the daemon child, do your thing here.
Unfortunately, this model has no way to restart itself if it crashes, or if the server is rebooted. (This can be resolved through creativity, but...)
To get the robustness of respawning, try an Upstart script (if you are on Ubuntu.) Here is a tutorial - but I have not yet tried this method.
while(1) means it is infinite loop. If you want to break it you should use break by condition.
eg,.
while (1){ //infinite loop
$now=time();
#do the routine job, trigger a php function and what no.
sleep($interval*60-(time()-$now));
if(condition) break; //it will break when condition is true
}

Execute function in php before SIGTERM

I am executing a PHP script through Windows Console. I have some ending functions that write results to a file. Anyway, sometimes I have to interrupt the execution (ctrl + c) and halt the script. I am interested in some way to write the current progress to file between keys stroke and actual sigterm. Is this possible ? I'd really need to be able to resume my script execution from last point the next time I run it. Thank you !
You can register a signal handler for SIGTERM:
function sig_handler($signo)
{
// Do something
}
pcntl_signal(SIGTERM, "sig_handler");
Your handler should then be executed when the signal is received.

pcntl_sigwaitinfo and signal handlers

I'm writing a daemon which periodcally does some work and sleeps some time before repeating it again. But it must still be responsive to outer impacts (i.e. termination request) while asleep.
I managed to implement sleep timeout with ALRM signal and termination with TERM signal (sample):
// ...
declare(ticks = 1);
function do_work()
{
echo "Doing some work.\n";
}
$term = FALSE;
$sighandler = function ($signal) use (&$term)
{
if ($signal === SIGTERM)
{
pcntl_alarm(0);
$term = TRUE;
echo "TERM HANDLER\n";
} else {
echo "ALRM HANDLER\n";
}
};
pcntl_signal(SIGALRM, $sighandler);
pcntl_signal(SIGTERM, $sighandler);
while (!$term)
{
do_work();
// Kick myself after 2 seconds
pcntl_alarm(2);
// Wait for alarm or termination
$signal = pcntl_sigwaitinfo(array(SIGTERM, SIGALRM), $info);
pcntl_signal_dispatch();
switch ($signal)
{
case SIGALRM: echo "ALRM SIGWI\n"; break;
case SIGTERM: echo "TERM SIGWI\n"; $term = TRUE; break;
}
}
// ...
But for Gods sake I can't figure out why the sighandler is never called. I get the following output:
$ php sigsample.php
Doing some work.
ALRM SIGWI
Doing some work.
ALRM SIGWI
Doing some work.
TERM SIGWI
And at the same time if I don't set this handler the script dies because of unhandler signal.
Am I missing somethind? Why is my signal handler function never called? Is it pcntl_sigwaitinfo() interferes?
And are there are any other means to implement timeout and signal handling at the same time?
That's not entirely unexpected.
You've asked for delivery to a signal handler (pcntl_signal(...)), but then also asked to accept the signal without invoking any handlers (pcntl_sigwaitinfo(...)). Your OS gets to decide what happens in this case, and your OS (like mine) chooses to let pcntl_sigwaitinfo() win.
Background
A process can receive ("suffer?") a signal in two different ways:
asynchronous delivery
The signal induces some asynchronous action, typically killing the process or invoking a user-defined handler. pcntl_signal registers just such a handler.
The underlying calls familiar to C programmers are signal and sigaction.
synchronous acceptance
Special system functions note that a signal is pending, and remove it from the pending list, returning information about the signal to the process. pcntl_sigwaitinfo is such a function.
The underlying calls are sigwait, sigwaitinfo and sigtimedwait.
These two ways, delivery and acceptance, are different and are not meant to be used together. AIX, for example, simply forbids "[c]oncurrent use of sigaction and sigwait".
(Related to the above is the concept of the signal mask, which can "block" signals, effectively forcing them to stay pending until accepted or until "unblocked" and delivered.)

PHP Launch script after background process completes?

I am converting a PDF with PDF2SWF and Indexing with XPDF.. with exec.. only this requires the execution time to be really high.
Is it possible to run it as background process and then launch a script when it is done converting?
in general, php does not implement threads.
But there is an ZF-class which may be suitable for you:
http://framework.zend.com/manual/en/zendx.console.process.unix.overview.html
ZendX_Console_Process_Unix allows
developers to spawn an object as a new
process, and so do multiple tasks in
parallel on console environments.
Through its specific nature, it is
only working on nix based systems
like Linux, Solaris, Mac/OSx and such.
Additionally, the shmop_, pcntl_* and
posix_* modules are required for this
component to run. If one of the
requirements is not met, it will throw
an exception after instantiating the
component.
suitable example:
class MyProcess extends ZendX_Console_Process_Unix
{
protected function _run()
{
// doing pdf and flash stuff
}
}
$process1 = new MyProcess();
$process1->start();
while ($process1->isRunning()) {
sleep(1);
}
echo 'Process completed';
.
Try using popen() instead of exec().
This hack will work on any standard PHP installation, even on Windows, no additional libraries required. Yo can't really control all aspects of the processes you spawn this way, but sometimes this is enough:
$p1 = popen("/bin/bash ./some_shell_script.sh argument_1","r");
$p2 = popen("/bin/bash ./some_other_shell_script.sh argument_2","r");
$p2 = popen("/bin/bash ./yet_other_shell_script.sh argument_3","r");
The three spawned shell scripts will run simultaneously, and as long as you don't do a pclose($p1) (or $p2 or $p3) or try to read from any of these pipes, they will not block your PHP execution.
When you're done with your other stuff (the one that you are doing with your PHP script) you can call pclose() on the pipes, and that will pause your script execution until the process you are pclosing finishes. Then your script can do something else.
Note that your PHP will not conclude or die() until those scripts have finished. Reaching the end of the script or calling die() will make it wait.
If you are running it from the command line, you can fork a php process using pcntl_fork
There are also daemon classes that would do the same trick:
http://pear.php.net/package/System_Daemon
$pid = pcntl_fork();
if ($pid == -1) {
die('could not fork');
} else if ($pid) {
//We are the parent, exit
exit();
} else {
// We are the child, do something interesting then call the script at the end.
}

Stopping gearman workers nicely

I have a number of Gearman workers running constantly, saving things like records of user page views, etc. Occasionally, I'll update the PHP code that is used by the Gearman workers. In order to get the workers to switch to the new code, I the kill and restart the PHP processes for the workers.
What is a better way to do this? Presumably, I'm sometime losing data (albeit not very important data) when I kill one of those worker processes.
Edit: I found an answer that works for me, and posted it below.
Solution 1
Generally I run my workers with the unix daemon utility with the -r flag and let them expire after one job. Your script will end gracefully after each iteration and daemon will restart automatically.
Your workers will be stale for one job but that may not be as big a deal to you as losing data
This solution also has the advantage of freeing up memory. You may run into problems with memory if you're doing large jobs as PHP pre 5.3 has god awful GC.
Solution 2
You could also add a quit function to all of your workers that exits the script. When you'd like to restart you simply give gearman calls to quit with a high priority.
function AutoRestart() {
static $startTime = time();
if (filemtime(__FILE__) > $startTime) {
exit();
}
}
AutoRestart();
Well, I posted this question, now I think I have found a good answer to it.
If you look in the code for Net_Gearman_Worker, you'll find that in the work loop, the function stopWork is monitored, and if it returns true, it exits the function.
I did the following:
Using memcache, I created a cached value, gearman_restarttime, and I use a separate script to set that to the current timestamp whenever I update the site. (I used Memcache, but this could be stored anywhere--a database, a file, or anything).
I extended the Worker class to be, essentially, Net_Gearman_Worker_Foo, and had all of my workers instantiate that. In the Foo class, I overrode the stopWork function to do the following: first, it checks gearman_restarttime; the first time through, it saves the value in a global variable. From then on, each time through, it compares the cached value to the global. If it has changed, the stopWork returns true, and the worker quits. A cron checks every minute to see if each worker is still running, and restarts any worker that has quit.
It may be worth putting a timer in stopWork as well, and checking the cache only once every x minutes. In our case, Memcache is fast enough that checking the value each time doesn't seem to be a problem, but if you are using some other system to store off the current timestamp, checking less often would be better.
Hmm, You could implement a code in the workers to check occasionally if the source code was modified, if yes then just just kill themselves when they see fit. That is, check while they are in the middle of the job, and if job is very large.
Other way would be implement some kind of an interrupt, maybe via network to say stop whenever you have the chance and restart.
The last solution is helping to modify Gearman's source to include this functionality.
I've been looking at this recently as well (though in perl with Gearman::XS). My usecase was the same as yours - allow a long-running gearman worker to periodically check for a new version of itself and reload.
My first attempt was just having the worker keep track of how long since it last checked the worker script version (an md5sum would also work). Then once N seconds had elapsed, between jobs, it would check to see if a new version of itself was available, and restart itself (fork()/exec()). This did work OK, but workers registered for rare jobs could potentially end up waiting hours for work() to return, and thus for checking the current time.
So I'm now setting a fairly short timeout when waiting for jobs with work(), so I can check the time more regularly. The PHP interface suggest that you can set this timeout value when registering for the job. I'm using SIGALRM to trigger the new-version check. The perl interface blocks on work(), so the alarm wasn't being triggered initially. Setting the timeout to 60 seconds got the SIGALRM working.
If someone were looking for answer for a worker running perl, that's part of what the GearmanX::Starter library is for. You can stop workers after completing the current job two different ways: externally by sending the worker process a SIGTERM, or programmatically by setting a global variable.
Given the fact that the workers are written in PHP, it would be a good idea to recycle them on a known schedule. This can be a static amount of time since started or can be done after a certain number of jobs have been attempted.
This essentially kills (no pun intended) two birds with one stone. You are are mitigating the potential for memory leaks, and you have a consistent way to determine when your workers will pick up on any potentially new code.
I generally write workers such that they report their interval to stdout and/or to a logging facility so it is simple to check on where a worker is in the process.
I ran into this same problem and came up with a solution for python 2.7.
I'm writing a python script which uses gearman to communicate with other components on the system. The script will have multiple workers, and I have each worker running in separate thread. The workers all receive gearman data, they process and store that data on a message queue, and the main thread can pull the data off of the queue as necessary.
My solution to cleanly shutting down each worker was to subclass gearman.GearmanWorker and override the work() function:
from gearman import GearmanWorker
POLL_TIMEOUT_IN_SECONDS = 60.0
class StoppableWorker(GearmanWorker):
def __init__(self, host_list=None):
super(StoppableWorker,self).__init__(host_list=host_list)
self._exit_runloop = False
# OVERRIDDEN
def work(self, poll_timeout=POLL_TIMEOUT_IN_SECONDS):
worker_connections = []
continue_working = True
def continue_while_connections_alive(any_activity):
return self.after_poll(any_activity)
while continue_working and not self._exit_runloop:
worker_connections = self.establish_worker_connections()
continue_working = self.poll_connections_until_stopped(
worker_connections,
continue_while_connections_alive,
timeout=poll_timeout)
for current_connection in worker_connections:
current_connection.close()
self.shutdown()
def stopwork(self):
self._exit_runloop = True
Use it just like GearmanWorker. When it's time to exit the script, call the stopwork() function. It won't stop immediately--it can take up to poll_timeout seconds before it kicks out of the run loop.
There may be multiple smart ways to invoke the stopwork() function. In my case, I create a temporary gearman client in the main thread. For the worker that I'm trying to shutdown, I send a special STOP command through the gearman server. When the worker gets this message, it knows to shut itself down.
Hope this helps!
http://phpscaling.com/2009/06/23/doing-the-work-elsewhere-sidebar-running-the-worker/
Like the above article demonstrates, I've run a worker inside a BASH shell script, exiting occasionally between jobs to cleanup (or re-load the worker-script) - or if a given task is given to it it can exit with a specific exit code and to shut down.
I use following code which supports both Ctrl-C and kill -TERM. By default supervisor sends TERM signal if have not modified signal= setting. In PHP 5.3+ declare(ticks = 1) is deprecated, use pcntl_signal_dispatch() instead.
$terminate = false;
pcntl_signal(SIGINT, function() use (&$terminate)
{
$terminate = true;
});
pcntl_signal(SIGTERM, function() use (&$terminate)
{
$terminate = true;
});
$worker = new GearmanWorker();
$worker->addOptions(GEARMAN_WORKER_NON_BLOCKING);
$worker->setTimeout(1000);
$worker->addServer('127.0.0.1', 4730);
$worker->addFunction('reverse', function(GearmanJob $job)
{
return strrev($job->workload());
});
$count = 500 + rand(0, 100); // rand to prevent multple workers restart at same time
for($i = 0; $i < $count; $i++)
{
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
$worker->work();
if ( $terminate )
{
break;
}
else
{
pcntl_signal_dispatch();
}
if ( GEARMAN_SUCCESS == $worker->returnCode() )
{
continue;
}
if ( GEARMAN_IO_WAIT != $worker->returnCode() && GEARMAN_NO_JOBS != $worker->returnCode() )
{
$e = new ErrorException($worker->error(), $worker->returnCode());
// log exception
break;
}
$worker->wait();
}
$worker->unregisterAll();
This would fit nicely into your continuous integration system. I hope you have it or you should have it soon :-)
As you check in new code, it automatically gets built and deployed onto the server. As a part of the build script, you kill all workers, and launch new ones.
What I do is use gearmadmin to check if there are any jobs running. I used the admin API to make a UI for this. When the jobs are sitting idly, there is no harm in killing them.

Categories