How do Heroku workers work ?

How do Heroku workers work ? - php

Heroku supports multiple types of dyno configurations and allows us to set both a web and a worker process type for an app, for example like so:
web: vendor/bin/heroku-php-apache2 web/
worker: php worker/myworker.php
The web dyno will handle the web traffic, meanwhile the worker dyno type can be used as a background job (e.g.: to process a queue.)
The docs didn't make it clear to me how these workers function, i.e.: how do they start? behave?
In particular:
Are they run just once at deployment? Or are they run repeatedly (if so when)?
Do they have a maximum execution time?
Is it okay to use an endless loop inside of them? Or should I trigger them somehow?
If I go for a simple hello world:
myworker.php
<?php
error_log( "Hello world. This is the worker talking." );
?>
Then (after deploy or restart) heroku logs --tail --ps worker shows me only this:
2017-08-31T17:46:55.353948+00:00 heroku[worker.1]: State changed from crashed to starting
2017-08-31T17:46:57.834203+00:00 heroku[worker.1]: Starting process with command `php worker/mailer.php`
2017-08-31T17:46:58.452281+00:00 heroku[worker.1]: State changed from starting to up
2017-08-31T17:46:59.782480+00:00 heroku[worker.1]: State changed from up to crashed
2017-08-31T17:46:59.773468+00:00 heroku[worker.1]: Process exited with status 0
2017-08-31T17:46:59.697976+00:00 app[worker.1]: Hello world. This is the worker talking.
Is it the expected behavior?
(I'm not very used to using PHP from the command-line, which is what workers seem to be doing and might explain some of my confusion.)
Background: I'm trying to understand how they work for my own sake, but also to help me decide whether I should use a worker or a clock for a homemade mailing/newsletter system I'm adapting to Heroku.

You probably have already find out, but here is my experience with it for anyone looking for the answer:
Are they run just once at deployment? Or are they run repeatedly (if so when)?
They will run once deployed and keep running unless an exit command is issued or if the script returns an error. On my experience, when an error occurs, it's interrupted immediately and restarted after some time (which seems random). I'm not sure if it will restart when a clean exit is executed.
Do they have a maximum execution time?
No. But keep in mind that all dynos are restarted once per day.
Is it okay to use an endless loop inside of them? Or should I trigger them somehow?
Yes. Most of my uses involves a sleep in loop followed by calling the main script function.

Related

How to reliably start and stop a long running process (think days) that outlives the script that sarted it?

I have a CLI php app with a command that starts a background process (a Selenium server) and then exits:
php app.php server:start #this should return immediatly
The app also needs to be able to stop the background process in a later invocation:
php app.php server:stop
Since the server process outlives the start/stop script, I cannot manage it by keeping a file descriptor to it open.
I could store the PID of the process on the file system during start and kill that PID in stop. But if the stop command is ran after the background process has died of its own, I risk killing a process that I did not start because the PID might have been reused by the OS for some other process.
Right now my approach is to store not just the PID of the background process, but also the background process' start time and command used. It's OK, but it is hard to make it work consistently across different platforms (I need Linux, Mac, and Windows).
Can you think of a better way to implement such a behaviour?

How can I keep an Amazon SQS PHP reciever script running forever?

I've previously used Gearman along with supervisor to manage jobs.
In this case we are using Amazon SQS which I have spent some time trying to get my head around.
I have set up a separate micro instance from our main webserver to use as an Image processing server (purely for testing at the moment, it will be upgraded and become part of a cluster before this implementation goes live)
On this micro instance I have installed PHP and ImageMagick in order to perform the image processing.
I have also written a worker script which receives the messages from Amazon SQS.
All works perfectly, however I need this script to run over and over again in order to continuously check for messages.
I don't like the thought of running a continuous loop so have started to look at other methods with little success.
So my question is what is generally considered the best practice way to do this?
I am worried about memory since PHP wasn't really designed for this, therefore it feels like running the script for a while, then stopping and restarting it might be my best bet.
I have experience using supervisor (to ensure that gearman workers kept running) and am wondering if I could simply use that to continuously execute the simple php script over and over?
My thoughts are as follows:
Set up SQS long polling so that the script checks for 20 seconds.
Use a while loop with a 20 second sleep to keep this script running for say an hour at a time
Have all this run through supervisor. When the hour is up and the loop is complete, allow the script to exit.
Supervisor should then automatically restart it
Does this sound viable? Is there a better way? What is generally considered the best practice for receiving SQS messages in PHP?
Thanks in advance

In supervisord you can set autorestart to true to have it run your command over and over again. See: http://supervisord.org/configuration.html#program-x-section-settings
Overall, using an endless while loop is perfectly fine, PHP will free your objects correctly and keep memory in check if written correctly. It can run for years without leaks (if there's a leak, you probably created it yourself, so review your code).
How do I stop a Supervisord process without killing the program it's controlling? might be of interest to you; the OP had a similar setup, with autorestart and wanted to add graceful shutdowns to it.

PHP, re-run the code when finished

I'm developing a PHP-service which does numerous operations per customer, and I want this to run continuously. I've already taken a look at cron, but as far as I understood cron made it possible to run the code on set times. This can be a bit dangerous since we are dependant that the code has finished running before it starts over, and the time for each run may vary as the customer base increases. So refresh, cron or other timed intervals cant be done, as far as I'm aware.
So I'm wondering if you know any solutions where I can restart my service when it is finished, and under no circumstances make the re-run before all the code have been executed?
I'm sorry if this is answered before or is easily found on Google, I have tried to find something, but to no avail.
Edit: I could set timed intervals to be 1 hour, to be absolutely sure, but I want as little time as possible between each run.

Look at this:
http://www.godlikemouse.com/2011/03/31/php-daemons-tutorial/
What you need is a daemon that keeps running. There are more solutions than this while loop.
The following I once used in a project: http://kvz.io/blog/2009/01/09/create-daemons-in-php/ , it's also a package for PEAR: http://pear.php.net/package/System_Daemon
For more information, see the following SO links:
What is a daemon: What is daemon? Their practical use? Usage with php?
How to use: PHP script that works forever :)

Have you tried runnning the PHP script as a process. This here has more details http://nsaunders.wordpress.com/2007/01/12/running-a-background-process-in-php/

If you do not want to learn how to code a daemon, I recommand using a software that manages processes in userland: Supervisor (http://supervisord.org/)
You just need to write a configuration file to specify which processes you want to run, and how.
It is extremely simple to configure and it is very adaptable (you can force having only one instance of your process, or instead have a fixed number of instances... etc).
It will also handle automatic restart in case your script crashes, and logging.
On the PHP side, just create a script that never quits, using a while(true) { ... } loop, and add an entry like this in supervisord's conf:
[program:your-script]
command=/usr/bin/php /path/to/your_script.php
I'm using that software in production for a few projects (to run ruby and php gearman asynchronous workers for websites).

Try to have a custom logic , where you can set the flag ON and OFF and in your CRON , you can check before running the code inside it. I wanted to suggested something like Queue based solution , once you get the entry , then run the logic of your processing . Which can be either daemon or cron. It will give more control if your task is OK to execute now . Edited it

PHP on Apache on Linux: when web app starts processes, is it possible to keep those processes alive if Apache gets restarted?

We have a web app which allows us to monitor and control our server applications. The web pages start applications by executing a shell script to start them. The problem we have run into is that if we need to restart apache, it kills any of the processes that were started by the web app.
The web pages are PHP, and are using the exec() command to call the start scripts. The start scripts start Java apps, and and run the apps with something like this:
nohup java ... &
As mentioned, PHP is running in Apache on Linux. Is there some other switch or way to start these processes which would not have them be child processes of Apache (and killed when it stops)?
CLARIFICATION
I am more familiar with Windows than with Linux. In Windows, if you want to accomplish what we are trying add the start keyword in the shell, i.e.:
start <batchfile>
When you use start, the new shell/process can be unhooked from the one that started it. Is there a Linux equivalent to the start command?

Starting long-lasting processes by PHP sounds like asking for big trouble.
You will have problems like yours, and you will have huge security implications.
Much better solution is to have your PHP pages save their intent that something needs to be run in batch mode into database table (MySQL or PostgreSQL).
Another process (probably running under more advanced credentials than apache www user) should run as daemon and constantly check database for new stuff to do and execute necessary tasks (also it could be fired by cron every few minutes).
This way, you will kill two birds with one stone.

I wrote up how to create long running processes with php on my blog however I've got to agree with mvp that this approach is far from ideal for your purposes - not just from the point of view of privilege seperation (using a setuid program or sudo solves that easily enough).
Depending opn what you're trying to achieve here, I suspect that the additional functionality in DJ Bernsteins daemontools will be a better fit.

You could use batch(1) to start your long lasting server processes.
In shell, you could do
batch << END
java yourjava.jar
END
if you have some batch shell script file, start it with
batch -f yourbatchfile
If you can improve the Java programs, you might have them call daemon(3) at their start time, perhaps using the daemon thing from Apache.
You probably want to store the daemons' process pid somewhere (e.g. in some file or database), to be able to stop them (first with kill -TERM, then with kill -QUIT, at last with kill -KILL).
Using daemon function or Java thing is probably better than using a batch

Constantly Running Gearman Worker

I have a process I'd like to be able to run in the background by starting up a Gearman Client any time.
I've found success by opening up two SSH connections to my server, and in one starting the worker and in the other then running the client. This produces the desired output.
The problem is that, I'd like to have a worker constantly running in the background so I can just call up a client whenever I need to have the process done. But as soon as I close the terminal which has the worker PHP file running, a call to the client does not work - the worker seems to die.
Is there a way to have the worker run constantly in the background, so calling a new client will work without having to start up a new worker?
Thanks!

If you want a program to keep running even after its parent is dead (i.e. you've closed your terminal), you must invoke it with nohup :
nohup your-command &
Quoting the relevant Wikipedia page I linked to :
nohup is a POSIX command to ignore
the HUP (hangup) signal, enabling
the command to keep running after the
user who issues the command has logged
out.The HUP (hangup) signal is
by convention the way a terminal warns
depending processes of logout.
For another (possibly more) interesting solution, see the following article : Dæmonize Your PHP.
It points to Supervisord, which makes sures a process is still running, relaunching it if necessary.

Is there a way to have the worker run constantly in the background, so calling a new client will work without having to start up a new worker?
Supervisor!
The 2009 PHP Advent Calendar has a quick article on using Supervisor (and other tricks) to create constantly-running PHP scripts without having to deal with the daemonization process in PHP itself.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.