I am developing a website that requires a lot background processes for the site to run. For example, a queue, a video encoder and a few other types of background processes. Currently I have these running as a PHP cli script that contains:
while (true) {
// some code
sleep($someAmountOfSeconds);
}
Ok these work fine and everything but I was thinking of setting these up as a deamon which will give them an actual process id that I can monitor, also I can run them int he background and not have a terminal open all the time.
I would like to know if there is a better way of handling these? I was also thinking about cron jobs but some of these processes need to loop every few seconds.
Any suggestions?
Creating a daemon which you can make calls to and ask questions would seem the sensible option. Depends on wether your hoster permits such things, especially if you're requiring it to do work every few seconds, then definately an OS based service/daemon would seem far more sensible than anything else.
You could create a daemon in PHP, but in my experience this is a lot of hard work and the result is unreliable due to PHP's memory management and error handling.
I had the same problem, I wanted to write my logic in PHP but have it daemonised by a stable program that could restart the PHP script if it failed and so I wrote The Fat Controller.
It's written in C, runs as a daemon and can run PHP scripts, or indeed anything. If the PHP script ends for whatever reason, The Fat Controller will restart it. This means you don't have to take care of daemonising or error recovery - it's all handled for you.
The Fat Controller can also do lots of other things such as parallel processing which is ideal for queue processing, you can read about some potential use cases here:
http://fat-controller.sourceforge.net/use-cases.html
I've done this for 5 years using PHP to run background tasks and its no different to doing in any other language. Just use CRON and lock files. The lock file will prevent multiple instances of your script running.
Also its important to monitor your code and one check I always do to prevent stale lock files from preventing scripts to run is to have second CRON job to check if if the lock file is older than a few minutes and if an instance of the PHP script is running, if not it then removes the lock file.
Using this technique allows you to set your CRON to run the script every minute without issues.
Use the System::Daemon module from PEAR.
One solution (that I really need to try myself, as I may need it) is to use cron, but get the process to loop for five mins or so. Then, get cron to kick it off every five minutes. As one dies, the next one should be finishing (or close to finishing).
Bear in mind that the two may overlap a bit, and so you need to ensure that this doesn't cause a clash (e.g. writing to the same video file). Some simple inter-process communication may be useful, even if it is just writing to a PID file in the temp directory.
This approach is a bit low-tech but helps avoid PHP hanging onto memory over the longer term - sort of in-built task restarts!
Related
I have some limitations with my host and my scripts can't run longer than 2 or 3 seconds. But the time it will take to finish will certainly increase as the database gets larger.
So I thought about making the script stop what it is doing and call itself after 2 seconds, for example.
Firstly I tried using cURL and then I made some attempts with wget. But there is always a problem with waiting for the response and timeouts (with cURL, for example, I just need to ping the script, not wait for a response) or permissions with the server (functions that we use to run wget such as exec seems to be disabled on my server, or something like that).
What do you think is the best idea to make a PHP script ping/call itself?
On Unix/LInux systems I would personally recommend to schedule CRON JOBS to keep running the scripts at certain intervals
May be this SO Link will help you
Php scripts generally don't call other php scripts. It is possible to spawn a background process as illustrated here, but I don't think that's what you're after. If, so you'd be better off using cron as was discussed above.
Calling a function every X amount of seconds with the same script is certainly possible, but this does the opposite of what you want since it would only extend the run time of the script in question.
What you seem to be asking is, contrary to your comment, somewhat paradoxical. A process that calls method() every so often is still a long running process and is subject to the same restrictions as any other process on the server, regardless of the fact that it may be sitting idle for short intervals.
As far as I can see your options are:
Extend the php max_execution_time directive, or have your sysadmin do so if they are willing
Revise your script so that it completes within the time limit
Move to a new server
I've previously used Gearman along with supervisor to manage jobs.
In this case we are using Amazon SQS which I have spent some time trying to get my head around.
I have set up a separate micro instance from our main webserver to use as an Image processing server (purely for testing at the moment, it will be upgraded and become part of a cluster before this implementation goes live)
On this micro instance I have installed PHP and ImageMagick in order to perform the image processing.
I have also written a worker script which receives the messages from Amazon SQS.
All works perfectly, however I need this script to run over and over again in order to continuously check for messages.
I don't like the thought of running a continuous loop so have started to look at other methods with little success.
So my question is what is generally considered the best practice way to do this?
I am worried about memory since PHP wasn't really designed for this, therefore it feels like running the script for a while, then stopping and restarting it might be my best bet.
I have experience using supervisor (to ensure that gearman workers kept running) and am wondering if I could simply use that to continuously execute the simple php script over and over?
My thoughts are as follows:
Set up SQS long polling so that the script checks for 20 seconds.
Use a while loop with a 20 second sleep to keep this script running for say an hour at a time
Have all this run through supervisor. When the hour is up and the loop is complete, allow the script to exit.
Supervisor should then automatically restart it
Does this sound viable? Is there a better way? What is generally considered the best practice for receiving SQS messages in PHP?
Thanks in advance
In supervisord you can set autorestart to true to have it run your command over and over again. See: http://supervisord.org/configuration.html#program-x-section-settings
Overall, using an endless while loop is perfectly fine, PHP will free your objects correctly and keep memory in check if written correctly. It can run for years without leaks (if there's a leak, you probably created it yourself, so review your code).
How do I stop a Supervisord process without killing the program it's controlling? might be of interest to you; the OP had a similar setup, with autorestart and wanted to add graceful shutdowns to it.
I'm looking for some ideas to do the following. I need a PHP script to perform certain action for quite a long time. This is an extension for a CMS and this can't be anything else but PHP. It also can't be a command line script because it should be used by common people that will have only the standard means of the CMS. One of the options is having a cron job (most simple hostings have it) that will trigger the script often so that instead of working for a long time it could perform the action step by step preserving its state from one launch to the next one. This is not perfect but I can't see of any other solutions. If the script will be redirecting to itself server will interrupt it. What other options can suit?
Thanks everyone in advance!
What you're talking about is a daemon or long running program that waits for calls by client programs, performs and action, provides a response then keeps on waiting for more calls.
You might be familiar w/ these in the form of Apache & MySQL ;) Anyway PHP is generally OK in this regard, it does have the ability to function over raw sockets as well as fork sub-processes to handle multiple requests simultaneously.
Having said that PHP daemons are a tool where YMMV. Some folks will say they work great, other folks like me will say they have issues w/ interprocess communication and leaking memory even amidst plethora unset() calls.
Anyway you likely won't be able to deploy a daemon of any type on a shared hosting environment. You'll need to get a better server package or stick with a Cron based solution.
Here's a link about writing a PHP daemon.
Also, one more note. Daemons do crash from time to time and therefore you may still need to store state about whats going on, just in case someone trips over the power cord to your shared server :)
I would also suggest that you think about making it a daemon but if not then you can simply use
set_time_limit(0);
ignore_user_abort(true);
at the top to tell it not to time out and not to get interrupted by anything. Then call it from the cron to start it every day or whatever. I have this on many long processing daily tasks and it works great for me. However, it won't be able to easily talk to the outside world (other scripts can't query it or anything -- if that is what you want look into php services) so once you get it running make sure it will stop and have it print its progress to a logfile.
I am looking for the PHP equivalent for VB doevents.
I have written a realtime analysis package in VB and used doevents to release to the operating system.
Doevents allows me to stay in memory and run continuously without filling up memory and allows me to respond to user input.
I have rewritten the package in PHP and I am looking for that same doevents feature.
If it doesn't exist I could reschedule myself and exit.
But I currently don't know how to do that and I think that would add a lot more overhead.
Thank you, gerardg
usleep is what you are looking for.. Delays program execution for the given number of micro seconds
http://php.net/manual/en/function.usleep.php
It's been almost 10 years since I last wrote anything in VB and as I recall, doevents() function allowed the application to yield to the processor during intensive processing (usually to allow other system events to fire - the most common being WM_PAINT so that your UI won't appear hung).
I don't think PHP has such functionality - your script will run as a single process and end (either when it's done or when it hits the default 30 second timeout).
If you are thinking in terms of threads (as most Windows programmers tend to do) and needing to spawn more than 1 instance of your script, perhaps you should take look at PHP's Process Control functions as a start.
I'm not entirely sure which aspects of doevents you're looking to emulate, so here's pretty much everything that could be useful for you.
You can use ob_implicit_flush(true) at the top of your script to enable implicit output buffer flushing. That means that whenever your script calls echo or print or whatever you use to display stuff, PHP will automatically send it all to the user's browser. You could also just use ob_flush() after each call to display something, which acts more like Application.DoEvents() in VB with regards to keeping your UI active, but must be called each time something is output.
Naturally if your script uses the output buffer already, you could build a copy of the buffer before flushing, with ob_get_contents().
If you need to allow the script to run for more time than usual, you can set a longer tiemout with set_time_limit($time). If you need more memory, and you have access to edit your .htaccess file, place the following code and edit the value:
php_value memory_limit 64M
That sets the memory limit to 64 megabytes.
For running multiple scripts at once, you can use pcntl_exec to start another one running.
If I am missing something important about DoEvents(), let me know and I will try to help you make it work.
PHP is designed for asynchronous on demand processing. However it can be forced to become a background task with a little hackery.
As PHP is running as a single thread you do not have to worry about letting the CPU do other things as that is already taken care of. If this was not the case then a web server would only be able to serve up one page at a time and all other requests would have to sit in a queue. You will need to write some sort of look that never expires until some detectable condition happens (like the "now please exit" message you set in the DB or something).
As pointed out by others you will need to set_time_limit($something); with perhaps usleep stopping the code from running "too fast" if it eats very much CPU each loop. However if you are also using a Database connection most of your script time is actually the script waiting for the Database (by far the biggest overhead for a script).
I have seen PHP worker threads created by using screen and detatching it to a background task. Other approaches also work so long as you do not have a session that will time out or exit (say when the web browser is closed). A cron that starts a script to check if the script is running every x mins or hours gives you automatic recovery from forced exists and/or system restarts.
TL;DR: doevents is "baked in" to PHP and you don't have to worry about it.
See also Having a PHP script loop forever doing computing jobs from a queue system, but that doesn't answer all my questions.
If I want to run a PHP script forever, accessing a queue and doing jobs:
What is the potential for memory problems? How to avoid them? (any flush functions or something I should use?)
What if the script dies for some reason? What would be a good method to automatically start it up again?
What would be the best basic approach to start the script. Since it runs forever, I don't need cron. But how do I start it up? (See also 2.)
Set the queue up as a cron script. Have it execute every 10 seconds. When the script fires up, check if there's a lock file present (something like .lock). If there is, exit immediately. If not, create the .lock and start processing. If any errors occur, email/log these errors, delete .lock and exit. If there's no tasks, then exit.
I think this approach is ideal, since PHP isn't really designed to be able to run a script for extended periods of time like you're asking. To avoid potential memory leaks, crashes etc, continuously executing the script is a better approach.
While PHP can access (publish and consume) MQ's, if at all possible try to use a fully functional MQ application to do this.
A fully functional MQ application (in ruby, perl, .NET, java etc) will handle all of the concurrency, error logging, state management and scalability issues that you discuss.
Not going too far with state machines, at least it's a good idea to introduce states both to 'jobs' (example: flv2avi conversion) and 'tasks' (flv2avi 1.flv).
On my script (Perl), sometimes zombie processes are starting to downgrade the whole script's performance. It is a rare case, but it is native in source, so the script should be able to stop reading queue anymore, allowing new instance to continue its tasks&jobs; however, keeping as much of running tasks' data is welcome. Once first instance has 1-2 tasks, it gets killed.
On start :
check for common errors (due to shutdown)
check for known errors (out of space, can't read input)
kill whatever may be killed and set status to 'waiting'
start all waiting.
If you run a piped jobs (vlc | ffmpeg, tail -f | grep), you can try to avoid using too much I/O in your program, instead doing fork() (bad idea for PHP?) or just calling /bin/bash -c "prog1 | prog2", this saves a lot of cpu load.
Start points: both /etc/rc.d and cron (check processes, run first instance || run second with 'debug' argument )