Running Cron jobs in parallel (PHP) - php

In the past, I ran a bunch of scripts each as a separate cron job. Now I'd like to run a controller script with one cron job, then have that call the scripts separately (and in parallel, all at the same time), so I don't have to create a new cron job every time I add another script.
I looked up pcntl_fork() but we don't have that installed. Can fsockopen() do this as well?
A few questions:
I saw this example, http://phplens.com/phpeverywhere/?q=node/view/254, that uses fsockopen(). Will this allow me to run PHP scripts in parallel? Note, the scripts don't interact, but I would still like to know if any of them exited prematurely with an error.
Secondly the scripts I'm running aren't externally accessible, they are internal only. The script was previously run like so: php -f /path/to/my/script1.php. It's not a web-accessible path. Would the example in #1 work with this, or only web-accessible paths?.
Thanks for any advice you can offer.

You can use proc_open to run multiple processes without waiting for each process to finish.
You will have a process handle, you can terminate each process at any time and you can read the standard output of each process.
You can also communicate via pipes, which is optional.
Passing 1st param php /your/path/to/script.php param1 "param2 x" means starting a separate PHP process.
proc_open (see Example #1)
Ultimately you will want to use an infinite while loop + usleep (or sleep) to avoid maxing out on the CPU. Break when all processes finish, or after you killed them.
Edit: you can know if a process has exited prematurely.
Edit2: a simpler way of doing the above is popen

Please correct me if I'm wrong, but if I understand things correctly, the solution Tiberiu-Ionut Stan proposed implies that starting the processes with proc_open and waiting for them to finish will not be run as a cron script, but is part of a running program/service, right?
As far as I understand the cron jobs, the controller script user920050 was thinking of using would be started by cron on a schedule and each new instance would launch the processes all over again, do the waiting for them to finish and probably run in parallel with other cron-launched instances of the controller script.

Related

How to make php work forever without cron?

Is there a way to make php work forever without cron.
What I want it for is to unban users after a few hours by running a mysql query, thanks
If you don't have access to cron jobs on your server (I guess you are running on a shared hosting?), the best alternative is to run an "external cron". Have a look at www.setcronjob.com. I have been using this for a couple of months now and it is pretty stable.
You can set it up such that it calls a script on your website every whenever you want. (Example: http://www.yoursite.com/script.xxx)
In the script, you can run a MySQL query to check which users have been banned for a couple of hours and then unban them.
You can start your script from the command line and let it run in the background. You will have to design this script in such a way that it never exits and just loops forever using the sleep() function to avoid unnecessary processor load. Since php scripts invoked from the command line have no max execution time the script will run until you manually kill it off with the kill command.
Once you've written the script you can start it with:
nohup php myscript.php &
nohup makes the script still run once you log out of the console session that you started it from, otherwise it would kill off then. The & symbol at the end starts the script as a new process in the background so that you can continue using the console.

Running a PHP process as a daemon while safely killing it from background

We are running a PHP Daemon which look into a queue, receives worker jobs and spawns the worker to handle it. The workers themselves acquire a lock on a specific location before proceeding.
We spawn the Daemon as nohup background processes.
This entire architecture seems to work, except when we have to kill the processes, for whatever reason. If we kill them using -9, there is no way to trap it in the worker process and release the locks before dying.
If we use anything less than -9 (like TERM or HUP), it doesn't seem to be received by either the daemon or the worker processes.
Has anybody solved this problem in a better way?
(ps: BTW, Due to other considerations, we may not be able to change our language of implementation, so please only consider PHP based solutions)
I had related problems once too. Let me explain. I had a php 'daemon' that worked like a downloader. It accessed feeds periodically and downloads (laaaarge) content from the net. The daemon had to be stopped at a certain time, lets say 0500 in the morning to prevent it from using the whole bandwith during daytime. I decided to use a cronjob to send SIGTERM to the daemon at 0500.
In the daemon I had the following code:
pcntl_signal(SIGTERM, array($this, 'signal_handler'));
where signal_handler looked like this:
public function signal_handler($signal) {
// some cleanup code
exit(1);
}
Unfortunately this did not work :|
It took me a time to find out what's going on. The first thing I figured out was that I'll have to call the method pcntl_signal_dispatch() on init to enable signal dispatching at all. Quote from the doc (comments):
If you are running PHP as CLI and as a "daemon" (i.e. in a loop), this function must be called in each loop to check if new signals are waiting dispatching.
Ok, so far, it seemed working. But I realized quickly that under certain conditions even this will not work as expected. Sometimes the daemon could only being stopped by kill -9 - as before. :|
So what's the problem?.. Answer: My program called wget to download the files via shell_exec. The problem is, that shell_exec() blocking waits until the child process has terminated. During this blocking wait no signal processing is done, the process can only being terminated using SIGKILL - what is hard. Also a problem was that child processes had to be terminated one by one as they became zombie processes after killing the father.
My solution to this was to execute the child process using proc_open() and the use stream_select() on it's output for non blocking IO.
Now it works like a charm. :) If you need further information don't hesitate to drop a comment.
Note If you are working with PHP < 5.3 then you'll have to use `
declare(ticks=1);
instead of pcntl_signal_dispatch(). You can rfer to the the documentation of pcntl_signal() for that. But if possible you should upgrade to PHP >= 5.3
The problem was solved just by adding ticks:
// tick use required as of PHP 4.3.0
declare(ticks = 1);
Leaving this alone was causing my code not to work.
*(It's unfortunate that the documentation of pcntl_signal doesn't mention it in a lot more attention grabbing way.)*
You need to catch the signal (SIGTERM). This can be achieved via the function pcntl_signal. This will give you the option to perform any necessary functions before calling exit.

How to set up Beanstalkd with PHP

Recently I've been researching the use of Beanstalkd with PHP. I've learned quite a bit but have a few questions about the setup on a server, etc.
Here is how I see it working:
I install Beanstalkd and any dependencies (such as libevent) on my Ubuntu server. I then start the Beanstalkd daemon (which should basically run at all times).
Somewhere in my website (such as when a user performs some actions, etc) tasks get added to various tubes within the Beanstalkd queue.
I have a bash script (such as the following one) that is run as a deamon that basically executes a PHP script.
#!/bin/sh
php worker.php
4) The worker script would have something like this to execute the queued up tasks:
while(1) {
$job = $this->pheanstalk->watch('test')->ignore('default')->reserve();
$job_encoded = json_decode($job->getData(), false);
$done_jobs[] = $job_encoded;
$this->log('job:'.print_r($job_encoded, 1));
$this->pheanstalk->delete($job);
}
Now here are my questions based on the above setup (which correct me if I'm wrong about that):
Say I have the task of importing an RSS feed into a database or something. If 10 users do this at once, they'll all be queued up in the "test" tube. However, they'd then only be executed one at a time. Would it be better to have 10 different tubes all executing at the same time?
If I do need more tubes, does that then also mean that i'd need 10 worker scripts? One for each tube all running concurrently with basically the same code except for the string literal in the watch() function.
If I run that script as a daemon, how does that work? Will it constantly be executing the worker.php script? That script loops until the queue is empty theoretically, so shouldn't it only be kicked off once? How does the daemon decide how often to execute worker.php? Is that just a setting?
Thanks!
If the worker isn't taking too long to fetch the feed, it will be fine. You can run multiple workers if required to process more than one at a time. I've got a system (currently using Amazon SQS, but I've done similar with BeanstalkD before), with up to 200 (or more) workers pulling from the queue.
A single worker script (the same script running multiple times) should be fine - the script can watch multiple tubes at the same time, and the first one available will be reserved. You can also use the job-stat command to see where a particular $job came from (which tube), or put some meta-information into the message if you need to tell each type from another.
A good example of running a worker is described here. I've also added supervisord (also, a useful post to get started) to easily start and keep running a number of workers per machine (I run shell scripts, as in the first link). I would limit the number of times it loops, and also put a number into the reserve() to have it wait for a few seconds, or more, for the next job the become available without spinning out of control in a tight loop that does not pause at all - even if there was nothing to do.
Addendum:
The shell script would be run as many times as you need. (the link show how to have it re-run as required with exec $#). Whenever the php script exits, it re-runs the PHP.
Apparently there's a Djanjo app to show some stats, but it's trivial enough to connect to the daemon, get a list of tubes, and then get the stats for each tube - or just counts.

What do I use when a cron job isn't enough? (php)

I'm trying to figure out the most efficient way to running a pretty hefty PHP task thousands of times a day. It needs to make an IMAP connection to Gmail, loop over the emails, save this info to the database and save images locally.
Running this task every so often using a cron isn't that big of a deal, but I need to run it every minute and I know eventually the crons will start running on top of each other and cause memory issues.
What is the next step up when you need to efficiently run a task multiple times a minute? I've been reading about beanstalk & pheanstalk and I'm not entirely sure if that will do what I need. Thoughts???
I'm not a PHP guy but ... what prevents you from running your script as a daemon? I've written many a perl script that does just that.
Either create a locking mechanism so the scripts won't overlap. This is quite simple as scripts only run every minute, a simple .lock file would suffice:
<?php
if (file_exists("foo.lock")) exit(0);
file_put_contents("foo.lock", getmypid());
do_stuff_here();
unlink("foo.lock");
?>
This will make sure scripts don't run in parallel, you just have to make sure the .lock file is deleted when the program exits, so you should have a single point of exit (except for the exit at the beginning).
A good alternative - as Brian Roach suggested - is a dedicated server process that runs all the time and keeps the connection to the IMAP server up. This reduces overhead a lot and is not much harder than writing a normal php script:
<?php
connect();
while (is_world_not_invaded_by_aliens())
{
get_mails();
get_images();
sleep(time_to_next_check());
}
disconnect();
?>
I've got a number of scripts like these, where I don't want to run them from cron in case they stack-up.
#!/bin/sh
php -f fetchFromImap.php
sleep 60
exec $0
The exec $0 part starts the script running again, replacing itself in memory, so it will run forever without issues. Any memory the PHP script uses is cleaned up whenever it exits, so that's not a problem either.
A simple line will start it, and put it into the background:
cd /x/y/z ; nohup ./loopToFetchMail.sh &
or it can be similarly started when the machine starts with various means (such as Cron's '#reboot ....')
fcron http://fcron.free.fr/ will not start new job if old one is still running, Your could use # 1 command and not worry about race conditions.

What options are there for executing a PHP script at a certain time every day?

I have a PHP script that needs to be run at certain times every weekday. Is cron or Windows task scheduler the only way to do this?
Is there a way to set this up from within another PHP script?
Depends how exact the timing needs to be. A technique I've seen used (dubbed "poor man's cron") is to have a frequently accessed script (a site's home page, for example) fire off a task on the first page load after a certain time (kept track of in the database).
If you need any sort of guaranteed accuracy, though, cron or a Windows scheduled task is really the best option. If your host doesn't support them, it's time to get a better one.
Apart from cron or Scheduled Tasks, you can have your PHP script to always run it. The process should sleep (within a loop) until the time has reached. Then, the process could execute the remaining functions/methods. Or, you can set up another script (which will act as a daemon) to check for the time and execute the other script.
Well since the web is a pull mechanism you have to have some sort of action that will trigger a PHP script to execute. cron is an option on *nix and task scheduler on windows. You could also write your own service that has a timer but only if needed, this is common on windows services for updaters, jobs etc.
One way you could do it is in the cron task just call a php script for each action needed. Or one php script that executes other tasks. The problem with web based tasks though such as PHP is timeouts. Make sure your tasks are under 60-90 seconds. If not you might look at using python , perl or ruby or even bash scripts to do the work rather than the PHP script.
cron seems like the best option for you though. You will have to call your script with wget. There are examples here: http://www.thesitewizard.com/general/set-cron-job.shtml
For instance this runs the script everyday at 11:
30 11 * * * /usr/bin/wget http://www.example.com/cron.php
Cron, of course, is by far the best way to schedule anything on *nix.
If this is in a remote server you do not have cron access to, you can setup cron/windows scheduler on your computer, to open a web browser to the page that contains the script you wish to run
You probably want to use cron (or windows scheduled tasks).
If you really wanted, you could set up another php script to run continuously with an infinite loop (with a sleep command inside the loop, say for 30 seconds or so) and then when you reach your desired day/time execute the other script via a shell command call. While possible, I can't think if a single good reason to use this method rather than cron/scheduled tasks
You can write a long running script that runs your main script in predefined times but it will be very unnecessary, error prone, and it will basically be a "cron rewrite in phph".
Using the real cron itself will be easier and a more robust solution. If you are packaging an application, you can put a file in /etc/cron.d which contains a single cron line running your application.
You'll need to use a cron job (under Linux/Unix) or a scheduled task under Windows. You could have another script running on a continuous basis which checks the time and executes a script at a specified interval, but using the OS-supplied mechanism is easier to manage and resilient to restarts, etc.
The Uniform Server project has some good suggestions on mimicking cron in environments where cron is unacceptable. Still though, if cron is at all an option, use it.

Categories