We have a website running on multiple Azure instances – typically between 2 and 5.
There is a PHP script I would like to schedule to run every few minutes on each instance. (It just makes a local copy of data from a system that couldn't handle the load from all our users hitting it in real-time.)
If it were just one instance, that would be easy - I'd use Azure Scheduler to call www.example.com/my-scheduled-task.php every 5 minutes.
But the script needs to run on each instance, so that every instance has a reasonably up-to-date copy of the data. How would you achieve this? I can't work out if it's something in Azure Scheduler, or if I should be looking at some sort of startup script?
You can use a continuous webjob for that.
Just tweak your php script to have a loop and add a sleep of a few minutes between runs of your code.
The continuous webjob will run on all of your instances and even if somethings fails it will be brought back up.
Per my experience, a PHP webjob running on your each webapp instance is the good solution as #AmitApple said. However, I think you can try to use a scheduled webjob with a CRON expression for ensuring a start time, not a continuous one with a sleep time. And please make sure the script can be completed in the interval time.
You can refer to the section Create a scheduled WebJob using a CRON expression of the doc Run Background tasks with WebJobs to know how to get start.
Please see the note of the section Create a continuously running WebJob https://azure.microsoft.com/en-us/documentation/articles/web-sites-create-web-jobs/#CreateContinuous.
Note:
If your web app runs on more than one instance, a continuously running WebJob will run on all of your instances. On-demand and scheduled WebJobs run on a single instance selected for load balancing by Microsoft Azure.
For Continuous WebJobs to run reliably and on all instances, enable the Always On* configuration setting for the web app otherwise they can stop running when the SCM host site has been idle for too long.
Related
I have created a script on PHP that creates cache files from API and it takes around 30 minutes to load the page completely means when it creates all cache files.
I have a concern that my hostinger's customer support is telling me that it won't run for 30 minutes but in some answers, I found that it can run in the background and nothing to worry about until it's loaded.
So is that possible that the cronjob will run up to 30 minutes?
If not what is the best solution to run that cache making script at a specific time in the background like the cronjob does? Please Explain in brief so I can get a way.
Thanks for the great answer.
Ideally, for long running tasks, the task should be hosted in a platform that allows extended operations and defined in a way that it can be externally triggered, this might be in the form of an endpoint in a web API.
Then you can use the cronjob to trigger that process.
Without creating a whole API, you could make this a single endpoint on your website, a hidden page that only the cronjob knows how to call, then run your script from there.
There are lots of ways around this but the methodology is similar just use the cronjob as the trigger to a different process. Move the core logic of your script to a platform that allows the long execution time.
This is a similar post: Run a “long” php-script via Cronjob with an answer that suggests you can try to execute the script without waiting for the response, that is the same expectation with calling an external web process or API, the cronjob should not wait for a response.
It's good practice to limit resources on web server, especially in the shared hosting account. Because, in most cases, it may cause the web server to slow down and Denial of Services situation.
It's recommended to run the script using php-cli and cron.
php-cli offer much more relaxation, such as time and resource limitation. Please also read
Events in MariaDB VS Cron in php - which is better
Context
I'm currently implementing a feature to schedule notifications for a specific period through a web form using PHP and Firebase.
To send the notification I use Firebase and it sends notifications to Android/Ios.
To schedule the notification I use the AT linux service, as it seems to suit better than cron, as cron runs at certain frequencies and AT does not, it runs at a specific time.
man page about the AT: man page AT
Sample code
/usr/bin/php `send_notification.php` | at 2021-07-11 15:40
This will create a file on linux that will run in the period 2021-07-11 15:40 only once.
Problems
The AT service, like CRON, creates files inside a directory on the operating system that represent the jobs.
1 - If a machine on AWS is scaled, jobs would likely be duplicated and consequently send notifications more than once. (Note: I don't know much about machine scaling, but I believe it should happen)
2 - And if the machine is in downtime due to the inclusion of some functionality or something like that, I believe that the way it is currently the job would not be executed.
3 - Another problem, but not the main one, would be if I was using a docker container. As Ubuntu + PHP are inside the container, the job files would probably be lost if I restarted the container, so in this case I believe that a solution would be to use volume, but that would not be my problem now, as currently the application uses only one machine on AWS EB with the PHP image.
Doubts
Is there any solution I can apply to solve this duplicate job problem using PHP?
Is the approach using AT the most suitable? I see a lot of people talking to use CRON, but CRON will run the job several times and for me that's not what I'm looking for.
I think you need a place where scheduled and finished notifications will be persisted, independently on what you are using, cron or at.
If I had such a task, I would stay with a solution like this: run special script, "scheduler.php" each 1 (or more, e.g. 5) mins by cron, which will check some log file(or remote database in case of several machines) and look if there are any new lines. If new line present and it contains timestamp in the past and status "sceduled", than script will lock it and run your "sender.php". After that it will mark the line as "done". Each line in a storage should contain a timestamp to run and one of three statuses "scheduled", "running" and "done".
With such approach you could plan new notifications by adding a line with needed time and status "scheduled" to the storage. Note, that there can be a little delay between scheduled time and actual notification depending on the cron interval, but I suppose it is not critical.
This will allow you to run any number of crons on different machines and guarantee that each job will be done once.
Important: if you will adopt this scheme, be sure that your scheduler.php reads and updates a storage in a single atomic operation, to prevent race conditions between several crons. File locks, or "select for update" will do.
I feel a little bit silly for asking this question but I can't seem to find an answer on the internet for this problem. After searching for several hours I figured out that on a linux server you use Supervisor to run "php artisan queue:listen" (either with or without daemon) continuously on your website to handle jobs pushed to the queue. This is all well and good, but what if I want to do this on a Windows Azure web app? After searching around the solutions I found were:
Make a chron job to run "php artisan queue:listen" every minute (or every X minutes), I really dislike this solution and wanted to avoid it specially if the site gets more traffic;
Add a WebJob that runs "php artisan queue:listen" continuously (the problem here is I don't know how to write the script for the WebJob...);
I want to ask you guys for help on to know which of these is the correct solution, if there is a better one and if the WebJob is the best one how do I write the script for this? Thanks in advance.
In short, Supervisor is a modern alternative to nohup (no hang up) with a few other bits and pieces tacked on. In short, there's other resources that can keep a task running in the background (daemon) and the solution I use for Windows based projects (very few tbh) is Forever which I discovered via: https://stackoverflow.com/a/18226392/5912664
C:\myprojectroot > forever -c php artisan queue:listen --queue=some_nice_queue --tries=3
How?
Install node for Windows, then with npm install Forever
C:\myprojectroot > npm install -g forever
If you're stuck for getting Node running on Windows, I recommend the Windows Package Manager, Chocolatey
https://chocolatey.org/packages?q=node
Be sure to check for any logfiles that Forever creates, as I had left one long enough to consume 30Gb of disk space!
For Azure you can make a new webjob to your web app, and upload a .cmd file including a command like this.
php %HOME%\site\wwwroot\artisan queue:work --daemon
and defining that as a triguered and 0 * * * * * frequency cron.
that way work for me.
best.
First of all you cannot use a WebJob with Laravel on Azure. The Azure PHP Web App is hosted on Linux. WebJobs do not work with Linux at this moment.
The best way to do chron jobs in Laravel on Azure is to create an Azure Logic App. You use the Recurrence trigger and then a HTTP action to send a POST request to your Laravel Web App. You use this periodic heartbeat to run whatever actions you need to do. Be sure to add authentication to your POST request.
The next problem you will have is that POST will be synchronous so the work you are doing cannot be extensive or your HTTP request will time out or you will reach the time limit on PHP scripts (60 seconds).
The solution is not Laravel Jobs because here again you need something running in the background to process the queues.
The solution is also not PHP threads. The standard Azure PHP Web App does not support PHP Threads. You can of course build your own Web App and enable PHP threads, but this is really swimming upstream.
You simply have to live with synchronous logic. So the work you are doing with the heartbeat should take no more than about 60 seconds.
If you need more extensive processing then you really need to off load it to another place: another Web App, an Azure Function, etc.
But why not do that in the first place? The reason is cost and complexity. If you have something simple...like a daily report...you simply connect the report to the heartbeat and all the facilities for producing the report are right there in Laravel. To separate the daily report into its own container would require setup and the Web App it runs in would incur costs...not worth it in my view for something simple.
I have a setup where there are several application servers running php-fpm service and they all share a GlusterFS mount for the application code and other assets. In the current deploy process, the files get updated directly on the file server and many times to reflect changes the application service must be reloaded. To achieve that, the deployment script needs to get into every server and issue a reload command but with autoscaling, the number of servers is not the same at every moment.
Overall, I am working on sketching a couple of alternatives to solution this problem:
First one, more artesanal and not perfect, as a proof of concept, would be a cron job that will run every X minutes on the application machines and look for a file that should contain a unique info like it's hostname or IP address. If it matches, it will not take action but if not, it will reload and write itself within the file. On the deployment procedure, the script would clear the file and all servers should get reloaded in the next cron run.
Second, using a more sophisticated approach like a message queue or notification service where the running applications machine would subscribe to at boot time and wait for an order to reload. Deploy script would then publish a notification to get all servers aware it is time. A similar cron job from the previous method would then notice that and reload the app server.
Would any of that make sense? Is there any simpler or more standard way to trigger a broadcast for the applications servers running at a given moment in the deploy procedure without having to ssh to each and issuing the reload command? Any other advice you can provide or other suggestions?
Thanks!
I'm looking for better solution to handling our cron tasks in a load balanced environment.
Currently have:
PHP application running on 3 CentOS servers behind a load balancer.
Tasks that need to be run periodically but only on a single machine at a time.
Good old cron set up to run those tasks on the first server.
Problems if the first server is out of play for whatever reason.
Looking for:
Something more robust and de-centralized.
Load balancing the tasks so multiple tasks would run only once but on random/different servers to spread the load.
Preventing not having the tasks run when the first server goes down.
Being able to manage tasks and see aggregate reports ideally using a web interface.
Notifications if anything goes wrong.
The solution doesn't need to be implemented in PHP but it would be nice as it would allow us to easily tweak it if needed.
I have found two projects that look promissing. GNUBatch and Job Scheduler. Will most likely further test both but I wonder if someone has better solution for the above.
Thanks.
You can use this small library that uses redis to create a temporary timed lock:
https://github.com/AlexDisler/MutexLock
The servers should be identical and have the same cron configuration. The server that will be first to create the lock will also execute the task. The other servers will see the lock and exit without executing anything.
For example, in the php file that executes the scheduled task:
MutexLock\Lock::init([
'host' => $redisHost,
'port' => $redisPort
]);
// check if a lock was already created,
// if it was, it means that another server is already executing this task
if (!MutexLock\Lock::set($lockKeyName, $lockTimeInSeconds)) {
return;
}
// if no lock was created, execute the scheduled task
scheduledTaskThatRunsOnlyOnce();
To run the tasks in a de-centralized way and spread the load, take a look at: https://github.com/chrisboulton/php-resque
It's a php port of the ruby version of resque and it stores the data in the same exact format so you can use https://github.com/resque/resque-web or http://resqueboard.kamisama.me/ to monitor the workers and see reports
Assuming you have a database available not hosted on one of those 3 servers;
Write a "wrapper" script that goes in cron, and takes the program you're running as its argument. The very first thing it does is connect to the remote database, and check when the last time an entry was inserted into a table (created for this wrapper). If the last insertion time is greater than when it was supposed to run, then insert a new record into the table with the current time, and execute the wrapper's argument (your cron job).
Cron up the wrapper on each server, each set X minutes behind the other (server A runs at the top of the hour, server B runs at 5 minutes, C at 10 minutes, etc).
The first server will always execute the cron first, so the other two servers never will. If the first server goes down, the second server will see it hasn't ran, and will run it.
If you also record in the table which server it was that executed the job, you'll have a log of when/where the script was executed.
Wouldn't this be an ideal situation for using a message / task queue?
I ran into the same problem but came up with this litte repository:
https://github.com/incapption/LoadBalancedCronTask