I've created a PHP daemon that runs.. and its main concern is polling FTP servers at a set interval.
Now, there is now a need to add the same functionality, but at set times as well (so say 7PM on Monday).
How would I modify the service to perform tasks at certain times of the day?
I know I could do something like IF date() == date task should run then ..., but if one of the loops takes longer than normal, it might miss running the task.
Any ideas of how to achieve this?
Related
I am working on a scheduler-like code (in PHP if that matters) and encountered an interesting thing: it's easy to reschedule a recurring task, but what if, for some reason, it was run significantly later, than it was supposed to?
For example, let's say a job needs to run every hour and it's next scheduled run is 13.05.2021 18:00, but it runs at 13.05.2021 20:00. Now normal rescheduling logic will be taking the original scheduled time and adding recurrence frequency (1 hour in this case), but that would make the new time 13.05.2021 19:00, which can cause to run this job twice. We could, theoretically, use the time for "last run" but it can be something like 13.05.2021 20:03, which would make new time 13.05.2021 21:03.
Now my question is: what logic can we use so that in this case next time would be 13.05.2021 21:00? I've tried googling something like this, but was not able to find anything. And I do see, that Event Scheduler in Windows, for example, does reschedule jobs in a way, that I want to do that.
I actually found a pretty easy way to do what I needed, so posting it as an answer.
If we have a value of frequency in seconds (in my case, at least) and we have the original nextrun, which is when a task was supposed to be run initially, then the logic is as follows:
We need to get current time (time(), UTC_TIMESTAMP() or whatever).
We need to compare current time against nextrun and get the difference between them in seconds.
We then calculate how many iterations of the task could have been completed in the amount of those seconds by dividing the time difference by frequency.
We round up the resulting value (ceil()). If we have a value lower than 1, we may want to sanitize it.
We multiply this rounded up value by frequency, which will give us a different result than on step 2, which is the salt of this method.
We add the resulting number of seconds to nextrun.
And that's it. This does not guarantee, that you won't ever have a task run twice, if it ended just a few seconds before the time value on step 6, but to my knowledge MS Event Scheduler has the same "flaw".
Since I am doing this calculation in SQL, here's how this would look in SQL (at least for MySQL/MariaDB):
UPDATE `cron__schedule` SET `nextrun`=TIMESTAMPADD(SECOND, IF(CEIL(TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP())/`frequency`) > 0, CEIL(TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP())/`frequency`), 1)*`frequency`, `nextrun`)
To explain by referencing the logic above:
UTC_TIMESTAMP()
TIMESTAMPDIFF(SECOND, `nextrun`, UTC_TIMESTAMP()) - time comparison in seconds.
TIMESTAMPDIFF(...)/`frequency`
CEIL(...) to round up the value. IF(...) is used to sanitize, since we can get 0 seconds, that will result in us not changing the time, at all.
CEIL(...)*`frequency`
TIMESTAMPADD(...)
I do not like having to use TIMESTAMPDIFF(...) twice because of IF(...), but I do not know a way to avoid that without moving to a stored procedure, which feels like an overkill. Besides, as far as I know, MySQL should calculate this value only once regardless. But, if someone can advise me on a cleaner approach, I'll update the answer.
There isn't a right or wrong in this situation, it really depends on your business logic and how you want to build this.
WordPress and Drupal, two of the largest CMSs out there have faced this problem, too, which boils down to "poor man's cron" versus "system cron". For a "poor man's cron", these systems rely on someone hitting the website in order to "wake" the scheduler up, and if no one visits your site in a month, your tasks don't run, either. Both of these systems instead recommend using the system's cron to be more consistent and "wake up" the scheduler at certain intervals. I would encourage you to explore this in your system, too.
The next problem is, how are you storing your recurrence? Do you have (effectively) a table with every possible run time? So for an hourly run there's 24 entries? Or is there just a single task that has an ideal run date/time? The latter is generally easier to control compared to the former which has a lot of duplicated data being stored.
Then, do tasks reschedule themselves, does the scheduler do that, or is there a middle ground where the scheduler asks the task for the next best run? Figuring this out is very important and there's some nuances.
Another thing to think about, what happens if a task runs earlier than planned? For instance, does the world break if a task runs as 01:00 and 01:15, or is it just sub-optimal.
Generally when I build these types of systems, my tasks conform to a pattern (interface in OOP) and support a "next run time". The scheduler pulls all of the tasks from a data store that have an expired "next run time" and runs them. Doing this, there's no chance for a single task to exist at both 01:00 and 02:00 because it will only exist in the data store once, for instance at 01:00. If the scheduler then wakes up at 01:15, it finds the 01:00 task which has expired and runs it, and then it asks the task for the next run. The task looks at the clock (or time as provided by the scheduler if you are running in a distributed environment) and the task performs its own logic to determine that. If the logic is every hour, you can add 60 minutes from "now" and then remove the minutes portions, so 01:15 becomes 02:00.
Throw some exception handling and possibly database transactions into this mix to guarantee that a task can't fail but still get rescheduled, too.
I want to setup a system for a privileged user to create a new task to run from date/time X to date/time Y saved in MySQL or SQLite? The task will send out a request to remote server via SSH and when the end date/time is up another SSH request would be sent.
What I'm not sure about is how to actually trigger the event at the start time and howto trigger the other at the end time?
Should I be polling the server somehow every 1min (sounds like a performance hit) or setup jobs in Iron.io/Amazon SQS or something else?
I noticed Amazon SQS only allows messages to queue for up to 14 days, how would that work for events weeks or months in the future?
Im not looking for code, just the idea on how it should work.
Basically there are two solutions, but maybe a hybrid version suits your problem best...
Use a queue (build into Laravel) and set up delayed jobs in the queue to be fired later on. You already mention that this might not be the best solution when a task takes months/weeks.
Use a cron job. Downside with this is that you can check once every day but that could mean a delay of 23h59m or you can check every minute but that might give you performance issues (in most cases it kind of works, but definitely not perfect).
Combining 1 & 2 might be the best solution; check in de beginning of a day whether there are tasks going to end in the coming day. If so, schedule a job in the queue to end the task at the exact time at which it should end. This gives you scalability and the possibility to create tasks that end a year after they where created.
I'm currently working on a browser game with a PHP backend that needs to perform certain checks at specific, changing points in the future. Cron jobs don't really cut it for me as I need precision at the level of seconds. Here's some background information:
The game is multiplayer and turn-based
On creation of a game room the game creator can specify the maximum amount of time taken per action (30 seconds - 24 hours)
Once a player performs an action, they should only have the specified amount of time to perform the next, or the turn goes to the player next in line.
For obvious reasons I can't just keep track of time through Javascript, as this would be far too easy to manipulate. I also can't schedule a cron job every minute as it may be up to 30 seconds late.
What would be the most efficient way to tackle this problem? I can't imagine querying a database every second would be very server-friendly, but it is the direction I am currently leaning towards[1].
Any help or feedback would be much appreciated!
[1]:
A user makes a move
A PHP function is called that sets 'switchTurnTime' in the MySQL table's game row to 'TIMESTAMP'
A PHP script that is always running in the background queries the table for any games where the 'switchTurnTime' has passed, switches the turn and resets the time.
You can always use a queue or daemon. This only works if you have shell access to the server.
https://stackoverflow.com/a/858924/890975
Every time you need an action to occur at a specific time, add it to a queue with a delay. I've used beanstalkd with varying levels of success.
You have lots of options this way. Here's two examples with 6 second intervals:
Use a cron job every minute to add 10 jobs, each with a delay of 6 seconds
Write a simple PHP script that runs in the background (daemon) to adds an a new job to the queue every 6 seconds
I'm going with the following approach for now, since it seems to be the easiest to implement and test, as well as deploy on different kinds of servers/ hosting, while still acting reliably.
Set up a cron job to run a PHP script every minute.
Within that script, first do a query to find candidates that will have their endtime within this minute.
Start a while-loop, that runs until 59 seconds have passed.
Inside this loop, check the remianing time for each candidate.
If teh time limit has passed, do another query on that specific candidate to ensure the endtime hasn't changed.
If it has, re-add it to the candidates queue as nescessary. If not, act accordingly (in my case: switch the turn to the next player).
Hope this will help somebody in the future, cheers!
I've search on the web and apparently there is no way to launch a php script without user interaction.
Few advisors recommend me Cron but I am not sure this is the right way to go.
I am building a website where auctions are possible just like ebay. And after an amount of time the objects are not available anymore and the auction is considered as finished.
I would like to know a way to interact with the database automatically.
When do you need to know if an object is available? -> Only if someone asks.
And then you have the user interaction you are searching for.
It's something different if you want to, let's say, send an email to the winner of an auction. In this case you'd need some timer set to the ending time of the auction. The easiest way to do this would be a cron job...
There are several ways to do this. Cron is a valid one of them and the one I would recommend if its available.
Another is to check before handling each request related to an object whether it is still valid. If it is not, you can delete it from the database on-the-fly (or do whatever you need to) and display a different page.
Also you could store the time at which your time-based script was run last in the database and compare that time with the current time. If the delay is large enough, you can run your time based code. However, this is prone to race conditions if multiple users hit the page at the same time, so the script may run multiple times (maybe this can be avoided using locks or anything though).
To edit cronjobs from the shell: crontab -e
A job to run every 10 minutes: */10 * * * * curl "http://example.com/finished.php"
TheGeekStuff.com cron Examples
Use heartbeat/bot implement
ation
Cron job that runs pretty frequently or a program that starts on boot and runs continuously (maybe sleeping periodically) is the way to go. With a cron job you'll need to make sure that you don't have two running at any given time or write it such that it doesn't matter if you have more than one working at any given time. With "resident" program you'll need to figure out how to handle the case when it crashes unexpectedly.
I wouldn't rely on this mechanism to actually close the auction, though. That should be handled in your database/web site. That is, the auction has a close time and either the database constraints or your code makes it impossible to bid on a closed auction. Notifying the winner and seller, setting up the payment process, etc. are things your service/scheduled task could do.
I'm new to PHP, so I need some guidance as to which would be the simplest and/or elegant solution to the following problem:
I'm working on a project which has a table with as many as 500,000 records, at user specified periods, a background task must be started which will invoke a command line application on the server that does the magic, the problem is, at each 1 minute or so, I need to check on all 500,000 records(and counting) if something needs to be done.
As the title says, it is time-critical, this means that a maximum of 1 minute delay can be allowed between the time expected by the user and the time that the task is executed, of course the less delay, the better.
Thus far, I can only think of a very dirty option, have a simple utility app that runs on the server, that at each minute, will make multiple requests to the server, example:
check records between 1 and 100,000;
check records between 100,000 and 200,000;
etc. you get the point;
and the server basically starts a task for each bulk of 100,000 records or less, but it seems to me that there must be a faster approach, something similar to facebook's notification.
Additional info:
server is Windows 2008
using apache + php
EDIT 1
users have an average of 3 tasks per day at about 6-8 hours interval
more than half of the tasks can be at least 1 time per day executed at the same time[!]
Any suggestion is highly appreciated!
The easiest approach would be using a persistent task that runs the whole time and receives notification about records that need to be processed. Then it could process them immediately or, in case it needs to be processed at a certain time, it could sleep until either that time is reached or another notification arrives.
I think I gave this question more than enough time, I will stick to a utility application(that sits on the server) that will make requests to a URL accessible only from the server's IP which will spawn a new thread for each task if multiple tasks needs to be executed at the same time, it's not really scalable but it will have to do for now.