I need to run automated tasks every 15. The tasks is for my server (call it server A, Ubuntu 10.04 LAMP) to query for updates to another server (Server B).
I have multiple users to query for, possibly 14 (or more) users. As of now, the scripts are written in PHP. They do the following:
request Server B for updates on users
If Server B says there are updates, then Server A retrieves the updates
In Server A, update DB with the new data for the users in
run calculations in server A
send prompts to users that I received data updates from.
I know cron jobs might be the way to go, but there could be a scenario where I might have a cron job for every user. Is that reasonable? Or should I force it to be one cron job querying data for all my users?
Also, the server I'm querying has a java api that I can use to query it. Which means I could develop a java servlet to do the same. I've had trouble with this approach but I'm looking for feedback if this is the way to go. I'm not familiar with Tomcat and I don't fully understand it yet.
Summary: I need my server to run tasks automatically every 15 mins, the requests data from another server, updates its DB and then send prompts to users. What's are recommended approaches?
Thanks for your help!
Create a single script, triggered by cron, that loops through each user and takes all three steps for each user: Pseudo code:
query for list of users from local DB;
foreach(users as user){
check for updates;
if(updates){
update db;
email user;
}
}
If you have a lot of users or the API is slow, you'll either want to set a long script timeout (ini_set) or you could add a TIMESTAMP DB column "LastUpdateCheck" and run the cron more often (every 30 seconds?) but limiting the update/API query to one or two users per instance (those with the oldest update times)
Related
I am working on the system which developed in php without framework.
It has the function is automatically run some jobs via third party api every night. It loops all the jobs in table and call api using curl.
// run cron job to loop this table
ID JOB
1 updateUser
2 getLatestInfo/topicA
……
//code
// if UpdateUser
loop user table and call api to get latest info…
Also other curl task will do here like send email / notification …
It works perfectly before. But recently we have many new users. It will call 50-100 API at the same time.
Each api call will take 10-20 seconds to respond, and we will retry the api if it is timeout.
I checked the log it totally take 3-4 hours for only first job (with many errors)
Although I can make the cron job for queueing the curl, like get first 5 curl and run them each 1 minutes. But if we keep increasing the users or task, and the third party api keep slow. It may take more hours to finish the task.
Is there any solution can make it keep listening to the job table, and run the curl one by one?
I want it can be auto Triggered if new row is added to the table. (Like websocket?) and not single php to run and infinite loop ( to prevent some error occurred and need to rerun the php task manually )
(The API keys is in the php project, so I hope that I can do this in same project)
PHP scripts need to be triggered in order to do something, they can't really "run in background" (I mean, they can, technically, but PHP isn't supposed to be used that way).
Instead, one of three options is usually used to do job management:
call jobs on every call from web, along with the actual code to generate output
use external web cron service to query specific URLs tied to job execution
use local cron job on your system to call the php executable and have it execute jobs periodically
If you want an event based system, PHP is likely the wrong option. Depending on your DB system though you might be able to create a small wrapper code that subscribes to DB changes and is triggered on inserts, that then calls PHP again - but it's definitely a cleaner solution to use a more suitable programming language / environment.
I'm using AWS Amazon Web Services and I have 4 identical Web Servers. I have a cron job setup to sent out emails that only needs to run from one server but exists on all 4.
Everything on these web servers is identical which suits me but currently the cron job runs on all 4. I therefore looking for a solution on how to have the job on all 4 servers but only run on one.
I do have a database and thought of using a tmp table to the first to write/register itself in that table would then be the only one to run (the others would read this and then not execute past this point). But I was concerned because they all use an identical system clock and therefore might read the table at the same time and even though one writes to it the others assume they also have access to run.
Would using a table be able to restrict 4 webservers to only 1 contined or would the identical clock factor get in the way?
Note: the server names have the potential of changing with time as they are more like resources then permanent systems. (eg: can spin new ones up when required)...
Is there a better way to do this? Note: I don't want a different server and I want them all identical.
thankyou
For me I would do:
1 - create a lock for each of the server
2 - define a list of server like ['1'=>'server ip', '2'=>'server ip',...]
3 - pass in server id in cron job as args
4 - modify the script if (lock not exists) do run then create a lock for this server, delete the lock for next server.
5 - delete the lock for the first server to start with.
The machines will each have different hostnames, so you could use that to only execute the job on the desired machine. For example if the machines are "server01", "server02", etc, then in PHP you might do:
if (gethostname() === 'server01') {
// do the task
}
Another way is, instead of having a single server image for all servers, have two slightly different images, one is for the "primary" server (which has the cron tasks, etc) and the other is for all other ("secondary") servers.
I have a service like backupify. Which Downloads data from different social media platforms, Currently i have about 2500 active users, for each user a script runs which gets data from facebook and stores them on Amazon S3, My server is Ec2 Instance on AWS.
I have entries in table like 900 entries for facebook users, There is a PHP script which runs and gets user from database table and then backups data from the facebook and then picks the next user from facebook.
Everything was fine when i was having less than 1000 users, but now i have more than 2500 users problem is that the PHP script halts, or runs for first 100 users and then halts, time out etc. I am running PHP Script fro php -q myscript.php command.
The other problem is that single user scripts takes about 65 seconds to reach the last user from the database table is may take days, so whats the best way to run parrallel on the databse table etc.
Please suggest me what is the best way to backup large amount of data for large amount of users, and i should be able to monitor the cron, somehting like a mangaer.
If I get it right, you've got a single cron task for all the users, running at some frequency, trying to process the data of every user in a single shot.
Did you try issuing set_time_limit(0); at the beginning of your code?
Also, if the task is resource demanding, did you consider creating a separate cron task for every N user (basically mimicking multithreaded behaviour; and thus utilizing multiple CPU cores of the server)?
Is writing your data to some kind of cache instead of the database and having a separate task commit the cache contents to the database feasible for you?
Do you have the opportunity to use an in-memory data table (that's pretty quick)? You'll need to persist the DB contents to disk every now and then, but for this price you get a fast DB access.
Can you maybe outsource the task to separate servers as a disributed service and write the cron script as a load balancer for them?
Also optimizing your code might help. For example (if you're not doing so yet) you could buffer the collected data and commit in a single transaction at the end of the script so the execution flow is not scattered by DB recurring I/O blockages.
I'm looking for better solution to handling our cron tasks in a load balanced environment.
Currently have:
PHP application running on 3 CentOS servers behind a load balancer.
Tasks that need to be run periodically but only on a single machine at a time.
Good old cron set up to run those tasks on the first server.
Problems if the first server is out of play for whatever reason.
Looking for:
Something more robust and de-centralized.
Load balancing the tasks so multiple tasks would run only once but on random/different servers to spread the load.
Preventing not having the tasks run when the first server goes down.
Being able to manage tasks and see aggregate reports ideally using a web interface.
Notifications if anything goes wrong.
The solution doesn't need to be implemented in PHP but it would be nice as it would allow us to easily tweak it if needed.
I have found two projects that look promissing. GNUBatch and Job Scheduler. Will most likely further test both but I wonder if someone has better solution for the above.
Thanks.
You can use this small library that uses redis to create a temporary timed lock:
https://github.com/AlexDisler/MutexLock
The servers should be identical and have the same cron configuration. The server that will be first to create the lock will also execute the task. The other servers will see the lock and exit without executing anything.
For example, in the php file that executes the scheduled task:
MutexLock\Lock::init([
'host' => $redisHost,
'port' => $redisPort
]);
// check if a lock was already created,
// if it was, it means that another server is already executing this task
if (!MutexLock\Lock::set($lockKeyName, $lockTimeInSeconds)) {
return;
}
// if no lock was created, execute the scheduled task
scheduledTaskThatRunsOnlyOnce();
To run the tasks in a de-centralized way and spread the load, take a look at: https://github.com/chrisboulton/php-resque
It's a php port of the ruby version of resque and it stores the data in the same exact format so you can use https://github.com/resque/resque-web or http://resqueboard.kamisama.me/ to monitor the workers and see reports
Assuming you have a database available not hosted on one of those 3 servers;
Write a "wrapper" script that goes in cron, and takes the program you're running as its argument. The very first thing it does is connect to the remote database, and check when the last time an entry was inserted into a table (created for this wrapper). If the last insertion time is greater than when it was supposed to run, then insert a new record into the table with the current time, and execute the wrapper's argument (your cron job).
Cron up the wrapper on each server, each set X minutes behind the other (server A runs at the top of the hour, server B runs at 5 minutes, C at 10 minutes, etc).
The first server will always execute the cron first, so the other two servers never will. If the first server goes down, the second server will see it hasn't ran, and will run it.
If you also record in the table which server it was that executed the job, you'll have a log of when/where the script was executed.
Wouldn't this be an ideal situation for using a message / task queue?
I ran into the same problem but came up with this litte repository:
https://github.com/incapption/LoadBalancedCronTask
How to create a always running PHP background script to clean up particular rows in mySql DB?
Hi guys I want to create a PHP script that automatically keeps on running (for example after every hour) and delete some particular rows from database(for example user comments those are atleast 5 days older)for this I am already having the date column in table.
Please guide me how to do this as I am new to PHP.
For this, you can create a cron job, if you have access to the cpanel....
The cron jobs will run periodically, say for every minute, hour, day or week. Refer this
what you are looking for is information on running some php maintenance scripts as a cron job. How you do that will depend upon what type of server you are using, if you are using shared hosting or have a dedicated server. In shared hosting there will be information about this on your control panel, otherwise you need to talk to your server admin.
You can set your PHP scrip that cleans up the mySql DB to run every hour or every mins/week/months with cron
crone job basic