i need to build a web application that will keep update the mysql database every minute and the data will be real time on browser.
The things i only care about is the cron job.
I was thinking if early stage the cron job need less than a minute to finish, but at late stage it might need more than a minute to complete the process. May i know that Will it crashed the process ?
Like Process A is still running, but later a minute Process B already start to run and Process A is still running. (Will it cause database data corrupted problem?)
Thanks.
Related
I am working on a web application that requires php code to run at a specific date/time.
Some hypothetical examples would be sending a user an email at 09:00 on their birthday or modifying a database entry (mySQL) at a predetermined date and time.
What would be the conventional way to implement this kind of scheduling feature?
I've seen cron-jobs been used for similar requirements but would this be feasible for a large amount of scheduled tasks?
Depends on the size of your project, but usually we don't have one script do all the cron jobs, because things need to happen at different times. Some emails might send at 9pm, some database cleanup might happen at midnight, and sessions might expire every few minutes. The best thing to do is set up separate cron jobs for each thing. There are a couple exceptions for this:
Some tasks need to run very frequently, like every 30 seconds. For that we set up a cron task which checks "ps" to see whether its own process is already running, and have it just wait 30 seconds and loop.
Some tasks make more sense to run when a queue is full. For that we trigger the task when a certain user-based script executes the 100th or 200th or 300th time. For example when a user pings 100 times without doing anything else, we log them out without using a cron task, but there is a separate cron task which checks if they've been inactive for 10 minutes.
For the checking of online users who doesn't log out properly but close the browser, I want to run a cron job in every 2/3 second. (I am updating database in every 10 second when logged in)
Will it harmful for server?
Cron lowest possible frequency is 1 minute so you cannot fire anything more often with it. As for overkill - it may or may not be there, but you need to review the code and load it produces yourself.
You can't run cron jobs more frequently than once a minute so this isn't possible from a cron anyways. You'd have to have a process running in a loop with sleep to achieve this, but the idea is overkill anyways - once a minute is fine.
I have a PHP cronjob which runs at 6:00am every day to do some calculations and store them in the SQL. What happens now is that the site becomes a bit unresponsive for about 40 seconds while the cron is running.
I was wondering whether using sleep() in the script would do the trick? Let's say I'd pause it for 2 seconds roughly every second. Therefore, the script would run for a lot longer time, but would also let apache use the system's resources.
Am I right? Does it work? Or will the script still keep resources to itself, even during the sleep()?
I need to run some tasks continuously. These tasks consist, mainly, of retrieving specific records from the DB, analyzing and saving them. This a non-trivial analysis, which might take several seconds (more than a minute, perhaps).
I do not know how frequently will new records be saved in the DB waiting for analysis (there's another cronjob for that).
Should I retrieve records one by one calling the same analysis function again once it finishes (recursively) and try to keep the cronjob running until there are no more unanalyzed records?
Or should I retrieve a fixed amount of new records on each cronjob run and call the cronjob every certain amount of minutes?
A job queue server may work well for this scenario (See ActiveMQ or MemcacheQ for example. Rather than adding the un-analyzed records directly to the database, send them to a queue for processing. Then your cron job could retrieve some items from the queue for processing, and if one job takes so long to run the cron job is triggered again, the next one will run and grab the next items in the queue.
Personally, I would have the cron job retrieve a fixed number of records for processing, just to make sure you don't get the script stuck processing for a very long time in the event new records keep getting added and the processor can't keep up. Eventually it would probably finish everything but you could end up in a situation where it continues for a very long time.
You may consider creating a lock file as well that the job can look for to see if the task processor is already running. For example when the cron job starts, check for the existence of a file (e.g. processor.lock), if it exists, exit, if not, create the file, process some records, and delete the file.
Hope that helps.
Or should I retrieve a fixed amount of new records on each cronjob run and call the cronjob every certain amount of minutes?
That. And you'll have to do some trial and error metrics first to decide an optimal fixed amount.
Of course it heavily depends on what you are actually doing, how many db intensive cron jobs you are running simultaneously and what kind of setup you have. I recently spent a day looking for a Heisenbug in a very intensive script that migrated images from db to s3 (and created a few thumbs while migrating). The problem was that due to an undocumented behaviour in our ORM the connection to the database was lost at some point, as posting to s3 + thumbs generation for certain images took a little bit more than the connection time limit. It was an ugly situation, that would probably cost more than a day to identify in a recursive do it all scheme.
You'd be better off with the safe approach, even if it means a little time lost between cron executions.
Instead of using a cron job, I would use The Fat Controller to run and repeat tasks. It is basically a daemon which can run any script or application and restart it after it finishes, optionally with a delay between runs.
You can additionally specify a timeout so that long-running scripts will be stopped. This way you don't need to care about locking, long-running processes, error process and so on. It will help to keep your business logic clean.
There's more examples and use cases on the website:
http://fat-controller.sourceforge.net/
I have a Cron Job with PHP which I want to set up on my webhost, but at the moment the script takes about 20 seconds to run with only 3 users data being refreshed. If I get a 1000 users - gonna take ages. Is there an alternative to Cron Job? Will my web host let me run a cron job which takes, for example, 10 minutes to run?
Your cron job can be as long as you want.
The main problem for you is that you must ensure the next cron job execution is not occuring while the first one is still running. You have a lot of solutions to avoid it, basically use a semaphore.
It can be a lock file, a record in database. Your cron job should check if the previous one is finished or not. A good thing is maybe sending you an email if he cannot run because of a long previous job (this way you'll have some notice alerting you that something is maybe getting wrong) By default cron jobs with bad error dstatus on exit are outputing all the standard output to the email of the account running the job, depending on how is configured the platform you could use this behavior or build an smtp connexion on the job (or store the alert in a database table).
If you want some alternatives to cron jobs you should have a look at work queues. You can mix work queues with a cron job, or use work queue in apache-php envirronment, lot of solutions, but the main idea is to make on single queue of things that should be done, and execute them one after the other (but be careful, if you handle theses tasks very slowly you'll get a big fat waiting queue).
A cron job shouldn't have any bearing on how long it's 'job' takes to complete. If you're jobs are taking 20 seconds to complete, it's PHP's fault, not cronjob.
Will my web host let me run a cron job which takes, for example, 10 minutes to run?
Ask your webhost.
If you want to learn about optimizing php scripts, take a look at Profiling PHP Code.