I have to populate and update one of my MySql database table using a complex and expensive query, based on selection from other table's data. This table doesn't need to be always fully updated when i make a query on it, but i'd like to have a cyclic update every 5 minutes.
This automatic update should be infinite and i need to be sure that it never stops.
After some research, i've found some solution, but i don't know which is better for security and performance.
One of these could be my goal:
Don't create table and make complex query from php every time to get the desired result
Create a php script that repeats cyclically and update table db, maybe using Cron Job.
Update table using a sql event
I think that first solution could be to expensive since query is complex and there could be many request every second, but the result is always updated. I don't have experience about Cron Job, so i can't know if it could be a good idea or not. For the third solution, i still don't have database privileges to run events, but i'd like to know if it could be a valid solution.
All other solutions are welcome, thanks.
Do not use cron. Think about what will happen if one instance goes beyond 5 minutes and the next starts up. Eventually you will have hundreds of copies bogged down stumbling over each other.
Instead have a single job in a loop doing the update. (OK, you could have a cron job to perform a "keep-alive" task of restarting the query if it dies.)
The job would
CREATE TABLE new ...
INSERT INTO new SELECT complex-stuff...
RENAME TABLE real TO old, new TO real;
DROP TABLE old;
loop.
I would opt for Cron Job.
It doesn't clog any request, since it's executed from the operating system.
You can define which user executes the script (cron -u apache -e).
Easy to define interval. (i.e. every 5 minutes */5 * * * * php /path/to/script.php).
It's loggable.
Additional Notes
I had a cron job running under root and it worked just fine. My problem was that the project had a private logging mechanism that each log file would be created by apache user. By running it from root, sometimes the file would be created by root and after that, the scripts being executed by apache would not be able to APPEND to the log.
I also had an emailing script that would run once every 2 minutes that got stuck for 1h. Turns out, because of a bug in the application, an invalid email address (somethingwithoutatsign.com) was inserted into the database, which made the PHPMailer library throws errors. After that, I added a catch block that would send an email to me whenever an exception was thrown. Now, if the script stops running because of bad execution, I get to know right away.
Related
I have very big database, and my users can sample from this database.
They build very large queries that link about 30-40 tables. The result of the query sometimes reaches 2 minutes. I optimized the server as much as possible, but still the data transfer rate is very low.
So I made a visual effect of the query, so that the user could save the request, and the result will be sent to him in the browser when the query is executed.
But there is one problem. I do not know how to make a database scan for the execution of the request.
I created the Event system. I bookmark events in the database and then process them. Separately, I did a database scan through the cron.
But the problem of the cronis that it does not have time to work in 1 minute and a new cron is launched and this increases the load on the server and creates a recursion.
I want to create a php task so that after saving a request from the user it starts executing it, but only after the event is created for its execution.
Could you please, how do I better do this, what methods can help me in this.
Thanks
I would use a framework such as Laravel and take advantage of its queue system.
https://laravel.com/docs/5.6/queues#job-events
There is already one implemented for databases.
"Using the before and after methods on the Queue facade, you may specify callbacks to be executed before or after a queued job is processed.".
I guess this can give you an idea about what to do after the query is processed.
Does anybody know how WordPress stores it's CronJob events? I'm developing a plugin with multiple concurrent CronJobs, which behaves really strange. When configuring the plugin the first Event will generate some page data over a period of roughly 10-15 mins and is split into multiple packages. These packages will reschedule themselves to get the maximum running time, without hitting the script execution limit. However when the first CronJob is executed, the user can start a second one (not the same one, it's from another section), which will always result in the second one being scheduled, staying in standby and getting removed after the first one has finished an execution.
We had problems with long running CronJobs and the database cache before: Some of our data is bundled into an option and inserting data into this package will overwrite changes made outside of the CronJob. Maybe something similar is happening here. For reference: The reshedule of the first CronJob happens inside said CronJob. Could that be a problem too?
This is how the error is behaving:
Init
Cron 1 is sheduled to a past timestamp.
Cron 1 is starting.
Cron 2 is sheduled to a past timestamp.
Cron 1 is working.
Cron 1 is finished.
Cron 1 is resheduled to a new timestamp.
Cron 2 gets removed from the event queue.
Cron 1 is starting...
I have checked everything that correlates to the scripts themselves: The events are properly registered, have a unique argument (just in case) and even pull a new version of the database options they change, before doing so. Limits are set beforehand and every related function is wrapped in a try-catch-block.
My questions so far: Does anybody know what can cause a CronJob do get deleted (besides "wp_clear_sheduled_hook")? Does WordPress store the events as an option? Can a CronJob overwrite these settings, when it is running for a long time?
Thanks for your help and greetings
SOLUTION: Thanks #kyon147 for pointing out that WordPress is using the wp-options table to store information about the sheduled events. In case anyone has similar problems: Wordpress will load ALL options into it's cache, when it is called. Meaning when starting Cron1 the "cron"-array with your events might look like this:
array('cron1' => 'time')
When something is changing this option while the script is still runing, this change will not be reflected to the script. Meaning the array will still be as above, even when an event is added from another script/session. So when resheduling the event INSIDE Cron1 WordPress took the array above, not the new one. This resulted in the changes being reset to the state, when Cron1 was started and thus the event appearing missing.
Hello guys I need an advice with these situation :
For example I have a free classified posting website where in a user can post classified ads..
The ad would be listed on the website for a maximum of 30 days then on 31st day it will automatically be deleted on the database as well as the images on the server.. The question is :
$db_ad_tbl('id','user_id','title','description',timestamp);
What is the right approach for doing this?
Can anyone suggest tutorials/links that covers this the same situation?
Another approach that does not require cron is to use MySQL events. If you can come up with the correct query, you can set it as a recurring event. phpMyAdmin 4.0.x supports events handling from the interface.
See http://dev.mysql.com/doc/refman/5.5/en/events.html.
As Barmar has noted you should add a cronjob for this task. You can write a simple php script and then add it to your crontab with something like:
1 0 * * * php -f /path/to/file/clean.php
This means that the php file will be executed every day at midnight.
Just a few notes:
the file should not be in your web folder
you might want to do some tests and report errors by email(such as unable to connect to db)
If you build more of thees you should keep a list of them somewhere in case you switch servers(or the server dies)
if you use a config file(ex:to store your db connection details), you should make sure that it is accessible by the user that the cronjob works with.
Most hosting platforms allow for crontab editing and run them with the same user they run the web server so it should not be a problem.
There is really no other good solution to this then creating cron job. This is of course if you don't check the time stamp every time you get the data from the database.You can then delete it if it is bigger then the expiry data (DELETE FROM my_table WHERE timestamp>[Expiry Timestamp] ). This is of course risky, since you will have to include the timestamp every time you try a count, and risk storing everything forever if no expired resource is ever requested from the database.
I have 5 cron jobs running a PHP file. The PHP file checks the MySQL database for items that require processing. Since cron launches the scripts all at the same time, it seems that some of the items are processed twice, or even sometimes up to five times.
Upon SELECting the file in one of the scripts, it immediately sends an UPDATE query so that other jobs shouldn't run it again. But looks like it's still double processing.
What can I do to prevent the other scripts from processing an item that was previously selected by the other cron jobs?
This issue is called "race condition". In this case it happens due to SELECT and UPDATE, though called one after another, are not a single operation. Therefore, there is a chance that two jobs do SELECT the same job, then first does UPDATE, and then second does UPDATE. And so they proceed to run this job simultaneously.
There is a workaround, however.
You could add a field to your table containing ID of current cron job worker (if you run it all on one machine, it may be PID). In worker you do UPDATE first, trying to reserve a job for it:
UPDATE jobs
SET worker = $PID, status = 'processing'
WHERE worker IS NULL AND status = 'awaiting' LIMIT 1
Then you verify you successfully reserved a job for this worker:
SELECT * FROM jobs WHERE worker = $PID
If it did not return you a row, it means other worker was first to reserve it. You can try again from step 1 to aquire another job. If it did return a row, you do all your processing, and then final UPDATE in the end:
UPDATE jobs
SET status = 'done', worker = NULL
WHERE id = $JOB_ID
I think you have a typical problem to use semaphores. Take a look at this article:
http://www.re-cycledair.com/php-dark-arts-semaphores
The idea would be at first of each script, ask for the same semaphore and wait until it be free. Then SELECT and UPDATE the DB as you do it, free the semaphore and start the process. This is the only way you can be sure that no more than one script is reading the DB while another one is about to write on it.
I would start again. This train of thought:
it takes time to process one item. about 30 seconds. if i have five cron jobs, five items are processed in 30 seconds
This is just plain wrong and you should not write your code with this in mind.
By that logic why not make 100 cron jobs and do 100 per 30 seconds? Answer, because your server is not RoadRunner and it will fall over and fail.
You should
Rethink your problem, this is the most important as it will help with 1 and 2.
Optimise your code so that it does not take 30 seconds.
Segment your code so that each job is only doing one task at a time which will make it quicker and also ensure that you do not get this 'double processing' effect.
EDIT
Even with the new knowledge of this being on a third party server my logic still stands, do not start multiple calls that you are not in control of, in fact this is now even more important.
If you do not know what they are doing with the calls then you cannot be sure they are in the right order, when or if they are processed. So just make one call to ensure you do not get double processing.
A technical solution would be for them to improve the processing time or for you to cache the responses - but that may not be relevant to your situation.
I've search on the web and apparently there is no way to launch a php script without user interaction.
Few advisors recommend me Cron but I am not sure this is the right way to go.
I am building a website where auctions are possible just like ebay. And after an amount of time the objects are not available anymore and the auction is considered as finished.
I would like to know a way to interact with the database automatically.
When do you need to know if an object is available? -> Only if someone asks.
And then you have the user interaction you are searching for.
It's something different if you want to, let's say, send an email to the winner of an auction. In this case you'd need some timer set to the ending time of the auction. The easiest way to do this would be a cron job...
There are several ways to do this. Cron is a valid one of them and the one I would recommend if its available.
Another is to check before handling each request related to an object whether it is still valid. If it is not, you can delete it from the database on-the-fly (or do whatever you need to) and display a different page.
Also you could store the time at which your time-based script was run last in the database and compare that time with the current time. If the delay is large enough, you can run your time based code. However, this is prone to race conditions if multiple users hit the page at the same time, so the script may run multiple times (maybe this can be avoided using locks or anything though).
To edit cronjobs from the shell: crontab -e
A job to run every 10 minutes: */10 * * * * curl "http://example.com/finished.php"
TheGeekStuff.com cron Examples
Use heartbeat/bot implement
ation
Cron job that runs pretty frequently or a program that starts on boot and runs continuously (maybe sleeping periodically) is the way to go. With a cron job you'll need to make sure that you don't have two running at any given time or write it such that it doesn't matter if you have more than one working at any given time. With "resident" program you'll need to figure out how to handle the case when it crashes unexpectedly.
I wouldn't rely on this mechanism to actually close the auction, though. That should be handled in your database/web site. That is, the auction has a close time and either the database constraints or your code makes it impossible to bid on a closed auction. Notifying the winner and seller, setting up the payment process, etc. are things your service/scheduled task could do.