On the webpage there is a google map where the user can change the location to one that he is interested in and sign up for alerts of new jobs by pressing a button. The location of interest saved will be defined by the bounds of the google map. Whenever a new job appears within that bound, an email alert will be sent to that user based on a frequency chosen by him (every hour or every day).
Problem: I am confused on how I should process all the alerts for all users.
Currently I am thinking of using a cron job for a table with all the lat1, lng1, lat2, lng2, user_id for hourly alerts that runs every hour, and another cron job for another table for daily alerts that runs once a day say 9pm. The cron job will loop through all the individual user's lat, lng pairs that define the google map bounds, and query the main jobs database for any jobs with posting timestamp within 1hr (or 1 day). If there is, an email alert will be sent.
This seems like a lot of work for the server, especially when there are 5000 user's location preferences and 1,000,000 jobs in the database. (30-ish mins to finish the cronjob?) I am stuck here and would like your opinions.
Instead of searching everything every time the cron runs (assuming I'm reading correctly that that's what you're doing), I'd consider performing that check when the alert is added:
Alert added to the system. System checks for any matching boundaries, if any are found then for each match store that info into a separate table. Stick two extra columns in this new table, one for hourly sending, one for daily.
On the hourly check, just send those where the hourly flag hasn't yet been applied, and for the daily, send those where the daily flag hasn't been set.
Then delete any where both have been set afterwards.
Doing it this way, you'll be breaking up the work to be done from one massive check on each cron job (All alerts, all boundaries), to one smaller check for each alert (One alert, all boundaries).
I think you can probably create two cron with this frequency
by hour.
by day.
(or any frequency you like)
Rather than processing all the alerts for all users, why not when user subscribed to a location, in your php codeigniter, create a task file with details of this job. Example, user_id, location(coordinate), frequency. The exact detail for this task file depend
in your situation and you will need to analyze into your system. Then place this task file to a directory.
Then based on the frequency you specified above, create a general php script to be called in the frequency. This script will loop through the directories, process the task file and send out email. This way, you will not worry to scan the whole database. There is also minor details like remove, update, delete task file but this is entirely implementation related.
Side note, probably this is irrevent since you tag this question with php but just if you would like to know, quartz do exactly what you want but it is in Java though. You can find out here if you want.
Related
I am developing a Web Application for businesses to track the status of their repairs & part orders that is running on LAMP (Linux Apache MySQL PHP). I just need some input as to how I should go about allowing users to customize the frequency of email notifications.
Currently, I just have a cron job running every Monday at 6:00AM that runs a php script that sends an email to each user of their un-processed jobs. But I would like to give users the flexibility of not only choosing the time they are sent at, but the days of the week as well.
One idea I had was, some way or another, storing their email notification preferences in a MySQL database, and then writing a php script to notify via email but only if the current date/time fits within the criteria they have set & write in code to prevent it from being sent twice within the same cycle. Then I could just run the cron job every minute or 5 or whatever.
Or would it be better to somehow create individual cron jobs for each user programatically via php?
Any input would be greatly appreciated! :)
No you are right.
Individual crons will consume many resources. Imagine 10k of users with a request to send mail at different times ... this imply 10k of tasks.
The best solution is to create a cron task that will run on your users and take the correct actions.
Iterate on your users, check the date/time set up, detect change and send mail with adding a flag somewhere so said "it's done" (an attribute last_cron_scandate or next_calculated_cron_scandate could be a good solution)
My application has a VIEW of another table that displays a list of all recorded high scores for the last calendar week. Each week, I'd like to increment a value in the rows of every user who is in the top X of this list to indicate that they've been in said top X however many weeks in the past. Since I'm only using a date filter and there's no sort of server side "event" firing off each week, I'm not sure of the most efficient way to register this information on a weekly basis.
One way I can think of doing so is to simply run some sort of "check" every time a user logs on, and the first user to log on each calendar week shoulders the burden of telling the server to increment the top X users of last week. While this seems a bit hokey to me, I will gladly do it with enough Internet Approval.
The main languages I'm working with are MySQL and PHP, in case it's of relevance.
You could write an update script that 1) Checks when it run last, and if that was this week, then abort. 2) Update the last-run timestamp. 3) Process the previous week. Now you can schedule this script with your operating system (either by running it with php, or by using e.g. wget to "download" the page). E.g. on linux you'd normally use cron, and task scheduler on windows. This is much better than using the first user because that can lead to race-conditions where it is hard to ensure that the processing cannot be run twice (in parallel).
You should consider running a cron job. You can run a check each time a user logs in but this really isn't ideal, especially if your system is going to scale up.
I am putting together an interface for our employees to upload a list of products for which they need industry stat's (currently doing 'em manually one at a time).
Each product will then be served up to our stat's engine via a webservice api.
I will be replying. The Stat's-engine will be requesting the "next victim" from my api.
Each list the users upload will have between 50 and 1000 products, and will be its own queue.
For now, Queues/Lists will likely be added (& removed via completion) aprox 10-20 times per day.
If successful, traffic will probably rev up after a few months to something like 700-900 lists per day.
We're just planning to go with a simple round-robin approach to direct the traffic evenly across queues.
The multiplexer would grab the top item off of List A, then List B, then List C and so on until looping back around to List A again ... keeping in mind that lists/queues can be added/removed at any time.
The issue I'm facing is just conceptualizing the management of this.
I thought about storing each queue as a flat file and managing the rotation via relational DB (MySQL). Thought about doing it the reverse. Thought about going either completely flat-file or completely relational DB ... bottom line, I'm flexible.
Regardless, my brain is just vapor locking when I try to statelessly meld a variable list of participants with a circular rotation (I just got back from a quick holiday, and I don't think my brain's made it home yet ;)
Has anyone done something like this?
How did you handle it?
What would you improve if you had to do it again?
Any & all tips/suggestions/advice are welcome.
NOTE: Since each request from our stat's engine/tool will be separated by many seconds, if not a couple minutes, I need to keep this stateless.
List data should be stored in a database, for sure. Your PHP side should have a view giving the status of the system, and the form to add lists.
Since each request becomes its own queue, and all the request-queues are considered equal in priority, the ideal number of tables is probably three. One to list requests and their priority relative to another (to determine who goes next in the round-robin) and processing status, another to list the contents (list-items) of each request that are yet to be processed, and a third table to list the processed items from each queue.
You will also need a script that does the actual processing, that is not driven by a user request, but instead by a system-scheduled job that executes periodically (throttled to whatever you desire). This can of course also be in PHP. This is where you would set up your 10-at-a-time list checks and updates.
The processing would be something like:
Select the next set of at most 10 items from the highest-priority queue.
Process them, Updating their DB status as they complete.
Update the priority of the above queue so that it is now the lowest priority.
And if new queues are added, they would be added with lowest priority.
Priority could be represented with an integer.
Your users would need to wait patiently for their list to be processed and then view or download the result. You might setup an auto-refresh script for this on your view page.
It sounds like you're trying to implement something that Gearman already does very well. For each upload / request, you can simply send off a job to the Gearman server to be queued.
Gearman can be configured to be persistent (just in case things go to hell), which should eliminate the need for you logging requests in a relational database.
Then, you can start as many workers as you'd like. I know you suggest running all jobs serially, which you can still do, but you can also parallelize the work, so that your user isn't sitting around quite as long as they would've been if all jobs had been processed in a serial fashion.
After a good nights sleep, I now have my wits about me (I hope :).
A simple solution is a flat file for the priorities.
Have a text file simply with one List/Queue ID on each line.
Feed from one end of the list, and add to the other ... simple.
Criticisms are welcome ;o)
Thanks #Trylobot and #Chris_Henry for the feedback.
I'm developing a web game (js php mysql) in which one clicks a button to start an action that takes time to complete (let's say 10 hours) and when it finishes some points are added to that player's total.. The problem is that I need those points to be added even if the player is not online at the time the action finishes.. for example I need to have the rankings updated, or an email sent to the player..
I thought about a cron job checking constantly for ending actions, but I think that would kill the resources (contantly checking against actions of thousands of players..).
Is there a better solution to this problem?
Thanks for your attention!!
You can just write into your database when it's finished and when the user logs in you add the earned points to his account. You can also check with a cronjob. Even if you have millions of user this will not kill your server.
Cron is perfect for this. You could write your tasks in stored procedures, then have cron run an SQL script to call the stored procedure that would update the records of your players.
Databases are designed to work with thousands and millions of pieces of information efficiently, so I don't think the idea that it will kill system resources is a valid one unless you hosting system is really constrained already.
If you want to be safe against cheating you need to do the checking on the server anyway. If the "waiting" will happen within a Javascript on the client, one could easily decrease the remaing time.
So you need to send the job to the server (which is assumed to be safe against clock modifications) and the server will determine the end timestamp. You could store your jobs in a queue.
If you only need this information for the user himself you can just look at the queue when the user logs in. Otherwise run a cron job every minute (or so). This job will mark all jobs finished when their timestamp is in the past (and remove them from the database).
If you need more precise checking you will need to come up with an alternative server side solution that is doing this more often (e.g. a simple program polling the database every few seconds).
Well i'm currently developping a browser multiplayers game and having some problem about the conception. This is a Ogame like (http://www.ogame.org/) using PHP / Js / MySql.
What i want is that players will launch an action (cut the wood) and this action will be ending in XX minutes.
So basicely my idea was to create an on the fly cron which will launch after XX minutes a SQL query, adding in the DB XX ressources to the player.
Another problem is that players can navigate on the sea with their ship, so if they are moving to a certain destination they will make XX meter each 5 minutes for example.
So the thing is that all the players can see each others on the sea. So basicaly, i can't wait for 10 minutes to add XX meter to the player ship, it have to be done after 5 minutes...
EDIT : So basicely i need a MySql job like or a infinite loop who's constantly checking on a table. This table will contain the end time (timestamp) of all the actions. And so the job have to execute a Sql Query when this time is corresponding to the CURRENT_TIME.
Hope you guys have understood my problem ;)
What you want is a "job system", or "queueing system", or "event system", or a "message queue".
These can be quite complex to build, but a simple version might just look like this:
You have a database table that just stores a queue of "messages", one message per row, each with some field specifying when it should be handled. Either immediately, or after some timespan, etc.
Your application inserts to this table as-necessary
You have a separate daemon running whose job it is to smartly handle this queue. For example, it pulls down any events that are ready to be handled, based on the criteria you specified when you inserted the event row. Then it handles them. You can run this in a neverending while() loop, and then run the script itself as a background process on your server. If you are handling memory-management well, the script can run forever.
There are a ton of issues with this setup, such as:
What happens if you need to divide up the tasks among multiple servers?
What happens when the daemon processes events more slowly than new ones are getting inserted?
A better solution to avoid these ways is to throw some money at a 3rd party managed message queue service (like Amazon SQS) or use a prebuilt framework that supports this (like gearman). That way you can smartly pull one event at a time from multiple machines who don't really have to care about how backlogged the system is - they just have to happily churn away at events.
Rather than having a specific script run in the future, you could have an events table in your database. So when the user initiates the chop wood task (at say 10:30 AM), you add an event to increment their wood balance by 5 at 10:35 AM.
Then you have a cron script which runs every minute. It gets the current time and looks up the database table for any events with a time that is less than or equal to the current time. You then perform that action and remove the event record from the table. Or if you need to keep a history of these things for your players, you can have a column called processed which tracks whether the event has been performed. In which case your cron script looks for unprocessed events before the current time.