Here is the scenario, I'm using Beanstalkd queue to send an email to a huge list of emails (50000+), each email has to have some unique content so the fired job loops over all the addresses, generates the content and sends the mail.
Sometimes the user may want to cancel the operation in the middle of sending, so for example while the job is running and after the mail was sent to say 20000 addresses, the user clicks on "Stop" which should "delete" the job.
what I have done so far is that I managed to get the running job instance, Queue::Push returns the job ID, so I save this ID saved in DB, and when I want to stop the job this is what I tried to do
$phean= Queue::getPheanstalk();
$res = $phean->peek($Job_ID); // returns a Pheanstalk_Job
$job = new \Illuminate\Queue\Jobs\BeanstalkdJob(app() , $phean, $res , 'default') ;
$res = $job->delete() // returns NOT_FOUND ??
$data = $job->getRawBody() // returns correct data, so I'm sure this is the right job instance
so why do I get NOT_FOUND, although when I use
supervisorctl tail -f queuename
I can see that the job is still running and outputting content
Any help ? If there is a better approach than trying to get the job and delete it this way I'm open to suggestions, I thought about saving the job ID in database (ID , status), and when I want to delete it I alter the ID status, and In the loop that is running within the job it checks every time, or maybe every 10 times, and if the status is equal to 1 for example then $job->delete(), but this will be so slow as it will hit the db in every loop.
So you have a a main job, which you reserve() and hold open, and in that job you create many emails directly.
Since the job you are trying to delete is currently reserved, you can't delete it. Even if you could, how would the currently running job be informed by Beanstalkd?
Instead, I would have the main loop check for any jobs on some separate control tube (you could do a quick check every say 10 or 100 emails sent) - just to request a new job, but not waiting if there isn't anything there. If there is a job there, then the main process cleans up and exits.
Another idea is not to actually send email in the main loop, but instead put the details of what emails to send, one per address, into the queue. Other processes read that mass-queue and start sending emails, but again, also read a control tube (with a higher priority message that would be returned ahead of the lower-priority email/details message). If there's anything in the control tube, stop sending emails. You would need at least as many STOP messages in the control tube as you have workers.
Related
I am developing a Web Application for businesses to track the status of their repairs & part orders that is running on LAMP (Linux Apache MySQL PHP). I just need some input as to how I should go about allowing users to customize the frequency of email notifications.
Currently, I just have a cron job running every Monday at 6:00AM that runs a php script that sends an email to each user of their un-processed jobs. But I would like to give users the flexibility of not only choosing the time they are sent at, but the days of the week as well.
One idea I had was, some way or another, storing their email notification preferences in a MySQL database, and then writing a php script to notify via email but only if the current date/time fits within the criteria they have set & write in code to prevent it from being sent twice within the same cycle. Then I could just run the cron job every minute or 5 or whatever.
Or would it be better to somehow create individual cron jobs for each user programatically via php?
Any input would be greatly appreciated! :)
No you are right.
Individual crons will consume many resources. Imagine 10k of users with a request to send mail at different times ... this imply 10k of tasks.
The best solution is to create a cron task that will run on your users and take the correct actions.
Iterate on your users, check the date/time set up, detect change and send mail with adding a flag somewhere so said "it's done" (an attribute last_cron_scandate or next_calculated_cron_scandate could be a good solution)
I'm currently working on a email notification of a social media website, and I want to send users email notification whenever the user is not viewing the site(at least for a certain period,30mins,1h,etc)
I'm considering using a cron job for sending those email notification and fire the cron job every 30 minutes.
Let's say user A commented on user B at 2014/8/13 18:39:00, it would have a row in the comment table of the database base like
comment_user user_received comment_send_time view_or_not(y/n) email(y/n)
user_A user_B 2014/8/13 18:39:00 n n
in my corn job php script, I would check if the interval between the current time and the comment_send_time is greater than 30mins, and is view=n email=nand the cron job is going to send user_B an email notification of the new comment, and after the email being sent successfully, updated the email to n, so that it can prevent sending a redundant email notification.
My concern is if I run the cron job every 30 mins, will it harms the server performance, and will a cron job be a proper way to handle this task and what other options would be
Personally, I'd have the job set to run every minute and have the script running a very lightweight sql query to check for new alerts.
If your script is heavy and takes longer to run than 1 minute, then scale it back or performance tweak it.
To answer your question directly, yes, it will affect performance. However, if you have a lightweight script, that hit is negligible. If you're sending out thousands of emails every minute, it's time to scale that gear to it's own server. At that point you should be making money enough to support another server, or you need to rethink your strategy for alerting users. If you need to start scaling, then start looking at message queueing, and task/work queues like RabbitMQ or Laravel
I have to send bulk emails to the users. I think of having an endless loop in a cron job, where I want to fetch a few dozens or hundreds users and send emails one by one - updating the the table, that the email was sent. And also I should put some sleep interval, as soon as each packet of dozen(or hundred) users received the email. Basically it looks like
while(1 != 0) {
$notifications = // fetch notifications, where email is not sent
foreach($notifications as $notification) {
// 1) send email
// 2) update table - email was sent
}
sleep(5);
}
Now, is this all right to use, or it is considered a bad practice ?
I know, I can also use multiple crons, lets say every one minute, but to prevent overlapping when using lock file, as soon as the cron starts and the lock file exists(so another cron is still running) it should either
a) wait for some time to the first cron to finish, to start,
or
b) just return empty, allowing the next cron to do the job ASA the ongoing one is done.
The problem with a) is that, what if the crons take lot more time than expected, then after some time I will have bunch of crons in a "waiting" state. About the b) case, what if immediately after the second cron is done(returning empty), the first cron ends, so I will have a gap of ~ one minute, and I need to send emails to users as soon as possible.
also, qsn 2, what is better in performance wise, one cron in loop vs multiple crons?
Thanks
What you are describing a daemon, not a cron task.
There are lots of daemons that run continuously, so no, it's not a bad practice to do that.
If you want the daemon automatically restarted if it crashes, you could have a watchdog task, which continuously checks that the daemon is running, and starts a daemon process if one isn't running.
Another alternative (as you describe) is to have crontask that occasionally attempts to start the daemon; the startup should detect whether the daemon process is already running. If it's already running, leave it be, and just exit. If it's not running, then start another one (in the background, as a detached process. Either way, the crontask completes quickly.
(And it doesn't matter one whit whether the daemon connects to MySQL.)
Personally, I dislike endless loops. I prefer a cron job running every 5 minutes for example.
And you can optimize your script for send max emails quantity in cron job time.
You need to estimate how many emails you will send per minute. I'll assume 1 email per second.
So my idea is:
Query for 290 notifications [10 seconds delay to get and update notifications] and mark them as "sending" status (to prevent next cron dont pick them).
Send emails and save result in array (for later update).
When finished, update notifications status (sent or error).
Just my 2 cents.
On the webpage there is a google map where the user can change the location to one that he is interested in and sign up for alerts of new jobs by pressing a button. The location of interest saved will be defined by the bounds of the google map. Whenever a new job appears within that bound, an email alert will be sent to that user based on a frequency chosen by him (every hour or every day).
Problem: I am confused on how I should process all the alerts for all users.
Currently I am thinking of using a cron job for a table with all the lat1, lng1, lat2, lng2, user_id for hourly alerts that runs every hour, and another cron job for another table for daily alerts that runs once a day say 9pm. The cron job will loop through all the individual user's lat, lng pairs that define the google map bounds, and query the main jobs database for any jobs with posting timestamp within 1hr (or 1 day). If there is, an email alert will be sent.
This seems like a lot of work for the server, especially when there are 5000 user's location preferences and 1,000,000 jobs in the database. (30-ish mins to finish the cronjob?) I am stuck here and would like your opinions.
Instead of searching everything every time the cron runs (assuming I'm reading correctly that that's what you're doing), I'd consider performing that check when the alert is added:
Alert added to the system. System checks for any matching boundaries, if any are found then for each match store that info into a separate table. Stick two extra columns in this new table, one for hourly sending, one for daily.
On the hourly check, just send those where the hourly flag hasn't yet been applied, and for the daily, send those where the daily flag hasn't been set.
Then delete any where both have been set afterwards.
Doing it this way, you'll be breaking up the work to be done from one massive check on each cron job (All alerts, all boundaries), to one smaller check for each alert (One alert, all boundaries).
I think you can probably create two cron with this frequency
by hour.
by day.
(or any frequency you like)
Rather than processing all the alerts for all users, why not when user subscribed to a location, in your php codeigniter, create a task file with details of this job. Example, user_id, location(coordinate), frequency. The exact detail for this task file depend
in your situation and you will need to analyze into your system. Then place this task file to a directory.
Then based on the frequency you specified above, create a general php script to be called in the frequency. This script will loop through the directories, process the task file and send out email. This way, you will not worry to scan the whole database. There is also minor details like remove, update, delete task file but this is entirely implementation related.
Side note, probably this is irrevent since you tag this question with php but just if you would like to know, quartz do exactly what you want but it is in Java though. You can find out here if you want.
I want to extract some of the time consuming things into a queue. For this I found Gearman to be the most used but don't know if it is the right thing for me.
One of the tasks we want to do is queue sending emails and want to provide the feature to be able to cancel to send the mail for 1 minute. So it should not work on the job right away but execute it at now + 1 minute. That way I can cancel the job before that and it never gets sent.
Is there a way to do this?
It will run on debian. And should be usable from php. The only thing I found so far was Schedule a job in Gearman for a specific date and time but that runs on something not widely spread :(
There are two parts to your question: (1) scheduling in the future and (2) being able to cancel the job until that time.
For (1) at should work just fine as specified in that question and the guy even posted his wrapper code. Have you tried it?
If you don't want to use that, consider this scenario:
insert an email record for the email to-be-sent in a database, including a "timeSent" column which you will set 1 minute in the future.
have a single gearman worker (I'll explain why single) look at the database for emails that have not been sent (eg some status column = 0) and where timeSent has already passed, and send those.
So, for (2), if you want to cancel an email before it's sent just update its status column to something else.
Your gearman worker has to be a single one because if you have multiple they might fetch and try to send the same email record. If you need multiple make sure the one that gets the email record first locks it immediately before any time consuming operations like actually emailing it (say, by updating that status column to something else).