I am making an payment system in PHP which depends on a REST API.
My Business Logic:
If someone submits a request through my system, lets say "transfer money from point A to point B" that transaction is saved in my database with status: "submited", then submitted to the (Mobile Network Operator) API URL which processes it and returns back the status to my system, update my database transaction status to the new status 'eg: waiting for the confirmation' and notify the user of the incoming status.
The problem is:
My application should keep requesting with an interval of 10 seconds to check for the new status and showing the new status to the user till the last status of 'complete or declined', since statuses can range to 5 eg:"waiting, declined, approved, complete...' .
I have managed to do this using AJAX, setting time intervals in JavaScript. But it stops requesting if the user closes the browser or anything happened at their end. resulting into my app not knowing whether the money was delivered or not .
I would like to how i can run this circular tasks in the background using Gearman without involving JavaScript time intervals thanks
Gearman is more of a worker queue, not a scheduling system. I would probably setup some type of cron job that will query the database and submit the appropriate jobs to Gearman in an async way. With gearman, you will want to use libdrizzle or something else for persistent queues and also some type of GearmanWorker process manager to run more than one job at a time. There are a number of projects that currently do this with varying degrees of success like https://github.com/brianlmoon/GearmanManager. None of the worker managers I have evaluated have really been up to par, so I created my own that will probably be open-sourced shortly.
You wouldn't use Gearman in the background for circular tasks, which is normally referred to as polling. Gearman is normally used as a job queue for doing things like video compression, resizing images, sending emails, or other tasks that you want to 'background'.
I don't recommend polling the database, either on the frontend or the backend. Polling is generally considered bad, because it doesn't scale. In your javascript example, you can see that as your application grows and is used by thousands of users, polling is going to introduce a lot of unnecessary traffic and load on your servers. On the backend, the machine doing the polling is a single point of failure.
The architecture you want to explore is a message queue. It's similar to the Listener/Observer pattern in programming, but applied at the systems level. This will allow a more robust system that can handle interruptions, from a user closing the browser all the way to a backend system going down for maintenance.
Related
There is a long running process(Excel report creation) in my web app that needs to be executed in a background.
Some details about the app and environment.
The app consists of many instances, where each client has separate one (with customized business logic) while everything is hosted on our server. The functionality that produces Excel is the same.
I'm planning to have one rabbitMq server installed. One part of app(Publisher) will take all report options from user and will put it into message. And some background job(Consumer) will consume it, produce report and send it via email.
However, there is a flaw in such design, where,say, users from one instance will queue lots of complicated reports(worth ~10 min of work) and a user from another instance will queue an easy one(1-2 mins) and he will have to wait until others will finish.
There could be separate queues for each app instance, but in that case I would need to create one consumer per instance. Given that there are 100+ instances atm, it doesn't look like a viable approach.
I was thinking if it's possible to have a script that checks all available queues(and consumers) and create a new consumer for a queue that doesn't have one. There are no limitations on language for consumer and such script.
Does that sound like a feasible approach? If not, please give a suggestion.
Thanks
As I understood topic correctly everything lies on one server - RabbitMQ, web application, different instances per client and messeges' consumers. In that case I rather put different topics per message (https://www.rabbitmq.com/tutorials/tutorial-five-python.html) and introduce consumer priorities (https://www.rabbitmq.com/consumer-priority.html). Based on that options during publishing of the message I will create combination of topic and priority of the message - publisher will know number of already sent reports per client, selected options and will decide is it high, low or normal priority.
Logic to pull messages based on that data will be in the consumer so consumer will not get heavy topics when there are in process already 3 (example).
Based on the total number of messages in the queue (its not accurate 100%) and previous topics and priorities you can implement kind of leaking bucket strategy in order to get control of resources- max 100 number of reports generated simultaneously.
You can consider using ZeroMQ (http://zeromq.org) for your case its maybe more suitable that RabbitMQ because is more simple and its broker less solution.
I'm just curious how would you handle a scenario where a lot of PDF's has to be generated on the server and be send to the user by email. You're not able to temper with the PDF because it needs to be 100% secure or close to that number.
For example the PDF contains the order you just made in a webshop, proof of purchase or something like that.
The application will have a lot of concurrent users. For this question I will use Laravel as a base platform for the web application.
I had the idea of running a cron job at night that will generate all this PDF's at once and send per e-mail.
What is considered best practise in this scenario?
For example the PDF contains the order you just made in a webshop, proof of purchase or something like that.
Given that these will presumably occur throughout the day, a queue may be a better solution than a cron. Every time someone does an action that'd require a PDF, fire off a queue job. A background process will check for queued jobs and process them.
This avoids having a giant backlog, protects you in the case a cron fails, and gets PDFs out to the clients in a more timely fashion.
Now that all of the browsers I like have almost full support for Server Sent Events, I wanted to try implementing it on a site I've been putting off because I hate polling. But I have initial hesitation that I was hoping I could get some help on.
Here is my use case:
User goes to a form, something time-based and competitive, in this case class registration. All things being equal, they have a list of about 30 - 40 classes they are eligible for, and in order to minimize instances of "she logged in first but he hit save first but he didn't mean to hit save but she already chose another class" etc, I want to make the form real-time, so that when someone selects an option, it goes straight into the db and anyone else viewing the form sees that it is filling up. (I'll deal with the stress of people changing their minds later).
So, in a polling scenario, I had to deal with the AJAX calls having to check on the status of 40 spots and update them and setting an interval that could potentially still create collisions.
But with Server Sent Events, I can have the listener get just the spots that need updating, which seems better, but here's where I get stuck:
Is there any risk of the listener getting overloaded? Let's say the script sends 15 messages, back-to-back, about a status change. I see vague mentions of how user agents should handle queued tasks, but it's not clear if that's for establishing a connection or handling server-sent messages
Is this basically just passing the burden of polling from the browser to the server? Does the script have to check the DB every second for changes? Is there any way for the script to be aware or notified when change has occurred? Let's assume that seat requests are sent to requests.php via ajax and that updates.php pushes events back to the browser. Is there a standard and/or clever way for updates to idle until requests has made a commit?
The only solution I can think of is for requests.php to write the committed changes to a flat file (commits.xml perhaps) and updates.php just polls the file size every half-second, thereby keeping the workload to a minimum.
Any better/smarter/more obvious solutions out there?
Polling your database for changes is not a good idea. Instead, you should do inter-process PUB/SUB on the server. To do that, you can use a message queue like RabbitMQ, ZeroMQ or Redis PUB/SUB.
In my application, the user needs to register through a form, where I have to send three mails and do some other (huge) database checks. It takes a lot of time, is it possible to make the whole task as background process or some other alternates is there?
If your database activities take too long then you need to rethink your design. However if the delay is due to emails, then just store the emails in DB or in files. Create a cron job that sends out these queued emails every 5/10/15 minutes(and then delete them).
maybe you can once a user is registered flag him as pending in your database.
Then you could defer the work in a python or php routine running in the background continuously who would look for any pending request, do the check, send the emails and finally update the database accordingly.
the user during this time would be in a registered but pending status, but at least from a visitor point of view, he is not stuck waiting for everything to be processesed.
You cannot make a PHP script that has been started over a webserver process a background process.
I would check if I can optimize the database (probably, you have insufficient indizes), and if that doesn't fly, build a second process that gets started regularily (maybe once every five minutes or so) on the CLI side with a cronjob, showing the user a "Thank you for your registration" page...
As per my comment elsewhere, spawning a long running process from PHP is a practical solution bearing in mind a few caveats if the performance problems are unavoidable.
However "send 3 mails" should not take an appreciable amount of time (I don't know what the database checks are). You need to spend some time looking at optimizing the existing process.
Other ways to solve the problem would be conventional batch processing, offloading the heavy lifting to a multi-process/multi-threaded daemon via a network call or asynchronous messaging system, or even a single threaded job processor using a message queue.
I have a web application where users can create topics and also comment on other topics (similar to what we have here on stackoverflow). I want to be able to send notifications to participating users of a discussion.
I know the easiest way to go about it is to hook the notification to the script executed when a user interacts with a discussion. In as much as that seems very easy, I believe its not the most appropriate way as the user will need to wait till all the emails notifications (notification script finishes execution) are sent till he gets the status of his action.
Another alternative I know of is to schedule the execution of the notification script using cronjob. In order for the notification to be relevant, the script will be scheduled to execute every 3 to 7 minutes so as to make sure the users get notification in a reasonable time.
Now my concern is, will setting cronjob to run a script every 3 minutes consume reasonable system resource putting into consideration my application is still running on a shared hosting platform?
Also, am thinking is it possible to have a scenario where by the comment script will trigger or notify a notification script to send notifications to specified email addresses while the comment script continues it's execution without having to wait for the completion of the notification script. If this can be achievable, then I think it will be the best choice for me.
Thank you very much for your time.
Unless your notification script is enormously resource-intensive and sends dozens or hundreds of messages out on each run, I would not worry about scheduling it every 3-7min on a shared host. Indeed, if you scheduled it for 3 minutes and found performance sagging on your site, then increase it to 4min for a 25% reduction in resources. It's pretty unlikely to be a problem though.
As far as starting a background process, you can achieve that with a system call to exec(). I would direct you to this question for an excellent answer.
IMO adding a "hook" to each "discussion interaction" is by far the cleanest approach, and one trick to avoid making users wait is to send back a Content-Length header in the HTTP response. Well-behaved HTTP clients are supposed to read the specified number of octets and then close the connection, so if you send back your "status" response with the proper Content-Length HTTP header (and set ignore_user_abort) then the end user won't notice that your server-side script actually continues on its merry way, generating email notifcations (perhaps even for several minutes) before exiting.