Various users will submit their phone number to get notified for certain service that I provide. As soon as I start providing the service, a SMS notification should be sent out to all those users who had signed up(submitted their phone number).
foreach($phone_numbers as $number)
// api call to send the sms
$msg->send($number);
endforeach
Now the thing that is confusing me is, at some point I can have like 1000 users signed up, and when the scripts runs automatically when I add that service, it will consume a lot of time to loop through 1000 number and make 1000 API calls.
Is there a better way of doing it ?
One approach would be to use a message queue like RabbitMQ. That way, you could just push the task to the queue and have it handled by a separate process, which could run on a different server. It also has the advantage that it should be a bit more robust, since if it fails, your worker script can try again.
I haven't had occasion to use it with a PHP application, but I have used it with Celery for a Django app with good results. I found decent-looking PHP examples at http://blog.teqneers.com/2013/10/simple-spawn-rabbitmq-consumers-with-php/ and http://www.rabbitmq.com/tutorials/tutorial-two-php.html.
Related
There is a long running process(Excel report creation) in my web app that needs to be executed in a background.
Some details about the app and environment.
The app consists of many instances, where each client has separate one (with customized business logic) while everything is hosted on our server. The functionality that produces Excel is the same.
I'm planning to have one rabbitMq server installed. One part of app(Publisher) will take all report options from user and will put it into message. And some background job(Consumer) will consume it, produce report and send it via email.
However, there is a flaw in such design, where,say, users from one instance will queue lots of complicated reports(worth ~10 min of work) and a user from another instance will queue an easy one(1-2 mins) and he will have to wait until others will finish.
There could be separate queues for each app instance, but in that case I would need to create one consumer per instance. Given that there are 100+ instances atm, it doesn't look like a viable approach.
I was thinking if it's possible to have a script that checks all available queues(and consumers) and create a new consumer for a queue that doesn't have one. There are no limitations on language for consumer and such script.
Does that sound like a feasible approach? If not, please give a suggestion.
Thanks
As I understood topic correctly everything lies on one server - RabbitMQ, web application, different instances per client and messeges' consumers. In that case I rather put different topics per message (https://www.rabbitmq.com/tutorials/tutorial-five-python.html) and introduce consumer priorities (https://www.rabbitmq.com/consumer-priority.html). Based on that options during publishing of the message I will create combination of topic and priority of the message - publisher will know number of already sent reports per client, selected options and will decide is it high, low or normal priority.
Logic to pull messages based on that data will be in the consumer so consumer will not get heavy topics when there are in process already 3 (example).
Based on the total number of messages in the queue (its not accurate 100%) and previous topics and priorities you can implement kind of leaking bucket strategy in order to get control of resources- max 100 number of reports generated simultaneously.
You can consider using ZeroMQ (http://zeromq.org) for your case its maybe more suitable that RabbitMQ because is more simple and its broker less solution.
I am making an payment system in PHP which depends on a REST API.
My Business Logic:
If someone submits a request through my system, lets say "transfer money from point A to point B" that transaction is saved in my database with status: "submited", then submitted to the (Mobile Network Operator) API URL which processes it and returns back the status to my system, update my database transaction status to the new status 'eg: waiting for the confirmation' and notify the user of the incoming status.
The problem is:
My application should keep requesting with an interval of 10 seconds to check for the new status and showing the new status to the user till the last status of 'complete or declined', since statuses can range to 5 eg:"waiting, declined, approved, complete...' .
I have managed to do this using AJAX, setting time intervals in JavaScript. But it stops requesting if the user closes the browser or anything happened at their end. resulting into my app not knowing whether the money was delivered or not .
I would like to how i can run this circular tasks in the background using Gearman without involving JavaScript time intervals thanks
Gearman is more of a worker queue, not a scheduling system. I would probably setup some type of cron job that will query the database and submit the appropriate jobs to Gearman in an async way. With gearman, you will want to use libdrizzle or something else for persistent queues and also some type of GearmanWorker process manager to run more than one job at a time. There are a number of projects that currently do this with varying degrees of success like https://github.com/brianlmoon/GearmanManager. None of the worker managers I have evaluated have really been up to par, so I created my own that will probably be open-sourced shortly.
You wouldn't use Gearman in the background for circular tasks, which is normally referred to as polling. Gearman is normally used as a job queue for doing things like video compression, resizing images, sending emails, or other tasks that you want to 'background'.
I don't recommend polling the database, either on the frontend or the backend. Polling is generally considered bad, because it doesn't scale. In your javascript example, you can see that as your application grows and is used by thousands of users, polling is going to introduce a lot of unnecessary traffic and load on your servers. On the backend, the machine doing the polling is a single point of failure.
The architecture you want to explore is a message queue. It's similar to the Listener/Observer pattern in programming, but applied at the systems level. This will allow a more robust system that can handle interruptions, from a user closing the browser all the way to a backend system going down for maintenance.
I have a Symfony2 app that under some circumstances has to send more than 10.000 push and email notifications.
I developed a SQS flow with some workers polling the queues to send emails and mobile push notifications.
But now, I have the problem that, when in the request/response cycle I need to send to SQS this task/jobs (maybe not that amount) this task itself is consuming a lot of time (response timeout is normally reached).
Should I process this task at background (I need to send back a quick response)? And how to handle possible errors with this scenario?
NOTE: Amazon SQS can receive 10 messages at one request and I already using this method. Maybe should I build a simple SQS Message with a lot of notifications jobs (max. 256K) to send less HTTP requests to SQS?
The moment you have a single action that triggers 10k actions, you need to try to find a way to tell the user that "OK, I got it. I'll start working on it and will let you know when it's done".
So to bring that work into the background, a domain event should be raised from your user's action which would be queued into SQS. The user gets notified, and then a worker can pick up that message from the queue and start sending emails and push notifications to another queue.
At the end of the day, 10k messages in batches of 10 are just 1k requests to SQS, which should be pretty quick anyway.
Try to keep your messages small. Don't send the whole content of an email into a queue message, because then you'll get unnecessary long latencies. Keep the content in a reachable place or just query for it again when consuming the message instead of passing big content up and down the network.
And how to handle possible errors with this scenario?
Amazon provides dead letter queues for this. In asynchronous systems I've built, I usually create a queue and then attach a redrive policy to it that says "if I see the same message on this queue 10 times, send it to a dead letter queue so that it doesn't bounce back and forth between the queue and a consumer for all eternity". The dead letter queue is simply another queue.
From a dead letter queue you can decide what to do with data that did not process. Since it's notifications (emails or push notifications) in your case, you might have another component in your system that will periodically reprocess a dead letter queue. Scheduled Lambdas are good for this.
Does RabbitMQ call the callback function for a consumer when it has some message for it, or does the consumer have to poll the RabbitMQ client?
So on the consumer side, if there is a PHP script, can RabbitMQ call it and pass the message/parameters to it. e.g. if rating is submitted on shard 1 and the aggregateRating table is on shard 2, then would RabbitMQ consumer on shard 2 trigger the script say aggRating.php and pass the parameters that were inserted in shard 1?
The AMQPQueue::consume method is now a "proper" implementation of basic.consume as of version 1.0 of the PHP AMQP library (http://www.php.net/manual/en/amqpqueue.consume.php). Unfortunately, since PHP is a single threaded language, you cant do other things while waiting for a message in the same process space. If you call AMQPQueue::consume and pass it a callback, your entire application will block and wait for the next message to be sent by the broker, at which point it will call the provided callback function. If you want a non blocking method, you will have to use AMQPQueue::get (http://www.php.net/manual/en/amqpqueue.get.php), which will poll the server for a message, and return a boolean FALSE if there is no message.
I disagree with scvatex's suggestion to use a separate language for using a "push" approach to this problem though. PHP is not IO driven, and therefore using a separate language to call a PHP script when a message arrives seems like unnecessary complexity: why not just use AMQPQueue::consume and let the process block (wait for a message), and either put all the logic in the callback or make the callback run a separate PHP script.
We have done the latter at my work as a large scale job processing system so that we can segregate errors and keep the parent job processor running no matter what happens in the children. If you would like a detailed description of how we set this up and some code samples, I would be more than happy to post them.
What you want is basic.consume, which allows the broker to push messages to clients.
That said, the libraries are implemented differently. Most of them have support for basic.consume, but because of inherent limitations of the frameworks used, some don't (most notably the official RabbitMQ C client on which a lot of other clients are based).
If your PHP library does not support basic.consume, you either have to use polling (bad), or you could use one of the more complete clients to drive the script. For instance, you could write a Python or Java program that consumes from the broker (so, the broker pushes deliveries to them) and they could call the script whenever a new message is received. The official tutorials are a great introduction to the AMQP APIs and are a good place to start.
This is efficient from most points of view, but it does require a stable connection to the broker.
If in doubt about the capabilities of the various clients, or if you need more guidance, the RabbitMQ Discuss mailing list is a great place to ask questions. The developers make a point of answering any query posted there.
Pecl amqp allows to use consume functionality with AMQPQueue::consume method. You just need to pass callback function in it and it will be executed when message arrives.
I have a php script which queries a list of clients from a mysql database, and goes to each client's IP address and picks up some information which is then displayed on the webpage.
But, it takes a long time, if the number of clients is too high. Is there anyway, I can send those url requests (file_get_contents) in parallel?
Lineke Kerckhoffs-Willems wrote a good article about Multithreading in PHP with CURL. You can use that instead of file_get_contents() to get needed information.
I would use something like Gearman to assign them as jobs in a queue for workers to come along and complete if this needs to scale.
As another option I have also written a PHP wrapper for the Unix at queue, which might be a fit for this problem. It would allow you to schedule the requests so that they can run in parallel. I have used this method successfully in the past to handle the sending of bulk email, which has similar blocking problems to your script.