I have a question concerning the third RabbitMQ tutorial. I am trying to implement something similar, except there is no guarantee the consumer(s) would be running at the time the producer sends a message to the exchange.
So, I have my producer which publishes the messages to a fanout exchange:
$channel->exchange_declare('my_exchange', 'fanout', false, false, false);
$channel->basic_publish('my_message', 'my_exchange');
In my publishers, I declare queues, which I then bind to the exchange:
list($queueName,, ) = $channel->queue_declare("", false, false, true, false);
$channel->queue_bind($queueName, 'my_exchange');
And this is where my problem has it's root. The tutorial says:
The messages will be lost if no queue is bound to the exchange yet,
but that's okay for us; if no consumer is listening yet we can safely
discard the message.
Is there a way to somehow preserve those messages, so when a consumer starts, it would access the previously sent messages? The only way I figured out how to do it is to declare the same queue in my producer and my publisher, but it kind of defeats the purpose of having an exchange and separate queues for different consumers.
The queues need to exist, doesn't matter really who/what creates them: it can be producer (althoug I would strongly discourage this), consumer, some third admin app that just creates queus via rest api, rabbitmqctl... If you want to consume the queue(s) later, just make sure that they're durable and that TTL for messages is long enough (also durable messages if needed). But beware that your queue(s) don't get into flow state.
The only way I figured out how to do it is to declare the same queue
in my producer and my publisher, but it kind of defeats the purpose of
having an exchange and separate queues for different consumers.
First - I think you meant to say in my producer and my subscriber :)
Second, separate queues for consumers (or queue per consumer) is just in this example. Bare in mind that this is for a fanout exchange, and each consumer decalres an exclusive queue - when the consumer disconnects, the queue is gone. And that's why that's okay for us, because we're simply broadcasting and who wants the broadcast (the messages) needs to get it. Fanout exchange just puts messages to all the queues bound to it, that's it. It's perfectly ok to have multiple consumers consuming from same queue (look at tutorial 2).
So you just need to consider your use case. Of course it doesn't make sense to create fanout exchange and pre-setup the queus for the consumers... Perhaps you need just some routing keys or something else.
In this example (so tutorial 3) it's ment that there is a brodcast of messages, and if no one get's them, not a big (or small) deal. If anyone wants them, they need to get them. It's like a tv channel - regardless if someone is watching or not, the signal goes on.
Consumers should attach themselves to queues, they shouldn't declare their own queues. Think of queues as buckets of work to be done. Depending on the workload you can add N consumers to those queues to do the work.
When you create an exchange you should have one or more queues (buckets of work) that are attached to that exchange. If you do this, messages will flow into the queues and start to queue-up (pardon the pun). Your consumers can then attach whenever they are ready and start doing the work.
Related
I was wondering, because I can not find anything on symfony or other resources, if php's symfony/messenger can handle messages in "bulk" with any async transport.
For example. Grab 20 messages from the bus, handle those 20 messages, and ack or reject any of the messages.
I know RabbitMQ has a feature to grab n-amount of messages from the queue, and process all of them in a single run.
In some cases this will have a better performance over scaling the async workers.
Does anybody have any leads, resources or experience with it? Or am I trying to resolve something by going against the idea of symfony/messenger?
[update]
I'm aware that bulk messages are not part of the (async) messaging concept. That each message should be processed individually. But some message brokers have implemented a feature to "grab" X-amount of messages from a queue and process them (either by sending an acknowledge or rejection, or otherwise). I know handling multiple messages in a single iteration increases complexity of any consumers, but in some cases it will improve performance.
I've used this concept of consuming multiple messages in a single iteration many times, but never with php's symfony/messenger.
This was not natively possible prior to symfony 5.4.
They added a BatchHandlerInterface which will allow you to batch (and choose the size of the batch) your messages.
You can find more info here :
Symfony - Handle messages in batches
GitHub PR of the feature
First I think you have the wrong concept. There is no such thing as “messages in bulk” in the queue world.
The idea of the queue is that one message is received in the consumer and the consumer is responsible of letting know the queue that the message was acknowledged, so it can be deleted. If this does not happen in X time the message is again visible for other messages.
If the messenger get 20 messages from the queue it still process them one by one and after he finish processing just acknowledge every message. These 20 messages are “hidden” to other consumers for some time /it depends of the configuration of the queue/. This answer also the question of multiple consumers.
There is a long running process(Excel report creation) in my web app that needs to be executed in a background.
Some details about the app and environment.
The app consists of many instances, where each client has separate one (with customized business logic) while everything is hosted on our server. The functionality that produces Excel is the same.
I'm planning to have one rabbitMq server installed. One part of app(Publisher) will take all report options from user and will put it into message. And some background job(Consumer) will consume it, produce report and send it via email.
However, there is a flaw in such design, where,say, users from one instance will queue lots of complicated reports(worth ~10 min of work) and a user from another instance will queue an easy one(1-2 mins) and he will have to wait until others will finish.
There could be separate queues for each app instance, but in that case I would need to create one consumer per instance. Given that there are 100+ instances atm, it doesn't look like a viable approach.
I was thinking if it's possible to have a script that checks all available queues(and consumers) and create a new consumer for a queue that doesn't have one. There are no limitations on language for consumer and such script.
Does that sound like a feasible approach? If not, please give a suggestion.
Thanks
As I understood topic correctly everything lies on one server - RabbitMQ, web application, different instances per client and messeges' consumers. In that case I rather put different topics per message (https://www.rabbitmq.com/tutorials/tutorial-five-python.html) and introduce consumer priorities (https://www.rabbitmq.com/consumer-priority.html). Based on that options during publishing of the message I will create combination of topic and priority of the message - publisher will know number of already sent reports per client, selected options and will decide is it high, low or normal priority.
Logic to pull messages based on that data will be in the consumer so consumer will not get heavy topics when there are in process already 3 (example).
Based on the total number of messages in the queue (its not accurate 100%) and previous topics and priorities you can implement kind of leaking bucket strategy in order to get control of resources- max 100 number of reports generated simultaneously.
You can consider using ZeroMQ (http://zeromq.org) for your case its maybe more suitable that RabbitMQ because is more simple and its broker less solution.
I have a Symfony2 app that under some circumstances has to send more than 10.000 push and email notifications.
I developed a SQS flow with some workers polling the queues to send emails and mobile push notifications.
But now, I have the problem that, when in the request/response cycle I need to send to SQS this task/jobs (maybe not that amount) this task itself is consuming a lot of time (response timeout is normally reached).
Should I process this task at background (I need to send back a quick response)? And how to handle possible errors with this scenario?
NOTE: Amazon SQS can receive 10 messages at one request and I already using this method. Maybe should I build a simple SQS Message with a lot of notifications jobs (max. 256K) to send less HTTP requests to SQS?
The moment you have a single action that triggers 10k actions, you need to try to find a way to tell the user that "OK, I got it. I'll start working on it and will let you know when it's done".
So to bring that work into the background, a domain event should be raised from your user's action which would be queued into SQS. The user gets notified, and then a worker can pick up that message from the queue and start sending emails and push notifications to another queue.
At the end of the day, 10k messages in batches of 10 are just 1k requests to SQS, which should be pretty quick anyway.
Try to keep your messages small. Don't send the whole content of an email into a queue message, because then you'll get unnecessary long latencies. Keep the content in a reachable place or just query for it again when consuming the message instead of passing big content up and down the network.
And how to handle possible errors with this scenario?
Amazon provides dead letter queues for this. In asynchronous systems I've built, I usually create a queue and then attach a redrive policy to it that says "if I see the same message on this queue 10 times, send it to a dead letter queue so that it doesn't bounce back and forth between the queue and a consumer for all eternity". The dead letter queue is simply another queue.
From a dead letter queue you can decide what to do with data that did not process. Since it's notifications (emails or push notifications) in your case, you might have another component in your system that will periodically reprocess a dead letter queue. Scheduled Lambdas are good for this.
I'm implementing RabbitMQ to perform some image editing operations on another server. Though, from time to time the request may arrive on that server before the source image is synced to it - in which case I would like to pop the message back in the the queue and process it after all other operations have completed.
However, calling basic.nack with the resubmit bit set makes my queue re-receive that message immediately - ahead of any operations that operations that can actually complete.
Currently I feel like I'm forced to implement some logic that just re-submits the original message to the exchange, but I'd like to avoid that. Both because the same message may have been successfully processed on another server (with it's own queue), and because I expect this to be so much of a common pattern that there must be better way.
(oh, I'm using php-amqplib in both consumer and server code)
Thanks!
Update: I solved my problem using Dead Letter Exchange, as suggested by zaq178miami
My current solution:
Declares a dead letter exchange $dead_letter_exchange on the original queue $worker
Declares a recovery exchange $recovery_exchange
Declares a queue $dead_queue, with a x-message-ttl of 5 seconds and x-dead-letter-exchange set to $recovery_exchange
Binds $dead_letter_queue to $dead_letter_exchange
And binds $worker to $recovery_exchange
$dead_letter_exchange and $recovery_exchange are generated names, based on the exchange I'm consuming from and the value of $worker
Making every message that gets nack'ed return to the worker only on that specific queue (server) after five seconds for a retry. I may still want to apply some logic that throws the message away after $n retries.
I'm still open to better ideas ;-)
Looks like you have 'race condition' issue which is the cause of the problem. Maybe it is a good choice to delay message publishing or publish delayed messages to be sure that image synced to target machine or publish message when image arrives (which might be tricky) or just sync image on demand (when message consumed). You can even add some API to get source image, so you can scale your consumers horizontally without any pain any time. The idea is to make consumers atomic and undependable as much as it can be.
Back to original question, if it an option for you, try Dead Letter Exchanges to move failed messages to separate queue. Mixing failed messages and valid without having definitive mechanism to detect re-published smells a bit (due to such reasons like potential cycling problem, management difficulties). But it really depends on your needs, messages rate and hardware, if some solution yields stable result and you are sure about it - just stick to it.
Note, if you are using php-amqplib you can start consuming messages from more than one queue at the same time, so you can consume messages from the main queue and postponed messages (but in such case you have to publish message to postponed queue delayed too to prevent it immediate consuming).
Usually delayed messages publishing done via per-message or per-queue ttl and extra queue with DLX set to the main working queue, or in your case to postponed messages queue.
I'm a little confused. I'm trying to implement topic exchanges and am not sure what is needed.
I want to have several routing keys and 1 topic exchange (the default amq.topic). My keys would be like:
customer.appA.created
customer.appB.created
customer.*.created
I want my queue(s) to be durable, but do I need 1 'customer' queue or 2 queues for appA and appB?
I have my publisher figured out; connect, exchange declare, basic publish.
But I'm struggling with the consumers. Let's say I want to open 3 consoles, one for each of the aforementioned routing keys.
My current consumer has: connect, exchange declare, queue bind, basic consume. These are connected to a durable 'customer' queue. However my messages are being round-robin'ed to each console/consumer and not using the routing keys.
So my questions;
For a typical topic exchange set up; how many queues do you need?
Can my consumers get away with just exchange binding, or does it have to include queue interaction?
Is it possible for a single message to appear in 2 consumers with topic exchange (or do you need fanout for that)?
First thing first : Exchange do not deliver to Consumer. It delivers messages to Queue for matching routing kyes.
1. For a typical topic exchange set up; how many queues do you need?
If you have multiple consumer then you will need one queue for every consumer.
2. Can my consumers get away with just exchange binding, or does it have to include queue interaction?
You need to bind consumer with queue, if queue not exist then create and bind.
3. Is it possible for a single message to appear in 2 consumers with topic exchange (or do you need fanout for that)?
YES, Only if the consumer have their own (separate queue bind with same routing key).Otherwise it will be Round Robin way.
So best way is to have consumer's own queue with required routing key...!!!
For a typical topic exchange set up; how many queues do you need?
It depends on your application needs. You may start from just one queue to implement simple FIFO stack and later add more queues with for more granulated messages consuming.
Can my consumers get away with just exchange binding, or does it have to include queue interaction?
The idea of AMQP exchanges is to get published messages and put them to one or more queues (or even other exchanges or drop at all under certain condition). Consumer works only with queues, while publisher works only with exchanges.
There might be some misunderstandings with Default Exchange while it route messages to the queue name same as a routing key and it sometimes interpreted as publishing to queues which is not true.
Is it possible for a single message to appear in 2 consumers with topic exchange (or do you need fanout for that)?
I guess you are talking about duplicating one message to multiple queues (while there are some erroneous situations when you really may have single message being processed by multiple consumers).
If so - sure. You can just create multiple bindings for queue to get different messages. Here is small example:
Let you have three messages published to topic exchange with three different routing keys customer.appA.created, customer.appB.created and `customer.appC.created.
You can create separate queue to collect specific messages by binding queue with exact routing key - customer.appA.created, customer.appB.created and so on, or as you already noted, bind queue with wildcard routing key customer.*.created.
To collect messages only for appA and appB you can create queue with two bindings customer.appA.created and customer.appB.created, so you'll get two messages type in one queue.