Dead lettering with php-amqplib and RabbitMQ?

Dead lettering with php-amqplib and RabbitMQ? - php

I'm just starting out in using php-amqplib and RabbitMQ and want a way to handle messages that, for whatever reason, can't be processed and are nack'd. I thought that one way people handle this is with a dead letter queue. I'm trying to set this up but have not had any luck so far and hope someone could offer some suggestions.
My initiation of the queues looks a little something like:
class BaseAbstract
{
/** #var AMQPStreamConnection */
protected $connection;
/** #var AMQPChannel */
protected $channel;
/** #var array */
protected $deadLetter = [
'exchange' => 'dead_letter',
'type' => 'direct',
'queue' => 'delay_queue',
'ttl' => 10000 // in milliseconds
];
protected function initConnection(array $config)
{
try {
$this->connection = AMQPStreamConnection::create_connection($config);
$this->channel = $this->connection->channel();
// Setup dead letter exchange and queue
$this->channel->exchange_declare($this->deadLetter['exchange'], $this->deadLetter['type'], false, true, false);
$this->channel->queue_declare($this->deadLetter['queue'], false, true, false, false, false, new AMQPTable([
'x-dead-letter-exchange' => $this->deadLetter['exchange'],
'x-dead-letter-routing-key' => $this->deadLetter['queue'],
'x-message-ttl' => $this->deadLetter['ttl']
]));
$this->channel->queue_bind($this->deadLetter['queue'], $this->deadLetter['exchange']);
// Set up regular exchange and queue
$this->channel->exchange_declare($this->getExchangeName(), $this->getExchangeType(), true, true, false);
$this->channel->queue_declare($this->getQueueName(), true, true, false, false, new AMQPTable([
'x-dead-letter-exchange' => $this->deadLetter['exchange'],
'x-dead-letter-routing-key' => $this->deadLetter['queue']
]));
if (method_exists($this, 'getRouteKey')) {
$this->channel->queue_bind($this->getQueueName(), $this->getExchangeName(), $this->getRouteKey());
} else {
$this->channel->queue_bind($this->getQueueName(), $this->getExchangeName());
}
} catch (\Exception $e) {
throw new \RuntimeException('Cannot connect to the RabbitMQ service: ' . $e->getMessage());
}
return $this;
}
// ...
}
which I thought should set up my dead letter exchange and queue, and then also set up my regular exchange and queue (with the getRouteKey, getQueueName, and getExchangeName/Type methods provided by extending classes)
When I try to handle a message like:
public function process(AMQPMessage $message)
{
$msg = json_decode($message->body);
if (empty($msg->payload) || empty($msg->payload->run)) {
$message->delivery_info['channel']->basic_nack($message->delivery_info['delivery_tag'], false, true);
return;
}
// removed for post brevity, but compose $cmd variable
exec($cmd, $output, $returned);
if ($returned !== 0) {
$message->delivery_info['channel']->basic_ack($message->delivery_info['delivery_tag']);
} else {
$message->delivery_info['channel']->basic_nack($message->delivery_info['delivery_tag']);
}
}
But I get back the error Something went wrong: Cannot connect to the RabbitMQ service: PRECONDITION_FAILED - inequivalent arg 'x-dead-letter-exchange' for queue 'delay_queue' in vhost '/': received 'dead_letter' but current is ''
Is this the way I should be setting up dead lettering? Different examples I've seen around all seem to show a bit of a different way of handling it, none of which seem to work for me. So I've obviously misunderstood something here and am appreciative of any advice. :)

Setting up (permanent) queues and exchanges is something you want to do once, when deploying code, not every time you want to use them. Think of them like your database schema - although the protocol provides "declare" rather than "create", you should generally be writing code that assumes things are configured a particular way. You could build the first part of your code into a setup script, or use the web- and CLI-based management plugin to manage these using a simple JSON format.
The error you are seeing is probably a result of trying to declare the same queue at different times with different parameters - the "declare" won't replace or reconfigure an existing queue, it will treat the arguments as "pre-conditions" to be checked. You'll need to drop and recreate the queue, or manage it via the management UI, to change its existing parameters.
Where run-time declares become more useful is when you want to dynamically create items in your broker. You can either give them names you know will be unique to that purpose, or pass null as the name to receive a randomly-generated name back (people sometimes refer to creating an "anonymous queue", but every queue in RabbitMQ has a name, even if you didn't choose it).
If I'm reading it correctly, your "schema" looks likes this:
# Dead Letter eXchange and Queue
Exchange: DLX
Queue: DLQ; dead letter exchange: DLX, with key "DLQ"; automatic expiry
Binding: copy messages arriving in DLX to DLQ
# Regular eXchange and Queue
Exchange: RX
Queue: RQ; dead letter exchange: DLX, with key "DLQ"
Binding: copy messages from RX to RQ, optionally filtered by routing key
When a message is "nacked" in RQ, it will be passed to DLX, with its routing key overwritten to be "DLQ". It will then be copied to DLQ. If it is nacked from DLQ, or waits in that queue too long, it will be routed round to itself.
I would simplify in two ways:
Remove the dead letter exchange and TTL from the "dead letter queue" (which I've labelled DLQ); that loop's likely to be more confusing than useful.
Remove the x-dead-letter-routing-key option from the regular queue (which I've labelled RQ). The configuration of the regular queue shouldn't need to know whether the Dead Letter Exchange has zero, one, or several queues attached to it, so shouldn't know the name of that other queue. If you want nacked messages to go straight to one queue, just make it a "fanout exchange" (which ignores routing keys) or a "topic exchange" with the binding key set to # (which is a wildcard matching all routing keys).
An alternative might be to set x-dead-letter-routing-key to the name of the regular queue, i.e. to label which queue it came from. But until you have a use case for that, I'd keep it simple and leave the message with its original routing key.

Related

How to manually ack messages in RabbitMQ?

I need to know how to manually ack the messages on the queue direct from the Consumer I created and to set a retry strategy of 5 times, each attemp increasing the time like second try 5min, third try, 10min after second try failed, fourth 15min...
I'm kinda lost in the Rabbit documentation, I learned a bit of the concept but the practical use is still a mistery to me...
I'm using Symfony 6.1 and my old_sound_rabbit_mq.yaml looks like this:
old_sound_rabbit_mq:
connections:
default:
host: '%rabbitmqHost%'
port: '%rabbitmqPort%'
user: '%rabbitmqUser%'
password: '%rabbitmqPassword%'
vhost: '%rabbitmqVhost%'
consumers:
upload_file:
connection: default
exchange_options: { name: 'upload_file_exchange', type: direct, durable: true, auto_delete: false }
queue_options: { name: 'upload_file_queue', durable: true, auto_delete: false, arguments: { 'x-max-priority': [ 'I', 20 ] } }
callback: App\Consumer\UploadFileConsumer
qos_options: { prefetch_size: 0, prefetch_count: 1, global: false }
This is my consumer:
<?php
declare(strict_types=1);
namespace App\Consumer;
use OldSound\RabbitMqBundle\RabbitMq\ConsumerInterface;
use PhpAmqpLib\Message\AMQPMessage;
class UploadFileConsumer implements ConsumerInterface
{
public function execute(AMQPMessage $msg): void
{
try {
// do something with $msg, if all is good then ack the msg and remove from queue
} catch (\Exception $e) {
// keep message in queue, don't ack it, keep it in queue retry 5 times then stop consumer if no success
}
}
}

AMQPMessage provides both ack() and nack() methods for this purpose.
https://github.com/php-amqplib/php-amqplib/blob/master/PhpAmqpLib/Message/AMQPMessage.php#L98-L128
So likely what you want is:
<?php
declare(strict_types=1);
namespace App\Consumer;
use OldSound\RabbitMqBundle\RabbitMq\ConsumerInterface;
use PhpAmqpLib\Message\AMQPMessage;
class UploadFileConsumer implements ConsumerInterface
{
public function execute(AMQPMessage $msg): void
{
try {
// do something with $msg
$msg->ack();
} catch (\Exception $e) {
// keep message in queue, don't ack it, keep it in queue retry
$msg->nack(true);
}
}
}
Though I'm not familiar with a way to limit the number of times a message is re-queued without modifying the headers/payload and re-queueing it as a new message. Alternatively, you can set a TTL value and messages will eventually time out of the queue. You can also create a dead-letter exchange if you want to inspect nack'ed/expired messages. [just make sure to clean it out, otherwise you'll have new problems]
If I had to kludge in a "re-queue X times" I'd suggest a cache with a built-in TTL like Redis, the key is the message ID, and the value is the number of retries.
Edit:
Spitballing some workflows for "Task A must complete before Task B can begin", in order of decreasing preference:
If Tasks A and B can never happen without each other, then consolidate Task A and Task B into a single task, as they are not independent.
If Task B cannot happen without A, have B invoke A in synchronous/RPC fashion.
Create new async Task C that calls A and B in synchronous/RPC fashion.
Task A ends with submitting Task B to the queue. [this still has the smell of "A and B" are actually a single work unit]
Track task status externally [eg: in a cache like Redis] and have Task B nack/requeue if Task A is not yet completed.
and always beware of "infinite requeue" if your queue does not have a defined TTL, as messages will continue to build up, and your consumers will be working constantly on tasks that may never be completed successfully.

Is there a way to manually store Kafka offset so a consumer never misses messages?

Using PHP Laravel Framework to consume kafka messages with the help of the mateusjunges/laravel-kafka laravel package.
Is it possible to save the offset by consumer in, for example, Redis or DB?
And, when the broker shuts down and comes back up, is it possible to tell the consumer to start consuming messages from that specific offset?
Let's say I have a laravel Artisan command that builds the following consumer :
public function handle()
{
$topics = [
'fake-topic-1',
'fake-topic-2',
'fake-topic-3'
];
$cachedRegistry = new CachedRegistry(
new BlockingRegistry(
new PromisingRegistry(
new Client(['base_uri' => 'https://fake-schema-registry.com'])
)
),
new AvroObjectCacheAdapter()
);
$registry = new \Junges\Kafka\Message\Registry\AvroSchemaRegistry($cachedRegistry);
$recordSerializer = new RecordSerializer($cachedRegistry);
foreach ($topics as $topic)
{
$registry->addKeySchemaMappingForTopic(
$topic,
new \Junges\Kafka\Message\KafkaAvroSchema($topic . '-key')
);
$registry->addBodySchemaMappingForTopic(
$topic,
new \Junges\Kafka\Message\KafkaAvroSchema($topic . '-value')
);
}
$deserializer = new \Junges\Kafka\Message\Deserializers\AvroDeserializer($registry, $recordSerializer);
$consumer = \Junges\Kafka\Facades\Kafka::createConsumer(
$topics, 'fake-test-group', 'fake-broker.com:9999')
->withOptions([
'security.protocol' => 'SSL',
'ssl.ca.location' => storage_path() . '/client.keystore.crt',
'ssl.keystore.location' => storage_path() . '/client.keystore.p12',
'ssl.keystore.password' => 'fakePassword',
'ssl.key.password' => 'fakePassword',
])
->withAutoCommit()
->usingDeserializer($deserializer)
->withHandler(function(\Junges\Kafka\Contracts\KafkaConsumerMessage $message) {
KafkaMessagesJob::dispatch($message)->onQueue('kafka_messages_queue');
})
->build();
$consumer->consume();
}
My problem now is that, from time to time, the "fake-broker.com:9999" shuts down and when it comes up again, it misses a few messages...
offset_reset is set to latest ;
The option auto.commit.interval.ms is not set on the ->withOptions() method, so it is using the default value (5 seconds, I believe) ;
auto_commit is set to true and the consumer is built with the option ->withAutoCommit() as well ;
Let me know if you guys need any additional information ;)
Thank you in advance.
EDIT:
According to this thread here , I should set my "offset_reset" to "earliest", and not "latest".
Even tho, I'm almost 100% sure that an offset is committed (somehow, somewhere stored), because I am using the same consumer group ID in the same partition (0), so, the "offset_reset" is not even taken into consideration, I'm assuming...

somehow, somewhere stored
Kafka consumer groups store offsets in Kafka (__consumer_offsets topic). So, therefore, storing externally doesn't really make sense because you need Kafka to be up, regardless.
Is it possible to save the offset by consumer in, for example, Redis or DB? And, when the broker shuts down and comes back up, is it possible to tell the consumer to start consuming messages from that specific offset?
In general, it is, but it adds unnecessary complexity. You'd need to manually assign each partition to your client rather than subscribing the consumer to just a topic. It's not clear to me if that Kafka library supports custom partition assignment, though
It's not clear from your question why Kafka would be scaled to zero brokers and have less uptime than "Redis or DB" for you not to store offsets in Kafka. (Redis is a DB, so not sure why that's an "or"...)
Only when there is no consumer group does that offset_reset value matter. The consumer client isn't (shouldn't? I don't know the PHP client code.) "caching" the offsets locally, and broker restarts should preserve any committed values. If you want to guarantee you are able to commit every message, you need to disable auto-commits and handle it yourself. https://junges.dev/documentation/laravel-kafka/v1.8/advanced-usage/4-custom-committers
You can optionally inspect the message in your handler function, and store that message offset somewhere else, but then you are fully responsible for seeking the consumer when it starts back up (again, you want to disable all commit functionality in the consumer, and also set auto.offset.reset consumer config to none rather that latest/earliest). This config will throw an error when the offset doesn't exist, however

Breaking out of a Gearman loop

I have a php application that gets requests for part numbers from our server. At that moment, we reach out to a third party API to gather pricing information to make sure we have the latest pricing for that particular request. Sometimes the third party API is slow or it might be down, so we have a database that stores the latest pricing requests for each particular part number that we can use as a fallback. I'd like to run the request to the third party API and the database in parallel using Gearman. Here is the idea:
Receive request
Through gearman, create two jobs:
Request to third party API
MySQL database lookup
Wait in a loop and return the results based on the following conditions:
If the third party API has completed return that result, return that result immediately
If an elapsed time has passed, (e.g. 2 seconds) and the third party API hasn't responded, return the MySQL lookup data
Using gearman, my thoughts were to either run the two tasks in the foreground and break out of runTasks() within the setCompleteCallback() call, or to run them in the background and check in on the two tasks within a separate loop and check in on the tasks using jobStatus().
Unfortunately, I can't get either route to work for me while still getting access to the resulting data. Is there a a better way, or are there some existing examples of how someone has made this work?

I think you've described a single blocking problem, namely the results of an 3rd-party API lookup. There's two ways you can handle this from my point of view, either you could abort the attempt altogether if you decide that you've run out of time or you could report back to the client that you ran out of time but continue on with the lookup anyway, just to update your local cache just in case it happens to respond slower than you would like. I'll describe how I would go about the former problem because that would be easier.
From the client side:
$request = array(
'productId' => 5,
);
$client = new GearmanClient( );
$client->addServer( '127.0.0.1', 4730 );
$results = json_decode($client->doNormal('apiPriceLookup', json_encode( $request )));
if($results && property_exists($results->success) && $results->success) {
// Use local data
} else {
// Use fresh data
}
This will create a job on the job server with a function name of 'apiPriceLookup' and pass it the workload data containing a product id of 5. It will wait for the results to come back, and check for a success property. If it exists and is true, then the api lookup was successful.
The idea is to set the timeout condition then in the worker task, which completely depends on how you're implementing the API lookup. If you're using cURL (or some wrapper around cURL), you can see the answer to how to detect a timeout here.
From the worker side:
$worker= new GearmanWorker();
$worker->addServer();
$worker->addFunction("apiPriceLookup", "apiPriceLookup", $count);
while ($worker->work());
function apiPriceLookup($job) {
$payload = json_decode($job->workload());
try {
$results = [
'data' => PerformApiLookupForProductId($payload->productId),
'success' => true,
];
} catch(Exception $e) {
$results = ['success' => false];
}
return json_encode($results);
}
This just creates a GearmanWorker object and subscribes it the function of apiPriceLookup. It will call the function apiPriceLookup whenever a client submits a task to the job server. That function calls out to another function, PerformApiLookupForProductId, which should be written so as to throw an exception whenever a timeout condition occurs.
I don't think this would be considered using exceptions to control logic flow, I think timeout conditions generally are exceptional (or should be) events. For instance, Guzzle will throw a GuzzleHttp\Exception\RequestException when it has decided to timeout.

How to Get the MongoDB Connection String to Work Around Downed Nodes?

I am running some tests on a test a cluster I have set up. Right now, I have a three node cluster with one master, one slave and one arbiter.
I have a connection string like
mongodb://admin:pass#the_slave_node,the_master_node
I was under the impression that one of the features inherent in the connection string was that supplying more than one host would introduce a certain degree of resiliency on the client side. I was expecting that when I took down the_slave_node that the php driver should have moved on and try connecting to the_master_node, however this doesn't seem to be the case and instead I get the error:
The MongoCursor object has not been correctly initialized by its constructor
I know that MongoClient is responsible for making the initial connections, and indeed it is that way in the code. So this error is an indication to me that the MongoClient didn't connect properly and I didn't implement correct error checking. However that is a different issue --
How do I guarantee that the MongoClient will connect to at least one of the hosts in the hosts csv in the event at least one host is up and some hosts are down?
Thank you

The MongoCursor object has not been correctly initialized by its constructor
This error should only ever happen if you are constructing your own MongoCursor and overwriting its constructor.
This would happen with for example
class MyCursor extends MongoCursor {
function __construct(MongoClient $connection, $ns , array $query = array(), array $fields = array()) {
/* Do some work, forgetting to call parent::__construct(....); */
}
}
If you are not extending any of the classes, then this error is definetly a bug and you should report it please :)
How do I guarantee that the MongoClient will connect to at least one of the hosts
Put at least one member of each datacenter into your seed list.
Tune the various timeout options, and plan for the case when the primary is down (e.g. which servers should you be reading from instead?)
I suspect you may have forgotten to specify the "replicaSet" option, since you mention your connection string without it?
The following snippet is what I recommend (adapt at will), especially when you require full consistency when possible (e.g. always reading from a primary).
<?php
$seedList = "hostname1:port,hostname2:port";
$options = array(
// If the server is down, don't wait forever
"connectTimeoutMS" => 500,
// When the server goes down in the middle of operation, don't wait forever
"socketTimeoutMS" => 5000,
"replicaSet" => "ReplicasetName",
"w" => "majority",
// Don't wait forever for majority write acknowledgment
"wtimeout" => 500,
// When the primary goes down, allow reading from secondaries
"readPreference" => MongoClient::RP_PRIMARY_PREFERRED,
// When the primary is down, prioritize reading from our local datacenter
// If that datacenter is down too, fallback to any server available
"readPreferenceTags" => array("dc:is", ""),
);
try {
$mc = new MongoClient($seedList, $options);
} catch(Exception $e) {
/* I always have some sort of "Skynet"/"Ground control"/"Houston, We have a problem" system to automate taking down (or put into automated maintenance mode) my webservers in case of epic failure.. */
automateMaintenance($e);
}

Define a custom ExceptionStrategy in a ZF2 module

Hi all,
I've been struggling with this issue for more than a week and finally decided to ask for help hoping that someone knows the answer.
I am developing an application, which is using Google's Protocol Buffers as the data exchange format. I am using DrSlump's PHP implementation, which let's you populate class instances with data and then serialize them into a binary string (or decode binary strings into PHP objects).
I have managed to implement my custom ProtobufStrategy whose selectRenderer(ViewEvent $e) returns an instance of ProtobufRenderer in case the event contains an instance of ProtobufModel. The renderer then extracts my custom parameters from the model by calling $model->getOptions() to determine which message needs to be sent back to the client, serializes the data and outputs the binary string to php://output.
For it to make more sense, let's look at the following sample message:
message SearchRequest {
required string query = 1;
optional int32 page_number = 2;
optional int32 result_per_page = 3;
}
If I wanted to respond to the client with this message, I would return something like this from my action:
public function getSearchRequestAction()
{
[..]
$data = array(
'query' => 'my query',
'page_number' => 3,
'result_per_page' => 20,
);
return new ProtobufModel($data, array(
'message' => 'MyNamespace\Protobuf\SearchRequest',
));
}
As you can see I am utilizing ViewModel's second parameter, $options, to tell which message needs to be serialized. That can then, as mentioned earlier, be extracted inside the renderer by calling $model->getOptions().
So far, so good. My controller actions output binary data as expected.
However, I am having issues with handling exceptions. My plan was to catch all exceptions and respond to the client with an instance of my Exception message, which looks like this:
message Exception {
optional string message = 1;
optional int32 code = 2;
optional string file = 3;
optional uint32 line = 4;
optional string trace = 5;
optional Exception previous = 6;
}
In theory it should work out of the box, but it does not. The issue is that Zend\Mvc\View\Http\ExceptionStrategy::prepareExceptionViewModel(MvcEvent $e) returns an instance of ViewModel, which obviously does not contain the additional $options information I need.
Also it returns ViewModel and not ProtobufModel, which means that the Zend invokes the default ViewPhpRenderer and outputs the exception as an HTML page.
What I want to do is replace the default ExceptionStrategy (and eventually also the RouteNotFoundStrategy) with my own classes, which would be returning something like this:
$data = array(
'message' => $e->getMessage(),
'code' => $e->getCode(),
'file' => $e->getFile(),
'line' => $e->getLine(),
'trace' => $e->getTraceAsString(),
'previous' => $e->getPrevious(),
);
return new ProtobufModel($data, array(
'message' => 'MyNamespace\Protobuf\Exception',
));
...and I can't find the way to do it...
I tried creating my own ExceptionStrategy class and alias it to the existing ExceptionStrategy service but Zend complained that a service with such name already exists.
I have a suspicion that I am on the right path with the custom strategy extension I can't find a way to override the default one.
I noticed that the default ExceptionStrategy and the console one get registered in Zend/Mvc/View/Http/ViewManager. I hope I won't have to add custom view managers to achieve such a simple thing but please, correct me if I'm wrong.
Any help will be appreciated!

The easiest way is to do a little fudging.
First, register your listener to run at a higher priority than the ExceptionStrategy; since it registers at default priority, this means any priority higher than 1.
Then, in your listener, before you return, make sure you set the "error" in the the MvcEvent to a falsy value:
$e->setError(false);
Once you've done that, the default ExceptionStrategy will say, "nothing to do here, move along" and return early, before doing anything with the ViewModel.
While you're at it, you should also make sure you change the result instance in the event:
$e->setResult($yourProtobufModel)
as this will ensure that this is what is inspected by other listeners.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.