I am using Laravel Jobs to read messages from an SQS queue (Laravel version 5.7)
Following Laravel indications I am using supervisor to run multiple queue:work processes at the same time.
All goes well until I get this SQS error related to the message availability:
InvalidParameterValue (client): Value
... for parameter ReceiptHandle is invalid. Reason: Message does not exist or
is not available for visibility timeout change. - <?xml version="1.0"?>
<ErrorResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/"><Error>
<Type>Sender</Type><Code>InvalidParameterValue</Code><Message>Value ...
for parameter ReceiptHandle is invalid. Reason: Message does not exist or is
not available for visibility timeout change.</Message><Detail/></Error>
<RequestId>8c1d28b7-a02c-5059-8b65-7c6292a0e56e</RequestId></ErrorResponse>
{"exception":"[object] (Aws\\Sqs\\Exception\\SqsException(code: 0): Error
executing \"ChangeMessageVisibility\" on \"https://sqs.eu-central-
1.amazonaws.com/123123123123/myQueue\"; AWS HTTP error: Client error: `POST
https://sqs.eu-central-1.amazonaws.com/123123123123/myQueue` resulted in a
`400 Bad Request` response:
In particular, the strange thing is Message does not exist or is not available for visibility timeout change.
Each supervisor process calls command=php /home/application/artisan queue:work without a --sleep=3 (I'd like the process to be reactive and not waiting for 3 seconds in case nothing was in the queue) nor a --tries=3 (I need all the tasks to be completed, so I don't put a limit to the tries parameter)
In case the message is not existing (and I can't exclude this possibility) why does the process fetches it from the queue ? Is there anything I can do to prevent it ?
I've seen this error intermittently in production too, where we run a good number of consumers for a single SQS queue. In our case, I'm pretty convinced that the error is due to SQS's at-least-once delivery semantics. Essentially, a message can be delivered twice or more on rare occasions.
Laravel's queue worker command isn't strictly idempotent because it will throw an exception when trying to release or delete an SQS message that is no longer available (i.e., because it has been deleted by another queue worker process, which received a duplicate of the message from SQS).
Our workaround is to try to detect when a duplicate message has been received, and then attempt to safely release the message back onto the queue. If the other queue worker that is currently working on the message succeeds, it will delete the message, and it won't be received again. If the other queue worker fails, then the message will be released and received again later. Something like this:
<?php
use Aws\Sqs\Exception\SqsException;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Str;
class ProcessPodcast implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
private $jobId;
public function __construct($jobId)
{
$this->jobId = $jobId;
}
public function handle()
{
$acquired = Cache::lock("process-podcast-$this->jobId")->get(function () {
// Process the podcast (NB: this should be idempotent)
});
if (!$acquired) {
$this->releaseDuplicateMessage($delay = 60);
}
}
private function releaseDuplicateMessage($delay)
{
try {
$this->release($delay);
} catch (Exception $ex) {
if (!$this->causedByMessageNoLongerAvailable($ex)) {
throw $ex;
}
}
}
private function causedByMessageNoLongerAvailable(Exception $ex): bool
{
return $ex instanceof SqsException &&
Str::contains(
$ex->getAwsErrorMessage(),
"Message does not exist or is not available for visibility timeout change"
);
}
}
Another potential for these duplicate messages is that SQS has a default Visibility Timeout of 30secs.
Visibility timeout sets the length of time that a message received from a queue (by one consumer) will not be visible to the other message consumers.
So if one worker reads a message from the queue and it takes longer than 30secs to process, the message will become visible again and another worker will start processing.
When the first worker finishes, it will delete it from the queue. Then when the second worker finishes processing the same message and tries to delete it, it can't because the first worker already deleted it.
We're having the same issue at the moment and are implementing a fix/workaround similar to Louis. Will post our version when done and confirmed working.
Note: You can increase the Visibility Timeout on SQS.
Related
I am trying to debug some bizarre behaviour of my PHP application. It is running Laravel 6 + AWS SQS. The program downloads call recordings from a VoIP provider's API using a job. The API has a heavy rate limit of 10req/minute, so I'm throttling the requests on my side. The job is configured to try to complete within 24 hours using retryUntil method. However, the job disappears from the queue after 4 tries. It doesn't fail. The job's failed method never gets executed (I've put logging and Sentry::capture in there). It's not on the failed_jobs table. The last log says "Cannot complete job, retrying in ... seconds", which is right before the release call. However, the job simply disappears from the queue and never gets executed again.
I am logging the number of attempts, max tries, timeoutAt, etc. Everything seems to be configured properly. Here's (the essence of) my code:
public function handle()
{
/** #var Track $track */
$track = Track::idOrUuId($this->trackId);
$this->logger->info('Downloading track', [
'trackId' => $track->getId(),
'attempt' => $this->attempts(),
'retryUntil' => $this->job->timeoutAt(),
'maxTries' => $this->job->maxTries(),
]);
$throttleKey = sprintf('track.download.%s', $track->getUser()->getTeamId());
if (!$this->rateLimiter->tooManyAttempts($throttleKey, self::MAX_ALLOWED_JOBS)) {
$this->downloadTrack($track);
$this->rateLimiter->hit($throttleKey, 60);
} else {
$delay = random_int(10, 100) + $this->rateLimiter->availableIn($throttleKey);
$this->logger->info('Throttling track download.', [
'trackId' => $track->getId(),
'delay' => $delay,
]);
$this->release($delay);
}
}
public function retryUntil(): DateTimeInterface
{
return now()->addHours(24);
}
public function failed(Exception $exception)
{
$this->logger->info('Job failed', ['exception' => $exception->getMessage()];
Sentry::captureException($exception);
}
I found the problem and I'm posting it here for anyone who might struggle in the future. It all came down to a simple configuration. In AWS SQS the queue I am working with has a configured DLQ (Dead-Letter Queue) and Maximum receives set to 4. According to the SQS docs
The Maximum receives value determines when a message will be sent to the DLQ. If the ReceiveCount for a message exceeds the maximum receive count for the queue, Amazon SQS moves the message to the associated DLQ (with its original message ID).
Since this is an infra configuration, it overwrites any Laravel parameters you might pass to the job. And because the message is simply removed from the queue, the processing job does not actually fail, so the failed method is not executed.
I'm using Lumen (Laravel) to handle an AWS SQS queue with the sqs-plain driver for the connection. That driver works the same way as the default sqs driver but allows me to use custom JSON in the queue message body.
Based on that package I've created a job handler that works fine, but I'm having trouble understanding how to properly release a message back to the queue.
In the handler method I've injected a SqsJob and tried calling the release method which isn't working as I'd expect.
It seems that the method SqsJob::release calls changeMessageVisibility on the SQS API and the parent release method, which simply sets a class variable $this->released = true; on the job.
So the "releasing" of the message isn't really done, as much as the message is just being marked as released and the visibility of the message in the queue is changed. Not handling the release would result in the message being "handled" and deleted from the queue.
By trial and error I found out, that throwing an exception in the handler method sends the message back to the queue.
Here's a simplified version of the job handler:
namespace App\Jobs;
use Illuminate\Contracts\Queue\Job as LaravelJob;
use Illuminate\Queue\Jobs\SqsJob;
use Illuminate\Support\Facades\Log;
class HandlerJob extends Job
{
protected $data;
/**
* #param SqsJob $job
* #param array $data
*/
public function handle(SqsJob $job, array $data)
{
if ($this->validData($data)) {
// handle the job
} else {
$this->release(60);
// actually release the job back to queue
throw new \Exception('Invalid Data');
}
}
private function validData($data)
{
//not relevant
}
}
This seems like a basic task to be handled by the queue worker but I can't figure it out.
What is the proper way of releasing a message back into the the queue using the Laravel/Lumen framework?
Below is what's happening when i run php artisan queue:listen and at my job table only have one job
and this is my code :
public function handle(Xero $xero)
{
$this->getAndCreateXeroSnapshotID();
$this->importInvoices($xero);
$this->importBankTransaction($xero);
$this->importBankStatement($xero);
$this->importBalanceSheet($xero);
$this->importProfitAndLoss($xero);
}
In order for a job to leave the queue, it must reach the end of the handle function -- without errors and exceptions.
There must be something breaking inside one or more of your functions.
If an exception is thrown while the job is being processed, the job will automatically be released back onto the queue so it may be attempted again. https://laravel.com/docs/5.8/queues
The same behavior can be achieved with
$this->release()
If you can't figure out what is breaking, you can set your job to run only once. If an error is thrown, the job will be considered failed and will be put in the failed jobs queue.
The maximum number of attempts is defined by the --tries switch used
on the queue:work Artisan command. https://laravel.com/docs/5.8/queues
php artisan queue:work --tries=1
If you are using the database queue, (awesome for debugging) run this command to create the failed queue table
php artisan queue:failed
Finally, to find out what is wrong with your code. You can catch and log the error.
public function handle(Xero $xero)
{
try{
$this->getAndCreateXeroSnapshotID();
$this->importInvoices($xero);
$this->importBankTransaction($xero);
$this->importBankStatement($xero);
$this->importBalanceSheet($xero);
$this->importProfitAndLoss($xero);
}catch(\Exception $e){
Log::error($e->getMessage());
}
}
You could also set your error log channel to be slack, bugsnag or whatever. Just be sure to check it. Please don't be offended, it's normal to screw up when dealing with laravel queues. How do you think I got here?
Laravel try to run the job again and again.
php artisan queue:work --tries=3
Upper command will only try to run the jobs 3 times.
Hope this helps
In my case the problem was the payload, I've created the variable private, but it needs to by protected.
class EventJob implements ShouldQueue
{
use InteractsWithQueue, Queueable, SerializesModels;
// payload
protected $command;
// Maximum tries of this event
public $tries = 5;
public function __construct(CommandInterface $command)
{
$this->command = $command;
}
public function handle()
{
$event = I_Event::create([
'event_type_id' => $command->getEventTypeId(),
'sender_url' => $command->getSenderUrl(),
'sender_ip' => $command->getSenderIp()
]);
return $event;
}
}
The solution that worked for me to delete the job after pushing them into the queue.
Consider the e.g.
class SomeController extends Controller{
public function uploadProductCsv(){
//process file here and push the code inot Queue
Queue::push('SomeController#processFile', $someDataArray);
}
public function processFile($job, $data){
//logic to process the data
$job->delete(); //after complete the process delete the job
}
}
Note: This is implemented for laravel 4.2
I am wondering how to deal with this. I have a webhook endpoint which responds to a webhook call from Github.
It starts a long running process in where it clones the repository from which the webhook call was made.
/**
* The webhook endpoint.
*
* #param Request $request
* #return mixed
* #throws \Exception
*/
public function webhook(Request $request)
{
// The type of GitHub event that we receive.
$event = $request->header('X-GitHub-Event');
$url = $this->createCloneUrl();
$this->cloneGitRepo($url);
return new Response('Webhook received succesfully', 200);
}
The problem with this is that Github gives an error when the response is not provided soon enough.
We couldn’t deliver this payload: Service Timeout
This is rightfully so because my cloneGitRepo method is simply blocking the execution of the response and it is taking too long.
How can I still deliver a response to acknowledge to Github that the webhook call has been made successfully and start my long running process?
I am using Laravel for all of this with Redis, maybe something can be accomplished there? I am open to all suggestions.
What you're looking for is a queued job. Laravel makes this very easy with Laravel Queues.
With queues, you setup a queue driver (database, redis, Amazon SQS, etc), and then you have one to many queue workers that are continuously running. When you put a job on the queue from your webhook method, it will be picked up by one of your queue workers and run in a separate process. The act of dispatching a queued job to a queue is very quick, though, so your webhook method will return quickly while the real work is being done by the queue worker.
The linked documentation has all the details, but the general process will be:
Setup a queue connection. You mention you're already using redis, I would start with that.
Use php artisan make:job CloneGitRepo to create a CloneGitRepo job class.
It should implement the Illuminate\Contracts\Queue\ShouldQueue interface so that Laravel knows to send this job to a queue when it is dispatched.
Make sure to define properties on the class for any data you pass into the constructor. This is necessary so the worker can rebuild the job correctly when it is pulled off the queue.
The queue worker will call the handle() method to process the job. Any dependencies can be type hinted here and they will be injected from the IoC container.
To dispatch the job to the queue, you can either use the global dispatch() helper function, or call the static dispatch() method on the job itself.
dispatch(new CloneGitRepo($url));
CloneGitRepo::dispatch($url);
So, your webhook would look like:
public function webhook(Request $request)
{
// The type of GitHub event that we receive.
$event = $request->header('X-GitHub-Event');
$url = $this->createCloneUrl();
CloneGitRepo::dispatch($url);
return new Response('Webhook received succesfully', 200);
}
I want to do something like this in my fire method:
class MyClass{
public function fire($job) {
if(something) {
$job->fail();
}else {
//processing
}
$job->delete();
}
There is no such method as fail(), is it possible to do something like this?
There is no such thing as fail a job but what you can do:
release it back to the queue with
$job->release();
After defined number of attempts it will end up in failed jobs table.
throw an exception. The job will be released back to the queue on it's own.
if you're using beanstalkd as a queue driver you can bury a job
$job->bury();
If your condition is unrecoverable you can log this fact and simply delete the job.