Laravel retryUntil job exits after 4th retry without failing - php

I am trying to debug some bizarre behaviour of my PHP application. It is running Laravel 6 + AWS SQS. The program downloads call recordings from a VoIP provider's API using a job. The API has a heavy rate limit of 10req/minute, so I'm throttling the requests on my side. The job is configured to try to complete within 24 hours using retryUntil method. However, the job disappears from the queue after 4 tries. It doesn't fail. The job's failed method never gets executed (I've put logging and Sentry::capture in there). It's not on the failed_jobs table. The last log says "Cannot complete job, retrying in ... seconds", which is right before the release call. However, the job simply disappears from the queue and never gets executed again.
I am logging the number of attempts, max tries, timeoutAt, etc. Everything seems to be configured properly. Here's (the essence of) my code:
public function handle()
{
/** #var Track $track */
$track = Track::idOrUuId($this->trackId);
$this->logger->info('Downloading track', [
'trackId' => $track->getId(),
'attempt' => $this->attempts(),
'retryUntil' => $this->job->timeoutAt(),
'maxTries' => $this->job->maxTries(),
]);
$throttleKey = sprintf('track.download.%s', $track->getUser()->getTeamId());
if (!$this->rateLimiter->tooManyAttempts($throttleKey, self::MAX_ALLOWED_JOBS)) {
$this->downloadTrack($track);
$this->rateLimiter->hit($throttleKey, 60);
} else {
$delay = random_int(10, 100) + $this->rateLimiter->availableIn($throttleKey);
$this->logger->info('Throttling track download.', [
'trackId' => $track->getId(),
'delay' => $delay,
]);
$this->release($delay);
}
}
public function retryUntil(): DateTimeInterface
{
return now()->addHours(24);
}
public function failed(Exception $exception)
{
$this->logger->info('Job failed', ['exception' => $exception->getMessage()];
Sentry::captureException($exception);
}

I found the problem and I'm posting it here for anyone who might struggle in the future. It all came down to a simple configuration. In AWS SQS the queue I am working with has a configured DLQ (Dead-Letter Queue) and Maximum receives set to 4. According to the SQS docs
The Maximum receives value determines when a message will be sent to the DLQ. If the ReceiveCount for a message exceeds the maximum receive count for the queue, Amazon SQS moves the message to the associated DLQ (with its original message ID).
Since this is an infra configuration, it overwrites any Laravel parameters you might pass to the job. And because the message is simply removed from the queue, the processing job does not actually fail, so the failed method is not executed.

Related

Rate limiting PHP function

I have a php function which gets called when someone visits POST www.example.com/webhook. However, the external service which I cannot control, sometimes calls this url twice in rapid succession, messing with my logic since the webhook persists stuff in the database which takes a few ms to complete.
In other words, when the second request comes in (which can not be ignored), the first request is likely not completed yet however I need this to be completed in the order it came in.
So I've created a little hack in Laravel which should "throttle" the execution with 5 seconds in between. It seems to work most of the time. However an error in my code or some other oversight, does not make this solution work everytime.
function myWebhook() {
// Check if cache value (defaults to 0) and compare with current time.
while(Cache::get('g2a_webhook_timestamp', 0) + 5 > Carbon::now()->timestamp) {
// Postpone execution.
sleep(1);
}
// Create a cache value (file storage) that stores the current
Cache::put('g2a_webhook_timestamp', Carbon::now()->timestamp, 1);
// Execute rest of code ...
}
Anyone perhaps got a watertight solution for this issue?
You have essentially designed your own simplified queue system which is the right approach but you can make use of the native Laravel queue to have a more robust solution to your problem.
Define a job, e.g: ProcessWebhook
When a POST request is received to /webhook queue the job
The laravel queue worker will process one job at a time[1] in the order they're received, ensuring that no matter how many requests are received, they'll be processed one by one and in order.
The implementation of this would look something like this:
Create a new Job, e.g: php artisan make:job ProcessWebhook
Move your webhook processing code into the handle method of the job, e.g:
public function __construct($data)
{
$this->data = $data;
}
public function handle()
{
Model::where('x', 'y')->update([
'field' => $this->data->newValue
]);
}
Modify your Webhook controller to dispatch a new job when a POST request is received, e.g:
public function webhook(Request $request)
{
$data = $request->getContent();
ProcessWebhook::dispatch($data);
}
Start your queue worker, php artisan queue:work, which will run in the background processing jobs in the order they arrive, one at a time.
That's it, a maintainable solution to processing webhooks in order, one-by-one. You can read the Queue documentation to find out more about the functionality available, including retrying failed jobs which can be very useful.
[1] Laravel will process one job at a time per worker. You can add more workers to improve queue throughput for other use cases but in this situation you'd just want to use one worker.

Send response but keep long running script going to prevent timeout?

I am wondering how to deal with this. I have a webhook endpoint which responds to a webhook call from Github.
It starts a long running process in where it clones the repository from which the webhook call was made.
/**
* The webhook endpoint.
*
* #param Request $request
* #return mixed
* #throws \Exception
*/
public function webhook(Request $request)
{
// The type of GitHub event that we receive.
$event = $request->header('X-GitHub-Event');
$url = $this->createCloneUrl();
$this->cloneGitRepo($url);
return new Response('Webhook received succesfully', 200);
}
The problem with this is that Github gives an error when the response is not provided soon enough.
We couldn’t deliver this payload: Service Timeout
This is rightfully so because my cloneGitRepo method is simply blocking the execution of the response and it is taking too long.
How can I still deliver a response to acknowledge to Github that the webhook call has been made successfully and start my long running process?
I am using Laravel for all of this with Redis, maybe something can be accomplished there? I am open to all suggestions.
What you're looking for is a queued job. Laravel makes this very easy with Laravel Queues.
With queues, you setup a queue driver (database, redis, Amazon SQS, etc), and then you have one to many queue workers that are continuously running. When you put a job on the queue from your webhook method, it will be picked up by one of your queue workers and run in a separate process. The act of dispatching a queued job to a queue is very quick, though, so your webhook method will return quickly while the real work is being done by the queue worker.
The linked documentation has all the details, but the general process will be:
Setup a queue connection. You mention you're already using redis, I would start with that.
Use php artisan make:job CloneGitRepo to create a CloneGitRepo job class.
It should implement the Illuminate\Contracts\Queue\ShouldQueue interface so that Laravel knows to send this job to a queue when it is dispatched.
Make sure to define properties on the class for any data you pass into the constructor. This is necessary so the worker can rebuild the job correctly when it is pulled off the queue.
The queue worker will call the handle() method to process the job. Any dependencies can be type hinted here and they will be injected from the IoC container.
To dispatch the job to the queue, you can either use the global dispatch() helper function, or call the static dispatch() method on the job itself.
dispatch(new CloneGitRepo($url));
CloneGitRepo::dispatch($url);
So, your webhook would look like:
public function webhook(Request $request)
{
// The type of GitHub event that we receive.
$event = $request->header('X-GitHub-Event');
$url = $this->createCloneUrl();
CloneGitRepo::dispatch($url);
return new Response('Webhook received succesfully', 200);
}

Laravel email with queue 550 error (too many emails per second)

Our emails are failing to send using Laravel with a Redis Queue.
The code that triggers the error is this: ->onQueue('emails')
$job = (new SendNewEmail($sender, $recipients))->onQueue('emails');
$job_result = $this->dispatch($job);
In combination with this in the job:
use InteractsWithQueue;
Our error message is:
Feb 09 17:15:57 laravel: message repeated 7947 times: [ production.ERROR: exception 'Swift_TransportException' with message 'Expected response code 354 but got code "550", with message "550 5.7.0 Requested action not taken: too many emails per second "' in /home/laravel/app/vendor/swiftmailer/swiftmailer/lib/classes/Swift/Transport/AbstractSmtpTransport.php:383 Stack trace: #0 /home/laravel/app/vendor/swiftmailer/swiftmailer/lib/classes/Swift/Transport/AbstractSmtpTransport.php(281):
Our error only happens using Sendgrid and not Mailtrap, which spoofs emailing sending. I've talked with Sendgrid and the emails never touched their servers and their service was fully active when my error occurred. So, the error appears to be on my end.
Any thoughts?
Seems like only Mailtrap sends this error, so either open another account or upgrade to a paid plan.
I finally figured out how to set up the entire Laravel app to throttle mail based on a config.
In the boot() function of AppServiceProvider,
$throttleRate = config('mail.throttleToMessagesPerMin');
if ($throttleRate) {
$throttlerPlugin = new \Swift_Plugins_ThrottlerPlugin($throttleRate, \Swift_Plugins_ThrottlerPlugin::MESSAGES_PER_MINUTE);
Mail::getSwiftMailer()->registerPlugin($throttlerPlugin);
}
In config/mail.php, add this line:
'throttleToMessagesPerMin' => env('MAIL_THROTTLE_TO_MESSAGES_PER_MIN', null), //https://mailtrap.io has a rate limit of 2 emails/sec per inbox, but consider being even more conservative.
In your .env files, add a line like:
MAIL_THROTTLE_TO_MESSAGES_PER_MIN=50
The only problem is that it doesn't seem to affect mail sent via the later() function if QUEUE_DRIVER=sync.
For debugging only!
If you don't expect more then 5 emails and don't have the option to change mailtrap, try:
foreach ($emails as $email) {
...
Mail::send(... $email);
if(env('MAIL_HOST', false) == 'smtp.mailtrap.io'){
sleep(1); //use usleep(500000) for half a second or less
}
}
Using sleep() is a really bad practice. In theory this code should only execute in test environment or in debug mode.
Maybe you should make sure it was really sent via Sendgrid and not mailtrap. Their hard rate limit seems currently to be 3k requests per second against 3 requests per second for mailtrap on free plan :)
I used sleep(5) to wait five seconds before using mailtrap again.
foreach ($this->suscriptores as $suscriptor) {
\Mail::to($suscriptor->email)
->send(new BoletinMail($suscriptor, $sermones, $entradas));
sleep(5);
}
I used the sleep(5) inside a foreach. The foreach traverses all emails stored in the database and the sleep(5) halts the loop for five seconds before continue with the next email.
I achieved this on Laravel v5.8 by setting the Authentication routes manually. The routes are located on the file routes\web.php. Below are the routes that needs to be added to that file:
Auth::routes();
Route::get('email/verify', 'Auth\VerificationController#show')->name('verification.notice');
Route::get('email/verify/{id}', 'Auth\VerificationController#verify')->name('verification.verify');
Route::group(['middleware' => 'throttle:1,1'], function(){
Route::get('email/resend', 'Auth\VerificationController#resend')->name('verification.resend');
});
Explanation:
Do not pass any parameter to the route Auth::routes(); to let configure the authentication routes manually.
Wrap the route email/resend inside a Route::group with the middleware throttle:1,1 (the two numbers represents the max retry attempts and the time in minutes for those max retries)
I also removed a line of code in the file app\Http\Controllers\Auth\VerificationController.php in the __construct function.
I removed this:
$this->middleware('throttle:6,1')->only('verify', 'resend');
You need to rate limit emails queue.
The "official" way is to setup Redis queue driver. But it's hard and time consuming.
So I wrote custom queue worker mxl/laravel-queue-rate-limit that uses Illuminate\Cache\RateLimiter to rate limit job execution (the same one that used internally by Laravel to rate limit HTTP requests).
In config/queue.php specify rate limit for emails queue (for example 2 emails per second):
'rateLimit' => [
'emails' => [
'allows' => 2,
'every' => 1
]
]
And run worker for this queue:
$ php artisan queue:work --queue emails
I had this problem when working with mail trap. I was sending 10 mails in one second and it was treated as spam. I had to make delay between each job. Take look at solution (its Laravel - I have system_jobs table and use queue=database)
Make a static function to check last job time, then add to it self::DELAY_IN_SECONDS - how many seconds you want to have between jobs:
public static function addSecondsToQueue() {
$job = SystemJobs::orderBy('available_at', 'desc')->first();
if($job) {
$now = Carbon::now()->timestamp;
$jobTimestamp = $job->available_at + self::DELAY_IN_SECONDS;
$result = $jobTimestamp - $now;
return $result;
} else {
return 0;
}
}
Then use it to make sending messages with delay (taking in to consideration last job in queue)
Mail::to($mail)->later(SystemJobs::addSecondsToQueue(), new SendMailable($params));

Breaking out of a Gearman loop

I have a php application that gets requests for part numbers from our server. At that moment, we reach out to a third party API to gather pricing information to make sure we have the latest pricing for that particular request. Sometimes the third party API is slow or it might be down, so we have a database that stores the latest pricing requests for each particular part number that we can use as a fallback. I'd like to run the request to the third party API and the database in parallel using Gearman. Here is the idea:
Receive request
Through gearman, create two jobs:
Request to third party API
MySQL database lookup
Wait in a loop and return the results based on the following conditions:
If the third party API has completed return that result, return that result immediately
If an elapsed time has passed, (e.g. 2 seconds) and the third party API hasn't responded, return the MySQL lookup data
Using gearman, my thoughts were to either run the two tasks in the foreground and break out of runTasks() within the setCompleteCallback() call, or to run them in the background and check in on the two tasks within a separate loop and check in on the tasks using jobStatus().
Unfortunately, I can't get either route to work for me while still getting access to the resulting data. Is there a a better way, or are there some existing examples of how someone has made this work?
I think you've described a single blocking problem, namely the results of an 3rd-party API lookup. There's two ways you can handle this from my point of view, either you could abort the attempt altogether if you decide that you've run out of time or you could report back to the client that you ran out of time but continue on with the lookup anyway, just to update your local cache just in case it happens to respond slower than you would like. I'll describe how I would go about the former problem because that would be easier.
From the client side:
$request = array(
'productId' => 5,
);
$client = new GearmanClient( );
$client->addServer( '127.0.0.1', 4730 );
$results = json_decode($client->doNormal('apiPriceLookup', json_encode( $request )));
if($results && property_exists($results->success) && $results->success) {
// Use local data
} else {
// Use fresh data
}
This will create a job on the job server with a function name of 'apiPriceLookup' and pass it the workload data containing a product id of 5. It will wait for the results to come back, and check for a success property. If it exists and is true, then the api lookup was successful.
The idea is to set the timeout condition then in the worker task, which completely depends on how you're implementing the API lookup. If you're using cURL (or some wrapper around cURL), you can see the answer to how to detect a timeout here.
From the worker side:
$worker= new GearmanWorker();
$worker->addServer();
$worker->addFunction("apiPriceLookup", "apiPriceLookup", $count);
while ($worker->work());
function apiPriceLookup($job) {
$payload = json_decode($job->workload());
try {
$results = [
'data' => PerformApiLookupForProductId($payload->productId),
'success' => true,
];
} catch(Exception $e) {
$results = ['success' => false];
}
return json_encode($results);
}
This just creates a GearmanWorker object and subscribes it the function of apiPriceLookup. It will call the function apiPriceLookup whenever a client submits a task to the job server. That function calls out to another function, PerformApiLookupForProductId, which should be written so as to throw an exception whenever a timeout condition occurs.
I don't think this would be considered using exceptions to control logic flow, I think timeout conditions generally are exceptional (or should be) events. For instance, Guzzle will throw a GuzzleHttp\Exception\RequestException when it has decided to timeout.

Multiple queue listeners will run the same job over multiple processes

I have a simple web application that is written with the Laravel 4.2 framework. I have configured the Laravel queue component to add new queue items to a locally running beastalkd server.
Essentially, there is a POST route that will add an item to the beanstalkd tube.
I then have supervisord set up to run artisan queue:listen as three separate processes. The issue that I am seeing is that the different queue:listen processes will end up spawning anywhere between one to three queue:worker processes for just one inserted job.
The end result being that one job inserted into the queue is sometimes being processed by multiple workers at the same time, something I am obviously trying to avoid.
The job code is relatively simple:
<?php
use App\Repositories\URLRepository;
class ProcessDataJob {
private $urls;
public function __construct(URLRepository $urls)
{
$this->urls = $urls;
}
public function fire($job, $data)
{
$input = $data['post'];
if (!isset($data['post']) && !is_array($data['post'])) {
Log::error('[Job #'.$job->getJobId().'] $input was empty inside CreateAuditJob. Deleting job. Quitting. Bye!');
$job->delete();
return false;
}
//
// ... code that will take a few hours to run.
//
$job->delete();
Log::info('[Job #'.$job->getJobId().'] ProcessDataJob was successful, deleting the job!');
return true;
}
}
The fun part being that most of the (duplicated) queue workers fail when deleting the job with this left in the errorlog:
exception 'Pheanstalk_Exception_ServerException' with message 'Job 3248 NOT_FOUND: does not exist or is not reserved by client'
The ttr (Time to Run) is set to 172,800 seconds (or 48 hours), which is much larger then the time it would take for the job to complete.
what's the job time_to_run when queued? If running the job takes longer than time_to_run seconds, the job is automatically re-queued and becomes eligible to be run by the next worker.
A reserved job has time_to_run seconds to be deleted, released or touched. Calling touch() restarts the timeout timer, so workers can use it to give themselves more time to finish. Or use a large enough value when queueing.
I've found the beanstalkd protocol document helpful
https://github.com/kr/beanstalkd/blob/master/doc/protocol.md
Since you are running with Laravel, verify your queue.php configuration file.
Change the ttr value from 60 (default) to something else that is better for you.
'beanstalkd' => array(
'driver' => 'beanstalkd',
'host' => 'localhost',
'queue' => 'default',
'ttr' => 600, //Example
),

Categories