Laravel process multiple jobs at the sametime - php

I am making an api that requires the job to be dispatched multiple times, however, each job takes 10 seconds, and it takes forever to process one by one. Is their anyway to run multiple job once?
GetCaptcha::dispatch($task_id)->afterCommit()->onQueue('default');

You can achieve that by running multiple workers at the same time.
From the Laravel docs:
To assign multiple workers to a queue and process jobs concurrently,
you should simply start multiple queue:work processes. This can either
be done locally via multiple tabs in your terminal or in production
using your process manager's configuration settings. When using
Supervisor, you may use the numprocs configuration value.
Read more here:
https://laravel.com/docs/9.x/queues#running-multiple-queue-workers
https://laravel.com/docs/9.x/queues#supervisor-configuration

Related

Laravel Queue - How to setup a FAST processor

I'm using Laravel 5.5 and I'm trying to setup some fast queue processing. I've been running into one roadblock after another.
This site is an employer/employee matching service. So when an employer posts a job position, it needs to then run through all the employees in our system and calculate a number of variables to determine how well they match to the job. We have this all figured out, but it takes a long time to process one at a time when you have thousands of employees in the system. So, I set up to write a couple of tables. The first is a simple table that defines the position ID and the status. The second is a table listing all the employee IDs, the position ID, and the status of that employee being processed. This takes only a few seconds to write and then allows the user to move on in the application.
Then I have another server setup to run a cron every minute that checks for new entries in the first table. When found, it marks it out as started and then grabs all the employees and runs through each employee and starts a queued job in Laravel. The job I have defined does properly submit to the queue and running queue:work does in fact process the job properly. This is all tested.
However, the problem I'm running into is that I've tried database (MySQL), Redis and SQS for the queue and they are all very slow. I was using this same server to try to operate the queue:work (using Supervisor and attempting to run up to 300 processes) but then created 3 clones that don't run the cron but only run Supervisor (100 processes per clone) and killed Supervisor on the first server. With database it would process ok, though to run through 10k queued jobs would take hours, but with SQS and Redis I'm getting a ton of failures. The scripts are taking too long or something. I checked the CPUs on the clones running the workers and they are barely hitting 40% so I'm not over-taxing the servers.
I was just reading about Horizon and I'm not sure if it would help the situation. I keep trying to find information about how to properly setup a queue processing system with Laravel and just keep running into more questions than answers.
Is anyone familiar with this stuff and have any advice on how to set this up correctly so that it's very fast and failure free (assuming my code has no bugs)?
UPDATE: Following some other post advice, I figured I'd share a few more details:
I'm using Forge as the setup tool with AWS EC2 servers with 2G of RAM.
Each of the three clones has the following worker configuration:
command=php /home/forge/default/artisan queue:work sqs --sleep=10 --daemon --quiet --timeout=30 --tries=3
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=forge
numprocs=100
stdout_logfile=/home/forge/.forge/worker-149257.log
The database is on Amazon RDS.
I'm curious if the Laravel cache will work with the queue system. There's elements of the queued script that are common to every run so perhaps if I queued that data up from the beginning it may save some time. But I'm not convinced it will be a huge improvement.
If we ignore the actual logic processed by each job, and consider the overhead of running jobs alone, Laravel's queueing system can easily handle 10,000 jobs per hour, if not several times that, in the environment described in the question—especially with a Redis backend.
For a typical queue setup, 100 queue worker processes per box seems extremely high. Unless these jobs spend a significant amount of time in a waiting state—such as jobs that make requests to web services across a network and use only a few milliseconds processing the response—the large number of processes running concurrently will actually diminish performance. We won't gain much by running more than one worker per processor core. Additional workers create overhead because the operating system must divide and schedule compute time between all the competing processes.
I checked the CPUs on the clones running the workers and they are barely hitting 40% so I'm not over-taxing the servers.
Without knowing the project, I can suggest that it's possible that these jobs do spend some of their time waiting for something. You may need to tune the number of workers to find the sweet spot between idle time and overcrowding.
With database it would process ok, though to run through 10k queued jobs would take hours, but with sqs and redis I'm getting a ton of failures.
I'll try to update this answer if you add the error messages and any other related information to the question.
I'm curious if the Laravel cache will work with the queue system. There's elements of the queued script that are common to every run so perhaps if I queued that data up from the beginning it may save some time.
We can certainly use the cache API when executing jobs in the queue. Any performance improvement we see depends on the cost of reproducing the data for each job that we could store in the cache. I can't say for sure how much time caching would save because I'm not familiar with the project, but you could profile sections of the code in the job to find expensive operations.
Alternatively, we could cache reusable data in memory. When we initialize a queue worker using artisan queue:work, Laravel starts a PHP process and boots the application once for all of the jobs that the worker executes. This is different from the application lifecycle for a typical PHP web app wherein the application reboots for every request and disposes state at the end of each request. Because every job executes in the same process, we can create an object that caches shared job data in the process memory, perhaps by binding a singleton into the IoC container, which the jobs can read much faster than even a Redis cache store because we avoid the overhead needed to fetch the data from the cache backend.
Of course, this also means that we need to make sure that our jobs don't leak memory, even if we don't cache data as described above.
I was just reading about Horizon and I'm not sure if it would help the situation.
Horizon provides a monitoring service that may help to track down problems with this setup. It may also improve efficiency a bit if the application uses other queues that Horizon can distribute work between when idle, but the question doesn't seem to indicate that this is the case.
Each of the three clones has the following worker configuration:
command=php /home/forge/default/artisan queue:work sqs --sleep=10 --daemon --quiet --timeout=30 --tries=3
(Sidenote: for Laravel 5.3 and later, the --daemon option is deprecated, and the queue:work command runs in daemon mode by default.)

Laravel how to set or define number of workers?

In laravel you can start a queue listener with:
php artisan queue:listen
But how many workers (threads, processes) will be used to process the queue?
Is there any way to define the number of workers?
https://laravel.com/docs/queues#supervisor-configuration
You generate a config file where you define the number of workers.
numprocs=10
By running php artisan queue:listen only one process will be run and fetches the jobs from the queue. So the jobs will be fetched and processed one by one.
If you want to have more than one thread to process the queue jobs you need to run the listener many times in different consoles. But instead of running them manually you can use Supervisor to manage your threads then you will be able to configure the number of thread by setting numprocs parameter in Supervisor configuration setting

How do I batch multiple gearman jobs to be handled together by a worker?

I have multiple clients adding jobs to my gearman queue.
These jobs are documents ultimately destined to be batch uploaded to SOLR for indexing
I would like to grab multiple jobs from my queue and concatinate them together in batches of 1000 documents for performance reasons.
I'm open to using the gearman cmd tool, or any of their SDKs
I've been looking at the PHP extension and the only option $worker->work() is inadequate.
I found a forum post suggesting the use of grab_job() but that's from 2009 and the method doesnt seem to exist anymore.
Am I using gearman wrong or am I missing something?
Your php workers do persist between the jobs. So you can do the following: use $worker->work() in a typical manner - I mean, one job - one work() call. Set a counter inside your worker and save all the job workloads into internal variable. Every 1000 jobs also do your batch processing.

PHP Concurrency via Cron

I have a few scripts that need to run concurrently as separate processes. My plan is to have a cron job that executes multiple instances of these scripts at a set interval. Is this a good idea? What are the pros/cons to this approach? Are there any other options I need to consider?
Bottomline: I'm trying to mimic multithreading. Any race conditions will be handled via code (e.g. setting statuses in DB, etc.). The scripts are supposed to do processing intensive tasks (e.g. creating thumbnails, etc.).
You can use forking. The startup script would load all the default configurations and initializations, then fork child processes to do the processing. It could then monitor the processes to see if they are still running.
http://php.net/manual/en/function.pcntl-fork.php
Well, if you need it as a cronjob, go ahead. If you want multiple processes, you most likely want to use pcntl_fork to create multiple instances of the same script.
Depending on how quickly you want to react to those jobs and if you're looking to do processor intensive tasks then you can also spread out that processing using a queuing system. Check out Gearman or beanstalkd with multiple workers per machine if you have multiple cores/processors.
Doesn't PHP have fork()? While that's not really multithreading, it is a basic way of co-routines.
One con of using cron is that it will execute a copy of your script at the interval you set regardless of how many script processes are already running. This means the scripts need a way to communicate with each other so that a maximum of N scripts are kept running concurrently (excess scripts can just exit immediately).
An alternative to cron could be supervisord which will execute a configurable number of scripts and monitor each one so any that exit are respawned.

PHP - Activating Multiple Instances of a Process

I'm building a system that watches a queue and activates a set of tasks on a regular interval.
I'm interested in running multiple instances of my processing "bots" based on how many items are in the queue. So if there are 5 items I'll run two bots and if their are 10 I'll run four.
I know how to run multiple instances from CLI (manually), but how would I do this as a function of my application? And how would I properly track the creation and destruction of these bots?
It seems like cron (*nix) or task scheduler (windows) would be what you need.
http://en.wikipedia.org/wiki/Cron
http://msdn.microsoft.com/en-us/library/aa383614%28VS.85%29.aspx
These can run a PHP script that determine how many "bots" need run, calculations, etc. Anything PHP is capable of.
Also, for running the multiple bots in the background (after the main controller script has finished executing) you may want to look at PHP process forking.
You might also want to look at gearman ( http://gearman.org/ )

Categories