I have a system running several kinds of jobs (Laravel 5.1 queues). I use database queue driver.
Supervisor makes sure 3 'instances' of queue:work are running as a daemon all the time.
ps aux | grep queue confirms this - three processes are waiting for jobs.
Sometimes I can see several job (without delay) records in the database table, but always only one of them has the reserved flag set to 1.
What am I doing wrong?
Why 3 daemon workers are not taking care of other jobs in the queue?
How can I make sure more than one job can be done at the same time?
UPDATE:
I wrote a job that sleeps for 15 seconds and then dispatches three new jobs (same class).
I ran this on my dev server and it worked. Three jobs were reserved at the same time. Once one of them is dealt with - another is taken from the queue and reserved. The behaviour one would expect.
Finally I ran the same situation on my production server and it did not work. One reserved job at once even though the environment is similar, three queue:work processes live etc.
I then asked supervisord to restart the workers and it started working as well.
So: it does work this way. The problem is I don't know what might have been causing the issue I had and when does it happen? How do I avoid this and how do I know if it's fine now?
Related
The MySQL DB CPU is running at 95%+ basically at all times, even when there's seemingly no activity in the app. It doesn't happen right away. Only once the app has been running for a while, but then it keeps at 95% CPU even once there's seemingly no activity.
The number of active sessions / connections gradually climbs from dozens to even hundreds. Looking at the MySQL processes on RDS reveals a dozen processes trying to use 8% of the DB CPU each for some reason.
I've checked for Laravel jobs via php artisan queue:listen but nothing appears.
Checked the database and query logs, and there are many DB logs which suggest a job or something occurring in a loop, but no indication as to what the source of those jobs are as the queries being ran are generic queries and could be called from many different places in the application.
We do not believe this is due to user activity, but if it is, it's some kind of user action whicih results in some kind of a server loop.
Checked application and error logs and nothing in particular stands out.
I still don't know the root cause of why this is occurring, but I have discovered the following which is enough for me to solve the problem:
There is a scheduled custom command ($schedule->command(...)->everyTenMinutes()) that runs specified in app/Console/Commands/Kernel.php
At some point, the job either fails to complete on time (and thus the commands running gradually build up over time) and/or there is an error and it gets stuck processing essentially the same records again and again and again in a loop.
protected $signature = 'minute:mycustomjob';
Over a period of several hours, the multiple instances of the same command running ends up using 100% DB CPU due to the never-ending loop of queries. I verified the running processes by running the following in the CLI: ps -ef | grep 'artisan' which listed about a dozen instances of this very CPU and DB-intensive process running on the server during peak load times.
Killing the process by killing artisan jobs with the command's "signature" name dropped the CPU usage back down to 0%, further proving the job as the culprit:
sudo kill -9 `ps -ef | awk '/[a]rtisan minute:mycustomjob/{print $2}'`
The potential solutions I have in mind are as follows: Re-write the job to not error, re-write the job to be more efficient and complete within 10 minutes or less, lower the frequency at which the job executes, upgrade Laravel to a newer version which supports preventing task overlaps: https://laravel.com/docs/9.x/scheduling#preventing-task-overlaps
I'm using Laravel 5.5 and I'm trying to setup some fast queue processing. I've been running into one roadblock after another.
This site is an employer/employee matching service. So when an employer posts a job position, it needs to then run through all the employees in our system and calculate a number of variables to determine how well they match to the job. We have this all figured out, but it takes a long time to process one at a time when you have thousands of employees in the system. So, I set up to write a couple of tables. The first is a simple table that defines the position ID and the status. The second is a table listing all the employee IDs, the position ID, and the status of that employee being processed. This takes only a few seconds to write and then allows the user to move on in the application.
Then I have another server setup to run a cron every minute that checks for new entries in the first table. When found, it marks it out as started and then grabs all the employees and runs through each employee and starts a queued job in Laravel. The job I have defined does properly submit to the queue and running queue:work does in fact process the job properly. This is all tested.
However, the problem I'm running into is that I've tried database (MySQL), Redis and SQS for the queue and they are all very slow. I was using this same server to try to operate the queue:work (using Supervisor and attempting to run up to 300 processes) but then created 3 clones that don't run the cron but only run Supervisor (100 processes per clone) and killed Supervisor on the first server. With database it would process ok, though to run through 10k queued jobs would take hours, but with SQS and Redis I'm getting a ton of failures. The scripts are taking too long or something. I checked the CPUs on the clones running the workers and they are barely hitting 40% so I'm not over-taxing the servers.
I was just reading about Horizon and I'm not sure if it would help the situation. I keep trying to find information about how to properly setup a queue processing system with Laravel and just keep running into more questions than answers.
Is anyone familiar with this stuff and have any advice on how to set this up correctly so that it's very fast and failure free (assuming my code has no bugs)?
UPDATE: Following some other post advice, I figured I'd share a few more details:
I'm using Forge as the setup tool with AWS EC2 servers with 2G of RAM.
Each of the three clones has the following worker configuration:
command=php /home/forge/default/artisan queue:work sqs --sleep=10 --daemon --quiet --timeout=30 --tries=3
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=forge
numprocs=100
stdout_logfile=/home/forge/.forge/worker-149257.log
The database is on Amazon RDS.
I'm curious if the Laravel cache will work with the queue system. There's elements of the queued script that are common to every run so perhaps if I queued that data up from the beginning it may save some time. But I'm not convinced it will be a huge improvement.
If we ignore the actual logic processed by each job, and consider the overhead of running jobs alone, Laravel's queueing system can easily handle 10,000 jobs per hour, if not several times that, in the environment described in the question—especially with a Redis backend.
For a typical queue setup, 100 queue worker processes per box seems extremely high. Unless these jobs spend a significant amount of time in a waiting state—such as jobs that make requests to web services across a network and use only a few milliseconds processing the response—the large number of processes running concurrently will actually diminish performance. We won't gain much by running more than one worker per processor core. Additional workers create overhead because the operating system must divide and schedule compute time between all the competing processes.
I checked the CPUs on the clones running the workers and they are barely hitting 40% so I'm not over-taxing the servers.
Without knowing the project, I can suggest that it's possible that these jobs do spend some of their time waiting for something. You may need to tune the number of workers to find the sweet spot between idle time and overcrowding.
With database it would process ok, though to run through 10k queued jobs would take hours, but with sqs and redis I'm getting a ton of failures.
I'll try to update this answer if you add the error messages and any other related information to the question.
I'm curious if the Laravel cache will work with the queue system. There's elements of the queued script that are common to every run so perhaps if I queued that data up from the beginning it may save some time.
We can certainly use the cache API when executing jobs in the queue. Any performance improvement we see depends on the cost of reproducing the data for each job that we could store in the cache. I can't say for sure how much time caching would save because I'm not familiar with the project, but you could profile sections of the code in the job to find expensive operations.
Alternatively, we could cache reusable data in memory. When we initialize a queue worker using artisan queue:work, Laravel starts a PHP process and boots the application once for all of the jobs that the worker executes. This is different from the application lifecycle for a typical PHP web app wherein the application reboots for every request and disposes state at the end of each request. Because every job executes in the same process, we can create an object that caches shared job data in the process memory, perhaps by binding a singleton into the IoC container, which the jobs can read much faster than even a Redis cache store because we avoid the overhead needed to fetch the data from the cache backend.
Of course, this also means that we need to make sure that our jobs don't leak memory, even if we don't cache data as described above.
I was just reading about Horizon and I'm not sure if it would help the situation.
Horizon provides a monitoring service that may help to track down problems with this setup. It may also improve efficiency a bit if the application uses other queues that Horizon can distribute work between when idle, but the question doesn't seem to indicate that this is the case.
Each of the three clones has the following worker configuration:
command=php /home/forge/default/artisan queue:work sqs --sleep=10 --daemon --quiet --timeout=30 --tries=3
(Sidenote: for Laravel 5.3 and later, the --daemon option is deprecated, and the queue:work command runs in daemon mode by default.)
I want to implement a queue for sending out emails in Laravel. I have the queue working fine, but am worried about efficiency. These are my settings:
I have created the jobs table and set up the .env file, to use the queues with my local database.
I have set up this crontab on the server:
* * * * * php /var/www/imagine.dev/artisan schedule:run >> /dev/null 2>&1
And have set up a schedule in app\Conosle\Kernel.php, so I dont have to manually enter the 'queue:listen' every time through console.
$schedule->command('queue:listen');
Now to my question. I would like to know if this is efficient? I am worried about having the queue:listen running all the time in the background consuming cpu and memory.
I have been trying to only run the queue:listen once every 5 minutes, and then put it to sleep with
$schedule->command('queue:listen --sleep 300');
but again, am not sure if this is the best approach.
Another thing I tried is using 'queue:work', but this only processes one queue at a time.
Ideally, I would like a way, to process all the queues every 5 minutes, avoiding a constant use of memory and cpu.
What is the best approach?
Not sure which version of Laravel you're using, but I suspect it's 5.2 or earlier.
You do not need to run this every minute, it continues to run until it's manually stopped.
From Laravel 5.2 documentation:
Note that once this task has started, it will continue to run until it is manually stopped. You may use a process monitor such as Supervisor to ensure that the queue listener does not stop running.
So maybe you want to look into Supervisor
Also, if this is helpful at all, you can chain onto $schedule, ->everyFiveMinutes(). There are several other methods available as well. Laravel Scheduling
I have a Cron Job with PHP which I want to set up on my webhost, but at the moment the script takes about 20 seconds to run with only 3 users data being refreshed. If I get a 1000 users - gonna take ages. Is there an alternative to Cron Job? Will my web host let me run a cron job which takes, for example, 10 minutes to run?
Your cron job can be as long as you want.
The main problem for you is that you must ensure the next cron job execution is not occuring while the first one is still running. You have a lot of solutions to avoid it, basically use a semaphore.
It can be a lock file, a record in database. Your cron job should check if the previous one is finished or not. A good thing is maybe sending you an email if he cannot run because of a long previous job (this way you'll have some notice alerting you that something is maybe getting wrong) By default cron jobs with bad error dstatus on exit are outputing all the standard output to the email of the account running the job, depending on how is configured the platform you could use this behavior or build an smtp connexion on the job (or store the alert in a database table).
If you want some alternatives to cron jobs you should have a look at work queues. You can mix work queues with a cron job, or use work queue in apache-php envirronment, lot of solutions, but the main idea is to make on single queue of things that should be done, and execute them one after the other (but be careful, if you handle theses tasks very slowly you'll get a big fat waiting queue).
A cron job shouldn't have any bearing on how long it's 'job' takes to complete. If you're jobs are taking 20 seconds to complete, it's PHP's fault, not cronjob.
Will my web host let me run a cron job which takes, for example, 10 minutes to run?
Ask your webhost.
If you want to learn about optimizing php scripts, take a look at Profiling PHP Code.
I am using MYSQL as my database and PHP as my programming language.I wanted to run a cron job which would run until the current system date matches the "deadline(date)" column in my database table called "PROJECT".Once the dates are same an update query has to run which would change the status(field of project table) from "open" to "close".
I am not really sure if cron jobs are the best way or I could use triggers or may be something else.Also I am using Apache as my web server and my OS is windows vista.
Also which is the best way to do it? PHP scheduler or cron jobs or any other method? can anybody enlighten me?
I think your concept needs to change.
PHP cannot schedule a job, neither can MySQL. Triggers in MySQL execute when a mysql query occurs, not at a specific time. Neither
This limitation usually isn't a problem in web development. The reason is because your PHP application should control all data going in and out. Usually, this means just the HTML that displays that data, or other formats to users, or other programs.
In your case you can think about it this way. The deadline is a set date. You can treat it as data, and save it to your database. When the deadline occurs is not important, it is that the data you have sent in your database is viewed correctly.
When a request is made to your application, check if the date of the deadline is in the past, if it is, then display that the project is closed - or update that the project is closed, just before display.
There really is no reason to update data independantly of your PHP application.
Usually, the only things you want to schedule are jobs that would affect your application in terms of load, or that need to be done only once, or where concurrency or time is an issue.
In your case none of those apply.
PS: I haven't tried PHPscheduler but I can guess it isn't a true scheduler. Cron is a deamon that sleeps until a given task is due in its queue, executes the task, then sleeps till the next one is due (at least thats what it does in the current algorithm). PHP cannot do that without the sockets and fork extensions, as special setup. So PHPscheduler is most likely just checking if a date for a task has expired, on each load of a webpage (whenever PHP executes a page). This is no different then you just checking if the date on the project has expired, without the overhead of PHPScheduler.
I would always go for a cron job for anything scheduling related.
The big bonus point is that you can echo info out as well and it get's emailed to you.
You'll find once you start using cronjobs, it's hard to stop.
cron does not exist, per se, in vista, but what does exist is the standard windows scheduling manager which you can run with a command line like "php -q -f myfile.php" which will execute the php file at the given time.
you can also use a port of the cron program, there are many out there.
if it is not critical to the second, any windows scheduling application will do, just be sure to have you PHP bin path in your PATH variable for simplicity.
For Windows CRON jobs I cannot recommend PyCron enough.
While CRON and Windows Scheduled Tasks are the tried and true ways of scheduling jobs/tasks to run on a regular basis, there are use cases where having a different scheduled task in CRON/Windows can become tedious. Namely when you want to let users schedule things to run, or for instances where you prefer simplicity/maintainability/portability/etc or all of the above.
In cases where I prefer to not use CRON/Windows for scheduled tasks, I build into the application a task scheduling system. This still requires 1 CRON job or Windows Task to be scheduled. The idea is to store Job details in the database (job name, job properties, last run time, run interval, anything else that is important for your implementation). You then schedule a "Master" job in CRON or Windows which handles running all of your other jobs for you. You'll need this master job to run at least as often as your shortest interval; if you want to be able to schedule jobs that run every minute the master job needs to run every minute.
You can then launch each scheduled job in the background from PHP with minimal effort (if you want). In memory constrained systems you can monitor memory usage or keep track of the PIDs (various methods) and limit to N jobs running at a given time.
I've had a great deal of success with this method, YMMV however based on your needs and your implementation.
how about PHPscheduler..R they not better than cronjobs? I think crons would be independent of the application hence would be difficult if one has to change the host..i am not really sure though..It would be great if anyone can comment on this!! Thanks!