Background
I'm running a laravel 5.3 powered web app on nginx. My prod is working fine (running on an AWS t2.medium) and my staging has been running fine until it recently got overloaded. Staging is a t2.micro.
Problem
The problem happened when started trying to hit the api endpoints, and started getting this error:
503 (Service Unavailable: Back-end server is at capacity)
using htop we got the following:
so we found that our beanstalk queues are taking insane amounts of memory.
What I have tried
we used telnet to peek into what's going inside beanstalk:
$~/beanstalk-console$ telnet localhost 11300
Trying 127.0.0.1...
Connected to staging-api-3.
Escape character is '^]'.
stats
OK 940
---
current-jobs-urgent: 0
current-jobs-ready: 0
current-jobs-reserved: 0
current-jobs-delayed: 2
current-jobs-buried: 0
cmd-put: 451
cmd-peek: 0
cmd-peek-ready: 0
cmd-peek-delayed: 0
cmd-peek-buried: 0
cmd-reserve: 0
cmd-reserve-with-timeout: 769174
cmd-delete: 449
cmd-release: 6
cmd-use: 321
cmd-watch: 579067
cmd-ignore: 579067
cmd-bury: 0
cmd-kick: 0
cmd-touch: 0
cmd-stats: 1
cmd-stats-job: 464
cmd-stats-tube: 0
cmd-list-tubes: 0
cmd-list-tube-used: 0
cmd-list-tubes-watched: 0
cmd-pause-tube: 0
job-timeouts: 0
total-jobs: 451
max-job-size: 65535
current-tubes: 2
current-connections: 1
current-producers: 0
current-workers: 0
current-waiting: 0
total-connections: 769377
pid: 1107
version: 1.10
rusage-utime: 97.572000
rusage-stime: 274.560000
uptime: 1609870
binlog-oldest-index: 0
binlog-current-index: 0
binlog-records-migrated: 0
binlog-records-written: 0
binlog-max-size: 10485760
id: 906b3629b01390dc
hostname: staging-api-3
nothing there seems to be concerning..
Question
I would like to have a more transparent look into whats going on in these jobs (ie what are the jobs exactly?) I know Laravel Horizon provides such services but only comes on Laravel 5.5. I researched what queue monitors are out there and tried to install beanstalk console. Right now when I installed it.. I'm getting 52.16.%ip%.%ip% took too long to respond. which I think is expected considering that the whole machine is already jammed.
I figure if I reboot the machine I can install beanstalk_console just fine, but then i'll lose the opportunity to investigate what's causing the problem this time around, sine it's a rare occurrence.. What else can I do to investigate and see what exactly are the jobs that are draining the CPU and why
Update
I restarted the instance, and the apis work now, but i'm still getting CPU is at 100%.. what am I missing?
Related
https://www.php.net/manual/en/features.commandline.webserver.php
From 7.4+ onwards, I assume PHP built in server is capable of handling multiple incoming requests, up to and equal to the environment variable: PHP_CLI_SERVER_WORKERS
I have a web app which is composed of a couple dozen AJAX powered lists, on the first page load, using the built-in server it slows to a crawl, usually fails due to timeout in PHP scripts.
I read the above feature, added environment variable (PHP is a docker container), and shell'ed into my container, did a top/ps I can now see X number of PHP processes:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 722864 21932 14448 S 0.3 1.1 0:09.91 symfony
20 root 20 0 210892 48188 36544 S 0.0 2.4 0:00.79 php7.4
21 root 20 0 205676 33460 24500 S 0.0 1.6 0:00.22 php7.4
22 root 20 0 208212 40908 29640 S 0.0 2.0 0:00.42 php7.4
23 root 20 0 210644 42236 30836 S 0.0 2.1 0:00.61 php7.4
24 root 20 0 208764 40784 31176 S 0.0 2.0 0:01.14 php7.4
25 root 20 0 205804 33588 24508 S 0.0 1.6 0:00.22 php7.4
...
I am using Symfony to start a dev server, but no matter what I do none of the processes seem to be carrying any of the load? What am I missing?
I switched to the built-in PHP server about 2 years ago, from NGINX. It made my vagrant setup easier (now docker) but my performance took a hit, which I've dealt with. I'd like to improve the responsiveness of the app using this approach if possible.
Any ideas?
Background
I'm given a laravel app who's queue is configured by forge. And so I'm trying to make it run now on my localhost which is OSX
This is what I did:
installed beanstalk on OSX
ran beanstalk server on my console: $ beanstalk
ran the laravel worker command
$ php artisan queue:work beanstalkd --env=local --queue=default
I then did some actions that create jobs, but they never got processed. I used telnet as a poor man's monitor for beanstalk like so:
$ telnet localhost 11300
Trying ::1...
Connected to localhost.
Escape character is '^]'.
stats
OK 923
---
current-jobs-urgent: 0
current-jobs-ready: 3
current-jobs-reserved: 0
current-jobs-delayed: 0
current-jobs-buried: 0
cmd-put: 3
cmd-peek: 0
cmd-peek-ready: 0
cmd-peek-delayed: 0
cmd-peek-buried: 0
cmd-reserve: 0
cmd-reserve-with-timeout: 652
cmd-delete: 0
cmd-release: 0
cmd-use: 1
cmd-watch: 0
cmd-ignore: 0
cmd-bury: 0
cmd-kick: 0
cmd-touch: 0
cmd-stats: 8
cmd-stats-job: 0
cmd-stats-tube: 0
cmd-list-tubes: 0
cmd-list-tube-used: 0
cmd-list-tubes-watched: 0
cmd-pause-tube: 0
job-timeouts: 0
total-jobs: 3
max-job-size: 65535
current-tubes: 2
current-connections: 2
current-producers: 0
current-workers: 1
current-waiting: 0
total-connections: 8
pid: 56692
version: 1.10
rusage-utime: 0.010171
rusage-stime: 0.031001
uptime: 2023
binlog-oldest-index: 0
binlog-current-index: 0
binlog-records-migrated: 0
binlog-records-written: 0
binlog-max-size: 10485760
id: 3620777b4ee08cdc
Question
I can see that 3 jobs are ready.. but i have no idea how to dispatch them (or for that matter, find out what jobs are exactly inside of them). What should I do?
You can use the beanstalk console web app https://github.com/ptrofimov/beanstalk_console.
I would also log some info in a separated log file, to inform me about some values and details happening within the running job. Then I tail that log file while executing the queued jobs and watching the beanstalk console interface.
When I call yii2 migrate command(from console). I always get "Out of memory" message. In regular case php do not show this error, but when migration command.
# ./yii migrate
Yii Migration Tool (based on Yii v2.0.13-dev)
Out of memory
However PHP through Apache works absolutely fine. It's just on the CLI that I get this error. The machine is running CentOS release 6 and PHP 5.6. System memory is 6GB, that is enough to run command.
total used free shared buffers cached
Mem: 5971 1557 4413 128 119 440
-/+ buffers/cache: 998 4973
Swap: 0 0 0
Add:
I witnessed laravel artisan command also show that error.
I could not know what occurs this error finally. But when I re-installed php library from base, the error disappeared. This may help for you.
I was doing some testing with memcached using php sessions.
The following steps cause problems.
Start memcache
Login to application which creates a session
Restart memcache
Try to navigate to another page
Browser hangs for 30 seconds and logs out
Requests after logging in again take 30 seconds but works. Randomly it stops taking 30 seconds to perform actions and is back to normal speed
What is the cause of this odd behavior
Sometimes I get the following error:
A PHP Error was encountered
Severity: Warning
Message: Unknown: Failed to write session data (memcache). Please verify that the current setting of session.save_path is correct (tcp://10.181.16.192:11211?persistent=1&weight=1&timeout=1&retry_interval=15)
Filename: Unknown
Line Number: 0
EDIT:
If I restart memcache then apache the problem does not occur
I noticed TCP connections change to CLOSE_WAIT when restarting memcached.
But if I restart memcache and apache back to back without delay then it solves the problem.
It seems like there is some sort of bug in the way php handles connections to memcache where it doesn't recognize the connection is valid anymore and causes the issues described.
[root#php-pos-web ~]# netstat -natp | grep '11211'
tcp 1 0 10.181.16.33:58722 10.181.16.192:11211 CLOSE_WAIT 7574/httpd
tcp 205 0 10.181.16.33:58753 10.181.16.192:11211 ESTABLISHED 7583/httpd
tcp 1 0 10.181.16.33:58745 10.181.16.192:11211 CLOSE_WAIT 7578/httpd
tcp 1 0 10.181.16.33:58749 10.181.16.192:11211 CLOSE_WAIT 7573/httpd
There is a bug in starting in 3.0.4 of the memcache pecl extension. This does not happen on the latest stable release (2.2.7). I have reported this bug to the team. I think it has something to do with session locking. This bug does not occur in memcached extension.
Trying to wrap Pheanstalk in my PHP job base class. I'm testing the reserve and reserve with delay functionality and I've found that I can reserve a job from a second instances of my base class without the first instance releasing the job or the TTR timing out. This is unexpected since I was thinking this is exactly the thing job queues are supposed to prevent. Here are the beanstalkd commands for the first put and the first reserve along with time stamps. I also do a stats-job request at the end:
01:40:15: Sending command: use QueuedCoreEvent
01:40:15: Got response: USING QueuedCoreEvent
01:40:15: Sending command: put 1024 0 300 233
a:4:{s:9:"eventName";s:21:"ReQueueJob_eawu7xr9bi";s:6:"params";a:2:{s:12:"InstanceName";s:21:"ReQueueJob_eawu7xr9bi";s:17:"aValueToIncrement";i:123456;}s:9:"behaviors";a:1:{i:0;s:22:"BehMCoreEventTestDummy";}s:12:"failureCount";i:0;}
01:40:15: Got response: INSERTED 10
01:40:15: Sending command: watch QueuedCoreEvent
01:40:15: Got response: WATCHING 2
01:40:15: Sending command: ignore default
01:40:15: Got response: WATCHING 1
01:40:15: Sending command: reserve-with-timeout 0
01:40:15: Got response: RESERVED 10 233
01:40:15: Data: a:4:{s:9:"eventName";s:21:"ReQueueJob_eawu7xr9bi";s:6:"params";a:2:{s:12:"InstanceName";s:21:"ReQueueJob_eawu7xr9bi";s:17:"aValueToIncrement";i:123456;}s:9:"behaviors";a:1:{i:0;s:22:"BehMCoreEventTestDummy";}s:12:"failureCount";i:0;}
01:40:15: Sending command: stats-job 10
01:40:15: Got response: OK 162
01:40:15: Data: ---
id: 10
tube: QueuedCoreEvent
state: reserved
pri: 1024
age: 0
delay: 0
ttr: 300
time-left: 299
file: 0
reserves: 1
timeouts: 0
releases: 0
buries: 0
kicks: 0
So far, so good. Now I do another reserve from a second instance of my base class followed by another stats-job request. Notice the time stamps are within the same second, nowhere near the 300 second TTR I've set. Also notice in this second stats-job printout that there are 2 reserves of this job with 0 timeouts and 0 releases.
01:40:15: Sending command: watch QueuedCoreEvent
01:40:15: Got response: WATCHING 2
01:40:15: Sending command: ignore default
01:40:15: Got response: WATCHING 1
01:40:15: Sending command: reserve-with-timeout 0
01:40:15: Got response: RESERVED 10 233
01:40:15: Data: a:4:{s:9:"eventName";s:21:"ReQueueJob_eawu7xr9bi";s:6:"params";a:2:{s:12:"InstanceName";s:21:"ReQueueJob_eawu7xr9bi";s:17:"aValueToIncrement";i:123456;}s:9:"behaviors";a:1:{i:0;s:22:"BehMCoreEventTestDummy";}s:12:"failureCount";i:0;}
01:40:15: Sending command: stats-job 10
01:40:15: Got response: OK 162
01:40:15: Data: ---
id: 10
tube: QueuedCoreEvent
state: reserved
pri: 1024
age: 0
delay: 0
ttr: 300
time-left: 299
file: 0
reserves: 2
timeouts: 0
releases: 0
buries: 0
kicks: 0
Anyone have any ideas on what I might be doing wrong? Is there something I have to do to tell the queue I want jobs to only be accessed by one worker at a time? I'm doing an "unset" on the pheanstalk instance as soon as I get the job off the queue which I believe terminates the session with beanstalkd. Could this cause beanstalkd to decide the worker has died and automatically release the job without a timeout? I'm uncertain of how much beanstalkd relies on session state to determine worker state. I was assuming that I could open and close sessions with impunity and that job id was the only thing that beanstalkd cared about to tie job operations together, but that may have been foolish on my part... This is my first foray into job queues.
Thanks!
My guess is your first client instance closed the TCP socket to the beanstalkd server before the second one reserved the job.
Closing the TCP connection implicitly releases the job back onto the queue. These implicit releases (close connection, quit command etc) do not seem to increment the releases counter.
Here's an example:
# Create a job, reserve it, close the connection:
pda#paulbookpro ~ > telnet 0 11300
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
put 0 0 600 5
hello
INSERTED 1
reserve
RESERVED 1 5
hello
^]
telnet> close
Connection closed.
# Reserve the job, stats-job shows two reserves, zero releases.
# Use 'quit' command to close connection.
pda#paulbookpro ~ > telnet 0 11300
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
reserve
RESERVED 1 5
hello
stats-job 1
OK 151
---
id: 1
tube: default
state: reserved
pri: 0
age: 33
delay: 0
ttr: 600
time-left: 593
file: 0
reserves: 2
timeouts: 0
releases: 0
buries: 0
kicks: 0
quit
Connection closed by foreign host.
# Reserve the job, stats-job still shows zero releases.
# Explicitly release the job, stats-job shows one release.
pda#paulbookpro ~ > telnet 0 11300
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
reserve
RESERVED 1 5
hello
stats-job 1
OK 151
---
id: 1
tube: default
state: reserved
pri: 0
age: 46
delay: 0
ttr: 600
time-left: 597
file: 0
reserves: 3
timeouts: 0
releases: 0
buries: 0
kicks: 0
release 1 0 0
RELEASED
stats-job 1
OK 146
---
id: 1
tube: default
state: ready
pri: 0
age: 68
delay: 0
ttr: 600
time-left: 0
file: 0
reserves: 3
timeouts: 0
releases: 1
buries: 0
kicks: 0
quit
Connection closed by foreign host.
I got the same issue. The problem was in multiple connections opened to beanstalkd.
use Pheanstalk\Pheanstalk;
$pheanstalk = connect();
$pheanstalk->put(serialize([1]), 1, 0, 1800);
/** #var Job $job */
$job = $pheanstalk->reserve(10);
print_r($pheanstalk->statsJob($job->getId()));
// state reserved but
// only those connection that reserved a job can resolve/update it
$pheanstalk2 = connect();
print_r($pheanstalk->statsJob($job->getId()));
$pheanstalk2->delete($job);
// new connection opened in same process still cannot update the job
// PHP Fatal error: Uncaught Pheanstalk\Exception\ServerException: Cannot delete job 89: NOT_FOUND in /var/www/vendor/pda/pheanstalk/src/Command/DeleteCommand.php:45
function connect() {
$pheanstalk = new Pheanstalk(
'localhost',
11300,
5
);
return $pheanstalk;
}