Docker - Nginx + PHP-FPM = reaching timeout - php

Simple question: I'm getting Maximum execution time of 0 seconds exceeded, but I don't know why: there are a lot of things written about it, but in script I've no set_timeout or anything like that, just in PHP-FPM config I've php_value[max_execution_time] = 0 and even with this I'm gettting timeout from PHP-FPM's side (nginx is OK, no 504 Gateway Timeout or what's throwing). The thing is long running http request being killed in aprox. 3 mins. Setup is standard nginx + php-fpm (7.2) running in docker.
Thanks for any points!

Related

php-fpm processes stuck on state "getting request informations"

My webserver has been experiencing a problem of php-fpm active processes slowly increasing till the pm.max_children setting is reached, at which point it's stuck and I need to restart php-fpm.
(os: ubuntu 20.0.4, webserver: Caddy, php-fpm version: 7.1, pm = dynamic, running Laravel 5.5 framework)
I've enable the php-fpm status page and found that many processes are stuck in the "Getting request informations" state.
Example row from output: of /status?html&full (this has been stuck here for over an hour)
pid
state
start time
start since
requests
request duration
request method
request uri
content length
user
script
last request cpu
last request memory
1772235
Getting request informations
24/Jun/2021:15:03:07 +0000
5111
131
4625314443
POST
/api.php?t=removed&e=/role/checkOut/3461
5542139
-
/var/www/nameremoved/app/fe/production/api.php
0.00
0
Can anyone shed some light on what the "Getting request informations" state is? I can't seem to find anywhere it's documented.
In php.ini I have:
max_execution_time = 180
Yet this seems to be ignored..
The scripts being run are from Laravel 5.5 and definitely shouldn't take more than a few seconds to execute - they are just basic database operations, maybe with file uploads that could be up to 500MB
I guess my next step could be to set the php-fpm setting:
request_terminate_timeout
and see if that terminates the processes.
The strange thing is I have an identical server set up in a different location (requests are routed to either server based on location) which does not have this problem.
Any advice appreciated :)
UPDATE 25/6/2021
Still happening, it seems to be only for POST requests with file uploads
UPDATE 29/6/2021
I've set request_terminate_timeout=2h
this successfully kills the requests stuck in the "Getting request informations" state.. so this kinda solves the problem but I still have no idea what was causing it
UPDATE 16/6/2022
Now using Php 8.1, Laravel 8, Caddy v 2.4.6 same problem still occurring.
I've added global before and after middleware in Laravel to log each http request with php-fpm process id to try to find the culprit, but it seems the problem is occuring before the before middleware is even being hit..
I have the same behavior with Ubuntu 20.04.3 LTS, Laravel 8, php-fpm 7.4 and caddy 2.4.5.
Restarting either the caddy or php-fpm service immediately frees up the processes. So I first quickly "fixed" it by restarting caddy every 15 minutes via crontab.
Since this doesn't happen with nginx, I'm now running caddy -> nginx -> php-fpm, it works so far.

How to diagnose NGINX/PHP slowness

Developing NGINX/PHP web site. Significant discrepancy between NGINX and PHP processing times and I don't know how to diagnose.
Pulling a JPEG from the NGINX server is speedy.
ab -l -c 100 -n 10000 http://blah.com/a_40kb_file.jpg
7,700 Requests per second. NGINX log says served between 2ms and 8ms.
Pulling a PHP front page is another matter. It's a simple form with no graphics so each connection represents a page. Each download is approx 3kb. Each page is different as it contains a randomised token in the form.
ab -l -c 100 -n 10000 http://blah.com/
900 requests per second. NGINX log says served between 15ms and 250ms. No errors reported from NGINX. PHP-FPM complained about reaching max pm.max_children. Increased until no more error.
The PHP script records execution time by simple microtime(true) at the beginning and end. These are showing that:
Calling a single page, the NGINX log time and the PHP rune time broadly align (approx 1 to 2 ms).
Under simulated load, the NGINX time goes mad but the PHP execution time remains the same.
NGINX is waiting around somewhere for something to happen and I don't know how to diagnose. Are any tools/methods available?
NGINX.CONF
worker_processes auto;
worker_connections 768;
Dev platform:
1 Core VM. Vbox 6.0. Guest: Ubuntu 18.04. Intel i5-6260U.

Monitor php-fpm max processes count script

We have an issue in which on production server, some bug in our system locks/hangs a php-fpm process and is not being released, this causes over a period of 10-15 minutes to more processes to lock (probably trying to access a shared resource which is not released) and after a while the server cannot serve any new users as no free php-fpm processes are available.
Parallel to trying and find what is creating that dead-lock, we were thinking of creating a simple cron job , which runs every 1-2 minutes and if it sees max processes above X it will either kill all php-fpm processes or restart the php-fpm .
What do you think of that simple temporary fix for the problem ?
Simple php script ,
$processCount = shell_exec("ps aux|grep php-fpm|grep USERNAME -c");
$killAll = $processCount >=60;
if($killAll){
echo "killing all processes";
try{
shell_exec("kill -9 $(lsof -t -i:9056)");
}catch(Exception $e1){
}
shell_exec("sudo service php56u-php-fpm restart");
$processCount = shell_exec("ps aux|grep php-fpm|grep USERNAME -c"); //check how much now
}
Killing all php processes doesn't seem like a good solution to your problem. It would also kill legitimate processes and return errors to visitors, and generally just hide the problem deeper. You may also introduce data inconsistencies, corrupt files and other problems killing processes indiscriminately.
Maybe it would be better to set some timeout, so the process would be killed if it takes too long to execute.
You could add something like this to php-fpm pool config:
request_terminate_timeout = 3m
and/or max_execution_time in php.ini
You can also enable logging in php-fpm config:
slowlog = /var/log/phpslow.log
request_slowlog_timeout = 2m
This will log slow requests and may help you find the cultprit of your issue.
it's not a good solution to kill PHP processes. in your PHP-fpm config file (/etc/php5/pool.d/www.conf)
set pm.max_requests=100 so after 100 requests the process will close and another process will start for the rest of requests.
also maybe there's a problem with your code, please make sure the request is ending.
So if the problem with your script try request_terminate_timeout=2m
The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_terminate_timeout = 0
Please note that if you are doing some long polling, this may affect your code.

UBuntu, PHP, NGINX, FastCGI, FPM, CodeIgniter Timeout

I need help on resetting server timeout, at present, the php scripts on my linode servers timeout in 50 seconds, i have tried all options to increase the timeout time but nothing helps.
Environment -
Linode VPS
Server OS - UBuntu 12
Nginx
PHP / Fast CGI
apache
My Development is being done with CodeIgniter
I have a test script (test.php), with the following code -
<?php
sleep(70);
die( 'Hello World' );
?>
When I call this script through a browser, I Get a 504 Timeout Error.
I have setup the following -
PHP - set max_execution_time to 300, max_input time to 300
updated NGINX Configuration file with all the timeout variables (proxy, fastcgi etc)
Setup FastCGI to timeout after 600 seconds
Still my script timeouts at 50 seconds, i have reloaded nginx, rebooted vps, but the changes are not reflecting.
Please help me, this issue is turning me mad.
Sincerely,

Memcached concurrency w/ lighttpd php

I'm having an issue with memcached. Not sure if it's memcached, php, or tcp sockets but everytime I try a benchmark with 50 or more concurrency to a page with memcached, some of those request failed using apache ab. I get the (99) Cannot assign requested address error.
When I do a concurrency test of 5000 to a regular phpinfo() page. Everything is fine. No failed requests.
It seems like memcached cannot support high concurrency or am I missing something? I'm running memcached with the -c 5000 flag.
Server: (2) Quad Core Xeon 2.5Ghz, 64GB ram, 4TB Raid 10, 64bit OpenSUSE 11.1
Ok, I've figured it out. Maybe this will help others who have the same problem.
It seems like the issue can be a combination of things.
Set the sever.max-worker in the lighttpd.conf to a higher number
Original: 16 Now: 32
Turned off keep-alive in lighttpd.conf, it was keeping the connections opened for too long.
server.max-keep-alive-requests = 0
Change ulimit -n open files to a higher number.
ulimit -n 65535
If you're on linux use:
server.event-handler = "linux-sysepoll"
server.network-backend = "linux-sendfile"
Increase max-fds
server.max-fds = 2048
Lower the tcp TIME_WAIT before recycling, this keep close the connection faster.
In /etc/sysctl.conf add:
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3
Make sure you force it to reload with: /sbin/sysctl -p
After I've made the changes, my server is now running 30,000 concurrent connections and 1,000,000 simultaneous requests without any issue, failed requests, or write errors with apache ab.
Command used to benchmark: ab -n 1000000 -c 30000 http://localhost/test.php
My Apache can't get even close to this benchmark. Lighttd make me laugh at Apache now. Apache crawl at around 200 concurrency.
I'm using just a 4 byte integer, using it as a page counter for testing purposes. Other php pages works fine even with 5,000 concurrent connections and 100,000 requests. This server have alot of horsepower and ram, so I know that's not the issue.
The page that seems to die have nothing but 5 lines to code to test the page counter using memcached. Making the connection gives me this error: (99) Cannot assign requested address.
This problem start to arise starting with 50 concurrent connections.
I'm running memcached with -c 5000 for 5000 concurrency.
Everything is on one machine (localhost)
The only process running is SSH, Lighttpd, PHP, and Memcached
There are no users connected to this box (test machine)
Linux -nofile is set to 32000
That's all I have for now, I'll post more information as I found more. It seems like there are alot of people with this problem.
I just tested something similar with a file;
$mc = memcache_connect('localhost', 11211);
$visitors = memcache_get($mc, 'visitors') + 1;
memcache_set($mc, 'visitors', $visitors, 0, 30);
echo $visitors;
running on a tiny virtual machine with nginx, php-fastcgi, and memcached.
I ran ab -c 250 -t 60 http://testserver/memcache.php from my laptop in the same network without seeing any errors.
Where are you seeing the error? In your php error log?
This is what I used for Nginx/php-fpm adding this lines in /etc/sysctl.conf # Rackspace dedicate servers with Memcached/Couchbase/Puppet:
# Memcached fix
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3
I hope it helps.

Categories