I use Laravel Forge for spinning up my EC2 environments, which makes a LEMP stacks for me. I recently started getting 504 timeouts on requests.
I'm no sysadmin (hence subscription to Forge), but I looked through the logs and narrowed the issue down to these 2 repeated entries in my logs:
in: /var/log/nginx/default-error.log
2017/09/15 09:32:17 [error] 2308#2308: *1 upstream timed out (110: Connection timed out) while sending request to upstream, client: x.x.x.x, server: xxxx.com, request: "POST /upload HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php7.1-fpm.sock", host: "xxxx.com", referrer: "https://xxxx.com/rest/of/the/path"
in: /var/log/php7.1-fpm-log
[15-Sep-2017 09:35:09] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 14 total children
It seems like fpm opens connections that never die, and from my RDS load logs I can see that the RAM is constantly maxed out.
I've tried:
Rolling back to a definite stable version of my app (2months ago)
Reinstalling my EC2 with 5.6, 7.0, and 7.1 (with their respective fpm)
Doing all the above on 14.04 and 16.04
Creating a bigger RDS
Right now the only thing that works is a beefy RDS (8gb RAM) + killing fpm pooled connections every 300 requests. But obviously throwing resources at this problem is not the solution.
Here is my config for /etc/php/7.1/fpm/pool.d/www.conf
user = forge
group = forge
listen = /run/php/php7.1-fpm.sock
listen.owner = www-data
listen.group = www-data
listen.mode = 0666
pm = dynamic
pm.max_children = 30
pm.start_servers = 7
pm.min_spare_servers = 6
pm.max_spare_servers = 10
pm.process_idle_timeout = 7s;
pm.max_requests = 300
And here is my config for nginx.conf
listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name xxxx.com;
root /home/forge/xxxx.com/public;
ssl_certificate /etc/nginx/ssl/xxxx.com/111111/server.crt;
ssl_certificate_key /etc/nginx/ssl/xxxx.com/111111/server.key;
ssl_protocols xxxx;
ssl_ciphers ...;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
include forge-conf/xxxx.com/server/*;
location / {
try_files $uri $uri/ /index.php?$query_string;
location = /favicon.ico
location = /robots.txt
access_log /var/log/nginx/xxxx.com-access.log;
error_log /var/log/nginx/xxxx.com-error.log error;
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php/php7.1-fpm.sock;
fastcgi_index index.php;
fastcgi_read_timeout 60;
include fastcgi_params;
location ~ /\.(?!well-known).* {
deny all;
location ~* \.(?:ico|css|js|gif|jpe?g|png)$ {
expires 30d;
add_header Pragma public;
add_header Cache-Control "public";
OK, after a LOT of debugging and testing I've noticed these few causes.
Primary Cause for me: The AWS RDS instance that I was using for my MySQL had 500Mb of memory. Looking back, all these issues started once the DB size surpassed 400Mb.
Solution: Make sure you have 2x RAM of your DB size at all times. Otherwise the entire B+Tree doesn't fit in the memory, so it has to do constant swaps. This can take your query time upwards of 15 secs.
Primary Cause for problems like these: Not optimized SQL queries.
Solution: In your localhost maintain data similar to the size of your data on the server.
I was running Ubuntu server 20.04 quite successfully with Ired mail and 2 websites, one of them with WordPress.
I wanted to install Nextcloud, to do that I had to reinstall php-fpm to generate php7.4-fpm.sock. After this Nextcloud worked, but my other websites stopped working with error '502 Bad Gateway'.
So to say the least, I'm very confused!
I followed this article to install Nextcloud and set up the sites-enabled .conf file as per instructions: https://www.linuxbabe.com/ubuntu/install-nextcloud-ubuntu-20-04-nginx-lemp-stack/amp
I think I understand that the .conf file used to listen on and now listens on php7.4-fpm.sock?
Here is the .conf file that I have put together for my website after re-installing php-fpm:
# Note: This file must be loaded before other virtual host config files,
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
error_log /var/log/nginx/localhost.error_log info;
root /var/www/SOMEWEBSITE/html;
index index.php index.html;
include /etc/nginx/templates/misc.tmpl;
include /etc/nginx/templates/ssl.tmpl;
include /etc/nginx/templates/iredadmin.tmpl;
include /etc/nginx/templates/roundcube.tmpl;
include /etc/nginx/templates/sogo.tmpl;
include /etc/nginx/templates/netdata.tmpl;
include /etc/nginx/templates/php-catchall.tmpl;
include /etc/nginx/templates/stub_status.tmpl;
location / {
try_files $uri $uri/ /index.php?q=$uri$args;
# PHP handling
location ~ \.php$ {
try_files $uri =404;
fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
ssl_certificate /etc/letsencrypt/live/SOMEWEBSITE/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/SOMEWEBSITE/privkey.pem; # managed by Certbot
# Redirect http to https
server {
listen 80;
listen [::]:80;
return 301 https://$host$request_uri;
I have checked the file permissions for php7.4-fpm.sock
ll /var/run/php/ | grep php
-rw-r--r-- 1 root root 3 May 22 21:13 php7.4-fpm.pid
srw-rw---- 1 www-data www-data 0 May 22 21:13 php7.4-fpm.sock=
lrwxrwxrwx 1 root root 30 May 22 21:13 php-fpm.sock -> /etc/alternatives/php-fpm.sock=
and I think it looks ok.
here is the log file:
2021/05/23 20:32:52 [error] 43596#43596: *305 connect() failed (111: Connection refused) while connecting to upstream, client: xx.xx.xxx.xxx, server: SOMEWEBSITE, request: "GET / HTTP/1.1", upstream: "fastcgi://", host: "SOMEWEBSITE"
2021/05/23 20:32:53 [info] 43596#43596: *305 client xx.xx.xxx.xxx closed keepalive connection
Any Ideas? Need any more information? Thank you in advance for looking.
PHP-FPM can listen using two method for accepting fastcgi request. using TCP Socket or with Unix Socket.
You can sepecify it in php-fpm configuration, In Ubuntu the configuration is in /etc/php/7.4/fpm/pool.d/www.conf and check listen configuration.
If you want to use unix socket, use configuration below.
listen = /run/php/php7.4-fpm.sock
For TCP Socket.
listen =
Next in nginx you can specify fastcgi_pass based on fpm configuration. If you using Unix socket, all your website, include Nextcloud must be using Unix Socket.
fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
If you using TCP Socket, you must change the nginx configuration for Nextcloud to pass from TCP Socket.
I have faced this issue as well. the problem is php-fpm didnt listen port 9999 (In my case i use port 9999). To make /mail/ work. u need to change below config file. change ip:port change to socket
listen = /var/run/php/php7.2-fpm.sock
fastcgi_pass unix:/var/run/php/php7.2-fpm.sock;
I have a load balancer and two ec2 instances with php-fpm + nginx to serve my website and have configured redis to store php sessions. By running command "keys *" on redis-cli, I realized that php is creating a lot of empty sessions beyond the correct ones. Even if I close the browser, clean all cookies and do not run any php command or open any urls, it keeps creating empty sessions. The problem is that the session expiry time is 15hours, so it will create more sessions than will remove in this time, since it's creating about 30 empty sessions per hour. The only way that it stoped creating new sessions was by stopping php-fpm at my instances.
My guess is that is maybe something with the load balancer health check, I added my nginx.conf and php.ini below and you can see how I'm handling these load balancer checkings and my php session configs.
keys *
1) "PHPREDIS_SESSION:22u4tot1ilj2jn2pegsvsa9455"
2) "PHPREDIS_SESSION:u9c530pk3h0kr0moigf9a030c7"
316) "PHPREDIS_SESSION:d3t36ou13ljuj5ntt2l2b6sne0"
317) "PHPREDIS_SESSION:5kbn03dn01qdn405pg43bbd1i3"
Only 1 session is filled up, other 316 are empty.
By running "ttl key", I see the expire time is same that I've set on php.ini.
My php code is just a session_start(); for tests purposes.
My php.ini:
session.use_strict_mode = 0
session.use_cookies = 1
session.cookie_secure = 1
session.use_only_cookies = 1
session.name = PHPSESSID
session.cookie_lifetime = 54000
session.cookie_path = /
session.cookie_domain = .domain.xxx
session.cookie_httponly = 1
session.serialize_handler = php
session.gc_probability = 1
session.gc_divisor = 50
session.gc_maxlifetime = 54000
session.cache_limiter = nocache
session.cache_expire = 900
session.use_trans_sid = 0
I've checked phpinfo() and nothing is overwriting these configs
#This is the block that responds to loadbalancer requests and serves the website
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name localhost;
root /var/www/html;
upstream php-fpm {
location /nginx-health {
access_log off;
return 200 "healthy\n";
try_files $uri $uri/ #rewrite;
location #rewrite {
rewrite ^/(.*)$ /index.php?param=$1;
location ~ \.php$ {
try_files $uri =404;
fastcgi_intercept_errors on;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_pass php-fpm;
#this is the block to serve a websocket listener. It handles conections directly to the node, no passing by load balancer.
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name sub.domain.xxx;
root /var/www/html;
ssl_session_cache shared:SSL:1m;
ssl_session_timeout 720m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
location / {
proxy_set_header Host $host;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $host;
ssl_certificate /etc/letsencrypt/live/sub.domain.xxx/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/sub.domain.xxx/privkey.pem;
I tried to create a cronjob to get redis keys, test its value and delete the empty ones but I saw running "keys" command is really bad for production enviroments. Does anyone have an idea how to fix this issue?
I can't tell you exactly what's going on, but I do have a few suggestions:
Yes, running keys is an O(n) operation, but if your instance is small, then it is trivial. Keep an eye on your slow log and see if any of your keys operations really are taking too long, but my guess is that they are not.
If you think that the extra sessions are being created by nginx health checks, take a peak at the access logs, you should see all accesses to your site.
I also see that you are using http2. I don't know that much about how http2 interacts with php, but consider reverting back to http 1.1 and see if you have the same behavior.
I'm using a Laravel App,below is my nginx configuration code:
server {
listen 80;
server_name domain.com;
root /var/www/project/public;
index index.html index.htm index.php index.nginx-debian.html;
charset utf-8;
location ~ /.well-known {
allow all;
location / {
try_files $uri $uri/ /index.php?$query_string;
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
access_log off;
error_log /var/log/nginx/domain.log error;
sendfile off;
client_max_body_size 100m;
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/run/php/php7.0-fpm.sock;
location ~ /\.ht {
deny all;
Error Log
2018/06/29 08:41:30 [error] 928#928: *14875 connect() to unix:/run/php/php7.0-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: IP, server: IP, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/run/php/php7.0-fpm.sock:", host: "IP"
Is something wrong with my config? I'm having a heavy servers with 32GB Ram, SSD and processor: 2x E5-2670 0 # 2.60GHz. I'm using Ubuntu with NGINX.
Kindly let me know, I've changed many servers but not able to get rid off this issue.
Check your php-fpm logs to see if nginx is dying. The error you are getting is Nginx saying that the server it is proxying requests to is not answering. This could be a network issue if they are on two different machines, it could be from there not being enough workers to satisfy all of the incoming requests, it could be from the OOM killer on a VPS killing the process. Just because a server has a ton of RAM doesn't mean it impervious to memory exhaustion. I have had a customer with a Magento site who's end of day reports threw memory errors on a server with similar specs due to the poorly coded plugin.
I just saw that you posted a line from your log. Check your pool.conf for the following:
process.max, pm.max_children, pm.min/max_spare_servers
These are just a few of the things you may need to tweak in a heavy traffic environment.
Running shopware 5 on a Debian Jessie machine with nginx and php5-fpm, we get very often a 502 Bad Gateway. This happens mostly in backend when longer operations are working like thumbnail creation, even if this is done within small chunks of single ajax requests.
The used server with 64 GB RAM and 16 Cores is sleeping at all, because there is no real traffic on it. We use it like a staging system currently unless we have fixed all errors like this one.
Error log:
In the nginx-error log the following lines can be found then:
[error] 20524#0: *175 connect() failed (111: Connection refused) while connecting to upstream, client: xx.xx.xx.xx, server: domain.com, request: "POST /backend/MediaManager/createThumbnails HTTP/1.1", upstream: "fastcgi://", host: "www.domain.com", referrer: "http://www.domain.com/backend/"
[error] 20524#0: *175 no live upstreams while connecting to upstream, client: xx.xx.xx.xx, server: domain.com, request: "POST /backend/Log/createLog HTTP/1.1", upstream: "fastcgi://php-fpm", host: "www.domain.com", referrer: "http://www.domain.com/backend/"
[error] 20524#0: *175 connect() failed (111: Connection refused) while connecting to upstream, client: xx.xx.xx.xx, server: domain.com, request: "GET /backend/login/getLoginStatus?_dc=1457014588680 HTTP/1.1", upstream: "fastcgi://", host: "www.domain.com", referrer: "http://www.domain.com/backend/"
[error] 20522#0: *209 connect() failed (111: Connection refused) while connecting to upstream, client: xx.xx.xx.xx, server: domain.com, request: "GET /backend/login/getLoginStatus?_dc=1457014618682 HTTP/1.1", upstream: "fastcgi://", host: "www.domain.com", referrer: "http://www.domain.com/backend/"
Maybe it is notable, that at first lot of "*175 connect" errors occure and then finally a "*209 connect".
Config files:
I'll try to post only significant lines related to this topic and will leave out all those lines which are commented out.
user = www-data
group = www-data
listen = /var/run/php5-fpm.sock
listen.owner = www-data
listen.group = www-data
pm.max_children = 5
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3
user www-data;
worker_processes auto;
pid /run/nginx.pid;
events {
worker_connections 768;
multi_accept on;
http {
## MIME types.
include /etc/nginx/mime.types;
default_type application/octet-stream;
## Default log and error files.
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
## Use sendfile() syscall to speed up I/O operations and speed up
## static file serving.
sendfile on;
## Handling of IPs in proxied and load balancing situations.
# set_real_ip_from; # set to your proxies ip or range
# real_ip_header X-Forwarded-For;
## Timeouts.
client_body_timeout 60;
client_header_timeout 60;
keepalive_timeout 10 10;
send_timeout 60;
## Reset lingering timed out connections. Deflect DDoS.
reset_timedout_connection on;
## Body size.
client_max_body_size 10m;
## TCP options.
tcp_nodelay on;
## Optimization of socket handling when using sendfile.
tcp_nopush on;
## Compression.
gzip on;
gzip_buffers 16 8k;
gzip_comp_level 1;
gzip_http_version 1.1;
gzip_min_length 10;
gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript image/x-icon application/vnd.ms-fontobject font/opentype application/x-font-ttf;
gzip_vary on;
gzip_proxied any; # Compression for all requests.
gzip_disable "msie6";
## Hide the Nginx version number.
server_tokens off;
## Upstream to abstract backend connection(s) for PHP.
upstream php-fpm {
server unix:/var/run/php5-fpm.sock;
# server;
## Create a backend connection cache.
keepalive 32;
## Include additional configs
include /etc/nginx/conf.d/*.conf;
## Include all vhosts.
include /etc/nginx/sites-enabled/*;
server {
listen 80;
listen 443 ssl;
server_name xxxxxxxx.com;
root /var/www/shopware;
## Access and error logs.
access_log /var/log/nginx/xxxxxxxx.com.access.log;
error_log /var/log/nginx/xxxxxxxx.com.error.log;
## leaving out lots of shopware/mediafiles-related settings
## ....
## continue:
location ~ \.php$ {
try_files $uri $uri/ =404;
## NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini
fastcgi_split_path_info ^(.+\.php)(/.+)$;
## required for upstream keepalive
# disabled due to failed connections
#fastcgi_keep_conn on;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param SHOPWARE_ENV $shopware_env if_not_empty;
fastcgi_param ENV $shopware_env if_not_empty; # BC for older SW versions
fastcgi_buffers 8 16k;
fastcgi_buffer_size 32k;
client_max_body_size 24M;
client_body_buffer_size 128k;
## upstream "php-fpm" must be configured in http context
fastcgi_pass php-fpm;
What to do now? Please let me now if i should provide further information to this question.
After applying nginx- and fpm-settings from #peixotorms, the errors in nginx-logs changed to:
30 upstream timed out (110: Connection timed out) while reading response header from upstream
But the issue itself isn't solved. It has just another face...
It might sound strange to you, but your problem is most probably due to the fact that you're running PHP on a socket instead of a tcp port. You will start seeing 502 errors (and others) when you have around 300 concurrent requests (sometimes less) to php on a socket configuration.
Also your pm.max_children is way too low, unless you want to limit your server to around 5 simultaneous php requests maximum: http://php.net/manual/en/install.fpm.configuration.php
Configure it this way, and those errors should go away:
For your nginx.conf change the following values:
worker_processes 4;
worker_rlimit_nofile 750000;
# handles connection stuff
events {
worker_connections 50000;
multi_accept on;
use epoll;
upstream php-fpm {
keepalive 30;
Your /etc/php5-fpm/pool.d/www.conf
(Use these settings because you have plenty or RAM and CPU)
user = www-data
group = www-data
listen.owner = www-data
listen.group = www-data
listen.mode = 0660
listen =
listen.allowed_clients =
listen.backlog = 65000
pm = dynamic
pm.max_children = 1024
pm.start_servers = 8
pm.min_spare_servers = 4
pm.max_spare_servers = 16
pm.max_requests = 10000
Also add this on your location ~ \.php$ { block:
location ~ \.php$ {
try_files $uri $uri/ =404;
## NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini
fastcgi_split_path_info ^(.+\.php)(/.+)$;
## required for upstream keepalive
# disabled due to failed connections
#fastcgi_keep_conn on;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param SHOPWARE_ENV $shopware_env if_not_empty;
fastcgi_param ENV $shopware_env if_not_empty; # BC for older SW versions
fastcgi_keep_conn on;
fastcgi_connect_timeout 20s;
fastcgi_send_timeout 60s;
fastcgi_read_timeout 60s;
fastcgi_pass php-fpm;
Change the values below on your /etc/php5/fpm/php.ini file to this and restart:
safe_mode = Off
output_buffering = Off
zlib.output_compression = Off
max_execution_time = 900
max_input_time = 900
memory_limit = 2048M
post_max_size = 120M
file_uploads = On
upload_max_filesize = 120M
Try binding to
listen =
I am running some siege tests on my nginx server. The bottleneck doesn't seem to be cpu or memory so what is it?
I try to do this on my macbook:
sudo siege -t 10s -c 500 server_ip/test.php
The response time goes to 10 seconds, I get errors and siege aborts before completing.
But I if run the above on my server
siege -t 10s -c 500 localhost/test.php
I get:
Transactions: 6555 hits
Availability: 95.14 %
Elapsed time: 9.51 secs
Data transferred: 117.30 MB
Response time: 0.18 secs
Transaction rate: 689.27 trans/sec
Throughput: 12.33 MB/sec
Concurrency: 127.11
Successful transactions: 6555
Failed transactions: 335
Longest transaction: 1.31
Shortest transaction: 0.00
I also noticed for lower concurrent figures, I get vastly improved transaction rate on localhost compared to externally.
But when the above is running on localhost the CPU usage is low, memory usage is low on HTOP. So I'm confused how I can boost performance because I can't see a bottleneck.
ulimit returns 50000 because I've increased it. There are 4 nginx worker processes which is 2 times my cpu cores. Here are my other settings
worker_rlimit_nofile 40000;
events {
worker_connections 20000;
# multi_accept on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
The test.php is just a echo phpinfo() script, nothing else. No database connections.
The machine is an AWS m3 large, 2 cpu cores and about 7gb of ram I believe.
Here is the contents of my server block:
listen 80 default_server;
listen [::]:80 default_server ipv6only=on;
root /var/www/sitename;
index index.php index.html index.htm;
# Make site accessible from http://localhost/
server_name localhost;
location / {
try_files $uri $uri.html $uri/ #extensionless-php;
location #extensionless-php {
rewrite ^(.*)$ $1.php last;
error_page 404 /404.html;
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
# pass the PHP scripts to FastCGI server listening on
location ~ \.php$ {
try_files $uri =404;
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
Also this was in my error log:
connect() to unix:/var/run/php5-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, cli$