When does PHP-FPM process release memory? - php

I am trying to configure our server cluster to handle large spikes in traffic. What I have noticed is when we get a spike we get a lot of failures due to PHP-FPM having to spawn a lot of workers quickly.
I can offset this by setting start_servers higher so the PHP-FPM processes are already ready but now this gives me some RAM management dilemmas.
On a test server with just me and some crons using it I load up a load of workers and watch the ram. Over time the PHP-FPM processes start to steadily increase in ram.
Why is there RAM left allocated inside the worker?
I am trying to understand why these processes gain ram and then just keep it. What is that RAM and when exactly does PHP-FPM recycle it?

Related

Max Users for WordPress Woocommerce Limited by PHP-FPM Processes Ram Utilization

I'm finding that the amount of concurrent users/visitors/connections to my woocommerce website is limited by the amount of RAM available. Each visitor seems to spawn a PHP-FPM process that is roughly 50-70 MB in size. My server has 32 GB of ram and I allocate 28 GB to PHP memory_limit and 28GB in innodb_buffer_pool_size for MariaDB. When the amount of users reaches 400-500, the server starts to run very slow and will have connection errors. Below are the PHP-FPM settings.
What I'm wondering is if the 50-70 MB per PHP-FPM process is normal. My WordPress installation does have several plugins. I could perform a bunch of tests to find out, but was hoping there were some experts out there that really understand what goes on here. In order to get to 1,500 concurrent users without server lag/hang, I will upgrade my dedicated server with 4X more cores and 4X more ram. However, it would be great to minimize the PHP-FPM process memory if there was a trick or two (like nginx caching).
Thank you for any input you can offer.
PHP-FPM settings
pm.max_children=400
pm.max_requests=500
pm=ondemand
PHP: version 7.4.13, run PHP as FPM application served by nginx
Server version: 10.3.27-MariaDB - MariaDB Server
Wordpress: 5.6
WooCommerce: Version 4.8.0

PHP-FPM performance tuning - bursts of traffic

I have a web application written in Laravel / PHP that is in the early stages and generally serves about 500 - 600 reqs/min. We use Maria DB and Redis for caching and everything is on AWS.
For events we want to promote on our platform, we send out a push notification (mobile platform) to all users which results in a roughly 2-min long traffic burst that takes us to 3.5k reqs/min
At our current server scale, this completely bogs down the application servers' CPU which usually operate at around 10% CPU. The Databases and Redis clusters seem fine during this burst.
Looking at the logs, it seems all PHP-FPM worker pool processes get occupied and begin queuing up requests from the Nginx upstream.
We currently have:
three m4.large servers (2 cores, 8gb RAM each)
dynamic PHP-FPM process management, with a max of 120 child processes (servers)on each box
My questions:
1) Should we increase the FPM pool? It seems that memory-wise, we're probably nearing our limit
2) Should we decrease the FPM pool? It seems possible that we're spinning up so many process that the CPU is getting bogged down and is unable to really complete any of them. I wonder if we'd therefore get better results with less.
3) Should we simply use larger boxes with more RAM and CPU, which will allow us to add more FPM workers?
4) Is there any FPM performance tuning that we should be considering? We use Opcache, however, should we switch to static process management for FPM to cut down on the overhead of processes spinning up and down?
There are too many child processes in relation to the number of cores.
First, you need to know the server status at normal and burst time.
1) Check the number of php-fpm processes.
ps -ef | grep 'php-fpm: pool' | wc -l
2) Check the load average. At 2 cores, 2 or more means that the work's starting delayed.
top
htop
glances
3) Depending on the service, we start to adjust from twice the number of cores.
; Example
;pm.max_children = 120 ; normal) pool 5, load 0.1 / burst) pool 120, load 5 **Bad**
;pm.max_children = 4 ; normal) pool 4, load 0.1 / burst) pool 4, load 1
pm.max_children = 8 ; normal) pool 6, load 0.1 / burst) pool 8, load 2 **Good**
load 2 = Maximum Performance 2 cores
It is more accurate to test the web server with a load similar to the actual load through the apache benchmark(ab).
ab -c100 -n10000 http://example.com
Time taken for tests: 60.344 seconds
Requests per second: 165.72 [#/sec] (mean)
100% 880 (longest request)

Huge CPU load - php-fpm + nginx

I use php-fpm with STATIC pools and the problem is that 2-3 pools from 20 are used with 80-100% CPU. Other php pools stay unused.
My question is: Why other 17 processes stay unused?
We used AWS instance c4.large.
Our docker image use 1024 Units of CPU and 2560 MB ram.
DOCKER containers in instance
ALL PROCESSES in container
TOP screenshot
The PHP-FPM pm static setting depends heavily on how much free memory your server has. Basically if you are suffering from low server memory, then pm ondemand or dynamic maybe be better options. On the other hand, if you have the memory available you can avoid much of the PHP process manager (PM) overhead by setting pm static to the max capacity of your server. In other words, when you do the math, pm.static should be set to the max amount of PHP-FPM processes that can run without creating memory availability or cache pressure issues. Also, not so high as to overwhelm CPU(s) and have a pile of pending PHP-FPM operations.

php-cgi.exe processes cause high cpu usage in IIS 7.5

I have a Windows Server that has random spikes of high CPU usage and upon looking at ProcessExplorer and Windows Task Manager, it seems that there are a high number of php-cgi.exe processes running concurrently, sometimes up to 6-8 instances, all taking around 10-15% of CPU each. Sometimes they are so bad that they cause the server to be unresponsive.
In the FastCGI settings, I've set MaxInstances to 4 so by right, so there shouldn't be more than 4 php-cgi.exe processes that are running simultaneously. Hence I would like some advice or directions on how to limit the number of instances to 4..
Additional notes: I've also set instanceMaxRequests to 10000 and also PHP_FCGI_MAX_REQUESTS to 10000 as well.

Nginx scaling and bottleneck identification on an EC2 cluster

I am developing a big application and i have to load test it. It is a EC2 based cluster with one HighCPU Ex.Large instance for application which runs PHP / NGinx.
This applicaton is responsible for reading data from a redis server which holds some 5k - 10k key values, it then makes the response and logs the data into a mongoDB server and replies back to client.
Whenever i send a request to the app server, it does all its computations in about 20 - 25 ms which is awesome.
I am now trying to do some load testing and i run a php based app on my laptop to send requests to server. Many thousands of them quickly over 20 - 30 seconds. During this load period, whenever i open the app URL in the browser, it replies back with the execution time of around 25 - 35 ms which is again cool. So i am sure that redis and mongo are not causing bottlenecks. But it is taking about 25 seconds to get the response back during load.
The high CPU ex. large instance has 8 GB RAM and 8 cores.
Also, during the load test, the top command shows about 4 - 6 php_cgi processes consuming some 15 - 20% of CPU.
I have 50 worker processes on nginx and 1024 worker connections.
What could be the issue causing the bottleneck ?
IF this doesnt work out, i am seriously considering moving out to a whole java application with an embedded webserver and an embedded cache.
UPDATE - increased PHP_FCGI_CHILDREN to 8 and it halfed the response time during load
50 worker processes is too many, you need only one worker process per CPU core. Using more worker processes will invoke inter-process switching, that will consume many time.
What you can do now:
1. Set worker process to minimum (one worker per CPU, e.g. 4 worker process if you have 4 cpu units), but worker connections - to maximum (10240 for example)
Tune up TCP stack via sysctl. You can reach stack limits if you have many connections
Get statistics from nginx stub_status module (you can use munin + nginx, its easy to setup and gave you enough information about system status).
Check nginx error.log and system messages log for errors.
Tune up nginx (decrease connection timings and max query size).
I hope that helps you.

Categories