Optimize PHP, MYSQL community website performance - php

I own a community website of about 12.000 users (write heavy), 100 concurrent users max on a single VPS with 1Gb ram. The load rarely goes above 3 and response is quite good.
Currently a simple file cache is used to store DB query results to ease the load on the DB, but the website still can slow down over 220 concurrent users (load test).
How can I find out what the bottleneck is?
I assume that DB is fine as cache is working fine, however Disk IO could cause problem. Each pageload has about 10 includes and 10-20 querys from DB or from the file cache, plus lots of php processing.
I tried using memcache instead of the file cache, but to my suprise the load test seemed to like file cache more.
I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?
Any other suggestions for finding bottlenecks (tried xdebug)?
Thanks,
Hamlet

I plan to use Alternative PHP Cache,
but I still don't really understand
how that cache is invalidated. I have
a singe index.php that handles all
requests. Will the cache store the
result for each individual request?
Will it clear the cache automatically
if one of my includes (or query result
from cache) change?
APC doesn't cache output. It caches your compiled bytecode.
Essentially, a normal PHP request looks like this:
PHP files are parsed and compiled to bytecode
The PHP interpreter executes the bytecode
APC caches the result of the first step, so you aren't reparsing/recompiling the same code over and over again. By default, it still stat()s your PHP files on every request, to see if the file has been modified since its cached copy was compiled -- so any changes to your code will automatically invalidate the cached copy.
You can also use APC much like you'd use memcached, for storing arbitrary user data. Keep in mind, however:
A memcached server can serve data to multiple servers; data cached in APC can only really be used locally. Better to serve a gig of data from one memcached box to four servers, than to have 4 copies of that gig of data in APC on each individual server.
Memcached, in my experience, is better at handling large numbers of concurrent writes to a single cache key.
APC doesn't seem to cope very well with its cache filling up. Fragmentation increases, and performance drops.
Also, beware: unless you've set up some sort of locking mechanism, your file-based cache is likely to become corrupt due to simultaneous writes. If you have implemented locking, that may become a bottleneck of its own. IMO, concurrency is tricky -- let memcached/APC/the database deal with it.

You mention you used XDebug - what weren't you able to do? Typically, to start tracking down a bottleneck you enable profiling of a request and then view the resulting "cachegrind" file in KCacheGrind or WinCacheGrind.
As for using a cache system, a dynamic script such as yours will generally do something like this
construct a cache "key" from the unique inputs to the script
ask the caching system if it has data for that key. If has, you're good to go!
otherwise, do all the hard work to generate the data, and ask the caching system to store it under the desired key for next time.
APC Cache can help to speed things up further by caching the parsed version of the PHP code.

MySQL has its own query cache.
You can enable it by setting query_cache_size to more than 0.
The query results are taken from the cache if the query is repeated verbatim and does not contain certain things like non-deterministic functions, session variables and some other things describe here:
The cache for a query is invalidated by issuing any DML operation against any of the underlying queries.

I turned on and configured APC on the test server and got a performance increase of about 400%
300 concurrent users with response time 1,4 secs max :) Good for a start.
Update:
Live server test results
Original:
No APC: 220 concurrent users, server load 20, response time 5000ms
No APC: 250 concurrent users, server load 20+, site is unavailable
New:
APC enabled: 250 concurrent users, server load 2, response time is 600ms
APC enabled: 350 concurrent users, server load 10, response time is 1500ms
APC enabled: 500 concurrent users, server load 20, response is 5000ms + site is fully operational, but a bit slow but can be used normally
Thanks for the suggestions, this is pretty great improvement.
Query cache is disabled as the site is write heavy thus cache would be invalidated constantly for whole tables.

I would say that it's likely that your database is IO bound, I don't know exactly what a "VPS" is, but if it's some kind of VM, then there is almost guaranteed to be very poorly performing IO.
Get it on to real hardware ASAP; and get a sensible amount of ram (1G is tiny; 16G sounds more reasonable).
Then you may be able to tune your db so it can behave properly. How big are your data in total? If you can get all of them (or most of them) to fit in your database cache (not the dodgy query cache, the proper innodb buffer pool one), then do so.
I'm assuming you're using the innodb engine; if so, then set up the buffer pool to be big enough for all your data - if you don't have enough ram, buy more until you do (No, really!).
Then your db queries should be fast even if they're fairly bad (yes).
The tricky bit is, if you have a single machine, how to carve up ram usage between mysql and PHP - the web server (I assume Apache), particularly if you use prefork and lots of MaxClients, can use up loads of ram and deprive your database of it.
Get some decent monitoring on the job (with trending), and make changes carefully and record exactly when you made them.

Related

Caching data to spare mysql queries

I have a PHP application that is executed up to one hundred times simultaneously, and very often. (its a telegram anti-spam bot with 250k+ users)
The script itself makes various DB calls (tickers update, counters etc.) but it also load each time some more or less 'static' data from the database, like regexes or json config files.
My script is also doing image manipulation, so the server's CPU and RAM are sometimes under pressure.
Some days ago i ran into a problem, the apache2 OOM-Killer was killing the mysql server process due to lack of avaible memory. The mysql server were not restarting automaticaly, leaving my script broken for hours.
I already made some code optimisations that enabled my server to breathe, but what i'm looking now, is to have some caching method to store data between script executions, with the possibility to update them based on a time interval.
First i thought about flat file where i could serialize data, but i would like to know if it is a good idea or not regarding performances.
In my case, is there a benefit of using caching data over mysql queries ?
What are the pro/con, regarding speed of access, speed of execution ?
Finaly, what caching method should i implement ?
I know that the simplest solution is to upgrade my server capacity, I plan to do so anytime soon.
Server is running Debian 11, PHP 8.0
Thank you.
If you could use a NoSQL to provide those queries it would speed up dramatically.
Now if this is a no go, you can go old school and keep that "static" data in the filesystem.
You can then create a timer of your own that runs, for example, every 20 minutes to update the files.
When you ask info regarding speed of access, speed of execution the answer will always be "depends" but from what you said it would be better to access the file system that being constantly querying the database for the same info...
The complexity, consistency, etc, lead me to recommend against a caching layer. Instead, let's work a bit more on other optimizations.
OOM implies that something is tuned improperly. Show us what you have in my.cnf. How much RAM do you have? How much RAM des the image processing take? (PHP's image* library is something of a memory hog.) We need to start by knowing how much RAM can MySQL can have.
For tuning, please provide GLOBAL STATUS and VARIABLES. See http://mysql.rjweb.org/doc.php/mysql_analysis
That link also shows how to gather the slowlog. In it we should be able to find the "worst" queries and work on optimizing them. For "one hundred times simultaneously", even fast queries need to be further optimized. When providing the 'worst' queries, please provide SHOW CREATE TABLE.
Another technique is to decrease the number of children that Apache is allowed to run. Apache will queue up others. "Hundreds" is too many for Apache or MySQL; it is better to wait to start some of them rather than having "hundreds" stumbling over each other.

How to stabilise PHP response times when dealing with multiple simultaneous requests?

I'm building a PHP application with an API that has be able to respond very rapidly (within 100ms) to all requests, and must be able to handle up to 200 queries per second (requests are in JSON, and responses require a DB lookup + save every time). My code runs easily fast enough (very consistently around 30ms) for single requests, but as soon as it has to respond to multiple requests per second, the response times start jumping all over the place.
I don't think it's a memory problem (PHP's memory limit is set to 128MB and the code's memory usage is only around 3.5MB) or a MySQL problem (the code before any DB request is as likely to bottleneck as the bit that interacts with the DB).
Because the timing is so important, I need to get the response times as consistent as possible. So my question is: are there any simple tweaks I can make (to php.ini or Apache) to stabilise PHP's response times when handling multiple simultaneous requests?
One of the slowest things (easiest to fix) in my experience in a server in terms of bottleneck is going to be your filesystem and hard drives. I think speeding this up will help out in all other areas.
So you could for example upgrade the hard drive where your httpdocs and database resides. You can put it on an SSD drive for example. Or even make a RAM disk and place all files on it.
Alternatively you can setup your database such that it operates off of a Memory storage engine.
(Related info here too)
Of course for all that you'll need a lot of physical memory. It is also important to note if your web/app hosting you got is shared then your going to have problems with Shared Memory.
Tune Mysql
Tune Apache
Performance tune PHP
Get Zend Optimizer enabled, or look at APC, or eAccelerator
Here's some basic LAMP tuning tips from IBM
Here's a slideshare with some good advice as well

First page request on website very slow

The first page I load from my site after not visiting it for 20+ mins is very slow. Subsequent page loads are 10-20x faster. What are the common causes of this symptom? Could my server be sleeping or something when it's not receiving http requests?
I will answer this question generally because I'm sure it's something that confuses a lot of newcomers.
The really short answer is: caching.
Just about every program in your computer uses some form of caching to remember data that has already been loaded/processed recently, so it doesn't have to do the work again.
The size of the cache is invariably limited, so stuff has to be thrown out. And 99% of the time the main criteria for expiring cache entries is, how long ago was this last used?
Your operating system caches file data that is read from disk
PHP caches pages and keeps them compiled in memory
The CPU caches memory in its own special faster memory (although this may be less obvious to most users)
And some things that are not actually a cache, work in the same way as cache:
virtual memory aka swap. When there not enough memory available for certain programs, the operating system has to make room for them by moving chunks of memory onto disk. On more recent operating systems the OS will do this just so it can make the disk cache bigger.
Some web servers like to run multiple copies of themselves, and share the workload of requests between them. The copies individually cache stuff too, depending on the setup. When the workload is low enough the server can terminate some of these processes to free up memory and be nice to the rest of the computer. Later on if the workload increases, new processes have to be started, and their memory loaded with various data.
(Note, the wikipedia links above go into a LOT of detail. I'm not expecting everyone to read them, but they're there if you really want to know more)
It's probably not sleeping. It's just not visited for a while and releases it's resources. It takes time to get it started again.
If the site is visited frequently by many users it should response quickly every time.
It sounds like it could be caching. Is the server running on the same machine as your browser? If not, what's the network configuration (same LAN, etc...)?

Save memcache data to a file or a database

We're running a website which performs financial modeling and it takes a while for memcache to build its cache. For instance, after 1 week the number of hits is only at 48% and the cache used is 2GB (out of 5GB allocated). Since we don't want to loose that cache should the server crash or need to be restarted, we would like to save it somewhere.
Q: What are the best options for storing the content of the memcache cache somewhere permanent (and restoring that content)?
So far we haven't seen memcache reach the point whereby % of hits doesn't improve. We know we quickly get to 30% hits with 300MB of data, which corresponds to caching of shared content. Afterwards, objects become much bigger and are created less frequently. By looking at our munin graphs, I would say we could reach our best % of hits within 2 to 3 months. I really think we have a case for saving our memcache data.
FYI I'm not adding the graph showing the % of hits/misses because it evolves so slowly that it's not really readable.
You might try:
CouchBase
Redis
Or you may find more tips on StackOverflow here: alternative to memcached that can persist to disk
I'd not suggest using Redis in a financial production environment, it's to unstable for it at this time. It is something to keep an eye out for in the future.
CouchBase is a solution.
You can also use repcached (http://repcached.lab.klab.org/) to replicate the memcache data to a 2nd memcache daemon in case the first one crashes, you still have all your memcache data. You can also use this type of setup to loadbalance your memcache usage if you want, i've been using it for quite some time now and are very happy with it's performance.

Performance comparison of MemCached with Disk Caching

I would like to know the performances of Memcached on remote server(on same LAN) with Disk Caching.Besides Memcached is a scalable cache solution, would there be any advantage of using Memcached with respect to performance when compared to disk caching.
Regards,
Mugil.
From my personal experience I've found memcached isn't as fast as disk cache. I believe this is because of the OS's disk IO's caching, but memcached allows for a "scalable" cache, meaning if you have more than 1 server accessing the same cache data, it will scale (especially since memcached have a very low CPU overhead compared to PHP). The only way to allow more than 1 machine to access a disk cache at once is network mount which is surely going to kill the speed of the accesses. Also one other thing you have to worry about with file caching is garbage collection to prevent disk from being saturated.
As your site(s) scale, you might want to change your mind later, so whatever choice you make, use a cache wrapper, so you can easily change your method. Zend provides a good API.
Reads are going to be about the same speed (since the OS will cache files frequently accessed)... The difference is going to be with writes. With memcached, all it needs to do is write the ram. But with file storage, it gets a little bit more tricky. If you have write caching enabled, it'll be about as fast. But most servers have it turned off (Unless they have a battery backed cache) for more reliable writes in the case of power failure. So if yours is not using a write cache, it'll require the write to complete on disk (can take 5+ ms on server grade drives, possibly more on desktop grade hardware) before returning. So writing files may be significantly slower than memcached.
But there's another caviat. With files, when you want to write, you'd need to lock the file yourself. This means that if two processes try to write to the same file, the writes would have to complete serially. With memcached, those two writes get pushed into a queue and happen in the order received, but the writing process (PHP) doesn't need to wait for the actual commit...

Categories