I have a PHP script that runs very simple queries on a MySQL database (up to a maximum of 25 times per page). Is it going to be worth caching the results (e.g. using APC), or is the performance difference likely to be negligible?
Caching is just about always worth it. Pulling it from APC's in memory user cache vs. establishing a db connection and running queries is a massive difference--especially if you're doing 25 queries on a page!
The benefits will compound:
Pulling from memory you'll serve up requests faster by requiring less overhead
You'll free up db connections
You'll free up apache processes faster
All of which will help server up request faster...
Related
I'm running an app that interacts with a mysql database using the native Mysql PDO, I'm not using laravel or any framework.
So most of the APIs logic is to fetch from DB and/or to insert.
when running on production I can see high memory usage from mysql tasks, check below:
I have a couple of questions.
Is such high memory usage normal? and what's the best practice to manage a proper PHP-MysQL connection in a multi-threaded production-level app?
When an intensive query is running (fetching historical data to plot a graph), the CPU usage jumps to 100% until the query execution finishes it returns back to 2-3%. But during that time the system is completely paused.
I'm thinking of hardware based solutions, such as separating the db server from the app server (currently they both run on the same node) And managing a cluster and using read-only nodes.
But i'd like to know if there are other options, and what's the most efficient way to handle PHP-MySQL connections.
You could also check the query written if there are fully optimized, and connection are closed when not used.
Also you can reduce the load on the mysql server by balancing some work to php.
Also take a look at your query_cache_size, innodb_io_capacity, innodb_buffer_pool_size and max_connection setting in your my.cnf.
Also sometimes upgrading php and doing some caching on your apache can help reduce of ram uses.
https://phoenixnap.com/kb/improve-mysql-performance-tuning-optimization
Normally, innodb_buffer_pool_size is set to about 70% of available RAM, and MySQL will consume that, plus other space that it needs. Scratch that; you have only 1GB of RAM? The buffer_pool needs to be a smaller percentage. It seems that the current value is pretty good.
You top show only 37.2% of RAM in used by MySQL -- the many threads are sharing the same memory.
Sort by memory (in top, use < or >). Something else is consuming a bunch of RAM.
query_cache_size should be zero.
max_connections should be, say, 10 or 20.
CPU -- You have only one core? So a big query hogging the CPU is to be expected. It also says that I/O was not an issue. You could show us the query, plus SHOW CREATE TABLE and EXPLAIN SELECT...; perhaps we can suggest how to speed it up.
I use Codeigniter 3 with database-backed sessions. It uses SELECT GET_LOCK("$SESSION_ID", 300) to prevent concurrent requests causing issue with session data.
This works fine until I have a ton of traffic causing a bunch of open connections all calling this function and waiting for the lock to be released
It sometimes brings the application to a halt for a couple of minutes. It never reaches my max number of connections in mysql and CPU and RAM usages are fine, but performance is slow for all users of that database server. (Not sure why it doesn't perform ok). I do use pt-kill to remove any queries taking longer than 15 seconds.
I already tried using the redis driver that had its own performance issues so I am back to the database sessions.
So my questions are:
How do I optimize my application to perform well when I get a ton of traffic and cases GET_LOCK queries to pile up and open connections to mysql. I was thinking I could use persistent connection; but not sure if that is a good idea as most people never recommend this
I have a mysql table with about 90k rows. I have a routine I've written which loops through each one of these rows, and then crosschecks the results within another table with about 90k rows. If there is a match, I delete one of the rows. All the columns I'm cross checking I've made indexes in mysql.
When I run this script on a dedicated local server with 2 x quad 2.4ghz intel xeon, 24gb of ram (with php memory_limit set to 12288m), and with an SSD, the whole script takes about a minute to complete. I would imagine then that the servers resources are maxing out, but actually CPU is about 93% idle, ram is utilising about 6% and I'm looking at Read/Writes on the SSD and it's like not much is happening at all.
I mentioned the problem to somebody else who said that the problem is I'm executing a single-threaded process and wondering why it's not using all 8 processors, but even so, is checking through a mysql table 90k times really a big deal? Wouldn't at least one CPU be running at max?
Why doesn't my server attempt to throw more resources at the script when I run it? Or, how can I unleash more resources so that my local web app runs not like a low spec'd VPS?
Depending on the size of the rows, 90K rows isn't a whole lot. Odds are they're all cached in RAM.
As for the CPUs, your process is not quite single threaded, but it's pretty close. Both your process and the DB server are separate processes, the problem is of course that your process stops while the DB server processes the request, so whatever core has your process scheduled shuts down just as the one with DB spools up.
As the commenter mentioned, it's likely you can do this more efficiently by offloading most of the processing to the DB server. Most of your time is just in statement overhead sending 90K SQL statements to the server.
I am having very high CPU spikes on mysqld process (greater than 100%, and even saw a 300% at one point). My load average is around: .25, .34, .28.
I read this great post about this issue: MySQL high CPU usage
One of the main things to do is disable persistent connections. So I checked my php.ini and mysql.allow_persistent = on and mysql.max_persistent = -1 -- which means no limit.
This raises a few questions for me before changing anything just to be sure:
If my mysqld process is spiking over 100% every couple seconds shouldn't my load average be higher then they are?
What will disabling persistent links do - will my scripts continue to function as is?
If I turn this off and reload php what does this mean for my current users as there will be many active users.
EDIT:
CPU Info: Core2Quad q9400 2.6 Ghz
Persistent connections won't use any CPU by themselves - if nothing's using a connection, it's just sitting idle and only consumes a bit of memory and occupies a socket.
Load averages are just that - averages. If you have a process that alternates between 0% and 100% 10 times a second, you'd get a load average of 0.5. They're good for figuring out long-term persistent high cpu, but by their nature hide/obliterate signs of spikes.
Persistent connections with mysql are usually not needed. MySQL has a relatively fast connection protocol and any time savings from using persistent connections is fairly minimal. The downside is that once a connection goes persistent, it can be left in an inconsistent state. e.g. If an app using the connection dies unexpectedly, MySQL will not see that and start cleaning up. This means that any server-side variables created by the app, any locks, any transactions, etc... will be left at the state they were in when the app crashed.
When the connection gets re-used by another app, you'll start out with what amounts to dirty dishes in the sink and an unflushed toilet. It can quite easily cause deadlocks because of the dangling transactions/locks - the new app won't know about them, and the old app is no longer around to relinquish those.
Spikes are fine. This is MySQL doing work. Your load average seems appropriate.
Disabling persistent links simply means that the scripts cannot use an existing connection to the database. I wouldn't recommend disabling this. At the very least, if you want to disable them, do it on the application later, rather than on MySQL. This might even increase load slightly, depending on the conditions.
Finally, DB persistence has nothing to do with the users on your site (generally). Users make a request, and once all of the page resources are loaded, that is it, until the next request. (Except in a few specific cases.) In any case, while the request is happening, the script will still be connected to the DB.
I own a community website of about 12.000 users (write heavy), 100 concurrent users max on a single VPS with 1Gb ram. The load rarely goes above 3 and response is quite good.
Currently a simple file cache is used to store DB query results to ease the load on the DB, but the website still can slow down over 220 concurrent users (load test).
How can I find out what the bottleneck is?
I assume that DB is fine as cache is working fine, however Disk IO could cause problem. Each pageload has about 10 includes and 10-20 querys from DB or from the file cache, plus lots of php processing.
I tried using memcache instead of the file cache, but to my suprise the load test seemed to like file cache more.
I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?
Any other suggestions for finding bottlenecks (tried xdebug)?
Thanks,
Hamlet
I plan to use Alternative PHP Cache,
but I still don't really understand
how that cache is invalidated. I have
a singe index.php that handles all
requests. Will the cache store the
result for each individual request?
Will it clear the cache automatically
if one of my includes (or query result
from cache) change?
APC doesn't cache output. It caches your compiled bytecode.
Essentially, a normal PHP request looks like this:
PHP files are parsed and compiled to bytecode
The PHP interpreter executes the bytecode
APC caches the result of the first step, so you aren't reparsing/recompiling the same code over and over again. By default, it still stat()s your PHP files on every request, to see if the file has been modified since its cached copy was compiled -- so any changes to your code will automatically invalidate the cached copy.
You can also use APC much like you'd use memcached, for storing arbitrary user data. Keep in mind, however:
A memcached server can serve data to multiple servers; data cached in APC can only really be used locally. Better to serve a gig of data from one memcached box to four servers, than to have 4 copies of that gig of data in APC on each individual server.
Memcached, in my experience, is better at handling large numbers of concurrent writes to a single cache key.
APC doesn't seem to cope very well with its cache filling up. Fragmentation increases, and performance drops.
Also, beware: unless you've set up some sort of locking mechanism, your file-based cache is likely to become corrupt due to simultaneous writes. If you have implemented locking, that may become a bottleneck of its own. IMO, concurrency is tricky -- let memcached/APC/the database deal with it.
You mention you used XDebug - what weren't you able to do? Typically, to start tracking down a bottleneck you enable profiling of a request and then view the resulting "cachegrind" file in KCacheGrind or WinCacheGrind.
As for using a cache system, a dynamic script such as yours will generally do something like this
construct a cache "key" from the unique inputs to the script
ask the caching system if it has data for that key. If has, you're good to go!
otherwise, do all the hard work to generate the data, and ask the caching system to store it under the desired key for next time.
APC Cache can help to speed things up further by caching the parsed version of the PHP code.
MySQL has its own query cache.
You can enable it by setting query_cache_size to more than 0.
The query results are taken from the cache if the query is repeated verbatim and does not contain certain things like non-deterministic functions, session variables and some other things describe here:
The cache for a query is invalidated by issuing any DML operation against any of the underlying queries.
I turned on and configured APC on the test server and got a performance increase of about 400%
300 concurrent users with response time 1,4 secs max :) Good for a start.
Update:
Live server test results
Original:
No APC: 220 concurrent users, server load 20, response time 5000ms
No APC: 250 concurrent users, server load 20+, site is unavailable
New:
APC enabled: 250 concurrent users, server load 2, response time is 600ms
APC enabled: 350 concurrent users, server load 10, response time is 1500ms
APC enabled: 500 concurrent users, server load 20, response is 5000ms + site is fully operational, but a bit slow but can be used normally
Thanks for the suggestions, this is pretty great improvement.
Query cache is disabled as the site is write heavy thus cache would be invalidated constantly for whole tables.
I would say that it's likely that your database is IO bound, I don't know exactly what a "VPS" is, but if it's some kind of VM, then there is almost guaranteed to be very poorly performing IO.
Get it on to real hardware ASAP; and get a sensible amount of ram (1G is tiny; 16G sounds more reasonable).
Then you may be able to tune your db so it can behave properly. How big are your data in total? If you can get all of them (or most of them) to fit in your database cache (not the dodgy query cache, the proper innodb buffer pool one), then do so.
I'm assuming you're using the innodb engine; if so, then set up the buffer pool to be big enough for all your data - if you don't have enough ram, buy more until you do (No, really!).
Then your db queries should be fast even if they're fairly bad (yes).
The tricky bit is, if you have a single machine, how to carve up ram usage between mysql and PHP - the web server (I assume Apache), particularly if you use prefork and lots of MaxClients, can use up loads of ram and deprive your database of it.
Get some decent monitoring on the job (with trending), and make changes carefully and record exactly when you made them.