I am having very high CPU spikes on mysqld process (greater than 100%, and even saw a 300% at one point). My load average is around: .25, .34, .28.
I read this great post about this issue: MySQL high CPU usage
One of the main things to do is disable persistent connections. So I checked my php.ini and mysql.allow_persistent = on and mysql.max_persistent = -1 -- which means no limit.
This raises a few questions for me before changing anything just to be sure:
If my mysqld process is spiking over 100% every couple seconds shouldn't my load average be higher then they are?
What will disabling persistent links do - will my scripts continue to function as is?
If I turn this off and reload php what does this mean for my current users as there will be many active users.
EDIT:
CPU Info: Core2Quad q9400 2.6 Ghz
Persistent connections won't use any CPU by themselves - if nothing's using a connection, it's just sitting idle and only consumes a bit of memory and occupies a socket.
Load averages are just that - averages. If you have a process that alternates between 0% and 100% 10 times a second, you'd get a load average of 0.5. They're good for figuring out long-term persistent high cpu, but by their nature hide/obliterate signs of spikes.
Persistent connections with mysql are usually not needed. MySQL has a relatively fast connection protocol and any time savings from using persistent connections is fairly minimal. The downside is that once a connection goes persistent, it can be left in an inconsistent state. e.g. If an app using the connection dies unexpectedly, MySQL will not see that and start cleaning up. This means that any server-side variables created by the app, any locks, any transactions, etc... will be left at the state they were in when the app crashed.
When the connection gets re-used by another app, you'll start out with what amounts to dirty dishes in the sink and an unflushed toilet. It can quite easily cause deadlocks because of the dangling transactions/locks - the new app won't know about them, and the old app is no longer around to relinquish those.
Spikes are fine. This is MySQL doing work. Your load average seems appropriate.
Disabling persistent links simply means that the scripts cannot use an existing connection to the database. I wouldn't recommend disabling this. At the very least, if you want to disable them, do it on the application later, rather than on MySQL. This might even increase load slightly, depending on the conditions.
Finally, DB persistence has nothing to do with the users on your site (generally). Users make a request, and once all of the page resources are loaded, that is it, until the next request. (Except in a few specific cases.) In any case, while the request is happening, the script will still be connected to the DB.
Related
I'm running an app that interacts with a mysql database using the native Mysql PDO, I'm not using laravel or any framework.
So most of the APIs logic is to fetch from DB and/or to insert.
when running on production I can see high memory usage from mysql tasks, check below:
I have a couple of questions.
Is such high memory usage normal? and what's the best practice to manage a proper PHP-MysQL connection in a multi-threaded production-level app?
When an intensive query is running (fetching historical data to plot a graph), the CPU usage jumps to 100% until the query execution finishes it returns back to 2-3%. But during that time the system is completely paused.
I'm thinking of hardware based solutions, such as separating the db server from the app server (currently they both run on the same node) And managing a cluster and using read-only nodes.
But i'd like to know if there are other options, and what's the most efficient way to handle PHP-MySQL connections.
You could also check the query written if there are fully optimized, and connection are closed when not used.
Also you can reduce the load on the mysql server by balancing some work to php.
Also take a look at your query_cache_size, innodb_io_capacity, innodb_buffer_pool_size and max_connection setting in your my.cnf.
Also sometimes upgrading php and doing some caching on your apache can help reduce of ram uses.
https://phoenixnap.com/kb/improve-mysql-performance-tuning-optimization
Normally, innodb_buffer_pool_size is set to about 70% of available RAM, and MySQL will consume that, plus other space that it needs. Scratch that; you have only 1GB of RAM? The buffer_pool needs to be a smaller percentage. It seems that the current value is pretty good.
You top show only 37.2% of RAM in used by MySQL -- the many threads are sharing the same memory.
Sort by memory (in top, use < or >). Something else is consuming a bunch of RAM.
query_cache_size should be zero.
max_connections should be, say, 10 or 20.
CPU -- You have only one core? So a big query hogging the CPU is to be expected. It also says that I/O was not an issue. You could show us the query, plus SHOW CREATE TABLE and EXPLAIN SELECT...; perhaps we can suggest how to speed it up.
I use Codeigniter 3 with database-backed sessions. It uses SELECT GET_LOCK("$SESSION_ID", 300) to prevent concurrent requests causing issue with session data.
This works fine until I have a ton of traffic causing a bunch of open connections all calling this function and waiting for the lock to be released
It sometimes brings the application to a halt for a couple of minutes. It never reaches my max number of connections in mysql and CPU and RAM usages are fine, but performance is slow for all users of that database server. (Not sure why it doesn't perform ok). I do use pt-kill to remove any queries taking longer than 15 seconds.
I already tried using the redis driver that had its own performance issues so I am back to the database sessions.
So my questions are:
How do I optimize my application to perform well when I get a ton of traffic and cases GET_LOCK queries to pile up and open connections to mysql. I was thinking I could use persistent connection; but not sure if that is a good idea as most people never recommend this
In PDO, a connection can be made persistent using the PDO::ATTR_PERSISTENT attribute. According to the php manual -
Persistent connections are not closed at the end of the script, but
are cached and re-used when another script requests a connection using
the same credentials. The persistent connection cache allows you to
avoid the overhead of establishing a new connection every time a
script needs to talk to a database, resulting in a faster web
application.
The manual also recommends not to use persistent connection while using PDO ODBC driver, because it may hamper the ODBC Connection Pooling process.
So apparently there seems to be no drawbacks of using persistent connection in PDO, except in the last case. However., I would like to know if there is any other disadvantages of using this mechanism, i.e., a situation where this mechanism results in performance degradation or something like that.
Please be sure to read this answer below, which details ways to mitigate the problems outlined here.
The same drawbacks exist using PDO as with any other PHP database interface that does persistent connections: if your script terminates unexpectedly in the middle of database operations, the next request that gets the left over connection will pick up where the dead script left off. The connection is held open at the process manager level (Apache for mod_php, the current FastCGI process if you're using FastCGI, etc), not at the PHP level, and PHP doesn't tell the parent process to let the connection die when the script terminates abnormally.
If the dead script locked tables, those tables will remain locked until the connection dies or the next script that gets the connection unlocks the tables itself.
If the dead script was in the middle of a transaction, that can block a multitude of tables until the deadlock timer kicks in, and even then, the deadlock timer can kill the newer request instead of the older request that's causing the problem.
If the dead script was in the middle of a transaction, the next script that gets that connection also gets the transaction state. It's very possible (depending on your application design) that the next script might not actually ever try to commit the existing transaction, or will commit when it should not have, or roll back when it should not have.
This is only the tip of the iceberg. It can all be mitigated to an extent by always trying to clean up after a dirty connection on every single script request, but that can be a pain depending on the database. Unless you have identified creating database connections as the one thing that is a bottleneck in your script (this means you've done code profiling using xdebug and/or xhprof), you should not consider persistent connections as a solution to anything.
Further, most modern databases (including PostgreSQL) have their own preferred ways of performing connection pooling that don't have the immediate drawbacks that plain vanilla PHP-based persistent connections do.
To clarify a point, we use persistent connections at my workplace, but not by choice. We were encountering weird connection behavior, where the initial connection from our app server to our database server was taking exactly three seconds, when it should have taken a fraction of a fraction of a second. We think it's a kernel bug. We gave up trying to troubleshoot it because it happened randomly and could not be reproduced on demand, and our outsourced IT didn't have the concrete ability to track it down.
Regardless, when the folks in the warehouse are processing a few hundred incoming parts, and each part is taking three and a half seconds instead of a half second, we had to take action before they kidnapped us all and made us help them. So, we flipped a few bits on in our home-grown ERP/CRM/CMS monstrosity and experienced all of the horrors of persistent connections first-hand. It took us weeks to track down all the subtle little problems and bizarre behavior that happened seemingly at random. It turned out that those once-a-week fatal errors that our users diligently squeezed out of our app were leaving locked tables, abandoned transactions and other unfortunate wonky states.
This sob-story has a point: It broke things that we never expected to break, all in the name of performance. The tradeoff wasn't worth it, and we're eagerly awaiting the day we can switch back to normal connections without a riot from our users.
In response to Charles' problem above,
From : http://www.php.net/manual/en/mysqli.quickstart.connections.php -
A common complain about persistent connections is that their state is
not reset before reuse. For example, open and unfinished transactions
are not automatically rolled back. But also, authorization changes
which happened in the time between putting the connection into the
pool and reusing it are not reflected. This may be seen as an unwanted
side-effect. On the contrary, the name persistent may be understood as
a promise that the state is persisted.
The mysqli extension supports both interpretations of a persistent
connection: state persisted, and state reset before reuse. The default
is reset. Before a persistent connection is reused, the mysqli
extension implicitly calls mysqli_change_user() to reset the state.
The persistent connection appears to the user as if it was just
opened. No artifacts from previous usages are visible.
The mysqli_change_user() function is an expensive operation. For
best performance, users may want to recompile the extension with the
compile flag MYSQLI_NO_CHANGE_USER_ON_PCONNECT being set.
It is left to the user to choose between safe behavior and best
performance. Both are valid optimization goals. For ease of use, the
safe behavior has been made the default at the expense of maximum
performance.
Persistent connections are a good idea only when it takes a (relatively) long time to connect to your database. Nowadays that's almost never the case. The biggest drawback to persistent connections is that it limits the number of users you can have browsing your site: if MySQL is configured to only allow 10 concurrent connections at once then when an 11th person tries to browse your site it won't work for them.
PDO does not manage the persistence. The MySQL driver does. It reuses connections when a) they are available and the host/user/password/database match. If any change then it will not reuse a connection. The best case net effect is that these connections you have will be started and stopped so often because you have different users on the site and making them persistent doesn't do any good.
The key thing to understand about persistent connections is that you should NOT use them in most web applications. They sound enticing but they are dangerous and pretty much useless.
I'm sure there are other threads on this but a persistent connection is dangerous because it persists between requests. If, for example, you lock a table during a request and then fail to unlock then that table is going to stay locked indefinitely. Persistent connections are also pretty much useless for 99% of your apps because you have no way of knowing if the same connection will be used between different requests. Each web thread will have it's own set of persistent connections and you have no way of controlling which thread will handle which requests.
The procedural mysql library of PHP, has a feature whereby subsequent calls to mysql_connect will return the same link, rather than open a different connection (As one might expect). This has nothing to do with persistent connections and is specific to the mysql library. PDO does not exhibit such behaviour
Resource Link : link
In General you could use this as a rough "ruleset"::
YES, use persistent connections, if:
There are only few applications/users accessing the database, i.e.
you will not result in 200 open (but probably idle) connections,
because there are 200 different users shared on the same host.
The database is running on another server that you are accessing over
the network
An (one) application accesses the database very often
NO, don't use persistent connections, if:
Your application only needs to access the database 100 times an hour.
You have many, many webservers accessing one database server
Using persistent connections is considerable faster, especially if you are accessing the database over a network. It doesn't make so much difference if the database is running on the same machine, but it is still a little bit faster. However - as the name says - the connection is persistent, i.e. it stays open, even if it is not used.
The problem with that is, that in "default configuration", MySQL only allows 1000 parallel "open channels". After that, new connections are refused (You can tweak this setting). So if you have - say - 20 Webservers with each 100 Clients on them, and every one of them has just one page access per hour, simple math will show you that you'll need 2000 parallel connections to the database. That won't work.
Ergo: Only use it for applications with lots of requests.
On my tests I had a connection time of over a second to my localhost, thus assuming I should use a persistent connection. Further tests showed it was a problem with 'localhost':
Test results in seconds (measured by php microtime):
hosted web: connectDB: 0.0038912296295166
localhost: connectDB: 1.0214691162109 (over one second: do not use localhost!)
127.0.0.1: connectDB: 0.00097203254699707
Interestingly: The following code is just as fast as using 127.0.0.1:
$host = gethostbyname('localhost');
// echo "<p>$host</p>";
$db = new PDO("mysql:host=$host;dbname=" . DATABASE . ';charset=utf8', $username, $password,
array(PDO::ATTR_EMULATE_PREPARES => false,
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION));
Persistent connections should give a sizable performance boost. I disagree with the assement that you should "Avoid" persistence..
It sounds like the complaints above are driven by someone using MyIASM tables and hacking in their own versions of transactions by grabbing table locks.. Well of course you're going to deadlock! Use PDO's beginTransaction() and move your tables over to InnoDB..
seems to me having a persistent connection would eat up more system resources. Maybe a trivial amount, but still...
The explanation for using persistent connections is obviously reducing quantity of connects that are rather costly, despite the fact that they're considerably faster with MySQL compared to other databases.
The very first trouble with persistent connections...
If you are creating 1000's of connections per second you normally don't ensure that it stays open for very long time, but Operation System does. Based on TCP/IP protocol Ports can’t be recycled instantly and also have to invest a while in “FIN” stage waiting before they may be recycled.
The 2nd problem... using a lot of MySQL server connections.
Many people simply don't realize you are able to increase *max_connections* variable and obtain over 100 concurrent connections with MySQL others were beaten by older Linux problems of the inability to convey more than 1024 connections with MySQL.
Allows talk now about why Persistent connections were disabled in mysqli extension. Despite the fact that you can misuse persistent connections and obtain poor performance which was not the main reason. The actual reason is – you can get a lot more issues with it.
Persistent connections were put into PHP throughout occasions of MySQL 3.22/3.23 when MySQL was not so difficult which means you could recycle connections easily with no problems. In later versions quantity of problems however came about – Should you recycle connection that has uncommitted transactions you take into trouble. If you recycle connections with custom character set configurations you’re in danger again, as well as about possibly transformed per session variables.
One trouble with using persistent connections is it does not really scale that well. For those who have 5000 people connected, you'll need 5000 persistent connections. For away the requirement for persistence, you may have the ability to serve 10000 people with similar quantity of connections because they are in a position to share individuals connections when they are not with them.
I was just wondering whether a partial solution would be to have a pool of use-once connections. You could spend time creating a connection pool when the system is at low usage, up to a limit, hand them out and kill them when either they've completed or timed out. In the background you're creating new connections as they're being taken. At worst case this should only be as slow as creating the connection without the pool, assuming that establishing the link is the limiting factor?
I own a community website of about 12.000 users (write heavy), 100 concurrent users max on a single VPS with 1Gb ram. The load rarely goes above 3 and response is quite good.
Currently a simple file cache is used to store DB query results to ease the load on the DB, but the website still can slow down over 220 concurrent users (load test).
How can I find out what the bottleneck is?
I assume that DB is fine as cache is working fine, however Disk IO could cause problem. Each pageload has about 10 includes and 10-20 querys from DB or from the file cache, plus lots of php processing.
I tried using memcache instead of the file cache, but to my suprise the load test seemed to like file cache more.
I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?
Any other suggestions for finding bottlenecks (tried xdebug)?
Thanks,
Hamlet
I plan to use Alternative PHP Cache,
but I still don't really understand
how that cache is invalidated. I have
a singe index.php that handles all
requests. Will the cache store the
result for each individual request?
Will it clear the cache automatically
if one of my includes (or query result
from cache) change?
APC doesn't cache output. It caches your compiled bytecode.
Essentially, a normal PHP request looks like this:
PHP files are parsed and compiled to bytecode
The PHP interpreter executes the bytecode
APC caches the result of the first step, so you aren't reparsing/recompiling the same code over and over again. By default, it still stat()s your PHP files on every request, to see if the file has been modified since its cached copy was compiled -- so any changes to your code will automatically invalidate the cached copy.
You can also use APC much like you'd use memcached, for storing arbitrary user data. Keep in mind, however:
A memcached server can serve data to multiple servers; data cached in APC can only really be used locally. Better to serve a gig of data from one memcached box to four servers, than to have 4 copies of that gig of data in APC on each individual server.
Memcached, in my experience, is better at handling large numbers of concurrent writes to a single cache key.
APC doesn't seem to cope very well with its cache filling up. Fragmentation increases, and performance drops.
Also, beware: unless you've set up some sort of locking mechanism, your file-based cache is likely to become corrupt due to simultaneous writes. If you have implemented locking, that may become a bottleneck of its own. IMO, concurrency is tricky -- let memcached/APC/the database deal with it.
You mention you used XDebug - what weren't you able to do? Typically, to start tracking down a bottleneck you enable profiling of a request and then view the resulting "cachegrind" file in KCacheGrind or WinCacheGrind.
As for using a cache system, a dynamic script such as yours will generally do something like this
construct a cache "key" from the unique inputs to the script
ask the caching system if it has data for that key. If has, you're good to go!
otherwise, do all the hard work to generate the data, and ask the caching system to store it under the desired key for next time.
APC Cache can help to speed things up further by caching the parsed version of the PHP code.
MySQL has its own query cache.
You can enable it by setting query_cache_size to more than 0.
The query results are taken from the cache if the query is repeated verbatim and does not contain certain things like non-deterministic functions, session variables and some other things describe here:
The cache for a query is invalidated by issuing any DML operation against any of the underlying queries.
I turned on and configured APC on the test server and got a performance increase of about 400%
300 concurrent users with response time 1,4 secs max :) Good for a start.
Update:
Live server test results
Original:
No APC: 220 concurrent users, server load 20, response time 5000ms
No APC: 250 concurrent users, server load 20+, site is unavailable
New:
APC enabled: 250 concurrent users, server load 2, response time is 600ms
APC enabled: 350 concurrent users, server load 10, response time is 1500ms
APC enabled: 500 concurrent users, server load 20, response is 5000ms + site is fully operational, but a bit slow but can be used normally
Thanks for the suggestions, this is pretty great improvement.
Query cache is disabled as the site is write heavy thus cache would be invalidated constantly for whole tables.
I would say that it's likely that your database is IO bound, I don't know exactly what a "VPS" is, but if it's some kind of VM, then there is almost guaranteed to be very poorly performing IO.
Get it on to real hardware ASAP; and get a sensible amount of ram (1G is tiny; 16G sounds more reasonable).
Then you may be able to tune your db so it can behave properly. How big are your data in total? If you can get all of them (or most of them) to fit in your database cache (not the dodgy query cache, the proper innodb buffer pool one), then do so.
I'm assuming you're using the innodb engine; if so, then set up the buffer pool to be big enough for all your data - if you don't have enough ram, buy more until you do (No, really!).
Then your db queries should be fast even if they're fairly bad (yes).
The tricky bit is, if you have a single machine, how to carve up ram usage between mysql and PHP - the web server (I assume Apache), particularly if you use prefork and lots of MaxClients, can use up loads of ram and deprive your database of it.
Get some decent monitoring on the job (with trending), and make changes carefully and record exactly when you made them.
Trying to separate out my LAMP application into two servers, one for php and one for mysql. So far the application connects locally through a file socket and works fine.
I'm worried about the number connections I can establish if it is over the network. I have been testing tcp connections on unix for benchmark purposes and I know that you cannot exceed a certain amount of connections per second otherwise it halts due to the lack of resources (be it sockets, or file handles or whatever). I also understand that php does not implement connection pooling so for each page load a new connection over the network must be made. I also looked into pconnect for php and it seems to bring more problems.
I know this is a very very common setup (php+mysql), can anyone provide some typical usage and statistics they get out of their servers? Thanks!
The problem is not related to running out of connections allowed my MySQL. The main problem is that unix cannot very quickly create and tear down tcp connections. Sockets end up in TIME_WAIT and you have to wait for a period before you free up more sockets to connect again. These two screenshots clearly shows this pattern. MySQL does work up to a certain point and then pauses because the web server ran out of sockets. After certain amount of time passed, the web server was able to make new connections.
alt text http://img35.imageshack.us/img35/3809/picture4k.png
alt text http://img35.imageshack.us/img35/4580/picture2uyw.png
I think the limit is at 65535. So you'd have to have 65535 connections at the same time to hit that limit since a regular mysql connection closes automatically.
mysql_connect()
Note: The link to the server will be closed as soon as the execution of the script ends, unless it's closed earlier by explicitly calling mysql_close().
But if you're using a persistent mysql connection, then you can run into trouble.
Using persistent connections can require a bit of tuning of your Apache and MySQL configurations to ensure that you do not exceed the number of connections allowed by MySQL.
Each MySQL connection actually uses several meg of ram for various buffers, and takes a while to set up, which is why MySQL is limited to 100 concurrent open connections by default. You can up that limit, but it's better to spend your time trying to limit concurrent connections, via various methods.
Beware of raising the connection limit too high, as you can run out of memory (which, I believe, crashes mysql), or you may push important things out of memory. e.g. MySQL's performance is highly dependent on the OS automatically caching the data it reads from disk in memory; if you set your connection limit too high, you'll be contending for memory with the cache.
If you don't up your connection limit, you'll run out of connections long before your run out of sockets/file handles/etc. If you do increase your connection limit, you'll run out of RAM long before you run out of sockets/file handles/etc.
Regarding limiting concurrent connections:
Use a connection pooling solution. You're right, there isn't one built in to PHP, but there are plenty of standalone ones out there to choose from. This saves expensive connection setup/tear down time.
Only open database connections when you absolutely need them. In my current project, we automatically open a database connection when the first query is issued, and not a moment before; we also release the connection after we've done all our database work, but before the page's HTML is actually generated. The shorter the period of time you hold connections open, the fewer connections will be open simultaneously.
Cache what you can in a lighter-weight solution like memcached. My current project temporarily caches pages displayed to anonymous users (since every anonymous user gets the same HTML, in the end -- why bother running the same database queries all over again a few scant milliseconds later?), meaning no database connection is necessary at all. This is especially useful for bursts of anonymous traffic, like a front-page digg.