Why is connecting to my InnoDB database often delayed by an integer amount of seconds?
Some background
I have a very small MySQL database, consisting of a table 'users' (150 records) and a table 'connections' (growing to 150*150 records). Tables and indexes add up to less than 5 MB.
When users are active, 5-50 records in 'connections' are changed (the weight is changed) or added (if they didn't exist yet). The whole app runs smoothly and load times are below ~100 ms.
Except when they are not.
The details
Under even quite small loads, page load times spike from 60 ms to somewhere between 1,000 ms and 10,000 ms.
Using the profiler in Symfony, I could pin down the delay for 95+% on the 'getRepository' statement, while the queries took only ~1 ms per query. This led me to believe that connecting to the database was the slow action. I wrote a helper script regularly connecting to the database to test this theory.
<?php // call this script commandline using watch
$a = microtime(true);
$pdo = new PDO('mysql:host=127.0.0.1;dbname=mydb','myuser','mypass');
file_put_contents( 'performance.txt', (microtime(true)-$a).PHP_EOL, FILE_APPEND );
The mystery
Connecting to the database took consistently 1-3 ms, or 1,001-1,003 ms, or 2,001-2,003 ms, or 3,001-3,003 ms, etc. An integer amount of seconds, plus the normal time. Nothing in between, like 400 ms or 800 ms. With no writes going on, the connection was made almost instantly. As soon as some writes were performed via the app, the higher numbers were reached.
What is causing this behavior? The InnoDB page_cleaner appears to do its work every 1,000 ms, maybe that's part of the explanation?
More importantly, how can I fix this? I was thinking of switching to MEMORY tables, but I'd say more elegant options should be available.
EDIT
On request, the variables and global status.
Additional information: I connect directly to 127.0.0.1 (see the code snippet above) and I tested the skip-name-resolve flag to no effect. It's a Debian server, by the way.
EDIT 2
I found the delays were either 1, 3, 7 or 15 seconds. Notice the pattern: 1 second, +2s, +4s, +8s. This really looks as some timeout issue...
It's common that reverse dns lookup takes a long time. Along with the size of the host_cache it can give a erratic behaviour.
Turn it off by adding this to my.cnf
[mysqld]
skip-name-resolve
Note that all grants must be by ip, not by name, if you change this.
There is more to read in the manual
Related
I have a LAMP server on which I run a PHP script that makes a SELECT query on a table containing about 1 million rows.
Here is my script (PHP 8.2 and mariaDB 10.5.18) :
$db = new PDO("mysql:host=$dbhost;dbname=$dbname;", $dbuser, $dbpass);
$req = $db->prepare('SELECT * FROM '.$f_dataset);
$req->execute();
$fetch = $req->fetchAll(PDO::FETCH_ASSOC);
$req->closeCursor();
My problem is that each execution of this script seems to consume about 500MB of RAM on my server, and this memory is not released at the end of the execution, so having only 2GB of RAM, after 3 executions, the server kills the Apache2 task, which forces me to restart the Apache server each time.
Is there a solution to this? A piece of code that allows to free the used memory?
I tried to use unset($fetch) and gc_collect_cycles() but nothing works and I haven't found anyone who had the same problem as me.
EDIT
After the more skeptical among you about my problem posted several responses asking for evidence as well as additional information, here is what else I can tell you:
I am currently developing a trading strategy testing tool where I set the parameters manually via an HTML form. This one is then processed by a PHP script that will first perform calculations in order to reproduce technical indicators (using the Trader library for some of them, and reprogrammed for others) from the parameters returned by the form.
In a second step, after having reproduced the technical indicators and having stored their values in my database, the PHP script will simulate a buy or sell order according to the values of the stock market price I am interested in, and according to the values of the technical indicators calculated just before.
To do this, I have in my database for example 2 tables, the first one stores the information of the candles of size 1 minute (opening price, closing price, max price, min price, volume ...), that is to say 1 candle per line, the second table stores the value of a technical indicator, corresponding to a candle, thus to a line of my 1st table.
The reason why I need to make calculations, and therefore to get my 1 million candles, is that my table contains 1 million candles of 1 minute on which I want to test my strategy. I could do this with 500 candles as well as with 10 million candles.
My problem now, is only with the candle retrieval, there are not even any calculations yet. I shared my script above which is very short and there is absolutely nothing else in it except the definitions of my variables $dbname, $dbhost etc. So look no further, you have absolutely everything here.
When I run this script on my browser, and I look at my RAM load during execution, I see that an apache process consumes up to 697 MB of RAM. I'd like to say that so far, nothing abnormal, the table I'm retrieving candles from is a little over 100 MB. The real problem is that once the script is executed, the RAM load remains the same. If I run my script a second time, the RAM load is 1400 MB. And this continues until I have used up all the RAM, and my Apache server crashes.
So my question is simple, do you know a way to clear this RAM after my script is executed?
What you describe is improbable and you don't say how you made these measurements. If your assertions are valid then there are a couple of ways to solve the memory issue, however this is the xy problem. There is no good reason to read a million rows into a web page script.
After several hours of research and discussion, it seems that this problem of unreleased memory has no solution. It is simply the current technical limitations of Apache compared to my case, which is not able to free the memory it uses unless it is restarted every time.
I have however found a workaround in the Apache configuration, by only allowing one maximum request per server process instead of the default 5.
This way, the process my script is running on gets killed at the end of the run and is replaced by another one that starts automatically.
I have an AWS EC2 instance with DUAL-CORE and 4 GB Memory. I have setup my Apache2 HTTP server running PHP "7.0.30" and MySQL "Ver 14.14 Distrib 5.7.22".
There are various devices that are sending GET/POST request to my Http server. Each post and get request using select and update queries.
Right now, there are around 200 devices which are hitting my Http server simultaneously and hitting SQL queries of select and update together. These hits contain data in JSON formats.
The problem is that my MYSQL server has become too much slow. It takes long time to gather data from select queries and load pages.
From phpMyAdmin, I see a number of sleep processes in status for queries. I also have tuned various parameters of my SQL server but no result.
One of the major query that is taking time is update query which is updating long text data in table and is coming from device in every 60 seconds simultaneously and we see its processes empty after a long period of time in MYSQL server status.
Is there a way to optimize it using SQL parameters to keep MYSQL server fast even with 1000s of queries with multiple connections coming to update the table column having long text ?
Most of the Global variables are with default values. I also tried changing values of Various Global variables but it didn't produce any result.
How can I reduce this slow processing of queries?
P.S: I believe the issue is due to Update queries. I have tuned Select queries and they seems fine. But, for UPDATE queries, I see sleep of upto 12 seconds in Processes tab of phpMyAdmin.
I have added link to the image having this issue
(Here, you can see sleeps of even 13 seconds, all in UPDATE queries) :
Here is the PasteBin for the query of an UPDATE operation:
https://pastebin.com/kyUnkJmz
That is ~25KB for the JSON! (Maybe 22KB if backslashes vanish.) And 40 inserts/sec, but more every 2 minutes.
I would like to see SHOW CREATE TABLE, but I can still make some comments.
In InnoDB, that big a row will be stored 'off record'. That is, there will be an extra disk hit to write that big string elsewhere.
Compressing the JSON should shrink it to about 7K, which may lead to storing that big string inline, thereby cutting back some on the I/O. Do the compression in the client to help cut back on network traffic. And make the column a BLOB, not TEXT.
Spinning drives can handle about 100 I/Os per second.
The 200 devices every 5 seconds needs to average 40 writes/second in order to keep up. That's OK.
Every 2 minutes there are an extra 40 writes. This may (or may not) push the amount of I/O past what the disk can handle. This may be the proximate cause of the "updating for 13 seconds" you showed. That snapshot was taken shortly after a 2-minute boundary?
Or are the devices out of sync? That is do the POSTs come all at the same time, or are they spread out across the 2 minutes?
If each Update is a separate transaction (or you are running with autocommit=ON), then there is an extra write -- for transactional integrity. This can be turned off (tradeoff between speed and security): innodb_flush_log_at_trx_commit = 2. If you don't mind risking 1 second's worth of data,this may be a simple solution.
Is anything else going on with the table? Or is it just these Updates?
I hope you are using InnoDB (which is what my remarks above are directed toward), because MyISAM would be stumbling all over itself with fragmentation.
Long "Sleeps" are not an issue; long "Updates" are an issue.
More
Have an index on usermac so that the UPDATE does not have to slog through the entire table looking for the desired row. You could probably drop the id and add PRIMARY KEY(usermac).
Some of the comments above are off by a factor of 8 -- there seem to be 8 JSON columns in the table, hence 200KB/row.
I'm really new to MySQL and just started to use the InnoDB table engine on my VPS.
Server info: 4 x 2.4 ghz, 8 Gb ram, 120gb Raid10.
My.cfg:
[mysqld]
innodb_file_per_table=1
innodb_buffer_pool_size=4G
innodb_log_file_size=512M
innodb_flush_log_at_trx_commit=2
Table to insert has 6 ints and 1 date and 1 trigger for mapping -> inserts 1 row for each row into a mapping table.
The last line helps a lot with speeding up the inserts (the database is 80% insert/update vs 20% read).
But when I do some tests, I call a PHP file in the webbrowser which will do 10 000 inserts, it takes about 3 seconds (is this fast/slow for this hardware?). But when I open multiple tabs and open the PHP file at the same time, they all have an execution time of 3 seconds, but they are waiting for each other to finish :/
Any ideas which settings I should add? Or other settings I should add for faster inserts are greatly appreciated!
3,300 inserts a second is quite respectable, especially with a trigger. It's hard to know more about that without understanding the tables.
What are you doing about commits? Are you doing all 10K inserts and then committing once? If so, other clients doing similar tasks are probably locked out until each client runs. That's a feature!
On the other hand, if you're using autocommit you're making your MySQL server churn.
The best strategy is to insert chunks of something like 100 or 500 rows, and then commit.
You should attempt to solve this kind of lockout not with configuration settings, but by tuning your web php application to manage transactions carefully.
You might want to consider using a MyISAM table if you don't need robust transaction semantics for this application. MyISAM handles large volumes of inserts more quickly. That is, if you don't particularly care if you read data that's current or a few seconds old, then MyISAM will serve you well.
Given, 20M documents with each average of 550bytes and PHP driver on a single machine.
First insert (not mongoimport) with journal on, WriteConcern to default (1). Took about 12 hours. Then it made me wonder, so I tried the second import.
Second, I used batchInsert() with --nojournal and WriteConcern=0 and I took noted the performance. In total it TOO took 12 hours?! What was interesting what started to be 40000 records being inserted per minute it ended up being 2500 records per minutes and I can only imagine it would have been 100 records per minute towards the end.
My questions are:
I assumed by turning journal off and make w=0 and use batchInsert() my total insertion should drop significantly!
How is the significant drop of inserts per minutes is explained?
--UPDATE--
Machine is Core Duo 3GHz, with 8GB of RAM. RAM usage stays steady at %50 during whole process. CPU usage however goes high. In PHP I have ini_set('memory_limit', -1) to not limit the memory usage.
If it only one time migration, I would suggest you to delete all indexes before these inserts. Using deleteIndex(..) method.
After all inserts finished use isureIndex(..) to get the indexes back.
PS. From numbers you provided, it is not a big amount of data, probably you have mis-configured the MongoDB Server. Please provide your MongoDB Server config and Memory size, maybe I could find something else to improve.
Replying to your (2) question, probably your server is luck of memory after some inserts.
After a lot of hair pulling, I realized the backlog effect. Interesting enough when I noundled my documents to 5000 rows, batch insert worked like magic and imported in just under 4 minutes!!
This tool gave me the idea: https://github.com/jsteemann/BulkInsertBenchmark
I'm trying to understand how to better identify where the issue is with what i'm currently seeing.
Presently I am updating a collection through a cron, downloading information from a 3rd party vendor every 15 minutes (without issues). There are times when I need to do a 2 year refresh and is when I see this issue.
Incoming are about 300-600k results all which I'm using mongo->collection->save($item); on I have the _id for all results so that is being hit too for (what I thought) were quick inserts.
The document sizes aren't changing much and are rather small to begin with (12kb~).
I batch the downloads at about 200 per request to 3rd party server, format them, then insert them one at a time into mongo using save with safe insert set to true.
Right now when saves are happening it looks to up my lock percent to between 20-30%. I'm wondering how to track down why this is happening as I believe it's the reason that I end up hitting a timeout (which is set to 100 seconds).
Timeout Error: MongoCursorTimeoutException Object->cursor timed out (timeout: 100000, time left: 0:0, status: 0)
Mongo Driver: Mongo Native Driver 1.2.6 (from PHP.net)
I'm currently on Mongo 2.2.1 with SSD drives and 16gb of ram.
Here is an example of the mongoStat operation that I follow while inserts are happening:
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time
0 0 201 0 215 203 0 156g 313g 1.57g 7 mydb:36.3% 0 0|0 0|0 892k 918k 52 a-cluster PRI 10:04:36
I have a primary with a secondary setup and an arb fronting them (per documentation suggestions), using PHP to do my inserts.
any help would be GREATLY APPRECIATED.
Thank you so much for your time
Update
I store all items in a "MongoDoc" as there are times that formatting on each of the elements is needed, upon batching these items in there I get the data out and insert as
$mongoData = $mongoSpec->getData();
try {
foreach($mongoData as $insert) {
$this->collection_instance->save($insert);
$count++;
}
} catch(Exception $e) {
print_r($e->getTrace());
exit;
}
I will say that I have removed safe writes and I've seen a drastic reduction in timeout's occurring, so as for now I'm chalking it up to that (unless there is something wrong with the insert..)
Thank you for your time and thoughts.
You're hitting the PHP max execution limit? Which Mongo library are you using? I was using FuelPHP's MongoDb library, and it would take nearly 1 second for only ~50 inserts (because each write was a confirmed, fsync'd operation), so this doesn't surprise me. My solution was to fsync and write confirm only at certain intervals, which gives much better performance, with reasonable assurance that nothing went wrong.
More info:
http://docs.mongodb.org/manual/reference/command/fsync/
http://docs.mongodb.org/manual/core/write-operations/#write-concern