We have an online application of large amount of data in tables ranging usually from 10+ million in each table.
The performance hits i am facing is in reporting modules where some charts and tables are displayed loads very slow.
Assuming that total time = PHP execution time + MYSQL query time + http response time
To verify this when i open phpmyadmin which again another web app.
If i click a table with 3 records (SELECT * from table_name) = total time for displaying is 1 - 1.5 seconds. i can see mysql query time 0.0001 sec
When I click a table with 10 million records = total time is 7 -8 second and mysql query time being again close to 0.0001 sec
shouldnt the page load time be the sum of mysql and script run times ? why it loads slow when mysql rows has larger data even mysql says it took same time.
PHPMyAdmin uses LIMIT, so that's an irrelevant comparison.
You should use EXPLAIN to see why your query is so awfully slow. 10 million is a small dataset (assuming average row size) and shouldn't take anywhere near 7 seconds.
Also, your method of counting the execution time is flawed. You should measure by timing the individual parts or your script. If SQL is your bottleneck, start optimizing your table or query.
Related
I've been at this for a few days now trying different methods to reduce the 95-135% CPU load due to a recent AJAX script, calling this query:
SELECT COUNT(*) as total FROM users WHERE last_access >= DATE_SUB(NOW(), INTERVAL 1 MINUTE) ORDER BY last_access DESC LIMIT 16
And I have tried COUNT(id) to reduce table scan times, I added LIMIT and ORDER BY, I think it improved but I can't tell. I am monitoring top -P on my BSD box, and the CPU has been spiking quite high almost killing apache too, while relying on that for my query testing.
We have a jquery bit that queries the the AJAX script and returns a table count according to last users online with a 15 second interval on jquery-side (1 minute on the query statement). It was fine for a day, then noticed the server was running overtime and fans going haywire.
We ended up removing MySQL 5.7 and installing MariaDB 12.4 - and a huge difference that made, however while its sitting at reduced load by ~20% CPU, its struggling too.. so the query is bad. I disabled the script and sure enough, CPU went down to 15-30% avg, however this is a big part of our website's UX. This simply reports (455 online) for example, and updates the text every 15 seconds (dynamically).
My question is.. due to the 15 second interval hits to the SELECT(*) statement on a table of 9600 records, how can I optimize this so the SQL server doesn't crash and suck up all available memory?
I didn't include the script as it works wonderfully, the query is the issue but will provide if needed.
This is our only AJAX script on the site. No other AJAX calls are made.
Kind regards,
I don't think the SQL-query itself is the problem here. Adding ORDER BY and LIMIT should not make any difference since there is only one row to sort and limit. What you could think about is to add an index for the last_access column.
Depending on your website traffic, I think the problem is the way you've designed your system. You're saying the client is requesting your server every 15s which is asking your database for numbers of users last minute. Imagine having 1 000 users online, in that case, you will have 66 queries per second.
In this case, you should consider implementing a cache. You tagged your post with PHP, Memcached is fairly easy to implement in PHP. Cache your SQL-query for 15 seconds and your database server will only have 0.06 queries per second instead.
For MyISAM tables, COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_count
SELECT COUNT(*) as total
FROM users
WHERE last_access >= DATE_SUB(NOW(), INTERVAL 1 MINUTE)
-- ORDER BY last_access DESC -- Toss; one output row, nothing to sort
-- LIMIT 16 -- Toss, only output row
Be sure to have this on the table
INDEX(last_access)
COUNT(id) is, if anything, slower. It is the same as COUNT(*), plus checking id for being NOT NULL.
How many different threads are there?
Let's look at what that query takes --
The AJAX request hits the web server.
It sends the request to a 'child' or 'thread' which it probably already has waiting for action.
If the child needs PHP, then it launches that process.
That connects to MySQL.
The query is performed. If you have the index, this step is the trivial step.
Things shutdown.
In other words, have you investigated which process is taking most of the CPU?
I have a tool witch compare one string with, on average - 250k strings from database.
Two tables are used during compare process - categories and categories_strings. In string table there is around 2.5 million rows while pivot - categories_string contains of 7 million rows.
My query is pretty simple, selecting strings columns, joining pivot table, adding where clause to specify category and setting limit of 10 000.
I run this query in a loop, every batch is 10 000 strings. To execute whole script faster I use Seek Method instead of MySQL offset which was a way too slow on huge offsets.
Then, comparing by common algorithms such us simple text, levenshtein etc. is perfomed on each batch. This part is simple.
The question starts here.
On my laptop (lenovo x230) whole process for i.e. 250k string compared takes: 7,4 seconds to load SQL, 13,3 seconds to compare all rows. And then 0,1 second sorting and transforming for view.
I've also small dedicated server. Same PHP version, same MySQL. Web server doesn't matter, as I run it from command line right now. As on my laptop it takes +- 20 seconds in total, on the server it is... 120 seconds.
So, what is the most important factor for a long running PHP program which have impact on execution time? All I can think of is CPU, which on the dedicated server is worse, it is Intel(R) Atom(TM) CPU N2800 # 1.86GHz. Memory comsumption is pretty low, about 2-4%. CPU usage, however is around 60% on my laptop and 99,7 - 100% on the server.
Is CPU the most importing factor in this case? Is there any way to split it for example into several processes which in total would take less? Despite all, how to monitor CPU usage, which part of script is most consuming.
MYSQL take too much time for insertion record
I have 32 GB RAM Dedicated Server and its hardly use CPU upto 15% and Memory 20% even 5 crons simultaneously executing.
The issue is that, there is PHP script has simple 200 lines of code with some basic calculation and total 3 queries to
select and insert with 12 column (4 has an integer, 8 has varchar datatype)
It executes once per day and insert records around 280000 to 300000 records, it takes on average 5-6 hours to execute.
Questions:
1) Why it takes 5-6 hours to insert just 3 lack of records?
2) Why it's not used much resources, RAM and CPU?
3) Is that any configuration to limit mysql execution?
Server Details:
Total 4 processors each have Intel(R) Xeon(R) CPU E3-1220 v3 # 3.10GHz Cache 8192 KB
32 GB RAM
Please help me to figure out the issue
First create INDEXon YOUR table .Then try to Execute this query
SELECT COLUMN1,COLUMN2..
FROM YOUR.
How much time it took for execution?Note down that timing and Note down Execution timing for same query without creating index on both table.
You will definitely get more timing.
So indirectly it indicates that Data you want to insert is more dependent on how fast it was fetched.
So once Fetching is fast obviously insertion is faster than previous one.
Hope this will helps.
I'm stress testing my database for a geolocation search system. It has a lot of optimisation built in already such a square box long/lat index system to narrow searches before performing arc distance calculations. My aim is to serve 10,000,000 users from one table.
At present my query time is between 0.1 and 0.01 seconds based on other conditions such as age, gender etc. This is for 10,000,000 users evenly distributed across the UK.
I have a LIMIT condition as I need to show the user X people, where X can be between 16 and 40.
The issue is when there are no other users / few users that match, the query can take a long time as it cannot reach the LIMIT quickly and may have to scan 400,000 rows.
There may be other optimisation techniques which I can look at but my questions is:
Is there a way to get the query to give up after X seconds? If it takes more than 1 second then it is not going to return results and I'm happy for this to occur. In pseudo query code it would be something like:
SELECT data FROM table WHERE ....... LIMIT 16 GIVEUP AFTER 1 SECOND
I have thought about a cron solution to kill slow queries but that is not very elegant. The query will be called every few seconds when in production so the cron would need to be on continuously.
Any suggestions?
Version is 10.1.14-MariaDB
Using MariaDB in version 10.1, you have two ways of limiting your query. It can be done based on time or on total of rows queried.
By rows:
SELECT ... LIMIT ROWS EXAMINED rows_limit;
You can use the keyword EXAMINED and set an amount of lines like 400000 as you mentioned (since MariaDB 10.0).
By time:
If the max_statement_time variable is set, any query (excluding stored
procedures) taking longer than the value of max_statement_time
(specified in seconds) to execute will be aborted. This can be set
globally, by session, as well as per user and per query.
If you want it for a specific query, as I imagine, you can use this:
SET STATEMENT max_statement_time=1 FOR
SELECT field1 FROM table_name ORDER BY field1;
Remember that max_statement_time is set in seconds (just the opposite of MySQL, which are milliseconds), so you can change it until you find the best fit for your case (since MariaDB 10.1).
If you need more information I recommend you this excellent post about queries timeouts.
Hope this helps you.
I have setup a 15 min cron job to monitor 10+ websites' performance (all hosted on different servers). Every 15 mins I will check if the server is up/down, the response time in ms etc.
I would like to save these information into MySQL instead of a log.txt file, so that it can be easily retrieve, query and analysis. (ie. the server performance on x day or x month or server performance between x and y days)
here is my table's going to look like:
id website_ip recorded_timestamp response_time_in_ms website_status
If I insert a new entry for each website, every day I'll have 1440 records for each website (15 x 4 x 24), then for 10 websites it will be 14400 records every day!!
so, I'm thinking of creating only 1 entry each hour / website. In this way instead of creating 14400 records every day, I'll only have 24 x 10 = 240 records every day for 10 websites.
But still, it's not perfect, what if I want to know keep the records for the whole year? then I'll have 87600 records for 365 years for 10 websites.
is 87600 records alot? my biggest concern is the difference between the server local time and client local time. How to improve the design without screwing up the accuracy and timezone?
This is a bit long or too short for a comment. The simple answer to your question is "no".
No, 87,600 records is not a lot of records. In fact, your full data with 5,256,000 records per year is not that much. It could be a lot of data if you had really wide records, but your record is at most a few tens of bytes. The resulting table would still have less than a gigabyte per year. Not very much at all, really.
Databases are designed to have big tables. You have the opportunity to do things to speed your queries on the log files. The most obvious is to create indexes on columns that would often be used for selection purposes. Another opportunity is to use partitioning to break up the storage of the tables into separate "files" (technically table spaces), so queries require less I/O.
You may want to periodically summarize more recent records and store the results in summary tables for your most common queries. However, that level of detail is beyond the scope of this answer.