MySQL (MariaDB) execution timeout within query called from PHP - php

I'm stress testing my database for a geolocation search system. It has a lot of optimisation built in already such a square box long/lat index system to narrow searches before performing arc distance calculations. My aim is to serve 10,000,000 users from one table.
At present my query time is between 0.1 and 0.01 seconds based on other conditions such as age, gender etc. This is for 10,000,000 users evenly distributed across the UK.
I have a LIMIT condition as I need to show the user X people, where X can be between 16 and 40.
The issue is when there are no other users / few users that match, the query can take a long time as it cannot reach the LIMIT quickly and may have to scan 400,000 rows.
There may be other optimisation techniques which I can look at but my questions is:
Is there a way to get the query to give up after X seconds? If it takes more than 1 second then it is not going to return results and I'm happy for this to occur. In pseudo query code it would be something like:
SELECT data FROM table WHERE ....... LIMIT 16 GIVEUP AFTER 1 SECOND
I have thought about a cron solution to kill slow queries but that is not very elegant. The query will be called every few seconds when in production so the cron would need to be on continuously.
Any suggestions?
Version is 10.1.14-MariaDB

Using MariaDB in version 10.1, you have two ways of limiting your query. It can be done based on time or on total of rows queried.
By rows:
SELECT ... LIMIT ROWS EXAMINED rows_limit;
You can use the keyword EXAMINED and set an amount of lines like 400000 as you mentioned (since MariaDB 10.0).
By time:
If the max_statement_time variable is set, any query (excluding stored
procedures) taking longer than the value of max_statement_time
(specified in seconds) to execute will be aborted. This can be set
globally, by session, as well as per user and per query.
If you want it for a specific query, as I imagine, you can use this:
SET STATEMENT max_statement_time=1 FOR
SELECT field1 FROM table_name ORDER BY field1;
Remember that max_statement_time is set in seconds (just the opposite of MySQL, which are milliseconds), so you can change it until you find the best fit for your case (since MariaDB 10.1).
If you need more information I recommend you this excellent post about queries timeouts.
Hope this helps you.

Related

AJAX - MySQL Query Flooding Server

I've been at this for a few days now trying different methods to reduce the 95-135% CPU load due to a recent AJAX script, calling this query:
SELECT COUNT(*) as total FROM users WHERE last_access >= DATE_SUB(NOW(), INTERVAL 1 MINUTE) ORDER BY last_access DESC LIMIT 16
And I have tried COUNT(id) to reduce table scan times, I added LIMIT and ORDER BY, I think it improved but I can't tell. I am monitoring top -P on my BSD box, and the CPU has been spiking quite high almost killing apache too, while relying on that for my query testing.
We have a jquery bit that queries the the AJAX script and returns a table count according to last users online with a 15 second interval on jquery-side (1 minute on the query statement). It was fine for a day, then noticed the server was running overtime and fans going haywire.
We ended up removing MySQL 5.7 and installing MariaDB 12.4 - and a huge difference that made, however while its sitting at reduced load by ~20% CPU, its struggling too.. so the query is bad. I disabled the script and sure enough, CPU went down to 15-30% avg, however this is a big part of our website's UX. This simply reports (455 online) for example, and updates the text every 15 seconds (dynamically).
My question is.. due to the 15 second interval hits to the SELECT(*) statement on a table of 9600 records, how can I optimize this so the SQL server doesn't crash and suck up all available memory?
I didn't include the script as it works wonderfully, the query is the issue but will provide if needed.
This is our only AJAX script on the site. No other AJAX calls are made.
Kind regards,
I don't think the SQL-query itself is the problem here. Adding ORDER BY and LIMIT should not make any difference since there is only one row to sort and limit. What you could think about is to add an index for the last_access column.
Depending on your website traffic, I think the problem is the way you've designed your system. You're saying the client is requesting your server every 15s which is asking your database for numbers of users last minute. Imagine having 1 000 users online, in that case, you will have 66 queries per second.
In this case, you should consider implementing a cache. You tagged your post with PHP, Memcached is fairly easy to implement in PHP. Cache your SQL-query for 15 seconds and your database server will only have 0.06 queries per second instead.
For MyISAM tables, COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_count
SELECT COUNT(*) as total
FROM users
WHERE last_access >= DATE_SUB(NOW(), INTERVAL 1 MINUTE)
-- ORDER BY last_access DESC -- Toss; one output row, nothing to sort
-- LIMIT 16 -- Toss, only output row
Be sure to have this on the table
INDEX(last_access)
COUNT(id) is, if anything, slower. It is the same as COUNT(*), plus checking id for being NOT NULL.
How many different threads are there?
Let's look at what that query takes --
The AJAX request hits the web server.
It sends the request to a 'child' or 'thread' which it probably already has waiting for action.
If the child needs PHP, then it launches that process.
That connects to MySQL.
The query is performed. If you have the index, this step is the trivial step.
Things shutdown.
In other words, have you investigated which process is taking most of the CPU?

mysqli_free_result() is slow when all the data are not read while using MYSQLI_USE_RESULT

I am using MYSQLI_USE_RESULT(Unbuffered query) while querying a huge data from the table.
For testing I took a table of 5.6 GB size.
I selected all columns Select * from test_table.
If I am not reading any rows with method like fetch_assoc() etc. Then try to close the result with mysqli_free_result(). It takes 5 to 10 secs to close it.
Sometimes I read required number rows based on available memory. And then I call mysqli_free_result() it takes less time when compared with not even one row is read.
So lesser unread rows means lesser time to free results. More unread rows more time to free results.
It's no where documented that this functionality will consume time in best of my knowledge.
Time taken for query is around 0.0008 sec.
Is it bug or is it expected behavior?
For me this sluggishness defeating whole point using MYSQLI_USE_RESULT.
MySQL v5.7.21, PHP v7.2.4 used for testing.
Alias of this function are mysqli_result::free -- mysqli_result::close -- mysqli_result::free_result -- mysqli_free_result.

Strange performance test results for LAMP site

We have an online application of large amount of data in tables ranging usually from 10+ million in each table.
The performance hits i am facing is in reporting modules where some charts and tables are displayed loads very slow.
Assuming that total time = PHP execution time + MYSQL query time + http response time
To verify this when i open phpmyadmin which again another web app.
If i click a table with 3 records (SELECT * from table_name) = total time for displaying is 1 - 1.5 seconds. i can see mysql query time 0.0001 sec
When I click a table with 10 million records = total time is 7 -8 second and mysql query time being again close to 0.0001 sec
shouldnt the page load time be the sum of mysql and script run times ? why it loads slow when mysql rows has larger data even mysql says it took same time.
PHPMyAdmin uses LIMIT, so that's an irrelevant comparison.
You should use EXPLAIN to see why your query is so awfully slow. 10 million is a small dataset (assuming average row size) and shouldn't take anywhere near 7 seconds.
Also, your method of counting the execution time is flawed. You should measure by timing the individual parts or your script. If SQL is your bottleneck, start optimizing your table or query.

instant win procedure in PHP / MySQL - how to make sure only one winner is chosen?

I'm building an instant win action for a competition draw. Basically at a given randomly selected minute of an hour, the next user to submit their details should be chosen as the winner.
The problem I'm having is that if MySQL is running multiple connections, then how do I stop two or three winners being drawn by mistake? Can I limit the connections, or maybe get PHP to wait until all current MySQL connections are closed?
have a look at lock tables
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html
Use the starting value of a field that you're updating as the concurrency check in your WHERE clause when you make the update. That way, only the 1st one to be executed will go through, because after that it will no longer match the WHERE clause. You can tell whose went through using the mysql_affected_rows() function, which will return 1 for the successful update and 0 for any others.
Use a timestamp field "registertime" in the user details table. When inserting the data, use the NOW() function to insert into the registertime field.
When you choose a random minute, convert that to a unix time.
The winner is: SELECT * FROM userTable WHERE registertime > -- the timestamp of your random minute -- ORDER BY registertime LIMIT 1
Keep a small status table somewhere that records the hour of the previous draw. If the new record's insert time is at a different hour, do the check if they're a winner.
If they are, you update the status table with this new "Winners" draw hour, and that'll prevent any more draws being made for the rest of the hour. Though, what happens if, by chance, no one actually "wins" the draw in any particular hour? Do you guarantee a win to the last person who registered, or there just isn't a winner at all?

Mysql - Summary Tables

Which method do you suggest and why?
Creating a summary table and . . .
1) Updating the table as the action occurs in real time.
2) Running group by queries every 15 minutes to update the summary table.
3) Something else?
The data must be near real time, it can't wait an hour, a day, etc.
I think there is a 3rd option, which might allow you to manage your CPU resources a little better. How about writing a separate process that periodically updates the summarized data tables? Rather than recreating the summary with a group by, which is GUARANTEED to run slower over time because there will be more rows every time you do it, maybe you can just update the values. Depending on the nature of the data, it may be impossible, but if it is so important that it can't wait and has to be near-real-time, then I think you can afford the time to tweak the schema and allow the process to update it without having to read every row in the source tables.
For example, say your data is just login_data (cols username, login_timestamp, logout_timestamp). Your summary could be login_summary (cols username, count). Once every 15 mins you could truncate the login_summary table, and then insert using select username, count(*) kind of code. But then you'd have to rescan the entire table each time. To speed things up, you could change the summary table to have a last_update column. Then every 15 mins you'd just do an update for every record newer than the last_update record for that user. More complicated of course, but it has some benefits: 1) You only update the rows that changed, and 2) You only read the new rows.
And if 15 minutes turned out to be too old for your users, you could adjust it to run every 10 mins. That would have some impact on CPU of course, but not as much as redoing the entire summary every 15 mins.

Categories