I have a summary query where all result of the calculation are showed. I store only the data and on every request a function calculates the result. My problems is this method causes waiting times. I would like to reduce it. The best thing I could do is to cache the results for 24 hours.
I can not store the results, because many of the variables has to be changed easily.
What would be the best practice to run php functions on mysql select query?
Related
I'm stress testing my database for a geolocation search system. It has a lot of optimisation built in already such a square box long/lat index system to narrow searches before performing arc distance calculations. My aim is to serve 10,000,000 users from one table.
At present my query time is between 0.1 and 0.01 seconds based on other conditions such as age, gender etc. This is for 10,000,000 users evenly distributed across the UK.
I have a LIMIT condition as I need to show the user X people, where X can be between 16 and 40.
The issue is when there are no other users / few users that match, the query can take a long time as it cannot reach the LIMIT quickly and may have to scan 400,000 rows.
There may be other optimisation techniques which I can look at but my questions is:
Is there a way to get the query to give up after X seconds? If it takes more than 1 second then it is not going to return results and I'm happy for this to occur. In pseudo query code it would be something like:
SELECT data FROM table WHERE ....... LIMIT 16 GIVEUP AFTER 1 SECOND
I have thought about a cron solution to kill slow queries but that is not very elegant. The query will be called every few seconds when in production so the cron would need to be on continuously.
Any suggestions?
Version is 10.1.14-MariaDB
Using MariaDB in version 10.1, you have two ways of limiting your query. It can be done based on time or on total of rows queried.
By rows:
SELECT ... LIMIT ROWS EXAMINED rows_limit;
You can use the keyword EXAMINED and set an amount of lines like 400000 as you mentioned (since MariaDB 10.0).
By time:
If the max_statement_time variable is set, any query (excluding stored
procedures) taking longer than the value of max_statement_time
(specified in seconds) to execute will be aborted. This can be set
globally, by session, as well as per user and per query.
If you want it for a specific query, as I imagine, you can use this:
SET STATEMENT max_statement_time=1 FOR
SELECT field1 FROM table_name ORDER BY field1;
Remember that max_statement_time is set in seconds (just the opposite of MySQL, which are milliseconds), so you can change it until you find the best fit for your case (since MariaDB 10.1).
If you need more information I recommend you this excellent post about queries timeouts.
Hope this helps you.
I'm using a query that calculates some values on a table with about 11 millions rows. And I need to display the results in real time (on my site), but this calculations need about 1min to execute. The table content changes each 30 mins, so I don't have to recalc the results at each time user reloads the page. How can I cache the results of calculations? Via php (I use odbc) or using some sql statement, some sybase IQ option. Thanks.
I also asked this question at https://dba.stackexchange.com/. So sorry for duplicating, can't figure out where is the better place.
So I found the solution. Not optimized, but helpful for me. I insert my calculations into temp table, and add there a column with current date. On a script start I'm checking if table is older then 30mins, and if so, I drop it and crwate again.
I'm developing an algorithm for intense calculations on multiple huge arrays. Right now I have used PHP arrays to do the job but, it seems slower than what I needed it to be. I was thinking on using MySQLi tables and convert the php arrays into database rows and then start the calculations to solve the speed issue.
At the very first step, when I was converting a 20*10 PHP array into 200 rows of database containing zeros, it took a long time. Here is the code: (Basically the following code is generating a zero matrix, if you're interested to know)
$stmt = $mysqli->prepare("INSERT INTO `table` (`Row`, `Col`, `Value`) VALUES (?, ?, '0')");
for($i=0;$i<$rowsNo;$i++){
for($j=0;$j<$colsNo;$j++){
//$myArray[$j]=array_fill(0,$colsNo,0);
$stmt->bind_param("ii", $i, $j);
$stmt->execute();
}
}
$stmt->close();
The commented-out line "$myArray[$j]=array_fill(0,$colsNo,0);" would generate the array very fast while filling out the table in next two lines, took a very longer time.
Array time: 0.00068 seconds
MySQLi time: 25.76 seconds
There is a lot more calculating remaining and I got worried even after modifying numerous parts it may get worse. I searched a lot but I couldn't find any answer on whether the array is a better choice or mysql tables? Has anybody done or know about any benchmarking test on this?
I really appreciate any help.
Thanks in advance
UPDATE:
I did the following test for a 273*273 matrix. I created two versions for the same data. First one, a two-dimension PHP array and the second one, a table with 273*273=74529 rows, both containing the same data. The followings are the speed test results for retrieving similar data from both [in here, finding out which column(s) of a certain row has a value equal to 1 - the other columns are zero]:
It took 0.00021 seconds for the array.
It took 0.0026 seconds for mysqli table. (more than 10 times slower)
My conclusion is sticking to the arrays instead of converting them into database tables.
Last thing to say, in case the mentioned data is stored in the database table in the first place, generating an array and then using it would be much much slower as shown below (slower due to data retrieval from database):
It took 0.9 seconds for the array. (more than 400 times slower)
It took 0.0021 seconds for mysqli table.
The main reason is not that the database itself is slower. The main reason is that the database access the hard-drive to store data and PHP functions use only the RAM memory to execute this procedure, wich is faster than the Hard-Drive.
Although there is a way to speed up your insert queries (most likely you are using innodb table without transaction), the very statement of question is wrong.
A database intended - in the first place - to store data. To store it permanently. It does it well. It can do calculations too, but again - before doing any calculations there is one necessary step - to store data.
If you want to do your calculations on a stored data - it's ok to use a database.
If you want to push your data in database only to calculate it - it makes not too much sense.
In my case, as shown on the update part of the question, I think arrays have better performance than mysql databases.
Array usage showed 10 times faster response even when I search through the cells to find desired values in a row. Even good indexing of the table couldn't beat the array functionality and speed.
I'm constructing a website for a small collection of parents at a private daycare centre. One of the desired functions of the site is to have a calendar where you can pick what days you can be responsible for the cleaning of the locales. Now, I have made a working calendar. I found a simple script online that I modified abit to fit our purpose. Technically, it works well, but I'm starting to wonder if I really should alter the way it extracts information from the databse.
The calendar is presented monthly, and drawn as a table using a for-loop. That means that said for-loop is run 28-31 times each time the page is loaded depending on the month. To present who is responsible for cleaning each day, I have added a call to a MySQL database where each member's cleaning day is stored. The pseudo code looks like this, simplified:
Draw table month
for day=start_of_month to day=end_ofmonth
type day
select member from cleaning_schedule where picked_day=day
type member
This means that each reload of the page does at least 28 SELECT calls to the database and to me it seems both inefficient and that one might be susceptible to a DDOS-attack. Is there a more efficient way of getting the same result? There are much more complex booking calendars out there, how do they handle it?
SELECT picked_day, member FROM cleaning_schedule WHERE picked_day BETWEEN '2012-05-01' AND '2012-05-31' ORDER BY picked_day ASC
You can loop through the results of that query, each row will have a date and a person from the range you picked, in order of ascending dates.
The MySQL query cache will save your bacon.
Short version: If you repeat the same SQL query often, it will end up being served without table access as long as the underlying tables have not changed. So: The first call for a month will be ca. 35 SQL Queries, which is a lot but not too much. The second load of the same page will give back the results blazing fast from the cache.
My experience says, that this tends to be much faster than creating fancy join queries, even if that would be possible.
Not that 28 calls is a big deal but I would use a join and call in the entire month's data in one hit. You can then iterate through the MySQL Query result as if it was an array.
You can use greater and smaller in SQL. So instead of doing one select per day, you can write one select for the entire month:
SELECT day, member FROM cleaning_schedule
WHERE day >= :first_day_of_month AND day >= :last_day_of_month
ORDER BY day;
Then you need to pay attention in your program to handle multiple members per day. Although the program logic will be a bit more complex, the program will be faster: The interprocess or even network based communication is a lot slower than the additional logic.
Depending on the data structure, the following statement might be possible and more convenient:
SELECT day, group_concat(member) FROM cleaning_schedule
WHERE day >= :first_day_of_month AND day >= :last_day_of_month
GROUP BY day
ORDER BY day;
28 queries isnt a massive issue and pretty common for most commercial websites but is recommend just grabbing your monthly data by each month on one hit. Then just loop through the records day by day.
I have a table in my database that has about 200 rows of data that I need to retrieve. How significant, if at all, is the difference in efficiency when retrieving all of them at once in one query, versus each row individually in separate queries?
The queries are usually made via a socket, so executing 200 queries instead of 1 represents a lot of overhead, plus the RDBMS is optimized to fetch a lot of rows for one query.
200 queries instead of 1 will make the RDBMS initialize datasets, parse the query, fetch one row, populate the datasets, and send the results 200 times instead of 1 time.
It's a lot better to execute only one query.
I think the difference will be significant, because there will (I guess) be a lot of overhead in parsing and executing the query, packaging the data up to send back etc., which you are then doing for every row rather than once.
It is often useful to write a quick test which times various approaches, then you have meaningful statistics you can compare.
If you were talking about some constant number of queries k versus a greater number of constant queries k+k1 you may find that more queries is better. I don't know for sure but SQL has all sorts of unusual quirks so it wouldn't surprise me if someone could come up with a scenario like this.
However if you're talking about some constant number of queries k versus some non-constant number of queries n you should always pick the constant number of queries option.
In general, you want to minimize the number of calls to the database. You can already assume that MySQL is optimized to retrieve rows, however you cannot be certain that your calls are optimized, if at all.
Extremely significant, Usually getting all the rows at once will take as much time as getting one row. So let's say that time is 1 second (very high but good for illustration) then getting all the rows will take 1 second, getting each row individually will take 200 seconds (1 second for each row) A very dramatic difference. And this isn't counting where are you getting the list of 200 to begin with.
All that said, you've only got 200 rows, so in practice it won't matter much.
But still, get them all at once.
Exactly as the others have said. Your RDBMS will not break a sweat throwing 200+++++ rows at you all at once. Getting all the rows in one associative array will also not make much difference to your script, since you no doubt already have a loop for grabbing each individual row.
All you need do is modify this loop to iterate through the array you are given [very minor tweak!]
The only time I have found it better to get fewer results from multiple queries instead of one big set is if there is lots of processing to be done on the results. I was able to cut out about 40,000 records from the result set (plus associated processing) by breaking the result set up. Anything you can build into the query that will allow the DB to do the processing and reduce result set size is a benefit, but if you truly need all the rows, just go get them.