i am running queries on a table that has thousands of rows:
$sql="select * from call_history where extension_number = '0536*002' and flow = 'in' and DATE(initiated) = '".date("Y-m-d")."' ";
and its taking forever to return results.
The SQL itself is
select *
from call_history
where extension_number = '0536*002'
and flow = 'in'
and DATE(initiated) = 'dateFromYourPHPcode'
is there any way to make it run faster? should i put the where DATE(initiated) = '".date("Y-m-d")."' before the extension_number where clause?
or should i select all rows where DATE(initiated) = '".date("Y-m-d")."' and put that in a while loop then run all my other queries (where extension_number = ...) whthin the while loop?
Here are some suggestions:
1) Replace SELECT * by the only fields you want.
2) Add indexing on the table fields you want as output.
3) Avoid running queries in loops. This causes multiple requests to SQL server.
4) Fetch all the data at once.
5) Apply LIMIT tag as and when required. Don't select all the records.
6) Fire two different queries: one for counting total number of records and other for fetching number of records per page (e.g. 10, 20, 50, etc...)
7) If applicable, create Database Views and get data from them instead of tables.
Thanks
The order of clauses under WHERE is irrelevant to optimization.
Pro-tip, also suggested by somebody else: Never use SELECT * in a query in a program
unless you have a good reason to do so. "I don't feel like writing out the names of the columns I need" isn't a good reason. Always enumerate the columns you need. MySQL and other database systems can often optimize things in surprising ways when the list of data columns you need is available.
Your query contains this selection criterion.
AND DATE(initiated) = 'dateFromYourPHPcode'
Notice that this search criterion takes the form
FUNCTION(column) = value
This form of search defeats the use of any index on that column. Your initiated column has a TIMESTAMP data type. Try this instead:
AND initiated >= 'dateFromYourPHPcode'
AND initiated < 'dateFromYourPHPcode' + INTERVAL 1 DAY
This will find all the initiated items in the particular day. And, because it doesn't use a function on the column value it can use an index range scan to do that, which performs well. It may, or may not, also help without an index. It's worth a try.
I suspect your ideal index for this particular search would created by
ALTER TABLE call_history
ADD INDEX flowExtInit (flow, extension_number, initiated)
You should ask the administrator of the database to add this index if your query needs good performance.
You should add index to your table. This way MySql will fetch faster. I have not tested but command should be like this:
ALTER TABLE `call_history ` ADD INDEX `callhistory` (`extension_number`,`flow`,`extension_number`,`DATE(initiated)`);
Related
I have a database design here that looks this in simplified version:
Table building:
id
attribute1
attribute2
Data in there is like:
(1, 1, 1)
(2, 1, 2)
(3, 5, 4)
And the tables, attribute1_values and attribute2_values, structured as:
id
value
Which contains information like:
(1, "Textual description of option 1")
(2, "Textual description of option 2")
...
(6, "Textual description of option 6")
I am unsure whether this is the best setup or not, but it is done as such per requirements of my project manager. It definitely has some truth in it as you can modify the text easily now without messing op the id's.
However now I have come to a page where I need to list the attributes, so how do I go about there? I see two major options:
1) Make one big query which gathers all values from building and at the same time picks the correct textual representation from the attribute{x}_values table.
2) Make a small query that gathers all values from the building table. Then after that get the textual representation of each attribute one at a time.
What is the best option to pick? Is option 1 even faster as option 2 at all? If so, is it worth the extra trouble concerning maintenance?
Another suggestion would be to create a view on the server with only the data you need and query from that. That would keep the work on the server end, and you can pull just what you need each time.
If you have a small number of rows in attributes table, then I suggest to fetch them first, fetch all of them! store them into some array using id as index key in array.
Then you can proceed with building data, now you just have to use respective array to look for attribute value
I would recommend something in-between. Parse the result from the first table in php, and figure out how many attributes you need to select from each attribute[x]_values table.
You can then select attributes in bulk using one query per table, rather than one query per attribute, or one query per building.
Here is a PHP solution:
$query = "SELECT * FROM building";
$result = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute1_values";
$result2 = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute2_values";
$result3 = mysqli_query(connection,$query);
$n = mysqli_num_rows($result);
for($i = 1; $n <= $i; $i++) {
$row = mysqli_fetch_array($result);
mysqli_data_seek($result2,$row['attribute1']-1);
$row2 = mysqli_fetch_array($result2);
$row2['value'] //Use this as the value for attribute one of this object.
mysqli_data_seek($result3,$row['attribute2']-1);
$row3 = mysqli_fetch_array($result3);
$row3['value'] //Use this as the value for attribute one of this object.
}
Keep in mind that this solution requires that the tables attribute1_values and attribute2_values start at 1 and increase by 1 every single row.
Oracle / Postgres / MySql DBA here:
Running a query many times has quite a bit of overhead. There are multiple round trips to the db, and if it's on a remote server, this can add up. The DB will likely have to parse the same query multiple times in MySql which will be terribly inefficient if there are tons of rows. Now, one thing that your PHP method (multiple queries) has as an advantage is that it'll use less memory as it'll release the results as they're no longer needed (if you run the query as a nested loop that is, but if you query all the results up front, you'll have a lot of memory overhead, depending on the table sizes).
The optimal result would be to run it as 1 query, and fetch the results 1 at a time, displaying each one as needed and discarding it, which can reek havoc with MVC frameworks unless you're either comfortable running model code in your view, or run small view fragments.
Your question is very generic and i think that to get an answer you should give more hints to how this page will look like and how big the dataset is.
You will get all the buildings with theyr attributes or just one at time?
Cause your data structure look like very simple and anything more than a raspberrypi can handle it very good.
If you need one record at time you don't need any special technique, just JOIN the tables.
If you need to list all buildings and you want to save db time you have to measure your data.
If you have more attribute than buildings you have to choose one way, if you have 8 attributes and 2000 buildings you can think of caching attributes in an array with a select for each table and then just print them using the array. I don't think you will see any speed drop or improvement with so simple tables on a modern computer.
$att1[1]='description1'
$att1[2]='description2'
....
Never do one at a time queries, try to combine them into a single one.
MySQL will cache your query and it will run much faster. PhP loops are faster than doing many requests to the database.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again.
http://dev.mysql.com/doc/refman/5.1/en/query-cache.html
I have a simple query using server-side pagination. The issue is the WHERE Clause makes a call to an expensive function and the functions argument is the user input, eg. what the user is searching for.
SELECT
*
FROM
( SELECT /*+ FIRST_ROWS(numberOfRows) */
query.*,
ROWNUM rn FROM
(SELECT
myColumns
FROM
myTable
WHERE expensiveFunction(:userInput)=1
ORDER BY id ASC
) query
)
WHERE rn >= :startIndex
AND ROWNUM <= :numberOfRows
This works and is quick assuming numberOfRows is small. However I would also like to have the total row count of the query. Depending on the user input and database size the query can take up to minutes. My current approach is to cache this value but that still means the user needs to wait minutes to see first result.
The results should be displayed in the Jquery datatables plugin which greatly helps with things like serer-side paging. It however requires the server to return a value for the total records to correctly display paging controls.
What would be the best approach? (Note: PHP)
I thought if returning first page immediately with a fake (better would be estimated) row count. After the page is loaded do an ajax call to a method that determines total row count of the query (what happens if the user pages during that time?) and then update the faked/estimated total row count.
However I have no clue how to do an estimate. I tried count(*) * 1000 with SAMPLE (0.1) but for whatever reason that actually takes longer than the full count query. Also just returning a fake/random value seems a bit hacky too. It would need to be bigger than 1 page size so that the "Next" button is enabled.
Other ideas?
One way to do it is as I said in the comments, to use a 'countless' approach. Modify the client side script in such a way that the Next button is always enabled and fetch the rows until there are none, then disable the Next button. You can always add a notification message to say that there are no more rows so it will be more user friendly.
Considering that you are expecting a significant amount of records, I doubt that the user will paginate through all the results.
Another way is to schedule a cron job that will do the counting of the records in the background and store that result in a table called totals. The running intervals of the job should be set up based on the frequency of the inserts / deletetions.
Then in the frontend, just use the count previously stored in totals. It should make a decent aproximation of the amount.
Depends on your DB engine.
In mysql, solution looks like this :
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
Basically, you add another attribute on your select (SQL_CALC_FOUND_ROWS) which tells mysql to count the rows as if limit clause was not present, while executing the query, while FOUND_ROWS actually retrieves that number.
For oracle, see this article :
How can I perform this query in oracle
Other DBMS might have something similar, but I don't know.
I am indexing all the columns that I use in my Where / Order by, is there anything else I can do to speed the queries up?
The queries are very simple, like:
SELECT COUNT(*)
FROM TABLE
WHERE user = id
AND other_column = 'something'`
I am using PHP 5, MySQL client version: 4.1.22 and my tables are MyISAM.
Talk to your DBA. Run your local equivalent of showplan. For a query like your sample, I would suspect that a covering index on the columns id and other_column would greatly speed up performance. (I assume user is a variable or niladic function).
A good general rule is the columns in the index should go from left to right in descending order of variance. That is, that column varying most rapidly in value should be the first column in the index and that column varying least rapidly should be the last column in the index. Seems counter intuitive, but there you go. The query optimizer likes narrowing things down as fast as possible.
If all your queries include a user id then you can start with the assumption that userid should be included in each of your indexes, probably as the first field. (Can we assume that the user id is highly selective? i.e. that any single user doesn't have more than several thousand records?)
So your indexes might be:
user + otherfield1
user + otherfield2
etc.
If your user id is really selective, like several dozen records, then just the index on that field should be pretty effective (sub-second return).
What's nice about a "user + otherfield" index is that mysql doesn't even need to look at the data records. The index has a pointer for each record and it can just count the pointers.
consider "Query1", which is quite time consuming. "Query1" is not static, it depends on $language_id parameter, thats why I can not save it on the server.
I would like to query this "Query1" with another query statement. I expect, that this should be fast. I see perhaps 2 ways
$result = mysql_query('SELECT * FROM raw_data_tbl WHERE ((ID=$language_id) AND (age>13))');
then what? here I want to take result and requery it with something like:
$result2 = mysql_query('SELECT * FROM $result WHERE (Salary>1000)');
Is it possible to create something like "on variable based" MYSQL query directly on the server side and pass somehow variable $language_id to it? The second query would query that query :-)
Thanks...
No, there is no such thing as your second idea.
For the first idea, though, I would go with a single query :
select *
from raw_data
where id = $language_id
and age > 13
and Salary > 1000
Provided you have set the right indexes on your table, this query should be pretty fast.
Here, considering the where clause of that query, I would at least go with an index on these three columns :
id
age
Salary
This should speed things up quite a bit.
For more informations on indexes, and optimization of queries, take a look at :
Chapter 7. Optimization
7.3.1. How MySQL Uses Indexes
12.1.11. CREATE INDEX Syntax
With the use of sub queries you can take advantage of MySQL's caching facilities.
SELECT * FROM raw_data_tbl WHERE (ID='eng') AND (age>13);
... and after this:
SELECT * FROM (SELECT * FROM raw_data_tbl WHERE (ID='eng') AND (age>13)) WHERE salary > 1000;
But this is only beneficial in some very rare circumstances.
With the right indexes your query will run fast enough without the need of trickery. In your case:
CREATE INDEX filter1 ON raw_data_tbl (ID, age, salary);
Although the best solution would be to just add conditions from your second query to the first one, you can use temporary tables to store temporary results. But it would still be better if you put that in a single query.
You could also use subqueries, like SELECT * FROM (SELECT * FROM table WHERE ...) WHERE ....
I will create 5 tables, namely data1, data2, data3, data4 and data5 tables. Each table can only store 1000 data records.
When a new entry or when I want to insert a new data, I must do a check,
$data1 = mysql_query(SELECT * FROM data1);
<?php
if(mysql_num_rows($data1) > 1000){
$data2 = mysql_query(SELECT * FROM data2);
if(mysql_num_rows($data2 > 1000){
and so on...
}
}
I think this is not the way right? I mean, if I am user 4500, it would take some time to do all the check. Is there any better way to solve this problem?
I haven decided the numbers, it can be 5000 or 10000 data. The reason is flexibility and portability? Well, one of my sql guru suggest me to do this way
Unless your guru was talking about something like Partitioning, I'd seriously doubt his advise. If your database can't handle more than 1000, 5000 or 10000 rows, look for another database. Unless you have a really specific example how a record limit will help you, it probably won't. With the amount of overhead it adds it probably only complicates things for no gain.
A properly set up database table can easily handle millions of records. Splitting it into separate tables will most likely increase neither flexibility nor portability. If you accumulate enough records to run into performance problems, congratulate yourself on a job well done and worry about it then.
Read up on how to count rows in mysql.
Depending on what database engine you are using, doing count(*) operations on InnoDB tables is quite expensive, and those counts should be performed by triggers and tracked in a adjacent information table.
The structure you describe is often designed around a mapping table first. One queries the mapping table to find the destination table associated with a primary key.
You can keep a "tracking" table to keep track of the current table between requests.
Also be on alert for race conditions (use transactions, or insure only one process is running at a time.)
Also don't $data1 = mysql_query(SELECT * FROM data1); with nested if's, do something like:
$i = 1;
do {
$rowCount = mysql_fetch_field(mysql_query("SELECT count(*) FROM data$i"));
$i++;
} while ($rowCount >= 1000);
I'd be surprised if MySQL doesn't have some fancy-pants way to manage this automatically (or at least, better than what I'm about to propose), but here's one way to do it.
1. Insert record into 'data'
2. Check the length of 'data'
3. If >= 1000,
- CREATE TABLE 'dataX' LIKE 'data';
(X will be the number of tables you have + 1)
- INSERT INTO 'dataX' SELECT * FROM 'data';
- TRUNCATE 'data';
This means you will always be inserting into the 'data' table, and 'data1', 'data2', 'data3', etc are your archived versions of that table.
You can create a MERGE table like this:
CREATE TABLE all_data ([col_definitions]) ENGINE=MERGE UNION=(data1,data2,data3,data4,data5);
Then you would be able to count the total rows with a query like SELECT COUNT(*) FROM all_data.
If you're using MySQL 5.1 or above, you can let the database handle this (nearly) automatically using partitioning:
Read this article or the official documentation