Creating an INDEX to improve slow execution time caused by ORDER BY

Creating an INDEX to improve slow execution time caused by ORDER BY - php

I have the following query in a PHP function. This gets called a number of times depending on a number of factors, but even if it is executed only 1 time it takes a long time.
SELECT `date` as dateTo
FROM table_name tbl
WHERE `colA` = 223 and `colB` <> 1
ORDER BY `date` DESC
LIMIT 1
The database table has about 2 million records and the ORDER BY is slowing the execution time.
What is the best INDEX I could have in this scenario?
Would an index on date only be beneficial or would I have to include colA and colB?
-----
I ended up using this query,
SELECT `ColA`,`date`, `ColB`
FROM atm_status_log
WHERE `ColA` = 223
HAVING `ColB` <> 1
ORDER BY `date` DESC
LIMIT 1;
and this INDEX, INDEX(colA, colB, date)

You have to add a index on date, colA, colB.
Please keep in mind the order off that index matters.
ALTER TABLE table_name ADD KEY index_name (date, colA, colB);

You have to add index for colA and colB. That will resolve the issue.

This is your query:
SELECT `date` as dateTo
FROM table_name tbl
WHERE `colA` = 223 and `colB` <> 1
ORDER BY `date` DESC
LIMIT 1
One approach to indexing would be table_name(colA, colB, date). This is a covering index for the query. However, it will not eliminate the sorting.
You could try this approach:
SELECT dateTo
FROM (SELECT `date` as dateTo, colB
FROM table_name tbl
WHERE `colA` = 223
ORDER BY `date` DESC
) t
WHERE `colB` <> 1;
The subquery should be able to make use of a query on table_name(colA, date). It will need to materialize the entire result set, but then choosing colB <> 1 should be pretty fast. I'm not thrilled with this approach, because it assumes that the subquery remains ordered when read by the outer query -- but that is true in MySQL.

How many different value in colB? If it can have only 0 and 1, then change the filter to
colB = 0
and add this
INDEX(colA, colB, date)
(The date must be last, but A and B can be in either order.) Only then can the all the filtering and the ORDER BY be handled by the INDEX.
If colB has more than 2 values, then let's slightly improve on Gordon's solution:
SELECT `date` as dateTo
FROM table_name tbl
WHERE `colA` = 223
AND colB <> 1
ORDER BY `date` DESC;
with INDEX(colA, colB, date), with the columns in exactly that order. This index would be a "covering" index.

Related

PHP MYSQL General Error returned when using LIMIT [duplicate]

This question already has answers here:
Implement paging (skip / take) functionality with this query
(6 answers)
Closed 1 year ago.
I have this query with MySQL:
select * from table1 LIMIT 10,20
How can I do this with SQL Server?

Starting SQL SERVER 2005, you can do this...
USE AdventureWorks;
GO
WITH OrderedOrders AS
(
SELECT SalesOrderID, OrderDate,
ROW_NUMBER() OVER (ORDER BY OrderDate) AS 'RowNumber'
FROM Sales.SalesOrderHeader
)
SELECT *
FROM OrderedOrders
WHERE RowNumber BETWEEN 10 AND 20;
or something like this for 2000 and below versions...
SELECT TOP 10 * FROM (SELECT TOP 20 FROM Table ORDER BY Id) ORDER BY Id DESC

Starting with SQL SERVER 2012, you can use the OFFSET FETCH Clause:
USE AdventureWorks;
GO
SELECT SalesOrderID, OrderDate
FROM Sales.SalesOrderHeader
ORDER BY SalesOrderID
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY;
GO
http://msdn.microsoft.com/en-us/library/ms188385(v=sql.110).aspx
This may not work correctly when the order by is not unique.
If the query is modified to ORDER BY OrderDate, the result set returned is not as expected.

This is how I limit the results in MS SQL Server 2012:
SELECT *
FROM table1
ORDER BY columnName
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY
NOTE: OFFSET can only be used with or in tandem to ORDER BY.
To explain the code line OFFSET xx ROWS FETCH NEXT yy ROW ONLY
The xx is the record/row number you want to start pulling from in the table, i.e: If there are 40 records in table 1, the code above will start pulling from row 10.
The yy is the number of records/rows you want to pull from the table.
To build on the previous example: If table 1 has 40 records and you began pulling from row 10 and grab the NEXT set of 10 (yy).
That would mean, the code above will pull the records from table 1 starting at row 10 and ending at 20. Thus pulling rows 10 - 20.
Check out the link for more info on OFFSET

This is almost a duplicate of a question I asked in October:
Emulate MySQL LIMIT clause in Microsoft SQL Server 2000
If you're using Microsoft SQL Server 2000, there is no good solution. Most people have to resort to capturing the result of the query in a temporary table with a IDENTITY primary key. Then query against the primary key column using a BETWEEN condition.
If you're using Microsoft SQL Server 2005 or later, you have a ROW_NUMBER() function, so you can get the same result but avoid the temporary table.
SELECT t1.*
FROM (
SELECT ROW_NUMBER OVER(ORDER BY id) AS row, t1.*
FROM ( ...original SQL query... ) t1
) t2
WHERE t2.row BETWEEN #offset+1 AND #offset+#count;
You can also write this as a common table expression as shown in #Leon Tayson's answer.

SELECT *
FROM (
SELECT TOP 20
t.*, ROW_NUMBER() OVER (ORDER BY field1) AS rn
FROM table1 t
ORDER BY
field1
) t
WHERE rn > 10

Syntactically MySQL LIMIT query is something like this:
SELECT * FROM table LIMIT OFFSET, ROW_COUNT
This can be translated into Microsoft SQL Server like
SELECT * FROM
(
SELECT TOP #{OFFSET+ROW_COUNT} *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rnum
FROM table
) a
WHERE rnum > OFFSET
Now your query select * from table1 LIMIT 10,20 will be like this:
SELECT * FROM
(
SELECT TOP 30 *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rnum
FROM table1
) a
WHERE rnum > 10

SELECT TOP 10 * FROM table;
Is the same as
SELECT * FROM table LIMIT 0,10;
Here's an article about implementing Limit in MsSQL Its a nice read, specially the comments.

This is one of the reasons I try to avoid using MS Server... but anyway. Sometimes you just don't have an option (yei! and I have to use an outdated version!!).
My suggestion is to create a virtual table:
From:
SELECT * FROM table
To:
CREATE VIEW v_table AS
SELECT ROW_NUMBER() OVER (ORDER BY table_key) AS row,* FROM table
Then just query:
SELECT * FROM v_table WHERE row BETWEEN 10 AND 20
If fields are added, or removed, "row" is updated automatically.
The main problem with this option is that ORDER BY is fixed. So if you want a different order, you would have to create another view.
UPDATE
There is another problem with this approach: if you try to filter your data, it won't work as expected. For example, if you do:
SELECT * FROM v_table WHERE field = 'test' AND row BETWEEN 10 AND 20
WHERE becomes limited to those data which are in the rows between 10 and 20 (instead of searching the whole dataset and limiting the output).

In SQL there's no LIMIT keyword exists. If you only need a limited number of rows you should use a TOP keyword which is similar to a LIMIT.

Must try. In below query, you can see group by, order by, Skip rows, and limit rows.
select emp_no , sum(salary_amount) from emp_salary
Group by emp_no
ORDER BY emp_no
OFFSET 5 ROWS -- Skip first 5
FETCH NEXT 10 ROWS ONLY; -- limit to retrieve next 10 row after skiping rows

Easy way
MYSQL:
SELECT 'filds' FROM 'table' WHERE 'where' LIMIT 'offset','per_page'
MSSQL:
SELECT 'filds' FROM 'table' WHERE 'where' ORDER BY 'any' OFFSET 'offset'
ROWS FETCH NEXT 'per_page' ROWS ONLY
ORDER BY is mandatory

This is a multi step approach that will work in SQL2000.
-- Create a temp table to hold the data
CREATE TABLE #foo(rowID int identity(1, 1), myOtherColumns)
INSERT INTO #foo (myColumns) SELECT myData order By MyCriteria
Select * FROM #foo where rowID > 10

SELECT
*
FROM
(
SELECT
top 20 -- ($a) number of records to show
*
FROM
(
SELECT
top 29 -- ($b) last record position
*
FROM
table -- replace this for table name (i.e. "Customer")
ORDER BY
2 ASC
) AS tbl1
ORDER BY
2 DESC
) AS tbl2
ORDER BY
2 ASC;
-- Examples:
-- Show 5 records from position 5:
-- $a = 5;
-- $b = (5 + 5) - 1
-- $b = 9;
-- Show 10 records from position 4:
-- $a = 10;
-- $b = (10 + 4) - 1
-- $b = 13;
-- To calculate $b:
-- $b = ($a + position) - 1
-- For the present exercise we need to:
-- Show 20 records from position 10:
-- $a = 20;
-- $b = (20 + 10) - 1
-- $b = 29;

If your ID is unique identifier type or your id in table is not sorted you must do like this below.
select * from
(select ROW_NUMBER() OVER (ORDER BY (select 0)) AS RowNumber,* from table1) a
where a.RowNumber between 2 and 5
The code will be
select * from limit 2,5

better use this in MSSQLExpress 2017.
SELECT * FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) as [Count], * FROM table1
) as a
WHERE [Count] BETWEEN 10 and 20;
--Giving a Column [Count] and assigning every row a unique counting without ordering something then re select again where you can provide your limits.. :)

One of the possible way to get result as below , hope this will help.
declare #start int
declare #end int
SET #start = '5000'; -- 0 , 5000 ,
SET #end = '10000'; -- 5001, 10001
SELECT * FROM (
SELECT TABLE_NAME,TABLE_TYPE, ROW_NUMBER() OVER (ORDER BY TABLE_NAME) as row FROM information_schema.tables
) a WHERE a.row > #start and a.row <= #end

If i remember correctly (it's been a while since i dabbed with SQL Server) you may be able to use something like this: (2005 and up)
SELECT
*
,ROW_NUMBER() OVER(ORDER BY SomeFields) AS [RowNum]
FROM SomeTable
WHERE RowNum BETWEEN 10 AND 20

Select most common value? [duplicate]

How can I find the most frequent value in a given column in an SQL table?
For example, for this table it should return two since it is the most frequent value:
one
two
two
three

SELECT
<column_name>,
COUNT(<column_name>) AS `value_occurrence`
FROM
<my_table>
GROUP BY
<column_name>
ORDER BY
`value_occurrence` DESC
LIMIT 1;
Replace <column_name> and <my_table>. Increase 1 if you want to see the N most common values of the column.

Try something like:
SELECT `column`
FROM `your_table`
GROUP BY `column`
ORDER BY COUNT(*) DESC
LIMIT 1;

Let us consider table name as tblperson and column name as city. I want to retrieve the most repeated city from the city column:
select city,count(*) as nor from tblperson
group by city
having count(*) =(select max(nor) from
(select city,count(*) as nor from tblperson group by city) tblperson)
Here nor is an alias name.

Below query seems to work good for me in SQL Server database:
select column, COUNT(column) AS MOST_FREQUENT
from TABLE_NAME
GROUP BY column
ORDER BY COUNT(column) DESC
Result:
column MOST_FREQUENT
item1 highest count
item2 second highest
item3 third higest
..
..

For use with SQL Server.
As there is no limit command support in that.
Yo can use the top 1 command to find the maximum occurring value in the particular column in this case (value)
SELECT top1
`value`,
COUNT(`value`) AS `value_occurrence`
FROM
`my_table`
GROUP BY
`value`
ORDER BY
`value_occurrence` DESC;

Assuming Table is 'SalesLT.Customer' and the Column you are trying to figure out is 'CompanyName' and AggCompanyName is an Alias.
Select CompanyName, Count(CompanyName) as AggCompanyName from SalesLT.Customer
group by CompanyName
Order By Count(CompanyName) Desc;

If you can't use LIMIT or LIMIT is not an option for your query tool. You can use "ROWNUM" instead, but you will need a sub query:
SELECT FIELD_1, ALIAS1
FROM(SELECT FIELD_1, COUNT(FIELD_1) ALIAS1
FROM TABLENAME
GROUP BY FIELD_1
ORDER BY COUNT(FIELD_1) DESC)
WHERE ROWNUM = 1

If you have an ID column and you want to find most repetitive category from another column for each ID then you can use below query,
Table:
Query:
SELECT ID, CATEGORY, COUNT(*) AS FREQ
FROM TABLE
GROUP BY 1,2
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY FREQ DESC) = 1;
Result:

Return all most frequent rows in case of tie
Find the most frequent value in mysql,display all in case of a tie gives two possible approaches:
Scalar subquery:
SELECT
"country",
COUNT(country) AS "cnt"
FROM "Sales"
GROUP BY "country"
HAVING
COUNT("country") = (
SELECT COUNT("country") AS "cnt"
FROM "Sales"
GROUP BY "country"
ORDER BY "cnt" DESC,
LIMIT 1
)
ORDER BY "country" ASC
With the RANK window function, available since MySQL 8+:
SELECT "country", "cnt"
FROM (
SELECT
"country",
COUNT("country") AS "cnt",
RANK() OVER (ORDER BY COUNT(*) DESC) "rnk"
FROM "Sales"
GROUP BY "country"
) AS "sub"
WHERE "rnk" = 1
ORDER BY "country" ASC
This method might save a second recount compared to the first one.
RANK works by ranking all rows, such that if two rows are at the top, both get rank 1. So it basically directly solves this type of use case.
RANK is also available on SQLite and PostgreSQL, I think it might be SQL standard, not sure.
In the above queries I also sorted by country to have more deterministic results.
Tested on SQLite 3.34.0, PostgreSQL 14.3, GitHub upstream.
Most frequent for each GROUP BY group
MySQL: MySQL SELECT most frequent by group
PostgreSQL:
Get most common value for each value of another column in SQL
https://dba.stackexchange.com/questions/193307/find-most-frequent-values-for-a-given-column
SQLite: SQL query for finding the most frequent value of a grouped by value

SELECT TOP 20 WITH TIES COUNT(Counted_Column) AS Count, OtherColumn1,
OtherColumn2, OtherColumn3, OtherColumn4
FROM Table_or_View_Name
WHERE
(Date_Column >= '01/01/2023') AND
(Date_Column <= '03/01/2023') AND
(Counted_Column = 'Desired_Text')
GROUP BY OtherColumn1, OtherColumn2, OtherColumn3, OtherColumn4
ORDER BY COUNT(Counted_Column) DESC
20 can be changed to any desired number
WITH TIES allows all ties in the count to be displayed
Date range used if date/time column exists and can be modified to search a date range as desired
Counted_Column 'Desired_Text' can be modified to only count certain entries in that column
Works in INSQL for my instance

One way I like to use is:
select *<given_column>*,COUNT(*<given_column>*)as VAR1 from Table_Name
group by *<given_column>*
order by VAR1 desc
limit 1

SQL position of row(ranking system) WITHOUT same rank for two records

so I'm trying to create a ranking system for my website, however as a lot of the records have same number of points, they all have same rank, is there a way to avoid this?
currently have
$conn = $db->query("SELECT COUNT( * ) +1 AS 'position' FROM tv WHERE points > ( SELECT points FROM tv WHERE id ={$data['id']} )");
$d = $db->fetch_array($conn);
echo $d['position'];
And DB structure
`id` int(11) NOT NULL,
`name` varchar(150) NOT NULL,
`points` int(11) NOT NULL,
Edited below,
What I'm doing right now is getting records by lets say
SELECT * FROM tv WHERE type = 1
Now I run a while loop, and I need to make myself a function that will get the rank, but it would make sure that the ranks aren't duplicate
How would I go about making a ranking system that doesn't have same ranking for two records? lets say if the points count is the same, it would order them by ID and get their position? or something like that? Thank you!

If you are using MS SQL Server 2008R2, you can use the RANK function.
http://msdn.microsoft.com/en-us/library/ms176102.aspx
If you are using MySQL, you can look at one of the below options:
http://thinkdiff.net/mysql/how-to-get-rank-using-mysql-query/
http://www.fromdual.ch/ranking-mysql-results

select #rnk:=#rnk+1 as rnk,id,name,points
from table,(select #rnk:=0) as r order by points desc,id

You want to use ORDER BY. Applying on multiple columns is as simple as comma delimiting them: ORDER BY points, id DESC will sort by points and if the points are the same, it will sort by id.
Here's your SELECT query:
SELECT * FROM tv WHERE points > ( SELECT points FROM tv WHERE id ={$data['id']} ) ORDER BY points, id DESC
Documentation to support this: http://dev.mysql.com/doc/refman/5.0/en/sorting-rows.html

Many Database vendors have added special functions to their products to do this, but you can also do it with straight SQL:
Select *, 1 +
(Select Count(*) From myTable
Where ColName < t.ColName) Rank
From MyTable t
or to avoid giving records with the same value of colName the same rank, (This requires a key)
Select *, 1 +
(Select Count(Distinct KeyCol)
From myTable
Where ColName < t.ColName or
(ColName = t.ColName And KeyCol < t.KeyCol)) Rank
From MyTable t

Forward Back Records in MySQL with the same DATA in the primary

I have a table that is is sorted 1st by Reminder Date then ID
Table Looks like:
ID | remind_date
1 2011-01-23
2 2010-02-21
4 2011-04-04
5 2011-04-04
6 2009-05-04
I am using a PHP front end to move forward and back thur the records. I want to have forward and back buttons but i am running into a problem with the 2 reminder dates that are the same.
Just to note the ID's are NOT in order, they are here but in the actual database they are mixed up when sorting by reminder_date
The select statement i am using is: ($iid is the current record i am on)
SELECT id FROM myDB.reminders where remind_date > (SELECT remind_date FROM myDB.reminders where id=$iid) order by remind_date ASC LIMIT 1
So what happens when i get to the dates that are the same its skips over one because its asking for remind_date >.
If i use remind_date >= it returns the current record. My solution was then to use limit 2 and check via PHP to if the 1st record = my current ID, if it did use the next one. but what it there are 3 dates the same or 4 etc..
I also thought about using the ID field but since they are out of order i can't add in a ID > $iid.
Any ideas? it works great except for 2 dates that are the same.

You might be able to use this:
SELECT ID, remind_date
FROM
(
SELECT #prev_id := -1
) AS vars
STRAIGHT_JOIN
(
SELECT
ID,
remind_date,
#prev_id AS prev_id,
#prev_id := id
FROM myDB.reminders
ORDER BY remind_date, ID
) T1
WHERE prev_id = $iid
Here is a test of the above with your test data from your comment:
CREATE TABLE Table1 (ID INT NOT NULL, remind_date DATE NOT NULL);
INSERT INTO Table1 (ID, remind_date) VALUES
(45, '2011-01-14'),
(23, '2011-01-22'),
(48, '2011-01-23'),
(25, '2011-01-23'),
(63, '2011-02-19');
SELECT ID, remind_date
FROM
(
SELECT #prev_id := -1
) AS vars
STRAIGHT_JOIN
(
SELECT
ID,
remind_date,
#prev_id AS prev_id,
#prev_id := id
FROM table1
ORDER BY remind_date, ID
) T1
WHERE prev_id = 25
Result:
ID remind_date
48 2011-01-23

add a condition WHERE ID<>MY_LAST_ID. This can not work with triple and more same dates, so you can collect already taken ID's to array like (4,5,6) - see array_push(), implode it with "," to convert to a string (let's call it YOUR_IDS_STRING) and add to your query:
WHERE id NOT IN( YOUR_IDS_STRING )
And after each query make check, does date has changed and if it does - you can unset your array and start from begining (this is not neccesary, but gives you more performance, because YOUR_ID_STRING will be only that long as is need).
If your page is refreshing between queries, maybe try to push YOUR_ID_STRING in session variable, _GET or cookies, and simply concat next id's by operator .=

I used the code provided by Mark Byers and with small changes I adapted it to navigate in opposite directions (and to pass other columns too, not only the date and ID):
$results = $mysqli->query("SELECT * FROM (SELECT #prev_id := -1) AS vars STRAIGHT_JOIN (SELECT *, #prev_id AS prev_id, #prev_id := ID FROM my_table ORDER BY data, ID) T1 WHERE prev_id = ".$ID);
$results = $mysqli->query("SELECT * FROM (SELECT #next_id := 1) AS vars STRAIGHT_JOIN (SELECT *, #next_id AS next_id, #next_id := ID FROM my_table ORDER BY data DESC, ID DESC) T1 WHERE next_id = ".$ID);
I tested it on duplicate dates and it navigates well trough a list of records displayed with:
$results = $mysqli->query("SELECT * FROM my_table ORDER BY data DESC, ID DESC");

getting non-distinct results from a distinct mysql query

Right, another question on queries (there must be a syntax guide more helpful than mySQL's manual, surely?)
I have this query (from another helpful answer on SO)...
SELECT DATE_FORMAT(`when`, '%e_%c_%Y')date, COUNT(`ip`) AddressCount FROM `Metrics` WHERE `ID` = '1' GROUP BY DATE(`when`)
I now want to do a similar query to get unique/distinct results for the IPs... i.e. unique visitors per date. My query was this...
SELECT DATE_FORMAT(`when`, '%e_%c_%Y')date, COUNT(distinct `ip`) AddressCount FROM `Metrics` WHERE `ID` = '1' GROUP BY DATE(`when`)
However, that returns a repetition of dates, though different quantities of Addresscount...
date AddressCount
29_6_2009 1
30_6_2009 1
29_6_2009 1
30_6_2009 1
29_6_2009 1
NULL 1
15_5_2009 1
14_5_2009 2
NULL 3
14_5_2009 4
15_5_2009 1
26_6_2009 1
29_6_2009 1
26_6_2009 1
15_5_2009 1
26_6_2009 1
29_6_2009 1
Any ideas on where I'm going wrong?

Your group by will need to match the data you're selecting, so this should work:
SELECT DATE_FORMAT(`when`, '%e_%c_%Y')date, COUNT(distinct `ip`) AddressCount FROM `Metrics` WHERE `ID` = '1' GROUP BY date

Try
SELECT DATE_FORMAT(when, '%e_%c_%Y')date, COUNT(distinct ip) AddressCount FROM Metrics WHERE ID = '1' GROUP BY date(when)
You might have run into some bugs when using reserved words in MySQL

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Creating an INDEX to improve slow execution time caused by ORDER BY - php

You have to add a index on date, colA, colB. Please keep in mind the order off that index matters. ALTER TABLE table_name ADD KEY index_name (date, colA, colB);

You have to add index for colA and colB. That will resolve the issue.

Related

PHP MYSQL General Error returned when using LIMIT [duplicate]

Select most common value? [duplicate]

SQL position of row(ranking system) WITHOUT same rank for two records

Forward Back Records in MySQL with the same DATA in the primary

getting non-distinct results from a distinct mysql query

Categories

Resources