The following is the simplest possible example, though any solution should be able to scale to however many n top results are needed:
Given a table like that below, with person, group, and age columns, how would you get the 2 oldest people in each group? (Ties within groups should not yield more results, but give the first 2 in alphabetical order)
+--------+-------+-----+
| Person | Group | Age |
+--------+-------+-----+
| Bob | 1 | 32 |
| Jill | 1 | 34 |
| Shawn | 1 | 42 |
| Jake | 2 | 29 |
| Paul | 2 | 36 |
| Laura | 2 | 39 |
+--------+-------+-----+
Desired result set:
+--------+-------+-----+
| Shawn | 1 | 42 |
| Jill | 1 | 34 |
| Laura | 2 | 39 |
| Paul | 2 | 36 |
+--------+-------+-----+
NOTE: This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group, and which received a great MySQL-specific answer from #Bohemian:
select *
from (select * from mytable order by `Group`, Age desc, Person) x
group by `Group`
Would love to be able to build off this, though I don't see how.
Here is one way to do this, using UNION ALL (See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group number and add queries for each group:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(#num:=if(#group = `group`, #num +1, if(#group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select #num:=0, #group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
In other databases you can do this using ROW_NUMBER. MySQL doesn't support ROW_NUMBER but you can use variables to emulate it:
SELECT
person,
groupname,
age
FROM
(
SELECT
person,
groupname,
age,
#rn := IF(#prev = groupname, #rn + 1, 1) AS rn,
#prev := groupname
FROM mytable
JOIN (SELECT #prev := NULL, #rn := 0) AS vars
ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2
See it working online: sqlfiddle
Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:
It it is a single query. The variables are initialized inside the SELECT statement.
It handles ties as described in the question (alphabetical order by name).
So I'll leave it here in case it can help someone.
Try this:
SELECT a.person, a.group, a.age FROM person AS a WHERE
(SELECT COUNT(*) FROM person AS b
WHERE b.group = a.group AND b.age >= a.age) <= 2
ORDER BY a.group ASC, a.age DESC
DEMO
How about using self-joining:
CREATE TABLE mytable (person, groupname, age);
INSERT INTO mytable VALUES('Bob',1,32);
INSERT INTO mytable VALUES('Jill',1,34);
INSERT INTO mytable VALUES('Shawn',1,42);
INSERT INTO mytable VALUES('Jake',2,29);
INSERT INTO mytable VALUES('Paul',2,36);
INSERT INTO mytable VALUES('Laura',2,39);
SELECT a.* FROM mytable AS a
LEFT JOIN mytable AS a2
ON a.groupname = a2.groupname AND a.age <= a2.age
GROUP BY a.person
HAVING COUNT(*) <= 2
ORDER BY a.groupname, a.age DESC;
gives me:
a.person a.groupname a.age
---------- ----------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category
Also, I'm using SQLite, but this should work on MySQL.
Another thing: in the above, I replaced the group column with a groupname column for convenience.
Edit:
Following-up on the OP's comment regarding missing tie results, I incremented on snuffin's answer to show all the ties. This means that if the last ones are ties, more than 2 rows can be returned, as shown below:
.headers on
.mode column
CREATE TABLE foo (person, groupname, age);
INSERT INTO foo VALUES('Paul',2,36);
INSERT INTO foo VALUES('Laura',2,39);
INSERT INTO foo VALUES('Joe',2,36);
INSERT INTO foo VALUES('Bob',1,32);
INSERT INTO foo VALUES('Jill',1,34);
INSERT INTO foo VALUES('Shawn',1,42);
INSERT INTO foo VALUES('Jake',2,29);
INSERT INTO foo VALUES('James',2,15);
INSERT INTO foo VALUES('Fred',1,12);
INSERT INTO foo VALUES('Chuck',3,112);
SELECT a.person, a.groupname, a.age
FROM foo AS a
WHERE a.age >= (SELECT MIN(b.age)
FROM foo AS b
WHERE (SELECT COUNT(*)
FROM foo AS c
WHERE c.groupname = b.groupname AND c.age >= b.age) <= 2
GROUP BY b.groupname)
ORDER BY a.groupname ASC, a.age DESC;
gives me:
person groupname age
---------- ---------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
Joe 2 36
Chuck 3 112
Snuffin solution seems quite slow to execute when you've got plenty of rows and Mark Byers/Rick James and Bluefeet solutions doesn't work on my environnement (MySQL 5.6) because order by is applied after execution of select, so here is a variant of Marc Byers/Rick James solutions to fix this issue (with an extra imbricated select):
select person, groupname, age
from
(
select person, groupname, age,
(#rn:=if(#prev = groupname, #rn +1, 1)) as rownumb,
#prev:= groupname
from
(
select person, groupname, age
from persons
order by groupname , age desc, person
) as sortedlist
JOIN (select #prev:=NULL, #rn :=0) as vars
) as groupedlist
where rownumb<=2
order by groupname , age desc, person;
I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds
If the other answers are not fast enough Give this code a try:
SELECT
province, n, city, population
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(province != #prev, 1, #n + 1) AS n,
#prev := province,
province, city, population
FROM Canada
ORDER BY
province ASC,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Output:
+---------------------------+------+------------------+------------+
| province | n | city | population |
+---------------------------+------+------------------+------------+
| Alberta | 1 | Calgary | 968475 |
| Alberta | 2 | Edmonton | 822319 |
| Alberta | 3 | Red Deer | 73595 |
| British Columbia | 1 | Vancouver | 1837970 |
| British Columbia | 2 | Victoria | 289625 |
| British Columbia | 3 | Abbotsford | 151685 |
| Manitoba | 1 | ...
Check this out:
SELECT
p.Person,
p.`Group`,
p.Age
FROM
people p
INNER JOIN
(
SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`
UNION
SELECT MAX(p3.Age) AS Age, p3.`Group` FROM people p3 INNER JOIN (SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`) p4 ON p3.Age < p4.Age AND p3.`Group` = p4.`Group` GROUP BY `Group`
) p2 ON p.Age = p2.Age AND p.`Group` = p2.`Group`
ORDER BY
`Group`,
Age DESC,
Person;
SQL Fiddle: http://sqlfiddle.com/#!2/cdbb6/15
WITH cte_window AS (
SELECT movie_name,director_id,release_date,
ROW_NUMBER() OVER( PARTITION BY director_id ORDER BY release_date DESC) r
FROM movies
)
SELECT * FROM cte_window WHERE r <= <n>;
Above query will returns latest n movies for each directors.
I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I'm working on. This doesn't quite give the output you're looking for but its close. The function in mysql called GROUP_CONCAT() worked really well for specifying how many results to return in each group. Using LIMIT or any of the other fancy ways of trying to do this with COUNT didn't work for me. So if you're willing to accept a modified output, its a great solution. Lets say I have a table called 'student' with student ids, their gender, and gpa. Lets say I want to top 5 gpas for each gender. Then I can write the query like this
SELECT sex, SUBSTRING_INDEX(GROUP_CONCAT(cast(gpa AS char ) ORDER BY gpa desc), ',',5)
AS subcategories FROM student GROUP BY sex;
Note that the parameter '5' tells it how many entries to concatenate into each row
And the output would look something like
+--------+----------------+
| Male | 4,4,4,4,3.9 |
| Female | 4,4,3.9,3.9,3.8|
+--------+----------------+
You can also change the ORDER BY variable and order them a different way. So if I had the student's age I could replace the 'gpa desc' with 'age desc' and it will work! You can also add variables to the group by statement to get more columns in the output. So this is just a way I found that is pretty flexible and works good if you are ok with just listing results.
In SQL Server row_numer() is a powerful function that can get result easily as below
select Person,[group],age
from
(
select * ,row_number() over(partition by [group] order by age desc) rn
from mytable
) t
where rn <= 2
There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group
Based on the solution in the referenced link, your query would be like:
SELECT Person, Group, Age
FROM
(SELECT Person, Group, Age,
#group_rank := IF(#group = Group, #group_rank + 1, 1) AS group_rank,
#current_group := Group
FROM `your_table`
ORDER BY Group, Age DESC
) ranked
WHERE group_rank <= `n`
ORDER BY Group, Age DESC;
where n is the top n and your_table is the name of your table.
I think the explanation in the reference is really clear. For quick reference I will copy and paste it here:
Currently MySQL does not support ROW_NUMBER() function that can assign
a sequence number within a group, but as a workaround we can use MySQL
session variables.
These variables do not require declaration, and can be used in a query
to do calculations and to store intermediate results.
#current_country := country This code is executed for each row and
stores the value of country column to #current_country variable.
#country_rank := IF(#current_country = country, #country_rank + 1, 1)
In this code, if #current_country is the same we increment rank,
otherwise set it to 1. For the first row #current_country is NULL, so
rank is also set to 1.
For correct ranking, we need to have ORDER BY country, population DESC
SELECT
p1.Person,
p1.`GROUP`,
p1.Age
FROM
person AS p1
WHERE
(
SELECT
COUNT( DISTINCT ( p2.age ) )
FROM
person AS p2
WHERE
p2.`GROUP` = p1.`GROUP`
AND p2.Age >= p1.Age
) < 2
ORDER BY
p1.`GROUP` ASC,
p1.age DESC
reference leetcode
Related
Here's what I'm trying to do. Let's say I have this table t:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
2 | 18 | 2012-05-19 | y
3 | 18 | 2012-08-09 | z
4 | 19 | 2009-06-01 | a
5 | 19 | 2011-04-03 | b
6 | 19 | 2011-10-25 | c
7 | 19 | 2012-08-09 | d
For each id, I want to select the row containing the minimum record_date. So I'd get:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
5 | 19 | 2011-04-03 | b
4 | 19 | 2009-06-01 | a
How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.
I could get to your expected result just by doing this in mysql:
SELECT id, min(record_date), other_cols
FROM mytable
GROUP BY id
Does this work for you?
To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:
SELECT categoryid,
productid,
productName,
unitprice
FROM products a WHERE unitprice = (
SELECT MIN(unitprice)
FROM products b
WHERE b.categoryid = a.categoryid)
The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.
I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.
SELECT * FROM
(
SELECT
ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
*
FROM products P
) INNER
WHERE rownum = 2
This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma
This does it simply:
select t2.id,t2.record_date,t2.other_cols
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2
where t2.rownum = 1
If record_date has no duplicates within a group:
think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:
SELECT * FROM t t1 WHERE record_date = (
select MIN(record_date)
from t t2 where t2.group_id = t1.group_id)
If there could be 2+ min record_date within a group:
filter out non-min rows (see above)
then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:
AND key_id = (select MIN(key_id)
from t t3 where t3.record_date = t1.record_date
and t3.group_id = t1.group_id)
so
key_id | group_id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
8 | 19 | 2009-06-01 | e
will select key_ids: #1 and #4
SELECT p.* FROM tbl p
INNER JOIN(
SELECT t.id, MIN(record_date) AS MinDate
FROM tbl t
GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id
This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.
This a old question, but this can useful for someone
In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql
select t.*
from (select m.*, #g := 0
from MyTable m --here i have a big query
order by id, record_date) t
where (1 = case when #g = 0 or #g <> id then 1 else 0 end )
and (#g := id) IS NOT NULL
Basically I ordered the result and then put a variable in order to get only the first record in each group.
The below query takes the first date for each work order (in a table of showing all status changes):
SELECT
WORKORDERNUM,
MIN(DATE)
FROM
WORKORDERS
WHERE
DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
WORKORDERNUM
select
department,
min_salary,
(select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname
from
(select department, min (salary) min_salary from staff s2 group by s2.department) s3
The following is the simplest possible example, though any solution should be able to scale to however many n top results are needed:
Given a table like that below, with person, group, and age columns, how would you get the 2 oldest people in each group? (Ties within groups should not yield more results, but give the first 2 in alphabetical order)
+--------+-------+-----+
| Person | Group | Age |
+--------+-------+-----+
| Bob | 1 | 32 |
| Jill | 1 | 34 |
| Shawn | 1 | 42 |
| Jake | 2 | 29 |
| Paul | 2 | 36 |
| Laura | 2 | 39 |
+--------+-------+-----+
Desired result set:
+--------+-------+-----+
| Shawn | 1 | 42 |
| Jill | 1 | 34 |
| Laura | 2 | 39 |
| Paul | 2 | 36 |
+--------+-------+-----+
NOTE: This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group, and which received a great MySQL-specific answer from #Bohemian:
select *
from (select * from mytable order by `Group`, Age desc, Person) x
group by `Group`
Would love to be able to build off this, though I don't see how.
Here is one way to do this, using UNION ALL (See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group number and add queries for each group:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(#num:=if(#group = `group`, #num +1, if(#group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select #num:=0, #group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
In other databases you can do this using ROW_NUMBER. MySQL doesn't support ROW_NUMBER but you can use variables to emulate it:
SELECT
person,
groupname,
age
FROM
(
SELECT
person,
groupname,
age,
#rn := IF(#prev = groupname, #rn + 1, 1) AS rn,
#prev := groupname
FROM mytable
JOIN (SELECT #prev := NULL, #rn := 0) AS vars
ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2
See it working online: sqlfiddle
Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:
It it is a single query. The variables are initialized inside the SELECT statement.
It handles ties as described in the question (alphabetical order by name).
So I'll leave it here in case it can help someone.
Try this:
SELECT a.person, a.group, a.age FROM person AS a WHERE
(SELECT COUNT(*) FROM person AS b
WHERE b.group = a.group AND b.age >= a.age) <= 2
ORDER BY a.group ASC, a.age DESC
DEMO
How about using self-joining:
CREATE TABLE mytable (person, groupname, age);
INSERT INTO mytable VALUES('Bob',1,32);
INSERT INTO mytable VALUES('Jill',1,34);
INSERT INTO mytable VALUES('Shawn',1,42);
INSERT INTO mytable VALUES('Jake',2,29);
INSERT INTO mytable VALUES('Paul',2,36);
INSERT INTO mytable VALUES('Laura',2,39);
SELECT a.* FROM mytable AS a
LEFT JOIN mytable AS a2
ON a.groupname = a2.groupname AND a.age <= a2.age
GROUP BY a.person
HAVING COUNT(*) <= 2
ORDER BY a.groupname, a.age DESC;
gives me:
a.person a.groupname a.age
---------- ----------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category
Also, I'm using SQLite, but this should work on MySQL.
Another thing: in the above, I replaced the group column with a groupname column for convenience.
Edit:
Following-up on the OP's comment regarding missing tie results, I incremented on snuffin's answer to show all the ties. This means that if the last ones are ties, more than 2 rows can be returned, as shown below:
.headers on
.mode column
CREATE TABLE foo (person, groupname, age);
INSERT INTO foo VALUES('Paul',2,36);
INSERT INTO foo VALUES('Laura',2,39);
INSERT INTO foo VALUES('Joe',2,36);
INSERT INTO foo VALUES('Bob',1,32);
INSERT INTO foo VALUES('Jill',1,34);
INSERT INTO foo VALUES('Shawn',1,42);
INSERT INTO foo VALUES('Jake',2,29);
INSERT INTO foo VALUES('James',2,15);
INSERT INTO foo VALUES('Fred',1,12);
INSERT INTO foo VALUES('Chuck',3,112);
SELECT a.person, a.groupname, a.age
FROM foo AS a
WHERE a.age >= (SELECT MIN(b.age)
FROM foo AS b
WHERE (SELECT COUNT(*)
FROM foo AS c
WHERE c.groupname = b.groupname AND c.age >= b.age) <= 2
GROUP BY b.groupname)
ORDER BY a.groupname ASC, a.age DESC;
gives me:
person groupname age
---------- ---------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
Joe 2 36
Chuck 3 112
Snuffin solution seems quite slow to execute when you've got plenty of rows and Mark Byers/Rick James and Bluefeet solutions doesn't work on my environnement (MySQL 5.6) because order by is applied after execution of select, so here is a variant of Marc Byers/Rick James solutions to fix this issue (with an extra imbricated select):
select person, groupname, age
from
(
select person, groupname, age,
(#rn:=if(#prev = groupname, #rn +1, 1)) as rownumb,
#prev:= groupname
from
(
select person, groupname, age
from persons
order by groupname , age desc, person
) as sortedlist
JOIN (select #prev:=NULL, #rn :=0) as vars
) as groupedlist
where rownumb<=2
order by groupname , age desc, person;
I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds
If the other answers are not fast enough Give this code a try:
SELECT
province, n, city, population
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(province != #prev, 1, #n + 1) AS n,
#prev := province,
province, city, population
FROM Canada
ORDER BY
province ASC,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Output:
+---------------------------+------+------------------+------------+
| province | n | city | population |
+---------------------------+------+------------------+------------+
| Alberta | 1 | Calgary | 968475 |
| Alberta | 2 | Edmonton | 822319 |
| Alberta | 3 | Red Deer | 73595 |
| British Columbia | 1 | Vancouver | 1837970 |
| British Columbia | 2 | Victoria | 289625 |
| British Columbia | 3 | Abbotsford | 151685 |
| Manitoba | 1 | ...
Check this out:
SELECT
p.Person,
p.`Group`,
p.Age
FROM
people p
INNER JOIN
(
SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`
UNION
SELECT MAX(p3.Age) AS Age, p3.`Group` FROM people p3 INNER JOIN (SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`) p4 ON p3.Age < p4.Age AND p3.`Group` = p4.`Group` GROUP BY `Group`
) p2 ON p.Age = p2.Age AND p.`Group` = p2.`Group`
ORDER BY
`Group`,
Age DESC,
Person;
SQL Fiddle: http://sqlfiddle.com/#!2/cdbb6/15
WITH cte_window AS (
SELECT movie_name,director_id,release_date,
ROW_NUMBER() OVER( PARTITION BY director_id ORDER BY release_date DESC) r
FROM movies
)
SELECT * FROM cte_window WHERE r <= <n>;
Above query will returns latest n movies for each directors.
I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I'm working on. This doesn't quite give the output you're looking for but its close. The function in mysql called GROUP_CONCAT() worked really well for specifying how many results to return in each group. Using LIMIT or any of the other fancy ways of trying to do this with COUNT didn't work for me. So if you're willing to accept a modified output, its a great solution. Lets say I have a table called 'student' with student ids, their gender, and gpa. Lets say I want to top 5 gpas for each gender. Then I can write the query like this
SELECT sex, SUBSTRING_INDEX(GROUP_CONCAT(cast(gpa AS char ) ORDER BY gpa desc), ',',5)
AS subcategories FROM student GROUP BY sex;
Note that the parameter '5' tells it how many entries to concatenate into each row
And the output would look something like
+--------+----------------+
| Male | 4,4,4,4,3.9 |
| Female | 4,4,3.9,3.9,3.8|
+--------+----------------+
You can also change the ORDER BY variable and order them a different way. So if I had the student's age I could replace the 'gpa desc' with 'age desc' and it will work! You can also add variables to the group by statement to get more columns in the output. So this is just a way I found that is pretty flexible and works good if you are ok with just listing results.
In SQL Server row_numer() is a powerful function that can get result easily as below
select Person,[group],age
from
(
select * ,row_number() over(partition by [group] order by age desc) rn
from mytable
) t
where rn <= 2
There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group
Based on the solution in the referenced link, your query would be like:
SELECT Person, Group, Age
FROM
(SELECT Person, Group, Age,
#group_rank := IF(#group = Group, #group_rank + 1, 1) AS group_rank,
#current_group := Group
FROM `your_table`
ORDER BY Group, Age DESC
) ranked
WHERE group_rank <= `n`
ORDER BY Group, Age DESC;
where n is the top n and your_table is the name of your table.
I think the explanation in the reference is really clear. For quick reference I will copy and paste it here:
Currently MySQL does not support ROW_NUMBER() function that can assign
a sequence number within a group, but as a workaround we can use MySQL
session variables.
These variables do not require declaration, and can be used in a query
to do calculations and to store intermediate results.
#current_country := country This code is executed for each row and
stores the value of country column to #current_country variable.
#country_rank := IF(#current_country = country, #country_rank + 1, 1)
In this code, if #current_country is the same we increment rank,
otherwise set it to 1. For the first row #current_country is NULL, so
rank is also set to 1.
For correct ranking, we need to have ORDER BY country, population DESC
SELECT
p1.Person,
p1.`GROUP`,
p1.Age
FROM
person AS p1
WHERE
(
SELECT
COUNT( DISTINCT ( p2.age ) )
FROM
person AS p2
WHERE
p2.`GROUP` = p1.`GROUP`
AND p2.Age >= p1.Age
) < 2
ORDER BY
p1.`GROUP` ASC,
p1.age DESC
reference leetcode
Today I am facing a challenge for me, that I could solve with multiple queries, a little bit of PHP and some other funny things, but I was wondering whether what I mean to do can be achieved with a single query and/or stored fn/procedure.
I explain myself better:
in a list of cities, I need to pick up a value (say "general expenses") of that named city (say "Rome").
Pretty simple.
What I would like to do is:
Have 6 records for the same value BEFORE and 6 AFTER the Rome one.
So I would see something:
| position | city | expenses |
| 35 | Paris | 1364775 |
| 36 | Milan | 1378499 |
| 37 | New York | 1385759 |
| 38 | London | 1398594 |
| 39 | Oslo | 1404648 |
| 40 | Munchen | 1414857 |
| 41 | Rome | 1425773 | *** <--this is the value I need
| 42 | Dublin | 1437588 |
| 43 | Athen | 1447758 |
| 44 | Stockholm | 1458593 |
| 46 | Helsinki | 1467489 |
| 47 | Moscow | 1477484 |
| 48 | Kiev | 1485665 |
These values will populate a bars chart.
As you can see there is also another complexity level: the position.
Position must be calculated on all the records.
So let's say I have 100 records, I will have the ranking position from 1 to 100, but only the "limited 13" records must be output.
Any link, suggestion, tutorial or else the could help me out with that?
Thank you in advance as always.
EDIT
Position MUST BE calculated. It is not an input value.
Anyway, thanks folks for all your efforts.
SELECT
all_ranked.*
FROM (select rank
from (SELECT a.id AS id2,
#curRow := #curRow + 1 AS Rank
FROM the_table a
JOIN
(SELECT #curRow := 0) r
ORDER BY position DESC
) AS B)
where B.id=1234567) as rank_record, <--- just one record - value of rank
(SELECT a.id AS id2,
#curRow := #curRow + 1 AS Rank
FROM the_table a
JOIN
(SELECT #curRow := 0) r
ORDER BY position DESC
) AS all_ranked <--- all ranked users
where all_ranked.rank>=rank_record.rank-6 and all_ranked.rank>=rank_record.rank+6
Create 2 queries joined in one. The first gets position and the second sets positions and cut's desired fragment
You could use a stored function/procedure that takes in an input that indicates the subject record, e.g., "Rome" to derive a ranking, which I shall refer to here as the perceived ID (PID) for that record, e.g., 41. You can then use a variable #PID to store that location.
Then you can do your ranking query again but select all records.
SELECT .... WHERE Ranking BETWEEN (#PID-6) AND (#PID+6)
An advantage to doing it this way is that the function/procedure can take in an additional parameter to allow it to fetch X records after and Y records before that ranking. It would be easier to read and maintain as well.
Performing it as a single query without the use of PHP would be tricky as you need to insert a WHERE clause in which the condition is the result of another query.
If your position is going to be continuous unique number you can use sub-query in where condition.
SELECT `position`, `city`, `expenses`
FROM table_name
WHERE `position` > (
SELECT `position`-7
FROM table_name
WHERE `city`='Rome'
)
ORDER BY `position`
LIMIT 13
PS: I am not an expert in SQL. There may be better more efficient ways.
I haven't tried it, but this should work: You get the positions with a variable you increment while selecting in MySQL. Then you would have to select this "temporary table" twice; once to find Rome, once to find all 13 records:
select
from
(
select #rowno := #rowno + 1 as position, city, expenses
from cities
cross join (select #rowno := 0)
order by expenses
)
where abs(position -
(
select position, city
from
(
select #rowno := #rowno + 1 as position, city, expenses
from cities
cross join (select #rowno := 0)
order by expenses
)
where city = 'Rome'
)
) <= 6;
SELECT rank as position,city,expenses FROM
(SELECT #rownum := #rownum + 1 AS position, city, expenses, FIND_IN_SET( position, (
SELECT GROUP_CONCAT( position
ORDER BY position ASC )
FROM test )
) AS rank
FROM test,(SELECT #rownum := 0) r
HAVING rank BETWEEN(SELECT FIND_IN_SET( position, (
SELECT GROUP_CONCAT( position
ORDER BY position ASC )
FROM test )
)-6 AS rank
FROM test
WHERE expenses=1425773)
AND
(SELECT FIND_IN_SET( position, (
SELECT GROUP_CONCAT( position
ORDER BY position ASC )
FROM test )
)+6 AS rank
FROM test
WHERE expenses=1425773))x
FIDDLE
I have a table that looks like this
id + kID
--------------------------
0 | 3
1 | 6
2 | 7
3 | 6
4 | 7
5 | 5
What I want to do is find the amount of rows where the kID occurs only once. So in this case the value of the variable should be 2 because kID: 3 and 5 occurs only once so i'm trying to count that while ignoring everything else. I am really stumped, thanks for any help.
This will show kIDs that occur only once:
SELECT kID, COUNT(kID)
FROM table
GROUP BY kID
HAVING COUNT(kID) < 2
Result
| KID | COUNT(KID) |
--------------------
| 3 | 1 |
| 5 | 1 |
See the demo
Then to get the total count of those:
SELECT Count(*) AS count
FROM (SELECT kid,
Count(kid)
FROM tbl
GROUP BY kid
HAVING Count(kid) < 2) a
Result
| COUNT |
---------
| 2 |
See the demo
Try this
SELECT
id,
count(kID) as `Count`
FROM mytable as t
GROUP BY kID
HAVING Count = 1
How about
select count(*) from
(select kid, count(*) from table group by kid having count(*) = 1)
You could do the following:
select count(*) from
(
select kID, COUNT(*) [c] from tableName
group by kID
) t
where t.c = 1
SELECT kID,
COUNT(kID)
FROM tableName
GROUP BY kID
HAVING COUNT(kID) = 1
You could do it with a sub-select. This should work, though might not be extremely efficient:
SELECT id, kID, COUNT(1) FROM (SELECT COUNT(1),kID FROM TABLE
GROUP BY kID
HAVING COUNT = 1)
One more way to do it. It will work as long as the (id) is the primary key of the table or there is a unique constraint on (kid, id):
SELECT COUNT(*) AS cnt
FROM
( SELECT NULL
FROM tableX
GROUP BY kid
HAVING MIN(id) = MAX(id)
) AS g ;
Tested at SQL-Fiddle
An index on (kid, id) will improve efficiency - and only one COUNT() will be done, not 2.
This is raw data, and want to rank them according to score (count(tbl_1.id)).
[tbl_1]
===========
id | name
===========
1 | peter
2 | jane
1 | peter
2 | jane
3 | harry
3 | harry
3 | harry
3 | harry
4 | ron
So make temporary table (tbl_2) to count score for each id.
SELECT id, name, COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC;
LIMIT 0, 30;
Then result is;
[tbl_2]
===================
id | name | score
===================
3 | harry | 4
1 | peter | 2
2 | jane | 2
4 | ron | 1
Then query this;
SELECT v1.id, v1.name, v1.score, COUNT( v2.score ) AS rank
FROM votes v1
JOIN votes v2 ON v1.score < v2.score
OR (
v1.score = v2.score
AND v1.id = v2.id
)
GROUP BY v1.id, v1.score
ORDER BY v1.rank ASC, v1.id ASC
LIMIT 0, 30;
Then result is;
==========================
id | name | score | rank
==========================
3 | harry | 4 | 1
1 | peter | 2 | 2
2 | jane | 2 | 2
4 | ron | 1 | 4
Is it possible to do this in one transaction (query) nicely?
Yes, it's possible to do this in a single query. But it's a total hairball in MySQL, because MySQL doesn't have a simple ROWNUM operation, and you need one for the rank computation.
Here's your vote query with the rank shown. The #ranka variable is used to number the rows.
SELECT #ranka:=#ranka+1 AS rank, id, name, score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, id
) votes,
(SELECT #ranka:=0) r
As you have discovered already, you need to self-join this thing to get a proper ranking (which handles ties correctly). So, if you take your query and replace the two references to your votes table each with their own version of this subquery, you get what you need.
SELECT v1.id,
v1.name,
v1.score,
COUNT( v2.score ) AS rank
FROM (
SELECT #ranka:=#ranka+1 AS rank,
id,
name,
score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, name
) votes,
(SELECT #ranka:=0) r) v1
JOIN (
SELECT #rankb:=#rankb+1 AS rank,
id,
name,
score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, name
) votes,
(SELECT #rankb:=0) r) v2
ON (v1.score < v2.score) OR
(v1.score = v2.score AND v1.id = v2.id)
GROUP BY v1.id, v1.score
ORDER BY v1.rank ASC, v1.id ASC
LIMIT 0, 30;
Told you it's a hairball. Notice that you need different #ranka and #rankb variables in the two versions of the subquery that you're self-joining, to make the row numbering work correctly: these variables have connection scope, not subquery scope, in MySQL.
http://sqlfiddle.com/#!2/c5350/1/0 shows this working.
Edit: It's far easier to do this using PostgreSQL's RANK() function.
SELECT name, votes, rank() over (ORDER BY votes)
FROM (
SELECT name, count(id) votes
FROM tab
GROUP BY name
)x
http://sqlfiddle.com/#!1/94cca/18/0