MySQL+PHP: optimise ranking query and count subquery - php

This is raw data, and want to rank them according to score (count(tbl_1.id)).
[tbl_1]
===========
id | name
===========
1 | peter
2 | jane
1 | peter
2 | jane
3 | harry
3 | harry
3 | harry
3 | harry
4 | ron
So make temporary table (tbl_2) to count score for each id.
SELECT id, name, COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC;
LIMIT 0, 30;
Then result is;
[tbl_2]
===================
id | name | score
===================
3 | harry | 4
1 | peter | 2
2 | jane | 2
4 | ron | 1
Then query this;
SELECT v1.id, v1.name, v1.score, COUNT( v2.score ) AS rank
FROM votes v1
JOIN votes v2 ON v1.score < v2.score
OR (
v1.score = v2.score
AND v1.id = v2.id
)
GROUP BY v1.id, v1.score
ORDER BY v1.rank ASC, v1.id ASC
LIMIT 0, 30;
Then result is;
==========================
id | name | score | rank
==========================
3 | harry | 4 | 1
1 | peter | 2 | 2
2 | jane | 2 | 2
4 | ron | 1 | 4
Is it possible to do this in one transaction (query) nicely?

Yes, it's possible to do this in a single query. But it's a total hairball in MySQL, because MySQL doesn't have a simple ROWNUM operation, and you need one for the rank computation.
Here's your vote query with the rank shown. The #ranka variable is used to number the rows.
SELECT #ranka:=#ranka+1 AS rank, id, name, score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, id
) votes,
(SELECT #ranka:=0) r
As you have discovered already, you need to self-join this thing to get a proper ranking (which handles ties correctly). So, if you take your query and replace the two references to your votes table each with their own version of this subquery, you get what you need.
SELECT v1.id,
v1.name,
v1.score,
COUNT( v2.score ) AS rank
FROM (
SELECT #ranka:=#ranka+1 AS rank,
id,
name,
score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, name
) votes,
(SELECT #ranka:=0) r) v1
JOIN (
SELECT #rankb:=#rankb+1 AS rank,
id,
name,
score
FROM
(
SELECT id,
name,
COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC, name
) votes,
(SELECT #rankb:=0) r) v2
ON (v1.score < v2.score) OR
(v1.score = v2.score AND v1.id = v2.id)
GROUP BY v1.id, v1.score
ORDER BY v1.rank ASC, v1.id ASC
LIMIT 0, 30;
Told you it's a hairball. Notice that you need different #ranka and #rankb variables in the two versions of the subquery that you're self-joining, to make the row numbering work correctly: these variables have connection scope, not subquery scope, in MySQL.
http://sqlfiddle.com/#!2/c5350/1/0 shows this working.
Edit: It's far easier to do this using PostgreSQL's RANK() function.
SELECT name, votes, rank() over (ORDER BY votes)
FROM (
SELECT name, count(id) votes
FROM tab
GROUP BY name
)x
http://sqlfiddle.com/#!1/94cca/18/0

Related

Need help in database query for contest winners

I'm creating a web application where the user will be able to participate in a contest and based on their rank, the user will be rewarded.
(table name contest)
id | contest_name | status
------------------------------------
1 | Test Contest | active
(table participants)
id | user_id | contest_id | score | time_taken
----------------------------------------------------------
1 | 123 | 1 | 10 | 2332 --> in milliseconds
My contest table prize distribution (table name price_distribution)
id | contest_id | rank_start | rank_end | price
-------------------------------------------------------
1 | 1 | 1 | 10 | 50
-------------------------------------------------------
2 | 1 | 11 | 20 | 25
-------------------------------------------------------
Meaning if all the users that rank between 1 to 10, they will get 50 points and rank between 11-20, 20 points so on.
I've used this query to get all the list of users in the contest with their rank.
SELECT participants.score,
participants.time_taken,
contest.name as contest_name,
user.name as username,
user.image as userimage,
FIND_IN_SET( participants.score, (SELECT GROUP_CONCAT( participants.score ORDER BY participants.score DESC, participants.time_taken ASC )
FROM participants )) AS rank
FROM participants
LEFT JOIN contest
ON participants.contest_id = contest.id
LEFT JOIN user ON user.id = participants.user_id
WHERE participants.contest_id = '1'
ORDER BY participants.score DESC, participants.time_taken ASC
LIMIT 50
The above query results is this
score | time_taken | contest_name | username | userimage | rank
-----------------------------------------------------------------------
10 | 2356 | test_contest | abc | image | 1
-----------------------------------------------------------------------
The above query only lists the user based on rank and does nothing else.
I want to reward the user based on rank.. How to achieve this query.
I want to know the query which when executed will reward the user based on their rank and will take the value from the prize distribution table.
Any help will be greatly appreciated.
You can join the table price_distribution based on the user's score:
SELECT participants.score,
participants.time_taken,
contest.name as contest_name,
user.name as username,
user.image as userimage,
FIND_IN_SET( participants.score, (SELECT GROUP_CONCAT( participants.score ORDER BY participants.score DESC, participants.time_taken ASC )
FROM participants )) AS rank,
pd.price
FROM participants
LEFT JOIN contest ON participants.contest_id = contest.id
LEFT JOIN user ON user.id = participants.user_id
LEFT JOIN price_distribution pd
ON participants.contest_id = pd.contest_id and participants.score BETWEEN pd.rank_start AND pd.rank_end
WHERE participants.contest_id = '1'
ORDER BY participants.score DESC, participants.time_taken ASC
LIMIT 50

postgreSQL ranking query with the given user_id

I am trying to get rank of a user by their two dimension params: donation sum and total donor count.
My rank formula is: rank of [rank of donation_sum + rank of donor_count / 2]
Sample table:
donation_id | user_id | donor_id | donation_sum
-----------------------------------------------
1 | 1 | 1 | 10
2 | 1 | 2 | 5
3 | 2 | 3 | 10
4 | 3 | 1 | 50
...
As you see, some donors make donation to different users, so I used sum(donation_sum) and count(distinct(donation_id)) to get exact rankings
I am able to get list of ranking separately by donation sum and total donor count with 2 sql but my need is to get a user rank with that formula above by given user_id in postgreSQL v. 9.4
Do you have any solution for it? so I will use that sql query in a Yii2 PHP framework
Thanks
Edit:
We added donation_date to the tbl_donation and modified actual query as below:
is it true usage of where donation_date?
with list as (
select
s.runner_id, sum, count, rank_sum, rank_count,
(rank_sum+ rank_count)::float/ 2 as rank_avg,
row_number() over (order by rank_sum) as rank
from (
select *, rank() over (order by sum desc) rank_sum
from (
select runner_id, sum(donation_sum)
from tbl_donation
where donation_date >= '2015-01-01'
group by 1
) s
) s
join (
select *, rank() over (order by count desc) rank_count
from (
select runner_id, count(distinct(donator_id))
from tbl_donation
where donation_date >= '2015-01-01'
group by 1
) c
) c
using (runner_id)
)
select rank
from list
where runner_id = 251;
Make two rankings in separate subqueries:
select
s.user_id, sum, count, rank_sum, rank_count,
(rank_sum+ rank_count)::float/ 2 as rank_avg,
row_number() over (order by rank_sum) as rank
from (
select *, rank() over (order by sum desc) rank_sum
from (
select user_id, sum(donation_sum)
from donations
group by 1
) s
) s
join (
select *, rank() over (order by count desc) rank_count
from (
select user_id, count(distinct(donation_id))
from donations
group by 1
) c
) c
using (user_id);
user_id | sum | count | rank_sum | rank_count | rank_avg | rank
---------+-----+-------+----------+------------+----------+------
3 | 100 | 1 | 1 | 2 | 1.5 | 1
1 | 30 | 2 | 2 | 1 | 1.5 | 2
2 | 20 | 1 | 3 | 2 | 2.5 | 3
(3 rows)
If you want to select rank for a single user_id use with query, e.g.:
with list as (
-- place here the above query
)
select rank
from list
where user_id = 2;

Sql count Average Limit By 2 each row [duplicate]

The following is the simplest possible example, though any solution should be able to scale to however many n top results are needed:
Given a table like that below, with person, group, and age columns, how would you get the 2 oldest people in each group? (Ties within groups should not yield more results, but give the first 2 in alphabetical order)
+--------+-------+-----+
| Person | Group | Age |
+--------+-------+-----+
| Bob | 1 | 32 |
| Jill | 1 | 34 |
| Shawn | 1 | 42 |
| Jake | 2 | 29 |
| Paul | 2 | 36 |
| Laura | 2 | 39 |
+--------+-------+-----+
Desired result set:
+--------+-------+-----+
| Shawn | 1 | 42 |
| Jill | 1 | 34 |
| Laura | 2 | 39 |
| Paul | 2 | 36 |
+--------+-------+-----+
NOTE: This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group, and which received a great MySQL-specific answer from #Bohemian:
select *
from (select * from mytable order by `Group`, Age desc, Person) x
group by `Group`
Would love to be able to build off this, though I don't see how.
Here is one way to do this, using UNION ALL (See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group number and add queries for each group:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(#num:=if(#group = `group`, #num +1, if(#group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select #num:=0, #group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
In other databases you can do this using ROW_NUMBER. MySQL doesn't support ROW_NUMBER but you can use variables to emulate it:
SELECT
person,
groupname,
age
FROM
(
SELECT
person,
groupname,
age,
#rn := IF(#prev = groupname, #rn + 1, 1) AS rn,
#prev := groupname
FROM mytable
JOIN (SELECT #prev := NULL, #rn := 0) AS vars
ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2
See it working online: sqlfiddle
Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:
It it is a single query. The variables are initialized inside the SELECT statement.
It handles ties as described in the question (alphabetical order by name).
So I'll leave it here in case it can help someone.
Try this:
SELECT a.person, a.group, a.age FROM person AS a WHERE
(SELECT COUNT(*) FROM person AS b
WHERE b.group = a.group AND b.age >= a.age) <= 2
ORDER BY a.group ASC, a.age DESC
DEMO
How about using self-joining:
CREATE TABLE mytable (person, groupname, age);
INSERT INTO mytable VALUES('Bob',1,32);
INSERT INTO mytable VALUES('Jill',1,34);
INSERT INTO mytable VALUES('Shawn',1,42);
INSERT INTO mytable VALUES('Jake',2,29);
INSERT INTO mytable VALUES('Paul',2,36);
INSERT INTO mytable VALUES('Laura',2,39);
SELECT a.* FROM mytable AS a
LEFT JOIN mytable AS a2
ON a.groupname = a2.groupname AND a.age <= a2.age
GROUP BY a.person
HAVING COUNT(*) <= 2
ORDER BY a.groupname, a.age DESC;
gives me:
a.person a.groupname a.age
---------- ----------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category
Also, I'm using SQLite, but this should work on MySQL.
Another thing: in the above, I replaced the group column with a groupname column for convenience.
Edit:
Following-up on the OP's comment regarding missing tie results, I incremented on snuffin's answer to show all the ties. This means that if the last ones are ties, more than 2 rows can be returned, as shown below:
.headers on
.mode column
CREATE TABLE foo (person, groupname, age);
INSERT INTO foo VALUES('Paul',2,36);
INSERT INTO foo VALUES('Laura',2,39);
INSERT INTO foo VALUES('Joe',2,36);
INSERT INTO foo VALUES('Bob',1,32);
INSERT INTO foo VALUES('Jill',1,34);
INSERT INTO foo VALUES('Shawn',1,42);
INSERT INTO foo VALUES('Jake',2,29);
INSERT INTO foo VALUES('James',2,15);
INSERT INTO foo VALUES('Fred',1,12);
INSERT INTO foo VALUES('Chuck',3,112);
SELECT a.person, a.groupname, a.age
FROM foo AS a
WHERE a.age >= (SELECT MIN(b.age)
FROM foo AS b
WHERE (SELECT COUNT(*)
FROM foo AS c
WHERE c.groupname = b.groupname AND c.age >= b.age) <= 2
GROUP BY b.groupname)
ORDER BY a.groupname ASC, a.age DESC;
gives me:
person groupname age
---------- ---------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
Joe 2 36
Chuck 3 112
Snuffin solution seems quite slow to execute when you've got plenty of rows and Mark Byers/Rick James and Bluefeet solutions doesn't work on my environnement (MySQL 5.6) because order by is applied after execution of select, so here is a variant of Marc Byers/Rick James solutions to fix this issue (with an extra imbricated select):
select person, groupname, age
from
(
select person, groupname, age,
(#rn:=if(#prev = groupname, #rn +1, 1)) as rownumb,
#prev:= groupname
from
(
select person, groupname, age
from persons
order by groupname , age desc, person
) as sortedlist
JOIN (select #prev:=NULL, #rn :=0) as vars
) as groupedlist
where rownumb<=2
order by groupname , age desc, person;
I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds
If the other answers are not fast enough Give this code a try:
SELECT
province, n, city, population
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(province != #prev, 1, #n + 1) AS n,
#prev := province,
province, city, population
FROM Canada
ORDER BY
province ASC,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Output:
+---------------------------+------+------------------+------------+
| province | n | city | population |
+---------------------------+------+------------------+------------+
| Alberta | 1 | Calgary | 968475 |
| Alberta | 2 | Edmonton | 822319 |
| Alberta | 3 | Red Deer | 73595 |
| British Columbia | 1 | Vancouver | 1837970 |
| British Columbia | 2 | Victoria | 289625 |
| British Columbia | 3 | Abbotsford | 151685 |
| Manitoba | 1 | ...
Check this out:
SELECT
p.Person,
p.`Group`,
p.Age
FROM
people p
INNER JOIN
(
SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`
UNION
SELECT MAX(p3.Age) AS Age, p3.`Group` FROM people p3 INNER JOIN (SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`) p4 ON p3.Age < p4.Age AND p3.`Group` = p4.`Group` GROUP BY `Group`
) p2 ON p.Age = p2.Age AND p.`Group` = p2.`Group`
ORDER BY
`Group`,
Age DESC,
Person;
SQL Fiddle: http://sqlfiddle.com/#!2/cdbb6/15
WITH cte_window AS (
SELECT movie_name,director_id,release_date,
ROW_NUMBER() OVER( PARTITION BY director_id ORDER BY release_date DESC) r
FROM movies
)
SELECT * FROM cte_window WHERE r <= <n>;
Above query will returns latest n movies for each directors.
I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I'm working on. This doesn't quite give the output you're looking for but its close. The function in mysql called GROUP_CONCAT() worked really well for specifying how many results to return in each group. Using LIMIT or any of the other fancy ways of trying to do this with COUNT didn't work for me. So if you're willing to accept a modified output, its a great solution. Lets say I have a table called 'student' with student ids, their gender, and gpa. Lets say I want to top 5 gpas for each gender. Then I can write the query like this
SELECT sex, SUBSTRING_INDEX(GROUP_CONCAT(cast(gpa AS char ) ORDER BY gpa desc), ',',5)
AS subcategories FROM student GROUP BY sex;
Note that the parameter '5' tells it how many entries to concatenate into each row
And the output would look something like
+--------+----------------+
| Male | 4,4,4,4,3.9 |
| Female | 4,4,3.9,3.9,3.8|
+--------+----------------+
You can also change the ORDER BY variable and order them a different way. So if I had the student's age I could replace the 'gpa desc' with 'age desc' and it will work! You can also add variables to the group by statement to get more columns in the output. So this is just a way I found that is pretty flexible and works good if you are ok with just listing results.
In SQL Server row_numer() is a powerful function that can get result easily as below
select Person,[group],age
from
(
select * ,row_number() over(partition by [group] order by age desc) rn
from mytable
) t
where rn <= 2
There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group
Based on the solution in the referenced link, your query would be like:
SELECT Person, Group, Age
FROM
(SELECT Person, Group, Age,
#group_rank := IF(#group = Group, #group_rank + 1, 1) AS group_rank,
#current_group := Group
FROM `your_table`
ORDER BY Group, Age DESC
) ranked
WHERE group_rank <= `n`
ORDER BY Group, Age DESC;
where n is the top n and your_table is the name of your table.
I think the explanation in the reference is really clear. For quick reference I will copy and paste it here:
Currently MySQL does not support ROW_NUMBER() function that can assign
a sequence number within a group, but as a workaround we can use MySQL
session variables.
These variables do not require declaration, and can be used in a query
to do calculations and to store intermediate results.
#current_country := country This code is executed for each row and
stores the value of country column to #current_country variable.
#country_rank := IF(#current_country = country, #country_rank + 1, 1)
In this code, if #current_country is the same we increment rank,
otherwise set it to 1. For the first row #current_country is NULL, so
rank is also set to 1.
For correct ranking, we need to have ORDER BY country, population DESC
SELECT
p1.Person,
p1.`GROUP`,
p1.Age
FROM
person AS p1
WHERE
(
SELECT
COUNT( DISTINCT ( p2.age ) )
FROM
person AS p2
WHERE
p2.`GROUP` = p1.`GROUP`
AND p2.Age >= p1.Age
) < 2
ORDER BY
p1.`GROUP` ASC,
p1.age DESC
reference leetcode

SQL joins. I got confused. Which one i need to use? [duplicate]

The following is the simplest possible example, though any solution should be able to scale to however many n top results are needed:
Given a table like that below, with person, group, and age columns, how would you get the 2 oldest people in each group? (Ties within groups should not yield more results, but give the first 2 in alphabetical order)
+--------+-------+-----+
| Person | Group | Age |
+--------+-------+-----+
| Bob | 1 | 32 |
| Jill | 1 | 34 |
| Shawn | 1 | 42 |
| Jake | 2 | 29 |
| Paul | 2 | 36 |
| Laura | 2 | 39 |
+--------+-------+-----+
Desired result set:
+--------+-------+-----+
| Shawn | 1 | 42 |
| Jill | 1 | 34 |
| Laura | 2 | 39 |
| Paul | 2 | 36 |
+--------+-------+-----+
NOTE: This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group, and which received a great MySQL-specific answer from #Bohemian:
select *
from (select * from mytable order by `Group`, Age desc, Person) x
group by `Group`
Would love to be able to build off this, though I don't see how.
Here is one way to do this, using UNION ALL (See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group number and add queries for each group:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(#num:=if(#group = `group`, #num +1, if(#group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select #num:=0, #group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
In other databases you can do this using ROW_NUMBER. MySQL doesn't support ROW_NUMBER but you can use variables to emulate it:
SELECT
person,
groupname,
age
FROM
(
SELECT
person,
groupname,
age,
#rn := IF(#prev = groupname, #rn + 1, 1) AS rn,
#prev := groupname
FROM mytable
JOIN (SELECT #prev := NULL, #rn := 0) AS vars
ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2
See it working online: sqlfiddle
Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:
It it is a single query. The variables are initialized inside the SELECT statement.
It handles ties as described in the question (alphabetical order by name).
So I'll leave it here in case it can help someone.
Try this:
SELECT a.person, a.group, a.age FROM person AS a WHERE
(SELECT COUNT(*) FROM person AS b
WHERE b.group = a.group AND b.age >= a.age) <= 2
ORDER BY a.group ASC, a.age DESC
DEMO
How about using self-joining:
CREATE TABLE mytable (person, groupname, age);
INSERT INTO mytable VALUES('Bob',1,32);
INSERT INTO mytable VALUES('Jill',1,34);
INSERT INTO mytable VALUES('Shawn',1,42);
INSERT INTO mytable VALUES('Jake',2,29);
INSERT INTO mytable VALUES('Paul',2,36);
INSERT INTO mytable VALUES('Laura',2,39);
SELECT a.* FROM mytable AS a
LEFT JOIN mytable AS a2
ON a.groupname = a2.groupname AND a.age <= a2.age
GROUP BY a.person
HAVING COUNT(*) <= 2
ORDER BY a.groupname, a.age DESC;
gives me:
a.person a.groupname a.age
---------- ----------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category
Also, I'm using SQLite, but this should work on MySQL.
Another thing: in the above, I replaced the group column with a groupname column for convenience.
Edit:
Following-up on the OP's comment regarding missing tie results, I incremented on snuffin's answer to show all the ties. This means that if the last ones are ties, more than 2 rows can be returned, as shown below:
.headers on
.mode column
CREATE TABLE foo (person, groupname, age);
INSERT INTO foo VALUES('Paul',2,36);
INSERT INTO foo VALUES('Laura',2,39);
INSERT INTO foo VALUES('Joe',2,36);
INSERT INTO foo VALUES('Bob',1,32);
INSERT INTO foo VALUES('Jill',1,34);
INSERT INTO foo VALUES('Shawn',1,42);
INSERT INTO foo VALUES('Jake',2,29);
INSERT INTO foo VALUES('James',2,15);
INSERT INTO foo VALUES('Fred',1,12);
INSERT INTO foo VALUES('Chuck',3,112);
SELECT a.person, a.groupname, a.age
FROM foo AS a
WHERE a.age >= (SELECT MIN(b.age)
FROM foo AS b
WHERE (SELECT COUNT(*)
FROM foo AS c
WHERE c.groupname = b.groupname AND c.age >= b.age) <= 2
GROUP BY b.groupname)
ORDER BY a.groupname ASC, a.age DESC;
gives me:
person groupname age
---------- ---------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
Joe 2 36
Chuck 3 112
Snuffin solution seems quite slow to execute when you've got plenty of rows and Mark Byers/Rick James and Bluefeet solutions doesn't work on my environnement (MySQL 5.6) because order by is applied after execution of select, so here is a variant of Marc Byers/Rick James solutions to fix this issue (with an extra imbricated select):
select person, groupname, age
from
(
select person, groupname, age,
(#rn:=if(#prev = groupname, #rn +1, 1)) as rownumb,
#prev:= groupname
from
(
select person, groupname, age
from persons
order by groupname , age desc, person
) as sortedlist
JOIN (select #prev:=NULL, #rn :=0) as vars
) as groupedlist
where rownumb<=2
order by groupname , age desc, person;
I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds
If the other answers are not fast enough Give this code a try:
SELECT
province, n, city, population
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(province != #prev, 1, #n + 1) AS n,
#prev := province,
province, city, population
FROM Canada
ORDER BY
province ASC,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Output:
+---------------------------+------+------------------+------------+
| province | n | city | population |
+---------------------------+------+------------------+------------+
| Alberta | 1 | Calgary | 968475 |
| Alberta | 2 | Edmonton | 822319 |
| Alberta | 3 | Red Deer | 73595 |
| British Columbia | 1 | Vancouver | 1837970 |
| British Columbia | 2 | Victoria | 289625 |
| British Columbia | 3 | Abbotsford | 151685 |
| Manitoba | 1 | ...
Check this out:
SELECT
p.Person,
p.`Group`,
p.Age
FROM
people p
INNER JOIN
(
SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`
UNION
SELECT MAX(p3.Age) AS Age, p3.`Group` FROM people p3 INNER JOIN (SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`) p4 ON p3.Age < p4.Age AND p3.`Group` = p4.`Group` GROUP BY `Group`
) p2 ON p.Age = p2.Age AND p.`Group` = p2.`Group`
ORDER BY
`Group`,
Age DESC,
Person;
SQL Fiddle: http://sqlfiddle.com/#!2/cdbb6/15
WITH cte_window AS (
SELECT movie_name,director_id,release_date,
ROW_NUMBER() OVER( PARTITION BY director_id ORDER BY release_date DESC) r
FROM movies
)
SELECT * FROM cte_window WHERE r <= <n>;
Above query will returns latest n movies for each directors.
I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I'm working on. This doesn't quite give the output you're looking for but its close. The function in mysql called GROUP_CONCAT() worked really well for specifying how many results to return in each group. Using LIMIT or any of the other fancy ways of trying to do this with COUNT didn't work for me. So if you're willing to accept a modified output, its a great solution. Lets say I have a table called 'student' with student ids, their gender, and gpa. Lets say I want to top 5 gpas for each gender. Then I can write the query like this
SELECT sex, SUBSTRING_INDEX(GROUP_CONCAT(cast(gpa AS char ) ORDER BY gpa desc), ',',5)
AS subcategories FROM student GROUP BY sex;
Note that the parameter '5' tells it how many entries to concatenate into each row
And the output would look something like
+--------+----------------+
| Male | 4,4,4,4,3.9 |
| Female | 4,4,3.9,3.9,3.8|
+--------+----------------+
You can also change the ORDER BY variable and order them a different way. So if I had the student's age I could replace the 'gpa desc' with 'age desc' and it will work! You can also add variables to the group by statement to get more columns in the output. So this is just a way I found that is pretty flexible and works good if you are ok with just listing results.
In SQL Server row_numer() is a powerful function that can get result easily as below
select Person,[group],age
from
(
select * ,row_number() over(partition by [group] order by age desc) rn
from mytable
) t
where rn <= 2
There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group
Based on the solution in the referenced link, your query would be like:
SELECT Person, Group, Age
FROM
(SELECT Person, Group, Age,
#group_rank := IF(#group = Group, #group_rank + 1, 1) AS group_rank,
#current_group := Group
FROM `your_table`
ORDER BY Group, Age DESC
) ranked
WHERE group_rank <= `n`
ORDER BY Group, Age DESC;
where n is the top n and your_table is the name of your table.
I think the explanation in the reference is really clear. For quick reference I will copy and paste it here:
Currently MySQL does not support ROW_NUMBER() function that can assign
a sequence number within a group, but as a workaround we can use MySQL
session variables.
These variables do not require declaration, and can be used in a query
to do calculations and to store intermediate results.
#current_country := country This code is executed for each row and
stores the value of country column to #current_country variable.
#country_rank := IF(#current_country = country, #country_rank + 1, 1)
In this code, if #current_country is the same we increment rank,
otherwise set it to 1. For the first row #current_country is NULL, so
rank is also set to 1.
For correct ranking, we need to have ORDER BY country, population DESC
SELECT
p1.Person,
p1.`GROUP`,
p1.Age
FROM
person AS p1
WHERE
(
SELECT
COUNT( DISTINCT ( p2.age ) )
FROM
person AS p2
WHERE
p2.`GROUP` = p1.`GROUP`
AND p2.Age >= p1.Age
) < 2
ORDER BY
p1.`GROUP` ASC,
p1.age DESC
reference leetcode

Count occurrences of distinct values in 2 fields

I am trying to find a MySQL query that will find distinct values in a particular field, count the number of occurrences of that value in 2 fields (1_user, 2_user) and then order the results by the count.
example db
+------+-----------+-----------+
| id | 1_user | 2_user |
+------+-----------+-----------+
| 1 | 2 | 1 |
| 2 | 3 | 2 |
| 3 | 8 | 7 |
| 4 | 1 | 8 |
| 5 | 2 | 8 |
| 6 | 3 | 8 |
+------+-----------+-----------+
expected result
user count
----- -----
8 4
2 3
3 2
1 2
The Query
SELECT user, count(*) AS count
FROM
(
SELECT 1_user AS USER FROM test
UNION ALL
SELECT 2_user FROM test
) AS all_users
GROUP BY user
ORDER BY count DESC
Explanation
List all the users in the first column.
SELECT 1_user AS USER FROM test
Combine them with the users from the second column.
UNION ALL
SELECT 2_user FROM test
The trick here is the UNION ALL which preserves duplicate values.
The rest is easy -- select the results you want from the subquery:
SELECT user, count(*) AS count
aggregate by user:
GROUP BY user
and prescribe the order:
ORDER BY count DESC
SELECT u, count(u) AS cnt
FROM (
SELECT 1_user AS u FROM table
UNION ALL
SELECT 2_user AS u FROM table
) subquery
GROUP BY u
ORDER by cnt DESC
Take the 2 queries:
SELECT COUNT(*) FROM table GROUP BY 1_user
SELECT COUNT(*) FROM table GROUP BY 2_user
Now combine them:
SELECT user, SUM(count) FROM
((SELECT 1_user as user FROM table)
UNION ALL
(SELECT 2_user as user FROM table))
GROUP BY user, ORDER BY count DESC;
I think this what you are looking for since your expected result did not include 7
select usr, count(usr) cnt from
(
select user_1 usr from users
union all
select user_2 usr from users
) u
where u.usr in (select user_1 from users)
group by usr
order by count(u.usr) desc

Categories