let me explain the whole thing with an example:
| id | product | rating |
1 23 54
2 23 54
3 23 53
4 24 33
5 26 22
6 24 11
Lets say we have multiple ratings for each product and want to display the three top products. This would mean we can can user Inner-/left-/right- Join to get the products name from another table, order it by desc and set a limit of 3. But this would show us the same product three times with a rating of 54, 54 and 53.
Is it possible to avoid products with the same id in the result just with SQL?
So the dream output from one SQL query would be:
| id | product | rating |
1 23 54
4 24 33
5 26 22
In words: the top three unique products by rating (and of course only the row of the item with the highest rating -> id 1 or 2 for product 23 and not id 3).
Further more if there is only one product or two products with multiple ratings it should only transfer 1 or 2 results.
You can do this by taking the maximum rating for each product and choosing the top three:
select product, max(rating) as maxrating
from table t
group by product
order by maxrating desc
limit 3;
If you want the id for this rating, you can use the substring_index()/group_concat() trick:
select product, max(rating) as maxrating,
substring_index(group_concat(id order by rating desc), ',', 1) as id
from table t
group by product
order by maxrating desc
limit 3;
Alternatively, you can eschew the group by:
select t.*
from table t
where not exists (select 1
from table t2
where t2.product = t.product and
(t2.rating > t.rating or
t2.rating = t.rating and t2.id > t.id
)
)
order by t.rating desc
limit 3;
The complicated where clause is because multiple ratings can be the same.
EDIT:
The not exists version is getting the highest rating on the highest id for each row. The logic is simply saying: "Get me all rows from the products table where the product in the row has no other row with a higher rating/id combination". This is an awkward way for people to understand "Get the row with the maximum rating". But it turns out to be easier for the database to process. It is typically the most efficient method in MySQL and often the most efficient method in other databases as well, particularly with the right indexes defined.
Use a SELECT DISTINCT query. Check out details here: http://dev.mysql.com/doc/refman/5.6/en/select.html
Related
I'm working on a track and field ranking database in MySQL/PHP5 whereby I'm struggling to find the best way to query results per unique athlete by highest value.
just
SELECT distinct name, event
FROM results
sample database
name | event | result
--------------------------
athlete 1 | 40 | 7.43
athlete 2 | 40 | 7.66
athlete 1 | 40 | 7.33
athlete 1 | 60 | 9.99
athlete 2 | 60 | 10.55
so let's say that in this case I'd like to rank the athletes on the 40m dash event by best performance I tried
SELECT distinct name, event
FROM results
WHERE event = 40
ORDER by result DESC
but the distinct only leaves the first performance (7.43) of the athlete which isn't the best (7.33). Is there an easy way other than creating a temp table first whereby the results are ordered first and performing a select on the temp table afterwards?
You might be interested in group by:
SELECT name, min(result) as result
FROM results
WHERE event = 40
GROUP BY name
This gives you the best result per athlete.
As suggested by spencer, you can also order the list by appending this:
ORDER BY min(result) ASC
The problem is that the columns used in the ORDER BY aren't specified in the DISTINCT. To do this, you need to use an aggregate function to sort on, and use a GROUP BY to make the DISTINCT work.
SELECT distinct name, event
FROM results
WHERE event = 40
GROUP BY name
ORDER by result DESC
I am trying to get a certain amount of rows of which another amount of rows satisfy a specific condition.
I'll explain.
table 1:
ID | NAME
1 | Thomas
2 | Jason
3 | Oleg
4 | Matt
5 | Sheldon
6 | Jenny
table 2:
ID | ACTIVE
1 | 1
2 | 0
3 | 1
4 | 1
5 | 0
6 | 1
Query:
SELECT tbl_1.ID, tbl_1.NAME, tbl_2.ACTIVE
FROM tbl_1 JOIN tbl_2 ON
tbl_1.ID = tbl_2.ID
WHERE tbl_2.ACTIVE=1
LIMIT 5
in this example I would like to get a minimum number of 5 users, of which 3 are active.
of course the query above will not do the job right, as it limits the total rows to 5. But 3 of the rows in the result (or less if no more exist) MUST be active.
the other way I can think of getting this done, is a union, but my query is so cumbersome, long and complex.
Any ideas?
Use ORDER BY instead:
SELECT tbl_1.ID, tbl_1.NAME, tbl_2.ACTIVE
FROM tbl_1 JOIN
tbl_2
ON tbl_1.ID = tbl_2.ID
ORDER BY (tbl_2.ACTIVE = 1) DESC
LIMIT 5;
This puts the active users at the top of the list and then fills in the rest with other users.
Note: The ORDER BY clause could simply be ORDER BY tbl_2.ACTIVE DESC. I left the boolean logic so you could see the similarity to the WHERE clause.
The way to at least x results is to use the count aggregate and the keyword having
select f1, count(*) records
from yourTable
where whatever
group by f1
having count(*) > x
I have a list of films that users can rank in order of which they like best using jQuery UI Sortable (all works well). The lower the order number the better the film (1) and the higher (26) the worse it is. The list of films could be endless but is fixed in the database (users can't add more), so the user can only select from x list of films.
Films do not have to be in the users list, if they haven't seen film 5 then it won't get included (this may be compounding the problem).
Currently this is stored in the table:
film_id | user_id | order
4 2 3
5 3 3
6 2 1
7 2 2
7 3 1
8 3 2
What I want, and don't know where to start is an overall 'Top 10' style list. i.e. film 7 is the most popular because it appears higher up peoples lists and is in more lists. Film 6 could be the most popular but it's only in one list?!
I am stuck on both the logic and the Mysql queries to do it!
I am thinking I might need to weight the order somehow? Or have a separate table with the score per film and just update it after every edit. The following query seems like the right idea if it was just based on the count of items in the table but not when I want to add position in to the equation.
SELECT ff.film_id, COUNT(ff.film_id) AS cnt, SUM(ff.order) AS rank FROM
`favourite_film` AS ff GROUP BY ff.film_id ORDER BY cnt DESC, rank ASC
I guess I need the count of all the films in the table and the sum of the order (but reversed?), my theory then goes flat!
Any help or links would be greatly appreciated. Thanks.
Depending your "business rules", I think you should find some sort of calculation to both take into account the position and the number of "votes".
Just a random guess, but why not sorting by COUNT(votes)/AVG(pos) ? For maintainability reason, you might want to factor out the ranking function:
CREATE FUNCTION ranking(average_pos REAL, vote_count INT)
RETURNS REAL
DETERMINISTIC
RETURN vote_count/average_pos;
The query is now simply:
SELECT film_id,
AVG(pos) as a, COUNT(*) as c, ranking(AVG(pos),COUNT(*)) AS rank
FROM vote GROUP BY film_id
ORDER BY ranking(AVG(pos), COUNT(*)) DESC;
Producing with your example:
+----------+------+----+----------------+
| FILM_ID | A | C | RANK |
+----------+------+----+----------------+
| 7 | 1.5 | 2 | 1.333333333333 |
| 6 | 1 | 1 | 1 |
| 8 | 2 | 1 | 0.5 |
| 5 | 3 | 1 | 0.333333333333 |
| 4 | 3 | 1 | 0.333333333333 |
+----------+------+----+----------------+
See http://sqlfiddle.com/#!2/3b1d9/1
you should have reverted the list before saving it. this way you could leave the unselected movies out of the rating.
a workaround might be:
Count the amount of lists SELECT COUNT(DISTINCT(user_id) save this as $AMOUNT_OF_LISTS
now count the points using
SELECT film_id, (SUM(order)+($AMOUNT_OF_LISTS-COUNT(DISTINCT(user_id)))*POINTS_FOR_NOT_IN_LIST) as points FROM table GROUP BY film_id
logic: sum up all points and add POINTS_FOR_NOT_IN_LIST points for every time not in a list (total amount of lists - amount of times movie is in the list)
insert a value POINTS_FOR_NOT_IN_LIST to your liking. (might be 26 or 27 or even lower)
you probably want to add ORDER BY points DESC LIMIT 10 to the query to get 10 highest points
SELECT MIN( `order` ) , COUNT( * ) AS cnt, `film_id`
FROM `favourite_film`
GROUP BY `film_id`
ORDER BY cnt DESC , `order`
I would do this, I would assign a higher value to the movies with the higher ranking. Then I would sum the values per movie and order by the total descending to get the overall ranking. This way you are giving weight to both the popularity and rankings of each movie.
So if you wanted to do it by the top 3 ranked movies per user you could do this:
SELECT film_id, SUM(3 -- The max number of ranked movies per user
- order -- the ranking
+ 1) total_score
FROM TABLE_NAME
GROUP BY film_id
ORDER BY total_score DESC;
Obviously you could remove the comments
This way the top rated movie would get the higher score, the next highest, the next highest score, etc. If you were counting the top 10 movies per user, just change the 3 to 10.
If i have a MYSQL table that sorts names and one table that sorts value.
Example this is how the table look like:
ID name value
-- ------ -----
1 John 500
2 Rock 350
3 Wayne 700
4 John 350
5 Rock 250
6 Nick 100
7 Sweety 75
8 Lex 350
How do i display the total value for eache user? and also if i want to filter it to only show top 3 is there an easy command for that? Or do i need to make some kind of function.
In PHP
John 850
Wayne 700
Rock 600
Lex 350
Nick 100
Sweety 75
It can be accomplished by the aggregates, kind of
SELECT ID, name, SUM(value) as total_score
FROM user_scores
GROUP BY name
ORDER BY total_score
You need an aggregate function, specifically SUM. You can order a query by any value in the select list, and to go high-to-low just use the DESC keyword:
SELECT Name, SUM(value) AS TotalValue
FROM myTable
GROUP BY Name
ORDER BY TotalValue DESC
The AS TotalValue will ensure that the column name is TotalValue when you reference it in PHP.
I'm using PHP and MySQL.
I have a table named quantity. Inside there are records of product name, product price and product quantity. Besides these, there are a few others that helps me select the last records based on date and position, as well as a GROUP BY the field named price because there are different quantities with different prices for the same product. So, I currently select my product specific price and quantity like this:
SELECT `price`,`quantity` FROM (SELECT `price`,`quantity` FROM `quantity` WHERE `product_name` = 'DELL' ORDER BY `date` DESC, `position`) AS `Actions` GROUP BY `price`
This query is a workaround because I need to get data like this:
product_name | price | quantity
DELL | 100 | 30
DELL | 120 | 10
DELL | 130 | 2
Assuming that I have multiple records like these and I need to get the latest of them. Anyway, from this query I need to do the following: I need to select the records whose quantity summed with another product's quantity equals 35. So, by using my query I know that it should stop at line 2 because I can take the 30 products that came with the price of $100 and another 5 products from the line 2 that has price of 120. And then I would need to enter my updates. So, the new data would look like:
product_name | price | quantity
DELL | 100 | 0
DELL | 120 | 5
DELL | 130 | 2
How am I going to achieve this?
Option 1: Use program logic instead of a query:
There is nothing wrong with using the programming layer to do more advanced database interactions. SQL is not an answer to everything... (Also consider a stored procedure).
enough = 35
running_total = 0
START TRANSACTION
while running_total < enough:
select one record order by price limit 1 FOR UPDATE
add to running_total
UPDATE records...
COMMIT
Option 2: Use a query with a running total:
In this option, you obtain a running total using a derived query, and then filter that down to specific records in the outer query. If you intend on updating them, you should wrap this in a transaction with the right isolation level.
SET #running_total = 0;
SELECT
row_id,
product_name,
price,
quantity
FROM
(
SELECT
row_id,
product_name,
price,
quantity,
#running_total := #running_total + quantity AS running_total
FROM
sometable
WHERE
quantity > 0
ORDER BY
quantity
LIMIT
35 /* for performance reasons :) */
) as T1
WHERE
running_total < 35
I would tend to prefer option 1 because it is more "obvious", but perhaps this will give you some food for thought.