I have two tables as follows:
I have a RatingsTable that contains a ratingname and a bit whether it is a positive or negative rating:
RatingsTable
----------------------
ratingname ispositive
----------------------
Good 1
Bad 0
Fun 1
Boring 0
And I have a FeedbackTable that contains feedback on things: the person rating, the rating and the thing rated. The feedback can be determined if it's a positive or negative rating based on the RatingsTable.
FeedbackTable
---------------------------------
username thing ratingname
---------------------------------
Jim Chicken Good
Jim Steak Bad
Ted Waterskiing Fun
Ted Hiking Fun
Nancy Hiking Boring
I am trying to write an efficient MySQL query for the following:
On a page, I want to display the the top 'things' that have the highest proportion of positive ratings. I want to be sure that the items from the feedback table are unique...meaning, that if Jim has rated Chicken Good 20 times...it should only be counted once. At some point I will want to require a minimum number of ratings (at least 10) to be counted for this page as well. I'll want to to do the same for highest proportional negative ratings, but I am sure I can tweak the one for positive accordingly.
To get the "things" in order of proportion of good ratings you can use this query:
SELECT thing, SUM(ispositive) / COUNT(*) AS proportion_positive
FROM (SELECT DISTINCT username, thing, ratingname FROM FeedbackTable) T1
JOIN RatingsTable T2
ON T1.ratingname = T2.ratingname
GROUP BY thing
ORDER BY proportion_positive DESC
For your example data it returns this:
thing proportion_positive
Chicken 1.0000
Waterskiing 1.0000
Hiking 0.5000
Steak 0.0000
To require at least 10 votes before displaying a thing in the results add this line after the GROUP BY:
HAVING COUNT(*) >= 10
To get the proportion of negative ratings change SUM(ispositive) to SUM(NOT ispositive).
Note: it might be better to add a unique constraint to your voting table instead of selecting only the disctinct values.
SELECT *
FROM `feedback`
LEFT JOIN `ratings` ON `feedback`.`rating` = `rating`.`label`
ORDER BY `rating`.`value` DESC
GROUP BY `feedback`.`username`
LIMIT 10
The summary: join the ratings to your feedback table, but group by the username so you only get one username per result.
Related
I am working with a table that contains a list of uploaded files that will be processed by a Laser Cutter.
To increase efficiency I need to list files that have the same material and colour in order of how many of the same combination occurs. Then order that by the time it was sent.
Here is a very simplified table example:
From the example table above I want to achieve the following results:
6 File Number 6 Plastic Red 9am
4 File Number 4 Plastic Red 10am
5 File Number 5 Plastic Red 10:30am
1 File Number 1 Card Blue 9am
2 File Number 2 Card Blue 9:30am
7 File Number 7 Plastic Purple 8am
3 File Number 3 Card Green 9am
So where both material and colour occur the most, the respective rows are at the top and within that, the earliest file is at the top.
As you can see, red plastic occurs the most so is at the top. Single files are ordered by time only as seen with file 7 and 3.
Ideally I would like to perform this in one MYSQL query but given its complexity I'm not sure that will be possible. Perhaps storing the results and looping through arrays?
I have nearly managed to achieve this with this query:
SELECT *
FROM propellor.pro_files
GROUP BY material, colour
ORDER BY count(*) DESC,
material DESC,
colour DESC,
sent DESC;
Although this seems to be ordered almost correctly it only returns single rows for each material/colour combination.
It's worth noting that the colour column can be null.
For an idea of how I'm grouping them, here's how it will look in practice:
UPDATE: The material / colour groups are ordered correctly now but single files are ordered newest to oldest when I need it the other way around, I have tried playing with sent ASC / DESC but it only affects files in a group.
UPDATE 2: Here is my current full MYSQL query:
SELECT *
FROM propellor.pro_files
INNER JOIN propellor.pro_users ON propellor.pro_files.userid=propellor.pro_users.userid
INNER JOIN (
SELECT material, colour, count(*) AS occ
FROM propellor.pro_files
GROUP BY material, colour
)
AS mcCounts
USING (material, colour)
WHERE status =1 AND laser=1
ORDER BY mcCounts.occ DESC,
material,
colour,
propellor.pro_files.timesent ASC
;
You need to calculate the counts separately, though it can all be done in a single query; this should do it:
SELECT pf.*
FROM propellor.pro_files AS pf
INNER JOIN (
SELECT material, colour, count(*) AS occ
FROM propellor.pro_files
GROUP BY material, colour
) AS mcCounts USING (material, colour)
ORDER BY mcCounts.occ DESC
, material, colour
, pf.sent DESC
;
Edit: Added material, colour to ORDER BY... for when two combinations have the same frequency.
You can try creating count columns in your sql select and then ordering by those counts
select count(material) over partition by material) as materialCount
, count(colour) over partition by colour) as colourCount
, ..... other columns
from propellor.pro_files
order by materialCount desc, colourcount desc
I run real state website, which allows different users to add their properties.
When someone searches for a specific criteria, we use a select statement along with the specified conditions to select the matched properties, do the paging bit and display the results.
We use a rating algorithm to rate each property and use it to priorities the displayed properties.
A simplified version:
name type bedrooms user score
green house sale two bedrooms alex 6
Blue one rent three bedrooms jack 6
Blue one sale three bedrooms jack 4
gray one sale three bedrooms jack 6
green one rent three bedrooms jack 6
purple one rent three bedrooms jack 6
green one rent three bedrooms jack 6
green one rent three bedrooms gary 6
Now the problem is that sometimes a few properties have the same score. In these cases I don't want properties from one user to dominate a search result page, I want to set a limit to display a maximum of three properties of any given user in a search result page.
In the example, I don't want the properties owned by jack to dominate the first page, and properties of other users go to second page. This would upset other users and create a bad experience for visitors.
If I wanted to show only one property for a given user, I'd use Group by, but I'm not sure what to limit to a larger number ( three for instance). Is there anything I could do in mysql to achieve this?
EDIT:
Sorry if it wasnt clear enough.
The use field displays the user who added the particular property. A sample query could be
SELECT * FROM properties WHERE type = 'sale' LIMIT 5 ORDER BY score
The result could be five properties, all added by jack. I want to make sure that no more than thee properties added by a particular user, are included in the results. This way properties added by other users would have a chance to be displayed.
Use DISTINCT it will solve your problem.In a table, a column may contain many duplicate values; and sometimes you only want to list the different (distinct) values.The DISTINCT keyword can be used to return only distinct (different) values. Example
SELECT DISTINCT user FROM table_name;
use DISTINCT in your query something like this example
SELECT DISTINCT column_name,column_name from table;
try with this and change string your_table with your table name
SELECT
*
FROM
`your_table`
LEFT JOIN
(SELECT * FROM `your_table` LIMIT 3) as lr on lr.user = `your_table`.user
GROUP BY
user
reference link
UPDATE 2
if you want to order by your score you can use
SELECT
*
FROM
`your_table`
LEFT JOIN
(SELECT * FROM `your_table` ORDER BY score DESC LIMIT 3) as lr on lr.user = `your_table`.user
GROUP BY
user
I need help with a query involving a review system set up with the following two tables.
reviews
-------
id date user_id item_id rating review
1 02-2012 40 456 3 'I like it'
2 03-2012 22 342 1 'I don't like it'
3 04-2012 45 548 0 'I hate it'
reviews_thumbs
--------------
review_id user_id like
1 22 1
1 45 -1
2 40 -1
3 22 1
The "reviews_thumbs" table exists to keep track of upvotes and downvotes for the reviews, so that reviews can be rated by quality. In the 'like' column, a 1 is an upvote and a -1 is a downvote. (The rating column in the reviews table is a star system, unrelated.)
When loading reviews, I need to join the reviews_thumbs table in such a way that I know the following details (for each individual review as they are returned):
1. The total number of upvotes
2. The total number of downvotes
3. Whether the current active user has upvoted or downvoted the review
I have accomplished this using the following query, which isn't sitting right with me:
SELECT `reviews`.*,
COUNT(upVoteTable.`user_id`) AS upVotes,
COUNT(downVoteTable.`user_id`) AS downVotes,
COUNT(userUpTable.`user_id`) AS userUp,
COUNT(userDownTable.`user_id`) as userDown
FROM `reviews`
LEFT JOIN `reviews_thumbs` AS upVoteTable
ON upVoteTable.`review_id` = `reviews`.`id`
AND upVoteTable.`like` = 1
LEFT JOIN `reviews_thumbs` AS downVoteTable
ON downVoteTable.`review_id` = `reviews`.`id`
AND downVoteTable.`like` = -1
LEFT JOIN `reviews_thumbs` AS userUpTable
ON userUpTable.`review_id` = `reviews`.`id`
AND userUpTable.`like` = 1
AND userUpTable.`user_id` = :userid
LEFT JOIN `reviews_thumbs` AS userDownTable
ON userDownTable.`review_id` = `reviews`.`id`
AND userDownTable.`like` = -1
AND userDownTable.`user_id` = :userid
WHERE `item_id`=:itemid
GROUP BY `reviews`.`id`
ORDER BY `date` DESC
(And binding the appropriate :userid and :itemid.)
So this query works perfectly and accomplishes what I need it to. But that is a lot of joining, and I'm almost positive there must be a better way to do this, but I can't seem to figure anything out.
Could someone please point me in the right direction on how to accomplish this in a cleaner way?
What I've Tried:
I've tried doing a GROUP_CONCAT, to list a string that contains all the user ids and likes, and to then run a regex to find the user's id to see if they've voted on the review, but this also feels really unclean.
Thank you in advance for any help you may provide.
You could modify reviews_thumbs to look more like this:
reviews_thumbs
--------------
review_id user_id upvote downvote
1 22 1 0
1 45 0 1
2 40 0 1
3 22 1 0
This would effectively store duplicate information, but that's okay when you have a good purpose. You really have 2 things you want to know, and this gives you a quick sum on 2 columns (and a quick subtraction on those results) to get exactly what you are looking for. This cuts you down to querying the reviews_thumbs table to 2x, once for the totals, and once for the users specific action.
Not sure if that improves your query performance, but you could try to do the counting in a sub-select
SELECT `reviews`.*,
(SELECT count(*) FROM reviews_thumbs t WHERE t.review_id =reviews.id AND t.like = 1) AS upVotes
...
FROM reviews
...
I have a MySQL database where I add soccer games shown on TV. Each team is represented as an number. I can't really figure out how I can make a query to list how many times a team has been shown on TV, no matter if they played at home or away.
I'm trying to make a top 20 list of teams thats been shown on TV. The two columns I have team id in are called "hjemmehold" and "udehold" (it's danish :)).
Anyone can help me here?
SELECT Team, Count(*)
FROM (select Away as Team from Games union all select Home as Team from Games) t
GROUP BY Team
ORDER BY Count(*) Desc
LIMIT 20
SELECT SUM(Away)+SUM(Home) AS NumGames
FROM Games
WHERE Team=#Team
Obviously this is pseudocode, but put the proper tables/fields/params in and you should be good to go.
For all 20 teams:
SELECT TOP 20 Team, SUM(Away)+SUM(Home) AS NumGames
FROM Games
ORDER BY SUM(Away)+SUM(Home) desc
t.his should be able to give you the top 20 teams that have played. This will also allow you to find which teams have not yet played a game (if you remove the top 20 part).
SELECT TOP 20
team_id,
(SELECT COUNT(1) FROM GAMES WHERE HOME = t.team_id) + (SELECT COUNT(1) FROM GAMES WHERE AWAY = team_id) AS team_count
FROM TEAMS T
ORDER BY team_count DESC
I am new to all of this and I have Googled and searched on here, but to no avail. Using google and some of the responses here, I've managed to solve a separate problem, but this is what I'm really interested in and am wondering if this is even possible/how to accomplish it.
I have mysql table that looks like this:
id type of game players timestamp
1 poker a,b,c,d,e,f,g,h 2011-10-08 08:00:00
2 fencing i,j,k,l,m,n,o,p 2011-10-08 08:05:00
3 tennis a,e,k,g,p,o,d,z 2011-10-08 08:10:00
4 football x,y,f,b 2011-10-08 08:15:00
There are 7 types of games, and either 4 or 8 players separated by commas for each gametype.
However, the players are IRC nicknames so potentially there could be new players with unique nicknames all the time.
What I am trying to do is look in the players column of the entire table and find the top 10 players in terms of games played, regardless of the gametype, and print it out to a website in this format, e.g.:
Top 10 Players:
a (50 games played)
f (39 games played)
o (20 games played)
......
10 g (2 games played)
Does anyone have any idea how to accomplish this? Any help is appreciated! Honestly, without this website I would not have even come this fair in my project!
My suggestion is that you don't keep a list of the players for each game in the same table, but rather implement a relationship between a games table and a players table.
The new model could look like:
TABLE Games:
id type of game timestamp
1 poker 2011-10-08 08:00:00
2 fencing 2011-10-08 08:05:00
3 tennis 2011-10-08 08:10:00
4 football 2011-10-08 08:15:00
TABLE Players:
id name
1 a
2 b
3 c
.. ..
TABLE PlayersInGame:
id idGame idPlayer current
1 1 1 true //Player a is currently playing poker
When a player starts a game, add it to the PlayersInGame table.
When a player exits a game, set the current status to false.
To retrieve the number of games played by a player, query the PlayersInGame table.
SELECT COUNT FROM PlayersInGame WHERE idPlayer=1
For faster processing you need to de-normalize(not actually denormalization, but i don't know what else to call it) the table and keep track of the number of games for each player in the Players table. This would increase the table size but provide better speed.
So insert column games played in Players and query after that:
SELECT * FROM Players ORDER BY games_played DESC LIMIT 10
EDIT:
As Ilmari Karonen pointed out, to gain speed from this you must create an INDEX for the column games_played.
Unless you have a huge number of players, you probably don't need the denormalization step suggested at the end of Luchian Grigore's answer. Assuming tables structured as he initially suggests, and an index on PlayersInGame (idPlayer), the following query should be reasonably fast:
SELECT
name,
COUNT(*) AS games_played
FROM
PlayersInGame AS g
JOIN Players AS p ON p.id = g.idPlayer
GROUP BY g.idPlayer
ORDER BY games_played DESC
LIMIT 10
This does require a filesort, but only on the grouped data, so its performance will only depend on the number of players, not the number of games played.
Ps. If you do end up adding an explicit games_played column to the player table, do remember to create an index on it — otherwise the denormalization will gain you nothing.