I need a rating system for my app, what happens is that a user can rate a thread 1 to 5. The calculation i was going to use was shown below:
ID UID TID Rating
--------------------------------------
1 1 37 5
2 4 37 5
3 8 37 5
4 22 37 5
5 2 37 5
This is a sample table, as you can see the way i did it was,
r1 r2 r3 r4 r5
(0 x 1) + (0 x 2) + (0 x 3) + (0 x 4) + (5 x 5)
----------------------------------------------- = 5
5
There is no user (UID) with a rating of 1 (r1) so you set r1 = (0 x 1) but there are users (UID) with a rating of 5 (r5) so you set r5 = (5 x 5) and the rest r2, r3, r4 are set to (0) there are no ratings for those. Hope you get it, if not ill explain more.
so i get a rating of 5 star.
But my problem is if 100 different users lets say all rate the thread 5 you will get a rating of 5 using the formula given, also if 5 different users rated another thread 5 you would get a rating of 5 also. But i don't want these results as both ratings for each thread will hit the top. I know i could select in sql to order the threads by number of users who have rated the thread so the 100 different users will go top this works, but the threads that had 5 users who rated 5 will be second.
Is there another way to rate theses threads taking into account how many users have rated each thread.
I hope you understand this, my question? if not ill edit.
I also need to generate a php script that calculates this in my select statement when i retrieve the rating, but ill ask another question when this is solved. Thanks.
I saw that you mentioned SELECT statements, so I assume you meant the SQL statements required.
What you asked for can be done purely using SQL itself, using the below line.
SELECT TID, AVG(RATING) FROM ratings GROUP BY TID ORDER BY AVG(RATING) DESC, SUM(RATING) DESC;
Here is the data I used for testing.
CREATE TABLE ratings(ID INT,UID INT,TID INT,RATING INT);
INSERT INTO ratings VALUES (1,1,37,5);
INSERT INTO ratings VALUES (2,4,37,5);
INSERT INTO ratings VALUES (3,8,37,5);
INSERT INTO ratings VALUES (4,22,37,5);
INSERT INTO ratings VALUES (5,2,37,5);
INSERT INTO ratings VALUES ( 6,1,12,5);
INSERT INTO ratings VALUES ( 7,4,12,5);
SELECT TID, AVG(RATING) FROM ratings GROUP BY TID ORDER BY AVG(RATING) DESC, SUM(RATING) DESC;
The output I got
TID AVG(RATING)
------------------
37 5
12 5
Related
Let us just imagine I have the following table structure
Table: images
id path
1 x
2 x
3 x
4 x
5 x
6 x
7 x
8 x
Table: user
id image imageSmall
1 1 1
2 2 2
3 4 4
Table: books
id image imageSmall
1 5 5
2 6 6
3 8 8
I now want to get the ID of every image used in other tables. I made this query here
SELECT id FROM images WHERE id IN (SELECT image FROM user) OR id IN (SELECT imageSmall FROM user) OR id IN (SELECT image FROM books) OR id IN (SELECT imageSmall FROM books);
The problem I see here, is that, when I have a large amount of data, this query could be very time consuming and not performant at all because of the many IN parts of the query. Is there a way to improve the performance of this query?
I would phrase this using exists rather than in:
SELECT id
FROM images i
WHERE EXISTS (SELECT 1 FROM user u WHERE u.image = i.id) OR
EXiSTS (SELECT 1 FROM user u WHERE u.imageSmall = i.id) OR
EXISTS (SELECT 1 FROM books b WHERE b.image = i.id);
For performance, be sure that you have the following indexes:
create index idx_user_image on user(image);
create index idx_user_imageSmall on user(imageSmall);
create index idx_books_image on books(image);
Use joins instead of that many select in selects, and DISTINCT to return unique values:
SELECT DISTINCT(i.id) FROM images i
INNER JOIN `user` u
INNER JOIN `books` b
WHERE b.image=i.id OR b.imageSmall=i.id OR u.image=i.id OR u.imageSmall=i.id
Im having a brain fart as to how I would do this.
I need to select only the latest entry in a group of same id entries
I have records in an appointment table.
lead_id app_id
4 42
3 43
1 44
2 45
2 46 (want this one)
1 48
3 49 (this one)
4 50 (this one)
1 51 (this one)
The results I require are app_id 46,49,50,51
Only the latest entries in the appointment table, based on duplicate lead_id identifiers.
Here is the query you're looking for:
SELECT A.lead_id
,MAX(A.app_id) AS [last_app_id]
FROM appointment A
GROUP BY A.lead_id
If you want to have every columns corresponding to these expected rows:
SELECT A.*
FROM appointment A
INNER JOIN (SELECT A2.lead_id
,MAX(A2.app_id) AS [last_app_id]
FROM appointment A2
GROUP BY A2.lead_id) M ON M.lead_id = A.lead_id
AND M.last_app_id = A.app_id
ORDER BY A.lead_id
Here i simply use the previous query for a jointure in order to get only the desired rows.
Hope this will help you.
The accepted answer by George Garchagudashvili is not a good answer, because it has group by with unaggregated columns in the select. Select * with group by is simply something that should not be allowed in SQL -- and it isn't in almost all databases. Happily, the default version of the more recent versions of MySQL also rejects this syntax.
An efficient solution is:
select a.*
from appointment a
where a.app_id = (select max(a2.app_id)
from appointment a2
where a2.lead_id = a.lead_id
);
With an index on appointment(lead_id, app_id), this should be as fast or faster than George's query.
I think this is much more optimal and efficient way of doing it (sorting next grouping):
SELECT * FROM (
SELECT * FROM appointment
ORDER BY lead_id, app_id DESC
) AS ord
GROUP BY lead_id
this will be useful when you need all other fields too from the table without complicated queries
Result:
lead_id app_id
1 51
2 46
3 49
4 50
I'm trying to build a really simple online functionality system. One of the queries that this system must handle is the returning of a scoreboard (all of the players scores, ordered and ranked). I know basic sql queries, but i'm completely lost when it comes to sub queries and variables within queries etc.
The table only has three columns. game, user_id, score. The table will let people upload and download scores for any game. I need to work out how to create a query that returns only the users from the game being queried, orders the players by score, then ranks them so duplicate scores will have the same rank. Here's a brief example of the desired outcome:
TABLE
user game score
fred A 100
bill A 78
john A 78
dave B 71
terry B 60
jean B 60
tom A 60
nick A 57
DESIRED OUTPUT
user score rank
fred 100 1
bill 78 2
john 78 2
tom 60 4
nick 57 5
CURRENT OUTPUT ** TAKES INTO ACCOUNT THE GAMES IT SHOULD IGNORE
user score rank
fred 100 1
bill 78 2
john 78 2
tom 60 5
nick 57 8
This is currently the query that works the best:
SELECT #rank:=#rank+1 AS ranking, user_id, score
FROM score_table , (SELECT #rank:=1) AS i
WHERE game='A'
ORDER BY score DESC
But the rank seems to take into account other games, which ruins the rankings. Other queries i've found have ranked correctly but not eliminated the other games (again, taking the other games scores into account when ranking.
Again, the above is an example I tweaked, as I have no idea how to use the # variables, sub queries etc.
Many thanks,
Dan.
To show the same rank for same score you can use case and additional variable for checking same rank
SELECT ranking,user,score FROM
(SELECT
#rank:=case when score =#s then #rank else #rank+1 end AS ranking,
#s:=score,
user, score,game
FROM tablename , (SELECT #rank:=0,#s:=0) AS i
ORDER BY score DESC
) new_alias
WHERE game='A'
Demo
Edit from comments
Updated Demo
I am developing a small gaming website for college fest where users attend few contests and based on their ranks in result table, points are updated in their user table. Then the result table is truncated for the next event. The schemas are as follows:
user
-------------------------------------------------------------
user_id | name | college | points |
-------------------------------------------------------------
result
---------------------------
user_id | score
---------------------------
Now, the first 3 students are given 100 points, next 15 given 50 points and others are given 10 points each.
Now, I am having problem in developing queries because I don't know how many users will attempt the contest, so I have to append that many ? in the query. Secondly, I also need to put ) at the end.
My queries are like
$query_top3=update user set points =points+100 where id in(?,?,?);
$query_next5=update user set points = points +50 where id in(?,?,?,?,?);
$query_others=update user set points=points+50 where id in (?,?...........,?);
How can I prepare those queries dynamically? Or, is there any better approach?
EDIT
Though its similar to this question,but in my scenario I have 3 different dynamic queries.
If I understand correctly your requirements you can rank results and update users table (adding points) all in one query
UPDATE users u JOIN
(
SELECT user_id,
(
SELECT 1 + COUNT(*)
FROM result
WHERE score >= r.score
AND user_id <> r.user_id
) rank
FROM result r
) q
ON u.user_id = q.user_id
SET points = points +
CASE
WHEN q.rank BETWEEN 1 AND 3 THEN 100
WHEN q.rank BETWEEN 4 AND 18 THEN 50
ELSE 10
END;
It totally dynamic based on the contents in of result table. You no longer need to deal with each user_id individually.
Here is SQLFiddle demo
I have the following tables
customers
cust_id cust_name
1 a company
2 a company 2
3 a company 3
tariffs
tariff_id cost_1 cost_2 cost_3
1 2 0 3
2 1 1 1
3 4 0 0
terminals
term_id cust_id term_number tariff_id
1 1 12345 1
2 1 67890 2
3 2 14324 1
4 3 78788 3
usage
term_ident usage_type usage_amount date
12345 1 20 11/12/2010
67890 2 10 31/12/2010
14324 1 1 01/01/2011
14324 2 5 01/01/2011
78788 1 0 14/01/2011
In real life the tables are quite large - there are 5000 customers, 250 tariffs, 500000 terminals and 5 million usage records.
In the terminals table - term_id, cust_id and tariff_id are all foreign keys. There are no foreign keys in the usage table - this is just raw data imported from a csv file.
There could be terminals in the usage table that do not exist in the terminals table - these can be ignored.
What I need to do is produce an invoice per customer on usage. I only want to include usage between 15/12/2010 and 15/01/2011 - this is the billing period. I need to calculate the line items of the invoice for the usage based on its tariff ... for example: take the first record in the usage table - the cost of usage_1 (for term_id 1) would be 90x2=180, this is because term_ident uses tariff_id number 1.
The output should be as follows
Customer 2
date terminal usage_cost_1 usage_cost_2 usage_cost_3 total cost
01/01/2011 14324 18 0 6 24
I am a competent PHP developer - but only a beginner with SQL. What I need some advice on is the most efficient process for producing the invoices - perhaps there is an SQL query that would help me before the processing in PHP starts to calculate the costs - or perhaps the SQL statement could produce the costs too ? Any advice is welcome ....
Edit:
1) There is something currently running this process - its written in C++ and takes around 24 hours to process this ... i do not have access to the source.
2) I am using Doctrine in Symfony - im not sure how helpfuly Doctrine will be as retrieving data as Objects is only going to slow down the process - and I'm not sure if the use of Objects will help too much here ?
Edit #13:54 ->
Had the usage table specified incorrectly ... Sorry !
I have to map the usage_type to a cost on the specific tariff for each terminal ie usage_type of 1 = cost_1 in appropriate tariff ... I guess that makes it slightly more complicated ?
There you go, should take less than 24 hours ;)
SELECT u.date, u.term_ident terminal,
(ta.cost_1 * u.usage_1) usage_cost_1,
(ta.cost_2 * u.usage_2) usage_cost_2,
(ta.cost_3 * u.usage_3) usage_cost_3,
(usage_cost_1 + usage_cost_2 + usage_cost_3) total_cost
FROM usage u
INNER JOIN terminals te ON te.term_number = u.term_ident
INNER JOIN tariffs ta ON ta.tariff_id = te.tariff_id
INNER JOIN customers c ON c.cust_id = te.cust_id
WHERE u.date BETWEEN '2010-12-15' AND '2011-01-15'
AND c.cust_id = 2
This query is only for the customer with cust_id = 2. If you want a result for the whole dataset, just remove the condition.
Update
It's not that trivial with your new requirements. You could transform the usage table to the new one you posted before.
To make a decision in SELECT queries you can do something like this. But this is not the result you expect. It could be used to create the transformed new usage table.
SELECT u.date, u.term_ident terminal,
CASE u.usage_type
WHEN 1 then ta.cost_1 * u.usage_1
WHEN 2 then ta.cost_2 * u.usage_2
WHEN 3 THEN ta.cost_3 * u.usage_3
AS usage_cost
FROM usage u
INNER JOIN terminals te ON te.term_number = u.term_ident
INNER JOIN tariffs ta ON ta.tariff_id = te.tariff_id
INNER JOIN customers c ON c.cust_id = te.cust_id
WHERE u.date BETWEEN '2010-12-15' AND '2011-01-15'
AND c.cust_id = 2