Count rows by foreign key - php

I need a little help some a sql queries. To summarise, I have 2 tables. Player (which represents a sports player) and a Goal (which represents a goal a player scores). A Player can have many Goals and linked using a foreign key on the goal table (player_id).
What I want to do is get a list of "top scoring players" (top 5), but I have no idea where to start to do this using MySQL. In PHP I'm getting all the goals, then with each goal counting how many player_id's appear and group them like that (then with the array of players and their goal count, trimming the array down to 5). It works, but I'm almost positive I can do the counting in MySQL.
How should I approach this?
EDIT
Tables look like
Player
ID
Name
Goal
player_id
scored_against
time

SELECT COUNT(PLAYER.PLAYER_ID) as Goals,PLAYER_NAME
FROM PLAYER, GOAL
WHERE GOAL.PLAYER_ID = PLAYER.PLAYER_ID
GROUP BY GOAL.PLAYER_ID
ORDER BY Goals DESC
LIMIT 5

Related

Mysql summary from colums

I need to summary columns together on each row, like a leaderboard. How it looks:
Name | country | track 1 | track 2 | track 3 | Total
John ENG 32 56 24
Peter POL 45 43 35
Two issues here, I could use the
update 'table' set Total = track 1 + track 2 + track 3
BUT it's not always 3 tracks, anywhere from 3 to 20.
Secound if I don't SUM it in mysql I can not sort it when I present data in HTML/php.
Or is there some other smart way to build leaderboards?
You need to redesign your table to have colums for name, country, track number and data Then instead if having a wide table with just 3 track numbers you have a tall, thin table with each row being the data for a given name, country and track.
Then you can summarise using something like
SELECT
country,
name,
sum(data) as total
FROM trackdata
GROUP BY
name,
country
ORDER BY
sum(data) desc
Take a look here where I have made a SQL fiddle showing this working the way you want it
Depending upon your expected data however you might really be better having a separate table for Country, where each country name only appears once (and also for name maybe). For example, if John is always associated with ENG then you have a repeating group and its better to remove that association from the table above which is really about scores on a track not who is in what country and put that into its own table which is then joined to the track data.
A full solution might have the following tables
**Athlete**
athlete_id
athlete_name
(other data about athletes)
**Country**
country_id
country_name
(other data about countries)
**Track**
Track_id
Track_number
(other data about tracks)
**country_athlete** (this joining table allows for the one to many of one country having many athletes
country_athlete_id
country_id
athlete_id
**Times**
country_athlete_id <--- this identifies a given combination of athlete and country
track_id <--- this identifies the track
data <--- this is where you store the actual time
It can get more complex depending on your data, eg can the same track number appear in different countries? if so then you need another joining table to join one track number to many countries.
Alternatively, even with the poor design of my SQL fiddle example, it might be good to make name,country and track a primary key so that you can only ever have one 'data' value for a given combination of name, country and track. However, this decision, and that of normalising your table into multiple joined tables would be based upon the data you expect to get.
But either way as soon as you say 'I don't know how many tracks there will be' then you should start thinking 'each track's data appears in one ROW and not one COLUMN'.
Like others mentioned, you need to redesign your database. You need an One-To-Many relationship between your Leaderboard table and a new Tracks table. This means that one User can have many Tracks, with each track being represented by a record in the Tracks table.
These two databases should be connected by a foreign key, in this case it could be a user_id field.
The total field in the leaderboard table could be updated every time a new track is inserted or updated, or you could have a query similar to the one you wanted. Here is how such a query could look like:
UPDATE leaderboard SET total = (
SELECT SUM(track) FROM tracks WHERE user_id = leaderboard.user_id
)
I recommend you read about database relationships, here is a link:
https://code.tutsplus.com/articles/sql-for-beginners-part-3-database-relationships--net-8561
I still get a lot of issues with this... I don't think that the issue is the database though, I think it's more they way I pressent the date on the web.
I'm able to get all the data etc. The only thing is my is not filling up the right way.
What I do now is like: "SELECT * FROM `times` NATURAL JOIN `players`
Then <?php foreach... ?>
<tr>
<td> <?php echo $row[playerID];?> </td>
<td> <?php echo $row[Time];?> </td>
....
The thing is it's hard to get sorting, order and SUM all in ones with this static table solution.
I searched around for leaderboards and I really don't understand how they build theres with active order etc. like. https://www.pgatour.com/leaderboard.html
How do they build leaderboards like that? With sorting and everything.

Most efficient method determining if a list of values completely satisfy a one to many relationship (MySQL)

I have a one-to-many relationship of rooms and their occupants:
Room | User
1 | 1
1 | 2
1 | 4
2 | 1
2 | 2
2 | 3
2 | 5
3 | 1
3 | 3
Given a list of users, e.g. 1, 3, what is the most efficient way to determining which room is completely/perfectly filled by them? So in this case, it should return room 3 because, although they are both in room 2, room 2 has other occupants as well, which is not a "perfect" fit.
I can think of several solutions to this, but am not sure about the efficiency. For example, I can do a group concatenate on the user (ordered ascending) grouping by room, which will give me comma separated strings such as "1,2,4", "1,2,3,5" and "1,3". I can then order my input list ascending and look for a perfect match to "1,3".
Or I can do a count of the total number of users in a room AND containing both users 1 and 3. I will then select the room which has the count of users equal to two.
Note I want to most efficient way, or at least a way that scales up to millions of users and rooms. Each room will have around 25 users. Another thing I want to consider is how to pass this list to the database. Should I construct a query by concatenating AND userid = 1 AND userid = 3 AND userid = 5 and so on? Or is there a way to pass the values as an array into a stored procedure?
Any help would be appreciated.
For example, I can do a group concatenate on the user (ordered ascending) grouping by room, which will give me comma separated strings such as "1,2,4", "1,2,3,5" and "1,3". I can then order my input list ascending and look for a perfect match to "1,3".
First, a word of advice, to improve your level of function as a developer. Stop thinking of the data, and of the solution, in terms of CSVs. It limits you to thinking in spreadsheet terms, and prevents you from thinking in Relational Data terms. You do not need to construct strings, and then match strings, when the data is in the database, you can match it there.
Solution
Now then, in Relational data terms, what exactly do you want ? You want the rooms where the count of users that match your argument user list is highest. Is that correct ? If so, the code is simple.
You haven't given the tables. I will assume room, user, room_user, with deadly ids on the first two, and a composite key on the third. I can give you the SQL solution, you will have to work out how to do it in the non-SQL.
Another thing I want to consider is how to pass this list to the database. Should I construct a query by concatenating AND userid = 1 AND userid = 3 AND userid = 5 and so on? Or is there a way to pass the values as an array into a stored procedure?
To pass the list to the stored proc, because it needs a single calling parm, the length of which is variable, you have to create a CSV list of users. Let's call that parm #user_list. (Note, that is not contemplating the data, that is passing a list to a proc in a single parm, because you can't pass an unknown number of identified users to a proc otherwise.)
Since you constructed the #user_list on the client, you may as well compute #user_count (the number of members in the list) while you are at it, on the client, and pass that to the proc.
Something like:
CREATE PROC room_user_match_sp (
#user_list CHAR(255),
#user_count INT
...
)
AS
-- validate parms, etc
...
SELECT room_id,
match_count,
match_count / #user_count * 100 AS match_pct
FROM (
SELECT room_id,
COUNT(user_id) AS match_count -- no of users matched
FROM room_user
WHERE user_id IN ( #user_list )
GROUP BY room_id -- get one row per room
) AS match_room -- has any matched users
WHERE match_count = MAX( match_count ) -- remove this while testing
It is not clear, if you want full matches only. In that case, use:
WHERE match_count = #user_count
Expectation
You have asked for a proc-based solution, so I have given that. Yes, it is the fastest. But keep in mind that for this kind of requirement and solution, you could construct the SQL string on the client, and execute it on the "server" in the usual manner, without using a proc. The proc is faster here only because the code is compiled and that step is removed, as opposed to that step being performed every time the client calls the "server" with the SQL string.
The point I am making here is, with the data in a reasonably Relational form, you can obtain the result you are seeking using a single SELECT statement, you don't have to mess around with work tables or temp tables or intermediate steps, which requires a proc. Here, the proc is not required, you are implementing a proc for performance reasons.
I make this point because it is clear from your question that your expectation of the solution is "gee, I can't get the result directly, I have work with the data first, I am ready and willing to do that". Such intermediate work steps are required only when the data is not Relational.
Maybe not the most efficient SQL, but something like:
SELECT x.room_id,
SUM(x.occupants) AS occupants,
SUM(x.selectees) AS selectees,
SUM(x.selectees) / SUM(x.occupants) as percentage
FROM ( SELECT room_id,
COUNT(user_id) AS occupants,
NULL AS selectees
FROM Rooms
GROUP BY room_id
UNION
SELECT room_id,
NULL AS occupants,
COUNT(user_id) AS selectees
FROM Rooms
WHERE user_id IN (1,3)
GROUP BY room_id
) x
GROUP BY x.room_id
ORDER BY percentage DESC
will give you a list of rooms ordered by the "best fit" percentage
ie. it works out a percentage of fulfilment based on the number of people in the room, and the number of people from your set who are in the room

How to use ORDER BY and comparing more than 2 columns

So I'm having some problems with an SQL Query, or maybe rather deciding on if I could solve this issue faster in an PHP While loop. See I have a small table with 4 teams, and each team will be assigned a rank based on each teams total points in GOLF,FOOTBALL and Miniature GOLF. The points in the different games: 8 to the first placed team, then 6->4->2 to the other teams depending on score in each game. Then I convert the total score to a ranking number. Here is an example:
MY question and my problem is:
That I don't want two teams to share rank. As you can see team_id 3 and 2 have the same amount of total points, therefor they have rank 3 together. I want to order this table after the team with the highest value in any of the games that have been played. SO! Team 2 has 8 points in one of the games, which is higher then team 3 highest score. So the rank_13 should look like this instead:
I tried to make a query that compares two teams with the same rank_12 point, but I don't know how to compare on 3 columns. Also tried to print the first table out in PHP and then alter the displayed values depending on highest value in the 3 columns, even more confusing. Am I unclear somehow? Please give me a comment.
you can try to order several column seperated by commas and use GREATEST() to find maximum points and compare like this
......order by rank_12 ASC,GREATEST(golf_points,football_points,mini_points)DESC

Complicated SQL query - how do i do it?

I'm currently working on a small feature for a project and wondered how best to achieve the result, i have a table full of reviews with the score being out of 5 and being stored as score in the database, on one page i want to show how many of each score reviews there are, ie number of 5 star reviews, 4 star etc
But i don't know how best to achieve surely i don't need 5 different queries, that would be awful design would it not ?
Thanks and hope you can help !
Since I do not have your table structure, I would do something similar to this (with appropriate names replaced)
edited SQL based on comments
Select Score, COUNT (*) as NumScores
From MyTableOfScores
Group By Score
Order by Score Desc
You need something like this:
select score, count(*) from reviews group by score

MySQL - Please help with optimization, I am not sure how

I would really appreciate it if some of you could help optimize my tables, as they currently take extremely long to execute because of the following issues:
Main table:
game {
game_id [PRIMARY KEY]
*team_id
*stadium_id
home_score
opponent_score
date
tour
half_home_score
half_opp_score
attendance
referee_id
coach_id
captain_id
}
players (contains all players that played in the game) {
*game_id
*team_id
*player_id
position
number
}
tries, conversions, penalties, dropgoals {
*player_id
*game_id
*team_id
}
team {
team_id [PRIMARY KEY]
country
}
player {
player_id [PRIMARY KEY]
*team_id
name
surname
fullname
DOB
POB
school
height
weight
position
}
I tend to make all *_id's of type (UNSIGNED INT[5])? I am a bit unsure if UNSIGNED is needed
All text is VARCHAR(500..200) depending on the size needed
I use DATE type where I can eg. DOB, date
' * ' - refers to foreign keys that are primary keys in other tables
One page is particularly slow, what happens is the following:
The page shows the lineup of players for the specific game so my queries are the following:
SELECT all player_id's,number,position FROM players table WHERE game_id is specific game's id
getTries(player_id,game_id) from tries table
.
.
getDropgoals(player_id,game_id) from dropgoals table
getPlayerName(player_id) from player table
Output to table the received details
<tr>
<td>tries</td>...<td>dropgoals</td><td>position</td><td>player's name</td><td>number</td></tr>
I would really appreciate it if someone could point out some visible pitfalls.
Regards
// edit
I have used the following query, but it only outputs the rows that found in the tries table, I want it to output all players found, but only count the number of tries that he scored, if no tries were scored for that player, it must still output the players details but 0 for tries scored.
I am not sure if my query is correct:
SELECT ps.player_id, ps.position, ps.number, p.name, p.surname, COUNT(*) AS triesNo FROM players ps, player p, tries WHERE p.player_id=ps.player_id AND ps.game_id = '$game_id' AND ps.team_id IS NULL AND tries.player_id = ps.player_id GROUP BY ps.player_id ORDER BY ps.number
I want it to also return a player if he scored no tries, now it only returns the player if he scored a try.
Can anyone help please?
From the sounds of it you may be (a) running too many queries on a page load, or (b) not joining data on appropriate fields, if you're selecting from multiple tables at once.
For example from what you said, you appear to get running a whole set of queries to get player names, but you can merge that with your first query like so:
SELECT ps.player_id, ps.position, ps.number, p.name
FROM players ps, player p
WHERE p.player_id=ps.player_id
That query joins two tables on the player_id, and you end up with an array of players with id/position/number/name.
You may also want to look into indexes. It's a good idea to index any field(s) used in a WHERE clause (that aren't already indexed with a primary key).
As others have, said, you'll need to be more specific with what queries are running slow.
Unsigned will enable you to have a larger (double the size) int id column. The drawback would be that you will not be able to use negative ids (which is evil, anyway).

Categories