I would really appreciate it if some of you could help optimize my tables, as they currently take extremely long to execute because of the following issues:
Main table:
game {
game_id [PRIMARY KEY]
*team_id
*stadium_id
home_score
opponent_score
date
tour
half_home_score
half_opp_score
attendance
referee_id
coach_id
captain_id
}
players (contains all players that played in the game) {
*game_id
*team_id
*player_id
position
number
}
tries, conversions, penalties, dropgoals {
*player_id
*game_id
*team_id
}
team {
team_id [PRIMARY KEY]
country
}
player {
player_id [PRIMARY KEY]
*team_id
name
surname
fullname
DOB
POB
school
height
weight
position
}
I tend to make all *_id's of type (UNSIGNED INT[5])? I am a bit unsure if UNSIGNED is needed
All text is VARCHAR(500..200) depending on the size needed
I use DATE type where I can eg. DOB, date
' * ' - refers to foreign keys that are primary keys in other tables
One page is particularly slow, what happens is the following:
The page shows the lineup of players for the specific game so my queries are the following:
SELECT all player_id's,number,position FROM players table WHERE game_id is specific game's id
getTries(player_id,game_id) from tries table
.
.
getDropgoals(player_id,game_id) from dropgoals table
getPlayerName(player_id) from player table
Output to table the received details
<tr>
<td>tries</td>...<td>dropgoals</td><td>position</td><td>player's name</td><td>number</td></tr>
I would really appreciate it if someone could point out some visible pitfalls.
Regards
// edit
I have used the following query, but it only outputs the rows that found in the tries table, I want it to output all players found, but only count the number of tries that he scored, if no tries were scored for that player, it must still output the players details but 0 for tries scored.
I am not sure if my query is correct:
SELECT ps.player_id, ps.position, ps.number, p.name, p.surname, COUNT(*) AS triesNo FROM players ps, player p, tries WHERE p.player_id=ps.player_id AND ps.game_id = '$game_id' AND ps.team_id IS NULL AND tries.player_id = ps.player_id GROUP BY ps.player_id ORDER BY ps.number
I want it to also return a player if he scored no tries, now it only returns the player if he scored a try.
Can anyone help please?
From the sounds of it you may be (a) running too many queries on a page load, or (b) not joining data on appropriate fields, if you're selecting from multiple tables at once.
For example from what you said, you appear to get running a whole set of queries to get player names, but you can merge that with your first query like so:
SELECT ps.player_id, ps.position, ps.number, p.name
FROM players ps, player p
WHERE p.player_id=ps.player_id
That query joins two tables on the player_id, and you end up with an array of players with id/position/number/name.
You may also want to look into indexes. It's a good idea to index any field(s) used in a WHERE clause (that aren't already indexed with a primary key).
As others have, said, you'll need to be more specific with what queries are running slow.
Unsigned will enable you to have a larger (double the size) int id column. The drawback would be that you will not be able to use negative ids (which is evil, anyway).
Related
First a bit of background about the tables & DB.
I have a MySQL db with a few tables in:
films:
Contains all film/series info with netflixid as a unique primary key.
users:
Contains user info "ratingid" is a unique primary key
rating:
Contains ALL user rating info, netflixid and a unique primary key of a compound "netflixid-userid"
This statement works:
SELECT *
FROM films
WHERE
INSTR(countrylist, 'GB')
AND films.netflixid NOT IN (SELECT netflixid FROM rating WHERE rating.userid = 1)
LIMIT 1
but it takes longer and longer to retrieve a new film record that you haven't rated. (currently at 6.8 seconds for around 2400 user ratings on an 8000 row film table)
First I thought it was the INSTR(countrylist, 'GB'), so I split them out into their own tinyint columns - made no difference.
I have tried NOT EXISTS as well, but the times are similar.
Any thoughts/ideas on how to select a new "unrated" row from films quickly?
Thanks!
Try just joining?
SELECT *
FROM films
LEFT JOIN rating on rating.ratingid=CONCAT(films.netflixid,'-',1)
WHERE
INSTR(countrylist, 'GB')
AND rating.pk IS NULL
LIMIT 1
Or doing the equivalent NOT EXISTS.
I would recommend not exists:
select *
from films f
where
instr(countrylist, 'GB')
and not exists (
select 1 from rating r where r.userid = 1 and f.netflixid = r.netflixid
)
This should take advantage of the primary key index of the rating table, so the subquery executes quickly.
That said, the instr() function in the outer query also represents a bottleneck. The database cannot take advantage of an index here, because of the function call: basically it needs to apply the computation to the whole table before it is able to filter. To avoid this, you would probably need to review your design: that is, have a separate table to represent the relationship between movies and countries, which each tuple on a separate row; then, you could use another exists subquery to filter on the country.
The INSTR(countrylist, 'GB') could be changed on countrylist = 'GB' or countrylist LIKE '%GB%' if the countrylist contains more than the country.
Then don't select all '*' if you need only some columns details. Depends on the number of columns, the query could be really slow
I need to summary columns together on each row, like a leaderboard. How it looks:
Name | country | track 1 | track 2 | track 3 | Total
John ENG 32 56 24
Peter POL 45 43 35
Two issues here, I could use the
update 'table' set Total = track 1 + track 2 + track 3
BUT it's not always 3 tracks, anywhere from 3 to 20.
Secound if I don't SUM it in mysql I can not sort it when I present data in HTML/php.
Or is there some other smart way to build leaderboards?
You need to redesign your table to have colums for name, country, track number and data Then instead if having a wide table with just 3 track numbers you have a tall, thin table with each row being the data for a given name, country and track.
Then you can summarise using something like
SELECT
country,
name,
sum(data) as total
FROM trackdata
GROUP BY
name,
country
ORDER BY
sum(data) desc
Take a look here where I have made a SQL fiddle showing this working the way you want it
Depending upon your expected data however you might really be better having a separate table for Country, where each country name only appears once (and also for name maybe). For example, if John is always associated with ENG then you have a repeating group and its better to remove that association from the table above which is really about scores on a track not who is in what country and put that into its own table which is then joined to the track data.
A full solution might have the following tables
**Athlete**
athlete_id
athlete_name
(other data about athletes)
**Country**
country_id
country_name
(other data about countries)
**Track**
Track_id
Track_number
(other data about tracks)
**country_athlete** (this joining table allows for the one to many of one country having many athletes
country_athlete_id
country_id
athlete_id
**Times**
country_athlete_id <--- this identifies a given combination of athlete and country
track_id <--- this identifies the track
data <--- this is where you store the actual time
It can get more complex depending on your data, eg can the same track number appear in different countries? if so then you need another joining table to join one track number to many countries.
Alternatively, even with the poor design of my SQL fiddle example, it might be good to make name,country and track a primary key so that you can only ever have one 'data' value for a given combination of name, country and track. However, this decision, and that of normalising your table into multiple joined tables would be based upon the data you expect to get.
But either way as soon as you say 'I don't know how many tracks there will be' then you should start thinking 'each track's data appears in one ROW and not one COLUMN'.
Like others mentioned, you need to redesign your database. You need an One-To-Many relationship between your Leaderboard table and a new Tracks table. This means that one User can have many Tracks, with each track being represented by a record in the Tracks table.
These two databases should be connected by a foreign key, in this case it could be a user_id field.
The total field in the leaderboard table could be updated every time a new track is inserted or updated, or you could have a query similar to the one you wanted. Here is how such a query could look like:
UPDATE leaderboard SET total = (
SELECT SUM(track) FROM tracks WHERE user_id = leaderboard.user_id
)
I recommend you read about database relationships, here is a link:
https://code.tutsplus.com/articles/sql-for-beginners-part-3-database-relationships--net-8561
I still get a lot of issues with this... I don't think that the issue is the database though, I think it's more they way I pressent the date on the web.
I'm able to get all the data etc. The only thing is my is not filling up the right way.
What I do now is like: "SELECT * FROM `times` NATURAL JOIN `players`
Then <?php foreach... ?>
<tr>
<td> <?php echo $row[playerID];?> </td>
<td> <?php echo $row[Time];?> </td>
....
The thing is it's hard to get sorting, order and SUM all in ones with this static table solution.
I searched around for leaderboards and I really don't understand how they build theres with active order etc. like. https://www.pgatour.com/leaderboard.html
How do they build leaderboards like that? With sorting and everything.
I need a little help some a sql queries. To summarise, I have 2 tables. Player (which represents a sports player) and a Goal (which represents a goal a player scores). A Player can have many Goals and linked using a foreign key on the goal table (player_id).
What I want to do is get a list of "top scoring players" (top 5), but I have no idea where to start to do this using MySQL. In PHP I'm getting all the goals, then with each goal counting how many player_id's appear and group them like that (then with the array of players and their goal count, trimming the array down to 5). It works, but I'm almost positive I can do the counting in MySQL.
How should I approach this?
EDIT
Tables look like
Player
ID
Name
Goal
player_id
scored_against
time
SELECT COUNT(PLAYER.PLAYER_ID) as Goals,PLAYER_NAME
FROM PLAYER, GOAL
WHERE GOAL.PLAYER_ID = PLAYER.PLAYER_ID
GROUP BY GOAL.PLAYER_ID
ORDER BY Goals DESC
LIMIT 5
I'm trying to record test/quiz scores in a database. What's the best method to do this when there might be a lot of tests and users?
These are some options I considered: should I create a new column for each quiz and row for users, or does this have its limitations? Might this be slow? Should i create a new row for each user & quiz? Should I stick to my original 'user' database and encode it in text?
Elaborating a little on the plan: JavaScript Quiz, submits score with AJAX, and a script sends it to the database. I'm new with php so i'm not sure about a good approach.
Any help would be greatly appreciated :) this is for a school science fair
I'd suggest 3 data tables in your database: students, tests, and scores.
Each student needs to have fields for an ID and whatever else (name, dob, etc) you want to record about them.
Tests should have fields for an ID and whatever else (name, date, weight, etc).
Scores should have the student ID, a test ID, and the score (any anything else).
This means you can query a student and join with the scores table to get all the student's scores. You can also join the test table these results to get labels put onto each score and calculate a grade based on scores and weight.
Alternately you can query for a test and join with the scores to get all the scores on a given test to get the class stats.
I would say create a database table, maybe one that lists all students(name, dob, student id), and then one for all tests(score, date, written by). Will only you access the db, or can your students access it too? If the latter is the case, you need to make sure the create accurate security or "views" to ensure the student can only see their own grades at a time (not everyone's).
Definitely do not create dynamic columns! (no column for each quiz). Also adding columns to user table (or generally any table) when they are not identifying the user(or generally any table item) is bad aproach...
This is pretty example of normalization, you should avoid storing any redundant rows. To do that you would create 3 tables and foreign keys to ensure scores are always referencing an existing user and quiz. E.g.:
users - id, nickname, name
quizzes - id, quizName, quizOtherData
scores - id, user_id (references users.id) , quiz_id , (ref. quizzes.id), score
And then add rows to scores table per user per quiz. Additionaly you could create UNIQUE key for columns user_id and quiz_id to disallow users to complete one quiz more times than one.
This will be fast and will not store redundant (unneeded extra) data.
To get results of quiz with id e.g. 4 and user info of people who's submitted this quiz, ordered from highest to lowest score, you would do query like:
SELECT users.*, scores.score
FROM scores RIGHT JOIN users ON(users.id=scores.user_id)
WHERE scores.quiz_id = 4
ORDER BY score DESC
Reason why I used RIGHT join here is because there might be users that didn't do this quiz, however every score always have an existing user&quiz (due to foreign keys
To get overall info of all users, quizes and scores you would do something like:
SELECT *
FROM quizzes
LEFT JOIN scores ON(quizzes.id=scores.quiz_id)
LEFT JOIN users ON(users.id=scores.user_id)
ORDER BY quizzes.id DESC, scores.score DESC, users.name ASC
BTW: If you are new to PHP (or anybody reading this), use PHP's PDO interface to communicate with your database :) AVOID functions like mysql_query, at least use mysqli_query, but for portability I would recommend stay with PDO.
I have a voting system for articles. Articles are stored in 'stories' table and all votes are stored in 'votes' table. id in 'stories' table is equal to item_name in 'votes' table (therefore each vote is related to article with item_name).
I want to make it so when sum of votes gets to 10 it updates 'showing' field in 'stories' table to value of "1".
I was thinking about setting up a cron job that runs every hour to check all posts that have a showing = 0. If showing = 0 than it will sum up votes related to that article and set showing = 1 if sum of votes >= 10. I'm not sure if it is efficient as it might take up a lot of server resources, not sure.
So could anyone suggest a cron job that could do the task?
Here is my database structure:
Stories table
Votes table
Edit:
For example this row from 'stories' table:
id| 12
st_auth | author name
st_date | story date
st_title| story title
st_category| story category
st_body| story body
showing| 0 for unaproved and 1 for approved
This row is related to this one from 'votes' table
id| 83
item_name| 12 (id of article)
vote_value| 1 for upvote -1 for downvote
...
Couple of things:
Why did you name the column item_name in the votes table, when it is actually the id of the article table? I would recommend making this a match on the article table in that it is an int(11) vs a var_char(255). Also, you should add a foreign key constraint to the votes table, so if an article is ever deleted, you don't orphan a row in the votes table.
Why is the vote_value column an int(11)? If it can only be two states (1, or -1) you can do a tinyint(1) signed (for the -1).
The ip column in the votes table is a bit concerning. If you are regulating 'unique' votes by ip, did you account for proxy ips? Something like this should be handled at the account level, so several users from the same proxy IP can issue individual votes.
I wouldn't do a cronjob for determining whether the showing column should be flagged 0 or 1. Rather, I would issue a count every time a vote was cast against the article. So if someone up-voted or down-voted, calculate the new value of the story, and store it in cache for future reads.
Using this query, you get a list of all articles plus a column containing the count of associated votes.
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
This way, you can create a view from which you can filter on votes_total > 10, without need of the cron job.
Or you can use it as a normal query, something like this:
SELECT * FROM (
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
) WHERE votes_total > 10;
I would use a trigger (insert trigger) and handle your logic there (in the database itself)?
This would remove the poll code altogether (cron job).
I would also keep your foreign key (in VOTES) the same (at least the type) as the primary key (in STORIES)?
Using a trigger instead of polling will be much cleaner in the long run.
You don't specify your database, but in TSQL (for SQL Server) it could be close to this
CREATE TRIGGER myTrigger
ON VOTES
FOR INSERT
AS
DECLARE #I INT --HOLDS COUNT OF VOTES
DECLARE #IN VARCHAR(255) --HOLDS FK ID FOR LOOKUP INTO STORIES IF UPDATE REQUIRED
SELECT #IN = ITEM_NAME FROM INSERTED
SELECT #I = COUNT(*) FROM VOTES WHERE ITEM_NAME = #IN
IF (#I >= 10)
BEGIN
UPDATE STORIES SET SHOWING = 1 WHERE ID = #IN --This is why your PK/FK should be refactored
END