I am finalizing a comments system and was with a doubt.
I have a table for blogs and one for news, and they accept comments.
My comments table receives the text and the id.
I wonder if I need to (or should I) go through some sort of reference to know where the comment comes from.
table comment
id | id_content | text | ref
1 | 1 | test | blog
2 | 1 | test | news
thanks
depending on the number of comments you expect to receive there are two ways of doing this ...
1 - parent_tbl, parent_id - in one big comment table
2 - two tables for comments with a parent_id - one for each primary table
either way you need to index properly, the second will always work faster, but it doesn't expand well if you say add "press_releases" now you have to duplicate code, tables, what not.
Related
I am creating a video player application with php and mysql.
The application has videos that are gathered in playlists like this:
Playlists table:
+----+------------------+------+
| id | name | lang |
+----+-------------------------+
| 1 | Introduction | 1 |
+----+-------------------------+
Videos table:
+----+--------------+-------------+
| id | name | playlist_id |
+----+--------------+-------------+
| 1 | Video1 | 1 |
| 2 | Video2 | 1 |
+----+--------------+-------------+
It worked fine until now, because I need to build a searcher that finds videos depending on its name and language.
I though of creating another field called lang in the videos table, but then I realize that this maybe would contradict the normalization database rules. Because I would be repeating unnecessary information.
What can I do to select the videos without creating another field? Or do I need to create a new one with the repeated information?
EDIT:
JOIN LEFT both tables is not a solution, because I maybe add in the future a new table that links to playlists such as courses.
You can make LANGUAGE_ID COLUMN in Videos table,which will foreign key references to Playlists.lang .
Try above solution.
Hope this will help you.
You need to be clear about what attribute you want to assign to which entity (playlist, video or possibly course). You can assign language ids to both, playlist and video list items independently. Who is to say that you are not allowed to include a video with a language id of 2 in a playlist that carries a language id of 1? (This could, for example be a video in a foreign language that you want to appear in a playlist of your own language).
To search for suitable items you should then definitely use some kind of join (on video.playlist_id=playlist.id). The resulting table will contain both, video.language_id and playlist.language_id, which is not a case of having redundant information, as I have tried to explain above since they refer to different entities.
DETAILS
I have a quiz (let’s call it quiz1). Quiz1 uses the same wordlist each time it is generated.
If the user needs to, they can skip words to complete the quiz. I’d like to store those skipped words in mysql and then later perform statistics on them.
At first I was going to store the missed words in one column as a string. Each word would be separated by a comma.
|testid | missedwords | score | userid |
*************************************************************************
| quiz1 | wordlist,missed,skipped,words | 59 | 1 |
| quiz2 | different,quiz,list | 65 | 1 |
The problem with this approach is that I want to show statistics at the end of each quiz about which words were most frequently missed by users who took quiz1.
I’m assuming that storing missed words in one column as above is inefficient for this purpose as I'd need to extract the information and then tally it -(probably tally using php- unless I stored that tallied data in a separate table).
I then thought perhaps I need to create a separate table for the missed words
The advantage of the below table is that it should be easy to tally the words from the table below.
|Instance| missed word |
*****************************
| 1 | wordlist |
| 1 | missed |
| 1 | skipped |
Another approach
I could create a table with tallys and update it each time quiz1 was taken.
Testid | wordlist| missed| skipped| otherword|
**************************************************
Quiz1 | 1 | 1| 1| 0 |
The problem with this approach is that I would need a different table for each quiz, because each quiz will use different words. Also information is lost because only the tally is kept not the related data such which user missed which words.
Question
Which approach would you use? Why? Alternative approaches to this task are welcome. If you see any flaws in my logic please feel free to point them out.
EDIT
Users will be able to retake the quiz as many times as they like. Their information will not be updated, instead a new instance would be created for each quiz they retook.
The best way to do this is to have the word collection completely normalized. This way, analyses will be easy and fast.
quiz_words with wordID, word
quiz_skipped_words with quizID, userID, wordID
To get all the skipped words of a user:
SELECT wordID, word
FROM quiz_words
JOIN quiz_skipped_words USING (wordID)
WHERE userID = ?;
You could add a group by clause to have group counts of the same word.
To get the count of a specific word:
SELECT COUNT(*)
FROM quiz_words
WHERE word LIKE '?';
According to database normalization theory, second approach is better, because ideally one relational table cell should store only one value, which is atomic and unsplitable. Each word is an entity instance.
Also, I might suggest to not create Quiz-Word tables, but reserve another column in Missed-Word table for quiz, for which this word was specified, then use this column as a foreign key for Quiz table. Then you probably may avoid real time table generation (which is a "bad practice" in database design).
why not have a quiz table and quiz_words table, the quiz_words table would store id,quizID,word as columns. Then for each quiz instance create records in the quiz_words table for each word the user did use.
You could then run mysql counts on the quiz_words table based on quizID and or quiz type
The best solution (from my pov) for what are you trying to achieve is the normalized aproach:
test table which has test_id column and other columns
missed_words table which has id (AI PK) and word (UQ) , here you can also have a hits column that should be incremented each time that a association to this word is made in test_missed_words table this way you have the stats that you want already compiled and you don't need them to be calculated from a select query
test_missed_words which is a link table that has test_id and missed_word_id (composite PK)
This way you do not have redundant data (missed words) and you can extract easily that stats that you want
Keeping as much information as possible (and being able to compile user-specific stats later as well as overall stats now) I would create a table structure similar to:
Stats
quizId | userId | type| wordId|
******************************************
1 | 1 | missed| 4|
1 | 1 | skipped| 7|
Where type can either be an int defining the different types of actions, or a string representation - depending on if you believe it can ever be more. ^^
Then:
Quizzes
quizId | quizName|
********************
1| Quiz 1|
With the word list made for each quiz like:
WordList (pk: wordId)
quizId | wordId| word|
***************************
1 | 1 | Cat|
1 | 2 | Dog|
You would have your user table however you want, we are just linking the id from it in to this system.
With this, all id fields will be non-unique keys in the stats table. When a user skips or misses a word, you would add the id of that word to the stats table along with relevant quizId and type. Getting stats this way would make it easy as a per-user basis, a per-word basis, or a per-type basis - or a combination of the three. It will also make the word list for each quiz easily available as well for making the quizzes. ^^
Hope this helps!
I am making comment system and now I want to insert comments to database and I am confusing that on what basis to assign specific comment_id.
Suppose we have multiple dives of images with comment system.if someone comment on image then how can we assign a specific comment id on that specific image.and if other user comment on same image then how can he found that image comment_id so the comment save in right direction.
we have many images and comment system for that image.
My english is bad may be you understand what i want to say.
You need to create a number of different tables in the database
table: comments
|comment_id | userID | name | comment | (for example)
1 50 James test
2 50 James test
3 50 James test
table: images
|image_id | link |
1 example.com/images/image1.png
2 example.com/images/image2.png
3 example.com/images/image3.png
table: comments_on_images (to make the table's purpose clear)
|id | comment_id | image_id
1 1 2
2 2 2
3 3 1
Using this method you can assosciate any number of comments to any images. You have to query the database using JOINS to get all the information you need.
I'm working in a commenting application and i would like some feedback on the method that i am using to keep track of the number of replies or likes that a comment has. Comments and replies are stored in the same table, to determine if a comment is a reply i use the field parent_id if it is anything other than 0 the comment is a reply.
Please note that i wont be including all the columns of the table below:
cid | parent_id | replies | likes
-----+-----------+---------+-------
2 | 0 | 3 | 0
3 | 2 | 0 | 0
4 | 2 | 0 | 2
5 | 2 | 0 | 0
In the table above comments with id (cid) [3,4,5] are replies of comment #2. The columns replies and likes are integer that hold the count of replies and likes accordingly. The integrity and accuracy of these columns is maintain and updated through the PHP code, for example if another reply for comment #2 is added than the replies column would be increased by one or decreased by one if deleted.
Im also aware that i could dynamically calculate the replies count in the SQL query that fetches the comments but i thought it would add more stress to the SQL server. This query would look something like these:
SELECT cid, parent_id, (
SELECT count(*)
FROM comments as SC
WHERE RC.parent_id = C.cid
) AS replies
FROM comments AS C
WHERE thread = {thread_id}
Am i doing it right by storing the replies and likes in an actual column in the table? or am i exaggerating about the stress that a query such as the one above would have in the MySql server and i should use such complex query instead?
Any feedback would be appreciated, thanks
I dont think you need the column called 'replies'. Just occupies additional unwanted space.
Do a combined Index on cid and parentId. That should be good enough. Queries should be fast.
By having the column, you are adding more stress to app code & mysql. (App code for maintaining integrity and mysql coz 2 writes in the place of 1 write - when a comment is entered).
But if you are talking about millions of rows, i wouldnt choose mysql for it, rather mongo, the data can be constructed as a beautiful JSON and dumped in mongo.
I want to create something like reddit where they have comments, then replies to the comment, then reply to the reply.
What type of database structure do they use so:
1. they keep track of all the comments to a posting
2. a reply to a comment
3. a reply to a reply
All I have right are is just a posting and a bunch of comments relating to it like..
POSTING TABLE
posting_id | title | author
COMMENTS TABLE
comment_id | posting_id | comment
REPLIES TABLE
????
How do I relate the comments to the replies?
What type of css do they use to give replies that indented space?
EDIT:
Thanks for the answers! Now my only question how do I indent the replies?
Such as..
you like food
yes I love italian
Yes i do like it too
chinese is best
You can add another column to your comments table specifying parent_comment_id where you populate it with the ID of the comment (or reply) the user is replying to. In the case where the comment is a direct reply to the post (not a reply to a comment) this column would be null.
To show replies inside replies, you'll have to do a recursive call to keep on generating the sub replies.
Something like
function get_comments($comment_id) {
print '<div class="comment_body">';
// print comment body or something?
if (comment_has_reply($comment_id)) {
foreach(comment_comments($comment_id) as $comment) {
get_comments($comment->id);
}
}
print '</div>';
}
To indent comments however, use css.
<style type="text/css">
.comment_body {
margin-left:10px;
}
</style>
This way sub replies are indented more than the parent, and their subs are indented even more, and so on.
I would do that by making a cross reference table.
Example:
Table: Posts
Columns: pstkey | userid | postMessage | etc...
pstkey is the key for the post body. userid is the person who created the post. postMessage is the actual post entry.
Table: Comments
Columns: comkey | pstkey | userid | commentMessage | etc...
comkey is the key for the comment made. referenced to the post using the pstkey. userid is the person who made the comment. and then commentMessage is the text body of the actual comment.
Table: xref_postComm
Columns: xrefkey | pstkey | comkey | comkey2 |
Now for the fun part. ALL posts go into post table. ALL comments go into comment table. The relationships are all defined in the Cross Reference Table.
I do all of my programming this way. I was privileged to work with one of the worlds bests database engineers who was retired and he taught me a few tricks.
How to use the Cross Reference table:
xrefkey | pstkey | comkey | comkey2
All that you look for is the population of a given field.
xref (Auto Incremented)
pstkey (Contains the pstkey for the post)
comkey (Contains the comkey for the comment post)
comkey2 (Contains the comkey for the comment post)
(but only populate comkey2 if comkey already has a value)
and of course you populate comkey2 with the key of the comment.
SEE, no reason for a 3rd tabel!
With this method you can add as many relationships as you want.
Now or in the future!
comkey2 is your reply to a reply. Where which this single row contains.... the key of the post, the key of the comment, and the key of the reply to the reply comment. All done by population of xref.
EXAMPLE:
PAGES.... Page table
POSTS
pstkey | pageid | user| Post
-------------------------------------
| 1 | 1 | 45 | Went to the store the....|
| 2 | 2 | 18 | Saw an apple on tv.....
COMMENTS
comkey | pstkey | user | Comment
-----------------------------------------------
| 1 | 1 | 9 | Wanted to say thanks...
| 2 | 1 | 7 | Cool I like tha.....
| 3 | 2 | 3 | Great seeing ya....
| 4 | 2 | 6 | Had a great....
| 5 | 2 | 2 | Don't sweat it man...
xref_PostCom
xrefkey | pageid | pstkey | comkey | comkey2 |
----------------------------------------------
| 1 | 1 | 1 | NULL | NULL | Post1 on Page1
| 2 | 1 | 1 | 1 | NULL | Comment1 under Post1
| 3 | 1 | 1 | 2 | NULL | Comment2 under Post1
| 4 | 2 | 2 | NULL | NULL | Post2 on Page2
| 5 | 2 | 2 | 3 | NULL | Comment3 under Post2 on Page2
| 6 | 2 | 2 | 4 | NULL | Comment4 under Post2 on Page2 (a second Comment)
| 7 | 2 | 2 | 4 | 5 | Explained below....
Comment key 5 is matched with comment key 4....under post2 on Page 2
If you know anything about join, left join, right join, inner/outer join creating SELECT's to get the data arrays using these relationships, your job becomes a whole lot easier.
I believe the engineer's call this basically "the data map" of defined relationships. The trick is now how you access them using these relationships. It seams hard at first, but know what I know it, I refuse to do it any other way.
What happens in the end is you end up writing 1 script that says, ok, go do uhh, everything and come back. You will end up with 1 function call that asks for page 1. It returns with page1, post 1, comment1&2&3 and the replies to the reply in 1 array. echo to output and done.
UPDATE FOR COMMENT
I said the same exact thing the first time it was shown to me. As a matter of fact it really was making me mad that the database programmer was forcing me to do it this way. But now I get it. The advantages are so many.
Advantage 1) 1 query can be written to pull it all out in 1 shot.
2) Answers in multiple queries can populate arrays in a structure that when printing the page a loop in a loop can display the page.
3) Upgrading your software that uses it can support any possible design change you can ever think of. Flawless expandability.
The guy who taught it to me was the hired gun who redesigned sears and jcpenny databases. Back when they has 9 books going to the same house because of duplicate records issues.
Cross reference tables prevent a lot of issues in design.
The heart to this theory is, a column can not only contain data but serve as a true or false statement at the same time. That in it's self saves space. Why search 20 tables when you can search one? 1 indexed cross reference table can tell you everything you need to know about the other 20 tables, it contents, what you need, what you don't need, and do you even need to open the other table at all.
IN SHORT:
1 Cross reference containing nothing but INT(2/11) that tells you everything thing you need to know before you ever open another table, not only contains flawless expandability but lighting speed results. Not to mention little possibility of duplicate records. To you and me duplicate records may not be an issue. But to Sears with 4 billion records at $11 a book, mistakes add up.
Add another field to your comments table which "reply_to" or some such, and store the id of the comment which it is in reply to there.
you could make the comments table generic like so :
COMMENTS TABLE
comment_id | posting_type | posting_id | comment
where posting_type is some sort of discriminator, eg a string 'POST' or 'COMMENT', or an integer for more efficiency (1 = POST, 2 = COMMENT, etc).
edit : admittedly this is more complicated but it means you can use the same comment table for comments on anything, not just posts and other comments.
You don't need the replies table. As others already correctly have pointed out, recursion is the way to go with an RDBMS. You could always consider using a nosql style DBMS, to avoid having to deal with recursion.