I have a script I wrote I while back for comments, but it is only single threaded. I would like it to be multi-threaded, but only as so a user can reply to a comment, not so a user can reply to a comment of a comment. So the threads would only be two deep.
Currently I store a comment_id against a user_id in my database.
The only way I can think of to do the multi threaded comments, is to have a parent field in the comments table. But if I do this then when I am selecting the comments with PHP, I will have to do another SELECT command to select the comments children (if any) for each comment. Seems like a lot of work on the database.
There has to be a better way. Any ideas on this? Or tutorials?
There are three (four) alternative possibilities:
A recursive query to select all comments based on their parent ids. This is supported by many database products and the syntax depends on the database type. Check the docs for more info (search for 'recursive').
If you store the article id in each (sub-)comment, you can just select all comments with the article id in one regular select query. You can use the parent ids to properly display the comments on the page under the right parent comment:
SELECT * FROM comments WHERE article_id = :article_id
If you only need two levels of comments, you can use an extended where to include both first level and second level comments:
SELECT * FROM comments
WHERE parent_id = :article_id
OR parent_id IN (SELECT id FROM comments WHERE parent_id = :article_id)
It is also possible to use union all to combine two queries that have the same columns, but since I assume that all data are from the same table, there is probably no need for it (see the extended where-clause above):
SELECT * FROM comments WHERE parent_id = :article_id
UNION ALL
SELECT * FROM comments WHERE parent_id IN
(SELECT id FROM comments WHERE parent_id = :article_id)
Personally, I would go for option 2 because it is simple (no exotic SQL construct required), efficient (1 query) and flexible (supports as many levels of comments as you like).
1 query is enough for this, you just need to loop the data and store it into an array correctly, then you loop the array and display the comments.
This is a common use for hierarchical, or tree-structure data. I wrote a popular answer to this Stack Overflow question: What is the most efficient/elegant way to parse a flat table into a tree?
I also wrote a presentation describing alternatives for tree-structured data: Models for Hierarchical Data with SQL and PHP.
Another solution that is not included in my presentation is the way Slashdot does threaded comments. They use a parent column like you do, so each comment references the comment it replies to. But then they also include a root column so each comment knows the post it belongs to. There are seldom more than a few hundred comments on a given post, and usually you want to get the whole tree of comments for a given post starting at the top of the comment thread:
SELECT * FROM comments WHERE root = 1234;
Then as you fetch the comments, you can write PHP code to process them into arrays of arrays according to the parent columns (this is what #JanL's answer alluded to). I posted code to do this in my answer to another Stack Overflow question, Convert flat array to the multi-dimentional.
This query might work for you (I did not know your structure, so I guessed at it):
SELECT * FROM comments a
LEFT JOIN comments b ON a.comment_id = b.parent_coment_id
LEFT JOIN comments c ON b.comment_id = c.parent_coment_id
WHERE a.comment_id <> b.comment_id
AND a.comment_id <> c.comment_id
AND b.comment_id <> c.comment_id;
Related
I'm working on an existing application that uses some JOIN statements to create "immutable" objects (i.e. the results are always JOINed to create a processable object - results from only one table will be meaningless).
For example:
SELECT r.*,u.user_username,u.user_pic FROM articles r INNER JOIN users u ON u.user_id=r.article_author WHERE ...
will yield a result of type, let's say, ArticleWithUser that is necessary to display an article with the author details (like a blog post).
Now, I need to make a table featured_items which contains the columnsitem_type (article, file, comment, etc.) and item_id (the article's, file's or comment's id), and query it to get a list of the featured items of some type.
Assuming tables other than articles contain whole objects that do not need JOINing with other tables, I can simply pull them with a dynamicially generated query like
SELECT some_table.* FROM featured_items RIGHT JOIN some_table ON some_table.id = featured_items.item_id WHERE featured_items.type = X
But what if I need to get a featured item from the aforementioned type ArticleWithUser? I cannot use the dynamically generated query because the syntax will not suit two JOINs.
So, my question is: is there a better practice to retrieve results that are always combined together? Maybe do the second JOIN on the application end?
Or do I have to write special code for each of those combined results types?
Thank you!
a view can be thot of as like a table for the faint of heart.
https://dev.mysql.com/doc/refman/5.0/en/create-view.html
views can incorporate joins. and other views. keep in mind that upon creation, they take a snapshot of the columns in existence at that time on underlying tables, so Alter Table stmts adding columns to those tables are not picked up in select *.
An old article which I consider required reading on the subject of MySQL Views:
By Peter Zaitsev
To answer your question as to whether they are widely used, they are a major part of the database developer's toolkit, and in some situations offer significant benefits, which have more to do with indexing than with the nature of views, per se.
I have one table called cf_posts
ID pk
user INT
subject VARCHAR
body TEXT
datetime TIMESTAMP
parent INT
category INT
mod INT
When a post is submitted to the forum the default parent is 0, when a post is submitted as a reply, then its parent is the ID of the original post.
How can I make it so that the default view of the forum main page is ordered that the most recently updated post (including the latest replies) would be at the "top" of the pile, and working down in date order? What would be the PHP/MySQL query?
The only workarounds I have seen for this are separate topics and reply tables, but I'd like to stay away from this approach if possible.
One workaround that I tried and failed was GROUP BY parent.
But this grouped all topics that had no replies together as one.
Another idea that I have yet to try is to make the parent id of the original post match the post ID, and not include matching ID and parent IDs in the output.
I look forward to hearing your thoughts.
SELECT mainPost.subject, lastPost.datetime
FROM cf_posts cfp,
(
SELECT *
FROM cf_posts subPost
WHERE subPost.parent = mainPost
ORDER BY subPost.datetime DESC
LIMIT 1
)lastPost
WHERE mainPost.parent IS NULL
This is done briefly, so there may be some syntax issues but I think this should help.
You can do the following: query each separate thing that you need, so maybe a query for each topic, Then you can use UNION to bunch all of them together to get one list. Now the trick to preserve an order is as followed. For each separate query append a column to the returned result called sort and set each instance of that to a higher int value, then you can guarantee that the final result is properly sorted. Go review UNION for select statements to get a better understanding of what I'm talking about.
Ill try to keep this simple and to the point. Essentially I have a news feed, and a comments section. The comments section has two tiers: responses and then replies to responses. Basically structured like so for a given news post:
-> comment
---> reply
---> reply
Each comment can have multiple replies. Obviously, the WRONG way to do this is to do an SQL query for every comment to check for replies and list them out. EDIT Comments only have 1 tier of replies, ie replies CANNOT have replies. - Thanks JohnP
My Questions for this kind of query:
Should I keep the comments and replies in separate tables and use a JOIN, or can I keep the replies and comments in the same table and use a qualifier to separate the type?
Should I attempt to sort them using the query or pull all the data into an array and sort & display that way?
My table currently is as follow:
ID (unique, auto increment)
NEWS_ID (ties the comment to a particular news post)
REPLY_ID (ties the comment to a parent comment if it is a reply to another comment)
USER_ID
BODY
PUBLISHED_DATE
Any suggestions from those wiser than me would be greatly appreciated! Im still in the very early stages of fully understanding JOINS and other higher level mysql query structures. (IE: I suck at mysql, but im learning :)
Since you said replies are one level deep..
I would make comments 1 table and have a comment_id field to denote ownership and a news_id field to add the relationship to the news item. This way you can simply query for all comments that match the news_id and sort it by comment_id. And then a wee bit of PHP array magic will get you a sorted list of comments/replies.
So having a look at your current table, you're on the correct path.
I'm trying to write a commenting system, where people can comment on other comments, and these are displayed as recursive threads on the page. (Reddit's Commenting system is an example of what I'm trying to achieve), however I am confused on how to implement such a system that would not be very slow and computationally expensive.
I imagine that each comment would be stored in a comment table, and contain a parent_id, which would be a foreign key to another comment. My problem comes with how to get all of this data without a ton of queries, and then how to efficiently organize the comments into the order belong in. Does anyone have any ideas on how to best implement this?
Try using a nested set model. It is described in Managing Hierarchical Data in MySQL.
The big benefit is that you don't have to use recursion to retrieve child nodes, and the queries are pretty straightforward. The downside is that inserting and deleting takes a little more work.
It also scales really well. I know of one extremely huge system which stores discussion hierarchies using this method.
Here's another site providing information on that method + some source code.
It's just a suggestion, but since I'm facing the same problem right now,
How about add a sequence field (int), and a depth field in the comments table,
and update it as new comments are inserted.
The sequence field would serve the purpose of ordering the comments.
And the depth field would indicates the recursion level of the comment.
Then the hard part would be do the right updates as users insert new comments.
I don't know yet how hard this is to implement,
but I'm pretty sure once implemented, we will have a performance gain over nested model based
solutions.
I created a small tutorial explaining the basic concepts behind the recursive approach. As people have said above, the recursive function doesn't scale as well, however, inserts are far more efficient.
Here are the links:
http://www.evanpetersen.com/index.php/item/php-and-mysql-recursion.html
and
http://www.evanpetersen.com/index.php/item/php-mysql-revisited.html
I normaly work with a parent - child system.
For example, consider the following:
Table comment(
commentID,
pageID,
userID,
comment
[, parentID]
)
parentID is a foreign key to commentID (from the same table) which is optional (can be NULL).
For selecting comments use this for a 'root' comment:
SELECT * FROM comments WHERE pageID=:pageid AND parentID IS NULL
And this for a child:
SELECT * FROM comments WHERE pageID=:pageid AND parentID=:parentid
I had to implement recursive comments too.
I broke my head with nested model, let me explain why :
Let's say you want comments for an article.
Let's call root comments the comments directly attached to this article.
Let's calls reply comments the comments that are an answer to another comment.
I noticed ( unfortunately ) that I wanted the root comments to be ordered by date desc,
BUT I wanted the reply comments to be ordered date asc !!
Paradoxal !!
So the nested model didn't help me to alleviate the number of queries.
Here is my solution :
Create a comment table with following fields :
id
article_id
parent_id (nullable)
date_creation
email
whateverYouLike
sequence
depth
The 3 key fields of this implementation are parent_id, sequence and depth.
parent_id and depth helps to insert new nodes.
Sequence is the real key field, it's kind of nested model emulation.
Each time you insert a new root comment, it is multiple of x.
I choose x=1000, which basically means that I can have 1000 maximum nested comments (That' s the only drawback I found
for this system, but this limit can easily be modified, it's enough for my needs now).
The most recent root comment has to be the one with the greatest sequence number.
Now reply comments :
we have two cases :
reply for a root comment, or reply for a reply comment.
In both cases the algoritm is the same :
take the parent's sequence, and retrieve one to get your sequence number.
Then you have to update the sequences numbers which are below the parent's sequence and above the base sequence,
which is the sequence of the root comment just below the root comment concerned.
I don't expect you to understand all this since I'm not a very good explainer,
but I hope it may give you new ideas.
( At least it worked for me better than nested model would= less requests which is the real goal ).
I’m taking a simple approach.
Save root id (if it’s comments then post_id)
Save parent_id
Then fetch all comments with post_id and recursively order them on the client.
I don’t care if there’s 1000 comments. This happens in memory.
It’s one database call, and that’s te expensive part.
So, basically, I have a MySQL table called 'topics' and another one called 'replies', for example. In table 'topics', there's a field called 'relforum' which relates this topic to a forum section. And in the table 'replies', there's a field called 'reltopic', which relates the reply to a topic. Both tables have an id field, auto_increment primary key.
Now, I want to select all replies from a certain forum. Since 'replies' has no 'relforum' field, my way would be:
Select all topics with 'relforum' equal to that certain forum and loop through them
While in the loop, select all replies from the topic that is being 'processed' right now
Merge all fetch_array results in one multidimensional array, then loop through them.
That's something like this:
$query = mysql_query("SELECT * FROM `topics` WHERE `relforum` = '1'");
while($array = mysql_fetch_array($query)) {
$temp = mysql_query("SELECT * FROM `replies` WHERE `reltopic` = {$array['id']}");
$results[] = mysql_fetch_array($temp);
}
Is there a way to merge all that into fewer queries? Because this process would basically run one query per topic in that forum plus one. That would be too much :P
Adding the relforum field to the replies table is a solution (I'm still designing the DB Part so it's not a problem to add it), but I would like to see if there's a solution.
I'm really not good at SQL things, I only know the basic SELECT/INSERT/UPDATE, and I usually generate the last two ones using PHPMyAdmin, so... I guess I need some help.
Thanks for reading!
You need to learn to use joins. The link below is for SQL server but the theory for mySQl is pretty much the same for basic joins. Please do not use comma-based joins as they are 18 years outdated and are a porr pracitce. Learn to use the ANSII standard joins.
http://www.tek-tips.com/faqs.cfm?fid=4785
In accessing a database, you almost never want to use any looping. Databases are designed to perform best when asked to operate on sets of data not individual rows. So you need to stop thinking about looping and start thinking about the set of the data you need.
SELECT
r.*
FROM
replies r
INNER JOIN
topics t
ON
r.reltopic = t.id
WHERE
t.relforum = 1;
You basically need a join of two tables.
SELECT * FROM `replies`, `topics` WHERE `replies`.`reltopic` = `topics`.`id` AND `topics`.`relforum` = '1';
SELECT r.* FROM replies r, topics t
WHERE t.relforum = 1 AND r.reltopic = t.id
get rid of the backquotes. they're nonstandard and clutter the code
Yes, you should use a join here. However you will need to take greater care processing your result set.
Joins are the essential query in a relational database schema. Add them to your arsenal of knowledge :)