Need help with a multiple table query in mysql - php

I'm working on building a forum with kohana. I know there is already good, free, forum software out there, but it's for a family site, so I thought I'd use it as a learning experience. I'm also not using the ORM that is built into Kohana, as I would like to learn more about SQL in the process of building the forum.
For my forum I have 4 main tables:
USERS
TOPICS
POSTS
COMMENTS
TOPICS table: id (auto incremented), topic row.
USERS table: username, email, first and last name and a few other non related rows
POSTS table: id (auto incremented), post-title, post-body, topic-id, user-id, post-date, updated-date, updated-by(which will contain the user-id of the person who made the most recent comment)
COMMENTS table: id (auto incremented), post-id, user-id and comment
On the main forum page I would like to have:
a list of all of the topics
the number of posts for each topic
the last updated post, and who updated it
the most recently updated topic to be on top, most likely an "ORDER BY updated-date"
Here is the query I have so far:
SELECT topics.id AS topic-id,
topics.topic,
post-user.id AS user-id,
CONCAT_WS(' ', post-user.first-name, post-user.last-name) AS name,
recent-post.id AS post-id,
post-num.post-total,
recent-post.title AS post-title,
recent-post.update_date AS updated-date,
recent-post.updated-by AS updated-by
FROM topics
JOIN (SELECT posts.topic-id,
COUNT(*) AS post-total
FROM POSTS
WHERE posts.topic-id = topic-id
GROUP BY posts.topic-id) AS post-num ON topics.id = post-num.topic-id
JOIN (SELECT posts.*
FROM posts
ORDER BY posts.update-date DESC) AS recent-post ON topics.id = recent-post.topic-id
JOIN (SELECT users.*,
posts.user-id
FROM users, posts
WHERE posts.user-id = users.id) as post-user ON recent-post.user_id = post-user.id
GROUP BY topics.id
This query almost works as it will get all of information for topics that have posts. But it doesn't return the topics that don't have any posts.
I'm sure that the query is inefficient and wrong since it makes two sub-selects to the posts table, but it was the only way I could get to the point I'm at.

Dash is not a valid character in SQL identifiers, but you can use "_" instead.
You don't necessarily have to get everything from a single SQL query. In fact, trying to do so makes it harder to code, and also sometimes makes it harder for the SQL optimizer to execute.
It makes no sense to use ORDER BY in a subquery.
Name your primary key columns topic_id, user_id, and so on (instead of "id" in every table), and you won't have to alias them in the select-list.
Here's how I would solve this:
First get the most recent post per topic, with associated user information:
SELECT t.topic_id, t.topic,
u.user_id, CONCAT_WS(' ', u.first_name, u.last_name) AS full_name,
p.post_id, p.title, p.update_date, p.updated_by
FROM topics t
INNER JOIN
(posts p INNER JOIN users u ON (p.updated_by = u.user_id))
ON (t.topic_id = p.topic_id)
LEFT OUTER JOIN posts p2
ON (p.topic_id = p2.topic_id AND p.update_date < p2.update_date)
WHERE p2.post_id IS NULL;
Then get the counts of posts per topic in a separate, simpler query.
SELECT t.topic_id, COUNT(*) AS post_total
FROM topics t LEFT OUTER JOIN posts p USING (topic_id)
GROUP BY t.topic_id;
Merge the two data sets in your application.

to ensure you get results for topics without posts, you'll need to use LEFT JOIN instead of JOIN for the first join between topics and the next table. LEFT JOIN means "always return a result set row for every row in the left table, even if there's no match with the right table."
Gotta go now, but I'll try to look at the efficiency issues later.

This is a very complicated query. You should note that JOIN statements will limit your topics to those that have posts. If a topic does not have a post, a JOIN statement will filter it out.
Try the following query.
SELECT *
FROM
(
SELECT T.Topic,
COUNT(AllTopicPosts.ID) NumberOfPosts,
MAX(IFNULL(MostRecentPost.Post-Title, '') MostRecentPostTitle,
MAX(IFNULL(MostRecentPostUser.UserName, '') MostRecentPostUser
MAX(IFNULL(MostRecentPost.Updated_Date, '') MostRecentPostDate
FROM TOPICS
LEFT JOIN POSTS AllTopicPosts ON AllTopicPosts.Topic_Id = TOPICS.ID
LEFT JOIN
(
SELECT *
FROM Posts P
WHERE P.Topic_id = TOPICS.id
ORDER BY P.Updated_Date DESC
LIMIT 1
) MostRecentPost ON MostRecentPost.Topic_Id = TOPICS.ID
LEFT JOIN USERS MostRecentPostUser ON MostRecentPostUser.ID = MostRecentPost.User_Id
GROUP BY T.Topic
)
ORDER BY MostRecentPostDate DESC

I'd use a left join inside a subquery to pull back the correct topic, and then you can do a little legwork outside of that to get some of the user info.
select
s.topic_id,
s.topic,
u.user_id as last_updated_by_id,
u.user_name as last_updated_by,
s.last_post,
s.post_count
from
(
select
t.id as topic_id,
t.topic,
t.user_id as orig_poster,
max(coalesce(p.post_date, t.post_date)) as last_post,
count(*) as post_count --would be p.post_id if you don't want to count the topic
from
topics t
left join posts p on
t.id = p.topic_id
group by
t.topic_id,
t.topic,
t.user_id
) s
left join posts p on
s.topic_id = p.topic_id
and s.last_post = p.post_date
and s.post_count > 1 --0 if you're using p.post_id up top
inner join users u on
u.id = coalesce(p.user_id, s.orig_poster)
order by
s.last_post desc
This query does introduce coalesce and left join, and they are very good concepts to look into. For two arguments (like used here), you can also use ifnull in MySQL, since it is functionally equivalent.
Keep in mind that that's exclusive to MySQL (if you need to port this code). Other databases have other functions for that (isnull in SQL Server, nvl in Oracle, etc., etc.). I used coalesce so that I could keep this query all ANSI-fied.

Related

Joining 3 tables and matching null entries

I have 3 tables, users, news, news_viewed. I'm trying to join these 3 tables and find a list of news each user has not viewed.
TABLE users
userid
username
status
TABLE news
newsid
title
post_time
TABLE news_viewed
nvid
username
newsid
Looking to find a list from users that have not read news (found in news_viewed)
I've tried many different joins, including left joins and inners and outers but cannot get the results I need.
$_30daysago = strtotime('-30 days');
SELECT * FROM
(
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
UNION
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
) as JoinedTable
I need the required results to include the users.username, news.newsid and news.title.
Any help would be appreciated, thank you!
This is a good spot to use the LEFT JOIN antipattern:
SELECT u.username, n.newsid, n.title
FROM users u
INNER JOIN news n ON n.post_time > ?
LEFT JOIN news_viewed nv
ON n.newsid = nv.newsid
AND nv.username = u.username
WHERE
u.status = 'active'
AND nv.nvid IS NULL
This query generates a cartesian product of users and recent news (ie having a post time greater than the parameter indicated by ?), and returns the users/news tuples for which the left join on news_viewed did not succeed (hence the antipattern).
Note: it is unclear what column to use in the join; column name news_viewed (username) tend to indicate that it relates to users(username), whereas the primary key of users seems to be userid. Fix your column names or fix your relationship.
Ellaborating on #GMB's answer
Your query:
$_30daysago = strtotime('-30 days');
SELECT * FROM
(
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
UNION
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
) as JoinedTable
is saying:
get all active users with the news they have read (inner join)
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
and add all the news with the users that have read them in the last 30 days (inner join again)
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
That is actually bringing up all the tuples from news_viewed minus the ones where the user is not active AND the new is over 30 days old.
however, given the usage of inner join, you're bringing a lot of duplicate records
1.- The results from the first query where the new is less than 30 days old
2.- The results from the second query where the user is active
since you're using UNION and not UNION ALL, you are implicitly asking for a SELECT DISTINCT, but the fields are different (it makes no sense to display newsid and then post_time in the same field)
plus, you have a typo in the field name, which is not news_id
You have to look at it from the other way around. The potential combinations amount for a scenario where every user has read every new. So you get that universe as a basis (number of users times number of news) and then
1- remove inactive users
2- remove news older than 30 days
3- remove tuples that are unrelated in the news_viewed table
SELECT users.username, news.newsid
FROM users
JOIN news
ON users.status='active' -- removes inactive users
AND news.post_time>'$_30daysago' -- removes older news
LEFT JOIN
news_viewed nv USING (username, newsid)
WHERE nv.nvid IS NULL -- removes unrelated entries

How to join 3 tables with different data between them?

I'm not too good with explaining things, apologies.
I have 3 tables that are similar to the below:
users
id
username
threads
id
title
user_id
lastpost_id
posts
id
content
thread_id
user_id
On a page listing forum threads, I want the username of both the thread author, and the last post author of that thread to be displayed, I'm attempting to achieve this in a single query.
My query looks like this:
SELECT t.*,u.username FROM threads t
INNER JOIN users u ON t.user_id=u.id
INNER JOIN posts p ON t.lastpost_id=p.id
ORDER BY t.id DESC
The first join enables me to get the username of the user id that started the thread.
The second join is what I'm not sure on, it can get me the user id but how do I get the username from that, as a 3rd join?
You can select the same table multiple times if you give it a different alias. You can give the fields aliases too:
SELECT
t.*,
tu.username as threadusername, /* Result field is called 'threadusername' */
p.*,
pu.username as lastpostusername
FROM threads t
INNER JOIN users tu ON t.user_id=tu.id /* thread user */
INNER JOIN posts p ON t.lastpost_id=p.id
INNER JOIN users pu ON p.user_id=pu.id /* post user */
ORDER BY t.id DESC
You can join to a joined table like this:
SELECT t.*,u.username,u2.username FROM threads t
INNER JOIN users u ON t.user_id=u.id
INNER JOIN posts p ON t.lastpost_id=p.id
INNER JOIN users u2 ON p.user_id=u2.id
ORDER BY t.id DESC
Note, I haven't had time to test it, but it should work (at least in MySQL).
I don't know if I got it correctly, but as per my understanding you can have a inner query to fetch the thread ids and then have a outer query to fetch the posts based on the thread id, have a max on post id and group by user id. Also join to user to have the name. Hope that helps.

Sorting a MySQL query based on another query?

I'm building some very simple forum software as a sort of self-test for my PHP and MySQL skills, but I'm not quite sure how to accomplish this task. I have a table called Threads which contains a list of all threads. I also have a table called Posts, which contains all posts. Each table's primary key is an auto-increment ID. Each row in Posts also contains the ID of the thread it belongs to.
While I can easily retrieve all of the threads in a certain subforum with a query like this:
SELECT * FROM Threads WHERE ForumID='$forumid'
...I'm not sure how to sort that based on the latest post from each thread. I can get the posts for any given thread like this:
SELECT * FROM Posts WHERE ThreadID='$threadid'
...but I don't know how to incorporate that data into my initial query to sort it. Since each post has a unique ID, posts with higher IDs will always be more recent, so there's no need to compare dates or anything like that. I'm just unclear on how to actually do the query.
I'm pretty sure that this is possible in just MySQL, but if it's not, what would be the most efficient PHP solution? Thanks!
Try this with INNER JOIN
SELECT p.*,t.*
FROM Threads t
INNER JOIN Posts p ON t.id = p.ThreadID
ORDER BY
//whatever column you want like this p.date
p.date
DESC
Thomething like this can helps you
SELECT DISTINCT t.* FROM Posts p LEFT JOIN Threads t ON p.ThreadID = t.id ORDER BY p.last_modified_time DESC
Or
SELECT DISTINCT t.* FROM Posts p LEFT JOIN Threads t ON p.ThreadID = t.id ORDER BY p.id DESC
A simple JOIN statement is the solution:
SELECT *
FROM Threads t
JOIN Posts p ON t.ThreadID = p.ThreadID
ORDER BY t.ModificationDate DESC -- or whatever column you want
This gives you the latest posts over all threads.
All of the given answers were very close, but I ended up using the following query:
SELECT t.*
FROM Threads t
INNER JOIN Posts p ON t.id = p.threadId
GROUP BY t.id
ORDER BY MAX(p.id) DESC
This is due to DUPLICATE causing problems due to its auto-grouping. See this answer for more info on the topic.

mysql/php: show posts and for each post all comments

I know this question has been asked multiple times (however, I could still not find a solution):
PHP MYSQL showing posts with comments
mysql query - blog posts and comments with limit
mysql structure for posts and comments
...
Basic question: having tables posts, comments, user... can you with one single select statement select and show all posts and all comments (with comment.user, comment.text, comment.timestamp)? How would such a select statement look like? If not, what is the easiest solution?
I also tried to JOIN the comments table with the posts table and use GROUP BY, but I got either only one comment in each row or each comment but also those posts multiple times!?
I tried the solution of the first link (nested mysql_query and then fetch) as well as the second link (with arrays). However, the first caused a bunch of errors (the syntax in that post seems to be not correct and I could not figure out how to solve it) and in the second I had problems with the arrays.
My query looks like this till now:
SELECT p.id, p.title, p.text, u.username, c.country_name, (SELECT SUM(vote_type) FROM votes v WHERE v.post_id = p.id) AS sum_vote_type FROM posts p LEFT JOIN user u ON ( p.user_id = u.id ) LEFT JOIN countries c ON ( c.country_id = u.country_id ) ORDER BY $orderby DESC
I was wondering if this issue was not very common, having posts and comments to show...?
Thank you for every help in advance!
Not knowing your database structure, it should look something like this. Note that you should replace the * characters with more explicit lists of columns you actually need.
SELECT p.*, c.*, u.* FROM posts p
LEFT JOIN comments c ON c.post_id = p.id
LEFT JOIN users u ON u.id = p.author_id
Note that if you're just trying to get counts, sums and things like that it's a good idea to cache some of that information. For instance, you may want to cache the comment count in the post table instead of counting them every query. Only count and update the comment count when adding/removing a comment.
EDIT:
Realized that you also wanted to attach user data to each comment. You can JOIN the same table more than once but it gets ugly. This could turn into a really expensive query. I also am including an example of how to alias columns so it's less confusing:
SELECT p.*, c.*, u.name as post_author, u2.name as comment_author FROM posts p
LEFT JOIN comments c ON c.post_id = p.id
LEFT JOIN users u ON u.id = p.author_id
LEFT JOIN users u2 ON u2.id = c.author_id

MySql If Record Exists Inside Select Query

Creating a forum-type site and I have 3 tables, one for storing user information (users), one for the original post (threads), and one for an upvote system like SO has (votes).
The votes table has 3 columns, id, userid, and threadid. When a user upvotes a thread, a record is inserted into the votes table. When I query for the thread I want to know if the user has upvoted for it, essentially if a record exists in the votes table with the correct userid and threadid. I can do this in two queries, but I think there has to be a way to get everything in one.
My query currently:
"SELECT t.id, t.title, t.content u.id AS uid, u.username
FROM threads t, users u
WHERE t.id = '".$userid."'
AND t.author = '".$userid."'"
In case you need a better idea, the following will query the desired results ONLY if the user has upvoted. I need the query to still return if the record in the votes table doesn't exist (possibly return a vote value as null?).
"SELECT t.id, t.title, t.content u.id
AS uid, u.username v.id
FROM threads t, users u, votes v
WHERE t.id = '".$threadid."'
AND t.author = '".$userid."'
AND v.threadid = t.id
AND v.userid = '".$userid."'"
Also I taught myself (and am still learning) mysql and database design so if there's a better method/approach such as joining tables, please let me know. Thanks.
Your second query is doing inner joins, which would only return records that appear on both sides of the join. You'd want to do a left/right outer join on the votes table instead, so that you'd still get user+thread records even if there's no matching vote record.
SELECT t.id, t.title, t.content, u.id, u.username, v.id
FROM threads AS t
INNER JOIN users AS u ON t.userid = u.id
LEFT JOIN votes AS v ON (v.userid = u.id and t.id = v.threadid)
WHERE (u.id = $userid) AND (t.id = $threadid)
just guessing at this, but should be enough to get you started.

Categories