Joining 3 tables and matching null entries - php

I have 3 tables, users, news, news_viewed. I'm trying to join these 3 tables and find a list of news each user has not viewed.
TABLE users
userid
username
status
TABLE news
newsid
title
post_time
TABLE news_viewed
nvid
username
newsid
Looking to find a list from users that have not read news (found in news_viewed)
I've tried many different joins, including left joins and inners and outers but cannot get the results I need.
$_30daysago = strtotime('-30 days');
SELECT * FROM
(
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
UNION
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
) as JoinedTable
I need the required results to include the users.username, news.newsid and news.title.
Any help would be appreciated, thank you!

This is a good spot to use the LEFT JOIN antipattern:
SELECT u.username, n.newsid, n.title
FROM users u
INNER JOIN news n ON n.post_time > ?
LEFT JOIN news_viewed nv
ON n.newsid = nv.newsid
AND nv.username = u.username
WHERE
u.status = 'active'
AND nv.nvid IS NULL
This query generates a cartesian product of users and recent news (ie having a post time greater than the parameter indicated by ?), and returns the users/news tuples for which the left join on news_viewed did not succeed (hence the antipattern).
Note: it is unclear what column to use in the join; column name news_viewed (username) tend to indicate that it relates to users(username), whereas the primary key of users seems to be userid. Fix your column names or fix your relationship.

Ellaborating on #GMB's answer
Your query:
$_30daysago = strtotime('-30 days');
SELECT * FROM
(
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
UNION
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
) as JoinedTable
is saying:
get all active users with the news they have read (inner join)
SELECT users.username, news_id
FROM users inner join news_viewed ON
users.username = news_viewed.username and users.status='active'
and add all the news with the users that have read them in the last 30 days (inner join again)
SELECT news_viewed.username, post_time
FROM news_viewed inner join news ON
news_viewed.newsid = news.newsid and news.post_time>'$_30daysago'
That is actually bringing up all the tuples from news_viewed minus the ones where the user is not active AND the new is over 30 days old.
however, given the usage of inner join, you're bringing a lot of duplicate records
1.- The results from the first query where the new is less than 30 days old
2.- The results from the second query where the user is active
since you're using UNION and not UNION ALL, you are implicitly asking for a SELECT DISTINCT, but the fields are different (it makes no sense to display newsid and then post_time in the same field)
plus, you have a typo in the field name, which is not news_id
You have to look at it from the other way around. The potential combinations amount for a scenario where every user has read every new. So you get that universe as a basis (number of users times number of news) and then
1- remove inactive users
2- remove news older than 30 days
3- remove tuples that are unrelated in the news_viewed table
SELECT users.username, news.newsid
FROM users
JOIN news
ON users.status='active' -- removes inactive users
AND news.post_time>'$_30daysago' -- removes older news
LEFT JOIN
news_viewed nv USING (username, newsid)
WHERE nv.nvid IS NULL -- removes unrelated entries

Related

How to join 3 tables with different data between them?

I'm not too good with explaining things, apologies.
I have 3 tables that are similar to the below:
users
id
username
threads
id
title
user_id
lastpost_id
posts
id
content
thread_id
user_id
On a page listing forum threads, I want the username of both the thread author, and the last post author of that thread to be displayed, I'm attempting to achieve this in a single query.
My query looks like this:
SELECT t.*,u.username FROM threads t
INNER JOIN users u ON t.user_id=u.id
INNER JOIN posts p ON t.lastpost_id=p.id
ORDER BY t.id DESC
The first join enables me to get the username of the user id that started the thread.
The second join is what I'm not sure on, it can get me the user id but how do I get the username from that, as a 3rd join?
You can select the same table multiple times if you give it a different alias. You can give the fields aliases too:
SELECT
t.*,
tu.username as threadusername, /* Result field is called 'threadusername' */
p.*,
pu.username as lastpostusername
FROM threads t
INNER JOIN users tu ON t.user_id=tu.id /* thread user */
INNER JOIN posts p ON t.lastpost_id=p.id
INNER JOIN users pu ON p.user_id=pu.id /* post user */
ORDER BY t.id DESC
You can join to a joined table like this:
SELECT t.*,u.username,u2.username FROM threads t
INNER JOIN users u ON t.user_id=u.id
INNER JOIN posts p ON t.lastpost_id=p.id
INNER JOIN users u2 ON p.user_id=u2.id
ORDER BY t.id DESC
Note, I haven't had time to test it, but it should work (at least in MySQL).
I don't know if I got it correctly, but as per my understanding you can have a inner query to fetch the thread ids and then have a outer query to fetch the posts based on the thread id, have a max on post id and group by user id. Also join to user to have the name. Hope that helps.

MySQL query counting on multiple tables

At the moment, we have 3 queries. In php, we loop over the first, then execute the 2nd multiple times, then which I'd like to have in one single query:
The first query is:
SELECT id FROM users
Then inside looping over those results, the 2nd is
SELECT id AS rid, count(recommendedById) FROM users WHERE id=$id
where $id is users.id from the first query.
The 3rd query is which is executed inside the 2nd loop is:
SELECT count(likes) AS likeCounter FROM posts WHERE author_id=$rid
and likeCounter is summed up to the first query.
Anyone able to bring this into one query?
Desired result
The result should be a row per user with a count of users he recommended and a sum of likes his recommended users got on their posts.
SELECT u.id,COUNT(DISTINCT ruid),sum(p.likes)
FROM users as u
LEFT JOIN (SELECT recommendedById as rid,id as ruid from users) as r ON r.rid = u.id
LEFT JOIN posts p ON p.author_id = ruid
GROUP BY u.id
You can do this:
SELECT u.id AS rid, count(recs.id), count(p.likes) AS likeCounter
FROM users u
LEFT JOIN posts p ON p.author_id=u.id
LEFT JOIN users recs ON recs.recommendedById=u.id
GROUP BY u.id
But a user has an id, and you use id from the users table. Isn't that always 1?

Complex MySQL Query, Multi Table - How Do I Write This Relational Query?

I'm having issues writing this query, it's an odd relation that I can't figure out. I'm wondering if it would be better to just use two mysql queries and merge the results with php?... Anyways.. so here we go.
Here's the tables we're using:
- media -
id
userId
accessKey
internalName
type
created
modified
- reposts -
id
userId
mediaId
created
- users -
id
username
Basically, what I want to do is get a result set of media items associated with the user who posted it, and then ALSO, in the same result set, include additional rows for media items that have been reposted, and then for reposted media items, instead of associating the media.userId of the media item for the username association, associate the reposts.userId as the username.
Here's a rough idea to illustrate, these two example queries below need to work as 1 to provide a combined result set.
SELECT media.*, users.username,
0 AS reposted
FROM media
LEFT JOIN users ON users.id = media.userId
SELECT media.id, media.accessKey, media.internalName, media.type, media.modified, users.username, reposts.userId, reposts.created,
1 AS reposted
FROM reposts
LEFT JOIN media ON media.id = reposts.mediaId
LEFT JOIN users ON users.id = reposts.userId
How would I go about doing this? Or would I be better off using 2 queries and merging the results with PHP?
You can use UNION in your query, but UNION requires the same number of columns (and same data type if I recall correctly) on both queries:
(SELECT media.id, media.accessKey, media.internalName, media.type, media.modified, users.username, users.id, media.created,
0 AS reposted
FROM media
LEFT JOIN users ON users.id = media.userId)
UNION
(SELECT media.id, media.accessKey, media.internalName, media.type, media.modified, users.username, reposts.userId, reposts.created,
1 AS reposted
FROM reposts
LEFT JOIN media ON media.id = reposts.mediaId
LEFT JOIN users ON users.id = reposts.userId)

Seeking a more efficient manner for Ordering before Grouping, with Unions

The following code works for what I am looking for but I am trying to see if there is a better way to create an efficient news feed. This news feed is supposed to choose the most recent comments and/or likes, etc but only exhibit the most recent per user (i.e. if John liked 2 items and commented on 3 items, only show the most recent comment and like. Please note that there are multiple UNIONs involved.
SELECT *
FROM (SELECT username, time, comment FROM comments ORDER BY time DESC) AS temp
GROUP by username
UNION
SELECT *
FROM (SELECT username, time, likes FROM likes ORDER BY time DESC) AS temp
GROUP BY username
ORDER BY time DESC
LIMIT 10
I think you can JOIN the two tables on username, but this will give you extra columns instead of the union, something like:
SELECT
c1.username,
c1.`time` AS CommentTime,
c1.comment,
l1.`time` AS LikeTime,
l1.likes
FROM comments AS c1
INNER JOIN
(
SELECT username, MAX(`time`) LatestTime
FROM Comments
GROUP BY username
) AS c2 ON c1.username = c2.username
AND c1.`time` = c2.LatestTime
INNER JOIN Likes l1 ON c1.username = l2.username
INNER JOIN
(
SELECT username, MAX(`time`) LatestTime
FROM Likes
GROUP BY username
) AS l2 ON l1.username = l2.username
AND l1.`time` = l2.latestTime
ORDER BY c1.`time` DESC
LIMIT 10;
This will give you the username with the latest comment and the latest likes with the latest time.

Need help with a multiple table query in mysql

I'm working on building a forum with kohana. I know there is already good, free, forum software out there, but it's for a family site, so I thought I'd use it as a learning experience. I'm also not using the ORM that is built into Kohana, as I would like to learn more about SQL in the process of building the forum.
For my forum I have 4 main tables:
USERS
TOPICS
POSTS
COMMENTS
TOPICS table: id (auto incremented), topic row.
USERS table: username, email, first and last name and a few other non related rows
POSTS table: id (auto incremented), post-title, post-body, topic-id, user-id, post-date, updated-date, updated-by(which will contain the user-id of the person who made the most recent comment)
COMMENTS table: id (auto incremented), post-id, user-id and comment
On the main forum page I would like to have:
a list of all of the topics
the number of posts for each topic
the last updated post, and who updated it
the most recently updated topic to be on top, most likely an "ORDER BY updated-date"
Here is the query I have so far:
SELECT topics.id AS topic-id,
topics.topic,
post-user.id AS user-id,
CONCAT_WS(' ', post-user.first-name, post-user.last-name) AS name,
recent-post.id AS post-id,
post-num.post-total,
recent-post.title AS post-title,
recent-post.update_date AS updated-date,
recent-post.updated-by AS updated-by
FROM topics
JOIN (SELECT posts.topic-id,
COUNT(*) AS post-total
FROM POSTS
WHERE posts.topic-id = topic-id
GROUP BY posts.topic-id) AS post-num ON topics.id = post-num.topic-id
JOIN (SELECT posts.*
FROM posts
ORDER BY posts.update-date DESC) AS recent-post ON topics.id = recent-post.topic-id
JOIN (SELECT users.*,
posts.user-id
FROM users, posts
WHERE posts.user-id = users.id) as post-user ON recent-post.user_id = post-user.id
GROUP BY topics.id
This query almost works as it will get all of information for topics that have posts. But it doesn't return the topics that don't have any posts.
I'm sure that the query is inefficient and wrong since it makes two sub-selects to the posts table, but it was the only way I could get to the point I'm at.
Dash is not a valid character in SQL identifiers, but you can use "_" instead.
You don't necessarily have to get everything from a single SQL query. In fact, trying to do so makes it harder to code, and also sometimes makes it harder for the SQL optimizer to execute.
It makes no sense to use ORDER BY in a subquery.
Name your primary key columns topic_id, user_id, and so on (instead of "id" in every table), and you won't have to alias them in the select-list.
Here's how I would solve this:
First get the most recent post per topic, with associated user information:
SELECT t.topic_id, t.topic,
u.user_id, CONCAT_WS(' ', u.first_name, u.last_name) AS full_name,
p.post_id, p.title, p.update_date, p.updated_by
FROM topics t
INNER JOIN
(posts p INNER JOIN users u ON (p.updated_by = u.user_id))
ON (t.topic_id = p.topic_id)
LEFT OUTER JOIN posts p2
ON (p.topic_id = p2.topic_id AND p.update_date < p2.update_date)
WHERE p2.post_id IS NULL;
Then get the counts of posts per topic in a separate, simpler query.
SELECT t.topic_id, COUNT(*) AS post_total
FROM topics t LEFT OUTER JOIN posts p USING (topic_id)
GROUP BY t.topic_id;
Merge the two data sets in your application.
to ensure you get results for topics without posts, you'll need to use LEFT JOIN instead of JOIN for the first join between topics and the next table. LEFT JOIN means "always return a result set row for every row in the left table, even if there's no match with the right table."
Gotta go now, but I'll try to look at the efficiency issues later.
This is a very complicated query. You should note that JOIN statements will limit your topics to those that have posts. If a topic does not have a post, a JOIN statement will filter it out.
Try the following query.
SELECT *
FROM
(
SELECT T.Topic,
COUNT(AllTopicPosts.ID) NumberOfPosts,
MAX(IFNULL(MostRecentPost.Post-Title, '') MostRecentPostTitle,
MAX(IFNULL(MostRecentPostUser.UserName, '') MostRecentPostUser
MAX(IFNULL(MostRecentPost.Updated_Date, '') MostRecentPostDate
FROM TOPICS
LEFT JOIN POSTS AllTopicPosts ON AllTopicPosts.Topic_Id = TOPICS.ID
LEFT JOIN
(
SELECT *
FROM Posts P
WHERE P.Topic_id = TOPICS.id
ORDER BY P.Updated_Date DESC
LIMIT 1
) MostRecentPost ON MostRecentPost.Topic_Id = TOPICS.ID
LEFT JOIN USERS MostRecentPostUser ON MostRecentPostUser.ID = MostRecentPost.User_Id
GROUP BY T.Topic
)
ORDER BY MostRecentPostDate DESC
I'd use a left join inside a subquery to pull back the correct topic, and then you can do a little legwork outside of that to get some of the user info.
select
s.topic_id,
s.topic,
u.user_id as last_updated_by_id,
u.user_name as last_updated_by,
s.last_post,
s.post_count
from
(
select
t.id as topic_id,
t.topic,
t.user_id as orig_poster,
max(coalesce(p.post_date, t.post_date)) as last_post,
count(*) as post_count --would be p.post_id if you don't want to count the topic
from
topics t
left join posts p on
t.id = p.topic_id
group by
t.topic_id,
t.topic,
t.user_id
) s
left join posts p on
s.topic_id = p.topic_id
and s.last_post = p.post_date
and s.post_count > 1 --0 if you're using p.post_id up top
inner join users u on
u.id = coalesce(p.user_id, s.orig_poster)
order by
s.last_post desc
This query does introduce coalesce and left join, and they are very good concepts to look into. For two arguments (like used here), you can also use ifnull in MySQL, since it is functionally equivalent.
Keep in mind that that's exclusive to MySQL (if you need to port this code). Other databases have other functions for that (isnull in SQL Server, nvl in Oracle, etc., etc.). I used coalesce so that I could keep this query all ANSI-fied.

Categories