SQL: Selecting count of multiple tables - php

I don't think this will be too complicated to explain, but certainly complicated to get it working.
First of all, I have a couple of tables regarding users comments, one table for each section (forum, articles etc), as shown below:
site_users (id, username, ...) [Table that holds user's info]
site_articles_comments (id, user_id, comment, ...) [Where user_id = site_users.id]
site_forum_comments (id, user_id, comment, ...) [Same for site_articles_comments]
The thing is that every new row is a new comment and users can comment multiple times, which means that more rows are being added, thus making the need of sorting the number of rows to get the amount of comments in some sort of ranking system.
I was able to make a simple forum rank by doing this simple query:
SELECT u.id, u.username, COUNT(r.id) AS rank FROM site_users AS u LEFT
JOIN site_forum_comments AS r ON u.id = r.user_id GROUP BY u.username,
u.id ORDER BY rank DESC LIMIT :l
This query sorts all users from the database, where the user who has commented the most is always on top.
What I need, in the other hand, is to have a global ranking system, which sums the amount of comments in each section (articles, forum etc) and displays the users accordingly.
I was playing around with the sql to do that and the last thing I came up with was this huge query:
SELECT u.id, u.username, (COUNT(a.id) + COUNT(f.id)) AS rank FROM
site_users u LEFT JOIN site_articles_comments a ON a.user_id = u.id
LEFT JOIN site_forum_comments f ON f.user_id = u.id GROUP BY
u.username, u.id ORDER BY rank DESC LIMIT :l
This, however, returns null. What could I possibly do to achieve the result I want?
Thanks in advance,
Mateus
EDIT1: Sorry for the lack of information, this is regarding MySQL.

The problem is math with nulls, and ordering with nulls (check into the "NULLS LAST" option for overriding the default ordering which returns the nulls first for a descending order).
In your case, with the outer joins, if the user has a ton of article comments but no forum comments, well, 100 + null = null in Oracle math. So to get the math to work you need to make null=0. That's where NVL() comes in (and also has the nice side-effect of eliminating pesky nulls from your result set)!
SELECT u.id, u.username, (NVL(COUNT(a.id),0) + NVL(COUNT(f.id),0)) AS rank
FROM site_users u
LEFT JOIN site_articles_comments a ON a.user_id = u.id
LEFT JOIN site_forum_comments f ON f.user_id = u.id
GROUP BY u.username, u.id ORDER BY rank DESC LIMIT :l
I see you have both MySQL and Oracle in your tags - the above is for Oracle. If for MYSQL use COALESCE(COUNT(),0) instead.

try SELECT u.id, MIN(u.username) AS username, (COALESCE(COUNT(DISTINCT(a.id)),0) + COALESCE(COUNT(DISTINCT(f.id)),0)) AS rank
FROM site_users AS u
LEFT JOIN site_articles_comments AS a ON (a.user_id = u.id)
LEFT JOIN site_forum_comments AS f ON (f.user_id = u.id)
GROUP BY u.id
ORDER BY rank DESC
LIMIT :l

Related

MySQL UNION ALL with LEFT JOIN

I am trying to work out the most efficient query to put data from two tables into one set of results and left join the user data to it. I need help with the syntax for the query. I have put together the below but it's not quite right.
$query_chat = sprintf("SELECT comment_id, user_id, comment, timestamp FROM activity
UNION ALL
SELECT comment_id, user_id, comment, timestamp FROM comments
LEFT JOIN users.company, users.contact_person, users.email ON users.user_id = comments.user_id
WHERE user_id = %s
ORDER BY timestamp DESC", GetSQLValueString($_GET['comment_id'], "INT"));
The syntax is wrong and throws up the below - I am trying to make it efficient by only selecting what I need from the users table.
1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ' users.contact_person, users.email ON users.user_id = comments.user_id WHERE us' at line 4
I have now got to this point below but get Unknown column 'users.company' in 'field list'
SELECT
comments.comment_id, comments.user_id, comments.comment,
comments.timestamp,
users.company, users.contact_person, users.email
FROM
comments
UNION ALL
SELECT
activity.comment_id, activity.user_id, activity.comment,
activity.timestamp,
users.company, users.contact_person, users.email
FROM
activity
LEFT JOIN users ON users.user_id = comments.user_id
WHERE
comments.comment_id = 69 ORDER BY timestamp ASC
Here is your problem:
LEFT JOIN
users.company, users.contact_person, users.email
ON users.user_id = comments.user_id
The syntax is:
LEFT JOIN (table) ON (join-clause)
So, in your case, I think you need something like this:
SELECT
comments.comment_id, comments.user_id, comments.comment,
comments.timestamp,
users.company, users.contact_person, users.email
FROM
comments
LEFT JOIN users ON users.user_id = comments.user_id
WHERE
users.user_id = %s
;
As you can see, when using a JOIN, the columns come first (from all tables) and then the tables come next, with their corresponding JOIN clauses.
Update: in reply to your update, you have two options. You can do this (which you are closest to):
SELECT
(users joined to activity)
UNION
(users joined to comments)
You already know how to do the (expressions) above - see the earlier part of my answer. Or, you can do things this way:
SELECT
*
FROM
(SELECT * FROM activity UNION SELECT * FROM comments) user_actions
LEFT JOIN users ON (users.user_id = user_actions.user_id)
Note that in both examples cases in this update, I am offering pseudocode1 - you need to fill in the blanks. You'll find that the skill of taking a generalised example (say from a manual) and applying to your own use case is something that happens a great deal in computer science, so you should practice this as often as possible.
1 That said, the second version containing the sub-query is done except for changing the columns to fetch - so you are probably 95% of the way there!
So finally got there with the below, appreciate the direction halfer
SELECT
comments.*, users.company, users.contact_person, users.email
FROM
comments
LEFT JOIN users ON users.user_id = comments.user_id
WHERE
comment_id = %s
UNION ALL
SELECT activity.*, users.company, users.contact_person, users.email
FROM
activity
LEFT JOIN users ON users.user_id = activity.user_id
WHERE
comment_id = %s
ORDER BY
timestamp ASC

MySQL Join/WHERE IN performance

I'm running the following query to select all posts liked by a user. The problem is, it takes quite a few seconds for the page to load.
SELECT p.*, a.username, a.avatar FROM user_posts p
LEFT JOIN account a ON p.uid=a.id WHERE p.pid in
(select post from user_posts_likes where `by`='$user_id')
ORDER BY `pid` DESC LIMIT $npage, 10";
Is there a better way to do this instead of using WHERE IN?
Thanks.
You can try two joins like this:
SELECT p.*, a.username, a.avatar FROM user_post_likes l
JOIN post p ON l.post = p.pid
LEFT JOIN account a ON p.uid = a.id
WHERE l.by = 555
I deducted the foreign key names from your original query so they might be wrong.
The 555 is an example user id, obviously.

Using SQL JOIN and COUNT

Let there be two tables, one holding user information and one holding user records of some sort, say receipts. There is a one-to-many relationship between the users and receipts.
What would be the best SQL method of retrieving users, sorted by the greatest number of receipts?
The best way I can think of is using a join and count(?) to return an array of users and their number of associated receipts.
Is there a way to make use of the count function in this instance?
select * from `users` inner join `receipts` on `users`.`id` = `receipts`.`uId`
If OP wishes to include additional information (additional aggregations, etc...) utilizing data from users table:
SELECT `users`.`id`,
count(`receipts`.`uId`)
FROM `users`
INNER JOIN `receipts` ON `users`.`id` = `receipts`.`uId`
GROUP BY `users`.`id`
ORDER BY count(`receipts`.`uId`) DESC
Otherwise, only the receipts table is required...
SELECT `users`.`id`,
count(`receipts`.`uId`)
FROM `receipts`
GROUP BY `receipts`.`uId`
ORDER BY count(`receipts`.`uId`) DESC
Two answers provided by Dave and meewoK will accomplish what you need. I'm providing an alternative, which should provide better performance and allow you to show more user information because in the case with Dave's answer you can only SELECT columns that are used by an aggregate function or in the group clause.
SELECT users.id, users.name, r.numReceipts
FROM users u
INNER JOIN (
SELECT uId, count(receipts) as numReceipts
FROM receipts
GROUP BY receipts.id
) as r ON r.uId = u.id
ORDER BY r.numReceipts DESC
This creates an inline view. Only return the count of receipts of each user and then join this inline view on the user's ID.
Some one correct me if I'm wrong, but I've been told that the planner isn't as efficient when you do a scalar subquery in the SELECT clause. It's better to join on a temporary table this way. There are multiple ways to write this query and it all depends on how you want to use the information!!! Cheers!
try this
SELECT a.`id`, count(b.`recipts`) as total_receipts
FROM `users` a
INNER JOIN `receipts` b
ON a.`id` = b.`uId`
GROUP BY a.`id`
ORDER BY count(b.`receipts`) desc
SELECT users.*, (SELECT COUNT(*) FROM tblreceipts WHERE tblreciepts.uId=users.id) as counter FROMusersORDER BY counter DESC
Something like this may work (not sure on the speed though if its big tables)
If you want to include all users, even those with no receipts, then a good way is a left outer join:
SELECT u.*, count(r.uid) as NumReceipts
FROM `users` u left outer join
`receipts` r
ON u.id = r.`uId
GROUP BY `u.id
ORDER BY NumReceipts DESC;
If you only want the id for users that have receipts, then the join is not even necessary:
SELECT r.uid, count(*) as NumReceipts
FROM receipts r
GROUP BY r.uid
ORDER BY NumReceipts

MySql If Record Exists Inside Select Query

Creating a forum-type site and I have 3 tables, one for storing user information (users), one for the original post (threads), and one for an upvote system like SO has (votes).
The votes table has 3 columns, id, userid, and threadid. When a user upvotes a thread, a record is inserted into the votes table. When I query for the thread I want to know if the user has upvoted for it, essentially if a record exists in the votes table with the correct userid and threadid. I can do this in two queries, but I think there has to be a way to get everything in one.
My query currently:
"SELECT t.id, t.title, t.content u.id AS uid, u.username
FROM threads t, users u
WHERE t.id = '".$userid."'
AND t.author = '".$userid."'"
In case you need a better idea, the following will query the desired results ONLY if the user has upvoted. I need the query to still return if the record in the votes table doesn't exist (possibly return a vote value as null?).
"SELECT t.id, t.title, t.content u.id
AS uid, u.username v.id
FROM threads t, users u, votes v
WHERE t.id = '".$threadid."'
AND t.author = '".$userid."'
AND v.threadid = t.id
AND v.userid = '".$userid."'"
Also I taught myself (and am still learning) mysql and database design so if there's a better method/approach such as joining tables, please let me know. Thanks.
Your second query is doing inner joins, which would only return records that appear on both sides of the join. You'd want to do a left/right outer join on the votes table instead, so that you'd still get user+thread records even if there's no matching vote record.
SELECT t.id, t.title, t.content, u.id, u.username, v.id
FROM threads AS t
INNER JOIN users AS u ON t.userid = u.id
LEFT JOIN votes AS v ON (v.userid = u.id and t.id = v.threadid)
WHERE (u.id = $userid) AND (t.id = $threadid)
just guessing at this, but should be enough to get you started.

Need help with a multiple table query in mysql

I'm working on building a forum with kohana. I know there is already good, free, forum software out there, but it's for a family site, so I thought I'd use it as a learning experience. I'm also not using the ORM that is built into Kohana, as I would like to learn more about SQL in the process of building the forum.
For my forum I have 4 main tables:
USERS
TOPICS
POSTS
COMMENTS
TOPICS table: id (auto incremented), topic row.
USERS table: username, email, first and last name and a few other non related rows
POSTS table: id (auto incremented), post-title, post-body, topic-id, user-id, post-date, updated-date, updated-by(which will contain the user-id of the person who made the most recent comment)
COMMENTS table: id (auto incremented), post-id, user-id and comment
On the main forum page I would like to have:
a list of all of the topics
the number of posts for each topic
the last updated post, and who updated it
the most recently updated topic to be on top, most likely an "ORDER BY updated-date"
Here is the query I have so far:
SELECT topics.id AS topic-id,
topics.topic,
post-user.id AS user-id,
CONCAT_WS(' ', post-user.first-name, post-user.last-name) AS name,
recent-post.id AS post-id,
post-num.post-total,
recent-post.title AS post-title,
recent-post.update_date AS updated-date,
recent-post.updated-by AS updated-by
FROM topics
JOIN (SELECT posts.topic-id,
COUNT(*) AS post-total
FROM POSTS
WHERE posts.topic-id = topic-id
GROUP BY posts.topic-id) AS post-num ON topics.id = post-num.topic-id
JOIN (SELECT posts.*
FROM posts
ORDER BY posts.update-date DESC) AS recent-post ON topics.id = recent-post.topic-id
JOIN (SELECT users.*,
posts.user-id
FROM users, posts
WHERE posts.user-id = users.id) as post-user ON recent-post.user_id = post-user.id
GROUP BY topics.id
This query almost works as it will get all of information for topics that have posts. But it doesn't return the topics that don't have any posts.
I'm sure that the query is inefficient and wrong since it makes two sub-selects to the posts table, but it was the only way I could get to the point I'm at.
Dash is not a valid character in SQL identifiers, but you can use "_" instead.
You don't necessarily have to get everything from a single SQL query. In fact, trying to do so makes it harder to code, and also sometimes makes it harder for the SQL optimizer to execute.
It makes no sense to use ORDER BY in a subquery.
Name your primary key columns topic_id, user_id, and so on (instead of "id" in every table), and you won't have to alias them in the select-list.
Here's how I would solve this:
First get the most recent post per topic, with associated user information:
SELECT t.topic_id, t.topic,
u.user_id, CONCAT_WS(' ', u.first_name, u.last_name) AS full_name,
p.post_id, p.title, p.update_date, p.updated_by
FROM topics t
INNER JOIN
(posts p INNER JOIN users u ON (p.updated_by = u.user_id))
ON (t.topic_id = p.topic_id)
LEFT OUTER JOIN posts p2
ON (p.topic_id = p2.topic_id AND p.update_date < p2.update_date)
WHERE p2.post_id IS NULL;
Then get the counts of posts per topic in a separate, simpler query.
SELECT t.topic_id, COUNT(*) AS post_total
FROM topics t LEFT OUTER JOIN posts p USING (topic_id)
GROUP BY t.topic_id;
Merge the two data sets in your application.
to ensure you get results for topics without posts, you'll need to use LEFT JOIN instead of JOIN for the first join between topics and the next table. LEFT JOIN means "always return a result set row for every row in the left table, even if there's no match with the right table."
Gotta go now, but I'll try to look at the efficiency issues later.
This is a very complicated query. You should note that JOIN statements will limit your topics to those that have posts. If a topic does not have a post, a JOIN statement will filter it out.
Try the following query.
SELECT *
FROM
(
SELECT T.Topic,
COUNT(AllTopicPosts.ID) NumberOfPosts,
MAX(IFNULL(MostRecentPost.Post-Title, '') MostRecentPostTitle,
MAX(IFNULL(MostRecentPostUser.UserName, '') MostRecentPostUser
MAX(IFNULL(MostRecentPost.Updated_Date, '') MostRecentPostDate
FROM TOPICS
LEFT JOIN POSTS AllTopicPosts ON AllTopicPosts.Topic_Id = TOPICS.ID
LEFT JOIN
(
SELECT *
FROM Posts P
WHERE P.Topic_id = TOPICS.id
ORDER BY P.Updated_Date DESC
LIMIT 1
) MostRecentPost ON MostRecentPost.Topic_Id = TOPICS.ID
LEFT JOIN USERS MostRecentPostUser ON MostRecentPostUser.ID = MostRecentPost.User_Id
GROUP BY T.Topic
)
ORDER BY MostRecentPostDate DESC
I'd use a left join inside a subquery to pull back the correct topic, and then you can do a little legwork outside of that to get some of the user info.
select
s.topic_id,
s.topic,
u.user_id as last_updated_by_id,
u.user_name as last_updated_by,
s.last_post,
s.post_count
from
(
select
t.id as topic_id,
t.topic,
t.user_id as orig_poster,
max(coalesce(p.post_date, t.post_date)) as last_post,
count(*) as post_count --would be p.post_id if you don't want to count the topic
from
topics t
left join posts p on
t.id = p.topic_id
group by
t.topic_id,
t.topic,
t.user_id
) s
left join posts p on
s.topic_id = p.topic_id
and s.last_post = p.post_date
and s.post_count > 1 --0 if you're using p.post_id up top
inner join users u on
u.id = coalesce(p.user_id, s.orig_poster)
order by
s.last_post desc
This query does introduce coalesce and left join, and they are very good concepts to look into. For two arguments (like used here), you can also use ifnull in MySQL, since it is functionally equivalent.
Keep in mind that that's exclusive to MySQL (if you need to port this code). Other databases have other functions for that (isnull in SQL Server, nvl in Oracle, etc., etc.). I used coalesce so that I could keep this query all ANSI-fied.

Categories