On a webpage, I am displaying a number of picture collections (I show the thumbnails for each collection). Each picture has five relevant tables:
likes (id, user_id, picture_id),
views (id, user_id, picture_id),
comments (id, user_id, picture_id, comment),
pictures (id (which equals the "picture_id" in the previous tables), collection_id, picture_url and several other columns),
collections (id (equal to collection_id in previous table), and several other columns.
When loading my page, I need to aggregate the number of likes, views and comments for all pictures in each collection, so as to show those numbers under each collection.
So basically: count the likes for each picture, count them all up, display number. Count the views for each picture, count them all up, display number. Count the comments for each picture, count them all up, display number. And then rinse and repeat for all collections.
I'm pretty new at mysql, and I'm struggling between selects, multiple joins, counts, php vs mysql, etc etc. I'm sure there's many ways I can do this that would be very inefficient, so I'm hoping you can tell me the best/fastest/most efficient way to do this.
Thanks in advance!
You can solve this with selects and left joins.
Since you'll count entries on each table for every pictureId, your pictures table will be the left side of each relation. So:
select
p.id as pictureId,
count(distinct l.id) as count_likes,
count(distinct v.id) as count_views,
count(distinct c.id) as count_comments
from
pictures as p
left join likes as l on p.id = l.pictureId
left join views as v on p.id = v.pictureId
left join comments as c on p.id = c.pictureId
group by
p.id
Basically, you are counting every record in each table for each record in the pictures table; if there are no records in likes, views or comments, the count will be zero, respectively.
Of course, you can expand this idea for collections:
select
c.id as collection_id,
p.id as picture_id,
count(distinct l.id) as count_likes,
count(distinct v.id) as count_views,
count(distinct c.id) as count_comments
from
collections as c
left join pictures as p on c.id = p.collection_id
left join likes as l on p.id = l.picture_Id
left join views as v on p.id = v.picture_Id
left join comments as c on p.id = c.picture_Id
group by
c.id,
p.id
If you want to filter your results for each collection, you only need to add where c.id = aValue before the group by (where aValue is the collection Id you want to retrieve)
Hope this helps you.
If you only need the aggregate data for each collection:
select
c.id as collection_id,
count(distinct l.id) as count_likes,
count(distinct v.id) as count_views,
count(distinct c.id) as count_comments
from
collections as c
left join pictures as p on c.id = p.collection_id
left join likes as l on p.id = l.picture_Id
left join views as v on p.id = v.picture_Id
left join comments as c on p.id = c.picture_Id
group by
c.id
This should do the trick ;-)
You could do this with subselects:
SELECT
collections.*,
( SELECT COUNT(*) FROM pictures, likes
WHERE pictures.id = likes.picture_id
AND pictures.collection_id = collection.id
) AS like_count,
( SELECT COUNT(*) FROM pictures, views
WHERE pictures.id = views.picture_id
AND pictures.collection_id = collection.id
) AS view_count,
( SELECT COUNT(*) FROM pictures, comments
WHERE pictures.id = comments.picture_id
AND pictures.collection_id = collection.id
) AS comment_count
FROM collections
WHERE ...
This looks like it's going over the pictures table thrice, but I suspect that MySQL might be able to optimize that using the join buffer. I should note that I haven't actually tested this query, however. I also have no idea how this compares performance-wise with Barranka's LEFT JOIN solution. (Both would be pretty horrible if implemented naïvely, so it comes down to how smart MySQL's query optimizer is in each case.)
Related
I have got two tables. One is for news and second one for images. Each news can have 0-3 images (image_1, image_2, image_3 in table news - its id). Now iam trying to get all rows from images table but its giving me back only one.
Like that (but it is not working)
select news.id as nid, image_1, image_2, image_3, photos.id as pid, big, small
from news
left join photos
on image_1=photos.id, image_2=photos.id, image_3=photos.id
order by nid desc
Even #juergen has suggested better option and also guided you how to solve your problem in your way but if stil you are facing issue how to do then you can follow below query-
SELECT p.id AS pid, n1.image_1, n2.image_2, n3.image_3, big, small
FROM photos AS p
LEFT JOIN news AS n1 ON n1.image_1=p.id
LEFT JOIN news AS n2 ON n2.image_2=p.id
LEFT JOIN news AS n3 ON n1.image_3=p.id
ORDER BY n.id DESC;
You have to join the photos table 3 times with different aliases.
But you actually should rather change your table design. Add another table called news_photos
news_photos table
-----------------
news_id
photo_id
Then you can remove the image columns from the news table.
After the changes you can select news with all photos of like that
select n.*, p.name
from news
left join news_photos np on n.id = np.news_id
left join photos p on p.id = np.photo_id
where n.id = 1234
I'm attempting to pull the latest pricing data from a table on an Inner Join. Prices get updated throughout the day but aren't necessary updated at midnight.
The following query works great when the data is updated on prices by the end of the day. But how do I get it to get yesterdays data if today's data is blank?
I'm indexing off of a column that is formatted like this date_itemnumber => 2015-05-22_12341234
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
AND concat('2015-05-23_',collection.itemid)=history.date_itemid
WHERE h.description LIKE '%Awesome%'
Production Query time: .046 sec
To be clear, I want it to check for the most up to date record for that item. Regardless on if it is today, yesterday or before that.
SQLFiddle1
The following query gives me the desired results but with my production dataset it takes over 3 minutes to return results. As my dataset gets larger, it would take longer. So this can't be the most efficient way to do this.
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
AND (select history.date_itemid from history WHERE itemid=collection.itemid GROUP BY date_itemid DESC LIMIT 1)=history.date_itemid
WHERE h.description LIKE '%Awesome%'
Production Query time: 181.140 sec
SQLFiddle2
SELECT x.*
FROM history x
JOIN
( SELECT itemid
, MAX(date_itemid) max_date_itemid
FROM history
-- optional JOINS and WHERE here --
GROUP
BY itemid
) y
ON y.itemid = x.itemid
AND y.max_date_itemid = x.date_itemid;
http://sqlfiddle.com/#!9/975f5/13
This should works:
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN(
SELECT a.*
FROM history a
INNER JOIN
( SELECT itemid,MAX(date_itemid) max_date_itemid
FROM history
GROUP BY itemid
) b ON b.itemid = a.itemid AND b.max_date_itemid = a.date_itemid
) AS history ON history.itemid = collection.itemid
WHERE h.description LIKE '%Awesome%'
I don't know if this take a lot of execution time. Please do try it, since you might have more data in your tables it will be a good test to see the query execution time.
This is actually a fairly common problem in SQL, at least I feel like I run into it a lot. What you want to do is join a one to many table, but only join to the latest or oldest record in that table.
The trick to this is to do a self LEFT join on the table with many records, specifying the foreign key and also that the id should be greater or less than the other records' ids (or dates or whatever you're using). Then in the WHERE conditions, you just add a condition that the left joined table has a NULL id - it wasn't able to be joined with a more recent record because it was the latest.
In your case the SQL should look something like this:
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
-- left join history table again
LEFT JOIN history AS history2 ON history.itemid = history2.itemid AND history2.id > history.id
-- filter left join results to the most recent record
WHERE history2.id IS NULL
AND h.description LIKE '%Awesome%'
This is another approach that cuts one inner join statement
select h.*,his.date_itemid, his.price from history his
INNER JOIN h ON his.itemid=h.id
WHERE his.itemid IN (select itemid from collection) AND h.description LIKE '%Awesome%' and his.id IN (select max(id) from history group by history.itemid)
you can try it here http://sqlfiddle.com/#!9/837a8/1
I am not sure if this is what you want but i give it a try
EDIT: modified
CREATE VIEW LatestDatesforIds
AS
SELECT
MAX(`history`.`date_itemid`) AS `lastPriceDate`,
MAX(`history`.`id`) AS `matchingId`
FROM `history`
GROUP BY `history`.`itemid`;
CREATE VIEW MatchDatesToPrices
AS
SELECT
`ldi`.`lastPriceDate` AS `lastPriceDate`,
`ldi`.`matchingId` AS `matchingId`,
`h`.`id` AS `id`,
`h`.`itemid` AS `itemid`,
`h`.`price` AS `price`,
`h`.`date_itemid` AS `date_itemid`
FROM (`LatestDatesforIds` `ldi`
JOIN `history` `h`
ON ((`ldi`.`matchingId` = `h`.`id`)));
SELECT c.itemid,price,lastpriceDate,description
FROM collection c
INNER JOIN MatchDatesToPrices mp
ON c.itemid = mp.itemid
INNER JOIN h ON c.itemid = h.id
Difficult to test the speed on such a small dataset but avoiding 'Group By' might speed things up. You could try conditionally joining the history table to itself instead of Grouping?
e.g.
SELECT h.*, c.*, h1.price
FROM h
INNER JOIN history h1 ON h1.itemid = h.id
LEFT OUTER JOIN history h2 ON h2.itemid = h.id
AND h1.date_itemid < h2.date_itemid
INNER JOIN collection c ON c.itemid = h.id
WHERE h2.id IS NULL
AND h.description LIKE '%Awesome%'
Changing this line
AND h1.date_itemid < h2.date_itemid
to actually work on a sequential indexed field (preferably unique) will speed things up too. e.g. order by id ASC
I know this question has been asked multiple times (however, I could still not find a solution):
PHP MYSQL showing posts with comments
mysql query - blog posts and comments with limit
mysql structure for posts and comments
...
Basic question: having tables posts, comments, user... can you with one single select statement select and show all posts and all comments (with comment.user, comment.text, comment.timestamp)? How would such a select statement look like? If not, what is the easiest solution?
I also tried to JOIN the comments table with the posts table and use GROUP BY, but I got either only one comment in each row or each comment but also those posts multiple times!?
I tried the solution of the first link (nested mysql_query and then fetch) as well as the second link (with arrays). However, the first caused a bunch of errors (the syntax in that post seems to be not correct and I could not figure out how to solve it) and in the second I had problems with the arrays.
My query looks like this till now:
SELECT p.id, p.title, p.text, u.username, c.country_name, (SELECT SUM(vote_type) FROM votes v WHERE v.post_id = p.id) AS sum_vote_type FROM posts p LEFT JOIN user u ON ( p.user_id = u.id ) LEFT JOIN countries c ON ( c.country_id = u.country_id ) ORDER BY $orderby DESC
I was wondering if this issue was not very common, having posts and comments to show...?
Thank you for every help in advance!
Not knowing your database structure, it should look something like this. Note that you should replace the * characters with more explicit lists of columns you actually need.
SELECT p.*, c.*, u.* FROM posts p
LEFT JOIN comments c ON c.post_id = p.id
LEFT JOIN users u ON u.id = p.author_id
Note that if you're just trying to get counts, sums and things like that it's a good idea to cache some of that information. For instance, you may want to cache the comment count in the post table instead of counting them every query. Only count and update the comment count when adding/removing a comment.
EDIT:
Realized that you also wanted to attach user data to each comment. You can JOIN the same table more than once but it gets ugly. This could turn into a really expensive query. I also am including an example of how to alias columns so it's less confusing:
SELECT p.*, c.*, u.name as post_author, u2.name as comment_author FROM posts p
LEFT JOIN comments c ON c.post_id = p.id
LEFT JOIN users u ON u.id = p.author_id
LEFT JOIN users u2 ON u2.id = c.author_id
I am building a blog with Codeigniter and MySQL. The question I have is this, I have a table with posts and one with categories. I also have a cross reference table with post_categories. What I am trying to do is get all the categories with their names and the number of posts they have under their name.
Example output would be: Hello World(1) Test(0) etc.
What I am having a hard time finding is a SQL query that will join the three tables and get me the counts, and I am also having a hard time wrapping my head around how to make that query.
Here is my table schema:
blgpost
====
id
*Other schema unimportant
blgpostcategories
=================
postid
categoryid
blgcategories
==========
id
name
*Other schema unimportant
This should give you the output you want....
SELECT c.name, COUNT(p.id) FROM
blgcategories c
INNER JOIN blgpostcategories pc ON c.id = pc.categoryid
INNER JOIN blgpost p ON pc.postid = p.id
GROUP BY c.id
You don't need to join the three tables - the blgpost table doesn't have any information in it that you need.
SELECT COUNT(*), blgcategories.name
FROM blgcategories INNER JOIN blgpostcategories
ON blgcategories.id=blgpostcategories.categoryid
GROUP BY blgcategories.id;
SELECT name, COUNT(pc.id)
FROM blgcategories c
LEFT JOIN
blgpostcategories pc
ON pc.categoryid = c.id
GROUP BY
c.id
Using LEFT JOIN will show 0 for empty categories (those without posts linked to them) rather than omitting them.
I'm working on building a forum with kohana. I know there is already good, free, forum software out there, but it's for a family site, so I thought I'd use it as a learning experience. I'm also not using the ORM that is built into Kohana, as I would like to learn more about SQL in the process of building the forum.
For my forum I have 4 main tables:
USERS
TOPICS
POSTS
COMMENTS
TOPICS table: id (auto incremented), topic row.
USERS table: username, email, first and last name and a few other non related rows
POSTS table: id (auto incremented), post-title, post-body, topic-id, user-id, post-date, updated-date, updated-by(which will contain the user-id of the person who made the most recent comment)
COMMENTS table: id (auto incremented), post-id, user-id and comment
On the main forum page I would like to have:
a list of all of the topics
the number of posts for each topic
the last updated post, and who updated it
the most recently updated topic to be on top, most likely an "ORDER BY updated-date"
Here is the query I have so far:
SELECT topics.id AS topic-id,
topics.topic,
post-user.id AS user-id,
CONCAT_WS(' ', post-user.first-name, post-user.last-name) AS name,
recent-post.id AS post-id,
post-num.post-total,
recent-post.title AS post-title,
recent-post.update_date AS updated-date,
recent-post.updated-by AS updated-by
FROM topics
JOIN (SELECT posts.topic-id,
COUNT(*) AS post-total
FROM POSTS
WHERE posts.topic-id = topic-id
GROUP BY posts.topic-id) AS post-num ON topics.id = post-num.topic-id
JOIN (SELECT posts.*
FROM posts
ORDER BY posts.update-date DESC) AS recent-post ON topics.id = recent-post.topic-id
JOIN (SELECT users.*,
posts.user-id
FROM users, posts
WHERE posts.user-id = users.id) as post-user ON recent-post.user_id = post-user.id
GROUP BY topics.id
This query almost works as it will get all of information for topics that have posts. But it doesn't return the topics that don't have any posts.
I'm sure that the query is inefficient and wrong since it makes two sub-selects to the posts table, but it was the only way I could get to the point I'm at.
Dash is not a valid character in SQL identifiers, but you can use "_" instead.
You don't necessarily have to get everything from a single SQL query. In fact, trying to do so makes it harder to code, and also sometimes makes it harder for the SQL optimizer to execute.
It makes no sense to use ORDER BY in a subquery.
Name your primary key columns topic_id, user_id, and so on (instead of "id" in every table), and you won't have to alias them in the select-list.
Here's how I would solve this:
First get the most recent post per topic, with associated user information:
SELECT t.topic_id, t.topic,
u.user_id, CONCAT_WS(' ', u.first_name, u.last_name) AS full_name,
p.post_id, p.title, p.update_date, p.updated_by
FROM topics t
INNER JOIN
(posts p INNER JOIN users u ON (p.updated_by = u.user_id))
ON (t.topic_id = p.topic_id)
LEFT OUTER JOIN posts p2
ON (p.topic_id = p2.topic_id AND p.update_date < p2.update_date)
WHERE p2.post_id IS NULL;
Then get the counts of posts per topic in a separate, simpler query.
SELECT t.topic_id, COUNT(*) AS post_total
FROM topics t LEFT OUTER JOIN posts p USING (topic_id)
GROUP BY t.topic_id;
Merge the two data sets in your application.
to ensure you get results for topics without posts, you'll need to use LEFT JOIN instead of JOIN for the first join between topics and the next table. LEFT JOIN means "always return a result set row for every row in the left table, even if there's no match with the right table."
Gotta go now, but I'll try to look at the efficiency issues later.
This is a very complicated query. You should note that JOIN statements will limit your topics to those that have posts. If a topic does not have a post, a JOIN statement will filter it out.
Try the following query.
SELECT *
FROM
(
SELECT T.Topic,
COUNT(AllTopicPosts.ID) NumberOfPosts,
MAX(IFNULL(MostRecentPost.Post-Title, '') MostRecentPostTitle,
MAX(IFNULL(MostRecentPostUser.UserName, '') MostRecentPostUser
MAX(IFNULL(MostRecentPost.Updated_Date, '') MostRecentPostDate
FROM TOPICS
LEFT JOIN POSTS AllTopicPosts ON AllTopicPosts.Topic_Id = TOPICS.ID
LEFT JOIN
(
SELECT *
FROM Posts P
WHERE P.Topic_id = TOPICS.id
ORDER BY P.Updated_Date DESC
LIMIT 1
) MostRecentPost ON MostRecentPost.Topic_Id = TOPICS.ID
LEFT JOIN USERS MostRecentPostUser ON MostRecentPostUser.ID = MostRecentPost.User_Id
GROUP BY T.Topic
)
ORDER BY MostRecentPostDate DESC
I'd use a left join inside a subquery to pull back the correct topic, and then you can do a little legwork outside of that to get some of the user info.
select
s.topic_id,
s.topic,
u.user_id as last_updated_by_id,
u.user_name as last_updated_by,
s.last_post,
s.post_count
from
(
select
t.id as topic_id,
t.topic,
t.user_id as orig_poster,
max(coalesce(p.post_date, t.post_date)) as last_post,
count(*) as post_count --would be p.post_id if you don't want to count the topic
from
topics t
left join posts p on
t.id = p.topic_id
group by
t.topic_id,
t.topic,
t.user_id
) s
left join posts p on
s.topic_id = p.topic_id
and s.last_post = p.post_date
and s.post_count > 1 --0 if you're using p.post_id up top
inner join users u on
u.id = coalesce(p.user_id, s.orig_poster)
order by
s.last_post desc
This query does introduce coalesce and left join, and they are very good concepts to look into. For two arguments (like used here), you can also use ifnull in MySQL, since it is functionally equivalent.
Keep in mind that that's exclusive to MySQL (if you need to port this code). Other databases have other functions for that (isnull in SQL Server, nvl in Oracle, etc., etc.). I used coalesce so that I could keep this query all ANSI-fied.