Mysql Query limitation - php

I use a query like :
$querym = mysql_query("SELECT * FROM allmembers a LEFT JOIN favorites f ON (f.memberid=a.memberid) order by f.date desc LIMIT 10");
while($row = mysql_fetch_array($querym,MYSQL_ASSOC)) {
$dataArray[$row['memberid']][$row['favoriteid']]=$row;
}
My purpose is getting 10 members with their last 5 favorites in an array. but as you can guess this query getting 10 row included with favorites. That means if a member have 15 favorites it only gets one member with 10 favorites instead of 10 members with his favorites.
I couldn't find an easy way to limit getting favorites for each member in that query. How can I limit?
Thanks in advance

This is not MYSQL FETCH ARRAY LIMITATION, this is your QUERY limitation.
Try something of your own on these lines :
SELECT memberid,group_concat(favorites_field) FROM allmembers a LEFT JOIN favorites f ON (f.memberid=a.memberid) group by a.memberid order by f.date desc LIMIT 10

This query may give you a better approach
SELECT * FROM favorites f where f.memberid in (SELECT * FROM allmembers limit 10) ORDER BY f.date desc
although you should limit the 5 favorites later, in php or whatever. This could work if you don't expect to have 10.000 favorites for one member (basically because you will be getting 10.000+ ROWS just to display, at maximum, 10*5 = 50 rows.)
Alternatively, you can do a query for each member, limiting it to 5 results... (this would mean doing 11 queries in total...)

The following query will do what you want:
SELECT
a.id as member_id,
a.name,
f1.id as favorite_id,
f1.link
FROM
allmembers a
JOIN favorites f1 ON f1.member = a.id
LEFT JOIN favorites f2 ON f2.member = f1.member AND f2.date < f1.date
GROUP BY
a.id, a.name, f1.id, f1.link
HAVING
COUNT(f2.id) < 10
ORDER BY
a.name, f1.id
It assumes the following database schema:
allmembers:
id INT
name VARCHAR
favorites:
id INT
member INT
link VARCHAR
Obviously you need to update the query according to your own database schema.

You have many ways to do that, and they can all be performant up to a limit, what i'd do in this specific case is this:
Get the favorite id's in a multiple left join as a seperate field, you can then easily for(...) the row array for each field and retrieve the id. The Query would LOOK LIKE this. Adapt it to your own use:
SELECT
a.*,
f1.favoriteid as f1id,
f2.favoriteid as f2id,
f3.favoriteid as f3id,
f4.favoriteid as f4id,
f5.favoriteid as f5id
FROM allmembers a
LEFT JOIN favorites f1 ON (f.memberid=a.memberid)
LEFT JOIN favorites f2 ON (f.memberid=a.memberid) AND f1.favoriteid <> f2.favoriteid
LEFT JOIN favorites f3 ON (f.memberid=a.memberid) AND f1.favoriteid <> f3.favoriteid AND f2.favoriteid <> f3.favoriteid
LEFT JOIN favorites f4 ON (f.memberid=a.memberid) AND f1.favoriteid <> f4.favoriteid AND f2.favoriteid <> f4.favoriteid AND f3.favoriteid <> f4.favoriteid
LEFT JOIN favorites f5 ON (f.memberid=a.memberid) AND f1.favoriteid <> f5.favoriteid AND f2.favoriteid <> f5.favoriteid AND f3.favoriteid <> f5.favoriteid AND f4.favoriteid <> f5.favoriteid
order by f.date desc LIMIT 10
Using this method you can also read any information from the favorites table or even left join on another table X number of times to get more information regarding that favorite. As long as you setup correct indexes, this method is extremely fast even with thousands of members and nearly millions of favorites.
You can also apply this strategy to many other scenarios. For example, we work with WordPress here at work and lots of information for users are kept as meta fields, so selecting one big table is impossible unless you perform this method.
Good luck

Related

EXISTS query optimization on mysql query

I have a big data problem with MySQL.
I have:
a users table with 59033 rows, and
a user_notes table with 8753 rows.
But when I search which users have user note in some dates.
My query like this :
SELECT u.*, rep.name as rep_name FROM users as u
LEFT JOIN users as rep on rep.id = u.add_user
LEFT JOIN authorization on authorization.id = u.authorization
LEFT JOIN user_situation_list on user_situation_list.user_situation_id = u.user_situation
WHERE
EXISTS(
select * from user_notes
where user_notes.note_user_id = u.id AND user_notes.create_date
BETWEEN "2017-10-20" AND "2017-10-22"
)
ORDER BY u.lp_modify_date DESC, u.id DESC
Turn it around -- find the ids first; deal with the joins later.
SELECT u.*,
( SELECT rep.name
FROM users AS rep
WHERE rep.id = u.add_user ) AS rep_name
FROM (
SELECT DISTINCT note_user_id
FROM user_notes
WHERE create_date >= "2017-10-20"
AND create_date < "2017-10-20" + INTERVAL 3 DAY
) AS un
JOIN users AS u ON u.id = un.note_user_id
ORDER BY lp_modify_date DESC, id DESC
Notes
No GROUP BY needed;
2 tables seem to be unused; I removed them;
I changed the date range;
User notes needs INDEX(create_date, note_user_id);
Notice how I turned a LEFT JOIN into a subquery in the SELECT list.
If there can be multiple rep_names, then the original query is "wrong" in that the GROUP BY will pick a random name. My Answer can be 'fixed' by changing rep.name to one of these:
MAX(rep.name) -- deliver only one; arbitrarily the max
GROUP_CONCAT(rep.name) -- deliver a commalist of names
Rewriting your query to use a JOIN rather than an EXISTS check in the where should speed it up. If you then group the results by the user.id it should give you the same result:
SELECT u.*, rep.name as rep_name FROM users as u
LEFT JOIN users as rep on rep.id = u.add_user
LEFT JOIN authorization on authorization.id = u.authorization
LEFT JOIN user_situation_list on user_situation_list.user_situation_id = u.user_situation
JOIN user_notes AS un
ON un.note_user_id
AND un.create_date BETWEEN "2017-10-20" AND "2017-10-22"
GROUP BY u.id
ORDER BY u.lp_modify_date DESC, u.id DESC

MySQL Using SUM with multiple joins

I have a projects table and a tasks table I want to do a query that gets all projects and the sum of the time_spent columns grouped by project id. So essentially list all projects and get the total of all the time_spent columns in the tasks table belonging to that project.
With the query posted below I get the latest added time_spent column and not the sum of all the columns.. :S
Below is the query I have at the moment:
SELECT `projects`.`id`, `projects`.`description`, `projects`.`created`,
`users`.`title`, `users`.`firstname`, `users`.`lastname`, `users2`.`title`
as assignee_title, `users2`.`firstname` as assignee_firstname,
`users2`.`lastname` as assignee_lastname,
(select sum(tasks2.time_spent)
from tasks tasks2
where tasks2.id = tasks.id)
as project_duration
FROM (`projects`)
LEFT JOIN `users`
ON `users`.`id` = `projects`.`user_id`
LEFT JOIN `users` as users2
ON `users2`.`id` = `projects`.`assignee_id`
LEFT JOIN `tasks` ON `tasks`.`project_id` = `projects`.`id`
GROUP BY `projects`.`id`
ORDER BY `projects`.`created` DESC
Below is my projects table:
Below is my tasks table:
Thanks in advance!
Usually this query will help you.
SELECT p.*, (SELECT SUM(t.time_spent) FROM tasks as t WHERE t.project_id = p.id) as project_fulltime FROM projects as p
In your question, you don't say about users. Do you need users?
You are on right way, maybe your JOINs can't fetch all data.
This query should do it for you.
Note, whenever you do a group by you must include every column that you select from or order by. Some MySql installations don't prevent you from doing this, but in the end it results in an incorrect result set.
As well you should never do a query as part of your SELECT statement, known as a sub-query, as it will result in an equal amount of additional queries in relation to the number of rows returned. So if you got 1,000 rows back, it would result in 1,001 queries instead of 1 query.
SELECT
p.id,
p.description,
p.created,
u.title,
u.firstname,
u.lastname,
a.title assignee_title,
a.firstname assignee_firstname,
a.lastname assignee_lastname,
SUM(t.time_spent) project_duration
FROM
projects p
LEFT JOIN
users u ON
u.id = p.user_id
LEFT JOIN
users a ON
a.id = u.assignee_id
LEFT JOIN
tasks t ON
t.project_id = p.id
GROUP BY
p.id,
p.description,
p.created,
u.title,
u.firstname,
u.lastname,
a.title,
a.firstname,
a.lastname
ORDER BY
p.created DESC

MySql inner join takes more than 10 seconds

I have two tables posts and followings
posts (id,userid,post,timestamp) 30 000 rows
and
followings(id_me,userid) 90 000 rows
I want to get lattest 10 posts form posts table based on the people i follow and my posts
SELECT p.*
FROM posts as p INNER JOIN
followings as f
ON (f.id_me=(my user id) AND p.userid=f.userid )
OR
p.userid=(my user id)
ORDER BY id DESC LIMIT 10
But it takes about 10-15 seconds to return. Thanks in advance!
First, remove the filter from the join clause, let the join just correlate the joining tables.
(
SELECT p.*
FROM posts as p
INNER JOIN followings as f ON p.userid=f.userid
where f.id_me=(my user id)
UNION
SELECT p.*
FROM posts as p
where p.userid=(my user id)
)
ORDER BY id DESC LIMIT 10
second, verify your indexes if that ids got no indexes it ill perform a full table scan for each cartesian product of both tables (30k x 90k =~ 3700k pairs being compared)
third, if you don't follow yourself you need a union from post you are following and your posts
Using an OR in SQL is a performance killer, try this:
SELEC p.*
FROM posts as p INNER JOIN
followings as f
ON (f.id_me=(my user id) AND p.userid IN (f.userid,(my user id)))
ORDER BY id DESC LIMIT 10
Do this query using union:
(SELECT p.*
FROM posts p INNER JOIN
followings f
ON (f.id_me=(my user id) AND p.userid=f.userid
)
union
(select p.*
from posts p
where p.userid=(my user id)
)
ORDER BY id DESC
LIMIT 10
If the two conditions never overlap, then use union all instead.
An OR condition like that prevents the query optimizer from making use of indexes. Use a UNION instead:
SELECT *
FROM (SELECT p.*
FROM posts as p
INNER JOIN followings as f
ON f.id_me=(my user id) AND p.userid=f.userid
UNION
SELECT *
FROM posts
WHERE userid = (my user id)) u
ORDER BY id DESC
LIMIT 10
It might be just me but I think your WHERE clause is in an inefficient location:
SELECT
p.*
FROM
posts p
INNER JOIN
followings f
ON p.userid=f.userid
WHERE
MyUserID IN (p.userid, f.id_me)
ORDER BY
id DESC
LIMIT
10
I read in comments that you have the required indexes. The problem is the query. Combining OR with a JOIN confuses the poor and (often) dumb optimizer. The LIMIT 10 should be helpful but the optimizer is not (yet) smart enough to make the best plan.
Try this query:
( SELECT p.*
FROM posts AS p
JOIN followings AS f
ON f.id_me = (my_user_id)
AND p.userid = f.userid
ORDER BY p.id DESC
LIMIT 10
)
UNION ALL
( SELECT p.*
FROM posts AS p
WHERE p.userid = (my_user_id)
ORDER BY p.id DESC
LIMIT 10
) AS x
ORDER BY id DESC
LIMIT 10 ;

Reducing mySQL queries and time

I am currently working on speeding up a website, that is returning 300,000+ rows from a query. While I don't think this is too much of a load on the DB server, this query is happening in a while loop depending on the number of 'galleries' a user has.
For example Joe has 10 galleries in his account. Each of those galleries has x number of images, which have x number of comments on those images. So the query that is currently being run...
SELECT count(*) as total
FROM galleryimage a
INNER JOIN imagecomments b ON a.id=b.imgId
WHERE a.galleryId='".$row['id']."'
AND b.note <> ''
...is looking through all the galleryimage table 334,000 rows and the imagecomments table 76,000 rows and returning the result on each gallery. The query run on a single gallery returns a result in about 578ms, but with many galleries, say 30-40 you could be looking at a page load time of 17+ secs. Any suggestions on how to deal with this issue?
I cannot change the DB architecture....
Query for gallery id
SELECT a.id,
a.created,
a.name,
b.clientName,
a.isFeatured,
a.views,
a.clientId
FROM gallery a
INNER JOIN client b
ON a.clientId = b.id
WHERE a.isTemp = 0
AND a.clientRef = '{$clientRef}'
AND a.finish='1'
AND a.isArchive='0'
ORDER BY created
DESC
You can consolidate the queries and eliminate the need for looping:
SELECT
a.id,
a.created,
a.name,
b.clientName,
a.isFeatured,
a.views,
a.clientId,
COALESCE(c.img_cnt, 0) AS gallery_image_count,
COALESCE(c.comment_cnt, 0) AS gallery_comment_count
FROM
gallery a
INNER JOIN
client b ON a.clientId = b.id
LEFT JOIN
(
SELECT aa.galleryId,
COUNT(DISTINCT aa.id) AS img_cnt,
COUNT(1) AS comment_cnt
FROM galleryimage aa
INNER JOIN imagecomments bb ON aa.id = bb.imgId
WHERE bb.note <> ''
GROUP BY aa.galleryId
) c ON a.id = c.galleryId
WHERE
a.isTemp = 0 AND
a.clientRef = '{$clientRef}' AND
a.finish = 1 AND
a.isArchive = 0
ORDER BY
a.created DESC

SQL query problem

I've got reporting of a user's score everytime it happens. Now I want to show the best score a user has had. The table set up is like this:
Player(id, name)
PlayerHasAchievement(id, playerId,
achievementId)
Achievement(id, type, amount, time)
This is what I have right now:
$query = "SELECT MAX(ach.amount) as amount, p.username, ach.time
FROM achievement as ach
INNER JOIN playerHasAchievement as playAch ON ach.id = playAch.id
INNER JOIN player as p ON p.userId = playAch.userid
WHERE ach.type = 2
GROUP BY amount
ORDER by `amount` DESC
LIMIT $amount";
I tried to select it distinctly but it didn't work. I'm stumped, it's supposed to be so easy! Thanks for reading, I'll be grateful for any help!
The problem is the the ach.time you are getting is not the same row as the MAX(amount). Join another subquery to get the MAX(amount) first.
Note: In the table definitions you posted, playerHasAchievement has a field playerId not userId
SELECT MAX(ach.amount) as amount, p.username, MAX(ach.time) MaxTime
FROM achievement as ach
INNER JOIN playerHasAchievement as playAch ON ach.id = playAch.id
INNER JOIN player as p ON p.userId = playAch.playerId
INNER JOIN (
SELECT playAch.playerId, MAX(ach.amount) as MaxAmount
FROM achievement as ach
INNER JOIN playerHasAchievement as playAch ON ach.id = playAch.id
WHERE ach.type = 2
GROUP BY playAch.playerId
) g ON p.playerId = g.playerId AND ach.amount = g.MaxAmount
WHERE ach.type = 2
GROUP BY p.playerId
ORDER by `amount` DESC
LIMIT $amount";
The reason why we group the outer query, is to avoid ties - say a player had the same score twice.
In your join on line 3 don't you really want
INNER JOIN playerHasAchievement as playAch ON ach.id = playAch.achievementId
and others are correct, you need to group by your non aggregate columns, not the aggregate one.
Assuming your db layout is as specified in the question here is the query I would use.
SELECT ach.amount, p.Name, ach.time
FROM achievement as ach
JOIN playerHasAchievement as playAch ON ach.id=playAch.achievementId
JOIN player AS p ON p.id = playAch.playerId
WHERE ach.type = 2
AND ach.amount = (SELECT MAX(ach.amount)
FROM achievement as ach
JOIN playerHasAchievement as playAch ON ach.id=playAch.achievementId
JOIN player AS p ON p.id = playAch.playerId
WHERE ach.type = 2)
GROUP BY ach.amount
ORDER by ach.time
taking the first result (in case there are multiples of the same score) will give you the high score and the lowest time.
Hope that helps!
You are not using group by appropriately, as you are only grouping by amount.
What about the user name and the time?

Categories