Slow mysql query with subqueries - php

I'm experiencing a problem with this query in PHP. It is so slow I cannot connect to the server for few minutes after executing it. I've been removing separate subqueries away from main query and found out, that it works perfectly smooth, after I remove the last inner query.
Table "u" has 30.000 rows, table "u_r" around 17.000, table "u_s" around 13.000 and table "s" around 100. Even though tables "u_r" and "u_s" have lots of rows, only 2-3 rows have same "id_u" that matches the condition.
I hope, I provided enough information. If you need to know anything else, feel free to ask in the comments.
SELECT DISTINCT id_p
FROM u
WHERE
'$x' IN(
SELECT id_p
FROM u_p
WHERE id_u=u.id_u
)
AND
(
'$y' IN (
SELECT id_r
FROM u_r
WHERE id_u=u.id_u
)
OR
'$y' IN (
SELECT DISTINCT id_r
FROM s
WHERE id_s IN (
SELECT id_s //without this query, everything works fine
FROM u_s
WHERE id_u=u.id_u
)
)
)

I think you can change:
SELECT DISTINCT id_r
FROM s
WHERE id_s IN (
SELECT id_s //without this query, everything works fine
FROM u_s
WHERE id_u=u.id_u
to
SELECT s.id_s
FROM u_s
INNER JOIN s ON u_s=id_u=u.id_u
WHERE u_s.id_u = u.id_u
GROUP BY s.id_r
Not tested as I'm trying to wrap my head around the structure.
Also remember to check the indexes. As a rough guide, you should have anything in the "WHERE" and anything in the "ON" indexed:
u_p.id_u
u_r.id_u
u_s.id_u
s.id_s

Test that this produces the same results - should be much faster.
Replace the IN statements with inner joins. More opportunities for query optimiser to optimise this way
Eliminate your OR statement - I did this using a UNION. I think you could probably eliminate the OR using a LEFT JOIN instead if you wanted.
You'll need to test this - I don't have any sample data.
select distinct id_p
from u
inner join u_p
on u.id_u = u_p.id_u
and u_p.id_u = '$x'
inner join
(
select id_r.id_u
from u_r
where id_r = '$y'
union
select id_r
from s
inner join u_s
on s.id_u = u_s.id_u
where id_r = '$y'
) as subq
on u.id_u = subq.id_u

Related

Paginate items from different sources

Here's a problem I'm facing: I need to lists some items. Those items come from different sources (let's say table A, table B, table C), with different attributes and nature (although some are common).
How can I merge them together in a list that is paginated?
The options I've considered:
Get them all first, then sort and paginate them afterwards in the code. This doesn't work well because there are too many items (thousands) and performance is a mess.
Join them in a SQL view with their shared attributes, once the SQL query is done, reload only the paginated items to get the rest of their attributes. This works so far, but might become difficult to maintain if the sources change/increase.
Do you know any other option? Basically, what is the most used/recommended way to paginate items from two data sources (either in SQL or directly in the code).
Thanks.
If UNION solves the problem, here are some syntax and optimization tips.
This will provide page 21 of 10-row pages:
(
( SELECT ... LIMIT 210 )
UNION [ALL|DISTINCT]
( SELECT ... LIMIT 210 )
) ORDER BY ... LIMIT 10 OFFSET 200
Note that 210 = 200+10. You can't trust using OFFSET in the inner SELECTs.
Use UNION ALL for speed, but if there could be repeated rows between the SELECTs, then explicitly say UNION DISTINCT.
If you take away too many parentheses, you will get either syntax errors or the 'wrong' results.
If you end up with a subquery, repeat the ORDER BY but not the LIMIT:
SELECT ...
FROM (
( SELECT ... LIMIT 210 )
UNION [ALL|DISTINCT]
( SELECT ... LIMIT 210 )
ORDER BY ... LIMIT 10 OFFSET 200
) AS u
JOIN something_else ON ...
ORDER BY ...
One reason that might include a JOIN is for performance -- The subquery u has boiled the resultset down to only 10 rows, hence the JOIN will have only 10 things to look up. Putting the JOIN inside would lead to lots of joining before whittling down to only 10.
I actually had to answer a similar situation very recently, specifically reporting across two large tables and paginating across both of them. The answer I came to was to use subqueries, like so:
SELECT
t1.id as 't1_id',
t1.name as 't1_name',
t1.attribute as 't1_attribute',
t2.id as 't2_id',
t2.name as 't2_name',
t2.attribute as 't2_attribute',
l.attribute as 'l_attribute'
FROM (
SELECT
id, name, attribute
FROM
table1
/* You can perform joins in here if you want, just make sure you're using your aliases right */
/* You can also put where statements here */
ORDER BY
name DESC, id ASC
LIMIT 0,50
) as t1
INNER JOIN (
SELECT
id,
name,
attribute
FROM
table2
ORDER BY
attribute ASC
LIMIT 250,50
) as t2
ON t2.id IS NOT NULL
LEFT JOIN
linkingTable as l
ON l.t1Id = t1.id
AND l.t2Id = t2.id
/* Do your wheres and stuff here */
/* You shouldn't need to do any additional ordering or limiting */

MySql - Joining another table with multiple rows, inserting a query into a another query?

I've been racking my brain for hours trying work out how to join these two queries..
My goal is to return multiple venue rows (from venues) based on certain criteria... which is what my current query does....
SELECT venues.id AS ven_id,
venues.venue_name,
venues.sub_category_id,
venues.score,
venues.lat,
venues.lng,
venues.short_description,
sub_categories.id,
sub_categories.sub_cat_name,
sub_categories.category_id,
categories.id,
categories.category_name,
((ACOS( SIN(51.44*PI()/180)*SIN(lat*PI()/180) + COS(51.44*PI()/180)*COS(lat*PI()/180)*COS((-2.60796 - lng)*PI()/180)) * 180/PI())*60 * 1.1515) AS dist
FROM venues,
sub_categories,
categories
WHERE
venues.sub_category_id = sub_categories.id
AND sub_categories.category_id = categories.id
HAVING
dist < 5
ORDER BY score DESC
LIMIT 0, 100
However, I need to include another field in this query (thumbnail), which comes from another table (venue_images). The idea is to extract one image row based on which venue it's related to and it's order. Only one image needs to be extracted however. So LIMIT 1.
I basically need to insert this query:
SELECT
venue_images.thumb_image_filename,
venue_images.image_venue_id,
venue_images.image_order
FROM venue_images
WHERE venue_images.image_venue_id = ven_id //id from above query
ORDER BY venue_images.image_order
LIMIT 1
Into my first query, and label this new field as "thumbnail".
Any help would really be appreciated. Thanks!
First of all, you could write the first query using INNER JOIN:
SELECT
...
FROM
venues INNER JOIN sub_categories ON venues.sub_category_id = sub_categories.id
INNER JOIN categories ON sub_categories.category_id = categories.id
HAVING
...
the result should be identical, but i like this one more.
What I'd like to do next is to JOIN a subquery, something like this:
...
INNER JOIN (SELECT ... FROM venue_images
WHERE venue_images.image_venue_id = ven_id //id from above query
ORDER BY venue_images.image_order
LIMIT 1) first_image
but unfortunately this subquery can't see ven_id because it is evaluated first, before the outer query (I think it's a limitation of MySql), so we can't use that and we have to find another solution. And since you are using LIMIT 1, it's not easy to rewrite the condition you need using just JOINS.
It would be easier if MySql provided a FIRST() aggregate function, but since it doesn't, we have to simulate it, see for example this question: How to fetch the first and last record of a grouped record in a MySQL query with aggregate functions?
So using this trick, you can write a query that extracts first image_id for every image_venue_id:
SELECT
image_venue_id,
SUBSTRING_INDEX(
GROUP_CONCAT(image_id order by venue_images.image_order),',',1) as first_image_id
FROM venue_images
GROUP BY image_venue_id
and this query could be integrated in your query above:
SELECT
...
FROM
venues INNER JOIN sub_categories ON venues.sub_category_id = sub_categories.id
INNER JOIN categories ON sub_categories.category_id = categories.id
INNER JOIN (the query above) first_image on first_image.image_venue_id = venues.id
INNER JOIN venue_images on first_image.first_image_id = venue_images.image_id
HAVING
...
I also added one more JOIN, to join the first image id with the actual image. I couldn't check your query but the idea is to procede like this.
Since the query is now becoming more complicated and difficult to mantain, i think it would be better to create a view that extracts the first image for every venue, and then join just the view in your query. This is just an idea. Let me know if it works or if you need any help!
I'm not too sure about your data but a JOIN with the thumbnails table and a group by on your large query would probably work.
GROUP BY venues.id

Inner Join and Outer Joins Producing the same Result

I have 2 tables in Mysql one is holding contractors and another is holding Projects, I want to produce a contractor-Project Report showing the approtining of the projects. problem is INNER JOIN, LEFT and RIGHT OUTER JOINS, all produce the same result only showing the contractor with a project even when i leave out the condition which seems Weird. here are my statements
SELECT DISTINCT (tbl_contractor.name_v), count( tbl_project.name_v )
FROM tbl_contractor
INNER JOIN tbl_project
ON tbl_project.Contractor=tbl_contractor.contractor_id_v
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
LIMIT 0 , 30;
SELECT DISTINCT (tbl_contractor.name_v), count( tbl_project.name_v )
FROM tbl_contractor
LEFT OUTER JOIN tbl_project
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
LIMIT 0 , 30;
You have an aggregate function, COUNT(), without a GROUP BY. This means youir query will return one row only.
You probably need a GROUP BY (contractor):
SELECT tbl_contractor.name_v, COUNT( tbl_project.name_v )
FROM tbl_contractor
LEFT OUTER JOIN tbl_project
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
GROUP BY tbl_contractor.contractor_id_v
LIMIT 0 , 30;
By doing SELECT DISTINCT (tbl_contractor.name_v) the query will only return one row for each contractor name, try removing the distinct and see if you get a better contractor - project result.
These queries are really group by queries on the contractor. If every contractor has at least one project, then the inner and left outer joins will return the same results. If there are contractors without projects, then the results are affected by the LIMIT clause. You are only getting the first 30, and, for whatever reason, the matches are appearing first.

Include sorting in query

I recently added a sorting field and now I want to sort the result. I could do it in PHP or directly on the database. So I tried the second one:
SELECT *
FROM constructions
AS whole
INNER JOIN
(
SELECT DISTINCT construction
FROM data
AS results
WHERE product =2
AND application =1
AND requirement =1
)
ON whole.id = results
ORDER BY whole.sorting
I tried to use inner join to match the complete table with the result set. But I can't get it working (#1248 - Every derived table must have its own alias). I tried to use the alias but something is still wrong. Perhaps I shouldn't user inner join and use IN() instead.
2 obvious errors in the syntax:
SELECT *
FROM constructions
AS whole
INNER JOIN
(
SELECT DISTINCT construction AS results
FROM data
WHERE product =2
AND application =1
AND requirement =1
) AS a
ON whole.id = a.results
ORDER BY whole.sorting
Try to get data like this...
SELECT *
FROM constructions
where
construction
in
(
SELECT DISTINCT construction
FROM data
AS results
WHERE product =2
AND application =1
AND requirement =1
)

Any way to optimize this mysql query?

This is the query. Im mostly interested if there is a better way to grab the stuff I use GROUP_CONCAT for, or if thats a fairy good way of grabbing this data. I then explode it, and put the ids/names into an array, and then use a for loop to echo them out.
SELECT
mov_id,
mov_title,
GROUP_CONCAT(DISTINCT categories.cat_name) as all_genres,
GROUP_CONCAT(DISTINCT cat_id) as all_genres_ids,
GROUP_CONCAT(DISTINCT case when gen_dominant = 1 then gen_catid else 0 end) as dominant_genre_ids,
GROUP_CONCAT(DISTINCT actors.act_name) as all_actors,
GROUP_CONCAT(DISTINCT actors.act_id) as all_actor_ids,
mov_desc,
mov_added,
mov_thumb,
mov_hits,
mov_numvotes,
mov_totalvote,
mov_imdb,
mov_release,
mov_html,
mov_type,
mov_buytickets,
ep_summary,
ep_airdate,
ep_id,
ep_hits,
ep_totalNs,
ep_totalRs,
mov_rating,
mov_rating_reason,
mrate_name,
dir_id,
dir_name
FROM movies
LEFT JOIN _genres
ON movies.mov_id = _genres.gen_movieid
LEFT JOIN categories
ON _genres.gen_catid = categories.cat_id
LEFT JOIN _actors
ON (movies.mov_id = _actors.ac_movid)
LEFT JOIN actors
ON (_actors.ac_actorid = actors.act_id AND act_famous = 1)
LEFT JOIN directors
ON movies.mov_director = directors.dir_id
LEFT JOIN movie_ratings
ON movies.mov_rating = movie_ratings.mrate_id
LEFT JOIN episodes
ON mov_id = ep_showid AND ep_season = 0 AND ep_num = 0
WHERE mov_id = *MOVIE_ID* AND mov_status = 1
GROUP BY mov_id
EXPLAIN of the query is here
alt text http://www.krayvee.com/o2/explain.gif
Personally, I would try to break the query up into multiple queries. Mostly I would recommend removing the Actor and Genre Joins so that you can get rid of all those group_concat functions. Then do separate queries to pull this data out. Not sure if it would speed things up, but it's probably worth a shot.
You've basically done a Cartesian product between genres, actors, directors, movie_ratings and episodes. That's why you have to use DISTINCT inside your GROUP_CONCAT(), because the pre-grouped result set has a number of rows equal to the product of the number of matching rows in each related table.
Note that this query wouldn't work at all in SQL, except that you're using MySQL which is permissive about the single-value rule.
Like #Kibbee, I usually recommend to run separate queries in cases like this. It's not always better to run a single query. Try breaking up the query and doing some profiling to be sure.
PS: What? No _directors table? So you can't represent a move with more than one director? :-)

Categories