MySQL INNER vs LEFT JOIN different order - php

Why do this two queries return different result set when they have the same ORDER BY.
Only difference in query is that first time I user INNER JOIN an it takes about 5 seconds.
Second time I used LEFT JOIN and it took 0.05 seconds. In both cases they return exactly 43.000 rows, but tck.id order is different and I can't figure out why or in which way?
SELECT tck.*, acc.ac_name
FROM support_tickets tck
INNER JOIN support_ticket_accounts acc USING (id_support_ticket_account)
WHERE tck.id_company = 2 AND tck.st_status = 1 ORDER BY tck.st_priority DESC
Edit:
SELECT tck.*, acc.ac_name
FROM support_tickets tck
LEFT JOIN support_ticket_accounts acc ON tck.id_support_ticket_account = acc.id_support_ticket_account
WHERE tck.id_company = 2 AND tck.st_status = 1
ORDER BY tck.st_priority DESC;
+----+-------------+-------+--------+---------------+------------+---------+-------------------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+------------+---------+-------------------------------+-------+-----------------------------+
| 1 | SIMPLE | tck | ref | id_company | id_company | 5 | const | 37586 | Using where; Using filesort |
| 1 | SIMPLE | acc | eq_ref | PRIMARY | PRIMARY | 4 | tck.id_support_ticket_account | 1 | |
+----+-------------+-------+--------+---------------+------------+---------+-------------------------------+-------+-----------------------------+
SELECT tck.*, acc.ac_name
FROM support_tickets tck
INNER JOIN support_ticket_accounts acc ON tck.id_support_ticket_account = acc.id_support_ticket_account
WHERE tck.id_company = 2 AND tck.st_status = 1
ORDER BY tck.st_priority DESC;
+----+-------------+-------+------+-------------------------------------+----------------------------+---------+-------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------------+---------------------------+---------+-------------------------------+------+---------------------------------+
| 1 | SIMPLE | acc | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using temporary; Using filesort |
| 1 | SIMPLE | tck | ref | id_company,id_support_ticket_account | id_support_ticket_account | 5 | acc.id_support_ticket_account | 2085 | Using where |
+----+-------------+-------+------+--------------------------------------+---------------------------+---------+-------------------------------+------+---------------------------------+

I think using temporary is responsible for the delay (but I don't see why it's necessary for one query and not the other one). I think creating multi-column index should help:
CREATE INDEX filter
ON support_tickets(id_company, st_status, st_priority)
USING BTREE;

If you just ORDER BY tck.st_priority DESC multiple different recordsets are posible and can be returned, for each of both cases (left or inner). That is because you must have a lot of records that has the same st_priority so any of them can came in no particular order
Add more fields to the order by clause to give any record unique possible position and you will have same order on both querys.

Related

This Mysql SQL Select query takes 33 mins to execute

Title says it all. Do you think anyone could help me untangle this? or if someone could point me out to what else could be causing it to take so much time. The query takes about half an hour to run. The guy who wrote this tried doing it in a loop, by removing the table from the last join statement and then querying the field.title for each vote. i was hoping to bring the result to about 5 mins.
some extra info:
The query result is 83,531 rows
The vote table size is 30 MB (261,169 rows)
SELECT `vote`.`id` `vote_id`, `branch`.`name` `branch`, `brand`.`name` `brand`, DATE(vote.created_at) `date`, HOUR(vote.created_at) `time_hour`,
MINUTE(vote.created_at) `time_minute`, `vote`.`is_like`, `voter`.`name`, `voter`.`telephone`, `voter`.`email`, popups_votes.title `popup_title`,
popups_votes.value `popup_value`, GROUP_CONCAT(dis.field SEPARATOR '|') `reasons`
FROM (`vote`)
LEFT JOIN `voter` ON `voter`.`id` = `vote`.`voter_id`
LEFT JOIN `device` ON `device`.`id` = `vote`.`device_id`
LEFT JOIN `branch` ON `branch`.`id` = `device`.`branch_id`
LEFT JOIN `brand` ON `brand`.`id` = `branch`.`brand_id`
LEFT JOIN `popups_votes` ON popups_votes.vote_id = vote.id
LEFT JOIN (SELECT vote_dislike.vote_id `vote_id`, field.title `field` FROM vote_dislike
LEFT JOIN branch_dislike_field ON branch_dislike_field.id = vote_dislike.branch_dislike_id
LEFT JOIN field ON field.id = branch_dislike_field.field_id) dis
ON dis.vote_id = vote.id
WHERE (vote.device_id in
(
Select d.id
From device d
WHERE d.branch_id IN (SELECT id FROM branch WHERE brand_id = 7)
)
)
AND (vote.created_at >= FROM_UNIXTIME('$from_time') AND vote.created_at <= FROM_UNIXTIME('$to_time') )
GROUP BY vote.id
EDIT: this is the explain {query} output:
+------+-------------+----------------------+--------+----------------------+-----------+---------+-------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------------------+--------+----------------------+-----------+---------+-------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | branch | ref | PRIMARY,brand_id | brand_id | 4 | const | 20 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | d | ref | PRIMARY,branch_id | branch_id | 4 | river_back.branch.id | 1 | Using index |
| 1 | PRIMARY | vote | ref | device_id,created_at | device_id | 4 | river_back.d.id | 1200 | Using where |
| 1 | PRIMARY | voter | eq_ref | PRIMARY | PRIMARY | 4 | river_back.vote.voter_id | 1 | |
| 1 | PRIMARY | device | eq_ref | PRIMARY | PRIMARY | 4 | river_back.d.id | 1 | |
| 1 | PRIMARY | branch | eq_ref | PRIMARY | PRIMARY | 4 | river_back.device.branch_id | 1 | Using where |
| 1 | PRIMARY | brand | eq_ref | PRIMARY | PRIMARY | 4 | river_back.branch.brand_id | 1 | Using where |
| 1 | PRIMARY | popups_votes | ref | vote_id | vote_id | 5 | river_back.vote.id | 602 | |
| 1 | PRIMARY | vote_dislike | ref | vote_id | vote_id | 4 | river_back.vote.id | 1 | |
| 1 | PRIMARY | branch_dislike_field | eq_ref | PRIMARY | PRIMARY | 4 | river_back.vote_dislike.branch_dislike_id | 1 | Using where |
| 1 | PRIMARY | field | eq_ref | PRIMARY | PRIMARY | 4 | river_back.branch_dislike_field.field_id | 1 | Using where |
+------+-------------+----------------------+--------+----------------------+-----------+---------+-------------------------------------------+------+----------------------------------------------+
You should check that all the data you are selecting are indexed and you have foreign keys.
How do MySQL indexes work?
Basically an index on a table works like an index in a book (that's
where the name came from):
Let's say you have a book about databases and you want to find some
information about, say, storage. Without an index (assuming no other
aid, such as a table of contents) you'd have to go through the pages
one by one, until you found the topic (that's a full table scan). On
the other hand, an index has a list of keywords, so you'd consult the
index and see that storage is mentioned on pages 113-120,231 and 354.
Then you could flip to those pages directly, without searching (that's
a search with an index, somewhat faster).
Basics of Foreign Keys in MySQL?
FOREIGN KEYS just ensure your data are consistent.
They do not improve queries in sense of efficiency, they just make
some wrong queries fail.
Do not use LEFT unless you are expecting the "right" table to have missing rows.
In particular, the Optimizer probably cannot start with the 'derived table' since it is hiding to the right of a LEFT.
Do not use IN ( SELECT ... ); if possible change to EXISTS ( SELECT * ...) or JOIN.
Try to avoid the "inflate-deflate" caused by JOIN ... GROUP BY. If possible find the ids of interest without needing a GROUP BY, then JOIN to the other tables.
Putting many of those together, does this get you close to the desired result, at least in the sense of getting the correct vote.id values?
SELECT vote.id
FROM vote AS v
JOIN (
SELECT vote_dislike.vote_id `vote_id`, field.title `field`
FROM vote_dislike AS vd
LEFT JOIN branch_dislike_field AS bd
ON bd.id = vd.branch_dislike_id
LEFT JOIN field
ON field.id = bd.field_id
) AS dis ON dis.vote_id = v.id
JOIN device AS d ON v.device_id = d.id
JOIN branch AS b ON d.branch_id = b.id
WHERE b.brand_id = 7
AND v.created_at >= ...
AND v.created_at <= ...
Then:
SELECT lots of stuff
FROM ( the above query ) AS x
JOIN vote v ON x.id = v.id -- yes, dig back into `vote` for the other stuff
JOIN voter ...
JOIN ...
but with no GROUP BY.

Joining two MySQL tables correctly based on conditions

My database has the following two tables
jobs:
+----+--------------+
| id | name |
+----+--------------+
| 1 | Mechanic |
| 2 | Programmer |
| 3 | Cleaner |
| 4 | Truck driver |
+----+--------------+
qualifications:
+--------+--------------------+
| job_id | qualification |
+--------+--------------------+
| 1 | drivers_license |
| 1 | engine_certificate |
| 2 | mysql_certificate |
| 4 | drivers_license |
+--------+--------------------+
Let's say that I have a drivers_license and a mysql_certificate. I want to create an SQL query that returns all jobs that don't have requirement I don't have. So the result of the query should be job id 2, 3 and 4.
I have tried the following query:
select *
from jobs j
join qualifications q
on j.id = q.job_id
where q.qualification = 'drivers_license' ||
q.qualification = 'mysql_certificate';
This returns id 1, 2 and 4 and therefore obviously doesn't work.
How can this be achieved in SQL? Any help is greatly appreciated.
You can use group by and having:
select j.name, j.id
from jobs j join
qualifications q
on j.id = q.job_id
group by j.name, j.id
having sum(q.qualification not in ('drivers_license' , 'mysql_certificate')) = 0;

Optimization of SQL with subquery and Having

Currently we are using a custom CI library to generate PDF files from documents which exist as database records in our database.
Each document is related to the contents (== rows) with a one-has-many relation. Each row has a number (field: row_span) to indicate how many lines it will use once it gets printed in the PDF.
Per PDF page that gets build, Rows needed for that page only are selected using a subquery:
$where = $this->docType."_id = ".$this->data['doc']->id." AND visible = 1";
$sql = "SELECT *,
(SELECT
sum(row_span) FROM app_".$this->docType."_rows X
WHERE X.position <= O.position
AND ".$where."
ORDER BY position ASC) 'span_total'
FROM app_".$this->docType."_rows O
WHERE ".$where."
HAVING span_total > ".(($i-1)*$this->maxRows)." AND span_total <= ".($i*$this->maxRows)." ORDER BY O.position ASC ";
$rows = $rows->query($sql);
In the code $i is the page number and $this->maxRows is loaded from the document template record which indicates how many available lines the PDF template has.
So when the SQL renders it might look like this for page 1 of an order with ID 834:
SELECT `app_order_rows`.*,
(SELECT SUM(`app_order_rows_subquery`.`row_span`) AS row_span
FROM `app_order_rows` `app_order_rows_subquery`
WHERE `app_order_rows_subquery`.`position` <= 'app_order_rows.position'
AND `app_order_rows_subquery`.`order_id` = 834
AND `app_order_rows_subquery`.`visible` = 1
ORDER BY `app_order_rows_subquery`.`position` asc) AS span_total
FROM (`app_order_rows`)
WHERE `app_order_rows`.`order_id` = 834
AND `app_order_rows`.`visible` = 1
HAVING span_total > 0
AND span_total <= 45
ORDER BY `app_order_rows`.`position` asc
And running this with EXPLAIN gives this as output:
+====+=============+=========================+======+===============+======+=========+======+======+=============================+===+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
+====+=============+=========================+======+===============+======+=========+======+======+=============================+===+
| 1 | PRIMARY | app_order_rows | ALL | NULL | NULL | NULL | NULL | 1809 | Using where; Using filesort | 1 |
+----+-------------+-------------------------+------+---------------+------+---------+------+------+-----------------------------+---+
| 2 | SUBQUERY | app_order_rows_subquery | ALL | NULL | NULL | NULL | NULL | 1809 | Using where | 2 |
+====+=============+=========================+======+===============+======+=========+======+======+=============================+===+
This is working great, but... When we have large orders or invoices it renders the documents very slow. This might be due to the subquery.
Does anyone have an idea on how to do the same select without subquery? Maybe we will have to go for a whole new approach to select rows and build the PDF. We are open for suggestions ^^
Thanks in advance
------------------------------- edit ------------------------------
The EXPLAIN after index creation:
+====+=============+=========================+=======+===============+============+=========+=======+======+=============+===+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
+====+=============+=========================+=======+===============+============+=========+=======+======+=============+===+
| 1 | PRIMARY | app_order_rows | ref | index_main | index_main | 5 | const | 9 | Using where | 1 |
+----+-------------+-------------------------+-------+---------------+------------+---------+-------+------+-------------+---+
| 2 | SUBQUERY | app_order_rows_subquery | range | index_main | index_main | 10 | NULL | 1 | Using where | 2 |
+====+=============+=========================+=======+===============+============+=========+=======+======+=============+===+
As you confirmed in the comments, the tables have no indexes.
The immediate solution would be:
create index index_main on app_order_rows (order_id, position);

Select foreign key (group) where is the biggest match

I have three tables group_sentences, group_sentences_attributes and group_senteces_categories.
I have an attributes array which I am using in query with IN (after implode).
Then I have one category ID because they are stored recursively, so no need for an array.
I need to select one group number where is the biggest match for $attributesArray and of course category too.
Here is table group_sentences_attributes
+-----+-------+-----------+
| id | group | attribute |
+-----+-------+-----------+
| 1 | 1 | 3564 |
| 2 | 1 | 3687 |
| 3 | 1 | 3689 |
| 4 | 2 | 3687 |
| 5 | 2 | 3564 |
+-----+-------+-----------+
Here is group_sentences_category
+-----+-------+----------+
| id | group | category |
+-----+-------+----------+
| 1 | 1 | 1564 |
| 2 | 1 | 1221 |
| 3 | 1 | 1756 |
| 4 | 2 | 1358 |
| 5 | 2 | 1125 |
+-----+-------+----------+
Here is my query, but I am afraid that it won't do the job done.
SELECT group_categories.group
FROM group_categories, group_attributes
WHERE group_categories.category = '$category'
AND group_attributes.attribute IN ($attributesArray)
GROUP BY group_categories.group
ORDER BY count(group_attributes.attribute)
Any help would be appreciated, thanks.
First, the table in your query do not match the tables in the question. I am guessing they are simply missing the "sentence". Then, you have no join clause. Simple rule: Never use commas in the from clause.
group is a lousy name for a column, because it is a keyword in SQL. The following may be what you are looking for:
SELECT gc.groupid
FROM group_sentences_attributes sa JOIN
group_sentences_category sc
ON sa.groupid = sc.groupid
WHERE sc.category = '$category' AND
sa.attribute IN ($attributesArray)
GROUP BY sa.groupid
ORDER BY count(sa.attribute);
If you only want one row, then add LIMIT 1 to the end.

SQL: get data spread over 3 tables

I am trying to get some statistics for an online game I maintain. I am searching for an SQL statement to get the result on the bottom.
There are three tables:
A table with teams, each having a unique identifier.
table teams
---------------------
| teamid | teamname |
|--------|----------|
| 1 | team_a |
| 2 | team_x |
---------------------
A table with players, each having a unique identifier and optionally an affiliation to one team by it's unique teamid.
table players
--------------------------------
| playerid | teamid | username |
|----------|--------|----------|
| 1 | 1 | user_a |
| 2 | | user_b |
| 3 | 2 | user_c |
| 4 | 2 | user_d |
| 5 | 1 | user_e |
--------------------------------
Finally a table with events. The event (duration in seconds) is related to one of the players through their playerid.
table events.
-----------------------
| playerid | duration |
|----------|----------|
| 1 | 2 |
| 2 | 5 |
| 3 | 3 |
| 4 | 8 |
| 5 | 12 |
| 3 | 4 |
-----------------------
I am trying to get a result where the durations of all team members is summed up.
result
--------------------------
| teamid | SUM(duration) |
|--------|---------------|
| 1 | 14 | (2+12)
| 2 | 15 | (3+8+4)
--------------------------
I tried several combinations of UNION, WHERE IN, JOIN and GROUP but could not get it right. I am using PostgreSQL and PHP. Can anyone help me?
Just use sum with group by:
select t.teamid, sum(e.duration)
from team t
join players p on t.teamid = p.teamid
join events e on p.playerid = e.playerid
group by t.teamid
If you need all teams to be returned even if they don't have events, then use an outer join instead.
Try this
SELECT teamid, Sum(duration),
AS LineItemAmount, AccountDescription
FROM teams
JOIN teams ON teams.teamid = players.teamid
JOIN events ON players.playersid = events.playersid
JOIN GLAccounts ON InvoiceLineItems.AccountNo = GLAccounts.AccountNo
GROUP BY teamid
http://www.w3computing.com/sqlserver/inner-joins-join-two-tables/

Categories