How to avoid "Using temporary" in many-to-many queries?

How to avoid "Using temporary" in many-to-many queries? - php

This query is very simple, all I want to do, is get all the articles in given category ordered by last_updated field:
SELECT
`articles`.*
FROM
`articles`,
`articles_to_categories`
WHERE
`articles`.`id` = `articles_to_categories`.`article_id`
AND `articles_to_categories`.`category_id` = 1
ORDER BY `articles`.`last_updated` DESC
LIMIT 0, 20;
But it runs very slow. Here is what EXPLAIN said:
select_type table type possible_keys key key_len ref rows Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SIMPLE articles_to_categories ref article_id,category_id article_id 5 const 5016 Using where; Using temporary; Using filesort
SIMPLE articles eq_ref PRIMARY PRIMARY 4 articles_to_categories.article_id 1
Is there a way to rewrite this query or add additional logic to my PHP scripts to avoid Using temporary; Using filesort and speed thing up?
The table structure:
*articles*
id | title | content | last_updated
*articles_to_categories*
article_id | category_id
UPDATE
I have last_updated indexed. I guess my situation is explained in documentation:
In some cases, MySQL cannot use
indexes to resolve the ORDER BY,
although it still uses indexes to find
the rows that match the WHERE clause.
These cases include the following:
The key used to fetch the rows is not the same as the one used in the ORDER BY:
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
You are joining many tables, and the
columns in the ORDER BY are not all
from the first nonconstant table that
is used to retrieve rows. (This is the
first table in the EXPLAIN output that
does not have a const join type.)
but I still have no idea how to fix this.

Here's a simplified example I did for a similar performance related question sometime ago that takes advantage of innodb clustered primary key indexes (obviously only available with innodb !!)
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
You have 3 tables: category, product and product_category as follows:
drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;
drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;
drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;
The most import thing is the order of the product_catgeory clustered composite primary key as typical queries for this scenario always lead by cat_id = x or cat_id in (x,y,z...).
We have 500K categories, 1 million products and 125 million product categories.
select count(*) from category;
+----------+
| count(*) |
+----------+
| 500000 |
+----------+
select count(*) from product;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
select count(*) from product_category;
+-----------+
| count(*) |
+-----------+
| 125611877 |
+-----------+
So let's see how this schema performs for a query similar to yours. All queries are run cold (after mysql restart) with empty buffers and no query caching.
select
p.*
from
product p
inner join product_category pc on
pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;
+---------+----------------+
| prod_id | name |
+---------+----------------+
| 993561 | Product 993561 |
| 991215 | Product 991215 |
| 989222 | Product 989222 |
| 986589 | Product 986589 |
| 983593 | Product 983593 |
| 982507 | Product 982507 |
| 981505 | Product 981505 |
| 981320 | Product 981320 |
| 978576 | Product 978576 |
| 973428 | Product 973428 |
| 959384 | Product 959384 |
| 954829 | Product 954829 |
| 953369 | Product 953369 |
| 951891 | Product 951891 |
| 949413 | Product 949413 |
| 947855 | Product 947855 |
| 947080 | Product 947080 |
| 945115 | Product 945115 |
| 943833 | Product 943833 |
| 942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec)
explain
select
p.*
from
product p
inner join product_category pc on
pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| 1 | SIMPLE | pc | ref | PRIMARY | PRIMARY | 3 | const | 499 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | vl_db.pc.prod_id | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)
So that's 0.70 seconds cold - ouch.
Hope this helps :)
EDIT
Having just read your reply to my comment above it seems you have one of two choices to make:
create table articles_to_categories
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(article_id, category_id), -- good for queries that lead with article_id = x
key (category_id)
)
engine=innodb;
or.
create table categories_to_articles
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(category_id, article_id), -- good for queries that lead with category_id = x
key (article_id)
)
engine=innodb;
depends on your typical queries as to how you define your clustered PK.

You should be able to avoid filesort by adding a key on articles.last_updated. MySQL needs the filesort for the ORDER BY operation, but can do it without filesort as long as you order by an indexed column (with some limitations).
For much more info, see here: http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html

I assume you have made the following in your db:
1) articles -> id is a primary key
2) articles_to_categories -> article_id is a foreign key of articles -> id
3) you can create index on category_id

ALTER TABLE articles ADD INDEX (last_updated);
ALTER TABLE articles_to_categories ADD INDEX (article_id);
should do it. The right plan is to find the first few records using the first index and do the JOIN using the second one. If it doesn't work, try STRAIGHT_JOIN or something to enforce proper index usage.

Related

MySQL fulltext index query does not use any index?

I have a simple table created like this
CREATE TABLE IF NOT EXISTS metadata (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title varchar(500),
category varchar(50),
uuid varchar(20),
FULLTEXT(title, category)
) ENGINE=InnoDB;
When I execute a fulltext search, it took 2.5s with 1M rows. So I execute a query planner and it does not use any index:
mysql> explain SELECT uuid, title, category, MATCH(title, category) AGAINST ('grimm' IN NATURAL LANGUAGE MODE) AS score FROM metadata HAVING score > 0 limit 20;
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
| 1 | SIMPLE | metadata | NULL | ALL | NULL | NULL | NULL | NULL | 1036202 | 100.00 | NULL |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
Is that expected? How can I speed this up?

Your query fetches every row in the table, calculates the natural language match, and then passes the results (still for every row) to the HAVING clause. This is a table-scan.
You should try putting the fulltext-indexed search into the WHERE clause instead, to reduce the number of matching rows.
mysql> explain SELECT uuid, title, category FROM metadata
WHERE MATCH(title, category) AGAINST ('grimm' IN NATURAL LANGUAGE MODE)
LIMIT 20;

Why this MySQL query is faster without index?

I am having trouble understanding why my MySQL query runs faster when I change it to use no indexes.
My first query takes 0.236s to run:
SELECT
u.id,
u.email,
CONCAT(u.first_name, ' ', u.last_name) AS u_name
FROM
tbl_user AS u
WHERE
u.site_id=1
AND u.role_id=5
AND u.removed_date IS NULL
ORDER BY
u_name ASC
LIMIT 0, 20
My second query takes 0.147s to run:
SELECT
u.id,
u.email,
CONCAT(u.first_name, ' ', u.last_name) AS u_name
FROM
tbl_user AS u USE INDEX ()
WHERE
u.site_id=1
AND u.role_id=5
AND u.removed_date IS NULL
ORDER BY
u_name ASC
LIMIT 0, 20
I have a unique index named idx_1 on columns site_id, role_id and email.
The EXPLAIN statement tells that it will use idx_1.
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
| 1 | SIMPLE | u | ref | idx_1,idx_import,tbl_user_ibfk_2 | idx_1 | 8 | const,const | 55006 | Using index condition; Using where; Using filesort |
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
The table has about 110000 records.
Thanks
UPDATE 1:
Below is the list of my table indexes:
Name Fields Type Method
---------------------------------------------------------------
idx_1 site_id, role_id, email Unique BTREE
idx_import site_id, external_id Unique BTREE
tbl_user_ibfk_2 role_id Normal BTREE
tbl_user_ibfk_3 country_id Normal BTREE
tbl_user_ibfk_4 preferred_country_id Normal BTREE
---------------------------------------------------------------

You haven't specified which mysql you are using. Does this explain it
Prior to MySQL 5.1.17, USE INDEX, IGNORE INDEX, and FORCE INDEX affect only which indexes are used when MySQL decides how to find rows in the table and how to process joins. They do not affect whether an index is used when resolving an ORDER BY or GROUP BY clause.
from https://dev.mysql.com/doc/refman/5.1/en/index-hints.html

MySQL Select from multiple tables

I'm new to MySQL. I am creating a checkout page in PHP. When the users select the items they want to buy and click "Add to Cart", a temporary table gets created which has the following fields (table name is temp):
+--------------+-----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------+------+-----+-------------------+----------------+
| Cart_Item_ID | int(11) | NO | PRI | NULL | auto_increment |
| Item_ID | int(11) | NO | | | |
| Added_On | timestamp | YES | | CURRENT_TIMESTAMP | |
+--------------+-----------+------+-----+-------------------+----------------+
I'm only inserting to the Item_ID field which contains the ID of each item they bought (I'm populating the forms with item IDs). What I want to do is look up the item's name and price that's stored in the Inventory table. Here's how that looks:
+--------------+----------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------------+------+-----+-------------------+----------------+
| Inventory_ID | int(11) | NO | PRI | NULL | auto_increment |
| Item_Name | varchar(40) | NO | | | |
| Item_Price | float unsigned | NO | | 0 | |
| Added_On | timestamp | YES | | CURRENT_TIMESTAMP | |
+--------------+----------------+------+-----+-------------------+----------------+
So how would I pull out the Item_name and Item_Price fields from the Inventory table based on the Item_ID field from the temp table so I can display it on the page? I just don't understand how to formulate the query. I'd appreciate any help. Thank you.

It's called JOIN - read more here
SELECT Inventory.Item_Name, Inventory.Item_Price
FROM Inventory, temp WHERE Inventory.Inventory_ID = temp.Item_ID

what i understand is that the Item_ID in temp table is referencing to the Inventory_ID in inventory table. based on this assumption you can use the following query.
Select Item_Name, Item_Price from Inventory, Temp where Temp.Item_ID == Inventory.Inventory_ID
i guess this is what you want to do.
Thanks

As it stands, you can't (unless Inventory_ID = Item_ID)
What you need is a way of JOINing the two tables together. In this instance, if Inventory_ID = Item_ID then the following is possible:
SELECT Item_Name,
Item_Price
FROM InventoryTable
INNER JOIN TempItemTable ON (InventoryTable.Inventory_ID = ItemTable.Item_ID)
If you want to filter for a particular item you can add the constraint:
WHERE ItemTable.Item_ID = 27 --for example
That will join all the rows in your inventory table with matching rows in the Item table.
Jeff Atwood has a great (IMO) visual explanation of how JOINs work.

Recursive-ish query for tags?

I have a table of tags that can be linked to other tags and I want to "recursively" select the tags in order of arrangement. So that when a search is made, we get the immediate (1-level) results and then carry on down to say 5-levels so that we always have a list of tags no matter if there wasn't enough exact matches on level 1.
I can manage this fine with making multiple queries until I get enough results, but surely there is a better, optimized, way via a one-trip query?
Any tips will be appreciated.
Thanks!
Results:
tagId, tagWord, child, child tagId
'513', 'Slap', 'Hog Slapper', '1518'
'513', 'Slap', 'Corporal Punishment', '147'
'513', 'Slap', 'Impact Play', '1394'
Query:
SELECT t.tagId, t.tagWord as tag, tt.tagWord as child, tt.tagId as childId
FROM platform.tagWords t
INNER JOIN platform.tagsLinks l ON l.parentId = t.tagId
INNER JOIN platform.tagWords tt ON tt.tagId = l.tagId
WHERE t.tagWord = 'slap'
Table Layouts:
mysql> explain tagWords;
+---------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+----------------+
| tagId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| tagWord | varchar(45) | YES | UNI | NULL | |
+---------+---------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> explain tagsLinks;
+----------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+-------+
| tagId | bigint(20) unsigned | NO | | NULL | |
| parentId | bigint(20) | YES | | NULL | |
+----------+---------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

AFAIK Mysql doesn't have any mechanism for querying data recursively
Oracle has Connected By construct and Sql Server has CTE(Common Table Expressions).
But Mysql,
Read Here and Here

Here are the options that I consider each time I find myself in a situation when I need to query hierarchical data.
Nested Sets
Path enumeration
Explicit joins (when the maximum level is known)
Vendor Extensions (SQL Server CTE, Oracle Connect by etc)
Stored Procedures
Suck it up

Finding mySQL duplicates, then merging data

I have a mySQL database with a tad under 2 million rows. The database is non-interactive, so efficiency isn't key.
The (simplified) structure I have is:
`id` int(11) NOT NULL auto_increment
`category` varchar(64) NOT NULL
`productListing` varchar(256) NOT NULL
Now the problem I would like to solve is, I want to find duplicates on productListing field, merge the data on the category field into a single result - deleting the duplicates.
So given the following data:
+----+-----------+---------------------------+
| id | category | productListing |
+----+-----------+---------------------------+
| 1 | Category1 | productGroup1 |
| 2 | Category2 | productGroup1 |
| 3 | Category3 | anotherGroup9 |
+----+-----------+---------------------------+
What I want to end up is with:
+----+----------------------+---------------------------+
| id | category | productListing |
+----+----------------------+---------------------------+
| 1 | Category1,Category2 | productGroup1 |
| 3 | Category3 | anotherGroup9 |
+----+----------------------+---------------------------+
What's the most efficient way to do this either in pure mySQL query or php?

I think you're looking for GROUP_CONCAT:
SELECT GROUP_CONCAT(category), productListing
FROM YourTable
GROUP BY productListing
I would create a new table, inserting the updated values, delete the old one and rename the new table to the old one's name:
CREATE TABLE new_YourTable SELECT GROUP_CONCAT(...;
DROP TABLE YourTable;
RENAME TABLE new_YourTable TO YourTable;
-- don't forget to add triggers, indexes, foreign keys, etc. to new table

SELECT MIN(id), GROUP_CONCAT(category SEPARATOR ',' ORDER BY id), productListing
FROM mytable
GROUP BY
productListing

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to avoid "Using temporary" in many-to-many queries? - php

I assume you have made the following in your db: 1) articles -> id is a primary key 2) articles_to_categories -> article_id is a foreign key of articles -> id 3) you can create index on category_id

Related

MySQL fulltext index query does not use any index?

Why this MySQL query is faster without index?

MySQL Select from multiple tables

Recursive-ish query for tags?

Finding mySQL duplicates, then merging data

Categories

Resources