MySQL fulltext index query does not use any index? - php

I have a simple table created like this
CREATE TABLE IF NOT EXISTS metadata (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title varchar(500),
category varchar(50),
uuid varchar(20),
FULLTEXT(title, category)
) ENGINE=InnoDB;
When I execute a fulltext search, it took 2.5s with 1M rows. So I execute a query planner and it does not use any index:
mysql> explain SELECT uuid, title, category, MATCH(title, category) AGAINST ('grimm' IN NATURAL LANGUAGE MODE) AS score FROM metadata HAVING score > 0 limit 20;
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
| 1 | SIMPLE | metadata | NULL | ALL | NULL | NULL | NULL | NULL | 1036202 | 100.00 | NULL |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------+
Is that expected? How can I speed this up?

Your query fetches every row in the table, calculates the natural language match, and then passes the results (still for every row) to the HAVING clause. This is a table-scan.
You should try putting the fulltext-indexed search into the WHERE clause instead, to reduce the number of matching rows.
mysql> explain SELECT uuid, title, category FROM metadata
WHERE MATCH(title, category) AGAINST ('grimm' IN NATURAL LANGUAGE MODE)
LIMIT 20;

Related

Why this MySQL query is faster without index?

I am having trouble understanding why my MySQL query runs faster when I change it to use no indexes.
My first query takes 0.236s to run:
SELECT
u.id,
u.email,
CONCAT(u.first_name, ' ', u.last_name) AS u_name
FROM
tbl_user AS u
WHERE
u.site_id=1
AND u.role_id=5
AND u.removed_date IS NULL
ORDER BY
u_name ASC
LIMIT 0, 20
My second query takes 0.147s to run:
SELECT
u.id,
u.email,
CONCAT(u.first_name, ' ', u.last_name) AS u_name
FROM
tbl_user AS u USE INDEX ()
WHERE
u.site_id=1
AND u.role_id=5
AND u.removed_date IS NULL
ORDER BY
u_name ASC
LIMIT 0, 20
I have a unique index named idx_1 on columns site_id, role_id and email.
The EXPLAIN statement tells that it will use idx_1.
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
| 1 | SIMPLE | u | ref | idx_1,idx_import,tbl_user_ibfk_2 | idx_1 | 8 | const,const | 55006 | Using index condition; Using where; Using filesort |
+----+-------------+-------+------+-------------------------------------+-------+---------+-------------+-------+----------------------------------------------------+
The table has about 110000 records.
Thanks
UPDATE 1:
Below is the list of my table indexes:
Name Fields Type Method
---------------------------------------------------------------
idx_1 site_id, role_id, email Unique BTREE
idx_import site_id, external_id Unique BTREE
tbl_user_ibfk_2 role_id Normal BTREE
tbl_user_ibfk_3 country_id Normal BTREE
tbl_user_ibfk_4 preferred_country_id Normal BTREE
---------------------------------------------------------------
You haven't specified which mysql you are using. Does this explain it
Prior to MySQL 5.1.17, USE INDEX, IGNORE INDEX, and FORCE INDEX affect only which indexes are used when MySQL decides how to find rows in the table and how to process joins. They do not affect whether an index is used when resolving an ORDER BY or GROUP BY clause.
from https://dev.mysql.com/doc/refman/5.1/en/index-hints.html

Search feature for my site

I have a requirement to add a search feature to a site I'm building and was wondering if anyone has done something similar.
I have a sample table that contains the details of cats in this format:
Name, place, type, age, gender and size.
And I only have one search box where users can enter their search terms. My question is, how do I search the table if, for example someone types in "cat in Paris"?
I want to be able to search all the fields and return a something if found.
Is there any way to achieve this rather than having lots of boxes for them to select a search criteria? Any help or suggestion would be appreciated.
One of the simpler approaches that works very well in this situation is to do a fulltext search in mysql. You can have it index all of the columns and to a natural language search.
If you had a mysql table called cats with the following schema:
mysql> desc cats;
+--------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(100) | YES | MUL | NULL | |
| place | varchar(100) | YES | | NULL | |
| type | varchar(100) | YES | | NULL | |
| age | int(11) | YES | | NULL | |
| gender | varchar(100) | YES | | NULL | |
| size | varchar(100) | YES | | NULL | |
+--------+--------------+------+-----+---------+----------------+
You can run the following SQL to create the index:
CREATE FULLTEXT INDEX cats_search ON cats (name, type, place, gender);
Then when you get the search string 'male tabby in paris' you can search the table with this SQL:
SELECT *
, MATCH(name, type, place, gender)
AGAINST ('male tabby in paris' IN BOOLEAN MODE) relevance
FROM cats
WHERE MATCH(name, type, place, gender)
AGAINST ('male tabby in paris' IN BOOLEAN MODE)
ORDER BY relevance DESC;
will return all of the rows that match those terms in the order mysql decides is most relevant.
You will have to research mysql fulltext searches to fine tune the results they way you want, but this should get you off the ground.

How to avoid "Using temporary" in many-to-many queries?

This query is very simple, all I want to do, is get all the articles in given category ordered by last_updated field:
SELECT
`articles`.*
FROM
`articles`,
`articles_to_categories`
WHERE
`articles`.`id` = `articles_to_categories`.`article_id`
AND `articles_to_categories`.`category_id` = 1
ORDER BY `articles`.`last_updated` DESC
LIMIT 0, 20;
But it runs very slow. Here is what EXPLAIN said:
select_type table type possible_keys key key_len ref rows Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SIMPLE articles_to_categories ref article_id,category_id article_id 5 const 5016 Using where; Using temporary; Using filesort
SIMPLE articles eq_ref PRIMARY PRIMARY 4 articles_to_categories.article_id 1
Is there a way to rewrite this query or add additional logic to my PHP scripts to avoid Using temporary; Using filesort and speed thing up?
The table structure:
*articles*
id | title | content | last_updated
*articles_to_categories*
article_id | category_id
UPDATE
I have last_updated indexed. I guess my situation is explained in documentation:
In some cases, MySQL cannot use
indexes to resolve the ORDER BY,
although it still uses indexes to find
the rows that match the WHERE clause.
These cases include the following:
The key used to fetch the rows is not the same as the one used in the ORDER BY:
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
You are joining many tables, and the
columns in the ORDER BY are not all
from the first nonconstant table that
is used to retrieve rows. (This is the
first table in the EXPLAIN output that
does not have a const join type.)
but I still have no idea how to fix this.
Here's a simplified example I did for a similar performance related question sometime ago that takes advantage of innodb clustered primary key indexes (obviously only available with innodb !!)
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
You have 3 tables: category, product and product_category as follows:
drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;
drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb;
drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;
The most import thing is the order of the product_catgeory clustered composite primary key as typical queries for this scenario always lead by cat_id = x or cat_id in (x,y,z...).
We have 500K categories, 1 million products and 125 million product categories.
select count(*) from category;
+----------+
| count(*) |
+----------+
| 500000 |
+----------+
select count(*) from product;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
select count(*) from product_category;
+-----------+
| count(*) |
+-----------+
| 125611877 |
+-----------+
So let's see how this schema performs for a query similar to yours. All queries are run cold (after mysql restart) with empty buffers and no query caching.
select
p.*
from
product p
inner join product_category pc on
pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;
+---------+----------------+
| prod_id | name |
+---------+----------------+
| 993561 | Product 993561 |
| 991215 | Product 991215 |
| 989222 | Product 989222 |
| 986589 | Product 986589 |
| 983593 | Product 983593 |
| 982507 | Product 982507 |
| 981505 | Product 981505 |
| 981320 | Product 981320 |
| 978576 | Product 978576 |
| 973428 | Product 973428 |
| 959384 | Product 959384 |
| 954829 | Product 954829 |
| 953369 | Product 953369 |
| 951891 | Product 951891 |
| 949413 | Product 949413 |
| 947855 | Product 947855 |
| 947080 | Product 947080 |
| 945115 | Product 945115 |
| 943833 | Product 943833 |
| 942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec)
explain
select
p.*
from
product p
inner join product_category pc on
pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| 1 | SIMPLE | pc | ref | PRIMARY | PRIMARY | 3 | const | 499 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | vl_db.pc.prod_id | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)
So that's 0.70 seconds cold - ouch.
Hope this helps :)
EDIT
Having just read your reply to my comment above it seems you have one of two choices to make:
create table articles_to_categories
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(article_id, category_id), -- good for queries that lead with article_id = x
key (category_id)
)
engine=innodb;
or.
create table categories_to_articles
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(category_id, article_id), -- good for queries that lead with category_id = x
key (article_id)
)
engine=innodb;
depends on your typical queries as to how you define your clustered PK.
You should be able to avoid filesort by adding a key on articles.last_updated. MySQL needs the filesort for the ORDER BY operation, but can do it without filesort as long as you order by an indexed column (with some limitations).
For much more info, see here: http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
I assume you have made the following in your db:
1) articles -> id is a primary key
2) articles_to_categories -> article_id is a foreign key of articles -> id
3) you can create index on category_id
ALTER TABLE articles ADD INDEX (last_updated);
ALTER TABLE articles_to_categories ADD INDEX (article_id);
should do it. The right plan is to find the first few records using the first index and do the JOIN using the second one. If it doesn't work, try STRAIGHT_JOIN or something to enforce proper index usage.

Recursive-ish query for tags?

I have a table of tags that can be linked to other tags and I want to "recursively" select the tags in order of arrangement. So that when a search is made, we get the immediate (1-level) results and then carry on down to say 5-levels so that we always have a list of tags no matter if there wasn't enough exact matches on level 1.
I can manage this fine with making multiple queries until I get enough results, but surely there is a better, optimized, way via a one-trip query?
Any tips will be appreciated.
Thanks!
Results:
tagId, tagWord, child, child tagId
'513', 'Slap', 'Hog Slapper', '1518'
'513', 'Slap', 'Corporal Punishment', '147'
'513', 'Slap', 'Impact Play', '1394'
Query:
SELECT t.tagId, t.tagWord as tag, tt.tagWord as child, tt.tagId as childId
FROM platform.tagWords t
INNER JOIN platform.tagsLinks l ON l.parentId = t.tagId
INNER JOIN platform.tagWords tt ON tt.tagId = l.tagId
WHERE t.tagWord = 'slap'
Table Layouts:
mysql> explain tagWords;
+---------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+----------------+
| tagId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| tagWord | varchar(45) | YES | UNI | NULL | |
+---------+---------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> explain tagsLinks;
+----------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+-------+
| tagId | bigint(20) unsigned | NO | | NULL | |
| parentId | bigint(20) | YES | | NULL | |
+----------+---------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
AFAIK Mysql doesn't have any mechanism for querying data recursively
Oracle has Connected By construct and Sql Server has CTE(Common Table Expressions).
But Mysql,
Read Here and Here
Here are the options that I consider each time I find myself in a situation when I need to query hierarchical data.
Nested Sets
Path enumeration
Explicit joins (when the maximum level is known)
Vendor Extensions (SQL Server CTE, Oracle Connect by etc)
Stored Procedures
Suck it up

Finding mySQL duplicates, then merging data

I have a mySQL database with a tad under 2 million rows. The database is non-interactive, so efficiency isn't key.
The (simplified) structure I have is:
`id` int(11) NOT NULL auto_increment
`category` varchar(64) NOT NULL
`productListing` varchar(256) NOT NULL
Now the problem I would like to solve is, I want to find duplicates on productListing field, merge the data on the category field into a single result - deleting the duplicates.
So given the following data:
+----+-----------+---------------------------+
| id | category | productListing |
+----+-----------+---------------------------+
| 1 | Category1 | productGroup1 |
| 2 | Category2 | productGroup1 |
| 3 | Category3 | anotherGroup9 |
+----+-----------+---------------------------+
What I want to end up is with:
+----+----------------------+---------------------------+
| id | category | productListing |
+----+----------------------+---------------------------+
| 1 | Category1,Category2 | productGroup1 |
| 3 | Category3 | anotherGroup9 |
+----+----------------------+---------------------------+
What's the most efficient way to do this either in pure mySQL query or php?
I think you're looking for GROUP_CONCAT:
SELECT GROUP_CONCAT(category), productListing
FROM YourTable
GROUP BY productListing
I would create a new table, inserting the updated values, delete the old one and rename the new table to the old one's name:
CREATE TABLE new_YourTable SELECT GROUP_CONCAT(...;
DROP TABLE YourTable;
RENAME TABLE new_YourTable TO YourTable;
-- don't forget to add triggers, indexes, foreign keys, etc. to new table
SELECT MIN(id), GROUP_CONCAT(category SEPARATOR ',' ORDER BY id), productListing
FROM mytable
GROUP BY
productListing

Categories