MYSQL where clause slow performance on big table - php

I have a performance issue when working with a huge table
I add index on column using this :
ALTER table add index column;
and on the text/blob column :
alter table add index (cat(200));
My table has about 6M rows and i am working with InnoDB engine (Mysql 5.5)
This query is very fast now that i add index on "order by" column:
SELECT * from table order by column DESC LIMIT 0,40
But when I add a WHERE clause on this query its very slow and it take about 10 seconds to load even with the column "cat" index like above. //index instead of indexed
SELECT * from table WHERE cat = 'electronic' order by column DESC LIMIT 0,40
the EXPLAIN of this slow query :
EXPLAIN SELECT * from table WHERE cat = 'electronic' order by 'id' DESC LIMIT 0,40
id : 1
select_type : SIMPLE
table : product
type : ref
possible_keys: cat
key: cat
Key_len: 203
ref: const
row : 1732184
extra: using where
The query working fine with small table with 50k rows but with 6M rows its slow. Why?

Do not use prefixing, such as cat(200); it usually makes the index unusable. I have never seen a case where the Optimizer, when faced with INDEX(a(10), b), gets past a and makes any use of b.
Change cat to be VARCHAR(255). That is probably more than sufficient for "categories".
The best index (if it is possible) is
INDEX(cat, `column`)
Note that cat is in the WHERE with =. It handles the entire WHERE, so the index can move on to the ORDER BY. Hence column can be used, too. More discussion of index making .
If cat must be TEXT, then the best you can do is
INDEX(`column`)
Then the Optimizer may decide to use it for avoiding a filesort. But if there are fewer than 40 (see LIMIT) 'electronic' rows, it will take an big scan and probably be slower than not using the index. So, I am not sure that it is even worth having INDEX(column).

For this query:
SELECT t.*
FROM table t
WHERE cat = 'electronic'
ORDER BY column DESC
LIMIT 0, 40;
The best index is a composite index on table(cat, column). You can use a prefix if column is too wide: table(cat, column(200)).

The best option is to index the table, if you dont know how to do it, you can check this doc
So, when you perform the query, the mysql will start searching on the indexed values, skipping a lot of useless data for that request.

Related

count() takes lots of time when use WHERE clause in mysql

Table has approximately 100 000 records(tuples). Without where clause it takes only few miliseconds whereas takes 4-5 secs when use where clause.
SELECT COUNT(DISTINCT id) FROM tablename WHERE shippable = '1'
I also tried this one but it takes more time as compared to previous one.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE shippable = '1' GROUP BY id) as T
This is the output when I use EXPLAIN keyword before starting of mysql query
If you a need a filter you could use an index on shippable eg:
create index shippable_ixd on tablename (shippable);
in this way the scan for the table is limited to values that match
and avoid the scan for entire table
and based on the fact you also need the column id you could also trying alternatively a composite index
create index shippable_ixd on tablename (shippable, id);
the sqloptimizer should retrive directly form the index the info needed.
In this case The use of composite index ( with a redundant id not need by where clause) is useful because the SQL engine retrive all the data needed to the query just scanning the index, avoiding the access to the data in the table. This tecnique is use frequently for db queries tuning.
When you checking any condition that time both value should be in same type then execution of query will be fast.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE CAST(shippable AS CHAR) = '1' GROUP BY id) as T

Mysql count slow when filter by category

Why my query fast when I run.
select count(*) as aggregate from `news` where `news`.`deleted_at` is null and `status` = '1'
But, slow when I run.
select count(*) as aggregate from `news` where `news`.`deleted_at` is null and `status` = '1' and `newscategory_id` = '17'
It is my table news structure image, have a look at here.
Sorry because my reputation is less than 8, so I can't attach image.
try adding an composite index on the three columns you are using for your select:
ALTER TABLE news ADD INDEX comp_index (deleted_at, status, newscategory_id);
and check it again.
probably use EXPLAIN to see if any indexes you have are used.
Indexes are used to find rows with specific column values quickly.
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data.
This is much faster than reading every row sequentially.
Try to add this in your DB:
CREATE INDEX newCategory_indx ON news (newscategory_id)
CREATE INDEX status_indx ON news (status)
This will give you quick result, as compared to previously generate (non-indexed column) result.
To know more about index and it's importance visit here

Getting random results from large tables

I'm trying to get 4 random results from a table that holds approx 7 million records. Additionally, I also want to get 4 random records from the same table that are filtered by category.
Now, as you would imagine doing random sorting on a table this large causes the queries to take a few seconds, which is not ideal.
One other method I thought of for the non-filtered result set would be to just get PHP to select some random numbers between 1 - 7,000,000 or so and then do an IN(...) with the query to only grab those rows - and yes, I know that this method has a caveat in that you may get less than 4 if a record with that id no longer exists.
However, the above method obviously will not work with the category filtering as PHP doesn't know which record numbers belong to which category and hence cannot select the record numbers to select from.
Are there any better ways I can do this? Only way I can think of would be to store the record id's for each category in another table and then select random results from that and then select only those record ID's from the main table in a secondary query; but I'm sure there is a better way!?
You could of course use the RAND() function on a query using a LIMIT and WHERE (for the category). That however as you pointed out, entails a scan of the database which takes time, especially in your case due to the volume of data.
Your other alternative, again as you pointed out, to store id/category_id in another table might prove a bit faster but again there has to be a LIMIT and WHERE on that table which will also contain the same amount of records as the master table.
A different approach (if applicable) would be to have a table per category and store in that the IDs. If your categories are fixed or do not change that often, then you should be able to use that approach. In that case you will effectively remove the WHERE from the clause and getting a RAND() with a LIMIT on each category table would be faster since each category table will contain a subset of records from your main table.
Some other alternatives would be to use a key/value pair database just for that operation. MongoDb or Google AppEngine can help with that and are really fast.
You could also go towards the approach of a Master/Slave in your MySQL. The slave replicates content in real time but when you need to perform the expensive query you query the slave instead of the master, thus passing the load to a different machine.
Finally you could go with Sphinx which is a lot easier to install and maintain. You can then treat each of those category queries as a document search and let Sphinx randomize the results. This way you offset this expensive operation to a different layer and let MySQL continue with other operations.
Just some issues to consider.
Working off your random number approach
Get the max id in the database.
Create a temp table to store your matches.
Loop n times doing the following
Generate a random number between 1 and maxId
Get the first record with a record Id greater than the random number and insert it into your temp table
Your temp table now contains your random results.
Or you could dynamically generate sql with a union to do the query in one step.
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
Note: my sql may not be valid, as I'm not a mySql guy, but the theory should be sound
First you need to get number of rows ... something like this
select count(1) from tbl where category = ?
then select a random number
$offset = rand(1,$rowsNum);
and select a row with offset
select * FROM tbl LIMIT $offset, 1
in this way you avoid missing ids. The only problem is you need to run second query several times. Union may help in this case.
For MySQl you can use
RAND()
SELECT column FROM table
ORDER BY RAND()
LIMIT 4

php + mysql: finding all rows that contain string "ABC" in some field

I'm relatively new to this stuff, so forgive if the question is dump. Suppose a table with fields: id, str_field. values of str_field is something like "12:17:1246:90". I want to get all rows where str_field contains e.g. "17". To do this I'll need to execute the following command
SELECT id FROM `table_name` WHERE INSTR(str_field, '17')>0
If there's a large number of rows in the table the query can be slow. Here's the question: if I'll index str_field will it increase the spead of query execution?
Thank you in advance!
PS. In other terms I'm asking: does index increase the spead on for the queries like
SELECT * FROM `table_name` WHERE str_field='value'
?
UPD str_field contains only numbers separated by colons.
If there's a large number of rows in the table the query can be slow. Here's the question: if I'll index str_field will it increase the spead of query execution?
Not much. If there are many other columns in your table then you can make a covering index on the columns you use: (id, str_field). This will be slightly faster because the index will be smaller than the original table and therefore can be read faster. However it will still require a full scan of the index (instead of a full scan of the entire table).
But other than that, you can improve the speed of the query by using a separate table with three columns to store the separate integers, using an approach called denormalization.
parent sortorder value
1 1 12
1 2 17
1 3 1246
1 4 90
Query like this:
SELECT parent AS id
FROM table_values
WHERE `value` = 17
You can then add an index on (value, parent) for this table, which will speed up the query. Note that the sortorder column is not required for this query. If you don't think you will ever need it then you don't need to include this column in your table.
You could create a fulltext index and use the special MATCH functionality: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
Still, searching for values instead of expressions is much faster...
Indexes can increase perfomance, that is true.
If you have a field which contains string data like 12:17:1246:90, you can use the query
SELECT id FROM `table_name` WHERE `str_field` LIKE '%17%';
% is the a joker character which means one or more characters.

optimizing Mysql tables with index

SELECT *
FROM sms_report
WHERE R_uid = '159'
AND R_show = '1'
ORDER BY R_timestamp DESC , R_numbers
This is my query. Now it is using filesort i need to add index so that its optimized.
Below is the output of explain
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE sms_report ref R_uid,R_show R_uid 4 const 765993 Using where; Using filesort
The table is MYISAM and i have created indexes on R_smppid, R_uid, R_show, R_timedate, R_numbers
Someone adviced me on adding composite index. can you tell me which all fields should I index and how.
Try using a composite index on R_uid,R_show,R_timestamp,R_numbers - that way it should be able to find exactly the rows you are looking for in 1 index, and have the results already sorted.
EDIT - DESC may throw that optimization... but it may be worth a try
Since MySQL says possible keys == R_uid,R_show, try creating a composite index over just those two.
Try running ANALYZE TABLE sms_report; Maybe also OPTIMIZE TABLE
Also try using EXPLAIN EXTENDED ... to see if it gives you more info.
If you are only interested in some of the columns, only specify those columns instead of *. Some databases (I don't know if MySQL is one of them) can skip reading the table and return the results straight from the index if the index includes all the columns you're interested in. e.g. if you're only interested in R_uid and R_show, then doing SELECT R_uid, R_show FROM ... instead of SELECT * FROM ... could speed things up. (Again, I don't know if this applies to MySQL.)
How to add index:
alter table sms_report add index new_index (uid, show, R_timestamp, R_numbers);
How to force query to use new index
SELECT *
FROM sms_report
USE INDEX new_index
WHERE R_uid=159 AND R_show=1
ORDER BY R_timestamp DESC, R_numbers;

Categories