count() takes lots of time when use WHERE clause in mysql - php

Table has approximately 100 000 records(tuples). Without where clause it takes only few miliseconds whereas takes 4-5 secs when use where clause.
SELECT COUNT(DISTINCT id) FROM tablename WHERE shippable = '1'
I also tried this one but it takes more time as compared to previous one.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE shippable = '1' GROUP BY id) as T
This is the output when I use EXPLAIN keyword before starting of mysql query

If you a need a filter you could use an index on shippable eg:
create index shippable_ixd on tablename (shippable);
in this way the scan for the table is limited to values that match
and avoid the scan for entire table
and based on the fact you also need the column id you could also trying alternatively a composite index
create index shippable_ixd on tablename (shippable, id);
the sqloptimizer should retrive directly form the index the info needed.
In this case The use of composite index ( with a redundant id not need by where clause) is useful because the SQL engine retrive all the data needed to the query just scanning the index, avoiding the access to the data in the table. This tecnique is use frequently for db queries tuning.

When you checking any condition that time both value should be in same type then execution of query will be fast.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE CAST(shippable AS CHAR) = '1' GROUP BY id) as T

Related

MYSQL where clause slow performance on big table

I have a performance issue when working with a huge table
I add index on column using this :
ALTER table add index column;
and on the text/blob column :
alter table add index (cat(200));
My table has about 6M rows and i am working with InnoDB engine (Mysql 5.5)
This query is very fast now that i add index on "order by" column:
SELECT * from table order by column DESC LIMIT 0,40
But when I add a WHERE clause on this query its very slow and it take about 10 seconds to load even with the column "cat" index like above. //index instead of indexed
SELECT * from table WHERE cat = 'electronic' order by column DESC LIMIT 0,40
the EXPLAIN of this slow query :
EXPLAIN SELECT * from table WHERE cat = 'electronic' order by 'id' DESC LIMIT 0,40
id : 1
select_type : SIMPLE
table : product
type : ref
possible_keys: cat
key: cat
Key_len: 203
ref: const
row : 1732184
extra: using where
The query working fine with small table with 50k rows but with 6M rows its slow. Why?
Do not use prefixing, such as cat(200); it usually makes the index unusable. I have never seen a case where the Optimizer, when faced with INDEX(a(10), b), gets past a and makes any use of b.
Change cat to be VARCHAR(255). That is probably more than sufficient for "categories".
The best index (if it is possible) is
INDEX(cat, `column`)
Note that cat is in the WHERE with =. It handles the entire WHERE, so the index can move on to the ORDER BY. Hence column can be used, too. More discussion of index making .
If cat must be TEXT, then the best you can do is
INDEX(`column`)
Then the Optimizer may decide to use it for avoiding a filesort. But if there are fewer than 40 (see LIMIT) 'electronic' rows, it will take an big scan and probably be slower than not using the index. So, I am not sure that it is even worth having INDEX(column).
For this query:
SELECT t.*
FROM table t
WHERE cat = 'electronic'
ORDER BY column DESC
LIMIT 0, 40;
The best index is a composite index on table(cat, column). You can use a prefix if column is too wide: table(cat, column(200)).
The best option is to index the table, if you dont know how to do it, you can check this doc
So, when you perform the query, the mysql will start searching on the indexed values, skipping a lot of useless data for that request.

Update Current Row in MySQL Loop

I have a MySQL table with over 16 million rows and there is no primary key. Whenever I try to add one, my connection crashes. I have tried adding one as an auto increment in PHPMyAdmin and in shell but the connection is always lost after about 10 minutes.
What I would like to do is loop through the table's rows in PHP so I can limit the number of results and with each returned row add an auto-incremented ID number. Since the number of impacted rows would be reduced by reducing the load on the MySQL query, I won't lose my connection.
I want to do something like
SELECT * FROM MYTABLE LIMIT 1000001, 2000000;
Then, in the loop, update the current row
UPDATE (current row) SET ID='$i++'
How do I do this?
Note: the original data was given to me as a txt file. I don't know if there are duplicates but I cannot eliminate any rows. Also, no rows will be added. This table is going to be used only for querying purposes. When I have added indexes, however, there were no problems.
I suspect you are trying to use phpmyadmin to add the index. As handy as it is, it is a PHP script and is limited to the same resources as any PHP script on your server, typically 30-60 seconds run time, and a limited amount of ram.
Suggest you get the mysql query you need to add the index, then use SSH to shell in, and use command line MySQL to add your indexes.
If you don't have duplicate rows then the following way might shed some light:
Suppose you want to update the auto incremented value for first 10000 rows.
UPDATE
MYTABLE
INNER JOIN
(SELECT
*,
#rn := #rn + 1 AS row_number
FROM MYTABLE,(SELECT #rn := 0) var
ORDER BY SOME_OF_YOUR_FIELD
LIMIT 0,10000 ) t
ON t.field1 = MYTABLE.field1 AND t.field2 = MYTABLE.field2 AND .... t.fieldN = MYTABLE.fieldN
SET MYTABLE.ID = t.row_number;
For next 10000 rows just need to change two things:
(SELECT #rn := 10000) var
LIMIT 10000,10000
Repeat..
Note: ORDER BY SOME_OF_YOUR_FIELD is important otherwise you would get results in random order. Better create a function which might take limit,offset as parameter and do this job. Since you need to repeat the process.
Explanation:
The idea is to create a temporary table(t) having N number of rows and assigning a unique row number to each of the row. Later make an inner join between your main table MYTABLE and this temporary table t ON matching all the fields and then update the ID field of the corresponding row(in MYTABLE) with the incremented value(in this case row_number).
Another IDEA:
You may use multithreading in PHP to do this job.
Create N threads.
Assign each thread a non overlapping region (1 to 10000, 10001 to
20000 etc) like the above query.
Caution: The query will get slower in higher offset.

PHP & Mysql select from big data

My database have 10.000.000 record
I want select from database but it is heavy
Query i have tried:
SELECT * FROM `table` USE INDEX (id) JOIN `new` AS p1
USE INDEX (pid) ON table.id = p1.pid
WHERE `p1.date` > '2015-02-01' AND `p1.date` < '2016-02-01'
You need an index on columns new.date and table.id.
You probably don't need the USE INDEX hints.
I am assuming that there not too many rows in the date range. If a large proportion of your rows are in that range, obviously, it will take a long time.
use
"LIKE" instead of "="

How to reduce subquery execution time...?

I want per day sales item count so for that one i already created query but it takes to much around 55.585s and query is
Query :
SELECT
td.db_date,
(
select count(*) from order as order where DATE(order.created_on) = td.db_date
)as day_contribute
FROM time_dimension as td
So can any one please let me know how may i optimized this query and reduce execution time.?
You can modify your query to join like:
SELECT
td.db_date, count(order.id) as day_contribute
FROM time_dimension as td
LEFT JOIN order ON DATE(order.created_on) = td.db_date
GROUP BY td.db_date;
I do not know your primary id key for table order - so used just "order.id". Replace it with your.
Also it is very important - test if you have index on td.db_date field.
And one more important thing - better to avoid using DATE(order.created_on). Because it is mean that DATE() method will be called each time when DB will compare dates. If it is possible - convert order.created_on to same format as td.db_date. Or join by other fields. That will add speed too.
First you should make sure you have index on created_on column in order table.
However if you have many records in time_dimension and many records in order table it might be hard to optimize the query, because for each record from time_dimension you need to search in order table.
You can also change count(*) into count(order_id) (assuming primary key in order table is order_id) or add extra column with date only in order table (created_on_date with date only and index on this column) so your query could look like this:
SELECT
td.db_date,
(
select count(order_id) from order where order.created_on_date = td.db_date
)as day_contribute
FROM time_dimension as td
However it's possible the execution time might be too high if you have many records in both tables, so it might be necessary to create one extra table where you hold number of orders for each day and update it in cron or when adding/updating/deleting records in order table

Insert Records All At Once

I have a table that has been functional and i added a column to the table. After adding the column i want to add the result of a query (query is same for all but different results) into that column all at once instead of one at a time which will be time consuming. How can i achieve that? Cos after updating, i have just one result in all the column, i cannot use a where clause cos it will require me doing it one after the other
$stmt = $pdo->prepare("UPDATE table SET my_value = '$myValue' ");
$stmt->execute();
UPDATE table
SET my_value = (select col from some_table where ...)
If the value is the same for all rows, I would advise using cross join:
update table t cross join
(select newval . . .) x
set t.col = x.newval;
Note: this is better than a subquery, because the subquery is guaranteed to be evaluated only once.
If you are trying to say that the value is the same for groups of columns, then extend this to a join:
update table t join
(select grp, newval . . .) x
on t.grp = x.grp
set t.col = x.newval;
After adding the column I want to add the result of a query (query
result is same for all) into that column all at once instead of one at
a time which will be time consuming.
The solution depends on what you mean by "Is the same for all the rows."
If you have one value that is exactly the same for all columns, you can just ask for it and then update. This is usually faster (and allows you to debug more easily) than using pure SQL to achieve everything.
If, on the other hand, you mean the values of that column are retrieved by the same query, but will be different for different rows, then a subquery or a cross join as Gordon suggested will do the trick.

Categories