MySQL query taking a long time on Join - php

I have the following mysql query which takes long time
SELECT `A`.*, max(B.timestamp) as timestamp2
FROM (`A`)
JOIN `B` ON `A`.`column1` = `B`.`column1`
WHERE `column2` = 'Player'
GROUP BY `column1`
ORDER BY `timestamp2` desc
I have index on TABLE A on column1 and indexes on table B are (column1,timestamp,column2),timestamp,column1.
When i use EXPLAIN it does not use timestamp index.

Try adding an index...
... ON `B` (`column2`,`column1`,`timestamp`)
with the columns in that order.
Without any information about datatype, we're going to guess that column2 is character type (and we're going to assume that the column is in table B, given the information about the current indexes.)
Absent any information about cardinality, we're going to guess that the number of rows that satisfy the equality predicate on column2 (in the WHERE clause) is a small subset of the total rows in B.
We expect that MySQL will use of a "range" scan operation, using an index that has column2 as a leading column.
Given that the new index is a "covering" index for the query, we also expect the EXPLAIN output to show "Using index" in the Extra column.
We also expect that MySQL can use the index to satisfy the GROUP BY operation and the MAX aggregate, without requiring a filesort operation.
But we are still going to see a filesort operation, used to satisfy the ORDER BY.

Related

count() takes lots of time when use WHERE clause in mysql

Table has approximately 100 000 records(tuples). Without where clause it takes only few miliseconds whereas takes 4-5 secs when use where clause.
SELECT COUNT(DISTINCT id) FROM tablename WHERE shippable = '1'
I also tried this one but it takes more time as compared to previous one.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE shippable = '1' GROUP BY id) as T
This is the output when I use EXPLAIN keyword before starting of mysql query
If you a need a filter you could use an index on shippable eg:
create index shippable_ixd on tablename (shippable);
in this way the scan for the table is limited to values that match
and avoid the scan for entire table
and based on the fact you also need the column id you could also trying alternatively a composite index
create index shippable_ixd on tablename (shippable, id);
the sqloptimizer should retrive directly form the index the info needed.
In this case The use of composite index ( with a redundant id not need by where clause) is useful because the SQL engine retrive all the data needed to the query just scanning the index, avoiding the access to the data in the table. This tecnique is use frequently for db queries tuning.
When you checking any condition that time both value should be in same type then execution of query will be fast.
SELECT count(rowsss) FROM (SELECT count(*) as rowsss FROM tablename WHERE CAST(shippable AS CHAR) = '1' GROUP BY id) as T

MYSQL where clause slow performance on big table

I have a performance issue when working with a huge table
I add index on column using this :
ALTER table add index column;
and on the text/blob column :
alter table add index (cat(200));
My table has about 6M rows and i am working with InnoDB engine (Mysql 5.5)
This query is very fast now that i add index on "order by" column:
SELECT * from table order by column DESC LIMIT 0,40
But when I add a WHERE clause on this query its very slow and it take about 10 seconds to load even with the column "cat" index like above. //index instead of indexed
SELECT * from table WHERE cat = 'electronic' order by column DESC LIMIT 0,40
the EXPLAIN of this slow query :
EXPLAIN SELECT * from table WHERE cat = 'electronic' order by 'id' DESC LIMIT 0,40
id : 1
select_type : SIMPLE
table : product
type : ref
possible_keys: cat
key: cat
Key_len: 203
ref: const
row : 1732184
extra: using where
The query working fine with small table with 50k rows but with 6M rows its slow. Why?
Do not use prefixing, such as cat(200); it usually makes the index unusable. I have never seen a case where the Optimizer, when faced with INDEX(a(10), b), gets past a and makes any use of b.
Change cat to be VARCHAR(255). That is probably more than sufficient for "categories".
The best index (if it is possible) is
INDEX(cat, `column`)
Note that cat is in the WHERE with =. It handles the entire WHERE, so the index can move on to the ORDER BY. Hence column can be used, too. More discussion of index making .
If cat must be TEXT, then the best you can do is
INDEX(`column`)
Then the Optimizer may decide to use it for avoiding a filesort. But if there are fewer than 40 (see LIMIT) 'electronic' rows, it will take an big scan and probably be slower than not using the index. So, I am not sure that it is even worth having INDEX(column).
For this query:
SELECT t.*
FROM table t
WHERE cat = 'electronic'
ORDER BY column DESC
LIMIT 0, 40;
The best index is a composite index on table(cat, column). You can use a prefix if column is too wide: table(cat, column(200)).
The best option is to index the table, if you dont know how to do it, you can check this doc
So, when you perform the query, the mysql will start searching on the indexed values, skipping a lot of useless data for that request.

Mysql query taking 50 seconds

The table reuniao (EN: meeting), has 401000 records, and it has an index on all columns, im using XAMPP, and
I think the problems are on trying to do an ORDER BY on a COUNT and the Join, but 50 seconds its to much.
Columns, nome its varchar (EN: Name), presenca its varchar (EN: presence), partido its varchar (EN: political party), id_deputado its PK INT, and data (its Date).
SELECT D.nome, COUNT(*) as count_dep_faltas
FROM reuniao R, deputado D
WHERE R.partido LIKE '%$_POST[partido]%' AND
R.presenca LIKE '%Injustific%' AND
data BETWEEN '$data_incio' AND '$data_fim' and
R.id_deputado=D.id_deputado
GROUP BY D.nome
ORDER BY count_dep_faltas DESC
LIMIT 5
This is your query, written using proper JOIN syntax:
SELECT D.nome, COUNT(*) as count_dep_faltas
FROM deputado D JOIN
reuniao R
ON R.id_deputado = D.id_deputado
WHERE R.partido LIKE '%$_POST[partido]%' AND
R.presenca LIKE '%Injustific%' AND
R.data BETWEEN '$data_incio' AND '$data_fim'
GROUP BY D.nome
ORDER BY count_dep_faltas DESC;
LIMIT 5;
First, you need to learn to use parameters for passing in queries, rather than munging the query string. This prevents unexpected syntax errors and SQL injection. But that is not related to performance.
This is hard to optimize in MySQL because of the wildcards in the LIKE patterns. You can approach this by creating an index on reuniao(data, partido, presenca, id_deputado) and deputado(id_deputado, nome). This is a covering index, so it should have some improvement.
I would also recommend that you consider full text indexes, if you really need matches on the strings with wildcards.

Mysql where between query optimization

Below is the format of the database of Autonomous System Numbers ( download and parsed from this site! ).
range_start range_end number cc provider
----------- --------- ------ -- -------------------------------------
16778240 16778495 56203 AU AS56203 - BIGRED-NET-AU Big Red Group
16793600 16809983 18144 AS18144
745465 total rows
A Normal query looks like this:
select * from table where 3232235520 BETWEEN range_start AND range_end
Works properly but I query a huge number of IPs to check for their AS information which ends up taking too many calls and time.
Profiler Snapshot:
Blackfire profiler snapshot
I've two indexes:
id column
a combine index on the range_start and range_end column as both the make unique row.
Questions:
Is there a way to query a huge number of IPs in a single query?
multiple where (IP between range_start and range_end) OR where (IP between range_start and range_end) OR ... works but I can't get the IP -> row mapping or which rows are retrieved for which IP.
Any suggestions to change the database structure to optimize the query speed and decrease the time?
Any help will be appreciated! Thanks!
It is possible to query more than one IP address. Several approaches we could take. Assuming range_start and range_end are defined as integer types.
For a reasonable number of ip addresses, we could use an inline view:
SELECT i.ip, a.*
FROM ( SELECT 3232235520 AS ip
UNION ALL SELECT 3232235521
UNION ALL SELECT 3232235522
UNION ALL SELECT 3232235523
UNION ALL SELECT 3232235524
UNION ALL SELECT 3232235525
) i
LEFT
JOIN ip_to_asn a
ON a.range_start <= i.ip
AND a.range_end >= i.ip
ORDER BY i.ip
This approach will work for a reasonable number of IP addresses. The inline view could be extended with more UNION ALL SELECT to add additional IP addresses. But that's not necessarily going to work for a "huge" number.
When we get "huge", we're going to run into limitations in MySQL... maximum size of a SQL statement limited by max_allowed_packet, there may be a limit on the number of SELECT that can appear.
The inline view could be replaced with a temporary table, built first.
DROP TEMPORARY TABLE IF EXISTS _ip_list_;
CREATE TEMPORARY TABLE _ip_list_ (ip BIGINT NOT NULL PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO _ip_list_ (ip) VALUES (3232235520),(3232235521),(3232235522),...;
...
INSERT INTO _ip_list_ (ip) VALUES (3232237989),(3232237990);
Then reference the temporary table in place of the inline view:
SELECT i.ip, a.*
FROM _ip_list_ i
LEFT
JOIN ip_to_asn a
ON a.range_start <= i.ip
AND a.range_end >= i.ip
ORDER BY i.ip ;
And then drop the temporary table:
DROP TEMPORARY TABLE IF EXISTS _ip_list_ ;
Some other notes:
Churning database connections is going to degrade performance. There's a significant amount overhead in establishing and tearing down a connection. That overhead get noticeable if the application is repeatedly connecting and disconnecting, if its doing that for every SQL statement being issued.
And running an individual SQL statement also has overhead... the statement has to be sent to the server, the statement parsed for syntax, evaluated from semantics, choose an execution plan, execute the plan, prepare a resultset, return the resultset to the client. And this is why it's more efficient to process set wise rather than row wise. Processing RBAR (row by agonizing row) can be very slow, compared to sending a statement to the database and letting it process a set in one fell swoop.
But there's a tradeoff there. With ginormous sets, things can start to get slow again.
Even if you can process two IP addresses in each statement, that halves the number of statements that need to be executed. If you do 20 IP addresses in each statement, that cuts down the number of statements to 5% of the number that would be required a row at a time.
And the composite index already defined on (range_start,range_end) is appropriate for this query.
FOLLOWUP
As Rick James points out in a comment, the index I earlier said was "appropriate" is less than ideal.
We could write the query a little differently, that might make more effective use of that index.
If (range_start,range_end) is UNIQUE (or PRIMARY) KEY, then this will return one row per IP address, even when there are "overlapping" ranges. (The previous query would return all of the rows that had a range_start and range_end that overlapped with the IP address.)
SELECT t.ip, a.*
FROM ( SELECT s.ip
, s.range_start
, MIN(e.range_end) AS range_end
FROM ( SELECT i.ip
, MAX(r.range_start) AS range_start
FROM _ip_list_ i
LEFT
JOIN ip_to_asn r
ON r.range_start <= i.ip
GROUP BY i.ip
) s
LEFT
JOIN ip_to_asn e
ON e.range_start = s.range_start
AND e.range_end >= s.ip
GROUP BY s.ip, s.range_start
) t
LEFT
JOIN ip_to_asn a
ON a.range_start = t.range_start
AND a.range_end = t.range_end
ORDER BY t.ip ;
With this query, for the innermost inline view query s, the optimizer might be able to make effective use of an index with a leading column of range_start, to quickly identify the "highest" value of range_start (that is less than or equal to the IP address). But with that outer join, and with the GROUP BY on i.ip, I'd really need to look at the EXPLAIN output; it's only conjecture what the optimizer might do; what is important is what the optimizer actually does.)
Then, for inline view query e, MySQL might be able to make more effective use of the composite index on (range_start,range_end), because of the equality predicate on the first column, and the inequality condition on MIN aggregate on the second column.
For the outermost query, MySQL will surely be able to make effective use of the composite index, due to the equality predicates on both columns.
A query of this form might show improved performance, or performance might go to hell in a handbasket. The output of EXPLAIN should give a good indication of what's going on. We'd like to see "Using index for group-by" in the Extra column, and we only want to see a "Using filesort" for the ORDER BY on the outermost query. (If we remove the ORDER BY clause, we want to not see "Using filesort" in the Extra column.)
Another approach is to make use of correlated subqueries in the SELECT list. The execution of correlated subqueries can get expensive when the resultset contains a large number of rows. But this approach can give satisfactory performance for some use cases.
This query depends on no overlapping ranges in the ip_to_asn table, and this query will not produce the expected results when overlapping ranges exist.
SELECT t.ip, a.*
FROM ( SELECT i.ip
, ( SELECT MAX(s.range_start)
FROM ip_to_asn s
WHERE s.range_start <= i.ip
) AS range_start
, ( SELECT MIN(e.range_end)
FROM ip_to_asn e
WHERE e.range_end >= i.ip
) AS range_end
FROM _ip_list_ i
) r
LEFT
JOIN ip_to_asn a
ON a.range_start = r.range_start
AND a.range_end = r.range_end
As a demonstration of why overlapping ranges will be a problem for this query, given a totally goofy, made up example
range_start range_end
----------- ---------
.101 .160
.128 .244
Given an IP address of .140, the MAX(range_start) subquery will find .128, the MIN(range_end) subquery will find .160, and then the outer query will attempt to find a matching row range_start=.128 AND range_end=.160. And that row just doesn't exist.
This is a duplicate of the question here however I'm not voting to close it, as the accepted answer in that question is not very helpful; the answer by Quassnoi is much better (but it only links to the solution).
A linear index is not going to help resolve a database of ranges. The solution is to use geospatial indexing (available in MySQL and other DBMS). An added complication is that MySQL geospatial indexing only works in 2 dimensions (while you have a 1-D dataset) so you need to map this to 2-dimensions.
Hence:
CREATE TABLE IF NOT EXISTS `inetnum` (
`from_ip` int(11) unsigned NOT NULL,
`to_ip` int(11) unsigned NOT NULL,
`netname` varchar(40) default NULL,
`ip_txt` varchar(60) default NULL,
`descr` varchar(60) default NULL,
`country` varchar(2) default NULL,
`rir` enum('APNIC','AFRINIC','ARIN','RIPE','LACNIC') NOT NULL default 'RIPE',
`netrange` linestring NOT NULL,
PRIMARY KEY (`from_ip`,`to_ip`),
SPATIAL KEY `rangelookup` (`netrange`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii;
Which might be populated with....
INSERT INTO inetnum
(from_ip, to_ip
, netname, ip_txt, descr, country
, netrange)
VALUES
(INET_ATON('127.0.0.0'), INET_ATON('127.0.0.2')
, 'localhost','127.0.0.0-127.0.0.2', 'Local Machine', '.',
GEOMFROMWKB(POLYGON(LINESTRING(
POINT(INET_ATON('127.0.0.0'), -1),
POINT(INET_ATON('127.0.0.2'), -1),
POINT(INET_ATON('127.0.0.2'), 1),
POINT(INET_ATON('127.0.0.0'), 1),
POINT(INET_ATON('127.0.0.0'), -1))))
);
Then you might want to create a function to wrap the rather verbose SQL....
DROP FUNCTION `netname2`//
CREATE DEFINER=`root`#`localhost` FUNCTION `netname2`(p_ip VARCHAR(20) CHARACTER SET ascii) RETURNS varchar(80) CHARSET ascii
READS SQL DATA
DETERMINISTIC
BEGIN
DECLARE l_netname varchar(80);
SELECT CONCAT(country, '/',netname)
INTO l_netname
FROM inetnum
WHERE MBRCONTAINS(netrange, GEOMFROMTEXT(CONCAT('POINT(', INET_ATON(p_ip), ' 0)')))
ORDER BY (to_ip-from_ip)
LIMIT 0,1;
RETURN l_netname;
END
And therefore:
SELECT netname2('127.0.0.1');
./localhost
Which uses the index:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inetnum range rangelookup rangelookup 34 NULL 1 Using where; Using filesort
(and takes around 10msec to find a record from the combined APNIC,AFRINIC,ARIN,RIPE and LACNIC datasets on the very low spec VM I'm using here)
You can compare IP ranges using MySQL. This question might contain an answer you're looking for: MySQL check if an IP-address is in range?
SELECT * FROM TABLE_NAME WHERE (INET_ATON("193.235.19.255") BETWEEN INET_ATON(ipStart) AND INET_ATON(ipEnd));
You will likely want to index your database. This optimizes the time it takes to search your database, similar to the index you will find in the back of a textbook, but for databases:
ALTER TABLE `table` ADD INDEX `name` (`column_id`)
EDIT: Apparently INET_ATON cannot be used on indexed databases, so you would have to pick one of these!

Adding a Row into an alphabetically ordered SQL table

I have a SQL table with two columns:
'id' int Auto_Increment
instancename varchar
The current 114 rows are ordered alphabetically after instancename.
Now i want to insert a new row that fits into the order.
So say it starts with a 'B', it would be at around id 14 and therefore had to 'push down' all of the rows after id 14. How do i do this?
An SQL table is not inherently ordered! (It is just a set.) You would simply add the new row and view it using something like:
select instancename
from thetable
order by instancename;
I think you're going about this the wrong way. IDs shouldn't be changed. If you have tables that reference these IDs as foreign keys then the DBMS wouldn't let you change them, anyway.
Instead, if you need results from a specific query to be ordered alphabetically, tell SQL to order it for you:
SELECT * FROM table ORDER BY instancename
As an aside, sometimes you want something that can seemingly be a key (read- needs to be unique for each row) but does have to change from time to time (such as something like a SKU in a product table). This should not be the primary key for the same reason (there are undoubtedly other tables that may refer to these entries, each of which would also need to be updated).
Keeping this information distinct will help keep you and everyone else working on the project from going insane.
Try using an over and joining to self.
Update thetable
Set ID = r.ID
From thetable c Join
( Select instancename, Row_Number() Over(Order By instancename) As ID
From CollectionStatus) r On c.instancename= r.instancename
This should update the id column to the ordered number. You may have to disable it's identity first.

Categories