I have this query:
SELECT * FROM projects WHERE current = 1 ORDER BY id DESC LIMIT 0,20
id is the primary key, and there is an index created for current and status field
CREATE INDEX currentstatus ON projects (current, status)
The table has 30,000+ rows, and this query runs in about 4.0s
I would like to use force index for group by (id) but MySQL 5.0 does not support it.
Explain shows the currentstatus is used for the query, but I would have liked to use currentstatus for the WHERE clause, and PRIMARY for the ORDER BY clause
We cannot upgrade the database as this is heavily used and we cannot have any downtime. I am wondering if there is a way to optimize this query in MySQL 5.0 so it will use the primary key index when sorting?
EDIT: The ideal query, as an example, in MySQL 5.1+ would be
SELECT * FROM projects FORCE INDEX FOR ORDER BY (PRIMARY) WHERE current = 1 ORDER BY id DESC LIMIT 0,20;
In MySQL 5.0 as you have mentioned it is not possible to force / use a specific index for only the ORDER BY clause.
http://dev.mysql.com/doc/refman/5.0/en/index-hints.html
And MySQL 5.1+ does indeed allow this.
It is for this reason they probably added this feature in for MySQL 5.1
I have found a solution that is more of a workaround for the issue. This will at least allow for much faster queries:
SELECT * FROM projects WHERE current = 1 AND id > ((SELECT MAX(id) FROM projects) - 1000) ORDER BY id DESC LIMIT 20;
This will default to using the PRIMARY key id for the order by / where clauses and it is a small enough subset to scan through that this query runs in 0.00s vs 4.35s on my current data set. 1000 rows will guarantee at least 20 records with current = 1 for this case anyway.
Related
I have a performance issue when working with a huge table
I add index on column using this :
ALTER table add index column;
and on the text/blob column :
alter table add index (cat(200));
My table has about 6M rows and i am working with InnoDB engine (Mysql 5.5)
This query is very fast now that i add index on "order by" column:
SELECT * from table order by column DESC LIMIT 0,40
But when I add a WHERE clause on this query its very slow and it take about 10 seconds to load even with the column "cat" index like above. //index instead of indexed
SELECT * from table WHERE cat = 'electronic' order by column DESC LIMIT 0,40
the EXPLAIN of this slow query :
EXPLAIN SELECT * from table WHERE cat = 'electronic' order by 'id' DESC LIMIT 0,40
id : 1
select_type : SIMPLE
table : product
type : ref
possible_keys: cat
key: cat
Key_len: 203
ref: const
row : 1732184
extra: using where
The query working fine with small table with 50k rows but with 6M rows its slow. Why?
Do not use prefixing, such as cat(200); it usually makes the index unusable. I have never seen a case where the Optimizer, when faced with INDEX(a(10), b), gets past a and makes any use of b.
Change cat to be VARCHAR(255). That is probably more than sufficient for "categories".
The best index (if it is possible) is
INDEX(cat, `column`)
Note that cat is in the WHERE with =. It handles the entire WHERE, so the index can move on to the ORDER BY. Hence column can be used, too. More discussion of index making .
If cat must be TEXT, then the best you can do is
INDEX(`column`)
Then the Optimizer may decide to use it for avoiding a filesort. But if there are fewer than 40 (see LIMIT) 'electronic' rows, it will take an big scan and probably be slower than not using the index. So, I am not sure that it is even worth having INDEX(column).
For this query:
SELECT t.*
FROM table t
WHERE cat = 'electronic'
ORDER BY column DESC
LIMIT 0, 40;
The best index is a composite index on table(cat, column). You can use a prefix if column is too wide: table(cat, column(200)).
The best option is to index the table, if you dont know how to do it, you can check this doc
So, when you perform the query, the mysql will start searching on the indexed values, skipping a lot of useless data for that request.
I have a MySQL script that takes a database query and cuts off a certain amount of rows depending on some settings. So if I have a user with a subscription of 100,000 things, and the user uploads 110,000, the script cuts off the last 10,000.
Here is the MySQL script:
DELETE FROM `my_table`
WHERE id <= (
SELECT id
FROM (
SELECT id
FROM `my_table`
WHERE some_id = $this->id
ORDER BY id DESC
LIMIT 1 OFFSET $max
) sp
Where max is 100,000
Which will delete any extra, I have since started implementing Elastic Search, and I am up to trying to duplicate this functionality but I don't know where to start because I am not that versed with this software just yet.
I have been looking at the deleteByQuery method in the PHP API, but I don't see anything about offsets or anything like that.
Can someone point me in the right direction?
Try this one, it will delete extra records
DELETE FROM my_table WHERE id IN (
SELECT id
WHERE some_id = $this->id
ORDER BY id ASC
LIMIT $maxRecordsAllowed, $countHowManyToDelete
)
I need to create an InnoDB table that I will use to add data to and constantly fetch the most recent 10 rows added to it. To avoid having to do an ORDER BY with every SELECT query to get those last 10 rows, I would like to have the table itself ordered by the Primary Key in DESC order so that I can skip the ORDER BY entirely and just do a SELECT ... LIMIT 10, which should automatically pull the most recent 10 rows added to the table.
How can I do that? Is it as simple as adding ORDER BY [PRIMARYKEY] DESC to the CREATE TABLE query? Will the table continue to be sorted in DESC order even after INSERTing new rows?
A RDBMS never provides any guarantees on the order of the rows in any of the tables it manages. The only way to get a specific order is to ask for one. For the case of MySQL, the rows happens to be sorted by the primary key in ascending order often time when that key is in auto increment mode, but it's not a guaranteed property.
Use ORDER BY on your queries to get the desired result.
On the other hand, the ordering will be faster if the primary key type is BTREE (which is the default on most engine).
The sorting direction isn't yet used on MySQL 5.5.
I would like to be able to pull back 15 or so records from a database. I've seen that using WHERE id = rand() can cause performance issues as my database gets larger. All solutions I've seen are geared towards selecting a single random record. I would like to get multiples.
Does anyone know of an efficient way to do this for large databases?
edit:
Further Edit and Testing:
I made a fairly simple table, on a new database using MyISAM. I gave this 3 fields: autokey (unsigned auto number key) bigdata (a large blob) and somemore (a medium int). I then applied random data to the table and ran a series of queries using Navicat. Here are the results:
Query 1: select * from test order by rand() limit 15
Query 2: select *
from
test
join
(select round(rand()*(select max(autokey) from test)) as val from test limit 15) as rnd
on
rnd.val=test.autokey;`
(I tried both select and select distinct and it made no discernible difference)
and:
Query 3 (I only ran this on the second test):
SELECT *
FROM (
SELECT #cnt := COUNT(*) + 1,
#lim := 10
FROM test
) vars
STRAIGHT_JOIN
(
SELECT r.*,
#lim := #lim - 1
FROM test r
WHERE (#cnt := #cnt - 1)
AND RAND(20090301) < #lim / #cnt
) i
ROWS: QUERY 1: QUERY 2: QUERY 3:
2,060,922 2.977s 0.002s N/A
3,043,406 5.334s 0.001s 1.260
I would like to do more rows so I can see how query 3 scales, but at the moment, it seems as though the clear winner is query 2.
Before I wrap up this testing and declare an answer, and while I have all this data and the test environment set up, can anyone recommend any further testing?
Try:
select * from table order by rand() limit 15
Another (and possibly more efficient way) would be to join against a set of random values. This should work, if there's some contiguous integer key in the table. Here is how I would do it in postgres (My MySQL is a bit rusty)
select * from table join
(select (random()*maxid)::integer as val from generate_series(1,15)) as rnd
on rand.val=table.id;
where maxid is the highest id in table. If id has an index, then this would mean only 15 index lookup, so its very fast.
UPDATE:
Looks like there no such thing as generate_series in MySQL. My fault. We don't need it actually:
select *
from
table
join
-- this just returns 15 random numbers.
-- I need `table` here only to produce rows for rand()
(select round(rand()*(select max(id) from table)) as val from table limit 15) as rnd
on
rnd.val=table.id;
P.S. If I don't want duplicates returned, I can use (select distinct [...]) in the random generator expression.
Update: Check out the accepted answer in this question. It's pure mySQL and even deals with even distribution.
The problem with id = rand() or anything comparable in PHP is that you can't be sure whether that particular ID still exists. Therefore, you need to work with LIMIT, and that can become slow for large amounts of data.
As an alternative to that, you could try using a loop in PHP.
What the loop does is
Create a random integer number using rand(), with a scope between 0 and the number of records in the database
Query the database whether a record with that ID exists
If it exists, add the number to an array
If it doesn't, go back to step 1
End the loop when the array of random numbers contains the desired number of elements
this method could cause a lot of queries in a fragmented table, but they should be pretty fast to execute. It may be faster than LIMIT rand() in certain situations.
The LIMIT method, as outlined by #Luther, is certainly the simplest code-wise.
You could do a query with all the results or however many limited, then use mysqli_fetch_all followed by:
shuffle($a);
$a = array_slice($a, 0, 15);
For a large dataset doing
select * from table order by rand() limit 15
can be quite time and memory consuming.
If your data records happen to be numbered you can put and index on the numbering colum and do a
select * from table where no >= rand() limit 15
Or even better do the random number generation in your application and do
select * from table where no >= $rand and no <= $rand+15
If your data doesn't change too often, it might be worth to add such a numbering a column to make the selection efficient.
Assuming MySQL supports nested queries and that operations on the primary key are fast, I'd try something like
select * from table where id in (select id from table order by rand() limit 15)
What SQL query would I use to display the newest entry?
Details:
id is the primary field. I have other fields but that are not related to when they were added.
ORDER BY SomeColumn DESC
LIMIT 1
or
use the MAX() function
Since you didn't give any details about your table it is hard to answer
SELECT * from yourTable ORDER BY `id` DESC LIMIT 1;
Another (better) way would be to have a "date_added" column (date_added TIMESTAMP DEFAULT CURRENT_TIMESTAMP) so you could order by that column descending instead. Dates are more reliable than ID-assignment.
not sure if this is what your looking for but I use mysql_insert_id() after inserting a new row
The auto incremented ID columns are not always the latest records inserted, I've remember really painful experience with this behavior. Conditions where specific, it was mysql 4.1.x at the time and there was almost 1 million records, where 1 out of 3 deleted everiday, and others re inserted in the next 24hours. It made a huge mess when I realize ordering them via ID was not ordering them by age....
Since then, I use a specific column for doing age related sorts, and populating these fields with date = NOW() at each row insert.
Of course it will work to found the latest record as you want, doing an ORDER BY date DESC LIMIT 0,1on your query
SELECT Primary_Key_Field FROM table ORDER BY Primary_Key_Field DESC LIMIT 1
Replace Primary_Key_Field and table obviously :)