I want to know on Indexes.
I want to create an index on one of MySQL Tables (number of rows is 300,000). Here, are some columns
item_id (primary key)
item_name
categoryid
date_added
impressions
visits
I want to create index on categoryid column. I have read in other posts that update, insert, delete makes processing slow because MySQL recreates indexes on every update.
So below are my queries:
What does update here means?
Does it mean an update to any column of any row or update to that
particular indexed column (categoryid here).
Because in this case when ever an item is shown in search results an impression will be incremented and if user visits item's page then visits will be incremented.
So, does this updation in impressions and visits will recreate the
index (categoryid) every time (categoryid does not changes on
updates) or it will only recreate index when categoryid is updated
or new row is added?
it will be update the only when you change categoryid is updated or new row is added if you have created the index on the categoryid... It will update the mapping table where this indexing manage...
how ever it depends on...indexing is a way of optimization.
First of all you can identify slow queries with this entry: log-slow-queries long_query_time = value in my.cnf
This gives you an idea if there is a need for optimization.
Next let MySQL explain the query.
the most valueable is possible_keys: item_name, categoryid and the used keys key: item_name and look if there is: using_filesort. The last one (using_filesort) says you should use an index to save response time
But!, because there is one key per table used is also worth to think about aggregation in some ways:
Combined index:
(categoryid, item_name) when your WHERE part is categoryid="iao" AND item_name="xyz"
(item_name, categoryid) when your WHERE part is item_name="xyz" AND categoryid="iao"
---> the order is important!
if your WHERE part is item_name="xyz" AND categoryid="iao" the use of two indexes:
1. index: item_name helps saving time
2. index: categoryid is lost time
the most benefit of using combined index you will get when your WHERE part make use of ORDER BY e.g.: WHERE part is item_name="xyz" AND categoryid="iao" ORDER BY date_added. In this case combined index: (item_name, categoryid, date_added) save time.
And yes do it the right way:
indexing consume time by indexing (DELETE, UPDATE, REPLACE, INSERT) and save time on each SELECT
When you index a column, any changes to values in that column will take "longer" to process because the index has to rebuilt/resorted, although this change will hardly be noticeable unless you are dealing with millions of records. On the flipside, having the index means searching on those columns will be many times faster. Its a tradeoff you need to make but usually the index is worthwhile if you are searching on those fields.
Related
I have a table called INVOICES that receives entries from a PHP script. It has many columns, but the two most relevant are INVOICE_ID and INVOICE_TYPE. Basically the INVOICE_TYPE is a number from 0 to 3, which designates different types of invoices.
Up to this point, everything ran smoothly until two users submitted invoices while the server had a hiccup and wrote both in as the same INVOICE_ID. The reason for this is the PHP script reads the MAX INVOICE_ID of the INVOICE_TYPE, then adds 1, then inserts the new row with that INVOICE_ID. In essence, it is programmatically a primary key. 99.9% of the time it worked, but that one time it was a problem.
I have tried finding SQL solutions but do not have sufficient knowledge of it. I have tried doing it myself in an SQL query to read the MAX, increment, and the insert but just throws an exception that you cannot select and insert from the same table at once.
What I'm wondering is if there is an auto-increment that could be conditional to the INVOICE_TYPE, to only increment if the type is matched. Any suggestions would help at this point.
An unique index over the two columns (INVOICE_ID, INVOICE_TYPE) will make one of such hiccupy queries fail.
CREATE UNIQUE INDEX id_type_unique ON INVOICES (INVOICE_ID, INVOICE_TYPE);
INSERT INTO INVOICES (INVOICE_ID, INVOICE_TYPE) VALUES (1, 5); -- okay
INSERT INTO INVOICES (INVOICE_ID, INVOICE_TYPE) VALUES (1, 5); -- error
If you insert only one row to one table at once simplest solution is to apply unique index on both columns.
CREATE UNIQUE INDEX invoice_id_type_unique
ON INVOICES(INVOICE_ID,INVOICE_TYPE);
But if you execute more queries based on the same data you need to use transactions to prevent modifying/inserting only part of data.
START TRANSACTION;
SELECT #invoice_id:=MAX(INVOICE_ID) FROM INVOICES WHERE INVOICE_TYPE=1;
INSERT INTO INVOICES (...,#invoice_id,...);
...
... #OTHER QUERIES UPDATING DATA
COMMIT;
make (INVOICE_ID) as UNIQUE, this will solved your problem, sql will not allowed duplicate value in same column.
The way i like it is to create a table to handle all the sequences and a stored procedure that i can call with the name of the sequence that i like to know the next value, something similar to:
START TRANSACTION;
SELECT value INTO result FROM Sequences WHERE name like paramSeqName FOR UPDATE;
UPDATE Sequences SET value = value + 1 WHERE name like paramSeqName;
COMMIT;
There is a good example here:
http://www.microshell.com/database/mysql/emulating-nextval-function-to-get-sequence-in-mysql/
I have a MyISAM table in MySQL with three columns - an auto-incrementing ID, an integer (customer id) and a decimal (account balance).
At this time, whenever I initialize my system, I completely wipe the table using:
truncate table xyz;
alter table xyz auto_increment=1001;
Then I repopulate the table with data from PHP. I usually end up having up to 10,000 entries in that table.
However, due to new requirements to the system, I now need to also be able to update the table while the system is running, so I can no longer wipe the table and have to use UPDATE instead of INSERT and update the balances one by one which will be much slower than inserting 20 new records at a time as I'm doing now.
PHP only sends the customer id and the amount to MySQL - the other id is not actually in use.
So my question is this:
Does it make sense to put an index on the customer id to speed up updating given that the input from PHP is most likely not going to be in order? Or will adding the index slow it down enough to not make it worthwhile?
I also don't know if the index is used at all for the UPDATE command or not ...
Any other suggestions on how to speed this up?
It depends on what your update query is. Presumably it is like:
update xyz
set val = <something>
where id = <something else>
If so, an index on id will definitely help speed things up, because you are looking for one record.
If your query looks like:
update xyz
set val = 1.001 * val;
An index will neither help, nor hurt. The entire table will need to be scanned and the index does not get involved.
If your query is like:
update xyz
set id = id+1;
Then an index will be slower. You have to read and write to every row of the table, plus you then have the overhead of maintaining the index.
Ok I'll make this into an answer. If you are saying:
Update xyz set balance=100 where customer_id = 123;
Then yes an index on customer_id will definitely increase the speed since it will find the row to update much quicker.
But for improvement, if you have columns (id,customer_id,balance) and customer_id is unique and id is just an auto incremented column get rid of the id column and make customer_id the primary key. Primary keys do not have to be auto incremented integer columns.
I have three or four tables in a MySQL database associated with an upcoming Android app that potentially may explode to thousands of rows very fast. At this time, I have about 6 - 8 SELECT and 2 INSERT SQL commands that will need to be done.
After doing research, I have found that I will have to use indexing to cut down on load time. I have searched for several tutorials on different sites to see if I can pick this up -- but I have found nothing that explains very clearly what and how to to do this.
Here's the situation:
First and foremast, it will be using a Godaddy MySQL server. Unlimited bandwidth and 150,000 MB. Here is one table that will be getting lots of use:
items_id (int 11)
item (100 varchar)
cat_id (int 11)
In PHPMyAdmin it says for indexes:
Keyname/PRIMARY type/PRIMARY Cardinality/576 items_id
So it appears there is an index established, correct?
Here is one SQL Query (via PHP) related to this table (SELECT):
"SELECT * FROM items WHERE cat_id = ' ".$_REQUEST['category_id']."' ORDER BY TRIM(LEADING 'The ' FROM item) ASC;"
And another (INSERT):
"INSERT INTO items (item, cat_id) VALUES ('{$newItem}', '{$cat_id}')"
My main questions are: With these methods, am I utilizing the best speed possible and making use of the established indexes? Or does this have "slow" written all over it?
Simple selects / inserts cannot be changed to take advantage of indexes.
But indexes can be added to the tables to make the queries run faster.
Well actually inserts don't do anything with indexes unless you're using InnoDB as a storage engine and foreign key constraints.
If you're using a column in the where / group by / order by clauses of a select statement you may consider adding an index on it. A good ideea would be to use EXPLAIN on the queries in cause and see how the database engine uses the columns in the where clause.
If a column has a small set of non-unique possible values (gender: male/ female) it makes little sense to add an index for it because you won't be searching for all the females or all the males (and half a table search is not very different than a full table search). But if you use that column along with another column to filter / group / sort you may want to add a composite index (multi-column index) on them.
Databases within MySQL are organized as folders. The folders contain multiple files for each table.
There's a table definition file, a table data file and some index files. If you define an index for a column or multiple columns, a file for that index will be created.
If you don't have any indexes not even the primary key, any Select statement is going to do a full table search which for hundreds of thousands of entries becomes noticeably slow.
If you define an index it will read all the unique values in the table for that column or set of columns and write a file that lists correspondences between a certain value of that column or those columns and the records that contain it.
That file should be much smaller that the data file and should usually fit into memory entirely along side other index files. MySQL now has to intersect the matching record lists in that file to find out which records match the select criteria and then cherry-pick the data it needs from the data table.
Primary and Unique indexes have a direct correspondence between one value and one record. So searching by unique value is fast.
Say I have a table called materials
materials table contains columns
item name | item description | stock date | sale date| price |
what I am looking into is some times I may want sort result by item name and may be by item description and may be by stock date and may be by sale date and by price.
So how I design a table according to above criteria? And how do I add index to all columns? Is it necessary to add index to all?
Any help?
well my table will have more than a million rows
I am using PHP and MySQL
There's no reason why you can't have an index on every column if it will help. You have to bear in mind the consequences of indexing like slowing down inserts/deletes. You need to weigh up the pro's and con's.
To create index...
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
It's worth reading http://use-the-index-luke.com/ - yes, you can index every column; it will rarely do any good, because you have to tune the indices for the queries your actually running.
Take a look at ORDER BY
Adding index on a column speeds up search queries, but slows down inserting / deleting. However, first make your application work, and optimize afterwards.
If you want an index-assisted sorting on each column, you should create an index on each column.
Note that index scan is not necessarily faster: if you want all (or even a significant part) of the records returned, then the filesort will most probably be more efficient (unless your tables are really large in which case you don't want all records anyway).
Indexes will only help if you are using ORDER BY along with LIMIT.
indexes should be created based on the criteria, a simple example in your case would be you might want to search your table for items that are of a particular price. The index for that should be item name, item description and price.
I tend to plan the table so that i know what i'm likely to do and then create the indexes accordingly. So you might have a function that getsItemsBySaleDate() or getItemsCheaperThan() etc... both of these would have different indexes as both would search different columns within the table. I would suggest for now simply create an index for each column on the table.
I'd also add one for:
item_search => itemname, itemdescription, price
Sure, you should create one index for each sort criteria. Don't forget to define additional keys to indexes. e.g. for the price key you should add item name, so the items will be sorted not only by price, but then name (within the same priced group).
As others told already, be careful with the number of indexes: all the indexes must be updated upon each insert or update. Do you really want to sort items on description?
Why use an index
Indexes are used for two things.
Selection of items
Sorting of the list
For selection of items you need an index, because searching through all records is not an option.
However if you only ever select 100 items at a time, MySQL can easily sort those items in place without using an index.
So first put indexes on the items that are in your where and join clauses.
Then see how many items you select per query. If it's fewer than say 200, I would not bother with indexes for sorting.
Adding an index
CREATE INDEX index_name ON tbl_name (price)
See: http://dev.mysql.com/doc/refman/5.1/en/create-index.html
For all the options you can give an index.
Creating the table
My suggestion:
CREATE TABLE materials (
id integer not null autoincrement primary key,
name varchar not null,
description varchar not null,
stockdate date not null,
saledate date not null,
price decimal(10,2) not null,
/*my suggestion, put an index on all, but not on description*/
INDEX `i_name` (name),
INDEX `i_stockdate` (stockdate),
INDEX `i_saledate` (saledate),
INDEX `i_price` (price)) ENGINE = MyISAM;
If you select on the description in the where clause, then add a fulltext index on description.
CREATE FULLTEXT INDEX i_description ON materials (description);
If you only sort on description do not add an index, it's not worth it IMO.
Ok so I've a SQL query here:
SELECT a.id,... FROM article AS a WHERE a.type=1 AND a.id=3765 ORDER BY a.datetime DESC LIMIT 1
I wanted to get exact article by country and id and created for that index with two columns type and id. Id is also primary key.
I used the EXPLAIN keyword to see which index is used and instead of the multiple column index it used primary key index, but I did set the where stuff exactly in order as the index is created.
Does MySQL use the primary key index instead of the multiple column index because the primary one is faster? Or should I force MySql to use the multiple column index?
P.S. Just noticed it was stupid to use order when there is 1 result row. Haha. It increased the search time for 0.0001 seconds. :P
I don'e KNOW, but I would THINK that the primary key index would be the fastest available. And if it is, there's not much use using any other index. You're either going to have a article with an id of 3765 or you're not. Scanning that single row to determine if the type matches is trivial.
If you're only returning one row, there's no point to your ORDER BY clause. And the only point to the a.type=1 is to reject an article with the right id if the type is not correct.
MySQL allows for up to 32 indexes for each table, and each index can incorporate up to 16 columns. A multiple-column / composite index is considered a sorted array containing values that are created by concatenating the values of the indexed columns. MySQL uses multiple-column indexes in such a way that queries are fast when you specify a known quantity for the first column of the index in a WHERE clause, even if you do not specify values for the other columns.
If you look very carefully in how MySQL uses indexes, you will find that indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
In MySQL, a primary key column is automatically indexed for efficiency, as they use the in-built AUTO_INCREMENT feature of MySQL. On the other hand, one should not go overboard with indexing. While it does improve the speed of reading from databases, it slows down the process of altering data in a database (because the changes need to be recorded in the index). Indexes are best used on columns:-
that are frequently used in the WHERE part of a query
that are frequently used in an ORDER BY part of a query
that have many different values (columns with numerous repeating values ought not to be indexed).
So I try to use the primary key if my queries can suffice its use. When & only when it is required for more such indexing & fastness of fetching records, do I use the composite indexes.
Hope it helps.
The primary key is unique, so there's no need for MySQL to check any other index. a.id=3765 guarantees that there will be no more than one row returned. If a.type=1 is false for that row, then nothing will be returned.