Say I have a table called materials
materials table contains columns
item name | item description | stock date | sale date| price |
what I am looking into is some times I may want sort result by item name and may be by item description and may be by stock date and may be by sale date and by price.
So how I design a table according to above criteria? And how do I add index to all columns? Is it necessary to add index to all?
Any help?
well my table will have more than a million rows
I am using PHP and MySQL
There's no reason why you can't have an index on every column if it will help. You have to bear in mind the consequences of indexing like slowing down inserts/deletes. You need to weigh up the pro's and con's.
To create index...
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
It's worth reading http://use-the-index-luke.com/ - yes, you can index every column; it will rarely do any good, because you have to tune the indices for the queries your actually running.
Take a look at ORDER BY
Adding index on a column speeds up search queries, but slows down inserting / deleting. However, first make your application work, and optimize afterwards.
If you want an index-assisted sorting on each column, you should create an index on each column.
Note that index scan is not necessarily faster: if you want all (or even a significant part) of the records returned, then the filesort will most probably be more efficient (unless your tables are really large in which case you don't want all records anyway).
Indexes will only help if you are using ORDER BY along with LIMIT.
indexes should be created based on the criteria, a simple example in your case would be you might want to search your table for items that are of a particular price. The index for that should be item name, item description and price.
I tend to plan the table so that i know what i'm likely to do and then create the indexes accordingly. So you might have a function that getsItemsBySaleDate() or getItemsCheaperThan() etc... both of these would have different indexes as both would search different columns within the table. I would suggest for now simply create an index for each column on the table.
I'd also add one for:
item_search => itemname, itemdescription, price
Sure, you should create one index for each sort criteria. Don't forget to define additional keys to indexes. e.g. for the price key you should add item name, so the items will be sorted not only by price, but then name (within the same priced group).
As others told already, be careful with the number of indexes: all the indexes must be updated upon each insert or update. Do you really want to sort items on description?
Why use an index
Indexes are used for two things.
Selection of items
Sorting of the list
For selection of items you need an index, because searching through all records is not an option.
However if you only ever select 100 items at a time, MySQL can easily sort those items in place without using an index.
So first put indexes on the items that are in your where and join clauses.
Then see how many items you select per query. If it's fewer than say 200, I would not bother with indexes for sorting.
Adding an index
CREATE INDEX index_name ON tbl_name (price)
See: http://dev.mysql.com/doc/refman/5.1/en/create-index.html
For all the options you can give an index.
Creating the table
My suggestion:
CREATE TABLE materials (
id integer not null autoincrement primary key,
name varchar not null,
description varchar not null,
stockdate date not null,
saledate date not null,
price decimal(10,2) not null,
/*my suggestion, put an index on all, but not on description*/
INDEX `i_name` (name),
INDEX `i_stockdate` (stockdate),
INDEX `i_saledate` (saledate),
INDEX `i_price` (price)) ENGINE = MyISAM;
If you select on the description in the where clause, then add a fulltext index on description.
CREATE FULLTEXT INDEX i_description ON materials (description);
If you only sort on description do not add an index, it's not worth it IMO.
Related
I'm using laravel & mysql, my database table has over 1M records and continue growing fast. I always need to filter or count by date range in created_at or updated_at column. I'm wondering that should I create indexes for created_at and updated_at. Do the indexes make the query faster? And how much the insert will be slower if I create the two indexes?
Thank all
You should add the index if your business requirements need you to query or order the records by modified_at or created_at, which is often the case.
Otherwise if you just need it for your personal checking, there is no need to add in the index for it.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
Indexes will degrade insert/delete performance since indexes have to be updated. In case of update it depends on whether you update indexed columns. If not, performance should not be affected.
I want to remove an index in a table whose access in php never uses the indexed column. Index takes up extra space and I am trying to trim it. It's a table of phone numbers. A phone number is linked to a user profile's id. So it has 3 columns. id (index), number and person. I was wondering if removing the index will affect the queries that use number or person in the where clause. My gut feeling is that it shouldn't but I am afraid computer science doesn't work on gut feelings. The data is accessed via joins. For example...
SELECT *
FROM people ... LEFT JOIN
phoneNumbers
ON people.id = phoneNumbers.person
Edit: Apparently no one seems to be able to answer the question in the title.
In the case you show, only the person column would benefit from an index.
Indexes help in basically four cases:
Row restriction, that is finding the rows by value instead of examining every row in the table.
Joining is a subset of row restriction, i.e. each distinct value in the first table looks up matching rows in the second table. Indexing a column that is referenced in the ON clause is done in the same way you would index a column referenced in the WHERE clause.
Sorting, to retrieve rows in index order instead of having to sort the result set as an additional step.
Distinct and Group By, to scan each distinct value in an index.
Covering index, that is when the query needs only the columns that are found in the index.
In the case of InnoDB, every table is treated as an index-organized table based on its primary key, and we should take advantage of this because primary key lookups are very efficient. So if you can redefine a primary key on your phoneNumbers.person column (in part), that would be best.
I think it is a good idea for all tables to have explicit primary keys and an index necessarily comes with these. For instance, it becomes difficult to delete rows in the table, if unwanted duplicates were to appear.
In general, indexes are used for where clauses, on clauses, and order by. If you have an id column, then foreign key references to the table should be using that column, and not the other two columns. The index might also be used for a select count(*) from table query, but I'm not 100% sure if MySQL does this.
If removing an index on a column makes that big a difference, then you should be investigating other ways to make your database more efficient. One method would be using partitioning to store different parts of the database in different files.
If the id column is an auto-incrementing integer, you have already indexed the table in the most efficient way possible. Removing it will make MySQL treat (number, person) as the table's primary key, which will cause less efficient look-ups.
Additionally, any index you create in the future will contain two columns, the first being the indexed field in the desired order, the second being the table's primary key. If you remove the id column and later decide to index the table on person, then your index will be larger than the table itself: each row would be: | person | (number, person) |.
Given that you're querying on this relationship, the person column should be indexed, and leaving the id column in place will ensure that the person index is as small and as quick as possible.
The column "id" seems useless. If I've understood you correctly, I'd
drop the "id" column,
add a primary key constraint on {person, number}, and
a foreign key reference from "person" to people.id.
I'm assuming each person can have more than one phone number.
Creating a primary key constraint has a side-effect that you might not want. It creates an internal index on the key columns.
I want to create a table with this info:
ID bigint(20) PK AI
FID bigint(20) unique
points int(10) index
birthday date index
current_city varchar(175) index
current_country varchar(100) index
home_city varchar(175) index
home_country varchar(100) index
Engine = MyISAM
On school I learned: create 2 extra tables, one with cities and one with countries and FK to that table when inserting data. The reason I doubt is:
This table will have around 10M inserts an hour. I'm afraid if I Insert a row and have to lookup the city FK and country FK every insert, I might lose a lot of speed? And is this worth the gain I get when I am selecting rows which only happens with WHERE ID = id. there will be around 25M of those selects an hour.
Premature optimization if the root of all evil. Design cleanly first, and optimize next, when you have actual performance data.
A clean design would be a properly normalized table, i.e. with separate city and a country tables.
I'm afraid if I Insert a row and have to lookup the city FK and country FK every insert, I might lose a lot of speed?
Actually, inserting just small IDs instead of raw country/city names in a varchar column may be more efficient:
This will result in less disk writes
You have a MyISAM table; so it doesn't have FK support, and doesn't do any foreign key lookup / check
Replacing the varchar columns with integers will put the table in fixed-length rows format, which may be faster than the dynamic length format
Benchmark with real data/workload, and see if de-normalizing is really worth it.
There's a reason why db normalization exists.
Use a table for cities, one for countries and join them with your master table via FK's.
Also, what country do you know having 100 chars in the name?
What city do you know having 175 chars in the name?
ID can be bigint, but are you sure you need a BIGINT(20), wouldn't a INT(11) suffice ? Anyway, AUTOINCREMENT it and don't UNIQUE it, it doesn't make any sense.
Also, you have indexes on every column, but no composite index. This is wrong for so many reasons. Do not pre-index, but index depending on your queries. Use explain to see what's to be indexed.
Also, don't be afraid to use composite indexes and avoid creating indexes for every column that you have.
Do all the above steps and you will have fast queries (let's hope at least)
The City and Country tables will be small (relatively) and probably fit nice in memory so lookups will be fast.
If that isn't fast enough try to cache the lookup client side (ie your php-app).
Since your rows will be smaller (int instead of varchar) you can fit more rows on each page making index lookups faster.
Try to do it normalized first, it will probably be fast enough.
And make sure you use InnoDB instead of MyISAM. It has much better locking and your application looks very concurrent.
I want to know on Indexes.
I want to create an index on one of MySQL Tables (number of rows is 300,000). Here, are some columns
item_id (primary key)
item_name
categoryid
date_added
impressions
visits
I want to create index on categoryid column. I have read in other posts that update, insert, delete makes processing slow because MySQL recreates indexes on every update.
So below are my queries:
What does update here means?
Does it mean an update to any column of any row or update to that
particular indexed column (categoryid here).
Because in this case when ever an item is shown in search results an impression will be incremented and if user visits item's page then visits will be incremented.
So, does this updation in impressions and visits will recreate the
index (categoryid) every time (categoryid does not changes on
updates) or it will only recreate index when categoryid is updated
or new row is added?
it will be update the only when you change categoryid is updated or new row is added if you have created the index on the categoryid... It will update the mapping table where this indexing manage...
how ever it depends on...indexing is a way of optimization.
First of all you can identify slow queries with this entry: log-slow-queries long_query_time = value in my.cnf
This gives you an idea if there is a need for optimization.
Next let MySQL explain the query.
the most valueable is possible_keys: item_name, categoryid and the used keys key: item_name and look if there is: using_filesort. The last one (using_filesort) says you should use an index to save response time
But!, because there is one key per table used is also worth to think about aggregation in some ways:
Combined index:
(categoryid, item_name) when your WHERE part is categoryid="iao" AND item_name="xyz"
(item_name, categoryid) when your WHERE part is item_name="xyz" AND categoryid="iao"
---> the order is important!
if your WHERE part is item_name="xyz" AND categoryid="iao" the use of two indexes:
1. index: item_name helps saving time
2. index: categoryid is lost time
the most benefit of using combined index you will get when your WHERE part make use of ORDER BY e.g.: WHERE part is item_name="xyz" AND categoryid="iao" ORDER BY date_added. In this case combined index: (item_name, categoryid, date_added) save time.
And yes do it the right way:
indexing consume time by indexing (DELETE, UPDATE, REPLACE, INSERT) and save time on each SELECT
When you index a column, any changes to values in that column will take "longer" to process because the index has to rebuilt/resorted, although this change will hardly be noticeable unless you are dealing with millions of records. On the flipside, having the index means searching on those columns will be many times faster. Its a tradeoff you need to make but usually the index is worthwhile if you are searching on those fields.
Ok so I've a SQL query here:
SELECT a.id,... FROM article AS a WHERE a.type=1 AND a.id=3765 ORDER BY a.datetime DESC LIMIT 1
I wanted to get exact article by country and id and created for that index with two columns type and id. Id is also primary key.
I used the EXPLAIN keyword to see which index is used and instead of the multiple column index it used primary key index, but I did set the where stuff exactly in order as the index is created.
Does MySQL use the primary key index instead of the multiple column index because the primary one is faster? Or should I force MySql to use the multiple column index?
P.S. Just noticed it was stupid to use order when there is 1 result row. Haha. It increased the search time for 0.0001 seconds. :P
I don'e KNOW, but I would THINK that the primary key index would be the fastest available. And if it is, there's not much use using any other index. You're either going to have a article with an id of 3765 or you're not. Scanning that single row to determine if the type matches is trivial.
If you're only returning one row, there's no point to your ORDER BY clause. And the only point to the a.type=1 is to reject an article with the right id if the type is not correct.
MySQL allows for up to 32 indexes for each table, and each index can incorporate up to 16 columns. A multiple-column / composite index is considered a sorted array containing values that are created by concatenating the values of the indexed columns. MySQL uses multiple-column indexes in such a way that queries are fast when you specify a known quantity for the first column of the index in a WHERE clause, even if you do not specify values for the other columns.
If you look very carefully in how MySQL uses indexes, you will find that indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
In MySQL, a primary key column is automatically indexed for efficiency, as they use the in-built AUTO_INCREMENT feature of MySQL. On the other hand, one should not go overboard with indexing. While it does improve the speed of reading from databases, it slows down the process of altering data in a database (because the changes need to be recorded in the index). Indexes are best used on columns:-
that are frequently used in the WHERE part of a query
that are frequently used in an ORDER BY part of a query
that have many different values (columns with numerous repeating values ought not to be indexed).
So I try to use the primary key if my queries can suffice its use. When & only when it is required for more such indexing & fastness of fetching records, do I use the composite indexes.
Hope it helps.
The primary key is unique, so there's no need for MySQL to check any other index. a.id=3765 guarantees that there will be no more than one row returned. If a.type=1 is false for that row, then nothing will be returned.