I have a MyISAM table in MySQL with three columns - an auto-incrementing ID, an integer (customer id) and a decimal (account balance).
At this time, whenever I initialize my system, I completely wipe the table using:
truncate table xyz;
alter table xyz auto_increment=1001;
Then I repopulate the table with data from PHP. I usually end up having up to 10,000 entries in that table.
However, due to new requirements to the system, I now need to also be able to update the table while the system is running, so I can no longer wipe the table and have to use UPDATE instead of INSERT and update the balances one by one which will be much slower than inserting 20 new records at a time as I'm doing now.
PHP only sends the customer id and the amount to MySQL - the other id is not actually in use.
So my question is this:
Does it make sense to put an index on the customer id to speed up updating given that the input from PHP is most likely not going to be in order? Or will adding the index slow it down enough to not make it worthwhile?
I also don't know if the index is used at all for the UPDATE command or not ...
Any other suggestions on how to speed this up?
It depends on what your update query is. Presumably it is like:
update xyz
set val = <something>
where id = <something else>
If so, an index on id will definitely help speed things up, because you are looking for one record.
If your query looks like:
update xyz
set val = 1.001 * val;
An index will neither help, nor hurt. The entire table will need to be scanned and the index does not get involved.
If your query is like:
update xyz
set id = id+1;
Then an index will be slower. You have to read and write to every row of the table, plus you then have the overhead of maintaining the index.
Ok I'll make this into an answer. If you are saying:
Update xyz set balance=100 where customer_id = 123;
Then yes an index on customer_id will definitely increase the speed since it will find the row to update much quicker.
But for improvement, if you have columns (id,customer_id,balance) and customer_id is unique and id is just an auto incremented column get rid of the id column and make customer_id the primary key. Primary keys do not have to be auto incremented integer columns.
Related
I have some tables in my phpmyadmin with one column that is auto incremented.
The problem is that when I delete some rows from the table (For example the element Car, with index auto incremented 1) and I create another row into the table, the new row have the index 2, but in the table there is only one row.
I want that this second element created to have the index equal to the position of the row, for example, if I have 3 rows that the third element will have the index equals to 3.
I was looking for a method that let me to use my phpMyAdmin like this but I couldn't find anything.
Is it possible? If it is true, what should I have to do? Do I have to create the table again?
This is generally a bad idea. Auto increment is used for creating unique ID of the row. Imagine you have a record "1 - John". Then you delete it and add another "1 - Jack". From the point of common database logic, it will seem that John was renamed to Jack (it has the same ID = it is the same entity) rather than it is another record. You should let DB assign new ID to each new record, even with leaving gaps after deleted records.
If you really want to do so, you can modify auto increment value using this query:
ALTER TABLE users AUTO_INCREMENT=123
but it is still not the way auto increment is designed for.
It is possible, but you shouldn't do that on production.
The SQL query is:
ALTER TABLE tablename AUTO_INCREMENT = 1
The most important part of this is to prevent overriding. For example you have an user and this user has id of 81 and if you delete this user and the database doesn't remember that this id 81 has ever been taken by an user (and for example, you have some relations - like friend lists) the user that is going to have the same ID will probably have the same data.
So basically, you don't want to reset auto increment values.
Execute this SQL sentence:
ALTER TABLE tablename AUTO_INCREMENT = 1
I struggled with this a bit too, then found in the "Operations" tab, there is a setting under "Auto Increment" where you can set the number it will start from, so you can roll it back if you want to do that.
It is possible. It is not a feature of phpmyadmin, but of mysql. Execute this
ALTER TABLE tablename AUTO_INCREMENT = 1
More info on this on stackoverflow
And in the mysql reference
I am using mysql with PHP. I have a students table like this. I am using InnoDB engine.
id int AUTO_INCREMENT
regno int
name varchar
whenever a new student is inserted, I want to assign the next available regno. for example the regno of previous student is 1 then the value should be 2 for the next entry. The auto increment does not work here as it may create gaps. (I am using transactions, so after inserting a row to students table, there are few more queries that may cause rollback, in which case, the auto increment id is incremented although no actual record is inserted). Also, I don't care if there is a gap present between old regnos... e.g regno may have 1,2,3,5,10,11,12 in sequence. now when next student is inserted I would like 12+1=13 for the this student. Also, I want to make sure the regno is not duplicated. (Although regno has a UNIQUE index, but I don't want to throw error. It should get the next number).
I've two solutions in mind.
1: (pseudo-code)
a. Query Database for the newregno = max(regno)+1
b. assign newregno to student while inserting the row.
In this case I am just concerned about that 2 instances of application may query the database at the same time and get the same newregno causing the duplicate.
2: Use triggers... Update the regno after real row insertion. (I've not read much about the triggers, but if any one suggest this is a better approach, I'll go for it)
Any suggestion?
EDIT---
The regno (registeration number) may not be unique itself in future but will be unique along with some other columns e.g. course/session. So please don't offer me an 'auto increment' index type solution.
Have a look at this:
http://www.mysqlperformanceblog.com/2011/11/29/avoiding-auto-increment-holes-on-innodb-with-insert-ignore/
Increment uses different algorithms for calculating the id. You need to set it to avoid holes.
I want to know on Indexes.
I want to create an index on one of MySQL Tables (number of rows is 300,000). Here, are some columns
item_id (primary key)
item_name
categoryid
date_added
impressions
visits
I want to create index on categoryid column. I have read in other posts that update, insert, delete makes processing slow because MySQL recreates indexes on every update.
So below are my queries:
What does update here means?
Does it mean an update to any column of any row or update to that
particular indexed column (categoryid here).
Because in this case when ever an item is shown in search results an impression will be incremented and if user visits item's page then visits will be incremented.
So, does this updation in impressions and visits will recreate the
index (categoryid) every time (categoryid does not changes on
updates) or it will only recreate index when categoryid is updated
or new row is added?
it will be update the only when you change categoryid is updated or new row is added if you have created the index on the categoryid... It will update the mapping table where this indexing manage...
how ever it depends on...indexing is a way of optimization.
First of all you can identify slow queries with this entry: log-slow-queries long_query_time = value in my.cnf
This gives you an idea if there is a need for optimization.
Next let MySQL explain the query.
the most valueable is possible_keys: item_name, categoryid and the used keys key: item_name and look if there is: using_filesort. The last one (using_filesort) says you should use an index to save response time
But!, because there is one key per table used is also worth to think about aggregation in some ways:
Combined index:
(categoryid, item_name) when your WHERE part is categoryid="iao" AND item_name="xyz"
(item_name, categoryid) when your WHERE part is item_name="xyz" AND categoryid="iao"
---> the order is important!
if your WHERE part is item_name="xyz" AND categoryid="iao" the use of two indexes:
1. index: item_name helps saving time
2. index: categoryid is lost time
the most benefit of using combined index you will get when your WHERE part make use of ORDER BY e.g.: WHERE part is item_name="xyz" AND categoryid="iao" ORDER BY date_added. In this case combined index: (item_name, categoryid, date_added) save time.
And yes do it the right way:
indexing consume time by indexing (DELETE, UPDATE, REPLACE, INSERT) and save time on each SELECT
When you index a column, any changes to values in that column will take "longer" to process because the index has to rebuilt/resorted, although this change will hardly be noticeable unless you are dealing with millions of records. On the flipside, having the index means searching on those columns will be many times faster. Its a tradeoff you need to make but usually the index is worthwhile if you are searching on those fields.
I decided back when I was coding to have different tables for each type of content. Now I am stuck solving this. Basically my notification system ranks the newest content by its timestamp currently. This is inaccurate however because there is a small chance that someone would submit content at the same time as another person, and incorrect ranking would occur.
Now if I had all my content in a single table, I would simply rank it by an auto-incrementing variable. Is there a way to implement this auto-increment integer across multiple tables (e.g. When something is inserted into table1, id=0, something is inserted into table2, id=1). Or do I have to recode all my stuff into a single table.
NOTE:
The reason I have content in multiple tables is because its organized and it would reduce load stress. I don't really care about the organization anymore, because I can just access the data through a GUI I coded, I'm just wondering about the load stress.
EDIT:
I'm using PHP 5 with MySQL.
Your question, particularly the need for ID spanning over multiple tables, is clearly signalizing that your database design needs change. You should make one table for all content types (as a generalization), with autoincrementing ID. Then, for each particular content type, you can define other table (equivalent of inheritance in OOP) with extra fields, and foreign key pointing to the basic table.
In other words, you need something like inheritance in SQL.
You can create a table with auto increment id just to keep track of ids. Your program would do an insert on that table, get the id, use it as necessary.
Something along the lines of:
function getNextId() {
$res = mysql_query("INSERT INTO seq_table(id) VALUES (NULL)");
$id = mysql_insert_id();
if ($id % 10 == 0) {
mysql_query("DELETE FROM seq_table");
}
return $id;
}
Where seq_table is a table that you've to create just to get the ids. Make it a function so it can be used whenever you need. Every 10 ids generated I delete all generated ids, anyway you don't need them there. I don't delete every time since it would slow down. If another insert happen in the meantime and I delete 11 or more records, it doesn't affect the behaviour of this procedure. It's safe for the purpose it has to reach.
Even if the table is empty new ids will just keep on growing since you've declared id as auto-increment.
UPDATE: I want to clarify why the ID generation is not wrapped in a transaction and why it shouldn't.
If you generate an auto id and you rollback the transaction, the next auto id, will be incremented anyway. Excerpt from a MySQL bug report:
[...] this is not a bug but expected behavior that happens in every RDBMS we know. Generated values are not a part of transaction and they don't care about other statements.
Getting the ID with this procedure is perfectly thread safe. Your logic after the ID is obtained should be wrapped in a transaction, especially if you deal with multiple tables.
Getting a sequence in this way isn't a new concept, for instance, the code of metabase_mysql.php which is a stable DB access library has a method called GetSequenceNextValue() which is quite similar.
In a single table, you could have a field for the content type and clustered index that includes the content type field. This effectively keeps all of one content type in one place on the disc, and another content type in another place, etc. (It's actually organised into pages, but this physical organisation is still true.)
Assuming that each content type has the same fields, this would likely meet your needs and behave similarly to multiple tables. In some cases you may even find that, with appropriate indexes, a single table solution can be faster, more convenient and maintainable, etc. Such as trying to create global unique identifiers across all content types.
If you're unable to merge these back into a single table, you could create a central link table...
CREATE TABLE content_link (
id INT IDENTITY(1,1), -- MS SQL SERVER syntax
content_type INT,
content_id INT -- The id from the real table
)
As you insert into the content tables, also insert into the link table to create your globally unique id.
More simply, but even more manually, just hold a single value somewhere in the database. Whenever you need a new id, use that centrally stored value and increment it by one. Be sure to wrap the increment and collection in a single transaction to stop race conditions. (This can be done in a number of ways, depending on your flavor of SQL.)
EDIT
A couple of MySQL example lines of code from the web...
START TRANSACTION;
INSERT INTO foo (auto,text)
VALUES(NULL,'text'); # generate ID by inserting NULL
INSERT INTO foo2 (id,text)
VALUES(LAST_INSERT_ID(),'text'); # use ID in second table
COMMIT TRANSACTION;
Personally, I'd actually store the value in a variable, commit the transaction, and then continue with my business logic. This would keep the locks on the tables to a minimum.
You could have a separate ID table, insert into that, and use the newly-inserted ID.
e.g.
CREATE TABLE ids (INT UNSIGNED AUTO INCREMENT PRIMARY KEY, timeadded DATETIME);
In the script:
<?php
$r = mysql_query('INSERT INTO ids (timeadded) VALUES (NOW())');
$id = mysql_insert_id();
mysql_query("INSERT INTO someOtherTable (id, data) VALUES ('$id', '$data)");
Add error checking etc. to taste.
The MySQL manual states:
The ID that was generated is maintained in the server on a
per-connection basis. This means that the value returned by the
function to a given client is the first AUTO_INCREMENT value generated
for most recent statement affecting an AUTO_INCREMENT column by that
client. This value cannot be affected by other clients, even if they
generate AUTO_INCREMENT values of their own. This behavior ensures
that each client can retrieve its own ID without concern for the
activity of other clients, and without the need for locks or
transactions.
(Source) So I don't think concerns about ACID complians are a problem.
I have a table called "posts" and it contain 500 posts but the ids are not sequence
like:
1
3
9
22
446
....
etc.
That's because I deleted some of the posts from the table.
So how can I re-correct the ids?
Primary Key IDs are not supposed to be changed, especially when they are referenced in other tables.
If you need a property that is like a row number, you can add another field for that.
For example invoices are numbered, but the invoice number should not be the primary key, since you want the freedom to re-number one of them without losing other connected information, such as invoice details in other tables.
The easiest way to fix it is to create a quick script to loop through the table and update that the id column and then run on your database: ALTER TABLE tbl AUTO_INCREMENT = 100;
NEVER EVER CHANGE THE ID!
Id is something the record borns with and dies with. That's why it's called id, it is an IDENTITY!
As in real life you cannot change the identity of things, you won't do it in database.
It is a very bad idea from the philosophic perspective, which also results in practical problems. Even if you would renumber the ID in all your tables in your database, the old IDs might still survive somewhere (and make a big mess then):
in URLs all over the internet
in your logs
in your backups
in other database copies.
Also, ID must serve only for identification and nothing else. For example: you use IDs to define order of some dictionary, which you normally present sorted. Then you need to add a new item, which must be presented between items with id 20 and 21. The BAD solution would be to change ID for records with ID >= 21. The GOOD solution is to add a new column Order, which defines the order of items and can be changed whenever needed.
Remember:
ID must serve only for identification and nothing else!
NEVER CHANGE THE ID!