MySQL auto increment and select count

MySQL auto increment and select count - php

I am having a strange issue with MySQL Query, i am having a table with the fields
slno,mobileno , contractor with slno as primary key and auto
increment sequence 1
, while test say uptil 100 records the count and the autoincrement values are same ,
so i truncated the table to reset the autoincrement and inserted a huge excel file with around 40k data via php, then issued select query which yields
the max of slno is 40000 as expected but the count shows 39920
, I am just amused and tried to find over google , may be my lack of keyword search ability prevented me from finding result, so i am posting in here, for ref added screen shot, Any ideas and clarifications. Thanks
EDIT:
min slno is 1
EDIT :
A related question with solution to find gap in auto number in mysql has been asked and solved here.

There are specific cases in which auto-incremented values can be lost. One example is if you roll back an insertion. As per the doco:
"Lost" auto-increment values and sequence gaps
In all lock modes (0, 1, and 2), if a transaction that generated auto-increment values rolls back, those auto-increment values are "lost". Once a value is generated for an auto-increment column, it cannot be rolled back, whether or not the "INSERT-like" statement is completed, and whether or not the containing transaction is rolled back. Such lost values are not reused. Thus, there may be gaps in the values stored in an AUTO_INCREMENT column of a table.
In that case, although the insert is backed out, the auto-increment may not be. That would certainly allow for the possibility that your bulk insertion from Excel is occasionally failing and retrying, with the subsequent retry working. It really depends on how your insertion process works.
In any case, assuming those values will always be contiguous is actually a bad assumption to make.
This is because, even if insertions were guaranteed to be contiguous, it's possible to delete rows which would result in gaps appearing. You can certainly fix this each time you delete (or bulk insert for that matter) but the workload is high - you basically have to find gaps and then "move" higher entries into those gaps.
This movement is likely to be non-trivial as it's most likely that there will be other tables holding key look-ups to that column, and each of those will need to be changed as well.
So the best use case for an auto-increment field is simply to provide a unique identifier for the row where no other one exists and not to be necessarily contiguous.

Related

Generate gap free numbers with database trigger

Together with my team, I am working on a functionality to generate invoice numbers. The requirements says that:
there should be no gaps between invoice numbers
the numbers should start from 0 every year (the together with the year we will have a unique key)
the invoice numbers should grow accordinlgy to the time of the creation of the invoices
We are using php and postgres. We tought to implement this in the following way:
each time a new invoice is persisted on the database we use a BEFORE INSERT trigger
the trigger executes a function that retrieves a new value from a postgres sequence and writes it on the invoice as its number
Considering that multiple invoices could be created during the same transaction, my question is: is this a sufficiently safe approach? What are its flaws? How would you suggest to improve it?

Introduction
I believe the most crucial point here is:
there should be no gaps between invoice numbers
In this case you cannot use a squence and an auto-increment field (as others propose in the comments). Auto-increment field use sequence under the hood and nextval(regclass) function increments sequence's counter no matter if transaction succeeded or failed (you point that out by yourself).
Update:
What I mean is you shouldn't use sequences at all, especially solution proposed by you doesn't eliminates gap possibility. Your trigger gets new sequence value but INSERT could still failed.
Sequences works this way because they mainly meant to be used for PRIMARY KEYs and OIDs values generation where uniqueness and non-blocking mechanism is ultimate goal and gaps between values are really no big deal.
In your case however the priorities may be different, but there are couple things to consider.
Simple solution
First possible solution to your problem could be returning new number as maximum value of currently existing ones. It can be done in your trigger:
NEW.invoice_number =
(SELECT foo.invoice_number
FROM invoices foo
WHERE foo._year = NEW._year
ORDER BY foo.invoice_number DESC NULLS LAST LIMIT 1
); /*query 1*/
This query could use your composite UNIQUE INDEX if it was created with "proper" syntax and columns order which would be the "year" column in the first place ex.:
CREATE UNIQUE INDEX invoice_number_unique
ON invoices (_year, invoice_number DESC NULLS LAST);
In PostgreSQL UNIQUE CONSTRAINTs are implemented simply as UNIQUE INDEXes so most of the times there no difference which command you will use. However using that particular syntax presented above, makes possible to define order on that index. It's really nice trick which makes /*query 1*/ quicker than simple SELECT max(invoice_number) FROM invoices WHERE _year = NEW.year if the invoice table gets bigger.
This is simple solution but has one big drawback. There is possibility of race condition when two transactions try to insert invoice at the same time. Both could acquire the same max value and the UNIQUE CONSTRAINT will prevent the second one from committing. Despite that it could be sufficient in some small system with special insert policy.
Better solution
You may create table
CREATE TABLE invoice_numbers(
_year INTEGER NOT NULL PRIMARY KEY,
next_number_within_year INTEGER
);
to store next possible number for certain year. Then, in AFTER INSERT trigger you could:
Lock invoice_numbers that no other transaction could even read the number LOCK TABLE invoice_numbers IN ACCESS EXCLUSIVE;
Get new invoice number new_invoice_number = (SELECT foo.next_number_within_year FROM invoice_numbers foo where foo._year = NEW.year);
Update number value of new added invoice row
Increment UPDATE invoice_numbers SET next_number_within_year = next_number_within_year + 1 WHERE _year = NEW._year;
Because table lock is hold by the transaction to its commit, this probably should be the last trigger fired (read more about trigger execution order here)
Update:
Instead of locking whole table with LOCK command check link provided by Craig Ringer
The drawback in this case is INSERT operation performance drop down --- only one transaction at the time can perform insert.

How to improve insert performance on a billion row table?

I have billion row table that no longer fits in the memory.
When I insert new rows in bulk, the overhead of recounting the primary index, kills the performance. I HAVE to have this index because otherwise SELECT statements are really slow. But since the inserts come in a random order, with each row inserted, the data has to be written in different area of the disk.
And since the HDD is capped at 200 IO operations per second, this slows the inserting to a crawl.
Can I "have my cake and eat it" at the same time in this situation? Maybe by creating another table in which the data would be grouped by different column ( by having a different primary key )? But this seems wasteful to me and I don't even know if that would help...
Or maybe I could use some staging table? Insert there 1,000,000 rows and then insert them to the target table, grouped up by the primary key?
Am I doomed?
EDIT:
I've partitioned the table horizontally.
When I removed the primary key on this field that I need and placed it on the autoincrement field, the inserts were blazingly fast.
Unfortunately, since the data on disk is placed by the primary key value, this killed the select performance... because selects don't query based on the autoincrement value but rather on the PK value.
So either I insert rows fast or I select them fast. Isn't there any solution that could help in both cases?

.When you insert new row each time it will do indexing after data insert. It's take more time.
You can use
START TRANSACTION
...You r insert query...
COMMIT

Try Like this
mysql_query("START TRANSACTION");
your insert query
mysql_query("COMMIT");

Auto Increment skipping numbers?

Note: I'm new to databases and PHP
I have an order column that is set to auto increment and unique.
In my PHP script, I am using AJAX to get new data but the problem with that is, is that the order skips numbers and is substantially higher thus forcing me to manually update the numbers when the data is inserted. In this case I would end up changing 782 to 38.
$SQL = "INSERT IGNORE INTO `read`(`title`,`url`) VALUES\n ".implode( "\n,",array_reverse( $sql_values ) );
How can I get it to increment +1?

The default auto_increment behavior in MySQL 5.1 and later will "lose" auto-increment values if the INSERT fails. That is, it increments by 1 each time, but doesn't undo an increment if the INSERT fails. It's uncommon to lose ~750 values but not impossible (I consulted for a site that was skipping 1500 for every INSERT that succeeded).
You can change innodb_autoinc_lock_mode=0 to use MySQL 5.0 behavior and avoid losing values in some cases. See http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html for more details.
Another thing to check is the value of the auto_increment_increment config variable. It's 1 by default, but you may have changed this. Again, very uncommon to set it to something higher than 1 or 2, but possible.
I agree with other commenters, autoinc columns are intended to be unique, but not necessarily consecutive. You probably shouldn't worry about it so much unless you're advancing the autoinc value so rapidly that you could run out of the range of an INT (this has happened to me).
How exactly did you fix it skipping 1500 for ever insert?
The cause of the INSERT failing was that there was another column with a UNIQUE constraint on it, and the INSERT was trying to insert duplicate values in that column. Read the manual page I linked to for details on why this matters.
The fix was to do a SELECT first to check for existence of the value before attempting to INSERT it. This goes against common wisdom, which is to just try the INSERT and handle any duplicate key exception. But in this case, the side-effect of the failed INSERT caused an auto-inc value to be lost. Doing a SELECT first eliminated almost all such exceptions.
But you also have to handle a possible exception, even if you SELECT first. You still have a race condition.
You're right! innodb_autoinc_lock_mode=0 worked like a charm.
In your case, I would want to know why so many inserts are failing. I suspect that like many SQL developers, you aren't checking for success status after you do your INSERTs in your AJAX handler, so you never know that so many of them are failing.
They're probably still failing, you just aren't losing auto-inc id's as a side effect. You should really diagnose why so many fails occur. You could be either generating incomplete data, or running many more transactions than necessary.

After you change 782 in 38 you can reset the autoincrement with ALTER TABLE mytable AUTO_INCREMENT = 39. This way you continue at 39.
However, you should check why your gap is so high and change your design accordingly. Changing the autoincement should not be "default" behaviour.

I know the question has been answered already.. But if you have deleted rows in the table before, mysql will remember the used ID/Number because typically your Auto increment is Unique.. So therefore will not create duplicate increments.. To reindex and increment from the current max ID/integer you could perform:
ALTER TABLE TableName AUTO_INCREMENT=(SELECT max(order) + 1 FROM tablename)

auto increment doesn't care, if you delete some rows - everytime you insert a row, the value is incremented.
If you want a numbering without gaps, don't use auto increment and do it by yourself. You could use something like this to achive this for inserting
INSERT INTO tablename SET
`order` = (SELECT max(`order`) + 1 FROM (SELECT * from tablename) t),
...
and if you delete a row, you have to rearange the order column manually

MySQL concurrent multi-row inserts - Insert ID Assumption

In my application, I have a lot of foreign key dependencies, and often insert large numbers of rows. What I have done up until now is run a single insert at a time, and record the insert ID. This tends to take a long time when inserting a large number of rows, even when apache and mysql are run on the same server.
My question is, if I were to alter my application to add a number of rows with a single INSERT, would I be able to assume the ids of each row based strictly upon the last insert id returned by the mysql connection? The issue is that there is the occasional situation where more than one person will be putting large amounts of information into the database at a time.
From what I have been able to determine, it seems safe to assume that when you insert 500 rows, your insert ids will range from (lastInsertID) to (lastInsertID+499), regardless of whether a query from another connection has begun or ended in the time it took to complete, but I want to be sure this is accepted as safe practice.
I am primarily running InnoDB, but there is the occasional MyISAM in there as well.
Thanks All!
-Jer

The mysql_insert_id and the now recommended alternative mysqli_insert_id returns the first entry id from the affected table's AUTO_INCREMENT column of the last query you've ran.
Mysql grouped INSERT statements are inserted iteratively from the first entry AUTO_INCREMENT index. So yes it is safe to assume that the entries inserted from a single INSERT statement are contiguous.
Comment from the Mysql reference "The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.

Does MySQL (MyISAM) fill table holes in a multirow insert?

I'm working on a project for which I need to frequently insert ~500 or so records at a remote location. I will be doing this in a single INSERT to minimize network traffic. I do, however, need to know the exact id field (AUTO_INCREMENT) values.
My web searches seem to indicate I could probably use the last_insert_id and calculate all the id values from there. However, this wouldn't work if the rows get ids that are all over the place.
Could anyone please clarify what would or should happen, and if the mathematical solution is safe?

A multirow insert is an atomic operation in MySQL (both MyISAM and InnoDB). Since the table will be locked for writing during this operations, no other rows will be inserted/updated during it's execution.
This means IDs will in fact be consecutive (unless auto-increment-increment option is set to something different than 1

Auto increment does exactly that, it auto-increments - i.e. each new row next the numerically next ID. MySQL does not re-use IDs of rows that were deleted.
Your solution is safe because write operations aquire a table lock, so no other inserts can happen while your operation completes - so you will get n contiguous auto-increment values for n inserted rows.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.