Are mysql inserts are always consecutive? - php

insert into tbl(name)values('john'),('bale'),('ron')
if a person runs this query and another person at different part of the world runs
insert into tbl(name)values('johnny'),('baleton'),('ronny')
this at few seconds after previous query but before its completion on a server. Wil the values inserted be consecutive? Like this
'john','bale','ron','johnny','baleton','ronny'
or it may not be the tbl has id|name as columns.

I believe each query in MySQL happens in single transaction (if autocommit is enabled). If you manage your transaction yourself that then the situation is even more obvious.
I believe that for this reason the records will be inserted in order.
Edit:
I assume this is all about autoincrement otherwise the question doesn't make sense as explained in comment under the original question.
I stand corrected. The doc states:
When accessing the auto-increment counter, InnoDB uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction.
So yes, still in order. Not for scope of transaction but single SQL statement.

MySQL (at least with the InnoDB engine) will assign the next Auto Increment value (ai_max + ai_increment) to the first statement (not transaction) that reads the table on INSERT. So if another statement comes along, attempts to INSERT, and finalizes before the first, it gets the NEXT AI value (ai_max + 2*ai_increment), not the one assigned to the first statement.
This is about as "in order" as databases get.
For more information on MySQL InnoDB Auto Increment, see the MySQL developer's reference

Related

SQL - auto increment withing group inside one table [duplicate]

I have got a table which has an id (primary key with auto increment), uid (key refering to users id for example) and something else which for my question won’t matter.
I want to make, lets call it, different auto-increment keys on id for each uid entry.
So, I will add an entry with uid 10, and the id field for this entry will have a 1 because there were no previous entries with a value of 10 in uid. I will add a new one with uid 4 and its id will be 3 because I there were already two entried with uid 4.
...Very obvious explanation, but I am trying to be as explainative an clear as I can to demonstrate the idea... clearly.
What SQL engine can provide such a functionality natively? (non Microsoft/Oracle based)
If there is none, how could I best replicate it? Triggers perhaps?
Does this functionality have a more suitable name?
In case you know about a non SQL database engine providing such a functioality, name it anyway, I am curious.
Thanks.
MySQL's MyISAM engine can do this. See their manual, in section Using AUTO_INCREMENT:
For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
The docs go on after that paragraph, showing an example.
The InnoDB engine in MySQL does not support this feature, which is unfortunate because it's better to use InnoDB in almost all cases.
You can't emulate this behavior using triggers (or any SQL statements limited to transaction scope) without locking tables on INSERT. Consider this sequence of actions:
Mario starts transaction and inserts a new row for user 4.
Bill starts transaction and inserts a new row for user 4.
Mario's session fires a trigger to computes MAX(id)+1 for user 4. You get 3.
Bill's session fires a trigger to compute MAX(id). I get 3.
Bill's session finishes his INSERT and commits.
Mario's session tries to finish his INSERT, but the row with (userid=4, id=3) now exists, so Mario gets a primary key conflict.
In general, you can't control the order of execution of these steps without some kind of synchronization.
The solutions to this are either:
Get an exclusive table lock. Before trying an INSERT, lock the table. This is necessary to prevent concurrent INSERTs from creating a race condition like in the example above. It's necessary to lock the whole table, since you're trying to restrict INSERT there's no specific row to lock (if you were trying to govern access to a given row with UPDATE, you could lock just the specific row). But locking the table causes access to the table to become serial, which limits your throughput.
Do it outside transaction scope. Generate the id number in a way that won't be hidden from two concurrent transactions. By the way, this is what AUTO_INCREMENT does. Two concurrent sessions will each get a unique id value, regardless of their order of execution or order of commit. But tracking the last generated id per userid requires access to the database, or a duplicate data store. For example, a memcached key per userid, which can be incremented atomically.
It's relatively easy to ensure that inserts get unique values. But it's hard to ensure they will get consecutive ordinal values. Also consider:
What happens if you INSERT in a transaction but then roll back? You've allocated id value 3 in that transaction, and then I allocated value 4, so if you roll back and I commit, now there's a gap.
What happens if an INSERT fails because of other constraints on the table (e.g. another column is NOT NULL)? You could get gaps this way too.
If you ever DELETE a row, do you need to renumber all the following rows for the same userid? What does that do to your memcached entries if you use that solution?
SQL Server should allow you to do this. If you can't implement this using a computed column (probably not - there are some restrictions), surely you can implement it in a trigger.
MySQL also would allow you to implement this via triggers.
In a comment you ask the question about efficiency. Unless you are dealing with extreme volumes, storing an 8 byte DATETIME isn't much of an overhead compared to using, for example, a 4 byte INT.
It also massively simplifies your data inserts, as well as being able to cope with records being deleted without creating 'holes' in your sequence.
If you DO need this, be careful with the field names. If you have uid and id in a table, I'd expect id to be unique in that table, and uid to refer to something else. Perhaps, instead, use the field names property_id and amendment_id.
In terms of implementation, there are generally two options.
1). A trigger
Implementations vary, but the logic remains the same. As you don't specify an RDBMS (other than NOT MS/Oracle) the general logic is simple...
Start a transaction (often this is Implicitly already started inside triggers)
Find the MAX(amendment_id) for the property_id being inserted
Update the newly inserted value with MAX(amendment_id) + 1
Commit the transaction
Things to be aware of are...
- multiple records being inserted at the same time
- records being inserted with amendment_id being already populated
- updates altering existing records
2). A Stored Procedure
If you use a stored procedure to control writes to the table, you gain a lot more control.
Implicitly, you know you're only dealing with one record.
You simply don't provide a parameter for DEFAULT fields.
You know what updates / deletes can and can't happen.
You can implement all the business logic you like without hidden triggers
I personally recommend the Stored Procedure route, but triggers do work.
It is important to get your data types right.
What you are describing is a multi-part key. So use a multi-part key. Don't try to encode everything into a magic integer, you will poison the rest of your code.
If a record is identified by (entity_id,version_number) then embrace that description and use it directly instead of mangling the meaning of your keys. You will have to write queries which constrain the version number but that's OK. Databases are good at this sort of thing.
version_number could be a timestamp, as a_horse_with_no_name suggests. This is quite a good idea. There is no meaningful performance disadvantage to using timestamps instead of plain integers. What you gain is meaning, which is more important.
You could maintain a "latest version" table which contains, for each entity_id, only the record with the most-recent version_number. This will be more work for you, so only do it if you really need the performance.

How to determine next insert id value (inside a transaction) before actually inserting a new record?

If I have a table with an id column (Autoincrement). How to get the value of the next insert id before actually inserting a new record in the same transaction?
You can do this query:
SELECT AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES
WHERE (TABLE_SCHEMA, TABLE_NAME) = ('mydatabase', 'mytable');
(Of course you would name your database and table, I just used examples.)
This would give you the next auto-increment value at the moment you run that query. But in the next instant, some other client could insert a row, causing that id to be used. So the result you value can be wrong almost as soon as you query it.
You could prevent concurrent inserts by locking your table, but most people prefer not to do that, because it blocks concurrent operations.
Other than that, the only way to guarantee you know the next auto-increment value in advance is to execute the INSERT, then query LAST_INSERT_ID() or a similar function provided by your connector. For example, PDO::lastInsertId().

Is it possible for stacked queries to sneak in between eachother when done by multiple users in short time windows?

I find this kind of hard to explain, but consider the following situation:
You have a website with two insert queries that get loaded right after eachother, there might be some variable declarations and for loops in between, but no other queries besides these two:
$mysqli->doQuery("INSERT INTO `company_order`(`customer_id`, `item_id`)
VALUES ($givenid, $givenproduct")
// This table has a primary key that gets defined using auto_increment.
/* (loop that defines array with 50~ variables, few names that get defined in object variables) */
$mysqli->doQuery("INSERT INTO `customer_orderlist`(`customer_id`,`order_id`)
VALUES ($givenid, (LAST_INSERT_ID()) ")
Imagine if two users loaded the same function that executes these queries almost right after eachother. Is there a risk that one user might get the last inserted ID from the other user, or is it guaranteed that the queries will be executed in order without any other queries getting called in between them?
The MySQL function LAST_INSERT_ID() returns the first auto-generated value successfully inserted for an AUTO INCREMENT column as a result of the most recent successful INSERT statement executed on the current connection.
The value is stored by the server on a per-connection basis. This guarantees the value returned by LAST_INSERT_ID() on your second query is the value generated for the AUTO INCREMENT column on your first query, no matter how many other INSERT queries run between these two queries on other connections.
To answer your questions:
is it guaranteed that the queries will be executed in order without any other queries getting called in between them?
No, there is no such guarantee. The queries are executed in the order you send them to the server but your connection does not block other connections to execute their own queries (and vice-versa).
Is there a risk that one user might get the last inserted ID from the other user
No. There is no such risk as long as the two queries (the INSERT that auto generates an AUTO INCREMENT value and the query that calls LAST_INSERT_ID()) run on the same connection (and no other INSERT query runs between them on the same connection).

MySQL concurrent multi-row inserts - Insert ID Assumption

In my application, I have a lot of foreign key dependencies, and often insert large numbers of rows. What I have done up until now is run a single insert at a time, and record the insert ID. This tends to take a long time when inserting a large number of rows, even when apache and mysql are run on the same server.
My question is, if I were to alter my application to add a number of rows with a single INSERT, would I be able to assume the ids of each row based strictly upon the last insert id returned by the mysql connection? The issue is that there is the occasional situation where more than one person will be putting large amounts of information into the database at a time.
From what I have been able to determine, it seems safe to assume that when you insert 500 rows, your insert ids will range from (lastInsertID) to (lastInsertID+499), regardless of whether a query from another connection has begun or ended in the time it took to complete, but I want to be sure this is accepted as safe practice.
I am primarily running InnoDB, but there is the occasional MyISAM in there as well.
Thanks All!
-Jer
The mysql_insert_id and the now recommended alternative mysqli_insert_id returns the first entry id from the affected table's AUTO_INCREMENT column of the last query you've ran.
Mysql grouped INSERT statements are inserted iteratively from the first entry AUTO_INCREMENT index. So yes it is safe to assume that the entries inserted from a single INSERT statement are contiguous.
Comment from the Mysql reference "The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.

Does MySQL (MyISAM) fill table holes in a multirow insert?

I'm working on a project for which I need to frequently insert ~500 or so records at a remote location. I will be doing this in a single INSERT to minimize network traffic. I do, however, need to know the exact id field (AUTO_INCREMENT) values.
My web searches seem to indicate I could probably use the last_insert_id and calculate all the id values from there. However, this wouldn't work if the rows get ids that are all over the place.
Could anyone please clarify what would or should happen, and if the mathematical solution is safe?
A multirow insert is an atomic operation in MySQL (both MyISAM and InnoDB). Since the table will be locked for writing during this operations, no other rows will be inserted/updated during it's execution.
This means IDs will in fact be consecutive (unless auto-increment-increment option is set to something different than 1
Auto increment does exactly that, it auto-increments - i.e. each new row next the numerically next ID. MySQL does not re-use IDs of rows that were deleted.
Your solution is safe because write operations aquire a table lock, so no other inserts can happen while your operation completes - so you will get n contiguous auto-increment values for n inserted rows.

Categories