I'm trying to keep the database tables for a project I'm working on nice and normalized, but I've run into a problem. I'm trying to figure out how I can insert a row into a table and then find out what the value of the auto_incremented id column was set to so that I can insert additional data into another table. I know there are functions such as mysql_insert_id which "get the ID generated from the previous INSERT operation". However, if I'm not mistaken mysql_insert_id just returns the ID of the very last operation. So, if the site has enough traffic this wouldn't necessarily return the ID of the query you want since another query could have been run between when you inserted the row and look for the ID. Is this understanding of mysql_insert_id correct? Any suggestions on how to do this are greatly appreciated. Thanks.
LAST_INSERT_ID() has session scope.
It will return the identity value inserted in the current session.
If you don't insert any rows between INSERT and LAST_INSERT_ID, then it will work all right.
Note though that for multiple value inserts, it will return the identity of the first row inserted, not the last one:
INSERT
INTO mytable (identity_column)
VALUES (NULL)
SELECT LAST_INSERT_ID()
--
1
INSERT
INTO mytable (identity_column)
VALUES (NULL), (NULL)
/* This inserts rows 2 and 3 */
SELECT LAST_INSERT_ID()
--
2
/* But this returns 2, not 3 */
You could:
A. Assume that won't be a problem and use mysql_insert_id
or
B. Include a timestamp in the row and retrieve the last inserted ID before inserting into another table.
The general solution to this is to do one of two things:
Create a procedural query that does the insert and then retrieves the last inserted id (using, ie. LAST_INSERT_ID()) and returns it as output from the query.
Do the insert, do another insert where the id value is something like (select myid from table where somecolumnval='val')
2b. Or make the select explicit and standalone, and then do the other inserts using that value.
The disadvantage to the first is that you have to write a proc for each of these cases. The disadvantage to the second is that some db engines don't accept that, and it clutters your code, and can be slow if you have to do a where on multiple columns.
This assumes that there may be inserts between your calls that you have no control over. If you have explicit control, one of the other solutions above is probably better.
Related
If I have a table with an id column (Autoincrement). How to get the value of the next insert id before actually inserting a new record in the same transaction?
You can do this query:
SELECT AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES
WHERE (TABLE_SCHEMA, TABLE_NAME) = ('mydatabase', 'mytable');
(Of course you would name your database and table, I just used examples.)
This would give you the next auto-increment value at the moment you run that query. But in the next instant, some other client could insert a row, causing that id to be used. So the result you value can be wrong almost as soon as you query it.
You could prevent concurrent inserts by locking your table, but most people prefer not to do that, because it blocks concurrent operations.
Other than that, the only way to guarantee you know the next auto-increment value in advance is to execute the INSERT, then query LAST_INSERT_ID() or a similar function provided by your connector. For example, PDO::lastInsertId().
I have a table which has id, name ,surname columns. When I add a new line to table, id increases by 1 since its AI and PK. Now how I get back latest id variable with OUTPUT command?
"INSERT INTO table (name, surname) VALUES ('mike', 'hensen') OUTPUT ?????? how to continue ????"
edit LAST_INSERT_ID() is not a very good method since in a big webpage there could be a lot adding per second.
I'm pretty sure that LAST_INSERT_ID() is exactly what you want. It returns the last id inserted on a per connection basis, not the last one inserted (documented here). Presumably, different web users would have different connections, so using the function does what you want.
If you want the last id that was inserted over all connections, but not necessarily from your most recent insert, then you can look at the auto_increment value in the metadata.
Just looking for some tips and pointers for a small project I am doing. I have some ideas but I am not sure if they are the best practice. I am using mysql and php.
I have a table called nomsing in the database.
It has a primary key called row id which is an integer.
Then I have about 8 other tables referencing this table.
That are called nomplu, accsing,accplu, datsing, datplu for instance.
Each has a column that references the primary key of nomsing.
Withing my php code I have all the information to insert into the tables except one thing , the row id primary key of the nomsing table. So that php generates a series of inserts like the following.
INSERT INTO nomsing(word,postress,gender) VALUES (''велосипед","8","mask").
INSERT INTO nomplu(word,postress,NOMSING?REFERENCE) VALUES (''велосипеды","2",#the reference to the id of the first insert#).
There are more inserts but this one gets the point across. The second insert should reference the auto generated id for the first insert. I was this to work as a transaction so all inserts should complete or none.
One idea I have is to not auto generate the id and generate it myself in php. That way would know the id given before the transaction but then I would have to check if the id was already in the db.
Another idea I have is to do the first insert and then query for the row id of that insert in php and then make the second insert. I mean both should work but they don't seem like an optimal solution. I am not too familiar with the database transactional features but what would be the best approach to do in this case. I don't like the idea of inserting then querying for the id and then running the rest of the queries. Just seems very inefficient or perhaps I am wrong.
Just insert a row in the master table. Then you can fetch the insert id ( lastInserId when on PDO) and use that to populate your other queries.
You could use the php version as given by JvdBerg , or Mysql's LAST_INSERT_ID. I usually use the former option.
See a similar SO question here.
You could add a new column to the nomsing table, called 'insert_order' (or similar) with a default value of 0, then instead of generating one SQL statement per insert create a bulk insert statement e.g.
INSERT INTO nomsing(word,postress,gender, insert_order)
VALUES (''велосипед","8","mask",1), (''abcd'',"9","hat",2).....
you generate the insert_order number with a counter in your loop starting at one. Then you can perform one SELECT on the table to get the ids e.g.
SELECT row_id
FROM nomsing
WHERE insert_order > 0;
now you have all the IDs you can now do a bulk insert for your following queries. At the end of your script just do an update to reset the insert_order column back to 0
UPDATE nomsing SET insert_order = 0 WHERE insert_order > 0;
It may seem messy to add an extra column to do this but it will add a significant speed increase over performing one query at a time.
This is my db structure:
ID NAME SOMEVAL API_ID
1 TEST 123456 A123
2 TEST2 223232 A123
3 TEST3 918922 A999
4 TEST4 118922 A999
I'm filling it using a function that calls an API and gets some data from an external service.
The first run, I want to insert all the data I get back from the API. After that, each time I run the function, I just want to update the current rows and add rows in case I got them from the API call and are not in the db.
So my initial thought regarding the update process is to go through each row I get from the API and SELECT to see if it already exists.
I'm just wondering if this is the most efficient way to do it, or maybe it's better to DELETE the relevant rows from the db and just re-inserting them all.
NOTE: each batch of rows I get from the API has an API_ID, so when I say delete the rows i mean something like DELETE FROM table WHERE API_ID = 'A999' for example.
If you retrieving all the rows from the service i recommend you the drop all indexes, truncate the table, then insert all the data and recreate indexes.
If you retrieving some data from the service i would drop all indexes, remove all relevant rows, insert all rows then recreate all indexes.
In such scenarios I'm usually going with:
start transaction
get row from external source
select local store to check if it's there
if it's there: update its values, remember local row id in list
if it's not there: insert it, remember local row id in list
at the end delete all rows that are not in remembered list of local row ids (NOT IN clause if the count of ids allows for this, or other ways if it's possible that there will be many deleted rows)
commit transaction
Why? Because usually I have local rows referenced by other tables, and deleting them all would break the references (not to mention deletete cascade).
I don't see any problem in performing SELECT, then deciding between an INSERT or UPDATE. However, MySQL has the ability to perform so-called "upserts", where it will insert a row if it does not exist, or update an existing row otherwise.
This SO answer shows how to do that.
I would recommend using INSERT...ON DUPLICATE KEY UPDATE.
If you use INSERT IGNORE, then the row won't actually be inserted if it results in a duplicate key on API_ID.
Add unique key index on API_ID column.
If you have all of the data returned from the API that you need to completely reconstruct the rows after you delete them, then go ahead and delete them, and insert afterwards.
Be sure, though, that you do this in a transaction, and that you are using an engine that supports transactions properly, such as InnoDB, so that other clients of the database don't see rows missing from the table just because they are going to be updated.
For efficiency, you should insert as many rows as you can in a single query. Much faster that way.
BEGIN;
DELETE FROM table WHERE API_ID = 'A987';
INSERT INTO table (NAME, SOMEVAL, API_ID) VALUES
('TEST5', 12345, 'A987'),
('TEST6', 23456, 'A987'),
('TEST7', 34567, 'A987'),
...
('TEST123', 123321, 'A987');
COMMIT;
I am using adodb for PHP library.
For fetching the id at which the record is inserted I use this function "$db->Insert_ID()"
I want to know if there are multiple and simultaneous inserts into the database table, will this method return me the correct inserted id for each record inserted ?
The reason I am asking this is because I use this last insert id for further processing of other records and making subsequent entries in the related tables.
Is this approach safe enough or am I missing something.
Please help me formulate a proper working plan for this so that I can use the last insert id for further inserts into the other table safely without having to mess up with the existing data.
Thanks
Yes, it's safe for concurent use. That's because LAST_INSERT_ID() is per-connection, as explained here:
The ID that was generated is
maintained in the server on a
per-connection basis. This means that
the value returned by the function to
a given client is the first
AUTO_INCREMENT value generated for
most recent statement affecting an
AUTO_INCREMENT column by that client.
This value cannot be affected by other
clients, even if they generate
AUTO_INCREMENT values of their own.
This behavior ensures that each client
can retrieve its own ID without
concern for the activity of other
clients, and without the need for
locks or transactions.
The $db->Insert_ID() will return you last insert id only so if you are inserting many records and want to get id of each last inserted row, then this will work successfully.
I want to know if there are multiple and simultaneous inserts into the database table, will this method return me the correct inserted id for each record inserted ?
It will return only the most recently inserted id.
In order to get ids for multiple inserts, you will have to call INSERT_ID() after each statement is executed. IE:
INSERT INTO TABLE ....
INSERT_ID()
INSERT INTO TABLE ....
INSERT_ID()
...to get the id value for each insert. If you ran:
INSERT INTO TABLE ....
INSERT INTO TABLE ....
INSERT_ID()
...will only return the id for the last insert statement.
Is this approach safe enough or am I missing something.
It's safer than using SELECT MAX(id) FROM TABLE, which risks returning a value inserted by someone else among other things relating to isolation levels.