I have this query in php. It's an insert select copying from table2, but I need to get the IDs of the newly created rows and store them into an array. Here is my code:
$sql = "INSERT INTO table1 SELECT distinct * from table2";
$db->query($sql);
I could revert the flow starting with a select on table2 and making all single inserts but it would slow down the script on a big table. Ideas?
You could lock the table, insert the rows, and get the ID of the last item inserted, and then unlock; that way you know that the IDs will be contiguous as no other concurrent user could have changed them. Locking and unlocking is something you want to use with caution though.
An alternative approach could be to use one of the columns in the table - either an 'updated' datetime column, or an insert-id column (for which you put in a value that will be the same across all of your rows.)
That way you can do a subsequent SELECT of the IDs back out of the database matching either the updated time or your chosen insert ID.
Related
I have 2 tables with similar columns in MYSQL. I am copying data from one to another with INSERT INTO table2 SELECT * FROM table1 WHERE column1=smth. I have different columns as autoincrement and KEY in tables. When I use mysqli_insert_id i get the first one rather then last one inserted. Is there any way to get the last one?
Thanks
There is no inherit ordering of data in a relational database. You have to specify which field it is that you wish to order by like:
INSERT INTO table2
SELECT *
FROM table1
WHERE column1=smth
ORDER BY <field to sort by here>
LIMIT 1;
Relying on the order a record is written to a table is a very bad idea. If you have an auto-numbered id on table1 then just use ORDER BY id DESC LIMIT 1 to sort the result set by ID in descending order and pick the last one.
Updated to address OP's question about mysqli_insert_id
According to the Mysql reference the function called here is last_insert_id() where it states:
Important If you insert multiple rows using a single INSERT statement,
LAST_INSERT_ID() returns the value generated for the first inserted
row only. The reason for this is to make it possible to reproduce
easily the same INSERT statement against some other server.
Unfortunately, you'll have to do a second query to get the true "Last inserted id". Your best bet might be to run a SELECT COUNT(*) FROM table1 WHERE column1=smth; and then use that count(*) return to add to the mysqli_insert_id value. That's not great, but if you have high volume where this one function is getting hit a lot, this is probably the safest route.
The less safe route would be SELECT max(id) FROM table2 or SELECT max(id) FROM table2 Where column1=smth. But... again, depending on your keys and the number of times this insert is getting hit, this might be risky.
So I'm creating a system that will be pulling 50-150 records at a time from a table and display them to the user, and I'm trying to keep a view count for each record.
I figured the most efficient way would be to create a MEMORY table that I use an INSERT INTO to pull the IDs of the rows into and then have a cron function that runs regularly to aggregate the view ID counts and clears out the memory table, updating the original one with the latest view counts. This avoids constantly updating the table that'll likely be getting accessed the most, so I'm not locking 150 rows at a time with each query(or the whole table if I'm using MyISAM).
Basically, the method explained here.
However, I would of course like to do this at the same time as I pull the records information for viewing, and I'd like to avoid running a second, separate query just to get the same set of data for its counts.
Is there any way to SELECT a dataset, return that dataset, and simultaneously insert a single column from that dataset into another table?
It looks like PostgreSQL might have something similar to what I want with the RETURNING keyword, but I'm using MySQL.
First of all, I would not add a counter column to the Main table. I would create a separate Audit table that would hold ID of the item from the Main table plus at least timestamp when that ID was requested. In essence, Audit table would store a history of requests. In this approach you can easily generate much more interesting reports. You can always calculate grand totals per item and also you can calculate summaries by day, week, month, etc per item or across all items. Depending on the volume of data you can periodically delete Audit entries older than some threshold (a month, a year, etc).
Also, you can easily store more information in Audit table as needed, for example, user ID to calculate stats per user.
To populate Audit table "automatically" I would create a stored procedure. The client code would call this stored procedure instead of performing the original SELECT. Stored procedure would return exactly the same result as original SELECT does, but would also add necessary details to the Audit table transparently to the client code.
So, let's assume that Audit table looks like this:
CREATE TABLE AuditTable
(
ID int
IDENTITY -- SQL Server
SERIAL -- Postgres
AUTO_INCREMENT -- MySQL
NOT NULL,
ItemID int NOT NULL,
RequestDateTime datetime NOT NULL
)
and your main SELECT looks like this:
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
To perform both INSERT and SELECT in one statement in SQL Server I'd use OUTPUT clause, in Postgres - RETURNING clause, in MySQL - ??? I don't think it has anything like this. So, MySQL procedure would have several separate statements.
MySQL
At first do your SELECT and insert results into a temporary (possibly memory) table. Then copy item IDs from temporary table into Audit table. Then SELECT from temporary table to return result to the client.
CREATE TEMPORARY TABLE TempTable
(
ItemID int NOT NULL,
Col1 ...,
Col2 ...,
...
)
ENGINE = MEMORY
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
;
INSERT INTO AuditTable (ItemID, RequestDateTime)
SELECT ItemID, NOW()
FROM TempTable;
SELECT ItemID, Col1, Col2, ...
FROM TempTable
ORDER BY ...;
SQL Server (just to tease you. this single statement does both INSERT and SELECT)
MERGE INTO AuditTable
USING
(
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(ItemID, RequestDateTime)
VALUES
(Src.ItemID, GETDATE())
OUTPUT
Src.ItemID, Src.Col1, Src.Col2, ...
;
You can leave Audit table as it is, or you can set up cron to summarize it periodically. It really depends on the volume of data. In our system we store individual rows for a week, plus we summarize stats per hour and keep it for 6 weeks, plus we keep daily summary for 18 months. But, important part, all these summaries are separate Audit tables, we don't keep auditing information in the Main table, so we don't need to update it.
Joe Celko explained it very well in SQL Style Habits: Attack of the Skeuomorphs:
Now go to any SQL Forum text search the postings. You will find
thousands of postings with DDL that include columns named createdby,
createddate, modifiedby and modifieddate with that particular
meta data on the end of the row declaration. It is the old mag tape
header label written in a new language! Deja Vu!
The header records appeared only once on a tape. But these meta data
values appear over and over on every row in the table. One of the main
reasons for using databases (not just SQL) was to remove redundancy
from the data; this just adds more redundancy. But now think about
what happens to the audit trail when a row is deleted? What happens to
the audit trail when a row is updated? The trail is destroyed. The
audit data should be separated from the schema. Would you put the log
file on the same disk drive as the database? Would an accountant let
the same person approve and receive a payment?
You're kind of asking if MySQL supports a SELECT trigger. It doesn't. You'll need to do this as two queries, however you can stick those inside a stored procedure - then you can pass in the range you're fetching, have it both return the results AND do the INSERT into the other table.
Updated answer with skeleton example for stored procedure:
DELIMITER $$
CREATE PROCEDURE `FetchRows`(IN StartID INT, IN EndID INT)
BEGIN
UPDATE Blah SET ViewCount = ViewCount+1 WHERE id >= StartID AND id <= EndID;
# ^ Assumes counts are stored in the same table. If they're in a seperate table, do an INSERT INTO ... ON DUPLICATE KEY UPDATE ViewCount = ViewCount+1 instead.
SELECT * FROM Blah WHERE id >= StartID AND id <= EndID;
END$$
DELIMITER ;
I'm using PHP to insert groups of records into a MySQL DB.
Whenever I insert a group of records, I want to give that group a unique set ID that is incremented by 1 for each group of records in the DB.
Currently, I'm checking the latest set ID in the DB and incrementing it by 1 for each new set of records.
The thing that scares me though is what happens if I query the DB to get the latest set ID, and before I can insert a new set of records with that set ID + 1, another insert occurs on the table thus taking the set ID I was about to use?
While fairly unlikely, something like that could greatly sacrifice the integrity of the data.
What can I do to prevent such a thing from happening? Is there any way to temporarily lock the DB table so that no other inserts can occur until I have performed a SELECT/INSERT combo?
Locking the table is one option, but that approach impacts concurrency.
The approach I would recommend is that you use a separate table with AUTO_INCREMENT column, and use a separate INSERT into that table, and a SELECT LAST_INSERT_ID() to retrieve the auto_increment value.
And then use that value as the group identifier for the group of rows you insert into your original table.
The basic approach is:
LOCK TABLE foo WRITE;
SELECT MAX(id) + 1 FROM foo
INSERT ...
INSERT ...
UNLOCK TABLES;
Locking the table prevents any other process from changing the table until you explicitly unlock it.
Having said that, seriously consider just using a table with an AUTO_INCREMENT column. MySQL will do the work of maintaining unique keys wholly automatically, and then you can simply refer to those keys from your existing table.
Using PHP and MySQL, I have a query that will look something like this:
UPDATE mytable
SET status='$newstatus'
WHERE (col1='$col1[0]'AND col2='$col2[0]')
OR (col1='$col1[1]'AND col2='$col2[1]')
OR (...);
I actually need to record the current 'status' of each of these rows before the update. Do I need to do a separate SELECT before this, or can (should / how would) I combine the two queries?
You cannot get that from this query (you could only get number of affected rows, but that's it). If you need that, you shall first do SELECT on your conditions like:
SELECT `id` FROM `mytable`
WHERE (`col1`='$col1[0]' AND `col2`='$col2[0]')
OR (`col1`='$col1[1]' AND `col2`='$col2[1]')
OR (...)
and then do UPDATE with WHERE using fetched ids. I do not recommend doing UPDATE with your current WHERE clause as in meantime (between your SELECT and UPDATE) db content could change, so you could be UPDATING different rows that you had SELECTed. Or use table locking (but I do not think it makes sense here).
No OUTPUT clause in Mysql. You need to either read status prior to update or create a trigger that stores value of OLD.status in other table.
You can't have a single query to update the row and record the current status before updating.
You'd better have a "log table", with the same schema of your "table" plus a timestamp, but it would store only historical data, the status of a row in a single point in time, like a versioning system.
Example:
Table User: Id, Username, Email, Telephone
Table UserLog: Id, Username, Email, Telephone, Timestamp
So, before updating a row on table User, you'd first do a SELECT and an INSERT, like this:
insert into UserLog
select Id, Username, Email, Telephone, Now() from User where Id=$Id
I am writing a converter to transfer data from old systems to new systems. I am using php+mysql.
I have one table that contains millions records with duplicate entries. I want to transfer that data in a new table and remove all entries. I am using following queries and pseudo code to perform this task
select *
from table1
insert into table2
ON DUPLICATE KEY UPDATE customer_information = concat('$firstName',',','$lastName')
It takes ages to process one table :(
I am pondering that is it possible to use group by and get all grouped record automatically?
Other than going through each record and checking duplicate etc.?
For example
select *
from table1
group by firstName, lastName
insert into table 2 only one record and add all users'
first last name into column ALL_NAMES with comma
EDIT
There are different records for each customers with different information. Each row is called duplicated if first and last name of user is same. In new table, we will just add one customer and their bought product in different columns (we have only 4 products).
I don't know what you are trying to do with customer_information, but if you just want to transfer the non-duplicated set of data from one table to another, this will work:
INSERT IGNORE INTO table2(field1, field2, ... fieldx)
SELECT DISTINCT field1, field2, ... fieldx
FROM table1;
DISTINCT will take care of rows that are exact duplicates. But if you have rows that are only partial duplicates (like the same last and first names but a different email) then IGNORE can help. If you put a unique index on table2(lastname,firstname) then IGNORE will make sure that only the first record with lastnameX, firstnameY from table1 is inserted. Of course, you might not like which record of a pair of partial duplicates is chosen.
ETA
Now that you've updated your question, it appears that you want to put the values of multiple rows into one field. This is, generally speaking, a bad idea because when you denormalize your data this way you make it much less accessible. Also, if you are grouping by (lastname, firstname), there will not be names in allnames. Because of this, my example uses allemails instead. In any event, if you really need to do this, here's how:
INSERT INTO table2(lastname, firstname, allemails)
SELECT lastname, firstname, GROUP_CONCAT(email) as allemails
FROM table1
GROUP BY lastname, firstname;
If they are really duplicate rows (every field is the the same) then you can use:
select DISTINCT * from table1
instead of :
select * from table1