transforming large imports of data - php
I have a question on the best way to import a long csv table and then transform into two tables for efficiency.
This is a very scaled down version but for this purpose suits:
each row is a unique record but the first three columns consist of very consistant data over a large amount of rows.
I figured the best way to manage this data was to build two tables:
The first being an auto increment id field and a group by of the first three columns.
This gives a nice compact table of the main groupings of my data.
The second table was to be every row but instead of holding all the repeated data I hold only the variable data columns d, e and f along with the autoincrement id field i generate when importing into the first table.
My question is really how do I get the id from the first table - is my only way to requery that table to find the id and then do the insert into the second table?
a,b,c,d,e,f
09/02/2013,A1,1,18503112043123,11,2.1219
09/02/2013,A1,1,44102576116476,73,14.0817
09/02/2013,A1,1,66918345446536,134,25.8486
09/02/2013,A1,2,62009507978229,10,1.929
09/02/2013,A1,2,92278593945574,55,10.6095
09/02/2013,B1,1,50474606002324,90,17.361
09/02/2013,B1,1,59697581427675,7,1.3503
09/02/2013,B1,1,86298530583467,51,9.8379
09/02/2013,B1,2,34885481077847,80,15.432
09/02/2013,B1,2,25479347211047,164,31.6356
09/02/2013,B1,3,56270556524425,6,1.1574
09/02/2013,C1,1,57680166803098,24,4.6296
09/02/2013,C1,1,72778510788287,77,14.8533
09/02/2013,C1,1,26084111080146,140,27.006
09/02/2013,C1,1,31435464483578,361,65.5937
09/02/2013,C1,2,29457756254473,492,89.3964
09/02/2013,C1,2,68414218104066,293,53.2381
EDIT
I have two queries in mind:
1: My parent table which has an auto increment
insert into parent_table
select null,a,b,c
from table
group by a,b,c
My child table which is all my rows of data but includes corresponding auto increment id from the parent table.
I dont understand how to pull the id back again without doing a query back to the parent table as i input the data into the child table
You can use PDO::lastInsertId or mysqli::$insert_id to retrieve the
auto generated id used in the last query.
Just do the insert and then fetch the id
$sth = $pdo->prepare("insert into first_table (a, b, c) values (?, ?, ?)");
$sth->execute(array('2013-02-09', 'A1', 1));
$id = $pdo->lastInsertId();
There is also the MySQL LAST_INSERT_ID(). You could test
insert into second_table (first_table_id, d, e, f) values (LAST_INSERT_ID(), ...)
but I have never tried this myself.
Related
Mysql Insert .... select, obtain last insert ID
I have this query in php. It's an insert select copying from table2, but I need to get the IDs of the newly created rows and store them into an array. Here is my code: $sql = "INSERT INTO table1 SELECT distinct * from table2"; $db->query($sql); I could revert the flow starting with a select on table2 and making all single inserts but it would slow down the script on a big table. Ideas?
You could lock the table, insert the rows, and get the ID of the last item inserted, and then unlock; that way you know that the IDs will be contiguous as no other concurrent user could have changed them. Locking and unlocking is something you want to use with caution though. An alternative approach could be to use one of the columns in the table - either an 'updated' datetime column, or an insert-id column (for which you put in a value that will be the same across all of your rows.) That way you can do a subsequent SELECT of the IDs back out of the database matching either the updated time or your chosen insert ID.
Efficient way to check and insert data into MySQL database node
First Way: Fetch all the records from the database and create a result array. Loop the new inserting array and check the unique id is exist or not with the above result array then inserting to the DB. Second Way: Loop the new inserting array and check the unique id is exist or not in the database then insert into the database. Note: Inserting data will be very less compared to the database table data. Please suggest any best way to do it.
An implemetation of #NigelRen idea insert into destTable (id, coln,...) select st.id,st.coln,... from srcTable st left join destTable dt on st.id=dt.id where dt.id is null where id is the relation one-to-one between the two tables
Controlled table quering based on existing rows
I have a product_group table with the following fields: group_id, product_id, order. The table will be queried against a lot: a single-form view will make it possible to insert new records and/or update existing ones with one submit. I'm trying to figure out optimal solution to cover the following 3 cases: User tries to insert an existing row: do nothing. Here a unique index of the 3 columns can be useful. User changes only the order column: perform an update. User inserts a completely new set of values: perform an insert. Is there a way to put all of this together in one MySQL query? If not, what would be the best approach here? The goal is to limit database queries as much as possible.
Does this do what you want? insert into product_group(group_id, product_id, `order`) values (#group_id, #product_id, #order) on duplicate key update `order` = values(`order`); Along with a unique index on group_id, product_id: create unique index idx_product_group_2 on product_group(group_id, product_id) This handles your three cases: Because the value assignment is a no-op if the values are the same. The order column will be updated if the other two have the same value. A new row that has a different group_id or product_id will be inserted. As a note, order is a lousy name for a column, because it is a SQL key word.
How to automatically add a new row to another table PHP MYSQL
I have 2 tables that have one-to-one relationship in mySQL. they both have the same user_id as Primary key, and I need somehow when I insert a post into the my first table, automatically be able to insert the row with same user_id to my second table. Is there any mysql command or PHP script that I can use it ?
You might set up a TRIGGER in the database. These trigger entities are stored in the database's structure, with PHP you might only execute the CREATE TRIGGER query which creates one. However, two tables having the exact same data as their PRIMARY KEY sounds like your database structure is a bit badly modelled. You should take time to remodel the database, essentially merging (if possible) the two tables together. And if you are using PHP to INSERT INTO the database, you can call the two queries after each other: INSERT INTO table1(field1, field2...) VALUES (value1, value2...) INSERT INTO table2(field1, field2...) VALUES (value1, value2...) But reliance on two queries after each other requires pinpoint accuracy as the primary keys might go out of sync, breaking the relations.
If I understand your question right you can get the inserted user_id from the first table and use that to insert a new row in the second table. $query = mysql_query("SELECT * FROM first_table ORDER BY user_id DESC LIMIT 1"); $row = mysql_fetch_assoc($query); Now $row['user_id'] can be inserted in the second table. This could of course be risky if many rows are inserted at the same time.
group by mysql option
I am writing a converter to transfer data from old systems to new systems. I am using php+mysql. I have one table that contains millions records with duplicate entries. I want to transfer that data in a new table and remove all entries. I am using following queries and pseudo code to perform this task select * from table1 insert into table2 ON DUPLICATE KEY UPDATE customer_information = concat('$firstName',',','$lastName') It takes ages to process one table :( I am pondering that is it possible to use group by and get all grouped record automatically? Other than going through each record and checking duplicate etc.? For example select * from table1 group by firstName, lastName insert into table 2 only one record and add all users' first last name into column ALL_NAMES with comma EDIT There are different records for each customers with different information. Each row is called duplicated if first and last name of user is same. In new table, we will just add one customer and their bought product in different columns (we have only 4 products).
I don't know what you are trying to do with customer_information, but if you just want to transfer the non-duplicated set of data from one table to another, this will work: INSERT IGNORE INTO table2(field1, field2, ... fieldx) SELECT DISTINCT field1, field2, ... fieldx FROM table1; DISTINCT will take care of rows that are exact duplicates. But if you have rows that are only partial duplicates (like the same last and first names but a different email) then IGNORE can help. If you put a unique index on table2(lastname,firstname) then IGNORE will make sure that only the first record with lastnameX, firstnameY from table1 is inserted. Of course, you might not like which record of a pair of partial duplicates is chosen. ETA Now that you've updated your question, it appears that you want to put the values of multiple rows into one field. This is, generally speaking, a bad idea because when you denormalize your data this way you make it much less accessible. Also, if you are grouping by (lastname, firstname), there will not be names in allnames. Because of this, my example uses allemails instead. In any event, if you really need to do this, here's how: INSERT INTO table2(lastname, firstname, allemails) SELECT lastname, firstname, GROUP_CONCAT(email) as allemails FROM table1 GROUP BY lastname, firstname;
If they are really duplicate rows (every field is the the same) then you can use: select DISTINCT * from table1 instead of : select * from table1