I need to add a CSV file in a database. But after I add one file, some weeks later I need to add a new update of the file. Problem is: Duplicate rows are added.
I don't have an ID for rows so I have to see if they are the same with 'City', 'Address' and 'Location Name'. Only if the 3 are matching then we dont put the new row in database.
I tried IGNORE but it seems to only work whit an ID as primary key (and I don't have primary key).
I also read a 'multiple primary key' thread but I did not succeed to create it.
My actual code is (Codeigniter):
$query = $this->db->query('
LOAD DATA INFILE "'.$path.'fichier/'.$fichier.'"
INTO TABLE location FIELDS TERMINATED BY ";"
LINES TERMINATED BY "'.$os2.'"
IGNORE 1 LINES ('.$name[1].','.$name[2].','.$name[3].','.$name[4].','.$name[5].','.$name[6].','.$name[7].','.$name[8].','.$name[9].','.$name[10].','.$name[11].','.$name[12].','.$name[13].','.$name[14].','.$name[15].','.$name[16].','.$name[17].','.$name[18].','.$name[19].')');
I would say, best would be to load your updated CSV in a staging table. Once all data loaded, do a LEFT JOIN with your actual table and find out all the new records and then insert only those new records to your main table (OR) Once all data loaded, flush all the data in main table with this new staging table.
Per your comment:
Yes, one you have loaded data to new table perform a LEFT JOIN with your main table (something like below)
select staging_table.id
from staging_table
left join main_table on staging_table.id = main_table.id
where main_table.id is null;
If someone want to know how I finally succeed:
I created a unique in phpmyadmin. And then I used a IGNORE on my request.
ALTER TABLE location
ADD CONSTRAINT iu_location UNIQUE( col1, col2, col3 );
Related
In mysql is there a way to evaluate during LOAD DATA, whether or not a record exists in the database, but not in the imported data based on a multi column index?
Example:
Updating a record in the database if the Name + UID exists in database, and the Name with other UIDs exists in the import, but the import does not include some Name + UID that is in the database.
If not, perhaps it is just easier to run a query periodically that updates records from Name + UID combos where the matching Name has records with a newer create date, or update date for other UIDs?
You would start by putting a unique index on UniqueID and Name this will make sure that the database knows that the combination of those two would be a Duplicate Key then your PDO would look something like
INSERT INTO `myTbl` (`UniqueID`,`Name`,`FixedDate`)
VALUES :UniqueID, :Name, :FixedDate
ON DUPLICATE KEY UPDATE `FixedDate` = VALUES(FixedDate)
I ended up doing this in a relatively straight forward manner. That being said, I am still interested to know if anyone has a simpler / more efficient way of doing this in MySQL (MariaDB 5.7)
I have a multi-column Index on Host+CVE to catch duplicates. I also have a createDate and updateDate column. The createDate update automatically on import, and the updateDate updates automatically on import, or on record Update except if the record updateDate happens during the import process below (I want to keep track of the last time we actually touched the record with our GUI).
LOAD DATA LOCAL INFILE '/tmp/Example.csv' INTO TABLE ExampleImport
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
INSERT INTO ExampleTable (PluginID,CVE,CVSS,Risk,Host,Protocol,Port,Name,Synopsis,Description,Solution,SeeAlso,PluginOutPut)
SELECT PluginID, CVE, CVSS, Risk, Host, Protocol, Port, Name, Synopsis, Description, Solution, SeeAlso, PluginOutput
FROM ExampleImport
ON DUPLICATE KEY UPDATE ImportDate = CURRENT_TIMESTAMP, UpdateDate = UpdateDate;
UPDATE ExampleTable x4
INNER JOIN (SELECT Host, MAX(UpdateDate) MaxDate
FROM ExampleTable
GROUP BY Host
) x2 ON x4.Host = x2.Host
SET FixDate = CURDATE(), x4.UpdateDate = x4.UpdateDate
WHERE x4.UpdateDate < x2.MaxDate;
I have a large database with many tables with several thousand records.
I have done some Excel work to identify some records / rows I want to delete from one of my large tables, as when I tried to do the query within phpmyadmin, the table kept locking as the query was too big.
Anyway.... Long story short.
I now have a list of 1500 records I need to delete from one of my tables.
Is there a way to "paste" these 1500 values into a query, so I can bring back the matching records, select them all at once and delete them?
Obviously, I dont want to do this manually one at a time.
So the query I have in mind is something like this:
Find any records which match these IDs (WHERE ID = )
Paste in list of IDs from Ms Excel
Results returned
Can select all rows and delete
Any tips?
Thanks
Just use the Keyword "IN" in your query with your list of value. Like :
Select Name
From Users
Where ID IN (1,2,3,4 .....) ;
There is more than one way to do this.
First:
You can directly paste the list of coma separated IDs in the where clause.
DELETE FROM tablename WHERE ID IN (1,2,3,4);
It you get error 'Packet Too Large' you can increase max_allowed_packet
The largest possible packet that can be transmitted to or from a MySQL
5.0 server or client is 1GB.
Second:
You can export you Excel file to csv file and load the data to a temp table then delete fro the table using the tmp table as reference
LOAD DATA INFILE 'X:\filename.csv'
INTO TABLE tmptable
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n';
DELETE t1
FROM tablename t1
JOIN tmptable t2 ON t1.ID = t2.ID;
Reference: MySQL LOAD DATA INFILE Syntax
Don't forget to remove your tmp table.
i am stuck in a simple update query.
i have a table say tabble1 containing 'name' and 'phone_no' column. Now when i upload csv file containing list of name and contact numbers, i want to update name of duplicate number with previous one. For ex. i have a row containing 'max' '8569589652'. now when i upload same number with another name say 'stela' '8569589652' then stela shuld get updated to max.
for this purpose i created another table say table2. then i collected all duplicate entries from table1 into table2. after that updated new entry with previous name.
following are my queries:
to collect all duplicate entries:
INSERT INTO table2 SELECT phone_no,name FROM table1
GROUP BY phone_no HAVING COUNT(*)>1;
to update duplicate entries in table1:
UPDATE table1.table2 SET table1.name=table2.name
WHERE table1.phone_no=table2.phone_no ;
My problem is when i run these two query it is taking tooo much of time. It is taking ore than half an hour to upload csv file of 1000 numbers.
Please suggest me optimize query to upload csv in less time.
does speed of uploading matters with size of database..
please help.
thanks in advance.
Here are the steps from the link I suggested.
1) Create a new temporary table.
CREATE TEMPORARY TABLE temporary_table LIKE target_table;
2) Optionally, drop all indices from the temporary table to speed things up.
SHOW INDEX FROM temporary_table;
DROP INDEX `PRIMARY` ON temporary_table;
DROP INDEX `some_other_index` ON temporary_table;
3) Load the CSV into the temporary table
LOAD DATA INFILE 'your_file.csv'
INTO TABLE temporary_table
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(field1, field2);
4) Copy the data using ON DUPLICATE KEY UPDATE
SHOW COLUMNS FROM target_table;
INSERT INTO target_table
SELECT * FROM temporary_table
ON DUPLICATE KEY UPDATE field1 = VALUES(field1), field2 = VALUES(field2);
5) Remove the temporary table
DROP TEMPORARY TABLE temporary_table;
You can update duplicate entries of "phone no" like below.
INSERT INTO table2 (phone_no,name)
VALUES
('11111', aaa),
('22222', bbb),
('33333', cccc),
ON DUPLICATE KEY UPDATE
phone_no = VALUES(phone_no);
Dump your CSV file in a temp table.
Then aply merge statement simply
Merge
AS MAIN
USING
AS TEMP
On
MAIN.CONTACT_NO=TEMP.CONTACT_NO
WHEN MATCHED THEN UPDATE
MAIN.NAME=TEMP.NAME;
IF YOU WANT TO INSERT NON MATCHING RECORD USE THIS
WHEN NOT MATCHED
THEN INSERT
(NAME,
CONTACT_NO)
VALUES
(
TEMP.NAME,
TEMP.CONTACT_NO
);
Please not that merge command must end with ';'
I have used ';' after upadate remove that and add the below part and end the whole merge with ';'
Hope this helps
Please update if any more help needed.
I am using mysql+php. The php provides interface to import XLSX table of goods into Mysql table(s). I am using a temporary table created with CREATE TABLE LIKE i.e. empty clone of live table for user evaluation before it is merged with live table
I am using INSERT INTO finalTable SELECT * FROM importTable ON DUPLICATE KEY UPDATE .... to avoid INSERT for every record. The fields to be updated on duplicate key are determined by php with SHOW COLUMNS i.e. all columns from the temporary table.
My problem is that the XLSX does not necessarily contain ALL the fields(name, price, category) i.e. it may be just an update of existing records. The current method will update ALL fields no matter if they were set or not during the import i.e. if the XLSX contained only price update, the rest of fields is null and would overwrite current values.
I tought to alter the temporary table by removing columns(not rows!) that are null for all rows. This way the SHOW COLUMNS would return only update-able columns. If a fields needs to be zeroed, it would be easy to set a special value e.g. "!X!" and use single UPDATE sentence to do so.
Is there a method for this or do you have any better suggestion(I am open to abando ON DUPLICATE KEY too)?
I'm a bit confused by your question.
I am using INSERT ... ON DUPLICATE KEY UPDATE to avoid INSERT for every record.
Like it says, it inserts or updates, you don't avoid anything. If you want to avoid inserts use INSERT IGNORE.
If I were you, I'd merge the tables in two steps.
First step, the insert (just the necessary ones):
INSERT INTO finalTable (col1, col2, ...)
SELECT i.col1, i.col2
FROM importTable i
LEFT JOIN finalTable f ON i.ID = f.ID
WHERE f.ID IS NULL;
Second step, the update:
I don't understand, why you want to delete columns. Unnecessary step which might take a while, and most importantly, your problem of updating with NULL still persists if just a few rows in a column are NULL in your import table. So, here's the solution.
UPDATE finalTable f
INNER JOIN importTable i ON f.ID = i.ID
SET f.col1 = COALESCE(i.col1, f.col1),
f.col2 = COALESCE(i.col2, f.col2),
...;
The trick in the second step is, to use COALESCE(). This function returns the first of its parameters which is not null. So if you have a column which is null in your import table, the value in the final table stays as it is.
UPDATE:
If you insist on having just one statement, you can of course do
INSERT INTO final f
SELECT * FROM import i
ON DUPLICATE KEY UPDATE
SET f.col1 = COALESCE(i.col1, f.col1),
f.col2 = COALESCE(i.col2, f.col2),
...;
i have a contactnumber column in mysql database. In contactnumber column there are more than 20,000 entries. Now when i upload new numbers through .csv file, i dont want duplicate numbers in database.
How can i avoid duplicate numbers while inserting in database.
I initially implemented logic that checks each number in .csv file with each of the number in database.
this works but takes lot of time to upload .csv file containing 1000 numbers.
Pleae suggest how to minimize time required to upload .csv file while not uploading duplicate values.
Simply add a UNIQUE constraint to the contactnumber column:
ALTER TABLE `mytable` ADD UNIQUE (`contactnumber`);
From there you can use the IGNORE option to ignore the error you'd usually be shown when inserting a duplicate:
INSERT IGNORE INTO `mytable` VALUES ('0123456789');
Alternatively, you could use the ON DUPLICATE KEY UPDATE to do something with the dupe, as detailed in this question: MySQL - ignore insert error: duplicate entry
If your contactnumber should not be repeated then make it PRIMARY or at least a UNIQUE key. That way when a value is being inserted as a duplicate, insert will fail automatically and you won't have to check beforehand.
The way I would do it is to create a temporary table.
create table my_dateasyyyymmddhhiiss as select * from mytable where 1=0;
Do your inserts into that table.
and then query out the orphans on the between mytable and the temp table based on contactnumber
then run an inner join query between the two tables and fetch out the duplicate for your telecaller tracking.
finally drop the temporary table.
Thing that this does not address are duplicates within the supplied file (don't know if that would be an issue in this problem)
Hope this help
If you don't want to insert duplicate values in table and rather wants to keep that value in different table.
You can create trigger on table.
like this:
DELIMITER $$
CREATE TRIGGER unique_key BEFORE INSERT ON table1
FOR EACH ROW BEGIN
DECLARE c INT;
SELECT COUNT(*) INTO c FROM table1 WHERE itemid = NEW.itemid;
IF (c > 0) THEN
insert into table2 (column_name) values (NEW.itemid);
END IF;
END$$
DELIMITER ;
I would recommend this way
Alter the contactnumber column as UNIQUE KEY
Using phpmyadmin import the .csv file and check the option 'Do not abort on INSERT error' under Format-Specific Options before submitting