Back-filling tables from another table; cannot have duplicates - php

We have two tables on two different MySQL servers. We have a unique key, which is invoice and date.
We need to grab all of the records from a certain time period and put them into another table. The caveat is that there may be records that exist already so we want to exclude those from the records we are back-filling.
What queries, ideas, scripts, etc. would be the most helpful in accomplishing this?

If you put a unique key onto a field that will uniquely identify a record (or a combination of fields), you can use INSERT IGNORE INTO as your MySQL statement. This will insert records, but if a key conflict arises (such as when that record already exists), it will simply proceed to the next record.
You could also use REPLACE INTO, instead of INSERT INTO, which is similar to INSERT IGNORE INTO, but rather than proceeding to the next record, it will overwrite the conflicted row.

Look at merge syntax
TSQLmerge

Related

Trigger to multiple tables on INSERT

quick question.
In my user database I have 5 separate tables all containing different information. 4 tables are connected by foreign key to the primary key of the first table.
I am wanting to trigger row inserts on the other 4 tables when I do an insert on the first (primary). I thought that with ON UPDATE CASCADE would do this for me but after trying it I realised it did not...I know clue is in the name ON UPDATE!!!!!
I also tried and failed at multiple triggers on the same table but found this was not possible either.
What I am planning on doing is putting a trigger on the first to INSERT on the second and then putting a trigger on the second to insert on the third......etc
Would just like to know if this is a wise thing to do or not or if I am missing a better and simpler way of doing this.
Any help/advice much appreciated.
Based on the given information, it "feels" as if there might be a flaw in the database design if each of the child tables requires a row for every single row in the parent table. There is a reason that "ON INSERT CASCADE" does not exist; it is typically not considered meaningful.
The first thought that comes to mind is that the child tables should actually be part of the parent table; it sounds as if there is a one-to-one relationship. It still may make sense to have separate tables from an organizational standpoint (and size of records), but it is something to think about.
If there is not a one-to-one relationship, then the ability to add meaningful data beyond default values to the child tables would imply there might be a bit more normalization of data required. If the only values to be added are NULLs, then one could maybe argue that there is no real point in having the record because a LEFT JOIN could produce the same results without that record.
Having said all that, if it is required, I would think that it would be better to have a single trigger on the parent table add all the records to the child tables rather than chain them in several triggers. That way the logic would be contained in a single location.
Not understanding your structure (the information you need in each of these tables is pertinent to correctly answer), I can only guess that a trigger might not be what you want to do this. If your tables have other fields beyond what is in table 1 and they do not have default values, how will you get the value for those other fields inthe trigger? Personally I would use a stored proc to insert to table1 and get the id value back from the insert and then insert to the other tables with the additonal information needed and put it all in a transaction so that if one insert fails all are rolled back.

Check for existing entries in Database or recreate table?

I've got a PHP script pulling a file from a server and plugging the values in it into a Database every 4 hours.
This file can and most likely change within the 4 hours (or whatever timeframe I finally choose). It's a list of properties and their owners.
Would it be better to check the file and compare it to each DB entry and update any if they need it, or create a temp table and then compare the two using an SQL query?
None.
What I'd personally do is run the INSERT command using ON DUPLICATE KEY UPDATE (assuming your table is properly designed and that you are using at least one piece of information from your file as UNIQUE key which you should based on your comment).
Reasons
Creating temp table is a hassle.
Comparing is a hassle too. You need to select a record, compare a record, if not equal update the record and so on - it's just a giant waste of time to compare a piece of info and there's a better way to do it.
It would be so much easier if you just insert everything you find and if a clash occurs - that means the record exists and most likely needs updating.
That way you took care of everything with 1 query and your data integrity is preserved also so you can just keep filling your table or updating with new records.
I think it would be best to download the file and update the existing table, maybe using REPLACE or REPLACE INTO. "REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted." http://dev.mysql.com/doc/refman/5.0/en/replace.html
Presumably you have a list of columns that will have to match in order for you to decide that the two things match.
If you create a UNIQUE index over those columns then you can use either INSERT ... ON DUPLICATE KEY UPDATE(manual) or REPLACE INTO ...(manual)

Possible to have a mixed overwrite / append batch write into MySQL?

I am setting up an uploader (using php) for my client where they can select a CSV (in a pre-determined format) on their machine to upload. The CSV will likely have 4000-5000 rows. Php will process the file by reading each line of the CSV and inserting it directly into the DB table. That part is easy.
However, ideally before appending this data to the database table, I'd like to review 3 of the columns (A, B, and C) and check to see if I already have a matching combo of those 3 fields in the table AND IF SO I would rather UPDATE that row rather than appending. If I DO NOT have a matching combo of those 3 columns I want to go ahead and INSERT the row, appending the data to the table.
My first thought is that I could make columns A, B, and C a unique index in my table and then just INSERT every row, detect a 'failed' INSERT (due to the restriction of my unique index) somehow and then make the update. Seems that this method could be more efficient than having to make a separate SELECT query for each row just to see if I have a matching combo already in my table.
A third approach may be to simply append EVERYTHING, using no MySQL unique index and then only grab the latest unique combo when the client later queries that table. However I am trying to avoid having a ton of useless data in that table.
Thoughts on best practices or clever approaches?
If you make the 3 columns the unique id, you can do an INSERT with ON DUPLICATE KEY.
INSERT INTO table (a,b,c,d,e,f) VALUES (1,2,3,5,6,7)
ON DUPLICATE KEY UPDATE d=5,e=6,f=7;
You can read more about this handy technique here in the MySQL manual.
If you add a unique index on the ( A, B, C ) columns, then you can use REPLACE to do this in one statement:
REPLACE works exactly like INSERT,
except that if an old row in the table
has the same value as a new row for a
PRIMARY KEY or a UNIQUE index, the old
row is deleted before the new row is
inserted...

How to archive changes to table rows?

I have a MySQL database where I am storing information that is entered from a PHP web page. I have a page that allows the user to view an existing row, and make changes and save them to the database. I want to know the best way to keep the original entries, as well as the new update and any subsequent updates.
My thought is to make a new table with the same columns as the first, with an additional timestamp field. When a user submits an update, the script would take the contents of the main table's row, and enter them into the archive table with a timestamp when it was done, and then enter in the new values to the main table. I'd also add a new field to the main table to specify whether or not the row has ever been edited.
This way, I can do a query of the main table and get the most current data, and I can also query the archive table to see the change history. Is this the best way to accomplish this, or is there a better way?
You can use triggers on update, delete, or insert to keep track of all changes, who made them and at what time.
Lookup database audit tables. There are several methods, I like the active column which gets set to 0 when you 'delete' or 'update' and the new record gets inserted. It does make a headache for unique key checking. The alternative I've used is the one you have mentioned, a separate table.
As buckbova mentions you can use a trigger to do the secondary insert on 'delete' or 'update'. Otherwise manage it in your PHP code if you don't have that ability.
You don't need a second table. Just have a start and end date on each row. The row without an end date is the active record. I've built entire systems using this method, and just so long as you index the date fields, it's very fast.
When retrieving the current record, AND end_date IS NULL gets added to the WHERE clause.
In this situation, I would recommend you to consider all properties in one table after adding it few columns:
active/ not active
ID of the person who kept these parameters
timestamp of adding

Ids from mysql massive insert from simultaneous sources

I've got an application in php & mysql where the users writes and reads from a particular table. One of the write modes is in a batch, doing only one query with the multiple values. The table has an ID which auto-increments.
The idea is that for each row in the table that is inserted, a copy is inserted in a separate table, as a history log, including the ID that was generated.
The problem is that multiple users can do this at once, and I need to be sure that the ID loaded is the correct.
Can I be sure that if I do for example:
INSERT INTO table1 VALUES ('','test1'),('','test2')
that the ids generated are sequential?
How can I get the Id's that were just loaded, and be sure that those are the ones that were just loaded?
I've thinked of the LOCK TABLE, but the users shouldn't note this.
Hope I made myself clear...
Building an application that requires generated IDs to be sequential usually means you're taking a wrong approach - what happens when you have to delete a value some day, are you going to re-sequence the entire table? Much better to just let the values fall as they may, using a primary key to prevent duplication.
based on the current implementation of myisam and innodb, yes. however, this is not guaranteed to be so in the future, so i would not rely on it.

Categories