Check for existing entries in Database or recreate table? - php

I've got a PHP script pulling a file from a server and plugging the values in it into a Database every 4 hours.
This file can and most likely change within the 4 hours (or whatever timeframe I finally choose). It's a list of properties and their owners.
Would it be better to check the file and compare it to each DB entry and update any if they need it, or create a temp table and then compare the two using an SQL query?

None.
What I'd personally do is run the INSERT command using ON DUPLICATE KEY UPDATE (assuming your table is properly designed and that you are using at least one piece of information from your file as UNIQUE key which you should based on your comment).
Reasons
Creating temp table is a hassle.
Comparing is a hassle too. You need to select a record, compare a record, if not equal update the record and so on - it's just a giant waste of time to compare a piece of info and there's a better way to do it.
It would be so much easier if you just insert everything you find and if a clash occurs - that means the record exists and most likely needs updating.
That way you took care of everything with 1 query and your data integrity is preserved also so you can just keep filling your table or updating with new records.

I think it would be best to download the file and update the existing table, maybe using REPLACE or REPLACE INTO. "REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted." http://dev.mysql.com/doc/refman/5.0/en/replace.html

Presumably you have a list of columns that will have to match in order for you to decide that the two things match.
If you create a UNIQUE index over those columns then you can use either INSERT ... ON DUPLICATE KEY UPDATE(manual) or REPLACE INTO ...(manual)

Related

MariaDB Update Table Contents From Changing File

I have the following problem:
I have got a dataset inside a text file (not xml or csv encoded or something, just field values separated by \t and \n) which is updated every 2 minutes. I need to put the data from the file into a MariaDB Database, which itself is not very difficult to do.
What I am unsure about however, is how I would go about updating the table if the file's contents change. I thought about truncating the table and then filling it again, but doing that every 2 minutes with about 1000 datasets would mean some nasty problems with the database being incomplete during those updates, which makes it not a usable solution (which it wouldn't have been with fewer datasets either :D)
Another solution I thought about was to append the new data to the existing table, and use a delimter on the unique column (e.g. use cols 1-1000 before update, append data, then use values 1001-2000 after the update and remove 1-1000, after 2 or so updates start at id 1 again).
Updating the changing fields is not an option, because the raw data format would make that really difficult to keep track of the column that has changed (or hasn't)
I am, however unsure about best practices, as I am relatively new to SQL and stuff, and would like to hear your opinion, maybe I am just overlooking something obvious...
Even better...
CREATE TABLE new LIKE real; -- permanent, not TEMPORARY
load `new` from the incoming data
RENAME TABLE real TO old, new TO real;
DROP TABLE old.
Advantages:
The table real is never invisible, nor empty, to the application.
The RENAME is "instantaneous" and "atomic".
As suggested by Alex, I will create a temporary table, insert my data into the temporary table, truncate the production table and then insert from the temporary table. Works like a charm!

PHP Upload CSV to update database records but exclude duplicate

I'm trying to figure out the best way to do this. I have a database with around 500 records/rows in MySQL and I need to update it daily. The problem is the file I upload will be Excel file (I probably need to convert it to CSV before upload?). Also I need to only upload records that don't exist yet in the current MySQL Database. The Unique field is named "MemberID".
What is the best way to achieve this? If I insert each rows (so I can check first if the record/row should be inserted to the Database) one by one like using a loop, will that be a slow process for uploading 500 records?
I'm new to PHP from VBA Programming and I only know how to insert records one at a time. Your suggestions is most appreciated.
You've got three options:
REPLACE (recommended as you're using a database that's updated daily - you'll never know, if old records didn't change from last update):
REPLACE INTO db_name (id,value) VALUES (1,1),(1,2),(1,3),(1,4)
It will affect all rows.
ON DUPLICATE KEY UPDATE (that's probably what you've been searching for, it will update whatever you want or simply leave the row 'as is'):
INSERT INTO db_name (id,value) VALUES (1,1) ON DUPLICATE KEY UPDATE id=id
INSERT IGNORE INTO (it will update only those rows that're new, skipping duplicates, but if you'll encounter key violations MySQL will NOT raise an error):
INSERT IGNORE INTO db_name (id,value) VALUES (1,1);
Also, some alternatives: SQL Merge.

Back-filling tables from another table; cannot have duplicates

We have two tables on two different MySQL servers. We have a unique key, which is invoice and date.
We need to grab all of the records from a certain time period and put them into another table. The caveat is that there may be records that exist already so we want to exclude those from the records we are back-filling.
What queries, ideas, scripts, etc. would be the most helpful in accomplishing this?
If you put a unique key onto a field that will uniquely identify a record (or a combination of fields), you can use INSERT IGNORE INTO as your MySQL statement. This will insert records, but if a key conflict arises (such as when that record already exists), it will simply proceed to the next record.
You could also use REPLACE INTO, instead of INSERT INTO, which is similar to INSERT IGNORE INTO, but rather than proceeding to the next record, it will overwrite the conflicted row.
Look at merge syntax
TSQLmerge

Possible to have a mixed overwrite / append batch write into MySQL?

I am setting up an uploader (using php) for my client where they can select a CSV (in a pre-determined format) on their machine to upload. The CSV will likely have 4000-5000 rows. Php will process the file by reading each line of the CSV and inserting it directly into the DB table. That part is easy.
However, ideally before appending this data to the database table, I'd like to review 3 of the columns (A, B, and C) and check to see if I already have a matching combo of those 3 fields in the table AND IF SO I would rather UPDATE that row rather than appending. If I DO NOT have a matching combo of those 3 columns I want to go ahead and INSERT the row, appending the data to the table.
My first thought is that I could make columns A, B, and C a unique index in my table and then just INSERT every row, detect a 'failed' INSERT (due to the restriction of my unique index) somehow and then make the update. Seems that this method could be more efficient than having to make a separate SELECT query for each row just to see if I have a matching combo already in my table.
A third approach may be to simply append EVERYTHING, using no MySQL unique index and then only grab the latest unique combo when the client later queries that table. However I am trying to avoid having a ton of useless data in that table.
Thoughts on best practices or clever approaches?
If you make the 3 columns the unique id, you can do an INSERT with ON DUPLICATE KEY.
INSERT INTO table (a,b,c,d,e,f) VALUES (1,2,3,5,6,7)
ON DUPLICATE KEY UPDATE d=5,e=6,f=7;
You can read more about this handy technique here in the MySQL manual.
If you add a unique index on the ( A, B, C ) columns, then you can use REPLACE to do this in one statement:
REPLACE works exactly like INSERT,
except that if an old row in the table
has the same value as a new row for a
PRIMARY KEY or a UNIQUE index, the old
row is deleted before the new row is
inserted...

Ids from mysql massive insert from simultaneous sources

I've got an application in php & mysql where the users writes and reads from a particular table. One of the write modes is in a batch, doing only one query with the multiple values. The table has an ID which auto-increments.
The idea is that for each row in the table that is inserted, a copy is inserted in a separate table, as a history log, including the ID that was generated.
The problem is that multiple users can do this at once, and I need to be sure that the ID loaded is the correct.
Can I be sure that if I do for example:
INSERT INTO table1 VALUES ('','test1'),('','test2')
that the ids generated are sequential?
How can I get the Id's that were just loaded, and be sure that those are the ones that were just loaded?
I've thinked of the LOCK TABLE, but the users shouldn't note this.
Hope I made myself clear...
Building an application that requires generated IDs to be sequential usually means you're taking a wrong approach - what happens when you have to delete a value some day, are you going to re-sequence the entire table? Much better to just let the values fall as they may, using a primary key to prevent duplication.
based on the current implementation of myisam and innodb, yes. however, this is not guaranteed to be so in the future, so i would not rely on it.

Categories