Possible to have a mixed overwrite / append batch write into MySQL? - php

I am setting up an uploader (using php) for my client where they can select a CSV (in a pre-determined format) on their machine to upload. The CSV will likely have 4000-5000 rows. Php will process the file by reading each line of the CSV and inserting it directly into the DB table. That part is easy.
However, ideally before appending this data to the database table, I'd like to review 3 of the columns (A, B, and C) and check to see if I already have a matching combo of those 3 fields in the table AND IF SO I would rather UPDATE that row rather than appending. If I DO NOT have a matching combo of those 3 columns I want to go ahead and INSERT the row, appending the data to the table.
My first thought is that I could make columns A, B, and C a unique index in my table and then just INSERT every row, detect a 'failed' INSERT (due to the restriction of my unique index) somehow and then make the update. Seems that this method could be more efficient than having to make a separate SELECT query for each row just to see if I have a matching combo already in my table.
A third approach may be to simply append EVERYTHING, using no MySQL unique index and then only grab the latest unique combo when the client later queries that table. However I am trying to avoid having a ton of useless data in that table.
Thoughts on best practices or clever approaches?

If you make the 3 columns the unique id, you can do an INSERT with ON DUPLICATE KEY.
INSERT INTO table (a,b,c,d,e,f) VALUES (1,2,3,5,6,7)
ON DUPLICATE KEY UPDATE d=5,e=6,f=7;
You can read more about this handy technique here in the MySQL manual.

If you add a unique index on the ( A, B, C ) columns, then you can use REPLACE to do this in one statement:
REPLACE works exactly like INSERT,
except that if an old row in the table
has the same value as a new row for a
PRIMARY KEY or a UNIQUE index, the old
row is deleted before the new row is
inserted...

Related

Is there a way to add multiple values with same ID in two different tables?

Using PHP and mySQL, I need to add multiple values in 2 mysql tables : the first table would have the more important informations and the second one would have the less important informations about each items.
To be more clear, ONE element would have his informations split in two tables.
(I need this for some reasons but two of them are : having the first table the less weight possible, and the second table would store datas that will be erase after a short time (meanwhile the first table keeps all the datas it stored).)
In the best scenario, I'd like to add a row in each table about one item/element with the same id in each table. Something like this :
Table 1 id|data_1_a|data_1_b|...
Table 2 id|data_2_a|data_2_b|...
So if I add an element which get the ID "12345" in the table 1, it adds the datas in the table 2 with the same ID "12345".
To achieve this, I think of two solutions :
Create the ID myself for each element (instead of having an auto_increment on table 1). The con is that it would probably be better to check if the ID doesn't already exist in the tables everytime I generate an ID...
Add the element on table 1, get its ID with $db->lastInsertId(); and use it to add the element's datas on table 2. The con is that I have to add one element by one element to get all the IDs, while most of the time I want to add a lot of elements (like one, two or three hundreds !) at once
Maybe there's a better way to achieve this ?
lastInsertId() reports the first value generated by the last INSERT statement executed. It's reliable to assume that when you insert many rows, they are given consecutive id values following that first value. For example, the MySQL JDBC driver relies on this assumption, so it can report the set of id values generated.
This assumption breaks only if you deliberately set innodb_autoinc_lock_mode=2 (interleaved). See https://dev.mysql.com/doc/refman/8.0/en/innodb-auto-increment-handling.html for details about that.
But if it were my task, I would still choose to use a single table. When you find you don't need some of the columns anymore, use UPDATE to set them to NULL. This will eliminate the problems you're facing with assuring the same id is used across two tables.

Merging Two Databases - How to Skip Same ID or Generate New ID

I have two MySQL databases. I would like to data from one database to another. Both have the same structure and entries except that one database has same IDs for different items within the same tables. I don't want to replace the data from the old to the new database. If IDs are there, I would like the new database to skip it. If it's a duplication, I would like a new ID to be generated.
I'd like to use phpmyadmin for this but have no idea if this is even possible.
0.) Make backup of both tables
PHPMYADMIN will be sufficient for your request.
First you need to ensure there is no duplicating id's or primary keys.
Assuming two tables testtable1 and testtable2 have columns testtable_id, name
1.) firstly you would make query on second table
UPDATE testtable2 SET testtable2.testtable_id = testtable2.testtable_id + (SELECT MAX( testtable1.testtable_id ) FROM testtable1);
2.) Than, again in testtable2, there is tool Copy table to (database.table): under Operations menu, set DB name and testtable1 name (db name should be already set), select Data only radio button option and click Go. 3.) Now, you have all data from both tables in testtable1.
Edit. Firstly I thought it is matter of two tables in same database. But nevertheless you can use step two for rest of the tables too. Just set correct DB and table name in step two. Also, before that, set query so expecting ID to be higher than MAX ID of table you want to extend. You can hard code parenthesis part with exact number of MAX ID first DB corresponding table.

Inserting mysql foreign keys and primary keys in a transaction.

Just looking for some tips and pointers for a small project I am doing. I have some ideas but I am not sure if they are the best practice. I am using mysql and php.
I have a table called nomsing in the database.
It has a primary key called row id which is an integer.
Then I have about 8 other tables referencing this table.
That are called nomplu, accsing,accplu, datsing, datplu for instance.
Each has a column that references the primary key of nomsing.
Withing my php code I have all the information to insert into the tables except one thing , the row id primary key of the nomsing table. So that php generates a series of inserts like the following.
INSERT INTO nomsing(word,postress,gender) VALUES (''велосипед","8","mask").
INSERT INTO nomplu(word,postress,NOMSING?REFERENCE) VALUES (''велосипеды","2",#the reference to the id of the first insert#).
There are more inserts but this one gets the point across. The second insert should reference the auto generated id for the first insert. I was this to work as a transaction so all inserts should complete or none.
One idea I have is to not auto generate the id and generate it myself in php. That way would know the id given before the transaction but then I would have to check if the id was already in the db.
Another idea I have is to do the first insert and then query for the row id of that insert in php and then make the second insert. I mean both should work but they don't seem like an optimal solution. I am not too familiar with the database transactional features but what would be the best approach to do in this case. I don't like the idea of inserting then querying for the id and then running the rest of the queries. Just seems very inefficient or perhaps I am wrong.
Just insert a row in the master table. Then you can fetch the insert id ( lastInserId when on PDO) and use that to populate your other queries.
You could use the php version as given by JvdBerg , or Mysql's LAST_INSERT_ID. I usually use the former option.
See a similar SO question here.
You could add a new column to the nomsing table, called 'insert_order' (or similar) with a default value of 0, then instead of generating one SQL statement per insert create a bulk insert statement e.g.
INSERT INTO nomsing(word,postress,gender, insert_order)
VALUES (''велосипед","8","mask",1), (''abcd'',"9","hat",2).....
you generate the insert_order number with a counter in your loop starting at one. Then you can perform one SELECT on the table to get the ids e.g.
SELECT row_id
FROM nomsing
WHERE insert_order > 0;
now you have all the IDs you can now do a bulk insert for your following queries. At the end of your script just do an update to reset the insert_order column back to 0
UPDATE nomsing SET insert_order = 0 WHERE insert_order > 0;
It may seem messy to add an extra column to do this but it will add a significant speed increase over performing one query at a time.

Proper way of 'updating' rows in MySQL

This is my db structure:
ID NAME SOMEVAL API_ID
1 TEST 123456 A123
2 TEST2 223232 A123
3 TEST3 918922 A999
4 TEST4 118922 A999
I'm filling it using a function that calls an API and gets some data from an external service.
The first run, I want to insert all the data I get back from the API. After that, each time I run the function, I just want to update the current rows and add rows in case I got them from the API call and are not in the db.
So my initial thought regarding the update process is to go through each row I get from the API and SELECT to see if it already exists.
I'm just wondering if this is the most efficient way to do it, or maybe it's better to DELETE the relevant rows from the db and just re-inserting them all.
NOTE: each batch of rows I get from the API has an API_ID, so when I say delete the rows i mean something like DELETE FROM table WHERE API_ID = 'A999' for example.
If you retrieving all the rows from the service i recommend you the drop all indexes, truncate the table, then insert all the data and recreate indexes.
If you retrieving some data from the service i would drop all indexes, remove all relevant rows, insert all rows then recreate all indexes.
In such scenarios I'm usually going with:
start transaction
get row from external source
select local store to check if it's there
if it's there: update its values, remember local row id in list
if it's not there: insert it, remember local row id in list
at the end delete all rows that are not in remembered list of local row ids (NOT IN clause if the count of ids allows for this, or other ways if it's possible that there will be many deleted rows)
commit transaction
Why? Because usually I have local rows referenced by other tables, and deleting them all would break the references (not to mention deletete cascade).
I don't see any problem in performing SELECT, then deciding between an INSERT or UPDATE. However, MySQL has the ability to perform so-called "upserts", where it will insert a row if it does not exist, or update an existing row otherwise.
This SO answer shows how to do that.
I would recommend using INSERT...ON DUPLICATE KEY UPDATE.
If you use INSERT IGNORE, then the row won't actually be inserted if it results in a duplicate key on API_ID.
Add unique key index on API_ID column.
If you have all of the data returned from the API that you need to completely reconstruct the rows after you delete them, then go ahead and delete them, and insert afterwards.
Be sure, though, that you do this in a transaction, and that you are using an engine that supports transactions properly, such as InnoDB, so that other clients of the database don't see rows missing from the table just because they are going to be updated.
For efficiency, you should insert as many rows as you can in a single query. Much faster that way.
BEGIN;
DELETE FROM table WHERE API_ID = 'A987';
INSERT INTO table (NAME, SOMEVAL, API_ID) VALUES
('TEST5', 12345, 'A987'),
('TEST6', 23456, 'A987'),
('TEST7', 34567, 'A987'),
...
('TEST123', 123321, 'A987');
COMMIT;

Check for existing entries in Database or recreate table?

I've got a PHP script pulling a file from a server and plugging the values in it into a Database every 4 hours.
This file can and most likely change within the 4 hours (or whatever timeframe I finally choose). It's a list of properties and their owners.
Would it be better to check the file and compare it to each DB entry and update any if they need it, or create a temp table and then compare the two using an SQL query?
None.
What I'd personally do is run the INSERT command using ON DUPLICATE KEY UPDATE (assuming your table is properly designed and that you are using at least one piece of information from your file as UNIQUE key which you should based on your comment).
Reasons
Creating temp table is a hassle.
Comparing is a hassle too. You need to select a record, compare a record, if not equal update the record and so on - it's just a giant waste of time to compare a piece of info and there's a better way to do it.
It would be so much easier if you just insert everything you find and if a clash occurs - that means the record exists and most likely needs updating.
That way you took care of everything with 1 query and your data integrity is preserved also so you can just keep filling your table or updating with new records.
I think it would be best to download the file and update the existing table, maybe using REPLACE or REPLACE INTO. "REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted." http://dev.mysql.com/doc/refman/5.0/en/replace.html
Presumably you have a list of columns that will have to match in order for you to decide that the two things match.
If you create a UNIQUE index over those columns then you can use either INSERT ... ON DUPLICATE KEY UPDATE(manual) or REPLACE INTO ...(manual)

Categories