Avoid Duplicates of Unique Key Within INSERT Query

Avoid Duplicates of Unique Key Within INSERT Query - php

I have a MySQL query that looks like this:
INSERT INTO beer(name, type, alcohol_by_volume, description, image_url) VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}', '{$image_url}')
The only problem is that name is a unique value, which means if I ever run into duplicates, I get an error like this:
Error storing beer data: Duplicate entry 'Hocus Pocus' for key 2
Is there a way to ensure that the SQL query does not attempt to add a unique value that already exists without running a SELECT query for the entire database?

You could of course use INSERT IGNORE INTO, like this:
INSERT IGNORE INTO beer(name, type, alcohol_by_volume, description, image_url) VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}', '{$image_url}')
You could use ON DUPLICATE KEY as well, but if you just don't want to add a row INSERT IGNORE INTO is a better choice. ON DUPLICATE KEY is better suited if you want to do something more specific when there are a duplicate.
If you decide to use ON DUPLICATE KEY - avoid using this clause on tables with multiple unique indexes. If you have a table with multiple unique indexes ON DUPLICATE KEY-clause could be giving unexpected results (You really don't have 100% control what's going to happen)
Example: - this row below only updates ONE row (if type is 1 and alcohol_by_volume 1 (and both columns are unique indexes))
ON DUPLICATE KEY UPDATE beer SET type=3 WHERE type=1 or alcohol_by_volume=1
To sum it up:
ON DUPLICATE KEY just does the work without warnings or errors when there are duplicates.
INSERT IGNORE INTO throws a warning when there are duplicates, but besides from that just ignore to insert the duplicate into the database.

As it just so happens, there is a way in MySQL by using ON DUPLICATE KEY UPDATE. This is available since MySQL 4.1
INSERT INTO beer(name, type, alcohol_by_volume, description, image_url)
VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}',
'{$image_url}')
ON DUPLICATE KEY UPDATE type=type;
You could also use INSERT IGNORE INTO... as an alternative, but the statement would still throw a warning (albeit, instead of an error).

Yes, there is. You can use the ON DUPLICATE KEY clause of mysql INSERT statement. The syntax is explained here
INSERT INTO beer(name, type, alcohol_by_volume, ...)
VALUES('{$name}', {$type}, '{$alcohol_by_volume}', ...)
ON DUPLICATE KEY UPDATE
type={$type}, alcohol_by_volume = '{$alcohol_by_volume}', ... ;

Yes, by first selecting the name from the database, and if the result of the query is not null (zero records), then the name already exists, and you have to get another name.

Quite simply - your code needs to figure out what it wants to do if something's trying to insert a duplicate name. As such, what you need to do first is run a select statement:
SELECT * FROM beer WHERE name='{$name}'
And then run an 'if' statement off of that to determine if you got a result.
if results = 0, then go ahead and run your insert.
Else ... whatever you want to do. Throw an error back to the user? Modify the database in a different way? Completely ignore it? How is this insert statement coming about? A mass update from a file? User input from a web page?
The way you're reaching this insert statement, and how it should affect your work flow, should determine exactly how you're handling that 'else'. But you should definitely handle it.
But just make sure that the select and insert statements are in a transaction together so that other folks coming in to do the same sort of stuff isn't an issue.

Related

Improve the performance of select first then insert to avoid duplicate record (in mysql & php)?

To avoid duplicate insert I know I can use “INSERT IGNORE” or “INSERT … ON DUPLICATE KEY UPDATE” in mysql.
But I am using laravel and I know firstOrCreate does not do that. It first makes a SELECT to see if the entry is there and INSERT only when SELECT returns no records. I guess it is becauase DUPLICATE KEY UPDATE is specific to MySQL.
It is really bad from performance point of view that I select first then insert ? How much performance impact will that cause compare to “INSERT IGNORE” or “INSERT … ON DUPLICATE KEY UPDATE” ?
Does it worth the trouble to not use firstOrCreate and write my own php code to “INSERT IGNORE” or “INSERT … ON DUPLICATE KEY UPDATE” ?

Assuming you only want INSERT INTO ... IGNORE functionality, I would suggest just creating a unique index on the table in question.
ALTER TABLE yourTable ADD CONSTRAINT cnstr UNIQUE KEY (col1, col2, ...);
With this in place, any attempt to insert a record which is duplicate would result in an application error on the PHP side, which you could easily catch and handle.
The problem with doing an insert first to check if a duplicate exists before attempting an insert is that such logic would only work if you ensure that no other DML activity occurs in the table from the time you select to the time the insert completes. Otherwise, something like the following would be possible:
process1: SELECT to check if record exists (assume it does not)
process2: INSERT record (which process1 wants to insert)
process1: INSERT same record (now a duplicate exists)
In other words, your PHP application would "think" that no duplicate exists, and insert the record. But, in the time in between its select and insert, another process happened to insert the same record.
To avoid this, I believe you would need something like a serializable transaction. But, using a unique constraint is much cleaner, and leaves this responsibility primarily up to the database to handle.

Maybe try REPLACE in mysql:
REPLACE ...

Get last_insert_id for INSERT DUPLICATE KEY UPDATE statement when there is no update

I have a database that needs to handle 2 special scenarios in case of duplicate key. In each case I need to obtain the unique id of the row whether there is duplicate record or not
Scenario #1 the record_count field has to be incremented by 1. So the mysql syntax looks like this
INSERT INTO (id, value, record_count) VALUES ('foo','bar', 1) ON DUPLICATE KEY UPDATE record_count=record_count+1
In this case $mysqli->insert_id method gives me correct value whether it is a new record or a duplicate record. Everything is good.
Scenario #2 the record_count field does not need to be incremented. I tried following 3 statements
INSERT IGNORE INTO (id, value, record_count) VALUES ('foo','bar', 1)
INSERT INTO (id, value, record_count) VALUES ('foo','bar', 1) ON DUPLICATE KEY UPDATE record_count=record_count
INSERT INTO (id, value, record_count) VALUES ('foo','bar', 1) ON DUPLICATE KEY UPDATE record_count=record_count+1-1
But in each case $mysqli->insert_id is yielding me 0 because I am not updating or inserting anything. I thought I could fool it by adding and subtracting 1 but no luck.
What is the workaround? I really do not want to use a SELECT statement (I am sending hundreds of queries per minute and I really do not want to increase the system load)

Since, I have not received any answers, here is what I am doing.
I created another dummy column which stores md5 hash. When I don't want to update the record_count, I update the dummy column so I can get the id of the column.
As #miken32 suggested, I can use a timestamp column as well (which might be more efficient).
Still looking for a better answer, if it exists.

The mysql_insert_id() documentation details all the cases, but explicitly details INSERT ... ON DUPLICATE KEY UPDATE as only updating when there is an actual change to the database.
So, to get an insert id out of MySQL, you have to make something in the database change. As has been suggested, a timestamp column would fit the bill. Alternatively, you could use an incrementing counter column, if that would provide you more useful information later.

Mysterious INSERT UPDATE error

I have a table that looks like (irrelevant columns subtracted):
PRIMARY KEY(AUTO-INCREMENT,INT),
CLIENTID(INT),
CLIENTENTRYID(INT),
COUNT1(INT),
COUNT2(INT)
Now, the CLIENTID and CLIENTENTRYID is a unique combined index serving as a duplication prevention.
I use PHP post input to the server. My query looks like:
$stmt = $sql->prepare('INSERT INTO table (COUNT1,COUNT2,CLIENTID,CLIENTENTRYID) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE COUNT1=VALUES(COUNT1),COUNT2=VALUES(COUNT2)');
$stmt->bind_param("iiii",$value,$value,$clientid,$cliententryid);
The SQL object has auto commit enabled. The "value" variable is reused as the value in COUNT1 and COUNT2 should ALWAYS be the same.
Okay - that works fine, most of the time, but randomly, and I cannot figure out why, it will post 0 in COUNT2 - for an entirely different row.
Any ideas how that might occur? I can't see a pattern (it doesn't happen after a failed attempt, which is why the unique index exists, so that a new attempt will not cause duplicates). It seems to be completely random.
Is there something I've misunderstood about ON DUPLICATE KEY UPDATE? The VERY weird thing is that it updates A DIFFERENT row incorrectly - not the one you insert.
I realize other factors might affect this, but now I'm trying to rule out my SQL logic as a source of error.

Aside from the PRIMARY KEY on the auto_increment column, there is only ONE UNIQUE key defined the table, and that's defined on (CLIENTID,CLIENTENTRYID), right?
And there are no triggers defined on the table, right?
And you are (obviously) using a prepared statement with bind placeholders.
It doesn't really matter if those two columns (CLIENTID and CLIENTENTRYID) are defined as NOT NULL or not; MySQL will allow multiple rows with NULL values; that doesn't violated the "uniqueness" enforced by a UNIQUE constraint. (This the same as how Oracle treats "uniqueness" of NULL values, but it is different from how SQL Server enforces it.)
I just don't see any way that the statement you show, that is:
INSERT INTO `mytable` (COUNT1,COUNT2,CLIENTID,CLIENTENTRYID) VALUES (?,?,?,?)
ON DUPLICATE KEY
UPDATE COUNT1 = VALUES(COUNT1)
, COUNT2 = VALUES(COUNT2)
... theres no way that Would cause some other row in the table to be updated.
Either the insert action succeeds, or it throws a "duplicate key" exception. If the "duplicate key" exception is thrown, the statement catches that, and performs the UPDATE action.
Given that (CLIENTID,CLIENTENTRYID) is the only unique key on the table (apart from the auto_increment column, not referenced by this statement), the update action will be equivalent to this statement:
UPDATE `mytable`
SET COUNT1 = ?
, COUNT2 = ?
WHERE CLIENTID = ?
AND CLIENTENTRYID = ?
... using the values supplied in the VALUES clause of the INSERT statement.
Bottom line, there isn't an issue in anything OP showed us. The logic is sound. There is something else going on, apart from this SQL statement.
OP code shows as using scalars (and not array elements) as arguments in the bind_param call, so that whole messiness of passing by reference shouldn't be an issue.
There's not an issue with the SQL statement OP has shown, based on everything OP told us and shown us. The issue reported has to be something other than the SQL statement.

Looking at the MySQL doc, it says that given an insert statement
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;
if column a and b are unique, the insert is equivalent to an update statement with a WHERE clause containing an OR instead of an AND:
UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;
And to quote from the documentation,
If a=1 OR b=2 matches several rows, only one row is updated. In
general, you should try to avoid using an ON DUPLICATE KEY UPDATE
clause on tables with multiple unique indexes.
Hope this helps.
UPDATE:
As per further discussion, OP will consider re-visiting existing database design. OP also has another table with similar multiple unique index spec, but without the same problem by utilizing INSERT IGNORE.

I found the answer.
As everyone here correctly suggested, this was something else. For some completely bizarre reason, the button I used to open the "add new entry" somehow POST'ed to set arrived = 0 on a selected object in a table view that has nothing to do with the button.
This must have been a UI linking somewhere in my Storyboard.
I'm sorry I wasted so much of your time guys. At least I learned a little more about SQL and indexes.

i think problem is with your are using values in UPDATE COUNT1=VALUES(COUNT1),COUNT2=VALUES(COUNT2) try to use like this
ON DUPLICATE KEY UPDATE COUNT1 = $v1,COUNT2 = $v2;

Searching for duplicate entries with PDO

I'm having a spot of trouble with a bit of code meant to find duplicates of a name along with the platform. This will also be adapted to find unique IDs later on.
So for example, if there is a server named "Apple" on the Xbox and you try to insert a record with the name "Apple" with the same platform it will reject it. However, another platform with the same name is allowed, such as "Apple" with PS3.
I've tried coming up with ideas and searching for answers, but I'm kind of in the dark as to what is the best way to go about checking for duplicates.
So far this is what I have:
$nameDuplicate_sql = $db->prepare("SELECT * FROM `servers` WHERE name=':name' AND platform=':platform'");
$nameDuplicate_sql->bindValue(':name', $name);
$nameDuplicate_sql->bindValue(':platform', $platform);
$nameDuplicate_sql->execute();
I've tried a bunch of different solutions, some from here, others from the PHP's manual and etc. None appear to work though.
I'm trying to stick with PDO, however, this is one instance where I cannot figure out where to turn. If this was in mysql_* I probably could just use mysql_affected_rows, but with PDO I have no clue. rowCount seemed promising, but it always returns 0 since this is neither an INSERT, UPDATE, or DELETE statement.
Oh, and I've tried the SQL statement in phpMyAdmin and it works; I tried it with a simple name/platform and it found rows properly.
If anyone can help me out here I'd appreciate it.

For most databases, PDOStatement::rowCount() does not return the
number of rows affected by a SELECT statement.
Instead, use PDO::query() to issue a SELECT COUNT(*) statement with the same predicates as your intended SELECT statement, then use
PDOStatement::fetchColumn() to retrieve the number of rows that will
be returned.
Your application can then perform the correct action.

Instead of checking for duplicates, why not just enforce it on the database table directly? Create a composite key that will prohibit entries being made if they are already there?
CREATE TABLE servers (
serverName varchar(50),
platform varchar(50),
PRIMARY KEY (serverName, platform)
)
This way, you will never get duplicates, and it also allows you to use the mysql insert... on duplicate key update... syntax which sounds like it might be rather handy for you.
If you already have a Primary Key on it or you don't want to make a new table, you can use the following:
ALTER TABLE servers DROP PRIMARY KEY, ADD PRIMARY KEY(serverName, platform);
Edit: A primary key is either a single row or a number of rows that have to have unique data in them. A single row cannot have the same value twice, but a composite key (which is what I am suggesting here) means that between the two columns, the same data cannot appear.
In this case, what you want to do, add in a server name and have it associated with a platform - the table will let you add in as many rows containing the same server name - as long as each one has a unique platform associated with it - and vice versa, you can have a platform listed as many times as you like, as long as all the server names are unique.
If you try to insert a record where the same servername/platform combination exists, the database simply won't let you do it. There is another golden benefit though. Due to this key constraint - mysql allows a special type of query to be used. It is the insert... on duplicate key update syntax. That means if you try to insert the same data twice (ie, database says no) you can catch it and update the row you already have in the table. For example:
You have a row with serverName=Fluffeh and it is on platform=Boosh but you don't know about it right now, so you try to insert a record with the intention of updating the server IP address.
Normally you would simply write something like this:
insert into servers (serverName, platform, IPAddress)
values ('$serverName', '$platform', '$IPAddy')
But with a nice primary key identified you can do this:
insert into servers (serverName, platform, IPAddress)
values ('$serverName', '$platform', '$IPAddy')
on duplicate key update set IPAddress='$IPAddy';
The second query will insert the row with all the data if it doesn't exist already. If it doesm, Bam! it will update the IP Address of the server which was your intention all along.

Remove the single quotes from your query on the parameter tokens... they will be quoted once they are bound... thats part of the reason for a prepared statement.
$nameDuplicate_sql = $db->prepare("SELECT * FROM `servers` WHERE name= :name AND platform= :platform");

MySQL only insert if there isn't an exact row in the Table

I need to insert this in a table but only if there isn't a replica of the row already. (both values should be equal). How can I change the code to work this way? Thanks
<?php
mysql_select_db("cyberworlddb", $con);
mysql_query("INSERT INTO Badges (UID, Website)
VALUES ('1', 'www.taringa.net')");
mysql_close($con)
?>

You could create a single index for the UID and Website columns and make that index unique, then use INSERT IGNORE. The result will be that if it is a duplicate, it will just be ignored.
If you need to be able to tell if the SQL inserted a row, then follow it up with a call to mysql_affected_rows() which should return 0 if it didn't do anything and 1 if it inserted the record.

Easiest thing to do is use INSERT IGNORE and have a unique key on the fields. It will insert if no row exists, otherwise do nothing.

What about a unique index on (UID, Website), which would cause the insert to fail?

First up, about the question. It is simple bad to check for "an exact" replica of row in RDBMS. That is just too costly. The right question to ask is what makes my row unique and what is the minimum I can get away with. Putting in unique constraints on big columns is a bad idea.
Answers saying that you should include UID in unique constraint are again just BAD. UID is most likely a generated key and the only input coming from outside is website name. So the only sane thing to do here is to put a unique constraint on website column.
Then the insert code should handle unique constraint errors coming out from the database. You can get the error number from DB handle, like
$errorNo = $mysql->errno ;
Then check for a particular code (1062 in case of MYSQL) that corresponds to unique key violation.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Avoid Duplicates of Unique Key Within INSERT Query - php

Yes, by first selecting the name from the database, and if the result of the query is not null (zero records), then the name already exists, and you have to get another name.

Related

Improve the performance of select first then insert to avoid duplicate record (in mysql & php)?

Get last_insert_id for INSERT DUPLICATE KEY UPDATE statement when there is no update

Mysterious INSERT UPDATE error

Searching for duplicate entries with PDO

MySQL only insert if there isn't an exact row in the Table

Categories

Resources