Searching for duplicate entries with PDO

Searching for duplicate entries with PDO - php

I'm having a spot of trouble with a bit of code meant to find duplicates of a name along with the platform. This will also be adapted to find unique IDs later on.
So for example, if there is a server named "Apple" on the Xbox and you try to insert a record with the name "Apple" with the same platform it will reject it. However, another platform with the same name is allowed, such as "Apple" with PS3.
I've tried coming up with ideas and searching for answers, but I'm kind of in the dark as to what is the best way to go about checking for duplicates.
So far this is what I have:
$nameDuplicate_sql = $db->prepare("SELECT * FROM `servers` WHERE name=':name' AND platform=':platform'");
$nameDuplicate_sql->bindValue(':name', $name);
$nameDuplicate_sql->bindValue(':platform', $platform);
$nameDuplicate_sql->execute();
I've tried a bunch of different solutions, some from here, others from the PHP's manual and etc. None appear to work though.
I'm trying to stick with PDO, however, this is one instance where I cannot figure out where to turn. If this was in mysql_* I probably could just use mysql_affected_rows, but with PDO I have no clue. rowCount seemed promising, but it always returns 0 since this is neither an INSERT, UPDATE, or DELETE statement.
Oh, and I've tried the SQL statement in phpMyAdmin and it works; I tried it with a simple name/platform and it found rows properly.
If anyone can help me out here I'd appreciate it.

For most databases, PDOStatement::rowCount() does not return the
number of rows affected by a SELECT statement.
Instead, use PDO::query() to issue a SELECT COUNT(*) statement with the same predicates as your intended SELECT statement, then use
PDOStatement::fetchColumn() to retrieve the number of rows that will
be returned.
Your application can then perform the correct action.

Instead of checking for duplicates, why not just enforce it on the database table directly? Create a composite key that will prohibit entries being made if they are already there?
CREATE TABLE servers (
serverName varchar(50),
platform varchar(50),
PRIMARY KEY (serverName, platform)
)
This way, you will never get duplicates, and it also allows you to use the mysql insert... on duplicate key update... syntax which sounds like it might be rather handy for you.
If you already have a Primary Key on it or you don't want to make a new table, you can use the following:
ALTER TABLE servers DROP PRIMARY KEY, ADD PRIMARY KEY(serverName, platform);
Edit: A primary key is either a single row or a number of rows that have to have unique data in them. A single row cannot have the same value twice, but a composite key (which is what I am suggesting here) means that between the two columns, the same data cannot appear.
In this case, what you want to do, add in a server name and have it associated with a platform - the table will let you add in as many rows containing the same server name - as long as each one has a unique platform associated with it - and vice versa, you can have a platform listed as many times as you like, as long as all the server names are unique.
If you try to insert a record where the same servername/platform combination exists, the database simply won't let you do it. There is another golden benefit though. Due to this key constraint - mysql allows a special type of query to be used. It is the insert... on duplicate key update syntax. That means if you try to insert the same data twice (ie, database says no) you can catch it and update the row you already have in the table. For example:
You have a row with serverName=Fluffeh and it is on platform=Boosh but you don't know about it right now, so you try to insert a record with the intention of updating the server IP address.
Normally you would simply write something like this:
insert into servers (serverName, platform, IPAddress)
values ('$serverName', '$platform', '$IPAddy')
But with a nice primary key identified you can do this:
insert into servers (serverName, platform, IPAddress)
values ('$serverName', '$platform', '$IPAddy')
on duplicate key update set IPAddress='$IPAddy';
The second query will insert the row with all the data if it doesn't exist already. If it doesm, Bam! it will update the IP Address of the server which was your intention all along.

Remove the single quotes from your query on the parameter tokens... they will be quoted once they are bound... thats part of the reason for a prepared statement.
$nameDuplicate_sql = $db->prepare("SELECT * FROM `servers` WHERE name= :name AND platform= :platform");

Related

PHP create a copy command like phpmyadmin

I am new with PHP development and just wondering if theres a existing function on PHP than duplicate the copy command on phpmyadmin, i know that the query sequence is below, but this is like a long query/code since the table has alot of columns. i mean if phpmyadmin has this feature maybe its calling a build in function?
SELECT * FROM table where id = X
INSERT INTO table (XXX)VALUES(XXX)
Where the information is based from the SELECT query
Note: The id is primary and auto increment.
Here is the copy command on phpmyadmin

i mean if phpmyadmin has this feature maybe its calling a build in function?
There is no built-in functionality in MySQL to duplicate a row other than an INSERT statement of the form: INSERT INTO tableName ( columns-specification ) SELECT columns-specification FROM tableName WHERE primaryKeyColumns = primaryKeyValue.
The problem is you need to know the names of the columns beforehand, you also need to exclude auto_increment columns, as well as primary-key columns, and know how to come up with "smart defaults" for non-auto_increment primary key columns, especially composite keys. You'll also need to consider if any triggers should be executed too - and how to handle any constraints and indexes that may be designed to prevent duplicate values that a "copy" operation might introduce.
You can still do it in PHP, or even pure-MySQL (inside a sproc, using Dynamic SQL) but you'll need to query information_schema to get metadata about your database - which may be more trouble than it's worth.

MySql - how can you create a unique constraint on a combination of two values in two columns

I have a problem with creating index described in answer for this question: sql unique constraint on a 2 columns combination
I am using MySql, and I received syntax error, my version of this query is as follows:
CREATE UNIQUE INDEX ON friends (LEAST(userID, friendID), GREATEST(userID, friendID));
LEAST and GREATEST functions are available in MySql, but maybe the syntax should be different?
I tried to make an ALTER TABLE version, but it does not worked as well.

In MySQL, you can't use functions as the values for indexes.
The documentation does not explicitly state this, however, it is a basic characteristic of an index to only support "fixed" data:
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
Generally, this "fixed" data is an individual column/field; with string-fields (such as varchar or text) you can have a prefix-index and not the entire column. Check out CREATE INDEX for more info on that.
The unique index that you're trying to create in you example will have a single record ever; that's not really a beneficial index since it doesn't help for searching the entire table. However, if you index your table on userID, friendID, using the LEAST() and GREATEST() functions in a SELECT statement will be optimized thanks to the index itself, so it may be what you're after in this case.

Mysterious INSERT UPDATE error

I have a table that looks like (irrelevant columns subtracted):
PRIMARY KEY(AUTO-INCREMENT,INT),
CLIENTID(INT),
CLIENTENTRYID(INT),
COUNT1(INT),
COUNT2(INT)
Now, the CLIENTID and CLIENTENTRYID is a unique combined index serving as a duplication prevention.
I use PHP post input to the server. My query looks like:
$stmt = $sql->prepare('INSERT INTO table (COUNT1,COUNT2,CLIENTID,CLIENTENTRYID) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE COUNT1=VALUES(COUNT1),COUNT2=VALUES(COUNT2)');
$stmt->bind_param("iiii",$value,$value,$clientid,$cliententryid);
The SQL object has auto commit enabled. The "value" variable is reused as the value in COUNT1 and COUNT2 should ALWAYS be the same.
Okay - that works fine, most of the time, but randomly, and I cannot figure out why, it will post 0 in COUNT2 - for an entirely different row.
Any ideas how that might occur? I can't see a pattern (it doesn't happen after a failed attempt, which is why the unique index exists, so that a new attempt will not cause duplicates). It seems to be completely random.
Is there something I've misunderstood about ON DUPLICATE KEY UPDATE? The VERY weird thing is that it updates A DIFFERENT row incorrectly - not the one you insert.
I realize other factors might affect this, but now I'm trying to rule out my SQL logic as a source of error.

Aside from the PRIMARY KEY on the auto_increment column, there is only ONE UNIQUE key defined the table, and that's defined on (CLIENTID,CLIENTENTRYID), right?
And there are no triggers defined on the table, right?
And you are (obviously) using a prepared statement with bind placeholders.
It doesn't really matter if those two columns (CLIENTID and CLIENTENTRYID) are defined as NOT NULL or not; MySQL will allow multiple rows with NULL values; that doesn't violated the "uniqueness" enforced by a UNIQUE constraint. (This the same as how Oracle treats "uniqueness" of NULL values, but it is different from how SQL Server enforces it.)
I just don't see any way that the statement you show, that is:
INSERT INTO `mytable` (COUNT1,COUNT2,CLIENTID,CLIENTENTRYID) VALUES (?,?,?,?)
ON DUPLICATE KEY
UPDATE COUNT1 = VALUES(COUNT1)
, COUNT2 = VALUES(COUNT2)
... theres no way that Would cause some other row in the table to be updated.
Either the insert action succeeds, or it throws a "duplicate key" exception. If the "duplicate key" exception is thrown, the statement catches that, and performs the UPDATE action.
Given that (CLIENTID,CLIENTENTRYID) is the only unique key on the table (apart from the auto_increment column, not referenced by this statement), the update action will be equivalent to this statement:
UPDATE `mytable`
SET COUNT1 = ?
, COUNT2 = ?
WHERE CLIENTID = ?
AND CLIENTENTRYID = ?
... using the values supplied in the VALUES clause of the INSERT statement.
Bottom line, there isn't an issue in anything OP showed us. The logic is sound. There is something else going on, apart from this SQL statement.
OP code shows as using scalars (and not array elements) as arguments in the bind_param call, so that whole messiness of passing by reference shouldn't be an issue.
There's not an issue with the SQL statement OP has shown, based on everything OP told us and shown us. The issue reported has to be something other than the SQL statement.

Looking at the MySQL doc, it says that given an insert statement
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;
if column a and b are unique, the insert is equivalent to an update statement with a WHERE clause containing an OR instead of an AND:
UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;
And to quote from the documentation,
If a=1 OR b=2 matches several rows, only one row is updated. In
general, you should try to avoid using an ON DUPLICATE KEY UPDATE
clause on tables with multiple unique indexes.
Hope this helps.
UPDATE:
As per further discussion, OP will consider re-visiting existing database design. OP also has another table with similar multiple unique index spec, but without the same problem by utilizing INSERT IGNORE.

I found the answer.
As everyone here correctly suggested, this was something else. For some completely bizarre reason, the button I used to open the "add new entry" somehow POST'ed to set arrived = 0 on a selected object in a table view that has nothing to do with the button.
This must have been a UI linking somewhere in my Storyboard.
I'm sorry I wasted so much of your time guys. At least I learned a little more about SQL and indexes.

i think problem is with your are using values in UPDATE COUNT1=VALUES(COUNT1),COUNT2=VALUES(COUNT2) try to use like this
ON DUPLICATE KEY UPDATE COUNT1 = $v1,COUNT2 = $v2;

Avoid Duplicates of Unique Key Within INSERT Query

I have a MySQL query that looks like this:
INSERT INTO beer(name, type, alcohol_by_volume, description, image_url) VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}', '{$image_url}')
The only problem is that name is a unique value, which means if I ever run into duplicates, I get an error like this:
Error storing beer data: Duplicate entry 'Hocus Pocus' for key 2
Is there a way to ensure that the SQL query does not attempt to add a unique value that already exists without running a SELECT query for the entire database?

You could of course use INSERT IGNORE INTO, like this:
INSERT IGNORE INTO beer(name, type, alcohol_by_volume, description, image_url) VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}', '{$image_url}')
You could use ON DUPLICATE KEY as well, but if you just don't want to add a row INSERT IGNORE INTO is a better choice. ON DUPLICATE KEY is better suited if you want to do something more specific when there are a duplicate.
If you decide to use ON DUPLICATE KEY - avoid using this clause on tables with multiple unique indexes. If you have a table with multiple unique indexes ON DUPLICATE KEY-clause could be giving unexpected results (You really don't have 100% control what's going to happen)
Example: - this row below only updates ONE row (if type is 1 and alcohol_by_volume 1 (and both columns are unique indexes))
ON DUPLICATE KEY UPDATE beer SET type=3 WHERE type=1 or alcohol_by_volume=1
To sum it up:
ON DUPLICATE KEY just does the work without warnings or errors when there are duplicates.
INSERT IGNORE INTO throws a warning when there are duplicates, but besides from that just ignore to insert the duplicate into the database.

As it just so happens, there is a way in MySQL by using ON DUPLICATE KEY UPDATE. This is available since MySQL 4.1
INSERT INTO beer(name, type, alcohol_by_volume, description, image_url)
VALUES('{$name}', {$type}, '{$alcohol_by_volume}', '{$description}',
'{$image_url}')
ON DUPLICATE KEY UPDATE type=type;
You could also use INSERT IGNORE INTO... as an alternative, but the statement would still throw a warning (albeit, instead of an error).

Yes, there is. You can use the ON DUPLICATE KEY clause of mysql INSERT statement. The syntax is explained here
INSERT INTO beer(name, type, alcohol_by_volume, ...)
VALUES('{$name}', {$type}, '{$alcohol_by_volume}', ...)
ON DUPLICATE KEY UPDATE
type={$type}, alcohol_by_volume = '{$alcohol_by_volume}', ... ;

Yes, by first selecting the name from the database, and if the result of the query is not null (zero records), then the name already exists, and you have to get another name.

Quite simply - your code needs to figure out what it wants to do if something's trying to insert a duplicate name. As such, what you need to do first is run a select statement:
SELECT * FROM beer WHERE name='{$name}'
And then run an 'if' statement off of that to determine if you got a result.
if results = 0, then go ahead and run your insert.
Else ... whatever you want to do. Throw an error back to the user? Modify the database in a different way? Completely ignore it? How is this insert statement coming about? A mass update from a file? User input from a web page?
The way you're reaching this insert statement, and how it should affect your work flow, should determine exactly how you're handling that 'else'. But you should definitely handle it.
But just make sure that the select and insert statements are in a transaction together so that other folks coming in to do the same sort of stuff isn't an issue.

Key problem: Which key strategy should I use in my database?

Problem: When I use an auto-incrementing primary key in my database, this happens all the time:
I want to store an Order with 10 Items. The ordered Items belong to the Order. So I store the order, ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?), and then store the 10 Items with the foreign key (order_id).
So I always have to do:
INSERT ...
last_inserted_id = db.lastInsertId();
INSERT ...
INSERT ...
INSERT ...
and I believe this prevents me from using transactions in almost all INSERT cases where I need a foreign key.
So... here some solutions, and I don't know if they're really good:
A) Don't use auto_increment keys! Use a key table?
Key Table would have two fields: table_name, next_key. Every time I need a key for a table to insert a new dataset, first I ask for the next_key by accessing a special static KeyGenerator class method. This does a SELECT and an UPDATE, if possible in one transaction (would that work?). Of course I would request that for every affected table. Next, I can INSERT my entire object graph in one transaction without playing ping-pong with the database, before I know the keys already in advance.
B) Using GUUID / UUID algorithm for keys?
These suppose to be really unique worldwide, and they're LARGE. I mean ... L_A_R_G_E. So a big amount of memory would go into these gigantic keys. Indexing will be hard, right? And data retrieval will be a pain for the database - at least I guess - integer keys are much faster to handle. On the other hand, these also provide some security: Visitors can't iterate anymore over all orders or all users or all pictures by just incrementing the id parameter.
C) Stick with auto_incremented keys?
Ok, if then, what about transactions like described in the example above? How can I solve that? Maybe by inserting a Ghost Row first and then doing an transaction with one UPDATE + n INSERTs?
D) What else?

When storing orders, you need transactions to prevent situations where only half your products are added to the database.
Depending on your database and your connector, the value returned by the last-insert-id function might be transaction-independent. For instance, with MySQL, mysql_insert_id returns the identifier for the last query from that particular client (without being affected by what other clients are doing concurrently).

Which database are you using?
Yes, typically inserting a record and then trying to select it again to find the auto-generated key is bad, especially if you are using a naive select max(id) from table query. This is because as soon as two threads are creating records max(id) may not actually return the last id your current thread used.
One way to avoid this is to create a sequence in the database. From your code you select sequence.NextValue then use that value to then execute your inserts (or you can craft a more complex SQL statement that does this selection and the inserts in one go). Sequences are atomic / thread-safe.
In MySQL you can ask for the last inserted id from the execution results which I believe will always give you the correct answer.

Sql Server supports SCOPE_IDENTITY (Transact-SQL) which should take care of your transaction issue and concurrency issue.
I would say stick with auto_increment.

(Assuming you are using MySQL)
"ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?)"
If you use MySQLs last_insert_id() function, you only see what happened in your session. So this is safe. You mention ths:
db.last_insert_id()
I don't know what framework or language it is, but I would assume that uses MySQL's last_insert_id() under the covers (if not, it is a pretty useless database abstraction fromework)
" I believe this prevents me from using transactions in almost all INSERT cases w"
I don't see why. Please explain.

D) Sequence
: may not be available in your DBMS, but if it is, solves your problem elegantly.
For Postgresql, have a look at Sequence Functions

There is no final and general answer to this question.
auto incrementing columns are easy to use when you add new records. To use them as foreign keys within the same transaction, they are not so straight forward. You need database specific commands to get the newly created key. This technology is common for certain databases, for instance sql server.
Sequences seem to be harder to use, because you need to get a key before you insert a row, but at the end its easier to use them as foreign keys. This technology is common for certain databases, for instance oracle.
When you use Hibernate or NHibernate, it is discouraged to use auto incrementing keys, because some optimizations are not possible anymore. Using a hi-lo algorithm which uses an additional table is recommended.
Guids are strong, for instance when sharing data between different databases, systems, disconnected scenarios, import / export etc. In many databases, most of the tables contain only a few hundred records, so memory and performance are not such an issue. When using NHibernate, you get an guid generator which produces sequential guids, because some databases perform better when keys are sequential.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Searching for duplicate entries with PDO - php

Remove the single quotes from your query on the parameter tokens... they will be quoted once they are bound... thats part of the reason for a prepared statement. $nameDuplicate_sql = $db->prepare("SELECT * FROM `servers` WHERE name= :name AND platform= :platform");

Related

PHP create a copy command like phpmyadmin

MySql - how can you create a unique constraint on a combination of two values in two columns

Mysterious INSERT UPDATE error

Avoid Duplicates of Unique Key Within INSERT Query

Key problem: Which key strategy should I use in my database?

Categories

Resources