Is there a way to delete duplicate records based on two fields?
I have a system where people can register for sport events. In the table:
event_registrations
• unique_id
• eventname
• id (person's id number)
• Name and Surname
One person can apply for many events - id may duplicate
an event may have multiple participants - eventname may duplicate:
--Johnsmith-- --Mountain Cycle--
--Johnsmith-- --Marathnon Walk--
--Linda-- --Mountain Cycle--
--Johnsmith-- --Mountain Cycle--
But a person may not register for a event they have already registered for:
--Johnsmith-- --Mountain Cycle--
--Johnsmith-- --Mountain Cycle--
They Select a event name through a form. Then the form data and their user details is stored in table event_registrations.
Any help would be appreciated
First delete any rows with duplicate (eventname, id) combinations.
Then add the UNIQUE constraint:
ALTER TABLE yourTable
ADD CONSTRAINT eventname_person_Unique
UNIQUE INDEX eventname_id_U
(eventname, id) ;
Your form that adds registrations should be adjusted accordingly to treat the error it will get from MySQL when a duplicate row is rejected.
A UNIQUE INDEX is the way to prevent this, as ypercube suggests. To identify/delete existing duplicates you could use this:
SELECT
eventname,
id -- You should consider using a less ambiguous name here
FROM
Event_Registrations ER1
WHERE
EXISTS (
SELECT *
FROM Event_Registrations ER2
WHERE
ER2.eventname = ER1.eventname AND
ER2.id = ER1.id AND
(ER2.registration_datetime < ER1.registration_datetime OR
(ER2.registration_datetime = ER1.registration_datetime AND
ER2.unique_id < ER1.unique_id
)
)
)
If you need to do some data tidy up before adding the unique constraint then you could use the following (one good reason why always having a unique id column is a good idea):
create table id_for_deletion (id int unsigned not null);
insert into id_for_deletion (id)
(
select a.delete_me_id
from (
select eventname,id,max(unique_id) as delete_me_id
from event_registrations
group by eventname,id
having count(*) > 1
) a);
delete from event_registrations where unique_id in (select id from id_for_deletion);
drop table id_for_deletion;
Related
I have 3 table in postgres database. Created with this code:
CREATE TABLE AUTHOR(
ID SERIAL PRIMARY KEY,
NAME TEXT
);
CREATE TABLE BOOK(
ID SERIAL PRIMARY KEY,
NAME TEXT
);
CREATE TABLE BOOK_AUTHOR(
BOOK_ID INTEGER REFERENCES BOOK(ID),
AUTHOR_ID INTEGER REFERENCES AUTHOR(ID)
);
A book can have multiple author.
I want to insert multiple author in AUTHOR table.
A book in BOOK table.
And pair in BOOK_AUTHOR table.
For example: If BOOK X is written by Mr. A and Mr. B
I want the table content be like this
AUTHOR
ID-NAME
1, Mr. A
2, Mr. B
BOOK
ID-NAME
1, X
BOOK_AUTHOR
BOOK_ID-AUTHOR_ID
1,1
1,2
I am using postgres-php.
I know I can insert data in author table. Insert data in book table. Make query over them to get the ids.
Then insert in book_author table.
But is there any way to insert those data more efficiently?
What is the possible best way?
PostgreSQL has a very handy 'RETURNING' function you can use here like this:
WITH authors AS (
INSERT INTO
author (name)
VALUES
('Mr. A'), ('Mr. B')
RETURNING
id
), books AS (
INSERT INTO
book (name)
VALUES
('X')
RETURNING
id
)
INSERT INTO
book_author
SELECT
b.id
, a.id
FROM
books b
, authors a;
Just make a Cartesian product of the output and use it as input for the third insert.
I have an SQL Table that contains a list of items that users can have linked to their profile. The SQL table looks something like this:
Item_Activity_ID Item_ID User_ID Status Date-Added
1 1 1 1 2015-06-08
2 2 2 1 2015-06-08
3 1 1 0 2015-06-09
The entry shows that someone with the user with id of 1 added item id 1 twice, and the only thing that was changed was the date and status. I want to make it so that when given an INSERT statement such as:
INSERT INTO items (Item_ID, User_ID, Status, Date_Added) VALUES ('$x', '$y', 1, CURDATE()) IF EXISTS SOME Item_ID = $x AND User_ID = $y UPDATE items SET Status = 1, Date_Added = CURDATE() WHERE Item_ID = $x AND User_ID = $y
Item_Activity_ID is an auto_incremented primary key index. How can I accomplish this in one query? Two users can have the same item, but where should never be repeat entries of the same user id and item id.
First, create a unique index for Item_ID, UserID combination,
Then, use the INSERT ... ON DUPLICATE KEY UPDATE statement:
INSERT INTO items (Item_ID, User_ID, Status, Date_Added)
VALUES ('$x', '$y', 1, CURDATE())
ON DUPLICATE KEY UPDATE Status = VALUES(Status), Date_Added = VALUES(Date_Added))
P.S. make sure to sanitize $x and $y to prevent SQL injections!
I would start by adding a unique key index:
ALTER TABLE items
ADD CONSTRAINT uc_UserItem UNIQUE (Item_ID,User_ID);
Then, you can just modify your insert query:
INSERT INTO items (Item_ID, User_ID, Status, Date_Added) VALUES ('$x', '$y', 1, CURDATE()) ON DUPLICATE KEY UPDATE Status=VALUES(1), Date_Added=VALUES(CURDATE());
Try to perform the update first supposing that the user and item already exist. Then check if this update affects any rows (using ##rowcount, in SQL Server).
If not then perform an insert.
Don't forget to put the above two statements in a transaction... ;)
Normal way would be to set a composite constraint at the db level. If you are using mysql and phpmyadmin you do this in the table structure view.
Check both fields (I guess 'user_id' and 'item_id') and click the 'unique' button.
After that is set you can just append
ON DUPLICATE KEY UPDATE status =1, date_added=CURDATE().
It will update the row that violated the constraint you created
I have two tables one containing user information and the other having their time entered for the weeks as below!
Table1
------
UserID (PK)
Username
Email
Phone
Table2
------
Timesheetid (PK)
UserID (FK)
Weekenddate(date)
Totaltimeworked
From the above tables I want to retrieving the user ID,username and email from TABLE 1 for the user who have not entered information in the table 2 based on the weekend date(weekend date is selected in the search field and not hardcoded).
Please help me with the SQL query to create this table .
Try this working code on SQL Fiddle. As you haven't posted any data and columns definition I assume weekenddate to be a varchar.
On the condition Weekenddate = 'sunday' OR Weekenddate = 'saturday' substitute the values sunday and saturday by your parameter value. In fact, you will only need to use on of the clauses of the condition as you have only one parameter with the weekend value. Then just wrap the code into an
`INSERT INTO your_new_table (UserID, Username)`
Considering your Totaltimeworked column is up to date:
CREATE TABLE newTable(
UserID int not null,
Weekenddate date not null,
FOREIGN KEY(UserID) REFERENCES Table1(UserID)
ON DELETE CASCADE ON UPDATE CASCADE
);
INSERT INTO newTable (UserID, Weekenddate)
(SELECT (UserID, Weekenddate) FROM Table2 WHERE Totaltimeworked > 0);
Or if you meant just for 1 specific week don't add the Weekenddate to newTable and modify query as
INSERT INTO newTable (UserID)
(SELECT (UserID) FROM Table2 WHERE Totaltimeworked > 0
AND Weekenddate = 'year-month-day');
Hello i have a question on picking random entries from a database. I have 4 tables, products, bids and autobids, and users.
Products
-------
id 20,21,22,23,24(prime_key)
price...........
etc...........
users
-------
id(prim_key)
name user1,user2,user3
etc
bids
-------
product_id
user_id
created
autobids
--------
user_id
product_id
Now a multiple users can have an autobid on an product. So for the next bidder I want to select a random user from the autobid table
example of the query in language:
for each product in the autobid table I want a random user, which is not the last bidder.
On product 20 has user1,user2,user3 an autobidding.
On product 21 has user1,user2,user3 an autobidding
Then I want a resultset that looks for example like this
20 – user2
21 – user3
Just a random user. I tried miximg the GOUP BY (product_id) and making it RAND(), but I just can't get the right values from it. Now I am getting a random user, but all the values that go with it don't match.
Can someone please help me construct this query, I am using php and mysql
The first part of the solution is concerned with identifying the latest bid for each product: these eventually wind up in temporary table "latest_bid".
Then, we assign randon rank values to each autobid for each product - excluding the latest bid for each product. We then choose the highest rank value for each product, and then output the user_id and product_id of the autobids with those highest rank values.
create temporary table lastbids (product_id int not null,
created datetime not null,
primary key( product_id, created ) );
insert into lastbids
select product_id, max(created)
from bids
group by product_id;
create temporary table latest_bid ( user_id int not null,
product_id int not null,
primary key( user_id, product_id) );
insert into latest_bid
select product_id, user_id
from bids b
join lastbids lb on lb.product_id = b.product_id and lb.created = b.created;
create temporary table rank ( user_id int not null,
product_id int not null,
rank float not null,
primary key( product_id, rank ));
# "ignore" duplicates - it should not matter
# left join on latest_bid to exclude latest_bid for each product
insert ignore into rank
select user_id, product_id, rand()
from autobids a
left join latest_bid lb on a.user_id = lb.user_id and a.product_id = lb.product_id
where lb.user_id is null;
create temporary table choice
as select product_id,max(rank) choice
from rank group by product_id;
select user_id, res.product_id from rank res
join choice on res.product_id = choice.product_id and res.rank = choice.choice;
You can use the LIMIT statement in conjunction with server-side PREPARE.
Here is an example that selects a random row from the table mysql.help_category:
select #choice:= (rand() * count(*)) from mysql.help_category;
prepare rand_msg from 'select * from mysql.help_category limit ?,1';
execute rand_msg using #choice;
deallocate prepare rand_msg;
This will need refining to prevent #choice becoming zero, but the general idea works.
Alternatively, your application can construct the count itself by running the first select, and constructing the second select with a hard-coded limit value:
select count(*) from mysql.help_category;
# application then calculates limit value and constructs the select statement:
select * from mysql.help_category limit 5,1;
Is there anyway I can erase all the duplicate entries from a certain table (users)? Here is a sample of the type of entries I have. I must say the table users consists of 3 fields, ID, user, and pass.
mysql_query("DELETE FROM users WHERE ???") or die(mysql_error());
randomtest
randomtest
randomtest
nextfile
baby
randomtest
dog
anothertest
randomtest
baby
nextfile
dog
anothertest
randomtest
randomtest
I want to be able to find the duplicate entries, and then delete all of the duplicates, and leave one.
You can solve it with only one query.
If your table has the following structure:
CREATE TABLE `users` (
`id` int(10) unsigned NOT NULL auto_increment,
`username` varchar(45) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin1;
you could do something like that (this will delete all duplicate users based on username with and ID greater than the smaller ID for that username):
DELETE users
FROM users INNER JOIN
(SELECT MIN(id) as id, username FROM users GROUP BY username) AS t
ON users.username = t.username AND users.id > t.id
It works and I've already use something similar to delete duplicates.
You can do it with three sqls:
create table tmp as select distinct name from users;
drop table users;
alter table tmp rename users;
This delete script (SQL Server syntax) should work:
DELETE FROM Users
WHERE ID NOT IN (
SELECT MIN(ID)
FROM Users
GROUP BY User
)
I assume that you have a structure like the following:
users
-----------------
| id | username |
-----------------
| 1 | joe |
| 2 | bob |
| 3 | jane |
| 4 | bob |
| 5 | bob |
| 6 | jane |
-----------------
Doing the magic with temporary is required since MySQL cannot use a sub-select in delete query that uses the delete's target table.
CREATE TEMPORARY TABLE IF NOT EXISTS users_to_delete (id INTEGER);
INSERT INTO users_to_delete (id)
SELECT MIN(u1.id) as id
FROM users u1
INNER JOIN users u2 ON u1.username = u2.username
GROUP BY u1.username;
DELETE FROM users WHERE id NOT IN (SELECT id FROM users_to_delete);
I know the query is a bit hairy but it does the work, even if the users table has more than 2 columns.
You need to be a bit careful of how the data in your table is used. If this really is a users table, there is likely other tables with FKs pointing to the ID column. In which case you need to update those tables to use ID you have selected to keep.
If it's just a standalone table (no table reference it)
CREATE TEMPORARY TABLE Tmp (ID int);
INSERT INTO Tmp SELECT ID FROM USERS GROUP BY User;
DELETE FROM Users WHERE ID NOT IN (SELECT ID FROM Tmp);
Users table linked from other tables
Create the temporary tables including a link table that holds all the old id's and the respective new ids which other tables should reference instead.
CREATE TEMPORARY TABLE Keep (ID int, User varchar(45));
CREATE TEMPORARY TABLE Remove (OldID int, NewID int);
INSERT INTO Keep SELECT ID, User FROM USERS GROUP BY User;
INSERT INTO Remove SELECT u1.ID, u2.ID FROM Users u1 INNER JOIN Keep u2 ON u2.User = u1.User WHERE u1.ID NOT IN (SELECT ID FROM Users GROUP BY User);
Go through any tables which reference your users table and update their FK column (likely called UserID) to point to the New unique ID which you have selected, like so...
UPDATE MYTABLE t INNER JOIN Remove r ON t.UserID = r.OldID
SET t.UserID = r.NewID;
Finally go back to your users table and remove the no longer referenced duplicates:
DELETE FROM Users WHERE ID NOT IN (SELECT ID FROM Keep);
Clean up those Tmp tables:
DROP TABLE KEEP;
DROP TABLE REMOVE;
A very simple solution would be to set an UNIQUE index on the table's column you wish to have unique values. Note that you subsequently cannot insert the same key twice.
Edit: My mistake, I hadn't read that last line: "I want to be able to find the duplicate entries".
I would get all the results, put them in an array of IDs and VALUES. Use a PHP function to work out the dupes, log all the IDs in an array, and use those values to delete the records.
I don't know your db schema, but the simplest solution seems to be to do SELECT DISTINCT on that table, keep the result in a variable (i.e. array), delete all records from the table and then reinsert the list returne by SELECT DISTINCT previously.
The temporary table is an excellent solution, but I'd like to provide a SELECT query that grabs duplicate rows from the table as an alternative:
SELECT * FROM `users` LEFT JOIN (
SELECT `name`, COUNT(`name`) AS `count`
FROM `users` GROUP BY `name`
) AS `grouped`
WHERE `grouped`.`name` = `users`.`name`
AND `grouped`.`count`>1
Select your 3 columns as per your table structure and apply condition as per your requirements.
SELECT user.userId,user.username user.password FROM user As user
GROUP BY user.userId, user.username
HAVING (COUNT(user.username) > 1));
Every answer above and/or below didn't work for me, therefore I decided to write my own little script. It's not the best, but it gets the job done.
Comments are included throughout, but this script is customized for my needs, and I hope the idea helps you.
I basically wrote the database contents to a temp file, called the temp file, applied the function to the called file to remove the duplicates, truncated the table, and then input the data right back into the SQL. Sounds like a lot, I know.
If you're confused as to what $setprofile is, it's a session that's created upon logging into my script (to establish a profile), and is cleared upon logging out.
<?php
// session and includes, you know the drill.
session_start();
include_once('connect/config.php');
// create a temp file with session id and current date
$datefile = date("m-j-Y");
$file = "temp/$setprofile-$datefile.txt";
$f = fopen($file, 'w'); // Open in write mode
// call the user and pass via SQL and write them to $file
$sql = mysql_query("SELECT * FROM _$setprofile ORDER BY user DESC");
while($row = mysql_fetch_array($sql))
{
$user = $row['user'];
$pass = $row['pass'];
$accounts = "$user:$pass "; // the white space right here is important, it defines the separator for the dupe check function
fwrite($f, $accounts);
}
fclose($f);
// **** Dupe Function **** //
// removes duplicate substrings between the seperator
function uniqueStrs($seperator, $str) {
// convert string to an array using ' ' as the seperator
$str_arr = explode($seperator, $str);
// remove duplicate array values
$result = array_unique($str_arr);
// convert array back to string, using ' ' to glue it back
$unique_str = implode(' ', $result);
// return the unique string
return $unique_str;
}
// **** END Dupe Function **** //
// call the list we made earlier, so we can use the function above to remove dupes
$str = file_get_contents($file);
// seperator
$seperator = ' ';
// use the function to save a unique string
$new_str = uniqueStrs($seperator, $str);
// empty the table
mysql_query("TRUNCATE TABLE _$setprofile") or die(mysql_error());
// prep for SQL by replacing test:test with ('test','test'), etc.
// this isn't a sufficient way of converting, as i said, it works for me.
$patterns = array("/([^\s:]+):([^\s:]+)/", "/\s++\(/");
$replacements = array("('$1', '$2')", ", (");
// insert the values into your table, and presto! no more dupes.
$sql = 'INSERT INTO `_'.$setprofile.'` (`user`, `pass`) VALUES ' . preg_replace($patterns, $replacements, $new_str) . ';';
$product = mysql_query($sql) or die(mysql_error()); // put $new_str here so it will replace new list with SQL formatting
// if all goes well.... OR wrong? :)
if($product){ echo "Completed!";
} else {
echo "Failed!";
}
unlink($file); // delete the temp file/list we made earlier
?>
This will work:
create table tmp like users;
insert into tmp select distinct name from users;
drop table users;
alter table tmp rename users;
If you have a Unique ID / Primary key on the table then:
DELETE FROM MyTable AS T1
WHERE MyID <
(
SELECT MAX(MyID)
FROM MyTable AS T2
WHERE T2.Col1 = T1.Col1
AND T2.Col2 = T1.Col2
... repeat for all columns to consider duplicates ...
)
if you don't have a Unique Key select all distinct values into a temporary table, delete all original rows, and copy back from temporary table - but this will be problematic if you have Foreign Keys referring to this table