i am trying to find duplicate entries within my mysql table. I would like to compare the different fields with each other. Here is the structure of my table:
ID FirstName LastName Street ZIP City IpAddress
1 Jack Smith 2nd 12345 Sample1 12.21.24.212
2 Paul Miller 3rd 45685 Sample2 78.54.85.654
3 Jenny Smith 3rd 77273 Sample3 84.91.67.311
4 Frank Jackson 1st 27819 Sample1 78.54.85.654
5 Jack Smith 3rd 72891 Sample2 94.79.99.465
Now i would like to compare the street and ip column individually and then i would like to find the combination of the first- and lastname. There are actually a few more columns in my table that i would like to search for but i think my example above should give you an idea about what i am planning.
I need the id numbers of the entries that could potencially duplicates.
In the example above the output should be the id numbers 1 and 5 when i compare the combination of the first- and lastname.
The output should be the id numbers 2,3 and 5 if i compare the street names.
And the output for the ip addresses should be id numbers 2 and 4.
Does anyone have some ideas about how i should do this? What is the best way to compare those different tables? I don't mind if i have to do several queries.
Use GROUP_CONCAT() to get all the IDs within a group, and GROUP BY to specify the columns that you're looking for duplidates of. And you can use COUNT(*) so you only return the ones that have duplicates.
For streets:
SELECT street, GROUP_CONCAT(id)
FROM yourTable
GROUP BY street
HAVING COUNT(*) > 1
For names:
SELECT firstname, lastname, GROUP_CONCAT(id)
FROM yourTable
GROUP BY firstname, lastname
HAVING COUNT(*) > 1
Related
I'd like to know how I can count and display duplicate rows on my PHP website. Let's assume that my website allows one to submit a name. How do I then display these names on my website? Let's say that the names in the database are:
John
Derrick
Billy
Jason
Wesley
Billy
John
Billy
I'd then like the website to display the following:
John: 2
Derrick: 1
Billy: 3
Jason: 1
Wesley: 1
How can I achieve this sort of structure?
Yes, you can make this:
select name, count(*) from your_table group by name
And that would return something like this:
name | count
John | 2
If you want the name and the count, you can make a concat
select concat(name, ": ", count(*)) from your_table group by name
And it would return something like this:
name
John: 2
(just one column)
<?php
$ps=$db->prepare("SELECT column_name, count(*) as nb FROM table_name GROUP BY column_name");
$ps->execute()
while($data=$ps->fetch()){
echo $data['column_name'].':'.$data['nb'].'<br/>';
}
?>
$db is the connection instance
I want to calculate how much some names repeated and to get he most repeated name out but cannot get it. I can calculate how much Michael has $fix in his rows. But I need who is the best of it in repeats.
SELECT COUNT(*) AS count FROM (SELECT * FROM names WHERE league='books' and $position='Michael' ORDER BY id LIMIT $limit) AS last12 WHERE $fix='1'
I want to print me Michael if he repeats the most:
Michael 1
Jack 1
Jack 1
Jack 1
Michael 1
Michael 1
Michael 1
Juni 1
Let's say you have a table called names with columns id, name, league, etc. You want to know how many Michael or Erick or whatever name are there. To do that, you need to group by that column and use count(*), like follows:
Select name, count(*) as count from names group by name
This will return the names with its respective counts.
I have a situation,
My MySQL table (company) contains duplicate records,i.e.,it has repeated companies, some records have values in most columns and some don't have. So I want to remove the duplicate companies having minimal set of information. Guys any ideas?
Id Company_name column column2 column3 column4
-------------------------------------------------
1 A xyz
2 B pqr abc tcv aaa
3 A bnm xyz ccc
4 A bnm xyz
5 B aaa
I need to get my table as follows
Id Company_name column column2 column3 column4
-------------------------------------------------
2 B pqr abc tcv aaa
3 A bnm xyz ccc
You can have a php method to do this work, and manually you will retrieve all the record grouped by the column by what you want to reduce the repetitive rows. In above case you are considering for the Company_name column. But there is possibility that it may have some different value on other columns but not in the Company_name column. This may create ambiguity in understanding that how it will the algorithm will treat such type of row.
But it will be good practice that before inserting the values, the information must be checked so no repetition occurs. But in the case when you already have such records,following query may help.
DELETE FROM TABLENAME WHERE (Company_name, column)
NOT IN
(
SELECT Company_name, column FROM
(
SELECT MIN(Id) AS Id, column FROM TABLENAME GROUP BY Company_name
)
X
);
This is for deleting the duplicate values for one column, you can make with combination of multiple query to reduce the duplicate values.
It's possible to get a "score" of each row and base the decision on that. Here is a quick example that shows where to start.
SELECT id,
name,
length(concat_ws('', col1, col2, col3, col4)) AS score
FROM company
ORDER BY score DESC;
See it on sqlfiddle
Say I have a database with two tables: "food", and "whatToEat".
I query the "whatToEat" and find 3 rows:
id Food username
1 Apple John
2 Banana John
3 Milk Linda
If I want to get those from the "food" table, I can just do something like this i guess:
SELECT *
FROM food
WHERE username='John' AND typeOfFood = 'apple'
OR typeOfFood = 'Banana' OR typeOfFood = 'Milk'
... but is it possible to dynamically write this, since the "whatToEat" table will change all the time, or do I need a loop and query the "food" table one by one for each of the objects in "whatToEat"?
EDIT
The above is just an example, the real scenario is an online game. When it's a players turn in a game, he's put on the "matches_updated" table. This table just holds his name, and the id of the match (or matches since he can be in several at the same time). When a player recive an update, I would like to check if he have any matches that needs to be updated (query "matches_updated" table), and then pull the data and return to him from the "matches" table, where all the information is stored about the matches.
Example:
The player Tim query the "mathces_updated" table and find he have 2 new matches that needs to be updated:
match_id username
1 Tim
2 Tim
2 Lisa
1 John
3 John
... He now want to get the information about these matches, which is stored in the "matches" table:
match_id match_status player1Name Player1Score Player2Name Player2Score
1 1 John 123 Tim 12
2 1 Lisa 4 Tim 15
3 1 John 0 Lisa 0
I am not sure whether I understand the question correctly.
Actually It depends on the queried tables have what in common.
so if suppose food is a common column, then query something like this...
select * from food where food in (select food from whattoeat where username = ?)
Try it out, if it solves your problem...
SELECT * FROM matches_updated JOIN matches
ON matches_updated.match_id == matches.match_id
WHERE matches_updated.user == "Tim"
--
Perhaps you want a JOIN statement?
SELECT * FROM food JOIN whattoeat
ON food.username == whattoeat.username
WHERE food.username == "John"
I'm having a really hard time trying to understand what your desired result is - posting example of both tables in question, and the desired result of your query, might help.
SELECT * FROM food WHERE username='John'
Say if I have 2 tables. The first one holds users ids and their first names. The second one holds user ids and their last names, but the rows in this table may or may not exist depending on whether the user has given their last name or not.
I want to select both the first name and the last name, but if only the first name exists then to just select that on its own.
I cant use something like this because if the second table row doesn't exist then it returns nothing:
$db->query("select firstname.fname, lastname.lname from firstname, lastname where firstname.userid = lastname.userid");
Thanks.
SELECT f.fname, l.lname
FROM firstname f
LEFT JOIN lastname l
ON f.userid = l.userid
this will return something like:
fname | lname
John | Doe
Bob | NULL
where NULL means that Bob hasn't got a last name
JOIN is more performant than cartesian product you are using in your example because it won't produce all the possible combinations of {firstame,lastname} but just the ones which make sense (the ones with the same userid)