Count duplicates and update table with a single query - php

I have a table which has several thousand records.
I want to update all the records which have a duplicate firstname
How can I achieve this with a single query?
Sample table structure:
Fname varchar(100)
Lname varchar(100)
Duplicates int
This duplicate column must be updated with the total number of duplicates with a single query.
Is this possible without running in a loop?

update table as t1
inner join (
select
fname,
count(fname) as total
from table
group by fname) as t2
on t1.fname = t2.fname
set t1.duplicates = t2.total

I have a table which has several thousand records. I want to update all the records which have a duplicate firstname How can I achieve this with a single query?
Are you absolutely sure you want to store the number of the so called duplicates? If not, it's a rather simple query:
SELECT fname, COUNT(1) AS number FROM yourtable GROUP BY fname;
I don't see why you would want to store that number though. What if there's another record inserted? What if there are records deleted? The "number of duplicates" will remain the same, and therefore will become incorrect at the first mutation.

Create the column first, then write a query like:
UPDATE table SET table.duplicates = (SELECT COUNT(*) FROM table r GROUP BY Fname/Lname/some_id)
Maybe this other SO will help?
How do I UPDATE from a SELECT in SQL Server?

You might not be able to do this. You can't update the same table that you are selecting from in the same query.

Related

Duplicate records in MySQL. EXISTS check for the same data not working properly?

SELECT EXISTS
(SELECT * FROM table WHERE deleted_at IS NULL and the_date = '$the_date' AND company_name = '$company_name' AND purchase_country = '$p_country' AND lot = '$lot_no') AS numofrecords")
What is wrong with this mysql query?
It is still allowing duplicates inserts (1 out of 1000 records). Around 100 users making entries, so the traffic is not that big, I assume. I do not have access to the database metrics, so I can not be sure.
The EXISTS condition is use in a WHERE clause. In your case, the first select doesn't specify the table and the condition.
One example:
SELECT *
FROM customers
WHERE EXISTS (SELECT *
FROM order_details
WHERE customers.customer_id = order_details.customer_id);
Try to put your statement like this, and if it returns the data duplicated, just use a DISTINCT. (SELECT DISCTINCT * .....)
Another approach for you :
INSERT INTO your_table VALUES (SELECT * FROM table GROUP BY your_column_want_to_dupplicate);
The answer from #Nick gave the clues to solve the issue. Separated EXIST check and INSERT was not the best way. Two users were actually able to do INSERT, if one got 0. A single statement query with INSERT ... ON DUPLICATE KEY UPDATE... was the way to go.

MYSQL Delete From Based on Multiple Distinct Columns

I have this problem that's been killing me for a couple days now.
So we have a table of all processed orders.
We have a table for all orders that come in.
We need to effectively cross-reference the orders in the new table that is continually updating against the orders already completely in the primary table so that we don't complete the same order multiple times.
After we get a batch of new orders, this is the query that I currently run in an attempt to cross reference it with the table of completed orders:
$sql = "DELETE
FROM
`orders_new`
WHERE
`order` IN (
SELECT DISTINCT
`order`
FROM
`orders_all`
)
AND `name` IN (
SELECT DISTINCT
`name`
FROM
`orders_all`
)
AND `jurisdiction` IN (
SELECT DISTINCT
`jurisdiction`
FROM
`orders_all`
)";
As you can probably tell, I want to delete rows from the "orders_new" table where a row with the same order, name, and jurisdiction already exists in the "orders_all" table.
Is this the right way to handle this sort of query?
Well, the right way depends on many things.
But first, I do not like your division into two tables. In that case I would introduce a column identfying state, that woul reference a table with possible states. Those would be "new", "in process", "completed". That way you have one order stored as only one record as it should be.
But your query migt be ok, but you should check the performance.
Take a look at: https://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join
Not exactly your case but very similar.
Another thing: Why do you use DISTINCT. That would imply that "order" is not a unique identifier.
Based on your edit you identify the order with composite key "order", "name", "jurisdiction". Is this really the key, the whole key and nothing but the key so help you Codd. If not, you could delete a bunch of records. But even so your query would delete an all orders for which the order, name and jurisdiction can be found in table order IN DIFFERENT RECORDS. So your query is false.
Saying that, a variant of your query might be
DELETE order_new
FROM
order_new
INNER JOIN
order_all ON order_all.order = order_new.order
AND order_all.name = order_new.name
AND order_all.jurisdiction = order_new.jurisdiction
But, the real problem is your ER model.
No, your query will delete any record where there are any records with the same order, name, and jurisdiction, even if those records are different from one another. In other words, a row in orders_new will be deleted if one row in order_all has the same order, a different one has the same name, and a third one has the same jurisdiction. You are very very likely to delete way more than you want to. Instead, this would be more appropriate:
DELETE FROM `orders_new`
WHERE (`order`, `name`, jurisdiction`) IN (
SELECT `order`, `name`, `jurisdiction`
FROM `orders_all`
)
or maybe
DELETE FROM `orders_new`
WHERE EXISTS (
SELECT 1
FROM `orders_all` AS oa
WHERE oa.`order` = `orders_new`.`order`
AND oa.`name` = `orders_new`.`name`
AND oa.`jurisdiction` = `orders_new`.`jurisdiction`
)
You should convert that to a DELETE - JOIN construct like
DELETE `orders_new`
FROM `orders_new`
INNER JOIN `orders_all` ON `orders_new`.`order` = `orders_all`.`order`
AND `orders_new`.`name` = `orders_all`.`name`
AND `orders_new`.`jurisdiction` = `orders_all`.`jurisdiction`;

Insert distinct records in the table while updating the remaining columns

This is actually a form to update the team members who work for a specific client, When i deselect a member then it's status turns to 0.
I have a table with all unique records. table consists of four columns -
first column is `id` which is unique and auto_incremented.
second column is `client_id`.
third column is `member_id`. (these second and third columns together make the primary key.)
fourth column is `current` which shows the status (default is 1.).
Now i have a form which sends the values of client_id and member_id. But this forms also contains the values that are already in the table BUT NOT ALL.
I need a query which
(i) `INSERT` the values that are not already in the table,
(ii) `UPDATE` the `current` column to value `0` which are in the table but not in the form values.
here is a screenshot of my form.
If (select count(*) from yourtable where client_id = and member_id = ) > 0 THEN
update yourtable set current = 0;
ELSE
insert into yourtable (client_id,member_id,current) values (value1,value2,value3)
First of all check if the value exists in the table or not, by using a SELECT query.
Then check if the result haven't save value so it will be inserted, else show an error .
This would be a great time to create a database stored procedure that flows something like...
select user
if exists update row
else insert new row
stored procedures don't improve transaction times, but they are a great addition to any piece of software.
If this doesn't solve your problem then a database trigger might help out.
Doing a little research on this matter might open up some great ideas!
Add below logic in your SP
If (select count(*) from yourtable where client_id = <value> and member_id = <value>) > 0 THEN
update yourtable set current = 0;
ELSE
insert into yourtable (client_id,member_id,current) values (value1,value2,value3)
if you want simple solution then follow this:
*) use select with each entry in selected team.
if select returns a row
then use update sql
else
use insert sql.
In your case member_id & client_id together makes the primary key.
So , you can use sql ON DUPLICATE KEY UPDATE Syntax.
Example:
$sql="INSERT INTO table_name SET
client_id='".$clientId."',
member_id='".$member_id."',
current='".$current."'
ON DUPLICATE KEY
UPDATE
current = '".$current."'
";
In this case when member_id & client_id combination repeats , it will automatically executes update query for that particular row.

Updating column with sum(data) from other table

Ok, i'm drawing a blank here and in dire need of your help!
3 tables:
matches (id, goals_slot_1, goals_slot_2, won, draw)
teams (id, name, score_for, score_against, won, lost, draw, points)
team-match (junction table) (team_id, match_id)
So what i want to achieve, is to update the 'draw' column in the teams table SET to the 'sum(draw)' of the matches table of the according teams.
The value of 'draw' in the matches table is '1' when it's a draw, '0' when not.
I just can't figure it out anymore. Stuck on it for days...
Can someone put me on the right track?
You would need to use a correlated sub query to get the values from the other tables. Something like:
UPDATE `teams`
SET `draw`=(SELECT SUM(`draw`)
FROM `matches`
WHERE `id` IN (SELECT `match_id`
FROM `team-match`
WHERE `team_id`=`teams`.`id`))
Or even a single sub query with a join:
UPDATE `teams`
SET `draw`=(SELECT SUM(`draw`)
FROM `matches`
JOIN `team-match`
ON `team-match`.`match_id`=`matches`.`id`
WHERE `team-match`.`team_id`=`teams`.`id`)
Both should do the work. I would assume the first is better for performance, but haven't tested and really they should be within a few milliseconds of each other. Other than this, you would need to use php to query the values and update the individual rows. Really though, the won/lost/draw columns could be calculated on the fly with similar performance and you wouldn't have to update the values every match.

mysql join two tables likes and posts

Here is my php/MySQL task:
I have a table POSTS that contains num field that is the primary key and other information fields about the post (author, title, etc.). I also have a table LIKE that contains a userId field that is the primary key and a field POST that corresponds to the num field in posts. Given a specific userID, I need to get all of the rows from the POSTS table that the userId 'likes'.
Table 1 - posts
-num
-author
-title
Table 2 - likes
-userId
-postId
This is all in php so my first idea was to get all of the rows from the LIKES table where the userId matches the one given and store those rows in an array. Then I would iterate through the array and for each row I would search get the row of the POSTS table where postId=POSTS.num. However, this seems like it would be rather slow, especially since each iteration through the array would be a separate mysql query.
I am assuming there is a faster way. Would it be to use a temporary table or is there a better way to join the tables? I have to assume that both tables contain many rows. I am a mysql novice so if there is a better solution please explain why it is better. Thank you in advance for you help!
Try the following query:
SELECT
`posts`.*
FROM
`likes`
INNER JOIN
`posts` ON
`posts`.num = `likes`.postId
WHERE
`likes`.Userid = {insert user id here}
Depending on your schema (not sure if each record in 'likes' has to be unique, you may want to use the DISTINCT keyword on your select to filter out duplicates.
SELECT poli.* FROM (
SELECT po.* FROM posts po
JOIN likes li
ON li.postId = po.num
WHERE li.userId = '$yourGivenUserId'
) AS poli
$yourGivenUserId is the given userId.

Categories