Find and remove MySQL row duplicates that are right after each other

Find and remove MySQL row duplicates that are right after each other - php

I'm trying to find and remove MySQL row duplicates that are right after each other, instead of finding all, even if they're not straight after each other.
SELECT DISTINCT(content) AS contentMsg, COUNT(*) AS cnt, `ticketId`,`date`
FROM ticketsReplies
WHERE username = 'X'
GROUP BY contentMsg, ticketId
HAVING cnt > 1
ORDER BY cnt DESC
This is my current code. However, this finds duplicates if there's just two of the same answers in one ticket instead of them having to be IDs right after each other (which can happen if you send a POST request, and it fails, and you refresh etc).
How would I go about finding ones that are only 1 ID from each other.
So finding e.g. 1,2,3,4,5,6,7 instead of 1,3,9,11
E.g. if you have
ID EMAIL
---------------------- --------------------
1 aaa
2 bbb
3 bbb
4 bbb
5 ddd
6 eee
7 aaa
8 aaa
9 bbb
If you have this, it should find the following IDs:
2,3,4 but not 9 as it's not directly after 4 even though its a duplicate.
It should also find 7,8 but not 1 as they are not right after each other.

E.g.:
SELECT id
FROM
( SELECT x.id FROM my_table x JOIN my_table y ON y.email = x.email AND y.id = x.id + 1 ) a
UNION
( SELECT y.id FROM my_table x JOIN my_table y ON y.email = x.email AND y.id = x.id + 1 );

If there are gaps in your id list (eg 5, 6, 9, 11), simply comparing id = id+1 wouldn't work. The solution I came up with is to create two identical temporary tables with sequential row-numbers. In that case you can safely compare the rows based on their number, even if the id's have gaps.
DELETE FROM tab WHERE id IN (
SELECT A.id
FROM
(
SELECT row_nr, id, email FROM (
SELECT
(#cnt1 := #cnt1 + 1) AS row_nr,
t.id,t.email
FROM tab AS t
CROSS JOIN (SELECT #cnt1 := 0) AS d
ORDER BY t.id
) x
) A
INNER JOIN
(
SELECT row_nr, id, email FROM (
SELECT
(#cnt2 := #cnt2 + 1) AS row_nr,
t.id,t.email
FROM tab AS t
CROSS JOIN (SELECT #cnt2 := 0) AS d
ORDER BY t.id
) x
) B
ON A.row_nr-1 = B.row_nr AND A.email=B.email
)
The two (SELECT row_nr, id, email FROM ... ) x parts create two identical tables A and B like
row_nr id email
1 1 aaa
2 4 aaa
3 5 bbb
4 9 aaa
5 11 aaa
Then you can compare the sequential row-nr's and email:
ON A.row_nr-1 = B.row_nr AND A.email=B.email
Selecting the result-id's gives you the id's 4, 11 which are the duplicates. Then you can delete those id's:
DELETE FROM tab WHERE id IN ( ... )
Here is a Fiddle to test the SELECT part.
NOTE: Before you try this at home, please backup your table!

Related

Using 2 levels avobe column in union subquery

I have a very complicated query which involves a subquery and this subquery usas an union as the table. I want to use a column from the first level (a field before the subquery) as part of the where clausule in the union. Like this:
SELECT
type,
registered_number - (
SELECT
MAX(last)
FROM (
SELECT
MAX(b) as last
FROM
x
WHERE
a = type
UNION ALL
SELECT
MAX(b) as last
FROM
y
WHERE
a = type
) as last_table
) as last
FROM `x`;
Sample data
Table X
a
b
1
25
2
26
3
27
TABLE Y
a
b
1
25
2
24
3
31
TABLE s
id
type
registered_number
1
1
7
2
2
8
3
3
9
EXPECTED RESULT
type
last
1
18
2
18
3
22

I suggest doing a union of the x and y tables first, then join s to an aggregate of the union subquery.
SELECT s.type, t.b AS last
FROM s
INNER JOIN
(
SELECT a, MAX(b) AS b
FROM
(
SELECT a, b FROM x
UNION ALL
SELECT a, b FROM y
) t
GROUP BY a
) t
ON t.a = s.type
ORDER BY s.type;

I want to sort my CatId in this order 1 2 3 4 1 2 3 4 but there is a catch that it first check allocationId last 14 digits if catid is same

My query is this:-
SELECT m.allocationID,mt.CatId,mt.CatSName
FROM msttransaction m,msttemp mt WHERE m.isPending='Y'
AND m.allocationID IN (
SELECT mt.AllocationId FROM msttemp WHERE mt.quarterId='010100001'
) ORDER BY SUBSTRING(m.AllocationId, -14)
output:-
12980013120170919125006 1 A
12980013320170919125404 3 C
12980013420170919125603 4 D
12980013820170919130113 2 B
12980013920170919130315 3 C
12980014020170919130519 4 D
12980013220170919130613 2 B
12980013720170919130722 1 A
In 129800 series last 14 digits is date and time. First I have to sort my output according to 1 2 3 4 1 2 3 4 but first it check last 14 digits of 129800 of same catId that comes first which comes first.
Expected output
12980013120170919125006 1 A
12980013820170919130113 2 B
12980013320170919125404 3 C
12980013420170919125603 4 D
12980013720170919130722 1 A
12980013220170919130613 2 B
12980013920170919130315 3 C
12980014020170919130519 4 D

Does this do it?
ORDER BY m.CatId, SUBSTRING(m.AllocationId, -14)
The point is to order first by that CatId, then order by the embedded datastamp in the AllocationId.
If it were me I'd be more formal about the datestamp ordering, by formally extracting it from your strings using STR_TO_DATE() like this. This isn't strictly necessary, but good practice anyhow.
ORDER BY m.CatId, STR_TO_DATE(SUBSTRING(m.AllocationId, -14), '%Y%m%d%H%i%s'))
Plus, then you could use date manipulation like
ORDER BY m.CatId, LAST_DAY(STR_TO_DATE(SUBSTRING(m.AllocationId, -14), '%Y%m%d%H%i%s')))
to gather all the datestamps in a month together for ordering, or some such thing.
Here's an example: http://sqlfiddle.com/#!9/1c87ee/6/0

You can introduce a variable that will give you the order of the CatId when the result set would have been ordered by CatId and then the 14 last digits of the AllocationId. So in your example this variable would be either 1 or 2.
Once you have that variable value for each record, you can sort by that value first, and then by CatId:
select t.allocationId, CatId
from (
select t.*,
#rn := if(#CatId = CatId, #rn+1, if(#CatId := CatId, 1, 1)) rn
from (
SELECT m.allocationID, mt.CatId,mt.CatSName
FROM msttransaction m,
INNER JOIN msttemp mt
ON m.allocationID = mt.AllocationId
WHERE m.isPending='Y'
AND mt.quarterId='010100001'
ORDER BY CatId, SUBSTRING(m.AllocationId, -14)
) t,
(select #CatId := -1, #rn := -1) init
) t
order by rn, CatId;

PHP - Deleting duplicate values with same ref

I've the following MySQL Table called store
id ref item_no supplier
1 10 x1 usa
2 10 x1 usa
3 11 x1 china
4 12 x2 uk
5 12 x3 uk
6 13 x3 uk
7 13 x3 uk
Now What i'm excepting the output to be is as follows :
id ref item_no supplier
1 10 x1 usa
3 11 x1 china
4 12 x2 uk
5 12 x3 uk
6 13 x3 uk
As you can see item_no x1 and x3 have same ref and supplier source, so what I want is to delete the duplicate record in-order to keep one item_no only !
I've create this PHP code to SELECT results only :
$query1 = "SELECT
DISTINCT(item_no) AS field,
COUNT(item_no) AS fieldCount,
COUNT(ref) AS refcount
FROM
store
GROUP BY item_no HAVING fieldCount > 1";
$result1 = mysql_query($query1);
if(mysql_num_rows($result1)>0){
while ($row1=mysql_fetch_assoc($result1)) {
echo $row1['field']."<br /><br />";
}
} else {
//IGNORE
}
How to tell the query to SELECT Duplicate records properly according to my needs before creating the DELETE query.
Thanks Guys

You can use the following query to produce the required result set:
SELECT t1.*
FROM store AS t1
JOIN (
SELECT MIN(id) AS id, ref, item_no
FROM store
GROUP BY ref, item_no
) AS t2 ON t1.id > t2.id AND t1.ref = t2.ref AND t1.item_no = t2.item_no
Demo here
To DELETE you can use:
DELETE t1
FROM store AS t1
JOIN (
SELECT MIN(id) AS id, ref, item_no
FROM store
GROUP BY ref, item_no
) AS t2 ON t1.id > t2.id AND t1.ref = t2.ref AND t1.item_no = t2.item_no

To find only duplicate records you can use
SELECT * FROM store WHERE id NOT IN
(SELECT id FROM store AS outerStore WHERE id =
(SELECT MAX(id) FROM store AS innerStore
WHERE outerStore.ref = innerStore.ref AND
outerStore.supplier = innerStore.supplier AND outerStore.item_no = innerStore.item_no))
Maybe long, but it should work.

If you want the select of the row to delete use
select * from store
where id not in (
select max(id) from store
group by distinct ref, item_no, supplier);
Or you can directly use a command for direct delete using
delete from store
where id not in (
select max(id) from store
group by distinct ref, item_no, supplier);

Calculate net change in ranking after "update", MYSQL

I have a two MYSQL tables:
Table-1
id catid title user_rating
123 8 title-one 3
321 8 title-two 5
and
Table-2
listing_id title user_rating
123 title-one 3
321 title-two 5
Plus, I have this query that calculates the current rank of each "title" based on "user_rating".
SELECT
MAX(x.rank) AS rank
FROM
(SELECT
a.id,
a.catid,
a.title,
b.listing_id,
#rank:=#rank + 1 AS rank
FROM
`table-1` a
INNER JOIN `table-2` b ON a.id = b.listing_id
CROSS JOIN (SELECT #rank:=0) r
WHERE
catid = '8'
ORDER BY user_rating DESC) x
WHERE
id = 123
Now, my issue: I want to calculate the difference in "ranking" (rank) when I update the "user_rating" value.
Please, note: the "user_rating" value is updated by a php script that allow users to vote for a specific content (range 1 to 5, step 0.5).
What's the best way to get the difference between the "previous rank" and "current rank" after the update?
Thanks in advance to all.

get SUM of another row of GROUP BY'd rows

I have this table:
This selection is is duplicated many times for different var_lines (which pretty much work as one row of data, or respondent for a survey) and set_codes (different survey codes).
With this query:
SELECT
*, COUNT(*) AS total
FROM
`data`
WHERE
`var_name` = 'GND.NEWS.INT'
AND(
`set_code` = 'BAN11A-GND'
OR `set_code` = 'BAN09A-GND'
OR `set_code` = 'ALG11A-GND'
)
AND `country_id` = '5'
GROUP BY
`data_content`,
`set_code`
ORDER BY
`set_code`,
`data_content`
The query basically counts the number of answers for a specific question. Then groups them survey (set_code).
What I need is for each of the grouped data_content answers for GND.NEWS.INT to also show the SUM of all the corresponding GND_WT with the same var_line.
For example if I had this:
data_id data_content var_name var_line
1 2 GND.NEW.INT 1
2 1.4 GND_WT 1
3 2 GND.NEW.INT 2
4 1.6 GND_WT 2
5 3 GND.NEW.INT 3
6 0.6 GND_WT 3
I would get something like this:
data_id data_content var_name var_line total weight
1 2 GND.NEW.INT 1 2 3
5 3 GND.NEW.INT 3 1 0.6
Thanks for any help.

Your requirements are not exactly clear, but I think the following gives you what you want:
select d1.data_id,
d1.data_content,
d1.var_name,
d1.var_line,
t.total,
w.weight
from data d1
inner join
(
select data_content,
count(data_content) Total
from data
group by data_content
) t
on d1.data_content = t.data_content
inner join
(
select var_line,
sum(case when var_name = 'GND_WT' then data_content end) weight
from data
group by var_line
) w
on d1.var_line = w.var_line
where d1.var_name = 'GND.NEW.INT'
See SQL Fiddle with Demo

This Query can be suitable for your specific example:
select st.data_id,
st.data_content,
st.var_name,
st.var_line,
count(st.data_id) as total,
sum(st1.data_content) as weight
from data st
left join data st1 on st1.var_name = 'GND_WT' AND st1.var_line=st.var_line
where st.var_name='GND.NEW.INT'
group by st.data_content
Regards,
Luis.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Find and remove MySQL row duplicates that are right after each other - php

E.g.: SELECT id FROM ( SELECT x.id FROM my_table x JOIN my_table y ON y.email = x.email AND y.id = x.id + 1 ) a UNION ( SELECT y.id FROM my_table x JOIN my_table y ON y.email = x.email AND y.id = x.id + 1 );

Related

Using 2 levels avobe column in union subquery

I want to sort my CatId in this order 1 2 3 4 1 2 3 4 but there is a catch that it first check allocationId last 14 digits if catid is same

PHP - Deleting duplicate values with same ref

Calculate net change in ranking after "update", MYSQL

get SUM of another row of GROUP BY'd rows

Categories

Resources