SQL Removing duplicates one row at a time

SQL Removing duplicates one row at a time - php

I have a table where I save all row-changes that have ever occurred. The problem is that in the beginning of the application there was a bug that made a bunch of copies of every row.
The table looks something like this:
copies
|ID |CID |DATA
| 1 | 1 | DA
| 2 | 2 | DO
| 2 | 3 | DO (copy of CID 2)
| 1 | 4 | DA (copy of CID 1)
| 2 | 5 | DA
| 1 | 6 | DA (copy of CID 1)
| 2 | 7 | DO
CID is UNIQUE in table copies.
What I want is to remove all the duplicates of DATA GROUP BY ID that is after one another sorted by CID.
As you can see in the table, CID 2 and 3 are the same and they are after one another. I would want to remove CID 3. The same with CID 4 and CID 6; they have no ID 1 between them and are copies of CID 1.
After duplicates removal, I would like the table to look like this:
copies
|ID |CID |DATA
| 1 | 1 | DA
| 2 | 2 | DO
| 2 | 5 | DA
| 2 | 7 | DO
Any suggestions? :)
I think my question was badly asked because the answer everybody seems to think is the best gives this result:
ID | DATA | DATA | DATA | DATA | DATA | DATA | CID |
|Expected | Quassnoi |
1809 | 1 | 0 | 1 | 0 | 0 | NULL | 252227 | 252227 |
1809 | 1 | 0 | 1 | 1 | 0 | NULL | 381530 | 381530 |
1809 | 1 | 0 | 1 | 0 | 0 | NULL | 438158 | (missing) |
1809 | 1 | 0 | 1 | 0 | 1535 | 20090113 | 581418 | 581418 |
1809 | 1 | 1 | 1 | 0 | 1535 | 20090113 | 581421 | 581421 |
CID 252227 AND CID 438158 are duplicates but because CID 381530 comes between them; I want to keep this one. It's only duplicates that are directly after one another when ordering by CID and ID.

DELETE c.*
FROM copies c
JOIN (
SELECT id, data, MIN(copies) AS minc
FROM copies
GROUP BY
id, data
) q
ON c.id = q.id
AND c.data = q.data
AND c.cid <> q.minc
Update:
DELETE c.*
FROM (
SELECT cid
FROM (
SELECT cid,
COALESCE(data1 = #data1 AND data2 = #data2, FALSE) AS dup,
#data1 := data1,
#data2 := data2
FROM (
SELECT #data1 := NULL,
#data2 := NULL
) vars, copies ci
ORDER BY
id, cid
) qi
WHERE dup
) q
JOIN copies c
ON c.cid = q.cid
This solution empoys MySQL session variables.
There is a pure ANSI solution that would use NOT EXISTS, however, it would be slow due to the way MySQL optimizer works (it won't use range access method in a correlated subquery).
See this article in my blog for performance details for quite a close task:
MySQL: difference between sets

You can use a count in a subquery for this:
delete from copies
where
(select count(*) from copies s where s.id = copies.id
and s.data = copies.data
and s.cid > copies.cid) > 0

// EDITED for #Jonathan Leffler comment
//$sql = "SELECT ID,CID,DATA FROM copies ORDER BY CID, ID";
$sql = "SELECT ID,CID,DATA FROM copies ORDER BY ID, CID";
$result = mysql_query($sql, $link);
$data = "";
$id = "";
while ($row = mysql_fetch_row($result)){
if (($row[0]!=$id) && ($row[2]!=$data) && ($id!="")){
$sql2 = "DELETE FROM copies WHERE CID=".$row[1];
$res = mysql_query($sql2, $link);
}
$id=$row[0];
$data=$row[2];
}

delete from copies c where c.cid in (select max(cid) as max_cid, count(*) as num from copies where num > 1 group by id, data)

Related

for each value of one table get and display count number of coresponding values from another table

unfortunately i have to do this in mysql / php . I looked for three days, and there is like 10.000 explantions of this but NONE (and I repeat NONE) works for me. I tried it all. I have to ask, sorry.
I have two tables - articles and control.
table "articles"
------------------
art_id | name |
------------------
1 | aaa |
2 | bbb |
3 | ccc |
4 | ddd |
table "control"
--------------------------------------------
con_id | art_id | data |
--------------------------------------------
1 | 1 | something-a |
2 | 2 | something-b |
3 | 1 | something-a |
4 | 2 | something-c |
5 | 3 | something-f |
art_id exists in both tables. Now what i wanted - for query:
"select * from articles order by art_id ASC" displayed in a table
to have also one cell displaying the count for each of art_id's from table CONTROL...
and so i tried join, left join, inner join - i get errors ... I also tried for each get only one result (for example 2 for everything)... this is semi-right but it displays the array of correct results and it's not even with join!!! :
$query = "SELECT art_id, count(*) as counting
FROM control GROUP BY art_id ORDER BY con_id ASC";
$result = mysql_query($query);
while($row=mysql_fetch_array($result)) {
echo $row['counting'];
}
this displays 221 -
-------------------------------------------------
art_id | name | count (this one from control) |
-------------------------------------------------
1 | aaa | 221 |
2 | bbb | 221 |
3 | ccc | 221 |
and it should be:
for art_id(value1)=2,
for art_id(2)=2,
for art_id(3)=1
it should be simple - like a count of values from CONTROL table displayed in query regarding the "articles" table...
The result query on page for table articles should be:
"select * from articles order by art_id ASC"
-------------------------------------------------
art_id | name | count (this one from control) |
-------------------------------------------------
1 | aaa | 2 |
2 | bbb | 2 |
3 | ccc | 1 |
So maybe i should go with JOIN or with join plus for each... Tried tha too, but then i'm not sure what is the proper thing to echo... all-in-all i'm completely lost here. Please help. Thank you.

So imagine this in two steps:
Get the counts per art_id from the control table
Using your articles table, pick up the counts from step 1
That will give you a query that looks like this:
SELECT a.art_id, a.name, b.control_count
FROM articles a
INNER JOIN
(
SELECT art_id, COUNT(*) AS control_count
FROM control
GROUP BY art_id
) b
ON a.art_id = b.art_id;
Which will give you the results you're looking for.
However, instead of using a subquery, you can do it all in one shot:
SELECT a.art_id, a.name, COUNT(b.art_id) AS control_count
FROM articles a
INNER JOIN control b
ON a.art_id = b.art_id
GROUP BY a.art_id, a.name;
SQL Fiddle demo

SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id DESC;
If I understood your question right, this query should do the trick.
Edit: Created the tables you have described, and it works.
SELECT * FROM articles;
+--------+------+
| art_id | name |
+--------+------+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
+--------+------+
4 rows in set (0.00 sec)
SELECT * FROM control;
+--------+--------+------+
| con_id | art_id | data |
+--------+--------+------+
| 1 | 1 | NULL |
| 2 | 2 | NULL |
| 3 | 1 | NULL |
| 4 | 2 | NULL |
| 5 | 3 | NULL |
+--------+--------+------+
5 rows in set (0.00 sec)
SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC;
+--------+------+----------------+
| art_id | name | count_from_con |
+--------+------+----------------+
| 1 | aaa | 2 |
| 2 | bbb | 2 |
| 3 | ccc | 1 |
| 4 | ddd | 0 |
+--------+------+----------------+
You haven't quite explained what you want to accomplish with the print out but here is an example in PHP: (Use PDO instead of mysql_)
$pdo = new PDO(); // Make your connection here
$stm = $pdo->query('SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC');
while( $row = $stm->fetch(PDO::FETCH_ASSOC) )
{
echo "Article with id: ".$row['art_id']. " has " .$row['count_from_con'].' connected rows in control.';
}
Alternatively with the mysql_ extension:
$result = mysql_query('SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC');
while( $row = mysql_fetch_assoc($result) )
{
echo "Article with id: ".$row['art_id']. " has " .$row['count_from_con'].' connected rows in control.';
}
This should be enough examples to help you accomplish what you need.

mysql fetch data in where condition

I have 3 table now:
First is : member_username
+-------------+------------------+
| uid | username |
+-------------+------------------+
| 1 | theone |
| 2 | ohno |
| 3 | prayforpr |
+-------------+------------------+
Second is : member_data
+-------------+-------------------+-----------------+
| uid | talk | etc |
+-------------+-------------------+-----------------+
| 1 | talk1 | |
| 2 | talkeee | |
| 3 | iojdfnl | |
+---------------------------------------------------+
Third is : member_level
+-------------+-------------------+-----------------+
| uid | level | fid |
+-------------+-------------------+-----------------+
| 1 | 2 | 1 |
| 1 | 10 | 2 |
| 2 | 1 | 1 |
| 2 | 99 | 2 |
| 1 | 40 | 3 |
| 3 | 50 | 1 |
| 1 | 44 | 4 |
+---------------------------------------------------+
I would like to query data and display the only one uid when member_level is higher in when SUM member_level.level Where fid in 1,2,3.
my query now is like below, but this query is sum all the level including fid 4 also, how to specify only sum in fid 1,2,3? and how do I assign the SUM of member_level.level Where fid in 1,2,3 to $levelKingTotalLevel?
$levelKing = DB::query("SELECT t1.uid,t1.username,t2.talk FROM ".DB::table('member_level')." t3 JOIN ".DB::table('member_username')." t1 ON(t3.uid = t1.uid) JOIN ".DB::table('member_data')." t2 ON (t1.uid = t2.uid) GROUP BY t3.uid ORDER BY SUM(t3.level) DESC LIMIT 1");
while($rowlevelKing = DB::fetch($levelKing)) {
$levelKingTotalLevel = $rowlevelKing['???'];
$levelKingN = $rowlevelKing['username'];
$levelKingUID = $rowlevelKing['uid'];
$levelKingT = $rowlevelKing['talk'];
};
echo "The ".$levelKingN." total level is ".$levelKingTotalLevel." and he talk about ".$levelKingT;
Thank you.

To filter records having fid values as 1, 2 or 3, use IN statement in WHERE clause. Alias totalLevel in select statement will give you total level for a user.
SELECT t1.uid, t1.username, t2.talk, SUM(t3.level) AS totalLevel
FROM member_level t3
JOIN member_username t1
ON (t3.uid = t1.uid)
JOIN member_data t2
ON (t1.uid = t2.uid)
WHERE t3.fid IN (1,2,3)
GROUP BY t3.uid
ORDER BY totalLevel DESC
LIMIT 1

how use order by clause in group by statement

I have two table tbl_issue_log, tbl_magazine_issue
tbl_issue_log
============
+----------+--------+--------+-----------+---------------------+
| issue_id | mag_id | log_id | operation | updated_time |
+----------+--------+--------+-----------+---------------------+
| 2 | 1 | 1 | 1 | 2014-01-30 21:29:44 |
| 3 | 4 | 1 | 1 | 2015-01-30 21:29:44 |
| 2 | 1 | 1 | 3 | 2015-01-31 21:29:44 |
+----------+--------+--------+-----------+---------------------+
tbl_magazine_issue
=================
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
| ISSUE_ID | ISSUE_NAME | MAGAZINE_ID | COVER_PAGE_THUMB | FROM_DATE | TO_DATE | issue_description | login_page_no |
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
| 2 | test issue | 1 | cover page | 2014-01-30 | 2015-01-30 | sdssdg fsdf | 20 |
| 3 | test issue1 | 4 | cover page1 | 2014-01-30 | 2015-01-30 | sdssdg fsdf | 20 |
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
in tbl_issue_log contain multiple records for same issue id. i want only one issue at a time
and this must latest updated time.
My query is this
SELECT
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
GROUP BY tbl_issue_log.issue_id
ORDER BY tbl_issue_log updated_time DESC;
Here i got issue id based output . But not getting the latest updated timeed record.
If any one about this please help me.

Try this
SELECT * FROM
(SELECT
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
ORDER BY tbl_issue_log.updated_time DESC ) TEMP_TABLE
GROUP BY ISSUE_ID

Can you try changing order by clause as
ORDER BY tbl_issue_log.updated_time DESC;
Edit ---
As you are grouping on issue_id, mysql will select first row that matches the issue_id. The order by runs later which essentially does not return what you are looking for. You may need to use a subquery approach for this.
select some_table.* FROM
(
SELECT
MAX(tbl_issue_log.updated_time) AS updated_time,
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
GROUP BY tbl_issue_log.issue_id
) some_table
ORDER BY some_table.updated_time DESC;

Compare column from two rows and if different change values of another column

I have a table with an auto increment key id, item_no can be either one or two rows in a row (so they always have consecutive ids) that share the same ref but have different right/left (but technically item_no can be repeated multiple times throughout the table but that's not an issue), and description will sometimes be the same on the consecutive rows but sometimes different:
id | item_no | description | right\left | ref
1 | 1 | a1 | right | aaa
2 | 1 | a1 | left | aaa
3 | 2 | b1 | right | bbb
4 | 3 | c1 | right | ccc
5 | 3 | c2 | left | ccc
6 | 4 | d1 | right | ddd
7 | 4 | d1 | left | ddd
My issue is that I need item_no to append a -r or -l on to its value if the description of its 'matching' row is different.
So the result I am looking for is:
id | item_no | description | right\left | ref
1 | 1 | a1 | right | aaa
2 | 1 | a1 | left | aaa
3 | 2 | b1 | right | bbb
4 | 3-r | c1 | right | ccc
5 | 3-l | c2 | left | ccc
6 | 4 | d1 | right | ddd
7 | 4 | d1 | left | ddd
I am exporting the table to a csv but am not using much php, just a mysql statement and then looping out the results, is this possible within the mysql statement or will I have to rely on a php loop?

I would use this:
update
items inner join
(select item_no from items
group by item_no
having count(distinct description)>1) dup
on items.item_no=dup.item_no
set
items.item_no=concat(items.item_no, '-', substr(rightleft, 1,1))
If rows are always consecutive, you could also use this:
update
items i1 inner join items i2
on (i1.id=i2.id+1 or i1.id=i2.id-1)
and (i1.item_no=i2.item_no)
and (i1.description<>i2.description)
set i1.item_no=concat(i1.item_no, '-', substr(i1.rightleft, 1,1))
EDIT: if rows are always consecutive, and you just need a select and not an update, you could use this:
select
i1.id,
case when i1.description=i2.description or i2.id is null then i1.item_no else
concat(i1.item_no, '-', substr(i1.rightleft, 1,1)) end,
i1.description, i1.rightleft, i1.ref
from
items i1 left join items i2
on (i1.id=i2.id+1 or i1.id=i2.id-1) and (i1.item_no=i2.item_no)
order by i1.id

Try this:
SELECT
id,
CASE RightLeft
WHEN 'right' THEN CONCAT(item_no, '-r' )
WHEN 'left' THEN CONCAT(item_no, '-l' )
END AS item_no,
DESCRIPTION,
Rightleft,
ref
FROM Items
WHERE item_no IN
(
SELECT i1.item_no
FROM items i1
GROUP BY i1.item_no
HAVING(COUNT(DISTINCT description)) > 1);
SQL Fiddle Demo
This will give you:
| ID | ITEM_NO | DESCRIPTION | RIGHTLEFT | REF |
------------------------------------------------
| 4 | 3-r | c1 | right | ccc |
| 5 | 3-l | c2 | left | ccc |

I would rely on a PHP loop if you're using mysql, if you were using Oracle or SQL server then you could program a stored procedure.
You script should look something like this:
$dbh = new PDO('mysql:host='.DATABASE_HOST.';dbname='.DATABASE_NAME, DATABASE_USER, DATABASE_PASSWORD);
$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$data = $dbh->query("SELECT * FROM ExampleTable");
$dbh->beginTransaction();
foreach($data as $row)
{
$append = $row["right\left"] == "left" ? $row["item_no"]."-l" : $row["item_no"]."-r";
$stmnt = $dbh->prepare("UPDATE ExampleTable SET item_no = :item WHERE id = :id");
$stmnt->execute(array(":item" => $append,":id" => $row["id"]));
}
// Do some exception handling if something goes wrong you can allways do a rollback
// With PDO $dbh->rollBack();
$dbh->commit();
$dbh = null;

Something like this
UPDATE [dbo].[maTable] SET [item_no] = [item_no]+'r' WHERE not distinct [description] from [dbo].[maTable]
Should add an 'r' in the registration line where [description] is not identical (coded for SQL Server)

Count data in same sentence

I have table :
==========================================================
|id | before | after | freq | id_sentence | document_id |
==========================================================
| 1 | a | b | 1 | 0 | 1 |
| 2 | c | d | 1 | 1 | 1 |
| 3 | e | f | 1 | 1 | 1 |
| 4 | g | h | 2 | 0 | 2 |
==========================================================
I want to get the number of data depend on the id_sentence and freq so the result must be 1 2 1
here's the code :
$query = mysql_query("SELECT freq FROM tb where document_id='$doc_id' ");
while ($row = mysql_fetch_array($query)) {
$stem_freq = $row['freq'];
$total = $total+$stem_freq;
but the result is still wrong. please, help me.. thank you :)

If I understand your question, you are trying the calculate the sum of freq for each distinct id_sentence for a particular document_id.
Try the following SQL:
SELECT id_sentence, SUM(freq) as freq FROM tb WHERE document_id = 1 GROUP BY(id_sentence)
The result will be rows of data with the id_sentence and corresponding total freq. No need to manually sum things up afterwards.
See this SQL Fiddle: http://sqlfiddle.com/#!2/691ed/8

I think you could do something like
SELECT count(*) AS count FROM tb GROUP BY(id_sentence,freq)
to do the counting you want. You could even do something like
SELECT count(*) AS count, id_sentence, freq FROM tb GROUP BY(id_sentence,freq)
to know which id_sentence,freq combination the count is for.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

SQL Removing duplicates one row at a time - php

You can use a count in a subquery for this: delete from copies where (select count(*) from copies s where s.id = copies.id and s.data = copies.data and s.cid > copies.cid) > 0

delete from copies c where c.cid in (select max(cid) as max_cid, count(*) as num from copies where num > 1 group by id, data)

Related

for each value of one table get and display count number of coresponding values from another table

mysql fetch data in where condition

how use order by clause in group by statement

Compare column from two rows and if different change values of another column

Count data in same sentence

Categories

Resources