SQL Removing duplicates one row at a time - php

I have a table where I save all row-changes that have ever occurred. The problem is that in the beginning of the application there was a bug that made a bunch of copies of every row.
The table looks something like this:
copies
|ID |CID |DATA
| 1 | 1 | DA
| 2 | 2 | DO
| 2 | 3 | DO (copy of CID 2)
| 1 | 4 | DA (copy of CID 1)
| 2 | 5 | DA
| 1 | 6 | DA (copy of CID 1)
| 2 | 7 | DO
CID is UNIQUE in table copies.
What I want is to remove all the duplicates of DATA GROUP BY ID that is after one another sorted by CID.
As you can see in the table, CID 2 and 3 are the same and they are after one another. I would want to remove CID 3. The same with CID 4 and CID 6; they have no ID 1 between them and are copies of CID 1.
After duplicates removal, I would like the table to look like this:
copies
|ID |CID |DATA
| 1 | 1 | DA
| 2 | 2 | DO
| 2 | 5 | DA
| 2 | 7 | DO
Any suggestions? :)
I think my question was badly asked because the answer everybody seems to think is the best gives this result:
ID | DATA | DATA | DATA | DATA | DATA | DATA | CID |
|Expected | Quassnoi |
1809 | 1 | 0 | 1 | 0 | 0 | NULL | 252227 | 252227 |
1809 | 1 | 0 | 1 | 1 | 0 | NULL | 381530 | 381530 |
1809 | 1 | 0 | 1 | 0 | 0 | NULL | 438158 | (missing) |
1809 | 1 | 0 | 1 | 0 | 1535 | 20090113 | 581418 | 581418 |
1809 | 1 | 1 | 1 | 0 | 1535 | 20090113 | 581421 | 581421 |
CID 252227 AND CID 438158 are duplicates but because CID 381530 comes between them; I want to keep this one. It's only duplicates that are directly after one another when ordering by CID and ID.

DELETE c.*
FROM copies c
JOIN (
SELECT id, data, MIN(copies) AS minc
FROM copies
GROUP BY
id, data
) q
ON c.id = q.id
AND c.data = q.data
AND c.cid <> q.minc
Update:
DELETE c.*
FROM (
SELECT cid
FROM (
SELECT cid,
COALESCE(data1 = #data1 AND data2 = #data2, FALSE) AS dup,
#data1 := data1,
#data2 := data2
FROM (
SELECT #data1 := NULL,
#data2 := NULL
) vars, copies ci
ORDER BY
id, cid
) qi
WHERE dup
) q
JOIN copies c
ON c.cid = q.cid
This solution empoys MySQL session variables.
There is a pure ANSI solution that would use NOT EXISTS, however, it would be slow due to the way MySQL optimizer works (it won't use range access method in a correlated subquery).
See this article in my blog for performance details for quite a close task:
MySQL: difference between sets

You can use a count in a subquery for this:
delete from copies
where
(select count(*) from copies s where s.id = copies.id
and s.data = copies.data
and s.cid > copies.cid) > 0

// EDITED for #Jonathan Leffler comment
//$sql = "SELECT ID,CID,DATA FROM copies ORDER BY CID, ID";
$sql = "SELECT ID,CID,DATA FROM copies ORDER BY ID, CID";
$result = mysql_query($sql, $link);
$data = "";
$id = "";
while ($row = mysql_fetch_row($result)){
if (($row[0]!=$id) && ($row[2]!=$data) && ($id!="")){
$sql2 = "DELETE FROM copies WHERE CID=".$row[1];
$res = mysql_query($sql2, $link);
}
$id=$row[0];
$data=$row[2];
}

delete from copies c where c.cid in (select max(cid) as max_cid, count(*) as num from copies where num > 1 group by id, data)

Related

for each value of one table get and display count number of coresponding values from another table

unfortunately i have to do this in mysql / php . I looked for three days, and there is like 10.000 explantions of this but NONE (and I repeat NONE) works for me. I tried it all. I have to ask, sorry.
I have two tables - articles and control.
table "articles"
------------------
art_id | name |
------------------
1 | aaa |
2 | bbb |
3 | ccc |
4 | ddd |
table "control"
--------------------------------------------
con_id | art_id | data |
--------------------------------------------
1 | 1 | something-a |
2 | 2 | something-b |
3 | 1 | something-a |
4 | 2 | something-c |
5 | 3 | something-f |
art_id exists in both tables. Now what i wanted - for query:
"select * from articles order by art_id ASC" displayed in a table
to have also one cell displaying the count for each of art_id's from table CONTROL...
and so i tried join, left join, inner join - i get errors ... I also tried for each get only one result (for example 2 for everything)... this is semi-right but it displays the array of correct results and it's not even with join!!! :
$query = "SELECT art_id, count(*) as counting
FROM control GROUP BY art_id ORDER BY con_id ASC";
$result = mysql_query($query);
while($row=mysql_fetch_array($result)) {
echo $row['counting'];
}
this displays 221 -
-------------------------------------------------
art_id | name | count (this one from control) |
-------------------------------------------------
1 | aaa | 221 |
2 | bbb | 221 |
3 | ccc | 221 |
and it should be:
for art_id(value1)=2,
for art_id(2)=2,
for art_id(3)=1
it should be simple - like a count of values from CONTROL table displayed in query regarding the "articles" table...
The result query on page for table articles should be:
"select * from articles order by art_id ASC"
-------------------------------------------------
art_id | name | count (this one from control) |
-------------------------------------------------
1 | aaa | 2 |
2 | bbb | 2 |
3 | ccc | 1 |
So maybe i should go with JOIN or with join plus for each... Tried tha too, but then i'm not sure what is the proper thing to echo... all-in-all i'm completely lost here. Please help. Thank you.
So imagine this in two steps:
Get the counts per art_id from the control table
Using your articles table, pick up the counts from step 1
That will give you a query that looks like this:
SELECT a.art_id, a.name, b.control_count
FROM articles a
INNER JOIN
(
SELECT art_id, COUNT(*) AS control_count
FROM control
GROUP BY art_id
) b
ON a.art_id = b.art_id;
Which will give you the results you're looking for.
However, instead of using a subquery, you can do it all in one shot:
SELECT a.art_id, a.name, COUNT(b.art_id) AS control_count
FROM articles a
INNER JOIN control b
ON a.art_id = b.art_id
GROUP BY a.art_id, a.name;
SQL Fiddle demo
SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id DESC;
If I understood your question right, this query should do the trick.
Edit: Created the tables you have described, and it works.
SELECT * FROM articles;
+--------+------+
| art_id | name |
+--------+------+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
+--------+------+
4 rows in set (0.00 sec)
SELECT * FROM control;
+--------+--------+------+
| con_id | art_id | data |
+--------+--------+------+
| 1 | 1 | NULL |
| 2 | 2 | NULL |
| 3 | 1 | NULL |
| 4 | 2 | NULL |
| 5 | 3 | NULL |
+--------+--------+------+
5 rows in set (0.00 sec)
SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC;
+--------+------+----------------+
| art_id | name | count_from_con |
+--------+------+----------------+
| 1 | aaa | 2 |
| 2 | bbb | 2 |
| 3 | ccc | 1 |
| 4 | ddd | 0 |
+--------+------+----------------+
You haven't quite explained what you want to accomplish with the print out but here is an example in PHP: (Use PDO instead of mysql_)
$pdo = new PDO(); // Make your connection here
$stm = $pdo->query('SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC');
while( $row = $stm->fetch(PDO::FETCH_ASSOC) )
{
echo "Article with id: ".$row['art_id']. " has " .$row['count_from_con'].' connected rows in control.';
}
Alternatively with the mysql_ extension:
$result = mysql_query('SELECT *, (SELECT COUNT(control.con_id) FROM control WHERE control.art_id = articles.art_id) AS count_from_con FROM articles ORDER BY art_id ASC');
while( $row = mysql_fetch_assoc($result) )
{
echo "Article with id: ".$row['art_id']. " has " .$row['count_from_con'].' connected rows in control.';
}
This should be enough examples to help you accomplish what you need.

mysql fetch data in where condition

I have 3 table now:
First is : member_username
+-------------+------------------+
| uid | username |
+-------------+------------------+
| 1 | theone |
| 2 | ohno |
| 3 | prayforpr |
+-------------+------------------+
Second is : member_data
+-------------+-------------------+-----------------+
| uid | talk | etc |
+-------------+-------------------+-----------------+
| 1 | talk1 | |
| 2 | talkeee | |
| 3 | iojdfnl | |
+---------------------------------------------------+
Third is : member_level
+-------------+-------------------+-----------------+
| uid | level | fid |
+-------------+-------------------+-----------------+
| 1 | 2 | 1 |
| 1 | 10 | 2 |
| 2 | 1 | 1 |
| 2 | 99 | 2 |
| 1 | 40 | 3 |
| 3 | 50 | 1 |
| 1 | 44 | 4 |
+---------------------------------------------------+
I would like to query data and display the only one uid when member_level is higher in when SUM member_level.level Where fid in 1,2,3.
my query now is like below, but this query is sum all the level including fid 4 also, how to specify only sum in fid 1,2,3? and how do I assign the SUM of member_level.level Where fid in 1,2,3 to $levelKingTotalLevel?
$levelKing = DB::query("SELECT t1.uid,t1.username,t2.talk FROM ".DB::table('member_level')." t3 JOIN ".DB::table('member_username')." t1 ON(t3.uid = t1.uid) JOIN ".DB::table('member_data')." t2 ON (t1.uid = t2.uid) GROUP BY t3.uid ORDER BY SUM(t3.level) DESC LIMIT 1");
while($rowlevelKing = DB::fetch($levelKing)) {
$levelKingTotalLevel = $rowlevelKing['???'];
$levelKingN = $rowlevelKing['username'];
$levelKingUID = $rowlevelKing['uid'];
$levelKingT = $rowlevelKing['talk'];
};
echo "The ".$levelKingN." total level is ".$levelKingTotalLevel." and he talk about ".$levelKingT;
Thank you.
To filter records having fid values as 1, 2 or 3, use IN statement in WHERE clause. Alias totalLevel in select statement will give you total level for a user.
SELECT t1.uid, t1.username, t2.talk, SUM(t3.level) AS totalLevel
FROM member_level t3
JOIN member_username t1
ON (t3.uid = t1.uid)
JOIN member_data t2
ON (t1.uid = t2.uid)
WHERE t3.fid IN (1,2,3)
GROUP BY t3.uid
ORDER BY totalLevel DESC
LIMIT 1

how use order by clause in group by statement

I have two table tbl_issue_log, tbl_magazine_issue
tbl_issue_log
============
+----------+--------+--------+-----------+---------------------+
| issue_id | mag_id | log_id | operation | updated_time |
+----------+--------+--------+-----------+---------------------+
| 2 | 1 | 1 | 1 | 2014-01-30 21:29:44 |
| 3 | 4 | 1 | 1 | 2015-01-30 21:29:44 |
| 2 | 1 | 1 | 3 | 2015-01-31 21:29:44 |
+----------+--------+--------+-----------+---------------------+
tbl_magazine_issue
=================
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
| ISSUE_ID | ISSUE_NAME | MAGAZINE_ID | COVER_PAGE_THUMB | FROM_DATE | TO_DATE | issue_description | login_page_no |
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
| 2 | test issue | 1 | cover page | 2014-01-30 | 2015-01-30 | sdssdg fsdf | 20 |
| 3 | test issue1 | 4 | cover page1 | 2014-01-30 | 2015-01-30 | sdssdg fsdf | 20 |
+----------+-------------+-------------+------------------+------------+------------+-------------------+---------------+
in tbl_issue_log contain multiple records for same issue id. i want only one issue at a time
and this must latest updated time.
My query is this
SELECT
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
GROUP BY tbl_issue_log.issue_id
ORDER BY tbl_issue_log updated_time DESC;
Here i got issue id based output . But not getting the latest updated timeed record.
If any one about this please help me.
Try this
SELECT * FROM
(SELECT
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
ORDER BY tbl_issue_log.updated_time DESC ) TEMP_TABLE
GROUP BY ISSUE_ID
Can you try changing order by clause as
ORDER BY tbl_issue_log.updated_time DESC;
Edit ---
As you are grouping on issue_id, mysql will select first row that matches the issue_id. The order by runs later which essentially does not return what you are looking for. You may need to use a subquery approach for this.
select some_table.* FROM
(
SELECT
MAX(tbl_issue_log.updated_time) AS updated_time,
`tbl_issue_log`.`operation`,
`tbl_magazine_issue`.`ISSUE_ID`,
`tbl_magazine_issue`.`ISSUE_NAME`,
`tbl_magazine_issue`.`MAGAZINE_ID`,
`tbl_magazine_issue`.`COVER_PAGE_THUMB`,
`tbl_magazine_issue`.`FROM_DATE`,
`tbl_magazine_issue`.`TO_DATE`,
`tbl_magazine_issue`.`issue_description`,
`tbl_magazine_issue`.`login_page_no`
FROM
`tbl_issue_log`
LEFT JOIN
`tbl_magazine_issue` ON tbl_magazine_issue.ISSUE_ID = tbl_issue_log.issue_id
WHERE
(tbl_issue_log.mag_id = '1')
AND (tbl_magazine_issue.ISSUE_STATUS = 3)
AND (tbl_issue_log.updated_time > '2014-02-25 00:42:22')
GROUP BY tbl_issue_log.issue_id
) some_table
ORDER BY some_table.updated_time DESC;

Compare column from two rows and if different change values of another column

I have a table with an auto increment key id, item_no can be either one or two rows in a row (so they always have consecutive ids) that share the same ref but have different right/left (but technically item_no can be repeated multiple times throughout the table but that's not an issue), and description will sometimes be the same on the consecutive rows but sometimes different:
id | item_no | description | right\left | ref
1 | 1 | a1 | right | aaa
2 | 1 | a1 | left | aaa
3 | 2 | b1 | right | bbb
4 | 3 | c1 | right | ccc
5 | 3 | c2 | left | ccc
6 | 4 | d1 | right | ddd
7 | 4 | d1 | left | ddd
My issue is that I need item_no to append a -r or -l on to its value if the description of its 'matching' row is different.
So the result I am looking for is:
id | item_no | description | right\left | ref
1 | 1 | a1 | right | aaa
2 | 1 | a1 | left | aaa
3 | 2 | b1 | right | bbb
4 | 3-r | c1 | right | ccc
5 | 3-l | c2 | left | ccc
6 | 4 | d1 | right | ddd
7 | 4 | d1 | left | ddd
I am exporting the table to a csv but am not using much php, just a mysql statement and then looping out the results, is this possible within the mysql statement or will I have to rely on a php loop?
I would use this:
update
items inner join
(select item_no from items
group by item_no
having count(distinct description)>1) dup
on items.item_no=dup.item_no
set
items.item_no=concat(items.item_no, '-', substr(rightleft, 1,1))
If rows are always consecutive, you could also use this:
update
items i1 inner join items i2
on (i1.id=i2.id+1 or i1.id=i2.id-1)
and (i1.item_no=i2.item_no)
and (i1.description<>i2.description)
set i1.item_no=concat(i1.item_no, '-', substr(i1.rightleft, 1,1))
EDIT: if rows are always consecutive, and you just need a select and not an update, you could use this:
select
i1.id,
case when i1.description=i2.description or i2.id is null then i1.item_no else
concat(i1.item_no, '-', substr(i1.rightleft, 1,1)) end,
i1.description, i1.rightleft, i1.ref
from
items i1 left join items i2
on (i1.id=i2.id+1 or i1.id=i2.id-1) and (i1.item_no=i2.item_no)
order by i1.id
Try this:
SELECT
id,
CASE RightLeft
WHEN 'right' THEN CONCAT(item_no, '-r' )
WHEN 'left' THEN CONCAT(item_no, '-l' )
END AS item_no,
DESCRIPTION,
Rightleft,
ref
FROM Items
WHERE item_no IN
(
SELECT i1.item_no
FROM items i1
GROUP BY i1.item_no
HAVING(COUNT(DISTINCT description)) > 1);
SQL Fiddle Demo
This will give you:
| ID | ITEM_NO | DESCRIPTION | RIGHTLEFT | REF |
------------------------------------------------
| 4 | 3-r | c1 | right | ccc |
| 5 | 3-l | c2 | left | ccc |
I would rely on a PHP loop if you're using mysql, if you were using Oracle or SQL server then you could program a stored procedure.
You script should look something like this:
$dbh = new PDO('mysql:host='.DATABASE_HOST.';dbname='.DATABASE_NAME, DATABASE_USER, DATABASE_PASSWORD);
$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$data = $dbh->query("SELECT * FROM ExampleTable");
$dbh->beginTransaction();
foreach($data as $row)
{
$append = $row["right\left"] == "left" ? $row["item_no"]."-l" : $row["item_no"]."-r";
$stmnt = $dbh->prepare("UPDATE ExampleTable SET item_no = :item WHERE id = :id");
$stmnt->execute(array(":item" => $append,":id" => $row["id"]));
}
// Do some exception handling if something goes wrong you can allways do a rollback
// With PDO $dbh->rollBack();
$dbh->commit();
$dbh = null;
Something like this
UPDATE [dbo].[maTable] SET [item_no] = [item_no]+'r' WHERE not distinct [description] from [dbo].[maTable]
Should add an 'r' in the registration line where [description] is not identical (coded for SQL Server)

Count data in same sentence

I have table :
==========================================================
|id | before | after | freq | id_sentence | document_id |
==========================================================
| 1 | a | b | 1 | 0 | 1 |
| 2 | c | d | 1 | 1 | 1 |
| 3 | e | f | 1 | 1 | 1 |
| 4 | g | h | 2 | 0 | 2 |
==========================================================
I want to get the number of data depend on the id_sentence and freq so the result must be 1 2 1
here's the code :
$query = mysql_query("SELECT freq FROM tb where document_id='$doc_id' ");
while ($row = mysql_fetch_array($query)) {
$stem_freq = $row['freq'];
$total = $total+$stem_freq;
but the result is still wrong. please, help me.. thank you :)
If I understand your question, you are trying the calculate the sum of freq for each distinct id_sentence for a particular document_id.
Try the following SQL:
SELECT id_sentence, SUM(freq) as freq FROM tb WHERE document_id = 1 GROUP BY(id_sentence)
The result will be rows of data with the id_sentence and corresponding total freq. No need to manually sum things up afterwards.
See this SQL Fiddle: http://sqlfiddle.com/#!2/691ed/8
I think you could do something like
SELECT count(*) AS count FROM tb GROUP BY(id_sentence,freq)
to do the counting you want. You could even do something like
SELECT count(*) AS count, id_sentence, freq FROM tb GROUP BY(id_sentence,freq)
to know which id_sentence,freq combination the count is for.

Categories