Select duplicates with PHP & MySql for merging process

Select duplicates with PHP & MySql for merging process - php

I wrote some code to select duplicates and group them using first and last names. I gather them into a multidimensional array and dedupe/merge them using jQuery/Ajax on the resulting page. I would like to ask if there is a better method of creating the array than how I'm doing it. Here is my code. Thank you.
$dataArr=fetchDups($conn, 13, 5); // get a few at a time
print '<div style="clear:both;"></div><pre>';
print_r($dataArr);
print '</pre><div style="clear:both;"></div>';
function fetchDups($conn, $client_id, $limit='')
{
$sql=' SELECT * FROM `contacts` WHERE `clientid`=\'13\' GROUP BY fname, lname ';
//$sql=' SELECT DISTICT fname, lname, * FROM `clients` WWHERE `clientid`=\'13\' ';
$res=mysql_query($sql, $conn)or die(mysql_error());
$contactsRow=array();
while($row=mysql_fetch_array($res)){
echo $row['fname'].'<br>';
$contactsRow[]=$row;
}
mysql_freeresult($res);
$dataArr=array();
$i=0;
$limitNum=0;
//----------------------------------
foreach($contactsRow AS $rowNew){
$sql=' SELECT * FROM `contacts` WHERE `clientid`=\'13\' AND `id`!=\''.$rowNew['id'].'\'
AND (`fname` = \''.$rowNew['fname'].'\' OR `lname` = \''.$rowNew['lname'].'\')
';
//echo $sql;
$res=mysql_query($sql, $conn)or die(mysql_error());
$rowCountDup=mysql_num_rows($res);
if($rowCountDup>0){
$d=0;
$dataArr[$i]=array();
$dataArr[$i][$d]=$rowNew;
while($rowNew=mysql_fetch_array($res)){
$dataArr[$i][($d+1)]=$rowNew;
$d++;
}
$i++;
$limitNum++;
}
// limit the results. too many crashes the browser
if($limitNum==$limit){
break;
}
}
mysql_freeresult($res);
return $dataArr;
}

For this kind of things, you should probably try using:
SELECT * FROM contacts refC JOIN contacts allC USING (fname, lname) WHERE refC.clientid='13'
This does a self-join on contacts based on first and last name, so allC aliases to the list of all contacts that share refC's first and last names (including himself).
This way, you get all the information you're looking for in only one SQL query. Tuning may be achieved on the query by adding an index on columns fname and lname of table contacts, so the join doesn't have to parse the whole table to match.
--edit: You may get to specify more finely how you join your tables as for instance:
SELECT *
FROM contacts refC
JOIN contacts allC ON (allC.fname LIKE CONCAT(refC.fname, '%') AND allC.lname LIKE CONCAT(refC.lname, '%'))
WHERE refC.clientid='13'
Which is strictly equivalent (but IMO easier to read than) to:
SELECT *
FROM contacts refC,contacts allC
WHERE allC.fname LIKE CONCAT(refC.fname, '%')
AND allC.lname LIKE CONCAT(refC.lname, '%')
AND refC.clientid='13'

If you just want to avoid displaying duplicates and not actually removing them from your db, use DISTINCT SQL keyword.

Or you could try something like the second query here which uses a derived table:
mysql> select * from contacts ;
+----+--------+---------+
| id | fname | lname |
+----+--------+---------+
| 1 | Annie | Haddock |
| 2 | Annie | Haddock |
| 3 | Ginger | Mole |
| 4 | Ted | Ted |
| 5 | Ted | Ted |
+----+--------+---------+
5 rows in set (0.01 sec)
mysql> select id, fname, lname, total from
(select *, count(*) as total
from contacts group by fname, lname) people
where total > 1;
+-----------+--------------+--------------+--------------+
| people.id | people.fname | people.lname | people.total |
+-----------+--------------+--------------+--------------+
| 1 | Annie | Haddock | 2 |
| 4 | Ted | Ted | 2 |
+-----------+--------------+--------------+--------------+
2 rows in set (0.01 sec)
then just iterate through it with foreach. Note that "people" above is an alias for the derived table created by the inner select

Related

SQL need to improve run speed

I am trying to select data from mysql by a date field in the database. (Users can enter start date and end date)
For each selected row between user selected dates, I need to select from the same table to produce a result.
Example:
$query = "SELECT * FROM table WHERE date BETWEEN $begindate AND $enddate"; //Select by date
$result = mysqli_query($dbc,$query);
while($row = mysqli_fetch_array($result)){
vardump($row); //user needs to see all data between date selection
$query = "SELECT * FROM table WHERE field = $row['field']";
// and then do calculations with the data
}
This runs very slowly and I can see why. How can I improve the run speed?
Edit:
The original purpose was to generate a sales report between dates. Now the user wants the report to produce another result. This result could only be produced by searching against the same table, and the rows that I need is not within the date selection.
Edit 2:
I do need to output the entire table between date selection. Each row will need to find ALL other rows where field = field, within or out side of the date selection.
Edit 3: Solved the problem. All the answers are helpful, though I think the chosen answer was most related to my question. However, I believe using join when working with two tables is the right way to go. For my problem, I actually just solved it by duplicating the table and run my search against the duplicated table. The chosen answer did not work for me because the second query selection is not a part of the first query selection. Hope this would help anyone looking at this post. Again, thanks for all the help!

Well, so if you are really looking for such a conditions in same table, I suggest you should use IN selector like following:
$query = "SELECT * FROM table
WHERE field IN
(SELECT DISTINCT field FROM table
WHERE
date BETWEEN $begindate AND $enddate)";
So final code will look some like following:
$query = "SELECT * FROM table
WHERE field IN
(SELECT DISTINCT field FROM table
WHERE
date BETWEEN $begindate AND $enddate)";
$result = mysqli_query($dbc,$query);
while($row = mysqli_fetch_array($result)){
// do calculations with the $row
}

I guess your table names arent TABLE:
just user inner join
$query = "SELECT *
FROM table1
JOIN table2
ON table1.field = table2.field
WHERE date BETWEEN $begindate AND $enddate
ORDER BY table1.field;"

Stop writing pseudo-SQL
SELECT * FROM is technically pseudo-SQL (a sql command which the interpreter has to modify before the command can be executed. It is best to get in a habit of specifying columns in the SELECT statement.
Use SQL joins
Joins are what makes relational databases so useful, and powerful. Learn them. Love them.
Your set of SQL queries, combined into a single query:
SELECT
table1.id as Aid, table1.name as Aname, table1.field as Afield,
table2.id as Bid, table2.name as Bname, table2.field
FROM table table1
LEFT JOIN table table2
ON table1.field = table2.field
WHERE table1.date BETWEEN $begindate AND $enddate
ORDER BY table1.id, table2.id
Your resulting print of the data should result in something which access each set of data akin to:
$previous_table1_id = 0;
while($row = mysqli_fetch_array($result)){
if ($row['Aid'] != $previous_table1_id) {
echo 'Table1: ' . $row['Aid'] . ' - ' . $row['Aname'] . ' - '. $row['Afield'] . "\n";
$previous_table1_id = $row['Aid'];
}
echo 'Table2: ' . $row['Bid'] . ' - ' . $row['Bname'];
}
Dealing with aggregated data
Data-aggregation (multiple matches for table1/table2 on field), is a complex subject, but important to get to know. For now, I'll leave you with this:
What follows is a simplified example of one of what aggregated data is, and one of the myriad approaches to working with it.
Contents of Table
id | name | field
--------------------
1 | foos | whoag
2 | doh | whoag
3 | rah | whoag
4 | fun | wat
5 | ish | wat
Result of query I gave you
Aid | Aname | Afield | Bid | Bname
----------------------------------
1 | foos | whoag | 1 | foos
1 | foos | whoag | 2 | doh
1 | foos | whoag | 3 | rah
2 | doh | whoag | 1 | foos
2 | doh | whoag | 2 | doh
2 | doh | whoag | 3 | rah
3 | rah | whoag | 1 | foos
3 | rah | whoag | 2 | doh
3 | rah | whoag | 3 | rah
4 | fun | wat | 4 | fun
4 | fun | wat | 5 | ish
5 | ish | wat | 4 | fun
5 | ish | wat | 5 | ish
GROUP BY example of shrinking result set
SELECT table1.id as Aid, table1.name as Aname
group_concat(table2.name) as field
FROM table table1
LEFT JOIN table table2
ON table1.field = table2.field
WHERE table1.date BETWEEN $begindate AND $enddate
ORDER BY table1.id, table2.id
GROUP BY Aid
Aid | Aname | field
----------------------------------
1 | foos | foos,doh,rah
2 | doh | foos,doh,rah
3 | rah | foos,doh,rah
4 | fun | fun, ish
5 | ish | fun, ish

SELECT same row as repeated result from mysql DB

I have a table like
+------+----------+
| id | location |
+------+----------+
| 1 | TVM |
| 2 | KLM |
| 3 | EKM |
+------+----------+
And I have an array of id like [1,2,1,3,1]. I need to get the result as
+------+----------+
| id | location |
+------+----------+
| 1 | TVM |
| 2 | KLM |
| 1 | TVM |
| 3 | EKM |
| 1 | TVM |
+------+----------+
I am already tried WHERE IN like conditions but no luck.

A where statement cannot multiply the number of rows. It can only filter rows out. You want a join:
select tl.*
from tablelike tl join
(select 1 as n union all select 2 union all select 1 union all
select 3 union all select 1
) n
on tl.id = n.n;
Note: if you are already generating the list via a query or from a table, then use that for the query rather than passing the list out of the database and then back in.

You could also return this result with a query like this; this uses a separate SELECT to return each occurrence of row with id=1.
( SELECT id, location FROM mytable WHERE id IN (1,2)
ORDER BY id
)
UNION ALL
( SELECT id, location FROM mytable WHERE id IN (1,3)
ORDER BY id
)
UNION ALL
( SELECT id, location FROM mytable WHERE id IN (1)
ORDER BY id
)
Following a similar pattern, the result could be obtained by combining the results from five SELECT statements, each returning a separate row. That would probably be a little simpler to achieve from a small array, e.g.
$glue = ") ) UNION ALL
(SELECT id, location FROM mytable WHERE id IN (";
$sql = "(SELECT id, location FROM mytable WHERE id IN ("
. implode($glue, $myarray)
. ") )";

SQL Count most used value in one column where other column equals something.

This is a bit of a weird one I didn't know how to word the title please bear with me.
So I have a table like this which stores data on different jobs:
id | company | contact
----------------------
0 | name1 | Bob
1 | name1 | Mark
2 | name3 | Sam
3 | name1 | Bob
4 | name2 | Nigel
5 | name1 | Bob
6 | name3 | Donald
7 | name1 | Sandy
8 | name3 | Nigel
Is there a query with SQL I can use to query the table to find out the most commonly used contact for a particular company.
So the theoretical code I would be looking for would be something like:
SELECT "Most Commonly used Contact" FROM table WHERE company = "$company";
Is it possible in a single query or is this a multi query job?

try this sql query...
SELECT *, COUNT(*) AS total
FROM table
WHERE company = '$company'
GROUP BY contact
ORDER BY total DESC
LIMIT 1

Basically you want to find the number of contacts grouped by each company, and then grouped by the actual contact. So in other words:
SELECT COUNT(`id`) as num_contacts, `contact`, `company` FROM `jobtable` GROUP BY `company`, `contact` ORDER BY `company`, num_contacts DESC
Or for a single company:
SELECT COUNT(`id`) as num_contacts, `contact` FROM `jobtable` WHERE `company`='$company' GROUP BY `contact` ORDER BY num_contacts DESC

Gives you the single most used contact for $company, if you can't use LIMIT (e.g. if you are utilizing an Oracle Database):
SELECT contact, used_by
FROM (
SELECT contact, COUNT(*) AS used_by
FROM table
WHERE company = $company
GROUP BY contact
) t
HAVING used_by = MAX(used_by)

Exploding in php

In my table 'users' there are 'friends' ,
Like this :
+----+------+---------+
| id | name | friends |
+----+------+---------+
| 1 | a | 0,1,2 |
| 2 | b | 0,1,3 |
| 3 | c | 0,1 |
+----+------+---------+
How do I use the explode function to get the friends id one by one (not 0,1,2) that are separated by a comma (,) ;
How do I select the id? (Example) :
$sql = Select id from users where id = (exploded)
if (mysql_num_rows($sql) > 0 ) {
$TPL->addbutton('Unfriend');
}else{
$TPL->addbutton('Add as Friend')
}

The solution here is actually a slight change in your database structure. I recommend you create a "many-to-many" relational table containing all of the users friends referenced by user.
+---------+-----------+
| user_id | firend_id |
+---------+-----------+
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 1 |
| 2 | 5 |
+---------+-----------+
If you are storing lists of values within one field then that is the first sign that your database design is not quite optimal. If you need to search for a numerical value, it'll always be better to place an index on that field to increase efficiency and make the database work for you and not the other way around :)
Then to find out if a user is a friend of someone, you'll query this table -
SELECT * FROM users_friends WHERE
`user_id` = CURRENT_USER AND `friend_id` = OTHER_USER
To get all the friends of a certain user you would do this -
SELECT * FROM users_friends WHERE `user_id` = CURRENT_USER

Just a simple example that will make you clear how to proceed:
// Obtain an array of single values from data like "1,2,3"...
$friends = explode(',', $row['friends']);
Then, back in your query:
// Obtain back data like "1,2,3" from an array of single values...
$frieldslist = implode(',', $friends);
$sql = "SELECT * FROM users WHERE id IN ('" . $frieldslist . "')";

to get an array of if ids from your string explode would be used like this
$my_array = explode("," , $friends);
but you'd probably be better using the mysql IN clause
$sql = "Select id from users where id in (".$row['friends'].")";

Just a quick idea. Change your database's table. It is certain that after a while many problems will arise.
You could have something like this.
id hasfriend
1 2
1 3
2 1 no need to be here (You have this already)
2 4
.....
You can do this by using indexes for uniqueness or programming. You may think of something better. Change your approach to the problem to something like this.

Need to update missing info in mySQL table column

I'm back again. Been searching and trying this for hours... Haven't found an answer or even the right question.
I want to fix a crashed table that I recreated from memory (and the members list in Works) using an query in phpMyAdmin. I need to populate each members total posts.
forum_messages
member_id | message |
--------------------
1 | Hello |
3 | One, Two, Three |
1 | Howdy! |
2 | Here we are again! |
2 | To answer your question... |
forum_members
member_id | posts |
--------------------
1 | 0 |
2 | 0 |
From forum_messages, forum_members should end up looking like this:
forum_members
member_id | posts |
--------------------
1 | 2 |
2 | 2 |
3 | 1 |
Thanks!

Using an INSERT SELECT query, you should be able to rebuild the data you had lost in the forum_members table.
This would return the number of messages per member_id:
SELECT member_id, COUNT(*) FROM forum_messages GROUP BY member_id;
Collating it with an INSERT query puts it into the table instead of displaying the data as it normally would in an SELECT query.
INSERT INTO forum_members (member_id, posts) SELECT member_id, COUNT(*) FROM forum_messages GROUP BY member_id;

try this :
UPDATE forum_members SET posts = (SELECT COUNT(*) FROM forum_messages where forum_messages.member_id = forum_members.member_id GROUP BY forum_messages.member_id)

I think you need just to count messages by members, isn't it?
If so, use this SQL:
TRUNCATE TABLE forum_members;
INSERT INTO forum_members(member_id, posts)
SELECT member_id, COUNT(1) FROM forum_messages GROUP BY member_id;

This should fix it.. Please note that the code is NOT TESTED. You should echo the outcome of the update query to check if it's correct before executing the update query
$get_memberid = "SELECT distinct(member_id) as member_id FROM forum_members;";
$Rget_memberid = mysql_query($get_memberid) or die(mysql_error());
while($row_get_memberid = mysql_fetch_array($Rget_memberid)) {
$arr_get_memberid[] = array( "member_id" => $row_get_memberid['member_id'] );
}
for ($c = 0; $c < count($arr_get_memberid); $c++){
$update_count = "UPDATE forum_members set posts = (SELECT count(member_id) from forum_messages where member_id = '".$arr_get_memberid[$c]['member_id']."') where member_id = '".$arr_get_memberid[$c]['member_id']."';";
$Rupdate_count = mysql_query($update_count) or die(mysql_error());
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Select duplicates with PHP & MySql for merging process - php

If you just want to avoid displaying duplicates and not actually removing them from your db, use DISTINCT SQL keyword.

Related

SQL need to improve run speed

SELECT same row as repeated result from mysql DB

SQL Count most used value in one column where other column equals something.

Exploding in php

Need to update missing info in mySQL table column

Categories

Resources