I have a set of queries that I am trying to run but I am having issues getting them to run together.
My set up is as follows with column names in parantheses:
Table 1 (Email / Date)
Table 2 (Email / Date_Submitted)
I have written 3 queries which each work perfectly, independent of each other, but I cannot seem to figure out how to connect them.
Query 1 - Distinct Emails from Table 1 (rfi_log)
SELECT DISTINCT email, date_submitted
FROM rfi_log
WHERE date_submitted BETWEEN '[start_date]' AND '[end_date]'
Query 2 - Distinct Emails from Table 2 (masterstudies)
SELECT DISTINCT email
FROM orutrimdb.mastersstudies
WHERE date BETWEEN '[start_date]' AND '[end_date]'
Query 3 - Join Query looking for duplicate emails from Table 1 & Table 2
SELECT rfi_log.email as emails, orutrimdb.mastersstudies.email
FROM rfi_log
CROSS JOIN orutrimdb.mastersstudies
ON orutrimdb.mastersstudies.email=rfi_log.email
WHERE date_submitted BETWEEN '[start_date]' AND '[end_date]';
My issue now is that I need to combine these queries by some fashion so that I can get a count of DISTINCT emails from both tables during the date range while EXCLUDING the emails identified from Query 3.
I need the following:
Query 3 = Count of Distinct Emails
Query 2 = Count of Distinct Emails (not identified in Query 3)
Query 1 = Count of Distinct Emails (not identified in Query 3)
Ultimately I need to get a total count of distinct emails during the date range that is "de-duplicated" since there are duplicates located in both tables.
How can this be accomplished?
One method for doing this is union all with aggregation. The following gets duplication information about each email:
select email, sum(isrfi) as numrfi, sum(isms) as numms
from ((select email, 1 as isrfi, 0 as isms
from rfilog
) union all
(select email, 0, 1
from orutrimdb.mastersstudies
)
) e
group by email;
An aggregation on top gives you the information you are looking for:
select numrfi, numms, count(*), min(email), max(email)
from (select email, sum(isrfi) as numrfi, sum(isms) as numms
from ((select email, 1 as isrfi, 0 as isms
from rfilog
) union all
(select email, 0, 1
from orutrimdb.mastersstudies
)
) e
group by email
) e
group by numrfi, numms;
Note that this also finds duplicates within a single table.
Related
I've 5x databases for 5x websites - each database has 2 tables that represent users in an HTML5 game.
Right now, I'm doing a UNION of 2 tables from ONE database. This is what I'm doing for that:
$query = "SELECT email, SUM(score) as score, username
FROM
(
(SELECT play2helpdb.users.email, play2helpdb.users.score, play2helpdb.users.username FROM play2helpdb.users)
UNION ALL
(SELECT scoreboard.yum.email, scoreboard.yum.score, scoreboard.yum.name FROM scoreboard.yum)
) AS tt
GROUP BY email ORDER BY score DESC";
Now, how do I extend this query for 4 other databases? I basically want to UNION all this data and display in a table using FOREACH.
Kindly help me out here.
EDIT:
I've been trying to fetch a list of chats from a table in mysql
This is how the table looks
|id|sender |receiver |date |
| 1|u1 |u2 |2014-06-12|
| 2|u2 |u1 |2014-06-13|
| 3|u3 |u2 |2014-06-14|
| 4|u1 |u2 |2014-06-15|
| 5|u1 |u3 |2014-06-16|
I want the query to fetch all id's where u1 is in receiver or sender but showing just the most updated id and ordering the query using date column
The expected result is something like this
5 4
In this way it show that u1 is chatting with u3 and u1 is chatting with u2 ( as u2 is also a sender in the second id but date is past it is not shown)
I tried to create the query using group and joining but it has been impossible
Thanks
You can find all rows using an inner query that finds the max id per receiver where the sender is u1, and do an outer query to get the rows and sort them;
SELECT id, date FROM mytable WHERE id IN (
SELECT MAX(id) id
FROM mytable
WHERE sender='u1' OR receiver='u1'
GROUP BY CASE WHEN sender='u1' THEN receiver ELSE sender END
)
ORDER BY date DESC;
An SQLfiddle to test with.
You can run a subquery to get the max date and unique records, then join it back to your table to get the IDs of those rows like this:
select
ba.id,
ba.sender,
ba.receiver
from
yourTable ba
join (
select
sender,
receiver
max(`date`) as mdate
from
yourTable
where
sender='u1'
or receiver='u1'
group by
sender,
receiver
) sub
on ba.sender=sub.sender
and ba.receiver=sub.receiver
and ba.`date`=sub.mdate
Your table does use a few column names I would suggest to change - namely date which is a reserved word. That's going to make it a huge pain in the ass to do anything with. Now, you could keep it but then use backticks around it each and every single time, but I would strongly suggest changing it to a name you don't have to give special treatment to.
select x.*
from mytable as x
join (
SELECT MAX(date) as d
FROM mytable
WHERE 'u1' in (sender, receiver)
) as t
on x.date = t.d
and 'u1' in (x.sender, x.receiver)
Alternative
select x.*
from mytable as x
WHERE 'u1' in (sender, receiver)
order by date desc
limit 1;
SELECT DISTINCT id FROM TableName
WHERE
Sender = u1
OR
Receiver = u1
Order BY date DESC
This question already has answers here:
MySQL select 10 random rows from 600K rows fast
(28 answers)
Closed 8 years ago.
I have searched all over for an answer and although people say not to use the ORDER BY RAND() clause, I think for my purposes it is ok as this is for a competition which barely has more than a few hundred records at a time PER competition.
So basically i need to retrieve 5 random records from a competition entries table. However any loyalty customers will received an additional EXTRA entry so example:
compEntryid | firstName | lastName | compID |
1 | bob | smith | 100
2 | bob | smith | 100
3 | jane | doe | 100
4 | sam | citizen | 100
etc
So we are giving the loyalty members a better chance at winning a prize. However im a little worried that the returned result from a usual ORDER BY RAND() can include 2 entries of the SAME person ? What is an optimised method to ensure that we truly have 5 random records but at the same time giving those extra entrants a better or (weighted) chance ? Happy to use multiple queries, sub-queries or even a mix of MySQL and PHP ? Any advice is deeply appreciated thank you !
Bass
EDIT:
These 2 queries both work!
query1
SELECT concat(firstName, " ", lastName) name,id, email
FROM t WHERE
RAND()<(SELECT ((5/COUNT(id))*10) FROM t)
group by email ORDER BY RAND() limit 5;
query2
select distinct
email, id, firstName, lastName from
(
select id ,
email, firstName , lastName , compID, rand()/(select count(*) from t where
email=t1.email
) as rank
from t t1
where compID = 100
order by rank) t2 limit 5;
http://sqlfiddle.com/#!2/73470c/2
If you have a few hundred record, I think that order by rand() solution should be fine:
subquery will order weighting number of entries, but duplicates remains. Parent SELECT will take the first 5 distinct rows.
SELECT DISTINCT firstName ,
lastName ,
compID
FROM
( SELECT compEntryid ,firstName , lastName , compID, rand()/(select count(*)
FROM t
WHERE firstName=t1.firstName AND
lastName = t1.lastName) AS rank
FROM t t1
WHERE compID = 100
ORDER BY rank) t2
LIMIT 5
Fiddle
I think you will need to use a sub query if you want to return a compEntryid.
SELECT t.firstName, t.lastName, t.compID, MIN(compEntryid)
FROM t
INNER JOIN
(
SELECT DISTINCT firstName, lastName, compID
FROM t
ORDER by rand()
LIMIT 5
) t2
ON t.firstName = t2.firstName
AND t.lastName = t2.lastName
AND t.compID = t2.compID
GROUP BY t.firstName, t.lastName, t.compID;
This uses a sub query to get 5 random firstName / lastName / compID. Then joins against the table to get the MIN compEntryId.
However not certain about this. Think it will eliminate the duplicates in the sub query before performing the order / limit, which would prevent someone with more entries having more chances.
EDIT
More of a play and I think I have found a solution. Although efficiency is not one of its strong points.
SELECT MIN(compEntryid), firstName, lastName, compID
FROM
(
SELECT firstName, lastName, compID, compEntryid, #seq:=#seq+1 AS seq
FROM
(
SELECT firstName, lastName, compID, compEntryid
FROM t
ORDER by rand()
) sub0
CROSS JOIN (SELECT #seq:=0) sub1
) sub2
GROUP BY sub2.firstName, sub2.lastName, sub2.compID
ORDER BY MIN(seq)
LIMIT 5
This has an inner sub query that gets all the records in a random order. Around that another sub query adds a sequence number to the records. The outer query groups by the name, etc, and orders by the min sequence number for that name. The compEntryId is just grabbed as the MIN for the name / competition (I am assuming you don't care too much about this).
This way if someone had 5 entries the inner sub query would mix them up in the list. the next sub query would add a sequence number. At this stage those 5 entries could be sequence numbers 1 to 5. The outer one would order by the lowest sequence number for the name and ignore the others, so of those 5 only sequence number 1 would be used and 2 to 5 ignored, with the next selected person being the one with sequence number 6.
This way the more entries they have the more likely they are to be a winner, but can't be 2 of the 5 winners.
With thanks to kiks73 for setting up some sqlfiddle data:-
http://sqlfiddle.com/#!2/cd777/1
EDIT
A solution based on that above by #kiks73. Tweaked to use a non correlated sub query for the counts, and eliminates a few uncertainties. For example with his solution I am not quite sure whether MySQL will chose to do the DISTINCT by implicitly doing a GROUP BY, which would also implicitly do an orderering of the results prior to doing the limit (it doesn't seem to, but I am not sure this behaviour is defined).
SELECT t.firstName ,
t.lastName ,
t.compID,
MIN(rand() / t1.entry_count) AS rank
FROM
(
SELECT firstName, lastName, compID, COUNT(*) AS entry_count
FROM t
GROUP BY firstName, lastName, compID
) t1
INNER JOIN t
ON t.firstName=t1.firstName
AND t.lastName = t1.lastName
AND t.compID = t1.compID
GROUP BY t.firstName, t.lastName, t.compID
ORDER BY rank
LIMIT 5
I have a table with customer info. Normally, the PHP checks for duplicates before they new rows are inserted. However, I had to dump a lot of older rows manually and now that they are all in my table, I need to check for duplicates.
Example rows:
id, name, email, phone, fax
I would like to do a mysql query that will show all ID's with matching emails. I can modify the query later for phone, fax, etc.
I have a feeling I will be using DISTINCT, but I am not quite sure how it's done.
You can GROUP BY email with HAVING COUNT(*) > 1 to find all duplicate email addresses, then join the resulting duplicate emails with your table to fetch the ids:
SELECT id FROM my_table NATURAL JOIN (
SELECT email FROM my_table GROUP BY email HAVING COUNT(*) > 1
) t
I have two tables. One tracks Part Shipments and the other tracks System shipments.
I am trying to count the customer contacts in each table with the result showing me the total customer contacts for both parts and systems combined.
I am trying to use Union and I would guess from my results I am doing this all wrong. My results end up with two entries for customers. Cust A will have a total of 9 and then another entry of 1. So I am guess there is no merge of the customer contacts and it is just creating a union of both results.
The Code I am using.
SELECT Count(part_shipment.Customer_Station_ID) AS Contact,
part_shipment.Customer_Station_ID AS Customer
FROM part_shipment
GROUP BY part_shipment.Customer_Station_ID
UNION
SELECT Count(system_shipments.Customer_Station_ID) AS Contact,
system_shipments.Customer_Station_ID AS Customer
FROM system_shipments
GROUP BY system_shipments.Customer_Station_ID
ORDER BY Contact DESC
You can't do it like that. The Union just take rows from first query and rows from second query, and "display" them ones after anothers.
UNION requires the creation of derived tables (tables created from a query).
SELECT *
FROM (
SELECT col1, col2
FROM table
) UNION (
SELECT col1, col2
FROM otherTable
)
I also don't think you can use GROUP BY inside the selects that make up the UNION (it's been a while since I used it so I don't remember for sure)
Do you have tried to use a GROUP BY and SUM from the results of UNION query?