I've always struggled with mysql joins but have started incorporating more but struggling to understand despite reading dozens of tutorials and mysql manual.
My situation is I have 3 tables:
/* BASICALLY A TABLE THAT HOLDS FAN RECORDS */
CREATE TABLE `fans` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) DEFAULT NULL,
`middle_name` varchar(255) DEFAULT NULL,
`last_name` varchar(255) DEFAULT NULL,
`email` varchar(255) DEFAULT NULL,
`join_date` datetime DEFAULT NULL,
`twitter` varchar(255) DEFAULT NULL,
`twitterCrawled` datetime DEFAULT NULL,
`twitterImage` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=20413 DEFAULT CHARSET=latin1;
/* A TABLE OF OUR TWITTER FOLLOWERS */
CREATE TABLE `twitterFollowers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`screenName` varchar(25) DEFAULT NULL,
`twitterId` varchar(25) DEFAULT NULL,
`customerId` int(11) DEFAULT NULL,
`uniqueStr` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique` (`uniqueStr`)
) ENGINE=InnoDB AUTO_INCREMENT=13426 DEFAULT CHARSET=utf8;
/* TABLE THAT SUGGESTS A LIKELY MATCH OF A TWITTER FOLLOWER BASED ON THE EMAIL / SCREEN NAME COMPARISON OF THE FAN vs OUR FOLLOWERS
IF SOMEONE (ie. a moderator) CONFIRMS OR DENIES THAT IT'S A GOOD MATCH THEY PUT A DATESTAMP IN `dismissed` */
CREATE TABLE `contentSuggestion` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`userId` int(11) DEFAULT NULL,
`fanId` int(11) DEFAULT NULL,
`twitterAccountId` int(11) DEFAULT NULL,
`contentType` varchar(50) DEFAULT NULL,
`contentString` varchar(255) DEFAULT NULL,
`added` datetime DEFAULT NULL,
`dismissed` datetime DEFAULT NULL,
`uniqueStr` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unstr` (`uniqueStr`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
What I'm trying to get is:
SELECT [fan columns]
WHERE fan screen name IS IN twitterfollowers
AND WHERE fan screen name IS NOT IN contentSuggestion (with a datestamp in dismissed)
My attempts so far:
~33 seconds
SELECT fans.id, tf.screenName as col1, tf.twitterId as col2 FROM fans
LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername
LEFT JOIN contentSuggestion cs ON cs.contentString = tf.screenName WHERE dismissed IS NULL
GROUP BY(fans.id) HAVING col1 != ''
~14 seconds
SELECT id, emailUsername FROM fans WHERE emailUsername IN(SELECT DISTINCT(screenName) FROM twitterFollowers) AND emailUsername NOT IN(SELECT DISTINCT(contentString) FROM contentSuggestion WHERE dismissed IS NULL) GROUP BY (fans.id);
9.53 seconds
SELECT fans.id, tf.screenName as col1, tf.twitterId as col2 FROM fans
LEFT JOIN twitterFollowers tf ON tf.screenName = fans.emailUsername WHERE tf.uniqueStr NOT IN(SELECT uniqueStr FROM contentSuggestion WHERE dismissed IS NULL)
I hope there is a better way. I've been struggling to really use JOINS outside of a single LEFT JOIN which has already helped me speed up other queries by a significant amount.
Thanks for any help you can give me.
I would go with a variation of the second method. Instead of IN, use EXISTS. Then add the correct indexes and remove the aggregation:
SELECT f.id, f.emailUsername
FROM fans f
WHERE EXISTS (SELECT 1
FROM twitterFollowers tf
WHERE f.emailUsername = tf.screenName
) AND
NOT EXISTS (SELECT 1
FROM contentSuggestion cs
WHERE f.emailUsername = cs.contentString AND
cs.dismissed IS NULL
) ;
Then be sure you have the following indexes: twitterFollowers(screenName) and contentSuggestion(contentString, dismissed).
Some notes:
When using IN, don't use SELECT DISTINCT. I'm not 100% sure that MySQL is always smart enough to ignore the DISTINCT in the subquery (it is redundant).
Historically, EXISTS was faster than IN in MySQL. The optimizer has improved in recent versions.
For performance, you need the correct indexes.
Then be sure you have the following indexes: twitterFollowers(screenName) and contentSuggestion(contentString, dismissed).
Assuming that fan.id is unique (a very reasonable assumption), you don't need the final group by.
Related
I need to pull the data and write it to a csv file but its taking too much time and too much ram. What is wrong with it and what can I do? Also, I feel like there's a redundancy in the query itself. I'm doing this with PHP.
Here's the query
CREATE TEMPORARY TABLE temp1 SELECT * FROM vicidial_closer_log
USE INDEX(call_date)
WHERE call_date BETWEEN '1980-01-01 00:00:00' AND '2019-03-12 23:59:59';
CREATE TEMPORARY TABLE temp2 SELECT * FROM vicidial_closer_log
USE INDEX(call_date)
WHERE call_date BETWEEN '1980-01-01 00:00:00' AND '2019-03-12 23:59:59';
SELECT a.call_date,
a.lead_id,
a.phone_number
AS customer_number,
IF(a.status != 'DROP', 'ANSWERED', 'UNANSWERED')
AS status,
IF(a.lead_id IS NOT NULL, 'inbound', 'outbound')
AS call_type,
a.USER
AS agent,
a.campaign_id
AS skill,
NULL
AS campaign,
a.status
AS disposition,
a.term_reason
AS Hangup,
a.uniqueid,
Sec_to_time(a.queue_seconds)
AS time_to_answer,
Sec_to_time(a.length_in_sec - a.queue_seconds)
AS talk_time,
Sec_to_time(a.park_sec)
AS hold_sec,
Sec_to_time(a.dispo_sec)
AS wrapup_sec,
From_unixtime(a.start_epoch)
AS start_time,
From_unixtime(a.end_epoch)
AS end_time,
c.USER
AS
transfered,
a.comments,
IF(a.length_in_sec IS NULL, Sec_to_time(a.queue_seconds),
Sec_to_time(a.length_in_sec + a.dispo_sec))
AS duration,
Sec_to_time(a.length_in_sec - a.queue_seconds + a.dispo_sec)
AS handling_time
FROM temp1 a
left outer join temp2 c
ON a.uniqueid = c.uniqueid
AND a.closecallid < c.closecallid
GROUP BY a.closecallid
I've uploaded screenshot of table structure and the indices.
Table Structure
Indices of Table
Thanks.
UPDATE:
SHOW CREATE TABLE vicidial_closer_log
vicidial_closer_log CREATE TABLE `vicidial_closer_log` (
`closecallid` int(9) unsigned NOT NULL AUTO_INCREMENT,
`lead_id` int(9) unsigned NOT NULL,
`list_id` bigint(14) unsigned DEFAULT NULL,
`campaign_id` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`call_date` datetime DEFAULT NULL,
`start_epoch` int(10) unsigned DEFAULT NULL,
`end_epoch` int(10) unsigned DEFAULT NULL,
`length_in_sec` int(10) DEFAULT NULL,
`status` varchar(6) COLLATE utf8_unicode_ci DEFAULT NULL,
`phone_code` varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
`phone_number` varchar(18) COLLATE utf8_unicode_ci DEFAULT NULL,
`user` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`comments` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`processed` enum('Y','N') COLLATE utf8_unicode_ci DEFAULT NULL,
`queue_seconds` decimal(7,2) DEFAULT 0.00,
`user_group` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`xfercallid` int(9) unsigned DEFAULT NULL,
`term_reason` enum('CALLER','AGENT','QUEUETIMEOUT','ABANDON','AFTERHOURS','HOLDRECALLXFER', 'HOLDTIME','NOAGENT','NONE','MAXCALLS') COLLATE utf8_unicode_ci DEFAULT 'NONE',
`uniqueid` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`agent_only` varchar(20) COLLATE utf8_unicode_ci DEFAULT '',
`queue_position` smallint(4) unsigned DEFAULT 1,
`called_count` smallint(5) unsigned DEFAULT 0,
`nopaperform` varchar(5) COLLATE utf8_unicode_ci NOT NULL DEFAULT 'NO',
`park_sec` int(3) DEFAULT 0,
`dispo_sec` int(3) DEFAULT 0,
`record_file` text COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`closecallid`),
KEY `lead_id` (`lead_id`),
KEY `call_date` (`call_date`),
KEY `campaign_id` (`campaign_id`),
KEY `uniqueid` (`uniqueid`),
KEY `phone_number` (`phone_number`),
KEY `date_user` (`call_date`,`user`),
KEY `closecallid` (`closecallid`)
) ENGINE=MyISAM AUTO_INCREMENT=1850672 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
EXPLAIN QUERY(On third query only):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ALL NULL NULL NULL NULL 664640 Using temporary; Using filesort
1 SIMPLE c ALL NULL NULL NULL NULL 662480 Using where; Using join buffer (flat, BNL join)
UPDATE(Updated Query):
SELECT a.call_date,
a.lead_id,
a.phone_number
AS customer_number,
IF(a.status != 'DROP', 'ANSWERED', 'UNANSWERED')
AS status,
IF(a.lead_id IS NOT NULL, 'inbound', 'outbound')
AS call_type,
a.user
AS agent,
a.campaign_id
AS skill,
NULL
AS campaign,
a.status
AS disposition,
a.term_reason
AS Hangup,
a.uniqueid,
Sec_to_time(a.queue_seconds)
AS time_to_answer,
Sec_to_time(a.length_in_sec - a.queue_seconds)
AS talk_time,
Sec_to_time(a.park_sec)
AS hold_sec,
Sec_to_time(a.dispo_sec)
AS wrapup_sec,
From_unixtime(a.start_epoch)
AS start_time,
From_unixtime(a.end_epoch)
AS end_time,
c.user
AS
transfered,
a.comments,
IF(a.length_in_sec IS NULL, Sec_to_time(a.queue_seconds),
Sec_to_time(a.length_in_sec + a.dispo_sec))
AS duration,
Sec_to_time(a.length_in_sec - a.queue_seconds + a.dispo_sec)
AS handling_time
FROM vicidial_closer_log a
LEFT OUTER JOIN vicidial_closer_log c
ON a.closecallid <> c.closecallid
AND a.uniqueid = c.uniqueid
AND a.closecallid < c.closecallid
WHERE a.call_date BETWEEN '2018-01-01 00:00:00' AND '2019-03-13 23:59:59'
EXPLAIN on updated query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ALL call_date,date_user NULL NULL NULL 662829 Using where
1 SIMPLE c ref PRIMARY,uniqueid,closecallid uniqueid 62 aastell_bliss.a.uniqueid 1 Using where
Updated Query Execution Result:
Number of rows present between given time range: 155016 rows
Time taken: 0.0149 secs
It works!
Summary of comments that lead to an answer:
CREATE TEMPORARY TABLE ... SELECT doesn't create indexes on the temporary table
Explicit use of a temporary table, particularly of a large size, will rarely give a performance gain.
Using table aliases in a join allows for a self join
Group by Primary Key on the left side of a join doesn't add much as its already unique and the JOIN had no aggregate expressions. GROUP BY adds an implicit ORDER BY so you expression could end up slower if a secondary index was used to join the table.
While the date range of the query was large, preparing for it to be a significant filter when small would make the call_date more favourable as an index. To make this more favorable, the join key is added to the end of the index so most of the work of the join can happen by just looking at the index.
When PK is on a column, a secondary index on the same column isn't needed.
I have the following 2 tables, api_analytics_data, and telecordia.
CREATE TABLE `api_analytics_data` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`upload_file_id` bigint(20) NOT NULL,
`partNumber` varchar(100) DEFAULT NULL,
`clei` varchar(45) DEFAULT NULL,
`description` varchar(150) DEFAULT NULL,
`processed` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `idx_aad_clei` (`clei`),
KEY `idx_aad_pn` (`partNumber`),
KEY `id_aad_processed` (`processed`),
KEY `idx_combo1` (`partNumber`,`clei`,`upload_file_id`)
) ENGINE=InnoDB CHARSET=latin1;
CREATE TABLE `telecordia` (
`tid` int(11) NOT NULL AUTO_INCREMENT,
`ProdID` varchar(50) DEFAULT NULL,
`Mfg` varchar(20) DEFAULT NULL,
`Pn` varchar(50) DEFAULT NULL,
`Clei` varchar(50) DEFAULT NULL,
`Series` varchar(50) DEFAULT NULL,
`Dsc` varchar(50) DEFAULT NULL,
`Eci` varchar(50) DEFAULT NULL,
`AddDate` date DEFAULT NULL,
`ChangeDate` date DEFAULT NULL,
`Cost` float DEFAULT NULL,
PRIMARY KEY (`tid`),
KEY `telecordia.ProdID` (`ProdID`) USING BTREE,
KEY `telecordia.clei` (`Clei`),
KEY `telecordia.pn` (`Pn`),
KEY `telcordia.eci` (`Eci`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Users upload data via a web interface using Excel/CSV files into api_analytics_data. The data contains EITHER the partNumbers or CLEIs. I then update the api_analytics_data table by joining the telecordia table. The telecordia table is the master list of partNumber and Cleis.
So if a user uploads a file of CLEIs, the update/join I use is:
update api_analytics_data aad
inner join telecordia t on aad.clei = t.Clei
set aad.partNumber = t.Pn
where aad.partNumber is null
and aad.upload_file_id = 5;
It works quickly, but not very thoroughly. The problem I have is that the CLEI uploaded may only be a substring of the CLEI in the telecordia table.
For example, the uploaded CLEI may be "5SC1DX0". In the telcordia table, the correct matching row is:
tid: 184324
ProdID: 472467
Mfg: PLSE
Pn: AUA58-2-REV-E
Clei: 5SC1DX04AA
Series: null
Dsc: DL SGL-PTY POTS CU RT
Eci: 205756
AddDate: 1994-03-18
ChangeDate: 1998-04-13
Cost: null
So obviously my update doesn't work in this case, even though 5SC1DX0 and 5SC1DX04AA are the same part.
What I need is a wildcard search. However, when I try this, it is crazy slow. With about 4500 rows uploaded into the api_analytics_data table, it runs for about 10 minutes, and then loses the connection with the server.
update api_analytics_data aad
inner join telecordia t on aad.clei like concat(t.Clei,'%')
set aad.partNumber = t.Pn
where aad.partNumber is null
and aad.upload_file_id = 5;
Is there a way to optimize this so that it runs quickly?
The correct answer is "no". The better course of action is to create a new column in telecordia with the correct Clei value in it, one that can be used for joining the tables. In the most recent versions of MySQL, this can even be a computed column and be indexed.
That said, you might be able to do something if the matching portion is always the same length. If so, try this:
update api_analytics_data aad inner join
telecordia t
on t.Clei = left(aad.clei, 7)
set aad.partNumber = t.Pn
where aad.partNumber is null and aad.upload_file_id = 5;
For this query, you want an index on api_analytics_data(upload_fiel_id, partNumber, clei) and telecordia(clei, pn).
these are my tables. first one is appusers table.
CREATE TABLE IF NOT EXISTS `appusers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(50) NOT NULL,
`is_active` tinyint(2) NOT NULL DEFAULT '0',
`zip` varchar(20) NOT NULL,
`city` text NOT NULL,
`country` text NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=23 ;
second table is stickeruses table.
CREATE TABLE IF NOT EXISTS `stickeruses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`sticker_id` int(11) NOT NULL,
`count` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=24 ;
Third table is Devices
CREATE TABLE IF NOT EXISTS `devices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`regid` varchar(300) NOT NULL,
`imei` varchar(50) NOT NULL,
`device_type` tinyint(2) NOT NULL,
`notification` tinyint(2) NOT NULL DEFAULT '1',
`is_active` tinyint(2) NOT NULL DEFAULT '0',
`activationcode` int(6) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=28 ;
I Want to find the Sum(stickeruses.count) and COUNT(devices.id) for all appusers.
Here is my query.
SELECT `Appuser`.`id`, `Appuser`.`email`, `Appuser`.`country`, `Appuser`.`created`,
`Appuser`.`is_active`, SUM(`Stickeruse`.`count`) AS total, COUNT(`Device`.`id`)
AS tdevice
FROM `stickerapp`.`appusers` AS `Appuser`
LEFT JOIN `stickerapp`.`stickeruses` AS `Stickeruse`
ON (`Stickeruse`.`user_id`=`Appuser`.`id`)
INNER JOIN `stickerapp`.`devices` AS `Device`
ON (`Device`.`user_id`=`Appuser`.`id`)
WHERE `Appuser`.`is_active` = 1
GROUP BY `Appuser`.`id`
LIMIT 10
When I am applying each join separately the results are right, but I want to combine both joins. And when I am doing it then results are wrong. please help.
When mixing JOIN and LEFT JOIN it is a good idea to use parentheses to make it clear what your intent is.
I don't know what you need, but these syntaxes might give you different results:
FROM a LEFT JOIN ( b JOIN c ON b..c.. ) bc ON a..bc..
FROM ( a LEFT JOIN b ON a..b.. ) ab JOIN c ON ab..c..
Also, you can rearrange them do FROM a JOIN c LEFT JOIN b (plus parentheses) or any of several other arrangements. Granted, some pairs rearrangements are equivalent.
Also, beware; aggregates (such as SUM()) get inflated values when JOINing. Think of it this way: first the JOINs get all appropriate combinations of rows from the tables, then the SUM adds them up. With that in mind, see if this works better:
SELECT a.`id`, a.`email`, a.`country`, a.`created`, a.`is_active`,
( SELECT SUM(`count`)
FROM stickerapp.stickeruses
WHERE user_id = a.id
) AS total,
( SELECT COUNT(*)
FROM stickerapp.devices
WHERE user_id = a.id
) AS tdevice
FROM stickerapp.`appusers` AS a
WHERE a.`is_active` = 1
GROUP BY a.`id`
LIMIT 10
Abstract:
Every client is given a specific xml ad feed (publisher_feed table). Everytime there is a query or a click on that feed, it gets recorded (publisher_stats_raw table) (Each query/click will have multiple rows depending on the subid passed by the client (We can sum the clicks together)). The next day, we pull stats from an API to grab the previous days revenue numbers (rev_stats table) (Each revenue stat might have multiple rows depending on the country of the click (We can sum the revenue together)). Been having a hard time trying to link together these three tables to find the average RPC for each client for the previous day.
Table Structure:
CREATE TABLE `publisher_feed` (
`publisher_feed_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`alias` varchar(45) DEFAULT NULL,
`user_id` int(10) unsigned DEFAULT NULL,
`remote_feed_id` int(10) unsigned DEFAULT NULL,
`subid` varchar(255) DEFAULT '',
`requirement` enum('tq','tier2','ron','cpv','tos1','tos2','tos3','pv1','pv2','pv3','ar','ht') DEFAULT NULL,
`status` enum('enabled','disabled') DEFAULT 'enabled',
`tq` decimal(4,2) DEFAULT '0.00',
`clicklimit` int(11) DEFAULT '0',
`prev_rpc` decimal(20,10) DEFAULT '0.0000000000',
PRIMARY KEY (`publisher_feed_id`),
UNIQUE KEY `alias_UNIQUE` (`alias`),
KEY `publisher_feed_idx` (`remote_feed_id`),
KEY `publisher_feed_user` (`user_id`),
CONSTRAINT `publisher_feed_feed` FOREIGN KEY (`remote_feed_id`) REFERENCES `remote_feed` (`remote_feed_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `publisher_feed_user` FOREIGN KEY (`user_id`) REFERENCES `user` (`user_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=124 DEFAULT CHARSET=latin1$$
CREATE TABLE `publisher_stats_raw` (
`publisher_stats_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`unique_data` varchar(350) NOT NULL,
`publisher_feed_id` int(10) unsigned DEFAULT NULL,
`date` date DEFAULT NULL,
`subid` varchar(255) DEFAULT NULL,
`queries` int(10) unsigned DEFAULT '0',
`impressions` int(10) unsigned DEFAULT '0',
`clicks` int(10) unsigned DEFAULT '0',
`filtered` int(10) unsigned DEFAULT '0',
`revenue` decimal(20,10) unsigned DEFAULT '0.0000000000',
PRIMARY KEY (`publisher_stats_id`),
UNIQUE KEY `unique_data_UNIQUE` (`unique_data`),
KEY `publisher_stats_raw_remote_feed_idx` (`publisher_feed_id`)
) ENGINE=InnoDB AUTO_INCREMENT=472 DEFAULT CHARSET=latin1$$
CREATE TABLE `rev_stats` (
`rev_stats_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date` date DEFAULT NULL,
`remote_feed_id` int(10) unsigned DEFAULT NULL,
`typetag` varchar(255) DEFAULT NULL,
`subid` varchar(255) DEFAULT NULL,
`country` varchar(2) DEFAULT NULL,
`revenue` decimal(20,10) DEFAULT NULL,
`tq` decimal(4,2) DEFAULT NULL,
`finalized` int(11) DEFAULT '0',
PRIMARY KEY (`rev_stats_id`),
KEY `rev_stats_remote_feed_idx` (`remote_feed_id`),
CONSTRAINT `rev_stats_remote_feed` FOREIGN KEY (`remote_feed_id`) REFERENCES `remote_feed` (`remote_feed_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=58 DEFAULT CHARSET=latin1$$
Context:
Each remote_feed has a specific subid/typetag given to it. So we need to match up the both the remote_feed_id and the subid columsn from the publisher_feed table to the remote_feed_id and typetag columns in the revenue stats table.
My current, non working, implementation:
SELECT
pf.publisher_feed_id, psr.date, sum(clicks), sum(rs.revenue)
FROM
xml_network.publisher_feed pf
JOIN
xml_network.publisher_stats_raw psr
ON
psr.publisher_feed_id = pf.publisher_feed_id
JOIN
xml_network.rev_stats rs
ON
rs.remote_feed_id = pf.remote_feed_id
WHERE
pf.requirement = 'tq'
AND
pf.subid = rs.typetag
AND
psr.date <> date(curdate())
GROUP BY
psr.date
ORDER BY
psr.date DESC
LIMIT 1;
The above keeps pulling the wrong data out of the rev_stats table (pulls the sum of the correct stats, but repeats it over because of a join). Any help with how I would be able to properly pull the correct data would be greatly helpful ( I could use multiple queries and PHP to get the correct results, but what's the fun in that!)
Figured out a way to get this accomplished. Its def not a fast method by any means, needing 4 selects to get it done, but it works flawlessly =)
SELECT
pf.publisher_feed_id,
round(
(
SELECT
SUM(rs.revenue)
FROM
xml_network.rev_stats rs
WHERE
rs.remote_feed_id = pf.remote_feed_id
AND
rs.typetag = pf.subid
AND
rs.date = subdate(current_date, 1)
),10)as revenue,
(
SELECT
MAX(rs.tq)
FROM
xml_network.rev_stats rs
WHERE
rs.remote_feed_id = pf.remote_feed_id
AND
rs.typetag = pf.subid
AND
rs.date = subdate(current_date, 1)
) as tq,
(
SELECT
SUM(psr.clicks)-SUM(psr.filtered)
FROM
xml_network.publisher_stats_raw psr
WHERE
psr.publisher_feed_id = pf.publisher_feed_id
AND
psr.date = subdate(current_date, 1)
) as clicks
FROM
xml_network.publisher_feed pf
WHERE
pf.requirement = 'tq';
Trying to track outbound clicks on advertisements, but im having troubles constructing the query to compile all the statistics for the user to view and track.
I have two tables, one to hold all of the advertisements, the other to track clicks and basic details on the user. ip address, timestamp, user agent.
I need to pull all of map_advertisements information along with Unique Clicks based on IP Address, and Total Clicks based on map_advertisements.id to be showin in a table with rows. 1 row per advertisement and two of its columns will be totalClicks and totalUniqueClicks
Aside from running three seperate queries for each advertisement is there a better way to go about this?
I am using MySQL5 PHP 5.3 and CodeIgniter 2.1
#example of an advertisements id
$aid = 13;
SELECT
*
count(acl.aid)
count(acl.DISTINCT(ip_address))
FROM
map_advertisements a
LEFT JOIN map_advertisements_click_log acl ON a.id = acl.aid
WHERE
a.id = $aid;
map_advertisements
-- ----------------------------
-- Table structure for `map_advertisements`
-- ----------------------------
DROP TABLE IF EXISTS `map_advertisements`;
CREATE TABLE `map_advertisements` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`youtube_id` varchar(255) NOT NULL,
`status` int(11) NOT NULL DEFAULT '1',
`timestamp` int(11) NOT NULL,
`type` enum('video','picture') NOT NULL DEFAULT 'video',
`filename` varchar(255) NOT NULL,
`url` varchar(255) NOT NULL,
`description` varchar(64) NOT NULL,
`title` varchar(64) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
map_advertisements_click_log
-- ----------------------------
-- Table structure for `map_advertisements_click_log`
-- ----------------------------
DROP TABLE IF EXISTS `map_advertisements_click_log`;
CREATE TABLE `map_advertisements_click_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`aid` int(11) NOT NULL,
`ip_address` varchar(15) NOT NULL DEFAULT '',
`browser` varchar(255) NOT NULL,
`timestamp` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=26 DEFAULT CHARSET=latin1;
A problem seems to be in your query there is no column with the name totalClicks in your table and distinct keyword is also used incorrectly. Try this:
SELECT *, count(acl.id) as totalClicks, count(DISTINCT acl.ip_address) as uniqueClicks
FROM map_advertisements a
LEFT JOIN map_advertisements_click_log acl ON a.id = acl.aid
WHERE a.id = $aid;