I have a invoices, invoices_items, order, order_items. Invoices and Orders tables contains around 1 Millions records. Invoices_items and Orders_items tables contains more than 2 Millions records. Items table contains 2 Hundred Thousands records. Now I want to generate a report based on my filter like customers, item categories and more....
Please refer queries.
Running on PHP 5.6. MySql 5.7 and Apache2.
SELECT
`si_items`.`item_id`
, SUM(qty) AS `qty`
, IFNULL(SUM(selling_price * (qty)), 0) AS `salestotal`
, GROUP_CONCAT(si.id) AS `siso_id`
, MAX(si.date_transaction) AS `date_transaction`
FROM
`invoice_items` AS `si_items`
LEFT JOIN `invoice` AS `si`
ON si.id = si_items.parent_id
LEFT JOIN `items`
ON si_items.item_id = items.id
WHERE (
DATE_FORMAT(si.date_transaction, '%Y-%m-%d') BETWEEN '2019-01-01'
AND '2019-02-15'
)
AND (si.approved = 1)
AND (si.deleted = 0)
AND (items.deleted = 0)
GROUP BY `item_id`
UNION
SELECT
`so_items`.`item_id`
, SUM(qty) AS `qty`
, IFNULL(SUM(selling_price * (qty)), 0) AS `salestotal`
, GROUP_CONCAT(so.id) AS `soso_id`
, MAX(so.date_transaction) AS `date_transaction`
FROM
`order_items` AS `so_items`
LEFT JOIN `order` AS `so`
ON so.id = so_items.parent_id
LEFT JOIN `items`
ON so_items.item_id = items.id
WHERE (
DATE_FORMAT(so.date_transaction, '%Y-%m-%d') BETWEEN '2019-01-01'
AND '2019-02-15'
)
AND (so.approved = 1)
AND (so.deleted = 0)
AND (items.deleted = 0)
GROUP BY `item_id`
When I executed this query for 50 days. It took 1 minute 20 seconds to execute this query.
INDEXES are added in tables
Invoice & Order Tables
PRIMARY KEY (`id`),
KEY `account_id` (`account_id`),
KEY `approved` (`approved`),
KEY `deleted` (`deleted`),
KEY `finalised` (`finalised`),
KEY `rp_status` (`rp_status`),
KEY `sales_types_id` (`sales_types_id`),
KEY `account_type_id` (`account_type_id`),
KEY `company_id` (`company_id`),
KEY `date_transaction` (`date_transaction`)
Invoices_items & Order_items
PRIMARY KEY (`id`),
KEY `deleted` (`deleted`),
KEY `item_id` (`item_id`),
KEY `parent_id` (`parent_id`),
KEY `vat_id` (`vat_id`),
KEY `qty` (`qty`),
Explain Query
Explain Query
I need to increase performance of this query. Could you please guide me how to proceed?
Show Create Tables
CREATE TABLE `invoice` (
`id` char(36) NOT NULL,
`reference` varchar(25) DEFAULT NULL,
`company_id` char(36) DEFAULT NULL,
`branch_id` char(36) DEFAULT NULL,
`account_id` char(36) DEFAULT NULL,
`contact_id` char(36) DEFAULT NULL,
`transaction_type` varchar(10) DEFAULT NULL,
`sales_types_id` int(11) DEFAULT '0',
`quote_validity` int(11) DEFAULT '0',
`delivery_method_id` int(11) DEFAULT '0',
`sales_representative_id` int(11) DEFAULT '0',
`account_type_id` char(36) DEFAULT NULL,
`vat_exempted` tinyint(1) DEFAULT '0',
`description` text,
`finalised` tinyint(1) DEFAULT '0' COMMENT 'Not Yet finalised - status=1; Need Approval - status = 2; Approved - status = 3',
`approved` tinyint(1) DEFAULT '0',
`approved_user_id` int(11) DEFAULT '0',
`default_sales_location_id` char(36) DEFAULT NULL COMMENT '0-Yes; 1-No',
`generate_do` tinyint(1) DEFAULT '1',
`generate_dn` tinyint(4) DEFAULT '1',
`do_status` tinyint(1) DEFAULT '0',
`cn_status` tinyint(1) DEFAULT '0',
`rp_status` tinyint(1) DEFAULT '0',
`dm_status` tinyint(1) DEFAULT '0',
`currency_id` char(36),
`exchange_rate_id` tinyint(1) DEFAULT '0',
`exchange_rate` double DEFAULT '1',
`date_transaction` datetime DEFAULT NULL,
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`created_user_id` int(11) DEFAULT '0',
`modified_user_id` int(11) DEFAULT '0',
`deleted` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `account_id` (`account_id`),
KEY `approved` (`approved`),
KEY `branch_id` (`branch_id`),
KEY `cn_status` (`cn_status`),
KEY `created_user_id` (`created_user_id`),
KEY `date_created` (`date_created`),
KEY `deleted` (`deleted`),
KEY `do_status` (`do_status`),
KEY `finalised` (`finalised`),
KEY `reference` (`reference`),
KEY `rp_status` (`rp_status`),
KEY `sales_types_id` (`sales_types_id`),
KEY `account_type_id` (`account_type_id`),
KEY `company_id` (`company_id`),
KEY `date_transaction` (`date_transaction`),
KEY `default_sales_location_id` (`default_sales_location_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `invoice_items` (
`id` char(36) NOT NULL,
`parent_id` char(36) DEFAULT NULL,
`item_id` char(36) DEFAULT NULL,
`qty` double DEFAULT '0',
`cost_price` double DEFAULT '0',
`list_price` double DEFAULT '0',
`selling_price` double DEFAULT '0',
`unit_price` double DEFAULT '0',
`vat` double DEFAULT '0',
`amount` double DEFAULT '0',
`special_discount` double DEFAULT '0',
`price_change_status` tinyint(1) DEFAULT '0',
`remarks` text,
`vat_id` int(11) DEFAULT '1',
`stock_category_id` tinyint(2) DEFAULT '0' COMMENT '1: Stockable 2: Service',
`is_giftitem` tinyint(1) DEFAULT '0' COMMENT '1: Gift Item 0: NO Gift',
`item_type_status` tinyint(1) DEFAULT '0',
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`created_user_id` int(11) DEFAULT '0',
`modified_user_id` int(11) DEFAULT '0',
`deleted` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `deleted` (`deleted`),
KEY `item_id` (`item_id`),
KEY `parent_id` (`parent_id`),
KEY `stock_category_id` (`stock_category_id`),
KEY `item_type_status` (`item_type_status`),
KEY `vat_id` (`vat_id`),
KEY `amount` (`amount`),
KEY `qty` (`qty`),
KEY `unit_price` (`unit_price`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Don't use LEFT JOIN when you mean JOIN. In particular, for joining to si.
WHERE (
DATE_FORMAT(si.date_transaction, '%Y-%m-%d') BETWEEN '2019-01-01'
AND '2019-02-15'
)
-->
WHERE si.date_transaction >= '2019-01-01'
AND si.date_transaction < '2019-01-15'
so that an index (see below) can use that column
WHERE si.date_transaction ...
AND (si.approved = 1)
AND (si.deleted = 0)
Add a composite index:
INDEX(deleted, approved, -- in either order
date_transaction) -- last
Make similar changes to so. Then let's hear how the performance is and see what the EXPLAIN has changed to.
UUIDs
Beware of UUIDs, they are bulky and slow. They are especially slow if the entire table cannot be cached.
I suspect you have uuids because I see CHAR(36).
By having CHARACTER SET utf8, that means 108 bytes is being used!. A UUID can be packed into a 16-byte BINARY(16). This would help with space (and hence speed).
But the real problem with UUIDs is with the randomness. Once the table becomes huge, the system becomes I/O-bound since the 'next' UUID is unlikely to be cached.
Consider switching to AUTO_INCREMENT ids. This is much preferred for single-server systems. If you need to generate ids from multiple locations, you may still need UUIDs.
More on UUIDs.
Related
I am working on News website, the table (news) contain about 200,000 rows.
I use Zend framework 1.
the website was working very good, but from a week ago, I have found some errors in data retrieve from the queries.
I use zend paginator, like this:
$paginator = Zend_Paginator::factory($select);
$paginator->setItemCountPerPage(10);
$pageCounter = $paginator->count();
so $pageCounter return 0,
when I try to debug the error, I have tried to drop the indexes from the database table and create the index again, then the problem was resolved the website back to work again.
but every couple days this problem back again.
the tables I use is:
CREATE TABLE `news` (
`news_id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) DEFAULT NULL,
`news_type` enum('text','photo','video') DEFAULT 'text',
`news_title` varchar(255) NOT NULL,
`news_url` varchar(255) NOT NULL,
`news_date` datetime NOT NULL,
`news_image` varchar(255) DEFAULT NULL,
`video` varchar(255) DEFAULT NULL,
`news_source` int(11) NOT NULL DEFAULT '0',
`author_id` int(11) NOT NULL DEFAULT '0',
`news_brief` text,
`news_content` mediumtext,
`meta_title` varchar(255) NOT NULL,
`meta_description` varchar(255) DEFAULT NULL,
`meta_keywords` varchar(255) DEFAULT NULL,
`status` tinyint(2) NOT NULL DEFAULT '1',
`news_read` int(11) DEFAULT '0',
`old_id` int(11) DEFAULT NULL,
`old_section_id` int(11) DEFAULT NULL,
`old_section` varchar(50) DEFAULT NULL,
`lang` varchar(10) DEFAULT NULL,
`featured` int(11) NOT NULL DEFAULT '0',
`in_timeline` tinyint(2) NOT NULL DEFAULT '0',
`is_highlight` tinyint(2) NOT NULL DEFAULT '0',
`home_exclusive` tinyint(2) NOT NULL DEFAULT '0',
`home_articles` tinyint(4) NOT NULL DEFAULT '0',
`home_articles_selected` tinyint(4) NOT NULL DEFAULT '0',
`home_mostread` tinyint(4) NOT NULL DEFAULT '0',
`video_featured` tinyint(4) NOT NULL DEFAULT '0',
`photos_featured` tinyint(4) NOT NULL DEFAULT '0',
`party_featured` tinyint(4) NOT NULL DEFAULT '0',
`party_activity` tinyint(4) NOT NULL DEFAULT '0',
`sport_featured` tinyint(4) NOT NULL DEFAULT '0',
`fnoun_featured` tinyint(4) NOT NULL DEFAULT '0',
`special_reports` tinyint(4) NOT NULL DEFAULT '0',
`mainnav` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`news_id`),
KEY `News_Category_idx` (`category_id`),
KEY `News_Type` (`news_type`),
KEY `News_Date` (`news_date`),
KEY `News_Read` (`news_read`),
KEY `old_id` (`old_id`),
KEY `Sources` (`news_source`),
KEY `Authors` (`author_id`),
KEY `Featured` (`featured`),
KEY `Articles` (`home_articles`),
KEY `Mostread` (`home_mostread`),
KEY `video_featured` (`video_featured`,`photos_featured`),
KEY `party_activity` (`party_activity`),
KEY `sport_featured` (`sport_featured`,`fnoun_featured`),
KEY `old_section_id` (`old_section_id`),
KEY `old_section` (`old_section`),
KEY `old section id` (`old_section_id`),
KEY `old section` (`old_section`),
KEY `articles_selected` (`home_articles_selected`),
KEY `URL` (`news_url`),
KEY `special_reports` (`special_reports`),
KEY `mainnav` (`mainnav`),
KEY `status` (`status`),
KEY `in_timeline` (`in_timeline`),
KEY `is_highlight` (`is_highlight`),
KEY `home_exclusive` (`home_exclusive`),
KEY `party_featured` (`party_featured`),
FULLTEXT KEY `news_title` (`news_title`)
) ENGINE=InnoDB AUTO_INCREMENT=167614 DEFAULT CHARSET=utf8
the columns (status - featured - in_timeline - is_highlight) type was (ENUM) but I have changed to (tinyint)
It works for a few days, but the problem come back again.
so I have to drop the indexed and create it again.
I don't know what is the problem?
can anyone help, please.
The Explain Query, it's the same when the problem happen and after I drop the indexes and create it again.
SQL query: Explain SELECT `N`.`news_id`, `N`.`news_title`, `N`.`news_url`, `N`.`news_image`, `N`.`news_type`, `N`.`news_date`, `N`.`is_highlight`, `N`.`news_brief`, `C`.`category_name`, `C`.`category_url`, `C`.`category_color`, `NT`.*, GROUP_CONCAT(T.tag_name separator ",") AS tags, GROUP_CONCAT(T.tag_url separator ",") AS `tags_urls` FROM `news` AS `N` INNER JOIN `news_categories` AS `C` ON C.category_id = N.category_id AND C.status = "1" LEFT JOIN `news_tags_relation` AS `NT` ON NT.news_id = N.news_id LEFT JOIN `news_tags` AS `T` ON T.tag_id = NT.tag_id WHERE (N.status = "1" AND N.category_id = "22") GROUP BY `N`.`news_id` ORDER BY `N`.`news_date` DESC;
Rows: 4
This table does not contain a unique column. Grid edit, checkbox, Edit, Copy and Delete features are not available.
the table in the attached images
I am getting the following error:
SQLSTATE[23000]: Integrity constraint violation: 1052 Column 'id' in field list is ambiguous
And I have no idea where is it coming from. The stacktrace is not clear about it either. The query that throws this error:
Database::executeQuery('CREATE TEMPORARY TABLE tmp_inventory ENGINE=MEMORY '
. 'SELECT id, email_hash, mailing_list_id, ttl, price, last_click, last_view, extra_data '
. 'FROM inventory i INNER JOIN mailing_list ml on i.mailing_list_id = ml.id '
. 'WHERE i.active = 0 AND i.deleted = 1 AND i.completely_deleted = 1 AND i.resting_to < NOW() AND i.next_sync_at < NOW() AND ml.active = 0 '
. 'LIMIT 10;');
So I have two tables - inventory, and mailing_list. Inventory has the following structure:
CREATE TABLE `inventory` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`email_hash` char(32) NOT NULL,
`mailing_list_id` int(6) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`last_send_at` datetime DEFAULT NULL,
`resting_to` datetime DEFAULT NULL,
`next_sync_at` datetime DEFAULT NULL,
`ttl` datetime DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT '1',
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`completely_deleted` tinyint(1) NOT NULL DEFAULT '0',
`price` int(10) unsigned NOT NULL,
`last_view` datetime DEFAULT NULL,
`last_view_at` datetime DEFAULT NULL,
`last_updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
)
And the mailing_list:
CREATE TABLE `mailing_list` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT '0',
`created_at` datetime NOT NULL,
`price` int(10) unsigned NOT NULL DEFAULT '1000',
`ttl` int(10) unsigned NOT NULL DEFAULT '604800',
`resting_time` int(10) unsigned NOT NULL DEFAULT '0',
`email_from` varchar(255) DEFAULT NULL,
`email_return_path` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
)
Whats wrong?
In your query you are accessing 2 tables: inventory in your FROM clause, mailing_list in your INNER JOIN clause.
Both of these tables have a column named id, so your db doesn't know which column you are referring to.
To fix this, specify the table in your select, by replacing id with i.id or ml.id
I have a table called "tables_data_info" where is stored some of the data relative to different tables. Data like "created time", "editing time", "editing user", "create by user id" etc. I'm using this table because there was a dynamic php script that generate it automatically.
But, when i have a huge number of record ( 15k in this case ) the query getting very very very slow, and take "minutes" to do his job! But i'm not selecting all 15k records, i'm limiting to select 10 records at all!
A simple query:
SELECT pd.id, pd.title, pd.sell_price, pd.available_qt, tdi.createtime, tdi.lastupdatetime, tdi.create_member_id, tdi.create_group_id, tdi.last_update_member_id, tdi.last_update_group_id FROM zd_products AS pd LEFT JOIN zd_tables_data_info AS tdi ON ( tdi.targetid = pd.id and tdi.table_name = 'products' ) ORDER by pd.title ASC LIMIT 0, 10
How can i run this query differently but more efficiently ?
Here the table structure:
zd_products
CREATE TABLE `zd_products` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(256) NOT NULL DEFAULT '',
`internalcode` varchar(256) DEFAULT NULL,
`ean13_jan_code` varchar(256) DEFAULT NULL,
`upc_code` varchar(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT '0',
`status` varchar(12) NOT NULL DEFAULT 'new',
`product_tags` longtext,
`buy_price` double NOT NULL DEFAULT '0',
`sell_price` double NOT NULL DEFAULT '0',
`fiscal_tax_id` int(11) NOT NULL DEFAULT '0',
`box_width` double NOT NULL DEFAULT '0',
`box_height` double NOT NULL DEFAULT '0',
`box_depth` double NOT NULL DEFAULT '0',
`box_weight` double NOT NULL DEFAULT '0',
`shipment_extra_price` double NOT NULL DEFAULT '0',
`available_qt` int(11) NOT NULL DEFAULT '0',
`allow_purchase_out_stock` tinyint(1) NOT NULL DEFAULT '0',
`meta_title` varchar(256) DEFAULT NULL,
`meta_description` varchar(256) DEFAULT NULL,
`meta_keywords` longtext,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=15730 DEFAULT CHARSET=utf8;
zd_tables_data_info
CREATE TABLE `zd_tables_data_info` (
`table_name` varchar(256) NOT NULL DEFAULT '',
`targetid` int(11) DEFAULT NULL,
`create_member_id` int(11) unsigned DEFAULT NULL,
`create_group_id` int(11) unsigned DEFAULT NULL,
`last_update_member_id` int(11) unsigned DEFAULT NULL,
`last_update_group_id` int(11) unsigned DEFAULT NULL,
`createtime` int(11) unsigned DEFAULT NULL,
`lastupdatetime` int(11) unsigned DEFAULT NULL,
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
KEY `INDEX` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=19692 DEFAULT CHARSET=utf8;
Your data is not particularly big. Here is the query:
SELECT pd.id, pd.title, pd.sell_price, pd.available_qt,
tdi.createtime, tdi.lastupdatetime, tdi.create_member_id, tdi.create_group_id,
tdi.last_update_member_id, tdi.last_update_group_id
FROM zd_products pd LEFT JOIN
zd_tables_data_info tdi
ON tdi.targetid = pd.id and tdi.table_name = 'products'
ORDER by pd.title ASC
LIMIT 0, 10;
You can improve performance of this query with indexes. The two that come to mind are zd_products(title, id) and zd_tables_data_info(targetid, table_name). Try these and see if they help. You can create these indexes either in the create table statement (or alter table) or by using:
create index zd_products_title_id on zd_products(title, id);
create index zd_tables_data_info_targetid_table_name on zd_tables_data_info(targetid, table_name);
If not, put explain in front of your query and then edit your question with the resulting plan.
Abstract:
Every client is given a specific xml ad feed (publisher_feed table). Everytime there is a query or a click on that feed, it gets recorded (publisher_stats_raw table) (Each query/click will have multiple rows depending on the subid passed by the client (We can sum the clicks together)). The next day, we pull stats from an API to grab the previous days revenue numbers (rev_stats table) (Each revenue stat might have multiple rows depending on the country of the click (We can sum the revenue together)). Been having a hard time trying to link together these three tables to find the average RPC for each client for the previous day.
Table Structure:
CREATE TABLE `publisher_feed` (
`publisher_feed_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`alias` varchar(45) DEFAULT NULL,
`user_id` int(10) unsigned DEFAULT NULL,
`remote_feed_id` int(10) unsigned DEFAULT NULL,
`subid` varchar(255) DEFAULT '',
`requirement` enum('tq','tier2','ron','cpv','tos1','tos2','tos3','pv1','pv2','pv3','ar','ht') DEFAULT NULL,
`status` enum('enabled','disabled') DEFAULT 'enabled',
`tq` decimal(4,2) DEFAULT '0.00',
`clicklimit` int(11) DEFAULT '0',
`prev_rpc` decimal(20,10) DEFAULT '0.0000000000',
PRIMARY KEY (`publisher_feed_id`),
UNIQUE KEY `alias_UNIQUE` (`alias`),
KEY `publisher_feed_idx` (`remote_feed_id`),
KEY `publisher_feed_user` (`user_id`),
CONSTRAINT `publisher_feed_feed` FOREIGN KEY (`remote_feed_id`) REFERENCES `remote_feed` (`remote_feed_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `publisher_feed_user` FOREIGN KEY (`user_id`) REFERENCES `user` (`user_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=124 DEFAULT CHARSET=latin1$$
CREATE TABLE `publisher_stats_raw` (
`publisher_stats_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`unique_data` varchar(350) NOT NULL,
`publisher_feed_id` int(10) unsigned DEFAULT NULL,
`date` date DEFAULT NULL,
`subid` varchar(255) DEFAULT NULL,
`queries` int(10) unsigned DEFAULT '0',
`impressions` int(10) unsigned DEFAULT '0',
`clicks` int(10) unsigned DEFAULT '0',
`filtered` int(10) unsigned DEFAULT '0',
`revenue` decimal(20,10) unsigned DEFAULT '0.0000000000',
PRIMARY KEY (`publisher_stats_id`),
UNIQUE KEY `unique_data_UNIQUE` (`unique_data`),
KEY `publisher_stats_raw_remote_feed_idx` (`publisher_feed_id`)
) ENGINE=InnoDB AUTO_INCREMENT=472 DEFAULT CHARSET=latin1$$
CREATE TABLE `rev_stats` (
`rev_stats_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date` date DEFAULT NULL,
`remote_feed_id` int(10) unsigned DEFAULT NULL,
`typetag` varchar(255) DEFAULT NULL,
`subid` varchar(255) DEFAULT NULL,
`country` varchar(2) DEFAULT NULL,
`revenue` decimal(20,10) DEFAULT NULL,
`tq` decimal(4,2) DEFAULT NULL,
`finalized` int(11) DEFAULT '0',
PRIMARY KEY (`rev_stats_id`),
KEY `rev_stats_remote_feed_idx` (`remote_feed_id`),
CONSTRAINT `rev_stats_remote_feed` FOREIGN KEY (`remote_feed_id`) REFERENCES `remote_feed` (`remote_feed_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=58 DEFAULT CHARSET=latin1$$
Context:
Each remote_feed has a specific subid/typetag given to it. So we need to match up the both the remote_feed_id and the subid columsn from the publisher_feed table to the remote_feed_id and typetag columns in the revenue stats table.
My current, non working, implementation:
SELECT
pf.publisher_feed_id, psr.date, sum(clicks), sum(rs.revenue)
FROM
xml_network.publisher_feed pf
JOIN
xml_network.publisher_stats_raw psr
ON
psr.publisher_feed_id = pf.publisher_feed_id
JOIN
xml_network.rev_stats rs
ON
rs.remote_feed_id = pf.remote_feed_id
WHERE
pf.requirement = 'tq'
AND
pf.subid = rs.typetag
AND
psr.date <> date(curdate())
GROUP BY
psr.date
ORDER BY
psr.date DESC
LIMIT 1;
The above keeps pulling the wrong data out of the rev_stats table (pulls the sum of the correct stats, but repeats it over because of a join). Any help with how I would be able to properly pull the correct data would be greatly helpful ( I could use multiple queries and PHP to get the correct results, but what's the fun in that!)
Figured out a way to get this accomplished. Its def not a fast method by any means, needing 4 selects to get it done, but it works flawlessly =)
SELECT
pf.publisher_feed_id,
round(
(
SELECT
SUM(rs.revenue)
FROM
xml_network.rev_stats rs
WHERE
rs.remote_feed_id = pf.remote_feed_id
AND
rs.typetag = pf.subid
AND
rs.date = subdate(current_date, 1)
),10)as revenue,
(
SELECT
MAX(rs.tq)
FROM
xml_network.rev_stats rs
WHERE
rs.remote_feed_id = pf.remote_feed_id
AND
rs.typetag = pf.subid
AND
rs.date = subdate(current_date, 1)
) as tq,
(
SELECT
SUM(psr.clicks)-SUM(psr.filtered)
FROM
xml_network.publisher_stats_raw psr
WHERE
psr.publisher_feed_id = pf.publisher_feed_id
AND
psr.date = subdate(current_date, 1)
) as clicks
FROM
xml_network.publisher_feed pf
WHERE
pf.requirement = 'tq';
I need to pump up my query a bit for it's taking way too long on a large DB.
I have the following tables
vb_user
+++++++++++++++++++++++++++++++++
++ userid ++ username ++ posts ++
+++++++++++++++++++++++++++++++++
vb_post
++++++++++++++++++++++++
++ userid ++ dateline ++
++++++++++++++++++++++++
I use this query
SELECT VBU.userid AS USER_ID
, VBU.username AS USER_NAME
, COUNT(VBP.userid) AS NUMBER_OF_POSTS_FOR_30_DAYS
, FROM_UNIXTIME(VBU.joindate) as JOIN_DATE
FROM vb_user AS VBU
LEFT JOIN vb_post AS VBP
ON VBP.userid = VBU.userid
WHERE VBU.joindate BETWEEN '__START_DATE__' AND '__END_DATE__'
AND VBP.dateline BETWEEN VBU.joindate AND DATE_ADD(FROM_UNIXTIME(VBU.joindate), INTERVAL 30 DAY)
GROUP BY VBP.userid
ORDER BY NUMBER_OF_POSTS_FOR_30_DAYS DESC"
I have to select the users who have posted the most from when they joined till 30 days after..... and I can't figure out how to do it withouth the FROM_UNIXTIME function..
But it takes a lot of time. Any thoughts on how to improve the performance for the query?
Here is the output for explain
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,VBP,index,userid,threadid_visible_dateline,18,NULL,2968000,"Using where; Using index; Using temporary; Using filesort"
1,SIMPLE,VBU,eq_ref,PRIMARY,PRIMARY,4,vb_copilul.VBP.userid,1,"Using where"
And here is the info about the tables
Table,"Create Table"
vb_user,"CREATE TABLE `vb_user` (
`userid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(100) NOT NULL DEFAULT '',
`posts` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`userid`),
KEY `usergroupid` (`usergroupid`),
) ENGINE=MyISAM AUTO_INCREMENT=101076 DEFAULT CHARSET=latin1"
Table,"Create Table"
vb_post,"CREATE TABLE `vb_post` (
`postid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`threadid` int(10) unsigned NOT NULL DEFAULT '0',
`parentid` int(10) unsigned NOT NULL DEFAULT '0',
`username` varchar(100) NOT NULL DEFAULT '',
`userid` int(10) unsigned NOT NULL DEFAULT '0',
`title` varchar(250) NOT NULL DEFAULT '',
`dateline` int(10) unsigned NOT NULL DEFAULT '0',
`pagetext` mediumtext,
`allowsmilie` smallint(6) NOT NULL DEFAULT '0',
`showsignature` smallint(6) NOT NULL DEFAULT '0',
`ipaddress` char(15) NOT NULL DEFAULT '',
`iconid` smallint(5) unsigned NOT NULL DEFAULT '0',
`visible` smallint(6) NOT NULL DEFAULT '0',
`attach` smallint(5) unsigned NOT NULL DEFAULT '0',
`infraction` smallint(5) unsigned NOT NULL DEFAULT '0',
`reportthreadid` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`postid`),
KEY `userid` (`userid`),
KEY `threadid` (`threadid`,`userid`),
KEY `threadid_visible_dateline` (`threadid`,`visible`,`dateline`,`userid`,`postid`),
KEY `dateline` (`dateline`),
KEY `ipaddress` (`ipaddress`)
) ENGINE=MyISAM AUTO_INCREMENT=3009320 DEFAULT CHARSET=latin1"
Two things you can do to improve the query:
Do not convert VBP.datetime to unix time. Use the BETWEEN query with the dates directly. In your query the server has to convert all dates in the DB to compare them, instead of use the native types. If you are always using the datetime column as unix timestamp, then declare it as Double (I think?) instead of DATETIME (or TIMESTAMP - whatever you have chosen). This way you will speed up other operations too.
Add index to the datetime column to make sure the between query is fast enough.
Everything else looks OK