I have a PHP/MySQL application
The application uses a query to get the values of a table leads, with 2 sub-queries to return the SUM and COUNT of values in a second table refunds
The 2 tables are linked with a foreign key lead_id
SELECT l.*,
IFNULL(
(SELECT SUM(amount)
FROM refunds r
WHERE l.lead_id = r.lead_id),0) amount_refunded,
IFNULL(
(SELECT COUNT(*)
FROM refunds r
WHERE l.lead_id = r.lead_id),0) number_refunded
FROM leads l
I would like to increase the performance of this query.
My thought was to:
Combine the the 2 sub-queries into a single sub-query using CONCAT
with a pipe delimiter
Explode the returned string using PHP at the application level to
get the 2 values.
Example below:
SELECT l.*,
(SELECT CONCAT(IFNULL(COUNT(*),0),'|', IFNULL(SUM(amount),0))
FROM fee_refunds r
WHERE l.lead_id = r.lead_id) values_refunded
FROM fee_leads l
Then in the application, within the loop:
list($amount_refunded, $number_refunded) = explode('|', $row->values_refunded);
This approach works, however my questions are:
Is this bad form?
Is there any reason I should not do it this way?
Is there a better solution?
Use join!
SELECT l.*, r.amount_refunded, r.number_refunded
FROM leads l LEFT JOIN
(SELECT lead_id, COUNT(*) as number_refunded, SUM(amount) as amount_refunded
FROM refunds r
GROUP BY lead_id
) r
ON l.lead_id = r.lead_id;
You may find it faster, under some circumstances, to join before the aggregation.
Related
I'm learning mysql and am having difficult time getting my head around more complicated outputs - mainly the logic part... I have a simple database that contains 2 tables with 1 connection - design is here https://prnt.sc/mfmwji
I need to create a report that displays daily balance of only negative states (so only if person is in a negative balance) for the past 6 months.
I've put together query that displays only differences when they're negative, but it does not 'connect' them to rows before them... only displays the withdraws so to say.
I've played around with query but this is the 'best' thing I've came up with... I've tried to wrap the difference with sum function but that just sums the whole thing and doesn't return daily difference.
SELECT
T1.name AS Name,
T2.withdraw - T2.deposit AS Difference,
DATE_FORMAT(T2.date, '%Y-%m-%d') AS Date
FROM
users AS T1
INNER JOIN transactions AS T2
ON
T1.id = T2.user_id
WHERE
(T2.withdraw - T2.deposit) > 0
The query returns this output (its just a part of result since I got 100 results)
http://prntscr.com/mfn0xf
The deposits and withraws for Pearl Champlin so you get the idea are:
http://prntscr.com/mfn15a
I've tried to check other questions on SO but they usually point to other problems and are not specific to my problem.
Thanks in advance for any information you think I should check out!
You could use a subquery to retrieve the balance up to date. Then, in an outer query, you can filter for where that balance is negative:
select *
from (
select u.name,
t.date,
t.deposit - t.withdraw action,
( select sum(deposit - withdraw)
from transactions
where user_id = u.id
and date <= t.date ) as balance
from users as u
inner join transactions as t
on u.id = t.user_id
) balances
where balance < 0
order by 1, 2
This is what you asked. It shows the report for one user. I don't know if there is a way to make this for all the user at the same time. Maybe it can help you to find what you want.
SELECT
PreAgg.name,
(PreAgg.withdraw - PreAgg.deposit) AS Difference,
#PrevBal := #PrevBal + (PreAgg.withdraw - PreAgg.deposit) AS Balance
FROM
(SELECT
T1.name,
T2.deposit,
T2.withdraw,
(T2.withdraw - T2.deposit) AS Difference,
T1.id
FROM
users AS T1
INNER JOIN transactions AS T2
ON
T1.id = T2.user_id
ORDER BY
T2.id ) AS PreAgg,
(SELECT #PrevBal := 0) as InitialVar
WHERE PreAgg.id = 1
I'm attempting to pull the latest pricing data from a table on an Inner Join. Prices get updated throughout the day but aren't necessary updated at midnight.
The following query works great when the data is updated on prices by the end of the day. But how do I get it to get yesterdays data if today's data is blank?
I'm indexing off of a column that is formatted like this date_itemnumber => 2015-05-22_12341234
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
AND concat('2015-05-23_',collection.itemid)=history.date_itemid
WHERE h.description LIKE '%Awesome%'
Production Query time: .046 sec
To be clear, I want it to check for the most up to date record for that item. Regardless on if it is today, yesterday or before that.
SQLFiddle1
The following query gives me the desired results but with my production dataset it takes over 3 minutes to return results. As my dataset gets larger, it would take longer. So this can't be the most efficient way to do this.
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
AND (select history.date_itemid from history WHERE itemid=collection.itemid GROUP BY date_itemid DESC LIMIT 1)=history.date_itemid
WHERE h.description LIKE '%Awesome%'
Production Query time: 181.140 sec
SQLFiddle2
SELECT x.*
FROM history x
JOIN
( SELECT itemid
, MAX(date_itemid) max_date_itemid
FROM history
-- optional JOINS and WHERE here --
GROUP
BY itemid
) y
ON y.itemid = x.itemid
AND y.max_date_itemid = x.date_itemid;
http://sqlfiddle.com/#!9/975f5/13
This should works:
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN(
SELECT a.*
FROM history a
INNER JOIN
( SELECT itemid,MAX(date_itemid) max_date_itemid
FROM history
GROUP BY itemid
) b ON b.itemid = a.itemid AND b.max_date_itemid = a.date_itemid
) AS history ON history.itemid = collection.itemid
WHERE h.description LIKE '%Awesome%'
I don't know if this take a lot of execution time. Please do try it, since you might have more data in your tables it will be a good test to see the query execution time.
This is actually a fairly common problem in SQL, at least I feel like I run into it a lot. What you want to do is join a one to many table, but only join to the latest or oldest record in that table.
The trick to this is to do a self LEFT join on the table with many records, specifying the foreign key and also that the id should be greater or less than the other records' ids (or dates or whatever you're using). Then in the WHERE conditions, you just add a condition that the left joined table has a NULL id - it wasn't able to be joined with a more recent record because it was the latest.
In your case the SQL should look something like this:
SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid
-- left join history table again
LEFT JOIN history AS history2 ON history.itemid = history2.itemid AND history2.id > history.id
-- filter left join results to the most recent record
WHERE history2.id IS NULL
AND h.description LIKE '%Awesome%'
This is another approach that cuts one inner join statement
select h.*,his.date_itemid, his.price from history his
INNER JOIN h ON his.itemid=h.id
WHERE his.itemid IN (select itemid from collection) AND h.description LIKE '%Awesome%' and his.id IN (select max(id) from history group by history.itemid)
you can try it here http://sqlfiddle.com/#!9/837a8/1
I am not sure if this is what you want but i give it a try
EDIT: modified
CREATE VIEW LatestDatesforIds
AS
SELECT
MAX(`history`.`date_itemid`) AS `lastPriceDate`,
MAX(`history`.`id`) AS `matchingId`
FROM `history`
GROUP BY `history`.`itemid`;
CREATE VIEW MatchDatesToPrices
AS
SELECT
`ldi`.`lastPriceDate` AS `lastPriceDate`,
`ldi`.`matchingId` AS `matchingId`,
`h`.`id` AS `id`,
`h`.`itemid` AS `itemid`,
`h`.`price` AS `price`,
`h`.`date_itemid` AS `date_itemid`
FROM (`LatestDatesforIds` `ldi`
JOIN `history` `h`
ON ((`ldi`.`matchingId` = `h`.`id`)));
SELECT c.itemid,price,lastpriceDate,description
FROM collection c
INNER JOIN MatchDatesToPrices mp
ON c.itemid = mp.itemid
INNER JOIN h ON c.itemid = h.id
Difficult to test the speed on such a small dataset but avoiding 'Group By' might speed things up. You could try conditionally joining the history table to itself instead of Grouping?
e.g.
SELECT h.*, c.*, h1.price
FROM h
INNER JOIN history h1 ON h1.itemid = h.id
LEFT OUTER JOIN history h2 ON h2.itemid = h.id
AND h1.date_itemid < h2.date_itemid
INNER JOIN collection c ON c.itemid = h.id
WHERE h2.id IS NULL
AND h.description LIKE '%Awesome%'
Changing this line
AND h1.date_itemid < h2.date_itemid
to actually work on a sequential indexed field (preferably unique) will speed things up too. e.g. order by id ASC
I have 2 tables in Mysql one is holding contractors and another is holding Projects, I want to produce a contractor-Project Report showing the approtining of the projects. problem is INNER JOIN, LEFT and RIGHT OUTER JOINS, all produce the same result only showing the contractor with a project even when i leave out the condition which seems Weird. here are my statements
SELECT DISTINCT (tbl_contractor.name_v), count( tbl_project.name_v )
FROM tbl_contractor
INNER JOIN tbl_project
ON tbl_project.Contractor=tbl_contractor.contractor_id_v
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
LIMIT 0 , 30;
SELECT DISTINCT (tbl_contractor.name_v), count( tbl_project.name_v )
FROM tbl_contractor
LEFT OUTER JOIN tbl_project
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
LIMIT 0 , 30;
You have an aggregate function, COUNT(), without a GROUP BY. This means youir query will return one row only.
You probably need a GROUP BY (contractor):
SELECT tbl_contractor.name_v, COUNT( tbl_project.name_v )
FROM tbl_contractor
LEFT OUTER JOIN tbl_project
ON tbl_project.Contractor = tbl_contractor.contractor_id_v
GROUP BY tbl_contractor.contractor_id_v
LIMIT 0 , 30;
By doing SELECT DISTINCT (tbl_contractor.name_v) the query will only return one row for each contractor name, try removing the distinct and see if you get a better contractor - project result.
These queries are really group by queries on the contractor. If every contractor has at least one project, then the inner and left outer joins will return the same results. If there are contractors without projects, then the results are affected by the LIMIT clause. You are only getting the first 30, and, for whatever reason, the matches are appearing first.
this is my query
SELECT U.id AS user_id,C.name AS country,
CASE
WHEN U.facebook_id > 0 THEN CONCAT(F.first_name,' ',F.last_name)
WHEN U.twitter_id > 0 THEN T.name
WHEN U.regular_id > 0 THEN CONCAT(R.first,' ',R.last)
END AS name,
FROM user U LEFT OUTER JOIN regular R
ON U.regular_id = R.id
LEFT OUTER JOIN twitter T
ON U.twitter_id = T.id
LEFT OUTER JOIN facebook F
ON U.facebook_id = F.id
LEFT OUTER JOIN country C
ON U.country_id = C.id
WHERE (CONCAT(F.first_name,' ',F.last_name) LIKE '%' OR T.name LIKE '%' OR CONCAT(R.first,' ',R.last) LIKE '%') AND U.active = 1
LIMIT 100
its realy fast, but in the EXPLAIN it don't show me it uses INDEXES (there is indexes).
but when i add ORDER BY 'name' before the LIMIT its takes long time why? there is a way to solve it?
tables: users 150000, regular 50000, facebook 50000, twitter 50000, country 250 and growing!
It takes a long time because it's a composite column, not a table column. The name column is a result of a case selection, and unlike simple selects with multiple join, MySQL has to use a different sorting algorithm for this kind of data.
I'm talking from ignorance here, but you could store the data in a temporary table and then sort it. It may go faster since you can create indexes for it but it won't be as fast, because of the different storage type.
UPDATE 2011-01-26
CREATE TEMPORARY TABLE `short_select`
SELECT U.id AS user_id,C.name AS country,
CASE
WHEN U.facebook_id > 0 THEN CONCAT(F.first_name,' ',F.last_name)
WHEN U.twitter_id > 0 THEN T.name
WHEN U.regular_id > 0 THEN CONCAT(R.first,' ',R.last)
END AS name,
FROM user U LEFT OUTER JOIN regular R
ON U.regular_id = R.id
LEFT OUTER JOIN twitter T
ON U.twitter_id = T.id
LEFT OUTER JOIN facebook F
ON U.facebook_id = F.id
LEFT OUTER JOIN country C
ON U.country_id = C.id
WHERE (CONCAT(F.first_name,' ',F.last_name) LIKE '%' OR T.name LIKE '%' OR CONCAT(R.first,' ',R.last) LIKE '%') AND U.active = 1
LIMIT 100;
ALTER TABLE `short_select` ADD INDEX(`name`); --add successive columns if you are going to order by them as well.
SELECT * FROM `short_select`
ORDER BY 'name'; -- same as above
Remember temporary tables are dropped upon connection termination, so you don't have to clean them, but you should anyway.
Without actually knowing your DB structure, and assuming you have all of the proper indexes on everything. An Order By statement takes some variable amount of time to sort the elements being returned by a query (index or not). If it is only 10 rows, it will seem almost instant, if you get 2000 rows, it will be a little slower, if you are sorting 15k rows joined across multiple tables, it is going to take some time to sort the returned result. Also make sure your adding indexes to the fields your sorting by. You may want to take the desired result and store everything in a presorted stub table for faster querying later as well (if you query this sorted result set often)
You need to create first 100 records from each name table separately, then union the results, join them with user and country, order and limit the output:
SELECT u.id AS user_id, c.name AS country, n.name
FROM (
SELECT facebook_id AS id, CONCAT(F.first_name, ' ', F.last_name) AS name
FROM facebook
ORDER BY
first_name, last_name
LIMIT 100
UNION ALL
SELECT twitter_id, name
FROM twitter
WHERE twitter_id NOT IN
(
SELECT facebook_id
FROM facebook
)
ORDER BY
name
LIMIT 100
UNION ALL
SELECT regular_id, CONCAT(R.first, ' ', R.last)
FROM regular
WHERE regular_id NOT IN
(
SELECT facebook_id
FROM facebook
)
AND
regular_id NOT IN
(
SELECT twitter_id
FROM twitter
)
ORDER BY
first, last
LIMIT 100
) n
JOIN user u
ON u.id = n.id
JOIN country с
ON c.id = u.country_id
Create the following indexes:
facebook (first_name, last_name)
twitter (name)
regular (first, last)
Note that this query orders slightly differently from your original one: in this query, 'Ronnie James Dio' would be sorted after 'Ronnie Scott'.
The use of functions on the columns prevent indexes from being used.
CONCAT(F.first_name,' ',F.last_name)
The result of the function is not indexed, even though the individual columns may be. Either you have to rewrite the conditions to query the name columns individually, or you have to store and index the result of that function (such as a "full name" column).
The index on [user.active] is unlikely to help you if most of the users are active.
I don't know what your application is all about, but I wonder if it hadn't been easier if you ditched the foreign keys in User table and instead put the UserID as a foreign key in the other tables instead.
dear php and mysql expertor
i have two table one large for posts artices 200,000records (index colume: sid) , and one small table (index colume topicid ) for topics has 20 record .. have same topicid
curent im using : ( it took round 0.4s)
+do get last 50 record from table:
SELECT sid, aid, title, time, topic, informant, ihome, alanguage, counter, type, images, chainid FROM veryzoo_stories ORDER BY sid DESC LIMIT 0,50
+then do while loop in each records for find the maching name of topic in each post:
while ( .. ) {
SELECT topicname FROM veryzoo_topics WHERE topicid='$topic'"
....
}
+Now
I going to use Inner Join for speed up process but as my test it took much longer from 1.5s up to 3.5s
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid, t.topicname FROM veryzoo_stories a INNER JOIN veryzoo_topics t ON a.topic = t.topicid ORDER BY sid DESC LIMIT 0,50
It look like the inner join do all joining 200k records from two table fist then limit result at 50 .. that took long time..
Please help to point me right way doing this..
eg take last 50 records from table one.. then join it to table 2 .. ect
Do not use inner join unless the two tables share the same primary key, or you'll get duplicate values (and of course a slower query).
Please try this :
SELECT *
FROM (
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid
FROM veryzoo_stories a
ORDER BY sid DESC
LIMIT 0 , 50
)b
INNER JOIN veryzoo_topics t ON b.topic = t.topicid
I made a small test and it seems to be faster. It uses a subquery (nested query) to first select the 50 records and then join.
Also make sure that veryzoo_stories.sid, veryzoo_stories.topic and veryzoo_topics.topicid are indexes (and that the relation exists if you use InnoDB). It should improve the performance.
Now it leaves the problem of the ORDER BY LIMIT. It is heavy because it orders the 200,000 records before selecting. I guess it's necessary. The indexes are very important when using ORDER BY.
Here is an article on the problem : ORDER BY … LIMIT Performance Optimization
I'm just give test to nested query + inner join and suprised that performace increase much: it now took only 0.22s . Here is my query:
SELECT a.*, t.topicname
FROM (SELECT sid, aid, title, TIME, topic, informant, ihome, alanguage, counter, TYPE, images, chainid
FROM veryzoo_stories
ORDER BY sid DESC
LIMIT 0, 50) a
INNER JOIN veryzoo_topics t ON a.topic = t.topicid
if no more solution come up , i may use this one .. thanks for anyone look at this post