I am new to coding and this is the most complex query I have tried writing.
I am trying to create a query that will find the first and last entry for meters sorted by bunker_id within a date range. There are 12 different systems that have their meters captured when they are used.
I have several MySQL tables to track usage of systems and component configurations.
One table has a log of meters for all the systems. What I am trying to do is query this table between Date A and Date B, and receive the first and last meter values within the date range for each system. They systems may not be used everyday, but on occasion will have multiple entries in a day.
I am looking to have a query run through POST on a web page with selectors for the days and the system id's. The data will be output into an HTML table.
date
bunker_id
power_on_hours
01-01-2022
A
26115.50
01-02-2022
B
28535.13
01-02-2022
A
26257.38
01-03-2022
B
28682.73
What I am trying to return
bunker_id
starting_meters
ending_meters
A
26115.50
26257.38
B
28535.13
28682.73
The query that I have sets the starting and ending hours as the same value. I tried using MAX and MIN, but everything breaks if someone were to enter 0 for the meter.
SELECT
lu_bunkers.bunker_name as 'bunker_name',
lu_bunkers.bunker_sn,
SUM(system_utilization.hours_used) as 'total_hours',
SUM(CASE WHEN system_utilization.use_id = '1' THEN system_utilization.hours_used ELSE 0 END) as 'maintenance_hours',
SUM(CASE WHEN system_utilization.use_id = '2' THEN system_utilization.hours_used ELSE 0 END) as 'working_rf_hours',
SUM(CASE WHEN system_utilization.use_id = '3' THEN system_utilization.hours_used ELSE 0 END) as 'working_no_rf_hours',
SUM(CASE WHEN system_utilization.use_id = '4' THEN system_utilization.hours_used ELSE 0 END) as 'acd_hours',
((SUM(system_utilization.hours_used))/ ((DATEDIFF('2022-02-24', '2021-01-01')+1)*5/7*12))*100 as net_utilization,
((DATEDIFF('2022-02-24', '2021-01-01')+1)*(5/7)*12) as num_hours,
(SELECT system_meters.power_on_hours WHERE system_utilization.date_used BETWEEN '2021-01-01' AND '2022-02-24' ORDER BY system_utilization.date_used DESC LIMIT 1) as 'ending_hours',
(SELECT system_meters.power_on_hours WHERE system_utilization.date_used BETWEEN '2021-01-01' AND '2022-02-24' ORDER BY system_utilization.date_used ASC LIMIT 1) as 'starting_hours'
FROM system_utilization
LEFT JOIN lu_bunkers ON system_utilization.bunker_id = lu_bunkers.bunker_id
LEFT JOIN lu_use ON system_utilization.use_id = lu_use.use_id
LEFT JOIN system_meters ON system_utilization.id = system_meters.utilization_id
WHERE system_utilization.date_used BETWEEN '2021-01-01' AND '2022-02-24' AND system_utilization.bunker_id LIKE '%'
GROUP BY system_utilization.bunker_id
ORDER BY lu_bunkers.bunker_name
2 possible solutions
Using Joins
SELECT
a.`bunker_id`,
b.`power_on_hours` as `starting_meters`,
c.`power_on_hours` as `ending_meters`
FROM `yourtable` a
LEFT JOIN `yourtable` b
ON (
SELECT `date`
FROM `yourtable`
WHERE `bunker_id` = a.`bunker_id`
ORDER BY `date`
LIMIT 1
) = b.`date`
AND a.`bunker_id` = b.`bunker_id`
LEFT JOIN `yourtable` c
ON (
SELECT `date`
FROM `yourtable`
WHERE `bunker_id` = a.`bunker_id`
ORDER BY `date` DESC
LIMIT 1
) = c.`date`
AND a.`bunker_id` = c.`bunker_id`
GROUP BY a.`bunker_id`
Using subqueries on columns
SELECT
a.`bunker_id`,
(
SELECT `power_on_hours`
FROM `yourtable`
WHERE `bunker_id` = a.`bunker_id`
ORDER BY `date` LIMIT 1
) as `starting_meters`,
(
SELECT `power_on_hours`
FROM `yourtable`
WHERE `bunker_id` = a.`bunker_id`
ORDER BY `date` DESC LIMIT 1
) as `ending_meters`
FROM `yourtable` a
GROUP BY a.`bunker_id`
I would like to better optimize my code. I'd like to have a single query that allows an alias name to have it's own limit and also include a result with no limit.
Currently I'm using two queries like this:
// ALL TIME //
$mikep = mysqli_query($link, "SELECT tasks.EID, reports.how_did_gig_go FROM tasks INNER JOIN reports ON tasks.EID=reports.eid WHERE `priority` IS NOT NULL AND `partners_name` IS NOT NULL AND mike IS NOT NULL GROUP BY EID ORDER BY tasks.show_date DESC;");
$num_rows_mikep = mysqli_num_rows($mikep);
$rating_sum_mikep = 0;
while ($row = mysqli_fetch_assoc($mikep)) {
$rating_mikep = $row['how_did_gig_go'];
$rating_sum_mikep += $rating_mikep;
}
$average_mikep = $rating_sum_mikep/$num_rows_mikep;
// AND NOW WITH A LIMIT 10 //
$mikep_limit = mysqli_query($link, "SELECT tasks.EID, reports.how_did_gig_go FROM tasks INNER JOIN reports ON tasks.EID=reports.eid WHERE `priority` IS NOT NULL AND `partners_name` IS NOT NULL AND mike IS NOT NULL GROUP BY EID ORDER BY tasks.show_date DESC LIMIT 10;");
$num_rows_mikep_limit = mysqli_num_rows($mikep_limit);
$rating_sum_mikep_limit = 0;
while ($row = mysqli_fetch_assoc($mikep_limit)) {
$rating_mikep_limit = $row['how_did_gig_go'];
$rating_sum_mikep_limit += $rating_mikep_limit;
}
$average_mikep_limit = $rating_sum_mikep_limit/$num_rows_mikep_limit;
This allows me to show an all-time average and also an average over the last 10 reviews. Is it really necessary for me to set up two queries?
Also, I understand I could get the sum in the query, but not all the values are numbers, so I've actually converted them in PHP, but left out that code in order to try and simplify what is displayed in the code.
All-time average and average over the last 10 reviews
In the best case scenario, where your column how_did_gig_go was 100% numeric, a single query like this could work like so:
SELECT
AVG(how_did_gig_go) AS avg_how_did_gig_go
, SUM(CASE
WHEN rn <= 10 THEN how_did_gig_go
ELSE 0
END) / 10 AS latest10_avg
FROM (
SELECT
#num + 1 AS rn
, tasks.show_date
, reports.how_did_gig_go
FROM tasks
INNER JOIN reports ON tasks.EID = reports.eid
CROSS JOIN ( SELECT #num := 0 AS n ) AS v
WHERE priority IS NOT NULL
AND partners_name IS NOT NULL
AND mike IS NOT NULL
ORDER BY tasks.show_date DESC
) AS d
But; Unless all the "numbers" are in fact numeric you are doomed to sending every row back from the server for php to process unless you can clean-up the data in MySQL somehow.
You might avoid sending all that data twice if you establish a way for your php to use only the top 10 from the whole list. There are probably way of doing that in PHP.
If you wanted assistance in SQL to do that, then maybe having 2 columns would help, it would reduce the number of table scans.
SELECT
EID
, how_did_gig_go
, CASE
WHEN rn <= 10 THEN how_did_gig_go
ELSE 0
END AS latest10_how_did_gig_go
FROM (
SELECT
#num + 1 AS rn
, tasks.EID
, reports.how_did_gig_go
FROM tasks
INNER JOIN reports ON tasks.EID = reports.eid
CROSS JOIN ( SELECT #num := 0 AS n ) AS v
WHERE priority IS NOT NULL
AND partners_name IS NOT NULL
AND mike IS NOT NULL
ORDER BY tasks.show_date DESC
) AS d
In future (MySQL 8.x) ROW_NUMBER() OVER(order by tasks.show_date DESC) would be a better method than the "roll your own" row numbering (using #num+1) shown before.
Can anyone help me optimise this query? I have the following table:
cdu_user_progress:
--------------------------------------------------------------
|id |uid |lesson_id |game_id |date |score |
--------------------------------------------------------------
For each user, I'm trying to obtain the difference between the best and first scores for a particular game_id for a particular lesson_id, and order the results by that difference ('progress' in my query):
SELECT ms.uid AS id, ms.max_score - fs.first_score AS progress
FROM (
SELECT up.uid, MAX(CASE WHEN game_id = 3 THEN score ELSE NULL END) AS max_score
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) ms
LEFT JOIN (
SELECT up.uid, up.score AS first_score
FROM cdu_user_progress up
INNER JOIN (
SELECT up.uid, MIN(CASE WHEN game_id = 3 THEN date ELSE NULL END) AS first_date
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) fd ON fd.uid = up.uid AND fd.first_date = up.date
) fs ON fs.uid = ms.uid
ORDER BY progress DESC
Any help would be most appreciated!
Absent any EXPLAIN output or index definitions, we can't make any recommendations. (I noted in a comment that it looks like some join predicates are missing, if we don't have guaranteed uniqueness on the (uid,date) tuple in cdu_user_progress... there's potential that we are going to get rows that are for a different lesson_id or a score that isn't greater than '-1'.
In the query text, immediately before ) fs , I'd be adding
AND up.lesson_id = '65'
AND up.score > '-1'
GROUP BY up.uid
I'd also wrap the up.score column (in the SELECT list of the fd view) in an aggregate function, either MIN() or MAX(), for compliance with the ANSI standard (even though it isn't required by MySQL when SQL_MODE doesn't include ONLY_FULL_GROUP_BY)
If I didn't have a suitable index defined, I'd consider adding an index:
... ON cdu_user_progress (lesson_id, uid, score, game_id, date)
There's some overhead for the derived tables (materializing the inline views) and those derived tables aren't going to have indexes on them (in MySQL 5.5 and earlier.) But the GROUP BY in each inline view ensures that we'll have less than 20 rows, so that's not really going to be a problem.
So, if there's a performance issue, it's in the view queries. Again, we'd really need to see the output from EXPLAIN and the index definitions, and some cardinality estimates, in order to make recommendations.
FOLLOWUP
Given that there's not a unique constraint on (uid,date), I'd add those predicates in the fs view query. I'd also use unique table aliases in the query (for each references to cdu_user_progress) to make both the statement and the EXPLAIN output easier to read. Also, adding the GROUP BY clause and the aggregate function in the fd view... I'd write the query like this:
SELECT ms.uid AS id
, ms.max_score - fs.first_score AS progress
FROM ( SELECT up.uid
, MAX(CASE WHEN up.game_id = 3 THEN up.score ELSE NULL END) AS max_score
FROM cdu_user_progress up
WHERE up.uid IN ('1671','1672','1673','1674','1675','1676','1679','1716','1725','1726','1937','1964','1996','2062','2065','2066','2085','2086')
AND up.lesson_id = '65'
AND up.score > '-1'
GROUP BY up.uid
) ms
LEFT
JOIN ( SELECT uo.uid
, MIN(uo.score) AS first_score
FROM ( SELECT un.uid
, MIN(CASE WHEN un.game_id = 3 THEN un.date ELSE NULL END) AS first_date
FROM cdu_user_progress un
WHERE un.uid IN ('1671','1672','1673','1674','1675','1676','1679','1716','1725','1726','1937','1964','1996','2062','2065','2066','2085','2086')
AND un.lesson_id = '65'
AND un.score > '-1'
GROUP BY un.uid
) fd
JOIN cdu_user_progress uo
ON uo.uid = fd.uid
AND uo.date = fd.first_date
AND uo.lesson_id = '65'
AND uo.score > '-1'
GROUP BY uo.uid
) fs
ON fs.uid = ms.uid
ORDER BY progress DESC
And I believe that would make the index I recommended above suitable for all of the references to cdu_user_progress.
I need to show empty rows for BRANDS too. I mean, there is a third brand not shown in this query, look:
SELECT
da_brands.name AS brand_name,
COUNT(DISTINCT da_deals.id) AS total_deals,
0 AS total_downloaded_coupons,
0 AS total_validated_coupons,
COUNT(da_logs.id) AS total_likes
FROM
da_brands,
da_deals
LEFT JOIN da_logs
ON da_logs.fk_deal_id = da_deals.id
AND da_logs.fk_deal_id = da_deals.id
AND da_logs.type = 'deal_like'
WHERE da_brands.fk_club_id = 6
AND da_deals.fk_brand_id = da_brands.id
AND da_brands.date <= NOW()
GROUP BY da_brands.name
ORDER BY da_brands.name ASC
RESULTS:
brand_name total_deals total_downloaded_coupons total_validated_coupons total_likes
Marca2 2 0 0 1
Marca1 9 0 0 4
This conditional is showing only brands within deals but i want all brands...:
AND da_deals.fk_brand_id = da_brands.id
Any idea what statement should i use?
Thank you so much.!!!
This following line in the WHERE predicate is the problem...
AND da_deals.fk_brand_id = da_brands.id
You need to LEFT JOIN to da_deals, the same way you did to da_logs, and move that line above into the ON statement for the join.
See below...
SELECT
da_brands.name AS brand_name,
COUNT(DISTINCT da_deals.id) AS total_deals,
0 AS total_downloaded_coupons,
0 AS total_validated_coupons,
COUNT(da_logs.id) AS total_likes
FROM da_brands
LEFT JOIN da_deals
ON da_brands.id = da_deals.fk_brand_id
LEFT JOIN da_logs
ON da_logs.fk_deal_id = da_deals.id
AND da_logs.fk_deal_id = da_deals.id
AND da_logs.type = 'deal_like'
WHERE da_brands.fk_club_id = 6
AND da_brands.date <= NOW()
GROUP BY da_brands.name
ORDER BY da_brands.name ASC
I have the following MYSQL query which returns the number of photos found for each record where the number of photos is greater than 0.
SELECT advert_id, (SELECT COUNT( * ) FROM advert_images b WHERE b.advert_id = adverts.advert_id) AS num_photos
FROM adverts
WHERE adverts.approve = '1'
HAVING num_photos > 0
The query works fine, but I want to just return a count of the records found. i.e. the number of records which have at least one photo. I've tried to wrap the whole query in a COUNT, but it gives an error. I want to do this in the query, and not a separate count of records found in php.
SELECT COUNT(*) AS TotalRecords
FROM
(
SELECT a.advert_id, COUNT(*) AS num_photos
FROM adverts AS a
JOIN advert_images AS i
ON i.advert_id = a.advert_id
WHERE a.approve = '1'
GROUP BY a.advert_id
HAVING num_photos > 0
) AS mq
SELECT COUNT(*) FROM (SELECT advert_id, (SELECT COUNT( * ) FROM advert_images b WHERE b.advert_id = adverts.advert_id) AS num_photos
FROM adverts
WHERE adverts.approve = '1'
HAVING num_photos > 0) AS c
This should do the trick