I am using a query that takes an average of all the records for each given id...
$query = "SELECT bline_id, AVG(flow) as flowavg
FROM blf
WHERE bline_id BETWEEN 1 AND 30
GROUP BY bline_id
ORDER BY bline_id ASC";
These records are each updated once daily. I would like to use only the 10 most recent records for each id in my average.
Any help would be qreatly appreciated.
blf table structure is:
id | bline_id | flow | date
If these are really updated every day, then use date arithmetic:
SELECT bline_id, AVG(flow) as flowavg
FROM blf
WHERE bline_id BETWEEN 1 AND 30 and
date >= date_sub(now(), interval 10 day)
GROUP BY bline_id
ORDER BY bline_id ASC
Otherwise, you have to put in a counter, which you can do with a correlated subquery:
SELECT bline_id, AVG(flow) as flowavg
FROM (select blf.*,
(select COUNT(*) from blf blf2 where blf2.bline_id = blf.bline_id and blf2.date >= blf.date
) seqnum
from blf
) blf
WHERE bline_id BETWEEN 1 AND 30 and
seqnum <= 10
GROUP BY bline_id
ORDER BY bline_id ASC
Another option is to simulate ROW_NUMBER().
This statement creates a counter and resets it every time it encounters a new bline_id. It then filters out any records that aren't in the first 10 rows.
SELECT bline_id,
Avg(flow) avg
FROM (SELECT id,
bline_id,
flow,
date,
CASE
WHEN #previous IS NULL
OR #previous = bline_id THEN #rownum := #rownum + 1
ELSE #rownum := 1
end rn,
#previous := bline_id
FROM blf,
(SELECT #rownum := 0,
#previous := NULL) t
WHERE bline_id > 0 and bline_id < 31
ORDER BY bline_id,
date DESC,
id) t
WHERE rn < 11
GROUP BY bline_id
DEMO
It's worthwhile seeing this in action by removing the group by and looking at intermediate results
Related
I have a database with colums I am working on. What I am looking for is the date associated with the row where the SUM(#) reaches 6 in a query. The query I have now will give the date when the number in the colum is six but not the sum of the previous rows. example below
Date number
---- ------
6mar16 1
8mar16 4
10mar16 6
12mar16 2
I would like to get a query to get the 10mar16 date because on that date the number is now greater than 6. Earlier dates wont total up to six.
Here is an example of a query i have been working on:
SELECT max(date) FROM `numbers` WHERE `number` > 60
You could use this query, which tracks the accumulated sum and then returns the first one that meets the condition:
select date
from (select * from mytable order by date) as base,
(select #sum := 0) init
where (#sum := #sum + number) >= 6
limit 1
SQL Fiddle
Most databases support ANSI standard window functions. In this case, cumulative sum is your friend:
select t.*
from (select t.*, sum(number) over (order by date) as sumnumber
from t
) t
where sumnumber >= 10
order by sumnumber
fetch first 1 row only;
In MySQL, you need variables:
select t.*
from (select t.*, (#sumn := #sumn + number) as sumnumber
from t cross join (select #sumn) params
order by date
) t
where sumnumber >= 10
order by sumnumber
fetch first 1 row only;
Awesome!!!! It seems to be working great. Here is the code that I used.
SELECT date, id, crewname
FROM (select * FROM flightrecord WHERE `crewname` = 'brayn'
ORDER BY dutyTimeArrive DESC) as base,
(select #sum := 0) init
WHERE (#sum := #sum + tankDropCount) >= 6
limit 1
I have no idea how to solve the following problem: I have several rows in my database with one timestamp per row. Now I would like to filter all rows for entries until the date interval for any two dates is bigger than 30 days. I have no defined date interval for specific dates, like between 12/01/2017 and 11/01/2017, that would be easy, even for me. All I know is that the timestamp interval from one row to the next row (query must be sorted by timestamp desc) must not be bigger than 30 days.
Please see my db at http://sqlfiddle.com/#!9/55a521/2
In this case the last entry shown should be the one with id 65404844. I would appreciate if you might give me a small hint for this.
Thank you very much!
You can use this query to build a filter.
SELECT
t.id,
from_unixtime(timestamp)
, IF(#pt < timestamp - 30*24*60*60, 1, 0) AS filter
, #pt := timestamp
FROM
t
, (SELECT #pt := MIN(timestamp) FROM t) v
ORDER BY timestamp
see it working live in an sqlfiddle
Important here is to order by timestamp. Then you initialize the #pt variable with the lowest value. Another important thing is to have the select clause in the right order.
First you compare the current record with the variable in the IF() function. Then you assign the current record to the variable. This way when the next row is evaluated, the variable still holds the value of the previous row in the IF() function.
To get the rows you want, use above query in a subquery to filter.
SELECT id, ts FROM (
SELECT
t.id,
from_unixtime(timestamp) as ts
, IF(#pt < timestamp - 30*24*60*60, 1, 0) AS filter
, #pt := timestamp
FROM
t
, (SELECT #pt := MIN(timestamp) FROM t) v
ORDER BY timestamp
) sq
WHERE sq.filter = 1
This filters out the rows that have a more than 30 days difference from the previous rows. (1st solution) - only works if the id column has consecutive values
SELECT t.id, t.timestamp, DATEDIFF(FROM_UNIXTIME(t1.timestamp), FROM_UNIXTIME(t.timestamp)) AS days_diff
FROM tbl t
LEFT JOIN tbl t1
ON t.id = t1.id + 1
HAVING days_diff <= 30
ORDER BY t.timestamp DESC;
This filters all the results that are within 30 days of each of the other entries.
SELECT *
FROM tbl t
WHERE EXISTS (
SELECT id
FROM tbl t1
WHERE DATEDIFF(FROM_UNIXTIME(t1.timestamp), FROM_UNIXTIME(t.timestamp)) < 30
AND t1.id <> t.id
)
ORDER BY t.timestamp desc;
I have a question. Suppose I have this table in SQL:
date user_id
2015-03-17 00:06:12 143
2015-03-17 01:06:12 143
2015-03-17 02:06:12 143
2015-03-17 09:06:12 143
2015-03-17 10:10:10 200
I want to get the number of consecutive hours. For example, for user 143, I want to get 2 hours, for user 200 0 hours. I tried like this :
select user_id, TIMESTAMPDIFF(HOUR,min(date), max(date)) as hours
from myTable
group by user_id
But this query fetches all non-consecutive hours. Is it possible to solve the problem with a query, or do I need to post-process the results in PHP?
Use a variable to compare with the previous row.
SELECT user_id, SUM(cont_hour) FROM (
SELECT
user_id,
IF(CONCAT(DATE(#prev_date), ' ', HOUR(#prev_date), ':00:00') - INTERVAL 1 HOUR = CONCAT(DATE(t.date), ' ', HOUR(t.date), ':00:00')
AND #prev_user = t.user_id, 1, 0) AS cont_hour
, #prev_date := t.date
, #prev_user := t.user_id
FROM
table t
, (SELECT #prev_date := NULL, #prev_user := NULL) var_init_subquery
WHERE t.date BETWEEN <this> AND <that>
ORDER BY t.date
) sq
GROUP BY user_id;
I made the comparison a bit more complicated than you expected, but I thought it's necessary, that you don't just compare the hour, but also, that it's the same date (or the previous day, when it's around midnight).
you can read more about user variables here
As a short explanation: The ORDER BY is very important, as well as the order in the SELECT clause. The #prev_date holds the "previous row", cause we assign the value of the current row after we made our comparison.
Another version using temporary variables:
SET #u := 0;
SET #pt := 0;
SET #s := 0;
SELECT `user_id`, MAX(`s`) `conseq` FROM
(
SELECT
#s := IF(#u = `user_id`,
IF(UNIX_TIMESTAMP(`date`) - #pt = 3600, #s + 1, #s),
0) s,
#u := `user_id` `user_id`,
#pt := UNIX_TIMESTAMP(`date`) pt
FROM `users`
ORDER BY `date`
) AS t
GROUP BY `user_id`
The subquery sorts the rows by date, then compares user_id with the previous value. If user IDs are equal, calculates the difference between date and the previous timestamp #pt. If the difference is an hour (3600 seconds), then the #s counter is incremented by one. Otherwise, the counter is reset to 0:
s user_id pt
0 143 1426529172
1 143 1426532772
2 143 1426536372
2 143 1426561572
0 200 1426565410
The outer query collects the maximum counter values per user_id, since the maximum counter value corresponds to the last counter value per user_id.
Output
user_id conseq
143 2
200 0
Note, the query accepts the difference of exactly 1 hour. If you want a more flexible condition, simply adjust the comparison. For example, you can accept a difference in interval between 3000 and 4000 seconds as follows:
#s := IF(#u = `user_id`,
IF( (UNIX_TIMESTAMP(`date`) - #pt) BETWEEN 3000 AND 4000, #s + 1, #s),
0) s
I have a table in my db, which contains following data:
————————————————————————————————————————————————————————————————————————
Id startDate availabilityStatus Hotel_Id
————————————————————————————————————————————————————————————————————————
1 2016-07-01 available 2
2 2016-07-02 available 2
3 2016-07-03 unavailable 2
4 2016-07-04 available 3
5 2016-07-05 available 3
6 2016-07-06 available 3
7 2016-07-07 unavailable 4
8 2016-07-08 available 4
9 2016-07-09 available 4
10 2016-07-10 available 4
Now, user wants to see all the Hotels which have 3 continuous days availability in July’16.
I am able to make the query to get the availability, but not sure how to fetch the Continuous date availability.
As per the above data, in July only Hotel Id 3, 4 have the continuous available dates, but as 2 also have the dates available. so how should we remove 2 and show just 3, 4 via MySQL query.
Please advise?
You can use the following query:
SELECT DISTINCT t1.hotel_id
FROM mytable AS t1
JOIN mytable AS t2
ON t1.hotel_id = t2.hotel_id AND
DATEDIFF(t1.startDate, t2.startDate) = 2 AND
t1.availabilityStatus = 'available' AND
t2.availabilityStatus = 'available'
LEFT JOIN mytable AS t3
ON t1.hotel_id = t3.hotel_id AND
t3.startDate < t2.startDate AND t3.startDate > t1.startDate AND
t3.availabilityStatus = 'unavailable'
WHERE t3.hotel_id IS NULL
The query is written in such a way, so that it can easily be adjusted in order to accommodate longer availability periods.
Edit:
Here's a solution using variables:
SELECT DISTINCT hotel_id
FROM (
SELECT hotel_id,
#seq := IF(#hid = hotel_id,
IF(availabilityStatus = 'available', #seq + 1, 0),
IF(#hid := hotel_id,
IF(availabilityStatus = 'available', 1, 0),
IF(availabilityStatus = 'available', 1, 0))) AS seq
FROM mytable
CROSS JOIN (SELECT #seq := 0, #hid := 0) AS vars
ORDER BY hotel_id, startDate) AS t
WHERE t.seq >= 3
You can test it with your actual data set and tell us how it compares with the first solution.
Try something like that. It works for any number of days. Replace N with 3.
SELECT DISTINCT A.Hotel_Id FROM table A
WHERE
A.availabilityStatus = 'available' AND
N-1 = (
SELECT count(DISTINCT startDate) FROM table B
WHERE B.availabilityStatus = 'available'
AND A.Hotel_Id = B.Hotel_Id
AND B.startDate
BETWEEN DATE_ADD(A.startDate, INTERVAL 1 DAY)
AND DATE_ADD(A.startDate, INTERVAL N-1 DAY)
)
It works like that: for each available date, count available dates in N-1 next days. If their count is N-1, add hotel_id to results.
Try this. I didn't get chance to test it as sqlfiddle is not working, but the general idea is to take 2 more instance of table by adding 1 and 2 days to the start date respectively.
Then join them based on derived dates and hotel id.
select t1.hotelid from
(select * from Table1 where availabilityStatus='available' ) t1
inner join
(select a.*, DATE_ADD(startDate,INTERVAL 1 DAY) as date_plus_one
from Table1 where availabilityStatus='available' ) t2
on t1.start_date=t2.date_plus_one and t1.hotelid=t2.hotelid
inner join
(select a.*, DATE_ADD(startDate,INTERVAL 2 DAY) as date_plus_two
from Table1 where availabilityStatus='available' ) t3
on t1.start_date=t3.date_plus_two and t1.hotelid=t3.hotelid
This query uses double self-join to find the same hotel available at day a, b and c, split by a day (function ADDDATE).
SELECT DISTINCT a.Hotel_Id
FROM table a
INNER JOIN table b ON a.Hotel_Id=b.Hotel_Id
INNER JOIN table c ON a.Hotel_Id=c.Hotel_Id
WHERE ADDDATE(a.startDate , INTERVAL 1 DAY) = b.startDate
AND ADDDATE(a.startDate , INTERVAL 2 DAY) = c.startDate
AND a.availabilityStatus = 'available'
AND b.availabilityStatus = 'available'
AND c.availabilityStatus = 'available'
Its working fine...
SELECT a.hotel_id FROM `mytable` as a WHERE
(select COUNT(id) from mytable as a1 where
DATE(a1.startDate)=DATE_ADD(a.startDate,INTERVAL 1 DAY) and
a1.hotel_id=a.hotel_id and
a1.availabilityStatus="Available"
) >0
and
(select COUNT(id) from mytable as a1 where
DATE(a1.startDate)=DATE_ADD(a.startDate,INTERVAL -1 DAY) and
a1.hotel_id=a.hotel_id and
a1.availabilityStatus="Available"
) >0
and
(select COUNT(id) from mytable as a1 where
DATE(a1.startDate)=DATE(a1.startDate) and
a1.hotel_id=a.hotel_id and
a1.availabilityStatus="Available"
) >0
//This is my query
SELECT bline_id, ROUND(Avg(flow),3) avg
FROM (SELECT id, bline_id, flow, date, CASE
WHEN #previous IS NULL
OR #previous = bline_id THEN #rownum := #rownum + 1
ELSE #rownum := 1
end rn,
#previous := bline_id
FROM blf,
(SELECT #rownum := 0,
#previous := NULL) t
WHERE bline_id > 0 and bline_id < 31
ORDER BY bline_id,
date DESC,
id) t
WHERE rn < 11
GROUP BY bline_id
This query takes the average of the last 10 records. I would like to be able to save these results back into the db, and compare them to the next group of 10 when a new record is added.
The end result I am looking for is to be able to tell if there is a change in the average by +or- 2%.
Does this make sense?
You could create a table with the following fields:
id, bline_id, avg, timestamp
Every time you add a record, insert the results of your query above into this table.
You can then compare the latest record in this table with the previous one.