SQL for grouping by timeframe with empty frames

SQL for grouping by timeframe with empty frames - php

My goal is it to create a SQL-Query that counts all items in a certain time frame (e.g. 5min)
That's my code so far:
SELECT FROM_UNIXTIME(FLOOR(timestamp_stop/5*60)*(5*60), '%h:%i') AS timekey, timestamp_stop, count(item) AS performance
FROM task
WHERE done = 1
GROUP BY timekey
ORDER BY timestamp_stop ASC
That works great, but doesn't include time frames in which there aren't any records in the database.
I would like to also get these 0-count-ones, up to the current time.
Currently I have no simple/elegant solution in my mind. Any ideas?
Some little post processing in php would also be possible.

As Gordon mentioned, you probably want a secondary table as a basis for ALL 5-minute intervals. I have done similar with a query to self-build using MySQL variables.
select
YourTable.WhateverFields
from
( select
#startTime RangeStart,
#startTime := date_add( #startTime, interval 5 MINUTE ) RangeEnd
from
( select #startTime := '2014-10-20' ) sqlvars,
AnyTableThatHasAsManyDaysYouExpectToReport
limit
12 * numberOfHoursYouNeed * numberOfDaysYouNeed ) DynamicTimeRange
LEFT JOIN YourTable
on YourTable.DateTimeField >= DynamicTimeRange.RangeStart
AND YourTable.DateTimeField < DynamicTimeRange.RangeEnd
So, in this example, the innermost declars a variable "startTime" to Oct 20, 2014 which defaults to 12:00:00. Then the one out from that creates a result set of two columns for a RangeStart and RangeEnd and might look something like...
RangeStart RangeEnd
2014-10-20 00:00 2014-10-20 00:05
2014-10-20 05:00 2014-10-20 00:10
2014-10-20 10:00 2014-10-20 00:15
2014-10-20 05:00 2014-10-20 00:20
2014-10-20 20:00 2014-10-20 00:25
The table reference "AnyTableThatHasAsManyDaysYouExpectToReport" is just that... any table in your database that has at least as many records as you would need to generate your 5-minute intervals for however many hours and days. If you need 1 day worth = 12 records * 5 minutes = 1 hour * 24 hrs = 24*12 = 288 records needed. If you wanted a week, then so be it... multiply that by 7 so my sample just has place-holders to help clarify the intent...
But with the LEFT JOIN, you get all the intervals...

If there are such time frame, but the where clause filters out the records, you can do:
SELECT FROM_UNIXTIME(FLOOR(timestamp_stop/5*60)*(5*60), '%h:%i') AS timekey,
timestamp_stop,
sum(item is not null and done = 1) AS performance
FROM task
GROUP BY timekey
ORDER BY timestamp_stop ASC;
If you still have gaps, then you need to generate a table (or subquery) containing the list of the time frames that you want and use left join.
EDIT:
A subquery is not pleasant. You have to list all the time values. Something like:
SELECT q.timekey, t.timestamp_stop, coalesce(t.performance, 0) as performance
FROM (SELECT '00:00' as timekey UNION ALL
SELECT '00:05' UNION ALL
. . .
) q LEFT JOIN
(SELECT FROM_UNIXTIME(FLOOR(timestamp_stop/5*60)*(5*60), '%h:%i') AS timekey,
timestamp_stop,
COUNT(item) AS performance
FROM task
WHERE done = 1
GROUP BY timekey
) t
ON t.timekey = q.timekey
ORDER BY timestamp_stop ASC;

Related

postgresql max(count(*)) - php

I have a problem in postgresql.
I have one cohorte (gathering of people) and i would like counting the persons in this cohorte.
Begin date : "2014-09-01", End date : "2014-11-30".
I have 5 persons between 09/01 and 09/22
I have 5 persons between 09/20 and 09/25
I have 5 persons between 09/26 and 10/05
I have 5 persons between 10/01 ans 11/30
I want to have the max of accommodation for each month between the begin date and the end date in SQL (or PHP). Expected max person count:
September(09) => 10
October(10) => 10
November(11) => 5

Find the maximum of simultaneously present persons on a single day for every month in a given period.
I suggest generate_series() to produce the series of days in your period. Then aggregate twice:
First to get a count for each day. A single day can be dealt with plain BETWEEN. Your ranges are obviously meant to be with include borders.
Second to get the maximum per month.
SELECT date_trunc('month', day)::date AS month, max(ct) AS max_ct
FROM (
SELECT g.day, count(*) AS ct
FROM cohorte
,generate_series('2014-09-01'::date -- first of Sept.
,'2014-11-30'::date -- last of Nov.
,'1 day'::interval) g(day)
WHERE g.day BETWEEN t_begin AND t_end
GROUP BY 1
) sub
GROUP BY 1
ORDER BY 1;
Returns:
month | max_ct
-----------+--------
2014-09-01 | 10
2014-10-01 | 10
2014-11-01 | 5
Use to_char() to prettify the month output.
SQL Fiddle .. is down ATM. Here is my test case (that you should have provided):
CREATE TEMP TABLE cohorte (
cohorte_id serial PRIMARY KEY
,person_id int NOT NULL
,t_begin date NOT NULL -- inclusive
,t_end date NOT NULL -- inclusive
);
INSERT INTO cohorte(person_id, t_begin, t_end)
SELECT g, '2014-09-01'::date, '2014-09-22'::date
FROM generate_series (1,5) g
UNION ALL
SELECT g+5, '2014-09-20', '2014-09-25'
FROM generate_series (1,5) g
UNION ALL
SELECT g+10, '2014-09-26', '2014-10-05'
FROM generate_series (1,5) g
UNION ALL
SELECT g+15, '2014-10-01', '2014-11-30'
FROM generate_series (1,5) g;
For more complex checks I'd suggest the OVERLAPS operator:
Find overlapping date ranges in PostgreSQL
For more complex scenarios I'd also consider range types:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL

can't you use window function?
I'd try something like this (I've not tested this code, just exposed my thoughts)
SELECT max(count) FROM (
SELECT count(*) OVER (PARTITION BY ???) as count
FROM contract
WHERE daterange(dateStart, dateEnd, '[]') && daterange('2014-09-01', '2014-10-01', '[)')
) as max
Here, my problem remains that I can't find a way to partition for each day of the interval. Maybe this is a wrong approach, but I would be interested by a solution based on windows.
edit: with this request, you have the max of simultaneous present, but over all the time, not only a given month
with presence as (
SELECT id, generate_series(begin_date, end_date, '1 day'::interval) AS date
FROM test
),
presents as (
SELECT count(*) OVER (PARTITION BY date) AS count
FROM presence
)
SELECT max(count) from presents;
Here we come, I think
Imagine your person table has 3 columns :
id
entrance_date
leaving_date
the request would look like
WITH presents as (
SELECT id,
daterange(entrance_date, leaving_date, '[]') * daterange('2014-09-01', '2014-11-30', '[]') as range
FROM person
WHERE daterange(entrance_date, leaving_date, '[]') && daterange('2014-09-01', '2014-11-30', '[]')
),
present_per_day as (
SELECT id,
generate_series(lower(range), upper(range), '1 day'::interval) AS date
FROM presents
),
count_per_day as (
SELECT count(*) OVER (PARTITION BY date) AS count,
date
FROM present_per_day
),
SELECT max(count) OVER (PARTITION BY date_part('year', date), date_part('month', date)) as max,
date_part('year', date),
date_part('month', date)
FROM count_per_day;
(I have to leave, I hope I'll have time to test it later)
In fact, #erwin solution is much much more easy and efficient than this one.

mysql total working hours

id title start end
1 Doing Coding for this project. 2013-04-02 02:00:00 2013-04-02 04:00:00
2 Doing Coding for this project. 2013-04-02 04:00:00 2013-04-02 06:00:00
3 Doing Coding for this project. 2013-04-02 06:00:00 2013-04-02 06:30:00
I have above MySQL database table record. Now i want to get the total number of hours.
I am developing TimeSheet Management Application and we need to display total working hours with minutes and second of employee. (i.e 04:30:00 according to data i share)
what i have tried?
SELECT HOUR(TIMEDIFF(end,start)) AS 'totalHour' but works only for each row not on all records.
I have also tried TIMESTAMPDIFF.
Is this possible?
EDIT
From the answer i have received from people i have tried every single of them but everytime i just get 4 or 4.5000 but it should return 06:30:00.

The range of HOUR() function is 0 to 23 so it's not correct to use it for total hours in diff.
For single value you could use TIMESTAMPDIFF() like:
SELECT TIMESTAMPDIFF(HOUR, start, end) AS `totalHour` FROM ...
If you want to calculate it for whole project, you have to sum up all time differences and them print it formatted probably with funciton like TIME_FORMAT() which prints hour larger than 24:
If the time value contains an hour part that is greater than 23, the %H and %k hour format specifiers produce a value larger than the usual range of 0..23. The other hour format specifiers produce the hour value modulo 12.
So you can use:
SELECT TIME_FORMAT( SEC_TO_TIME( SUM( TIME_TO_SEC(end) - TIME_TO_SEC(start))), "%H")
AS `totalHour`
FROM ...
GROUP BY sort_of_project_id
If you need seconds/minutes too (as suggested in comment), use either:
time_to_sec( <left side of select>)/3600 which will return value like 4.84 hours
TIME_FORMAT( ..., "%H:%m:%s") which will display 4:38:24

Try this query
SELECT
id,
title,
TIME_FORMAT(SEC_TO_TIME(sum(TIME_TO_SEC(TIMEDIFF( end, start)))), "%h:%i") AS diff
FROM
tbl1
GROUP BY
title
According to the data that you have given answer should be 4:30. Pl cross check in you records.
FIDDLE

Try like this, it will give you the total no of hours:
For Example:
SELECT sum(time_to_sec(timediff(end, start ))/ 3600) AS 'totalHour' from test;
If you run this query for above table you given, it shows the output 4.5 hours.
Hope it will help you.

I already answered in other thread https://stackoverflow.com/questions/44560345/query-is-not-working/44567322#44567322
I just created a temporary table called dataimport
[Table Format][1]
and wrote a query as,
SELECT `EnNo`, work_dt,
SEC_TO_TIME(sum(TIMESTAMPDIFF(SECOND,login,logout))) as time_worked
from (
SELECT `EnNo`, date(`DateTime`) as work_dt, `DateTime` as login
, coalesce(
(SELECT MIN(`DateTime`)
FROM `dataimport` as b
WHERE a.EnNo = b.EnNo
and date(a.`DateTime`) = date(b.`DateTime`)
and b.`DateTime` >= a.`DateTime`
and b.`INOUT` = 'E'
), now()) AS logout
FROM `dataimport` AS a
WHERE a.`INOUT` = 'S'
) as t
GROUP BY `EnNo`, work_dt
Finally got the output as,
[Output][2]
Hope this is what you are lookin on.
[1]: https://i.stack.imgur.com/OEEMe.png
[2]: https://i.stack.imgur.com/g6ivm.png

Change your
SELECT HOUR(TIMEDIFF(end,start)) AS 'totalHour'
to
SELECT sum(TIMESTAMPDIFF(HOUR, end, start)) AS 'totalHour'

SELECT IF(DATE(datetime_end) = DATE(datetime_start), TIMEDIFF(datetime_end,datetime_start), IF(DATEDIFF(datetime_start,datetime_end) > 1, ADDTIME( TIME_FORMAT(CONCAT((DATEDIFF(datetime_start,datetime_end) - 1) * 8,':00:00'), "%H:%i:%s"), ADDTIME( TIMEDIFF(datetime_end,CONCAT(DATE(datetime_end),' 08:00:00')), TIMEDIFF(CONCAT(DATE(datetime_start),' 17:00:00'),datetime_start) ) ) , ADDTIME( TIMEDIFF(datetime_end,CONCAT(DATE(datetime_end),' 08:00:00')), TIMEDIFF(CONCAT(DATE(datetime_start),' 17:00:00'),datetime_start) ) ) ) AS total_working_hrs FROM table

Working out the amount of free dates in a given time period

I have a fun one for you. I have a database with the date columns free_from and free_until. What I need to find is the amount of days between now and 1 month today which are free. For example, if the current date was 2013/01/15 and the columns were as follows:
free_from | free_until
2013/01/12| 2013/01/17
2013/01/22| 2013/01/26
2013/01/29| 2013/02/04
2013/02/09| 2013/02/11
2013/02/14| 2013/02/17
2013/02/19| 2013/02/30
The answer would be 16
as 2 + 4 + 6 + 2 + 2 + 0 = 16
The first row only starts counting at the 15th rather than the 12th
since the 15th is the current date.
The last row is discounted because none of the dates are within a
month of the current date.
The dates must be counted as it the free_from date is inclusive and
the free_until date is exclusive.
I'm assuming DATEDIFF() will be used somewhere along the line, but I can't, for the life of me, work this one out.
Thanks for your time!
Edit: This is going into PHP mysql_query so that might restrict you a little concerning what you can do with MYSQL.

SET #today = "2013-01-15";
SET #nextm = DATE_ADD(#today, INTERVAL 1 month);
SET #lastd = DATE_ADD(#nextm, INTERVAL 1 day);
SELECT
DATEDIFF(
IF(#lastd> free_until, free_until, #lastd),
IF(#today > free_from, #today, free_from)
)
FROM `test`
WHERE free_until >= #today AND free_from < #nextm
That should work. At least for your test data. But what day is 2013/02/30? :-)
Dont forget to change #today = CURDATE();

The best I can think of is something like:
WHERE free_until > CURDATE()
AND free_from < CURDATE() + INTERVAL '1' MONTH
That will get rid of any unnecessary rows. Then on the first row do in PHP:
date_diff(date(), free_until)
On the last row, do:
date_diff(free_from, strtotime(date("Y-m-d", strtotime($todayDate)) . "+1 month"))
Then on intermediate dates do:
date_diff(free_from, free_until)
Something to that effect, but this seems extremely clunky and convoluted...

From the top of my mind... first do a:
SELECT a.free_from AS a_from, a.free_until AS a_until, b.free_from AS b_from
FROM availability a
INNER JOIN availability b ON b.free_from > a.free_until
ORDER BY a_from, b_from
This probably will return a set of rows where for each row interval you have next i.e. greater intervals. The results are ordered strategically. You can then wrap the results in a partial group by:
SELECT * FROM (
SELECT a.free_from AS a_from, a.free_until AS a_until, b.free_from AS b_from
FROM availability a
INNER JOIN availability b ON b.free_from > a.free_until
ORDER BY a_from, b_from
) AS NextInterval
GROUP BY a_from, b_until
In the above query, add a DATE_DIFF clause (wrap it in SUM() if necessary):
DATE_DIFF(b_until, a_from)

MySQL Query Problem with INTERVAL, need 0 if no data provided

i have the following statement:
SELECT
count(rs.rsc_id) as counter
FROM shots as rs
where rsc_rs_id = 345354
AND YEAR(rs.timestamp) = YEAR(DATE_SUB(CURDATE(), INTERVAL 6 MONTH))
GROUP BY DATE_FORMAT(rs.timestamp,'%Y%m')
rs.timestamp is a unix timestamp
Output would be like for each row / month a numeric like '28'
It Works fine, but if i have inconsistent data, like only for the past three month (not for all six month), i get no return from my Database. I would like to have every time there is not data for this month, 0 returned...
any suggestion?
i thought about some case statements, but this seems not so good...
thanks!!

For only 6 months, a date table seems unnecessary, although this looks complicated (it really isn't!)
SELECT DATE_FORMAT(N.PivotDate,'%Y%m'), count(rs.rsc_id) as counter
FROM (
select ADDDATE(CURDATE(), INTERVAL N MONTH) PivotDate
FROM (
select 0 N union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6) N) N
LEFT JOIN shots as rs
ON rsc_rs_id = 345354
AND DATE_FORMAT(N.PivotDate,'%Y%m')=DATE_FORMAT(FROM_UNIXTIME(rs.timestamp),'%Y%m')
GROUP BY DATE_FORMAT(N.PivotDate,'%Y%m')

In such cases it's common to use a table of dates with all dates (e.g. from 1/1/1970 to 31/12/2999) and LEFT JOIN your data to that table.
See an example in the answer here: mysql joins tables creating missing dates
If you create a dates table you can use:
SELECT
DATE_FORMAT(d.date,'%Y%m') AS `month`, count(rs.rsc_id) AS `counter`
FROM dates d
LEFT JOIN shots as rs
ON d.date = FROM_UNIXTIME(rs.timestamp)
AND rs.rsc_rs_id = 345354
WHERE d.date > DATE_SUB(CURDATE(), INTERVAL 5 MONTH)
AND d.date < CURDATE()
GROUP BY DATE_FORMAT(d.date,'%Y%m');

How to minimize the load in queries that need grouping with different invervals?

I'm looking for a best practice advice how to speed up queries and at the same time to minimize the overhead needed to invoke date/mktime functions. To trivialize the problem I'm dealing with the following table layout:
CREATE TABLE my_table(
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
important_data INTEGER,
date INTEGER);
The user can choose to show 1) all entries between two dates:
SELECT * FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Output:
10-21-2009 12:12:12, 10002
10-21-2009 14:12:12, 15002
10-22-2009 14:05:01, 20030
10-23-2009 15:23:35, 300
....
I don't think there is much to improve in this case.
2) Summarize/group the output by day, week, month, year:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data
FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Example output by month:
10-2009, 100002
11-2009, 200030
12-2009, 3000
01-2010, 0 /* <- very important to show empty dates, with no entries in the table! */
....
To accomplish option 2) I'm currently running a very costly for-loop with mktime/date like the following:
for(...){ /* example for group by day */
$span_from = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i, date("Y", $time_min));
$span_to = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i+1, date("Y", $time_min));
$query = "..";
$output = date("m-d-y", ..);
}
What are my ideas so far? Add additional/ redundant columns (INTEGER) for day (20091212), month (200912), week (200942) and year (2009). This way I can get rid of all the unnecessary queries in the for loop. However I'm still facing the problem to very fastly calculate all dates that doesn't have any equivalent in database. One way to simply move the problem could be to let MySQL do the job and simply use one big query (calculate all the dates/use MySQL date functions) with a left join (the data). Would it be wise to let MySQL take the extra load? Anyway I'm reluctant to use all these mktime/date in the for loop. Since I have complete control over the table layout and code even suggestions with major changes are welcome!
Update
Thanks to Greg I came up with the following SQL query. However it still bugs me to use 50 lines of sql statements - build up with php - that maybe could be done faster and more elegantly otherwise:
SELECT * FROM (
SELECT DATE_ADD('2009-01-30', INTERVAL 0 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 1 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 2 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 3 DAY) AS day UNION ALL
......
SELECT DATE_ADD('2009-01-30', INTERVAL 50 DAY) AS day ) AS dates
LEFT JOIN (
SELECT DATE_FORMAT(date, '%Y-%m-%d') AS date, SUM(data) AS data
FROM test
GROUP BY date
) AS results
ON DATE_FORMAT(dates.day, '%Y-%m-%d') = results.date;

You definitely shouldn't be doing a query inside a loop.
You can group like this:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data, DATE_FORMAT('%Y-%m', date) AS month
FROM my_table
WHERE date BETWEEN ? AND ? -- This should be the min and max of the whole range
GROUP BY DATE_FORMAT('%Y-%m', date)
ORDER BY date DESC;
Then pull these into an array keyed by date and loop over your data range as you are doing (that loop should be pretty light on CPU).

Another idea is not to use string inside the query. Transform the string parameter to datetime, on mysql.
STR_TO_DATE(str,format)
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.