MYSQL & PHP calculate total hours in specific date and exclude overlaps hours - php

I have a MYSQL table for tasks where each task has a date, start time,end time and user_id. I want to calculate total number of hours on specific date.
Table Structure
CREATE TABLE tasks
(`id` int,`user_id` int, `title` varchar(30), `task_date` datetime, `start` time, `end` time)
;
INSERT INTO tasks
(`id`,`user_id`, `title`,`task_date`, `start`, `end`)
VALUES
(1,10, 'Task one','2013-04-02', '02:00:00', '04:00:00'),
(2,10, 'Task two','2013-04-02', '03:00:00', '06:00:00'),
(3,10, 'Task three.','2013-04-02','06:00:00', '07:00:00');
MYSQL Query
select TIME_FORMAT(SEC_TO_TIME(sum(TIME_TO_SEC(TIMEDIFF( end, start)))), "%h:%i") AS diff
FROM tasks
where task_date="2013-04-02"
The result am getting is "06:00" Hours which is fine, but I want to exclude the overlap hours. In the example I gave result should be "05:00" Hours when the hour between 3-4 in the 2nd record is excluded because this hour is already exist in the 1st record between 2-4.
1st record 2=>4 = 2 Hours
2nd record 3=>6 = 3 hours 3-1 hour=2 (The 1 hour is the overlap hour between 1st and 2nd record )
3rd 6=>7=1
Total is 5 Hours
I hope I made my question clear. Example http://sqlfiddle.com/#!2/05dd8/2

Here is an idea that uses variables (in other databases, CTEs and window functions would make this much easier). The idea is to first list all the times -- starts and ends. Then, keep track of the cumulative number of starts and stops.
When the cumulative number is greater than 0, then include the difference from the previous time. If equal to 0, then it is the beginning of a new time period, so nothing is added.
Here is an example of the query, which is simplified a bit for your data by not keeping track of changes in user_id:
select user_id, TIME_FORMAT(SEC_TO_TIME(sum(secs)), '%h:%i')
from (select t.*,
#time := if(#sum = 0, 0, TIME_TO_SEC(TIMEDIFF(start, #prevtime))) as secs,
#prevtime := start,
#sum := #sum + isstart
from ((select user_id, start, 1 as isstart
from tasks t
) union all
(select user_id, end, -1
from tasks t
)
) t cross join
(select #sum := 0, #time := 0, #prevtime := 0) vars
order by 1, 2
) t
group by user_id;
Here is a SQL Fiddle showing it working.

Related

Time Record Calculations in PHP

I have a database table full of time records and I need to calculate the quantity of hours that exist between them...
A time record has the following fields: 'created' (i.e. 2017:08:30 11:15:00) and 'direction' (i.e. 1 represents "clock in" and "0" represents "clock out"). So I need to set a start and end date, then select all time records within that time frame and calculate the quantity of hours worked (the quantity of hours that exist between the records where direction=0 and direction=1).
Any idea how to create the logic for this? The result must be a measurement of "hours" in decimal format (1 decimal place, i.e. '26.7' hours).
I started by establishing variables:
$query_start_date = "2017-08-29 00:00:00";
$query_end_date = "2017-08-30 23:59:59";
Let's assume these are the time records that exist in that time frame:
time record 1: 'created'="2017-08-29 08:00:00", 'direction'=1;
time record 2: 'created'="2017-08-29 16:30:00", 'direction'=0;
time record 3: 'created'="2017-08-30 08:00:00", 'direction'=1;
time record 4: 'created'="2017-08-30 16:00:00", 'direction'=0;
But I don't know how to begin the calculation. Do I select the records first and assign each record to a variable as an array...? Any help is appreciated!
Make start and end time in two columns:
SELECT
created as start,
(select created from test t2 where t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0
conunt time difference:
select TIME_TO_SEC(TIMEDIFF(end, start))/60/60 as diff, start, end from
(SELECT
created as start,
(select created from test t2 where
t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0) intervals
and finally count sum:
select sum(TIME_TO_SEC(TIMEDIFF(end, start))/60/60) as total from
(SELECT
created as start,
(select created from test t2 where
t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0) intervals

Get sql by consecutive hours

I have a question. Suppose I have this table in SQL:
date user_id
2015-03-17 00:06:12 143
2015-03-17 01:06:12 143
2015-03-17 02:06:12 143
2015-03-17 09:06:12 143
2015-03-17 10:10:10 200
I want to get the number of consecutive hours. For example, for user 143, I want to get 2 hours, for user 200 0 hours. I tried like this :
select user_id, TIMESTAMPDIFF(HOUR,min(date), max(date)) as hours
from myTable
group by user_id
But this query fetches all non-consecutive hours. Is it possible to solve the problem with a query, or do I need to post-process the results in PHP?
Use a variable to compare with the previous row.
SELECT user_id, SUM(cont_hour) FROM (
SELECT
user_id,
IF(CONCAT(DATE(#prev_date), ' ', HOUR(#prev_date), ':00:00') - INTERVAL 1 HOUR = CONCAT(DATE(t.date), ' ', HOUR(t.date), ':00:00')
AND #prev_user = t.user_id, 1, 0) AS cont_hour
, #prev_date := t.date
, #prev_user := t.user_id
FROM
table t
, (SELECT #prev_date := NULL, #prev_user := NULL) var_init_subquery
WHERE t.date BETWEEN <this> AND <that>
ORDER BY t.date
) sq
GROUP BY user_id;
I made the comparison a bit more complicated than you expected, but I thought it's necessary, that you don't just compare the hour, but also, that it's the same date (or the previous day, when it's around midnight).
you can read more about user variables here
As a short explanation: The ORDER BY is very important, as well as the order in the SELECT clause. The #prev_date holds the "previous row", cause we assign the value of the current row after we made our comparison.
Another version using temporary variables:
SET #u := 0;
SET #pt := 0;
SET #s := 0;
SELECT `user_id`, MAX(`s`) `conseq` FROM
(
SELECT
#s := IF(#u = `user_id`,
IF(UNIX_TIMESTAMP(`date`) - #pt = 3600, #s + 1, #s),
0) s,
#u := `user_id` `user_id`,
#pt := UNIX_TIMESTAMP(`date`) pt
FROM `users`
ORDER BY `date`
) AS t
GROUP BY `user_id`
The subquery sorts the rows by date, then compares user_id with the previous value. If user IDs are equal, calculates the difference between date and the previous timestamp #pt. If the difference is an hour (3600 seconds), then the #s counter is incremented by one. Otherwise, the counter is reset to 0:
s user_id pt
0 143 1426529172
1 143 1426532772
2 143 1426536372
2 143 1426561572
0 200 1426565410
The outer query collects the maximum counter values per user_id, since the maximum counter value corresponds to the last counter value per user_id.
Output
user_id conseq
143 2
200 0
Note, the query accepts the difference of exactly 1 hour. If you want a more flexible condition, simply adjust the comparison. For example, you can accept a difference in interval between 3000 and 4000 seconds as follows:
#s := IF(#u = `user_id`,
IF( (UNIX_TIMESTAMP(`date`) - #pt) BETWEEN 3000 AND 4000, #s + 1, #s),
0) s

How to skip other OR condition if first is matched in SELECT Query?

I am having a trouble with OR condition inside the SELECT.
I want a simple result if one condition is matched and rest OR condition should not be use.
What i want is:
I have some users shared records and i would like to email them the newest items shared on my website.
For me: Newest Items will be least two days older
Like Today is 9th so i would like to pull all records of 7th. but if i
didn't get any record of 7th then i would like to pull all record of
6th (3 days older from today). if i didn't get any records on 6th then
i would like to pull 1 day older from today.
for all this i have used OR in my SELECT query like this:
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE (
(tg.added_date BETWEEN '2014-07-07 00:00:00' AND '2014-07-08 00:00:00') OR
(tg.added_date BETWEEN '2014-07-06 00:00:00' AND '2014-07-07 00:00:00') OR
(tg.added_date BETWEEN '2014-07-08 00:00:00' AND '2014-07-09 00:00:00') )
And i have records in my database for dates:
2014-07-06
2014-07-07
and when i run this query it gives me all record of both dates.
But I need to pull only record of 2014-07-07 not of both.(I have mentioned above.)
I know i can do this by using multiple Select and i think that will not be a good idea to request to database again and again.
My Question is : How to pull data from database if the first match is true? and skip all data of rest dates?
OR
Is there any other way to do this?
Please Help
Usually one would just work with LIMIT, which is not applicable here, since there might be many rows per day. What I do is quite similar to LIMIT.
SELECT * FROM (
SELECT
tg.*,
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
#prev_date := DATE(tg.added_date)
FROM tblgallery tg
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
ORDER BY FIELD(DATE(tg.added_date), CURDATE() - INTERVAL 1 DAY, CURDATE() - INTERVAL 3 DAY, CURDATE() - INTERVAL 2 DAY) DESC
) sq
WHERE my_group_number = 1;
Here's how it works.
With this line
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
the variables are initialized.
Then the ORDER BY is important! The FIELD() function sorts the rows from 2 days ago (gets value 3), to 3 days ago (gets value 2), to 1 day ago (gets value 1). Everything else gets value 0.
Then in the SELECT clause the order is also important.
With this line
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
the variable #gn is incremented when the date of the current row is different from the date of the previous row.
With this line
#prev_date := DATE(tg.added_date)
the date of the current row is assigned to the variable #prev_date. In the line above it still has the value of the previous row.
Now those entries have a 1 in column my_group_number that have the most recent date in the order
2 days ago
3 days ago
yesterday
4 days ago
5 days ago
...
Try this Query:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB('2014-07-09 00:00:00', interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See my SQL Fiddle Demo
And even if the date is more than 2 days older it will still work.
See here the Demo below wherein the latest is 4 days older from July 9, 2014
See the 2nd Demo
And if you want the current date instead of literal date like here then you could use CURDATE() function instead. Like one below:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB(CURDATE(), interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See 3rd Demo
Well, I'm not being able to solve the multi OR issue but this is how could you get records being added last two days. Change the interval or the CURDATE() in order to fit your needs.
SELECT id, date_added
FROM gallery
WHERE date_added BETWEEN CURDATE() - INTERVAL 2 DAY AND CURDATE()
ORDER BY date_added
Check the SQL Fiddel
It is not about how OR works in MySQL.
I think you are misunderstanding where part by looking at your discussion with #B.T.
It will be executed for each record.
so if one of the record evaluates to false for the first condition then it will evaluate the second condition for that particular record and so on so if any condition evaluates to true by considering all the conditions then that will become part of your result set.
Try this query.
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE tg.added_date = (
select date (
select distinct(tg.added_date) date from tblgallery as tg
) as t1 order by case
when date between '2014-07-07 00:00:00' AND '2014-07-08 00:00:00'
then 1
when date between '2014-07-06 00:00:00' AND '2014-07-07 00:00:00'
then 2
when date between '2014-07-08 00:00:00' AND '2014-07-09 00:00:00'
then 3
else 4
end limit 1);
Here's what I am doing in this query.
I am getting all the distinct dates.
then I am ordering all the condition in order i.e if first condition is true then 1, if second is true then 2 and so on.
I am limiting the result to 1 so after the order whichever the result is the first row will be selected and which is a date and will be used in the condition.
Note: I have note tested it yes, so you may need to do some changes to the query.

postgresql max(count(*)) - php

I have a problem in postgresql.
I have one cohorte (gathering of people) and i would like counting the persons in this cohorte.
Begin date : "2014-09-01", End date : "2014-11-30".
I have 5 persons between 09/01 and 09/22
I have 5 persons between 09/20 and 09/25
I have 5 persons between 09/26 and 10/05
I have 5 persons between 10/01 ans 11/30
I want to have the max of accommodation for each month between the begin date and the end date in SQL (or PHP). Expected max person count:
September(09) => 10
October(10) => 10
November(11) => 5
Find the maximum of simultaneously present persons on a single day for every month in a given period.
I suggest generate_series() to produce the series of days in your period. Then aggregate twice:
First to get a count for each day. A single day can be dealt with plain BETWEEN. Your ranges are obviously meant to be with include borders.
Second to get the maximum per month.
SELECT date_trunc('month', day)::date AS month, max(ct) AS max_ct
FROM (
SELECT g.day, count(*) AS ct
FROM cohorte
,generate_series('2014-09-01'::date -- first of Sept.
,'2014-11-30'::date -- last of Nov.
,'1 day'::interval) g(day)
WHERE g.day BETWEEN t_begin AND t_end
GROUP BY 1
) sub
GROUP BY 1
ORDER BY 1;
Returns:
month | max_ct
-----------+--------
2014-09-01 | 10
2014-10-01 | 10
2014-11-01 | 5
Use to_char() to prettify the month output.
SQL Fiddle .. is down ATM. Here is my test case (that you should have provided):
CREATE TEMP TABLE cohorte (
cohorte_id serial PRIMARY KEY
,person_id int NOT NULL
,t_begin date NOT NULL -- inclusive
,t_end date NOT NULL -- inclusive
);
INSERT INTO cohorte(person_id, t_begin, t_end)
SELECT g, '2014-09-01'::date, '2014-09-22'::date
FROM generate_series (1,5) g
UNION ALL
SELECT g+5, '2014-09-20', '2014-09-25'
FROM generate_series (1,5) g
UNION ALL
SELECT g+10, '2014-09-26', '2014-10-05'
FROM generate_series (1,5) g
UNION ALL
SELECT g+15, '2014-10-01', '2014-11-30'
FROM generate_series (1,5) g;
For more complex checks I'd suggest the OVERLAPS operator:
Find overlapping date ranges in PostgreSQL
For more complex scenarios I'd also consider range types:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
can't you use window function?
I'd try something like this (I've not tested this code, just exposed my thoughts)
SELECT max(count) FROM (
SELECT count(*) OVER (PARTITION BY ???) as count
FROM contract
WHERE daterange(dateStart, dateEnd, '[]') && daterange('2014-09-01', '2014-10-01', '[)')
) as max
Here, my problem remains that I can't find a way to partition for each day of the interval. Maybe this is a wrong approach, but I would be interested by a solution based on windows.
edit: with this request, you have the max of simultaneous present, but over all the time, not only a given month
with presence as (
SELECT id, generate_series(begin_date, end_date, '1 day'::interval) AS date
FROM test
),
presents as (
SELECT count(*) OVER (PARTITION BY date) AS count
FROM presence
)
SELECT max(count) from presents;
Here we come, I think
Imagine your person table has 3 columnsĀ :
id
entrance_date
leaving_date
the request would look like
WITH presents as (
SELECT id,
daterange(entrance_date, leaving_date, '[]') * daterange('2014-09-01', '2014-11-30', '[]') as range
FROM person
WHERE daterange(entrance_date, leaving_date, '[]') && daterange('2014-09-01', '2014-11-30', '[]')
),
present_per_day as (
SELECT id,
generate_series(lower(range), upper(range), '1 day'::interval) AS date
FROM presents
),
count_per_day as (
SELECT count(*) OVER (PARTITION BY date) AS count,
date
FROM present_per_day
),
SELECT max(count) OVER (PARTITION BY date_part('year', date), date_part('month', date)) as max,
date_part('year', date),
date_part('month', date)
FROM count_per_day;
(I have to leave, I hope I'll have time to test it later)
In fact, #erwin solution is much much more easy and efficient than this one.

How to minimize the load in queries that need grouping with different invervals?

I'm looking for a best practice advice how to speed up queries and at the same time to minimize the overhead needed to invoke date/mktime functions. To trivialize the problem I'm dealing with the following table layout:
CREATE TABLE my_table(
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
important_data INTEGER,
date INTEGER);
The user can choose to show 1) all entries between two dates:
SELECT * FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Output:
10-21-2009 12:12:12, 10002
10-21-2009 14:12:12, 15002
10-22-2009 14:05:01, 20030
10-23-2009 15:23:35, 300
....
I don't think there is much to improve in this case.
2) Summarize/group the output by day, week, month, year:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data
FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Example output by month:
10-2009, 100002
11-2009, 200030
12-2009, 3000
01-2010, 0 /* <- very important to show empty dates, with no entries in the table! */
....
To accomplish option 2) I'm currently running a very costly for-loop with mktime/date like the following:
for(...){ /* example for group by day */
$span_from = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i, date("Y", $time_min));
$span_to = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i+1, date("Y", $time_min));
$query = "..";
$output = date("m-d-y", ..);
}
What are my ideas so far? Add additional/ redundant columns (INTEGER) for day (20091212), month (200912), week (200942) and year (2009). This way I can get rid of all the unnecessary queries in the for loop. However I'm still facing the problem to very fastly calculate all dates that doesn't have any equivalent in database. One way to simply move the problem could be to let MySQL do the job and simply use one big query (calculate all the dates/use MySQL date functions) with a left join (the data). Would it be wise to let MySQL take the extra load? Anyway I'm reluctant to use all these mktime/date in the for loop. Since I have complete control over the table layout and code even suggestions with major changes are welcome!
Update
Thanks to Greg I came up with the following SQL query. However it still bugs me to use 50 lines of sql statements - build up with php - that maybe could be done faster and more elegantly otherwise:
SELECT * FROM (
SELECT DATE_ADD('2009-01-30', INTERVAL 0 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 1 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 2 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 3 DAY) AS day UNION ALL
......
SELECT DATE_ADD('2009-01-30', INTERVAL 50 DAY) AS day ) AS dates
LEFT JOIN (
SELECT DATE_FORMAT(date, '%Y-%m-%d') AS date, SUM(data) AS data
FROM test
GROUP BY date
) AS results
ON DATE_FORMAT(dates.day, '%Y-%m-%d') = results.date;
You definitely shouldn't be doing a query inside a loop.
You can group like this:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data, DATE_FORMAT('%Y-%m', date) AS month
FROM my_table
WHERE date BETWEEN ? AND ? -- This should be the min and max of the whole range
GROUP BY DATE_FORMAT('%Y-%m', date)
ORDER BY date DESC;
Then pull these into an array keyed by date and loop over your data range as you are doing (that loop should be pretty light on CPU).
Another idea is not to use string inside the query. Transform the string parameter to datetime, on mysql.
STR_TO_DATE(str,format)
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html

Categories