MySQL: Count no of consecutive week days from datetime column - php

I have the below table:
studentid VARCHAR(12)
latetime DATETIME
attendance CHAR(1)
latetime only have weekdays.
Some of the days the students will have "Parents letter" indicated by V for attendance column.
I need to group these attendance column V by consecutive week days.
Then count these occurrences.
Each group of consecutive days are counted as 1 letter.
My SQLFIDDLE: http://sqlfiddle.com/#!2/55d5b/1
This SQLFIDDLE sample data should return
STUDENTID LETTERCOUNT
a1111 3
b2222 2
a1111 - 3 counts
-----
1. 2014-01-02
2. 2014-01-27
2. 2014-01-29 and 2014-01-30
b2222 - 2 counts
-----
1. 2014-01-02 and 2014-01-03
2. 2014-01-24, 2014-01-27 and 2014-01-28
I tried various methods from the below SO without any proper result yet:
How to GROUP BY consecutive data (date in this case)
MySQL: group by consecutive days and count groups
I can do this programatically in PHP by looping through the results and manually checking for each record + its next date. But i was trying to acheive the same with SQL.
Any help / direction towards finding a solution will be much appreciated.

This is derived from one of the answers in MySQL: group by consecutive days and count groups. I added the WITH ROLLUP option to get the letter count into the same query, and used GROUP_CONCAT to show all the dates. I made the INTERVAL conditional on the weekday, to skip over weekends; holidays aren't taken into account, though.
In my version of the fiddle I changed the latetime column to date, so I could remove all the DATE() functions from the SQL.
SELECT studentid, IFNULL(dates, '') dates, IF(dates IS NULL, lettercount, '') lettercount
FROM (
SELECT studentid, dates, COUNT(*) lettercount
FROM (
SELECT v.studentid,
GROUP_CONCAT(latetime ORDER BY latetime SEPARATOR ', ') dates
FROM
(SELECT studentid, latetime,
#start_date := IF(#last_student IS NULL OR #last_student <> studentid,
1,
IF(#last_latetime IS NULL
OR (latetime - INTERVAL IF(WEEKDAY(latetime) = 0, 3, 1) DAY) > #last_latetime, latetime, #start_date)) AS start_date,
#last_latetime := latetime,
#last_student := studentid
FROM
studentattendance, (SELECT #start_date := NULL, #last_latetime := NULL, #last_student := NULL) vars
WHERE attendance = 'V'
ORDER BY
studentid, latetime
) v
GROUP BY
v.studentid, start_date) x
GROUP BY studentid, dates WITH ROLLUP) y
WHERE studentid IS NOT NULL
ORDER BY studentid, dates
http://sqlfiddle.com/#!2/6c944/12

Related

MySQL query rows with defined date interval

I have no idea how to solve the following problem: I have several rows in my database with one timestamp per row. Now I would like to filter all rows for entries until the date interval for any two dates is bigger than 30 days. I have no defined date interval for specific dates, like between 12/01/2017 and 11/01/2017, that would be easy, even for me. All I know is that the timestamp interval from one row to the next row (query must be sorted by timestamp desc) must not be bigger than 30 days.
Please see my db at http://sqlfiddle.com/#!9/55a521/2
In this case the last entry shown should be the one with id 65404844. I would appreciate if you might give me a small hint for this.
Thank you very much!
You can use this query to build a filter.
SELECT
t.id,
from_unixtime(timestamp)
, IF(#pt < timestamp - 30*24*60*60, 1, 0) AS filter
, #pt := timestamp
FROM
t
, (SELECT #pt := MIN(timestamp) FROM t) v
ORDER BY timestamp
see it working live in an sqlfiddle
Important here is to order by timestamp. Then you initialize the #pt variable with the lowest value. Another important thing is to have the select clause in the right order.
First you compare the current record with the variable in the IF() function. Then you assign the current record to the variable. This way when the next row is evaluated, the variable still holds the value of the previous row in the IF() function.
To get the rows you want, use above query in a subquery to filter.
SELECT id, ts FROM (
SELECT
t.id,
from_unixtime(timestamp) as ts
, IF(#pt < timestamp - 30*24*60*60, 1, 0) AS filter
, #pt := timestamp
FROM
t
, (SELECT #pt := MIN(timestamp) FROM t) v
ORDER BY timestamp
) sq
WHERE sq.filter = 1
This filters out the rows that have a more than 30 days difference from the previous rows. (1st solution) - only works if the id column has consecutive values
SELECT t.id, t.timestamp, DATEDIFF(FROM_UNIXTIME(t1.timestamp), FROM_UNIXTIME(t.timestamp)) AS days_diff
FROM tbl t
LEFT JOIN tbl t1
ON t.id = t1.id + 1
HAVING days_diff <= 30
ORDER BY t.timestamp DESC;
This filters all the results that are within 30 days of each of the other entries.
SELECT *
FROM tbl t
WHERE EXISTS (
SELECT id
FROM tbl t1
WHERE DATEDIFF(FROM_UNIXTIME(t1.timestamp), FROM_UNIXTIME(t.timestamp)) < 30
AND t1.id <> t.id
)
ORDER BY t.timestamp desc;

mySQL query to not skip "empty" results

I have a query like this
SELECT count(Distinct name) as total, DATE_FORMAT(date_added, '%M %d, %Y') AS date_added
FROM `submitted_changes`
WHERE date_added >= NOW() - INTERVAL 1 WEEK
GROUP BY DATE(date_added)
It works great and returns rows that have a nicely formatted date and a total. Basically, this represents the number of submissions per day.
The problem I have is dealing with days with 0 submissions. I don't want to skip these days, but rather have the date shown and 0 for the total. Is there a way to ensure that when I do the query above (which only includes dates from the past week [7 days]) that I always get 7 rows back?
To do this, you need an existence of a record per each date. Since this is not the case in your submitted_changes table - I'll suggest to create a date table (if you don't have it already).
Note - for the shortest version, check the last edit at the bottom:
Here is an example with a temporary table. First run:
CREATE TEMPORARY TABLE IF NOT EXISTS dates AS
SELECT DATE(curdate()-num) as date_col
FROM
(
SELECT 0 as num
UNION
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6) sub
This will create a table with 7 relevant dates.
Now left join it with your data:
SELECT
count(Distinct name) as total,
DATE_FORMAT(date_col, '%M %d, %Y') AS date_added
FROM dates LEFT JOIN submitted_changes
on (dates.date_col = DATE(submitted_changes.date_added))
GROUP BY date_col
You can also run it as a one-shot query (with no create statement):
SELECT
count(Distinct name) as total,
DATE_FORMAT(date_col, '%M %d, %Y') AS date_added
FROM
(SELECT DATE(curdate()-num) as date_col
FROM
(
SELECT 0 as num
UNION
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6) sub) dates
LEFT JOIN submitted_changes
on (dates.date_col = DATE(submitted_changes.date_added))
GROUP BY date_col
Another approach is a permanent dim_date. Here is a sample code for static table (with more extra fields):
CREATE TABLE dim_date (
id int(11) NOT NULL AUTO_INCREMENT,
date date,
day int(11),
month int(11),
year int(11),
day_name varchar(45),
PRIMARY KEY (id),
INDEX date_index (date)
)
and then populate it:
SET #currdate := "2015-01-01";
SET #enddate := "2025-01-01";
delimiter $$
DROP PROCEDURE IF EXISTS BuildDate$$
CREATE PROCEDURE BuildDate()
BEGIN
WHILE #currdate < #enddate DO
INSERT INTO dim_date (date, day, month, year, day_name)
VALUES (
#currdate, DAY(#currdate), MONTH(#currdate),
YEAR(#currdate), DAYNAME(#currdate)
);
SET #currdate := DATE_ADD(#currdate, INTERVAL 1 DAY);
END WHILE;
END$$
CALL BuildDate();
Then you can finally run your query with a left join:
SELECT
count(Distinct name) as total,
DATE_FORMAT(date, '%M %d, %Y') AS date_added
FROM dim_date LEFT JOIN submitted_changes
on (dim_date.date = DATE(submitted_changes.date_added))
WHERE date >= NOW() - INTERVAL 1 WEEK
GROUP BY date
This would return a line per each date, even if there are no records in submitted_changes for them.
Edit: another one-shot super short version inspired by this post:
SELECT
count(Distinct name) as total,
DATE_FORMAT(date, '%M %d, %Y') AS date_added
(SELECT date(curdate()-id%7) as date
FROM submitted_changes
GROUP BY num) dates LEFT JOIN submitted_changes
on (date.dates = DATE(submitted_changes.date_added))
GROUP BY date

How to skip other OR condition if first is matched in SELECT Query?

I am having a trouble with OR condition inside the SELECT.
I want a simple result if one condition is matched and rest OR condition should not be use.
What i want is:
I have some users shared records and i would like to email them the newest items shared on my website.
For me: Newest Items will be least two days older
Like Today is 9th so i would like to pull all records of 7th. but if i
didn't get any record of 7th then i would like to pull all record of
6th (3 days older from today). if i didn't get any records on 6th then
i would like to pull 1 day older from today.
for all this i have used OR in my SELECT query like this:
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE (
(tg.added_date BETWEEN '2014-07-07 00:00:00' AND '2014-07-08 00:00:00') OR
(tg.added_date BETWEEN '2014-07-06 00:00:00' AND '2014-07-07 00:00:00') OR
(tg.added_date BETWEEN '2014-07-08 00:00:00' AND '2014-07-09 00:00:00') )
And i have records in my database for dates:
2014-07-06
2014-07-07
and when i run this query it gives me all record of both dates.
But I need to pull only record of 2014-07-07 not of both.(I have mentioned above.)
I know i can do this by using multiple Select and i think that will not be a good idea to request to database again and again.
My Question is : How to pull data from database if the first match is true? and skip all data of rest dates?
OR
Is there any other way to do this?
Please Help
Usually one would just work with LIMIT, which is not applicable here, since there might be many rows per day. What I do is quite similar to LIMIT.
SELECT * FROM (
SELECT
tg.*,
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
#prev_date := DATE(tg.added_date)
FROM tblgallery tg
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
ORDER BY FIELD(DATE(tg.added_date), CURDATE() - INTERVAL 1 DAY, CURDATE() - INTERVAL 3 DAY, CURDATE() - INTERVAL 2 DAY) DESC
) sq
WHERE my_group_number = 1;
Here's how it works.
With this line
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
the variables are initialized.
Then the ORDER BY is important! The FIELD() function sorts the rows from 2 days ago (gets value 3), to 3 days ago (gets value 2), to 1 day ago (gets value 1). Everything else gets value 0.
Then in the SELECT clause the order is also important.
With this line
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
the variable #gn is incremented when the date of the current row is different from the date of the previous row.
With this line
#prev_date := DATE(tg.added_date)
the date of the current row is assigned to the variable #prev_date. In the line above it still has the value of the previous row.
Now those entries have a 1 in column my_group_number that have the most recent date in the order
2 days ago
3 days ago
yesterday
4 days ago
5 days ago
...
Try this Query:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB('2014-07-09 00:00:00', interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See my SQL Fiddle Demo
And even if the date is more than 2 days older it will still work.
See here the Demo below wherein the latest is 4 days older from July 9, 2014
See the 2nd Demo
And if you want the current date instead of literal date like here then you could use CURDATE() function instead. Like one below:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB(CURDATE(), interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See 3rd Demo
Well, I'm not being able to solve the multi OR issue but this is how could you get records being added last two days. Change the interval or the CURDATE() in order to fit your needs.
SELECT id, date_added
FROM gallery
WHERE date_added BETWEEN CURDATE() - INTERVAL 2 DAY AND CURDATE()
ORDER BY date_added
Check the SQL Fiddel
It is not about how OR works in MySQL.
I think you are misunderstanding where part by looking at your discussion with #B.T.
It will be executed for each record.
so if one of the record evaluates to false for the first condition then it will evaluate the second condition for that particular record and so on so if any condition evaluates to true by considering all the conditions then that will become part of your result set.
Try this query.
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE tg.added_date = (
select date (
select distinct(tg.added_date) date from tblgallery as tg
) as t1 order by case
when date between '2014-07-07 00:00:00' AND '2014-07-08 00:00:00'
then 1
when date between '2014-07-06 00:00:00' AND '2014-07-07 00:00:00'
then 2
when date between '2014-07-08 00:00:00' AND '2014-07-09 00:00:00'
then 3
else 4
end limit 1);
Here's what I am doing in this query.
I am getting all the distinct dates.
then I am ordering all the condition in order i.e if first condition is true then 1, if second is true then 2 and so on.
I am limiting the result to 1 so after the order whichever the result is the first row will be selected and which is a date and will be used in the condition.
Note: I have note tested it yes, so you may need to do some changes to the query.

postgresql max(count(*)) - php

I have a problem in postgresql.
I have one cohorte (gathering of people) and i would like counting the persons in this cohorte.
Begin date : "2014-09-01", End date : "2014-11-30".
I have 5 persons between 09/01 and 09/22
I have 5 persons between 09/20 and 09/25
I have 5 persons between 09/26 and 10/05
I have 5 persons between 10/01 ans 11/30
I want to have the max of accommodation for each month between the begin date and the end date in SQL (or PHP). Expected max person count:
September(09) => 10
October(10) => 10
November(11) => 5
Find the maximum of simultaneously present persons on a single day for every month in a given period.
I suggest generate_series() to produce the series of days in your period. Then aggregate twice:
First to get a count for each day. A single day can be dealt with plain BETWEEN. Your ranges are obviously meant to be with include borders.
Second to get the maximum per month.
SELECT date_trunc('month', day)::date AS month, max(ct) AS max_ct
FROM (
SELECT g.day, count(*) AS ct
FROM cohorte
,generate_series('2014-09-01'::date -- first of Sept.
,'2014-11-30'::date -- last of Nov.
,'1 day'::interval) g(day)
WHERE g.day BETWEEN t_begin AND t_end
GROUP BY 1
) sub
GROUP BY 1
ORDER BY 1;
Returns:
month | max_ct
-----------+--------
2014-09-01 | 10
2014-10-01 | 10
2014-11-01 | 5
Use to_char() to prettify the month output.
SQL Fiddle .. is down ATM. Here is my test case (that you should have provided):
CREATE TEMP TABLE cohorte (
cohorte_id serial PRIMARY KEY
,person_id int NOT NULL
,t_begin date NOT NULL -- inclusive
,t_end date NOT NULL -- inclusive
);
INSERT INTO cohorte(person_id, t_begin, t_end)
SELECT g, '2014-09-01'::date, '2014-09-22'::date
FROM generate_series (1,5) g
UNION ALL
SELECT g+5, '2014-09-20', '2014-09-25'
FROM generate_series (1,5) g
UNION ALL
SELECT g+10, '2014-09-26', '2014-10-05'
FROM generate_series (1,5) g
UNION ALL
SELECT g+15, '2014-10-01', '2014-11-30'
FROM generate_series (1,5) g;
For more complex checks I'd suggest the OVERLAPS operator:
Find overlapping date ranges in PostgreSQL
For more complex scenarios I'd also consider range types:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
can't you use window function?
I'd try something like this (I've not tested this code, just exposed my thoughts)
SELECT max(count) FROM (
SELECT count(*) OVER (PARTITION BY ???) as count
FROM contract
WHERE daterange(dateStart, dateEnd, '[]') && daterange('2014-09-01', '2014-10-01', '[)')
) as max
Here, my problem remains that I can't find a way to partition for each day of the interval. Maybe this is a wrong approach, but I would be interested by a solution based on windows.
edit: with this request, you have the max of simultaneous present, but over all the time, not only a given month
with presence as (
SELECT id, generate_series(begin_date, end_date, '1 day'::interval) AS date
FROM test
),
presents as (
SELECT count(*) OVER (PARTITION BY date) AS count
FROM presence
)
SELECT max(count) from presents;
Here we come, I think
Imagine your person table has 3 columnsĀ :
id
entrance_date
leaving_date
the request would look like
WITH presents as (
SELECT id,
daterange(entrance_date, leaving_date, '[]') * daterange('2014-09-01', '2014-11-30', '[]') as range
FROM person
WHERE daterange(entrance_date, leaving_date, '[]') && daterange('2014-09-01', '2014-11-30', '[]')
),
present_per_day as (
SELECT id,
generate_series(lower(range), upper(range), '1 day'::interval) AS date
FROM presents
),
count_per_day as (
SELECT count(*) OVER (PARTITION BY date) AS count,
date
FROM present_per_day
),
SELECT max(count) OVER (PARTITION BY date_part('year', date), date_part('month', date)) as max,
date_part('year', date),
date_part('month', date)
FROM count_per_day;
(I have to leave, I hope I'll have time to test it later)
In fact, #erwin solution is much much more easy and efficient than this one.

MySQL BETWEEN DATE RANGE

I have a scenario where I need to pull up delivery dates based on a table below (Example)
job_id | delivery_date
1 | 2013-01-12
2 | 2013-01-25
3 | 2013-02-15
What I'm trying to do is show the user all the delivery dates that start with the earliest (in this case it would be 2013-01-12) and add an another 21 days to that. Basically, the output I would expect it to show of course, the earliest date being the starting date 2013-01-12 and 2013-01-25. The dates past the February date are of no importance since they're not in my 21 date range. If it were a 5 day range, for example, then of course 2013-01-25 would not be included and only the earliest date would appear.
Here is main SQL clause I have which only shows jobs starting this year forward:
SELECT date, delivery_date
FROM `job_sheet`
WHERE print_status IS NULL
AND job_sheet.date>'2013-01-01'
Is it possible to accomplish this with 1 SQL query, or must I go with a mix of PHP as well?
You can use the following:
select *
from job_sheet
where print_status IS NULL
and delivery_date >= (select min(delivery_date)
from job_sheet)
and delivery_date <= (select date_add(min(delivery_date), interval 21 day)
from job_sheet)
See SQL Fiddle with Demo
If you are worried about the dates not being correct, if you use a query then it might be best to pass in the start date to your query, then add 21 days to get the end date. Similar to this:
set #a='2013-01-01';
select *
from job_sheet
where delivery_date >= #a
and delivery_date <= date_add(#a, interval 21 day)
See SQL Fiddle with Demo
SELECT date,
delivery_date
FROM job_sheet
WHERE print_status IS NULL
AND job_sheet.date BETWEEN (SELECT MIN(date) FROM job_sheet) AND
(SELECT MIN(date) FROM job_sheet) + INTERVAL 21 DAY
SELECT j.job_id
, j.delivery_date
FROM `job_sheet` j
JOIN ( SELECT MIN(d.delivery_date) AS earliest_date
FROM `job_sheet` d
WHERE d.delivery_date >= '2013-01-01'
) e
ON j.delivery_date >= e.earliest_date
AND j.delivery_date < DATE_ADD(e.earliest_date, INTERVAL 22 DAY)
AND j.print_status IS NULL
ORDER BY j.delivery_date
(The original query has a predicate on job_sheet.date; the query above references the d.delivery_date... change that if it is supposed to be referencing the date column instaed.)
If the intent is to only show delivery_date values from today forward, then change the literal '2013-01-01' to an expression that returns the current date, e.g. DATE(NOW())

Categories