I want to display users stats on my website, returning the percentage of age groups like :
-13 years : $percent %
13-15 years : $percent %
15-20 years : $percent %
23+ : $percent %
In my mysql table i have a column birth_date returning datatime (yyyy-mm-dd).
Did you have hints or idea to do that ?
Pure SQL:
SELECT
`group`,
COUNT(*) as `count`
FROM
`user`
INNER JOIN (
SELECT
0 as `start`, 12 as `end`, '0-12' as `group`
UNION ALL
SELECT
13, 14, '13-14'
UNION ALL
SELECT
15, 19, '15-19'
UNION ALL
SELECT
20, 150, '20+'
) `sub`
ON TIMESTAMPDIFF(YEAR, `birth_date`, NOW()) BETWEEN `start` AND `end`
GROUP BY `group` WITH ROLLUP;
Anything else might be calculated via PHP.
select case when year(curdate() - birth_date) < 13
then '< 13 years'
when year(curdate() - birth_date) between 13 and 14
then '13 - 14 years'
when year(curdate() - birth_date) between 15 and 20
then '15 - 20 years'
when year(curdate() - birth_date) >= 23
then '+23 years'
end as `description`,
(select count(*) from your_table) / count(*)
from your_table
group by case when year(curdate() - birth_date) < 13 then 1
when year(curdate() - birth_date) between 13 and 14 then 2
when year(curdate() - birth_date) between 15 and 20 then 3
when year(curdate() - birth_date) >= 23 then 4
end
If you want pure sql:
SELECT COUNT(*)
FROM [table_name]
WHERE birth_date < DATE_ADD(NOW(),INTERVAL 13 YEAR)
WHERE birth_date >= DATE_ADD(NOW(),INTERVAL 13 YEAR)
AND birth_date < DATE_ADD(NOW(),INTERVAL 15 YEAR)
--etc
else I would suggest using php to create the dates.
You could union all the queries together and create and sql view eg:
CREATE VIEW statistics AS
SELECT "0-13" as age ,COUNT(*) as total
FROM table_name
WHERE birth_date < DATE_ADD(NOW(),INTERVAL 13 YEAR)
UNION
SELECT "13-15" as age ,COUNT(*) as total
FROM table_name
WHERE birth_date >= DATE_ADD(NOW(),INTERVAL 13 YEAR)
AND birth_date < DATE_ADD(NOW(),INTERVAL 15 YEAR)
UNION
SELECT "15-20" as age ,COUNT(*) as total
FROM table_name
WHERE birth_date >= DATE_ADD(NOW(),INTERVAL 15 YEAR)
AND birth_date < DATE_ADD(NOW(),INTERVAL 20 YEAR)
UNION
SELECT "20-23" as age ,COUNT(*) as total
FROM table_name
WHERE birth_date >= DATE_ADD(NOW(),INTERVAL 20 YEAR)
AND birth_date < DATE_ADD(NOW(),INTERVAL 23 YEAR)
UNION
SELECT "23+" as age ,COUNT(*) as total
FROM table_name
WHERE birth_date >= DATE_ADD(NOW(),INTERVAL 23 YEAR)
Then you can just query:
SELECT * from statistics
Return datetime as a timestamp, compare with now.offset(years=??), categorize in arrays then count records.
Related
I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.
date score
-----------------
1.8.2010 19
2.8.2010 21
4.8.2010 14
7.8.2010 10
10.8.2010 14
My problem is that some dates are missing - I want to see:
date score
-----------------
1.8.2010 19
2.8.2010 21
3.8.2010 0
4.8.2010 14
5.8.2010 0
6.8.2010 0
7.8.2010 10
...
What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.
I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.
EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.
MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE `example`.`numbers` (
`id` int(10) unsigned NOT NULL auto_increment,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO `example`.`numbers`
( `id` )
VALUES
( NULL )
...for as many values as you need.
Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -
SELECT `x`.*
FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY)
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
LEFT JOIN onto your table of data based on the time portion:
SELECT `x`.`ts` AS `timestamp`,
COALESCE(`y`.`score`, 0) AS `cnt`
FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x
LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
If you want to maintain the date format, use the DATE_FORMAT function:
DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`
I'm not a fan of the other answers, requiring tables to be created and such. This query does it efficiently without helper tables.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
FROM
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
So lets dissect this.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
The if will detect days that had no score and set them to 0. b.Days is the configured amount of days you chose to get from the current date, up to 1000.
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
This subquery is something I saw on stackoverflow. It efficiently generates a list of the past 1000 days from the current date. The interval (currently 30) in the WHERE clause at the end determines which days are returned; the maximum is 1000. This query could be easily modified to return 100s of years worth of dates, but 1000 should be good for most things.
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
This is the part that brings your table that contains the score into it. You compare to the selected date range from the date generator query to be able to fill in 0s where needed (the score will be set to NULL initially, because it is a LEFT JOIN; this is fixed in the select statement). I also order it by the dates, just because. This is preference, you could also order by score.
Before the ORDER BY you could easily join with your table about user info you mentioned with your edit, to add that last requirement.
I hope this version of the query helps someone. Thanks for reading.
Time went by since this question was asked. MySQL 8.0 was released in 2018 and added support for recursive common table expressions, which provide an elegant, state-of-the-art solution to this question.
The following query can be used to generate a list of dates, say for the first 15 days of August 2010:
with recursive all_dates(dt) as (
-- anchor
select '2010-08-01' dt
union all
-- recursion with stop condition
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select * from all_dates order by dt
You can then left join this resultset with your table to generate the expected output:
with recursive all_dates(dt) as (
select '2010-08-01' dt
union all
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt
Demo on DB Fiddle:
date | score
:--------- | ----:
2010-08-01 | 19
2010-08-02 | 21
2010-08-03 | 0
2010-08-04 | 14
2010-08-05 | 0
2010-08-06 | 0
2010-08-07 | 10
2010-08-08 | 0
2010-08-09 | 0
2010-08-10 | 14
2010-08-11 | 0
2010-08-12 | 0
2010-08-13 | 0
2010-08-14 | 0
2010-08-15 | 0
Note that it is very easy to adapt the recursive CTE for other intervals or periods. As an example, say we want a row every 15 minutes from 4 AM to 8 AM on August 1st, 2010 ; we can do :
with recursive all_dates(dt) as (
select '2010-08-01 04:00:00' dt
union all
select dt + interval 15 minute from all_dates where dt < '2010-08-01 08:00:00'
)
...
You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.
Michael Conard answer is great but I needed intervals of 15 minutes where the time must always start at the top of every 15th minute:
SELECT a.Days
FROM (
SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY
This will set the current time to the previous round 15th minute:
FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))
And this will remove time with a 15 minute step:
- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE
If there's a simpler way to do it, please let me know.
you can user direct from start date up to today with insertion
with recursive all_dates(dt) as (
-- anchor
select '2021-01-01' dt
union all
-- recursion with stop condition
INSERT IGNORE INTO mytable (date,score) VALUES (dt + interval 1 day ,0 ) where dt + interval 1 day <= curdate()
)
select * from all_dates
I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.
date score
-----------------
1.8.2010 19
2.8.2010 21
4.8.2010 14
7.8.2010 10
10.8.2010 14
My problem is that some dates are missing - I want to see:
date score
-----------------
1.8.2010 19
2.8.2010 21
3.8.2010 0
4.8.2010 14
5.8.2010 0
6.8.2010 0
7.8.2010 10
...
What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.
I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.
EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.
MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE `example`.`numbers` (
`id` int(10) unsigned NOT NULL auto_increment,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO `example`.`numbers`
( `id` )
VALUES
( NULL )
...for as many values as you need.
Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -
SELECT `x`.*
FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY)
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
LEFT JOIN onto your table of data based on the time portion:
SELECT `x`.`ts` AS `timestamp`,
COALESCE(`y`.`score`, 0) AS `cnt`
FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x
LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
If you want to maintain the date format, use the DATE_FORMAT function:
DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`
I'm not a fan of the other answers, requiring tables to be created and such. This query does it efficiently without helper tables.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
FROM
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
So lets dissect this.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
The if will detect days that had no score and set them to 0. b.Days is the configured amount of days you chose to get from the current date, up to 1000.
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
This subquery is something I saw on stackoverflow. It efficiently generates a list of the past 1000 days from the current date. The interval (currently 30) in the WHERE clause at the end determines which days are returned; the maximum is 1000. This query could be easily modified to return 100s of years worth of dates, but 1000 should be good for most things.
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
This is the part that brings your table that contains the score into it. You compare to the selected date range from the date generator query to be able to fill in 0s where needed (the score will be set to NULL initially, because it is a LEFT JOIN; this is fixed in the select statement). I also order it by the dates, just because. This is preference, you could also order by score.
Before the ORDER BY you could easily join with your table about user info you mentioned with your edit, to add that last requirement.
I hope this version of the query helps someone. Thanks for reading.
Time went by since this question was asked. MySQL 8.0 was released in 2018 and added support for recursive common table expressions, which provide an elegant, state-of-the-art solution to this question.
The following query can be used to generate a list of dates, say for the first 15 days of August 2010:
with recursive all_dates(dt) as (
-- anchor
select '2010-08-01' dt
union all
-- recursion with stop condition
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select * from all_dates order by dt
You can then left join this resultset with your table to generate the expected output:
with recursive all_dates(dt) as (
select '2010-08-01' dt
union all
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt
Demo on DB Fiddle:
date | score
:--------- | ----:
2010-08-01 | 19
2010-08-02 | 21
2010-08-03 | 0
2010-08-04 | 14
2010-08-05 | 0
2010-08-06 | 0
2010-08-07 | 10
2010-08-08 | 0
2010-08-09 | 0
2010-08-10 | 14
2010-08-11 | 0
2010-08-12 | 0
2010-08-13 | 0
2010-08-14 | 0
2010-08-15 | 0
Note that it is very easy to adapt the recursive CTE for other intervals or periods. As an example, say we want a row every 15 minutes from 4 AM to 8 AM on August 1st, 2010 ; we can do :
with recursive all_dates(dt) as (
select '2010-08-01 04:00:00' dt
union all
select dt + interval 15 minute from all_dates where dt < '2010-08-01 08:00:00'
)
...
You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.
Michael Conard answer is great but I needed intervals of 15 minutes where the time must always start at the top of every 15th minute:
SELECT a.Days
FROM (
SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY
This will set the current time to the previous round 15th minute:
FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))
And this will remove time with a 15 minute step:
- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE
If there's a simpler way to do it, please let me know.
you can user direct from start date up to today with insertion
with recursive all_dates(dt) as (
-- anchor
select '2021-01-01' dt
union all
-- recursion with stop condition
INSERT IGNORE INTO mytable (date,score) VALUES (dt + interval 1 day ,0 ) where dt + interval 1 day <= curdate()
)
select * from all_dates
I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.
date score
-----------------
1.8.2010 19
2.8.2010 21
4.8.2010 14
7.8.2010 10
10.8.2010 14
My problem is that some dates are missing - I want to see:
date score
-----------------
1.8.2010 19
2.8.2010 21
3.8.2010 0
4.8.2010 14
5.8.2010 0
6.8.2010 0
7.8.2010 10
...
What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.
I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.
EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.
MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE `example`.`numbers` (
`id` int(10) unsigned NOT NULL auto_increment,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO `example`.`numbers`
( `id` )
VALUES
( NULL )
...for as many values as you need.
Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -
SELECT `x`.*
FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY)
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
LEFT JOIN onto your table of data based on the time portion:
SELECT `x`.`ts` AS `timestamp`,
COALESCE(`y`.`score`, 0) AS `cnt`
FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
FROM `numbers` `n`
WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x
LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
If you want to maintain the date format, use the DATE_FORMAT function:
DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`
I'm not a fan of the other answers, requiring tables to be created and such. This query does it efficiently without helper tables.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
FROM
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
So lets dissect this.
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
The if will detect days that had no score and set them to 0. b.Days is the configured amount of days you chose to get from the current date, up to 1000.
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
This subquery is something I saw on stackoverflow. It efficiently generates a list of the past 1000 days from the current date. The interval (currently 30) in the WHERE clause at the end determines which days are returned; the maximum is 1000. This query could be easily modified to return 100s of years worth of dates, but 1000 should be good for most things.
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
This is the part that brings your table that contains the score into it. You compare to the selected date range from the date generator query to be able to fill in 0s where needed (the score will be set to NULL initially, because it is a LEFT JOIN; this is fixed in the select statement). I also order it by the dates, just because. This is preference, you could also order by score.
Before the ORDER BY you could easily join with your table about user info you mentioned with your edit, to add that last requirement.
I hope this version of the query helps someone. Thanks for reading.
Time went by since this question was asked. MySQL 8.0 was released in 2018 and added support for recursive common table expressions, which provide an elegant, state-of-the-art solution to this question.
The following query can be used to generate a list of dates, say for the first 15 days of August 2010:
with recursive all_dates(dt) as (
-- anchor
select '2010-08-01' dt
union all
-- recursion with stop condition
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select * from all_dates order by dt
You can then left join this resultset with your table to generate the expected output:
with recursive all_dates(dt) as (
select '2010-08-01' dt
union all
select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt
Demo on DB Fiddle:
date | score
:--------- | ----:
2010-08-01 | 19
2010-08-02 | 21
2010-08-03 | 0
2010-08-04 | 14
2010-08-05 | 0
2010-08-06 | 0
2010-08-07 | 10
2010-08-08 | 0
2010-08-09 | 0
2010-08-10 | 14
2010-08-11 | 0
2010-08-12 | 0
2010-08-13 | 0
2010-08-14 | 0
2010-08-15 | 0
Note that it is very easy to adapt the recursive CTE for other intervals or periods. As an example, say we want a row every 15 minutes from 4 AM to 8 AM on August 1st, 2010 ; we can do :
with recursive all_dates(dt) as (
select '2010-08-01 04:00:00' dt
union all
select dt + interval 15 minute from all_dates where dt < '2010-08-01 08:00:00'
)
...
You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.
Michael Conard answer is great but I needed intervals of 15 minutes where the time must always start at the top of every 15th minute:
SELECT a.Days
FROM (
SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY
This will set the current time to the previous round 15th minute:
FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))
And this will remove time with a 15 minute step:
- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE
If there's a simpler way to do it, please let me know.
you can user direct from start date up to today with insertion
with recursive all_dates(dt) as (
-- anchor
select '2021-01-01' dt
union all
-- recursion with stop condition
INSERT IGNORE INTO mytable (date,score) VALUES (dt + interval 1 day ,0 ) where dt + interval 1 day <= curdate()
)
select * from all_dates
I've an activity tracker table with activity_id (Primary key, auto increment), user_id, api_function and date_added fields (please find the screenshot attached).
By using the below query I was able to count the number of entries per user in the last 28 days:
SELECT COUNT( DISTINCT date(date_Added) ) AS day_of_activity, user_id
FROM activity_tracker
WHERE date_added >= DATE( NOW() ) - INTERVAL 28 DAY
GROUP BY user_id
LIMIT 0 , 30
like:
days_of_activity user_id
34 1
1 3
13 9
2 10
1 11
8 12
I need to track the count of users who have:
more than 16 entries in the past 28 days
between 6 to 16 in the past 28 days,
1 to 6 in the past 28 days,
no entries in the past 30 days,
no entries in the past 90 days and
no entries in the past 180 days.
Is it possible to do this in single mysql query?
Please help me.
Thanks in advance.
Please find the answer below:
SELECT
SUM(
CASE WHEN day_of_activity>16 AND last_activity_date >= DATE(NOW()) - INTERVAL 28 DAY
THEN 1 ELSE 0
END)
as daily_users_count,
SUM(
CASE WHEN day_of_activity>6 AND day_of_activity <=16 AND last_activity_date >= DATE(NOW()) - INTERVAL 28 DAY
THEN 1 ELSE 0
END)
as weekly_users_count,
SUM(
CASE WHEN day_of_activity>=1 AND day_of_activity <=6 AND last_activity_date >= DATE(NOW()) - INTERVAL 28 DAY
THEN 1 ELSE 0
END)
as monthly_users_count,
SUM(
CASE WHEN DATE_SUB(CURDATE(),INTERVAL 30 DAY) >= last_activity_date OR last_activity_date IS NULL
THEN 1 ELSE 0
END)
as not_in_30,
SUM(
CASE WHEN DATE_SUB(CURDATE(),INTERVAL 90 DAY) >= last_activity_date OR last_activity_date IS NULL
THEN 1 ELSE 0
END)
as not_in_90,
SUM(
CASE WHEN DATE_SUB(CURDATE(),INTERVAL 180 DAY) >= last_activity_date OR last_activity_date IS NULL
THEN 1 ELSE 0
END)
as not_in_180
FROM (
SELECT COUNT(DISTINCT date(at.date_added)) as day_of_activity, a.user_id, max(at.date_added) as last_activity_date
FROM accounts a
LEFT JOIN activity_tracker at ON a.user_id = at.user_id
WHERE a.user_role_id = 2
GROUP BY a.user_id
)temp
try this query
select CASE WHEN dates_of_activity>16 THEN count(*) ELSE
CASE WHEN dates_of_activity>6 and dates_of_activity<17 THEN count(*) ELSE CASE WHEN dates_of_activity>0 and dates_of_activity<7 THEN count(*) ELSE count(*)
END
END END as count,
CASE WHEN dates_of_activity>16 THEN 'Above 16' ELSE
CASE WHEN dates_of_activity>6 and dates_of_activity<17 THEN ' >6 and<16' ELSE CASE WHEN dates_of_activity>0 and dates_of_activity<7 THEN '>0 and <6' ELSE 'never use'
END
END END as typeday
from (SELECT COUNT( DISTINCT date( date_Added ) ) AS dates_of_activity
, user_id
FROM activity_tracker
WHERE date_added >= DATE( NOW( ) ) - INTERVAL 28 DAY
GROUP BY user_id)a
LIMIT 0 , 30
I made the following query to select persons by age group, as count and as percentage. Ages are stored as 0000-00-00 in my database.
SELECT AgeGroup, count(*) AS count, ROUND(sum( 100 ) / total) AS percentage
FROM (
SELECT case
when age between 0 and 17 then '00 - 17'
when age between 18 and 24 then '18 − 24'
when age between 25 and 34 then '25 − 34'
when age between 35 and 44 then '35 − 44'
when age between 45 and 54 then '45 − 54'
when age between 55 and 64 then '55 − 64'
when age between 65 and 125 then '65+'
else 'Unknown'
end AS AgeGroup
FROM (
SELECT ROUND(DATEDIFF(Cast(NOW() as Date),
Cast(dateofbirth as Date)) / 365, 0) as age
FROM people
) as SubQueryAlias
) as SubQueryAlias2
CROSS JOIN (SELECT count( * ) AS total FROM people)x
group by
AgeGroup
The current result is:
AgeGroup | count | percentage
00 - 17 33 1
18 − 24 235 5
.. .. ..
What I need is a addition to the query to separate the results in male/female/unknown:
AgeGroup | gender | count | percentage
00 - 17 M 33 1
00 - 17 F 33 1
.. .. .. ..
You might have the easiest time by defining range-tables. This also prevents you from needing to do date math on every entry, and so may be more efficient for the grouping.
First, a range table for ages:
SELECT '00 - 17' AS ageGroup, CURRENT_DATE AS lower, CURRENT_DATE - INTERVAL 18 YEAR AS upper
UNION ALL
SELECT '18 - 24', CURRENT_DATE - INTERVAL 18 YEAR, CURRENT_DATE - INTERVAL 25 YEAR
UNION ALL
SELECT '25 - 34', CURRENT_DATE - INTERVAL 25 YEAR, CURRENT_DATE - INTERVAL 35 YEAR
UNION ALL
SELECT '35 - 44', CURRENT_DATE - INTERVAL 35 YEAR, CURRENT_DATE - INTERVAL 45 YEAR
UNION ALL
SELECT '45 - 54', CURRENT_DATE - INTERVAL 45 YEAR, CURRENT_DATE - INTERVAL 55 YEAR
UNION ALL
SELECT '55 - 64', CURRENT_DATE - INTERVAL 55 YEAR, CURRENT_DATE - INTERVAL 65 YEAR
UNION ALL
SELECT '65+', CURRENT_DATE - INTERVAL 65 YEAR, null
UNION ALL
SELECT 'Unknown', null, null
SQL FIddle Demo
...which generates a table about like you'd expect. Note that the upper-bound is exclusive, which is why it uses the same value as the lower bound of the next row. Note also that 1) the '65+' bracket has no upper bound, and 2) the 'Unknown' bracket has neither.
Of course, we also need a Gender table:
SELECT 'M' AS gender
UNION ALL
SELECT 'F'
UNION ALL
SELECT 'Unknown'
(As a side note, I'd normally be using a multi-line VALUES(...) statements, but SQL Fiddle seems to dislike the syntax in subqueries for MySQL for some reason. Use whichever you're comfortable with.)
There's one last piece of knowledge we need:
Specifically, COUNT(<expression>) will ignore null rows. We can thus stitch together the full query similarly to:
SELECT AgeRange.ageGroup, Gender.gender,
COUNT(People.id), ROUND(100 * COUNT(People.id) / Total.countOfPeople) AS percentage
FROM (SELECT '00 - 17' AS ageGroup, CURRENT_DATE AS lower, CURRENT_DATE - INTERVAL 18 YEAR AS upper
UNION ALL
SELECT '18 - 24', CURRENT_DATE - INTERVAL 18 YEAR, CURRENT_DATE - INTERVAL 25 YEAR
UNION ALL
SELECT '25 - 34', CURRENT_DATE - INTERVAL 25 YEAR, CURRENT_DATE - INTERVAL 35 YEAR
UNION ALL
SELECT '35 - 44', CURRENT_DATE - INTERVAL 35 YEAR, CURRENT_DATE - INTERVAL 45 YEAR
UNION ALL
SELECT '45 - 54', CURRENT_DATE - INTERVAL 45 YEAR, CURRENT_DATE - INTERVAL 55 YEAR
UNION ALL
SELECT '55 - 64', CURRENT_DATE - INTERVAL 55 YEAR, CURRENT_DATE - INTERVAL 65 YEAR
UNION ALL
SELECT '65+', CURRENT_DATE - INTERVAL 65 YEAR, null
UNION ALL
SELECT 'Unknown', null, null) AgeRange
CROSS JOIN (SELECT 'M' AS Gender
UNION ALL
SELECT 'F'
UNION ALL
SELECT 'Unknown') Gender
CROSS JOIN (SELECT COUNT(*) countOfPeople
FROM People) Total
LEFT JOIN People
ON ((People.dateOfBirth > AgeRange.upper AND dateOfBirth <= AgeRange.lower)
OR (People.dateOfBirth <= AgeRange.lower AND AgeRange.upper IS NULL)
OR (AgeRange.lower IS NULL AND AgeRange.upper IS NULL AND People.dateOfBirth IS NULL))
AND (Gender.gender = People.gender
OR Gender.gender = 'Unknown' AND People.gender IS NULL)
GROUP BY AgeRange.ageGroup, Gender.gender
SQL Fiddle Demo
(note the Fiddle demo uses the date of this post, '2014-07-21', as CURRENT_DATE, to make the age range query stable for future readers).
I really hope I am wrong about it ....but would the reason of constant error be...you didn't select the gender?
Also, a nerdy side note, 365 days doesn't make a year, it's roughly 365.25 days XD which mean your equation is slightly off haha
SELECT AgeGroup, gender, count(*) AS count, ROUND(sum( 100 ) / total) AS percentage
FROM (
SELECT case
when age between 0 and 17 then '00 - 17'
when age between 18 and 24 then '18 − 24'
when age between 25 and 34 then '25 − 34'
when age between 35 and 44 then '35 − 44'
when age between 45 and 54 then '45 − 54'
when age between 55 and 64 then '55 − 64'
when age between 65 and 125 then '65+'
else 'Unknown'
end AS AgeGroup, gender
FROM (
SELECT ROUND(DATEDIFF(Cast(NOW() as Date),
Cast(dateofbirth as Date)) / 365, 0) as age,
gender
FROM people
) as SubQueryAlias
) as SubQueryAlias2
CROSS JOIN (SELECT count( * ) AS total FROM people)x
group by
AgeGroup, gender