Efficient way to sum measurements / time series by given interval in php - php

I have a series of measurement data / time series in the same interval of 15 minutes. Furthermore, I have a given period (e.g. one day, current week, month, year, (...) and I need to summarize values by hour, day, month, (...).
E.g. summarize all values of the last month, by day.
My approach is to generate a temporary array with the needed interval per period in the first step. E.g. here in PHP (PHP is not that necessary, I would prefer Python or Javascript if it provides a faster method)
$this->tempArray = array(
'2014-10-01T00:00:00+0100' => array(),
'2014-10-02T00:00:00+0100' => array(),
'2014-10-03T00:00:00+0100' => array(),
'2014-10-04T00:00:00+0100' => array(),
(...)
'2014-10-31T00:00:00+0100' => array()
);
In the second step, I loop through each date/value pair (in this example 4*24*31, (96 per day)) and assign them to my temporary array. For each date, I override some values from the datetime object. In this example the hour and the minutes to match the keys in the temp array.
$insert = array(
'datetime' => $datetime,
'value' => $value
);
if ($interval == "d") {
$this->tempArray[date('Y-m-d\T00:00:sO', $datetime)][] = $insert;
}
At the last step, I loop through the temp array and summarize each array. As the result, I receive an array with 31 new date/values pairs, summarized by each day. This works fine. However is there a faster way or more efficient way? It takes nearly 0.5 seconds with this approach for one month. (If someone is interested in the source code, I will add a gist). The data are stored within a mysql database with 15 mio entries.
// Edit: I think the best way is to group this with mysql.
My current SQL query to fetch data from one year:
SELECT
FROM_UNIXTIME(PointOfTime)) as `date`,
value
FROM data
WHERE EnergyMeterId="0ca64479-bddf-4b91-9e35-bf81f4bfa84c"
and PointOfTime >= unix_timestamp('2013-01-01T00:00:00')
and PointOfTime <= unix_timestamp('2013-12-31T23:45:00')
order by `date` asc;

If the data lies in MySQL, then that is where I would implement my solution. It is trivial to use various MySQL date/time functions to aggregate this data. Let's take a simplistic example assuming a table structure like this:
id: autoincrement primary key
your_datetime: datetime or timestamp field
the_data: the data items you are trying to summarize
A query to summarize by day (most recent first) would look like this:
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
GROUP BY `day`
ORDER BY `day` DESC
If you wanted to limit it by some period of time (last 7 days for example) you can simply add a where condition
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
GROUP BY `day`
ORDER BY `day` DESC
Here is another example where you specify a range of datetimes
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`
ORDER BY `day` DESC
Sum by hour:
SELECT
DATE(your_datetime) as `day`,
HOUR(your_datetime) as `hour`
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`, `hour`
ORDER BY `day` DESC, `hour` DESC
Sum by month:
SELECT
YEAR(your_datetime) as `year`,
MONTH(your_datetime) as `month`
SUM(the_data) as `data_sum`
FROM table
GROUP BY `year`, `month`
ORDER BY `year` DESC, `month` DESC
Here is a reference to the MySQL Date/Time functions:
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_date-sub

Related

Display payment made by mark every 3 month for the last 3 years

I have a table called payment it has date field, i have a customer called Mark who has been making payment every day for 3 years
Table: Payment
Fields: Name , Amountpaid, date
I want to display payment record made by mark every 3 month and also the total Amountpaid for 3 years
How i want the result to look like
First 3 months payment record table
total Amountpaid at the bottom of the table
second 3 months payment record table
total Amountpaid at the bottom of the table
Third 3 months payment record table
total Amountpaid at the bottom of the table
and so on for 3 years
Please do help out
It seems like you're looking for a SQL solution for this, but databases are for holding data, they aren't for formatting it into a report. To this end my advice would be: Don't try and do this in the database, do it in the front end code instead
It will be very simple to run a query like
SELECT * FROM payment WHERE
name = 'mark' and
`date` between date_sub(now(), interval 3 year) and now()
ORDER BY date
And then put the results into an HTML table usig a loop, and a variable that keeps track of the amount paid total. Every 3 months reset the variable. If you want MySQL to do a bit more data processing to help out you can do this:
SELECT * FROM
payment
INNER JOIN
(SELECT YEAR(`date`) + (QUARTER(`date`)/10) as qd, SUM(amountpaid) as qp FROM payment WHERE name = 'mark' GROUP BY YEAR(`date`), QUARTER(`date`)) qpt
ON
qpt.qd = YEAR(`date`) + (QUARTER(`date`)/10)
WHERE
name = 'mark' AND
`date` between date_sub(now(), interval 3 year) and now()
ORDER BY `date`
This will give all mark's data row by row and an extra two columns (that mostly repeat themselves over and over) showing the year and quarter (3 months) of the year like 2017.1, 2017.2, together with a sum of all payments made in that quarter. Formatting it in the front end now won't need a variable to keep a running total of the amount paid
This is about the limit of what you should do with formatting the data in the database (personal opinion). If, however, you're determined to have MySQL do pretty much all this, read on..
Ysth mentioned rollup, which is intended for summarising data.. such a solution would look like this:
SELECT
Name, `date`, SUM(amountpaid) as amountpaid
FROM
payment
WHERE
name = 'mark' AND
`date` between date_sub(now(), interval 3 year) and now()
GROUP BY
name,
YEAR(`date`) + (QUARTER(`date`)/10),
`date`
WITH ROLLUP
The only downside with this approach is you also get a totals row for all payments by mark. To suppress that, use grouping sets instead:
SELECT
Name, `date`, SUM(amountpaid) as amountpaid
FROM
payment
WHERE
name = 'mark' AND
`date` between date_sub(now(), interval 3 year) and now()
GROUP BY GROUPING SETS
(
(
name,
YEAR(`date`) + (QUARTER(`date`)/10),
`date`
),
(
name,
YEAR(`date`) + (QUARTER(`date`)/10)
)
)
You can use a group by on the year and month divided by 3 and truncated using floor
SELECT
EXTRACT(YEAR_MONTH FROM `date`),
SUM(`Amountpaid`)
FROM
`Payment`
WHERE
`Name` = 'Mark'
AND `date` >= DATE_SUB(NOW(), INTERVAL 3 YEAR)
GROUP BY
EXTRACT(YEAR FROM `date`),
FLOOR(EXTRACT(MONTH FROM `date`) / 3)
For the total you will need to iterate the result set and sum up the amounts paid, or if you want it as the final record you could do a UNION SELECT but this would be ineffecient, but for completeness it is below:
SELECT
EXTRACT(YEAR_MONTH FROM `date`),
SUM(`Amountpaid`)
FROM
`Payment`
WHERE
`Name` = 'Mark'
AND `date` >= DATE_SUB(NOW(), INTERVAL 3 YEAR)
GROUP BY
EXTRACT(YEAR FROM `date`),
FLOOR(EXTRACT(MONTH FROM `date`) / 3)
UNION SELECT
NULL,
SUM(`Amountpaid`)
FROM
`Payment`
WHERE
`Name` = 'Mark'
AND `date` >= DATE_SUB(NOW(), INTERVAL 3 YEAR)
This is for get summary per 3 months :
select year(date)*100+floor(month(date)/3) as period, sum(amountpaid)
from payment
where name = 'mark' and (date between '2014-01-01' and '2017-01-01')
group by year(date)*100+floor(month(date)/3)
order by period
And this is how to get summary 3 year :
select sum(amountpaid) from payment where name = 'mark' and (date between '2014-01-01' and '2017-01-01')
You can change the date between for your need

Average day values of last month as a SQL statement

I got a table with two columns, timestamp (like '1405184196') and value.
I've saved some measured values.
$day= time()-84600;
$result = mysql_query('SELECT timestamp, value FROM table WHERE timestamp >= "'.$day.'" ORDER BY timestamp ASC');
This is how I get all values for the last 24h.
But is it possible to get average day values for the last month with a SQL statement or do I have to select all values of the last month and calculate the average of each day via PHP?
Several issues with Anish's answer:
1) This won't work if date+time is being stored in the timestamp field.
2) It assumes the OP means last month i.e June, May etc and not the last say 30 days.
This solves those issues:
SELECT DATE(`timestamp`) as `timestamp`, AVG(value)
FROM table
WHERE `timestamp` >= CURDATE() - INTERVAL 1 MONTH
GROUP BY DATE(`timestamp)
EDIT
Since the timestamp is a unix timestamp and the OP would like a calendar month:
SELECT DATE(FROM_UNIX(`timestamp`)) as `timestamp`, AVG(value)
FROM table
WHERE MONTH(FROM_UNIX(`timestamp`)) = MONTH(NOW() - 1)
GROUP BY DATE(FROM_UNIX(`timestamp))
You can do this:-
SELECT timestamp, AVG(value)
FROM table
GROUP BY timestamp
HAVING MONTH(timestamp) = MONTH(NOW()) - 1;
This query calculates average for last month.
DEMO

How to skip other OR condition if first is matched in SELECT Query?

I am having a trouble with OR condition inside the SELECT.
I want a simple result if one condition is matched and rest OR condition should not be use.
What i want is:
I have some users shared records and i would like to email them the newest items shared on my website.
For me: Newest Items will be least two days older
Like Today is 9th so i would like to pull all records of 7th. but if i
didn't get any record of 7th then i would like to pull all record of
6th (3 days older from today). if i didn't get any records on 6th then
i would like to pull 1 day older from today.
for all this i have used OR in my SELECT query like this:
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE (
(tg.added_date BETWEEN '2014-07-07 00:00:00' AND '2014-07-08 00:00:00') OR
(tg.added_date BETWEEN '2014-07-06 00:00:00' AND '2014-07-07 00:00:00') OR
(tg.added_date BETWEEN '2014-07-08 00:00:00' AND '2014-07-09 00:00:00') )
And i have records in my database for dates:
2014-07-06
2014-07-07
and when i run this query it gives me all record of both dates.
But I need to pull only record of 2014-07-07 not of both.(I have mentioned above.)
I know i can do this by using multiple Select and i think that will not be a good idea to request to database again and again.
My Question is : How to pull data from database if the first match is true? and skip all data of rest dates?
OR
Is there any other way to do this?
Please Help
Usually one would just work with LIMIT, which is not applicable here, since there might be many rows per day. What I do is quite similar to LIMIT.
SELECT * FROM (
SELECT
tg.*,
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
#prev_date := DATE(tg.added_date)
FROM tblgallery tg
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
ORDER BY FIELD(DATE(tg.added_date), CURDATE() - INTERVAL 1 DAY, CURDATE() - INTERVAL 3 DAY, CURDATE() - INTERVAL 2 DAY) DESC
) sq
WHERE my_group_number = 1;
Here's how it works.
With this line
, (SELECT #gn := 0, #prev_date := CURDATE()) var_init
the variables are initialized.
Then the ORDER BY is important! The FIELD() function sorts the rows from 2 days ago (gets value 3), to 3 days ago (gets value 2), to 1 day ago (gets value 1). Everything else gets value 0.
Then in the SELECT clause the order is also important.
With this line
#gn := IF(DATE(tg.added_date) != #prev_date, #gn + 1, #gn) AS my_group_number,
the variable #gn is incremented when the date of the current row is different from the date of the previous row.
With this line
#prev_date := DATE(tg.added_date)
the date of the current row is assigned to the variable #prev_date. In the line above it still has the value of the previous row.
Now those entries have a 1 in column my_group_number that have the most recent date in the order
2 days ago
3 days ago
yesterday
4 days ago
5 days ago
...
Try this Query:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB('2014-07-09 00:00:00', interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See my SQL Fiddle Demo
And even if the date is more than 2 days older it will still work.
See here the Demo below wherein the latest is 4 days older from July 9, 2014
See the 2nd Demo
And if you want the current date instead of literal date like here then you could use CURDATE() function instead. Like one below:
SELECT GalleryID, PixName, A.added_date
FROM tblGallery A
INNER JOIN (
SELECT added_date FROM tblGallery
WHERE added_date <= DATE_SUB(CURDATE(), interval 2 day)
GROUP BY added_date
ORDER BY added_date DESC
LIMIT 1 ) B
ON A.added_date = B.added_date
See 3rd Demo
Well, I'm not being able to solve the multi OR issue but this is how could you get records being added last two days. Change the interval or the CURDATE() in order to fit your needs.
SELECT id, date_added
FROM gallery
WHERE date_added BETWEEN CURDATE() - INTERVAL 2 DAY AND CURDATE()
ORDER BY date_added
Check the SQL Fiddel
It is not about how OR works in MySQL.
I think you are misunderstanding where part by looking at your discussion with #B.T.
It will be executed for each record.
so if one of the record evaluates to false for the first condition then it will evaluate the second condition for that particular record and so on so if any condition evaluates to true by considering all the conditions then that will become part of your result set.
Try this query.
SELECT `tg`.* FROM `tblgallery` AS `tg` WHERE tg.added_date = (
select date (
select distinct(tg.added_date) date from tblgallery as tg
) as t1 order by case
when date between '2014-07-07 00:00:00' AND '2014-07-08 00:00:00'
then 1
when date between '2014-07-06 00:00:00' AND '2014-07-07 00:00:00'
then 2
when date between '2014-07-08 00:00:00' AND '2014-07-09 00:00:00'
then 3
else 4
end limit 1);
Here's what I am doing in this query.
I am getting all the distinct dates.
then I am ordering all the condition in order i.e if first condition is true then 1, if second is true then 2 and so on.
I am limiting the result to 1 so after the order whichever the result is the first row will be selected and which is a date and will be used in the condition.
Note: I have note tested it yes, so you may need to do some changes to the query.

Very specific MySQL query I want to improve

This is my scenario: I have a table that contains events, every event has a field called 'created' with the timestamp in which that event was created. Now I need to sort the events from newest to oldest, but I do not want MySQL to return them all. I need only the latest in a given interval, for example in a range of 24 hours (EDIT: I'd like to have a flexible solution, not only for a 24 hours range, but maybe every few hours). And I only need for the last 10 days. I have achieved that but i'm sure in the most inefficient ways possible, that is, something like that:
$timestamp = time();
for($i = 0; $i < 10; $i++) {
$query = "SELECT * FROM `eventos` WHERE ... AND `created` < '{$timestamp}' ORDER BY `created` DESC LIMIT 1";
$return = $database->query( $query );
if($database->num( $return ) > 0) {
$event = $database->fetch( $return );
$events[] = $event;
$timestamp = $timestamp - 86400;
}
}
I hope I was clear enough. Thanks,
Jesús.
If you have an index with created as the leading column, MySQL may be able to do a reverse scan. If you have a 24 hour period that doesn't have any events, you could be returning a row that is NOT from that period. To make sure you're getting a row in that period, you would really need to include a lower bound on the created column as well, something like this:
SELECT * FROM `eventos`
WHERE ...
AND `created` < FROM_UNIXTIME( {$timestamp} )
AND `created` >= DATE_ADD(FROM_UNIXTIME( {$timestamp} ),INTERVAL -24 HOUR)
ORDER BY `created` DESC
LIMIT 1
I think the big key to performance here is an index with created as the leading column, along with all (or most) of the other columns referenced in the WHERE clause, and making sure that index is used by your query.
If you need a different time interval, down to the second, this approach could be easily generalized.
SELECT * FROM `eventos`
WHERE ...
AND `created` < DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL 0*{$nsecs} SECOND)
AND `created` >= DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -1*{$nsecs} SECOND)
ORDER BY `created` DESC
LIMIT 1
From your code, it looks like the 24-hour periods are bounded at an arbitrary time... if the time function returns e.g. 1341580800 ('2012-07-06 13:20'), then your ten periods would all be from 13:20 on a given day to 13:20 the following day.
(NOTE: be sure that if your parameter is a unix timestamp integer, that this is being interpreted correctly by the database.)
It might be more efficient to pull the ten rows in a single query. If there is a guarantee that 'timestamp' is unique, then it's possible to craft such a query, but the query text will be considerably more complex than what you have now. We could mess with getting MAX(timestamp_) within each period, and then joining that back to get the row... but that's going to be really messy.
If I were going to try to pull all ten rows I would probably try going with a UNION ALL approach, not very pretty, but it least it could be tuned.
SELECT p0.*
FROM ( SELECT * FROM `eventos` WHERE ...
AND `created` < DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL 0*24 HOUR)
AND `created` >= DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -1*24 HOUR)
ORDER BY `created` DESC LIMIT 1
) p0
UNION ALL
SELECT p1.*
FROM ( SELECT * FROM `eventos` WHERE ...
AND `created` < DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -1*24 HOUR)
AND `created` >= DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -2*24 HOUR)
ORDER BY `created` DESC LIMIT 1
) p1
UNION ALL
SELECT p2.*
FROM ( SELECT * FROM `eventos` WHERE ...
AND `created` < DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -2*24 HOUR)
AND `created` >= DATE_ADD(FROM_UNIXTIME({$timestamp}),INTERVAL -3*24 HOUR)
ORDER BY `created` DESC LIMIT 1
) p2
UNION ALL
SELECT p3.*
FROM ...
Again, this could be generalized, to pass in a number of seconds as an argument. Replace HOUR with SECOND, and replace the '24' with a bind parameter that has a number of seconds.
It's rather long winded, but it should run okay.
Another really messy and complicated way to get this back in a single result set would be to use an inline view to get the end timestamp for the ten periods, something like this:
SELECT p.period_end
FROM (SELECT DATE_ADD(t.t_,INTERVAL -1 * i.i_* {$nsecs} SECOND) AS period_end
FROM (SELECT FROM_UNIXTIME( {$timestamp} ) AS t_) t
JOIN (SELECT 0 AS i_
UNION ALL SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
UNION ALL SELECT 9
) i
) p
And then join that to your table ...
ON `created` < p.period_end
AND `created` >= DATE_ADD(p.period_end,INTERVAL -1 * {$nsecs} SECOND)
And pull back MAX(created) for each period GROUP BY p.period_end, wrap that in an inline view.
And then join that back to your table to get each row.
But that is really, really messy, hard to understand, and not likely to be any faster (or more efficient) than what you are already doing. The most improvement you could make is the time it takes to run 9 of your queries.
Assuming you want the latest (having the greatest created date) event per day for the last 10 days.
so let's get the latest timestamp per day
$today = date('Y-m-d');
$tenDaysAgo = date('Y-m-d', strtotime('-10 day'));
$innerSql = "SELECT date_format(created, '%Y-%m-%d') day, MAX(created) max_created FROM eventos WHERE date_format(created, '%Y-%m-%d') BETWEEN '$today' and '$tenDaysAgo' GROUP BY date_format(created, '%Y-%m-%d')";
Then we can select all the events that match those created dates
$outerSql = "SELECT * FROM eventos INNER JOIN ($innerSql) as A WHERE eventos.created = A.max_created";
I haven't had a chance to test this, but the principles should be sound enough.
If you want to group by some other arbitrary number of hours you would change innerSql:
$fromDate = '2012-07-06' // or if you want a specific time '2012-07-06 12:00:00'
$intervalInHours = 5;
$numberOfIntervals = 10;
$innerSql = "SELECT FLOOR(TIMESTAMPDIFF(HOUR, created, '$fromDate') / $intervalInHours) as grouping, MAX(created) as max_created FROM eventos WHERE created BETWEEN DATE_SUB('$fromDate', INTERVAL ($intervalInHours * $numberOfIntervals) HOUR) AND '$fromDate' GROUP BY FLOOR(TIMESTAMPDIFF(HOUR, created, '$fromDate') / $intervalInHours)";
I'd add another column that is the date(not time) and then use MySQL "group by" to get the most recent for each date.
http://www.tizag.com/mysqlTutorial/mysqlgroupby.php/
This tutorial does just that, but by product type instead of date. This should help!
Do you want all of the events within the 10 days, or just one event per day within the 10 day period?
Either way, consider MySQL's date functions for assistance. It should help you get the date range you want.
Here's one that will get you the first event of the day for the last 10 days.
SELECT *
FROM eventos
WHERE created BETWEEN DATE_SUB(DATE(NOW()), INTERVAL 10 DAY) AND DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
GROUP BY DATE(created)
ORDER BY MAX(created) DESC
LIMIT 10
Try this:
SELECT *
FROM eventos
WHERE created BETWEEN DATE_SUB(DATE(NOW()), INTERVAL 10 DAY) AND DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
ORDER BY created DESC
LIMIT 10

How to minimize the load in queries that need grouping with different invervals?

I'm looking for a best practice advice how to speed up queries and at the same time to minimize the overhead needed to invoke date/mktime functions. To trivialize the problem I'm dealing with the following table layout:
CREATE TABLE my_table(
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
important_data INTEGER,
date INTEGER);
The user can choose to show 1) all entries between two dates:
SELECT * FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Output:
10-21-2009 12:12:12, 10002
10-21-2009 14:12:12, 15002
10-22-2009 14:05:01, 20030
10-23-2009 15:23:35, 300
....
I don't think there is much to improve in this case.
2) Summarize/group the output by day, week, month, year:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data
FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Example output by month:
10-2009, 100002
11-2009, 200030
12-2009, 3000
01-2010, 0 /* <- very important to show empty dates, with no entries in the table! */
....
To accomplish option 2) I'm currently running a very costly for-loop with mktime/date like the following:
for(...){ /* example for group by day */
$span_from = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i, date("Y", $time_min));
$span_to = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i+1, date("Y", $time_min));
$query = "..";
$output = date("m-d-y", ..);
}
What are my ideas so far? Add additional/ redundant columns (INTEGER) for day (20091212), month (200912), week (200942) and year (2009). This way I can get rid of all the unnecessary queries in the for loop. However I'm still facing the problem to very fastly calculate all dates that doesn't have any equivalent in database. One way to simply move the problem could be to let MySQL do the job and simply use one big query (calculate all the dates/use MySQL date functions) with a left join (the data). Would it be wise to let MySQL take the extra load? Anyway I'm reluctant to use all these mktime/date in the for loop. Since I have complete control over the table layout and code even suggestions with major changes are welcome!
Update
Thanks to Greg I came up with the following SQL query. However it still bugs me to use 50 lines of sql statements - build up with php - that maybe could be done faster and more elegantly otherwise:
SELECT * FROM (
SELECT DATE_ADD('2009-01-30', INTERVAL 0 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 1 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 2 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 3 DAY) AS day UNION ALL
......
SELECT DATE_ADD('2009-01-30', INTERVAL 50 DAY) AS day ) AS dates
LEFT JOIN (
SELECT DATE_FORMAT(date, '%Y-%m-%d') AS date, SUM(data) AS data
FROM test
GROUP BY date
) AS results
ON DATE_FORMAT(dates.day, '%Y-%m-%d') = results.date;
You definitely shouldn't be doing a query inside a loop.
You can group like this:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data, DATE_FORMAT('%Y-%m', date) AS month
FROM my_table
WHERE date BETWEEN ? AND ? -- This should be the min and max of the whole range
GROUP BY DATE_FORMAT('%Y-%m', date)
ORDER BY date DESC;
Then pull these into an array keyed by date and loop over your data range as you are doing (that loop should be pretty light on CPU).
Another idea is not to use string inside the query. Transform the string parameter to datetime, on mysql.
STR_TO_DATE(str,format)
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html

Categories