How to Efficiently use SQL to Retrieve Data on Half Hour Intervals?

How to Efficiently use SQL to Retrieve Data on Half Hour Intervals? - php

Problem - Retrieve sum of subtotals on a half hour interval efficiently
I am using MySQL and I have a table containing subtotals with different times. I want to retrieve the sum of these sales on a half hour interval from 7 am through 12 am. My current solution (below) works but takes 13 seconds to query about 150,000 records. I intend to have several million records in the future and my current method is too slow.
How I can make this more efficient or if possible replace the PHP component with pure SQL? Also, would it help your solution to be even more efficient if I used Unix timestamps instead of having a date and time column?
Table Name - Receipts
subtotal date time sale_id
--------------------------------------------
6 09/10/2011 07:20:33 1
5 09/10/2011 07:28:22 2
3 09/10/2011 07:40:00 3
5 09/10/2011 08:05:00 4
8 09/10/2011 08:44:00 5
...............
10 09/10/2011 18:40:00 6
5 09/10/2011 23:05:00 7
Desired Result
An array like this:
Half hour 1 ::: (7:00 to 7:30) => Sum of Subtotal is 11
Half hour 2 ::: (7:30 to 8:00) => Sum of Subtotal is 3
Half hour 3 ::: (8:00 to 8:30) => Sum of Subtotal is 5
Half hour 4 ::: (8:30 to 9:00) => Sum of Subtotal is 8
Current Method
The current way uses a for loop which starts at 7 am and increments 1800 seconds, equivalent to a half hour. As a result, this makes about 34 queries to the database.
for($n = strtotime("07:00:00"), $e = strtotime("23:59:59"); $n <= $e; $n += 1800) {
$timeA = date("H:i:s", $n);
$timeB = date("H:i:s", $n+1799);
$query = $mySQL-> query ("SELECT SUM(subtotal)
FROM Receipts WHERE time > '$timeA'
AND time < '$timeB'");
while ($row = $query-> fetch_object()) {
$sum[] = $row;
}
}
Current Output
Output is just an array where:
[0] represents 7 am to 7:30 am
[1] represents 7:30 am to 8:00 am
[33] represents 11:30 pm to 11:59:59 pm.
array ("0" => 10000,
"1" => 20000,
..............
"33" => 5000);

You can try this single query as well, it should return a result set with the totals in 30 minute groupings:
SELECT date, MIN(time) as time, SUM(subtotal) as total
FROM `Receipts`
WHERE `date` = '2012-07-30'
GROUP BY hour(time), floor(minute(time)/30)
To run this efficiently, add a composite index on the date and time columns.
You should get back a result set like:
+---------------------+--------------------+
| time | total |
+---------------------+--------------------+
| 2012-07-30 00:00:00 | 0.000000000 |
| 2012-07-30 00:30:00 | 0.000000000 |
| 2012-07-30 01:00:00 | 0.000000000 |
| 2012-07-30 01:30:00 | 0.000000000 |
| 2012-07-30 02:00:00 | 0.000000000 |
| 2012-07-30 02:30:00 | 0.000000000 |
| 2012-07-30 03:00:00 | 0.000000000 |
| 2012-07-30 03:30:00 | 0.000000000 |
| 2012-07-30 04:00:00 | 0.000000000 |
| 2012-07-30 04:30:00 | 0.000000000 |
| 2012-07-30 05:00:00 | 0.000000000 |
| ...
+---------------------+--------------------+

First, I would use a single DATETIME column, but using a DATE and TIME column will work.
You can do all the work in one pass using a single query:
select date,
hour(`time`) hour_num,
IF(MINUTE(`time`) < 30, 0, 1) interval_num,
min(`time`) interval_begin,
max(`time`) interval_end,
sum(subtotal) sum_subtotal
from receipts
where date='2012-07-31'
group by date, hour_num, interval_num;

UPDATE:
Since you aren't concerned with any "missing" rows, I'm also going to assume (probably wrongly) that you aren't concerned that the query might possibly return rows for periods that are not from 7AM to 12AM. This query will return your specified result set:
SELECT (HOUR(r.time)-7)*2+(MINUTE(r.time) DIV 30) AS i
, SUM(r.subtotal) AS sum_subtotal
FROM Receipts r
GROUP BY i
ORDER BY i
This returns the period index (i) derived from an expression referencing the time column. For best performance of this query, you probably want to have a "covering" index available, for example:
ON Receipts(`time`,`subtotal`)
If you are going to include an equality predicate on the date column (which does not appear in your solution, but which does appear in the solution of the "selected" answer, then it would be good to have that column as a leading index in the "covering" index.
ON Receipts(`date`,`time`,`subtotal`)
If you want to ensure that you are not returning any rows for periods before 7AM, then you could simply add a HAVING i >= 0 clause to the query. (Rows for periods before 7AM would generate a negative number for i.)
SELECT (HOUR(r.time)-7)*2+(MINUTE(r.time) DIV 30) AS i
, SUM(r.subtotal) AS sum_subtotal
FROM Receipts r
GROUP BY i
HAVING i >= 0
ORDER BY i
PREVIOUSLY:
I've assumed that you want a result set similar to the one you are currently returning, but in one fell swoop. This query will return the same 33 rows you are currently retrieving, but with an extra column identifying the period (0 - 33). This is as close to your current solution that I could get:
SELECT t.i
, IFNULL(SUM(r.subtotal),0) AS sum_subtotal
FROM (SELECT (d1.i + d2.i + d4.i + d8.i + d16.i + d32.i) AS i
, ADDTIME('07:00:00',SEC_TO_TIME((d1.i+d2.i+d4.i+d8.i+d16.i+d32.i)*1800)) AS b_time
, ADDTIME('07:30:00',SEC_TO_TIME((d1.i+d2.i+d4.i+d8.i+d16.i+d32.i)*1800)) AS e_time
FROM (SELECT 0 i UNION ALL SELECT 1) d1 CROSS
JOIN (SELECT 0 i UNION ALL SELECT 2) d2 CROSS
JOIN (SELECT 0 i UNION ALL SELECT 4) d4 CROSS
JOIN (SELECT 0 i UNION ALL SELECT 8) d8 CROSS
JOIN (SELECT 0 i UNION ALL SELECT 16) d16 CROSS
JOIN (SELECT 0 i UNION ALL SELECT 32) d32
HAVING i <= 33
) t
LEFT
JOIN Receipts r ON r.time >= t.b_time AND r.time < t.e_time
GROUP BY t.i
ORDER BY t.i
Some important notes:
It looks like your current solution may be "missing" rows from Receipts whenever the the seconds is exactly equal to '59' or '00'.
It also looks like you aren't concerned with the date component, you are just getting a single value for all dates. (I may have misread that.) If so, the separation of the DATE and TIME columns helps with this, because you can reference the bare TIME column in your query.
It's easy to add a WHERE clause on the date column. e.g. to get the subtotal rollups for just a single day e.g. add a WHERE clause before the GROUP BY.
WHERE r.date = '2011-09-10'
A covering index ON Receipts(time,subtotal) (if you don't already have a covering index) may help with performance. (If you include an equality predicate on the date column (as in the WHERE clause above, the most suitable covering index would likely be ON Receipts(date,time,subtotal).
I've made an assumption that the time column is of datatype TIME. (If it isn't, then a small adjustment to the query (in the inline view aliased as t) is probably called for, to have the datatype of the (derived) b_time and e_time columns match the datatype of the time column in Receipts.
Some of proposed solutions in other answers are not guaranteed to return 33 rows, when there are no rows in Receipts within a given time period. "Missing rows" may not be an issue for you, but it is a frequent issue with timeseries and timeperiod data.
I've made the assumption that you would prefer to have a guarantee of 33 rows returned. The query above returns a subtotal of zero when no rows are found matching a time period. (I note that your current solution will return a NULL in that case. I've gone and wrapped that SUM aggregate in an IFNULL function, so that it will return a 0 when the SUM is NULL.)
So, the inline query aliased as t is an ugly mess, but it works fast. What it's doing is generating 33 rows, with distinct integer values 0 thru 33. At the same time, it derives a "begin time" and an "end time" that will be used to "match" each period to the time column on the Receipts table.
We take care not to wrap the time column from the Receipts table in any functions, but reference just the bare column. And we want to ensure we don't have any implicit conversion going on (which is why we want the datatypes of b_time and e__time to match. The ADDTIME and SEC_TO_TIME functions both return TIME datatype. (We can't get around doing the matching and the GROUP BY operations.)
The "end time" value for that last period is returned as "24:00:00", and we verify that this is a valid time for matching by running this test:
SELECT MAKETIME(23,59,59) < MAKETIME(24,0,0)
which is successful (returns a 1) so we're good there.
The derived columns (t.b_time and t.e_time) could be included in the resultset as well, but they aren't needed to create your array, and it's (likely) more efficient if you don't include them.
And one final note: for optimal performance, it may be beneficial to load the inline view aliased as t into an actual table (a temporary table would be fine.), and then you could reference the table in place of the inline view. The advantage of doing that is that you could create an index on that table.

One way to make it pure SQL is to use a lookup table. I don't know MySql that well so there maybe alot of improvement to the code. All my code will be Ms Sql..
I would do it something like this:
/* Mock salesTable */
Declare #SalesTable TABLE (SubTotal int, SaleDate datetime)
Insert into #SalesTable (SubTotal, SaleDate) VALUES (1, '2012-08-01 12:00')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (2, '2012-08-01 12:10')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (3, '2012-08-01 12:15')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (4, '2012-08-01 12:30')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (5, '2012-08-01 12:35')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (6, '2012-08-01 13:00')
Insert into #SalesTable (SubTotal, SaleDate) VALUES (7, '2012-08-01 14:00')
/* input data */
declare #From datetime, #To DateTime, #intervall int
set #from = '2012-08-01'
set #to = '2012-08-02'
set #intervall = 30
/* Create lookup table */
DECLARE #lookup TABLE (StartTime datetime, EndTime datetime)
DECLARE #tmpTime datetime
SET #tmpTime = #from
WHILE (#tmpTime <= #To)
BEGIN
INSERT INTO #lookup (StartTime, EndTime) VALUES (#tmpTime, dateAdd(mi, #intervall, #tmpTime))
set #tmpTime = dateAdd(mi, #intervall, #tmpTime)
END
/* Get data */
select l.StartTime, l.EndTime, sum(subTotal) from #SalesTable as SalesTable
join #lookUp as l on SalesTable.SaleDate >= l.StartTime and SalesTable.SaleDate < l.EndTime
group by l.StartTime, l.EndTime

In my query, I'm assuming one datetime field named date. This will give you all the groups starting at whatever datetime you give it to start with:
SELECT
ABS(FLOOR(TIMESTAMPDIFF(MINUTE, date, '2011-08-01 00:00:00') / 30)) AS GROUPING
, SUM(subtotal) AS subtotals
FROM
Receipts
GROUP BY
ABS(FLOOR(TIMESTAMPDIFF(MINUTE, date, '2011-08-01 00:00:00') / 30))
ORDER BY
GROUPING

Always use the proper datatypes for your data. In the case of your date/time columns, it's best to store them as (preferrably UTC zoned) timestamps. This is especially true in that some times don't exist for some dates (for some timzones, hence UTC). You will want an index on this column.
Also, your date/time range isn't going to give you what you want - namely, you're missing anything exactly on the hour (because you use a strict greater-than comparison). Always define ranges as 'lower-bound inclusive, upper-bound exclusive' (so, time >= '07:00:00' AND time < '07:30:00'). This is especially important for timestamps, which have an additional number of fields to deal with.
Because mySQL doesn't have recursive queries, you're going to want a couple of extra tables to pull this off. I'm referencing them as 'permanent' tables, but it would certainly be possible to define them in-line, if necessary.
You're going to want a Calendar table. These are useful for a number of reasons, but here we want them for their listing of dates. This will allow us to show dates that have subtotals of 0, if necessary. You're also going to want a value of times in half-hour increments, for the same reasons.
This should allow you to query your data like so:
SELECT division, COALESCE(SUM(subtotal), 0)
FROM (SELECT TIMESTAMP(calendar_date, clock_time) as division
FROM Calendar
CROSS JOIN Clock
WHERE calendar_date >= DATE('2011-09-10')
AND calendar_date < DATE('2011-09-11')) as divisions
LEFT JOIN Sales_Data
ON occurredAt >= division
AND occurredAt < division + INTERVAL 30 MINUTE
GROUP BY division
(Working example on SQLFiddle, which uses a regular JOIN for brevity)

I found a different solution too and posting it here for reference should anyone stumble upon this. Groups by half hour intervals.
SELECT SUM(total), time, date
FROM tableName
GROUP BY (2*HOUR(time) + FLOOR(MINUTE(time)/30))
Link for more info
http://www.artfulsoftware.com/infotree/queries.php#106

Related

How to select a corresponding column in mysql for a MAX DATE grouped by DATE

mysql table: stats
columns:
date | stats
05-05-2015 22:25:00 | 78
05-05-2015 09:25:00 | 21
05-05-2015 05:25:00 | 25
05-04-2015 09:25:00 | 29
05-04-2015 05:25:00 | 15
sql query:
SELECT MAX(date) as date, stats FROM stats GROUP BY date(date) ORDER BY date DESC
when I do this, I does select one row per date (grouped by date, regardless of the time), and selects the largest date with MAX, but it does not select the corresponding column.
for example, it returns 05-05-2015 22:25:00 as the date, and 25 as the stats. It should be selecting 78 as the stats. I've done my research and seems like solutions to this are out there, but I am not familiar with JOIN or other less-common mysql functions to achieve this, and it's hard for me to understand other examples/solutions so I decided to post my own specific scenario.

This question is asked every single day in SO. Sometimes, it's correctly answered too. Anyway, purists won't like it but here's one option:
Select x.* from stats x join (SELECT MAX(date) max_date FROM stats GROUP BY date(date)) y on y.max_date = x.date;
Obviously, for this to work dates need to be stored using a datetime data type.

Optimising PHP/mysql algorithm

I have to make some statistics for my application, so I need an algorithm with a performance as best as possible. I have some several question.
I have a data structure like this in the mysql database:
user_id group_id date
1 5 2012-11-20
1 2 2012-11-01
1 4 2012-11-01
1 3 2012-10-15
1 9 2013-01-18
...
So I need to find the group of some user at a specific date. For example, the group of the user 1 at date 2012-11-15 (15 november 2012) should return the most recent group, which is 2 and 4 (many group at the same time) at date 2012-11-01 (the closest and smaller date).
Normally, I could do a Select where date <= chosen date order by date desc, etc... but that's not the point because if I have 1000 users, it will need 1000 requests to have all the result.
So here are some question:
I have already using the php method to loop through the array to avoid the high number of mysql request, but it's still not good because the array size may be 10000+. Using a foreach (or for?) is quite costly.
So my question is if given an array, ordered by date (desc or asc), what's the fastest way to find the closest index of the element which contain a date smaller (or greater) than a given date; beside using a for or foreach loop to loop through each element.
If there is no solution for the first question, then what kind of data structure would you suggest for this kind of problem.
Note: the date is in mysql format, it's not converted in timestamp when you stored it in an array
EDIT: this is a sql fiddle http://sqlfiddle.com/#!2/dc28d/1
For dos_id = 6, t="2012-11-01" it should returns only 2 and 5 at date "2010-12-10 13:16:58"

Not sure why you'd want to do this in php. Here's some SQL using joins instead to get most recent group(s) for all users given a date. Make sure you've got indexes on date and userid.
SELECT *
FROM test t1
LEFT JOIN test t2
ON t1.userid = t2.userid AND t2.thedate <= '2012-11-15' AND t2.thedate > t1.thedate
WHERE t1.thedate <= '2012-11-15' AND t2.userid IS NULL;
SQLfiddle
Or using your SQLFiddle
SELECT t1.*
FROM dossier_dans_groupe t1
LEFT JOIN dossier_dans_groupe t2
ON t1.dos_id = t2.dos_id AND t2.updated_at <= '2012-11-01'
AND t2.updated_at > t1.updated_at
WHERE t1.updated_at <= '2012-11-01' AND t2.dos_id IS NULL;

This would give you a list of all users and their groups (1 row per group) for the latest date that is smaller than the one you specify (2012-11-15 below).
SELECT user_id, group_id, date FROM table WHERE date <= '2012-11-15' AND NOT EXISTS (SELECT 1 FROM table test WHERE test.user_id = table.user_id AND test.date > table.date and test.date <= '2012-11-15')

PHP/MYSQL datetime ranges overlapping for users

please I need help with this (for better understanding please see attached image) because I am completely helpless.
As you can see I have users and they store their starting and ending datetimes in my DB as YYYY-mm-dd H:i:s. Now I need to find out overlaps for all users according to the most frequent time range overlaps (for most users). I would like to get 3 most frequented datatime overlaps for most users. How can I do it?
I have no idea which mysql query should I use or maybe it would be better to select all datetimes (start and end) from database and process it in php (but how?). As stated on image results should be for example time 8.30 - 10.00 is result for users A+B+C+D.
Table structure:
UserID | Start datetime | End datetime
--------------------------------------
A | 2012-04-03 4:00:00 | 2012-04-03 10:00:00
A | 2012-04-03 16:00:00 | 2012-04-03 20:00:00
B | 2012-04-03 8:30:00 | 2012-04-03 14:00:00
B | 2012-04-06 21:30:00 | 2012-04-06 23:00:00
C | 2012-04-03 12:00:00 | 2012-04-03 13:00:00
D | 2012-04-01 01:00:01 | 2012-04-05 12:00:59
E | 2012-04-03 8:30:00 | 2012-04-03 11:00:00
E | 2012-04-03 21:00:00 | 2012-04-03 23:00:00

What you effectively have is a collection of sets and want to determine if any of them have non-zero intersections. This is the exact question one asks when trying to find all the ancestors of a node in a nested set.
We can prove that for every overlap, at least one time window will have a start time that falls within all other overlapping time windows. Using this tidbit, we don't need to actually construct artificial timeslots in the day. Simply take a start time and see if it intersects any of the other time windows and then just count up the number of intersections.
So what's the query?
/*SELECT*/
SELECT DISTINCT
MAX(overlapping_windows.start_time) AS overlap_start_time,
MIN(overlapping_windows.end_time) AS overlap_end_time ,
(COUNT(overlapping_windows.id) - 1) AS num_overlaps
FROM user_times AS windows
INNER JOIN user_times AS overlapping_windows
ON windows.start_time BETWEEN overlapping_windows.start_time AND overlapping_windows.end_time
GROUP BY windows.id
ORDER BY num_overlaps DESC;
Depending on your table size and how often you plan on running this query, it might be worthwhile to drop a spatial index on it (see below).
UPDATE
If your running this query often, you'll need to use a spatial index. Because of range based traversal (ie. does start_time fall in between the range of start/end), a BTREE index will not do anything for you. IT HAS TO BE SPATIAL.
ALTER TABLE user_times ADD COLUMN time_windows GEOMETRY NOT NULL DEFAULT 0;
UPDATE user_times SET time_windows = GeomFromText(CONCAT('LineString( -1 ', start_time, ', 1 ', end_time, ')'));
CREATE SPATIAL INDEX time_window ON user_times (time_window);
Then you can update the ON clause in the above query to read
ON MBRWithin( Point(0,windows.start_time), overlapping_windows.time_window )
This will get you an indexed traversal for the query. Again only do this if your planning on running the query often.
Credit for the spatial index to Quassoni's blog.

Something like this should get you started -
SELECT slots.time_slot, COUNT(*) AS num_users, GROUP_CONCAT(DISTINCT user_bookings.user_id ORDER BY user_bookings.user_id) AS user_list
FROM (
SELECT CURRENT_DATE + INTERVAL ((id-1)*30) MINUTE AS time_slot
FROM dummy
WHERE id BETWEEN 1 AND 48
) AS slots
LEFT JOIN user_bookings
ON slots.time_slot BETWEEN `user_bookings`.`start` AND `user_bookings`.`end`
GROUP BY slots.time_slot
ORDER BY num_users DESC
The idea is to create a derived table that consists of time slots for the day. In this example I have used dummy (which can be any table with an AI id that is contiguous for the required set) to create a list of timeslots by adding 30mins incrementally. The result of this is then joined to bookings to be able to count the number of books for each time slot.
UPDATE For entire date/time range you could use a query like this to get the other data required -
SELECT MIN(`start`) AS `min_start`, MAX(`end`) AS `max_end`, DATEDIFF(MAX(`end`), MIN(`start`)) + 1 AS `num_days`
FROM user_bookings
These values can then be substituted into the original query or the two can be combined -
SELECT slots.time_slot, COUNT(*) AS num_users, GROUP_CONCAT(DISTINCT user_bookings.user_id ORDER BY user_bookings.user_id) AS user_list
FROM (
SELECT DATE(tmp.min_start) + INTERVAL ((id-1)*30) MINUTE AS time_slot
FROM dummy
INNER JOIN (
SELECT MIN(`start`) AS `min_start`, MAX(`end`) AS `max_end`, DATEDIFF(MAX(`end`), MIN(`start`)) + 1 AS `num_days`
FROM user_bookings
) AS tmp
WHERE dummy.id BETWEEN 1 AND (48 * tmp.num_days)
) AS slots
LEFT JOIN user_bookings
ON slots.time_slot BETWEEN `user_bookings`.`start` AND `user_bookings`.`end`
GROUP BY slots.time_slot
ORDER BY num_users DESC
EDIT I have added DISTINCT and ORDER BY clauses in the GROUP_CONCAT() in response to your last query.
Please note that you will will need a much greater range of ids in the dummy table. I have not tested this query so it may have syntax errors.

I would not do much in SQL, this is so much simpler in a programming language, SQL is not made for something like this.
Of course, it's just sensible to break the day down into "timeslots" - this is statistics. But as soon as you start handling dates over the 00:00 border, things start to get icky when you use joins and inner selects. Especially with MySQL which does not quite like inner selects.
Here's a possible SQL query
SELECT count(*) FROM `times`
WHERE
( DATEDIFF(`Start`,`End`) = 0 AND
TIME(`Start`) < TIME('$SLOT_HIGH') AND
TIME(`End`) > TIME('$SLOT_LOW'))
OR
( DATEDIFF(`Start`,`End`) > 0 AND
TIME(`Start`) < TIME('$SLOT_HIGH') OR
TIME(`End`) > TIME('$SLOT_LOW')
Here's some pseudo code
granularity = 30*60; // 30 minutes
numslots = 24*60*60 / granularity;
stats = CreateArray(numslots);
for i=0, i < numslots, i++ do
stats[i] = GetCountFromSQL(i*granularity, (i+1)*granularity); // low, high
end
Yes, that makes numslots queries, but no joins no nothing, hence it should be quite fast. Also you can easily change the resolution.
And another positive thing is, you could "ask yourself", "I have two possible timeslots, and I need the one where more people are here, which one should I use?" and just run the query twice with respective ranges and you are not stuck with predefined time slots.
To only find full overlaps (an entry only counts if it covers the full slot) you have to switch low and high ranges in the query.
You might have noticed that I do not add times between entries that could span multiple days, however, adding a whole day, will just increase all slots by one, making that quite useless.
You could however add them by selecting sum(DAY(End) - DAY(Start)) and just add the return value to all slots.

Table seems pretty simple. I would keep your SQL query pretty simple:
SELECT * FROM tablename
Then when you have the info saved in your PHP object. Do the processing with PHP using loops and comparisons.
In simplest form:
for($x, $numrows = mysql_num_rows($query); $x < $numrows; $x++){
/*Grab a row*/
$row = mysql_fetch_assoc($query);
/*store userID, START, END*/
$userID = $row['userID'];
$start = $row['START'];
$end = $row['END'];
/*Have an array for each user in which you store start and end times*/
if(!strcmp($userID, "A")
{
/*Store info in array_a*/
}
else if(!strcmp($userID, "B")
{
/*etc......*/
}
}
/*Now you have an array for each user with their start/stop times*/
/*Do your loops and comparisons to find common time slots. */
/*Also, use strtotime() to switch date/time entries into comparable values*/
Of course this is in very basic form. You'll probably want to do one loop through the array to first get all of the userIDs before you compare them in the loop shown above.

Minus one date from another in MySQL request?

I have table with start_date and end_date. I need to find the duration (end_date-start_date). Can someone suggest how I can do so in the query? Will I get a new variable with this somehow like duration=end_date-start_date in the query?
EDIT
If I use mminus it gives me:
2012-07-01 minus 2012-01-01 = 600
How can it be 600 days in 6 months as 2011-07-27 - 2011-07-06 = 21? So i assume it's days?
Is there function to get actually how many months it even if date is in middle of month.
e.g. like "3rd june" and "27 july" is 2 month

use PERIOD_DIFF
Returns the number of months between periods P1 and P2. P1 and P2 should be in the format YYMM or YYYYMM. Note that the period arguments P1 and P2 are not date values. so try
SELECT PERIOD_DIFF(
DATE_FORMAT('2011-07-27','%Y%m'),
DATE_FORMAT('2011-06-03','%Y%m')
) AS durationInMonths

If you want to know DateTime different, you can use TimeDiff
select
(Hour(Duration) / 24)/365 as Year,
(Hour(Duration) / 24)%365 as Day,
(Hour(Duration) % 24) as Hours,
MINUTE(Duration) as Minutes,
SECOND(Duration) as Seconds
from
(
SELECT ADDTIME(NOW(),'1000:27:50') as end_datetime, NOW() as start_datetime,
TIMEDIFF(ADDTIME(NOW(),'1000:27:50'), NOW()) AS Duration
) x;
If you want to know Date different, you can use DateDiff()
select DATEDIFF('2011-06-06','2011-05-01');

You can use * but to get duration, you need to add extra column after it
select *, DATEDIFF(updated_at, created_at) from users;
TO have the best practice, it is better naming u for user, duration for extra column.
So you can see Duration title at the top. It is good if you export your sql procedure to excel.
select u.*, DATEDIFF(updated_at, created_at) as duration from users u;
Also from your php or rails code, you can call that given variable name. In rails,
users =User.find_by_sql("select u.*, DATEDIFF(updated_at, created_at) as duration from users u")
users.first.duration

SELECT end_date - start_date AS duration FROM table ...

MySQL find first available weekend

I have a table which holds restaurant reservations. It just has an restaurant_id and a date column which specify the actual date (we are talking about whole day reservations).
I want to find out when is the next available weekend for a particular restaurant. A "weekend" is either Saturday or Sunday. If one of them is available, then we have an available weekend.
The query should, of course, consider the current time to calculate the next weekend.
Can anyone help?
Here's the table structure and data for the "dates" table which holds all reservations made so far:
id id_venue date
12 1 2011-04-22
13 1 2011-04-23
14 1 2011-04-24
15 1 2011-04-30
16 1 2011-05-07
17 1 2011-05-08
As you can see, the weekend of 23-24 is full, so the one of 7-8 May. What I need to find is the date of 2001-05-01 which is the first available Saturday OR Sunday after today's date.

I think the others are missing the question... They think your table may already be POPULATED with all weekends and some status as to open or not... My guess is that your table only HAS a record IF it is reserved... thus you need to find records that DO NOT EXIST AT ALL... based on some automated Look for dates...
This is a modification to another post I've done here
Although I didn't change the context of the query, I only put in the columns associated to YOUR table. I understand you are only going against a single venue table and so am I (actually). However, to understand the "JustDates" alias, this INNER PRE-QUERY is creating a dynamically populated table of ALL DATES by doing a Cartesian join against ANY other table.. in this case, your "Venue" table of reservations (I didn't see your actual table name reference explicitly, so you'll have to change that). So, this in essence creates a table of all dates starting from whatever "today" is and goes forward for 30 days (via limit), but could be 40, 50, 300 or as many as you need.. provided the "YourVenueTable" has at least as many records as days you want to test for. (same clarification in post this was derived from). This result set going out 30, 40 or however many days is pre-filtered for ONLY the given day of week of 1-Sunday or 7-Saturday... So it should return a result set of only Apr 23, Apr 24, Apr 30, May 1, May 7, May 8, May 14, May 15, May 21, May 28, etc.
So NOW you have a dynamically created result set of all possible days you are considering moving forward. Now, that gets joined to your actual Venue Reservations table and is filtered to ONLY return those DATES where it is NOT found for the id_venue you are concerned about. In your data example it WOULD find a match on Apr 23 and 24 and NOT return those records. Same with Apr 30... However, it WILL find that the record in the prequalifying list that includes May 1 will NOT find the date match in the venue table and thus include that as you are anticipating... It will then continue to skip May 7 and 8, then return May 14, 15, 21, 28, etc...
select JustDates.OpenDate
from
( select
#r:= date_add( #r, interval 1 day ) OpenDate
from
( select #r := current_date() ) vars,
Venue
LIMIT 30 ) JustDates
where
DAYOFWEEK( JustDates.OpenDate ) IN ( 1, 7 )
AND JustDates.OpenDate NOT IN
( select Venue.date
from Venue
where Venue.id_venue = IDYouAreInterestedIn
and Venue.Date = JustDates.OpenDate )
order by
JustDates.OpenDate
Note, and per the other reservations posting, the query for reservation date availability dates doing a limit of 30 above can be ANY table in the system as long as it has AT LEAST as many days out as you want to look forward for reservations... If you want all availability for an upcoming year, you would want 365 records in the table used for a Cartesian result to get the #r cycling through dynamically created "date" records.

SELECT ...... DAYOFWEEK(`date`) as `num` FROM .... WHERE num = 1 OR num = 7
I don't know how u wanna check "availability"

How about?:
SELECT * FROM table WHERE (DAYOFWEEK(date)=1 OR DAYOFWEEK(date)=7) AND restaurant_id =$RESTAURANTID AND date > CURDATE() ORDER BY date ASC LIMIT 1

Set the number of days from today until the next Saterday (if 0 then today is Saterday)
Assuming that if today is Sunday you only want reservations for the next full weekend.
select #OffsetSaterday:= mod((8-DayOfWeek(CurDate())+7,7);
You have not supplied enough info to know how the reservation database looks, so I'm going to guess here.
Every restaurant has seats:
Table seats
id: integer primary key
rest_id: integer #link to restaurant
desc: varchar(20) # description of the seat.
Table restaurant
id: integer primary key
other fields.....
Table Reservation
id: integer primary key
reservation_date: date
seat_id: integer
The select statement to get all available seats for next weekend is:
select #OffsetSaterday:= mod((8-DayOfWeek(CurDate())+7,7);
select s.*, rst.* from seats s
inner join restaurant rst on (rst.id = seats.rest_id)
left join r on (r.seat_id = s.id
and r.reservation_date between
date_add(curdate(),interval #OffsetSaterday day) and
date_add(curdate(),interval #OffsetSaterday+1 day)
where r.id is null
order by s.rest_id, s.desc;
You might be able to combine the two selects into one, but MySQL does not guarantee the order in which expressions get evaluated, so I would recommend against that.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.