I'm working on a module where the system would be able to determine where the logs of a flexi-time schedule belong...
Here's what I'm trying to do. I have a table called office_schedule with fields and values:
emp_ID time_in time_out
1 8:00:00 9:00:00
1 9:30:00 12:00:00
1 13:30:00 17:00:00
The example table Above 'office_schedule' Contains the values of schedule of a single employee in a single day. Given that I have another table called 'office_logs' with a value:
emp_ID log_in log_out
1 8:40:00 11:30:00
I searching for a query that would take the employee's logs and try to determine which value in 'office_schedule' table the logs belong to, by calculating the most value of time it has covered.
for example, if I query using the logs in 'office logs' table, it would match the second value of 'office_schedule' table, because the logs cover more span of time in the 'office_schedule' table's second value than the others.
i hope this is understandable enough.
please help...
Assuming the time cells are defined as TIME and not as VARCHAR, I would try something like that (but maybe there is a better way):
SELECT * FROM `office_logs` as log LEFT JOIN `office_schedule` AS sched ON log.`emp_ID` = sched.`emp_ID` WHERE log.`emp_ID` = 1 ORDER BY (ABS(sched.`Time_in` - log.`log_in`) + ABS(sched.`Time_out` - log.`log_out`)) ASC LIMIT 1;
It calculates the absolute difference between the log in and log out times of an employee to each of his scheduled time in and time out. The return is ordered by the smallest difference.
Maybe this helps.
Related
(sorry for bad english and poor skill)
Hello! I've got a mysql database which contains four columns and a cron job as a script, which requesting a status of a user every 10 minutes.
DataBase columns:
ID UID STATUS CHECK_AT
ID - just a sequence number (1,2,3 and so on). Each time a script writing something into the DB, the number grows up.
UID - Key value. Let's say it's ID of a user. All DB contains about 3-5 differents UID
STATUS - with values 1 or 0. Let's say 1 is online, 0 is offline. Online status timeout is 10 minutes.
CHECK_AT - Time and date of script work, like 2013-10-01 00:30:01
Logic: every 10 minutes script is checking specific UIDs (written in other table) for online (1) or offline (0).
What I;m trying to do:
To output summary online time of specific UIDs for a day; week; month etc
I guess it should be elementary, like
select count(id) from DB_NAME where date(check_at) = '2013-10-01';
for a one day
select count(uid) from user_activity where date(check_at) between '2013-10-01' and '2013-10-07';
For a few days and so on.
But, my skill is to low to know, how I can count only online time (status=1) for a date.
Can you give me some advices, please?
you could add your conditions in WHERE clause like:
select count(id) from your_table where date(check_at) = '2013-10-01' AND status = 1;
OR
select count(uid) from user_activity where
date(check_at) between '2013-10-01' AND '2013-10-07'
AND status = 1;
Scenario
UDPATE
Please ignore the commented section. After thinking for an alternative, I came up with this:
Let's say I have
$date = '2012-10-03 13:00:00'
The time interval range is
2012-10-03 12:00:00 to 2012-10-03 14:00:00
Now $date falls between the time range mentioned above. Any ideas on how to compare a date time with a range of date time? I've come across functions which compare either just date or just time but not both at the same time. Any help much appreciated.
/*I'm building a school timetable and want to make sure that a room cannot be assigned to two different periods if it is already occupied. I have datetime values of **`2012-10-03 13:00:00`** (the start time of a period. Let's call it **abc** for reference) and **`2012-10-03 13:30:00`** (the end time of a period. Let's call it **xyz** for reference).
My database table contains columns for room number assigned for a period and the start and end time of that period. Something like this:
room_no | start_time | end_time
5 2012-10-03 13:00:00 2012-10-03 14:30:00
This means for October 3, 2012 room 5 is occupied between 1pm and 2:30pm. So the datetime values that I have (abc & xyz) will have to be assigned to a room other than 5.
I'm at a loss of ideas on how to go about validating this scenario, i.e. make sure that the period with time interval between abc & xyz cannot be assigned room number 5.
Any help would be appreciated. Thanks in advance
PS : I'm not asking for code. I'm looking for ideas on how to proceed with the issue at hand. Also, is there a way a query can be build to return a row if `abc` or `xyz` lie between `start_time` and `end_time` as that would be great and reduce a lot of workload. I could simply use the number of rows returned to validate (if greater than 0, then get the room number and exclude it from the result)*/
if(StartTime - BookingTime < 0 && BookingTime - EndTime < 0)
{
// Booking time is already taken
}
You can do this in SQL with TIMEDIFF().
I'm working on something similar and perhaps an easier way to code it would not be using times but timeslots? The way I thought of doing it was a table bookings (date, slot ids, room) table slots (with maybe slot ID and TIME) and per booking use a certain amount of slots.. then when you look for when the room is available it shows you per date what slots are free.. Just an idea.
Basically i think you need the first available room_no to be assigned to your abc-xyz timespan. So, you should be fetching the first good value that is not in the already-booked set.
Example query could be something like this
select room_no
from
bookings
where
room_no not in (
select
room_no
from bookings
where start_time >= 'abc' and end_time <='xyz'
)
limit 1
I need to come up with a software able to store employees' schedules in a database. I currently have this design:
CREATE TABLE IF NOT EXISTS `schedules` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`employee_id` varchar(32) NOT NULL,
`day_of_week` int(2) NOT NULL,
`starting_time` time DEFAULT NULL,
`ending_time` time DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
Pretty straight forward. Print today's date, figure out what is the day of the week, and retrieve all matches from database, but I'd like to know a better way to achieve the same thing.
I need to display on a calendar employees' schedules and I need to pass a date to the calendar in order to be displayed. However, I cannot pass a date to the calendar if the only information I know is that a given employee is going to work on Wednesday.
Given the latter database design, is there a way to retrieve all the Wednesdays, or Mondays of a month/year?
Thank you!
EDIT
I need help going backwards (or a better alternative) to change day of the week to actual dates.
These are my "restrictions":
My calendar control requires a date in order to display an "event" in the UI.
Employees should submit their schedule just once.
For instance:
John Doe works Mondays, Wednesdays and Fridays. 8:00 - 12:00.
That information, with my current database design, is represented in the following fashion:
ID, employee_id, day_of_week, starting_time, ending_time
1, John Doe, 1, 8:00, 12:00
2, John Doe, 3, 8:00, 12:00
3, John Doe, 5, 8:00, 12:00
I need to be able to pass a date to the calendar UI for a given month.
For instance I should be able to come up with a way to tell my calendar control:
John Doe is going to work on the 6th, 13th, 20th, and 27th taking in consideration that the only information I have is "day_of_week" = 1 (Mondays)
EDIT 2
My current but ugly solution is:
Loop through all the days of the month, and query the database day by day.
EDIT 3 - SOLUTION
Thanks to RS, I was able to solve my problem.
I kept the schedules table as it was, but I created the tables suggested by rs
After creating the tables described on the article shared by rs
The following query did the job:
SELECT CONCAT(users.firstname,\" \",users.lastname) as employee_name,
DATE_FORMAT(DATE, '%a %b %d %Y') as date, schedules.starting_time,
schedules.ending_time FROM dates_d
RIGHT JOIN schedules ON schedules.day_of_week = dates_d.day_of_week
LEFT JOIN users ON schedules.employee_id = users.ID
I JSON-Encoded an array, and the Calendar Control finally worked like a charm.
you can create date table and use key from that table and store here. The date table will allow you to query data by different date variables, ex: day, week, day of week, quarter etc - dwhworld.com/2010/08/date-dimension-sql-scripts-mysql
your schedules table will have id, emplid, date_id and you can join schedules with date table on schedules.date_id = datetble.date_id and get date, day of week in one query. You can then use this date field with your control
Why not have your day_of_week column a date type instead, that way if you want to get the day from this date you can use MySQL's DayOfWeek function. You could use this same date to pass to your calendar control (I think that's what you were getting at) in your UI.
To elaborate on your comment: Wouldn't the date that John Doe is scheduled to work be entered by the user (possibly via jQuery UI's datepicker), then when rendering your schedule at the UI level you'd use the following query to retrieve their monday schedule:
SELECT *
FROM schedules
WHERE DAYOFWEEK(schedule_date) = 1
AND employee_id = (
SELECT employee_id
FROM employees
WHERE employee_name = 'John Doe'
)
Obviously this query makes some assumptions on your employees table, and it's not particularly elegant but it serves to explain my meaning.
RE: EDIT
OK, now I see what you mean. Basically you're after a function that will give you all dates between two ranges (perhaps) that land on a specified day (let's say monday). It seems this has already been done in PHP so that might be useful? Or do have a specific technology in mind that this would need to be done in?
EDIT 3
This seems a more elegant solution: Get mondays tuesdays etc - from this you can query your DayOfWeek in the SQL and return an array of integers that you can pass as the third argument to their function.
please I need help with this (for better understanding please see attached image) because I am completely helpless.
As you can see I have users and they store their starting and ending datetimes in my DB as YYYY-mm-dd H:i:s. Now I need to find out overlaps for all users according to the most frequent time range overlaps (for most users). I would like to get 3 most frequented datatime overlaps for most users. How can I do it?
I have no idea which mysql query should I use or maybe it would be better to select all datetimes (start and end) from database and process it in php (but how?). As stated on image results should be for example time 8.30 - 10.00 is result for users A+B+C+D.
Table structure:
UserID | Start datetime | End datetime
--------------------------------------
A | 2012-04-03 4:00:00 | 2012-04-03 10:00:00
A | 2012-04-03 16:00:00 | 2012-04-03 20:00:00
B | 2012-04-03 8:30:00 | 2012-04-03 14:00:00
B | 2012-04-06 21:30:00 | 2012-04-06 23:00:00
C | 2012-04-03 12:00:00 | 2012-04-03 13:00:00
D | 2012-04-01 01:00:01 | 2012-04-05 12:00:59
E | 2012-04-03 8:30:00 | 2012-04-03 11:00:00
E | 2012-04-03 21:00:00 | 2012-04-03 23:00:00
What you effectively have is a collection of sets and want to determine if any of them have non-zero intersections. This is the exact question one asks when trying to find all the ancestors of a node in a nested set.
We can prove that for every overlap, at least one time window will have a start time that falls within all other overlapping time windows. Using this tidbit, we don't need to actually construct artificial timeslots in the day. Simply take a start time and see if it intersects any of the other time windows and then just count up the number of intersections.
So what's the query?
/*SELECT*/
SELECT DISTINCT
MAX(overlapping_windows.start_time) AS overlap_start_time,
MIN(overlapping_windows.end_time) AS overlap_end_time ,
(COUNT(overlapping_windows.id) - 1) AS num_overlaps
FROM user_times AS windows
INNER JOIN user_times AS overlapping_windows
ON windows.start_time BETWEEN overlapping_windows.start_time AND overlapping_windows.end_time
GROUP BY windows.id
ORDER BY num_overlaps DESC;
Depending on your table size and how often you plan on running this query, it might be worthwhile to drop a spatial index on it (see below).
UPDATE
If your running this query often, you'll need to use a spatial index. Because of range based traversal (ie. does start_time fall in between the range of start/end), a BTREE index will not do anything for you. IT HAS TO BE SPATIAL.
ALTER TABLE user_times ADD COLUMN time_windows GEOMETRY NOT NULL DEFAULT 0;
UPDATE user_times SET time_windows = GeomFromText(CONCAT('LineString( -1 ', start_time, ', 1 ', end_time, ')'));
CREATE SPATIAL INDEX time_window ON user_times (time_window);
Then you can update the ON clause in the above query to read
ON MBRWithin( Point(0,windows.start_time), overlapping_windows.time_window )
This will get you an indexed traversal for the query. Again only do this if your planning on running the query often.
Credit for the spatial index to Quassoni's blog.
Something like this should get you started -
SELECT slots.time_slot, COUNT(*) AS num_users, GROUP_CONCAT(DISTINCT user_bookings.user_id ORDER BY user_bookings.user_id) AS user_list
FROM (
SELECT CURRENT_DATE + INTERVAL ((id-1)*30) MINUTE AS time_slot
FROM dummy
WHERE id BETWEEN 1 AND 48
) AS slots
LEFT JOIN user_bookings
ON slots.time_slot BETWEEN `user_bookings`.`start` AND `user_bookings`.`end`
GROUP BY slots.time_slot
ORDER BY num_users DESC
The idea is to create a derived table that consists of time slots for the day. In this example I have used dummy (which can be any table with an AI id that is contiguous for the required set) to create a list of timeslots by adding 30mins incrementally. The result of this is then joined to bookings to be able to count the number of books for each time slot.
UPDATE For entire date/time range you could use a query like this to get the other data required -
SELECT MIN(`start`) AS `min_start`, MAX(`end`) AS `max_end`, DATEDIFF(MAX(`end`), MIN(`start`)) + 1 AS `num_days`
FROM user_bookings
These values can then be substituted into the original query or the two can be combined -
SELECT slots.time_slot, COUNT(*) AS num_users, GROUP_CONCAT(DISTINCT user_bookings.user_id ORDER BY user_bookings.user_id) AS user_list
FROM (
SELECT DATE(tmp.min_start) + INTERVAL ((id-1)*30) MINUTE AS time_slot
FROM dummy
INNER JOIN (
SELECT MIN(`start`) AS `min_start`, MAX(`end`) AS `max_end`, DATEDIFF(MAX(`end`), MIN(`start`)) + 1 AS `num_days`
FROM user_bookings
) AS tmp
WHERE dummy.id BETWEEN 1 AND (48 * tmp.num_days)
) AS slots
LEFT JOIN user_bookings
ON slots.time_slot BETWEEN `user_bookings`.`start` AND `user_bookings`.`end`
GROUP BY slots.time_slot
ORDER BY num_users DESC
EDIT I have added DISTINCT and ORDER BY clauses in the GROUP_CONCAT() in response to your last query.
Please note that you will will need a much greater range of ids in the dummy table. I have not tested this query so it may have syntax errors.
I would not do much in SQL, this is so much simpler in a programming language, SQL is not made for something like this.
Of course, it's just sensible to break the day down into "timeslots" - this is statistics. But as soon as you start handling dates over the 00:00 border, things start to get icky when you use joins and inner selects. Especially with MySQL which does not quite like inner selects.
Here's a possible SQL query
SELECT count(*) FROM `times`
WHERE
( DATEDIFF(`Start`,`End`) = 0 AND
TIME(`Start`) < TIME('$SLOT_HIGH') AND
TIME(`End`) > TIME('$SLOT_LOW'))
OR
( DATEDIFF(`Start`,`End`) > 0 AND
TIME(`Start`) < TIME('$SLOT_HIGH') OR
TIME(`End`) > TIME('$SLOT_LOW')
Here's some pseudo code
granularity = 30*60; // 30 minutes
numslots = 24*60*60 / granularity;
stats = CreateArray(numslots);
for i=0, i < numslots, i++ do
stats[i] = GetCountFromSQL(i*granularity, (i+1)*granularity); // low, high
end
Yes, that makes numslots queries, but no joins no nothing, hence it should be quite fast. Also you can easily change the resolution.
And another positive thing is, you could "ask yourself", "I have two possible timeslots, and I need the one where more people are here, which one should I use?" and just run the query twice with respective ranges and you are not stuck with predefined time slots.
To only find full overlaps (an entry only counts if it covers the full slot) you have to switch low and high ranges in the query.
You might have noticed that I do not add times between entries that could span multiple days, however, adding a whole day, will just increase all slots by one, making that quite useless.
You could however add them by selecting sum(DAY(End) - DAY(Start)) and just add the return value to all slots.
Table seems pretty simple. I would keep your SQL query pretty simple:
SELECT * FROM tablename
Then when you have the info saved in your PHP object. Do the processing with PHP using loops and comparisons.
In simplest form:
for($x, $numrows = mysql_num_rows($query); $x < $numrows; $x++){
/*Grab a row*/
$row = mysql_fetch_assoc($query);
/*store userID, START, END*/
$userID = $row['userID'];
$start = $row['START'];
$end = $row['END'];
/*Have an array for each user in which you store start and end times*/
if(!strcmp($userID, "A")
{
/*Store info in array_a*/
}
else if(!strcmp($userID, "B")
{
/*etc......*/
}
}
/*Now you have an array for each user with their start/stop times*/
/*Do your loops and comparisons to find common time slots. */
/*Also, use strtotime() to switch date/time entries into comparable values*/
Of course this is in very basic form. You'll probably want to do one loop through the array to first get all of the userIDs before you compare them in the loop shown above.
I would like to implement a fidelity program, similar to the one on stackoverflow, on my website.
I want to be able to give some kind of reward to users who have visited my website for 30 days in a row.
[MySQL] What would be the best table architecture?
[PHP] What kind of algorithm should I use to optimize this task?
I prefer more raw data in the database than the approach that #Matt H. advocates. Make a table that records all logins to the site (or, if you prefer, new session initiations) along with their time and date:
CREATE TABLE LoginLog (
UserId INT NOT NULL REFERENCES Users (UserId),
LoginTime TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
)
CREATE INDEX IX_LoginLog USING BTREE ON LoginLog (UserId ASC, LoginTime DESC)
Just insert the UserId into the table on login. I, of course, made some assumptions about your database, but I think you will be able to adapt.
Then, to check for discrete logins for each of the preceding thirty days:
SELECT COUNT(*)
FROM (SELECT DATE(log.LoginTime) AS LoginDate,
COUNT(*) AS LoginCount
FROM LoginLog log
WHERE log.LoginTime >= DATE(DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 30 DAYS))
GROUP BY LoginDate
HAVING COUNT(*) > 0) a
If the result is 30, you're golden.
I will admit that I haven't touched MySQL in a while (working mainly on SQL Server and on PostgreSQL lately), so if the syntax is off a bit, I apologize. I hope the concept makes sense, though.
From your description above, this could be accomplished fairly simply and with one table.
| ID | Table PK, auto-incrementing
| EMAIL | website visitor unique ID. Ostensibly an email, but could be any piece of data that uniquely ID's the visitor
| FIRST_CONSECUTIVE_DAY | timestamp
| LAST_CONSECUTIVE_DAY | timestamp
| HAS_BEEN_REWARDED | bool, default false(0)
Thats it for the table. :)
The algorithm is in three parts:
When a user logs in, once they have been verified...
1) Check the users LAST_CONSECUTIVE_DAY. If the LAST_CONSECUTIVE_DAY is today, do nothing. If the LAST_CONSECUTIVE_DAY is yesterday, set LAST_CONSECUTIVE_DAY to todays date. Otherwise, set FIRST_CONSECUTIVE_DAY and LAST_CONSECUTIVE_DAY to todays date.
2) Use TIMESTAMPDIFF to compare LAST_CONSECUTIVE_DAY and FIRST_CONSECUTIVE_DAY by the DAY unit. If it returns 30 go on to step 3, otherwise move on with the application.
3) Your user has visited the website every single day for 30 days in a row! Congratulations! Check HAS_BEEN_REWARDED to see if they have done it before, if still false give them a prize and mark HAS_BEEN_REWARDED as true.