I have a table that stores shift records for employees.
Simply, there's the following data:
id = Shift ID
employeenum = Employee Number
start = unix timestamp of shift start time
end = unix timestamp of shift end time
date = YYYY-mm-dd description of date the shift starts on
status = shift status (numeric status identifier)
I am currently determining conflicts through a looping php script but it's far too slow. I've searched other questions and can't quite find the answer I'm looking for.
I am trying to come up with a query that will basically give me a list of employeenums that have conflicting shifts within a given time period.
i.e. for the period 2016-07-03 to 2016-07-10, which employees have overlapping start and end timestamps for shifts with a status value of 1 or 7.
Any help would be appreciated.
Thank you!
EDIT
This is essentially the table structure.
id is a primary auto increment key. The table is full of numeric data.
ID is an autoincremented number, employeenum is a 6 digit number, start and end are unix timetamps, date is YYYY-mm-dd date format, overridden is 1 or 0, status is 1,2,3,4,5,6, or 7.
Current loop works by querying:
SELECT * FROM schedule WHERE overridden =0 AND date >=$startdate AND date <= $enddate AND (status = 1 OR status = 7) AND employeenum != 0 ORDER BY date ASC
It then loops through all of those returned shifts to test whether or not another one conflicts with them by executing this query over and over (using the returned start and end values from the results of the above query):
SELECT `employeenum` FROM `schedule` WHERE `overridden` =0 AND `date` >= '$startdate' AND `date` <= '$enddate' AND (`status` = '1' OR `status` = '7') AND ((('$start' > `start`) AND ('$start' < `end`)) OR ((`end` > '$start') AND (`end` < '$end'))) AND `employeenum` = '$employee';"
If there is a result, it pushes the employee number to an array of employees with conflicts. This then prevents the loop from checking for that employee again.
At any given time there could be 10,000 records, so it's executing 10,000+ queries. These records represent only 100-200 employees, so I am looking for a way to query one time to see if there are any overlapping (start and end overlap with another start or end) records between two date values for one employeenum without having to query the database 10,000 times.
This Query will give you the shift id, the date, the employee number, the conflicting employee numbers and a count of conflicting shifts. You have a ton of shifts that conflict in your dataset!!!
SELECT `schedule_test`.`id`, `schedule_test`.`date`, `schedule_test`.`employeenum`, GROUP_CONCAT(DISTINCT`join_tbl`.`employeenum`), COUNT(`join_tbl`.`employeenum`)
FROM `schedule_test`
INNER JOIN `schedule_test` AS `join_tbl` ON
`schedule_test`.`date` = `join_tbl`.`date`
AND (`join_tbl`.`status` = 1 OR `join_tbl`.`status` = 7)
AND (`join_tbl`.`start` BETWEEN `schedule_test`.`start` AND `schedule_test`.`end`
OR `join_tbl`.`end` BETWEEN `schedule_test`.`start` AND `schedule_test`.`end`)
AND `schedule_test`.`id` != `join_tbl`.`id`
WHERE (`schedule_test`.`status` = 1 OR `schedule_test`.`status` = 7)
GROUP BY `schedule_test`.`id`
ORDER BY `schedule_test`.`date`
Adapted from #cmorrissey 's answer. THANK YOU!!
SELECT `schedule_test`.`id`, `schedule_test`.`date`,
`schedule_test`.`employeenum`,
GROUP_CONCAT(DISTINCT`join_tbl`.`employeenum`),
COUNT(`join_tbl`.`employeenum`)
FROM `schedule_test`
INNER JOIN `schedule_test` AS `join_tbl` ON
`schedule_test`.`date` = `join_tbl`.`date`
AND (`join_tbl`.`status` = 1 OR `join_tbl`.`status` = 7)
AND (`join_tbl`.`employeenum` = `schedule_test`.`employeenum`)
AND (`join_tbl`.`start` BETWEEN `schedule_test`.`start` AND `schedule_test`.`end`
OR `join_tbl`.`end` BETWEEN `schedule_test`.`start` AND `schedule_test`.`end`)
AND `schedule_test`.`id` != `join_tbl`.`id`
WHERE (`schedule_test`.`status` = 1 OR `schedule_test`.`status` = 7)
GROUP BY `schedule_test`.`id`
ORDER BY `schedule_test`.`date`
Related
I have a database table full of time records and I need to calculate the quantity of hours that exist between them...
A time record has the following fields: 'created' (i.e. 2017:08:30 11:15:00) and 'direction' (i.e. 1 represents "clock in" and "0" represents "clock out"). So I need to set a start and end date, then select all time records within that time frame and calculate the quantity of hours worked (the quantity of hours that exist between the records where direction=0 and direction=1).
Any idea how to create the logic for this? The result must be a measurement of "hours" in decimal format (1 decimal place, i.e. '26.7' hours).
I started by establishing variables:
$query_start_date = "2017-08-29 00:00:00";
$query_end_date = "2017-08-30 23:59:59";
Let's assume these are the time records that exist in that time frame:
time record 1: 'created'="2017-08-29 08:00:00", 'direction'=1;
time record 2: 'created'="2017-08-29 16:30:00", 'direction'=0;
time record 3: 'created'="2017-08-30 08:00:00", 'direction'=1;
time record 4: 'created'="2017-08-30 16:00:00", 'direction'=0;
But I don't know how to begin the calculation. Do I select the records first and assign each record to a variable as an array...? Any help is appreciated!
Make start and end time in two columns:
SELECT
created as start,
(select created from test t2 where t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0
conunt time difference:
select TIME_TO_SEC(TIMEDIFF(end, start))/60/60 as diff, start, end from
(SELECT
created as start,
(select created from test t2 where
t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0) intervals
and finally count sum:
select sum(TIME_TO_SEC(TIMEDIFF(end, start))/60/60) as total from
(SELECT
created as start,
(select created from test t2 where
t2.created > t.created
and direction = 1 order by created limit 1) as end
FROM `test` t where direction = 0) intervals
I currently have a system to track inventory items.
The sql table is set up as follows:
Unique ID | Order number | Location | TimeStamp
Every time an order moves, a new entry is created with the same order number with the new location and timestamp.
Now I need to find the average time required for order to move from one location to another, say from Location Warehouse to Pickup Depot.
I am trying to work on the query and I have this so far.
SELECT
IFNULL(TIMESTAMPDIFF(SECOND,
MIN(TimeStamp),
MAX(TimeStamp)) / NULLIF(COUNT(*) - 1, 0), 0)
FROM TableName
WHERE Status = 'Delivered'
AND TimeStamp > DATE_SUB(NOW(), INTERVAL 6 HOUR)
The works really well if the table only had one order number, the moment we add more table numbers the average goes off.
I need it to only look at the timestamp difference for each order number, while currently I think its looking at the whole table.
Any help would be much appreciated.
Apologize for posting this question twice, the previous post did not contain enough information.
Thanks again.
SELECT
IFNULL(TIMESTAMPDIFF(SECOND,
MIN(TimeStamp),
MAX(TimeStamp)) / NULLIF(COUNT(*) - 1, 0), 0)
FROM TableName
WHERE Status = 'Delivered'
AND TimeStamp > DATE_SUB(NOW(), INTERVAL 6 HOUR)
GROUP BY OrderNumber
The above query returns the timedifference in different rows in sql (with the following error " Current selection does not contain a unique column. Grid edit, checkbox, Edit, Copy and Delete features are not available.". The table has one column named "IFNULL(TIMESTAMPDIFF(SECOND, MIN(TimeStamp), MAX(TimeStamp)) / NULLIF(COUNT(*) - 1, 0), 0)" with the time difference for various orders arrange in rows. Now I am trying to get their average with the output code.
Am outputting the results with the following code:
$row_cnt = $result2->num_rows;
while ($row2 = mysqli_fetch_assoc($result2)) {
$processingseconds = $row2['IFNULL(TIMESTAMPDIFF(SECOND, MIN(TimeStamp), MAX(TimeStamp)) / NULLIF(COUNT(*) - 1, 0), 0)'] + $processingseconds;
}
print "Current Processing Time: ";
$processingseconds = $processingseconds/$row_cnt;
$processingminutes = $processingseconds/60;
echo $processingminutes;
Try adding a Group by condition to your query.
GROUP BY Order_Number_column
Something like this:
SELECT
IFNULL(TIMESTAMPDIFF(SECOND,
MIN(TimeStamp),
MAX(TimeStamp)) / NULLIF(COUNT(*) - 1, 0), 0)
FROM TableName
WHERE Status = 'Delivered'
AND TimeStamp > DATE_SUB(NOW(), INTERVAL 6 HOUR)
GROUP BY Your_Order_Number_column
I am trying to combine two MYSQL Queries into one. What I want to do is select the first and last row added for each day and subtract the last column for that day from the first column of that day and output that. What this would do is give me a net gain of XP in this game for that day.
Below are my two queries, their only difference is ordering the date by DESC vs ASC. the column in the database that i want to subtract from each other is "xp"
$query = mysql_query("
SELECT * FROM (SELECT * FROM skills WHERE
userID='$checkID' AND
skill = '$skill' AND
date >= ".$date."
ORDER BY date DESC) as temp
GROUP BY from_unixtime(date, '%Y%m%d')
");
$query2 = mysql_query("
SELECT * FROM (SELECT * FROM skills WHERE
userID='$checkID' AND
skill = '$skill' AND
date >= ".$date."
ORDER BY date DESC) as temp
GROUP BY from_unixtime(date, '%Y%m%d')
");
SELECT FROM_UNIXTIME(date, '%Y%m%d') AS YYYYMMDD, MAX(xp)-MIN(xp) AS xp_gain
FROM skills
WHERE userID = '$checkID'
AND skill = '$skill'
AND date >= $date
GROUP BY YYYYMMDD
This assumes that XP always increases, so it doesn't need to use the times to find the beginning and ending values.
If that's not a correct assumption, what you want is something like this:
SELECT first.YYYYMMDD, last.xp - first.xp
FROM (subquery1) AS first
JOIN (subquery2) AS last
ON first.YYYYMMDD = last.YYYYMMDD
Replace subquery1 with a query that returns the first row of each day, and subquery2 with a query that returns the last row of each day. The queries you posted in your question don't do this, but there are many SO questions you can find that explain how to get the highest or lowest row per group.
I've been searching and I know there are similar questions, but none of them seems to answer this particular question.
I am trying to get a count of the total number of days an employee has worked on a given schedule. To do this, I am counting the total number of rows the employee appears on the "schedules" table. Only we run into a problem if the employee is scheduled twice on the same day.
To solve this, I want to count total number of rows and sort by DATE in a DATETIME field.
Current query:
$days = mysql_query("SELECT emp_id FROM schedules
WHERE sch_id = '$sch_id'
AND emp_id = '$emp_data[emp_id]'");
$tot_days = mysql_num_rows($days);
I would like to change it to:
$days = mysql_query("SELECT emp_id FROM schedules
WHERE sch_id = '$sch_id'
AND emp_id = '$emp_data[emp_id]'
GROUP BY start_date");
// "start_date" is a datetime field. Need to sort by date only YYYY-MM-DD
$tot_days = mysql_num_rows($days);
Any thoughts?
If your start_date column is a MySQL datetime type, you could use the following:
$days = mysql_query("SELECT start_date, count(*) FROM schedules
WHERE sch_id = '$sch_id'
AND emp_id = '$emp_data[emp_id]'
GROUP BY DATE(start_date)
HAVING count(*) > 1
ORDER BY DATE(start_date)");
The DATE function "Extracts the date part of the date or datetime expression"
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_date
This will give you only those rows where the emp_id being considered is used more than once in a given date. Remove the HAVING line if you want to see all.
I'm looking for a best practice advice how to speed up queries and at the same time to minimize the overhead needed to invoke date/mktime functions. To trivialize the problem I'm dealing with the following table layout:
CREATE TABLE my_table(
id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
important_data INTEGER,
date INTEGER);
The user can choose to show 1) all entries between two dates:
SELECT * FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Output:
10-21-2009 12:12:12, 10002
10-21-2009 14:12:12, 15002
10-22-2009 14:05:01, 20030
10-23-2009 15:23:35, 300
....
I don't think there is much to improve in this case.
2) Summarize/group the output by day, week, month, year:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data
FROM my_table
WHERE date >= ? AND date <= ?
ORDER BY date DESC;
Example output by month:
10-2009, 100002
11-2009, 200030
12-2009, 3000
01-2010, 0 /* <- very important to show empty dates, with no entries in the table! */
....
To accomplish option 2) I'm currently running a very costly for-loop with mktime/date like the following:
for(...){ /* example for group by day */
$span_from = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i, date("Y", $time_min));
$span_to = (int)mktime(0, 0, 0, date("m", $time_min), date("d", $time_min)+$i+1, date("Y", $time_min));
$query = "..";
$output = date("m-d-y", ..);
}
What are my ideas so far? Add additional/ redundant columns (INTEGER) for day (20091212), month (200912), week (200942) and year (2009). This way I can get rid of all the unnecessary queries in the for loop. However I'm still facing the problem to very fastly calculate all dates that doesn't have any equivalent in database. One way to simply move the problem could be to let MySQL do the job and simply use one big query (calculate all the dates/use MySQL date functions) with a left join (the data). Would it be wise to let MySQL take the extra load? Anyway I'm reluctant to use all these mktime/date in the for loop. Since I have complete control over the table layout and code even suggestions with major changes are welcome!
Update
Thanks to Greg I came up with the following SQL query. However it still bugs me to use 50 lines of sql statements - build up with php - that maybe could be done faster and more elegantly otherwise:
SELECT * FROM (
SELECT DATE_ADD('2009-01-30', INTERVAL 0 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 1 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 2 DAY) AS day UNION ALL
SELECT DATE_ADD('2009-01-30', INTERVAL 3 DAY) AS day UNION ALL
......
SELECT DATE_ADD('2009-01-30', INTERVAL 50 DAY) AS day ) AS dates
LEFT JOIN (
SELECT DATE_FORMAT(date, '%Y-%m-%d') AS date, SUM(data) AS data
FROM test
GROUP BY date
) AS results
ON DATE_FORMAT(dates.day, '%Y-%m-%d') = results.date;
You definitely shouldn't be doing a query inside a loop.
You can group like this:
SELECT COUNT(*) AS count, SUM(important_data) AS important_data, DATE_FORMAT('%Y-%m', date) AS month
FROM my_table
WHERE date BETWEEN ? AND ? -- This should be the min and max of the whole range
GROUP BY DATE_FORMAT('%Y-%m', date)
ORDER BY date DESC;
Then pull these into an array keyed by date and loop over your data range as you are doing (that loop should be pretty light on CPU).
Another idea is not to use string inside the query. Transform the string parameter to datetime, on mysql.
STR_TO_DATE(str,format)
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html