Writing for holes in ranges

Writing for holes in ranges - php

I'm trying to write a program that relies on date ranges. I am trying to be able to alert users when there are holes in their ranges but I need a reliable way to find those, and to be able to handle them effectively.
My solution was to change any dates so that any day inserted into the app is rewritten so it is that day at noon. Here is the code for that:
public function reformDate($date){
return strtotime(date("F j, Y", $date)." 12:00pm");
}
This would allow me to deal with a more regular and consistent dataset. Because I only had to see how many days they were apart, rather than seeing how many seconds they were apart and making a decision whether that time quantity represented an intentional gap or not...
I saw, however, when you put something in for today at noon, then if you put something tomorrow at noon, since the values are the same, and based on my restriction:
Select * from times where :date between start and end
It triggers a response. My solution for this was to just add one to the start variable, and detract one from the end variable, so I can easily check if there are overlap by asking if the difference between the start of one and end of another is more than 2.
Anyway, my question is: is this a good way to do this? I'm particularly worried about the number 2 - do I need to worry about using such small units of time (that is unix time, by the way). Alternately, should I create a test that if two time units overlap perfectly - they should be accepted?

Related

PHP - Getting the next occurrence of a date between a period

I'm working with icalendar events that make use of RRULE to deal with repetitions;
Now, i'm aware that the are some php class like When and RRules etc.. to handle RRULE and i'm already using it, to accomplish the task of generating repeating events, but the problem is about performance with long date-range.
So i thought to speed up the task of generating repetitions by limiting the range ( start & end ) by current calendar views that are [ MONTH, WEEK, DAY ].
Assuming we have a repeating event like
FREQ=DAILY;INTERVAL=1;DTSTART:2009-01-01
what i do is obviously to change the DTSTART to today date and to add an UNTIL date to limit the loop to a close range, and it is working just fine. the problem comes with rules like these:
FREQ=WEEKLY;BYDAY=SU;DTSTART:2009-01-01
or
FREQ=WEEKLY;INTERVAL=5;DTSTART:2009-01-01
with this kind of rules my trick doesn't work because the original start date doesn't match my harcoded today date.
i have tried without luck to make some iteration using DateTime Period and Interval but i can't figure it out.
So what i'm asking is a way to find when a given date will recur in my view range that can be MONTH, WEEK, or single DAY.
thanks in advance hope someone can help me. ;)

I know the question is quite old already, but I'm going to answer just in case.
There is no reliable way to alter the rule like you are trying to do. As you noticed, as soon as you start having anything more than just the basic daily/weekly/monthly/yearly repetition your trick doesn't work anymore. And you haven't even scratched the surface yet, things like BYSETPOS and COUNT are a real nightmare.
You only have two approaches:
Either loop through all occurrences, starting from DTSTART ignoring anything that is before the start of your period, and stopping once you reach the end of the period
Or generate the full list of dates of your period, and test each one against the RRULE. This can be done with a process of elimination (example: your RRULE only occurs on Sunday and the date is not a Sunday? Then discard and move on). However in the most complex cases, the only solution is to revert to option 1 and compute all the occurrences.
While you can code these yourself, I suggest to use a lib for that. I'm the author of php-rrule and with the lib you can use getOccurrencesBetween($begin, $end) (that implements option 1) and occursAt($date) (that implements option 2).

PHP/Oracle: Time representation

Disclaimer: I'm fully aware that the best way to represent date/times is either Unix timestamps or PHP's DateTime class and Oracle's DATE data type.
With that out of the way, I'm wondering what the most appropriate data types are (in PHP, as well as Oracle) for storing just time data. I'm not interested in storing a date component; only the time.
For example, say I had an employee entity, for which I wanted to store his/her typical work schedule. This employee might work 8:00am - 5:00pm. There are no date components to these times, so what should be used to store them and represent them?
Options I have considered:
As strings, with a standard format (likely 24-hour HH:MM:SS+Z).
As numbers in the range 0 <= n < 24, with fractional parts representing minutes/seconds (not able to store timezone info?).
As PHP DateTime and Oracle DATE with normalized/unused date component, such as 0001-01-01.
Same as above, only using Unix timestamps instead (PHP integer and Oracle TIMESTAMP).
Currently I'm using #3 above, but it sort of irks me that it seems like I'm misusing these data types. However, it provides the best usability as far as I can tell. Comparisons and sorts all work as expected in both PHP and Unix, timezone data can be maintained, and there's not really any special manipulation needed for displaying the data.
Am I overlooking anything, or is there a more appropriate way?

If you don't need the date at all, then what you need is the interval day data type. I haven't had the need to actually use that, but the following should work:
interval day(0) to second(6)

The option you use (3) is the best one.
Oracle has the following types for storing times and dates:
date
timestamp (with (local) time zone)
interval year to month
interval day to second
Interval data types are not an option for you, because you care when to start and when you finish. You could possibly use one date and one interval but this just seems inconsistent to me, as you still have one "incorrect" date.
All the other options you mentioned need more work on your side and probably also lead to decreased performance compared to the native date type.
More information on oracle date types: http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm#i1005946

I think that the most correct answer to this question is totally dependant on what you are planning to do with the data. If you are planning to do all your work in PHP and nothing in the database, the best way to store the data will be whatever is easiest for you to get the data in a format that assists you with what you are doing in PHP. That might indeed be storing them as strings. That may sound ghastly to a DBA, but you have to remember that the database is there to serve your application. On the other hand, if you are doing a lot of comparisons in the database with some fancy queries, be sure to store everything in the database in a format that makes your queries the most efficient.
If you are doing tasks like heavy loads calculating hours worked, converting into a decimal format may make things easier for calculations before a final conversion back to hours:minutes. A simple function can be written to convert a decimal to and fro when you are getting data from the database, convert it to decimal, do all your calculations, then run it back through to convert back into a time format.
Using unix timestamps is handy when you are calculating dates, probably not so much when you are calculating times though. While there seem to be some positives using this, such as very easily adding a timestamp to a timestamp, I have found that having to convert everything into timestamps to calculations is pesky and annoying, so I would steer clear of this scenario.
So, to sum up:
If you want to easily store, but not manipulate data, strings can be
an effective method. They are easy to read and verify. For anything
else, choose something else.
Calculating as numbers makes for super easy calculations. Convert
the time/date to a decimal, do all your heavy hiting, then revert to
a real time format and store.
Both PHP's Datetime and Oracle's Date are handy, and there are some
fantastic functions built into oracle and PHP to manipulate the
data, but even the best functions can be more difficult then adding
some decimals together. I think that storing the data in the
database in a date format is probably a safe idea - especially if
you want to do calculations based on the columns within a query.
What you plan to do with them inside PHP will determine how you use
them.
I would rule option four out right off the bat.
Edit: I just had an interesting chat with a friend about time types. Another thing you should be aware of is that sometimes time based objects can cause more problems than they solve. He was looking into an application where we track delivery dates and times. The data was in fact stored in datetime objects, but here is the catch: truck delivery times are set for a particular day and a delivery window. An acceptable delivery is either on time, or up to an hour after the time. This caused some havoc when a truck was to arrive at 11:30pm and turned up 45 minutes later. While still within the acceptable window, it was showing up as being the next day. Another issue was at a distribution center which actually works on a 4:00AM starting 24 hour day. Setting up times worked for the staff - and consolidating it to payments revolving around a normal date proved quite a headache.

user generated / user specific functions

I'm looking for the most elegant and secure method to do the following.
I have a calendar, and groups of users.
Users can add events to specific days on the calendar, and specify how long each event lasts for.
I've had a few requests from users to add the ability for them to define that events of a specific length include a break, of a certain amount of time, or require that a specific amount of time be left between events.
For example, if event is >2 hours, include a 20min break. for each event, require 30 minutes before start of next event.
The same group that has asked for an event of >2 hours to include a 20 min break, could also require that an event >3 hours include a 30 minute break.
In the end, what the users are trying to get is an elapsed time excluding breaks calculated for them. Currently I provide them a total elapsed time, but they are looking for a running time.
However, each of these requests is different for each group. Where one group may want a 30 minute break during a 2 hour event, and another may want only 10 minutes for each 3 hour event.
I was kinda thinking I could write the functions into a php file per group, and then include that file and do the calculations via php and then return a calculated total to the user, but something about that doesn't sit right with me.
Another option is to output the groups functions to javascript, and have it run client-side, as I'm already returning the duration of the event, but where the user is part of more than one group with different rules, this seems like it could get rather messy.
I currently store the start and end time in the database, but no 'durations', and I don't think I should be storing the calculated totals in the db, because if a group decides to change their calculations, I'd need to change it throughout the db.
Is there a better way of doing this?
I would just store the variables in mysql, but I don't see how I can then say to mysql to calculate based on those variables.
I'm REALLY lost here. Any suggestions? I'm hoping somebody has done something similar and can provide some insight into the best direction.
If it helps, my table contains
eventid, user, group, startDate, startTime, endDate, endTime, type
The json for the event which I return to the user is
{"eventid":"'.$eventId.'", "user":"'.$userId.'","group":"'.$groupId.'","type":"'.$type.'","startDate":".$startDate.'","startTime":"'.$startTime.'","endDate":"'.$endDate.'","endTime":"'.$endTime.'","durationLength":"'.$duration.'", "durationHrs":"'.$durationHrs.'"}
where for example, duration length is 2.5 and duration hours is 2:30.

Store only the start time and end time for the event, and a BLOB field named notes.
I've worked on several systems that suffered from feature creep of these sorts of requirements until the code and data modeling became nothing but an unmaintainable collection of exception cases. It was a lot of work to add new permutations to the code, and typically these cases were used only once.
If you need enforcement of the rules and conditions described in the notes field, it's actually more cost-effective to hire an event coordinator instead of trying to automate everything in software. A detail-oriented human can adapt to the exception cases much more rapidly than you can adapt the code to handle them.

Curve-fitting in PHP

I have a MySql table called today_stats. It has got Id, date and clicks. I'm trying to create a script to get the values and try to predict the next 7 days clicks. How I can predict it in PHP?

Different types of curve fitting described here:
http://en.wikipedia.org/wiki/Curve_fitting
Also: http://www.qub.buffalo.edu/wiki/index.php/Curve_Fitting

This has less to do with PHP, and more to do with math. The simplest way to calculate something like this is to take the average traffic for a given day over the past X weeks. You don't want to pull all the data, because fads and page content changes.
So, for example, get the average traffic for each day over the last month. You'll be able to tell how accurate your estimates are by comparing them to actual traffic. If they aren't accurate at all, then try playing with the calculation (ex., change the time period you're sampling from). Or maybe it's a good thing that your estimate is off: your site was just featured on the front page of the New York Times!
Cheers.

The algorithm you are looking for is called Least Squares
What you need to do is minimize the summed up distances from each point to the function you will use to predict the future values. For the distance to be always positive, not the absolute value is taken into calculation, but the square of the value. The sum of the squares of the differences has to be minimum. By defining the function that makes up that sum, deriving it, solving the resulting equation, you will find the parameters for your function, that will be CLOSEST to the statistical values from the past.
Programs like Excel (maybe OpenOffice Spreadsheet too) have a built-in function that does this for you, using polynomial functions to define the dependence.
Basically you should take Time as the independent value, and all the others as described values.
This is called econometrics, because its widespread in economics. This way, if you have a lot of statistical data from the past, the prediction for the next day will be quite accurate (you will also be able to determine the trust interval - the possible error that may occur). The following days will be less and less accurate.
If you make different models for each day of week, include holidays and special days as variables, you will get a much higher precision.
This is the only RIGHT way to mathematically forecast future values. But from all this a question arises: Is it really worth it?

Start off by connecting to the database and then retrieving the data for x days previously.
Then you could attempt to make a line of best fit for the previous days and then just use that and extend into the future. But depending on the application, a line of best fit isn't going to be good enough.

a simple approach would be to group by days and average each value. This can all be done in SQL

How to handle dates that repeat indefinitely

I am implementing a fairly simple calendar on a website using PHP and MySQL. I want to be able to handle dates that repeat indefinitely and am not sure of the best way to do it.
For a time limited repeating event it seems to make sense to just add each event within the timeframe into my db table and group them with some form of recursion id.
But when there is no limit to how often the event repeats, is it better to
a) put records in the db for a specific time frame (eg the next 2 years) and then periodically check and add new records as time goes by - The problem with this is that if someone is looking 3 years ahead, the event won't show up
b) not actually have records for each event but instead when i check in my php code for events within a specified time period, calculate wether a repeated event will occur within this time period - The problem with this is that it means there isn't a specific record for each event which i can see being a pain when i then want to associate other info (attendance etc) with that event. It also seems like it might be a bit slow
Has anyone tried either of these methods? If so how did it work out? Or is there some other ingenious crafty method i'm missing?

I'd take approach b and if someone adds something to it, I'd create a "real" event entry.
Edit:
How many periodic events do you expect and what kind of periodic events would that be? (eg: every monday, every two weeks etc.)

I would create a single record for a repeated event. Then in case more info has to be added to a specific date, I would create a record for the attachment with a reference to the repeated event.

Third vote for option B - rationale being that the data should only ever be queried for a limited timeframe (i.e. start and end). For performance reasons I'd suggest that, in addition to storing the date/time of the first occurrence, number of occurrences and frequency that you also maintain the last occurrence in the database.
C.

From my experience, generating recurring dates and checking if a specific date is in that pattern isn't all that bad performance-wise. There's only 365 days in a year. 10,000 days is already almost 30 years. which means, the size of the input/output is relatively small in a practical scenario.
This library may help (but it's javascript): http://github.com/mooman/recurring_dates

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.