Event entries scheduling algorithm PHP - php

we have an entry portal system in which we accept entries for events.
For example Championship Event 2017 will be held on 30th Nov.
Event got about 150 entries in different classes. Junior Class, Senior Class, Pro Class etc.
Now event venue only has certain numbers of ground on which the competition can be held. For example Ground 1, Ground 2 and Ground 3. Its a solo performance event.
Now our system needs to generate a schedule in such a way that competitors who entered multiple classes or same classes multiple times should get maximum break between their performances.
The input data we have are registration under each class.
Starting time of each Ground. For example Ground A will start at 8:00 AM, Ground 2 at 8:00 and Ground 3 at 9:00.
We also know that which class will be held in which arena. For example Junior and senior Class will be held in Ground 1 and Pro Class will be held in Ground 2.
We know the performance time as well. Senior Class 1 performance is 5 minutes. Junior class performance is 7 minutes and Pro Performance is 9 minutes.
Now I have written following code to get the schedule so that competitors competing multiple times in one class or in multiple class get maximum break between their performance but it still puts same competitor performance one after another.
Let me know what is my mistake.
foreach ($totalPerformanceTimeSlot as $time => $performance) {
# $totalPerformanceTimeSlot is array of timeslots starting from 8:00 am
foreach ($performance as $classId) {
#there could be 2 performance at the same time in different arena for different class.
$totalPerformanceLeftThisClass = count($this->lassRegistrationLinks[$classId]); //Get the total performance for this class from array;
# $accountRidesLeftArray has value of how many times each account is performing in this class
arsort($accountRidesLeftArray);
# for each person, estimate what their start time threshold should be based on how many times they're performing
$accountPerformanceTimeThreshold = array();
foreach ($accountPerformanceLeftArray as $accountId => $accountPerformancesLeft) {
$tempPerformanceThreshold = 20 * 60;
# reduce this person's performance threshold by a performance at a time until the minimum performance threshold has been met
while ((($totalPerformanceLeftThisClass * $this->classes[$classId]['performanceTime']) / $accountPerformanceLeft < $tempPerformanceThreshold) && ($tempPerformanceThreshold > $this->minRideThreshold))
$tempPerformanceThreshold -= $this->classes[$classId]['performanceTime'];
$accountPerformanceTimeThreshold[$accountId] = $tempPerformanceThreshold;
}
$performanceLeft = $totalPerformanceLeftThisClass - $count;
# given the number of performance left in the class,
# calculate how important it is per account that they get placed in the next slot
$accountToPerformNextImportanceArray = array();
$timeLeft = $performanceLeft * $this->classes[$classId]['performanceTime'];
foreach ($accountPerformanceLeftArray as $accountId => $accountPerformancesLeft) {
# work out the maximum number that can be used as entropy
$entropyMax = (20 * 60 / ($timeLeft / 1)) * 0.5;
$entropy = ((mt_rand (0, $entropyMax * 1000)) / 1000);
# the absolute minimum amount of time required for this user to perform
$minTimeRequiredForComfortableSpacing = ($accountRidesLeft - 1) * 20* 60;
# add a bit of time around the absolute minimum amount of time required for this person to perform so that it doesn't instantly snap in when this person suddenly has the minimum amount of time left to perform
$generalTimeRequiredForComfortableSpacing = $minTimeRequiredForComfortableSpacing * 1.7;
$nearestPerformancePrior = $this->nearest_performance_prior($classDetails['date'], $currentTime, $accountId);
$nearestRideAfter = $this->nearest_performance_after($classDetails['date'], $currentTime, $accountId);
# work out how important it is for this rider to ride next based on how many rides they have left
$importanceRating = 20 * 60 / ($timeLeft / $accountPerformanceLeft);
# if there's more than enough time left then don't worry about giving this person any importance rating, ie. it's not really important that they perform straight away
if ($timeLeft > $generalTimeRequiredForComfortableSpacing)
$importanceRating = 0;
# add a little bit of random entropy to their importance rating
$importanceRating += $entropy;
# if this account has performed too recently to place them here in this slot, then make them very undesirable for this slot
if ((!is_null($nearestPerformancePrior)) && ($nearestPerformancePrior > $currentTime - $accountPerformanceTimeThreshold[$accountId]))
$importanceRating = -1;
# work out if this account will perform too soon afterwards to place them here in this slot, then make them very undesirable for this slot
if ((!is_null($nearestRideAfter)) && ($nearestRideAfter < $currentTime + $accountRideTimeThreshold[$accountId]))
$importanceRating = -1;
$accountToPerformNextImportanceArray[$accountId] = $importanceRating;
}
arsort($accountToPerformNextImportanceArray);
//Then I take the first one from this array and allocate the time for that user.
$this->set_performance_time($classDetails['date'], $accountId, $currentTime);
$currentTime += $this->classes[$classId]['performanceTime'];
}
}
Here is some explanation of the variables
$accountPerformancessLeft is total number of performance for each user.
For e.g. if user has entered into 2 classes that means $accountPerformancessLeft is 6 for that user.
threshold is something like break.
Rider and account is conceptually the same.
I know it is hard to think the output without the actual data but any help would be appreciated.
Thank you

Well, first let's see what we have and simplify the problem:
There are different competitions(events) but since they are independent to each other we can consider only one
We have C different classes (senior, junior, ...)
We have G different grounds that each ground may hold some of C classes.
There are some persons(competitor) lets say P who registers to C classes.
Persons need to have maximum possible break.
So putting them all together the problem is:
The are some grounds G = {g1, g2, ..., gm} that each of them contains some persons P = {p1, p2, ..., pn}. We want to maximize the break time of each person in all of its competitions.
The trivial case:
First, let's assume that there is only one ground g1, and a group of person P = {p1, p2, ..., pn} who wants to compete on this ground. let's define a boolean method isItPossible(breaktime) showing that whether it is possible to schedule the competition that each person has at least breaktime to rest or not. we can simply prove that this method is monotonic i.e. if there exist a breaktime that isItPossible(breaktime) became true then:
isItPossible(t) = true for every t <= breaktime
So we can use binary search to find the maximum value for breaktime. Here is the pseudo code (C++ syntax like):
double low = 0 , high = INF;
while(low < high){
mid = (low + high) / 2;
if(isItPossible(mid))
low = mid;
else
high = mid;
}
breakTime = low;
Now the only thing remains is implementing isItPossible(breaktime) method. There are a lot of ways to implement it but I use greedy algorithm and a heap priority queue to solve it. We need a priority queue for maintaining some tuples. Each tuple contains a person, the number of time that person should compete and the earliest time we can schedule a competition for that person. We start from time t0 (the opening time of the ground e.g. it could be 8.00 a.m.) and each time we pick a person from the priority queue with the minimum earliest time. Here is the C++ like pseudo code:
bool isItPossible(double breaktime){
//Tuple(personId, numberOfCompete, earliestTime)
priority_queue<Tuple> pq;
for p in Person_list
pq.push(Tuple(p,countCompetition(p),t0));
for(time = t0;time<end_of_ground_time;){
person = pq.pop();
add_person_to_scedule_list(person.personId, max(time, person.earliestTime));
time = max(time, person.earliestTime) + competition_time;
if(person.numberOfCompete > 1)
pq.push(Tuple(person.Id,person.numberOfCompete - 1,time + breaktime)));
}
return pq.isEmpty();
}
The main problem:
After solving the trivial case we are ready to solve the original problem. In this case there are G = {g1, g2, ..., gm} grounds and we want to schedule P = {p1, p2, ..., pn} competitors. Like the trivial case we define a isItPossible(breaktime) function. Again we can prove that this function in monotonic, so we use binary search for finding the maximum value (like the above code). After that we only need to implement the isItPossible(breaktime) method. In this case implementing this method is little tricky.
For this method you can do some heuristic algorithms or some creative greedy ones (for example distribute each person start time base on breakTime over all grounds and check whether it is possible to do it for all persons or not). But again I suggest you to use Greedy algorithm and priority queue like the trivial case. Your tuple should also contains number of times that person compete in each ground, and when you want to increase time and sweep it, you should iterate over all grounds and schedule them simultaneously.
Hope it can help you. Of course there are some evolutionary algorithms like genetic or PSO to solve it (I can also explain them if you want) But using the above method is much simpler to implement and debug.

What an interesting problem!
Here is how I'd tackle it:
Set up a random schedule which works (but doesn't fit the criteria).
Write a function that can swap 2 performances
Write a shuffler which uses swap() many times in order to get a new timetable
Write a function to calculate a score(), how good is this particular schedule? does it have a lot of breaks between performances?
A score should sum all the performance gaps together, this is the function we want to maximise.
Write an algorithm that takes a "search" approach and backtracking to the problem, and let it run for a couple hours, the backtracking should:
Swap stuff
See if the swapped stuff has a better score
if so, continue from swapped
otherwise, backtrack
It can take a while, but the program can generate a better timetable.
Let us know if this approach helps.

Related

time decay factor for posts / updates in newsfeed using neo4j

i am using neo4j to retrieve news feed using this query.
MATCH (u:Users {user_id:140}),(p:Posts)-[:CREATED_BY]->(pu:Users)
WHERE (p)-[:CREATED_BY]->(u) OR (p:PUBLIC AND (u)-[:FOLLOW]->(pu)) OR
(p:PRIVATE AND (p)-[:SHARED_WITH]->(u))
OPTIONAL MATCH (p)-[:POST_MEDIA]->(f)
OPTIONAL MATCH (p)-[:COMMENT]->(c)<-[:COMMENT]-(u3) RETURN
(p.meta_score+0.2*p.likes+0.1*p.dislikes + 10/(((".time()."-
p.created_time)/3600)+0.1)) as score,
{user_id:pu.user_id,firstname:pu.firstname,lastname:pu.lastname,
profile_photo:pu.profile_photo,username:pu.username} as pu, p,
collect({user_id:u3.user_id,profile_photo:u3.profile_photo,text:c.text}) as comment,
collect(f) as file ORDER BY score DESC,
p.post_id DESC LIMIT 25
In this equation for getting score right now i am using mainly this equation p.meta_score+0.1*p.likes-0.05*p.dislikes + 10/(((current_time-
p.created_time)/3600)+0.1)) as score here i hace added 0.1 to prevent infinity error as current_time may be nearly equal to post created_time( as p refer post class)
Here its nice for single day but after a day the time part doesn't contribute well total score as the way i am calculating time decay factor is not consistent i need a equation which plays its role consistently (I means decrease score at lesser rate) for first seven days and than start decreasing its contribution towards score at an higher rate. one way was using trigonometry's tan or cot functions but the problem is that after some intervals they changes there signs.I shall be thankfull to everybody gives me further suggestions.
At a basic level, it is common to use an exponential time decay function here. Something like:
score = score / elapsedTime^2
As elapsed time since the post increases, the value of the score decreases exponentially. Sites like Reddit and Hacker News use much more complicated algorithms, but that is the basic idea.

Create fixed length non-repeating permutation within certain ranges in PHP

I've got a table with 1000 recipes in it, each recipe has calories, protein, carbs and fat values associated with it.
I need to figure out an algorithm in PHP that will allow me to specify value ranges for calories, protein, carbs and fat as well as dictating the number of recipes in each permutation. Something like:
getPermutations($recipes, $lowCal, $highCal, $lowProt, $highProt, $lowCarb, $highCarb, $lowFat, $highFat, $countRecipes)
The end goal is allowing a user to input their calorie/protein/carb/fat goals for the day (as a range, 1500-1600 calories for example), as well as how many meals they would like to eat (count of recipes in each set) and returning all the different meal combinations that fit their goals.
I've tried this previously by populating a table with every possible combination (see: Best way to create Combination of records (Order does not matter, no repetition allowed) in mySQL tables ) and querying it with the range limits, however that proved not to be efficient as I end up with billions of records to scan through and it takes an indefinite amount of time.
I've found some permutation algorithms that are close to what I need, but don't have the value range restraint for calories/protein/carbs/fat that I'm looking for (see: Create fixed length non-repeating permutation of larger set) I'm at a loss at this point when it comes to this type of logic/math, so any help is MUCH appreciated.
Based on some comment clarification, I can suggest one way to go about it. Specifically, this is my "try the simplest thing that could possibly work" approach to a problem that is potentially quite tricky.
First, the tricky part is that the sum of all meals has to be in a certain range, but SQL does not have a built-in feature that I'm aware of that does specifically what you want in one pass; that's ok, though, as we can just implement this functionality in PHP instead.
So lets say you request 5 meals that will total 2000 calories - we leave the other variables aside for simplicity, but they will work the same way. We then calculate that the 'average' meal is 2000/5=400 calories, but obviously any one meal could be over or under that amount. I'm no dietician, but I assume you'll want no meal that takes up more than 1.25x-2x the average meal size, so we can restrict out initial query to this amount.
$maxCalPerMeal = ($highCal / $countRecipes) * 1.5;
$mealPlanCaloriesRemaining = $highCal; # more on this one in a minute
We then request 1 random meal which is less than $maxCalPerMeal, and 'save' it as our first meal. We then subtract its actual calorie count from $mealPlanCaloriesRemaining. We now recalculate:
$maxCalPerMeal = ($highCal / $countRecipesRemaining) * 1.5); # 1.5 being a maximum deviation from average multiple
Now the next query will ask for both a random meal that is less than $maxCalPerMeal AND $mealPlanCaloriesRemaining, AND NOT one of the meals you already have saved in this particular meal plan option (thus ensuring unique meals - no mac'n'cheese for breakfast, lunch, and dinner!). And we update the variables as in the last query, until you reach the end. For the last meal requested it we don't care about the average and it's associated multiple, as thanks to a compound query you'll get what you want anyway and don't need to complicate your control loops.
Assuming the worst case with the 5 meal 2000 calorie max diet:
Meal 1: 600 calories
Meal 2: 437
Meal 3: 381
Meal 4: 301
Meal 5: 281
Or something like that, and in most cases you'll get something a bit nicer and more random. But in the worst-case it still works! Now this actually just plain works for the usual case. Adding more maximums like for fat and protein, etc, is easy, so lets deal with the lows next.
All we need to do to support "minimum calories per day" is add another set of averages, as such:
$minCalPerMeal = ($lowCal / $countRecipes) * .5 # this time our multiplier is less than one, as we allow for meals to be bigger than average we must allow them to be smaller as well
And you restrict the query to being greater than this calculated minimum, recalculating with each loop, and happiness naturally ensues.
Finally we must deal with the degenerate case - what if using this method you end up needing a meal that is to small or too big to fill the last slot? Well, you can handle this a number of ways. Here's what I'd recommended.
The easiest is just returning less than the desired amount of meals, but this might be unacceptable. You could also have special low calorie meals that, due to the minimum average dietary content, would only be likely to be returned if someone really had to squeeze in a light meal to make the plan work. I rather like this solution.
The second easiest is throw out the meal plan you have so far and regenerate from scratch; it might work this time, or it just might not, so you'll need a control loop to make sure you don't get into an infinite work-intensive loop.
The least easy, requires a control loop max iteration again, but here you use a specific strategy to try to get a more acceptable meal plan. In this you take the optional meal with the highest value that is exceeding your dietary limits and throw it out, then try pulling a smaller meal - perhaps one that is no greater than the new calculated average. It might make the plan as a whole work, or you might go over value on another plan, forcing you back into a loop that could be unresolvable - or it might just take a few dozen iterations to get one that works.
Though this sounds like a lot when writing it out, even a very slow computer should be able to churn out hundreds of thousands of suggested meal plans every few seconds without pausing. Your database will be under very little strain even if you have millions of recipes to choose from, and the meal plans you return will be as random as it gets. It would also be easy to make certain multiple suggested meal plans are not duplicates with a simple comparison and another call or two for an extra meal plan to be generated - without fear of noticeable delay!
By breaking things down to small steps with minimal mathematical overhead a daunting task becomes manageable - and you don't even need a degree in mathematics to figure it out :)
(As an aside, I think you have a very nice website built there, so no worries!)

Item rankings, order by confidence using Reddit Ranking Algorithms

I am interested to use this ranking class, based off of an article by Evan Miller to rank a table I have that has upvotes and downvotes. I have a system very similar to Stack Overflow's up/down voting system for an events site I am working on, and by using this ranking class I feel as though results will be more accurate. My question is how do I order by the function 'hotness'?
private function _hotness($upvotes = 0, $downvotes = 0, $posted = 0) {
$s = $this->_score($upvotes, $downvotes);
$order = log(max(abs($s), 1), 10);
if($s > 0) {
$sign = 1;
} elseif($s < 0) {
$sign = -1;
} else {
$sign = 0;
}
$seconds = $posted - 1134028003;
return round($order + (($sign * $seconds)/45000), 7);
}
I suppose each time a user votes I could have a column in my table that has the hotness data recalculated for the new vote, and order by that column on the main page. But I am interested to do this more on-the-fly incorporating the function above, and I am not sure if that is possible.
From Evan Miller, he uses:
SELECT widget_id, ((positive + 1.9208) / (positive + negative) -
1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) /
(positive + negative)) / (1 + 3.8416 / (positive + negative))
AS ci_lower_bound FROM widgets WHERE positive + negative > 0
ORDER BY ci_lower_bound DESC;
But I rather not do this calculation in the sql as I feel this is messy and difficult to change down the line if I utilize this code on multiple pages .etc.
Accessing the corresponding "Posts" table for anything (reading, writing, sorting, comparing, etc.) is extremely quick and thus relying on the database is the "most on-the-fly" alternative you have for non-temporary data storage (memory/sessions are still quicker but, logically, cannot be used to store this information).
You should be more worried about building a good ranking algorithm delivering the results you want (you are proposing two different systems, delivering different results) and working on making the whole code and the code-database communication as efficient as possible.
In principle, small codes with iterative simple orders offer the quickest and most reliable solution for this kind of situations. Example:
Ranking function (like the first one you propose or any
other one built on the ranking rules you want) called every time a
vote is given. It writes to the corresponding column(s) in the
"Posts" table (the simpler the query, the better: you can create a
ranking system as complex as you wish, but try to rely on PHP
rather than on queries).
Every time a comparison between posts is required, the "Posts" table is read with a simple SELECT ordering the records by ranking
(you can have various "assessing columns" (e.g., up-votes,
down-votes, further considerations); but better having one with the
definitive ranking).
You are right, query like this is rather messy and expensive as well.
Mixed PHP/MySQL on the fly is a bad idea well as you will have to select values for all posts and calculate hotness and then select a list of hotest ones. Extremely expensive.
You should consider saving at least part of your calculation to the database. Definitely order should go to the database. It's always better to calculate something and save just once on every save/update, instead of calculating each time it will be displayed. Try to do a benchmark on how much time you will save by calculating order on save/update instead of every time you calculate the hotness. Good thing is that order never changes unless someone upvotes/downvotes which you save to the db anyway, same for the sign.
Even if you save the sign to the db you are stil not able to avoid calculating on the fly due to the posted timestamp parameter.
I would see what difference does it make and where it makes a difference and calculate hotness with a CLI script every x amount of time only for those scripts where this is crucial, every y amount of time where it's making less of a difference.
Taking this approach you will be recalculating hotness only when necessary. This will make your application much more efficient.
I am not sure if it is possible with your DB and Schema however have you consider writing a UDF for custom sorting?
A post from stackoverflow talks about how to do this here.

Ordering Combinations for Maximum Effectiveness

So recently I was given a problem, which I have been mulling over and am still unable to solve; I was wondering if anyone here could point me in the right direction by providing me with the psuedo code (or at least a rough outline of the pseudo code) for this problem. PS I'll be building in PHP if that makes a difference...
Specs
There are ~50 people (for this example I'll just call them a,b,c... ) and the user is going to group them into groups of three (people in the groups may overlap), and in the end there will be 50-100 groups (ie {a,b,c}; {d,e,f}; {a,d,f}; {b,c,l}...). *
So far it is easy, it is a matter of building an html form and processing it into a multidimensional array
There are ~15 time slots during the day (eg 9:00AM, 9:20AM, 9:40AM...). Each of these groups needs to meet once during the day. And during one time slot the person cannot be double booked (ie 'a' cannot be in 2 different groups at 9:40AM).
It gets tricky here, but not impossible, my best guess at how to do this would be to brute force it (pick out sets of groups that have no overlap (eg {a,b,c}; {l,f,g}; {q,n,d}...) and then just put each into a time slot
Finally, the schedule which I output needs to be 'optimized', by that I mean that 'a' should have minimal time between meetings (so if his first meeting is at 9:20AM, his second meeting shouldn't be at 2:00PM).
Here's where I am lost, my only guess would be to build many, many schedules and then rank them based on the average waiting time a person has from one meeting to the next
However My 'solutions' (I hesitate to call them that) require too much brute force and would take too long to create. Are there simpler, more elegant solutions?
These are the table laid out, modified for your scenerio
+----User_Details------+ //You may or may not need this
| UID | Particulars... |
+----------------------+
+----User_Timeslots---------+ //Time slots per collumn
| UID | SlotNumber(bool)... | //true/false if the user is avaliable
+---------------------------+ //SlotNumber is replaced by s1, s2, etc
+----User_Arrangements--------+ //Time slots per collumn
| UID | SlotNumber(string)... | //Group session string
+-----------------------------+
Note: That the string in the Arrangement table, was in the following format : JSON
'[12,15,32]' //From SMALLEST to BIGGEST!
So what happens in the arrangement table, was that a script [Or an EXCEL column formula] would go through each slot per session, and randomly create a possible session. Checking all previous sessions for conflicts.
/**
* Randomise a session, in which data is not yet set
**/
function randomizeSession( sesionID ) {
for( var id = [lowest UID], id < [highest UID], id++ ) {
if( id exists ) {
randomizeSingleSession( id, sessionID );
} //else skips
}
}
/**
* Randomizes a single user in a session, without conflicts in previous sessions
**/
function randomizeSingleSession( id, sessionID ) {
convert sessionID to its collumns name =)
get the collumns name of all ther previous session
if( there is data, false, or JSON ) {
Does nothing (Already has data)
}
if( ID is avaliable in time slot table (for this session) ) {
Get all IDs who are avaliable, and contains no data this session
Get all the UID previous session
while( first time || not yet resolved ) {
Randomly chose 2
if( there was conflict in UID previous session ) {
try again (while) : not yet resolved
} else {
resolved
}
}
Registers all 3 users as a group in the session
} else {
Set session result to false (no attendance)
}
}
You will realize the main part of the assignment of groups is via randomization. However, as the amount of sessions increases. There will be more and more data to check against for conflicts. Resulting to a much slower performance. However large being, ridiculously large, to an almost perfect permutation/combination formulation.
EDIT:
This setup will also help ensure, that as long as the user is available, they will be in a group. Though you may have pockets of users, having no user group (a small number). These are usually remedied by recalculating (for small session numbers). Or just manually group them together, even if it is a repeat. (having a few here and there does not hurt). Or alternatively in your case, along with the remainders, join several groups of 3's to form groups of 4. =)
And if this can work for EXCEL with about 100+ ppl, and about 10 sessions. I do not see how this would not work in SQL + PHP. Just that the calculations may actually take some considerable time both ways.
Okay, for those who just join in on this post, please read through all the comments to the question before considering the contents of this answer, as this will very likely fly over your head.
Here is some pseudo code in PHP'ish style:
/* Array with profs (this is one dimensional here for the show, but I assume
it will be multi-dimensional, filled with availability and what not;
For the sake of this example, let me say that the multi-dimensional array
contains the following keys: [id]{[avail_from],[avail_to],[last_ses],[name]}*/
$profs = array_fill(0, $prof_num, "assoc_ids");
// Array with time slots, let's say UNIX stamps of begin time
$times = array_fill(0, $slot_num, "time");
// First, we need to loop through all the time slots
foreach ($times as $slot) {
// See when session ends
$slot_end = $slot + $session_time;
// Now, run through the profs to see who's available
$avail_profs = array(); // Empty
foreach ($profs as $prof_id => $data) {
if (($data['avail_from'] >= $slot) && ($data['avail_to'] >= $slot_end)) {
$avail_prof[$prof_id] = $data['last_ses'];
}
}
/* Reverse sort the array so that the highest numbers (profs who have been
waiting the longest) will be up top */
arsort($avail_profs);
$profs_session = array_slice($avail_profs, 0, 3);
$profs_session_names = array(); // Empty
// Reset the last_ses counters on those profs
foreach ($profs_session as $prof_id => $last_ses) {
$profs[$prof_id]['last_ses'] = 0;
$profs_session_names[0] = $profs[$prof_id]['name'];
}
// Now, loop through all profs to add one to their waiting time
foreach ($profs as $prof_id = > $data) {
$profs[$prof_id]['last_ses']++;
}
print(sprintf('The %s session will be held by: %s, $s, and %s<br />', $slot,
$profs_session_names[0], $profs_session_names[1],
$profs_session_names[2]);
unset ($profs_session, $profs_session_names, $avail_prof);
}
That should print something like:
The 9:40am session will be held by: C. Hicks, A. Hole, and B.E.N. Dover
I see an object model consisting of:
Panelists: a fixed repository of of your the panelists (Tom, Dick, Harry, etc)
Panel: consists of X Panelists (X=3 in your case)
Timeslots: a fixed repository of your time slots. Assuming fixed duration and only occurring on a single day, then all you need is track is start time.
Meeting: consists of a Panel and Timeslot
Schedule: consists of many Meetings
Now, as you have observed, the optimization is the key. To me the question is: "Optimized with respect to what criteria?". Optimal for Tom might means that the Panels on which he is a member lay out without big gaps. But Harry's Panels may be all over the board. So, perhaps for a given Schedule, we compute something like totalMemberDeadTime (= sum of all dead time member gaps in the Schedule). An optimal Schedule is the one that is minimal with respect to this sum
If we are interested in computing a technically optimal schedule among the universe of all schedules, I don't really see an alternative to brute force .
Perhaps that universe of Schedules does not need to be as big as might first appear. It sounds like the panels are constituted first and then the issue is to assign them to Meetings which them constitute a schedule. So, we removed the variability in the panel composition; the full scope of variability is in the Meetings and the Schedule. Still, sure seems like a lot of variability there.
But perhaps optimal with respect to all possible Schedules is more than we really need.
Might we define a Schedule as acceptable if no panelist has total dead time more than X? Or failing that, if no more than X panelists have dead time more than X (can't satisfy everyone, but keep the screwing down to a minimum)? Then the user could assign meeting for panels containing the the more "important" panelists first, and less-important guys simply have to take what they get. Then all we have to do is fine a single acceptable Schedule
Might it be sufficient for your purposes to compare any two Schedules? Combined with an interface (I'm seeing a drag-and-drop interface, but that's clearly beyond the point) that allows the user to constitute a schedule, clone it into a second schedule, and tweak the second one, looking to reduce aggregate dead time until we can find one that is acceptable.
Anyway, not a complete answer. Just thinking out loud. Hope it helps.

Popularity Algorithm

I'd like to populate the homepage of my user-submitted-illustrations site with the "hottest" illustrations uploaded.
Here are the measures I have available:
How many people have favourited that illustration
votes table includes date voted
When the illustration was uploaded
illustration table has date created
Number of comments (not so good as max comments total about 10 at the moment)
comments table has comment date
I have searched around, but don't want user authority to play a part, but most algorithms include that.
I also need to find out if it's better to do the calculation in the MySQL that fetches the data or if there should be a PHP/cron method every hour or so.
I only need 20 illustrations to populate the home page. I don't need any sort of paging for this data.
How do I weight age against votes? Surely a site with less submission needs less weight on date added?
Many sites that use some type of popularity ranking do so by using a standard algorithm to determine a score and then decaying eternally over time. What I've found works better for sites with less traffic is a multiplier that gives a bonus to new content/activity - it's essentially the same, but the score stops changing after a period of time of your choosing.
For instance, here's a pseudo-example of something you might want to try. Of course, you'll want to adjust how much weight you're attributing to each category based on your own experience with your site. Comments are rare, but take more effort from the user than a favorite/vote, so they probably should receive more weight.
score = (votes / 10) + comments
age = UNIX_TIMESTAMP() - UNIX_TIMESTAMP(date_created)
if(age < 86400) score = score * 1.5
This type of approach would give a bonus to new content uploaded in the past day. If you wanted to approach this in a similar way only for content that had been favorited or commented on recently, you could just add some WHERE constraints on your query that grabs the score out from the DB.
There are actually two big reasons NOT to calculate this ranking on the fly.
Requiring your DB to fetch all of that data and do a calculation on every page load just to reorder items results in an expensive query.
Probably a smaller gotcha, but if you have a relatively small amount of activity on the site, small changes in the ranking can cause content to move pretty drastically.
That leaves you with either caching the results periodically or setting up a cron job to update a new database column holding this score you're ranking by.
Obviously there is some subjectivity in this - there's no one "correct" algorithm for determining the proper balance - but I'd start out with something like votes per unit age. MySQL can do basic math so you can ask it to sort by the quotient of votes over time; however, for performance reasons, it might be a good idea to cache the result of the query. Maybe something like
SELECT images.url FROM images ORDER BY (NOW() - images.date) / COUNT((SELECT COUNT(*) FROM votes WHERE votes.image_id = images.id)) DESC LIMIT 20
but my SQL is rusty ;-)
Taking a simple average will, of course, bias in favor of new images showing up on the front page. If you want to remove that bias, you could, say, count only those votes that occurred within a certain time limit after the image being posted. For images that are more recent than that time limit, you'd have to normalize by multiplying the number of votes by the time limit then dividing by the age of the image. Or alternatively, you could give the votes a continuously varying weight, something like exp(-time(vote) + time(image)). And so on and so on... depending on how particular you are about what this algorithm will do, it could take some experimentation to figure out what formula gives the best results.
I've no useful ideas as far as the actual agorithm is concerned, but in terms of implementation, I'd suggest caching the result somewhere, with a periodic update - if the resulting computation results in an expensive query, you probably don't want to slow your response times.
Something like:
(count favorited + k) * / time since last activity
The higher k is the less weight has the number of people having it favorited.
You could also change the time to something like the time it first appeared + the time of the last activity, this would ensure that older illustrations would vanish with time.

Categories