I am trying to to mark some trends, so I have 1 as the lowest and 5 as the biggest value.
So for example,
I may have the following case:
5,4,5,5 (UP)
3,4, (UP)
4,3,3 (DOWN)
4,4,4,4, (FLAT - this is OK for all same numbers)
I am planning to have unlimited number of ordered values as input, an as an output I will just show an (UP), (DOWN), or (FLAT) image.
Any ideas on how I can achieve this?
Sorry if I am not descriptive enough.
Thank you all for you time.
Use least square fit to calculate the "slope" of the values.
function leastSquareFit(array $values) {
$x_sum = array_sum(array_keys($values));
$y_sum = array_sum($values);
$meanX = $x_sum / count($values);
$meanY = $y_sum / count($values);
// calculate sums
$mBase = $mDivisor = 0.0;
foreach($values as $i => $value) {
$mBase += ($i - $meanX) * ($value - $meanY);
$mDivisor += ($i - $meanX) * ($i - $meanX);
}
// calculate slope
$slope = $mBase / $mDivisor;
return $slope;
} // function leastSquareFit()
$trend = leastSquareFit(array(5,4,5,5));
(Untested)
If the slope is positive, the trend is upwards; if negative, it's downwards. Use your own judgement to decide what margin (positive or negative) is considered flat.
A little bit hard to answer based on the limited info you provide, but assuming that:
if there's no movement at all the trend is FLAT,
otherwise, the trend is the last direction of movement,
then this code should work:
$input = array();
$previousValue = false;
$trend = 'FLAT';
foreach( $input as $currentValue ) {
if( $previousValue !== false ) {
if( $currentValue > $previousValue ) {
$trend = 'UP';
} elseif( $currentValue < $previousValue ) {
$trend = 'DOWN';
}
}
$previousValue = $currentValue;
}
For your examples :
Calculate longest increasing subsequence, A
Calulate longest decreasing subsequence , B
Going by your logic, if length of A is larger than B , its an UP , else DOWN.
You will also need to keep track of all equals using one boolean variable to mark FLAT trend.
Query :
What trend would be :
3,4,5,4,3 ?
3,4,4,4,3 ?
1,2,3,4,4,3,2,2,1 ?
Then the logic might need some alterations depending upon what your requirements are .
I'm not sure if i understand your problem totally but I would put the values in an array and use a code like this (written in pseudocode):
int i = 0;
String trend = "FLAT":
while(i<length(array)) {
if(array(i)<array(i+1)) {
trend = "UP";
}
else if(array(i)>array(i+1) {
trend = "DOWN";
}
i++;
}
EDIT: this would obviously only display the trend of the latest alteration
one would also may count the number of times the trend is up or down and determine the overall trend by that values
echo foo(array(5,4,5,5)); // UP
echo foo(array(3,4)); // UP
echo foo(array(4,3,3)); // DOWN
echo foo(array(4,4,4,4)); // FLAT
function foo($seq)
{
if (count(array_unique($seq)) === 1)
return 'FLAT';
$trend = NULL;
$count = count($seq);
$prev = $seq[0];
for ($i = 1; $i < $count; $i++)
{
if ($prev < $seq[$i])
{
$trend = 'UP';
}
if ($prev > $seq[$i])
{
$trend = 'DOWN';
}
$prev = $seq[$i];
}
return $trend;
}
I used the code from #liquorvicar to determine Google search page rank trends, but added some extra trend values to make it more accurate:
nochange - no change
better (higher google position = lower number)
worse (lower google position = higher number)
I also added extra checks when the last value had no change, but taking in account the previous changes i.e.
worsenochange (no change, previouse was worse - lower number)
betternochange (no change, previouse was better - lower number)
I used these values to display a range of trend icons:
$_trendIndicator="<img title="trend" width="16" src="/include/main/images/trend-'. $this->getTrend($_positions). '-icon.png">";
private function getTrend($_positions)
{
// calculate trend based on last value
//
$_previousValue = false;
$_trend = 'nochange';
foreach( $_positions as $_currentValue ) {
if( $_previousValue !== false ) {
if( $_currentValue > $_previousValue ) {
$_trend = 'better';
} elseif( $_currentValue < $_previousValue ) {
$_trend = 'worse';
}
if ($_trend==='worse' && ($_previousValue == $_currentValue)) {$_trend = 'worsenochange';}
if ($_trend==='better' && ($_previousValue == $_currentValue)) {$_trend = 'betternochange';}
}
$_previousValue = $_currentValue;
}
return $_trend;
}
Related
I need to find the value of x where the variance of two results (which take x into account) is the closest to 0. The problem is, the only way to do this is to cycle through all possible values of x. The equation uses currency, so I have to check in increments of 1 cent.
This might make it easier:
$previous_var = null;
$high_amount = 50;
for ($i = 0.01; $i <= $high_amount; $i += 0.01) {
$val1 = find_out_1($i);
$val2 = find_out_2();
$var = variance($val1, $val2);
if ($previous_var == null) {
$previous_var = $var;
}
// If this variance is larger, it means the previous one was the closest to
// 0 as the variance has now started increasing
if ($var > $previous_var) {
$l_s -= 0.01;
break;
}
}
$optimal_monetary_value = $i;
I feel like there is a mathematical formula that would make the "cycling through every cent" more optimal? It works fine for small values, but if you start using 1000's as the $high_amount it takes quite a few seconds to calculate.
Based on the comment in your code, it sounds like you want something similar to bisection search, but a little bit different:
function calculate_variance($i) {
$val1 = find_out_1($i);
$val2 = find_out_2();
return variance($val1, $val2);
}
function search($lo, $loVar, $hi, $hiVar) {
// find the midpoint between the hi and lo values
$mid = round($lo + ($hi - $lo) / 2, 2);
if ($mid == $hi || $mid == $lo) {
// we have converged, so pick the better value and be done
return ($hiVar > $loVar) ? $lo : $hi;
}
$midVar = calculate_variance($mid);
if ($midVar >= $loVar) {
// the optimal point must be in the lower interval
return search($lo, $loVar, $mid, $midVar);
} elseif ($midVar >= $hiVar) {
// the optimal point must be in the higher interval
return search($mid, $midVar, $hi, $hiVar);
} else {
// we don't know where the optimal point is for sure, so check
// the lower interval first
$loBest = search($lo, $loVar, $mid, $midVar);
if ($loBest == $mid) {
// we can't be sure this is the best answer, so check the hi
// interval to be sure
return search($mid, $midVar, $hi, $hiVar);
} else {
// we know this is the best answer
return $loBest;
}
}
}
$optimal_monetary_value = search(0.01, calculate_variance(0.01), 50.0, calculate_variance(50.0));
This assumes that the variance is monotonically increasing when moving away from the optimal point. In other words, if the optimal value is O, then for all X < Y < O, calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) (and the same with all > and < flipped). The comment in your code and the way have you have it written make it seem like this is true. If this isn't true, then you can't really do much better than what you have.
Be aware that this is not as good as bisection search. There are some pathological inputs that will make it take linear time instead of logarithmic time (e.g., if the variance is the same for all values). If you can improve the requirement that calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) to be calculate_variance(X) > calculate_variance(Y) > calculate_variance(O), you can improve this to be logarithmic in all cases by checking to see how the variance for $mid compares the the variance for $mid + 0.01 and using that to decide which interval to check.
Also, you may want to be careful about doing math with currency. You probably either want to use integers (i.e., do all math in cents instead of dollars) or use exact precision numbers.
If you known nothing at all about the behavior of the objective function, there is no other way than trying all possible values.
On the opposite if you have a guarantee that the minimum is unique, the Golden section method will converge very quickly. This is a variant of the Fibonacci search, which is known to be optimal (require the minimum number of function evaluations).
Your function may have different properties which call for other algorithms.
Why not implementing binary search ?
<?php
$high_amount = 50;
// computed val2 is placed outside the loop
// no need te recalculate it each time
$val2 = find_out_2();
$previous_var = variance(find_out_1(0.01), $val2);
$start = 0;
$end = $high_amount * 100;
$closest_variance = NULL;
while ($start <= $end) {
$section = intval(($start + $end)/2);
$cursor = $section / 100;
$val1 = find_out_1($cursor);
$variance = variance($val1, $val2);
if ($variance <= $previous_var) {
$start = $section;
}
else {
$closest_variance = $cursor;
$end = $section;
}
}
if (!is_null($closest_variance)) {
$closest_variance -= 0.01;
}
Can't call it a problem on Stack Overflow apparently, however I am currently trying to understand how to integrate constraints in the form of item groups within the Knapsack problem. My math skills are proving to be fairly limiting in this situation, however I am very motivated to both make this work as intended as well as figure out what each aspect does (in that order since things make more sense when they work).
With that said, I have found an absolutely beautiful implementation at Rosetta Code and cleaned up the variable names some to help myself better understand this from a very basic perspective.
Unfortunately I am having an incredibly difficult time figuring out how I can apply this logic to include item groups. My purpose is for building fantasy teams, supplying my own value & weight (points/salary) per player but without groups (positions in my case) I am unable to do so.
Would anyone be able to point me in the right direction for this? I'm reviewing code examples from other languages and additional descriptions of the problem as a whole, however I would like to get the groups implemented by whatever means possible.
<?php
function knapSolveFast2($itemWeight, $itemValue, $i, $availWeight, &$memoItems, &$pickedItems)
{
global $numcalls;
$numcalls++;
// Return memo if we have one
if (isset($memoItems[$i][$availWeight]))
{
return array( $memoItems[$i][$availWeight], $memoItems['picked'][$i][$availWeight] );
}
else
{
// At end of decision branch
if ($i == 0)
{
if ($itemWeight[$i] <= $availWeight)
{ // Will this item fit?
$memoItems[$i][$availWeight] = $itemValue[$i]; // Memo this item
$memoItems['picked'][$i][$availWeight] = array($i); // and the picked item
return array($itemValue[$i],array($i)); // Return the value of this item and add it to the picked list
}
else
{
// Won't fit
$memoItems[$i][$availWeight] = 0; // Memo zero
$memoItems['picked'][$i][$availWeight] = array(); // and a blank array entry...
return array(0,array()); // Return nothing
}
}
// Not at end of decision branch..
// Get the result of the next branch (without this one)
list ($without_i,$without_PI) = knapSolveFast2($itemWeight, $itemValue, $i-1, $availWeight,$memoItems,$pickedItems);
if ($itemWeight[$i] > $availWeight)
{ // Does it return too many?
$memoItems[$i][$availWeight] = $without_i; // Memo without including this one
$memoItems['picked'][$i][$availWeight] = array(); // and a blank array entry...
return array($without_i,array()); // and return it
}
else
{
// Get the result of the next branch (WITH this one picked, so available weight is reduced)
list ($with_i,$with_PI) = knapSolveFast2($itemWeight, $itemValue, ($i-1), ($availWeight - $itemWeight[$i]),$memoItems,$pickedItems);
$with_i += $itemValue[$i]; // ..and add the value of this one..
// Get the greater of WITH or WITHOUT
if ($with_i > $without_i)
{
$res = $with_i;
$picked = $with_PI;
array_push($picked,$i);
}
else
{
$res = $without_i;
$picked = $without_PI;
}
$memoItems[$i][$availWeight] = $res; // Store it in the memo
$memoItems['picked'][$i][$availWeight] = $picked; // and store the picked item
return array ($res,$picked); // and then return it
}
}
}
$items = array("map","compass","water","sandwich","glucose","tin","banana","apple","cheese","beer","suntan cream","camera","t-shirt","trousers","umbrella","waterproof trousers","waterproof overclothes","note-case","sunglasses","towel","socks","book");
$weight = array(9,13,153,50,15,68,27,39,23,52,11,32,24,48,73,42,43,22,7,18,4,30);
$value = array(150,35,200,160,60,45,60,40,30,10,70,30,15,10,40,70,75,80,20,12,50,10);
## Initialize
$numcalls = 0;
$memoItems = array();
$selectedItems = array();
## Solve
list ($m4, $selectedItems) = knapSolveFast2($weight, $value, sizeof($value)-1, 400, $memoItems, $selectedItems);
# Display Result
echo "<b>Items:</b><br>" . join(", ", $items) . "<br>";
echo "<b>Max Value Found:</b><br>$m4 (in $numcalls calls)<br>";
echo "<b>Array Indices:</b><br>". join(",", $selectedItems) . "<br>";
echo "<b>Chosen Items:</b><br>";
echo "<table border cellspacing=0>";
echo "<tr><td>Item</td><td>Value</td><td>Weight</td></tr>";
$totalValue = 0;
$totalWeight = 0;
foreach($selectedItems as $key)
{
$totalValue += $value[$key];
$totalWeight += $weight[$key];
echo "<tr><td>" . $items[$key] . "</td><td>" . $value[$key] . "</td><td>".$weight[$key] . "</td></tr>";
}
echo "<tr><td align=right><b>Totals</b></td><td>$totalValue</td><td>$totalWeight</td></tr>";
echo "</table><hr>";
?>
That knapsack program is traditional, but I think that it obscures what's going on. Let me show you how the DP can be derived more straightforwardly from a brute force solution.
In Python (sorry; this is my scripting language of choice), a brute force solution could look like this. First, there's a function for generating all subsets with breadth-first search (this is important).
def all_subsets(S): # brute force
subsets_so_far = [()]
for x in S:
new_subsets = [subset + (x,) for subset in subsets_so_far]
subsets_so_far.extend(new_subsets)
return subsets_so_far
Then there's a function that returns True if the solution is valid (within budget and with a proper position breakdown) – call it is_valid_solution – and a function that, given a solution, returns the total player value (total_player_value). Assuming that players is the list of available players, the optimal solution is this.
max(filter(is_valid_solution, all_subsets(players)), key=total_player_value)
Now, for a DP, we add a function cull to all_subsets.
def best_subsets(S): # DP
subsets_so_far = [()]
for x in S:
new_subsets = [subset + (x,) for subset in subsets_so_far]
subsets_so_far.extend(new_subsets)
subsets_so_far = cull(subsets_so_far) ### This is new.
return subsets_so_far
What cull does is to throw away the partial solutions that are clearly not going to be missed in our search for an optimal solution. If the partial solution is already over budget, or if it already has too many players at one position, then it can safely be discarded. Let is_valid_partial_solution be a function that tests these conditions (it probably looks a lot like is_valid_solution). So far we have this.
def cull(subsets): # INCOMPLETE!
return filter(is_valid_partial_solution, subsets)
The other important test is that some partial solutions are just better than others. If two partial solutions have the same position breakdown (e.g., two forwards and a center) and cost the same, then we only need to keep the more valuable one. Let cost_and_position_breakdown take a solution and produce a string that encodes the specified attributes.
def cull(subsets):
best_subset = {} # empty dictionary/map
for subset in filter(is_valid_partial_solution, subsets):
key = cost_and_position_breakdown(subset)
if (key not in best_subset or
total_value(subset) > total_value(best_subset[key])):
best_subset[key] = subset
return best_subset.values()
That's it. There's a lot of optimization to be done here (e.g., throw away partial solutions for which there's a cheaper and more valuable partial solution; modify the data structures so that we aren't always computing the value and position breakdown from scratch and to reduce the storage costs), but it can be tackled incrementally.
One potential small advantage with regard to composing recursive functions in PHP is that variables are passed by value (meaning a copy is made) rather than reference, which can save a step or two.
Perhaps you could better clarify what you are looking for by including a sample input and output. Here's an example that makes combinations from given groups - I'm not sure if that's your intention... I made the section accessing the partial result allow combinations with less value to be considered if their weight is lower - all of this can be changed to prune in the specific ways you would like.
function make_teams($players, $position_limits, $weights, $values, $max_weight){
$player_counts = array_map(function($x){
return count($x);
}, $players);
$positions = array_map(function($x){
$positions[] = [];
},$position_limits);
$num_positions = count($positions);
$combinations = [];
$hash = [];
$stack = [[$positions,0,0,0,0,0]];
while (!empty($stack)){
$params = array_pop($stack);
$positions = $params[0];
$i = $params[1];
$j = $params[2];
$len = $params[3];
$weight = $params[4];
$value = $params[5];
// too heavy
if ($weight > $max_weight){
continue;
// the variable, $positions, is accumulating so you can access the partial result
} else if ($j == 0 && $i > 0){
// remember weight and value after each position is chosen
if (!isset($hash[$i])){
$hash[$i] = [$weight,$value];
// end thread if current value is lower for similar weight
} else if ($weight >= $hash[$i][0] && $value < $hash[$i][1]){
continue;
// remember better weight and value
} else if ($weight <= $hash[$i][0] && $value > $hash[$i][1]){
$hash[$i] = [$weight,$value];
}
}
// all positions have been filled
if ($i == $num_positions){
$positions[] = $weight;
$positions[] = $value;
if (!empty($combinations)){
$last = &$combinations[count($combinations) - 1];
if ($weight < $last[$num_positions] && $value > $last[$num_positions + 1]){
$last = $positions;
} else {
$combinations[] = $positions;
}
} else {
$combinations[] = $positions;
}
// current position is filled
} else if (count($positions[$i]) == $position_limits[$i]){
$stack[] = [$positions,$i + 1,0,$len,$weight,$value];
// otherwise create two new threads: one with player $j added to
// position $i, the other thread skipping player $j
} else {
if ($j < $player_counts[$i] - 1){
$stack[] = [$positions,$i,$j + 1,$len,$weight,$value];
}
if ($j < $player_counts[$i]){
$positions[$i][] = $players[$i][$j];
$stack[] = [$positions,$i,$j + 1,$len + 1
,$weight + $weights[$i][$j],$value + $values[$i][$j]];
}
}
}
return $combinations;
}
Output:
$players = [[1,2],[3,4,5],[6,7]];
$position_limits = [1,2,1];
$weights = [[2000000,1000000],[10000000,1000500,12000000],[5000000,1234567]];
$values = [[33,5],[78,23,10],[11,101]];
$max_weight = 20000000;
echo json_encode(make_teams($players, $position_limits, $weights, $values, $max_weight));
/*
[[[1],[3,4],[7],14235067,235],[[2],[3,4],[7],13235067,207]]
*/
I have a comma delimited list of numbers which i am converting into an array and what i want to know about the list of numbers is if the numbers listed obey a natural ordering of numbers,you know,have a difference of exactly 1 between the next and the previous.
If its true the list obeys the natural ordering,i want to pick the first number of the list and if not the list obeys not the natural order,i pick the second.
This is my code.
<?php
error_reporting(0);
/**
Analyze numbers
Condition 1
if from number to the next has a difference of 1,then pick the first number in the list
Condition 2
if from one number the next,a difference of greater than 1 was found,then pick next from first
Condition 3
if list contains only one number,pick the number
*/
$number_picked = null;
$a = '5,7,8,9,10';
$b = '2,3,4,5,6,7,8,9,10';
$c = '10';
$data = explode(',', $b);
$count = count($data);
foreach($data as $index => $number)
{
/**
If array has exactly one value
*/
if($count == 1){
echo 'number is:'.$number;
exit();
}
$previous = $data[($count+$index-1) % $count];
$current = $number;
$next = $data[($index+1) % $count];
$diff = ($next - $previous);
if($diff == 1){
$number_picked = array_values($data)[0];
echo $number_picked.'correct';
}
elseif($diff > 1){
$number_picked = array_values($data)[1];
echo $number_picked.'wrong';
}
}
?>
The problem i am having is to figure out how to test the difference for all array elements.
No loops are needed, a little bit of maths will help you here. Once you have your numbers in an array:
$a = explode(',', '5,7,8,9,10');
pass them to this function:-
function isSequential(array $sequence, $diff = 1)
{
return $sequence[count($sequence) - 1] === $sequence[0] + ($diff * (count($sequence) - 1));
}
The function will return true if the numbers in the array follow a natural sequence. You should even be able to adjust it for different spacings between numbers, eg 2, 4, 6, 8, etc using the $diff parameter, although I haven't tested that thoroughly.
See it working.
Keep in mind that this will only work if your list of numbers is ordered from smallest to largest.
Try using a function to solve this... Like so:
<?php
error_reporting(0);
/**
Analyze numbers
Condition 1
if from number to the next has a difference of 1,then pick the first number in the list
Condition 2
if from one number the next,a difference of greater than 1 was found,then pick next from first
Condition 3
if list contains only one number,pick the number
*/
$number_picked = null;
$a = '5,7,8,9,10';
$b = '2,3,4,5,6,7,8,9,10';
$c = '10';
function test($string) {
$data = explode(',', $string);
if(count($data) === 1){
return 'number is:'.$number;
}
foreach($data as $index => $number)
{
$previous = $data[($count+$index-1) % $count];
$current = $number;
$next = $data[($index+1) % $count];
$diff = ($next - $previous);
if($diff == 1){
$number_picked = array_values($data)[0];
return $number_picked.'correct';
}
elseif($diff > 1){
$number_picked = array_values($data)[1];
return $number_picked.'wrong';
}
}
}
echo test($a);
echo test($b);
echo test($c);
?>
You already know how to explode the list, so I'll skip that.
You already handle a single item, so I'll skip that as well.
What is left, is checking the rest of the array. Basically; there's two possible outcome values: either the first element or the second. So we'll save those two first:
$outcome1 = $list[0];
$outcome2 = $list[1];
Next, we'll loop over the items. We'll remember the last found item, and make sure that the difference between the new and the old is 1. If it is, we continue. If it isn't, we abort and immediately return $outcome2.
If we reach the end of the list without aborting, it's naturally ordered, so we return $outcome1.
$lastNumber = null;
foreach( $items as $number ) {
if($lastNumber === null || $number - $lastNumber == 1 ) {
// continue scanning
$lastNumber = $number;
}
else {
// not ordened
return $outcome2;
}
}
return $outcome1; // scanned everything; was ordened.
(Note: code not tested)
To avoid the headache of accessing the previous or next element, and deciding whether it still is inside the array or not, use the fact that on a natural ordering the item i and the first item have a difference of i.
Also the corner case you call condition 3 is easier to handle outside the loop than inside of it. But easier still, the way we characterize a natural ordered list holds for a 1-item list :
$natural = true;
for($i=1; $i<$count && $natural; $i++)
$natural &= ($data[$i] == $data[0] + $i)
$number = $natural ? $data[0] : $data[1];
For $count == 1 the loop is never entered and thus $natural stays true : you select the first element.
Given a list of ranges ie: 1-3,5,6-4,31,9,19,10,25-20
how can i reduce it to 1-6,9-10,19-25,31 ?
Here is what i've done so far, it seems a little bit complicated, so
is there any simpler/clever method to do this.
$in = '1-3,5,6-4,31,9,19,10,25-20';
// Explode the list in ranges
$rs = explode(',', $in);
$tmp = array();
// for each range of the list
foreach($rs as $r) {
// find the start and end date of the range
if (preg_match('/(\d+)-(\d+)/', $r, $m)) {
$start = $m[1];
$end = $m[2];
} else {
// If only one date
$start = $end = $r;
}
// flag each date in an array
foreach(range($start,$end) as $i) {
$tmp[$i] = 1;
}
}
$str = '';
$prev = 999;
// for each date of a month (1-31)
for($i=1; $i<32; $i++) {
// is this date flaged ?
if (isset($tmp[$i])) {
// is output string empty ?
if ($str == '') {
$str = $i;
} else {
// if the previous date is less than the current minus 1
if ($i-1 > $prev) {
// build the new range
$str .= '-'.$prev.','.$i;
}
}
$prev = $i;
}
}
// build the last range
if ($i-1 > $prev) {
$str .= '-'.$prev;
}
echo "str=$str\n";
NB: it must run under php 5.1.6 (i can't upgrade).
FYI : the numbers represent days of month so they are limited to 1-31.
Edit:
From a given range of dates (1-3,6,7-8), i'd like obtain another list (1-3,6-8) where all the ranges are recalculated and ordered.
Perhaps not the most efficient, but shouldn't be too bad with the limited range of values you're working with:
$in = '1-3,5,6-4,31,9,19,10,25-20';
$inSets = explode(',',$in);
$outSets = array();
foreach($inSets as $inSet) {
list($start,$end) = explode('-',$inSet.'-'.$inSet);
$outSets = array_merge($outSets,range($start,$end));
}
$outSets = array_unique($outSets);
sort($outSets);
$newSets = array();
$start = $outSets[0];
$end = -1;
foreach($outSets as $outSet) {
if ($outSet == $end+1) {
$end = $outSet;
} else {
if ($start == $end) {
$newSets[] = $start;
} elseif($end > 0) {
$newSets[] = $start.'-'.$end;
}
$start = $end = $outSet;
}
}
if ($start == $end) {
$newSets[] = $start;
} else {
$newSets[] = $start.'-'.$end;
}
var_dump($newSets);
echo '<br />';
You just have to search your data to get what you want. Split the input on the delimiter, in your case ','. Then sort it somehow, this safes you searching left from the current position. Take you first element, check whether it's a range and use the highest number in this range (3 out of 1-3 range or 3 if 3 is a single element) for further comparisions. Then take the 2nd element in your list and check if it's a direct successor of the last element. If yes combine the 1st and 2nd elements/range to a new range. Repeat.
Edit: I'm not sure about PHP but a regular expression is a bit overkill for this problem. Just look for a '-' in your exploded array, then you know it's a range. Sorting the exp. array safes you the backtracking, the stuff you are doing with $prev. You could also explode every element in the exploded array on '-' and check if the resulting array has a size > 1 to learn whether an element is a range or not.
Looking at the problem from an algorithmic stand-point, let's consider the limitations that you've put on the problem. All numbers will be from 1-31. The list is a collection of "ranges", each of which is defined by two numbers (start and end). There is no rule for whether start will be more, less than, or equal to end.
Since we have an arbitrarily large list of ranges but a definite means of sorting/organizing these, a divide and conquer strategy may yield the best complexity.
At first I typed out a very long and careful explanation of how I created each step in this algorithm (the dividing portion, the conquering potion, optimizations, etc.) however the explanation got extremely long. To shorten it, here's the final answer:
<?php
$ranges = "1-3,5,6-4,31,9,19,10,25-20";
$range_array = explode(',', $ranges);
$include = array();
foreach($range_array as $range){
list($start, $end) = explode('-', $range.'-'.$range); //"1-3-1-3" or "5-5"
$include = array_merge($include, range($start, $end));
}
$include = array_unique($include);
sort($include);
$new_ranges = array();
$start = $include[0];
$count = $start;
// And begin the simple conquer algorithm
for( $i = 1; $i < count($include); $i++ ){
if( $include[$i] != ($count++) ){
if($start == $count-1){
$new_ranges[] = $start;
} else {
$new_ranges[] = $start."-".$count-1;
}
$start = $include[$i];
$count = $start;
}
}
$new_ranges = implode(',', $new_ranges);
?>
This should (theoretically) work on arrays of arbitrary length for any positive integers. Negative integers would get tripped up since - is our delimiter for the range.
I was looking for a way to calculate dynamic market values in a soccer manager game. I asked this question here and got a very good answer from Alceu Costa.
I tried to code this algorithm (90 elements, 5 clustes) but it doesn't work correctly:
In the first iteration, a high percentage of the elements changes its cluster.
From the second iteration, all elements change their cluster.
Since the algorithm normally works until convergence (no element changes its cluster), it doesn't finish in my case.
So I set the end to the 15th iteration manually. You can see that it runs infinitely.
You can see the output of my algorithm here. What's wrong with it? Can you tell me why it doesn't work correctly?
I hope you can help me. Thank you very much in advance!
Here's the code:
<?php
include 'zzserver.php';
function distance($player1, $player2) {
global $strengthMax, $maxStrengthMax, $motivationMax, $ageMax;
// $playerX = array(strength, maxStrength, motivation, age, id);
$distance = 0;
$distance += abs($player1['strength']-$player2['strength'])/$strengthMax;
$distance += abs($player1['maxStrength']-$player2['maxStrength'])/$maxStrengthMax;
$distance += abs($player1['motivation']-$player2['motivation'])/$motivationMax;
$distance += abs($player1['age']-$player2['age'])/$ageMax;
return $distance;
}
function calculateCentroids() {
global $cluster;
$clusterCentroids = array();
foreach ($cluster as $key=>$value) {
$strenthValues = array();
$maxStrenthValues = array();
$motivationValues = array();
$ageValues = array();
foreach ($value as $clusterEntries) {
$strenthValues[] = $clusterEntries['strength'];
$maxStrenthValues[] = $clusterEntries['maxStrength'];
$motivationValues[] = $clusterEntries['motivation'];
$ageValues[] = $clusterEntries['age'];
}
if (count($strenthValues) == 0) { $strenthValues[] = 0; }
if (count($maxStrenthValues) == 0) { $maxStrenthValues[] = 0; }
if (count($motivationValues) == 0) { $motivationValues[] = 0; }
if (count($ageValues) == 0) { $ageValues[] = 0; }
$clusterCentroids[$key] = array('strength'=>array_sum($strenthValues)/count($strenthValues), 'maxStrength'=>array_sum($maxStrenthValues)/count($maxStrenthValues), 'motivation'=>array_sum($motivationValues)/count($motivationValues), 'age'=>array_sum($ageValues)/count($ageValues));
}
return $clusterCentroids;
}
function assignPlayersToNearestCluster() {
global $cluster, $clusterCentroids;
$playersWhoChangedClusters = 0;
// BUILD NEW CLUSTER ARRAY WHICH ALL PLAYERS GO IN THEN START
$alte_cluster = array_keys($cluster);
$neuesClusterArray = array();
foreach ($alte_cluster as $alte_cluster_entry) {
$neuesClusterArray[$alte_cluster_entry] = array();
}
// BUILD NEW CLUSTER ARRAY WHICH ALL PLAYERS GO IN THEN END
foreach ($cluster as $oldCluster=>$clusterValues) {
// FOR EVERY SINGLE PLAYER START
foreach ($clusterValues as $player) {
// MEASURE DISTANCE TO ALL CENTROIDS START
$abstaende = array();
foreach ($clusterCentroids as $CentroidId=>$centroidValues) {
$distancePlayerCluster = distance($player, $centroidValues);
$abstaende[$CentroidId] = $distancePlayerCluster;
}
arsort($abstaende);
if ($neuesCluster = each($abstaende)) {
$neuesClusterArray[$neuesCluster['key']][] = $player; // add to new array
// player $player['id'] goes to cluster $neuesCluster['key'] since it is the nearest one
if ($neuesCluster['key'] != $oldCluster) {
$playersWhoChangedClusters++;
}
}
// MEASURE DISTANCE TO ALL CENTROIDS END
}
// FOR EVERY SINGLE PLAYER END
}
$cluster = $neuesClusterArray;
return $playersWhoChangedClusters;
}
// CREATE k CLUSTERS START
$k = 5; // Anzahl Cluster
$cluster = array();
for ($i = 0; $i < $k; $i++) {
$cluster[$i] = array();
}
// CREATE k CLUSTERS END
// PUT PLAYERS IN RANDOM CLUSTERS START
$sql1 = "SELECT ids, staerke, talent, trainingseifer, wiealt FROM ".$prefix."spieler LIMIT 0, 90";
$sql2 = mysql_abfrage($sql1);
$anzahlSpieler = mysql_num_rows($sql2);
$anzahlSpielerProCluster = $anzahlSpieler/$k;
$strengthMax = 0;
$maxStrengthMax = 0;
$motivationMax = 0;
$ageMax = 0;
$counter = 0; // for $anzahlSpielerProCluster so that all clusters get the same number of players
while ($sql3 = mysql_fetch_assoc($sql2)) {
$assignedCluster = floor($counter/$anzahlSpielerProCluster);
$cluster[$assignedCluster][] = array('strength'=>$sql3['staerke'], 'maxStrength'=>$sql3['talent'], 'motivation'=>$sql3['trainingseifer'], 'age'=>$sql3['wiealt'], 'id'=>$sql3['ids']);
if ($sql3['staerke'] > $strengthMax) { $strengthMax = $sql3['staerke']; }
if ($sql3['talent'] > $maxStrengthMax) { $maxStrengthMax = $sql3['talent']; }
if ($sql3['trainingseifer'] > $motivationMax) { $motivationMax = $sql3['trainingseifer']; }
if ($sql3['wiealt'] > $ageMax) { $ageMax = $sql3['wiealt']; }
$counter++;
}
// PUT PLAYERS IN RANDOM CLUSTERS END
$m = 1;
while ($m < 16) {
$clusterCentroids = calculateCentroids(); // calculate new centroids of the clusters
$playersWhoChangedClusters = assignPlayersToNearestCluster(); // assign each player to the nearest cluster
if ($playersWhoChangedClusters == 0) { $m = 1001; }
echo '<li>Iteration '.$m.': '.$playersWhoChangedClusters.' players have changed place</li>';
$m++;
}
print_r($cluster);
?>
It's so embarassing :D I think the whole problem is caused by only one letter:
In assignPlayersToNearestCluster() you can find arsort($abstaende);. After that, the function each() takes the first value. But it's arsort so the first value must be the highest. So it picks the cluster which has the highest distance value.
So it should be asort, of course. :) To prove that, I've tested it with asort - and I get convergence after 7 iterations. :)
Do you think that was the mistake? If it was, then my problem is solved. In that case: Sorry for annoying you with that stupid question. ;)
EDIT: disregard, I still get the same result as you, everyone winds up in cluster 4. I shall reconsider my code and try again.
I think I've realised what the problem is, k-means clustering is designed to break up differences in a set, however, because of the way you calculate averages etc. we are getting a situation where there are no large gaps in the ranges.
Might I suggest a change and only concentrate on a single value(strength appears to make most sense to me) to determine the clusters, or abandon this sorting method altogether, and adopt something different(not what you want to hear I know)?
I found a rather nice site with an example k-mean sort using integers, I'm going to try and edit that, I will get back with the results some time tomorrow.
http://code.blip.pt/2009/04/06/k-means-clustering-in-php/ <-- link I mentioned and forgot about.