Appropriate data structure for table that uses ranges - php

I have a table that looks like this:
<22 23-27
8-10 1.3 1.8
11-13 2.2 2.8
14-16 3.2 3.8
and it goes on. So I'd like to lookup a value like this:
lookup(11,25)
and get the response, in this case 2.8. What is the best data structure to use for this? I have the data in CSV format.
I'm looking to program this in PHP.
Thank you.

I'm certainly not claiming this is the best or most efficient data structure, but this is how I'd map your data into a two-dimensional PHP array that very closely resembles your raw data:
$fp = fopen('data.csv', 'r');
$cols = fgetcsv($fp);
array_shift($cols); // remove empty first item
$data = array();
while ($row = fgetcsv($fp)) {
list($min, $max) = explode('-', $row[0]);
// TODO: Handle non-range values here (e.g. column header "<22")
$data["$min-$max"] = array();
for ($x = 0; $x < count($cols); $x++) {
$data["$min-$max"][$cols[$x]] = $row[$x + 1];
}
}
You'd then need to add some parsing logic in your lookup function:
function lookup($row, $col) {
$return = null;
// Loop through all rows
foreach ($data as $row_name => $cols) {
list($min, $max) = explode('-', $row_name);
if ($min <= $row && $max >= $row) {
// If row matches, loop through columns
foreach ($cols as $col_name => $value) {
// TODO: Add support for "<22"
list($min, $max) = explode('-', $col_name);
if ($min <= $col && $max >= $col) {
$return = $value;
break;
}
}
break;
}
}
return $return;
}

How about some kind of two dimensional data structure.
X "coordinates" being <22, 23-27
Y "coordinates" being ...
A two dimensional Array would probably work for this purpose.
You will then need some function to map the specific X and Y values to the ranges, but that should not be too hard.

Database structure:
values
------
value
x_range_start
x_range_end
y_range_start
y_range_end
Code:
function lookup(x, y) {
sql = "
SELECT * FROM values
WHERE
x >= x_range_start
AND
x <= x_range_end
AND
y >= y_range_start
AND
y <= y_range_end
"
/---/
}
Your data would map to the database like so:
<22 23-27
8-10 1.3 1.8
11-13 2.2 2.8
14-16 3.2 3.8
(value, x start, x end, y start, y end)
1.3, 0, 22, 8, 10
1.8, 23, 27, 8, 10
2.2, 0, 22, 11, 13
...
Basically store the x and y axis start and end numbers for each value in the table.

I'm partial to the 2 Dimensional array with a "hash" function that maps the ranges into specific addresses in the table.
So your underlying data structure would be a 2 dimensional array:
0 1
0 1.3 1.8
1 2.2 2.8
2 3.2 3.8
Then you would write two functions:
int xhash(int);
int yhash(int);
That take the original arguments and convert them into indexes into your array. So xhash performs the conversion:
8-10 0
11-13 1
14-16 2
Finally, your lookup operation becomes.
function lookup($x, $y)
{
$xIndex = xhash($x);
$yIndex = yhash($y);
// Handle invalid indices!
return $data[$xIndex][$yIndex];
}

Well, the other answers all use 2D arrays, which means using a 2D loop to retrieve it. Which, if your ranges are age ranges or something similar, may be finite (there are only so many age ranges!), and not an issue (what's a few hundred iterations?). If your ranges are expected to scale to enormous numbers, a play on a hash map may be your best bet. So, you create a hashing function that turns any number into the relevant range, then you do direct lookups, instead of a loop. It would be O(1) access instead of O(n^2).
So your hash function could be like: function hash(n) { if (n < 22) return 1; if (n < 25) return 2; return -1; }, and then you can specify your ranges in terms of those hash values (1, 2, etc.), and then just go $data[hash(11)][hash(25)]

the simplest option: create array of arrays, where each array consists of 5 elements: minX, maxX, minY, maxY, value, in your case it would be
$data = array(
array(8, 10, 0, 22, 1.3),
array(8, 10, 23, 27, 1.8),
array(11, 13, 0, 22, 2.2), etc
write a loop that goes through every element and compares min & max values with your arguments:
function find($x, $y) {
foreach($data as $e) {
if($x <= $e[0] && $x >= $e[1] && $y <= $e[2] && $y >= $e[3])
return $e[4];
}
with a small dataset this will work fine, if your dataset is bigger you should consider using a database.

Related

Solve Multiple Choice Knapsack (MCKP) With Dynamic Programming?

Example Data
For this question, let's assume the following items:
Items: Apple, Banana, Carrot, Steak, Onion
Values: 2, 2, 4, 5, 3
Weights: 3, 1, 3, 4, 2
Max Weight: 7
Objective:
The MCKP is a type of Knapsack Problem with the additional constraint that "[T]he items are subdivided into k classes... and exactly one item must be taken from each class"
I have written the code to solve the 0/1 KS problem with dynamic programming using recursive calls and memoization. My question is whether it is possible to add this constraint to my current solution? Say my classes are Fruit, Vegetables, Meat (from the example), I would need to include 1 of each type. The classes could just as well be type 1, 2, 3.
Also, I think this can be solved with linear programming and a solver, but if possible, I'd like to understand the answer here.
Current Code:
<?php
$value = array(2, 2, 4, 5, 3);
$weight = array(3, 1, 3, 4, 2);
$maxWeight = 7;
$maxItems = 5;
$seen = array(array()); //2D array for memoization
$picked = array();
//Put a dummy zero at the front to make things easier later.
array_unshift($value, 0);
array_unshift($weight, 0);
//Call our Knapsack Solver and return the sum value of optimal set
$KSResult = KSTest($maxItems, $maxWeight, $value, $weight);
$maxValue = $KSResult; //copy the result so we can recreate the table
//Recreate the decision table from our memo array to determine what items were picked
//Here I am building the table backwards because I know the optimal value will be at the end
for($i=$maxItems; $i > 0; $i--) {
for($j=$maxWeight; $j > 0; $j--) {
if($seen[$i][$j] != $seen[$i-1][$j]
&& $maxValue == $seen[$i][$j]) {
array_push($picked, $i);
$maxValue -= $value[$i];
break;
}
}
}
//Print out picked items and max value
print("<pre>".print_r($picked,true)."</pre>");
echo $KSResult;
// Recursive formula to solve the KS Problem
// $n = number of items to check
// $c = total capacity of bag
function KSTest($n, $c, &$value, &$weight) {
global $seen;
if(isset($seen[$n][$c])) {
//We've seen this subproblem before
return $seen[$n][$c];
}
if($n === 0 || $c === 0){
//No more items to check or no more capacity
$result = 0;
}
elseif($weight[$n] > $c) {
//This item is too heavy, check next item without this one
$result = KSTest($n-1, $c, $value, $weight);
}
else {
//Take the higher result of keeping or not keeping the item
$tempVal1 = KSTest($n-1, $c, $value, $weight);
$tempVal2 = $value[$n] + KSTest($n-1, $c-$weight[$n], $value, $weight);
if($tempVal2 >= $tempVal1) {
$result = $tempVal2;
//some conditions could go here? otherwise use max()
}
else {
$result = $tempVal1;
}
}
//memo the results and return
$seen[$n][$c] = $result;
return $result;
}
?>
What I've Tried:
My first thought was to add a class (k) array, sort the items via class (k), and when we choose to select an item that is the same as the next item, check if it's better to keep the current item or the item without the next item. Seemed promising, but fell apart after a couple of items being checked. Something like this:
$tempVal3 = $value[$n] + KSTest($n-2, $c-$weight[$n]);
max( $tempVal2, $tempVal3);
Another thought is that at the function call, I could call a loop for each class type and solve the KS with only 1 item at a time of that type + the rest of the values. This will definitely be making some assumptions thought because the results of set 1 might still be assuming multiples of set 2, for example.
This looks to be the equation (If you are good at reading all those symbols?) :) and a C++ implementation? but I can't really see where the class constraint is happening?
The c++ implementation looks ok.
Your values and weights which are 1 dimensional array in your current PHP implementation will become 2 dimensional.
So for example,
values[i][j] will be value of j th item in class i. Similarly in case of weights[i][j]. You will be taking only one item for each class i and move forward while maximizing the condition.
The c++ implementation also does an optimization in memo. It only keeps 2 arrays of size respecting the max_weight condition, which are current and previous states. This is because you only need these 2 states at a time to compute present state.
Answers to your doubts:
1)
My first thought was to add a class (k) array, sort the items via
class (k), and when we choose to select an item that is the same as
the next item, check if it's better to keep the current item or the
item without the next item. Seemed promising, but fell apart after a
couple of items being checked. Something like this: $tempVal3 =
$value[$n] + KSTest($n-2, $c-$weight[$n]); max( $tempVal2, $tempVal3);
This won't work because there could be some item in class k+1 where you take a optimal value and to respect constraint you need to take a suboptimal value for class k. So sorting and picking the best won't work when the constraint is hit. If the constraint is not hit you can always pick the best value with best weight.
2)
Another thought is that at the function call, I could call a loop for
each class type and solve the KS with only 1 item at a time of that
type + the rest of the values.
Yes you are on the right track here. You will assume that you had already solved for first k classes. Now you will try extending using the values of k+1 class respecting the weight constraint.
3)
... but I can't really see where the class constraint is happening?
for (int i = 1; i < weight.size(); ++i) {
fill(current.begin(), current.end(), -1);
for (int j = 0; j < weight[i].size(); ++j) {
for (int k = weight[i][j]; k <= max_weight; ++k) {
if (last[k - weight[i][j]] > 0)
current[k] = max(current[k],
last[k - weight[i][j]] + value[i][j]);
}
}
swap(current, last);
}
In the above c++ snippet, the first loop iterates on class, the second loop iterates on values of class and the third loop extends the current state current using the previous state last and only 1 item j with class i at a time. Since you are only using previous state last and 1 item of the current class to extend and maximize, you are following the constraint.
Time complexity:
O( total_items x max_weight) which is equivalent to O( class x max_number_of_items_in_a_class x max_weight)
So I am not a php programmer but I will try to write a pseudocode with good explanation.
In the original problem each cell i, j meaning was: "Value of filling the knapsack with items 1 to i until it reach capacity j", the solution in the link you have provided defines each cell as "Value of filling the knapsack with items from buckets 1 to i until it reach capacity j". Notice that in this variation there is not such this as not taking an element from a class.
So on each step (each call for KSTest with $n, $c), we need to find which element to pick from the n'th class such that the weight of this element is less than c and it's value + KSTest(n - 1, c - w) is the greatest.
So I think you should only change the else if and else statements to something like:
else {
$result = 0
for($i=0; $i < $number_of_items_in_nth_class; $i++) {
if ($weight[$n][$i] > $c) {
//This item is too heavy, check next item
continue;
}
$result = max($result, KSTest($n-1, $c - $weight[$n][$i], $value, $weight));
}
}
Now two disclaimers:
I do not code in php so this code will not run :)
This is not the implementation given in the link you provided, TBH I didn't understood why the time complexity of their algorithm is so small (and what is C) but this implementation should work since it is following the definition of the recursive formula given.
The time complexity of this should be O(max_weight * number_of_classes * size_of_largerst_class).
This is my PHP solution. I've tried to comment the code in a way that it's easy to follow.
Update:
I updated the code because the old script was giving unreliable results. This is cleaner and has been thoroughly tested. Key takeaways are that I use two memo arrays, one at the group level to speed up execution and one at the item level to reconstruct the results. I found any attempts to track which items are being chosen as you go are unreliable and much less efficient. Also, isset() instead of if($var) is essential for checking the memo array because the previous results might have been 0 ;)
<?php
/**
* Multiple Choice Knapsack Solver
*
* #author Michael Cruz
* #version 1.0 - 03/27/2020
**/
class KS_Solve {
public $KS_Items;
public $maxValue;
public $maxWeight;
public $maxItems;
public $finalValue;
public $finalWeight;
public $finalItems;
public $finalGroups;
public $memo1 = array(); //Group memo
public $memo2 = array(); //Item memo for results rebuild
public function __construct() {
//some default variables as an example.
//KS_Items = array(Value, Weight, Group, Item #)
$this->KS_Items = array(
array(2, 3, 1, 1),
array(2, 1, 1, 2),
array(4, 3, 2, 3),
array(5, 4, 2, 4),
array(3, 2, 3, 5)
);
$this->maxWeight = 7;
$this->maxItems = 5;
$this->KS_Wrapper();
}
public function KS_Wrapper() {
$start_time = microtime(true);
//Put a dummy zero at the front to make things easier later.
array_unshift($this->KS_Items, array(0, 0, 0, 0));
//Call our Knapsack Solver
$this->maxValue = $this->KS_Solver($this->maxItems, $this->maxWeight);
//Recreate the decision table from our memo array to determine what items were picked
//ksort($this->memo2); //for debug
for($i=$this->maxItems; $i > 0; $i--) {
//ksort($this->memo2[$i]); //for debug
for($j=$this->maxWeight; $j > 0; $j--) {
if($this->maxValue == 0) {
break 2;
}
if($this->memo2[$i][$j] == $this->maxValue
&& $j == $this->maxWeight) {
$this->maxValue -= $this->KS_Items[$i][0];
$this->maxWeight -= $this->KS_Items[$i][1];
$this->finalValue += $this->KS_Items[$i][0];
$this->finalWeight += $this->KS_Items[$i][1];
$this->finalItems .= " " . $this->KS_Items[$i][3];
$this->finalGroups .= " " . $this->KS_Items[$i][2];
break;
}
}
}
//Print out the picked items and value. (IMPLEMENT Proper View or Return!)
echo "<pre>";
echo "RESULTS: <br>";
echo "Value: " . $this->finalValue . "<br>";
echo "Weight: " . $this->finalWeight . "<br>";
echo "Item's in KS:" . $this->finalItems . "<br>";
echo "Selected Groups:" . $this->finalGroups . "<br><br>";
$end_time = microtime(true);
$execution_time = ($end_time - $start_time);
echo "Results took " . sprintf('%f', $execution_time) . " seconds to execute<br>";
}
/**
* Recursive function to solve the MCKS Problem
* $n = number of items to check
* $c = total capacity of KS
**/
public function KS_Solver($n, $c) {
$group = $this->KS_Items[$n][2];
$groupItems = array();
$count = 0;
$result = 0;
$bestVal = 0;
if(isset($this->memo1[$group][$c])) {
$result = $this->memo1[$group][$c];
}
else {
//Sort out the items for this group
foreach($this->KS_Items as $item) {
if($item[2] == $group) {
$groupItems[] = $item;
$count++;
}
}
//$k adjusts the index for item memoization
$k = $count - 1;
//Find the results of each item + items of other groups
foreach($groupItems as $item) {
if($item[1] > $c) {
//too heavy
$result = 0;
}
elseif($item[1] >= $c && $group != 1) {
//too heavy for next group
$result = 0;
}
elseif($group == 1) {
//Just take the highest value
$result = $item[0];
}
else {
//check this item with following groups
$result = $item[0] + $this->KS_Solver($n - $count, $c - $item[1]);
}
if($result == $item[0] && $group != 1) {
//No solution with the following sets, so don't use this item.
$result = 0;
}
if($result > $bestVal) {
//Best item so far
$bestVal = $result;
}
//memo the results
$this->memo2[$n-$k][$c] = $result;
$k--;
}
$result = $bestVal;
}
//memo and return
$this->memo1[$group][$c] = $result;
return $result;
}
}
new KS_Solve();
?>

PHP: Optimizing array iteration

i am working on an algorithm for sorting teams based on highest number of score. Teams are to be generated from a list of players. The conditions for creating a team is
It should have 6 players.
The collective salary for 6 players must be less than or equal to 50K.
Teams are to be generated based on highest collective projection.
What i did to get this result is generate all possibilities of team then run checks on them to exclude those teams that have more than 50K salary and then sort the remainder based on projection. But generating all the possibilities takes a lot of time and sometimes it consume all the memory. For a list of 160 players it takes around 90 seconds. Here is the code
$base_array = array();
$query1 = mysqli_query($conn, "SELECT * FROM temp_players ORDER BY projection DESC");
while($row1 = mysqli_fetch_array($query1))
{
$player = array();
$mma_id = $row1['mma_player_id'];
$salary = $row1['salary'];
$projection = $row1['projection'];
$wclass = $row1['wclass'];
array_push($player, $mma_id);
array_push($player, $salary);
array_push($player, $projection);
array_push($player, $wclass);
array_push($base_array, $player);
}
$result_base_array = array();
$totalsalary = 0;
for($i=0; $i<count($base_array)-5; $i++)
{
for($j=$i+1; $j<count($base_array)-4; $j++)
{
for($k=$j+1; $k<count($base_array)-3; $k++)
{
for($l=$k+1; $l<count($base_array)-2; $l++)
{
for($m=$l+1; $m<count($base_array)-1; $m++)
{
for($n=$m+1; $n<count($base_array)-0; $n++)
{
$totalsalary = $base_array[$i][1]+$base_array[$j][1]+$base_array[$k][1]+$base_array[$l][1]+$base_array[$m][1]+$base_array[$n][1];
$totalprojection = $base_array[$i][2]+$base_array[$j][2]+$base_array[$k][2]+$base_array[$l][2]+$base_array[$m][2]+$base_array[$n][2];
if($totalsalary <= 50000)
{
array_push($result_base_array,
array($base_array[$i], $base_array[$j], $base_array[$k], $base_array[$l], $base_array[$m], $base_array[$n],
$totalprojection, $totalsalary)
);
}
}
}
}
}
}
}
usort($result_base_array, "cmp");
And the cmp function
function cmp($a, $b) {
if ($a[6] == $b[6]) {
return 0;
}
return ($a[6] < $b[6]) ? 1 : -1;
}
Is there anyway to reduce the time it takes to do this task, or any other workaround for getting the desired number of teams
Regards
Because number of elements in array can be very big (for example 100 players can generate 1.2*10^9 teams), you can't hold it in memory. Try to save resulting array to file by parts (truncate array after each save). Then use external file sorting.
It will be slow, but at least it will not fall because of memory.
If you need top n teams (like 10 teams with highest projection) then you should convert code that generates result_base_array to Generator, so it will yield next team instead of pushing it into array. Then iterate over this generator. On each iteration add new item to sorted resulted array and cut redundant elements.
Depending on whether the salaries are often the cause of exclusion, you could perform tests on this in the other loops as well. If after 4 player selections their summed salaries are already above 50K, there is no use to select the remaining 2 players. This could save you some iterations.
This can be further improved by remembering the lowest 6 salaries in the pack, and then check if after selecting 4 members you would still stay under 50K if you would add the 2 lowest existing salaries. If this is not possible, then again it is of no use to try to add the two remaining players. Of course, this can be done at each stage of the selection (after selecting 1 player, 2 players, ...)
Another related improvement comes into play when you sort your data by ascending salary. If after selecting the 4th player, the above logic brings you to conclude you cannot stay under 50K by adding 2 more players, then there is no use to replace the 4th player with the next one in the data series either: that player would have a greater salary, so it would also yield to a total above 50K. So that means you can backtrack immediately and work on the 3rd player selection.
As others pointed out, the number of potential solutions is enormous. For 160 teams and a team size of 6 members, the number of combinations is:
160 . 159 . 158 . 157 . 156 . 155
--------------------------------- = 21 193 254 160
6 . 5 . 4 . 3 . 2
21 billion entries is a stretch for memory, and probably not useful to you either: will you really be interested in the team at the 4 432 456 911th place?
You'll probably be interested in something like the top-10 of those teams (in terms of projection). This you can achieve by keeping a list of 10 best teams, and then, when you get a new team with an acceptable salary, you add it to that list, keeping it sorted (via a binary search), and ejecting the entry with the lowest projection from that top-10.
Here is the code you could use:
$base_array = array();
// Order by salary, ascending, and only select what you need
$query1 = mysqli_query($conn, "
SELECT mma_player_id, salary, projection, wclass
FROM temp_players
ORDER BY salary ASC");
// Specify with option argument that you only need the associative keys:
while($row1 = mysqli_fetch_array($query1, MYSQLI_ASSOC)) {
// Keep the named keys, it makes interpreting the data easier:
$base_array[] = $row1;
}
function combinations($base_array, $salary_limit, $team_size) {
// Get lowest salaries, so we know the least value that still needs to
// be added when composing a team. This will allow an early exit when
// the cumulative salary is already too great to stay under the limit.
$remaining_salary = [];
foreach ($base_array as $i => $row) {
if ($i == $team_size) break;
array_unshift($remaining_salary, $salary_limit);
$salary_limit -= $row['salary'];
}
$result = [];
$stack = [0];
$sum_salary = [0];
$sum_projection = [0];
$index = 0;
while (true) {
$player = $base_array[$stack[$index]];
if ($sum_salary[$index] + $player['salary'] <= $remaining_salary[$index]) {
$result[$index] = $player;
if ($index == $team_size - 1) {
// Use yield so we don't need to build an enormous result array:
yield [
"total_salary" => $sum_salary[$index] + $player['salary'],
"total_projection" => $sum_projection[$index] + $player['projection'],
"members" => $result
];
} else {
$index++;
$sum_salary[$index] = $sum_salary[$index-1] + $player['salary'];
$sum_projection[$index] = $sum_projection[$index-1] + $player['projection'];
$stack[$index] = $stack[$index-1];
}
} else {
$index--;
}
while (true) {
if ($index < 0) {
return; // all done
}
$stack[$index]++;
if ($stack[$index] <= count($base_array) - $team_size + $index) break;
$index--;
}
}
}
// Helper function to quickly find where to insert a value in an ordered list
function binary_search($needle, $haystack) {
$high = count($haystack)-1;
$low = 0;
while ($high >= $low) {
$mid = (int)floor(($high + $low) / 2);
$val = $haystack[$mid];
if ($needle < $val) {
$high = $mid - 1;
} elseif ($needle > $val) {
$low = $mid + 1;
} else {
return $mid;
}
}
return $low;
}
$top_team_count = 10; // set this to the desired size of the output
$top_teams = []; // this will be the output
$top_projections = [];
foreach(combinations($base_array, 50000, 6) as $team) {
$j = binary_search($team['total_projection'], $top_projections);
array_splice($top_teams, $j, 0, [$team]);
array_splice($top_projections, $j, 0, [$team['total_projection']]);
if (count($top_teams) > $top_team_count) {
// forget about lowest projection, to keep memory usage low
array_shift($top_teams);
array_shift($top_projections);
}
}
$top_teams = array_reverse($top_teams); // Put highest projection first
print_r($top_teams);
Have a look at the demo on eval.in, which just generates 12 players with random salary and projection data.
Final remarks
Even with the above mentioned optimisations, doing this for 160 teams might still require a lot of iterations. The more often the salaries amount to more than 50K, the better the performance will be. If this never happens, the algorithm cannot escape from having to look at each of the 21 billion combinations. If you would know beforehand that the 50K limit would not play any role, you would of course order the data by descending projection, like you originally did.
Another optimisation could be if you would feed back into the combination function the 10th highest team projection you have so far. The function could then eliminate combinations that would lead to a lower total projection. You could first take the 6 highest player projection values and use this to determine how high a partial team projection can still grow by adding the missing players. It might turn out that this becomes impossible after having selected a few players, and then you can skip some iterations, much like done on the basis of salaries.

Random ID/Number Generator in PHP

I am building a list of "agent id's" in my database with the following requirements:
The ID must be 9 digits long (numeric only)
The ID may not contain more than 3 of the same number.
The ID may not contain more than 2 of the same number consecutively (i.e. 887766551; cannot have 888..)
So far I have part 1 down solid but am struggling with 2 and 3 above. My code is below.
function createRandomAGTNO() {
srand ((double) microtime( )*1000000);
$random_agtno = rand(100000000,900000000);
return $random_agtno;
}
// Usage
$NEWAGTNO = createRandomAGTNO();
Any ideas?
Do not re-seed the RNG on every call like that, unless you want to completely blow the security of your random numbers.
Unless your PHP is very old, you probably don't need to re-seed the RNG at all, as PHP seeds it for you on startup and there are very few cases where you need to replace the seed with one of your own choosing.
If it's available to you, use mt_rand instead of rand. My example will use mt_rand.
As for the rest -- you could possibly come up with a very clever mapping of numbers from a linear range onto numbers of the form you want, but let's brute-force it instead. This is one of those things where yes, the theoretical upper bound on running time is infinite, but the expected running time is bounded and quite small, so don't worry too hard.
function createRandomAGTNO() {
do {
$agt_no = mt_rand(100000000,900000000);
$valid = true;
if (preg_match('/(\d)\1\1/', $agt_no))
$valid = false; // Same digit three times consecutively
elseif (preg_match('/(\d).*?\1.*?\1.*?\1/', $agt_no))
$valid = false; // Same digit four times in string
} while ($valid === false);
return $agt_no;
}
For second condition, you can create an array like this
$a = array( 0,0,1,1,2,2,3,3.....,9,9 );
and get random elements: array_rand() (see manual) to get digit, append it to your ID and remove value from source array by unsetting at index.
Generally, this solving also third condition, but this solution excludes all ID's with possible and acceptable three digits
The first solution that comes to mind is a recursive function that simply tests your three requirements and restarts if any three of them fail. Not the most efficient solution but it would work. I wrote an untested version of this below. May not run without errors but you should get the basic idea from it.
function createRandomAGTNO(){
srand ((double) microtime( )*1000000);
$random_agtno = rand(100000000,900000000);
$random_agtno_array = explode('', $random_agtno);
foreach($random_agtno_array as $raa_index => $raa){
if($raa == $random_agtno_array[$raa_index + 1] && raa == $random_agtno_array[$raa_index + 2]) createRandomAGTNO();
$dup_match = array_search($raa, $random_agtno_array);
if($dup_match){
unset($random_agtno_array[$dup_match]);
if(array_search($raa, $random_agtno_array)) createRandomAGTNO();
};
}
return $random_agtno;
}
Try this code:
<?php
function createRandomAGTNO() {
//srand ((double) microtime( )*1000000);
$digits = array( 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 ,1, 2, 3, 4, 5, 6, 7, 8, 9, 0 );
shuffle($digits);
$random_agtno = 0;
for($i = 0; $i < 9; $i++)
{
if($i == 0)
{
while($digits[0] == 0)
shuffle($digits);
}
/*if($i >= 2)
{
while(($random_agtno % 100) == $digits[0])
shuffle($digits);
}*/
$random_agtno *= 10;
$random_agtno += $digits[0];
array_splice($digits, 0, 1);
}
return $random_agtno;
}
for($i = 0; $i < 1000; $i++)
{
$NEWAGTNO = createRandomAGTNO();
echo "<p>";
echo $NEWAGTNO;
echo "</p>";
}
?>
Good luck!
Edit:
Removed the call to srand() and commented-out the "if($i >= 2)" code, which is impossible anyway, here.

PHP Random numbers

I would like to draw a random number from the interval 1,49 but I would like to add a number as an exception ( let's say 44 ) , I cannot use round(rand(1,49)) .So I decided to make an array of 49 numbers ( 1-49) , unset[$aray[44]] and apply array_rand
Now I want to draw a number from the interval [$left,49] , how can I do that using the same array that I used before ?The array now misses value 44.
The function pick takes an array as an argument with all the numbers you have already picked. It will then pick a number between the start and the end that IS NOT in that array. It will add this number into that array and return the number.
function pick(&$picked, $start, $end) {
sort($picked);
if($start > $end - count($picked)) {
return false;
}
$pick = rand($start, $end - count($picked));
foreach($picked as $p) {
if($pick >= $p) {
$pick += 1;
} else {
break;
}
}
$picked[] = $pick;
return $pick;
}
This function will efficiently get a random number that is not in the array AND WILL NEVER INFINITELY RECURSE!
To use this like you want:
$array = array(44); // you have picked 44 for example
$num = pick($array, 1, 49); // get a random number between 1 and 49 that is not in $array
// $num will be a number between 1 and 49 that is not in $arrays
How the function works
Say you are getting a number between 1 and 10. And you have picked two numbers (e.g. 2 and 6). This will pick a number between 1 and (10 minus 2) using rand: rand(1, 8).
It will then go through each number that has been picked and check if the number is bigger.
For Example:
If rand(1, 8) returns 2.
It looks at 2 (it is >= then 2 so it increments and becomes 3)
It looks at 6 (it is not >= then 6 so it exits the loop)
The result is: 3
If rand(1, 8) returns 3
It looks at 2 (it is >= then 2 so it increments and becomes 4)
It looks at 6 (it is not >= then 6 so it exits the loop)
The result is 4
If rand(1, 8) returns 6
It looks at 2 (it is >= then 2 so it increments and becomes 7)
It looks at 6 (it is >= then 6 so it increments and becomes 8)
The result is: 8
If rand(1, 8) returns 8
It looks at 2 (it is >= then 2 so it increments and becomes 9)
It looks at 6 (it is >= then 6 so it increments and becomes 10)
The result is: 10
Therefore a random number between 1 and 10 is returned and it will not be 2 or 6.
I implemented this a long time ago to randomly place mines in a 2-dimensional array (because I wanted random mines, but I wanted to guarantee the number of mines on the field to be a certain number)
Why not just check your exceptions:
function getRand($min, $max) {
$exceptions = array(44, 23);
do {
$rand = mt_rand($min, $max);
} while (in_array($rand, $exceptions));
return $rand;
}
Note that this could result in an infinite loop if you provide a min and max that force mt_rand to return an exception character. So if you call getRand(44,44);, while meaningless, will result in an infinite loop... (And you can avoid the infinite loop with a bit of logic in the function (checking that there is at least one non-exception value in the range $min to $max)...
The other option, would be to build the array with a loop:
function getRand($min, $max) {
$actualMin = min($min, $max);
$actualMax = max($min, $max);
$values = array();
$exceptions = array(44, 23);
for ($i = $actualMin; $i <= $actualMax; $i++) {
if (in_array($i, $exceptions)) {
continue;
}
$values[] = $i;
}
return $values[array_rand($values)];
}
The simplest solution would be to just search for a random number from min to max - number of exceptions. Then just add 1 to the result for each exception lower than the result.
function getRandom($min, $max)
{
$exceptions = array(23, 44); // Keep them sorted or you have to do sort() every time
$random = rand($min, $max - count($exceptions));
foreach ($exceptions as $ex)
{
if ($ex > $random) break;
++$random;
}
return $random;
}
Runtime should be O(1+n) with n being the number of exceptions lower than the result.

Replace duplicate values in array with new randomly generated values

I have below a function (from a previous question that went unanswered) that creates an array with n amount of values. The sum of the array is equal to $max.
function randomDistinctPartition($n, $max) {
$partition= array();
for ($i = 1; $i < $n; $i++) {
$maxSingleNumber = $max - $n;
$partition[] = $number = rand(1, $maxSingleNumber);
$max -= $number;
}
$partition[] = $max;
return $partition;
}
For example: If I set $n = 4 and $max = 30. Then I should get the following.
array(5, 7, 10, 8);
However, this function does not take into account duplicates and 0s. What I would like - and have been trying to accomplish - is to generate an array with unique numbers that add up to my predetermined variable $max. No Duplicate numbers and No 0 and/or negative integers.
Ok, this problem actually revolves around linear sequences. With a minimum value of 1 consider the sequence:
f(n) = 1 + 2 + ... + n - 1 + n
The sum of such a sequence is equal to:
f(n) = n * (n + 1) / 2
so for n = 4, as an example, the sum is 10. That means if you're selecting 4 different numbers the minimum total with no zeroes and no negatives is 10. Now go in reverse: if you have a total of 10 and 4 numbers then there is only one combination of (1,2,3,4).
So first you need to check if your total is at least as high as this lower bound. If it is less there is no combination. If it is equal, there is precisely one combination. If it is higher it gets more complicated.
Now imagine your constraints are a total of 12 with 4 numbers. We've established that f(4) = 10. But what if the first (lowest) number is 2?
2 + 3 + 4 + 5 = 14
So the first number can't be higher than 1. You know your first number. Now you generate a sequence of 3 numbers with a total of 11 (being 12 - 1).
1 + 2 + 3 = 6
2 + 3 + 4 = 9
3 + 4 + 5 = 12
The second number has to be 2 because it can't be one. It can't be 3 because the minimum sum of three numbers starting with 3 is 12 and we have to add to 11.
Now we find two numbers that add up to 9 (12 - 1 - 2) with 3 being the lowest possible.
3 + 4 = 7
4 + 5 = 9
The third number can be 3 or 4. With the third number found the last is fixed. The two possible combinations are:
1, 2, 3, 6
1, 2, 4, 5
You can turn this into a general algorithm. Consider this recursive implementation:
$all = all_sequences(14, 4);
echo "\nAll sequences:\n\n";
foreach ($all as $arr) {
echo implode(', ', $arr) . "\n";
}
function all_sequences($total, $num, $start = 1) {
if ($num == 1) {
return array($total);
}
$max = lowest_maximum($start, $num);
$limit = (int)(($total - $max) / $num) + $start;
$ret = array();
if ($num == 2) {
for ($i = $start; $i <= $limit; $i++) {
$ret[] = array($i, $total - $i);
}
} else {
for ($i = $start; $i <= $limit; $i++) {
$sub = all_sequences($total - $i, $num - 1, $i + 1);
foreach ($sub as $arr) {
array_unshift($arr, $i);
$ret[] = $arr;
}
}
}
return $ret;
}
function lowest_maximum($start, $num) {
return sum_linear($num) + ($start - 1) * $num;
}
function sum_linear($num) {
return ($num + 1) * $num / 2;
}
Output:
All sequences:
1, 2, 3, 8
1, 2, 4, 7
1, 2, 5, 6
1, 3, 4, 6
2, 3, 4, 5
One implementation of this would be to get all the sequences and select one at random. This has the advantage of equally weighting all possible combinations, which may or may not be useful or necessary to what you're doing.
That will become unwieldy with large totals or large numbers of elements, in which case the above algorithm can be modified to return a random element in the range from $start to $limit instead of every value.
I would use 'area under triangle' formula... like cletus(!?)
Im really gonna have to start paying more attention to things...
Anyway, i think this solution is pretty elegant now, it applies the desired minimum spacing between all elements, evenly, scales the gaps (distribution) evenly to maintain the original sum and does the job non-recursively (except for the sort):
Given an array a() of random numbers of length n
Generate a sort index s()
and work on the sorted intervals a(s(0))-a(s(1)), a(s(1))-a(s(2)) etc
increase each interval by the
desired minimum separation size eg 1
(this necessarily warps their
'randomness')
decrease each interval by a factor
calculated to restore the series sum
to what it is without the added
spacing.
If we add 1 to each of a series we increase the series sum by 1 * len
1 added to each of series intervals increases sum by:
len*(len+1)/2 //( ?pascal's triangle )
Draft code:
$series($length); //the input sequence
$seriesum=sum($series); //its sum
$minsepa=1; //minimum separation
$sorti=sort_index_of($series) //sorted index - php haz function?
$sepsum=$minsepa*($length*($length+1))/2;
//sum of extra separation
$unsepfactor100=($seriesum*100)/($seriesum+sepsum);
//scale factor for original separation to maintain size
//(*100~ for integer arithmetic)
$px=series($sorti(0)); //for loop needs the value of prev serie
for($x=1 ; $x < length; $x++)
{ $tx=$series($sorti($x)); //val of serie to
$series($sorti($x))= ($minsepa*$x) //adjust relative to prev
+ $px
+ (($tx-$px)*$unsepfactor100)/100;
$px=$tx; //store for next iteration
}
all intervals are reduced by a
constant (non-random-warping-factor)
separation can be set to values other
than one
implementantions need to be carefuly
tweaked (i usualy test&'calibrate')
to accomodate rounding errors.
Probably scale everything up by ~15
then back down after. Intervals should survive if done right.
After sort index is generated, shuffle the order of indexes to duplicate values to avoid runs in the sequence of collided series.
( or just shuffle final output if order never mattered )
Shuffle indexes of dupes:
for($x=1; $x<$len; $x++)
{ if ($series($srt($x))==$series($srt($x-1)))
{ if( random(0,1) )
{ $sw= $srt($x);
$srt($x)= $srt($x-1);
$srt($x-1)= $sw;
} } }
A kind of minimal disturbance can be done to a 'random sequence' by just parting dupes by the minimum required, rather than moving them more than minimum -some 'random' amount that was sought by the question.
The code here separates every element by the min separation, whether duplicate or not, that should be kindof evenhanded, but overdone maybe. The code could be modified to only separate the dupes by looking through the series(sorti(n0:n1..len)) for them and calculating sepsum as +=minsep*(len-n) for each dupe. Then the adjustment loop just has to test again for dupe before applying adjustment.

Categories