GFS Grib Wind Values Decode And Convert (U & V) - php

Im doing a Grib2 decoder in PHP, and started with a half written library that I found. Everything is working fine except the values I get from the data are incorrect after converting Int Values to real values. I think I am converting everything right, and even when I test with cloud data it looks correct when I check it in Panoply. I think its with this formula that is all over the internet. Below im using 10 m above ground GFS from https://nomads.ncep.noaa.gov
Y*10^D = R+(X1+X2)*2^E
Im not sure I'm plugging in the values correctly but again it works with cloud cover percentages.
So.... The "Data Representation Values" I get from Grib Section 5
'Reference value (R)' => 886.25067138671875,
'Binary Scale Factor (E)' => 0,
'Decimal Scale Factor (D)' => 2,
'Number of bits used for each packed value' => 11,
'exp' => pow(2, $E), //(Equals 1) (The Library used these as the 2^E)
'base' => pow(10, $D), //(Equals 100) (And the 10^D)
'template' => 0,
As you can see below the numbers definitely have a connection to the Reference Value. The Number closest to 886(R) is 892 and its actual value should be 0.05 as shown below (EX.) The numbers Higher are than 892 are positive and the ones lower than 892 are negative. But when I user the formula (886 + 892 * 1) / 100 it give me 17.78, not 0.05. I seem to be missing something pretty obvious, am I misunderstanding the formula/equation where Y is the value I want...
X1 = 0 (documentation says)
X2 = 892 (documentation says is scaled value, the value in the Grib from bits?)
2^0 = 1
10^2 = 100
R = 886.25067138671875
Y * 10^D = R + (X1 + X2) * 2^E
Y * 100 = R + (X1 + X2) * 1
886 + (0 + 892) * 1 ) / 100
(886 + 892 * 1) / 100
= 17.78
Int Values of wind from Grib (After converting from Bits)
0 => 695,
1 => 639,
2 => 631,
3 => 0,
4 => 436,
5 => 513,
6 => 690,
7 => 570,
8 => 625,
9 => 805,
10 => 892,<-----------(EX.)
11 => 1044,
12 => 952,
13 => 1081,
14 => 1414,
15 => 997,
16 => 1106,
17 => 974,
18 => 1135,
19 => 1069,
20 => 912,
Actual decoded wind values shown in Panoply (Well known Grib App)
-1.9125067
-2.4725068
-2.5525067
-8.862507
-4.5025067
-3.7325068
-1.9625068
-3.1625068
-2.6125066
-0.81250674
0.057493284 <-----------(EX.)
1.5774933
0.6574933
1.9474933
5.2774935
1.1074933
2.1974933
0.87749326
2.4874933
1.8274933
0.2574933

y = 0.01 * (x - 886.25067138671875) seems to work for all points
so 0.01 * (892 - 886.25067138671875) = 0.0574

Related

Decode a CS:GO match sharing code with PHP

I am trying to build a function which decodes a CS:GO match sharing code. I have seen enough examples but everyhting is in JS or C# but nothing in PHP.
I took the akiver demo manager as an example and i tried to replicate it in PHP. I am going bit blind because i have no idea what is the output on a certain points so i can only hope that the result will be what i expect it to be. I think i am on the right path, the problem comes when the bytes have to be created/interpeted/converted to the desire outcome.
The code that should be decoded is the following: 'CSGO-oPRbA-uTQuR-UFkiC-hYWMB-syBcO' ($getNextGame variable)
The result should be 3418217537907720662
My code so far:
/**
* #param $getNextGame
* #return array
*/
public function decodeDemoCode(string $getNextGame): array
{
$shareCodePattern = "/CSGO(-?[\w]{5}){5}$/";
if (preg_match($shareCodePattern, $getNextGame) === 1) {
$result = [];
$bigNumber = 0;
$matchIdBytes = $outcomeIdBytes = $tvPortIdBytes = [];
$dictionary = "ABCDEFGHJKLMNOPQRSTUVWXYZabcdefhijkmnopqrstuvwxyz23456789";
$dictionaryLength = strlen($dictionary);
$changedNextGame = str_replace(array("CSGO", "-"), "", $getNextGame);
$chars = array_reverse(str_split($changedNextGame));
foreach ($chars as $char) {
$bigNumber = ($bigNumber * $dictionaryLength) + strpos($dictionary, $char);
}
}
}
This brings me back something like that:
1.86423701402E+43 (double)
Then i have the following:
$packed = unpack("C*", $bigNumber);
$reversedPacked = array_reverse($packed);
and this brings the following back:
array(17 items)
0 => 51 (integer)
1 => 52 (integer)
2 => 43 (integer)
3 => 69 (integer)
4 => 50 (integer)
5 => 48 (integer)
6 => 52 (integer)
7 => 49 (integer)
8 => 48 (integer)
9 => 55 (integer)
10 => 51 (integer)
11 => 50 (integer)
12 => 52 (integer)
13 => 54 (integer)
14 => 56 (integer)
15 => 46 (integer)
16 => 49 (integer)
Now here i am not really sure what to do because i do not completely understand C# and i have never worked with bytes in PHP before.
Generally the return type should be an array and would look something like that:
$result = [
matchId => 3418217537907720662,
reservationId => 3418217537907720662,
tvPort => 55788
];
Thanks in advance. Any help is deeply appreciated
I have created a PHP class which makes that possible:
CS:GO ShareCode Decoder PHP
The first problem you have to solve is the returned double value. PHP has limitation when it comes to big integers. More to that here What is the maximum value for an integer in PHP.
Because of this limitation you are losing precision leading to inaccurate results. In order to solve this problem you will have to use one of these libraries GMB, BC Math. What these libraries do, is to give you back the result as a string which solves the double value you got.
So your code has to look something like that:
foreach ($chars as $char) {
$bigNumber = gmp_add(
gmp_mul($bigNumber,$dictionaryLength),
strpos($dictionary,$char)
);
}
json_encode($bigNumber);
$result = json_decode($bigNumber, true, 512, JSON_BIGINT_AS_STRING);
This will give you back the following "18642370140230194654275126136176397505221000"
You do not really need the PHP pack and unpack functions since the results can be generated without them. The next step is to convert your number to hexadecimal. You can do that with the following:
$toHex = gmp_strval(gmp_init($number, 10), 16);
Again you need to use the gmp library in order to get the desired value. What you do, is to make sure that the result is a string and then you convert your number's base from 10 to 16 which is the equivalent of hexadecimal. The results is the following:
"d6010080bdf26f2fbf0100007cf76f2f5188"
The next step is to convert the hex value to an array of byte integers. It looks like this:
$bytes = [];
$byteArray= str_split($toHex, 2);
foreach ($byteArray as $byte) {
$bytes[] = (int)base_convert($byte, 16, 10);
}
What you do here is to split the array to every two chars. The $byteArray variable looks like this (before it enters the foreach loop)
array(18 items)
0 => 'd6' (2 chars) 1 => '01' (2 chars) 2 => '00' (2 chars) 3 => '80' (2 chars)
4 => 'bd' (2 chars) 5 => 'f2' (2 chars) 6 => '6f' (2 chars) 7 => '2f' (2 chars)
8 => 'bf' (2 chars) 9 => '01' (2 chars) 10 => '00' (2 chars) 11 => '00' (2 chars)
12 => '7c' (2 chars) 13 => 'f7' (2 chars) 14 => '6f' (2 chars) 15 => '2f' (2 chars)
16 => '51' (2 chars) 17 => '88' (2 chars)
Now you will have to convert each entry into integer. Since the results are not that big anymore you can change the base of your values with the base_convert function. The base is 16 (hex) and you will have to change it back to 10. The results $bytes after the foreach loop looks like this:
array(18 items)
0 => 214 (integer) 1 => 1 (integer) 2 => 0 (integer) 3 => 128 (integer)
4 => 189 (integer) 5 => 242 (integer) 6 => 111 (integer) 7 => 47 (integer)
8 => 191 (integer) 9 => 1 (integer) 10 => 0 (integer) 11 => 0 (integer)
12 => 124 (integer) 13 => 247 (integer) 14 => 111 (integer) 15 => 47 (integer)
16 => 81 (integer) 17 => 136 (integer)
Now you have to define which bytes are responsible for each result.
$matchIdBytes = array_reverse(array_slice($bytes, 0, 8));
$reservationIdBytes = array_reverse(array_slice($bytes, 8, 8));
$portBytes = array_reverse(array_slice($bytes, 16, 2));
For the match id you will have to get the first 8 entries and the reverse the array
For the reservation id you will have to get the next 8 entries starting from the 8th entry and reverse the array
For the port you will have to get the last 2 entries and reverse the array
Now you will have to return the value
return [
'matchId' => $this->getResultFromBytes($matchIdBytes),
'reservationId' => $this->getResultFromBytes($reservationIdBytes),
'tvPort' => $this->getResultFromBytes($portBytes)
];
The getResultFromBytes() function:
**
* #param array $bytes
* #return string
*/
public function getResultFromBytes(array $bytes): string
{
$chars = array_map("chr", $bytes);
$bin = implode($chars);
$hex = bin2hex($bin);
return gmp_strval($this->gmp_hexDec($hex));
}
/**
* #param $n
* #return string
*/
public function gmp_hexDec($n): string
{
$gmp = gmp_init(0);
$multi = gmp_init(1);
for ($i=strlen($n)-1;$i>=0;$i--,$multi=gmp_mul($multi, 16)) {
$gmp = gmp_add($gmp, gmp_mul($multi, hexdec($n[$i])));
}
return $gmp;
}
Best regards

Convert number to and from alphanumeric code

Is it possible to convert from numeric to an alphanumeric code like this:
a
b
c
d
..
z
1
2
3
4
..
aa
ab
ac
ad
..
az
a1
a2
a3
a4
..
aaa
aab
aac
aad
..
aaz
aa1
aa2
etc.
I'm trying to convert large numbers to smaller length alphanumeric strings.
Don't know why you want to do this specifically, but try changing the base from 10 to something like 32;
base_convert($number, 10, 32);
Then to convert back
base_convert($number, 32, 10);
As someone else pointed out - for very large numbers this may not work.
If you need to be able to handle very large numbers, check out this link:
How to generate random 64-bit value as decimal string in PHP
You can use base_convert() for changing the base of your number from 10 (decimal) to 36 (26 latin letters plus 10 arabic numerals).
The result will differ from your given example list. You have used the digits abc..xyz012..789, base_convert will use a diffent order 012..789abc..xyz.
// convert decimal to base36
echo base_convert($number_dec, 10 , 36);
// convert base36 to decimal
echo base_convert($number_b36, 36 , 10);
Translation
dec base36
0 0
1 1
...
9 9
10 a
11 b
...
34 y
35 z
36 10
37 11
..
45 19
46 1a
...
1295 zz
1296 100
1297 101
You could use dechex to convert the number to hex
http://php.net/manual/en/function.dechex.php
For example:
1000000 => f4240
1000001 => f4241
1000002 => f4242
1000003 => f4243
1000004 => f4244
1000005 => f4245
1000006 => f4246
1000007 => f4247
1000008 => f4248
1000009 => f4249
1000010 => f424a
1000011 => f424b
1000012 => f424c
1000013 => f424d
1000014 => f424e
1000015 => f424f
1000016 => f4250
1000017 => f4251
1000018 => f4252
1000019 => f4253
1000020 => f4254
To convert back, just use hexdec
http://php.net/manual/en/function.hexdec.php
base64_encode();
and for decode use
base64_decode();
Both dechex() and base_convert() will fail with large numbers. They are limited by the maximum size and precision of int and float types internally used during conversion.
The http://php.net/manual/pt_BR/function.base-convert.php discussion has some nice helper functions (see 2, 3) that can avoid this problem by using BC-functions to do the math. The BC extension can deal with arbitrarily large numbers.

Randomly Generating Combinations From Variable Weights

VERY IMPORTANT EDIT: All Ai are unique.
The Question
I have a list A of n unique objects. Each object Ai has a variable percentage Pi.
I want to create an algorithm that generates a new list B of k objects (k < n/2 and in most cases k is significantly less than n/2. E.g. n=231 , k=21). List B should have no duplicates and will be populated with objects originating from list A with the following restriction:
The probability that an object Ai appears in B is Pi.
What I Have Tried
(These snipits are in PHP simply for the purposes of testing)
I first made list A
$list = [
"A" => 2.5,
"B" => 2.5,
"C" => 2.5,
"D" => 2.5,
"E" => 2.5,
"F" => 2.5,
"G" => 2.5,
"H" => 2.5,
"I" => 5,
"J" => 5,
"K" => 2.5,
"L" => 2.5,
"M" => 2.5,
"N" => 2.5,
"O" => 2.5,
"P" => 2.5,
"Q" => 2.5,
"R" => 2.5,
"S" => 2.5,
"T" => 2.5,
"U" => 5,
"V" => 5,
"W" => 5,
"X" => 5,
"Y" => 5,
"Z" => 20
];
At first I tried the following two algorthms (These are in PHP simply for the purposes of testing):
$result = [];
while (count($result) < 10) {
$rnd = rand(0,10000000) / 100000;
$sum = 0;
foreach ($list as $key => $value) {
$sum += $value;
if ($rnd <= $sum) {
if (in_array($key,$result)) {
break;
} else {
$result[] = $key;
break;
}
}
}
}
AND
$result = [];
while (count($result) < 10) {
$sum = 0;
foreach ($list as $key => $value) {
$sum += $value;
}
$rnd = rand(0,$sum * 100000) / 100000;
$sum = 0;
foreach ($list as $key => $value) {
$sum += $value;
if ($rnd <= $sum) {
$result[] = $key;
unset($list[$key]);
break;
}
}
}
The only differences between the two algorithms is that one tries again when it encounters a duplicate, and one removes the object form list A when it is picked. As it turns out, these two algorithms have the same probability outputs.
I ran the second algorithm 100,000 times and kept track of how many times each letter was picked. The following array contians the percentage chance that a letter is picked in any list B based off of the 100,000 tests.
[A] => 30.213
[B] => 29.865
[C] => 30.357
[D] => 30.198
[E] => 30.152
[F] => 30.472
[G] => 30.343
[H] => 30.011
[I] => 51.367
[J] => 51.683
[K] => 30.271
[L] => 30.197
[M] => 30.341
[N] => 30.15
[O] => 30.225
[P] => 30.135
[Q] => 30.406
[R] => 30.083
[S] => 30.251
[T] => 30.369
[U] => 51.671
[V] => 52.098
[W] => 51.772
[X] => 51.739
[Y] => 51.891
[Z] => 93.74
When looking back at the algorithm this makes sense. The algorithm incorrectly interpreted the original percentages to be the percentage chance that an object is picked for any given location, not any list B. So for example, in reality, the chance that Z is picked in a list B is 93%, but the chance that Z is picked for an index Bn is 20%. This is NOT what I want. I want the chance that Z is picked in a list B to be 20%.
Is this even possible? How can it be done?
EDIT 1
I tried simply having the sum of all Pi = k, this worked if all Pi are equal, but after modifying their values, it started to get more and more wrong.
Initial Probabilities
$list= [
"A" => 8.4615,
"B" => 68.4615,
"C" => 13.4615,
"D" => 63.4615,
"E" => 18.4615,
"F" => 58.4615,
"G" => 23.4615,
"H" => 53.4615,
"I" => 28.4615,
"J" => 48.4615,
"K" => 33.4615,
"L" => 43.4615,
"M" => 38.4615,
"N" => 38.4615,
"O" => 38.4615,
"P" => 38.4615,
"Q" => 38.4615,
"R" => 38.4615,
"S" => 38.4615,
"T" => 38.4615,
"U" => 38.4615,
"V" => 38.4615,
"W" => 38.4615,
"X" => 38.4615,
"Y" =>38.4615,
"Z" => 38.4615
];
Results after 10,000 runs
Array
(
[A] => 10.324
[B] => 59.298
[C] => 15.902
[D] => 56.299
[E] => 21.16
[F] => 53.621
[G] => 25.907
[H] => 50.163
[I] => 30.932
[J] => 47.114
[K] => 35.344
[L] => 43.175
[M] => 39.141
[N] => 39.127
[O] => 39.346
[P] => 39.364
[Q] => 39.501
[R] => 39.05
[S] => 39.555
[T] => 39.239
[U] => 39.283
[V] => 39.408
[W] => 39.317
[X] => 39.339
[Y] => 39.569
[Z] => 39.522
)
We must have sum_i P_i = k, or else we cannot succeed.
As stated, the problem is somewhat easy, but you may not like this answer, on the grounds that it's "not random enough".
Sample a uniform random permutation Perm on the integers [0, n)
Sample X uniformly at random from [0, 1)
For i in Perm
If X < P_i, then append A_i to B and update X := X + (1 - P_i)
Else, update X := X - P_i
End
You'll want to approximate the calculations involving real numbers with fixed-point arithmetic, not floating-point.
The missing condition is that the distribution have a technical property called "maximum entropy". Like amit, I cannot think of a good way to do this. Here's a clumsy way.
My first (and wrong) instinct for solving this problem was to include each A_i in B independently with probability P_i and retry until B is the right length (there won't be too many retries, for reasons that you can ask math.SE about). The problem is that the conditioning messes up the probabilities. If P_1 = 1/3 and P_2 = 2/3 and k = 1, then the outcomes are
{}: probability 2/9
{A_1}: probability 1/9
{A_2}: probability 4/9
{A_1, A_2}: probability 2/9,
and the conditional probabilities are actually 1/5 for A_1 and 4/5 for A_2.
Instead, we should substitute new probabilities Q_i that yield the proper conditional distribution. I don't know of a closed form for Q_i, so I propose to find them using a numerical optimization algorithm like gradient descent. Initialize Q_i = P_i (why not?). Using dynamic programming, it's possible to find, for the current setting of Q_i, the probability that, given an outcome with l elements, that A_i is one of those elements. (We only care about the l = k entry, but we need the others to make the recurrences work.) With a little more work, we can get the whole gradient. Sorry this is so sketchy.
In Python 3, using a nonlinear solution method that seems to converge always (update each q_i simultaneously to its marginally correct value and normalize):
#!/usr/bin/env python3
import collections
import operator
import random
def constrained_sample(qs):
k = round(sum(qs))
while True:
sample = [i for i, q in enumerate(qs) if random.random() < q]
if len(sample) == k:
return sample
def size_distribution(qs):
size_dist = [1]
for q in qs:
size_dist.append(0)
for j in range(len(size_dist) - 1, 0, -1):
size_dist[j] += size_dist[j - 1] * q
size_dist[j - 1] *= 1 - q
assert abs(sum(size_dist) - 1) <= 1e-10
return size_dist
def size_distribution_without(size_dist, q):
size_dist = size_dist[:]
if q >= 0.5:
for j in range(len(size_dist) - 1, 0, -1):
size_dist[j] /= q
size_dist[j - 1] -= size_dist[j] * (1 - q)
del size_dist[0]
else:
for j in range(1, len(size_dist)):
size_dist[j - 1] /= 1 - q
size_dist[j] -= size_dist[j - 1] * q
del size_dist[-1]
assert abs(sum(size_dist) - 1) <= 1e-10
return size_dist
def test_size_distribution(qs):
d = size_distribution(qs)
for i, q in enumerate(qs):
d1a = size_distribution_without(d, q)
d1b = size_distribution(qs[:i] + qs[i + 1 :])
assert len(d1a) == len(d1b)
assert max(map(abs, map(operator.sub, d1a, d1b))) <= 1e-10
def normalized(qs, k):
sum_qs = sum(qs)
qs = [q * k / sum_qs for q in qs]
assert abs(sum(qs) / k - 1) <= 1e-10
return qs
def approximate_qs(ps, reps=100):
k = round(sum(ps))
qs = ps[:]
for j in range(reps):
size_dist = size_distribution(qs)
for i, p in enumerate(ps):
d = size_distribution_without(size_dist, qs[i])
d.append(0)
qs[i] = p * d[k] / ((1 - p) * d[k - 1] + p * d[k])
qs = normalized(qs, k)
return qs
def test(ps, reps=100000):
print(ps)
qs = approximate_qs(ps)
print(qs)
counter = collections.Counter()
for j in range(reps):
counter.update(constrained_sample(qs))
test_size_distribution(qs)
print("p", "Actual", sep="\t")
for i, p in enumerate(ps):
print(p, counter[i] / reps, sep="\t")
if __name__ == "__main__":
test([2 / 3, 1 / 2, 1 / 2, 1 / 3])
Let's analyze it for a second.
With replacements: (not what you want, but simpler to analyze).
Given a list L of size k, and and element a_i, the probability for a_i to be in the list is denoted by your value p_i.
Let's examine the probability of a_i to be at some index j in the list. Let's denote that probability as q_i,j. Note that for any index t in the list, q_i,j = q_i,t - so we can simply say q_i_1=q_i_2=...=q_i_k=q_i.
The probability that a_i will be anywhere in the list is denoted as:
1-(1-q_i)^k
But it is also p_i - so we need to solve the equation
1-(1-q_i)^k = pi
1 - (1-q_i)^k -pi = 0
One way to do it is newton-raphson method.
After calculating the probability for each element, check if its indeed a proabability space (sums to 1, all probabilities are in [0,1]). If it's not - it cannot be done for given probabilities and k.
Without replacement: This is trickier, since now q_i,j != q_i,t (the selections are not i.i.d). Calculations for probability here will be much trickier, and I am not sure at the moment how to calculate them, it will be needed to be done in run-time, during the creation of the list I suppose.
(Deleted a solution that I am almost certain is biased).
Unless my math skills are a lot weaker than i think an average chance of an Element from list A in your example being found in list B should be 10/26 = 0.38.
If you lower this chance for any object, there must be others with higher chances.
Also, your probabilites from list A cannot compute: they are too low: you could not fill your list / you don't have enough elements to pick from.
Assuming the above is correct (or correct enough), that would mean that in your list A your average weight would have to be the average chance of a random pick. That, in turn, means your probabilities in list a don't sum up to 100.
Unless i am completely wrong, that is...

Using unpack() to convert to a byte array in PHP

I'm trying to convert a binary string to a byte array of a specific format.
Sample binary data:
ê≤ÚEZêK
The hex version of the binary string looks like this:
00151b000000000190b2f20304455a000003900000004b0000
The Python script uses struct package and unpacks the above string (in binary) using this code:
data = unpack(">hBiiiiih",binarydata)
The desired byte array looks like this. This is also the output of the data array is:
(21, 27, 0, 26260210, 50611546, 912, 75, 0)
How can I unpack the same binary string using PHP's unpack() function and get the same output? That is, what's the >hBiiiiih equivalent in PHP?
So far my PHP code
$hex = "00151b000000000190b2f20304455a000003900000004b0000";
$bin = pack("H*",$hex);
print_r(unpack("x/c*"));
Which gives:
Array ( [*1] => 21 [*2] => 27 [*3] => 0 [*4] => 0 [*5] => 0 [*6] => 0 [*7] => 1 [*8] => -112 [*9] => -78 [*10] => -14 [*11] => 3 [*12] => 4 [*13] => 69 [*14] => 90 [*15] => 0 [*16] => 0 [*17] => 3 [*18] => -112 [*19] => 0 [*20] => 0 [*21] => 0 [*22] => 75 [*23] => 0 [*24] => 0 )
Would also appreciate links to a PHP tutorial on working with pack/unpack.
This produces the same result as does Python, but it treats signed values as unsigned because unpack() does not have format codes for signed values with endianness. Also note that the integers are converted using long, but this is OK because both have the same size.
$hex = "00151b000000000190b2f20304455a000003900000004b0000";
$bin = pack("H*", $hex);
$x = unpack("nbe_unsigned_1/Cunsigned_char/N5be_unsigned_long/nbe_unsigned_2", $bin);
print_r($x);
Array
(
[be_unsigned_1] => 21
[unsigned_char] => 27
[be_unsigned_long1] => 0
[be_unsigned_long2] => 26260210
[be_unsigned_long3] => 50611546
[be_unsigned_long4] => 912
[be_unsigned_long5] => 75
[be_unsigned_2] => 0
)
Because this data is treated as unsigned, you will need to detect whether the original data was negative, which can be done for 2 byte shorts with something similar to this:
if $x["be_unsigned_1"] >= pow(2, 15)
$x["be_unsigned_1"] = $x["be_unsigned_1"] - pow(2, 16);
and for longs using
if $x["be_unsigned_long2"] >= pow(2, 31)
$x["be_unsigned_long2"] = $x["be_unsigned_long2"] - pow(2, 32);

Split an array into two evenly by it's values

array
1703 => float 15916.19738
5129 => float 11799.15419
33 => float 11173.49945
1914 => float 8439.45987
2291 => float 6284.22271
5134 => float 5963.14065
5509 => float 5169.85755
4355 => float 5153.80867
2078 => float 3932.79341
31 => float 3924.09928
5433 => float 2718.7711
3172 => float 2146.1932
1896 => float 2141.36021
759 => float 1453.5501
2045 => float 1320.74681
5873 => float 1222.7448
2044 => float 1194.4903
6479 => float 1074.1714
5299 => float 950.872
3315 => float 878.06602
6193 => float 847.3372
1874 => float 813.816
1482 => float 330.6422
6395 => float 312.1545
6265 => float 165.9224
6311 => float 122.8785
6288 => float 26.5426
I would like to distribute this array into two arrays both ending up with a grand total (from the float values) to be about the same. I tried K-Clustering but that distributes higher values onto one array and lower values onto the other array. I'm pretty much trying to create a baseball team with even player skills.
Step 1: Split the players into two teams. It doesn't really matter how you do this, but you could do every other one.
Step 2: Randomly switch two players only if it makes the teams more even.
Step 3: Repeat step 2 until it converges to equality.
$diff = array_sum($teams[0]) - array_sum($teams[1]);
for ($i = 0; $i < 1000 && $diff != 0; ++$i)
{
$r1 = rand(0, 8); // assumes nine players on each team
$r2 = rand(0, 8);
$new_diff = $diff - ($teams[0][$r1] - $teams[1][$r2]) * 2;
if (abs($new_diff) < abs($diff))
{
// if the switch makes the teams more equal, then swap
$tmp = $teams[0][$r1];
$teams[0][$r1] = $teams[1][$r2];
$teams[1][$r2] = $tmp;
var_dump(abs($new_diff));
$diff = $new_diff;
}
}
You'll have to adapt that code to your own structures, but it should be simple.
Here's a sample output:
int(20)
int(4)
int(0)
I was using integers from 0 to 100 to rate each player. Notice how it gradually converges to equality, although an end result of 0 is not guaranteed.
You can stop the process after a fixed interval or until it reaches some threshold.
There are more scientific methods you could use, but this works well.
This is extremely simplistic, but have you considered just doing it like a draft? With the array sorted as in your example, Team A gets array[0], Team B gets array[1] and array[2] the next two picks go to Team A, and so on.
For the example you give, I got one team with ~50,000 and the other with ~45,000.

Categories