PHP array_column with array_filter

PHP array_column with array_filter - php

I am doing this to echo the minimum value in an array...
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$min = min(array_column($array, 'a'));
echo $min;
Now I want to exclude 0 from the results, I know I can use array_filter to achieve this but do i need to process the array twice?

Yes, this will do:
$min = min(array_filter(array_column($array, 'a')));
It will iterate the array three times, once for each function.
You can use array_reduce to do it in one iteration:
$min = array_reduce($array, function ($min, $val) {
return $min === null || ($val['a'] && $val['a'] < $min) ? $val['a'] : $min;
});
Whether that's faster or not must be benchmarked, a PHP callback function may after all be slower than three functions in C.
A somewhat more efficient solution without the overhead of a function call would be a good ol' loop:
$min = null;
foreach ($array as $val) {
if ($min === null || ($val['a'] && $val['a'] < $min)) {
$min = $val['a'];
}
}
In the end you need to benchmark and decide on the correct tradeoff of performance vs. readability. In practice, unless you have positively humongous datasets, the first one-liner will probably do just fine.

A solution using array_reduce() to walk the array only once.
$min = array_reduce(
$array,
function($acc, array $item) {
return min($acc, $item['a'] ?: INF);
},
INF
);
How it works:
It starts with +INF as the partial minimum value. All the values it encounters in array are, theoretically smaller than that.
The callback function ignores the items having 0 (or another value that is equal to FALSE when evaluated as boolean). The expression $item['a'] ?: INF uses INF (infinity) instead of $item['a'] to avoid altering the partial result (to ignore 0 values).
It returns the minimum between the current partial minimum (passed by array_reduce() in parameter $acc) and the value of the current item, as explained above.
The value in $min is the minimum of the not FALSE-ey values in column a of the items in $array. If all these values are 0 (FALSE), the value returned in $min is INF.

This is not an answer but the format of its content cannot be provided by a comment. It also cannot stay in my answer as it is technically not part of it.
I generated a benchmark for the three solutions provided by #deceze and my solution and ran it using PHP 7.0. Everything below applies only to PHP 7.x.
PHP 5 runs much slower and it requires more memory.
I started by running the code 1,000,000 times over a small list of 100 items then I iteratively divided the number of iteration by 10 while multiplied the list length by 10.
Here are the results:
$ php bench.php 100 1000000
Generating 100 elements... Done. Time: 0.000112 seconds.
array_filter(): 3.265538 seconds/1000000 iterations. 0.000003 seconds/iteration.
foreach : 3.771463 seconds/1000000 iterations. 0.000004 seconds/iteration.
reduce #deceze: 6.869162 seconds/1000000 iterations. 0.000007 seconds/iteration.
reduce #axiac : 8.599051 seconds/1000000 iterations. 0.000009 seconds/iteration.
$ php bench.php 1000 100000
Generating 1000 elements... Done. Time: 0.000750 seconds.
array_filter(): 3.024423 seconds/100000 iterations. 0.000030 seconds/iteration.
foreach : 3.997505 seconds/100000 iterations. 0.000040 seconds/iteration.
reduce #deceze: 6.669426 seconds/100000 iterations. 0.000067 seconds/iteration.
reduce #axiac : 8.342756 seconds/100000 iterations. 0.000083 seconds/iteration.
$ php bench.php 10000 10000
Generating 10000 elements... Done. Time: 0.002643 seconds.
array_filter(): 2.913948 seconds/10000 iterations. 0.000291 seconds/iteration.
foreach : 4.190049 seconds/10000 iterations. 0.000419 seconds/iteration.
reduce #deceze: 9.649768 seconds/10000 iterations. 0.000965 seconds/iteration.
reduce #axiac : 11.236113 seconds/10000 iterations. 0.001124 seconds/iteration.
$ php bench.php 100000 1000
Generating 100000 elements... Done. Time: 0.042237 seconds.
array_filter(): 90.369577 seconds/1000 iterations. 0.090370 seconds/iteration.
foreach : 15.487466 seconds/1000 iterations. 0.015487 seconds/iteration.
reduce #deceze: 19.896064 seconds/1000 iterations. 0.019896 seconds/iteration.
reduce #axiac : 15.056250 seconds/1000 iterations. 0.015056 seconds/iteration.
For lists up to about 10,000 elements, the results are consistent and they match the expectations: array_filter() is the fastest, foreach comes close then the array_reduce() solutions aligned by the number of functions they call (#deceze's is faster as it doesn't call any function, mine's calls min() once). Even the total running time feels consistent.
The value of 90 seconds for the array_filter() solution for 100,000 items in the list looks out of place but it has a simple explanation: both array_filter() and array_column() generate new arrays. They allocate memory and copy the data and this takes time. Add the time needed by the garbage collector to free all the small memory blocks used by a list of 10,000 small arrays and the running time will go up faster.
Another interesting result for 100,000 items array is that my solution using array_reduce() is as fast as the foreach solution and better than #deceze's solution using array_reduce(). I don't have an explanation for this result.
I tried to find out some thresholds when these things start to happen. For this I ran the benchmark with different list sizes, starting from 5,000 and increasing the size by 1,000 while keeping the total number of visited items to 100,000,000. The results can be found here.
The results are surprising. For some sizes of the list (8,000, 11,000, 12,000, 13,000, 17,000 items), the array_filter() solution needs about 10 times more time to complete than any solution that uses array_reduce(). For other list sizes, however, it goes back to the track and completes the 100 million node visits in about 3 seconds while the time needed by the other solutions constantly increases as the list length increases.
I suspect the culprit for the hops in the time needed by the array_filter() solution is the PHP's memory allocation strategy. For some lengths of the initial array, the temporary arrays returned by array_column() and array_filter() probably trigger more memory allocation and garbage cleanup cycles than for other sizes. Of course, it is possible that the same behaviour happens on other sizes I didn't test.
Somewhere around 16,000...17,000 items in the list, my solution starts running faster than #deceze's solution using array_reduce() and around 25.000 it starts performing equally fast as the foreach solution (and even faster sometimes).
Also for lists longer than 16,000-17,000 items the array_filter() solution consistently needs more time to complete than the others.
The benchmark code can be found here. Unfortunately it cannot be executed on 3v4l.org for lists larger than 15,000 elements because it reaches the memory limit imposed by the system.
Its results for lists larger than 5,000 items can be found here.
The code was executed using PHP 7.0.20 CLI on Linux Mint 18.1. No APC or other kind of cache was involved.
Conclusion
For small lists, up to 5,000 items, use the array_filter(array_column()) solution as it performs well for this size of the list and it looks neat.
For lists larger than 5,000 items switch to the foreach solution. It doesn't look well but it runs fast and it doesn't need extra memory. Stick to it as the list size increases.
For hackatons, interviews and to look smart to your colleagues, use any array_reduce() solution. It shows your knowledge about PHP array functions and your understanding of the "callback" programming concept.

Try with array_flip and unset() php function
like this
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$min = array_flip(array_column($array, 'a'));
unset($min[0]);
$min=min(array_flip($min));
o/p
28

You can use sort:
$array = [
[
'a' => 0,
'f' => 0,
'f' => 0,
'l' => 61.60
],
[
'a' => 38,
'f' => 0,
'f' => 0,
'l' => 11.99
],
[
'a' => 28,
'f' => 0,
'f' => 0,
'l' => 3.40
]
];
$array = array_column($array, 'a');
sort($array);
echo $array[1];

Related

Permutations and big arrays in PHP - performance issues

I have an array of numbers (int or float) and I need to find a value by combining array values. Once the smallest possible combination is found the function returns the array values. Therefore I start with sample-size=1 and keep incrementing it.
Here's a simplified example of the given data:
$values = [10, 20, 30, 40, 50];
$lookingFor = 80;
Valid outcomes:
[30, 50] // return this
[10, 20, 50], [10, 30, 40] // just to demonstrate the possible combinations
Permutations solve this problem and I've tried many different implementations (for example: Permutations - all possible sets of numbers, Get all permutations of a PHP array?, https://github.com/drupol/phpermutations). My favourite is this one with a parameter for permutation-size using the Generator pattern: https://stackoverflow.com/a/43307800
What's my problem? Performance! My arrays have 5 - 150 numbers and sometimes the sum of 30 array numbers is needed to find the searched value. Sometimes the value can't be found, which means I needed to try all possible combinations. Basically with permutation-size > 5 the task becomes too time consuming.
An alternative, yet not precise way is to sort the array, take the first X and last X numbers and compare with the searched value. Like this:
sort($values, SORT_NUMERIC);
$countValues = count($values);
if ($sampleSize > $countValues)
{
$sampleSize = $countValues;
}
$minValues = array_slice($values, 0, $sampleSize);
$maxValues = array_slice($values, $countValues - $sampleSize, $sampleSize);
$possibleMin = array_sum($minValues);
$possibleMax = array_sum($maxValues);
if ($possibleMin === $lookingFor)
{
return $minValues;
}
if ($possibleMax === $lookingFor)
{
return $maxValues;
}
return [];
Hopefully somebody has dealt with a similar problem and can guide me in the right direction. Thank you!

you must use combination instead of permutations {ex: P(15) = 130767436800 vs C(15) = 32768}
if array_sum < target_number then no solution exists
if in_array(target_number, numbers) solution found with 1 element
sort lowest to highest
start with C(n,2) where 2 represents 1st 2nd then 1st 3rd etc (static one is 1st element)
if above loop found no solution continue with 2nd 3rd then 2nd 4th, etc)
if C(n,2) had no solution then jump to C(n,3)s but this time 2 static numbers and 1 dynamic one
if loop ended with no solution then there exists no solution
lastly, I would adjust this question and ask in statistics branch of stack exchange (crossvalidated) since mean, median and cumulative distribution of the sums of the numbers may hint to decrease the number of iterations significantly and this is their profession.

Fastest way to convert an int into a binary code string in PHP

I need the binary code representation of an integer (unsigned byte). I found this solution, where in my case $n = 8:
function _decBinDig($x, $n)
{
return substr(decbin(pow(2, $n) + $x), 1);
}
which surprisingly takes about 74 ms, while my first try - which I thought was too slow:
function getBinary(int $x)
{
return str_pad(base_convert($x, 10, 2), 8, '0', STR_PAD_LEFT);
}
only takes about 38 ms
Is there a faster solution?

Benchmarked the following five functions:
// For a baseline, returns unpadded binary
function decBinPlain(int $x) {
return decbin($x);
}
// Alas fancier than necessary:
function decBinDig(int $x) {
return substr(decbin(pow(2, 8) + $x), 1);
}
// OP's initial test
function getBinary(int $x) {
return str_pad(base_convert($x, 10, 2), 8, '0', STR_PAD_LEFT);
}
// OP's function using decbin()
function getDecBin(int $x) {
return str_pad(decbin($x), 8, '0', STR_PAD_LEFT);
}
// TimBrownlaw's method
function intToBin(int $x) {
return sprintf( "%08d", decbin($x));
}
At 500,000 iterations each, run as 10 x (50,000 # 5), here are the stats:
[average] => [
[decBinPlain] => 0.0912
[getDecBin] => 0.1355
[getBinary] => 0.1444
[intToBin] => 0.1493
[decBinDig] => 0.1687
]
[relative] => [
[decBinPlain] => 100
[getDecBin] => 148.57
[getBinary] => 158.33
[intToBin] => 163.71
[decBinDig] => 184.98
]
[ops_per_sec] => [
[decBinPlain] => 548355
[getDecBin] => 369077
[getBinary] => 346330
[intToBin] => 334963
[decBinDig] => 296443
]
The positions are consistent. OP's function, changed to use decbin in place of base_convert, is the fastest function that returns the complete result, by a very thin margin. I'd opt for decbin simply because the meaning is crystal clear. For adding in the left-padding, str_pad is less complex than sprintf. Running PHP 7.4.4 on W10 & i5-8250U, total runtime 7.11 sec.
For a baseline, calling an empty dummy function averages 0.0542 sec. Then: If you need to run this enough times to worry about minute per-op performance gains, it's more economical to have the code inline to avoid the function call. Here, the overhead from the function call is greater than the difference between the slowest and the fastest options above!
For future reference. If you're bench-marking several options, I'd recommend testing them over a single script call and over several consecutive loops of each function. That'll help even out any "lag noise" from background programs, CPU throttling (power to max if on battery!!) etc. Then, call it a couple of times and see that the numbers are stable. You'll want to do much more than 1000 iterations to get reliable numbers. Try e.g. 10K upwards for more complex functions, and 100K upwards for simpler functions. Burn it enough if you want to prove it!

There is a "nicer" method you can try out.
function intToBin(int $x)
{
return sprintf( "%08d", decbin($x));
}
or just call the sprintf inline.

Why does rand seem more random than mt_rand when only doing (1, 2)?

I have some elements that I'm trying to randomize at 50% chance of output. Wrote a quick if statement like this.
$rand = mt_rand(1, 2);
if ( $rand == 1 ) {
echo "hello";
} else {
echo "goodbye";
}
In notice that when using mt_rand, "goodbye" is output many times in a row, whereas, if I just use "rand," it's a more equal distribution.
Is there something about mt_rand that makes it worse at handling a simple 1-2 randomization like this? Or is my dataset so small that these results are just anecdotal?

To get the same value "many times in a row" is a possible outcome of a randomly generated series. It would not be completely random if such a pattern were not allowed to occur. If you would continue taking samples, you would also find that the opposite value will sometimes occur several times in a row, provided you keep going long enough.
One way to test that the generated values are indeed quite random and uniformly distributed, is to count how many times the same value is generated as the one generated before, and how many times the opposite value is generated.
Note that the strings "hello" and "goodbye" don't add much useful information; we can just look at the values 1 and 2.
Here is how you could do such a test:
// $countAfter[$i][$j] will contain the number of occurrences of
// a pair $i, $j in the randomly generated sequence.
// So there is an entry for [1][1], [1][2], [2][1] and [2][2]:
$countAfter = [1 => [1 => 0, 2 => 0],
2 => [1 => 0, 2 => 0]];
$prev = 1; // We assume for simplicity that the "previously" generated value was 1
for ($i = 0; $i < 10000; $i++) { // Produce a large enough sample
$n = mt_rand(1, 2);
$countAfter[$prev][$n]++; // Increase the counter that corresponds to the generated pair
$prev = $n;
}
print_r($countAfter);
You can see in this demo that the 4 numbers that are output do not differ that much. Output is something like:
Array (
[1] => Array (
[1] => 2464
[2] => 2558
)
[2] => Array (
[1] => 2558
[2] => 2420
)
)
This means that 1 and 2 are generated about an equal number of times and that a repetition of a value happens just as often as a toggle in the series.
Obviously these numbers are rarely exactly the same, since that would mean the last couple of generated values would not be random at all, as they would need to bring those counts to the desired value.
The important thing is that your sample needs to be large enough to see the pattern of a uniform distribution confirmed.

Weighted random pick

I have a set of items. I need to randomly pick one. The problem is that they each have a weight of 1-10. A weight of 2 means that the item is twice as likely to be picked than a weight of 1. A weight of 3 is three times as likely.
I currently fill an array with each item. If the weight is 3, I put three copies of the item in the array. Then, I pick a random item.
My method is fast, but uses a lot of memory. I am trying to think of a faster method, but nothing comes to mind. Anyone have a trick for this problem?
EDIT: My Code...
Apparently, I wasn't clear. I do not want to use (or improve) my code. This is what I did.
//Given an array $a where $a[0] is an item name and $a[1] is the weight from 1 to 100.
$b = array();
foreach($a as $t)
$b = array_merge($b, array_fill(0,$t[1],$t));
$item = $b[array_rand($b)];
This required me to check every item in $a and uses max_weight/2*size of $a memory for the array. I wanted a COMPLETELY DIFFERENT algorithm.
Further, I asked this question in the middle of the night using a phone. Typing code on a phone is nearly impossible because those silly virtual keyboards simply suck. It auto-corrects everything, ruining any code I type.
An yet further, I woke up this morning with an entirely new algorithm that uses virtual no extra memory at all and does not require checking every item in the array. I posted it as an answer below.

This ones your huckleberry.
$arr = array(
array("val" => "one", "weight" => 1),
array("val" => "two", "weight" => 2),
array("val" => "three", "weight" => 3),
array("val" => "four", "weight" => 4)
);
$weight_sum = 0;
foreach($arr as $val)
{
$weight_sum += $val['weight'];
}
$r = rand(1, $weight_sum);
print "random value is $r\n";
for($i = 0; $i < count($arr); $i++)
{
if($r <= $arr[$i]['weight'])
{
print "$r <= {$arr[$i]['weight']}, this is our match\n";
print $arr[$i]['val'] . "\n";
break;
}
else
{
print "$r > {$arr[$i]['weight']}, subtracting weight\n";
$r -= $arr[$i]['weight'];
print "new \$r is $r\n";
}
}
No need to generate arrays containing an item for every weight, no need to fill an array with n elements for a weight of n. Just generate a random number between 1 and total weight, then loop through the array until you find a weight less than your random number. If it isn't less than the number, subtract that weight from the random and continue.
Sample output:
# php wr.php
random value is 8
8 > 1, subtracting weight
new $r is 7
7 > 2, subtracting weight
new $r is 5
5 > 3, subtracting weight
new $r is 2
2 <= 4, this is our match
four
This should also support fractional weights.
modified version to use array keyed by weight, rather than by item
$arr2 = array(
);
for($i = 0; $i <= 500000; $i++)
{
$weight = rand(1, 10);
$num = rand(1, 1000);
$arr2[$weight][] = $num;
}
$start = microtime(true);
$weight_sum = 0;
foreach($arr2 as $weight => $vals) {
$weight_sum += $weight * count($vals);
}
print "weighted sum is $weight_sum\n";
$r = rand(1, $weight_sum);
print "random value is $r\n";
$found = false;
$elem = null;
foreach($arr2 as $weight => $vals)
{
if($found) break;
for($j = 0; $j < count($vals); $j ++)
{
if($r < $weight)
{
$elem = $vals[$j];
$found = true;
break;
}
else
{
$r -= $weight;
}
}
}
$end = microtime(true);
print "random element is: $elem\n";
print "total time is " . ($end - $start) . "\n";
With sample output:
# php wr2.php
weighted sum is 2751550
random value is 345713
random element is: 681
total time is 0.017189025878906
measurement is hardly scientific - and fluctuates depending on where in the array the element falls (obviously) but it seems fast enough for huge datasets.

This way requires two random calculations but they should be faster and require about 1/4 of the memory but with some reduced accuracy if weights have disproportionate counts. (See Update for increased accuracy at the cost of some memory and processing)
Store a multidimensional array where each item is stored in the an array based on its weight:
$array[$weight][] = $item;
// example: Item with a weight of 5 would be $array[5][] = 'Item'
Generate a new array with the weights (1-10) appearing n times for n weight:
foreach($array as $n=>$null) {
for ($i=1;$i<=$n;$i++) {
$weights[] = $n;
}
}
The above array would be something like: [ 1, 2, 2, 3, 3, 3, 4, 4, 4, 4 ... ]
First calculation: Get a random weight from the weighted array we just created
$weight = $weights[mt_rand(0, count($weights)-1)];
Second calculation: Get a random key from that weight array
$value = $array[$weight][mt_rand(0, count($array[$weight])-1)];
Why this works: You solve the weighted issue by using the weighted array of integers we created. Then you select randomly from that weighted group.
Update: Because of the possibility of disproportionate counts of items per weight, you could add another loop and array for the counts to increase accuracy.
foreach($array as $n=>$null) {
$counts[$n] = count($array[$n]);
}
foreach($array as $n=>$null) {
// Calculate proportionate weight (number of items in this weight opposed to minimum counted weight)
$proportion = $n * ($counts[$n] / min($counts));
for ($i=1; $i<=$proportion; $i++) {
$weights[] = $n;
}
}
What this does is if you have 2000 10's and 100 1's, it'll add 200 10's (20 * 10, 20 because it has 20x the count, and 10 because it is weighted 10) instead of 10 10's to make it proportionate to how many are in there opposed the minimum weight count. So to be accurate, instead of adding one for EVERY possible key, you are just being proportionate based on the MINIMUM count of weights.

I greatly appreciate the answers above. Please consider this answer, which does not require checking every item in the original array.
// Given $a as an array of items
// where $a[0] is the item name and $a[1] is the item weight.
// It is known that weights are integers from 1 to 100.
for($i=0; $i<sizeof($a); $i++) // Safeguard described below
{
$item = $a[array_rand($a)];
if(rand(1,100)<=$item[1]) break;
}
This algorithm only requires storage for two variables ($i and $item) as $a was already created before the algorithm kicked in. It does not require a massive array of duplicate items or an array of intervals.
In a best-case scenario, this algorithm will touch one item in the original array and be done. In a worst-case scenario, it will touch n items in an array of n items (not necessarily every item in the array as some may be touched more than once).
If there was no safeguard, this could run forever. The safeguard is there to stop the algorithm if it simply never picks an item. When the safeguard is triggered, the last item touched is the one selected. However, in millions of tests using random data sets of 100,000 items with random weights of 1 to 10 (changing rand(1,100) to rand(1,10) in my code), the safeguard was never hit.
I made histograms comparing the frequency of items selected among my original algorithm, the ones from answers above, and the one in this answer. The differences in frequencies are trivial - easy to attribute to variances in the random numbers.
EDIT... It is apparent to me that my algorithm may be combined with the algorithm pala_ posted, removing the need for a safeguard.
In pala_'s algorithm, a list is required, which I call an interval list. To simplify, you begin with a random_weight that is rather high. You step down the list of items and subtract the weight of each one until your random_weight falls to zero (or less). Then, the item you ended on is your item to return. There are variations on this interval algorithm that I've tested and pala_'s is a very good one. But, I wanted to avoid making a list. I wanted to use only the given weighted list and never touch all the items. The following algorithm merges my use of random jumping with pala_'s interval list. Instead of a list, I randomly jump around the list. I am guaranteed to get to zero eventually, so no safeguard is needed.
// Given $a as the weighted array (described above)
$weight = rand(1,100); // The bigger this is, the slower the algorithm runs.
while($weight>0)
{
$item = $a[array_rand($a)];
$weight-= $item[1];
}
// $item is the random item you want.
I wish I could select both pala_ and this answer as the correct answers.

I'm not sure if this is "faster", but I think it may be more "balance"d between memory usage and speed.
The thought is to transform your current implementation (500000 items array) into an equal-length array (100000 items), with the lowest "origin" position as key, and origin index as value:
<?php
$set=[["a",3],["b",5]];
$current_implementation=["a","a","a","b","b","b","b","b"];
// 0=>0 means the lowest "position" 0
// points to 0 in the set;
// 3=>1 means the lowest "position" 3
// points to 1 in the set;
$my_implementation=[0=>0,3=>1];
And then randomly picks a number between 0 and highest "origin" position:
// 3 is the lowest position of the last element ("b")
// and 5 the weight of that last element
$my_implemention_pick=mt_rand(0,3+5-1);
Full code:
<?php
function randomPickByWeight(array $set)
{
$low=0;
$high=0;
$candidates=[];
foreach($set as $key=>$item)
{
$candidates[$high]=$key;
$high+=$item["weight"];
}
$pick=mt_rand($low,$high-1);
while(!array_key_exists($pick,$candidates))
{
$pick--;
}
return $set[$candidates[$pick]];
}
$cache=[];
for($i=0;$i<100000;$i++)
{
$cache[]=["item"=>"item {$i}","weight"=>mt_rand(1,10)];
}
$time=time();
for($i=0;$i<100;$i++)
{
print_r(randomPickByWeight($cache));
}
$time=time()-$time;
var_dump($time);
3v4l.org demo
3v4l.org have some time limitation on codes, so the demo didn't finished. On my laptop the above demo finished in 10 seconds (i7-4700 HQ)

ere is my offer in case I've understand you right. I offer you take a look and if there are some question I'll explain.
Some words in advance:
My sample is with only 3 stages of weight - to be clear
- With outer while I'm simulating your main loop - I count only to 100.
- The array must to be init with one set of initial numbers as shown in my sample.
- In every pass of main loop I get only one random value and I'm keeping the weight at all.
<?php
$array=array(
0=>array('item' => 'A', 'weight' => 1),
1=>array('item' => 'B', 'weight' => 2),
2=>array('item' => 'C', 'weight' => 3),
);
$etalon_weights=array(1,2,3);
$current_weights=array(0,0,0);
$ii=0;
while($ii<100){ // Simulates your main loop
// Randomisation cycle
if($current_weights==$etalon_weights){
$current_weights=array(0,0,0);
}
$ft=true;
while($ft){
$curindex=rand(0,(count($array)-1));
$cur=$array[$curindex];
if($current_weights[$cur['weight']-1]<$etalon_weights[$cur['weight']-1]){
echo $cur['item'];
$array[]=$cur;
$current_weights[$cur['weight']-1]++;
$ft=false;
}
}
$ii++;
}
?>

I'll use this input array for my explanation:
$values_and_weights=array(
"one"=>1,
"two"=>8,
"three"=>10,
"four"=>4,
"five"=>3,
"six"=>10
);
The simple version isn't going to work for you because your array is so large. It requires no array modification but may need to iterate the entire array, and that's a deal breaker.
/*$pick=mt_rand(1,array_sum($values_and_weights));
$x=0;
foreach($values_and_weights as $val=>$wgt){
if(($x+=$wgt)>=$pick){
echo "$val";
break;
}
}*/
For your case, re-structuring the array will offer great benefits.
The cost in memory for generating a new array will be increasingly justified as:
array size increases and
number of selections increases.
The new array requires the replacement of "weight" with a "limit" for each value by adding the previous element's weight to the current element's weight.
Then flip the array so that the limits are the array keys and the values are the array values.
The selection logic is: the selected value will have the lowest limit that is >= $pick.
// Declare new array using array_walk one-liner:
array_walk($values_and_weights,function($v,$k)use(&$limits_and_values,&$x){$limits_and_values[$x+=$v]=$k;});
//Alternative declaration method - 4-liner, foreach() loop:
/*$x=0;
foreach($values_and_weights as $val=>$wgt){
$limits_and_values[$x+=$wgt]=$val;
}*/
var_export($limits_and_values);
$limits_and_values looks like this:
array (
1 => 'one',
9 => 'two',
19 => 'three',
23 => 'four',
26 => 'five',
36 => 'six',
)
Now to generate the random $pick and select the value:
// $x (from walk/loop) is the same as writing: end($limits_and_values); $x=key($limits_and_values);
$pick=mt_rand(1,$x); // pull random integer between 1 and highest limit/key
while(!isset($limits_and_values[$pick])){++$pick;} // smallest possible loop to find key
echo $limits_and_values[$pick]; // this is your random (weighted) value
This approach is brilliant because isset() is very fast and the maximum number of isset() calls in the while loop can only be as many as the largest weight (not to be confused with limit) in the array.
FOR YOUR CASE, THIS APPROACH WILL FIND THE VALUE IN 10 ITERATIONS OR LESS!
Here is my Demo that will accept a weighted array (like $values_and_weights), then in just four lines:
Restructure the array,
Generate a random number,
Find the correct value, and
Display it.

operating with group of indexes of an array without using loop in php

I have an array with data like below:
$data=array(
array(1,1),
array(1,2),
array(1,3),
array(1,4),
array(1,5),
array(1,6),
array(1,7)
);
And I want to apply some operands to group of data, for example
(pseudo-code)
// get all indexes , after the second index of each index
$d = $data[all indexes][1] /10 ; //devided by 10
// multiple the second index of first to fifteenth indexes by 2
$d2=$data[0-15][1] * 2;
I know I can use foreach or any other loops , but i'm looking for a better way .

Well, I don't really get why you don't want to use loops, but if it makes you feel better you can use a "loop in disguise"!
PHP offers some functions to perform a repetitive task in an array, for example:
arrray_map - Keeps the base array intact
array_walk - Changes the base array
Example with array_map
Code
$newData = array_map (function($subArray)
{
return $subArray[1] / 10;
}, $data);
output
array (size=7)
0 => float 0.1
1 => float 0.2
2 => float 0.3
3 => float 0.4
4 => float 0.5
5 => float 0.6
6 => float 0.7
Answer to OP comment:
If it's a performance problem, then array_map is actually slower than foreach or for loops.
I don't think PHP has built in library to deal with matrices operations. However, a quick search in PEAR revealed this extension:
http://pear.php.net/package/Math_Matrix
I've never used so I don't know if it's any good, and the package isn't maintained anymore

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.