PHP usort() order in case of equality - php

In the PHP manual for usort(), it states:
If two members compare as equal, their relative order in the sorted array is undefined.
Also,
A new sort algorithm was introduced. The cmp_function doesn't keep the original order for elements comparing as equal.
Then my question is: what happens if two elements are equal (e.g. the user defined function returns 0)?
I am using this function and apparently equal items are arranged randomly in the sorted array.

Be careful not to confuse "undefined" and "random".
A random implementation would indeed be expected to give a different ordering every time. This would imply there was specific code to shuffle the results when they're found to be equal. This would make the algorithm more complex and slower, and rarely be a desirable outcome.
What undefined means is the opposite: absolutely no care has been taken while designing the algorithm to have predictable or stable order. This means the result may be different every time you run it, if that happens to be the side effect of the algorithm on that data.
You can see the core sort implementation in the PHP source code. It consists of a mixture of "quick sort" (divide and conquer) and insertion sort (a simpler algorithm effective on short lists) with hand-optimised routines for lists of 2, 3, 4, and 5 elements.
So the exact behaviour of equal members will depend on factors like the size of the list, where in the list those equal members come, how many equal members there are in one batch, and so on. In some situations, the algorithm will see that they're identical, and not swap them (the ideal, because swapping takes time); in others, it won't directly compare them until they've already been moved relative to other things, so they'll end up in a different order.

I also find the same while sorting my array, then I found some custom made function because php have some limitation to use defined sorting function .
http://php.net/manual/en/function.uasort.php#Vd114535
PHP 7 uses a stable sort algorithm for small arrays (< 16),
but for larger arrays the algorithm is still not stable.
Furthermore PHP makes no guarantee whether sorting with *sort() is
stable or not https://bugs.php.net/bug.php?id=53341.

If I have an array of: ['b', 'a', 'c', 'b'] and I were to sort this I would get: ['a','b','b','c']. Since 'b' == 'b' php can not guarantee that one comes before the other so the sort order is 'undefined', however since they are equal, does this matter?
If you are using a sort function that returns 0 for unequal objects you're facing a different problem altogether.

understand that php does not care about the order if all the compared values are same.
Example:
$temp=array("b"=>"10","c"=>"10","d"=>"10","e"=>"4");
as above array has 4 array length in which 3 are having the same values as shown b,c,d = 10;
arsort() //The arsort() function sorts an associative array in descending order, according to the value
if print_r(arsort($temp)) o/p: => Array ( [b] => 10 [c] => 10 [d] => 10 [e] => 4 )
this means it return array after sorting equal value but keeps the position(order) same for equal values
but
if $temp=array("a"=>"4",b"=>"10","c"=>"10","d"=>"10","e"=>"4");
here in above array b,c,d = 10 are bounded under two extreme left and right arrays having value less then center (b,c,d = 10) values
the arsort of above temp is o/p:
Array ( [c] => 10 [b] => 10 [d] => 10 [a] => 4 [e] => 4 )
it gives the middle part i.e [c] array in center.
this means if similar values or equal values array is bounded by from both side by lower values array or the first value is lower then the order of equal gives middle one from three array values as first in that three .

I was also facing the same problem, where 2 rows have same values and when apply sort function its order gets changed which I didn't want. I want to sort keys based on their values, if they are equal don't change order. So here is my solution-
// sample array
$arr = Array("a" => 0.57,"b" => 1.19,"c" => 0.57,"d" => 0.57,"e" => 0.57,"f" => 0.57,"g" => 0.57,"h" => 0.57,"i" => 0.99,"j" => 1.19,"k" => 1.19);
$multi_arr = [];
foreach ($arr as $k=>$val){
$multi_arr["$val"][] = array($k=>$val);
}
uksort($multi_arr, function ($a, $b) {
return $b > $a ? 1 : -1;
});
$s_arr = [];
foreach ($multi_arr as $k=>$val){
foreach($val as $p_id){
$p_arr = array_keys($p_id);
$s_arr[] = $p_arr[0];
}
}
print_r($s_arr);
output-
Array([0] => b,[1] => j,[2] => k,[3] => i,[4] => a,[5] => c,[6] => d,[7] => e,[8] => f,[9] => g,[10] => h)

This function preserves order for equal values (example from my CMS EFFCORE):
function array_sort_by_number(&$array, $key = 'weight', $order = 'a') {
$increments = [];
foreach ($array as &$c_item) {
$c_value = is_object($c_item) ? $c_item->{$key} : $c_item[$key];
if ($order === 'a') $increments[$c_value] = array_key_exists($c_value, $increments) ? $increments[$c_value] - .0001 : 0;
if ($order === 'd') $increments[$c_value] = array_key_exists($c_value, $increments) ? $increments[$c_value] + .0001 : 0;
if (is_object($c_item)) $c_item->_synthetic_weight = $c_value + $increments[$c_value];
else $c_item['_synthetic_weight'] = $c_value + $increments[$c_value];
}
uasort($array, function ($a, $b) use ($order) {
if ($order === 'a') return (is_object($b) ? $b->_synthetic_weight : $b['_synthetic_weight']) <=> (is_object($a) ? $a->_synthetic_weight : $a['_synthetic_weight']);
if ($order === 'd') return (is_object($a) ? $a->_synthetic_weight : $a['_synthetic_weight']) <=> (is_object($b) ? $b->_synthetic_weight : $b['_synthetic_weight']);
});
foreach ($array as &$c_item) {
if (is_object($c_item)) unset($c_item->_synthetic_weight);
else unset($c_item['_synthetic_weight']);
}
return $array;
}
It works with both arrays and objects.
It adds a synthetic key, sorts by it, and then removes it.
Test
$test = [
'a' => ['weight' => 4],
'b' => ['weight' => 10],
'c' => ['weight' => 10],
'd' => ['weight' => 10],
'e' => ['weight' => 4],
];
$test_result = [
'b' => ['weight' => 10],
'c' => ['weight' => 10],
'd' => ['weight' => 10],
'a' => ['weight' => 4],
'e' => ['weight' => 4],
];
array_sort_by_number($test, 'weight', 'a');
print_R($test);
var_dump($test === $test_result);
$test = [
'a' => ['weight' => 4],
'b' => ['weight' => 10],
'c' => ['weight' => 10],
'd' => ['weight' => 10],
'e' => ['weight' => 4],
];
$test_result = [
'a' => ['weight' => 4],
'e' => ['weight' => 4],
'b' => ['weight' => 10],
'c' => ['weight' => 10],
'd' => ['weight' => 10],
];
array_sort_by_number($test, 'weight', 'd');
print_R($test);
var_dump($test === $test_result);

Related

How to sort an associative array by key type and force specific order in PHP? [duplicate]

This question already has answers here:
Custom key-sort a flat associative based on another array
(16 answers)
Closed 2 years ago.
I have an associative array filled with more or less randomly mixed numeric and string keys, let's go with such example:
$arr = array('third'=>321, 4=>1, 65=>6, 'first'=>63, 5=>88, 'second'=>0);
Now I'd like to sort it in such way:
Sort the array so the numeric keys would be first, and string keys after.
The numeric values sholud be in specific order: 5, 4, 65
The string keys should be ordered by specific order as well: 'first', 'second', 'third'
Output in this case should look like: array(5=>88, 4=>1, 65=>6, 'first'=>63, 'second'=>0, 'third'=>321)
Each element might not be present at all in the original $arr, which might be additional problem...
My guess would be to split the array and make separate one with string keys, and one with numerc keys. Than sort each one and glue them together... But I do't know how to do it?
Edit: Turned out to be a very poor idea, much better approach is in the answer below.
The spaceship operator inside of a uksort() call is the only way that I would do this one.
Set up a two arrays containing your prioritized numeric and word values, then flip them so that they can be used a lookups whereby the respective values dictate the sorting order.
By writing two arrays of sorting criteria separated by <=> the rules will be respected from left to right.
The is_int() check is somewhat counterintuitive. Because we want true outcomes to come before false outcomes, I could have swapped $a and $b in the first element of the criteria arrays, but I decided to keep all of the variables in the same criteria array and just negate the boolean outcome. When sorting ASC, false comes before true because it is like 0 versus 1.
Code: (Demo)
$numericPriorities = array_flip([5, 4, 65]);
$numericOutlier = count($numericPriorities);
$wordPriorities = array_flip(['first', 'second', 'third']);
$wordOutlier = count($wordPriorities);
$arr = ['third' => 321, 4 => 1, 'zero' => 'last of words', 7 => 'last of nums', 65 => 6, 'first' => 63, 5 => 88, 'second' => 0];
uksort(
$arr,
function($a, $b) use ($numericPriorities, $numericOutlier, $wordPriorities, $wordOutlier) {
return [!is_int($a), $numericPriorities[$a] ?? $numericOutlier, $wordPriorities[$a] ?? $wordOutlier]
<=>
[!is_int($b), $numericPriorities[$b] ?? $numericOutlier, $wordPriorities[$b] ?? $wordOutlier];
}
);
var_export($arr);
or (demo)
uksort(
$arr,
function($a, $b) use ($numericPriorities, $numericOutlier, $wordPriorities, $wordOutlier) {
return !is_int($a) <=> !is_int($b)
?: (($numericPriorities[$a] ?? $numericOutlier) <=> ($numericPriorities[$b] ?? $numericOutlier))
?: (($wordPriorities[$a] ?? $wordOutlier) <=> ($wordPriorities[$b] ?? $wordOutlier));
}
);
Output:
array (
5 => 88,
4 => 1,
65 => 6,
7 => 'last of nums',
'first' => 63,
'second' => 0,
'third' => 321,
'zero' => 'last of words',
)
I was extreamly overthinking the case. I came out with a solution that I was unalbe to implement, and unnecessary I described it and ask for help on how to do this.
Turned out that what I really wanted was extreamly simple:
$arr = array('third'=>321, 4=>1, 65=>6, 'first'=>63, 5=>88, 'second'=>0);
$proper_order = array(5, 4, 65, 'first', 'second', 'third');
$newtable = array();
foreach ($proper_order as $order)
{
if (isset($arr[$order]))
$newtable[$order] = $arr[$order];
}
unset($order, $arr, $proper_order);
print_r($newtable);
The result is as expected, and the code should resist the case when the original $arr is incomplete:
Array
(
[5] => 88
[4] => 1
[65] => 6
[first] => 63
[second] => 0
[third] => 321
)

How come usort (php) works even when not returning integers?

Yesterday at work I stumbled upon a piece of code that was roughly like this:
uasort($array, function($a, $b) {
return isset($a['sort_code']) && isset($b['sort_code']) && $a['sort_code'] > $b['sort_code'];
});
this flipped a couple of switches in my head seeing as this can only return a boolean true or false instead of the 0 1 or -1 that is clearly stated in php.net. Best case scenario PHP will interpret false as 0 and true as 1. So one (the colleague that wrote this) could argue that it actually works.
I made the argument that it cannot possibly work because it will never ever ever return a -1. There is a state missing in this return-either-true-or-false from the sort callback and the way I think this is you will get either a wrong result or the process is not as performant.
I actually got down to writing a couple of tests and though I am no expert in sorting I actually got to the point where you could actually do away with the -1 state and still get a correct sorting with the same amount of calculations
effectively you could replace:
if ($a === $b) {
return 0;
}
return $a < $b ? -1 : 1;
with
return $a > $b;
Now, as I said, I am by no means an expert in sorting and my tests were less than thorough. However my results with small and bigger arrays showed that this is true. The array always got sorted correctly with the amount of calculations being the same using either method
So it really doesnt seem to be that big of a matter that there is no third state (-1)
I still stand by my opinion that you should follow the docs, especially PHP, to the letter. When php says I need either 0, a negative or positive number then that is exactly what one should do. You shouldnt leave stuff like that to chance and watch how things work by accident.
However I am intrigued by all this and would like some insight from somebody who knows more
thoughts? anyone?
You do not have to actually return all -1, 0 and 1 for function to work. Returning 0 would not switch elements in place. Also you can return something like -1345 to act as -1.
Returning only 0 or 1 means you want just to "sink" some elements towards end of array, like have all NULL values at end or start of array and other sorting is not important:
$input = ['c', 'a', null, 'b', null];
usort($input, function ($a, $b) {
return !is_null($b);
});
// 'b', 'a', 'c', null, null
var_dump($input);
In your case it's sorting to elements without sort_code to end and also sort_code in order
A comparison sort requires only a total preorder, which is basically a "less than or equal to" relation. In other words, it doesn't matter whether a < b or a = b; you don't need that information to sort, all you need to know is that a <= b. That's why most (if not all) sorting algorithms don't distinguish between values -1 and 0, they only test if the comparison function returns 1 or not 1.
However, to work correctly, the function needs to be transitive (otherwise the relation would not be a total preorder). This is NOT the case of the function given in the original post.
Here is an example:
$array = [
['sort_code' => 2],
[],
['sort_code' => 1]
];
uasort($array, function($a, $b) {
return isset($a['sort_code']) && isset($b['sort_code']) && $a['sort_code'] > $b['sort_code'];
});
var_export($array);
which gives:
array (
0 =>
array (
'sort_code' => 2,
),
1 =>
array (
),
2 =>
array (
'sort_code' => 1,
),
)
sort_code 2 comes before 1, which was not intended. That's because, according to the comparison function:
$array[0] <= $array[1]
$array[1] <= $array[2]
$array[0] > $array[2]
If the relation was transitive, the first two lines would imply:
$array[0] <= $array[2]
which contradicts with what the function says.
The logic in your uasort() has logic that caterers to making fewer overall comparisons and therefore (assuming the sorting rules deliver the expected result) executes with higher efficiency.
PHP's uasort() uses Quicksort as far as I can tell, but the docs say you should not rely on this fact (perhaps in case it changes in the future). An SO reference.
Here is a demo that shows that the sort call only makes enough comparisons to potentially break all ties. Employing a three-way comparison ends up being more expensive because more comparisons are required but would likely deliver a different (and better) sort due to the sorting instructions.
$array = [
['id' => 1, 'sort_code' => 4],
['id' => 2, 'sort_code' => 2],
['id' => 3],
['id' => 4, 'sort_code' => 8],
['id' => 5],
['id' => 6, 'sort_code' => 6]
];
uasort($array, function($a, $b) {
echo "\n" . json_encode($a) . " -vs- " . json_encode($b) . " eval: ";
echo $eval = (int)(isset($a['sort_code'], $b['sort_code']) && $a['sort_code'] > $b['sort_code']);
return $eval;
});
echo "\n---\n";
var_export($array);
echo "\n======\n";
uasort($array, function($a, $b) {
echo "\n" . json_encode($a) . " -vs- " . json_encode($b) . " eval: ";
echo $eval = ($a['sort_code'] ?? 0) <=> ($b['sort_code'] ?? 0);
return $eval;
});
echo "\n---\n";
var_export($array);
Output:
{"id":1,"sort_code":4} -vs- {"id":2,"sort_code":2} eval: 1
{"id":1,"sort_code":4} -vs- {"id":3} eval: 0
{"id":3} -vs- {"id":4,"sort_code":8} eval: 0
{"id":4,"sort_code":8} -vs- {"id":5} eval: 0
{"id":5} -vs- {"id":6,"sort_code":6} eval: 0
---
array (
1 =>
array (
'id' => 2,
'sort_code' => 2,
),
0 =>
array (
'id' => 1,
'sort_code' => 4,
),
2 =>
array (
'id' => 3,
),
3 =>
array (
'id' => 4,
'sort_code' => 8,
),
4 =>
array (
'id' => 5,
),
5 =>
array (
'id' => 6,
'sort_code' => 6,
),
)
======
{"id":2,"sort_code":2} -vs- {"id":1,"sort_code":4} eval: -1
{"id":1,"sort_code":4} -vs- {"id":3} eval: 1
{"id":2,"sort_code":2} -vs- {"id":3} eval: 1
{"id":1,"sort_code":4} -vs- {"id":4,"sort_code":8} eval: -1
{"id":4,"sort_code":8} -vs- {"id":5} eval: 1
{"id":1,"sort_code":4} -vs- {"id":5} eval: 1
{"id":2,"sort_code":2} -vs- {"id":5} eval: 1
{"id":3} -vs- {"id":5} eval: 0
{"id":4,"sort_code":8} -vs- {"id":6,"sort_code":6} eval: 1
{"id":1,"sort_code":4} -vs- {"id":6,"sort_code":6} eval: -1
---
array (
2 =>
array (
'id' => 3,
),
4 =>
array (
'id' => 5,
),
1 =>
array (
'id' => 2,
'sort_code' => 2,
),
0 =>
array (
'id' => 1,
'sort_code' => 4,
),
5 =>
array (
'id' => 6,
'sort_code' => 6,
),
3 =>
array (
'id' => 4,
'sort_code' => 8,
),
)

Count unique values in a column of an array

I have an array like this:
$arr = [
1 => ['A' => '1', 'C' => 'TEMU3076746'],
2 => ['A' => '2', 'C' => 'FCIU5412720'],
3 => ['A' => '3', 'C' => 'TEMU3076746'],
4 => ['A' => '4', 'C' => 'TEMU3076746'],
5 => ['A' => '5', 'C' => 'FCIU5412720']
];
My goal is to count the distinct values in the C column of the 2-dimensional array.
The total rows in the array is found like this: count($arr) (which is 5).
How can I count the number of rows which contain a unique value in the 'C' column?
If I removed the duplicate values in the C column, there would only be: TEMU3076746 and FCIU5412720
My desired output is therefore 2.
Hope this simplest one will be helpful. Here we are using array_column, array_unique and count.
Try this code snippet here
echo count(
array_unique(
array_column($data,"C")));
Result: 2
combine array_map, array_unique, count
$array = [ /* your array */ ];
$count = count(
array_unique(
array_map(function($element) {
return $element['C'];
}, $array))))
or use array_column as suggested by sahil gulati, array_map can do more stuff which probably isn't needed here.
I had a very similar need and I used a slightly different method.
I have several events where teams are participating and I need to know how many teams there are in each event. In other words, I don't need to only know how many distinct item "C" there are, but how many items TEMU3076746 and FCIU5412720 there are.
The code will then be as is
$nbCs = array_count_values ( array_column ( $array, 'C' ) );
$nbCs will issue an array of values = Array([TEMU3076746] => 3 [FCIU5412720] => 2)
See example in sandbox Sandbox code
$data=array();
$data=[
1 => [
'A' => '1'
'C' => 'TEMU3076746'
]
2 => [
'A' => '2'
'C' => 'FCIU5412720'
]
3 => [
'A' => '3'
'C' => 'TEMU3076746'
]
4 => [
'A' => '4'
'C' => 'TEMU3076746'
]
5 => [
'A' => '5'
'C' => 'FCIU5412720'
]
];
$total_value=count(
array_unique(
array_column($data,"C")));
echo $total_value;
Most concisely, use array_column()'s special ability to assign new first level keys using the targeted column's values. This provides the desired effect of uniqueness because arrays cannot contain duplicate keys on the same level.
Code: (Demo)
echo count(array_column($arr, 'C', 'C')); // 2
To be perfectly clear, array_column($arr, 'C', 'C') produces:
array (
'TEMU3076746' => 'TEMU3076746',
'FCIU5412720' => 'FCIU5412720',
)
This would also work with array_column($arr, null, 'C'), but that create a larger temporary array.
p.s. There is a fringe case that may concern researchers who are seeking unique float values. Assigning new associative keys using float values is inappropriate/error-prone because the keys will lose precision (become truncated to integers).
In that fringe case with floats, fallback to the less performant technique: count(array_unique(array_column($arr, 'B))) Demo

Sort an associative array by value in descending and preserve order when values are same

I want to sort an associative array and there is an inbuilt function to achieve the same viz. arsort(), but the problem with this function is that it doesn't maintain the original key order when values are same.
e.g.
$l = [
'a' => 1,
'b' => 2,
'c' => 2,
'd' => 4,
'e' => 5,
'f' => 5
];
The result which I want is :
$l = [
'e' => 5,
'f' => 5,
'd' => 4,
'b' => 2,
'c' => 2,
'a' => 1
];
arsort() gives the result in descending order but it randomly arranges the element when values are same.
This question is not a duplicate of PHP array multiple sort - by value then by key?. In that question it is asking for same numeric value to be sorted alphabetically but in my question I am asking values to sorted according to the original order if they are same.
There is probably a more efficient way to do this, but I think this should work to maintain the original key order within groups of the same value. I'll start with this array for example:
$l = [ 'a' => 1, 'b' => 2, 'c' => 2, 'd' => 4, 'g' => 5, 'e' => 5, 'f' => 5 ];
Group the array by value:
foreach ($l as $k => $v) {
$groups[$v][] = $k;
}
Because the foreach loop reads the array sequentially, the keys will be inserted in their respective groups in the correct order, and this will yield:
[1 => ['a'], 2 => ['b', 'c'], 4 => ['d'], 5 => ['g', 'e', 'f'] ];
sort the groups in descending order by key:
krsort($groups);
Reassemble the sorted array from the grouped array with a nested loop:
foreach ($groups as $value => $group) {
foreach ($group as $key) {
$sorted[$key] = $value;
}
}
You can use array_multisort. The function can be a bit confusing, and really hard to explain, but it orders multiple arrays, and the first array provided gets sorted based on the order of subsequent arrays.
Try:
array_multisort($l, SORT_DESC, array_keys($l));
See the example here: https://3v4l.org/oV8Od
It sorts the array by values descending, then is updated by the sort on the keys of the array.

array values in multidimensional array

I have two arrays
they look like
$a1 = array(
array('num' => 1, 'name' => 'one'),
array('num' => 2, 'name' => 'two'),
array('num' => 3, 'name' => 'three'),
array('num' => 4, 'name' => 'four'),
array('num' => 5, 'name' => 'five')
)
$a2 = array(3,4,5,6,7,8);
I want to end up with an array that looks like
$a3 = array(3,4,5);
so basically where $a1[$i]['num'] is in $a2
I know I could do
$a3 = array();
foreach($a1 as $num)
if(array_search($num['num'], $a2))
$a3[] = $num['num'];
But that seems like a lot of un-needed iterations.
Is there a better way?
Ah Snap...
I just realized I asked this question the wrong way around, I want to end up with an array that looks like
$a3 array(
array('num' => 3, 'name' => 'three'),
array('num' => 4, 'name' => 'four'),
array('num' => 5, 'name' => 'five')
)
You could extract the relevant informations (the 'num' items) from $a1 :
$a1_bis = array();
foreach ($a1 as $a) {
$a1_bis[] = $a['num'];
}
And, then, use array_intersect() to find what is both in $a1_bis and $a2 :
$result = array_intersect($a1_bis, $a2);
var_dump($result);
Which would get you :
array
2 => int 3
3 => int 4
4 => int 5
With this solution :
you are going through $a1 only once
you trust PHP on using a good algorithm to find the intersection between the two arrays (and/or consider that a function developed in C will probably be faster than any equivalent you could code in pure-PHP)
EDIT after the comment : well, considering the result you want, now, I would go with another approach.
First, I would use array_flip() to flip the $a2 array, to allow faster access to its elements (accessing by key is way faster than finding a value) :
$a2_hash = array_flip($a2); // To speed things up :
// accessing by key is way faster than finding
// an item in an array by value
Then, I would use array_filter() to apply a filter to $a1, keeping the items for which num is in the $a2_hash flipped-array :
$result = array_filter($a1, function ($item) use ($a2_hash) {
if (isset($a2_hash[ $item['num'] ])) {
return true;
}
return false;
});
var_dump($result);
Note : I used an anonymous function, which only exist with PHP >= 5.3 ; if you are using PHP < 5.3, this code can be re-worked to suppress the closure.
With that, I get the array you want :
array
2 =>
array
'num' => int 3
'name' => string 'three' (length=5)
3 =>
array
'num' => int 4
'name' => string 'four' (length=4)
4 =>
array
'num' => int 5
'name' => string 'five' (length=4)
Note the keys are not corresponding to anything useful -- if you want them removed, just use the array_values() function on that $result :
$final_result = array_values($result);
But that's probably not necessary :-)

Categories