Check if each values are in different arrays - php

I need to check if element 'a' is present twice, 'b' is present twice and 'c' is present once in a group of arrays. Each element should be in five different arrays.
That is like, 'a' in array1, another 'a' in array2, 'b' in array3, another 'b' in array4 and 'c' in array5. there should be atleast five or more arrays and each element should be in different arrays using php. Now my code is
$arr = array(
$branch1 = array('a', 'b'),
$branch2 = array('b','c'),
$branch3 = array('a', 'c'),
$branch4 = array('c', 'a'),
$branch5 = array('b', 'a'),
$branch6 = array('b', 'c', 'a')
);//This may have any number of branches and any kind of combinations of a, b and c(but each element only once in each array).
$reqd_branch_count = 5;//required branch count
As I am new to php, now I have written a very long code, but it fails when trying new combinations.Please help me if someone knows.
From asker's comment:
The condition is a user has some ranks in his/her downline which are
in branches. for example rank1, rank2 and rank3,.... Those rank counts
are shown in that array. If that user need to get rank8 he/she should
have one rank 7, two rank 5 or 7s, and two rank4s, that should be
1+2+2=5 branches, that is each rank should be in different branches.
Hope you understood the question

if i understood correctly
merge all arrays, count amount of all items in all arrays and test you want
$arr = array_merge($branch1,$branch2,$branch3,$branch4,$branch5,$branch6);
$count = array_count_values($arr);
echo $count[7]; // 4
echo $count[4]; // 2
so, yuo can make condition
if (($count[7] == 1) or ($count[7] == 2) or ($count[5] == 2) or ($count[4] == 2))
{ any stuff for true}
UPDATE
$arr = array(
$branch1=array('3'),
$branch2=array('4'),
$branch3=array('3','5','7'),
$branch4=array('3','4','7'),
$branch5=array('7'),
$branch6=array('4','7'));
// find rank per branch
$ranks = array_map('max', $arr);
// make array rank => amount
$count = array_replace(array_fill(0,7,0), array_count_values($ranks));
if (($count[7] >= 1) and (($count[7] + $count[5]) >= 2) and (($count[7] + $count[5] + $count[4]) >= 5)) {
echo "Satisfy ";
}

Because it is another question, this is another answer
$arr = array(
$branch1 = array('a'),
$branch2 = array('b','c'),
$branch3 = array('a', 'c'),
$branch4 = array('c', 'a'),
$branch5 = array('a'),
$branch6 = array( 'c', 'a')
);
$names = array('a', 'b', 'c'); // for convenience only
var_dump(goNext($arr, $names)); // watch result
function goNext($arr, $names) {
// for each name make array with list of brances where it is
$in = array(array(), array(), array());
foreach($arr as $k1 => $branch)
foreach($names as $k2 => $letter)
if(in_array($letter, $branch)) $in[$k2][] = $k1;
foreach ($in[0] as $i1) // 1st a
foreach (array_diff($in[0], array($i1)) as $i2) // 2nd a
foreach (array_diff($in[1], array($i1,$i2)) as $i3) // 1st b
foreach (array_diff($in[1], array($i1,$i2,$i3)) as $i4) // 2nd b
foreach(array_diff($in[2], array($i1,$i2,$i3,$i4)) as $i5) {
// if here we find combination we need
// next line only for debug
// it shows set of branches that give true
// a a b b c
echo $i1 . " " . $i2 . " " . $i3 . " " . $i4 . " " . $i5;
return(true);
}
return(false); // combination has not found
}

Related

Get the name of the variable which has the second lowest/smallest value with PHP

I have 4 variables and each of those have an integer assigned to them. Could anybody please let me know how I can get the name of the variable which has the second smallest value?
Thanks.
Use compact to set the variables to one array, sort the array, then use array slice to get the second value.
Then optionally echo the key of the second value.
$a = 2;
$b = 7;
$c = 6;
$d = 1;
$arr = compact('a', 'b', 'c', 'd');
asort($arr);
$second = array_slice($arr,1,1);
Echo "variable name " . Key($second) ."\n";
Echo "value " . ${key($second)};
https://3v4l.org/SVdCq
Updated the code with how to access the original variable from the array
Unless you have a structured way of naming your variables eg prefix_x there is no real way.
Recommended way is using an array like this:
$array = array(
"a" => 3,
"b" => 2,
"c" => 1,
"d" => 6
);
// Sort the array descending but keep the keys.
// http://php.net/manual/en/function.asort.php.
asort($array);
// Fetch the keys and get the second item (index 1).
// This is the key you are looking for per your question.
$second_key = array_keys($array)[1];
// Dumping the result to show it's the second lowest value.
var_dump($array[$second_key]); // int(2).
To be more in line with your question you can create your array like this.
$array = array();
$array['variable_one'] = $variable_one;
$array['some_random_var'] = $some_random_var;
$array['foo'] = $foo;
$array['bar']= $bar;
// Same code as above here.
Instead of using 4 variables for 4 integer values, you can use an array to store these values. Sort the array and print the second index of the array i.e. 1.
<?php
$x = array(2,3,1,6);
$i = 0, $j = 0, $temp = 0;
for($i = 0; $i < 4; $i++){
for($j=0; $j < 4 - $i; j++){
if($x[$j] > $x[$j+1]){
$temp = $x[$j];
$x[$j] = $x[$j+1];
$x[$j+1] = $temp;
}
}
}
for($j = 0; $j < 4; $j++){
echo $x[$j];
}
echo $x[1];
?>
First you need to have all Variables in an Array. You can do this this way:
$array = array(
'a' => 3,
'b' => 6,
'c' => 2,
'd' => 1
);
or this way:
$array['a'] = 3;
$array['b'] = 6;
// etc
Then you need to sort the Items with natsort() to receive a natural Sorting.
natsort($array);
Then you flip the Array-Keys with the Values (In Case you want the Value, skip this Line)
$array = array_flip($array);
After this you jump to the next Item in the Array (Position 1) by using next();
echo next($array);
Makes in Total a pretty short Script:
$array = array(
'a' => 3,
'b' => 6,
'c' => 2,
'd' => 1
);
natsort($array);
$array = array_flip($array);
echo next($array);

PHP Script and max() Value

Sorry for my bad English and thanks for your help in advance! I have kind of a tricky problem I've encountered while coding. Here's the point:
I need a script that essentially extracts the 5 max values of 5 arrays, that are "mixed", i.e. they contain "recurrent" values. Here is an example:
array1(a, b)
array2(a, c, d, e, g)
array3(b, d, g, h)
array4(e, t, z)
array5(b, c, d, k)
The 2 essential requests are:
1) the sum of those 5 arrays (array1+array2+array3...) MUST be the highest possible...
2) ...without repeat ANY value previously used** (e.g. if in array1 the top value was "b", this cannot be re-used as max value in arrays 3 or 5).
Currently I have this...:
$group1 = array(a, b);
$group = array(a, b, c, d);
$max1a = max(group1);
$max2a = max(group2) unset($max1a);
$sum1 = $max1a + $max2a;
$max2b = max(group2);
$max1b = max(group1)
unset($max2b);
$sum2 = $max1b + $max2b;
if($sum1 > $sum2) {
echo $sum1
} else {
echo $sum2
}
... but it's kinda impossible to use this code with 5 arrays, because I should compare 5! (120...!!!) combinations in order to find the max sum value.
I know the problem is quite difficult to explain and to solve, but I really need your help and I hope you can save me!!!
Cheers
I'm adding this as another answer to leave the previous one intact for someone coming across this looking for that variation on this behaviour.
Given the 2 arrays:
$array1 = array(30, 29, 20);
$array2 = array(30, 20, 10);
The maximum sum using one element from each is 59 - this is dramatically different to my previous approach (and the answers' of others) which took the max element of the first array and then the highest element of the next array that is not equal to any previously used value - this would give 50 instead.
The code you want is this:
$mainArray = array();
$mainArray[] = array(30, 29, 20);
$mainArray[] = array(30, 20, 10);
$tempArray = array();
$newArray = array();
foreach($mainArray as $innerArray) {
$newArray = array();
if (count($tempArray) == 0) {
foreach ($innerArray as $value) {
$newArray[] = array('total' => $value, 'used' => array($value));
}
}
else {
foreach ($tempArray as $key => $innerTempArray) {
$placed = FALSE;
foreach ($innerArray as $value) {
if (!(in_array($value, $innerTempArray['used']))) {
$newArray[] = array('total' => $tempArray[$key]['total'] + $value, 'used' => $tempArray[$key]['used']);
$newArray[count($newArray) - 1]['used'][] = $value;
$placed = TRUE;
}
}
if (!($placed)) {
echo "An array had no elements that had not already been used";
die();
}
}
}
$tempArray = $newArray;
}
$total = 0;
if (count($newArray) == 0) {
echo "No data passed";
die();
}
else {
$total = $newArray[0]['total'];
}
for ($i = 0; $i < count($newArray); $i++) {
if ($newArray[$i]['total'] > $total) {
$total = $newArray[$i]['total'];
}
}
var_dump($total);
EDIT - Do not repeat used variables (but repeated values are ok):
Let
//$a = 30, $b = 30, $c = 25, $d = 20, $e = 19
$array1 = array($a, $c, $d);
$array2 = array($b, $d, $e);
We want to choose $a from $array1 and $b from $array2 as these give the largest sum - although they're values are the same that is allowed because we only care if the names of the variables saved to that place are the same.
With the arrays in the above format there is no way of achieving the desired behaviour - the arrays do not know what the name of the variable who's value was assigned to their elements, only it's value. Therefore we must change the first part of the original answer to:
$mainArray[] = array('a', 'c', 'd');
$mainArray[] = array('b', 'd', 'e');
and also have either the of the following before the first foreach loop (to declare $a, $b, $c, $d, $e)
//either
extract(array(
'a' => 30,
'b' => 30,
'c' => 25,
'd' => 20,
'e' => 19
));
//or
$a = 30; $b = 30; $c = 25; $d = 20; $e = 19;
The above both do exactly the same thing, I just prefer the first for neatness.
Then replace the line below
$newArray[] = array('total' => $value, 'used' => array($value));
with
$newArray[] = array('total' => ${$value}, 'used' => array($value));
The change is curly brackets around the first $value because that is then evaluated to get the variable name to use (like below example):
$test = 'hello';
$var = 'test';
echo ${$var}; //prints 'hello'
A similar change replaces
$newArray[] = array('total' => $tempArray[$key]['total'] + $value, 'used' => $tempArray[$key]['used']);
with
$newArray[] = array('total' => $tempArray[$key]['total'] + ${$value}, 'used' => $tempArray[$key]['used']);
Now the code will function as wanted :)
If you are dynamically building the arrays you are comparing and can't build the array of strings instead of variables then there is no way to do it. You would need some way of extracting "$a" or "a" from $a = 30, which PHP is not meant to do (there are hacks but they are complicated and only work in certain situations (google "get variable name as string in php" to see what I mean)
If by the top value you mean the first alphabetically then the following would work:
$array1 = array('a', 'b');
$array2 = array('a', 'c', 'd', 'e', 'g');
$array3 = array('b', 'd', 'g', 'h');
$array4 = array('e', 't', 'z');
$array5 = array('b', 'c', 'd', 'k');
$mainArray = array($array1, $array2, $array3, $array4, $array5);
foreach ($mainArray as $key => $value) {
sort($mainArray[$key]);
}
$resultArray = array();
foreach($maniArray as $key1 => $value1) {
$placed = FALSE;
foreach ($value1 as $value2) {
if (!(in_array($value2, $resultArray))) {
$resultArray[] = $value2;
$placed = TRUE;
break;
}
}
if (!($placed)) {
echo "All the values in the " . ($key + 1) . "th array are already max values in other arrays";
die();
}
}
var_dump($resultArray);
I'm not sure, of i really understood your problem correctly, these are my assumptions:
You have five arrays containing numbers
These numbers can occur multiple times across the arrays
You want to find the highest possible sum of elements across your arrays
The sum uses one single value of each array
But the sum must not use the same number twice
Is that correct?
If Yes, then:
The highest possible sum across all arrays is always the sum of the largest elements. If you do not want to use the same number twice, you can just get the maximum from the first array, remove it from all the others and then sum up all the remaining maxima.
Like so:
$arrays = array();
$arrays[] = array(1, 2);
$arrays[] = array(1, 3, 4, 5, 7);
$arrays[] = array(2, 4, 7, 8);
$arrays[] = array(5, 20, 26);
$arrays[] = array(2, 3, 4, 11);
for($i=0, $n=count($arrays); $i<$n; $i++) {
if($i===0) {
$a1max = max($arrays[$i]);
$sum = $a1max;
} else {
$duplicate_pos = array_search($a1max, $arrays[$i]);
if($duplicate_pos !== FALSE) {
unset($arrays[$i][$duplicate_pos]);
}
$sum += max($arrays[$i]);
}
}
echo "sum: " . $sum . "\n";
Assuming you have grouped together all your values in one array like this,
$array = array(
array(1,2,3),
array(1,2,3,4),
array(1,2,3,4,5,6),
array(1,2,3,4,5,6),
array(1,2,3,4,5,6,7)
);
Loop through $array, and get the highest value which has not been used previously,
$max = array();
foreach($array as $value)
$max[] = max(array_diff($value, $max));
Calculate the sum of all values with array_sum(),
echo "The maximal sum is: ".array_sum($max);

foreach two arrays when one is twice bigger than second

Ok so I have two arrays, and the second one is allways twice larger than first one:
$items1 = array('1', '2', '3');
$items2 = array('a', 'b', 'c', 'd', 'e', 'f');
I know that i can foreach two arrays with same items count like this:
foreach ($items1 as $key => $item1)
{
echo $item1 . $items2[$key] . ', ';
}
This will give result like this: 1a, 2b, 3c....
But how to foreach thos two arrays to get result like this:
1ab, 2cd, 3ef?
aka echo first item from $items1 array, an then two from $items2.
foreach (array_combine($items1, array_chunk($items2, 2)) as $key => $value) {
echo $key.implode($value)."\n";
}
You could write:
foreach ($items1 as $i => $item1)
{
echo $item1 . $items2[$i * 2] . $items2[$i * 2 + 1] . ', ';
}

php - string replacement

I am trying to do chord transposition in PHP the array of Chord values are as followed...
$chords1 = array('C','C#','D','D#','E','F','F#','G','G#','A','A#','B','C','Db','D','Eb','E','F','Gb','G','Ab','A','Bb','B','C');
An example would be D6/F#. I want to match the array value and then transpose it by a given number position in the array. Here is what I have so far...
function splitChord($chord){ // The chord comes into the function
preg_match_all("/C#|D#|F#|G#|A#|Db|Eb|Gb|Ab|Bb|C|D|E|F|G|A|B/", $chord, $notes); // match the item
$notes = $notes[0];
$newArray = array();
foreach($notes as $note){ // for each found item as a note
$note = switchNotes($note); // switch the not out
array_push($newArray, $note); // and push it into the new array
}
$chord = str_replace($notes, $newArray, $chord); // then string replace the chord with the new notes available
return($chord);
}
function switchNotes($note){
$chords1 = array('C','C#','D','D#','E','F','F#','G','G#','A','A#','B','C','Db','D','Eb','E','F','Gb','G','Ab','A','Bb','B','C');
$search = array_search($note, $chords1);////////////////Search the array position D=2 & F#=6
$note = $chords1[$search + 4];///////////////////////then make the new position add 4 = F# and A#
return($note);
}
This works, except the problem is that if I use a split chord like (D6/F#) The chord is transposed to A#6/A#. It is replacing the first note (D) with an (F#) then, Both (F#'s) with an (A#).
The question is... How can I keep this redundancy from happening. The desired output would be F#6/A#. Thank you for your help. If the solution is posted, I WILL mark it as answered.
You can use preg_replace_callback function
function transposeNoteCallback($match) {
$chords = array('C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B', 'C', 'Db', 'D', 'Eb', 'E', 'F', 'Gb', 'G', 'Ab', 'A', 'Bb', 'B', 'C');
$pos = array_search($match[0], $chords) + 4;
if ($pos >= count($chords)) {
$pos = $pos - count($chords);
}
return $chords[$pos];
}
function transposeNote($noteStr) {
return preg_replace_callback("/C#|D#|F#|G#|A#|Db|Eb|Gb|Ab|Bb|C|D|E|F|G|A|B/", 'transposeNoteCallback', $noteStr);
}
Test
echo transposeNote("Eb6 Bb B Ab D6/F#");
returns
G6 C# Eb C F#6/A#
Cheap advice: move into natural numbers domain [[0-11]] and associate them with corresponding notes at display time only, it will save you many time.
The only problem will be homophones sounds [e.g. C-sharp / D-flat], but hope you can deduce it from tonality.

PHP Arrays - Remove duplicates ( Time complexity )

Okay this is not a question of "how to get all uniques" or "How to remove duplicates from my array in php". This is a question about the time complexity.
I figured that the array_unique is somewhat O(n^2 - n) and here's my implementation:
function array_unique2($array)
{
$to_return = array();
$current_index = 0;
for ( $i = 0 ; $i < count($array); $i++ )
{
$current_is_unique = true;
for ( $a = $i+1; $a < count($array); $a++ )
{
if ( $array[$i] == $array[$a] )
{
$current_is_unique = false;
break;
}
}
if ( $current_is_unique )
{
$to_return[$current_index] = $array[$i];
}
}
return $to_return;
}
However when benchmarking this against the array_unique i got the following result:
Testing (array_unique2)... Operation took 0.52146291732788 s.
Testing (array_unique)... Operation took 0.28323101997375 s.
Which makes the array_unique twice as fast, my question is, why ( Both had the same random data ) ?
And a friend of mine wrote the following:
function array_unique2($a)
{
$n = array();
foreach ($a as $k=>$v)
if (!in_array($v,$n))
$n[$k]=$v;
return $n;
}
which is twice as fast as the built in one in php.
I'd like to know, why?
What is the time-complexity of array_unique and in_array?
Edit
I removed the count($array) from both loops and just used a variable in the top of the function, that gained 2 seconds on 100 000 elements!
While I can't speak for the native array_unique function, I can tell you that your friends algorithm is faster because:
He uses a single foreach loop as opposed to your double for() loop.
Foreach loops tend to perform faster than for loops in PHP.
He used a single if(! ) comparison while you used two if() structures
The only additional function call your friend made was in_array whereas you called count() twice.
You made three variable declarations that your friend didn't have to ($a, $current_is_unique, $current_index)
While none of these factors alone is huge, I can see where the cumulative effect would make your algorithm take longer than your friends.
The time complexity of in_array() is O(n). To see this, we'll take a look at the PHP source code.
The in_array() function is implemented in ext/standard/array.c. All it does is call php_search_array(), which contains the following loop:
while (zend_hash_get_current_data_ex(target_hash, (void **)&entry, &pos) == SUCCESS) {
// checking the value...
zend_hash_move_forward_ex(target_hash, &pos);
}
That's where the linear characteristic comes from.
This is the overall characteristic of the algorithm, becaus zend_hash_move_forward_ex() has constant behaviour: Looking at Zend/zend_hash.c, we see that it's basically just
*current = (*current)->pListNext;
As for the time complexity of array_unique():
first, a copy of the array will be created, which is an operation with linear characteristic
then, a C array of struct bucketindex will be created and pointers into our array's copy will be put into these buckets - linear characteristic again
then, the bucketindex-array will be sorted usign quicksort - n log n on average
and lastly, the sorted array will be walked and and duplicate entries will be removed from our array's copy - this should be linear again, assuming that deletion from our array is a constant time operation
Hope this helps ;)
Try this algorithm. It takes advantage of the fact that the key lookup is faster than in_array():
function array_unique_mine($A) {
$keys = Array();
$values = Array();
foreach ($A as $k => $v) {
if (!array_key_exists($v, $values)) {
$keys[] = $k;
$values[$v] = $v;
}
}
return array_combine($keys, $values);
}
Gabriel's answer has some great points about why your friend's method beats yours. Intrigued by the conversation following Christoph's answer, I decided to run some tests of my own.
Also, I tried this with differing lengths of random strings and although the results were different, the order was the same. I used 6 chars in this example for brevity.
Notice that array_unique5 actually has the same keys as native, 2 and 3, but just outputs in a different order.
Results...
Testing 10000 array items of data over 1000 iterations:
array_unique6: 1.7561039924622 array ( 9998 => 'b', 9992 => 'a', 9994 => 'f', 9997 => 'e', 9993 => 'c', 9999 => 'd', )
array_unique4: 1.8798060417175 array ( 0 => 'b', 1 => 'a', 2 => 'f', 3 => 'e', 4 => 'c', 5 => 'd', )
array_unique5: 7.5023629665375 array ( 10 => 'd', 0 => 'b', 3 => 'e', 2 => 'f', 9 => 'c', 1 => 'a', )
array_unique3: 11.356487989426 array ( 0 => 'b', 1 => 'a', 2 => 'f', 3 => 'e', 9 => 'c', 10 => 'd', )
array_unique: 22.535032987595 array ( 0 => 'b', 1 => 'a', 2 => 'f', 3 => 'e', 9 => 'c', 10 => 'd', )
array_unique2: 62.107122898102 array ( 0 => 'b', 1 => 'a', 2 => 'f', 3 => 'e', 9 => 'c', 10 => 'd', )
array_unique7: 71.557286024094 array ( 0 => 'b', 1 => 'a', 2 => 'f', 3 => 'e', 9 => 'c', 10 => 'd', )
And The Code...
set_time_limit(0);
define('HASH_TIMES', 1000);
header('Content-Type: text/plain');
$aInput = array();
for ($i = 0; $i < 10000; $i++) {
array_push($aInput, chr(rand(97, 102)));
}
function array_unique2($a) {
$n = array();
foreach ($a as $k=>$v)
if (!in_array($v,$n))
$n[$k]=$v;
return $n;
}
function array_unique3($aOriginal) {
$aUnique = array();
foreach ($aOriginal as $sKey => $sValue) {
if (!isset($aUnique[$sValue])) {
$aUnique[$sValue] = $sKey;
}
}
return array_flip($aUnique);
}
function array_unique4($aOriginal) {
return array_keys(array_flip($aOriginal));
}
function array_unique5($aOriginal) {
return array_flip(array_flip(array_reverse($aOriginal, true)));
}
function array_unique6($aOriginal) {
return array_flip(array_flip($aOriginal));
}
function array_unique7($A) {
$keys = Array();
$values = Array();
foreach ($A as $k => $v) {
if (!array_key_exists($v, $values)) {
$keys[] = $k;
$values[$v] = $v;
}
}
return array_combine($keys, $values);
}
function showResults($sMethod, $fTime, $aInput) {
echo $sMethod . ":\t" . $fTime . "\t" . implode("\t", array_map('trim', explode("\n", var_export(call_user_func($sMethod, $aInput), 1)))) . "\n";
}
echo 'Testing ' . (count($aInput)) . ' array items of data over ' . HASH_TIMES . " iterations:\n";
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique($aInput);
$aResults['array_unique'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique2($aInput);
$aResults['array_unique2'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique3($aInput);
$aResults['array_unique3'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique4($aInput);
$aResults['array_unique4'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique5($aInput);
$aResults['array_unique5'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique6($aInput);
$aResults['array_unique6'] = microtime(1) - $fTime;
$fTime = microtime(1);
for ($i = 0; $i < HASH_TIMES; $i++) array_unique7($aInput);
$aResults['array_unique7'] = microtime(1) - $fTime;
asort($aResults, SORT_NUMERIC);
foreach ($aResults as $sMethod => $fTime) {
showResults($sMethod, $fTime, $aInput);
}
Results using Christoph's data set from the comments:
$aInput = array(); for($i = 0; $i < 1000; ++$i) $aInput[$i] = $i; for($i = 500; $i < 700; ++$i) $aInput[10000 + $i] = $i;
Testing 1200 array items of data over 1000 iterations:
array_unique6: 0.83235597610474
array_unique4: 0.84050011634827
array_unique5: 1.1954448223114
array_unique3: 2.2937450408936
array_unique7: 8.4412341117859
array_unique: 15.225166797638
array_unique2: 48.685120105743
PHP's arrays are implemented as hash tables, i.e. their performance characteristics are different from what you'd expect from 'real' arrays. An array's key-value-pairs are additionally stored in a linked list to allow fast iteration.
This explains why your implementation is so slow compared to your friend's: For every numeric index, your algorithm has to do a hash table lookup, whereas a foreach()-loop will just iterate over a linked list.
The following implementation uses a reverse hash table and might be the fastest of the crowd (double-flipping courtesy of joe_mucchiello):
function array_unique2($array) {
return array_flip(array_flip($array));
}
This will only work if the values of $array are valid keys, ie integers or strings.
I also reimplemented your algorithm using foreach()-loops. Now, it will actually be faster than your friend's for small data sets, but still slower than the solution via array_flip():
function array_unique3($array) {
$unique_array = array();
foreach($array as $current_key => $current_value) {
foreach($unique_array as $old_value) {
if($current_value === $old_value)
continue 2;
}
$unique_array[$current_key] = $current_value;
}
return $unique_array;
}
For large data sets, the built-in version array_unique() will outperform all other's except the double-flipping one. Also, the version using in_array() by your friend will be faster than array_unique3().
To summarize: Native code for the win!
Yet another version, which should preserve keys and their ordering:
function array_flop($array) {
$flopped_array = array();
foreach($array as $key => $value) {
if(!isset($flopped_array[$value]))
$flopped_array[$value] = $key;
}
return $flopped_array;
}
function array_unique4($array) {
return array_flip(array_flop($array));
}
This is actually enobrev's array_unique3() - I didn't check his implementations as thoroughly as I should have...
PHP is slower to execute than raw machine code (which is most likely executed by array_unique).
Your second example function (the one your friend wrote) is interesting. I do not see how it would be faster than the native implementation, unless the native one is removing elements instead of building a new array.
I'll admit I don't understand the native code very well, but it seems to copy the entire array, sort it, then loop through it removing duplicates. In that case your second piece of code is actually a more efficient algorithm, since adding to the end of an array is cheaper than deleting from the middle of it.
Keep in mind the PHP developers probably had a good reason for doing it the way they do. Does anyone want to ask them?
The native PHP function array_unique is implemented in C. Thus it is faster than PHP, that has to be translated first. What’s more, PHP uses an different algorithm than you do. As I see it, PHP first uses Quick sort to sort the elements and then deletes the duplicates in one run.
Why his friend’s implementation is faster has his own? Because it uses more built-in functionality that trying to recreate them.

Categories