I have officially hit a wall and I cannot figure out the solution to this issue. Any help would be much appreciated! I have tried array_intersect() but it just keeps running against the first array in the function, that wont work.
I have an infinite amounts of arrays (I'll show 4 for demonstration purposes), for example:
// 1.
array(1,2,3,4,5);
// 2.
array(1,3,5);
// 3.
array(1,3,4,5);
// 4.
array(1,3,5,6,7,8,9);
I need to figure out how to search all the arrays and find only the numbers that exist in all 4 arrays. In this example I need to only pull out the values from the arrays - 1, 3 & 5.
PS: In all reality, it would be best if the function could search against a multi dimensional array and extract only the numbers that match in all the arrays within the array.
Thanks so much for your help!
Fun question! This worked:
function arrayCommonFind($multiArray) {
$result = $multiArray[0];
$count = count($multiArray);
for($i=1; $i<$count; $i++) {
foreach($result as $key => $val) {
if (!in_array($val, $multiArray[$i])) {
unset($result[$key]);
}
}
}
return $result;
}
Note that you can just use $multiArray[0] (or any sub-array) as a baseline and check all the others against that since any values that will be in the final result must necessarily be in all individual subarrays.
How about this?
Find the numbers that exist in both array 1 and 2. Then compare those results with array 3 to find the common numbers again. Keep going as long as you want.
Is this what you are getting at?
If it's in a multidimensional array you could
$multiDimensional = array(/* Your arrays*/);
$found = array_pop($multiDimensional);
foreach($multiDimensional as $subArray)
{
foreach($found as $key=>$element)
{
if(!in_array($element, $subArray)
{
unset($found[$key]);
}
}
}
Per your comment on my other question here is a better solution:
<?php
// 1. merge the arrays
$merged_arrays = array_merge( $arr1, $arr2, $arr3, $arr4, ...);
// 2. count the values
$merged_count = array_count_values( $merged_arrays );
// 3. sort the result for elements that only matched once
for( $merged_count as $key => $value ){
if ($value == 1) {
// 4. unset the values that didn't intersect
unset($merged_count($key));
}
}
// 5. print the resulting array
print_r( $merged_count );
Performing iterated in_array() calls followed by unset() is excessive handling and it overlooks the magic of array_intersect() which really should be the hero of any solid solution for this case.
Here is a lean iterative function:
Code: (Demo)
function array_intersect_multi($arrays){ // iterative method
while(sizeof($arrays)>1){
$arrays[1]=array_intersect($arrays[0],$arrays[1]); // find common values from first and second subarray, store as (overwrite) second subarray
array_shift($arrays); // discard first subarray (reindex $arrays)
}
return implode(', ',$arrays[0]);
}
echo array_intersect_multi([[1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]]);
// output: 1, 3, 5
This assumes you will package the individual arrays into an indexed array of arrays.
I also considered a recursive function, but recursion is slower and uses more memory.
function array_intersect_multi($arrays){ // recursive method
if(sizeof($arrays)>1){
$arrays[1]=array_intersect($arrays[0],$arrays[1]); // find common values from first and second subarray, store as (overwrite) second subarray
array_shift($arrays); // discard first subarray (reindex $arrays)
return array_intersect_multi($arrays); // recurse
}
return implode(', ',$arrays[0]);
}
Furthermore, if you are happy to flatten your arrays into one with array_merge() and declare the number of individual arrays being processed, you can use this:
(fastest method)
Code: (Demo)
function flattened_array_intersect($array,$total_arrays){
return implode(', ',array_keys(array_intersect(array_count_values($array),[$total_arrays])));
}
echo flattened_array_intersect(array_merge([1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]),4);
or replace array_intersect() with array_filter() (slightly slower and more verbose):
function flattened_array_intersect($array,$total_arrays){
return implode(', ',array_keys(array_filter(array_count_values($array),function($v)use($total_arrays){return $v==$total_arrays;})));
}
echo flattened_array_intersect(array_merge([1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]),4);
Related
EDIT: Thanks to nice_dev for providing an excellent solution below!
I had to rewrite part of his query for correct results:
From:
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray, &$matched) {
$matched[ $val ] = $matched [$val] ?? 0;
$matched[ $val ]++;
return $matched[ $val ] <= $tmparray[ $val ];
}
);
To:
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray) {
return empty($tmparray[$val]) || !$tmparray[$val]++;
}
);
Original:
I'm attempting to compare arrays to get a new array of matched and unmatched items. I have a loop that goes through roughly 55,000 items. The processing of this script can take upwards of 20+ minutes to attempt to complete and I've narrowed it down to both usage of array_intersect and array_filter within the foreach. Ideally, I need it to complete much faster. If I limit the foreach to 1000 items it still takes upwards of ~3 minutes to complete which is slow for the client-side experience.
If I remove them, the script completes almost immediately. As such, I will include only these relevant pieces in the code below.
I'm using a custom array_intersect_fixed function as regular array_intersect returned wrong results with duplicate values as per here.
Explanations:
totalidsarray = An array of numbers such as ['11233', '35353, '3432433', '123323']. Could contain thousands of items.
$deck['main_deck'] = An array of numbers to compare against $totalidsarray. Similar structure. Max length is 60 items.
foreach($dbdeckarray as $deck){
$tmparray = $totalidsarray;
//Get an array of unmatched cards between the deck and the collection
//Collection = $tmparray
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray) {
$key = array_search($val, $tmparray);
if ( $key === false ) return true;
unset($tmparray[$key]);
return false;
}
);
//Get an array of matched cards between the deck and the collection
$tmpmatch['matched'] = array_intersect_fixed($deck['main_deck'], $totalidsarray);
//Push results to matcharray
$matcharray[] = $tmpmatch;
}
//Built in array_intersect function returns wrong result when input arrays have duplicate values.
function array_intersect_fixed($array1, $array2) {
$result = array();
foreach ($array1 as $val) {
if (($key = array_search($val, $array2, FALSE))!==false) {
$result[] = $val;
unset($array2[$key]);
}
}
return $result;
}
To make matters worse, I have to do 2 further matched/unmatched checks within that same foreach loop against another array extra_deck, further increasing processing time.
Is there a more optimized approach I can take to this?
EDIT: Explanation of what the code needs to achieve.
The script will retrieve the user's card collection of cards that they own from a card game. This is assigned into totalidsarray. It will then query every deck in the database (~55,000) and compare the collection you own against the built deck of cards (main_deck). It then attempted to extract all owned cards (matched) and all un-owned cards (unmatched) into two arrays. Once the full foreach loop is done, the client-side returns a list of each deck alongside the matched cards/unmatched cards of each (with a % match for each).
A couple of optimizations I can suggest:
The array_intersect_fixed routine you have is quadratic in nature in terms of getting the result, because it is 2 nested loops under the hood. We can use array_count_values to optimize it to work in linear time(which uses a map).
json_decode() doesn't need to be done twice for every deck. If you do it once and use it wherever needed, it should work just fine(unless you make any edits in place which I don't find right now) . It also needs to be decoded to an array and not to an object using the true flag.
For your array_filter, the comparison is also quadratic in nature. We will use array_count_values again to optimize it and use a $matched array. We keep counting the frequency of elements and if any of them surpasses count in $tmparray, we return false, else, we return true.
Snippet:
<?php
$tmparray = array_count_values($totalidsarray);
foreach($dbdeckarray as $deck){
$matched = [];
$deck['main_deck'] = json_decode($deck['main_deck'], true);
$tmpmatch['unmatched'] = array_filter($deck['main_deck'],
function ($val) use (&$tmparray, &$matched) {
$matched[ $val ] = $matched [$val] ?? 0;
$matched[ $val ]++;
return $matched[ $val ] <= $tmparray[ $val ];
}
);
$tmpmatch['matched'] = array_intersect_fixed($deck['main_deck'], $tmparray);
$matcharray[] = $tmpmatch;
}
function array_intersect_fixed($array1, $array2) {
$result = array();
$matched = [];
foreach ($array1 as $val) {
$matched[ $val ] = $matched[ $val ] ?? 0;
$matched[ $val ]++;
if (isset($array2[ $val ]) && $matched[ $val ] <= $array2[ $val ]) {
$result[] = $val;
}
}
return $result;
}
Note: array_intersect_fixed expects $array2 to be in the Hashmap way by default. If you wish to use it elsewhere, make sure to pass array_count_values of the array as 2nd parameter or use a third parameter to indicate a flag check otherwise.
Beside #nice_dev suggestion your code can be simplified.
The unmatched part is an array diff
array_diff($deck['main_deck'], $tmparray);
The array_intersect_fixed(), if the problem are duplicated value in array, can be avoided by running array_unique() on the array (I guess is $deck['main_deck']) before calling array_intersect()
This will also speed up array_diff() as it will have less array element to compare.
Suppose you have two arrays $a=array('apple','banana','canaple'); and $b=array('apple');, how do you (elegantly) extract the numeric indices of elements in array a that aren't in array b? (in this case, indices: 1 and 2).
In this case, array a will always have more elements than b.
Note, this is not asking for array_diff_key, but rather the numeric indices in the array with more elements that don't exist in the array with fewer elements.
array_diff gets you half way there. Using array_keys on the diff gets you the rest of what you want.
$a = ['apple','banana','canaple'];
$b = ['apple'];
$diff = array_diff($a, $b);
$keys = array_keys($diff);
var_dump($keys); // [1, 2]
This is because array_diff returns both the element and it's key from the first array. If you wanted to write a PHP implementation of array_diff it might look something like this...
function array_diff(Array ... $arrays) {
$return = [];
$cmp = array_shift($arrays);
foreach ($cmp as $key => $value) {
foreach($arrays as $array) {
if (!in_array($value, $array)) {
$return[$key] = $value;
}
}
}
return $return;
}
This gives you an idea how you might achieve the result, but internally php implements this as a sort, because it's much faster than the aforementioned implementation.
Perhaps this has been asked several times but I can't find the right answer so here goes.
I have two arrays: one with ~135732 and the other one with ~135730 elements. I need to find which items are on the first but not on the second and viceverse and don't know is there is an easy way to achieve that.
This is what I would do it:
$countArr1 = count($arr1);
$countArr2 = count($arr2);
for($i=0; $i < $countArr1; $i++) {
// Check whether current element on $arr1 is on $arr2 or not
if (!in_array($arr1[$i], $arr2)) {
// if it doesn't then add it to $newArr
$newArr[] = $arr1[$i];
}
}
Then I would do the same but inverse for $arr2. In huge arrays could take a while and also could kill memory or server resources, even if it's executed from CLI so which is the best and the most efficient, regarding use of resources, way to achieve this?
EDIT
Let's clear this a bit. I get $arr1 from DB and $arr2 comes from other place. So the big idea is to find which items needs to be updated and which ones needs to be added also which ones needs to be marked as obsolete. In less and common words:
if element is on $arr1 but doesn't exists on $arr2 should be marked as obsolete
if element comes in $arr2 btu doesn't exists on $arr1 then needs to be added (created)
otherwise that element just need to be updated
Clear enough? Feel free to ask everything in order to help on this
EDIT 2
Based on #dakkaron answer I made this code:
// $arr1 and $arr2 are previously built
$sortArr1 = asort($arr1);
$sortArr2 = asort($arr2);
$countArr1 = count($sortArr1);
$countArr2 = count($sortArr2);
$i = $j = 0;
$updArr = $inactiveArr = $newArr = [];
echo "original arr1 count: ", count($arr1), "\n";
echo "original arr2 count: ", count($arr2), "\n";
echo "arr1 count: ", $countArr1, "\n";
echo "arr2 count: ", $countArr2, "\n";
while ( $i < $countArr1 && $j < $countArr2) {
if ($sortArr1[$i] == $sortArr2[$j]) {
//Handle equal values
$updArr[] = $sortArr1[$i];
$i++; $j++;
} else if ($sortArr1[$i] < $sortArr2[$j]) {
//Handle values that are in arr1 but not in arr2
$inactiveArr[] = $sortArr1[$i];
$i++;
} else {
//Handle values that are in arr2 but not in arr1
$newArr[] = $sortArr2[$j];
$j++;
}
}
echo "items update: ", count($updArr), "\n", "items inactive: ", count($inactiveArr), "\n", "items new: ", count($newArr), "\n";
And I got this output:
original arr1 count: 135732
original arr2 count: 135730
arr1 count: 1
arr2 count: 1
items update: 1
items inactive: 0
items new: 0
Why sort count returns 1?
You could take avantage of array_diff: http://php.net/manual/en/function.array-diff.php
Edit
A php function construct is more likely to perform better than an equivalent user-defined one. Searching I found this, but the size of your array is way smaller, and in the end I believe you should benchmark a prototype script with candidate solutions.
See my last comment.
The best solution I can think of would be to first sort both arrays and then compare them from the bottom up.
Start with the lowest element in both arrays and compare them.
If they are equal, take them and move up one element on both arrays.
If they are different, move up one element on the array with the lower value.
If you reached the end of one of the arrays you are done.
After the sorting this should take about O(n) complexity.
This is a bit of code in pseudocode:
arr1 = ...
arr2 = ...
arr1.sort();
arr2.sort();
i1 = 0;
i2 = 0;
while (i1<arr1.length() && i2<arr2.length()) {
if (arr1[i1]==arr2[i2]) {
//Handle equal values
i1++; i2++;
} else if (arr1[i1]<arr2[i2]) {
//Handle values that are in arr1 but not in arr2
i1++;
} else {
//Handle values that are in arr2 but not in arr1
i2++;
}
}
Other than that, if you don't want to implement it yourself, just use array_diff
The best solution i can think of is to sort the second array, and try to look for values from the first array using binary search,
this would take O(nLog(n)) complexity
Since your values are strings, you could take the advantage of PHP’s implementation of arrays using a hash-table internally with O(1) for key lookups:
$diff = [];
// A \ B
$lookup = array_flip($b); // O(n)
foreach ($a as $value) { // O(n)
if (!isset($lookup[$value])) $diff[] = $value;
}
// B \ A
$lookup = array_flip($a); // O(n)
foreach ($b as $value) { // O(n)
if (!isset($lookup[$value])) $diff[] = $value;
}
So in total, it’s O(n) in both space and time.
Of course, in the end you should benchmark it to see if it’s actually more efficient than other solutions here.
Fill hashtable-based dictionary/map (don't know how it is called in PHP) with the second array elements, and check whether every element of the first array presents in this dictionary.
Usual complexity O(N)
for A in arr2
map.insert(A)
for B in arr1
if not map.contains(B) then
element B is on $arr1 but doesn't exists on $arr2
note that this approach doesn't address all problems in your edited question
I have 4 arrays and use array_multisort to sort them all at the same time relative to one another.
Problem is, in the first array, there can be empty values and I want to put them at the end, not the beginning.
Example : http://codepad.org/V6TjCsS5
Is there a way to:
Pass a custom function to array_multisort
or
Sort the first array with a custom function then use the result order to sort the other arrays
or
Use a certain argument with array_multisort to achieve what I want
Thank you very much
Unfortunately none of the approaches your propose is possible, which means you have to take another step back and look for alternatives. I am assuming you want a normal ascending sort, with the explicit exception that empty elements (which you be "smallest") need to be considered as "largest".
Option 1: Manually rearrange elements after sorting
Do your array_multisort as usual, and then make the modifications you require:
// $arr1, $arr2 etc have been sorted with array_multisort
while(reset($arr1) == '') {
$k = key($arr1);
unset($arr1[$k]); // remove empty element from beginning of array
$arr1[$k] = ''; // add it to end of array
// and now do the same for $arr2
$v = reset($arr2);
$k = key($arr2);
unset($arr2[$k]);
$arr2[$k] = $v;
// the same for $arr3, etc
}
You can pull out part of the code in a function to make this prettier:
function shift_and_push(&$arr) {
$v = reset($arr);
$k = key($arr);
unset($arr[$k]);
$arr[$k] = $v;
}
Option 2: Condense everything inside one array so you can use usort
The idea here is to pull all your arrays into one so you can specify the comparison function by using usort:
$allArrays = array_map(function() { return func_get_args(); },
$array1, $array2 /* , as many arrays as you want */);
You can now sort:
// writing this as a free function so that it looks presentable
function cmp($row1, $row2) {
// $row1[0] is the item in your first array, etc
if($row1[0] == $row2[0]) {
return 0;
}
else if($row1[0] == '') {
return 1;
}
else if($row2[0] == '') {
return -1;
}
return $row1[0] < $row2[0] ? -1 : 1;
}
usort($allArrays, "cmp");
At this point you are left with an array, each element (row) of which is an array. The first elements of each are what was originally inside $array1, second elements are what was in $array2, etc. By placing those elements inside "rows", we have managed to keep the sort order among all your original arrays synchronized.
The second argument to array_multisort can be the options to the sort. So you could pass SORT_DESC if SORT_ASC is doing the opposite of what you want.
array_multisort($arr, SORT_DESC);
For completeness sake, here are the other options SORT_ASC, SORT_DESC, SORT_REGULAR, SORT_NUMERIC, SORT_STRING. SORT_STRING may be helpful.
Do you actually want the empty elements? Just remove them before sorting:
foreach($array1 as &$v) {
if($v==='')
{
unset($v);
}
}
Lets say I have an array as follows :
$sampArray = array (1,4,2,1,6,4,9,7,2,9)
I want to remove all the duplicates from this array, so the result should be as follows:
$resultArray = array(1,4,2,6,9,7)
But here is the catch!!! I don't want to use any PHP in built functions like array_unique().
How would you do it ? :)
Here is a simple O(n)-time solution:
$uniqueme = array();
foreach ($array as $key => $value) {
$uniqueme[$value] = $key;
}
$final = array();
foreach ($uniqueme as $key => $value) {
$final[] = $key;
}
You cannot have duplicate keys, and this will retain the order.
A serious (working) answer:
$inputArray = array(1, 4, 2, 1, 6, 4, 9, 7, 2, 9);
$outputArray = array();
foreach($inputArray as $inputArrayItem) {
foreach($outputArray as $outputArrayItem) {
if($inputArrayItem == $outputArrayItem) {
continue 2;
}
}
$outputArray[] = $inputArrayItem;
}
print_r($outputArray);
This depends on the operations you have available.
If all you have to detect duplicates is a function that takes two elements and tells if they are equal (one example will be the == operation in PHP), then you must compare every new element with all the non-duplicates you have found before. The solution will be quadratic, in the worst case (there are no duplicates), you need to do (1/2)(n*(n+1)) comparisons.
If your arrays can have any kind of value, this is more or less the only solution available (see below).
If you have a total order for your values, you can sort the array (n*log(n)) and then eliminate consecutive duplicates (linear). Note that you cannot use the <, >, etc. operators from PHP, they do not introduce a total order. Unfortunately, array_unique does this, and it can fail because of that.
If you have a hash function that you can apply to your values, than you can do it in average linear time with a hash table (which is the data structure behind an array). See
tandu's answer.
Edit2: The versions below use a hashmap to determine if a value already exists. In case this is not possible, here is another variant that safely works with all PHP values and does a strict comparison (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = function($a)
{
$u = array();
foreach($a as $v)
{
foreach($u as $vu)
if ($vu===$v) continue 2
;
$u[] = $v;
}
return $u;
};
var_dump($unique($array)); # array(1,4,2,6,9,7)
Edit: Same version as below, but w/o build in functions, only language constructs (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = array();
foreach($array as $v)
isset($k[$v]) || ($k[$v]=1) && $unique[] = $v;
var_dump($unique); # array(1,4,2,6,9,7)
And in case you don't want to have the temporary arrays spread around, here is a variant with an anonymous function:
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = function($a) /* similar as above but more expressive ... ... you have been warned: */ {for($v=reset($a);$v&&(isset($k[$v])||($k[$v]=1)&&$u[]=$v);$v=next($a));return$u;};
var_dump($unique($array)); # array(1,4,2,6,9,7)
First was reading that you don't want to use array_unique or similar functions (array_intersect etc.), so this was just a start, maybe it's still of som use:
You can use array_flip PHP Manual in combination with array_keys PHP Manual for your array of integers (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$array = array_keys(array_flip($array));
var_dump($array); # array(1,4,2,6,9,7)
As keys can only exist once in a PHP array and array_flip retains the order, you will get your result. As those are build in functions it's pretty fast and there is not much to iterate over to get the job done.
<?php
$inputArray = array(1, 4, 2, 1, 6, 4, 9, 7, 2, 9);
$outputArray = array();
foreach ($inputArray as $val){
if(!in_array($val,$outputArray)){
$outputArray[] = $val;
}
}
print_r($outputArray);
You could use an intermediate array into which you add each item in turn. prior to adding the item you could check if it already exists by looping through the new array.