I looping through a large dataset (contained in a multidimensional associative array $values in this example) with many duplicate index values with the goal of producing an array containing only the unique values from a given index 'data'.
Currently I am doing this like:
foreach ($values as $value) {
$unique[$value['data']] = true;
}
Which accomplishes the objective because duplicate array keys simply get replaced. But this feels a bit odd since the indexes themselves don't actually contain any data.
It was suggested that I build the array first and then use array_unique() to removes duplicates. I'm inclined to stick with the former method but am wondering are there pitfalls or problems I should be aware of with this approach? Or any benefits to using array_unique() instead?
I would do it like this.
$unique = array();
foreach($values as $value) {
if(!in_array($value, $unique) {
$unique[] = value;
}
}
Related
I need to determine the keys of values, that have duplicates from an array.
What I came up with is:
$duplicates_keys = array();
$unique = array_unique($in);
$duplicates = array_diff_assoc($in, $unique);
foreach ($in as $key => $val){
if (in_array($val,$duplicates)){
$duplicates_keys[]=$key;
}
}
Which works, but that's pretty resource intensive, is there a faster way to do this?
As per my comment, i doubt this is a bottleneck. However you can reduce your iterations to once through the array, as follows:
$temp=[];
$dup=[];
foreach ($in as $key=>$val) {
if(isset($temp[$val])){
$dup[]=$key;
}else{
$temp[$val]=0;
}
}
Note that the value is set as an array key in temp, so you can use O(1) isset rather than in_array, which must search the full array until the value is found.
This is theoretically faster than your example, but you would need to profile it to be sure (as you should have already done to ascertain your current code is slow).
Probably you can do something else that has a far greater impact, like caching or a better database query
Use array_intersect() for this.
$duplicates_keys = array_intersect($in, $duplicates);
array_intersect()
I can't find an answer to this anywhere.
foreach ($multiarr as $array) {
foreach ($array as $key=>$val) {
$newarray[$key] = $val;
}
}
say $key has duplicate names, so when I am trying to push into $newarray it actually looks like this:
$newarray['Fruit'] = 'Apples';
$newarray['Fruit'] = 'Bananas';
$newarray['Fruit'] = 'Oranges';
The problem is, the above example just replaces the old value, instead of pushing into it.
Is it possible to push values like this?
Yes, notice the new pair of square brackets:
foreach ($multiarr as $array) {
foreach ($array as $key=>$val) {
$newarray[$key][] = $val;
}
}
You may also use array_push(), introducing a bit of overhead, but I'd stick with the shorthand most of the time.
I'll offer an alternative to moonwave99's answer and explain how it is subtly different.
The following technique unpacks the indexed array of associative arrays and serves each subarray as a separate parameter to array_merge_recursive() which performs the merging "magic".
Code: (Demo)
$multiarr = [
['Fruit' => 'Apples'],
['Fruit' => 'Bananas'],
['Fruit' => 'Oranges'],
['Veg' => 'Carrot'],
//['Veg' => 'Leek'],
];
var_export(
array_merge_recursive(...$multiarr)
);
As you recursively merge, if there is only one value for a respective key, then a subarray is not used, if there are multiple values for a key, then a subarray is used.
See this action by uncommenting the Leek element.
p.s. If you know that you are only targetting a single column of data and you know the key that you are targetting, then array_column() would be a wise choice.
Code: (Demo)
var_export(
['Fruit' => array_column($multiarr, 'Fruit')]
);
I have a large array of arrays and each of these sub-arrays has an ID and some other info. Is there a way to access an array of just the ID's without using a loop?
Sort of like
$array[ALLOFTHEITEMS][Id];
I want to eventually compare these ID's to another flat array of ID's.
I would usually do a for loop and then just add the id of each item to a new array and then compare them. But is there a faster way?
Not sure if its faster then foreach as I've never benchmarked it but an alternative to foreach would be:
php 5.3
$ids = array_map(function($data) { return $data['id']; }, $array);
php < 5.3
function reduceToIds($data) {
return $data['id'];
}
$ids = array_map('reduceToIds', $array);
I normally use the foreach approach myself though.
I'm trying to pass key values pairs within PHP:
// "initialize"
private $variables;
// append
$this->variables[] = array ( $key = $value)
// parse
foreach ( $variables as $key => $value ) {
//..
}
But it seems that new arrays are added instead of appending the key/value, nor does the iteration work as expect. Please let me know what the proper way is.
Solution
$this->variables[$key] = $value;
did the trick - the iteration worked as described above.
I think you may be looking for:
$this->variables[$key] = $value;
The way you have it right now you are creating an array of arrays, so you would have to do this:
foreach($this->variables as $tuple) {
list($key, $value) = $tuple;
}
Referring to Perl, but helps understand the difference between hashes and arrays:
Some people think that hashes are like arrays (the old name 'associative array' also indicates this, and in some other languages, such as PHP, there is no difference between arrays and hashes.), but there are two major differences between arrays and hashes. Arrays are ordered, and you access an element of an array using its numerical index. Hashes are un-ordered and you access a value using a key which is a string.
Source: http://perlmaven.com/perl-hashes
After using array_unique, an array without the duplicate values is removed. However, it appears that the keys are also removed, which leaves gaps in an array with numerical indexes (although is fine for an associative array). If I iterate using a for loop, I have to account for the missing indexes and just copy the keys to a new array, but that seems clumsy.
$foo = array_values($foo); will re-number an array for you
Instead of using for loops it sounds like you should use foreach loops. Apparently you don't care about indexes anyway since you are renumbering them.
This loop:
for ($i = 0; $i < $loopSize; $i++)
{
process($myArray[$i]);
}
turns into
foreach($myArray as $key=> $value)
{
process($value);
/** or process($myArray[$key]); */
}
or even more simply
foreach($myArray as $value)
{
process($value);
}
In the few cases I've tried using for instead of foreach, I soon regretted it.
It can really always be avoided, you can even use foreach but ignore the values and use the key, almost forgetting that its a foreach instead of for, but avoiding any gaps in your keys and automatically have your bounds taken care of without length/min/max functions or anything.
ex.
foreach($myArray as $key=>$val)
{
myArray[$key] = myFunction(myArray[$key]);
}
I've particularly found this useful with parallel arrays.
$a = getA(); $b = getB();
foreach($a as $key=>val)
{
$sql = "INSERT INTO table (field1, field2) VALUES ($a[$key], $b[$key])";
}