Efficiency of accessing element in multi-dimensional array - php

I can't see to find an answer for this, but it's been bugging me for a while.
When using multidimensional arrays in php (or any language for that matter), do you get any efficiency and productivity out of saving an array of a multidimensional array as its separate array or accessing that "array" element directly from the whole array?
For example,
print_r($myArray);
Array
(
[2] => Array
(
[7.0] => Array
(
[0] => 1
[1] => 23
)
[16.0] => Array
(
[0] => 4
[1] => 28
)
)
[5] => Array
(
[17.0] => Array
(
[0] => 1
[1] => 3
)
)
)
If I needed to REPEATABLY access the array of [16.0], would it be better to just save that entry as its own array until I don't need it anymore, or is it better just to access it directly?
Option 1:
$tempArray=$myArray[2]["16.0"];
echo "Index=".$tempArray[0].";
or
Option 2:
echo "Index=".$myArray[2]["16.0"][0];
Of course this is just a tiny example, but if the values in (arbitrary) $array[.][.][n][...] are being accessed more than once, and the value of n depends on the index of a loop, is there any difference between accessing the element directly or saving that array (deep down on the layers of the array) as its own array and accessing its values that way?

If you look at the bytecode generated for these statements:
$tempArray=$myArray[2]["16.0"];
echo $tempArray[0];
FETCH_DIM_R $3 $myArray, 2
FETCH_DIM_R $4 $3, '16.0'
ASSIGN $tempArray, $4
FETCH_DIM_R $6 $tempArray, 0
ECHO $6
echo $myArray[2]["16.0"][0];
FETCH_DIM_R $7 $myArray, 2
FETCH_DIM_R $8 $7, '16.0'
FETCH_DIM_R $9 $8, 0
ECHO $9
(I've reformatted and removed the EXT_STMT markers). You can see that the only difference is the actual assignment to $tempArray. You might think that an array assignment is expensive. However PHP uses a referencing model and does a lazy copy-on-write, so it is not, and the cost is the pretty much same whatever the array size.
(Except of course if you change elements after the assignemt, so $tempArray is no longer the same as its originating slice, and at that point the memory usage jumps as the assign triggers the clone of the slice as the references need to be split.)
OK, this approach might be worthwhile if you are doing a lot of localised readonly access to a slice to save the repeated FETCH_DIM_R lookups (the PHP compile does absolutely no local optimisation of repeated use of indexes). However, the opportunities to shoot yourself in the foot on update are significant.
Why not benchmark this yourself using a loop and microtime() and memory_get_usage()?

You can easily test this yourself by creating an immense multidimensional array with a for loop, but accessing it directly will generally be faster when indexing large arrays.
With smaller arrays (~500 items or less) it isn't going to make a noticeable difference.

Related

Check, if array A contains all items from array B, when both arrays are multidimensional

I want to check, if array A contains all the items from array B (may contain others, but must contain all), when both arrays are multidimensional, i.e. can contains different variable types.
I've seen a lot (particularly this, this, this, this, this and this, also this, this and this as well). I've read PHP doc. Everything, that I checked, fails with "Array to string conversion" notice. Especially wen using array_intersect() or array_diff().
I'm using strict error checking, so notices actually holds further execution of entire script and are something, I don't generally like and want to avoid. Is it possible in this case?
My array A is:
Array
(
[0] => content/manage/index
[Content] => Array
(
[title] =>
[type] => 5
[category] =>
[recommended] =>
[featured] =>
[status] =>
[views] =>
[last_access_date] =>
[creation_date] =>
[modification_date] =>
[availability_date] =>
[author_id] =>
)
)
My array B is:
Array
(
[0] => /content/manage/index
[Content] => Array
(
[type] => 1
)
)
So, is there any way I can if I can use array_intersect on multidimensional arrays containing different variable types without getting notice?
My problem (and question) came out of misunderstanding, what "Array to string conversion" notice really means. In my case, it was trying to tell me, that I'm trying to walk multidimensional array with functions designed to be used on single dimension array.
Understanding that led me to a solution within few seconds. There are many of them here, on SO, but the one given by deceze here looked the best for me. So I adopted it into the form of such function:
function recursiveArrayIntersect($array1, $array2)
{
$array1 = array_intersect_key($array1, $array2);
foreach($array1 as $key=>&$value)
{
if(is_array($value)) $value = recursiveArrayIntersect($value, $array2[$key]);
}
return $array1;
}
I adopted it to my project and my way of coding, but all the credits still goes to deceze (his answer here)!
Now I can find an intersection of virtually any array, no matter what kind of variable types it contain and no matter of, how deep it is (how many subarrays it contains).

Why does the sort order of multidimensional child arrays revert as soon as foreach loop used for sorting ends?

I have a very strange array sorting related problem in PHP that is driving me completely crazy. I have googled for hours, and still NOTHING indicates that other people have this problem, or that this should happen to begin with, so a solution to this mystery would be GREATLY appreciated!
To describe the problem/question in as few words as possible: When sorting an array based on values inside a multiple levels deeply nested array, using a foreach loop, the resulting array sort order reverts as soon as execution leaves the loop, even though it works fine inside the loop. Why is this, and how do I work around it?
Here is sample code for my problem, which should hopefully be a little more clear than the sentence above:
$top_level_array = array('key_1' => array('sub_array' => array('sub_sub_array_1' => array(1),
'sub_sub_array_2' => array(3),
'sub_sub_array_3' => array(2)
)
)
);
function mycmp($arr_1, $arr_2)
{
if ($arr_1[0] == $arr_2[0])
{
return 0;
}
return ($arr_1[0] < $arr_2[0]) ? -1 : 1;
}
foreach($top_level_array as $current_top_level_member)
{
//This loop will only have one iteration, but never mind that...
print("Inside loop before sort operation:\n\n");
print_r($current_top_level_member['sub_array']);
uasort($current_top_level_member['sub_array'], 'mycmp');
print("\nInside loop after sort operation:\n\n");
print_r($current_top_level_member['sub_array']);
}
print("\nOutside of loop (i.e. after all sort operations finished):\n\n");
print_r($top_level_array);
The output of this is as follows:
Inside loop before sort operation:
Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_2] => Array
(
[0] => 3
)
[sub_sub_array_3] => Array
(
[0] => 2
)
)
Inside loop after sort operation:
Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_3] => Array
(
[0] => 2
)
[sub_sub_array_2] => Array
(
[0] => 3
)
)
Outside of loop (i.e. after all sort operations finished):
Array
(
[key_1] => Array
(
[sub_array] => Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_2] => Array
(
[0] => 3
)
[sub_sub_array_3] => Array
(
[0] => 2
)
)
)
)
As you can see, the sort order is "wrong" (i.e. not ordered by the desired value in the innermost array) before the sort operation inside the loop (as expected), then is becomes "correct" after the sort operation inside the loop (as expected).
So far so good.
But THEN, once we're outside the loop again, all of a sudden the order has reverted to its original state, as if the sort loop didn't execute at all?!?
How come this happens, and how will I ever be able to sort this array in the desired way then?
I was under the impression that neither foreach loops nor the uasort() function operated on separate instances of the items in question (but rather on references, i.e. in place), but the result above seems to indicate otherwise? And if so, how will I ever be able to perform the desired sort operation?
(and WHY doesn't anyone else than me on the entire internet seem to have this problem?)
PS.
Never mind the reason behind the design of the strange array to be sorted in this example, it is of course only a simplified PoC of a real problem in much more complex code.
Your problem is a misunderstanding of how PHP provides your "value" in the foreach construct.
foreach($top_level_array as $current_top_level_member)
The variable $current_top_level_member is a copy of the value in the array, not a reference to inside the $top_level_array. Therefore all your work happens on the copy and is discarded after the loop completes. (Actually it is in the $current_top_level_member variable, but $top_level_array never sees the changes.)
You want a reference instead:
foreach($top_level_array as $key => $value)
{
$current_top_level_member =& $top_level_array[$key];
EDIT:
You can also use the foreach by reference notation (hat tip to air4x) to avoid the extra assignment. Note that if you are working with an array of Objects, they are already passed by reference.
foreach($top_level_array as &$current_top_level_member)
To answer you question as to why PHP defaults to a copy instead of a reference, it's simply because of the rules of the language. Scalar values and arrays are assigned by value, unless the & prefix is used, and objects are always assigned by reference (as of PHP 5). And that is likely due to a general consensus that it's generally better to work with copies of everything expect objects. BUT--it is not slow like you might expect. PHP uses a lazy copy called copy on write, where it is really a read-only reference. On the first write, the copy is made.
PHP uses a lazy-copy mechanism (also called copy-on-write) that does
not actually create a copy of a variable until it is modified.
Source: http://www.thedeveloperday.com/php-lazy-copy/
You can add & before $current_top_level_member and use it as reference to the variable in the original array. Then you would be making changes to the original array.
foreach ($top_level_array as &$current_top_level_member) {

Which is more efficent PHP: array_intersect() OR array_intersect_key()

Is it more efficient to use the key intersect or the value intersect if both key and value have the same contents for example:
Array
(
[743] => 743
[744] => 744
[745] => 745
[746] => 746
[747] => 747
[748] => 748
)
Is there any difference in performance in using one or the other with the same values. Similar to the difference of using double or single quotes?
From another post: I have two unordered integer arrays, and i need to know how many integers these arrays have in common
Depending on your data (size) you
might want to use
array_intersect_key() instead of
array_intersect(). Apparently the
implementation of array_intersect
(testing php 5.3) does not use any
optimization/caching/whatsoever but
loops through the array and compares
the values one by one for each element
in array A. The hashtable lookup is
incredibly faster than that.

CakePHP returning double array from find('list) query

I'm using cakephp and am getting back a "double array" where it is giving me 2 arrays where it should be 1, I have looked into the issue as far as cakephp and can't figure it out and just want to move past this for now so I am wondering if anyone knows how to unset a second array if a variable has 2 arrays.. below is the print_r of the array, its just one variable that has this, which I find odd.. so I want to make it so there is not a 2nd set of duplicate values, if I do an array_push it pushes both values for that index into the resulting new array index so that won't work
one variable is equal to the following:
Array ( [0] => 42 [1] => 62 ) Array ( [0] => 42 [1] => 62 )
EDIT:
This is not an issue of my printing out the array twice accidentally, as I said above, with a foreach array_push of the variable, i end up with this, which is odd:
Array ( [0] => 4242 [1] => 6262 )
EDIT:
This is the cakephp database call that I am using, I know I didn't ask this in regards to cakephp but since some people think this is impossible i am posting this just so you can see what it does if you want
$specificfields_array = $this->Mymodel->find('list', array('fields' =>'Mymodel.id'),
'conditions' => array('emailgroup' => $categorynumber, 'sent' => '0');));
EDIT:
This is what a "foreach" array_push is:
$mynewarray = array();
foreach ($specificfields as $specificfields_current) {
array_push ($mynewarray, $specificfields_current);
}
A variable cannot "have two arrays". It can be one array that has two arrays nested. The scenario you describe is impossible (probably there are two print_r there or there is a < character hiding stuff – check the HTML source).
Can you post the controller, the model and the view file with your print_r calls to the http://bin.cakephp.org/ site and post the links back here so we can see all of your code?

PHP Array Efficiency and Memory Clarification

When declaring an Array in PHP, the index's may be created out of order...I.e
Array[1] = 1
Array[19] = 2
Array[4] = 3
My question. In creating an array like this, is the length 19 with nulls in between? If I attempted to get Array[3] would it come as undefined or throw an error? Also, how does this affect memory. Would the memory of 3 index's be taken up or 19?
Also currently a developer wrote a script with 3 arrays FailedUpdates[] FailedDeletes[] FailedInserts[]
Is it more efficient to do it this way, or do it in the case of an associative array controlling several sub arrays
"Failures" array(){
["Updates"] => array(){
[0] => 12
[1] => 41
}
["Deletes"] => array(){
[0] => 122
[1] => 414
[1] => 43
}
["Inserts"] => array(){
[0] => 12
}
}
Memory effiency isn't really something you need to worry about in PHP unless you're dealing with really huge arrays / huge numbers of variables.
An array in PHP isn't really like an array in C++ or a similar lower-level language; an array in PHP is a map. You have a list of keys (which must be unique and all of type string or integer), and a list of values corresponding to the keys. So the following is a legal array:
array(0 => 'butt', 1 => 'potato', 2 => 'tulip')
but so is
array(5 => 'i', 'barry' => 6, 19 => array(-1 => array(), 7 => 'smock'))
In both cases there are 3 entries in the array, hence 3 keys and 3 values.
In addition to the keys and values in the array, one array may be distinguished from another by the order in which the key/value pairs occur. If you define an array so that it has nonnegative integers as keys, this will often be the expected order. The order matters when you use constructs like foreach().
array[3] will be undefined/unset but not causing an error, and the array will use only memory for that 3 values - php isn't like C where you have to look at those things.
Accessing $arr[3] gives a notice: Notice: Undefined offset: 3 in /data/home/sjoerd/public_html/svnreps/test/a.php on line 3. You can avoid this by checking with isset() or array_key_exists().
There are no nulls stored.
Having empty elements won't take up extra memory.
Whether you should use multiple variables or an array depends on the context and how you use the variables.

Categories