Optimizing big array manipulation in PHP inside a function

Optimizing big array manipulation in PHP inside a function - php

I want to modify a big array inside a function, so I'm pretty sure I need to use references there, but I'm not sure what of these two alternatives is better (more performant, but also maybe some side effects?):
$array1 = getSomeBigArray();
$array2 = getAnotherBigArray();
$results[] = combineArrays($array1, $array2);
function combineArrays(&$array1, $array2){
// this is not important, just example of modification
foreach($array2 as $value){
if($value > 0){
$array1[] = $value;
}
}
return $array1; // will returning $array1 make a copy?
}
Option 2:
$array1 = getSomeBigArray();
$array2 = getAnotherBigArray();
combineArrays($array1, $array2);
$results[] = $array1;
function combineArrays(&$array1, $array2){
foreach($array2 as $value){
if($value > 0){
$array1[] = $value;
}
}
// void function
}
EDIT:
I have run some tests and now I'm more confused.
This is the test:
https://ideone.com/v7sepC
From those results it seems to be faster to not use references at all! and if used is faster option1 (with return).
But in my local env using references seems to be faster (not so much).
EDIT 2:
Maybe there is a problem with ideone.com? because running the same here:
https://3v4l.org/LaffP
the result is:
Opcion1 and Option2 (references) are almost equal and faster than passing by value

code 1: 1000000 values, resources: 32
code 1: 10000000 values, resources: 67
code 2: 1000000 values, resources: 27
code 2: 2000000 values, resources: 49
I calculated the resource usage of the system by calling
getrusage
And code 2 seems to be more performant. You can use the following code to make some tests yourself:
<?php
function getSomeBigArray() {
$arr = [];
for ($i=0;$i<2000000;$i++) {
$arr[] = $i;
}
return $arr;
}
function rutime($ru, $rus, $index) {
return ($ru["ru_$index.tv_sec"]*1000 + intval($ru["ru_$index.tv_usec"]/1000))
- ($rus["ru_$index.tv_sec"]*1000 + intval($rus["ru_$index.tv_usec"]/1000));
}
$array1 = getSomeBigArray();
$array2 = getSomeBigArray();
$rustart = getrusage();
$results[] = combineArrays($array1, $array2);
$ru = getrusage();
echo rutime($ru, $rustart, "utime");
function combineArrays(&$array1, $array2){
// The array combining method.
}
Note: method rutime used was copied by the right answer of the following stackoverflow post: Tracking the script execution time in PHP

When you do return $array1; (in the first option) it does not copy the array, only increases the reference counter and returns the reference to the same array.
I.e. returning value of the function and $array1 will be pointing to the same array in the memory. Unless you modify any of them: in that moment the data will be actually copied.
The same happens when you are assigning a value to $results[] = $array1; no data is actually copied, only a reference being put into a new element of $results.
In the end, both options have the same result: You'll have references to the same data in variable $array1 and in the last item of $results. Therefore, there is no notable performance difference in those two options.
Also, consider using native functions to perform typical actions. E.g. array_merge()

Related

Search Multiple Arrays for

I have officially hit a wall and I cannot figure out the solution to this issue. Any help would be much appreciated! I have tried array_intersect() but it just keeps running against the first array in the function, that wont work.
I have an infinite amounts of arrays (I'll show 4 for demonstration purposes), for example:
// 1.
array(1,2,3,4,5);
// 2.
array(1,3,5);
// 3.
array(1,3,4,5);
// 4.
array(1,3,5,6,7,8,9);
I need to figure out how to search all the arrays and find only the numbers that exist in all 4 arrays. In this example I need to only pull out the values from the arrays - 1, 3 & 5.
PS: In all reality, it would be best if the function could search against a multi dimensional array and extract only the numbers that match in all the arrays within the array.
Thanks so much for your help!

Fun question! This worked:
function arrayCommonFind($multiArray) {
$result = $multiArray[0];
$count = count($multiArray);
for($i=1; $i<$count; $i++) {
foreach($result as $key => $val) {
if (!in_array($val, $multiArray[$i])) {
unset($result[$key]);
}
}
}
return $result;
}
Note that you can just use $multiArray[0] (or any sub-array) as a baseline and check all the others against that since any values that will be in the final result must necessarily be in all individual subarrays.

How about this?
Find the numbers that exist in both array 1 and 2. Then compare those results with array 3 to find the common numbers again. Keep going as long as you want.
Is this what you are getting at?

If it's in a multidimensional array you could
$multiDimensional = array(/* Your arrays*/);
$found = array_pop($multiDimensional);
foreach($multiDimensional as $subArray)
{
foreach($found as $key=>$element)
{
if(!in_array($element, $subArray)
{
unset($found[$key]);
}
}
}

Per your comment on my other question here is a better solution:
<?php
// 1. merge the arrays
$merged_arrays = array_merge( $arr1, $arr2, $arr3, $arr4, ...);
// 2. count the values
$merged_count = array_count_values( $merged_arrays );
// 3. sort the result for elements that only matched once
for( $merged_count as $key => $value ){
if ($value == 1) {
// 4. unset the values that didn't intersect
unset($merged_count($key));
}
}
// 5. print the resulting array
print_r( $merged_count );

Performing iterated in_array() calls followed by unset() is excessive handling and it overlooks the magic of array_intersect() which really should be the hero of any solid solution for this case.
Here is a lean iterative function:
Code: (Demo)
function array_intersect_multi($arrays){ // iterative method
while(sizeof($arrays)>1){
$arrays[1]=array_intersect($arrays[0],$arrays[1]); // find common values from first and second subarray, store as (overwrite) second subarray
array_shift($arrays); // discard first subarray (reindex $arrays)
}
return implode(', ',$arrays[0]);
}
echo array_intersect_multi([[1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]]);
// output: 1, 3, 5
This assumes you will package the individual arrays into an indexed array of arrays.
I also considered a recursive function, but recursion is slower and uses more memory.
function array_intersect_multi($arrays){ // recursive method
if(sizeof($arrays)>1){
$arrays[1]=array_intersect($arrays[0],$arrays[1]); // find common values from first and second subarray, store as (overwrite) second subarray
array_shift($arrays); // discard first subarray (reindex $arrays)
return array_intersect_multi($arrays); // recurse
}
return implode(', ',$arrays[0]);
}
Furthermore, if you are happy to flatten your arrays into one with array_merge() and declare the number of individual arrays being processed, you can use this:
(fastest method)
Code: (Demo)
function flattened_array_intersect($array,$total_arrays){
return implode(', ',array_keys(array_intersect(array_count_values($array),[$total_arrays])));
}
echo flattened_array_intersect(array_merge([1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]),4);
or replace array_intersect() with array_filter() (slightly slower and more verbose):
function flattened_array_intersect($array,$total_arrays){
return implode(', ',array_keys(array_filter(array_count_values($array),function($v)use($total_arrays){return $v==$total_arrays;})));
}
echo flattened_array_intersect(array_merge([1,2,3,4,5],[1,3,5],[1,3,4,5],[1,3,5,6,7,8,9]),4);

Searching multi-dimensional array's keys using a another array

Is there an elegant way of getting values from a massive multi-dimensional array using another array for the keys to lookup?
e.g.
$cats[A][A1][A11][A111] = $val;
$cats[A][A1][A11][A112] = $val;
$cats[A][A1][A12] = $val;
$cats[A][A1][A12][A121] = $val;
$cats[A][A2] = $val;
$cats[A][A2][A21] = $val;
$cats[A][A2][A22] = $val;
$cats[A][A2][A22][A221] = $val;
$cats[A][A2][A22][A222] = $val;
access values from $cats using $keys = Array ('A', 'A2', 'A22', 'A221');
without checking the length of $keys and doing something like...
switch (count($keys)) {
case 1: $val = $cats[$keys[0]]; break;
case 2: $val = $cats[$key[0]][$key[1]]; break;
case 3: $val = $cats[$key[0]][$key[1]][$key[2]]; break;
...
}
many thanks.

Why not use recursion? Something like this:
function get_val($array, $keys) {
if(empty($keys) || !is_array($keys) || !is_array($array)) return $array;
else {
$first_key = array_shift($keys);
return get_val($array[$first_key], $keys);
}
}
I originally had this written in a loop, but changed it to recursive for some reason. It's true, as yeoman said, that a recursive function is more likely than a loop to cause a stack overflow, especially if your array is sufficiently large (PHP does support end recursion), so here's a loop that should accomplish the same purpose:
// given a multidimensional array $array and single-dimensional array of keys $keys
$desired_value = $array;
while(count($keys) > 0) {
$first_key = array_shift($keys);
$desired_value = $desired_value[$first_key];
}

That's fine so far. Otherwise you would need to iterate through array and check deepness. To make it dynamic I am sure you add keys into $keys array when constructing $cats. Using recursion also solution it will take more steps, more memory.

jburbage's suggestion of using recursion is OK in principle, but from what I know, PHP doesn't support end-recursion.
And the question was about a "massive" multidimensional array.
As "massive" suggests great depth in addition to great overall size, it's possible to run into a stack overflow with this solution, as it's usually possible to create data structures on the heap that reach deeper than the stack can cope with via recursion.
The approach is also not desirable from the performance point of view in this case.
Simply refactor jburbage's recursive solution to work in a loop, and you're almost there :-)
Here's jburbage's original suggested code once again:
function get_val($array, $keys) {
if(empty($keys) || !is_array($keys) || !is_array($array)) return $array;
else {
$first_key = array_shift($keys);
return get_val($array[$first_key], $keys);
}
}

Using PHP remove duplicates from an array without using any in- built functions?

Lets say I have an array as follows :
$sampArray = array (1,4,2,1,6,4,9,7,2,9)
I want to remove all the duplicates from this array, so the result should be as follows:
$resultArray = array(1,4,2,6,9,7)
But here is the catch!!! I don't want to use any PHP in built functions like array_unique().
How would you do it ? :)

Here is a simple O(n)-time solution:
$uniqueme = array();
foreach ($array as $key => $value) {
$uniqueme[$value] = $key;
}
$final = array();
foreach ($uniqueme as $key => $value) {
$final[] = $key;
}
You cannot have duplicate keys, and this will retain the order.

A serious (working) answer:
$inputArray = array(1, 4, 2, 1, 6, 4, 9, 7, 2, 9);
$outputArray = array();
foreach($inputArray as $inputArrayItem) {
foreach($outputArray as $outputArrayItem) {
if($inputArrayItem == $outputArrayItem) {
continue 2;
}
}
$outputArray[] = $inputArrayItem;
}
print_r($outputArray);

This depends on the operations you have available.
If all you have to detect duplicates is a function that takes two elements and tells if they are equal (one example will be the == operation in PHP), then you must compare every new element with all the non-duplicates you have found before. The solution will be quadratic, in the worst case (there are no duplicates), you need to do (1/2)(n*(n+1)) comparisons.
If your arrays can have any kind of value, this is more or less the only solution available (see below).
If you have a total order for your values, you can sort the array (n*log(n)) and then eliminate consecutive duplicates (linear). Note that you cannot use the <, >, etc. operators from PHP, they do not introduce a total order. Unfortunately, array_unique does this, and it can fail because of that.
If you have a hash function that you can apply to your values, than you can do it in average linear time with a hash table (which is the data structure behind an array). See
tandu's answer.

Edit2: The versions below use a hashmap to determine if a value already exists. In case this is not possible, here is another variant that safely works with all PHP values and does a strict comparison (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = function($a)
{
$u = array();
foreach($a as $v)
{
foreach($u as $vu)
if ($vu===$v) continue 2
;
$u[] = $v;
}
return $u;
};
var_dump($unique($array)); # array(1,4,2,6,9,7)
Edit: Same version as below, but w/o build in functions, only language constructs (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = array();
foreach($array as $v)
isset($k[$v]) || ($k[$v]=1) && $unique[] = $v;
var_dump($unique); # array(1,4,2,6,9,7)
And in case you don't want to have the temporary arrays spread around, here is a variant with an anonymous function:
$array = array (1,4,2,1,6,4,9,7,2,9);
$unique = function($a) /* similar as above but more expressive ... ... you have been warned: */ {for($v=reset($a);$v&&(isset($k[$v])||($k[$v]=1)&&$u[]=$v);$v=next($a));return$u;};
var_dump($unique($array)); # array(1,4,2,6,9,7)
First was reading that you don't want to use array_unique or similar functions (array_intersect etc.), so this was just a start, maybe it's still of som use:
You can use array_flip PHP Manual in combination with array_keys PHP Manual for your array of integers (Demo):
$array = array (1,4,2,1,6,4,9,7,2,9);
$array = array_keys(array_flip($array));
var_dump($array); # array(1,4,2,6,9,7)
As keys can only exist once in a PHP array and array_flip retains the order, you will get your result. As those are build in functions it's pretty fast and there is not much to iterate over to get the job done.

<?php
$inputArray = array(1, 4, 2, 1, 6, 4, 9, 7, 2, 9);
$outputArray = array();
foreach ($inputArray as $val){
if(!in_array($val,$outputArray)){
$outputArray[] = $val;
}
}
print_r($outputArray);

You could use an intermediate array into which you add each item in turn. prior to adding the item you could check if it already exists by looping through the new array.

PHP best way to MD5 multi-dimensional array?

What is the best way to generate an MD5 (or any other hash) of a multi-dimensional array?
I could easily write a loop which would traverse through each level of the array, concatenating each value into a string, and simply performing the MD5 on the string.
However, this seems cumbersome at best and I wondered if there was a funky function which would take a multi-dimensional array, and hash it.

(Copy-n-paste-able function at the bottom)
As mentioned prior, the following will work.
md5(serialize($array));
However, it's worth noting that (ironically) json_encode performs noticeably faster:
md5(json_encode($array));
In fact, the speed increase is two-fold here as (1) json_encode alone performs faster than serialize, and (2) json_encode produces a smaller string and therefore less for md5 to handle.
Edit: Here is evidence to support this claim:
<?php //this is the array I'm using -- it's multidimensional.
$array = unserialize('a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:4:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}i:3;a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}');
//The serialize test
$b4_s = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(serialize($array));
}
echo 'serialize() w/ md5() took: '.($sTime = microtime(1)-$b4_s).' sec<br/>';
//The json test
$b4_j = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(json_encode($array));
}
echo 'json_encode() w/ md5() took: '.($jTime = microtime(1)-$b4_j).' sec<br/><br/>';
echo 'json_encode is <strong>'.( round(($sTime/$jTime)*100,1) ).'%</strong> faster with a difference of <strong>'.($sTime-$jTime).' seconds</strong>';
JSON_ENCODE is consistently over 250% (2.5x) faster (often over 300%) -- this is not a trivial difference. You may see the results of the test with this live script here:
http://nathanbrauer.com/playground/serialize-vs-json.php
http://nathanbrauer.com/playground/plain-text/serialize-vs-json.php
Now, one thing to note is array(1,2,3) will produce a different MD5 as array(3,2,1). If this is NOT what you want. Try the following code:
//Optionally make a copy of the array (if you want to preserve the original order)
$original = $array;
array_multisort($array);
$hash = md5(json_encode($array));
Edit: There's been some question as to whether reversing the order would produce the same results. So, I've done that (correctly) here:
http://nathanbrauer.com/playground/json-vs-serialize.php
http://nathanbrauer.com/playground/plain-text/json-vs-serialize.php
As you can see, the results are exactly the same. Here's the (corrected) test originally created by someone related to Drupal:
http://nathanjbrauer.com/playground/drupal-calculation.php
http://nathanjbrauer.com/playground/plain-text/drupal-calculation.php
And for good measure, here's a function/method you can copy and paste (tested in 5.3.3-1ubuntu9.5):
function array_md5(Array $array) {
//since we're inside a function (which uses a copied array, not
//a referenced array), you shouldn't need to copy the array
array_multisort($array);
return md5(json_encode($array));
}

md5(serialize($array));

I'm joining a very crowded party by answering, but there is an important consideration that none of the extant answers address. The value of json_encode() and serialize() both depend upon the order of elements in the array!
Here are the results of not sorting and sorting the arrays, on two arrays with identical values but added in a different order (code at bottom of post):
serialize()
1c4f1064ab79e4722f41ab5a8141b210
1ad0f2c7e690c8e3cd5c34f7c9b8573a
json_encode()
db7178ba34f9271bfca3a05c5dddf502
c9661c0852c2bd0e26ef7951b4ca9e6f
Sorted serialize()
1c4f1064ab79e4722f41ab5a8141b210
1c4f1064ab79e4722f41ab5a8141b210
Sorted json_encode()
db7178ba34f9271bfca3a05c5dddf502
db7178ba34f9271bfca3a05c5dddf502
Therefore, the two methods that I would recommend to hash an array would be:
// You will need to write your own deep_ksort(), or see
// my example below
md5( serialize(deep_ksort($array)) );
md5( json_encode(deep_ksort($array)) );
The choice of json_encode() or serialize() should be determined by testing on the type of data that you are using. By my own testing on purely textual and numerical data, if the code is not running a tight loop thousands of times then the difference is not even worth benchmarking. I personally use json_encode() for that type of data.
Here is the code used to generate the sorting test above:
$a = array();
$a['aa'] = array( 'aaa'=>'AAA', 'bbb'=>'ooo', 'qqq'=>'fff',);
$a['bb'] = array( 'aaa'=>'BBBB', 'iii'=>'dd',);
$b = array();
$b['aa'] = array( 'aaa'=>'AAA', 'qqq'=>'fff', 'bbb'=>'ooo',);
$b['bb'] = array( 'iii'=>'dd', 'aaa'=>'BBBB',);
echo " serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";
echo "\n json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";
$a = deep_ksort($a);
$b = deep_ksort($b);
echo "\n Sorted serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";
echo "\n Sorted json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";
My quick deep_ksort() implementation, fits this case but check it before using on your own projects:
/*
* Sort an array by keys, and additionall sort its array values by keys
*
* Does not try to sort an object, but does iterate its properties to
* sort arrays in properties
*/
function deep_ksort($input)
{
if ( !is_object($input) && !is_array($input) ) {
return $input;
}
foreach ( $input as $k=>$v ) {
if ( is_object($v) || is_array($v) ) {
$input[$k] = deep_ksort($v);
}
}
if ( is_array($input) ) {
ksort($input);
}
// Do not sort objects
return $input;
}

Answer is highly depends on data types of array values.
For big strings use:
md5(serialize($array));
For short strings and integers use:
md5(json_encode($array));
4 built-in PHP functions can transform array to string:
serialize(), json_encode(), var_export(), print_r().
Notice: json_encode() function slows down while processing associative arrays with strings as values. In this case consider to use serialize() function.
Test results for multi-dimensional array with md5-hashes (32 char) in keys and values:
Test name Repeats Result Performance
serialize 10000 0.761195 sec +0.00%
print_r 10000 1.669689 sec -119.35%
json_encode 10000 1.712214 sec -124.94%
var_export 10000 1.735023 sec -127.93%
Test result for numeric multi-dimensional array:
Test name Repeats Result Performance
json_encode 10000 1.040612 sec +0.00%
var_export 10000 1.753170 sec -68.47%
serialize 10000 1.947791 sec -87.18%
print_r 10000 9.084989 sec -773.04%
Associative array test source.
Numeric array test source.

Aside from Brock's excellent answer (+1), any decent hashing library allows you to update the hash in increments, so you should be able to update with each string sequentially, instead having to build up one giant string.
See: hash_update

md5(serialize($array));
Will work, but the hash will change depending on the order of the array (that might not matter though).

Note that serialize and json_encode act differently when it comes to numeric arrays where the keys don't start at 0, or associative arrays.
json_encode will store such arrays as an Object, so json_decode returns an Object, where unserialize will return an array with exact the same keys.

I think that this could be a good tip:
Class hasharray {
public function array_flat($in,$keys=array(),$out=array()){
foreach($in as $k => $v){
$keys[] = $k;
if(is_array($v)){
$out = $this->array_flat($v,$keys,$out);
}else{
$out[implode("/",$keys)] = $v;
}
array_pop($keys);
}
return $out;
}
public function array_hash($in){
$a = $this->array_flat($in);
ksort($a);
return md5(json_encode($a));
}
}
$h = new hasharray;
echo $h->array_hash($multi_dimensional_array);

Important note about serialize()
I don't recommend to use it as part of hashing function because it can return different result for the following examples. Check the example below:
Simple example:
$a = new \stdClass;
$a->test = 'sample';
$b = new \stdClass;
$b->one = $a;
$b->two = clone $a;
Produces
"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}}"
But the following code:
<?php
$a = new \stdClass;
$a->test = 'sample';
$b = new \stdClass;
$b->one = $a;
$b->two = $a;
Output:
"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";r:2;}"
So instead of second object php just create link "r:2;" to the first instance. It's definitely good and correct way to serialize data, but it can lead to the issues with your hashing function.

// Convert nested arrays to a simple array
$array = array();
array_walk_recursive($input, function ($a) use (&$array) {
$array[] = $a;
});
sort($array);
$hash = md5(json_encode($array));
----
These arrays have the same hash:
$arr1 = array(0 => array(1, 2, 3), 1, 2);
$arr2 = array(0 => array(1, 3, 2), 1, 2);

I didn't see the solution so easily above so I wanted to contribute a simpler answer. For me, I was getting the same key until I used ksort (key sort):
Sorted first with Ksort, then performed sha1 on a json_encode:
ksort($array)
$hash = sha1(json_encode($array) //be mindful of UTF8
example:
$arr1 = array( 'dealer' => '100', 'direction' => 'ASC', 'dist' => '500', 'limit' => '1', 'zip' => '10601');
ksort($arr1);
$arr2 = array( 'direction' => 'ASC', 'limit' => '1', 'zip' => '10601', 'dealer' => '100', 'dist' => '5000');
ksort($arr2);
var_dump(sha1(json_encode($arr1)));
var_dump(sha1(json_encode($arr2)));
Output of altered arrays and hashes:
string(40) "502c2cbfbe62e47eb0fe96306ecb2e6c7e6d014c"
string(40) "b3319c58edadab3513832ceeb5d68bfce2fb3983"

there are several answers telling to use json_code,
but json_encode don't work fine with iso-8859-1 string, as soon as there is a special char, the string is cropped.
i would advice to use var_export :
md5(var_export($array, true))
not as slow as serialize, not as bugged as json_encode

Currently the most up-voted answer md5(serialize($array)); doesn't work well with objects.
Consider code:
$a = array(new \stdClass());
$b = array(new \stdClass());
Even though arrays are different (they contain different objects), they have same hash when using md5(serialize($array));. So your hash is useless!
To avoid that problem, you can replace objects with result of spl_object_hash() before serializing. You also should do it recursively if your array has multiple levels.
Code below also sorts arrays by keys, as dotancohen have suggested.
function replaceObjectsWithHashes(array $array)
{
foreach ($array as &$value) {
if (is_array($value)) {
$value = $this->replaceObjectsInArrayWithHashes($value);
} elseif (is_object($value)) {
$value = spl_object_hash($value);
}
}
ksort($array);
return $array;
}
Now you can use md5(serialize(replaceObjectsWithHashes($array))).
(Note that the array in PHP is value type. So replaceObjectsWithHashes function DO NOT change original array.)

in some case maybe it's better to use http_build_query to convert array to string :
md5( http_build_query( $array ) );

Which is faster in PHP, $array[] = $value or array_push($array, $value)?

What's better to use in PHP for appending an array member,
$array[] = $value;
or
array_push($array, $value);
?
Though the manual says you're better off to avoid a function call, I've also read $array[] is much slower than array_push(). What are some clarifications or benchmarks?

I personally feel like $array[] is cleaner to look at, and honestly splitting hairs over milliseconds is pretty irrelevant unless you plan on appending hundreds of thousands of strings to your array.
I ran this code:
$t = microtime(true);
$array = array();
for($i = 0; $i < 10000; $i++) {
$array[] = $i;
}
print microtime(true) - $t;
print '<br>';
$t = microtime(true);
$array = array();
for($i = 0; $i < 10000; $i++) {
array_push($array, $i);
}
print microtime(true) - $t;
The first method using $array[] is almost 50% faster than the second one.
Some benchmark results:
Run 1
0.0054171085357666 // array_push
0.0028800964355469 // array[]
Run 2
0.0054559707641602 // array_push
0.002892017364502 // array[]
Run 3
0.0055501461029053 // array_push
0.0028610229492188 // array[]
This shouldn't be surprising, as the PHP manual notes this:
If you use array_push() to add one element to the array it's better to use $array[] = because in that way there is no overhead of calling a function.
The way it is phrased I wouldn't be surprised if array_push is more efficient when adding multiple values. Out of curiosity, I did some further testing, and even for a large amount of additions, individual $array[] calls are faster than one big array_push. Interesting.

The main use of array_push() is that you can push multiple values onto the end of the array.
It says in the documentation:
If you use array_push() to add one
element to the array it's better to
use $array[] = because in that way
there is no overhead of calling a
function.

From the PHP documentation for array_push:
Note: If you use array_push() to add one element to the array it's better to use $array[] = because in that way there is no overhead of calling a function.

Word on the street is that [] is faster because no overhead for the function call. Plus, no one really likes PHP's array functions...
"Is it...haystack, needle....or is it needle haystack...ah, f*** it...[] = "

One difference is that you can call array_push() with more than two parameters, i.e. you can push more than one element at a time to an array.
$myArray = array();
array_push($myArray, 1,2,3,4);
echo join(',', $myArray);
prints 1,2,3,4

A simple $myarray[] declaration will be quicker as you are just pushing an item onto the stack of items due to the lack of overhead that a function would bring.

Since "array_push" is a function and it called multiple times when it is inside the loop, it will allocate memory into the stack.
But when we are using $array[] = $value then we are just assigning a value to the array.

Second one is a function call so generally it should be slower than using core array-access features. But I think even one database query within your script will outweight 1000000 calls to array_push().
See here for a quick benchmark using 1000000 inserts: https://3v4l.org/sekeV

I just wan't to add : int array_push(...) returns
the new number of elements in the array (PHP documentation). which can be useful and more compact than $myArray[] = ...; $total = count($myArray);.
Also array_push(...) is meaningful when variable is used as a stack.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Optimizing big array manipulation in PHP inside a function - php

Related

Search Multiple Arrays for

Searching multi-dimensional array's keys using a another array

Using PHP remove duplicates from an array without using any in- built functions?

PHP best way to MD5 multi-dimensional array?

Which is faster in PHP, $array[] = $value or array_push($array, $value)?

Categories

Resources