This question already has answers here:
How to check if PHP array is associative or sequential?
(60 answers)
Closed last year.
I'd like to be able to pass an array to a function and have the function behave differently depending on whether it's a "list" style array or a "hash" style array. E.g.:
myfunc(array("One", "Two", "Three")); // works
myfunc(array(1=>"One", 2=>"Two", 3=>"Three")); also works, but understands it's a hash
Might output something like:
One, Two, Three
1=One, 2=Two, 3=Three
ie: the function does something differently when it "detects" it's being passed a hash rather than an array. Can you tell I'm coming from a Perl background where %hashes are different references from #arrays?
I believe my example is significant because we can't just test to see whether the key is numeric, because you could very well be using numeric keys in your hash.
I'm specifically looking to avoid having to use the messier construct of myfunc(array(array(1=>"One"), array(2=>"Two"), array(3=>"Three")))
Pulled right out of the kohana framework.
public static function is_assoc(array $array)
{
// Keys of the array
$keys = array_keys($array);
// If the array keys of the keys match the keys, then the array must
// not be associative (e.g. the keys array looked like {0:0, 1:1...}).
return array_keys($keys) !== $keys;
}
This benchmark gives 3 methods.
Here's a summary, sorted from fastest to slowest. For more informations, read the complete benchmark here.
1. Using array_values()
function($array) {
return (array_values($array) !== $array);
}
2. Using array_keys()
function($array){
$array = array_keys($array); return ($array !== array_keys($array));
}
3. Using array_filter()
function($array){
return count(array_filter(array_keys($array), 'is_string')) > 0;
}
PHP treats all arrays as hashes, technically, so there is not an exact way to do this. Your best bet would be the following I believe:
if (array_keys($array) === range(0, count($array) - 1)) {
//it is a hash
}
No, PHP does not differentiate arrays where the keys are numeric strings from the arrays where the keys are integers in cases like the following:
$a = array("0"=>'a', "1"=>'b', "2"=>'c');
$b = array(0=>'a', 1=>'b', 2=>'c');
var_dump(array_keys($a), array_keys($b));
It outputs:
array(3) {
[0]=> int(0) [1]=> int(1) [2]=> int(2)
}
array(3) {
[0]=> int(0) [1]=> int(1) [2]=> int(2)
}
(above formatted for readability)
My solution is to get keys of an array like below and check that if the key is not integer:
private function is_hash($array) {
foreach($array as $key => $value) {
return ! is_int($key);
}
return false;
}
It is wrong to get array_keys of a hash array like below:
array_keys(array(
"abc" => "gfb",
"bdc" => "dbc"
)
);
will output:
array(
0 => "abc",
1 => "bdc"
)
So, it is not a good idea to compare it with a range of numbers as mentioned in top rated answer. It will always say that it is a hash array if you try to compare keys with a range.
Being a little frustrated, trying to write a function to address all combinations, an idea clicked in my mind: parse json_encode result.
When a json string contains a curly brace, then it must contain an object!
Of course, after reading the solutions here, mine is a bit funny...
Anyway, I want to share it with the community, just to present an attempt to solve the problem from another prospective (more "visual").
function isAssociative(array $arr): bool
{
// consider empty, and [0, 1, 2, ...] sequential
if(empty($arr) || array_is_list($arr)) {
return false;
}
// first scenario:
// [ 1 => [*any*] ]
// [ 'a' => [*any*] ]
foreach ($arr as $key => $value) {
if(is_array($value)) {
return true;
}
}
// second scenario: read the json string
$jsonNest = json_encode($arr, JSON_THROW_ON_ERROR);
return str_contains($jsonNest, '{'); // {} assoc, [] sequential
}
NOTES
php#8.1 is required, check out the gist on github containing the unit test of this method + Polyfills (php>=7.3).
I've tested also Hussard's posted solutions, A & B are passing all tests, C fails to recognize: {"1":0,"2":1}.
BENCHMARKS
Here json parsing is ~200 ms behind B, but still 1.7 seconds faster than solution C!
What do you think about this version? Improvements are welcome!
Related
I have an array containing links. I am trying to cut a part of those links. For example:
$array = [
"https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular",
"https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular"
];
I want change these links like below:
array(2) {
[0]=>
string(91) "https://eksisozluk.com/merve-sanayin"
[1]=>
string(91) "https://eksisozluk.com/merve-sanayin"
[2]=>
}
Is there any possible way to edit array items?
Given the array:
$array = [
"https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular",
"https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular"
];
Using array_walk() (modifies the array in place).
Using a regular expression this time:
function filter_url(&$item)
{
preg_match('|(https:\/\/\w+\.\w{2,4}\/\w+-\w+)-.+|', $item, $matches);
$item = $matches[1];
}
array_walk($array, 'filter_url');
(See it working here).
Note that filter_url passes the first parameter by reference, as explained in the documentation, so changes to each of the array items are performed in place and affect the original array.
Using array_map() (returns a modified array)
Simply using substr, since we know next to nothing about your actual requirements:
function clean_url($item)
{
return substr($item, 0, 36);
}
$new_array = array_map('clean_url', $array);
Working here.
The specifics of how actually filter the array elements are up to you.
The example shown here seems kinda pointless, since you are setting all elements exactly to the same value. If you know the lenght you can use substr, or you could just could write a more robust regex.
Since all the elements of your input array are the same in the example, I am going to assume this doesn't represent your actual input.
You could also iterate the array using either for, foreach or while, but either of those options seems less elegant when you have specific array functions to deal with this kind of situation.
There are multiple ways. One way is to iterate over the items and clip them with substr().
$arr = array("https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular",
"https://eksisozluk.com/merve-sanayin-pizzaciya-kapiyi-ciplak-acmasi--5868043?a=popular");
for ($i = 0; $i < count($arr); $i++)
{
$arr[$i] = substr($arr[$i], 0, 36);
}
I currently have a column in mysql database that stores a string with delimiters so I can convert it into an array in php.
An example string looks like this:
23,1,1|72,2,0|16,3,1|...etc.
It essentially divides it into 3 major groups with |, and 3 smaller ones with , (if there's a cleaner way, let me know):
1st number is an ID number for an article from a different table
2nd is just a number for indenting purposes
3rd is for visible or not (0 or 1).
I will have an admin section where we'll be able to re-order the major groups (i.e., move group 3 to position 2) and modify specific numbers from the sub-groups (e.g. change 72,1,0 to 72,2,0) I'm not sure how I can accomplish this.
How do I loop through these modifications while keeping the order (or new order) when reinserting into the database?
I was thinking of adding a another number to my string that would determine the position of each major group? Something like this:
1[23,1,1]2[72,2,0]3[16,3,1]
But how do I loop through this and move things around?
Any help would be greatly appreciated.
I agree with the comments about normalization, but if you insist on doing it this way, or are stuck with an existing schema you cannot alter, use the PHP serialize/unserialize functions if you can, rather than string parsing. This will at least allow you to retrieve the data into PHP and modify the array and then save it back.
http://php.net/manual/en/function.serialize.php
I'm joining all comments about the approach used to store data, but nevertheless.
This what can help you to move forward:
/** #var string $s Assumed packet */
$s = "23,3,1|72,1,0|16,2,1"; // Indent is not ordered
/** #var array $parsed Parsed packet */
$parsed = array_map(function ($segment) {
list($artId, $indent, $visibility) = explode(',', $segment);
return [
'ArticleId' => (int)$artId,
'Indent' => (int)$indent,
'Visible' => (int)$visibility !== 0
];
}, explode('|', $s));
usort($parsed, function (array $a, array $b) {
return ($a['Indent'] < $b['Indent']) ? -1 : 1;
});
You'll get following $parsed structure sorted by Indent key:
array(3) {
[0] => array(3) {
["ArticleId"]=> int(23)
["Indent"]=> int(1)
["Visible"]=> bool(true)
}
[1] => array(3) {
["ArticleId"]=> int(72)
["Indent"]=> int(2)
["Visible"]=> bool(false)
}
[2] => array(3) {
["ArticleId"]=> int(16)
["Indent"]=> int(3)
["Visible"]=> bool(true)
}
}
Thus you can alter Indent as you want just applying usort() before/after parsing.
Regarding storing this structure in database the way you decided, you can use JSON format (json_encode(),json_decode()). Such "serialization" way faster than proposed serialize() method and way more faster than $parsed approach + more readable. If you worry about redundancy you can json_encode() array without array keys and add them on parsing or use directly [0], [1], [2] knowing the correspondence beforehand.
If you use json_*() functions you can omit structure parsing bec. it will be decoded right the same you've encoded it for saving. Order can be defined on save using same usort(). This can be considered as improvement by reducing excessive sorts bec. readings/decodings will occur more frequently than saves.
I was interested with your question. So i create this. It may not like what you desired.
<?php
$string = "23,1,1|72,2,0|16,3,1|";
$explode = explode("|", $string);
$a = array();
for($x = 0;$x<count($explode)-1;$x++)
{
$a[] = $explode[$x];
$b[] =explode(',',substr($explode[$x], strpos($explode[$x], ",") + 1));
}
for($y=0;$y<count($b);$y++){
echo $b[$y][0]. '=>'. $a[$y] . '<br>';
}
?>
is there a command to know if an array has their keys as string or plain int?
Like:
$array1 = array('value1','value2','value3');
checkArr($array1); //> Returns true because there aren't keys as string
And:
$array2 = array('key1'=>'value1','key2'=>'value2','value3');
checkArr($array2); //> Returns false because there are keys as string
Note: I know I can parse all the array to check it.
The "compact" version to test this is:
$allnumeric =
array_sum(array_map("is_numeric", array_keys($array))) == count($array);
#Gumbo's suggestion is 1 letter shorter and could very well be a bit speedier for huge arrays:
count(array_filter(array_keys($array), "is_numeric")) == count($array);
You can use array_keys to obtain the keys for the array and then analyse the resultant array.
look at array_keys() if values are int - you got ints if strings -> strings
If you want to check the array's keys, it would be probably better to use something like this:
reset($array);
while (($key = key($array)) !== null) {
// check the key, for example:
if (is_string($key)) {
// ...
}
next($array);
}
This will be most performant, as there are no extraneous copies made of variables that you are not going to use.
On the other hand, this way is probably the most readable and makes the intent crystal clear:
$keys = array_keys($array);
foreach($keys as $key) {
// check the key, for example:
if (is_string($key)) {
// ...
}
}
Take your pick.
Important note:
Last time I checked, PHP would not let you have keys which are string representations of integers. For example:
$array = array();
$array["5"] = "foo";
echo $array[5]; // You might think this will not work, but it will
So keep that in mind when you are checking what the types of such keys are: they might have been created as strings, but PHP will have converted them to integers behind the scenes.
This question already has answers here:
A numeric string as array key in PHP
(11 answers)
Closed 2 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
I've come across an old app that uses an id to name type array, for example...
array(1) {
[280]=>
string(3) "abc"
}
Now I need to reorder these, and a var_dump() would make it appear that that isn't going to happen while the keys are integers.
If I add an a to every index, var_dump() will show double quotes around the key, my guess to show it is now a string...
array(1) {
["280a"]=>
string(3) "abc"
}
This would let me easily reorder them, without having to touch more code.
This does not work.
$newArray = array();
foreach($array as $key => $value) {
$newArray[(string) $key] = $value;
}
A var_dump() still shows them as integer array indexes.
Is there a way to force the keys to be strings, so I can reorder them without ruining the array?
YOU CAN'T!!
Strings containing valid integers will be cast to the integer type. E.g. the key "8" will actually be stored under 8. On the other hand "08" will not be cast, as it isn't a valid decimal integer.
Edit:
ACTUALLY YOU CAN!!
Cast sequential array to associative array
$obj = new stdClass;
foreach($array as $key => $value){
$obj->{$key} = $value;
}
$array = (array) $obj;
In most cases, the following quote is true:
Strings containing valid integers will be cast to the integer type. E.g. the key "8" will actually be stored under 8. On the other hand "08" will not be cast, as it isn't a valid decimal integer.
This examples from the PHP Docs
<?php
$array = array(
1 => "a",
"1" => "b",
1.5 => "c",
true => "d",
);
var_dump($array);
?>
The above example will output:
array(1) {
[1]=> string(1) "d"
}
So even if you were to create an array with numbered keys they would just get casted back to integers.
Unfortunately for me I was not aware of this until recently but I thought I would share my failed attempts.
Failed attempts
$arr = array_change_key_case($arr); // worth a try.
Returns an array with all keys from array lowercased or uppercased. Numbered indices are left as is.
My next attempts was to create a new array by array_combineing the old values the new (string)keys.
I tried several ways of making the $keys array contain numeric values of type string.
range("A", "Z" ) works for the alphabet so I though I would try it with a numeric string.
$keys = range("0", (string) count($arr) ); // integers
This resulted in an array full of keys but were all of int type.
Here's a couple of successful attempts of creating an array with the values of type string.
$keys = explode(',', implode(",", array_keys($arr))); // values strings
$keys = array_map('strval', array_keys($arr)); // values strings
Now just to combine the two.
$arr = array_combine( $keys, $arr);
This is when I discovered numeric strings are casted to integers.
$arr = array_combine( $keys, $arr); // int strings
//assert($arr === array_values($arr)) // true.
The only way to change the keys to strings and maintain their literal values would be to prefix the key with a suffix it with a decimal point "00","01","02" or "0.","1.","2.".
You can achieve this like so.
$keys = explode(',', implode(".,", array_keys($arr)) . '.'); // added decimal point
$arr = array_combine($keys, $arr);
Of course this is less than ideal as you will need to target array elements like this.
$arr["280."]
I've created a little function which will target the correct array element even if you only enter the integer and not the new string.
function array_value($array, $key){
if(array_key_exists($key, $array)){
return $array[ $key ];
}
if(is_numeric($key) && array_key_exists('.' . $key, $array)){
return $array[ '.' . $key ];
}
return null;
}
Usage
echo array_value($array, "208"); // "abc"
Edit:
ACTUALLY YOU CAN!!
Cast sequential array to associative array
All that for nothing
You can append the null character "\0" to the end of the array key. This makes it so PHP can't interpret the string as an integer. All of the array functions (like array_merge()) work on it. Also not even var_dump() will show anything extra after the string of integers.
Example:
$numbers1 = array();
$numbers2 = array();
$numbers = array();
$pool1 = array(111, 222, 333, 444);
$pool2 = array(555, 666, 777, 888);
foreach($pool1 as $p1)
{
$numbers1[$p1 . "\0"] = $p1;
}
foreach($pool2 as $p2)
{
$numbers2[$p2 . "\0"] = $p2;
}
$numbers = array_merge($numbers1, $numbers2);
var_dump($numbers);
The resulting output will be:
array(8) {
["111"] => string(3) "111"
["222"] => string(3) "222"
["333"] => string(3) "333"
["444"] => string(3) "444"
["555"] => string(3) "555"
["666"] => string(3) "666"
["777"] => string(3) "777"
["888"] => string(3) "888"
}
Without the . "\0" part the resulting array would be:
array(8) {
[0] => string(3) "111"
[1] => string(3) "222"
[2] => string(3) "333"
[3] => string(3) "444"
[4] => string(3) "555"
[5] => string(3) "666"
[6] => string(3) "777"
[7] => string(3) "888"
}
Also ksort() will also ignore the null character meaning $numbers[111] and $numbers["111\0"] will both have the same weight in the sorting algorithm.
The only downside to this method is that to access, for example $numbers["444"], you would actually have to access it via $numbers["444\0"] and since not even var_dump() will show you there's a null character at the end, there's no clue as to why you get "Undefined offset". So only use this hack if iterating via a foreach() or whoever ends up maintaining your code will hate you.
Use an object instead of an array $object = (object)$array;
EDIT:
I assumed that if they are integers, I
can't reorder them without changing
the key (which is significant in this
example). However, if they were
strings, I can reorder them how they
like as the index shouldn't be
interpreted to have any special
meaning. Anyway, see my question
update for how I did it (I went down a
different route).
Actually they dont have to be in numeric order...
array(208=>'a', 0=> 'b', 99=>'c');
Is perfectly valid if youre assigning them manually. Though i agree the integer keys might be misinterpreted as having a sequential meaning by someone although you would think if they were in a non-numeric order it would be evident they werent. That said i think since you had the leeway to change the code as you updated that is the better approach.
Probably not the most efficient way but easy as pie:
$keys = array_keys($data);
$values = array_values($data);
$stringKeys = array_map('strval', $keys);
$data = array_combine($stringKeys, $values);
//sort your data
I was able to get this to work by adding '.0' onto the end of each key, as such:
$options = [];
for ($i = 1; $i <= 4; $i++) {
$options[$i.'.0'] = $i;
}
Will return:
array("1.0" => 1, "2.0" => 2, "3.0" => 3, "4.0" => 4)
It may not be completely optimal but it does allow you to sort the array and extract (an equivalent of) the original key without having to truncate anything.
Edit:
This should work
foreach($array as $key => $value) {
$newkey = sprintf('%s',$key);
$newArray["'$newkey'"] = $value;
}
Hi we can make the index of the array a string using the following way. If we convert an array to xml then indexes like [0] may create issue so convert to string like [sample_0]
$newArray = array();
foreach($array as $key => $value) {
$newArray["sample_".$key] = $value;
}
All other answers thus far are hacks that either use fragile workarounds that could break between major PHP versions, create unnecessary gotchas by deliberately corrupting keys, or just slow down your code for no benefit. The various functions to sort arrays yet maintain the crucial key associations have existed since PHP 4.
It is pointless stop PHP from using integer keys, it only does so when the integer representation is exactly the same as the string, thus casting an integer key back to string when reading from the array is guaranteed to return the original data. PHP's internal representation of your data is completely irrelevant as long as you avoid the functions that rewrite integer keys. The docs clearly state which array functions will do that.
An example of sorting, without any hacks, that demonstrates how data remains uncorrupted:
<?php
# use string keys to define as populating from a db, etc. would,
# even though PHP will convert the keys to integers
$in = array(
'347' => 'ghi',
'176' => 'def',
'280' => 'abc',
);
# sort by key
ksort($in);
echo "K:\n";
$i = 1;
foreach ($in as $k => $v) {
echo $i++, "\n";
$k = (string) $k; # convert back to original
var_dump($k, $v);
}
# sort by value
asort($in, SORT_STRING);
echo "\nV:\n";
$i = 1;
foreach ($in as $k => $v) {
echo $i++, "\n";
$k = (string) $k;
var_dump($k, $v);
}
# unnecessary to cast as object unless keys could be sequential, gapless, and start with 0
if (function_exists('json_encode')) {
echo "\nJSON:\n", json_encode($in);
}
The output it produces hasn't changed since v5.2 (with only the JSON missing prior to that):
K:
1
string(3) "176"
string(3) "def"
2
string(3) "280"
string(3) "abc"
3
string(3) "347"
string(3) "ghi"
V:
1
string(3) "280"
string(3) "abc"
2
string(3) "176"
string(3) "def"
3
string(3) "347"
string(3) "ghi"
JSON:
{"280":"abc","176":"def","347":"ghi"}
What is the best way to generate an MD5 (or any other hash) of a multi-dimensional array?
I could easily write a loop which would traverse through each level of the array, concatenating each value into a string, and simply performing the MD5 on the string.
However, this seems cumbersome at best and I wondered if there was a funky function which would take a multi-dimensional array, and hash it.
(Copy-n-paste-able function at the bottom)
As mentioned prior, the following will work.
md5(serialize($array));
However, it's worth noting that (ironically) json_encode performs noticeably faster:
md5(json_encode($array));
In fact, the speed increase is two-fold here as (1) json_encode alone performs faster than serialize, and (2) json_encode produces a smaller string and therefore less for md5 to handle.
Edit: Here is evidence to support this claim:
<?php //this is the array I'm using -- it's multidimensional.
$array = unserialize('a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:4:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}i:3;a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}');
//The serialize test
$b4_s = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(serialize($array));
}
echo 'serialize() w/ md5() took: '.($sTime = microtime(1)-$b4_s).' sec<br/>';
//The json test
$b4_j = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(json_encode($array));
}
echo 'json_encode() w/ md5() took: '.($jTime = microtime(1)-$b4_j).' sec<br/><br/>';
echo 'json_encode is <strong>'.( round(($sTime/$jTime)*100,1) ).'%</strong> faster with a difference of <strong>'.($sTime-$jTime).' seconds</strong>';
JSON_ENCODE is consistently over 250% (2.5x) faster (often over 300%) -- this is not a trivial difference. You may see the results of the test with this live script here:
http://nathanbrauer.com/playground/serialize-vs-json.php
http://nathanbrauer.com/playground/plain-text/serialize-vs-json.php
Now, one thing to note is array(1,2,3) will produce a different MD5 as array(3,2,1). If this is NOT what you want. Try the following code:
//Optionally make a copy of the array (if you want to preserve the original order)
$original = $array;
array_multisort($array);
$hash = md5(json_encode($array));
Edit: There's been some question as to whether reversing the order would produce the same results. So, I've done that (correctly) here:
http://nathanbrauer.com/playground/json-vs-serialize.php
http://nathanbrauer.com/playground/plain-text/json-vs-serialize.php
As you can see, the results are exactly the same. Here's the (corrected) test originally created by someone related to Drupal:
http://nathanjbrauer.com/playground/drupal-calculation.php
http://nathanjbrauer.com/playground/plain-text/drupal-calculation.php
And for good measure, here's a function/method you can copy and paste (tested in 5.3.3-1ubuntu9.5):
function array_md5(Array $array) {
//since we're inside a function (which uses a copied array, not
//a referenced array), you shouldn't need to copy the array
array_multisort($array);
return md5(json_encode($array));
}
md5(serialize($array));
I'm joining a very crowded party by answering, but there is an important consideration that none of the extant answers address. The value of json_encode() and serialize() both depend upon the order of elements in the array!
Here are the results of not sorting and sorting the arrays, on two arrays with identical values but added in a different order (code at bottom of post):
serialize()
1c4f1064ab79e4722f41ab5a8141b210
1ad0f2c7e690c8e3cd5c34f7c9b8573a
json_encode()
db7178ba34f9271bfca3a05c5dddf502
c9661c0852c2bd0e26ef7951b4ca9e6f
Sorted serialize()
1c4f1064ab79e4722f41ab5a8141b210
1c4f1064ab79e4722f41ab5a8141b210
Sorted json_encode()
db7178ba34f9271bfca3a05c5dddf502
db7178ba34f9271bfca3a05c5dddf502
Therefore, the two methods that I would recommend to hash an array would be:
// You will need to write your own deep_ksort(), or see
// my example below
md5( serialize(deep_ksort($array)) );
md5( json_encode(deep_ksort($array)) );
The choice of json_encode() or serialize() should be determined by testing on the type of data that you are using. By my own testing on purely textual and numerical data, if the code is not running a tight loop thousands of times then the difference is not even worth benchmarking. I personally use json_encode() for that type of data.
Here is the code used to generate the sorting test above:
$a = array();
$a['aa'] = array( 'aaa'=>'AAA', 'bbb'=>'ooo', 'qqq'=>'fff',);
$a['bb'] = array( 'aaa'=>'BBBB', 'iii'=>'dd',);
$b = array();
$b['aa'] = array( 'aaa'=>'AAA', 'qqq'=>'fff', 'bbb'=>'ooo',);
$b['bb'] = array( 'iii'=>'dd', 'aaa'=>'BBBB',);
echo " serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";
echo "\n json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";
$a = deep_ksort($a);
$b = deep_ksort($b);
echo "\n Sorted serialize()\n";
echo md5(serialize($a))."\n";
echo md5(serialize($b))."\n";
echo "\n Sorted json_encode()\n";
echo md5(json_encode($a))."\n";
echo md5(json_encode($b))."\n";
My quick deep_ksort() implementation, fits this case but check it before using on your own projects:
/*
* Sort an array by keys, and additionall sort its array values by keys
*
* Does not try to sort an object, but does iterate its properties to
* sort arrays in properties
*/
function deep_ksort($input)
{
if ( !is_object($input) && !is_array($input) ) {
return $input;
}
foreach ( $input as $k=>$v ) {
if ( is_object($v) || is_array($v) ) {
$input[$k] = deep_ksort($v);
}
}
if ( is_array($input) ) {
ksort($input);
}
// Do not sort objects
return $input;
}
Answer is highly depends on data types of array values.
For big strings use:
md5(serialize($array));
For short strings and integers use:
md5(json_encode($array));
4 built-in PHP functions can transform array to string:
serialize(), json_encode(), var_export(), print_r().
Notice: json_encode() function slows down while processing associative arrays with strings as values. In this case consider to use serialize() function.
Test results for multi-dimensional array with md5-hashes (32 char) in keys and values:
Test name Repeats Result Performance
serialize 10000 0.761195 sec +0.00%
print_r 10000 1.669689 sec -119.35%
json_encode 10000 1.712214 sec -124.94%
var_export 10000 1.735023 sec -127.93%
Test result for numeric multi-dimensional array:
Test name Repeats Result Performance
json_encode 10000 1.040612 sec +0.00%
var_export 10000 1.753170 sec -68.47%
serialize 10000 1.947791 sec -87.18%
print_r 10000 9.084989 sec -773.04%
Associative array test source.
Numeric array test source.
Aside from Brock's excellent answer (+1), any decent hashing library allows you to update the hash in increments, so you should be able to update with each string sequentially, instead having to build up one giant string.
See: hash_update
md5(serialize($array));
Will work, but the hash will change depending on the order of the array (that might not matter though).
Note that serialize and json_encode act differently when it comes to numeric arrays where the keys don't start at 0, or associative arrays.
json_encode will store such arrays as an Object, so json_decode returns an Object, where unserialize will return an array with exact the same keys.
I think that this could be a good tip:
Class hasharray {
public function array_flat($in,$keys=array(),$out=array()){
foreach($in as $k => $v){
$keys[] = $k;
if(is_array($v)){
$out = $this->array_flat($v,$keys,$out);
}else{
$out[implode("/",$keys)] = $v;
}
array_pop($keys);
}
return $out;
}
public function array_hash($in){
$a = $this->array_flat($in);
ksort($a);
return md5(json_encode($a));
}
}
$h = new hasharray;
echo $h->array_hash($multi_dimensional_array);
Important note about serialize()
I don't recommend to use it as part of hashing function because it can return different result for the following examples. Check the example below:
Simple example:
$a = new \stdClass;
$a->test = 'sample';
$b = new \stdClass;
$b->one = $a;
$b->two = clone $a;
Produces
"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}}"
But the following code:
<?php
$a = new \stdClass;
$a->test = 'sample';
$b = new \stdClass;
$b->one = $a;
$b->two = $a;
Output:
"O:8:"stdClass":2:{s:3:"one";O:8:"stdClass":1:{s:4:"test";s:6:"sample";}s:3:"two";r:2;}"
So instead of second object php just create link "r:2;" to the first instance. It's definitely good and correct way to serialize data, but it can lead to the issues with your hashing function.
// Convert nested arrays to a simple array
$array = array();
array_walk_recursive($input, function ($a) use (&$array) {
$array[] = $a;
});
sort($array);
$hash = md5(json_encode($array));
----
These arrays have the same hash:
$arr1 = array(0 => array(1, 2, 3), 1, 2);
$arr2 = array(0 => array(1, 3, 2), 1, 2);
I didn't see the solution so easily above so I wanted to contribute a simpler answer. For me, I was getting the same key until I used ksort (key sort):
Sorted first with Ksort, then performed sha1 on a json_encode:
ksort($array)
$hash = sha1(json_encode($array) //be mindful of UTF8
example:
$arr1 = array( 'dealer' => '100', 'direction' => 'ASC', 'dist' => '500', 'limit' => '1', 'zip' => '10601');
ksort($arr1);
$arr2 = array( 'direction' => 'ASC', 'limit' => '1', 'zip' => '10601', 'dealer' => '100', 'dist' => '5000');
ksort($arr2);
var_dump(sha1(json_encode($arr1)));
var_dump(sha1(json_encode($arr2)));
Output of altered arrays and hashes:
string(40) "502c2cbfbe62e47eb0fe96306ecb2e6c7e6d014c"
string(40) "b3319c58edadab3513832ceeb5d68bfce2fb3983"
there are several answers telling to use json_code,
but json_encode don't work fine with iso-8859-1 string, as soon as there is a special char, the string is cropped.
i would advice to use var_export :
md5(var_export($array, true))
not as slow as serialize, not as bugged as json_encode
Currently the most up-voted answer md5(serialize($array)); doesn't work well with objects.
Consider code:
$a = array(new \stdClass());
$b = array(new \stdClass());
Even though arrays are different (they contain different objects), they have same hash when using md5(serialize($array));. So your hash is useless!
To avoid that problem, you can replace objects with result of spl_object_hash() before serializing. You also should do it recursively if your array has multiple levels.
Code below also sorts arrays by keys, as dotancohen have suggested.
function replaceObjectsWithHashes(array $array)
{
foreach ($array as &$value) {
if (is_array($value)) {
$value = $this->replaceObjectsInArrayWithHashes($value);
} elseif (is_object($value)) {
$value = spl_object_hash($value);
}
}
ksort($array);
return $array;
}
Now you can use md5(serialize(replaceObjectsWithHashes($array))).
(Note that the array in PHP is value type. So replaceObjectsWithHashes function DO NOT change original array.)
in some case maybe it's better to use http_build_query to convert array to string :
md5( http_build_query( $array ) );