Can an array key be a string with embedded zero bytes? - php

Can an array key in PHP be a string with embedded zero-bytes?
I wanted to implode a multi-part key with embedded zero-bytes as the delimiter and use it as the key in an associative array, but it doesn't seem to be working. Not sure whether this is a problem with the array access or with array_keys_exists().
Does anybody have any idea? Am I going about this the wrong way? Should I be creating a multi-part key in another way?
To clarify, I am trying to eliminated duplicates from user-entered data. The data consists of a product ID, a variation ID, and N fields of textual data. Each of the N fields has a label and a value. To be considered a duplicate, everything must match exactly (product ID, variation ID, all the labels and all the values).
I thought that if a create a string key by concatenating the information with null bytes, I could keep an associative array to check for the presence of duplicates.

From the PHP string documentation:
There are no limitations on the values the string can be composed of;
in particular, bytes with value 0 (“NUL bytes”) are allowed anywhere
in the string (however, a few functions, said in this manual not to be
“binary safe”, may hand off the strings to libraries that ignore data
after a NUL byte.)
From the PHP arrays documentation:
A key may be either an integer or a string.
No mention is made of any special case for strings that are array keys.
So, yes.

Like I already said in the comments
print_r(array("\00foo\00bar" => 'works'));
works. However, there is no reason for any of the gymnastics you are doing with implode or serialize or null byte keys.
If you want to see whether arrays are identical, then you can just compare them:
$input1 = array('product_id' => 1, 'variation_id' => 2, 'foo' => 'bar');
$input2 = array('product_id' => 1, 'variation_id' => 2, 'foo' => 'bar');
var_dump($input1 === $input2);
will output true whereas
$input3 = array('product_id' => 1, 'variation_id' => 2, 'foo' => 'foobarbaz');
var_dump($input1 === $input3);
will give false.
Quoting the PHP Manual on Array Operators:
$a == $b Equality TRUE if $a and $b have the same key/value pairs.
$a === $b Identity TRUE if $a and $b have the same key/value pairs in the same order and of the same types.
Also, PHP has a function for deleting duplicate values from arrays:
array_unique — Removes duplicate values from an array
And when you set the second argument to SORT_REGULAR, PHP will compare the arrays for equality, e.g.
$data = array(
array('product_id' => 1, 'variation_id' => 2, 'foo' => 'bar'),
array('product_id' => 1, 'variation_id' => 2, 'foo' => 'bar'),
array('product_id' => 2, 'variation_id' => 2, 'foo' => 'bar')
);
print_r(array_unique($data, SORT_REGULAR));
will reduce the array to only the first and the third element.

Related

Why does array_uintersect_assoc() need the comparison function to return non-boolean values?

I wondered why array_uintersect_assoc()'s custom comparison function:
must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second
When I compare two arrays, I only need a boolean return value: the elements either match or they don't.
What's the actual reason of this behavior?
The function has been implemented this way to allow the usage of "classical" comparison functions which use such a return strategy. Such a function typically needs to be able to express three cases which is not possible with a boolean return value, for obvious reasons.
You can, however, also use a comparison function which does return a boolean result, since php as a weak typed language will automatically convert that for you. Take a look at that example which is a slightly modifed version of the one given in the function documentation:
<?php
function mystrcasecmp($a, $b) {
return strcasecmp($a, $b) ? true : false;
}
$array1 = array("a" => "green", "b" => "brown", "c" => "blue", "red");
$array2 = array("a" => "GREEN", "B" => "brown", "yellow", "red");
print_r(array_uintersect_assoc($array1, $array2, "mystrcasecmp"));
You can see that the comparison function used here returns a boolean, yet the result is exactly the same.
Bottom line: the existing implementation is more flexible, whilst allowing for the use of a comparison function returning a boolean result too.
I only need boolean value: the elements either match or they dont.
TL;DR
You need to invert the boolean value which will be converted to an int because these array_intersect and array_diff functions that call custom functions only qualify data that returns a zero-ish result (i.e.: null, false, "0", 0, empty string, empty array). Here is a ternary implementation:
array_uintersect_assoc($array1, $array2, fn($a, $b) => str_contains($a, $b) ? 0 : 1)
The explanation...
It's easy to get confused with this operation. Your question led me to post Unexpected results with array_uintersect_assoc() when callback returns non-numeric string.
Let's say you want to use array_uintersect_assoc() and you have these two input arrays:
$array1 = ["a" => "green", "b" => "brown", "c" => "blue", 0 => "red"];
$array2 = ["a" => "GREEN", "B" => "brown", 0 => "yellow", 1 => "red", "c" => "blue"];
Now let's say you want to make a custom function call which returns a boolean value. I'll nominate PHP8's str_contains() which will be sufficient for this demonstration. The first array will contain the haystack strings and the second array will contain the needle strings.
var_export(array_uintersect_assoc($array1, $array2, 'str_contains'));
This will check for identical keys from the first array in the second array THEN among those qualifying elements, it will check if the second array's value is found within the first array's string. Being an "intersect" call, you would intuitively expect:
['c' => 'blue']
because only the c-keyed element in the second array has a value where the second array's value case-sensitively exists in the first array's value.
However, what you actually get is:
['a' => 'green', 0 => 'red']
What?!? The reason that you get elements with keys a and 0 in the result is because any of the array_ diff/intersect functions that include u in their name are making a qualifying match when a 0 result is returned.
When the c elements' values are fed to str_contain(), a true boolean is returned. array_uintersect_assoc() then forces the boolean value to be converted to an int type. When converting booleans to ints, false becomes 0 and true becomes 1.
To fix this behavior to get the intended result, you cannot simply change the intersect word inside of the function's name to diff -- that creates:
['b' => 'brown', 'c' => 'blue']
This is because b doesn't have an identical corresponding key in the second array. c does have an identical corresponding key, and the true result from str_contains() is evaluated as "don't keep" by array_udiff_assoc().
Finally, the fix is to invert the boolean value so that a true becomes 0 and a false becomes a non-zero. (Demo)
var_export(
array_udiff_assoc(
$array1,
$array2,
fn($a, $b) => str_contains($a, $b) ? 0 : 1;
// or !str_contains($a, $b) until the day when PHP throws a DEPRECATED warning for returning a boolean
)
);

array_diff() strange behaviour

I have a routine in my code that computes the differences between two arrays in order create an SQL UPDATE statement.
As soon the routine starts I create a copy of the input array in order to manipulate it while the input one is kept untouched. When I'm ready to create the UPDATE statement I compute the difference between them, using the modified array as leading one.
In every test I ran both arrays were filled with key=value pairs and all values were strings, even when they're integers in the database (PDO behavior) and until now everything worked flawlessly.
But right now I've found something strange. Two different actions, in two different Controllers, but with the same logic are producing different differences:
This one works as expected:
$input = array(
'tid' => '3',
'enabled' => '1'
);
$modified = array(
'tid' => '3',
'enabled' => 0,
'modified' => '2014-11-26 15:17:55'
};
$diff = array(
'enabled' => 0,
'modified' => '2014-11-26 15:17:55'
);
$input is the result of a query. $modified is a copy of the first array manipulated through class methods. When I'm ready to create the statement, $diff is computed in order to send to PDO (or other driver) the correct statement.
Here, array_diff() works. the tid index is present in both array and it's ignored. The enabled, a simple on/off flag, is different and it's included. The datetime representation too.
But look the variables of this other case:
$input2 = array(
'sid' => '1',
'finished' => '0'
);
$modified2 = array(
'sid' => '1',
'finished' => 1,
'modified' => '2014-11-26 15:21:58'
);
$diff2 = array(
'modified' => '2014-11-26 15:21:58'
);
The same as before but with different field names. The sid is ignored but the finished is ignored too while it shouldn't because it is not present in the $input.
By replacing array_diff() with array_diff_assoc(), as expected, everything works, but why?
From the docs:
array array_diff ( array $array1 , array $array2 [, array $... ] )
Compares array1 against one or more other arrays and returns the
values in array1 that are not present in any of the other arrays.
In your example, $modified2 has an entry 'finished' which has value 1. But $input2 also has value 1 for key 'sid'. Thus, using array_diff($modified2, $input2) will result in the removal of every entry with value 1, no matter what the key is.
Using array_diff_assoc, it will only check to see if $input2 has the entry 'finished' => 1, which it does not, so the entry will not be removed.

Return all records that match array of conditions using CakePHPs find

I'm currently attempting to return all records that match an array of conditions that I have. Currently I can get my code to work, but instead of returning all records that match an array of conditions that I've passed, it just returns the first one and then stops, instead of the four that I know exist in the table I'm accessing. This is with the all parameter set for find.
Here's the code snippet for a better view:
$array = implode(',', array('1','2','3','4'));
$a = $this->Assets->find('all', array(
'conditions' => array(
'id' => $array
)
)
);
var_dump($a);
var_dumping $a will just provide the record that has id 1, when there's records that exist for 2, 3, and 4 as well.
That is the expected result.
You are working against the ORMs auto-magic. Passing a string will result in an equality comparison, ie WHERE x = y, and since id is most probably an integer, the casting will turn the string 1,2,3,4 into 1, so ultimately the condition will be WHERE id = 1.
You should pass the array instead
'conditions' => array(
'id' => array(1, 2, 3, 4)
)
that way the ORM will generate an IN condition, ie WHERE id IN (1,2,3,4).
This is also documented in the cookbook:
http://book.cakephp.org/2.0/en/models/retrieving-your-data.html#complex-find-conditions

php - Why does key of first element of an associative array cannot be zero?

I am new to associative array concept of php. I had never used associative array before this. I came through an example, and got the following code:
function isAssoc($arr)
{
return array_keys($arr) !== range(0, count($arr) - 1);
}
echo(var_dump(isAssoc(array("0" => 'a', "1" => 'b', "2" => 'c'))).'<br />'); // false
echo(var_dump(isAssoc(array("1" => 'a', "1" => 'b', "2" => 'c'))).'<br />'); //true
echo(var_dump(isAssoc(array("1" => 'a', "0" => 'b', "2" => 'c'))).'<br />'); // true
echo(var_dump(isAssoc(array("a" => 'a', "b" => 'b', "c" => 'c'))).'<br />'); // true
The above function is used to tell whether the array is associate array or not.
I have this doubt why:
array("0" => 'a', "1" => 'b', "2" => 'c')
is not an associative array as it returns false. Whereas,
array("1" => 'a', "0" => 'b', "2" => 'c') //OR
array("1" => 'a', "1" => 'b', "2" => 'c')
is an associative array?
The term "associative array" in the PHP manual is used to differentiate from "indexed array". In PHP, all arrays are by definition associative in that they contain key/value pairs (the association). However, the documentation aims to refer to "associative" when it means "non-indexed", for clarity. This is the important point.
So what is an "indexed array"? By "indexed", it is meant that the keys of the array are starting at 0 and incrementing by one. Whether the keys are explicitly set at definition (array(0 => 'a', 1 => 'b')) or implicit (array('a', 'b')) makes no difference. Any other sequence of keys would be referred to as "associative".
Note that how the PHP manual uses the term "associative" doesn't necessarily correlate precisely with the same term's use elsewhere.
All arrays in PHP are associative, you can consider it to be tuple if all keys are numeric (integers but not necessarily of that type), continuous and start from 0.
Simple check would be:
function is_assoc(array $array)
{
$keys = array_keys($array);
$keys_keys = array_keys($keys);
return $keys_keys !== $keys;
}
It would yield same results as the one you've linked to/used.
A hint here would be excerpt from json_decode documentation:
assoc
When TRUE, returned objects will be converted into associative arrays.
Even if it returns "numeric" and "indexed" array it's still associative.
Another example would be:
$array = ["0" => "a", "1" => "b", "2" => "c"]; # numeric, continuous, starts from 0
json_encode($array); # (array) ["a","b","c"]
While:
$array = ["0" => "a", "2" => "b", "3" => "c"]; # numeric, NOT continuous, starts from 0
json_encode($array); # (list) {"0":"a","2":"b","3":"c"}
The function you're referring to has flaws and it is not authoritative about the fact whether or not an array in PHP is associative or not.
In-fact, in PHP it is not strictly defined what the term associative array actually means.
However it is important to understand that
PHP is loosely typed
PHP differs in array keys between integer (bool, NULL, float, integer, numeric string represented as integer) and string (nun-numeric strings) for keys.
Most likely an associative array in PHP is one where inside the array (that is after creating it, not while it seems when writing the array definition) that all keys in that array are strings.
But keep in mind that no set-in-stone definition exists. PHP arrays are just a mixture of array and hash-map where you need to know both without having both properly differentiated and the need to keep in mind the difference between numeric and non-numeric keys and how the loosely type-system of PHP converts for the array key needs.
First page to go as usual:
http://php.net/language.types.array
For those who really can only understand it with code, here the pony goes:
function is_array_assoc(array $array) {
return is_array($array);
}
If you can ensure that you're not using an error-handler that catches catchable errors, then the following code is correct, too:
function is_array_assoc(array $array) {
return TRUE;
}
The later example makes it perhaps a bit more clear that all arrays in PHP are associative.

PHP: Double square brackets after a variable?

echo $a['b']['b2'];
What does the value in the brackets refer to? Thanks.
This is an array.
what you are seeing is
<?php
$a = array(
'b' => array(
'b2' => 'x'
)
);
So in this case, $a['b']['b2'] will have a value of 'x'.
This is just my example though, there could be more arrays in the tree. Refer to the PHP Manual
Those are keys of a multidimensional array.
It may refer to this array:
$a = array(
"a" => array(
"a1" => "foo",
"a2" => "bar"
),
"b" => array(
"b1" => "baz",
"b2" => "bin"
)
)
In this case, $a['b']['b2'] would refer to 'bin'
This refers to a two dimensional array, and the value inside the bracket shows the key of the array
That means the variable $a holds an array. The values inside of the brackets refer the array keys.
$a = array('b' => 'somevalue', 'b2' => 'somevalue2');
In this case echo'ing $a['b'] would output it's value of 'somevalue' and $a['b2'] would output it's value of 'somevalue2'.
In your example, it's refering to a multi-dimensional array (an array inside of an array)
$a = array('b' => array('b2' => 'b2 value'));
where calling b2 would output 'b2 value'
Sorry if my answer is too simplistic, not sure your level of knowledge :)
$a is an array, a list of items. Most programming languages allow you to access items in the array using a number, but PHP also allows you to access them by a string, like 'b' or 'b2'.
Additionally, you have a two-dimensional array there - an array of arrays. So in that example, you are printing out the 'b2' element of the 'b' element in the $a array.

Categories