PHP algorithm (loose) equality comparison - php

There are some SO questions on this subject, but none of them answer it with an algorithm description, as it exists in JS (ECMAScript). It doesn't seems to exist in the PHP documentation.
I'm not a C developer and I could not even find the corresponding code in PHP sources. I will not sleep well anymore if I can't tell why (loose, ==) comparing a string/number/resource to an object/array seems to always return false?
Eg. why '' == [] is false, or why 'foo' == ['foo'] is false.

There are multiple pages in the PHP documentation dedicated to loose comparison with the == operator. For objects, see Comparing Objects:
When using the comparison operator (==), object variables are compared in a simple manner, namely: Two object instances are equal if they have the same attributes and values (values are compared with ==), and are instances of the same class.
For loose comparison between other types, see PHP type comparison tables.

I finally found an almost satisfying answer from this blog of a security expert (Gynvael) and by reading source code. From the former, I'm only quoting the parts that answer my initial question: why (loose, ==) comparing a string/number/resource to an object/array seems to always return false? The algorithm in charge of equivalent comparison (==) can be found here.
The main mechanics of the equality operator are implemented in the compare_function in php-src/Zend/zend_operators.c, however many cases call other functions or use big macros (which then call other functions that use even more macros), so reading this isn't too pleasant.
The operator basically works in two steps:
If both operands are of a type that the compare_function knows how to compare they are compared. This behavior includes the following pairs of types (please note the equality operator is symmetrical so comparison of A vs B is the same as B vs A):
• LONG vs LONG
• LONG vs DOUBLE (+ symmetrical)
• DOUBLE vs DOUBLE
• ARRAY vs ARRAY
• NULL vs NULL
• NULL vs BOOL (+ symmetrical)
• NULL vs OBJECT (+ symmetrical)
• BOOL vs BOOL
• STRING vs STRING
• and OBJECT vs OBJECT
In case the pair of types is not on the above list the compare_function tries to cast the operands to either the type of the second operand (in case of OBJECTs with cast_object handler), cast to BOOL (in case the second type is either NULL or BOOL), or cast to either LONG or DOUBLE in most other cases. After the cast the compare_function is rerun.
I think that all other cases return false.

Related

PHP, Objects are automatically converted to 1 on comparison operators

According to documentation, this comparison should return false because "object is always greater"! But instead, the object is automatically converted to 1! Even so, it says that "the object could not be converted to int"! So why is it happening?
http://php.net/manual/en/language.operators.comparison.php#language.operators.comparison.types
// php code
$obj=new stdClass();
var_dump($obj==1);
// output
NOTICE Object of class stdClass could not be converted to int on line number 3
bool(true)
you can test it on http://phptester.net/
I think the documentation is either wrong or poorly worded.
See this site where he dove into the source code...
https://gynvael.coldwind.pl/?id=492
The operator basically works in two steps:
If both operands are of a type that the compare_function knows how to compare they
are compared. This behavior includes the following pairs of types (please note the
equality operator is symmetrical so comparison of A vs B is the same as B vs A):
• LONG vs LONG
• LONG vs DOUBLE (+ symmetrical)
• DOUBLE vs DOUBLE
• ARRAY vs ARRAY
• NULL vs NULL
• NULL vs BOOL (+ symmetrical)
• NULL vs OBJECT (+ symmetrical)
• BOOL vs BOOL
• STRING vs STRING
• and OBJECT vs OBJECT
In case the pair of types is not on the above list the compare_function tries to cast
the operands to either the type of the second operand (in case of OBJECTs with
cast_object handler), cast to BOOL (in case the second type is either NULL or BOOL), or
cast to either LONG or DOUBLE in most other cases. After the cast the compare_function
is rerun.
See my PHP equal operator == reference table for details each specific case.

PHP MongoIDs objects comparison - best practice

Can somebody explain me, why the strict comparison (===) of two MongoDB\BSON\ObjectIds in PHP returns FALSE although both of ids are type MongoDB\BSON\ObjectId with the same oid?
Next question is about best practice to handle this case. Is it safe to do it via non strict comparison (==) or is there another way to do it e.g. (string)$id1 === (string)$id2?
From the relevant PHP documentation:
When using the identity operator (===), object variables are identical if and only if they refer to the same instance of the same class.
So you should just use the standard comparison operator (==). No string casting required.
Per #jh1711:
BSON\ObjectId ... implements a custom object_compare handler. But the handler just compares the ids

Why is this twig statement behaving weird? [duplicate]

I have a problem baffling me terribly. I noticed this before but didn't give it any heed until today.
I was trying to write my own check for integer strings. I know of is_numeric() but it does not suffice since it counts float as numeric not only integers and is_int() which does not work on string numbers.
I did something similar to this
$var1 = 'string';
$var2 = '123';
var_dump( (int)$var1 == $var1);// boolean true
var_dump((int)$var2 == $var2);// boolean true
var_dump((int)$var1);//int 0
var_dump($var1);//string 'string' (length=6)
As expected the second var dump outputs true since I expect with php's loose comparison that the string and integer versions be equal.
However with the first, I don't get why this is so. I have tried casting to bool and it still gives me the same result.
I have tried assigning the cast var to a new variablr and comparing the two, still the same result
Is this something I am doing wrong or it is a php bug?
***Note
I am not comparing types here. I'm actually trying to take advantage of the fact that int 0 is not equal to string 'string'.
I wrote my integer check differently so I don't really need alternatives for that.
***Edit
I did some extra checking and it turns out that 0 == 'string' is true as well. How is that possible?
***Edit 2
There are multiple correct answers below to the question. Thanks to everyone who answered.
It's not a bug, it's a feature. Any string can be casted to an integer, but the cast will return 0 if the string doesn't start with an integer value. Also, when comparing an integer and a string, the string is casted to an integer and then the check is done against the two integers. Because of that rule, about just any random string is "equal" to zero. (To bypass this behavior, you should use strcmp, as it performs an explicit string comparison by casting anything passed to a string.)
To make sure I'm dealing with an integer, I would use is_numeric first, then convert the string to an int, and verify that the stringified int corresponds to the input value.
if (is_numeric($value) && strcmp((int)$value, $value) == 0)
{
// $value is an integer value represented as a string
}
According to php.net http://php.net/manual/en/language.operators.comparison.php:
var_dump(0 == "a"); // 0 == 0 -> true
So, I think it is juggling the types, and actually casting both sides to int. Then comparing either the sum of the ascii values or the ascii values of each respective index in the string.
First of all in mathematices '=' is called transitive b/c (A=B and B=C => A=C) is valid.
This is not the case with PHPs "=="!
(int)$var1 == $var1
In that case PHP will cast 'string' to 0 - that's a convention.
Then ==-operator will implicitely have the second operand 'string' also be casted to integer -> as well 0.
That leads to true.
You made an error with your post, the correct output is this:
bool(true)
bool(true)
int(0)
string(6) "string"
What happens is this:
Because you cast the variable to an integer, and you compare it to an integer with a loose comparison ==, PHP will first implicitely cast the string to an integer, a more explicit but 100% equivalent form would be: if((int)$var1 == (int) $var1)
See 1), the same thing applies here
It prints int(0), as it should, because it fails to parse the number, it will return 0 instead.
Prints string(6) "string" - as expected
Note: This answer is in response to a related question about the Twig template engine, that was marked as a duplicate, and redirects here.
Because the context is different, this answer is provided to those members of the SO community who may benefit from additional details specifically related to twig exclusively.
TL;DR: see this post How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
Problem
Context
Twig template engine (latest version as of Fri 2017-01-27T05:12:25)
Scenario
DeveloperYlohEnrohK uses comparison operator in twig expression
DeveloperYlohEnrohK notices unexpected results when using equality comparison operator
Questions
Why does the equality comparison operator (==) produce unexpected results in Twig?
Why do the following produce different results?
{{ dump(0 == 'somekey') }} ==> true
{{ dump(0|lower == 'somekey') }} ==> false
Solution
Since Twig is based on PHP, the casting, implicit type-conversion and comparison semantics of PHP apply to Twig templates as well.
Unless DeveloperYlohEnrohK is intentionally and specifically leveraging the behavior of implicit type-conversion in PHP, the comparison expression will almost certainly produce counterintuitive and unexpected results.
This is a well-known circumstance that is described in detail in this SO post on PHP equality.
Solution: Just as is the case with standard PHP, unless the well-known circumstance is accounted for, using === in Twig is much less likely to produce unexpected results.
Pitfalls
As of this writing, the Twig template engine does not support === in the same way as standard PHP
Twig does have a substitute for === using same as
Because of this, the treatment of this well-known circumstance differs slightly between PHP and Twig.
See also
How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
http://twig.sensiolabs.org/doc/2.x/tests/sameas.html
If you want to compare types of variables too you should use ===.
Here's a function that more rigorously tests for either an int or an integer string.
function isIntegerNumeric($val) {
return (
is_int($val)
|| (
!is_float($val)
&& is_numeric($val)
&& strpos($val, ".") === false
)
);
}
It's relatively quick and avoids doing any string checking if it doesn't have to.

Underlying philosophy behind php type comparisons

So there's this page on the php site which shows the result of comparing different values:
http://php.net/manual/en/types.comparisons.php
This is a helpful reference, but I would rather not have to visit this page every time I want to make sure that I'm doing type comparison right. So my question is
Is there some kind of underlying philosophy/reasoning behind the logic of type comparisons on PHP?
For example, I can see that for loose comparisons:
1, -1, "1" and "-1" can be treated as TRUE and 0 and "0" can be treated as FALSE;
Comparing the string value of a number against the number itself with yield TRUE;
but it becomes a bit hairy from then on trying to establish a pattern.
For casting directly to a boolean this is how it works.
All string with a length > 0 are true
All non 0 numbers are true
All non-empty arrays are true
All objects are true
Then these rules for comparing variables of the same type:
Objects are equivalent if their properties are equal
Arrays are equivalent if their keys and elements are equal
Strings are equivalent if they would produce the same output
Numbers are equivalent if they are mathematically equivalent
Booleans are equivalent if they have the same value.
For variable of different types the type that is higher on the above list is cast to the one that is lower then the comparison is made.
=== and !== operators don't cast prior to comparing but you should note objects are only === if they are the same instance.
The really odd one is arrays, they are === if they have the same keys and values defined in the same order.
$a = array("a"=>1, "b"=>2);
$b = array("b"=>2, "a"=>1);
$a == $b; // true
$a === $b; // false
and empty() is equivalent to !(bool)$var
EXCEPTIONS
Casting an array to a string will trigger a notice and unhelpfully cast as the text Array
Casting an object without a __toString method to a string will get you a fatal error.
Objects will not implicitly cast to an array, so any time you compare an object to an array it will yield a false (UPDATE confirmed that this is true even if object implemtents the ArrayAccess interface)
For strict === comparision, the logic is easy: each value entity is equal only to itself, so TRUE === TRUE, "1" === "1", but "1" !== 1 etc.
When it comes to == comparision, unfortunately there is no rule of thumb nor a clear logic. This is probably because the various forms of the operator were implemented by different programmers, without a central design decision. The best I can do is providing you with this graph to print and stick over the monitor:
The key of the grap is: A == B will be TRUE if and only if A and B are of two types directly connected by a line in the graph above. For instance, array() == NULL is TRUE because array() and NULL are directly connected, while array() == 0 is FALSE because there is no line connecting the two.
Lines marked in red are the tricky (non obvious) equalities.
I've omitted that each entity will be equal to itself (e.g. "1" == "1" etc.) but that should be easy to remember.
As a final note, I'd like to explain why "php" == 0 is TRUE (non empty, non number string is equal to 0): because PHP casts "php" to number before comparision and, since it's not a number, it defaults to 0 and makes the test TRUE.
Fun fact: there is no partition in this relation! If ever a transitive closure was allowed, you could easily say that True is False and False is True, destroying millennia of philosphy in four easy PHP statements :D
If the value contains something then it can be said to be true. For example, 1, 1.123, array("value"), etc. are all treated as true.
If the value can be said to be empty or void (i.e. lacking something) then it is seen as false. For example, 0, 0.0, array(), and so on.
This way of thinking about variables is not special to PHP. Many other languages do it in the same or similar way. E.g. Perl, C and Javascript, just to name a few.
There is imo a very straightforward guideline and a bug in the specification, which might be confusing.
Strict comparison checks equality in datatype and value.
Loose comparison checks equality in value only.
For an object (not part of the comparison table) is php quite straightforward:
if the object is the same instance as the other one, then is it strictly equal, otherwise might it be loosely equal.
Therefor is a 0 and a "0" loosely equal to each other and to false (and to any string). The latter can be understood as all strings are not numeric, hence false and the number that is equal to false is 0, hence all strings are equal to 0.
The comparison between null and array() is more complicated. If you check an array created with array() and compare that loosely and strictly, then will it return true. If you however check it with is_null, then will it return false. I think the latter is more logical, because an array() created with array() is not equal to '', where null is. I would think that this functional inconsistency between the function is_null() and the checks
== null or === null a bug, because it should not happen that using two different valid methods to check for a value return different results.
Null is also not an array according to the function is_array(), which is true. An empty array is an array according to the function is_array(), which should be true too. Hence should it never be true that null is equal to array().
There is no particular logic, but you can figure out some patterns.
"empty" values (null, false, 0, empty string and string '0') evaluate to false
comparison of numeric values is done implicitly converting them to integers until some version (there was a bug when two actually different long numeric strings counted as equal, now it's fixed)
when working with arrays, there is no difference between integer and numeric indexes, except when you call array_key_exists with explicit strict parameter
comparing number with string implicitly converts right argument to the type of the left one
return ($something); implicitly converts $something to string if it is not scalar
The base pattern is the same to the one used in C: anything non-zero is true for the sake of boolean comparisons.
In this sense, an empty string or array is also false.
The hairy scalar to look out for is '0', which is (very inconveniently) treated as empty too because it gets converted to an integer. array(0) is just as thorny on the array front.
When using strict comparisons (=== and !==), things are a lot more sane. In practice, it's often a good idea to cast input coming from superglobals and the database as appropriate, and to use these operators from that point forward.
I look at it the following way:
PHP is designed as a web programming language and all the input of the pages is based on strings [human-like perception] [This is by the way is also true for JavaScript]
Hence, all the strings which look like numbers (is_numeric() function), preliminary behave like numbers [comparison, casting].
That explains why extreme cases, like "0" are first implicitly thought to be cast to (int)0 and only then to false.

PHP array_key_exists - loose on some types, strict on another

Even being relatively well aware of PHP peculiarities, the following strange behaviour still got me confused today:
// loose
$a = array(true => 'foo');
var_dump(array_key_exists(1, $a));
// strict
$a = array('7.1' => 'foo');
var_dump(array_key_exists('7.10', $a));
I wonder what could be the technical reason of this effect, so the question is, what in the process behind this function is causing values of some types to be compared loosely while others are compared strictly? I'm not complaining about the behaviour, but trying to understand that, so there is no point for "PHP sucks" comments.
In your first case, a boolean value is not a valid array key, so it is immediately turned into a 1 when you initialize the array, making your search match.
In your second case, the array key is a string, and '7.1' is not the same string as '7.10'
In your second example, '7.1' and '7.10' are strings. They are compared as string, so they don't match.
Now why do you have a match in the first example? Array keys can be either strings or integer. So true is converted to integer, which evaluates as 1.
This is documented here. Note that, keys are integers or strings. Specific key casts are mentioned in the documentation, in particular (for your case) that bools are cast to integers (ie. true as 1 and false as 0). As noted elsewhere, your other examples are strings (remove the quotes to make them floats, which would then be truncated to integers as per the docs).
maybe you could add the script output there? First glance though: Boolean as array key? I dont think thats gonna help in any way! 2nd: 7.10 is not the same at 7.1 - declaring this in '' makes it a string....
if you want true as a keyname, then you need to encapsulate it in either single or double quotes. IF you dont know about PHP and single/double quotes, it will cause the contents to be treated as a string value rather than Integer of Boolean (True/False)

Categories