Underlying philosophy behind php type comparisons - php

So there's this page on the php site which shows the result of comparing different values:
http://php.net/manual/en/types.comparisons.php
This is a helpful reference, but I would rather not have to visit this page every time I want to make sure that I'm doing type comparison right. So my question is
Is there some kind of underlying philosophy/reasoning behind the logic of type comparisons on PHP?
For example, I can see that for loose comparisons:
1, -1, "1" and "-1" can be treated as TRUE and 0 and "0" can be treated as FALSE;
Comparing the string value of a number against the number itself with yield TRUE;
but it becomes a bit hairy from then on trying to establish a pattern.

For casting directly to a boolean this is how it works.
All string with a length > 0 are true
All non 0 numbers are true
All non-empty arrays are true
All objects are true
Then these rules for comparing variables of the same type:
Objects are equivalent if their properties are equal
Arrays are equivalent if their keys and elements are equal
Strings are equivalent if they would produce the same output
Numbers are equivalent if they are mathematically equivalent
Booleans are equivalent if they have the same value.
For variable of different types the type that is higher on the above list is cast to the one that is lower then the comparison is made.
=== and !== operators don't cast prior to comparing but you should note objects are only === if they are the same instance.
The really odd one is arrays, they are === if they have the same keys and values defined in the same order.
$a = array("a"=>1, "b"=>2);
$b = array("b"=>2, "a"=>1);
$a == $b; // true
$a === $b; // false
and empty() is equivalent to !(bool)$var
EXCEPTIONS
Casting an array to a string will trigger a notice and unhelpfully cast as the text Array
Casting an object without a __toString method to a string will get you a fatal error.
Objects will not implicitly cast to an array, so any time you compare an object to an array it will yield a false (UPDATE confirmed that this is true even if object implemtents the ArrayAccess interface)

For strict === comparision, the logic is easy: each value entity is equal only to itself, so TRUE === TRUE, "1" === "1", but "1" !== 1 etc.
When it comes to == comparision, unfortunately there is no rule of thumb nor a clear logic. This is probably because the various forms of the operator were implemented by different programmers, without a central design decision. The best I can do is providing you with this graph to print and stick over the monitor:
The key of the grap is: A == B will be TRUE if and only if A and B are of two types directly connected by a line in the graph above. For instance, array() == NULL is TRUE because array() and NULL are directly connected, while array() == 0 is FALSE because there is no line connecting the two.
Lines marked in red are the tricky (non obvious) equalities.
I've omitted that each entity will be equal to itself (e.g. "1" == "1" etc.) but that should be easy to remember.
As a final note, I'd like to explain why "php" == 0 is TRUE (non empty, non number string is equal to 0): because PHP casts "php" to number before comparision and, since it's not a number, it defaults to 0 and makes the test TRUE.
Fun fact: there is no partition in this relation! If ever a transitive closure was allowed, you could easily say that True is False and False is True, destroying millennia of philosphy in four easy PHP statements :D

If the value contains something then it can be said to be true. For example, 1, 1.123, array("value"), etc. are all treated as true.
If the value can be said to be empty or void (i.e. lacking something) then it is seen as false. For example, 0, 0.0, array(), and so on.
This way of thinking about variables is not special to PHP. Many other languages do it in the same or similar way. E.g. Perl, C and Javascript, just to name a few.

There is imo a very straightforward guideline and a bug in the specification, which might be confusing.
Strict comparison checks equality in datatype and value.
Loose comparison checks equality in value only.
For an object (not part of the comparison table) is php quite straightforward:
if the object is the same instance as the other one, then is it strictly equal, otherwise might it be loosely equal.
Therefor is a 0 and a "0" loosely equal to each other and to false (and to any string). The latter can be understood as all strings are not numeric, hence false and the number that is equal to false is 0, hence all strings are equal to 0.
The comparison between null and array() is more complicated. If you check an array created with array() and compare that loosely and strictly, then will it return true. If you however check it with is_null, then will it return false. I think the latter is more logical, because an array() created with array() is not equal to '', where null is. I would think that this functional inconsistency between the function is_null() and the checks
== null or === null a bug, because it should not happen that using two different valid methods to check for a value return different results.
Null is also not an array according to the function is_array(), which is true. An empty array is an array according to the function is_array(), which should be true too. Hence should it never be true that null is equal to array().

There is no particular logic, but you can figure out some patterns.
"empty" values (null, false, 0, empty string and string '0') evaluate to false
comparison of numeric values is done implicitly converting them to integers until some version (there was a bug when two actually different long numeric strings counted as equal, now it's fixed)
when working with arrays, there is no difference between integer and numeric indexes, except when you call array_key_exists with explicit strict parameter
comparing number with string implicitly converts right argument to the type of the left one
return ($something); implicitly converts $something to string if it is not scalar

The base pattern is the same to the one used in C: anything non-zero is true for the sake of boolean comparisons.
In this sense, an empty string or array is also false.
The hairy scalar to look out for is '0', which is (very inconveniently) treated as empty too because it gets converted to an integer. array(0) is just as thorny on the array front.
When using strict comparisons (=== and !==), things are a lot more sane. In practice, it's often a good idea to cast input coming from superglobals and the database as appropriate, and to use these operators from that point forward.

I look at it the following way:
PHP is designed as a web programming language and all the input of the pages is based on strings [human-like perception] [This is by the way is also true for JavaScript]
Hence, all the strings which look like numbers (is_numeric() function), preliminary behave like numbers [comparison, casting].
That explains why extreme cases, like "0" are first implicitly thought to be cast to (int)0 and only then to false.

Related

PHP algorithm (loose) equality comparison

There are some SO questions on this subject, but none of them answer it with an algorithm description, as it exists in JS (ECMAScript). It doesn't seems to exist in the PHP documentation.
I'm not a C developer and I could not even find the corresponding code in PHP sources. I will not sleep well anymore if I can't tell why (loose, ==) comparing a string/number/resource to an object/array seems to always return false?
Eg. why '' == [] is false, or why 'foo' == ['foo'] is false.
There are multiple pages in the PHP documentation dedicated to loose comparison with the == operator. For objects, see Comparing Objects:
When using the comparison operator (==), object variables are compared in a simple manner, namely: Two object instances are equal if they have the same attributes and values (values are compared with ==), and are instances of the same class.
For loose comparison between other types, see PHP type comparison tables.
I finally found an almost satisfying answer from this blog of a security expert (Gynvael) and by reading source code. From the former, I'm only quoting the parts that answer my initial question: why (loose, ==) comparing a string/number/resource to an object/array seems to always return false? The algorithm in charge of equivalent comparison (==) can be found here.
The main mechanics of the equality operator are implemented in the compare_function in php-src/Zend/zend_operators.c, however many cases call other functions or use big macros (which then call other functions that use even more macros), so reading this isn't too pleasant.
The operator basically works in two steps:
If both operands are of a type that the compare_function knows how to compare they are compared. This behavior includes the following pairs of types (please note the equality operator is symmetrical so comparison of A vs B is the same as B vs A):
• LONG vs LONG
• LONG vs DOUBLE (+ symmetrical)
• DOUBLE vs DOUBLE
• ARRAY vs ARRAY
• NULL vs NULL
• NULL vs BOOL (+ symmetrical)
• NULL vs OBJECT (+ symmetrical)
• BOOL vs BOOL
• STRING vs STRING
• and OBJECT vs OBJECT
In case the pair of types is not on the above list the compare_function tries to cast the operands to either the type of the second operand (in case of OBJECTs with cast_object handler), cast to BOOL (in case the second type is either NULL or BOOL), or cast to either LONG or DOUBLE in most other cases. After the cast the compare_function is rerun.
I think that all other cases return false.

Why is this twig statement behaving weird? [duplicate]

I have a problem baffling me terribly. I noticed this before but didn't give it any heed until today.
I was trying to write my own check for integer strings. I know of is_numeric() but it does not suffice since it counts float as numeric not only integers and is_int() which does not work on string numbers.
I did something similar to this
$var1 = 'string';
$var2 = '123';
var_dump( (int)$var1 == $var1);// boolean true
var_dump((int)$var2 == $var2);// boolean true
var_dump((int)$var1);//int 0
var_dump($var1);//string 'string' (length=6)
As expected the second var dump outputs true since I expect with php's loose comparison that the string and integer versions be equal.
However with the first, I don't get why this is so. I have tried casting to bool and it still gives me the same result.
I have tried assigning the cast var to a new variablr and comparing the two, still the same result
Is this something I am doing wrong or it is a php bug?
***Note
I am not comparing types here. I'm actually trying to take advantage of the fact that int 0 is not equal to string 'string'.
I wrote my integer check differently so I don't really need alternatives for that.
***Edit
I did some extra checking and it turns out that 0 == 'string' is true as well. How is that possible?
***Edit 2
There are multiple correct answers below to the question. Thanks to everyone who answered.
It's not a bug, it's a feature. Any string can be casted to an integer, but the cast will return 0 if the string doesn't start with an integer value. Also, when comparing an integer and a string, the string is casted to an integer and then the check is done against the two integers. Because of that rule, about just any random string is "equal" to zero. (To bypass this behavior, you should use strcmp, as it performs an explicit string comparison by casting anything passed to a string.)
To make sure I'm dealing with an integer, I would use is_numeric first, then convert the string to an int, and verify that the stringified int corresponds to the input value.
if (is_numeric($value) && strcmp((int)$value, $value) == 0)
{
// $value is an integer value represented as a string
}
According to php.net http://php.net/manual/en/language.operators.comparison.php:
var_dump(0 == "a"); // 0 == 0 -> true
So, I think it is juggling the types, and actually casting both sides to int. Then comparing either the sum of the ascii values or the ascii values of each respective index in the string.
First of all in mathematices '=' is called transitive b/c (A=B and B=C => A=C) is valid.
This is not the case with PHPs "=="!
(int)$var1 == $var1
In that case PHP will cast 'string' to 0 - that's a convention.
Then ==-operator will implicitely have the second operand 'string' also be casted to integer -> as well 0.
That leads to true.
You made an error with your post, the correct output is this:
bool(true)
bool(true)
int(0)
string(6) "string"
What happens is this:
Because you cast the variable to an integer, and you compare it to an integer with a loose comparison ==, PHP will first implicitely cast the string to an integer, a more explicit but 100% equivalent form would be: if((int)$var1 == (int) $var1)
See 1), the same thing applies here
It prints int(0), as it should, because it fails to parse the number, it will return 0 instead.
Prints string(6) "string" - as expected
Note: This answer is in response to a related question about the Twig template engine, that was marked as a duplicate, and redirects here.
Because the context is different, this answer is provided to those members of the SO community who may benefit from additional details specifically related to twig exclusively.
TL;DR: see this post How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
Problem
Context
Twig template engine (latest version as of Fri 2017-01-27T05:12:25)
Scenario
DeveloperYlohEnrohK uses comparison operator in twig expression
DeveloperYlohEnrohK notices unexpected results when using equality comparison operator
Questions
Why does the equality comparison operator (==) produce unexpected results in Twig?
Why do the following produce different results?
{{ dump(0 == 'somekey') }} ==> true
{{ dump(0|lower == 'somekey') }} ==> false
Solution
Since Twig is based on PHP, the casting, implicit type-conversion and comparison semantics of PHP apply to Twig templates as well.
Unless DeveloperYlohEnrohK is intentionally and specifically leveraging the behavior of implicit type-conversion in PHP, the comparison expression will almost certainly produce counterintuitive and unexpected results.
This is a well-known circumstance that is described in detail in this SO post on PHP equality.
Solution: Just as is the case with standard PHP, unless the well-known circumstance is accounted for, using === in Twig is much less likely to produce unexpected results.
Pitfalls
As of this writing, the Twig template engine does not support === in the same way as standard PHP
Twig does have a substitute for === using same as
Because of this, the treatment of this well-known circumstance differs slightly between PHP and Twig.
See also
How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
http://twig.sensiolabs.org/doc/2.x/tests/sameas.html
If you want to compare types of variables too you should use ===.
Here's a function that more rigorously tests for either an int or an integer string.
function isIntegerNumeric($val) {
return (
is_int($val)
|| (
!is_float($val)
&& is_numeric($val)
&& strpos($val, ".") === false
)
);
}
It's relatively quick and avoids doing any string checking if it doesn't have to.

Several PHP type-juggling comparisons, such as empty string and an empty array, return unexpected results

The triple equal I think everyone understands; my doubts are about the double equal. Please read the code below.
<?php
//function to improve readability
function compare($a,$b,$rep)
{
if($a == $b)
echo "$rep is true<br>";
else
echo "$rep is false<br>";
}
echo "this makes sense to me<br>";
compare(NULL,0,'NULL==0');
compare(NULL,"",'NULL==""');
compare(NULL,[],'NULL==[]');
compare(0,"",'0==""');
echo "now this is what I don't understand<br>";
compare("",[],'""==[]');
compare(0,[],'0==[]');
compare(0,"foo",'0=="foo"');
echo "if I cast to boolean then it makes sense again<br>";
compare("",(bool)[],'""==(bool)[]');
compare(0,(bool)[],'0==(bool)[]');
?>
Output:
this makes sense to me
NULL==0 is true
NULL=="" is true
NULL==[] is true
0=="" is true
now this is what I don't understand
""==[] is false
0==[] is false
0=="foo" is true
if I cast to boolean then it makes sense again
""==(bool)[] is true
0==(bool)[] is true
I would expect an empty array to be "equal" to an empty string or to the integer 0. And I wouldn't expect that the integer 0 would be "equal" to the string "foo". To be honest, I am not really understanding what PHP is doing behind the scenes. Can someone please explain to me what is going on here?
The simple answer is that this is the way php has been designed to work.
The outcomes are well defined in the docs comparison operators and comparison tables.
A == comparison between an array (your first two queries) and a string always results in false.
In a == comparison between a number and a string (your third query) the string is converted to a number and then a numeric comparison made. In the case of 0=='foo' the string 'foo' evaluates numerically to 0 and the test becomes 0==0 and returns true. If the string had been 'numeric' e.g. "3" then the result in your case would be false (0 not equal to 3).
Whether the design is "correct" (whatever that may mean) is arguable. It is certainly not always immediately obvious. An illustrative example of the potential fury of the debate can be found in Bug#54547 where the devs argue strongly that the design is rooted in php's history as a web language where everything is a string and should be left alone, and others argue php "violates the principle of least surprise".
To avoid uncertainty use === wherever possible, with the added benefit of potentially showing up assumptions in your code that may not be valid.
As someone has already said, the PHP automatic casting rules can be quite tricky, and it is worth using === unless you know both sides will be of the same type. However I believe I can explain this one:
""==[] (returns false)
The initial string "" indicates the comparison will be a string one, and thus [] is cast to a string. When that happens, the right hand side of the comparison will be set to the word Array. You are therefore doing this comparison:
"" == "Array" (returns false)
and thus false is the correct result.
Edit: a helpful comment below casts doubt on my answer via this live code example. I should be interested to see what other answers are supplied.

if($val) vs. if($val != "") vs. if(!empty($val)) -- which one?

I see a lot of people using a variety of different methods to check whether of a variable is empty, there really seems to be no consensus. I've heard that if($foo) is exactly the same as if(!empty($foo)) or if($foo != ""). Is this true?
I realize it's a really simple question, but I'd really like to know. Are there any differences? Which method should I use?
Difference between bare test and comparison to empty string
if($foo != "") is equivalent to if($foo) most of the time, but not always.
To see where the differences are, consider the comparison operator behavior along with the conversion to string rules for the first case, and the conversion to boolean rules for the second case.
What I found out is that:
if $foo === array(), the if($foo != "") test will succeed (arrays are "greater than" strings), but the if($foo) test will fail (empty arrays convert to boolean false)
if $foo === "0" (a string), the if($foo != "") test will again succeed (obviously), but the if($foo) test will fail (the string "0" converts to boolean false)
if $foo is a SimpleXML object created from an empty tag, the if($foo != "") test will again succeed (objects are "greater than" strings), but the if($foo) test will fail (such objects convert to boolean false)
See the differences in action.
The better way to test
The preferred method to test is if(!empty($foo)), which is not exactly equal to the above in that:
It does not suffer from the inconsistencies of if($foo != "") (which IMHO is simply horrible).
It will not generate an E_NOTICE if $foo is not present in the current scope, which is its main advantage over if($foo).
There's a caveat here though: if $foo === '0' (a string of length 1) then empty($foo) will return true, which usually is (but may not always be) what you want. This is also the case with if($foo) though.
Sometimes you need to test with the identical operator
Finally, an exception to the above must be made when there is a specific type of value you want to test for. As an example, strpos might return 0 and also might return false. Both of these values will fail the if(strpos(...)) test, but they have totally different meanings. In these cases, a test with the identical operator is in order: if(strpos() === false).
No it's not always true. When you do if($foo) PHP casts the variable to Boolean. An empty string, a Zero integer or an empty array will then be false. Sometimes this can be an issue.
You should always try to use the most specific comparison as possible, if you're expecting a string which could be empty use if($foo==='') (note the three equal signs). If you're expecting either (boolean) false or a resource (from a DB query for instance) use if($foo===false){...} else {...}
You may read the documentation about casting to boolean to find the answer to this question. There's a list in there with which values are converted to true and false, respectively.
Note that empty also checks if the variable is set, which regular comparison does not. An unset variable will trigger an error of type E_NOTICE during comparison, but not when using empty. You can work around this using the isset call before your comparison, like this:
if(isset($foo) && $foo != '')
if() "converts" the statement given to a bool, so taking a look at the documentation for boolean seems to be what you're looking for. in general:
empty strings (""), empty arrays (array()), zero (0) and boolean false (false) are treated as false
everything else ("foo", 1, array('foo'), true, ...) is treated as true
EDIT :
for more information, you could also check the type comparison tables.
empty($foo) should return true in all of these cases: 0,"", NULL.
For a more complete list check this page: http://php.net/manual/en/function.empty.php
No, it's not equal. When variable is not defined, expression without empty will generate notice about non-defined variable.

Why does comparison and empty() behave like this in PHP?

PHP:
$a = "0";
$b = "00";
var_dump(empty($a)); # True (wtf?)
var_dump($a == $b); # True... WTF???
var_dump(empty($b)); # False WWWTTTFFFF!!??
I've read the docs. But the docs don't give explanation as to why they designed it this way. I'm not looking for workarounds (I already know them), I'm looking for an explanation.
Why is it like this? Does this make certain things easier somehow?
As for "0" == "00" resolving to true, the answer lies in Comparison Operators:
If you compare an integer with a
string, the string is converted to a
number. If you compare two numerical
strings, they are compared as
integers. These rules also apply to
the switch statement.
(emphasis added)
Both "0" and "00" are numerical strings so a numerical comparison is performed and obviously 0 == 0.
I'd suggest using === instead if you don't want any implicit type conversion.
As for empty():
The following things are considered to
be empty:
"" (an empty string)
0 (0 as an integer)
"0" (0 as a string)
NULL
FALSE
array() (an empty array)
var $var; (a variable declared, but without a value in a class)
http://au2.php.net/empty
The following things are considered to be empty:
"0" (0 as a string)
but "00" will not be considered empty.
It all stems from the language designers goal of "doing the right thing".
That is a given piece of code should do what the niave programmer or casual viewer of a piece of code would expect it too. This was not an easy goal to acheive.
Php has avoided most of worst pitfalls of other languages (like C's if (a = b) { ... or perl' s if ( "xxxx" == 0) { print "True!"; }).
The 0 == 0000 and if ("000") { echo "True!"; } are two of the few cases where code might not do exactly what you expect, but in pracice it is seldom a problem. In my experience the "cure" using the exact comparison operator === is the one thing guarenteed to have novice php programmers scratching there heads and searching the manual.
It has do do with what PHP considers empty, and, as #Shadow imagined, it's a dynamic typing issue. 0 and 00 are equal in PHP's eyes. Consider using the strict equality instead:
($a === $b) // is a equal to b AND the same type (strings)
Check the docs for empty http://us.php.net/empty. That should take care of the first and third lines.
for the second, it's because PHP is dynamically typed. the interpreter is inferring the type of the variables for use in the context in which you have used them. In this case the interpreter is probably thinking that you are trying to compare numbers and converting the string to ints before comparing.
From the documentation one can assume that 0 can be either an int or a 1 char string to signify empty. 00 would be more of a formatting assumption since there's no such thing as 00, but there is 0. 00 would be implying a 2 integer format, but the empty() function is only written for 0.
FWIW IANA php developer.

Categories