Related
There are some SO questions on this subject, but none of them answer it with an algorithm description, as it exists in JS (ECMAScript). It doesn't seems to exist in the PHP documentation.
I'm not a C developer and I could not even find the corresponding code in PHP sources. I will not sleep well anymore if I can't tell why (loose, ==) comparing a string/number/resource to an object/array seems to always return false?
Eg. why '' == [] is false, or why 'foo' == ['foo'] is false.
There are multiple pages in the PHP documentation dedicated to loose comparison with the == operator. For objects, see Comparing Objects:
When using the comparison operator (==), object variables are compared in a simple manner, namely: Two object instances are equal if they have the same attributes and values (values are compared with ==), and are instances of the same class.
For loose comparison between other types, see PHP type comparison tables.
I finally found an almost satisfying answer from this blog of a security expert (Gynvael) and by reading source code. From the former, I'm only quoting the parts that answer my initial question: why (loose, ==) comparing a string/number/resource to an object/array seems to always return false? The algorithm in charge of equivalent comparison (==) can be found here.
The main mechanics of the equality operator are implemented in the compare_function in php-src/Zend/zend_operators.c, however many cases call other functions or use big macros (which then call other functions that use even more macros), so reading this isn't too pleasant.
The operator basically works in two steps:
If both operands are of a type that the compare_function knows how to compare they are compared. This behavior includes the following pairs of types (please note the equality operator is symmetrical so comparison of A vs B is the same as B vs A):
• LONG vs LONG
• LONG vs DOUBLE (+ symmetrical)
• DOUBLE vs DOUBLE
• ARRAY vs ARRAY
• NULL vs NULL
• NULL vs BOOL (+ symmetrical)
• NULL vs OBJECT (+ symmetrical)
• BOOL vs BOOL
• STRING vs STRING
• and OBJECT vs OBJECT
In case the pair of types is not on the above list the compare_function tries to cast the operands to either the type of the second operand (in case of OBJECTs with cast_object handler), cast to BOOL (in case the second type is either NULL or BOOL), or cast to either LONG or DOUBLE in most other cases. After the cast the compare_function is rerun.
I think that all other cases return false.
I have a problem baffling me terribly. I noticed this before but didn't give it any heed until today.
I was trying to write my own check for integer strings. I know of is_numeric() but it does not suffice since it counts float as numeric not only integers and is_int() which does not work on string numbers.
I did something similar to this
$var1 = 'string';
$var2 = '123';
var_dump( (int)$var1 == $var1);// boolean true
var_dump((int)$var2 == $var2);// boolean true
var_dump((int)$var1);//int 0
var_dump($var1);//string 'string' (length=6)
As expected the second var dump outputs true since I expect with php's loose comparison that the string and integer versions be equal.
However with the first, I don't get why this is so. I have tried casting to bool and it still gives me the same result.
I have tried assigning the cast var to a new variablr and comparing the two, still the same result
Is this something I am doing wrong or it is a php bug?
***Note
I am not comparing types here. I'm actually trying to take advantage of the fact that int 0 is not equal to string 'string'.
I wrote my integer check differently so I don't really need alternatives for that.
***Edit
I did some extra checking and it turns out that 0 == 'string' is true as well. How is that possible?
***Edit 2
There are multiple correct answers below to the question. Thanks to everyone who answered.
It's not a bug, it's a feature. Any string can be casted to an integer, but the cast will return 0 if the string doesn't start with an integer value. Also, when comparing an integer and a string, the string is casted to an integer and then the check is done against the two integers. Because of that rule, about just any random string is "equal" to zero. (To bypass this behavior, you should use strcmp, as it performs an explicit string comparison by casting anything passed to a string.)
To make sure I'm dealing with an integer, I would use is_numeric first, then convert the string to an int, and verify that the stringified int corresponds to the input value.
if (is_numeric($value) && strcmp((int)$value, $value) == 0)
{
// $value is an integer value represented as a string
}
According to php.net http://php.net/manual/en/language.operators.comparison.php:
var_dump(0 == "a"); // 0 == 0 -> true
So, I think it is juggling the types, and actually casting both sides to int. Then comparing either the sum of the ascii values or the ascii values of each respective index in the string.
First of all in mathematices '=' is called transitive b/c (A=B and B=C => A=C) is valid.
This is not the case with PHPs "=="!
(int)$var1 == $var1
In that case PHP will cast 'string' to 0 - that's a convention.
Then ==-operator will implicitely have the second operand 'string' also be casted to integer -> as well 0.
That leads to true.
You made an error with your post, the correct output is this:
bool(true)
bool(true)
int(0)
string(6) "string"
What happens is this:
Because you cast the variable to an integer, and you compare it to an integer with a loose comparison ==, PHP will first implicitely cast the string to an integer, a more explicit but 100% equivalent form would be: if((int)$var1 == (int) $var1)
See 1), the same thing applies here
It prints int(0), as it should, because it fails to parse the number, it will return 0 instead.
Prints string(6) "string" - as expected
Note: This answer is in response to a related question about the Twig template engine, that was marked as a duplicate, and redirects here.
Because the context is different, this answer is provided to those members of the SO community who may benefit from additional details specifically related to twig exclusively.
TL;DR: see this post How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
Problem
Context
Twig template engine (latest version as of Fri 2017-01-27T05:12:25)
Scenario
DeveloperYlohEnrohK uses comparison operator in twig expression
DeveloperYlohEnrohK notices unexpected results when using equality comparison operator
Questions
Why does the equality comparison operator (==) produce unexpected results in Twig?
Why do the following produce different results?
{{ dump(0 == 'somekey') }} ==> true
{{ dump(0|lower == 'somekey') }} ==> false
Solution
Since Twig is based on PHP, the casting, implicit type-conversion and comparison semantics of PHP apply to Twig templates as well.
Unless DeveloperYlohEnrohK is intentionally and specifically leveraging the behavior of implicit type-conversion in PHP, the comparison expression will almost certainly produce counterintuitive and unexpected results.
This is a well-known circumstance that is described in detail in this SO post on PHP equality.
Solution: Just as is the case with standard PHP, unless the well-known circumstance is accounted for, using === in Twig is much less likely to produce unexpected results.
Pitfalls
As of this writing, the Twig template engine does not support === in the same way as standard PHP
Twig does have a substitute for === using same as
Because of this, the treatment of this well-known circumstance differs slightly between PHP and Twig.
See also
How do the PHP equality (== double equals) and identity (=== triple equals) comparison operators differ?
http://twig.sensiolabs.org/doc/2.x/tests/sameas.html
If you want to compare types of variables too you should use ===.
Here's a function that more rigorously tests for either an int or an integer string.
function isIntegerNumeric($val) {
return (
is_int($val)
|| (
!is_float($val)
&& is_numeric($val)
&& strpos($val, ".") === false
)
);
}
It's relatively quick and avoids doing any string checking if it doesn't have to.
So there's this page on the php site which shows the result of comparing different values:
http://php.net/manual/en/types.comparisons.php
This is a helpful reference, but I would rather not have to visit this page every time I want to make sure that I'm doing type comparison right. So my question is
Is there some kind of underlying philosophy/reasoning behind the logic of type comparisons on PHP?
For example, I can see that for loose comparisons:
1, -1, "1" and "-1" can be treated as TRUE and 0 and "0" can be treated as FALSE;
Comparing the string value of a number against the number itself with yield TRUE;
but it becomes a bit hairy from then on trying to establish a pattern.
For casting directly to a boolean this is how it works.
All string with a length > 0 are true
All non 0 numbers are true
All non-empty arrays are true
All objects are true
Then these rules for comparing variables of the same type:
Objects are equivalent if their properties are equal
Arrays are equivalent if their keys and elements are equal
Strings are equivalent if they would produce the same output
Numbers are equivalent if they are mathematically equivalent
Booleans are equivalent if they have the same value.
For variable of different types the type that is higher on the above list is cast to the one that is lower then the comparison is made.
=== and !== operators don't cast prior to comparing but you should note objects are only === if they are the same instance.
The really odd one is arrays, they are === if they have the same keys and values defined in the same order.
$a = array("a"=>1, "b"=>2);
$b = array("b"=>2, "a"=>1);
$a == $b; // true
$a === $b; // false
and empty() is equivalent to !(bool)$var
EXCEPTIONS
Casting an array to a string will trigger a notice and unhelpfully cast as the text Array
Casting an object without a __toString method to a string will get you a fatal error.
Objects will not implicitly cast to an array, so any time you compare an object to an array it will yield a false (UPDATE confirmed that this is true even if object implemtents the ArrayAccess interface)
For strict === comparision, the logic is easy: each value entity is equal only to itself, so TRUE === TRUE, "1" === "1", but "1" !== 1 etc.
When it comes to == comparision, unfortunately there is no rule of thumb nor a clear logic. This is probably because the various forms of the operator were implemented by different programmers, without a central design decision. The best I can do is providing you with this graph to print and stick over the monitor:
The key of the grap is: A == B will be TRUE if and only if A and B are of two types directly connected by a line in the graph above. For instance, array() == NULL is TRUE because array() and NULL are directly connected, while array() == 0 is FALSE because there is no line connecting the two.
Lines marked in red are the tricky (non obvious) equalities.
I've omitted that each entity will be equal to itself (e.g. "1" == "1" etc.) but that should be easy to remember.
As a final note, I'd like to explain why "php" == 0 is TRUE (non empty, non number string is equal to 0): because PHP casts "php" to number before comparision and, since it's not a number, it defaults to 0 and makes the test TRUE.
Fun fact: there is no partition in this relation! If ever a transitive closure was allowed, you could easily say that True is False and False is True, destroying millennia of philosphy in four easy PHP statements :D
If the value contains something then it can be said to be true. For example, 1, 1.123, array("value"), etc. are all treated as true.
If the value can be said to be empty or void (i.e. lacking something) then it is seen as false. For example, 0, 0.0, array(), and so on.
This way of thinking about variables is not special to PHP. Many other languages do it in the same or similar way. E.g. Perl, C and Javascript, just to name a few.
There is imo a very straightforward guideline and a bug in the specification, which might be confusing.
Strict comparison checks equality in datatype and value.
Loose comparison checks equality in value only.
For an object (not part of the comparison table) is php quite straightforward:
if the object is the same instance as the other one, then is it strictly equal, otherwise might it be loosely equal.
Therefor is a 0 and a "0" loosely equal to each other and to false (and to any string). The latter can be understood as all strings are not numeric, hence false and the number that is equal to false is 0, hence all strings are equal to 0.
The comparison between null and array() is more complicated. If you check an array created with array() and compare that loosely and strictly, then will it return true. If you however check it with is_null, then will it return false. I think the latter is more logical, because an array() created with array() is not equal to '', where null is. I would think that this functional inconsistency between the function is_null() and the checks
== null or === null a bug, because it should not happen that using two different valid methods to check for a value return different results.
Null is also not an array according to the function is_array(), which is true. An empty array is an array according to the function is_array(), which should be true too. Hence should it never be true that null is equal to array().
There is no particular logic, but you can figure out some patterns.
"empty" values (null, false, 0, empty string and string '0') evaluate to false
comparison of numeric values is done implicitly converting them to integers until some version (there was a bug when two actually different long numeric strings counted as equal, now it's fixed)
when working with arrays, there is no difference between integer and numeric indexes, except when you call array_key_exists with explicit strict parameter
comparing number with string implicitly converts right argument to the type of the left one
return ($something); implicitly converts $something to string if it is not scalar
The base pattern is the same to the one used in C: anything non-zero is true for the sake of boolean comparisons.
In this sense, an empty string or array is also false.
The hairy scalar to look out for is '0', which is (very inconveniently) treated as empty too because it gets converted to an integer. array(0) is just as thorny on the array front.
When using strict comparisons (=== and !==), things are a lot more sane. In practice, it's often a good idea to cast input coming from superglobals and the database as appropriate, and to use these operators from that point forward.
I look at it the following way:
PHP is designed as a web programming language and all the input of the pages is based on strings [human-like perception] [This is by the way is also true for JavaScript]
Hence, all the strings which look like numbers (is_numeric() function), preliminary behave like numbers [comparison, casting].
That explains why extreme cases, like "0" are first implicitly thought to be cast to (int)0 and only then to false.
Just discovered that type-hinting is allowed in PHP, but not for ints, strings, bools, or floats.
Why does PHP not allow type hinting for types like integer, strings, ... ?
As of PHP 7.0 there is support for bool, int, float and string. Support was added in the scalar types RFC. There are two modes: weak and strict. I recommend reading the RFC to fully understand the differences, but essentially type conversions only happen in weak mode.
Given this code:
function mul2(int $x) {
return $x * 2;
}
mul2("1");
Under weak mode (which is the default mode) this will convert "1" to 1.
To enable strict types add declare(strict_types=1); to the top of the file. Under strict mode an error will be emitted:
Fatal error: Argument 1 passed to mul2() must be of the type integer, string given
PHP's loosely typed, where your "primitive" types are automatically type-juggled based on the context in which they're used. Type-hinting wouldn't really change that, since a string could be used as an int, or vice versa. Type-hinting would only really be helpful for complex types like arrays and objects, which can't be cleanly juggled as ints, strings, or other primitives.
To put it another way, since PHP has no concept of specific types, you couldn't require an int somewhere because it doesn't know what an int really is. On the other hand, an object is of a certain type, since a MyClass is not interchangeable with a MyOtherClass.
Just for reference, here's what happens when you try to convert between such types (not an exhaustive list):
Converting to Object (ref)
"If an object is converted to an object, it is not modified. If a value of any other type is converted to an object, a new instance of the stdClass built-in class is created. If the value was NULL, the new instance will be empty. Arrays convert to an object with properties named by keys, and corresponding values. For any other value, a member variable named scalar will contain the value."
Object to int/float (ref)
undefined behavior
Object to boolean (ref)
in PHP5, always TRUE
Object to string (ref)
The object's __toString() magic method will be called, if applicable.
Object to array (ref)
"If an object is converted to an array, the result is an array whose elements are the object's properties. The keys are the member variable names, with a few notable exceptions: integer properties are unaccessible; private variables have the class name prepended to the variable name; protected variables have a '*' prepended to the variable name. These prepended values have null bytes on either side. This can result in some unexpected behaviour."
Array to int/float (ref)
undefined behavior
Array to boolean (ref)
If the array is empty (i.e., no elements), it evaluates to FALSE -- otherwise, TRUE.
Array to string (ref)
the string "Array"; use print_r() or var_dump() to print the contents of an array
Array to object (ref)
"Arrays convert to an object with properties named by keys, and corresponding values."
It is not proper to call it "type hinting". "Hinting" implies it's some optional typing, a mere hint rather than requirement, however typed function parameters are not optional at all - if you give it wrong type, you get the fatal error. Calling it "type hinting" was a mistake.
Now to the reasons why there's no primitive typing for function parameters in PHP. PHP does not have a barrier between primitive types - i.e., string, integer, float, bool - are more or less interchangeable, you can have $a = "1"; and then echo $a+3; and get 4. All internal functions also work this way - if the function expects a string and you pass an integer, it's converted to string, if the function expects a float and gets an integer, it is converted to float, etc. This is unlike object types - there's no conversion between, say, SimpleXMLElement and DirectoryIterator - neither there could be, it wouldn't make any sense.
Thus, if you introduce function that accepts integer 1 and not string 1, you create incompatibility between internal functions and user functions and create problems for any code that assumes those are pretty much the same. This would be a big change in how PHP programs behave, and this change will need to be propagated through all the code using such functions - since otherwise you risk errors when transitioning between "strict" and "non-strict" code. This would imply a necessity of types variables, typed properties, typed return values, etc. - big change. And since PHP is not a compiled language, you do not get benefits of static type control with the downsides of it - you only get inconveniences but not the added safety. This is the reason why parameter typing is not accepted in PHP.
There is another option - coercive typing, i.e. behavior analogous to what internal functions do - convert between types. Unfortunately, this one does not satisfy the proponents of strict typing, and thus no consensus is found so far, and thus none is there.
On the other hand, object types were never controversial - it is clear that there's no conversion between them, no code assumes they are interchangeable and checks can only be strict for them, and it is the case with both internal and external functions. Thus, introducing strict object types was not a problem.
I have a collegue who constantly assigns variable and forces their type. For example he would declare something like so:
$this->id = (int)$this->getId();
or when returning he will always return values as such:
return (int)$id;
I understand that php is a loosely typed language and so i am not asking what the casting is doing. I am really wondering what the benefits are of doing this - if any - or if he is just wasting time and effort in doing this?
There are a few benefits.
Type-checking. Without type-checking, 0 == false and 1 == true.
Sanitizing. If you're inserting the value into an SQL query, you can't have SQL injection because string values are converted to zero.
Integrity. It prevents inserting invalid database data. Again, it converts to zero, so you won't be trying to insert a string into a integer field in a database.
To explicitly convert a value to
integer, use either the (int) or
(integer) casts. However, in most
cases the cast is not needed, since a
value will be automatically converted
if an operator, function or control
structure requires an integer
argument. A value can also be
converted to integer with the intval()
function.
http://www.php.net/manual/en/language.types.integer.php#language.types.integer.casting
PHP does not require (or support)
explicit type definition in variable
declaration; a variable's type is
determined by the context in which the
variable is used. That is to say, if a
string value is assigned to variable
$var, $var becomes a string. If an
integer value is then assigned to
$var, it becomes an integer.
An example of PHP's automatic type
conversion is the addition operator
'+'. If either operand is a float,
then both operands are evaluated as
floats, and the result will be a
float. Otherwise, the operands will be
interpreted as integers, and the
result will also be an integer. Note
that this does not change the types of
the operands themselves; the only
change is in how the operands are
evaluated and what the type of the
expression itself is.
http://www.php.net/manual/en/language.types.type-juggling.php
You can do something like this instead.
function hello($foo, $bar) {
assert(is_int($foo));
assert(is_int($bar));
}
http://php.net/manual/en/function.assert.php