This is my code:
$a = 5;
$b = &$a;
echo ++$a.$b++;
Shouldn't it print 66?
Why does it print 76?
Alright. This is actually pretty straight forward behavior, and it has to do with how references work in PHP. It is not a bug, but unexpected behavior.
PHP internally uses copy-on-write. Which means that the internal variables are copied when you write to them (so $a = $b; doesn't copy memory until you actually change one of them). With references, it never actually copies. That's important for later.
Let's look at those opcodes:
line # * op fetch ext return operands
---------------------------------------------------------------------------------
2 0 > ASSIGN !0, 5
3 1 ASSIGN_REF !1, !0
4 2 PRE_INC $2 !0
3 POST_INC ~3 !1
4 CONCAT ~4 $2, ~3
5 ECHO ~4
6 > RETURN 1
The first two should be pretty easy to understand.
ASSIGN - Basically, we're assinging the value of 5 into the compiled variable named !0.
ASSIGN_REF - We're creating a reference from !0 to !1 (the direction doesn't matter)
So far, that's straight forward. Now comes the interesting bit:
PRE_INC - This is the opcode that actually increments the variable. Of note is that it returns its result into a temporary variable named $2.
So let's look at the source code behind PRE_INC when called with a variable:
static int ZEND_FASTCALL ZEND_PRE_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
USE_OPLINE
zend_free_op free_op1;
zval **var_ptr;
SAVE_OPLINE();
var_ptr = _get_zval_ptr_ptr_var(opline->op1.var, execute_data, &free_op1 TSRMLS_CC);
if (IS_VAR == IS_VAR && UNEXPECTED(var_ptr == NULL)) {
zend_error_noreturn(E_ERROR, "Cannot increment/decrement overloaded objects nor string offsets");
}
if (IS_VAR == IS_VAR && UNEXPECTED(*var_ptr == &EG(error_zval))) {
if (RETURN_VALUE_USED(opline)) {
PZVAL_LOCK(&EG(uninitialized_zval));
AI_SET_PTR(&EX_T(opline->result.var), &EG(uninitialized_zval));
}
if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
CHECK_EXCEPTION();
ZEND_VM_NEXT_OPCODE();
}
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
if (UNEXPECTED(Z_TYPE_PP(var_ptr) == IS_OBJECT)
&& Z_OBJ_HANDLER_PP(var_ptr, get)
&& Z_OBJ_HANDLER_PP(var_ptr, set)) {
/* proxy object */
zval *val = Z_OBJ_HANDLER_PP(var_ptr, get)(*var_ptr TSRMLS_CC);
Z_ADDREF_P(val);
fast_increment_function(val);
Z_OBJ_HANDLER_PP(var_ptr, set)(var_ptr, val TSRMLS_CC);
zval_ptr_dtor(&val);
} else {
fast_increment_function(*var_ptr);
}
if (RETURN_VALUE_USED(opline)) {
PZVAL_LOCK(*var_ptr);
AI_SET_PTR(&EX_T(opline->result.var), *var_ptr);
}
if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
CHECK_EXCEPTION();
ZEND_VM_NEXT_OPCODE();
}
Now I don't expect you to understand what that's doing right away (this is deep engine voodoo), but let's walk through it.
The first two if statements check to see if the variable is "safe" to increment (the first checks to see if it's an overloaded object, the second checks to see if the variable is the special error variable $php_error).
Next is the really interesting bit for us. Since we're modifying the value, it needs to preform copy-on-write. So it calls:
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
Now, remember, we already set the variable to be a reference above. So the variable is not separated... Which means everything we do to it here will happen to $b as well...
Next, the variable is incremented (fast_increment_function()).
Finally, it sets the result as itself. This is copy-on-write again. It's not returning the value of the operation, but the actual variable. So what PRE_INC returns is still a reference to $a and $b.
POST_INC - This behaves similarly to PRE_INC, except for one VERY important fact.
Let's check out the source code again:
static int ZEND_FASTCALL ZEND_POST_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
retval = &EX_T(opline->result.var).tmp_var;
ZVAL_COPY_VALUE(retval, *var_ptr);
zendi_zval_copy_ctor(*retval);
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
fast_increment_function(*var_ptr);
}
This time I cut away all of the non-interesting stuff. So let's look at what it's doing.
First, it gets the return temporary variable (~3 in our code above).
Then it copies the value from its argument (!1 or $b) into the result (and hence the reference is broken).
Then it increments the argument.
Now remember, the argument !1 is the variable $b, which has a reference to !0 ($a) and $2, which if you remember was the result from PRE_INC.
So there you have it. It returns 76 because the reference is maintained in PRE_INC's result.
We can prove this by forcing a copy, by assigning the pre-inc to a temporary variable first (through normal assignment, which will break the reference):
$a = 5;
$b = &$a;
$c = ++$a;
$d = $b++;
echo $c.$d;
Which works as you expected. Proof
And we can reproduce the other behavior (your bug) by introducing a function to maintain the reference:
function &pre_inc(&$a) {
return ++$a;
}
$a = 5;
$b = &$a;
$c = &pre_inc($a);
$d = $b++;
echo $c.$d;
Which works as you're seeing it (76): Proof
Note: the only reason for the separate function here is that PHP's parser doesn't like $c = &++$a;. So we need to add a level of indirection through the function call to do it...
The reason I don't consider this a bug is that it's how references are supposed to work. Pre-incrementing a referenced variable will return that variable. Even a non-referenced variable should return that variable. It may not be what you expect here, but it works quite well in almost every other case...
The Underlying Point
If you're using references, you're doing it wrong about 99% of the time. So don't use references unless you absolutely need them. PHP is a lot smarter than you may think at memory optimizations. And your use of references really hinders how it can work. So while you think you may be writing smart code, you're really going to be writing less efficient and less friendly code the vast majority of the time...
And if you want to know more about References and how variables work in PHP, checkout One Of My YouTube Videos on the subject...
I think the full concatenate line is first executed and than send with the echo function.
By example
$a = 5;
$b = &$a;
echo ++$a.$b++;
// output 76
$a = 5;
$b = &$a;
echo ++$a;
echo $b++;
// output 66
EDIT: Also very important, $b is equal to 7, but echoed before adding:
$a = 5;
$b = &$a;
echo ++$a.$b++; //76
echo $b;
// output 767
EDIT: adding Corbin example: https://eval.in/34067
There's obviously a bug in PHP. If you execute this code:
<?php
{
$a = 5;
echo ++$a.$a++;
}
echo "\n";
{
$a = 5;
$b = &$a;
echo ++$a.$b++;
}
echo "\n";
{
$a = 5;
echo ++$a.$a++;
}
You get:
66 76 76
Which means the same block (1st and 3rd one are identical) of code doesn't always return the same result. Apparently the reference and the increment are putting PHP in a bogus state.
https://eval.in/34023
Related
I need to render some Smarty template for different values inside a loop (a PHP loop, not a Smarty foreach), in the following way (just an example):
$a = 0;
$b = 0;
$output = "";
$tmpl = "$a, $b";
$smarty->assignByRef('a', $a['a']);
$smarty->assignByRef('b', $b['b']);
for (int i = 0; i < 10; ++i) {
++$a;
++$b;
$output .= $smarty->fetch("string:" . $tmpl);
}
My doubt is about assignByRef. The Smarty v3 docs says:
With the introduction of PHP5, assignByRef() is not necessary for most
intents and purposes. assignByRef() is useful if you want a PHP array
index value to be affected by its reassignment from a template.
Assigned object properties behave this way by default.
but I don't fully understand what does that technical note means. So, can I use assignByRef that way or not? or using just assign will produce the same output?
PHP 4 objects were passed by value, unless the user explicitly specified the reference by prepending ampersand: &$variable. For this reason, function arguments that were likely to consume a big amount of memory were passed by reference in order to optimize memory usage:
function f(&$huge) {
// ...
}
PHP 5 variables are passed by reference, even if the user did't specify it explicitly (the ampersand character is not used). By assigning one variable to another we only create a new container (internally called zval) for the same data in memory. Consider this:
$a = new stdClass;
$b = $a;
The first line allocates memory for variable $a and an object of stdClass, and stores the object's identifier into the variable. The second line allocates memory for variable $b, stores the object's identifier into the $b variable, and increments an internal reference counter. The reference counter value shows how many times the object is referenced in the code. When $b variable is destroyed, the reference counter is decremented by one. When the value of reference counter becomes equal to zero, the object's memory is deallocated. The following code demonstrates the idea:
$a = new stdClass;
debug_zval_dump($a);
$b = $a;
debug_zval_dump($a);
$c = $a;
debug_zval_dump($a);
$c = null; // destroy $c
debug_zval_dump($a);
$b = null; // destroy $b
debug_zval_dump($a);
Output
object(stdClass)#1 (0) refcount(2){
}
object(stdClass)#1 (0) refcount(3){
}
object(stdClass)#1 (0) refcount(4){
}
object(stdClass)#1 (0) refcount(3){
}
object(stdClass)#1 (0) refcount(2){
}
But when a variable is modified, PHP versions 5 and 7 create a copy of the variable in order to keep the original value (variable) intact.
$m1 = memory_get_usage();
$a = str_repeat('a', 1 << 24);
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 16,781,408
$b = $a;
$c = $a;
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 16,781,472
$b[0] = 'x';
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 33,562,880
$c[0] = 'x';
echo number_format(memory_get_usage() - $m1), PHP_EOL;
// 50,344,288
The same is applied to the context of the function arguments. Thus, if a variable is supposed to be used for read only, there is no need for passing it by reference explicitly. The words in the Smarty documentation mean that in most cases, you pass variables to the templates, and usually do not expect the template to change them. You need to pass a variable by reference only when you really want the variable to be modified in the template. The same concept is applied to any function arguments in PHP 5 and newer.
Operator precedence tells that order should be: +, &, =. But this code execution shows that order is: &, =, +
$b = 1;
$a = & $b + print('print executed');
if ($a == 1)
echo ' but one was not added and error was not raised';
Output print executed but one was not added and error was not raised
Why precedence is changed for this case?
P.S.
$a = new stdClass();
$c = &$a instanceof $a;
var_dump($c); // class stdClass#1 (0) {}
$b = $a instanceof $a;
var_dump($b); // bool(true)
Arguably, this doesn't really answer your question but consider this code:
$b = 1;
$a = &$b + 123;
The opcodes reveal the following execution strategy:
compiled vars: !0 = $b, !1 = $a
line # * op fetch ext return operands
-----------------------------------------------------------------------------
3 0 > ASSIGN !0, 1
4 1 ASSIGN_REF $1 !1, !0
2 ADD ~2 $1, 123
3 FREE ~2
As you can see, the assignment by reference takes place and the addition gets stored in a temporary variable and then freed; basically, a no-op.
Perhaps the documentation could be clearer, but I can't imagine a scenario in which this particular code would ever make sense :)
http://www.php.net/manual/en/language.operators.precedence.php#example-115
<?php
$a = 1;
echo $a + $a++; // may print either 2 or 3
?>
The example from the php manual doesn't explain very well. Why isn't $a++ evaluated to 2, and then added to 1, so that it always becomes echo 1 + 2 // equals 3? I don't understand how it "may print either 2 or 3". I thought incremental ++ has "higher precedence" than addition +?
In other words, I don't understand why isn't it...
$a = 1;
1) echo $a + $a++;
2) echo 1 + ($a = 1 + 1);
3) echo 1 + (2);
4) echo 3;
It can be either 2 or 3. However in most of the time it will be 3. So why it MIGHT be 2? Because PHP is not describing in which order expressions are evaluated, since it might depends on the PHP version.
Operator precedence in PHP is a mess, and it's liable to change between versions. For that reason, it's always best to use parentheses to group your in-line equations so that there is no ambiguity in their execution.
The example I usually give when asked this question is to ask in turn what the answer to this equation would be:
$a = 2;
$b = 4;
$c = 6;
$val = $a++ + ++$b - 0 - $c - -++$a;
echo $val;
:)
Depending where I run it now, I get anything between 4 and 7, or a parser error.
This will load $a (1) into memory, then load it into memory again and increment it (1 + 1), then it will add the two together, giving you 3:
$a = 1;
$val = $a + ($a++);
This, however, is a parser error:
$a = 1;
$val = ($a + $a)++;
Anyway, long story short, your example 2) is the way that most versions will interpret it unless you add parenthesis around ($a++) as in the example above, which will make it run the same way in all PHP versions that support the incrementation operator. :)
Order of evaluation isn't a precedence issue. It has nothing to do with operators. The problem also happens with function calls.
By the way, $a++ returns the old value of $a. In your example, $a++ evaluates to 1, not 2.
In the following example, PHP does not define which subexpression is evaluated first: $a or $a++.
$a = 1;
f($a, $a++); //either f(1,1) or f(2,1)
Precedence is about where you put in parentheses. Order of evaluation can't be changed by parentheses. To fix order of evaluation problems, you need to break the code up into multiple lines.
$a = 1;
$a0 = $a;
$a1 = $a++;
f($a0, $a1); //only f(1,1)
Order of evaluation only matters when your subexpressions can have side-effects on each other: the value of one subexpression can change if another subexpression is evaluated first.
I was reading PHP manual and I came across type juggling
I was confused, because I've never came across such thing.
$foo = 5 + "10 Little Piggies"; // $foo is integer (15)
When I used this code it returns me 15, it adds up 10 + 5 and when I use is_int() it returns me true ie. 1 where I was expecting an error, it later referenced me to String conversion to numbers where I read If the string starts with valid numeric data, this will be the value used. Otherwise, the value will be 0 (zero)
$foo = 1 + "bob3"; /* $foo is int though this doesn't add up 3+1
but as stated this adds 1+0 */
now what should I do if I want to treat 10 Little Piggies OR bob3 as a string and not as an int. Using settype() doesn't work either. I want an error that I cannot add 5 to a string.
If you want an error, you need to trigger an error:
$string = "bob3";
if (is_string($string))
{
trigger_error('Does not work on a string.');
}
$foo = 1 + $string;
Or if you like to have some interface:
class IntegerAddition
{
private $a, $b;
public function __construct($a, $b) {
if (!is_int($a)) throw new InvalidArgumentException('$a needs to be integer');
if (!is_int($b)) throw new InvalidArgumentException('$b needs to be integer');
$this->a = $a; $this->b = $b;
}
public function calculate() {
return $this->a + $this->b;
}
}
$add = new IntegerAddition(1, 'bob3');
echo $add->calculate();
This is by design as a result of PHP's dynamically typed nature and of course lack of an explicit type declaration requirement. Variable types are determined based on context.
Based on your example, when you do:
$a = 10;
$b = "10 Pigs";
$c = $a + $b // $c == (int) 20;
Calling is_int($c) will of course always evaluate to a boolean true because PHP has decided to convert the result of the statement to an integer.
If you're looking for an error by the interpreter, you won't get it since this is, like I mentioned, something built into the language. You might have to write a lot of ugly conditional code to test your data types.
Or, if you want to do that for testing arguments passed to your functions - that's the only scenario which I can think of where you might want to do this - you can trust the client invoking your function to know what they are doing. Otherwise, the return value can simply be documented to be undefined.
I know coming from other platforms and languages, that might be hard to accept, but believe it or not a lot of great libraries written in PHP follow that same approach.
This code
<?php
$a = 10;
$arr1 = array(&$a);
$arr1[0] = 20;
echo $a; echo "\n";
$arr2 = $arr1;
$arr2[0] = 30;
echo $a;
produces
20
30
Obviously reference array members are "preserved", which can lead, for example, to some interesting/strange behavior, like
<?php
function f($arr) {
$arr[0] = 20;
}
$val = 10;
$a = array(&$val);
f($a);
echo $a[0];
?>
outputting
20
My question is: what is it for, where is it documented (except for a user comment at http://www.php.net/manual/en/language.types.array.php#50036) and the Zend Engine source code itself?
PHP's assignment by reference behavior is documented on the manual page "PHP: What References Do". You'll find a paragraph on array value references there, too, starting with:
While not being strictly an assignment by reference, expressions created with the language construct array() can also behave as such by prefixing & to the array element to add.
The page also explains your why your first code behaves like it does:
Note, however, that references inside arrays are potentially dangerous. Doing a normal (not by reference) assignment with a reference on the right side does not turn the left side into a reference, but references inside arrays are preserved in these normal assignments. This also applies to function calls where the array is passed by value.