Understanding PHP op code in a if statement

Understanding PHP op code in a if statement - php

I'm trying to understand the op code for an simple code.
The code is:
<?php
$a = TRUE;
$b = FALSE;
if($a && $b) {
echo 'done';
}
The op code for the above code is:
php -dvld.active=1 test.php
Finding entry points
Branch analysis from position: 0
Jump found. Position 1 = 3, Position 2 = 4
Branch analysis from position: 3
Jump found. Position 1 = 5, Position 2 = 7
Branch analysis from position: 5
Jump found. Position 1 = 7
Branch analysis from position: 7
Return found
Branch analysis from position: 7
Branch analysis from position: 4
filename: /home/starlays/learning/test.php
function name: (null)
number of ops: 8
compiled vars: !0 = $a, !1 = $b
line # * op fetch ext return operands
---------------------------------------------------------------------------------
3 0 > ASSIGN !0, true
5 1 ASSIGN !1, false
7 2 > JMPZ_EX ~2 !0, ->4
3 > BOOL ~2 !1
4 > > JMPZ ~2, ->7
8 5 > ECHO 'done'
9 6 > JMP ->7
10 7 > > RETURN 1
branch: # 0; line: 3- 7; sop: 0; eop: 2; out1: 3; out2: 4
branch: # 3; line: 7- 7; sop: 3; eop: 3; out1: 4
branch: # 4; line: 7- 7; sop: 4; eop: 4; out1: 5; out2: 7
branch: # 5; line: 8- 9; sop: 5; eop: 6; out1: 7
branch: # 7; line: 10- 10; sop: 7; eop: 7
path #1: 0, 3, 4, 5, 7,
path #2: 0, 3, 4, 7,
path #3: 0, 4, 5, 7,
path #4: 0, 4, 7,
I'm trying to understand what is happening on line 7, how is the evaluation done? How many values does it enter in the expression of if for evaluation? It enters 3 values, or it enters the 2 values the value of $a and value of $b and the expression from the parentheses of if is evaluated afterwards?
I have read the manual for JMPZ_EX, I've understand what is happening in the op code until step 2 after that is a little bit mixed up and it is very hard to me to understand what are exact steps that php is doing.
Another thing that I need to understand is what are all the branches in the op code, which of all that branches will be used at the end?

Unless you are proficient at ASM, I think the easiest way to understand what is happening is looking at the same code by reading its (almost) 1:1 representation in PHP:
if(!$a) goto end;
if(!$b) goto end;
echo 'done';
end: return 0;
The intermediate representation is based on the negations of your actual clauses to make a jump over the code contained in the if block.
If you want to really understand how PHP transforms its input to this opcode array, you'll have to learn about PHP internals, but not before studying the dragon book, particularily the parts about intermediate representation, which is part of the compilation pipeline.
The rest of the opcodes are "background noise", intermediate values, or even one instruction which makes no sense 9 6 > JMP ->7 which simply exists probably because it didn't make sense to put the effort into making the PHP parser spit out the most optimal opcode array for the ZendVM to be run by.

line # * op fetch ext return operands
---------------------------------------------------------------------------------
3 0 > ASSIGN !0, true
5 1 ASSIGN !1, false
7 2 > JMPZ_EX ~2 !0, ->4
3 > BOOL ~2 !1
4 > > JMPZ ~2, ->7
8 5 > ECHO 'done'
9 6 > JMP ->7
10 7 > > RETURN 1
Going by the line numbers #
0 assigns true to !0, !0 is just the internal representation of $A
1 assigns true to !1, !1 is $B
JMPZ means to jump to code if the value is 0. I'm not sure the specific difference of JMPZ_EX it looks like it allows the return of a boolean result.
So:
2 JMPZ_EX, Jump to #4 (->4) if !0 ($A) is 0 (FALSE) and assign the result to ~2
3 BOOL !1 return ~2. ~2 is now equal to the BOOLean value of !1 ($B)
4 JMPZ ~2, Jump to #7 if ~2 is zero
5 ECHO, our echo statement. If any of the JMPZ had jumped, this part would be skipped.
6 JMP -7, jumps to #7
7 RETURN, ends the function call
Some notes:
It seems like the JMPZ_EX is unnecessary in this case, but would be useful in more complex if statements where you need to use the value in calculating further values.
6 JMP -7 is probably in there to allow for an else block. If this was the main part of the if block, finishing it could then jump over the portion of code that was the else block.

Related

Which is faster between (string)$value and "$value" when casting to a string

In PHP, assuming $value = 12345;(an integer), which is faster when casting $value from an integer to a string;
$value = (string)$value;
or
$value = "$value";
This is a kind of performance measure question and specifically for this case. Thanks for your help!

Your question is really about the efficacy of the php interpreter, and how it converts php code (the one you write) to php bytecode (the one that runs and may actually consume time and resources). If i take p01ymath's experiment, and decompose it:
implicit.php
<?php
$str = 12345;
for($i=0;$i<=200000000;$i++){
$str2 = "$str";
}
implicit.php bytecode
DarkMax:temp yvesleborg$ php -dvld.active=1 -dvld.verbosity=0 -dvld.exececute=0 implicit.php
filename: /Users/yvesleborg/temp/implicit.php
function name: (null)
number of ops: 14
compiled vars: !0 = $str, !1 = $i, !2 = $str2
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > EXT_STMT
1 ASSIGN !0, 12345
3 2 EXT_STMT
3 ASSIGN !1, 0
4 > JMP ->10
4 5 > EXT_STMT
6 CAST 6 ~5 !0
7 ASSIGN !2, ~5
3 8 POST_INC ~7 !1
9 FREE ~7
10 > IS_SMALLER_OR_EQUAL ~8 !1, 200000000
11 EXT_STMT
12 > JMPNZ ~8, ->5
7 13 > > RETURN 1
branch: # 0; line: 2- 3; sop: 0; eop: 4; out0: 10
branch: # 5; line: 4- 3; sop: 5; eop: 9; out0: 10
branch: # 10; line: 3- 3; sop: 10; eop: 12; out0: 13; out1: 5; out2: 13; out3: 5
branch: # 13; line: 7- 7; sop: 13; eop: 13; out0: -2
path #1: 0, 10, 13,
path #2: 0, 10, 5, 10, 13,
path #3: 0, 10, 5, 10, 13,
path #4: 0, 10, 13,
path #5: 0, 10, 5, 10, 13,
path #6: 0, 10, 5, 10, 13,
explicit.php
<?php
$str = 12345;
for($i=0;$i<=200000000;$i++){
$str2 = (string)$str;
}
explicit.php bytecode
DarkMax:temp yvesleborg$ php -dvld.active=1 -dvld.verbosity=0 -dvld.exececute=0 explicit.php
filename: /Users/yvesleborg/temp/explicit.php
function name: (null)
number of ops: 14
compiled vars: !0 = $str, !1 = $i, !2 = $str2
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > EXT_STMT
1 ASSIGN !0, 12345
3 2 EXT_STMT
3 ASSIGN !1, 0
4 > JMP ->10
4 5 > EXT_STMT
6 CAST 6 ~5 !0
7 ASSIGN !2, ~5
3 8 POST_INC ~7 !1
9 FREE ~7
10 > IS_SMALLER_OR_EQUAL ~8 !1, 200000000
11 EXT_STMT
12 > JMPNZ ~8, ->5
7 13 > > RETURN 1
branch: # 0; line: 2- 3; sop: 0; eop: 4; out0: 10
branch: # 5; line: 4- 3; sop: 5; eop: 9; out0: 10
branch: # 10; line: 3- 3; sop: 10; eop: 12; out0: 13; out1: 5; out2: 13; out3: 5
branch: # 13; line: 7- 7; sop: 13; eop: 13; out0: -2
path #1: 0, 10, 13,
path #2: 0, 10, 5, 10, 13,
path #3: 0, 10, 5, 10, 13,
path #4: 0, 10, 13,
path #5: 0, 10, 5, 10, 13,
path #6: 0, 10, 5, 10, 13,
as you can see, both snippets produce exactly the same bytecode (which you would expect from any well-crafted compiler/interpreter). Thus, the experiment above merely measures the actual run-time performance of the engine as it is sequencing the bytecode, at the time it was run, and on the box (chipset) it was run on.
To really answer your own question, you have to ponder the trickier question:
Under which circumstances does an explicit cast produce different bytecode from an implicit cast.
and if you find such circumstances, instrument a test to measure their respective performance.
If you want to pursue this quest, you will need the pecl vld component. You can follow this interesting post to get familiar with vld (be sure to check on pecl, and install the appropriate version for your php compiler under test)

Interesting question. I am totally dumb about testing these kind of things but I do it anyways. After seeing your question, I got an idea to check which one is faster.
So the idea is simple, I will just create a script to type cast an integer using both implicit and explicit methods 200 million times to see if there is a difference.
What I found out is, There is no much difference in speed. Here is the script I made to perform the tests.
<?php
$str = 12345;
$startTimeExplicit = microtime(true);
for($i=0;$i<=200000000;$i++){
$str2 = (string)$str;
}
$endTimeExplicit = microtime(true);
$explicit = round($endTimeExplicit-$startTimeExplicit,6);
$startTimeImplicit = microtime(true);
for($i=0;$i<=200000000;$i++){
$str2 = "$str";
}
$endTimeImplicit = microtime(true);
$implicit = round($endTimeImplicit-$startTimeImplicit,6);
echo "Average time (Implicit type casting): ".$implicit."<br>";
echo "Average time (Explicit type casting): ".$explicit."<br>";
?>
And here are the results I got after executing this script multiple times.
Average time (Implicit type casting): 14.815689
Average time (Explicit type casting): 14.614734
Average time (Implicit type casting): 14.56812
Average time (Explicit type casting): 15.190028
Average time (Implicit type casting): 14.649186
Average time (Explicit type casting): 15.587608
Average time (Implicit type casting): 15.522457
Average time (Explicit type casting): 15.566786
Average time (Implicit type casting): 15.235483
Average time (Explicit type casting): 15.333145
Average time (Implicit type casting): 15.972657
Average time (Explicit type casting): 16.161957
As you can see, both of them are equally fast. Sometimes Implicit type casting is few hundred milliseconds fast and sometimes Explicit. But on Average, You can't see much difference.

Is it more expensive to set a variable twice or once but inside an else block?

Setting a variable twice
$var = 2;
if ($someThing) {
$var = 1;
}
VS
Using an else
if ($someThing) {
$var = 1;
}
else {
$var = 2;
}
I know that$someThing will get evaluated in both cases. $var Is also going to be set in both cases. In the former it's set once then 50/50 chance to get set again. The later it's set only once, but there's an else block.
I was just curious if anyone has done any sort of testing with something similar. I know this is really micro-optimizing, but just a random thought that I had.

You can look at the opcode steps for each option using the Vulcan Logic Dump
Option #1
compiled vars: !0 = $var, !1 = $someThing
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > ASSIGN !0, 2
4 1 > JMPZ !1, ->3
5 2 > ASSIGN !0, 1
3 > > RETURN 1
Option #2
compiled vars: !0 = $someThing, !1 = $var
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > > JMPZ !0, ->3
4 1 > ASSIGN !1, 1
2 > JMP ->4
7 3 > ASSIGN !1, 2
4 > > RETURN 1
and also using the Ternary operator
compiled vars: !0 = $var, !1 = $someThing
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
3 0 E > > JMPZ !1, ->3
1 > QM_ASSIGN ~2 1
2 > JMP ->4
3 > QM_ASSIGN ~2 2
4 > ASSIGN !0, ~2
5 > RETURN 1
EDIT
As Barmar points out, not all methods execute all the steps, because there are jumps here, and not all steps have an equal processing cost either

It's not about chance whether $someThing is fulfilled, but about the effort the machine has to make. Let's see:
$var = 2;
if ($someThing) {
$var = 1;
}
means min. 1 assignment and 1 check and max. 2 assignments and 1 check.
if ($someThing) {
$var = 1;
}
else {
$var = 2;
}
means min. 1 assignment and 1 check and max. 1 assignment and 1 check.
The else is optimal.
Note: If $someThing depends on $var outside this piece of code and it is called repeatedly and want to be absolutely sure that your code is optimal you'll have to do an amotized cost analysis which isn't too trivial.

You have to imagine it in assembly, which can differ based on architectures.
Here's some basic pseudocode:
write $var 2
bne $something x
write $var 1
x:
or
bne $something x
write $var 1
jmp y:
x:
write $var 2
y:
The first one has an extra write operation sometimes. The second one has an extra jump instruction sometimes. I would think that the jump instruction would be faster, and branch prediction would allow it to be optimised further under the hood, so I'd go for the second option.

What is the precidency and associtivity for increment operator and assignment operator for the block of code

What is the precidency and associtivity for increment operator and assignment operator for the block of code
$a=array(1,2,3);
$b=array(4,5,6);
$c=1;
$a[$c++]=$b[$c++];
print_r($a);
As per the execution it outputs
Array
(
[0] => 1
[1] => 6
[2] => 3
)
But I am not able to understand how array $a index 1 holds the value of array $b index 2 value. Can anybody explain the scenario how the execution happens?

PHP is (once again) different from other languages in that the left part of an assignment evaluates first. Simple proof:
$a[print 1] = $b[print 2]; // what does this print?
According to http://3v4l.org/, this code:
$a = array(); $b = array(); $c = 1;
$a[$c++]=$b[$c++];
generated following opcodes:
compiled vars: !0 = $a, !1 = $b, !2 = $c
line # * op fetch ext return operands
---------------------------------------------------------------------------------
2 0 > INIT_ARRAY ~0
1 ASSIGN !0, ~0
2 INIT_ARRAY ~2
3 ASSIGN !1, ~2
4 ASSIGN !2, 1
3 5 POST_INC ~5 !2
6 POST_INC ~7 !2
7 FETCH_DIM_R $8 !1, ~7
8 ASSIGN_DIM !0, ~5
9 OP_DATA $8, $9
10 > RETURN 1
The opcode 5 is the left $c++, and the opcode 6 is the right $c++. So the final assignment (opcode 8) is evaluated as
$a[1] = $b[2];
which results in (1,6,3).

The ++ post increment operator first returns the value and afterwards (post) increments the value. I.e. $c++ returns the value of $c, then increments $c.
It is then obviously executing like this:
$a[$c++] =
Here the value of $c++ is taken as 1, but $c is then post-incremented to 2.
$b[$c++]
Here the value of $c++ is taken as 2, and then $c is post-incremented to 3 (which nobody cares about anymore though).
So the expression is equivalent to:
$a[1] = $b[2];
For contrast, the pre-increment operator ++$var first increments the value, then returns the new incremented value. So $a[++$c] = $b[++$c] would result in a Undefined offset 3 in $b error.

It's called undefined order of evalution.
Operator precedence and associativity only determine how expressions
are grouped, they do not specify an order of evaluation. PHP does not
(in the general case) specify in which order an expression is
evaluated and code that assumes a specific order of evaluation should
be avoided, because the behavior can change between versions of PHP or
depending on the surrounding code.
http://php.net/manual/en/language.operators.precedence.php#example-130
But current behaviour have never changed: http://3v4l.org/b1Y1X

Order of precedence with nested ternary operators [duplicate]

This question already has answers here:
Stacking Multiple Ternary Operators in PHP
(11 answers)
Closed 2 years ago.
I was in class the other day and this snippet of code was presented:
<?php
//Intialize the input
$score=rand(50,100);
//Determine the Grade
$grade=($score>=90)?'A':(
($score>=80)?'B':(
($score>=70)?'C':(
($score>=60)?'D':'F')));
//Output the Results
echo "<h1>A score of $score = $grade</h1>";
?>
At the time I questioned the order of operations within the nested ternary operators, thinking that they would evaluate from the inside out, that is it would evaluate if $score were >= 60 first, then if $score >= 70, etc -- working through the whole stack every time regardless of the score.
To me it seems that this construct should follow the same order of precedence given to mathematical operators -- resolving the inner-most set of parentheses first and then working out, unless there is some order of operations unique to the ternary.
Unfortunately the discussion in class quickly became about winning an argument, when I really just wanted to understand how it works. So my questions are two:
(1)How would this statement me interpreted and why?
and
(2)Is the some sort of stack trace or step through tool that would allow me to watch how this code executes?

PHP respects brackets. Expressions inside the innermost ( ... ) are evaluated first, like we are taught in elementary school.
PHP is unusual in that ternary operators are left-associative. This means without brackets, the ternary expression is evaluated left to right.
But in this particular case, the brackets force the expression to be evaluated right to left. This code is equivalent to:
if ($score >= 90) {
$grade = 'A';
}
elseif ($score >= 80) {
$grade = 'B';
}
elseif ($score >= 70) {
$grade = 'C';
}
...

The ternary operator is left associative, but with brackets applied, it evaluates right to left.
You might use xdebug or phpdbg as a step debugger to step through your code and see how it evaluates.
There is also VulcanLogicDumper around, which shows the instructions:
http://3v4l.org/QeF9i/vld#tabs compared to an if-elseif-else structure http://3v4l.org/bZE6M/vld#tabs
line # * op fetch ext return operands
---------------------------------------------------------------------------------
3 0 > SEND_VAL 50
1 SEND_VAL 100
2 DO_FCALL 2 $0 'rand'
3 ASSIGN !0, $0
5 4 IS_SMALLER_OR_EQUAL ~2 90, !0
5 > JMPZ ~2, ->8
6 > QM_ASSIGN ~3 'A'
7 > JMP ->24
6 8 > IS_SMALLER_OR_EQUAL ~4 80, !0
9 > JMPZ ~4, ->12
10 > QM_ASSIGN ~5 'B'
11 > JMP ->23
7 12 > IS_SMALLER_OR_EQUAL ~6 70, !0
13 > JMPZ ~6, ->16
14 > QM_ASSIGN ~7 'C'
15 > JMP ->22
8 16 > IS_SMALLER_OR_EQUAL ~8 60, !0
17 > JMPZ ~8, ->20
18 > QM_ASSIGN ~9 'D'
19 > JMP ->21
20 > QM_ASSIGN ~9 'F'
21 > QM_ASSIGN ~7 ~9
22 > QM_ASSIGN ~5 ~7
23 > QM_ASSIGN ~3 ~5
24 > ASSIGN !1, ~3
10 25 ADD_STRING ~11 '%3Ch1%3EA+score+of+'
26 ADD_VAR ~11 ~11, !0
27 ADD_STRING ~11 ~11, '+%3D+'
28 ADD_VAR ~11 ~11, !1
29 ADD_STRING ~11 ~11, '%3C%2Fh1%3E'
30 ECHO ~11
31 > RETURN 1
How to read these Opcodes
I will try to explain the first JMPZ in the opcode, in order to understand how it evaluates:
Of interest is Line 5, Opcode Number 5:
5 > JMPZ ~2, ->8
This means: If compare with 90 (opcode 4) is false, then JUMP to Opcode 8.
Warning: ->8 doesn't mean jump to Line 8.
Now, what is Opcode 8? The comparison with 80
6 8 > IS_SMALLER_OR_EQUAL ~4 80, !0
And now it's safe to say that this doesn't evaluate, like you expected from inside out (90->60->70), but like an if-elseif-else structure (90->80->70).

The ternary operator short-circuits. Only the appropriate operand is evaluated. This means that the parens do not matter until they are actually tested.
echo false ? (crash() / 0) : "Worked.";

ternary are left to right:
http://php.net/manual/en/language.operators.comparison.php#language.operators.comparison.ternary
So it will evaluate what is to the left of the ? and based on that, evaluate either the 1st or 2nd side of the :.
You could put in a function call that has a side effect to demonstrate this:
function p($a,$b) { echo $a . " >= " . $b; return $a>=$b; }
$grade=(p($score,90))?'A':(
p($score,80)?'B':(
p($score,70)?'C':(
p($score,60)?'D':'F')));

What usage of if's is faster?

I am quite new to PHP, and I have a question about IF statements.
Example 1:
<?php
if($a == 1){
if($b == 1){
echo 'test';
}
}
?>
Example 2:
<?php
if($a == 1 && $b ==1){
echo 'test';
}
?>
Both have the same result, but which one is faster? Does it even matter?

This is Premature Optimization & Micro Benchmark , you really need to read Don't be STUPID: GRASP SOLID! to understand why i said so
But if you want to know if($a == 1 && $b ==1) seems faster is most PHP versions
See Benchmark
If you want to know the real difference then look at the opcodes
First Code :
line # * op fetch ext return operands
---------------------------------------------------------------------------------
2 0 > IS_EQUAL ~0 !0, 1
1 > JMPZ ~0, ->7
3 2 > IS_EQUAL ~1 !1, 1
3 > JMPZ ~1, ->6
4 4 > ECHO 'test'
5 5 > JMP ->6
6 6 > > JMP ->7
7 > > RETURN 1
Secound Code
line # * op fetch ext return operands
---------------------------------------------------------------------------------
2 0 > IS_EQUAL ~0 !0, 1
1 > JMPZ_EX ~0 ~0, ->4
2 > IS_EQUAL ~1 !1, 1
3 BOOL ~0 ~1
4 > > JMPZ ~0, ->7
3 5 > ECHO 'test'
4 6 > JMP ->7
7 > > RETURN 1
Can you see how similar with very minimal difference. And that is why it does not make sense to worry about this light this but write good and readable code.

They are the same - in both cases if the first condition is false, the second will not be tested.

Both are the same. There are not much code to be optimized you can even write to make shorter syntax.
<?php
echo $a && b ? 'test' : '';
?>
Does the same.
I've modified a bit Baba's benchmark to check the results for shorthand syntax.
Results

Preoptimization is the root to all evil.
That said, your first piece of code is a tiny bit faster (but again, minimally - don't bother to change your code to this - readability is way more important than the tiny speed incremention you get from changing your conditions.
3,000,000 iterations of the first piece of code: ~ 0.9861679077 seconds
3,000,000 iterations of the second piece of code: ~ 1.0684559345 seconds
Difference: ~ 0.0822880268 seconds
Difference per iteration: ~ 0.0000000274 seconds (or 274 nano seconds).

No, it does not matter. Such little performance tweaks are usually overcome by the running environment. Such as inefficient algorithms or client side best practices being ignored.

Both are same, because PHP interpreter is "smart enough" to figure out that.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Understanding PHP op code in a if statement - php

Related

Which is faster between (string)$value and "$value" when casting to a string

Is it more expensive to set a variable twice or once but inside an else block?

What is the precidency and associtivity for increment operator and assignment operator for the block of code

Order of precedence with nested ternary operators [duplicate]

What usage of if's is faster?

Categories

Resources