Let's say I run the following code :
function isLucky() : bool
{
for ($i = 0; $i < 50; ++$i) {
try {
if (!rand(0, 9)) {
return true;
};
throw new Exception();
} catch (Exception $e) {
}
}
}
Some software (Vulcan Logic Dumper) gets me generated opcode:
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
5 0 E > ASSIGN !0, 0
1 > JMP ->15
7 2 > INIT_FCALL 'rand'
3 SEND_VAL 0
4 SEND_VAL 9
5 DO_ICALL $3
6 BOOL_XOR ~4 $3
7 > JMPZ ~4, ->9
8 8 > > RETURN <true>
10 9 > NEW $5 :20
10 DO_FCALL 0
11 > THROW 0 $5
12* JMP ->14
11 13 E > > CATCH last 'Exception'
5 14 > PRE_INC !0
15 > IS_SMALLER_OR_EQUAL ~8 !0, 50
16 > JMPNZ ~8, ->2
14 17 > VERIFY_RETURN_TYPE
18 > RETURN null
This is nice, but is there a way to get what the system really does ?
Is this zend framework re-interpreting each token ?
Where actually are the system calls ?
Do each opcode instruction cost the same ?
When I check the generated output of a C++ program with objdump, a program is a list of instructions, and jumps are are made to a memory adress.
A dummy c++ function from a c++ file compiled with -O0 and -c objdump-ed:
0000000000000000 <_Z14dummy_functionb>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 89 f8 mov %edi,%eax
6: 88 45 fc mov %al,-0x4(%rbp)
9: 80 7d fc 00 cmpb $0x0,-0x4(%rbp)
d: 74 07 je 16 <_Z14dummy_functionb+0x16>
f: b8 01 00 00 00 mov $0x1,%eax
14: eb 05 jmp 1b <_Z14dummy_functionb+0x1b>
16: b8 00 00 00 00 mov $0x0,%eax
1b: 5d pop %rbp
1c: c3 retq
Is that the case with zend's opcode ?
For instance, assuming a simple function :
<?php
(function (){
return true;
throw new Exception();
});
line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
4 0 E > > RETURN <true>
5 1* NEW $0 :4
2* DO_FCALL 0
3* THROW 0 $0
6 4* > RETURN null
Will the throw expression ever be read by something ? Or will the jump completely ignore it ?
Opcodes are executed by the Zend executor. If you want to know how it works exactly, you need to read its source files.
You will find a general presentation here:
http://blog.jpauli.tech/2015-02-05-zend-vm-executor-html/
Related
My problem is that I was some time ago base64 encoding random bytes from openssl sha256 in C (as uint8_t), feeding them into a shell script and using the output.
What I can recreate from my data now is:
Content of file.txt:
uvjWEHTUk1LnzVZul9ynRpezWfKYN3bvlx103wxACxo
test#test:~# base64 -d file.txt | od -t x1
0000000 ba f8 d6 10 74 d4 93 52 e7 cd 56 6e 97 dc a7 46
0000020 97 b3 59 f2 98 37 76 ef 97 1d 74 df 0c 40 0b 1a
The output is the same as calling in PHP:
echo bin2hex(base64_decode("uvjWEHTUk1LnzVZul9ynRpezWfKYN3bvlx103wxACxo="));
baf8d61074d49352e7cd566e97dca74697b359f2983776ef971d74df0c400b1a
What I did all the time in shell and need to do now in PHP is the following:
Again, same content of file.txt:
uvjWEHTUk1LnzVZul9ynRpezWfKYN3bvlx103wxACxo
test#test:~# base64 -d file.txt | od -t x8
0000000 5293d47410d6f8ba 46a7dc976e56cde7
0000020 ef763798f259b397 1a0b400cdf741d97
My problem here: what is now the equal procedure in PHP (to od -t x8 in shell)?
I tried pack / unpack / bin2hex / ... and can't get the same result.
I'm trying to get a string with this content:
"5293d47410d6f8ba46a7dc976e56cde7ef763798f259b3971a0b400cdf741d97"
from a starting point of base64_decode("uvjWEHTUk1LnzVZul9ynRpezWfKYN3bvlx103wxACxo="). Any ideas?
If x8 is what you really need, which is 8 bytes, then the implementation would be as simple as
<?php
$str = 'uvjWEHTUk1LnzVZul9ynRpezWfKYN3bvlx103wxACxo';
$bin = base64_decode($str);
if (strlen($bin) % 8 !== 0) {
throw new \RuntimeException('data length should be divisible by 8');
}
$result = '';
for ($i = 0; $i < strlen($bin); $i += 8) {
for ($j = $i + 7; $j >= $i; --$j) {
$result .= bin2hex($bin[$j]);
}
}
echo $result;
It iterates over blocks of 8 bytes, then dumps them in reverse order each.
Ideone: https://ideone.com/hBanqi
I'm trying to learn binary and create a simple WebM parser in PHP based on Matroska.
I read TimecodeScale, MuxingAppm WritingApp, etc. with unpack(format, data). My problem is when I reach Duration (0x4489) in Segment Information (0x1549a966) I must read a float and based on TimecodeScale convert it to seconds: 261.564s->00:04:21.564 and I don't know how.
This is a sample sequence:
`2A D7 B1 83 0F 42 40 4D 80 86 67 6F 6F 67 6C 65 57 41 86 67 6F 6F 67 6C 65 44 89 88 41 0F ED E0 00 00 00 00 16 54 AE 6B`
TimecodeScale := 2ad7b1 uint [ def:1000000; ]
MuxingApp := 4d80 string; ("google")
WritingApp := 5741 string; ("google")
Duration := 4489 float [ range:>0.0; ]
Tracks := 1654ae6b container [ card:*; ]{...}
I must read a float after (0x4489) and return 261.564s.
The duration is a double precision floating point value (64-bits) represented in the IEEE 754 format. If you want to see how the conversion is done check this.
The TimecodeScale is the timestamp scale in nanoseconds.
In php you can do:
$bin = hex2bin('410fede000000000');
$timecode_scale = 1e6;
// endianness
if (unpack('S', "\xff\x00")[1] === 0xff) {
$bytes = unpack('C8', $bin);
$bytes = array_reverse($bytes);
$bin = implode('', array_map('chr', $bytes));
}
$duration = unpack('d', $bin)[1];
$duration_s = $duration * $timecode_scale / 1e9;
echo "duration=${duration_s}s\n";
Result:
duration=261.564s
This question already has answers here:
Stacking Multiple Ternary Operators in PHP
(11 answers)
Closed 2 years ago.
I was in class the other day and this snippet of code was presented:
<?php
//Intialize the input
$score=rand(50,100);
//Determine the Grade
$grade=($score>=90)?'A':(
($score>=80)?'B':(
($score>=70)?'C':(
($score>=60)?'D':'F')));
//Output the Results
echo "<h1>A score of $score = $grade</h1>";
?>
At the time I questioned the order of operations within the nested ternary operators, thinking that they would evaluate from the inside out, that is it would evaluate if $score were >= 60 first, then if $score >= 70, etc -- working through the whole stack every time regardless of the score.
To me it seems that this construct should follow the same order of precedence given to mathematical operators -- resolving the inner-most set of parentheses first and then working out, unless there is some order of operations unique to the ternary.
Unfortunately the discussion in class quickly became about winning an argument, when I really just wanted to understand how it works. So my questions are two:
(1)How would this statement me interpreted and why?
and
(2)Is the some sort of stack trace or step through tool that would allow me to watch how this code executes?
PHP respects brackets. Expressions inside the innermost ( ... ) are evaluated first, like we are taught in elementary school.
PHP is unusual in that ternary operators are left-associative. This means without brackets, the ternary expression is evaluated left to right.
But in this particular case, the brackets force the expression to be evaluated right to left. This code is equivalent to:
if ($score >= 90) {
$grade = 'A';
}
elseif ($score >= 80) {
$grade = 'B';
}
elseif ($score >= 70) {
$grade = 'C';
}
...
The ternary operator is left associative, but with brackets applied, it evaluates right to left.
You might use xdebug or phpdbg as a step debugger to step through your code and see how it evaluates.
There is also VulcanLogicDumper around, which shows the instructions:
http://3v4l.org/QeF9i/vld#tabs compared to an if-elseif-else structure http://3v4l.org/bZE6M/vld#tabs
line # * op fetch ext return operands
---------------------------------------------------------------------------------
3 0 > SEND_VAL 50
1 SEND_VAL 100
2 DO_FCALL 2 $0 'rand'
3 ASSIGN !0, $0
5 4 IS_SMALLER_OR_EQUAL ~2 90, !0
5 > JMPZ ~2, ->8
6 > QM_ASSIGN ~3 'A'
7 > JMP ->24
6 8 > IS_SMALLER_OR_EQUAL ~4 80, !0
9 > JMPZ ~4, ->12
10 > QM_ASSIGN ~5 'B'
11 > JMP ->23
7 12 > IS_SMALLER_OR_EQUAL ~6 70, !0
13 > JMPZ ~6, ->16
14 > QM_ASSIGN ~7 'C'
15 > JMP ->22
8 16 > IS_SMALLER_OR_EQUAL ~8 60, !0
17 > JMPZ ~8, ->20
18 > QM_ASSIGN ~9 'D'
19 > JMP ->21
20 > QM_ASSIGN ~9 'F'
21 > QM_ASSIGN ~7 ~9
22 > QM_ASSIGN ~5 ~7
23 > QM_ASSIGN ~3 ~5
24 > ASSIGN !1, ~3
10 25 ADD_STRING ~11 '%3Ch1%3EA+score+of+'
26 ADD_VAR ~11 ~11, !0
27 ADD_STRING ~11 ~11, '+%3D+'
28 ADD_VAR ~11 ~11, !1
29 ADD_STRING ~11 ~11, '%3C%2Fh1%3E'
30 ECHO ~11
31 > RETURN 1
How to read these Opcodes
I will try to explain the first JMPZ in the opcode, in order to understand how it evaluates:
Of interest is Line 5, Opcode Number 5:
5 > JMPZ ~2, ->8
This means: If compare with 90 (opcode 4) is false, then JUMP to Opcode 8.
Warning: ->8 doesn't mean jump to Line 8.
Now, what is Opcode 8? The comparison with 80
6 8 > IS_SMALLER_OR_EQUAL ~4 80, !0
And now it's safe to say that this doesn't evaluate, like you expected from inside out (90->60->70), but like an if-elseif-else structure (90->80->70).
The ternary operator short-circuits. Only the appropriate operand is evaluated. This means that the parens do not matter until they are actually tested.
echo false ? (crash() / 0) : "Worked.";
ternary are left to right:
http://php.net/manual/en/language.operators.comparison.php#language.operators.comparison.ternary
So it will evaluate what is to the left of the ? and based on that, evaluate either the 1st or 2nd side of the :.
You could put in a function call that has a side effect to demonstrate this:
function p($a,$b) { echo $a . " >= " . $b; return $a>=$b; }
$grade=(p($score,90))?'A':(
p($score,80)?'B':(
p($score,70)?'C':(
p($score,60)?'D':'F')));
My app has people put 24 groups of 4 statements in order. In each group of 4 one is the "D" statement, one is the "I" statement, one is the "S" statement, and one is the "C" statement.
So the end result looks something like ['ISCD','CISD','DISC',CISD,'CISD','ISCD'...] because the are essentially rearranging the 4 letters
In the end, they get a "score" for each letter using the following algorithm.
For each of I,S,C and D
Find the number of times that letter is first and multiply by 3
Find the number of times that letter is second and multiply by 2
Find the number of times that letter is third and muliply by 1
Total it up, and that is the score for that letter
The end result is that each letter (I,S,D,C) gets a score from 0 to 72, and there are always 144 total points given out:
I want to map the results to 14 reports:
D
I
S
C
DI
IS
SC
CD
DS
IC
DIS
ISC
SCD
CDI
The idea is that if S is dominant, we choose the S report. If Both D and I are dominant, we choose the DI report. If none is particularly dominant, we choose the top 3. (there is no difference between DI and ID meaning which one is most dominant is irrelevant if they are both high)
So if the scores are D=50, I=48, S=20,C=26 then I want it to choose "DI" since D and I are dominant. There are 24^(4!) possible responses from the user, that I need to map to 14 reports
I understand that I will have to set the thresholds for what "dominant" means, but for starters, I want to assume all possible responses are equally likely, and to map all possible responses to the 14 reports to where each of the 14 reports is equally likely, given random input.
I expect it's 1 to 5 lines of code. It'll be in php but any language including math or pseudo code should be fine.
UPDATE:
I figured out a way to do it in one line of code, but it's not evenly distributed. here's the php (no dependencies)
<?php
$totals=array();
$lets=array('D','I','S','C');
for($j=0;$j<100000;$j++)
{
$vals=array('D'=>0,'I'=>0,'S'=>0,'C'=>0);
for($i=0;$i<24;$i++)
{
shuffle($lets);
$vals[$lets[0]]+=3;
$vals[$lets[1]]+=2;
$vals[$lets[2]]+=1;
}
$D=$vals['D'];$I=$vals['I'];$S=$vals['S'];$C=$vals['C'];
//calculate which report
$reportKey=($D>36?'D':'').($I>36?'I':'').($S>36?'S':'').($C>36?'C':'');
if(!$reportKey)
$reportKey="DIS";
if(isset($totals[$reportKey]))
$totals[$reportKey]+=1;
else
$totals[$reportKey]=1;
echo $reportKey." $D $I $S $C <br>";
}
echo "<br>";
foreach ($totals as $k=>$v)
echo "$k: $v<br>";
The magic line is
$reportKey=($D>36?'D':'').($I>36?'I':'').($S>36?'S':'').($C>36?'C':'');
That line says if any value is over 36, include that letter. the output of the script is like this:
SC 35 33 38 38
IC 33 42 32 37
DI 44 39 29 32
...
...
DC 46 21 35 42
DIS 38 37 40 29
IC 36 39 28 41
DS 41 36 42 25
C 36 34 29 45
IS 29 41 38 36
IS 28 46 41 29
DS 38 33 40 33
DS 41 33 40 30
DS: 1444
D: 889
IS: 1466
S: 910
C: 874
SC: 1442
IC: 1467
DI: 1569
ISC: 407
DSC: 386
DIS: 388
DC: 1487
DIC: 396
I: 875
As you can see, it automatically split it into 14 categories, but the distribution varies with the 2 letter ones being way more likely.
You can do this recursively using Haskell e.g. as follows:
combinationsOf _ 0 = [[]]
combinationsOf [] _ = []
combinationsOf (x:xs) k = map (x:) (combinationsOf xs (k-1) ) ++ combinationsOf xs k
The results from GHCI:
*Main> concatMap (combinationsOf "DISC") [1,2,3]
["D","I","S","C","DI","DS","DC","IS","IC","SC","DIS","DIC","DSC","ISC"]
In this question a code bit is presented and the questioner wants to make it faster by eliminating the use of variables. Seems to me he's looking in the wrong place, but far be it from me to know. Here's the code
while ($item = current($data))
{
echo '<ATTR>',$item, '</ATTR>', "\n";
next($data);
}
Seems to me that the recreation of the strings <ATTR> etc. -- more than once on each line and every time the line is processed -- would have a cost associated with them (both in terms of speed and memory). Or perhaps the PHP processor smart enough so that there's no penalty to not putting the strings into variables before the loop?
I use variables for clarity and centralization in any case, but: is there a cost associated with using variables, not using variables, or what? (Anybody who wants to answer for other similar languages please feel free.)
If you really want to micro-optimize this way (I don't think it is that relevant or useful, btw -- but I understand it's fun ^^ ), you can have a look at a PHP extension called Vulcan Logic Disassembler
It allows you to get the bytecode generated for a PHP script.
Then, you must use a command like this one, in command line, to launch the script :
php -dextension=vld.so -dvld.active=1 tests/temp/temp.php
For instance, with this script :
$data = array('a', 'b', 'c', 'd');
while ($item = current($data))
{
echo '<ATTR>',$item, '</ATTR>', "\n";
next($data);
}
You will get this bytecode dump :
line # op fetch ext return operands
-------------------------------------------------------------------------------
8 0 EXT_STMT
1 INIT_ARRAY ~0 'a'
2 ADD_ARRAY_ELEMENT ~0 'b'
3 ADD_ARRAY_ELEMENT ~0 'c'
4 ADD_ARRAY_ELEMENT ~0 'd'
5 ASSIGN !0, ~0
9 6 EXT_STMT
7 EXT_FCALL_BEGIN
8 SEND_REF !0
9 DO_FCALL 1 'current'
10 EXT_FCALL_END
11 ASSIGN $3 !1, $2
12 JMPZ $3, ->24
11 13 EXT_STMT
14 ECHO '%3CATTR%3E'
15 ECHO !1
16 ECHO '%3C%2FATTR%3E'
17 ECHO '%0A'
12 18 EXT_STMT
19 EXT_FCALL_BEGIN
20 SEND_REF !0
21 DO_FCALL 1 'next'
22 EXT_FCALL_END
13 23 JMP ->7
37 24 RETURN 1
25* ZEND_HANDLE_EXCEPTION
And with this script :
$data = array('a', 'b', 'c', 'd');
while ($item = current($data))
{
echo "<ATTR>$item</ATTR>\n";
next($data);
}
You will get :
line # op fetch ext return operands
-------------------------------------------------------------------------------
19 0 EXT_STMT
1 INIT_ARRAY ~0 'a'
2 ADD_ARRAY_ELEMENT ~0 'b'
3 ADD_ARRAY_ELEMENT ~0 'c'
4 ADD_ARRAY_ELEMENT ~0 'd'
5 ASSIGN !0, ~0
20 6 EXT_STMT
7 EXT_FCALL_BEGIN
8 SEND_REF !0
9 DO_FCALL 1 'current'
10 EXT_FCALL_END
11 ASSIGN $3 !1, $2
12 JMPZ $3, ->25
22 13 EXT_STMT
14 INIT_STRING ~4
15 ADD_STRING ~4 ~4, '%3CATTR%3E'
16 ADD_VAR ~4 ~4, !1
17 ADD_STRING ~4 ~4, '%3C%2FATTR%3E%0A'
18 ECHO ~4
23 19 EXT_STMT
20 EXT_FCALL_BEGIN
21 SEND_REF !0
22 DO_FCALL 1 'next'
23 EXT_FCALL_END
24 24 JMP ->7
39 25 RETURN 1
26* ZEND_HANDLE_EXCEPTION
(This ouput is with PHP 5.2.6, which is the default on Ubuntu Jaunty)
In the end , you will probably notice there is not that much differences, and that it's often really just micro-optimisation ^^
What might be more interesting is to look at the differences between versions of PHP : you might seen that some operations have been optimized between PHP 5.1 and 5.2, for instance.
For more informations, you can also have a look at Understanding Opcodes
Have fun !
EDIT : adding another test :
With this code :
$attr_open = '<ATTR>';
$attr_close = '</ATTR>';
$eol = "\n";
$data = array('a', 'b', 'c', 'd');
while ($item = current($data))
{
echo $attr_open, $item, $attr_close, $eol;
next($data);
}
You get :
line # op fetch ext return operands
-------------------------------------------------------------------------------
19 0 EXT_STMT
1 ASSIGN !0, '%3CATTR%3E'
20 2 EXT_STMT
3 ASSIGN !1, '%3C%2FATTR%3E'
21 4 EXT_STMT
5 ASSIGN !2, '%0A'
23 6 EXT_STMT
7 INIT_ARRAY ~3 'a'
8 ADD_ARRAY_ELEMENT ~3 'b'
9 ADD_ARRAY_ELEMENT ~3 'c'
10 ADD_ARRAY_ELEMENT ~3 'd'
11 ASSIGN !3, ~3
24 12 EXT_STMT
13 EXT_FCALL_BEGIN
14 SEND_REF !3
15 DO_FCALL 1 'current'
16 EXT_FCALL_END
17 ASSIGN $6 !4, $5
18 JMPZ $6, ->30
26 19 EXT_STMT
20 ECHO !0
21 ECHO !4
22 ECHO !1
23 ECHO !2
27 24 EXT_STMT
25 EXT_FCALL_BEGIN
26 SEND_REF !3
27 DO_FCALL 1 'next'
28 EXT_FCALL_END
28 29 JMP ->13
43 30 RETURN 1
31* ZEND_HANDLE_EXCEPTION
And, with this one (concatenations instead of ',') :
$attr_open = '<ATTR>';
$attr_close = '</ATTR>';
$eol = "\n";
$data = array('a', 'b', 'c', 'd');
while ($item = current($data))
{
echo $attr_open . $item . $attr_close . $eol;
next($data);
}
you get :
line # op fetch ext return operands
-------------------------------------------------------------------------------
19 0 EXT_STMT
1 ASSIGN !0, '%3CATTR%3E'
20 2 EXT_STMT
3 ASSIGN !1, '%3C%2FATTR%3E'
21 4 EXT_STMT
5 ASSIGN !2, '%0A'
23 6 EXT_STMT
7 INIT_ARRAY ~3 'a'
8 ADD_ARRAY_ELEMENT ~3 'b'
9 ADD_ARRAY_ELEMENT ~3 'c'
10 ADD_ARRAY_ELEMENT ~3 'd'
11 ASSIGN !3, ~3
24 12 EXT_STMT
13 EXT_FCALL_BEGIN
14 SEND_REF !3
15 DO_FCALL 1 'current'
16 EXT_FCALL_END
17 ASSIGN $6 !4, $5
18 JMPZ $6, ->30
26 19 EXT_STMT
20 CONCAT ~7 !0, !4
21 CONCAT ~8 ~7, !1
22 CONCAT ~9 ~8, !2
23 ECHO ~9
27 24 EXT_STMT
25 EXT_FCALL_BEGIN
26 SEND_REF !3
27 DO_FCALL 1 'next'
28 EXT_FCALL_END
28 29 JMP ->13
43 30 RETURN 1
31* ZEND_HANDLE_EXCEPTION
So, never much of a difference ^^
Here is an interesting one, my initial tests show that storing the newline char into a variable instead of PHP parsing it with each iteration is faster. See below:
$nl = "\n";
while ($item = current($data))
{
echo '<ATTR>',$item, '</ATTR>',$nl;
next($data);
}
There seems to be no measurable difference in using the string literals inside the loop vs. moving them to variables outside the loop. I threw together the following simple script to test this:
$length = 100000;
$data = array();
$totals = array();
for ($i = 0; $i < $length; $i++) {
$data[] = rand(1,1000);
}
$start = xdebug\_time\_index();
while ($item = current($data))
{
echo '<ATTR>',$item,'</ATTR>',PHP_EOL;
next($data);
}
$end = xdebug\_time\_index();
$total = $end - $start;
$totals["Warmup:"] = $total;
reset($data);
$start = xdebug\_time\_index();
while ($item = current($data))
{
echo '<ATTR>',$item,'</ATTR>',PHP_EOL;
next($data);
}
$end = xdebug\_time\_index();
$total = $end - $start;
$totals["First:"] = $total;
reset($data);
$startTag = '<ATTR>';
$endTag = '</ATTR>';
$start = xdebug\_time\_index();
while ($item = current($data))
{
echo $startTag,$item,$endTag,PHP_EOL;
next($data);
}
$end = xdebug\_time\_index();
$total = $end - $start;
$totals["Second:"] = $total;
foreach ($totals as $label => $data) {
echo $label,' ', $data,PHP_EOL;
}
I ran this several times and saw no discernable difference between the differing methods. In fact, sometimes the warmup was the fastest of the three.
When trying to microoptimize things such as this you really end up measuring the performance of the machine you are on more often than the actual code. Of note, you may want to use PHP_EOL instead of \n or defining a variable containing such.
Actually this is probably the fastest implementation. You could try to concat all in to one string but all of the concat operations are pretty expensive.
Everything has a cost. The goal is to minimize that cost as much as possible.
If you were thinking about concatenation check this resource for information on its performance. It's probably best to leave the code as-is.
If you really want to speed this up, use this instead:
ob_start();
while ($item = current($data))
{
echo '<ATTR>',$item, '</ATTR>', "\n";
next($data);
}
Output buffering flushes content more efficiently to the client, which speeds up your code much more than any micro-optimization can.
As an aside, in my experience micro-optimization is a useless endeavour when it comes to PHP code. I've never seen a performance problem get solved by clever use of a particular concatenation or variable declaration method. Real solutions tend to involve change to design or architecture or the use of less complicated algorithms.