PHP accelerator vs Just in Time Compilation - php

from wikipedia:
Most PHP accelerators work by caching the compiled bytecode of PHP
scripts to avoid the overhead of parsing and compiling source code on
each request (some or all of which may never even be executed). To
further improve performance, the cached code is stored in shared
memory and directly executed from there, minimizing the amount of slow
disk reads and memory copying at runtime.
Just in time Compilation:
JIT compilers represent a hybrid approach, with translation occurring
continuously, as with interpreters, but with caching of translated
code to minimize performance degradation.
so is using PHP accelerators such as APC on PHP have equivalent implications towards performance with "Just-in-time" compiling PHP (assuming that it's possible to do so)....in fact are they actually the same thing?

so is using PHP accelerators such as APC on PHP have equivalent implications towards performance with "Just-in-time" compiling PHP (assuming that it's possible to do so)....in fact are they actually the same thing?
Same concept, different execution.
When JIT is spoken of in most circles, it refers to compiling virtual machine bytecode into native bytecode. For example, Facebook's HHVM is a PHP implementation that uses a JIT engine.
However, PHP's native virtual machine doesn't do JIT to native bytecode. In fact, it doesn't do JIT at all in the traditional sense. While whole files are compiled to PHP bytecode on-demand, this isn't actually JIT.
Be careful with the term "PHP accelerator." Back in the PHP4 days, the bytecode created by the PHP parser could be optimized a bit to get better performance. This hasn't been needed since early PHP5. The only thing that APC, the Zend "Optimizer" and other "accelerator" products do to increase PHP performance is cache bytecode. The term "accelerator" should no longer be used to remove ambiguity.
If you're using standard PHP, then you do want a bytecode cache, just steer clear of products saying that they try to do PHP bytecode optimization.

Related

Do I pay a performance penalty as PHP recompiles my source code for every request?

I know PHP is mostly an interpreted language. Does the PHP interpreter (php.exe in Windows and php file in Linux) do interpretation every time my script executes or only when I change the source? To put it another way, does the PHP interpreter cache interpreted scripts or not?
Yes you have a performance penalty as PHP does interpretation every time. Though, if you have APC(Alternative PHP Cache: http://php.net/apc) installed and configured it will keep whole byte code in memory and will re-build it when some changes occur.
This is in essence what happens every time a request arrives:
PHP reads the file
PHP compiles the file to a language it can process, the so called opcode
PHP runs the opcode
There is some overhead in compiling the file into opcode as many have already pointed out, and PHP by default has no cache, so it will do the "compilation" process every time a request arrives even if the file didn't change.
There are some optional modules that can produce opcode caches to avoid that overhead, of which generally the most recommended is APC, since it will ship by default on PHP 6.
Yes.
Being an interpreted language, you do pay a performance penalty.
However there is some research in the direction of compiling and using it.
Take a look at PHP Accelerator.
Most PHP accelerators work by caching the compiled bytecode of PHP
scripts to avoid the overhead of parsing and
compiling source code on each request (some or even most of which may
never be executed). To further improve performance, the cached code is
stored in shared memory and directly executed from there, minimizing
the amount of slow disk reads and memory copying at runtime.

Compilation of PHP- to op-code and having the opcode executed

PHP is usually compiled to opcode by the Zend engine on execution time.
To skip the compiling every time one can use an opcode cache like APC to save the opcode in shared memory and reuse it.
Okay, now it seems that there is no solution yet for just compiling the PHP to opcode and using that. Similar to how you use Java.
But why? I am wondering about that b/c this is a quite obvious idea, so I guess there is a reason for this.
EDIT:
the core question is this:
wouldn't make PHP-compilation make opcode-caching superfluous?
The only "reason" against it would be that you couldn't just fix something on the live-system ... which is anyway bad bad bad practice.
You've given one reason against it.
Another very important one is that if you separate the compile from the runtime both in terms of the time at which each occur but also in terms of the hardware where it runs, you quickly run into complex dependency problems - what happens when you try to run opcode generated by PHP 5.1 on a PHP 5.3 runtime?
It also makes debugging of code harder - since the debugger has to map the opcode back to the source code.
But a very important question you don't seem to have asked let alone answered is what is the benefit of pre-generating the opcode?
Would compiling the opcode prior to runtime have a significant benefit over caching the opcode? The difference would be un-measurably small.
Certainly the raison d'etre for HipHop is that natively compiled PHP code runs faster than PHP with opcode caching at the expense of some functionality. But that's something quite different.
Do you think that having only the opcodes on the server improves the security (by obscurity)?

Question of PHP cache vs compile

from my understanding, if you use a PHP caching program like APC, eAccelerator, etc. then opcodes will be stored in memory for faster execution upon subsequent requests. My question is, why wouldn't it ALWAYS be better/faster to compile your scripts, assuming you're using a compiler like phc or even HPHP (although I know they have issues with dynamic constructs)? Why bother storing opcodes since they have to be re-read by the Zend Engine, which uses C functions to execute it, when you can just compile and skip that step?
You cannot simply compile to c and have your php script execute the same way. HPHP does real compilation, but it doesn't support the whole superset of php features.
Other compilers actually just embed a php interpreter in the binary so you aren't really compiling the code anyway.
PHP is not meant to be compiled. opcode caching is very fast and good enough for 99% of applications out there. If you have facebook level of traffic, and you have already optimized your back end db, compilation might be the only way to increase performance.
PHP is not a thin layer to the std c library.
If PHP didn't have eval(), it probably would be possible to do a straight PHP->compiled binary translation with (relative) ease. But since PHP can itself dynamically build/execute scripts on the fly via eval(), it's not possible to do a full-on binary. Any binary would necessarily have to contain the entirety of PHP because the compiler would have no idea what your dynamic code could do. You'd go from a small 1 or 2k script into a massive multi-megabyte binary.

Use APC to cache source files in PHP, does it work?

I was browsing thru the docs of APC (Alternative PHP Cache) and I've seen that it has a function called apc_compile_file. The Docs say that this function is to:
Stores a file in the bytecode cache,
bypassing all filters.
Is this like HipHop's idea, to store PHP code in more optimized code? If is not, can someone educate me because Im kinda lost in that. If is indeed like that, then why APC is older than HipHop and doesn't get all the fuzz that HipHop gets.
Best regards!
The two are very, very different.
APC isn't a bytecode optimizer, just a bytecode cache. It saves the need for the PHP script to be parsed (or even read from the .php file on disk) on subsequent accesses, but it's still being executed as PHP bytecode.
HipHop doesn't just optimize PHP code, it transforms it to compilable C++ code, ten compiles it into an native executable on the server. By its nature as compiled code, it then runs significantly faster than any scripted language.

Can Ruby, PHP, or Perl create a pre-compiled file for the code like Python?

For Python, it can create a pre-compiled version file.pyc so that the program can be run without interpreted again. Can Ruby, PHP, and Perl do the same on the command line?
There is no portable bytecode specification for Ruby, and thus also no standard way to load precompiled bytecode archives. However, almost all Ruby implementations use some kind of bytecode or intcode format, and several of them can dump and reload bytecode archives.
YARV always compiles to bytecode before executing the code, however that is usually only done in memory. There are ways to dump out the bytecode to disk. At the moment, there is no way to read it back in, however. This will change in the future: work is underway on a bytecode verifier for YARV, and once that is done, bytecode can safely be loaded into the VM, without fear of corruption. Also, the JRuby developers have indicated that they are willing to implement a YARV VM emulator inside JRuby, once the YARV bytecode format and verifier are stabilized, so that you could load YARV bytecode into JRuby. (Note that this version is obsolete.)
Rubinius also always compiles to bytecode, and it has a format for compiled files (.rbc files, analogous to JVM .class files) and there is talk about a bytecode archive format (.rba files, analogous to JVM .jar files). There is a chance that Rubinius might implement a YARV emulator, if deploying apps as YARV bytecode ever becomes popular. Also, the JRuby developers have indicated that they are willing to implement a Rubinius bytecode emulator inside JRuby, if Rubinius bytecode becomes a popular way of deploying Ruby apps. (Note that this version is obsolete.)
XRuby is a pure compiler, it compiles Ruby sourcecode straight to JVM bytecode (.class files). You can deploy these .class files just like any other Java application.
JRuby started out as an interpreter, but it has both a JIT compiler and an AOT compiler (jrubyc) that can compile Ruby sourcecode to JVM bytecode (.class files). Also, work is underway to create a new compiler that can compile (type-annotated) Ruby code to JVM bytecode that actually looks like a Java class and can be used from Java code without barriers.
Ruby.NET is a pure compiler that compiles Ruby sourcecode to CIL bytecode (PE .dll or .exe files). You can deploy these just like any other CLI application.
IronRuby also compiles to CIL bytecode, but typically does this in-memory. However, you can pass commandline switches to it, so it dumps the .dll and .exe files out to disk. Once you have those, they can be deployed normally.
BlueRuby automatically pre-parses Ruby sourcecode into BRIL (BlueRuby Intermediate Language), which is basically a serialized parsetree. (See Blue Ruby - A Ruby VM in SAP ABAP(PDF) for details.)
I think (but I am definitely not sure) that there is a way to get Cardinal to dump out Parrot bytecode archives. (Actually, Cardinal only compiles to PAST, and then Parrot takes over, so it would be Parrot's job to dump and load bytecode archives.)
Perl 5 can dump the bytecodes to disk, but it is buggy and nasty. Perl 6 has a very clean method of creating bytecode executables that Parrot can run.
Perl's just-in-time compilation is fast enough that this doesn't matter in most circumstances. One place where it does matter is in a CGI environment which is what mod_perl is for.
For hysterical raisins, Perl 5 looks for .pmc files ahead of .pm files when searching for module. These files could contain bytecode, though Perl doesn't write bytecode out by default (unlike Python).
Module::Compile (or: what's this PMC thingy?) goes into some more depth about this obscure feature. They're not frequently used, but...
The clever folks who wrote Module::Compile take advantage of this, to pre-compile Perl code into... well, it's still Perl, but it's preprocessed.
Among other benefits, this speeds up loading time and makes debugging easier when using source filters (Perl code modifying Perl source code before being loaded by the interpreter).
Not for PHP, although most PHP setups incorporate a Bytecode Cache that will cache the compiled bytecode so that next time the script runs, the compiled version is run. This speeds up execution considerably.
There's no way I'm aware of to actually get at the bytecode through the command line.
For Perl you can try using B::Bytecode and perlcc. However, both of these are highly experimental. And Perl 6 is coming out soon (theoretically) and will be on Parrot and will use a different bytecode and so all of this will be somewhat moot then.
here are some example magic words for the command-line
perl -MO=Bytecode,-H,-o"Module.pm"c "Module.pm"
According to the third edition of Programming Perl, it is possible to approximate this in some experimental ways.
If you use Zend Guard on your PHP scripts, it essentially precompiles the scripts to byte-code, which can then be run by the PHP engine if the Zend Optimizer extension is loaded.
So, yes, Zend Guard/Optimizer permits pre-compiled PHP scripts to be used.
For PHP, the Phalanger Project compiles down to .Net assemblies. I'm not sure if thats what you were looking for though.
Has anyone considered using LLVM's bytecode, instead of a yet-another-custom-bytecode?
Ruby 1.8 doesn't actually use bytecode at all (even internally), so there is no pre-compilation step.

Categories