Is PHP compiled or interpreted?
PHP is an interpreted language. The binary that lets you interpret PHP is compiled, but what you write is interpreted.
You can see more on the Wikipedia page for Interpreted languages
Both. PHP is compiled down to an intermediate bytecode that is then interpreted by the runtime engine.
The PHP compiler's job is to parse your PHP code and convert it into a form suitable for the runtime engine. Among its tasks:
Ignore comments
Resolve variables, function names, and so forth and create the symbol table
Construct the abstract syntax tree of your program
Write the bytecode
Depending on your PHP setup, this step is typically done just once, the first time the script is called. The compiler output is cached to speed up access on subsequent uses. If the script is modified, however, the compilation step is done again.
The runtime engine walks the AST and bytecode when the script is called. The symbol table is used to store the values of variables and provide the bytecode addresses for functions.
This process of compiling to bytecode and interpreting it at runtime is typical for languages that run on some kind of virtual runtime machine including Perl, Java, Ruby, Smalltalk, and others.
A compiled code can be executed directly by the computer's CPU. That is, the executable code is specified in the CPU's native language.
The code of interpreted languages must be translated at run-time from any format to CPU machine instructions. This translation is done by an interpreter.
It would not be proper to say that a language is interpreted or compiled, because interpretation and compilation are both properties of the implementation of that particular language and not a property of the language as such. So, any language can be compiled or interpreted — it just depends on what the particular implementation that you are using does.
The most widely used PHP implementation is powered by the Zend Engine and is known simply as PHP. The Zend Engine compiles PHP source into a format that it can execute, thus the Zend engine works as an interpreter.
In generally it is interpreted, but some time can use it as compiled and it is really increases performance.
Open source tool to perform this operation:
hhvm.com
PHP is an interpreted language. It can be compiled to bytecode by third party-tools, though.
I know this question is old but it's linked all over the place and I think all answers here are incorrect (maybe because they're old).
There is NO such thing as an interpreted language or a compiled language. Any programming language can be interpreted and/or compiled.
First of all a language is just a set of rules, so when we are talking about compilation we refer to specific implementations of that language.
HHVM, for example, is an implementation of PHP. It uses JIT compilation to transform the code to intermediate HipHop bytecode and then translated into machine code. Is it enough to say it is compiled? Some Java implementations (not all) also use JIT. Google's V8 also uses JIT.
Using the old definitions of compiled vs. interpreted does not make sense nowadays.
"Is PHP compiled?" is a non-sensical question given that there are no
longer clear and agreed delimiters between what is a compiled language vs an
interpreted one.
One possible way to delimit them is (I don't find any meaning in this dichotomy):
compiled languages use Ahead of Time compilation (C, C++);
interpreted languages use Just in Time compilation or no compilation at all (Python, Ruby, PHP, Java).
This is a meaningless question. PHP uses yacc (bison), just like GCC. yacc is a "compiler compiler". The output of yacc is a compiler. The output of a compiler is "compiled". PHP is parsed by the output of yacc. So it is, by definition, compiled.
If that doesn't satisfy, consider the following. Both php (the binary) and gcc read your source code and produce an abstract syntax tree. Under versions 4 and 5, php then walks the tree to translate the program to bytecode (the compilation step). You can see the bytecode translated to opcodes (which are analogous to assembly) using the Vulcan Logic Dumper. Finally, php (in particular, the Zend engine) interprets the bytecode. gcc, in comparison, walks the tree and outputs assembly; it can also run assemblers and linkers to finish the process. Calling a program handled by one "interpreted" and another program handled by the other "compiled" is meaningless. After all, programs are both run through a "compiler" with both.
You should actually ask the question you want to ask instead. ("Do I pay a performance penalty as PHP recompiles my source code for every request?", etc.)
Just keep in mind, if you need to source code every time to run the program, it means it is using Interpreter. So its an interpreted language.
On the other hand, if you compiled the source code and generate a compiled code which you can executed, then it is using complier. As here you don't need to source code. Like C, JAVA
At least it doesn't compile (or should I say optimize) the code as much as one might want it.
This code...
for($i=0;$i<100000000;$i++);
echo $i;
...delays the program equally much each time it is run.
It could have detected that it is a calculation that only needs to be done the first time.
The accepted answer is blatantly false. PHP IS compiled. End of story. Maybe not to native instructions but to an interpreted bytecode.
Related
As per wikipedia:
Scripts are loaded into memory and compiled into Zend opcodes
One line below is said:
The interpreter part analyzes the input code, translates it, and
executes it.
As I know the code is loaded in the memory, then goes through lexical analyze, getting parsed and compiled to opcodes. I fall in total mess even after ton of articles about the engine. So in the end is PHP code compiled or interpretered?
I think the distinction between "compiling" and "interpreting" is less clear in practice than Computer Science lessons would imply, as is the distinction between a "runtime environment" and a "virtual machine".
The answer is essentially that it is both: the Zend Engine first compiles your PHP code to an intermediate representation called "opcodes"; it then interprets these opcodes to execute the code.
In some ways, this is similar to the way Java is first compiled to bytecode, and then executed on the Java Virtual Machine; however, the "VM" which executes the code in the Zend Engine is not defined like a real processor, and is closely tied to the PHP language. It therefore acts more like a traditional interpreter, but of a language that no human would write.
The Zend Engine is responsible for the following tasks in PHP:
High performance parsing (including syntax checking), in-memory compilation and execution of PHP scripts [..]
Source: http://www.zend.com/products/zend_engine/in_depth
I'm going through the PHP manual and found the word 'userland' a couple of times. What does that usually mean? I found it in this page; I think it's the source code itself but I'm not sure.
From PHP manual:
While executing in a debug environment, configured with --enable-debug, the leak function used in the next example is actually implemented by the engine and is available to call in userland.
Since this question does not have a correct answer yet (despite an answer being selected), I'll go ahead and answer this.
The PHP core development team makes three main distinctions when referring to PHP:
PHP core. This refers to the Zend engine that powers PHP. It does things like tokenize userland code, handle memory management, process built-in keywords (if-else, while, isset, etc), and more. That last bit is why built-ins are many times faster than function calls. What PHP core generally does not do is implement functions like substr(), fopen(), etc., which is left to...
PHP extensions. This refers to the majority of the PHP source code but also PECL extensions and other PHP extensions written in C (and sometimes C++). All of the core functions and classes that are always available with PHP are actually implemented in extensions with the biggest extension being 'ext/standard'.
PHP userland. This refers to code that users of PHP generally write that leverage various PHP extensions and the core.
When you see the phrase "pure PHP userland" usually in reference to a PHP userland library that someone writes, they generally mean without dependencies on anything outside of PHP built-ins, extensions that may not be compiled in or available on a host, or external software not in the PHP ecosystem.
The use of any of these phrases may be an indicator that the person tends to lurk on the PHP internals mailing list. Most PHP developers are userland devs and have little to no knowledge of the inner workings of PHP itself.
It's not a PHP term, but a general computing one:
The term userland (or user space) refers to all code which runs outside the operating system's kernel.
https://en.wikipedia.org/wiki/User_space
I was just thinking to myself "How exactly is a PHP script executed?" I thought it was parsed first for syntax errors etc, and then interpreted and executed.
However, I don't know why I believe that is correct. I'm probably wrong.
So, how exactly is a PHP file interpreted and executed? What stages does this involve? How do included files fit into the parsing of the script?
This is just to help me get my head around it. I'm interested and can not find a good answer with Google.
PHP is a compiled language since PHP 4.0
The idea of what is a compiler seems to be a subject that causes great confusion. Some people assume that a compiler is a program that converts source code in one language into an executable program. The definition of what is a compiler is actually broader than that.
A compiler is a program that transforms source code into another representation of the code. The target representation is often machine code, but it may as well be source code in another language or even in the same language.
PHP became a compiled language in the year 2000, when PHP 4 was released for the first time. Until version 3, PHP source code was parsed and executed right away by the PHP interpreter.
PHP 4 introduced the the Zend engine. This engine splits the processing of PHP code into several phases. The first phase parses PHP source code and generates a binary representation of the PHP code known as Zend opcodes. Opcodes are sets of instructions similar to Java bytecodes. These opcodes are stored in memory. The second phase of Zend engine processing consists in executing the generated opcodes.
Form more information go to http://www.phpclasses.org/blog/post/117-PHP-compiler-performance.html
Basically, each time a PHP script is loaded, it goes by two steps :
The PHP source code is parsed, and converted to what's called opcodes
Kind of an equivalent of JAVA's bytecode
If you want to see what those look like, you can use the VLD extension
Then, those opcode are executed
These slides from Sebastian Bergmann, on slideshare, might help you understand that process a bit better : PHP Compiler Internals
Here is also a list of all the parser tokens.
from my understanding, if you use a PHP caching program like APC, eAccelerator, etc. then opcodes will be stored in memory for faster execution upon subsequent requests. My question is, why wouldn't it ALWAYS be better/faster to compile your scripts, assuming you're using a compiler like phc or even HPHP (although I know they have issues with dynamic constructs)? Why bother storing opcodes since they have to be re-read by the Zend Engine, which uses C functions to execute it, when you can just compile and skip that step?
You cannot simply compile to c and have your php script execute the same way. HPHP does real compilation, but it doesn't support the whole superset of php features.
Other compilers actually just embed a php interpreter in the binary so you aren't really compiling the code anyway.
PHP is not meant to be compiled. opcode caching is very fast and good enough for 99% of applications out there. If you have facebook level of traffic, and you have already optimized your back end db, compilation might be the only way to increase performance.
PHP is not a thin layer to the std c library.
If PHP didn't have eval(), it probably would be possible to do a straight PHP->compiled binary translation with (relative) ease. But since PHP can itself dynamically build/execute scripts on the fly via eval(), it's not possible to do a full-on binary. Any binary would necessarily have to contain the entirety of PHP because the compiler would have no idea what your dynamic code could do. You'd go from a small 1 or 2k script into a massive multi-megabyte binary.
This is something I've always wondered: Why is PHP slower than Java or C#, if all 3 of these languages get compiled down to bytecode and then executed from there? I know that normally PHP recompiles each file with each request, but even when you bring APC (a bytecode cache) into the picture, the performance is nowhere near that of Java or C# (although APC greatly improves it).
Edit:
I'm not even talking about these languages on the web level. I am talking about the comparison of them when they're number crunching. Not even including startup time or anything like that.
Also, I am not making some kind of decision based on the replies here. PHP is my language of choice; I was simply curious about its design.
One reason is the lack of a JIT compiler in PHP, as others have mentioned.
Another big reason is PHP's dynamic typing. A dynamically typed language is always going to be slower than a statically typed language, because variable types are checked at run-time instead of compile-time. As a result, statically typed languages like C# and Java are going to be significantly faster at run-time, though they typically have to be compiled ahead of time. A JIT compiler makes this less of an issue for dynamically typed languages, but alas, PHP does not have one built-in. (Edit: PHP 8 will come with a built-in JIT compiler.)
I'm guessing you are a little bit into the comparing of apples and oranges here - assuming that you are using all these languages to create web applications there is quite a bit more to it than just the language. (And lots of the time it is the database that is slowing you down ;-)
I would never suggest choosing one of these languages over the other on the basis of a speed argument.
Both Java and C# have JIT compilers, which take the bytecode and compile into true machine code. The act of compiling it can take time, hence C# and Java can suffer from slower startup times, but once the code is JIT compiled, its performance is in the same ballpark as any "truly compiled" language like C++.
The biggest single reason is that Java's HotSpot JVM and C#'s CLR both use Just-In-Time (JIT) compilation. JIT compilation compiles the bytecodes down to native code that runs directly on the processor.
Also I think Java bytecode and CIL are lower-level than PHP's internal bytecode which might make alot of JIT optimizations easier and more effective.
A wild guess might be that JAVA depends on some kind of "application" server, while PHP doesn't -- which means a new environnement has to be created each time a PHP page is called.
(This was especially true when PHP was/is used as a CGI, and not as an Apache module or via FastCGI)
Another idea might be that C# and JAVA compilers can do some heavy optimisations at compile time -- on the other side, as PHP scripts are compiled (at least, if you don't "cheat" with an opcode cache) each time a page is called, the compilation phase has to be real quick ; which means it's not possible to spend much time optimizing.
Still : Each version of PHP generally comes with some amelioration of the performances ; for instance, you can gain between 15% and 25% of CPU, when switching from PHP 5.2 to 5.3.
For instance, take a look at those benchmarks :
Benchmark of PHP Branches 3.0 through 5.3-CVS
Performance PHP 5.2 vs PHP 5.3 - huge gain
Bench PHP 5.2 vs PHP 5.3 -- disclaimer : it's in french, and I'm the one who did it.
One important thing, also, is that PHP is quite easy to scale : just add a couple of web servers, and voila !
The problem you often meet when going from 1 to several servers is with sessions -- store those in DB or memcached (very easy), and problem solved !
As a sidenote : I would not recommend choosing a technology because there is a couple of percent difference of speed on some benchmark : there are far more important factors, like how well your team know each technology -- or, even, the algorithms you are going to use !
There is no way an interpreted language can be faster than a compiled language or even a JIT language under trivial conditions.
Unless your test program consists of printing out "Hello Worlds" if you are concerned about speed, stick with C# or Java.
Depends on what you want to do. In some cases, PHP is definitely faster. PHP is (pretty) good at file manipulation and other basic stuff (also XML stuff). Java or C# might be slower in those cases (though I didn't benchmark).
Also, the PHP output (HTML or whatever) needs to be downloaded to the browser, which also consumes time.
Also, the speed of Java / C# is very much depending on the machine it runs on (which could be multiple). Java / C# could be slow on your computer, while PHP just runs on one server from which it is available and is always as fast as the server is (except for download times, etc.).
I don't think they are comparable in a general manner. I think you need to take a task, which you could be accomplished with those three programming languages, and then compare that. That is basically always what you should do when choosing a programming language; find the one that fits the task. Don't shape the task until it fits the programming language.
According to wikipedia, PHP uses The Zend Engine, which does not have a JIT.