Command Line PHP - Is it Thread Safe? [duplicate] - php

I saw different binaries for PHP, like non-thread or thread safe?
What does this mean?
What is the difference between these packages?

Needed background on concurrency approaches:
Different web servers implement different techniques for handling incoming HTTP requests in parallel. A pretty popular technique is using threads -- that is, the web server will create/dedicate a single thread for each incoming request. The Apache HTTP web server supports multiple models for handling requests, one of which (called the worker MPM) uses threads. But it supports another concurrency model called the prefork MPM which uses processes -- that is, the web server will create/dedicate a single process for each request.
There are also other completely different concurrency models (using Asynchronous sockets and I/O), as well as ones that mix two or even three models together. For the purpose of answering this question, we are only concerned with the two models above, and taking Apache HTTP server as an example.
Needed background on how PHP "integrates" with web servers:
PHP itself does not respond to the actual HTTP requests -- this is the job of the web server. So we configure the web server to forward requests to PHP for processing, then receive the result and send it back to the user. There are multiple ways to chain the web server with PHP. For Apache HTTP Server, the most popular is "mod_php". This module is actually PHP itself, but compiled as a module for the web server, and so it gets loaded right inside it.
There are other methods for chaining PHP with Apache and other web servers, but mod_php is the most popular one and will also serve for answering your question.
You may not have needed to understand these details before, because hosting companies and GNU/Linux distros come with everything prepared for us.
Now, onto your question!
Since with mod_php, PHP gets loaded right into Apache, if Apache is going to handle concurrency using its Worker MPM (that is, using Threads) then PHP must be able to operate within this same multi-threaded environment -- meaning, PHP has to be thread-safe to be able to play ball correctly with Apache!
At this point, you should be thinking "OK, so if I'm using a multi-threaded web server and I'm going to embed PHP right into it, then I must use the thread-safe version of PHP". And this would be correct thinking. However, as it happens, PHP's thread-safety is highly disputed. It's a use-if-you-really-really-know-what-you-are-doing ground.
Final notes
In case you are wondering, my personal advice would be to not use PHP in a multi-threaded environment if you have the choice!
Speaking only of Unix-based environments, I'd say that fortunately, you only have to think of this if you are going to use PHP with Apache web server, in which case you are advised to go with the prefork MPM of Apache (which doesn't use threads, and therefore, PHP thread-safety doesn't matter) and all GNU/Linux distributions that I know of will take that decision for you when you are installing Apache + PHP through their package system, without even prompting you for a choice. If you are going to use other webservers such as nginx or lighttpd, you won't have the option to embed PHP into them anyway. You will be looking at using FastCGI or something equal which works in a different model where PHP is totally outside of the web server with multiple PHP processes used for answering requests through e.g. FastCGI. For such cases, thread-safety also doesn't matter. To see which version your website is using put a file containing <?php phpinfo(); ?> on your site and look for the Server API entry. This could say something like CGI/FastCGI or Apache 2.0 Handler.
If you also look at the command-line version of PHP -- thread safety does not matter.
Finally, if thread-safety doesn't matter so which version should you use -- the thread-safe or the non-thread-safe? Frankly, I don't have a scientific answer! But I'd guess that the non-thread-safe version is faster and/or less buggy, or otherwise they would have just offered the thread-safe version and not bothered to give us the choice!

For me, I always choose non-thread safe version because I always use nginx, or run PHP from the command line.
The non-thread safe version should be used if you install PHP as a CGI binary, command line interface or other environment where only a single thread is used.
A thread-safe version should be used if you install PHP as an Apache module in a worker MPM (multi-processing model) or other environment where multiple PHP threads run concurrently - simply put, any CGI/FastCGI build of PHP does not require thread safety.

Apache MPM prefork with modphp is used because it is easy to configure/install. Performance-wise it is fairly inefficient. My preferred way to do the stack, FastCGI/PHP-FPM. That way you can use the much faster MPM Worker. The whole PHP remains non-threaded, but Apache serves threaded (like it should).
So basically, from bottom to top
Linux
Apache + MPM Worker + ModFastCGI (NOT FCGI) |(or)| Cherokee |(or)| Nginx
PHP-FPM + APC
ModFCGI does not correctly support PHP-FPM, or any external FastCGI applications. It only supports non-process managed FastCGI scripts. PHP-FPM is the PHP FastCGI process manager.

As per PHP Documentation,
What does thread safety mean when downloading PHP?
Thread Safety means that binary can work in a multithreaded webserver
context, such as Apache 2 on Windows. Thread Safety works by creating
a local storage copy in each thread, so that the data won't collide
with another thread.
So what do I choose? If you choose to run PHP as a CGI binary, then
you won't need thread safety, because the binary is invoked at each
request. For multithreaded webservers, such as IIS5 and IIS6, you
should use the threaded version of PHP.
Following Libraries are not thread safe. They are not recommended for use in a multi-threaded environment.
SNMP (Unix)
mSQL (Unix)
IMAP (Win/Unix)
Sybase-CT (Linux, libc5)

The other answers address SAPIs implementations, and while this is relevant the question asks the difference between the thread-safe vs non thread-safe distributions.
First, PHP is compiled as an embeddable library, such as libphp.so on *NIX and php.dll on Windows. This library can be embedded into any C/CPP application, but obviously it is primarily used on web servers. At it's core PHP starts up in in two major phases, the module init phase and then request init phase. Module init initializes the PHP core and all extensions, where request init initializes PHP userspace - both native userspace features as well as PHP code itself.
The PHP library is setup to where the module phase only has to be called on once, but the request phase has to be reinitialized for each HTTP request. Note that CLI links to the same library as mod_php ect, and still has to go through these phases internally even though it may not be used in the context of processing HTTP requests. Also, it's important to note that PHP isn't literally designed for processing HTTP requests - most accurately, it is designed for processing CGI events. Again, this isn't just php-cgi, but all SAPI/applications including php-fpm, mod_php, CLI and even the exceedingly rare PHP desktop application.
Webservers (or more typically SAPIs) that link to libphp tend to follow one of four patterns:
create a completely new instance of a PHP per request (old CGI pattern, not common and obviously not recommended)
create one PHP instance, but go though both initialization phases (together) in separate forked child processes
create one PHP instance, do module init once in parent process pre-fork, and then the individual request phase post-fork per HTTP request
Note that in examples 2 and 3 the child process is typically terminated after each request. In example 2 the child process must be terminated at the end of each request.
The forth example is related to threaded implementations
call module init once in main thread, then call request init within other threads.
In the threaded case, request handling threads tend to utilize a thread pool, with each thread running in a loop initializing the request phase at the beginning and than destroying the request phase at the end which is more optimal than spawning a new thread per request
Regardless of how threaded implementations utilize libphp, if the module phase is initialized in one thread and request phases are called in different threads (which is the case PHP was designed for) it requires a non-trivial amount of synchronization, not just within the PHP core, but also within all native PHP extensions. Note that this is not just a matter of a “request” at this point, but synchronization that it being called on per PHP OPCODE that relies on any form of resource within the PHP core (or any PHP extension) which exists in a different thread as PHP userspace.
This places a huge demand on synchronization within thread-safe PHP distributions, which is why PHP tends to follow a "share nothing" rule which helps minimize the impact, but there is no such thing as truly "sharing nothing" in this pattern, unless each thread contains a completely separate PHP context, where the module phase and request phase is all done within the same thread per request which is not suggested or supported. If the context built within the module init phase is in a separate thread as the request init phase there will most definitely be sharing between threads. So the best attempt is made to minimize context within the module init phase that must be shared between threads, but this is not easy and in some cases not possible.
This is especially true in more complicated extensions which have their own requirements of how a their own context must be shared between threads, with openssl being a major culprit of of this example which effectually extends outward to any extension that uses it, whether internal such as PHP stream handlers or external such as sockets, curl, database extensions, etc.
If not obvious at this point, thread-safe vs non thread-safe is not just a matter of how PHP works internally as a “request handler” within an SAPI implementation, but a matter of how PHP works internally as an embedded virtual machine for a programming language.
This is all made possible by the TSRM, or the thread safe resource manager, which is well made and handles a very large amount of synchronization with little perceived overhead, but the overhead is definitely there and will grow not just based on how many requests per second that the server must handle (the deciding factor on how may threads the SAPI requires), but also by how much PHP code is used per request (or per execution). In other words, large bloated frameworks can make a real difference when it comes specifically to TSRM overhead. This isn't to speak of overall PHP performance and resource requirements within thread-safe PHP, but just the additional overhead of TSRM itself within thread-safe PHP.
As such, pre compiled PHP is distributed in two flavors, one built where TSRM is active in libphp (thread-safe) and one where libphp does not use any TSRM features (non thread-safe) and thus does not have the overhead of TSRM.
Also note that the flag used to compile PHP with TSRM (--enable-maintainer-zts or --with-zts in later PHP versions) causes phpize to extend this outward into the compilation of extensions and how they initialize their own libraries (libssl, libzip, libcurl, etc) which will often have their own way of compiling for thread-safe vs non thread-safe implementations, i.e their own synchronization mechanisms outside of TSRM and PHP as a whole. While this not exactly PHP related, in the end will still have effect on PHP performance outside of TSRM (meaning on top of TSRM). As such, PHP extensions (and their dependents, as well as external libraries PHP or extensions link to or otherwise depend on) will often have different attributes in thead-safe PHP distributions.

Related

Why is PHP CLI considered as a kind of SAPI?

According to Wikipedia's definition:
In other words, SAPI is actually an application programming interface
(API) provided by the web server to help other developers in extending
the web server capabilities.
Different kinds of SAPIs exist for various web-server extensions. For example, in addition to those listed above, other SAPIs for the PHP language include the Common Gateway Interface (CGI) and command-line interface (CLI).
But, i think when running a php file by terminal(command line), or using php-cli interactive shell only zend engine will be involved to interpret the code.
Is there something that I'm wrong about or don't know?
Can it be because of interacting with client through the web server?
Edit: Further explanations for a better understanding of the subject in addition to the accepted answer:
Advanced PHP Programming By George Schlossnagle
The outermost layer, where PHP interacts with other applications, is the Server Abstraction API (SAPI) layer. The SAPI layer partially handles the startup and shutdown of PHP inside an application, and it provides hooks for handling data such as cookies and POST data in an application-agnostic manner.
Below the SAPI layer lies the PHP engine itself. The core PHP code handles setting up the running environment (populating global variables and setting default .ini options), providing interfaces such as the stream's I/O interface, parsing of data, and most importantly, providing an interface for loading extensions (both statically compiled extensions and dynamically loaded extensions).
At the core of PHP lies the Zend Engine, which we have discussed in depth here. As you've seen, the Zend Engine fully handles the parsing and execution of scripts. The Zend Engine was also designed for extensibility and allows for entirely overriding its basic functionality (compilation, execution, and error handling), overriding selective portions of its behavior (overriding op_handlers in particular ops), and having functions called on registerable hooks (on every function call, on every opcode, and so on). These features allow for easy integration of caches, profilers, debuggers, and semantics-altering extensions.
The SAPI Layer
The SAPI layer is the abstraction layer that allows for easy embedding of PHP into other applications. Some SAPIs include the following:
mod_php5 This is the PHP module for Apache, and it is a SAPI that
embeds PHP into the Apache Web server.
fastcgi This is an implementation of FastCGI that provides a scalable
extension to the CGI standard. FastCGI is a persistent CGI daemon
that can handle multiple requests. FastCGI is the preferred method of
running PHP under IIS and shows performance almost as good as that of
mod_php5.
CLI This is the standalone interpreter for running PHP scripts from
the command line, and it is a thin wrapper around a SAPI layer.
embed This is a general-purpose SAPI that is designed to provide a C
library interface for embedding a PHP interpreter in an arbitrary
application.
The idea is that regardless of the application, PHP needs to communicate with an application in a number of common places, so the SAPI interface provides a hook for each of those places. When an application needs to start up PHP, for instance, it calls the startup hook. Conversely, when PHP wants to output information, it uses the provided ub_write hook, which the SAPI layer author has coded to use the correct output method for the application PHP is running in.
Read more
"Why" is always a slippery, somewhat philosophical, question; ultimately, the answer to "why is CLI considered a SAPI" is "because that's how the developers defined it". If they'd called it "CLI mode", would you still have asked "why"?
But you do ask a more concrete question, which can be paraphrased as:
When running a program on the CLI, why do you need a SAPI as well as the Zend Engine?
Or even more succinctly:
What does the CLI SAPI do?
The Zend Engine on its own takes a series of instructions, and executes them. It can manage variables, arithmetic, function definitions and calls, class definitions, and so on. But none of those are very useful if you can't get any output from the program; and most commonly you want to provide some variable input as well.
Some forms of input and output are based only on the operating system you're running on, not the SAPI - reading or writing a file based on an absolute path, or accessing something over a network connection. You could theoretically have a running mode that only gave access to those, but it would feel very limited. (Would that still be a "SAPI"? Definitely a philosophical question.)
Consider something as commonplace as echo: there's no absolute definition of "output" that the Zend Engine can manage directly. In a web server context, you expect echo to add some output to the HTTP response the server will send the client; in a command-line context, you expect it to show the output in the console where you ran the command, or be "piped" to another command if you run something like php foo.php | grep error.
The CLI SAPI doesn't provide all the same facilities that a web server SAPI would, but it fills a similar role in interfacing your program, running in the Zend Engine, to the outside world. Here are a few of the things it needs to do:
Attach output to the parent console or "standard output" stream
Make the "standard input" and "standard error" streams available
Populate $argv and $argc based on the arguments the script was invoked with
Populate $_ENV with the environment variables the process inherited
Define an initial value for "current working directory" for use with relative file paths

Gwan or nginx for php

I have large photo file and would like to use gwan as it fast. Is there any performance benefits in comparison to nginx fast-cgi. Does gwan fast at timd to first byte? Is it faster at connecting time? Is commputing time faster? Is throughput faster?Furthermore can you install hhvm on gwan. If you can, how would you install hhvm?would it give a performance benefit to php(how much)?
The only way you are going to know if nginx or G-WAN are better for your use case is to actually use them for your site and benchmark it. The speed of software like this depend very much on your configuration, usage patterns, site structure, etc etc, and is not something where a single blanket answer is appropriate or useful.
HHVM can be used behind any webserver which can serve FastCGI requests. A quick google search indicates that G-WAN may not support FastCGI, but rather has its own custom scripting interface? If so, it may still be possible to use that interface to integrate HHVM, though it is likely to require some work and not be officially supported.
The downside of FasCGI is that it's itself a backend server: instead of having only G-WAN as a server, you are limited by the speed of the backend server when G-WAN sends it requests and waits its replies:
Internet LAN
[clients] ============ [G-WAN] ----------------- [FastCGI + PHP]
latency1 latency2 latency3 latency4 latency5
In this case, the latency of a FastCGI server and the extra LAN latency are slowing-down G-WAN.
A more efficient way is to have G-WAN load and run the HHVM itself, which has been done with PH7, another thread-safe PHP runtime provided with G-WAN v4+:
Internet
[clients] ============ [G-WAN + PHP]
latency1 latency2 latency3
It is technically possible to implement *.hhvm G-WAN scripts like it has been done for G-WAN *.ph7, *.java, *.scala and *.cs (C#) scripts. This requires writting a G-WAN C module to load the HHVM in the G-WAN memory-sapce (something that may take time depending on the level of support provided by the Facebook HHVM team).
One could also use the G-WAN CGI interface to invoke HHVM as a local process (like G-WAN was forced to do for the thread-unsafe Zend PHP). But the results in terms of performance greatly depend on the initialization and processing times of the HHVM executable (not to mention the extra per-request overhead). This third way is simpler to implement but necessarily slower than a native HHVM G-WAN module.

How does PHP works and what is its architecture ?

Guys recently I decided to go back to PHP and do some more complex stuff than a simple log in page. For 3 years I've been programming with Java/JavaEE and have a good understanding of the architecture of of Java Applications. Basically, a virtual machine ( a simple OS process ) that runs compiled code called byte code. a simple Java web server is basically a java application that listens on provided TCP port for Http requests and responds accordingly of course it is more complicated than that but this is its initial work.
Now, what about PHP ? How does it work ? What, in a nutshell, is its architecture.
I googled about this question but in 90% the articles explain how to implement and construct a web application with PHP which is not what I am looking for.
The biggest difference between a Java web server and PHP is that PHP doesn't have its own built-in web server. (Well, newer versions do, but it's supposed to be for testing only, it's not a production ready web server.) PHP itself is basically one executable which reads in a source code file of PHP code and interprets/executes the commands written in that file. That's it. That's PHP's architecture in a nutshell.
That executable supports a default API which the userland PHP code can call, and it's possible to add extensions to provide more APIs. Those extensions are typically written in C and compiled together with the PHP executable at install time. Some extensions can only be added by recompiling PHP with additional flags, others can be compiled against a PHP install and activated via a configuration file after the fact. PHP offers the PEAR and PECL side projects as an effort to standardise and ease such after-the-fact installs. Userland PHP code will often also include additional third party libraries simply written in PHP code. The advantage of C extensions is their execution speed and low-level system access, the advantage of userland code libraries is their trivial inclusion. If you're administering your own PHP install, it's often simple enough to add new PHP extensions; however on the very popular shared-host model there's often a tension between what the host wants to install and what the developer needs.
In practice a web service written in PHP runs on a third party web server, very often Apache, which handles any incoming requests and invokes the PHP interpreter with the given requested PHP source code file as argument, then delivers any output of that process back to the HTTP client. This also means there's no persistent PHP process running at all times with a persistent state, like Java typically does, but each request is handled by starting up and then tearing down a new PHP instance.
While Java simply saves persistent data in memory, data persistence between requests in PHP is handled via a number of methods like memcache, sessions, databases, files etc.; depending on the specific needs of the situation. PHP does have opcode cache addons, which kind of work like Java byte code, simply so PHP doesn't have to repeat the same parse and compile process every single time it's executing the same file.
Do keep in mind that it's entirely feasible to write a persistent PHP program which keeps running just like Java, it's simply not PHP's default modus operandi. Personally I'm quite a fan of writing workers for specific tasks on Gearman or ZMQ which run persistently, and have some ephemeral scripts running on the web server as "frontend" which delegate work to those workers as needed.
If this sounds like a typical PHP app is much more of a glued-together accumulation of several disparate components, you'd be correct. Java is pretty self-contained, except for external products like RDBMS servers. PHP on the other hand often tends to rely on a bunch of third party products; which can work to its advantage in the sense that you can use best-of-breed products for specific tasks, but also requires more overhead of dealing with different systems.
This is how does PHP work:
(one of the best over the Internet)
In general terms, PHP as an engine interprets the content of PHP files (typically *.php, although alternative extensions are used occasionally) into an abstract syntax tree. The PHP engine then processes the translated AST and then returns the result given whatever inputs and processing are required.
Below image will depict more information
Source: freecodecamp.org

What is thread safe or non-thread safe in PHP?

I saw different binaries for PHP, like non-thread or thread safe?
What does this mean?
What is the difference between these packages?
Needed background on concurrency approaches:
Different web servers implement different techniques for handling incoming HTTP requests in parallel. A pretty popular technique is using threads -- that is, the web server will create/dedicate a single thread for each incoming request. The Apache HTTP web server supports multiple models for handling requests, one of which (called the worker MPM) uses threads. But it supports another concurrency model called the prefork MPM which uses processes -- that is, the web server will create/dedicate a single process for each request.
There are also other completely different concurrency models (using Asynchronous sockets and I/O), as well as ones that mix two or even three models together. For the purpose of answering this question, we are only concerned with the two models above, and taking Apache HTTP server as an example.
Needed background on how PHP "integrates" with web servers:
PHP itself does not respond to the actual HTTP requests -- this is the job of the web server. So we configure the web server to forward requests to PHP for processing, then receive the result and send it back to the user. There are multiple ways to chain the web server with PHP. For Apache HTTP Server, the most popular is "mod_php". This module is actually PHP itself, but compiled as a module for the web server, and so it gets loaded right inside it.
There are other methods for chaining PHP with Apache and other web servers, but mod_php is the most popular one and will also serve for answering your question.
You may not have needed to understand these details before, because hosting companies and GNU/Linux distros come with everything prepared for us.
Now, onto your question!
Since with mod_php, PHP gets loaded right into Apache, if Apache is going to handle concurrency using its Worker MPM (that is, using Threads) then PHP must be able to operate within this same multi-threaded environment -- meaning, PHP has to be thread-safe to be able to play ball correctly with Apache!
At this point, you should be thinking "OK, so if I'm using a multi-threaded web server and I'm going to embed PHP right into it, then I must use the thread-safe version of PHP". And this would be correct thinking. However, as it happens, PHP's thread-safety is highly disputed. It's a use-if-you-really-really-know-what-you-are-doing ground.
Final notes
In case you are wondering, my personal advice would be to not use PHP in a multi-threaded environment if you have the choice!
Speaking only of Unix-based environments, I'd say that fortunately, you only have to think of this if you are going to use PHP with Apache web server, in which case you are advised to go with the prefork MPM of Apache (which doesn't use threads, and therefore, PHP thread-safety doesn't matter) and all GNU/Linux distributions that I know of will take that decision for you when you are installing Apache + PHP through their package system, without even prompting you for a choice. If you are going to use other webservers such as nginx or lighttpd, you won't have the option to embed PHP into them anyway. You will be looking at using FastCGI or something equal which works in a different model where PHP is totally outside of the web server with multiple PHP processes used for answering requests through e.g. FastCGI. For such cases, thread-safety also doesn't matter. To see which version your website is using put a file containing <?php phpinfo(); ?> on your site and look for the Server API entry. This could say something like CGI/FastCGI or Apache 2.0 Handler.
If you also look at the command-line version of PHP -- thread safety does not matter.
Finally, if thread-safety doesn't matter so which version should you use -- the thread-safe or the non-thread-safe? Frankly, I don't have a scientific answer! But I'd guess that the non-thread-safe version is faster and/or less buggy, or otherwise they would have just offered the thread-safe version and not bothered to give us the choice!
For me, I always choose non-thread safe version because I always use nginx, or run PHP from the command line.
The non-thread safe version should be used if you install PHP as a CGI binary, command line interface or other environment where only a single thread is used.
A thread-safe version should be used if you install PHP as an Apache module in a worker MPM (multi-processing model) or other environment where multiple PHP threads run concurrently - simply put, any CGI/FastCGI build of PHP does not require thread safety.
Apache MPM prefork with modphp is used because it is easy to configure/install. Performance-wise it is fairly inefficient. My preferred way to do the stack, FastCGI/PHP-FPM. That way you can use the much faster MPM Worker. The whole PHP remains non-threaded, but Apache serves threaded (like it should).
So basically, from bottom to top
Linux
Apache + MPM Worker + ModFastCGI (NOT FCGI) |(or)| Cherokee |(or)| Nginx
PHP-FPM + APC
ModFCGI does not correctly support PHP-FPM, or any external FastCGI applications. It only supports non-process managed FastCGI scripts. PHP-FPM is the PHP FastCGI process manager.
As per PHP Documentation,
What does thread safety mean when downloading PHP?
Thread Safety means that binary can work in a multithreaded webserver
context, such as Apache 2 on Windows. Thread Safety works by creating
a local storage copy in each thread, so that the data won't collide
with another thread.
So what do I choose? If you choose to run PHP as a CGI binary, then
you won't need thread safety, because the binary is invoked at each
request. For multithreaded webservers, such as IIS5 and IIS6, you
should use the threaded version of PHP.
Following Libraries are not thread safe. They are not recommended for use in a multi-threaded environment.
SNMP (Unix)
mSQL (Unix)
IMAP (Win/Unix)
Sybase-CT (Linux, libc5)
The other answers address SAPIs implementations, and while this is relevant the question asks the difference between the thread-safe vs non thread-safe distributions.
First, PHP is compiled as an embeddable library, such as libphp.so on *NIX and php.dll on Windows. This library can be embedded into any C/CPP application, but obviously it is primarily used on web servers. At it's core PHP starts up in in two major phases, the module init phase and then request init phase. Module init initializes the PHP core and all extensions, where request init initializes PHP userspace - both native userspace features as well as PHP code itself.
The PHP library is setup to where the module phase only has to be called on once, but the request phase has to be reinitialized for each HTTP request. Note that CLI links to the same library as mod_php ect, and still has to go through these phases internally even though it may not be used in the context of processing HTTP requests. Also, it's important to note that PHP isn't literally designed for processing HTTP requests - most accurately, it is designed for processing CGI events. Again, this isn't just php-cgi, but all SAPI/applications including php-fpm, mod_php, CLI and even the exceedingly rare PHP desktop application.
Webservers (or more typically SAPIs) that link to libphp tend to follow one of four patterns:
create a completely new instance of a PHP per request (old CGI pattern, not common and obviously not recommended)
create one PHP instance, but go though both initialization phases (together) in separate forked child processes
create one PHP instance, do module init once in parent process pre-fork, and then the individual request phase post-fork per HTTP request
Note that in examples 2 and 3 the child process is typically terminated after each request. In example 2 the child process must be terminated at the end of each request.
The forth example is related to threaded implementations
call module init once in main thread, then call request init within other threads.
In the threaded case, request handling threads tend to utilize a thread pool, with each thread running in a loop initializing the request phase at the beginning and than destroying the request phase at the end which is more optimal than spawning a new thread per request
Regardless of how threaded implementations utilize libphp, if the module phase is initialized in one thread and request phases are called in different threads (which is the case PHP was designed for) it requires a non-trivial amount of synchronization, not just within the PHP core, but also within all native PHP extensions. Note that this is not just a matter of a “request” at this point, but synchronization that it being called on per PHP OPCODE that relies on any form of resource within the PHP core (or any PHP extension) which exists in a different thread as PHP userspace.
This places a huge demand on synchronization within thread-safe PHP distributions, which is why PHP tends to follow a "share nothing" rule which helps minimize the impact, but there is no such thing as truly "sharing nothing" in this pattern, unless each thread contains a completely separate PHP context, where the module phase and request phase is all done within the same thread per request which is not suggested or supported. If the context built within the module init phase is in a separate thread as the request init phase there will most definitely be sharing between threads. So the best attempt is made to minimize context within the module init phase that must be shared between threads, but this is not easy and in some cases not possible.
This is especially true in more complicated extensions which have their own requirements of how a their own context must be shared between threads, with openssl being a major culprit of of this example which effectually extends outward to any extension that uses it, whether internal such as PHP stream handlers or external such as sockets, curl, database extensions, etc.
If not obvious at this point, thread-safe vs non thread-safe is not just a matter of how PHP works internally as a “request handler” within an SAPI implementation, but a matter of how PHP works internally as an embedded virtual machine for a programming language.
This is all made possible by the TSRM, or the thread safe resource manager, which is well made and handles a very large amount of synchronization with little perceived overhead, but the overhead is definitely there and will grow not just based on how many requests per second that the server must handle (the deciding factor on how may threads the SAPI requires), but also by how much PHP code is used per request (or per execution). In other words, large bloated frameworks can make a real difference when it comes specifically to TSRM overhead. This isn't to speak of overall PHP performance and resource requirements within thread-safe PHP, but just the additional overhead of TSRM itself within thread-safe PHP.
As such, pre compiled PHP is distributed in two flavors, one built where TSRM is active in libphp (thread-safe) and one where libphp does not use any TSRM features (non thread-safe) and thus does not have the overhead of TSRM.
Also note that the flag used to compile PHP with TSRM (--enable-maintainer-zts or --with-zts in later PHP versions) causes phpize to extend this outward into the compilation of extensions and how they initialize their own libraries (libssl, libzip, libcurl, etc) which will often have their own way of compiling for thread-safe vs non thread-safe implementations, i.e their own synchronization mechanisms outside of TSRM and PHP as a whole. While this not exactly PHP related, in the end will still have effect on PHP performance outside of TSRM (meaning on top of TSRM). As such, PHP extensions (and their dependents, as well as external libraries PHP or extensions link to or otherwise depend on) will often have different attributes in thead-safe PHP distributions.

PHP as a thttpd module vs CGI in terms of memory usage

I am planning to use php in an embedded environment. Our current web server is thttpd. I am considering two options now: whether to run it as a cgi or as SAPI module. I know cgi has advantage in terms of security. But if we are to use php as cgi, an instance of the php should be loaded into the memory for each request.
I have tried compiling it as a SAPI module of thttpd and I have observed that thttpd's memory usage, specifically rss, does not grow larger as the number of request increases.
Can anybody explain how thttpd loads php? Is it loaded just one time and stays resident to the memory as long as thttpd is running? If so, we may consider this as an alternative to cgi.
Does it perform multi-threading, i.e. if there's multiple http request at the same time? or does it process request one at a time?
Is there a good documentation discussing behavior of php as a module of thttpd?
I have no experience with thttpd, but here are some pointers:
the PHP engine is thread safe, but some extensions aren't, so usually people shy away from using it in a multi-threaded environment and rather go with the one-process - one-request method
yes, usually webserver modules (like the Apache mod_* stuff) works by staying resident, but the big speedbump for PHP is that it needs to parse the source file (or even multiple source files if you use include / require) for each request. You can cut down on this by using something like APC which caches the parsed version of the files
there is also a protocol called FastCGI which you might want to look at - it basically is a crossover between the module and CGI solution - it spins up a couple of processes, each process hosts a single instance of the CGI problem (PHP in this case) and uses them to process requests. Instances are recycled (ie. they can process multiple requests, one after the other).

Categories