Related
TL;DR: I'm not sure this topic has its place on StackOverflow, but basically it's just a topic of debate and thinking about making PHP apps like we would do with NodeJS for example (stateless request flow, asynchronous calls, etc.)
The situation
We know NodeJS can be used as both a web-server and web-app.
But for PHP, the internal web-server is not recommended for production (so says the documentation).
But, as Symfony full-stack is based on the Kernel which handles Request objects, it means we should be able to send lots of requests to the same kernel, only if we could "bootstrap" the php web-server (not the app) by creating a kernel before listening to HTTP requests. And our router would only create a Request object and make the kernel handle it.
But for this, a Symfony app has to be stateless, for example we need Doctrine to effectively clear its unit of work after a request, or maybe we would need to sort of isolate some components based on a request (By identifying a request with its unique PHP class reference id? Or by using other php processes?), and obviously, we would need more asynchronous things in PHP or in the way we use the internal web-server.
The main questions I sometimes ask myself, and now ask to the community
To clarify this, I have some questions about PHP:
Why exactly is the internal PHP webserver not recommended for production?
I mean, if we can configure how the server is run and its "router" file, we should be able to use it like any PHP server, yes or no?
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not shared, else we could make nodejs-like apps, but it seems PHP is not capable of doing something like this.
Is it really possible to make a full-stateless application with Symfony?
e.g. I send two different requests to the same kernel object, in this case, is there any possibility that the two requests create a conflict in Symfony core components?
Actually, the idea of "Create a kernel -> start server -> on request, make the kernel handle it" behavior would be awesome, because it would be something quite similar to NodeJS, but actually, the PHP paradigm is not compatible with this because we would need each request to be handled asynchronously. But if a kernel and its container is stateless, then, there should be a way to do something like that, shouldn't it?
Some thoughts
I've heard about React PHP, Ratchet PHP for Websocket integration, Icicle, PHP-PM but never experienced them, it seems a bit too complex to me for now (I may lack some concepts about asynchronicity in apps, that's why my brain won't understand until I have some more answers :D ).
Is there any way that these libraries could be used as "wrappers" for our kernel request handling?
I mean, let's create this reactphp/icicle/whatever environment setup, create our kernel like we would do in any Symfony app, and run the app as web-server, and when a request is retrieved, we send it asynchrously to our kernel, and as long as the kernel has not sent the response, the client waits for it, even if the response is also sent asynchrously (from nested callbacks, etc., like in NodeJS).
This would make any existing Symfony app compatible with this paradigm, as long as the app is stateless, obviously. (if the app config changes based on a request, there's a paradigm issue in the app itself...)
Is it even a possible reality with PHP libraries rather than using PHP internal web-server in another way?
Why ask these questions?
Actually, it would be kind of a revolution if PHP could implement real asynchronous stuff internally, like Javascript has, but this would also has a big impact on performances in PHP, because of persistent data in our web-server, less bootstraping (require autoloader, instantiate kernel, get heavy things from cached files, resolve routing, etc.).
In my thoughts, only the $kernel->handleRaw($request); would consume CPU, the whole rest (container, parameters, services, etc.) would be already in the memory, or, for the case of services, "awaiting to be instantiated". Then, performance boost, I think.
And it may troll a bit the people who still think PHP is a very bad and slow language to use :D
For readers and responders ;)
If a core PHP contributor reads me, is there any way that internally PHP could be more asynchronous even with a specific new internal API based on functions or classes?
I'm not a pro of all of these concepts, and I hope really good experts are going to read this and answer me!
It could be a great advance in the PHP world if all of this was possible in any way.
Why exactly is the internal PHP webserver not recommended for
production? I mean, if we can configure how the server is run and its
"router" file, we should be able to use it like any PHP server, yes or
no?
Because it's not written to behave well under load, and there are no configuration options that let you handle HTTP request processing before it reaches PHP.
Basically, it lacks features if you compare it to nginx. It would be equal to comparing a skateboard to a Lamborghini.
It can get you from A to B but.. you get the gist.
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not
shared, else we could make nodejs-like apps, but it seems PHP is not
capable of doing something like this.
Documentation states it's singlethreaded, so it appears that it would behave the same as if you wrote while(true) { // all your processing here }.
It's a playtoy designed to quickly check a few things if you can't be bothered to set up a proper web server before trying out your code.
Is it really possible to make a full-stateless application with
Symfony? e.g. I send two different requests to the same kernel object,
in this case, is there any possibility that the two requests create a
conflict in Symfony core components?
Why would it go to the same kernel object? Why not design your app in such a way that it's not relevant which object or even processing server gets the request? Why not design for redundancy and high availability from the get go? HTTP = stateless by default. Your task = make it irrelevant what processes the request. It's not difficult to do so, if you avoid coupling with the actual processing server (example: don't store sessions to local filesystem etc.)
Actually, the idea of "Create a kernel -> start server -> on request,
make the kernel handle it" behavior would be awesome, because it would
be something quite similar to NodeJS, but actually, the PHP paradigm
is not compatible with this because we would need each request to be
handled asynchronously. But if a kernel and its container is
stateless, then, there should be a way to do something like that,
shouldn't it?
Actually, nginx + php-fpm behave almost identical to node.js.
nginx uses a reactor to handle all connections on the same thread. Node.js does the exact same thing. What you do is create a closure / callback that is fed into Node's libraries and I/O is handled in a threaded environment. Multithreading is abstracted from you (related to I/O, not CPU). That's why you can experience that Node.js blocks when it's asked to do a CPU intensive task.
nginx implements the exact same concept, except this callback isn't a closure written in javascript. It's a callback that expects an answer from php-fpm during <timeout> seconds. Nginx takes care of async for you. What your task is is to write what you want in PHP. Now, if you're reading a huge file, then async code in your PHP would make sense, except it's not really needed.
With nginx and sending off requests for processing to a fastcgi worker, scaling becomes trivial. For example, let's assume that 1 PHP machine isn't enough to deal with the amount of requests you're dealing with. No problem, add more machines to nginx's pool.
This is taken from nginx docs:
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com:8080;
server unix:/tmp/backend3;
server backup1.example.com:8080 backup;
server backup2.example.com:8080 backup;
}
server {
location / {
proxy_pass http://backend;
}
}
You define a pool of servers and then assign various weights / proxying options related to balancing how requests are handled.
However, the important part is that you can add more servers to cope with availability requirements.
This is the reason why nginx + php-fpm stack is appealing. Since nginx acts as a proxy, it can proxy requests to node.js as well, letting you handle web socket related operations in node.js (which, in turn, can perform an HTTP request to a PHP endpoint, allowing you to contain your entire app logic in PHP).
I know this answer might not be what you're after, but what I wanted to highlight is the way node.js works (conceptually) is identical to what nginx does when it comes to handling incoming request. You could make php work as node does, but there's no need for that.
Your questions can be summed up as this:
"Could PHP be more like Node?"
to which the answer is of course "Yes." But that leads us to another question:
"Should PHP be more like Node?"
and now the answer is not that obvious.
Of course in theory PHP could be made more like Node - even to a point to make it exactly the same. Just take the next version of Node and call it PHP 6.0 or something.
I would argue that it would be harmful to both Node and PHP. There is a diversity in the runtime environments for a reason. One of the variations is the concurrency model used in a given environment. Making one like the other would mean less choice for the programmer. And less choice is less freedom of expression.
PHP and Node were created in different times and for different reasons.
PHP was developed in 1995 and the name stood for Personal Home Page. The use case was to add some server-side dynamic features to HTML. We already had SSI and CGI at that point but people wanted to be able to inject right into the HTML - synchronously, as it wouldn't make much sense otherwise - results of database queries and other computations. It isn't a surprise how good it is at this job even today.
Node, on the other hand, was developed in 2009 - almost 15 years later - to create high performance network servers. So it shouldn't surprise us that writing such servers in Node is easy and that they have great performance characteristics. This is why Node was created in the first place. One of the choices it had to make was a 100% non-blocking environment of single-threaded, asynchronous event loops.
Now, single-threading concurrency is conceptually more difficult than multi-threading. But if you want performance for I/O-heavy operations then currently you have no other options. You will not be able to create 10,000 threads but you can easily handle 10,000 connections with Node in a single thread. There is a reason why nginx is single-threaded and why Redis is single threaded. And one common characteristic of nginx and Redis is amazing performance - but both of those were hard to write.
Now, as far as Node and PHP go, those technologies are so far from each other that it's hard to even comprehend how their fusion would look like. It reminds me the old April Fool's joke about unifying Perl and Python that so many people believed in.
PHP has its strengths and Node has it strengths. And just like it would be hard to imagine Node with blocking-I/O, it would be equally hard to imagine PHP with non-blocking I/O.
To summarize: it could be possible to make PHP like Node, but I wouldn't expect it to happen any time soon - if ever.
I was thinking of implementing real time chat using a PHP backend, but I ran across this comment on a site discussing comet:
My understanding is that PHP is a
terrible language for Comet, because
Comet requires you to keep a
persistent connection open to each
browser client. Using mod_php this
means tying up an Apache child
full-time for each client which
doesn’t scale at all. The people I
know doing Comet stuff are mostly
using Twisted Python which is designed
to handle hundreds or thousands of
simultaneous connections.
Is this true? Or is it something that can be configured around?
Agreeing/expanding what has already been said, I don't think FastCGI will solve the problem.
Apache
Each request into Apache will use one worker thread until the request completes, which may be a long time for COMET requests.
This article on Ajaxian mentions using COMET on Apache, and that it is difficult. The problem isn't specific to PHP, and applies to any back-end CGI module you may want to use on Apache.
The suggested solution was to use the 'event' MPM module which changes the way requests are dispatched to worker threads.
This MPM tries to fix
the 'keep alive problem' in HTTP.
After a client completes the first
request, the client can keep the
connection open, and send further
requests using the same socket. This
can save signifigant overhead in
creating TCP connections. However,
Apache traditionally keeps an entire
child process/thread waiting for data
from the client, which brings its own
disadvantages. To solve this problem,
this MPM uses a dedicated thread to
handle both the Listening sockets, and
all sockets that are in a Keep Alive
state.
Unfortunately, that doesn't work either, because it will only 'snooze' after a request is complete, waiting for a new request from the client.
PHP
Now, considering the other side of the problem, even if you resolve the issue with holding up one thread per comet request, you will still need one PHP thread per request - this is why FastCGI won't help.
You need something like Continuations which allow the comet requests to be resumed when the event they are triggered by is observed. AFAIK, this isn't something that's possible in PHP. I've only seen it in Java - see the Apache Tomcat server.
Edit:
There's an article here about using a load balancer (HAProxy) to allow you to run both an apache server and a comet-enabled server (e.g. jetty, tomcat for Java) on port 80 of the same server.
You could use Nginx and JavaScript to implement a Comet based chat system that is very scalable with little memory or CPU utilization.
I have a very simple example here that can get you started. It covers compiling Nginx with the NHPM module and includes code for simple publisher/subscriber roles in jQuery, PHP, and Bash.
http://blog.jamieisaacs.com/2010/08/27/comet-with-nginx-and-jquery/
PHP
I found this funny little screencasts explaining simple comet. As a side note I really think this is going to kill your server on any real load. When just having a couple of users, I would say to just go for this solution. This solution is really simple to implement(screencasts only takes 5 minutes of your time :)). But as I was telling previously I don't think it is good for a lot of concurrent users(Guess you should benchmark it ;)) because:
It uses file I/O which is much slower then just getting data from memory. Like for example the functions filemtime(),
Second, but I don't think least PHP does not a have a decent thread model. PHP was not designed for this anyway because of the share nothing model. Like the slides says "Shared data is pushed down to the data-store layer" like for example MySQL.
Alternatives
I really think you should try the alternatives if you want to do any comet/long polling. You could use many languages like for example:
Java/JVM: Jetty continuations.
Python: Dustin's slosh.
Erlang: Popular language for comet/etc.
Lua, Ruby, C, Perl just to name a few.
Just performing a simple google search, will show you a lot alternatives also PHP(which I think on any big load will kill your server).
mod_php is not the only way to use PHP. You can use fastcgi. PHP must be compiled with --enable-fastcgi.
PHP as FastCGI: http://www.fastcgi.com/drupal/node/5?q=node/10
You may also try https://github.com/reactphp/react
React is a low-level library for event-driven programming in PHP. At its core is an event loop, on top of which it provides low-level utilities, such as: Streams abstraction, async dns resolver, network client/server, http client/server, interaction with processes. Third-party libraries can use these components to create async network clients/servers and more.
The event loop is based on the reactor pattern (hence the name) and strongly inspired by libraries such as EventMachine (Ruby), Twisted (Python) and Node.js (V8).
The introductory example shows a simple HTTP server listening on port 1337:
<?php
$i = 0;
$app = function ($request, $response) use (&$i) {
$i++;
$text = "This is request number $i.\n";
$headers = array('Content-Type' => 'text/plain');
$response->writeHead(200, $headers);
$response->end($text);
};
$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$http = new React\Http\Server($socket);
$http->on('request', $app);
$socket->listen(1337);
$loop->run();
I'm having a similar issue. One option I'm finding interesting is to use an existing Comet server, like cometd-java or cometd-python, as the core message hub. Your PHP code is then just a client to the Comet server -- it can post or read messages from channels, just like other clients.
There's an interesting code snippet linked here: http://morglog.org/?p=22=1 that implements part of this method (although there are bits of debug code spread around, too).
I'm current implementing a scalable PHP Comet server using socket functions. It is called 'phet' ( [ph]p com[et] )
Project page: http://github.com/Tim-Smart/phet
Free free to join in on development. I have currently managed to get most of the server logic done, just need to finish off the client side stuff.
EDIT: Recently added 'Multi-threading' capabilities using the pcntl_fork method :)
You'll have a hard time implementing comet in PHP, just because of it's inherent single-threaded-ness.
Check out Websync On-Demand - the service lets you integrate PHP via server-side publishing, offloading the heavy concurrent connection stuff, and will let you create a real-time chat app in no time.
A new module just came out for the nginx web server that'll allow Comet with any language, including PHP.
http://www.igvita.com/2009/10/21/nginx-comet-low-latency-server-push/
You will have to create your own server in PHP. Using Apache/mod_php or even fastcgi will not scale at all. A few years old, but can get you started:
PHP-Comet-Server:
http://sourceforge.net/projects/comet/
I think this is more an issue that having a lot of apache threads running all the time is a problem. That will existing with any language if it works via apache in the same way as PHP (usually) does.
This is a bit complicated, so please don't jump to conclusions, feel free to ask about anything that is not clear enough.
Basically, I have a websocket server written in PHP. Please note that websocket messages are asynchronous, that is, a response to a request might take a lot of time, all the while the client keeps on working (if applicable).
Clients are supposed to ask the server for access to files on other servers. This can be an FTP service, or Dropbox, for the matter.
Here, please take note of two issues: connections should be shared and reused and the server actually 'freezes' while it does its work, hence any requests are processed after the server has 'unfrozen'.
Therefore, I thought, why not offload file access (which is what freezes the server) to PHP threads?
The problem here is twofold;
how do I make a connection resource in the main thread (the server) available to the sub threads (not possible with the above threading model)?
what would happen if two threads end up needing the same resource? It's perfectly fine if one is locked until the other one finishes, but we still need to figure out issue #1.
Perhaps my train of thought is all screwed up, if you can find a better solution, I'm eager to hear it out. I've also had the idea of having a PHP thread hosting a connection resource, but it's pretty memory intensive.
PHP supports no threads. The purpose of PHP is to respond to web requests quickly. That's what the architecture was built for. Different libraries try to do something like threads but they usually cause more issues than they solve.
In general there are two ways to achieve what you want:
off-load the long processes to an external process. A common approach is using a system like gearman http://php.net/gearman
Use asynchronous operations. Some stream operations and such provide an "async" flag or "non-blocking" mode. http://php.net/stream-set-blocking
I have a PHP page that gets its content by making an HTTP request to another site on the same server, using file_get_contents. Both sites run in Apache 2 which calls PHP using suPHP (which is FastCGI, right?)
How significant is the overhead of this call? Does Apache do a lot of processing before sending a request to PHP?
An alternative way to make the call would be for the first site to exec('php /the/other/script.php some parameters'). Would this be faster, or is the overhead of spawning a process bigger than that of going through Apache?
Apache's over head is going to depend on whats configured for that site host, for example https, htaccess checks, rewriting, etc.. Those things can stack up. Now i dont think it would be much strain overhead wise comparatively but you are going to have the time it taks to generate the response which depending on the nature of the external pages oyure calling could be signifigant in some situations.
With that said, i dont nessecarily see a problem with making the calls through apache. But i do think that as you suggest exposing the php directly would be better. I think maybe reading up on SOA in general might help you gain some insight on how best to implement.
Unfotunatly installing PHP as cgi, you will loose alot of performace, because eachtime you have to create a new process for it.
So best method is to install PHP as apache modul
I was thinking of implementing real time chat using a PHP backend, but I ran across this comment on a site discussing comet:
My understanding is that PHP is a
terrible language for Comet, because
Comet requires you to keep a
persistent connection open to each
browser client. Using mod_php this
means tying up an Apache child
full-time for each client which
doesn’t scale at all. The people I
know doing Comet stuff are mostly
using Twisted Python which is designed
to handle hundreds or thousands of
simultaneous connections.
Is this true? Or is it something that can be configured around?
Agreeing/expanding what has already been said, I don't think FastCGI will solve the problem.
Apache
Each request into Apache will use one worker thread until the request completes, which may be a long time for COMET requests.
This article on Ajaxian mentions using COMET on Apache, and that it is difficult. The problem isn't specific to PHP, and applies to any back-end CGI module you may want to use on Apache.
The suggested solution was to use the 'event' MPM module which changes the way requests are dispatched to worker threads.
This MPM tries to fix
the 'keep alive problem' in HTTP.
After a client completes the first
request, the client can keep the
connection open, and send further
requests using the same socket. This
can save signifigant overhead in
creating TCP connections. However,
Apache traditionally keeps an entire
child process/thread waiting for data
from the client, which brings its own
disadvantages. To solve this problem,
this MPM uses a dedicated thread to
handle both the Listening sockets, and
all sockets that are in a Keep Alive
state.
Unfortunately, that doesn't work either, because it will only 'snooze' after a request is complete, waiting for a new request from the client.
PHP
Now, considering the other side of the problem, even if you resolve the issue with holding up one thread per comet request, you will still need one PHP thread per request - this is why FastCGI won't help.
You need something like Continuations which allow the comet requests to be resumed when the event they are triggered by is observed. AFAIK, this isn't something that's possible in PHP. I've only seen it in Java - see the Apache Tomcat server.
Edit:
There's an article here about using a load balancer (HAProxy) to allow you to run both an apache server and a comet-enabled server (e.g. jetty, tomcat for Java) on port 80 of the same server.
You could use Nginx and JavaScript to implement a Comet based chat system that is very scalable with little memory or CPU utilization.
I have a very simple example here that can get you started. It covers compiling Nginx with the NHPM module and includes code for simple publisher/subscriber roles in jQuery, PHP, and Bash.
http://blog.jamieisaacs.com/2010/08/27/comet-with-nginx-and-jquery/
PHP
I found this funny little screencasts explaining simple comet. As a side note I really think this is going to kill your server on any real load. When just having a couple of users, I would say to just go for this solution. This solution is really simple to implement(screencasts only takes 5 minutes of your time :)). But as I was telling previously I don't think it is good for a lot of concurrent users(Guess you should benchmark it ;)) because:
It uses file I/O which is much slower then just getting data from memory. Like for example the functions filemtime(),
Second, but I don't think least PHP does not a have a decent thread model. PHP was not designed for this anyway because of the share nothing model. Like the slides says "Shared data is pushed down to the data-store layer" like for example MySQL.
Alternatives
I really think you should try the alternatives if you want to do any comet/long polling. You could use many languages like for example:
Java/JVM: Jetty continuations.
Python: Dustin's slosh.
Erlang: Popular language for comet/etc.
Lua, Ruby, C, Perl just to name a few.
Just performing a simple google search, will show you a lot alternatives also PHP(which I think on any big load will kill your server).
mod_php is not the only way to use PHP. You can use fastcgi. PHP must be compiled with --enable-fastcgi.
PHP as FastCGI: http://www.fastcgi.com/drupal/node/5?q=node/10
You may also try https://github.com/reactphp/react
React is a low-level library for event-driven programming in PHP. At its core is an event loop, on top of which it provides low-level utilities, such as: Streams abstraction, async dns resolver, network client/server, http client/server, interaction with processes. Third-party libraries can use these components to create async network clients/servers and more.
The event loop is based on the reactor pattern (hence the name) and strongly inspired by libraries such as EventMachine (Ruby), Twisted (Python) and Node.js (V8).
The introductory example shows a simple HTTP server listening on port 1337:
<?php
$i = 0;
$app = function ($request, $response) use (&$i) {
$i++;
$text = "This is request number $i.\n";
$headers = array('Content-Type' => 'text/plain');
$response->writeHead(200, $headers);
$response->end($text);
};
$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$http = new React\Http\Server($socket);
$http->on('request', $app);
$socket->listen(1337);
$loop->run();
I'm having a similar issue. One option I'm finding interesting is to use an existing Comet server, like cometd-java or cometd-python, as the core message hub. Your PHP code is then just a client to the Comet server -- it can post or read messages from channels, just like other clients.
There's an interesting code snippet linked here: http://morglog.org/?p=22=1 that implements part of this method (although there are bits of debug code spread around, too).
I'm current implementing a scalable PHP Comet server using socket functions. It is called 'phet' ( [ph]p com[et] )
Project page: http://github.com/Tim-Smart/phet
Free free to join in on development. I have currently managed to get most of the server logic done, just need to finish off the client side stuff.
EDIT: Recently added 'Multi-threading' capabilities using the pcntl_fork method :)
You'll have a hard time implementing comet in PHP, just because of it's inherent single-threaded-ness.
Check out Websync On-Demand - the service lets you integrate PHP via server-side publishing, offloading the heavy concurrent connection stuff, and will let you create a real-time chat app in no time.
A new module just came out for the nginx web server that'll allow Comet with any language, including PHP.
http://www.igvita.com/2009/10/21/nginx-comet-low-latency-server-push/
You will have to create your own server in PHP. Using Apache/mod_php or even fastcgi will not scale at all. A few years old, but can get you started:
PHP-Comet-Server:
http://sourceforge.net/projects/comet/
I think this is more an issue that having a lot of apache threads running all the time is a problem. That will existing with any language if it works via apache in the same way as PHP (usually) does.