Should Nginx Be Combined With Language Supporting Asynchronous Programming Model? - php

I found there are a lot of articles comparing Nginx and Apache in Internet. However, all these comparisons are based on stress test to web server running PHP code. I suppose this is mainly due to Apache is generally deployed with PHP as LAMP architecture.
In my understanding, Nginx is created to solve C10K problem with event-based architecture. That is, Nginx is supposed to serve M concurrent requests with N threads/processes. N is supposed to much less than M. This is a big difference from Apache which needs M threads/processes to serve M concurrent requests.
For PHP code, the programming model is not asynchronous. Each web request would occupy one thread/process for PHP to handle it. So, I don't understand the meaning to compare Nginx and Apache with PHP code.
The event-based architecture of Nginx must excels Apache especially when requests involves I/O operations. For example, requests need to merge results from multiple other web services. For Apache+PHP, each requests might takes seconds just waiting for I/O operation complete. That would consume a lot of threads/processes. For Nginx, this is not a problem, if asynchronous programming is used.
Would it make more sense to deploy Nginx with language supporting asynchronous programming model?
I'm not sure which programming language could dig most potential from Nginx, but it id definitely not PHP.

First and foremost, nginx does not support any application execution directly.
It can serve static files, proxy requests to any other webserver and some other small things.
Historically, nginx aimed to handle many network connections, true, but the rationale was this:
until apache respond to the request of someone on slow connection, it can do nothing.
Apache has a limit of workers, so when there are lots of slow clients, anyone new have to wait until
a worker finishes the transfer and resumes accepting new request.
So the classic setup is nginx accepting external requests, proxying them to the local apache;
apache handles the requests and gives back the responses to the nginx to transfer to the clients.
Thus apache is eliminated from dealing with clients.
Regarding the question and nginx in the picture. It's not that hard to utilize
system event frameworks these days. That's epoll for Linux, kqueue for FreeBSD
and others. At the application level lots of choices, twisted for python for
example. So all you have to do is to write application with these frameworks,
which 1) usually put you in async world and 2) give you a way
to build HTTP service, ready to be backend for nginx.
That's probably where you are aiming at.
So, c10k doesn't seem to be a problem for nginx,
nor for applications built around these frameworks.
Example at hand is friendfeed's tornado server:
python written, uses epoll and kqueue depending on the system,
handles up to 8k easyly, as i recall. There were some benchmarks
and afterthought to scale it further.
Something must be brewing in ruby world about all the async trend,
so they can come up with, if they haven't already.
Ruby's passenger and mongrel, whatever in essense they are (i'm blanking on this),
do work with nginx, and this required writing modules for nginx.
So the community takes nginx into account and does extra when it needs to.
Php, by the way, stays relevant for pushes when websockets massively deployed. Oh well.

The point is that potential doesn't matter. PHP is something of a standard for web development and so is what people usually care about with servers, so just because Ngnix or Apache are optimised to run an obscure programming language y times faster than the other is irrelevant unless it's PHP.

Related

Deploying Laravel site via Nginx Vs. PHP Artisan Serve

Since locally, I did only php artisan serve and it works fine.
In my production VM, I am not sure if I should just do the same php artisan serve &
so I don't have to install Nginx, configure the document root, and so on.
Are there any disadvantages in doing that?
nginx
designed to solve c10k problem
performs extremely well, even under huge load
is a reverse proxy
uses state of the art http parser to check whether request is even valid
uses extremely powerful yet simple config syntax
comes with plethora of modules to deal with http traffic (auth module, mirror module)
can terminate ssl/tls
can load balance between multiple php serving endpoints (or any other endpoints that speak http)
can be reloaded to apply new config, without losing current connections
php artisan serve
designed to quickly fiddle with laravel based website
written in php, isn't designed to solve c10k problem
will crash once available memory is exceeded (128 mb by default, that gets quickly filled up)
isn't a reverse proxy
isn't using state of the art http parser
isn't stress tested
can't scale to other machines the way nginx does
doesn't terminate SSL. Even if it did, it would be painfully slow compared to a pure compiled solution
isn't event-based or threaded the way php-fpm/nginx are so everything executes in the same process. There's no reactor pattern for offloading to workers to scale across cpu cores and protect against bringing the server down if a piece of code is messed up. This means if you load too much data from MySQL - process goes down, therefore the server too.
Configuring nginx takes about ~30 seconds on average, for experienced person. I'm speaking from experience since it's my daily job. Using automation tools like ansible makes this even easier, you can almost forget about it.
Using a web server designed to fiddle and quickly test a part of your code in production comes with risks. Your site will be slower. Your site will be prone to crashing if any script kiddie decides to run a curl request in a foreach loop.
If you think installing and configuring nginx is a hassle and you want to go with php artisan serve, make sure you run it supervised (supervisord is my go to tool). If it crashes, it'll boot up back again.
In my opinion, it's worthless to run a php-based server to serve your app. The amount of time spent to configure nginx / php-fpm isn't humongous, even if you're new to it.
Everything comes with risks and gains, but in this particular case - the gain doesn't exist, while there's certainty that something will go wrong.
TL;DR
Don't do it, spend those few minutes configuring nginx. The best software is the one that does the work well to that point you can forget about it. nginx is one of those tools. PHP excels in many areas, but built-in webserver is not one of those things that you should use in production. Go with tools proven in the battle field.
The php artisan serve never should be used on the production environment as it is using the PHP7 built-in server functionality which is designed to development purposes only.
See this page
So, please avoid using in production. Instead, use Apache or Nginx, which both are good choices, depending on your needs. Nginx may be usually faster(not always).

Stateless & asynchronous web-server with PHP (and Symfony)

TL;DR: I'm not sure this topic has its place on StackOverflow, but basically it's just a topic of debate and thinking about making PHP apps like we would do with NodeJS for example (stateless request flow, asynchronous calls, etc.)
The situation
We know NodeJS can be used as both a web-server and web-app.
But for PHP, the internal web-server is not recommended for production (so says the documentation).
But, as Symfony full-stack is based on the Kernel which handles Request objects, it means we should be able to send lots of requests to the same kernel, only if we could "bootstrap" the php web-server (not the app) by creating a kernel before listening to HTTP requests. And our router would only create a Request object and make the kernel handle it.
But for this, a Symfony app has to be stateless, for example we need Doctrine to effectively clear its unit of work after a request, or maybe we would need to sort of isolate some components based on a request (By identifying a request with its unique PHP class reference id? Or by using other php processes?), and obviously, we would need more asynchronous things in PHP or in the way we use the internal web-server.
The main questions I sometimes ask myself, and now ask to the community
To clarify this, I have some questions about PHP:
Why exactly is the internal PHP webserver not recommended for production?
I mean, if we can configure how the server is run and its "router" file, we should be able to use it like any PHP server, yes or no?
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not shared, else we could make nodejs-like apps, but it seems PHP is not capable of doing something like this.
Is it really possible to make a full-stateless application with Symfony?
e.g. I send two different requests to the same kernel object, in this case, is there any possibility that the two requests create a conflict in Symfony core components?
Actually, the idea of "Create a kernel -> start server -> on request, make the kernel handle it" behavior would be awesome, because it would be something quite similar to NodeJS, but actually, the PHP paradigm is not compatible with this because we would need each request to be handled asynchronously. But if a kernel and its container is stateless, then, there should be a way to do something like that, shouldn't it?
Some thoughts
I've heard about React PHP, Ratchet PHP for Websocket integration, Icicle, PHP-PM but never experienced them, it seems a bit too complex to me for now (I may lack some concepts about asynchronicity in apps, that's why my brain won't understand until I have some more answers :D ).
Is there any way that these libraries could be used as "wrappers" for our kernel request handling?
I mean, let's create this reactphp/icicle/whatever environment setup, create our kernel like we would do in any Symfony app, and run the app as web-server, and when a request is retrieved, we send it asynchrously to our kernel, and as long as the kernel has not sent the response, the client waits for it, even if the response is also sent asynchrously (from nested callbacks, etc., like in NodeJS).
This would make any existing Symfony app compatible with this paradigm, as long as the app is stateless, obviously. (if the app config changes based on a request, there's a paradigm issue in the app itself...)
Is it even a possible reality with PHP libraries rather than using PHP internal web-server in another way?
Why ask these questions?
Actually, it would be kind of a revolution if PHP could implement real asynchronous stuff internally, like Javascript has, but this would also has a big impact on performances in PHP, because of persistent data in our web-server, less bootstraping (require autoloader, instantiate kernel, get heavy things from cached files, resolve routing, etc.).
In my thoughts, only the $kernel->handleRaw($request); would consume CPU, the whole rest (container, parameters, services, etc.) would be already in the memory, or, for the case of services, "awaiting to be instantiated". Then, performance boost, I think.
And it may troll a bit the people who still think PHP is a very bad and slow language to use :D
For readers and responders ;)
If a core PHP contributor reads me, is there any way that internally PHP could be more asynchronous even with a specific new internal API based on functions or classes?
I'm not a pro of all of these concepts, and I hope really good experts are going to read this and answer me!
It could be a great advance in the PHP world if all of this was possible in any way.
Why exactly is the internal PHP webserver not recommended for
production? I mean, if we can configure how the server is run and its
"router" file, we should be able to use it like any PHP server, yes or
no?
Because it's not written to behave well under load, and there are no configuration options that let you handle HTTP request processing before it reaches PHP.
Basically, it lacks features if you compare it to nginx. It would be equal to comparing a skateboard to a Lamborghini.
It can get you from A to B but.. you get the gist.
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not
shared, else we could make nodejs-like apps, but it seems PHP is not
capable of doing something like this.
Documentation states it's singlethreaded, so it appears that it would behave the same as if you wrote while(true) { // all your processing here }.
It's a playtoy designed to quickly check a few things if you can't be bothered to set up a proper web server before trying out your code.
Is it really possible to make a full-stateless application with
Symfony? e.g. I send two different requests to the same kernel object,
in this case, is there any possibility that the two requests create a
conflict in Symfony core components?
Why would it go to the same kernel object? Why not design your app in such a way that it's not relevant which object or even processing server gets the request? Why not design for redundancy and high availability from the get go? HTTP = stateless by default. Your task = make it irrelevant what processes the request. It's not difficult to do so, if you avoid coupling with the actual processing server (example: don't store sessions to local filesystem etc.)
Actually, the idea of "Create a kernel -> start server -> on request,
make the kernel handle it" behavior would be awesome, because it would
be something quite similar to NodeJS, but actually, the PHP paradigm
is not compatible with this because we would need each request to be
handled asynchronously. But if a kernel and its container is
stateless, then, there should be a way to do something like that,
shouldn't it?
Actually, nginx + php-fpm behave almost identical to node.js.
nginx uses a reactor to handle all connections on the same thread. Node.js does the exact same thing. What you do is create a closure / callback that is fed into Node's libraries and I/O is handled in a threaded environment. Multithreading is abstracted from you (related to I/O, not CPU). That's why you can experience that Node.js blocks when it's asked to do a CPU intensive task.
nginx implements the exact same concept, except this callback isn't a closure written in javascript. It's a callback that expects an answer from php-fpm during <timeout> seconds. Nginx takes care of async for you. What your task is is to write what you want in PHP. Now, if you're reading a huge file, then async code in your PHP would make sense, except it's not really needed.
With nginx and sending off requests for processing to a fastcgi worker, scaling becomes trivial. For example, let's assume that 1 PHP machine isn't enough to deal with the amount of requests you're dealing with. No problem, add more machines to nginx's pool.
This is taken from nginx docs:
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com:8080;
server unix:/tmp/backend3;
server backup1.example.com:8080 backup;
server backup2.example.com:8080 backup;
}
server {
location / {
proxy_pass http://backend;
}
}
You define a pool of servers and then assign various weights / proxying options related to balancing how requests are handled.
However, the important part is that you can add more servers to cope with availability requirements.
This is the reason why nginx + php-fpm stack is appealing. Since nginx acts as a proxy, it can proxy requests to node.js as well, letting you handle web socket related operations in node.js (which, in turn, can perform an HTTP request to a PHP endpoint, allowing you to contain your entire app logic in PHP).
I know this answer might not be what you're after, but what I wanted to highlight is the way node.js works (conceptually) is identical to what nginx does when it comes to handling incoming request. You could make php work as node does, but there's no need for that.
Your questions can be summed up as this:
"Could PHP be more like Node?"
to which the answer is of course "Yes." But that leads us to another question:
"Should PHP be more like Node?"
and now the answer is not that obvious.
Of course in theory PHP could be made more like Node - even to a point to make it exactly the same. Just take the next version of Node and call it PHP 6.0 or something.
I would argue that it would be harmful to both Node and PHP. There is a diversity in the runtime environments for a reason. One of the variations is the concurrency model used in a given environment. Making one like the other would mean less choice for the programmer. And less choice is less freedom of expression.
PHP and Node were created in different times and for different reasons.
PHP was developed in 1995 and the name stood for Personal Home Page. The use case was to add some server-side dynamic features to HTML. We already had SSI and CGI at that point but people wanted to be able to inject right into the HTML - synchronously, as it wouldn't make much sense otherwise - results of database queries and other computations. It isn't a surprise how good it is at this job even today.
Node, on the other hand, was developed in 2009 - almost 15 years later - to create high performance network servers. So it shouldn't surprise us that writing such servers in Node is easy and that they have great performance characteristics. This is why Node was created in the first place. One of the choices it had to make was a 100% non-blocking environment of single-threaded, asynchronous event loops.
Now, single-threading concurrency is conceptually more difficult than multi-threading. But if you want performance for I/O-heavy operations then currently you have no other options. You will not be able to create 10,000 threads but you can easily handle 10,000 connections with Node in a single thread. There is a reason why nginx is single-threaded and why Redis is single threaded. And one common characteristic of nginx and Redis is amazing performance - but both of those were hard to write.
Now, as far as Node and PHP go, those technologies are so far from each other that it's hard to even comprehend how their fusion would look like. It reminds me the old April Fool's joke about unifying Perl and Python that so many people believed in.
PHP has its strengths and Node has it strengths. And just like it would be hard to imagine Node with blocking-I/O, it would be equally hard to imagine PHP with non-blocking I/O.
To summarize: it could be possible to make PHP like Node, but I wouldn't expect it to happen any time soon - if ever.

Is it more performant to use Proxygen or NGINX + FastCGI local socket with HHVM?

HHVM has a built in Server, Proxygen. You can run HHVM with the Proxygen server or run it in FastCGI mode, using another server such as nginx or apache to handle web requests.
I cannot find any benchmarks or authoritative source that provides any indication of which of the two option performs best. Obviously I could provision two systems an manually test various loads under different concurrency combinations and put together a benchmark, but I'd rather avoid the work if someone has already done such a comparison.
Does anyone know in general which is the better option from a sheer performance standpoint?
I have not done any measurement. But in theory, proxygen server would be more performant because it runs in the same process as the php worker threads, thus avoiding some overhead inter-process communication. Proxygen server is used at Facebook and some efforts are made to make it more reliable, e.g., protection mechanisms when the JIT compiler isn't fully warmed up. However, these should not matter much for other users. If you already have your favorite apache/nginx setup and do not want to spend the time to tune settings for another http server, use FastCGI.

How Can a LAMP Guy Easily Implement WebSockets?

I've always worked with Apache, MySQL, and PHP. I'd like to eventually branch out to Python/Django or Ruby/Ruby on Rails, but that's another discussion. Two great things about Apache, MySQL, and PHP are all three are ubiquitous and it's very easy to launch a website. Just set up an Apache virtual host, import the database into MySQL, and copy the PHP files onto the server. That's it. This is all I've ever done and all I've ever known. Please keep this in mind.
These days, it's becoming increasingly important for websites to be able to deliver data in real-time to the users. Users expect this too due to the live nature of Facebook and Gmail. This effect can be faked with Ajax polling, but that has a lot of overhead, as explained here. I'd like to use WebSockets. Now remember that I've always been a LAMP guy. I've only ever launched websites using the method I described earlier. So if I have, say, a CakePHP site, how can I "add on" the feature of WebSockets? Do I need to install some other server or something or can I get it to work smoothly with Apache? Will it require Apache 2.4? Please explain the process to me keeping in mind that I only know about LAMP. Thanks!
One key thing to keep in mind, is that a realtime websockets server needs to be "long running", so that it can push stuff to clients. In the classic LAMP setup, Apache spawns a PHP interpreter on each request. Between requests the PHP interpreter is not running, and the only protocol state kept between requests is sessions.
One nice property of the LAMP way, is that memory management is easy. You just implicitly allocate whatever memory you need, and it is automatically reclaimed when the request is done, and the PHP process exits. As soon as you want the server to keep running, you need to consider memory management. In some laguages, like C++, you manage allocation and deallocation explicitly. In other languages, like Java or Javascript, you have garbage collection. In PHP you throw everything away, and start with a fresh slate on each request.
I think you will have a hard time making long running servers with something like Cake or any other classic PHP framework. Those frameworks works by basically taking an HTTP request and turning it into an HTTP response.
My advice is that you should look into something like Node.JS and SocketIO. If you know Javascript, or don't mind learning, these technologies allow you to easily implement real-time servers and clients. If necessary you could run a reverse proxy like nginx, so that your existing LAMP stack would get some requests, and one or more NodeJS servers would get some.
This answer came out a bit fluffy, but I hope that it helps a little.. :-)

2011 Web Scripting Languages and Dynamic Reloading

This has been bugging me for awhile now.
In a deployed PHP web application one can upload a changed php script and have the updated file picked up by the web server without having to restart.
The problem? Ruby, Groovy, & Python, etc. are all "better" than PHP in terms of language expressiveness, concision, power, ...your-reason-here.
Currently, I am really enjoying Groovy (via Grails), but the reality is that the JVM does not do well (at all) with production dynamic reloading of application code. Basically, Permgen out of memory errors are a virtual guarantee, and that means application crash at anytime -- not good.
Ruby frameworks seem to have this solved somewhat from what I have read: Passenger has an option to dynamically reload changed files in polled directories on the next request (thus preventing connected users from being disconnected, session lost, etc.).
Standalone Python I am not sure about at all; it may, like PHP allow dynamic reloading of python scripts without web server restart.
As far as our web work is concerned, invariably clients wind up wanting to make changes to a deployed application regardless of how detailed and well planned the spec was. Telling the client, "sure, we'll implement that [simple] change at 4AM tomorrow [so as to not wreak havoc with connected users]", won't go over too well.
As of 2011 where are we at in terms of dynamic reloading and scripting languages? Are we forever doomed, relegated to the convenience of PHP, or the joys of non-PHP and being forced to restart a deployed application?
BTW, I am not at all a fan of JSPs, GSPs, and Ruby, Python templating equivalents, despite their reloadability. This is a cake & eat it too thread, where we can make a change to any aspect of the application and not have to restart.
You haven't specified a web server. If you're using Apache, mod_wsgi is your best bet for running Python web apps, and it has a reloading mechanism that doesn't require a server restart.
I think you're making a bigger deal out of this than it really is.
Any application for which it is that important that it never be down for 1/2 a minute (which is all it takes to reboot a server to pick up a file change) really needs to have multiple application server instances in order to handle potential failures of individual instances. Once you have multiple application servers to handle failures, you can also safely restart individual instances for maintenance without causing a problem.

Categories