NGINX reverse proxy / php pre processing

NGINX reverse proxy / php pre processing - php

I would somehow solve the following scenario: We have a server nginx acting as a reverse proxy for some apache servers. We should make sure that when a request comes to nginx proxy it is pre processed by a php script that sets some HTTP request headers, based on the URL content, and then the URL is passed to the server apache.
We should avoid redirects in this process, but I have no idea how I could do it.
Thanks a lot...
[EDIT]
Sorry for the vague question. Our setup is as follows: nginx is used as a balancer for some apache web server. On the web server runs an application that generates the content of e-commerce (and page categories) on the basis of the analysis of the submitted URL. We use a third-party analysis tool that requires a request header valorized with category but the categories are calculated by the php code of the application... I should make that the request processed by nginx will have an header before arriving to apache. I can extract the code from the php application and create an intermediate layer but I have no idea how to manage the whole process.
This is a simple draw: Black as-is, in green to-be (or may be-to-be)
simple solution draw

Your question is very vague - and will probably be closed on that basis. My response here is intended as a comment - but its a bit long for the comment box.
That you are using nginx as a reverse proxy implies that you are somewhat concerned with performance. While it would be quite possible to implement what you describe, the nature of PHP running in a webserver means that it will be rather inefficient at the task you describe - each incoming web request will require a new connection to the backend webserver.
Presumably there is some application running on or behind the apache webservers - is there a reason you don't implement the required functionality there?
Can you provide examples of the changes you need to apply to requests and responses? It's possible that some of this could be handled by nginx or apache.
Alternatively you might have a look at ICAP (rfc3507) which is protocol designed for supporting these kind of transformations. Although there sre server implementations using PHP, I suspect they will have most of the same performance issues referenced above.

Related

Stateless & asynchronous web-server with PHP (and Symfony)

TL;DR: I'm not sure this topic has its place on StackOverflow, but basically it's just a topic of debate and thinking about making PHP apps like we would do with NodeJS for example (stateless request flow, asynchronous calls, etc.)
The situation
We know NodeJS can be used as both a web-server and web-app.
But for PHP, the internal web-server is not recommended for production (so says the documentation).
But, as Symfony full-stack is based on the Kernel which handles Request objects, it means we should be able to send lots of requests to the same kernel, only if we could "bootstrap" the php web-server (not the app) by creating a kernel before listening to HTTP requests. And our router would only create a Request object and make the kernel handle it.
But for this, a Symfony app has to be stateless, for example we need Doctrine to effectively clear its unit of work after a request, or maybe we would need to sort of isolate some components based on a request (By identifying a request with its unique PHP class reference id? Or by using other php processes?), and obviously, we would need more asynchronous things in PHP or in the way we use the internal web-server.
The main questions I sometimes ask myself, and now ask to the community
To clarify this, I have some questions about PHP:
Why exactly is the internal PHP webserver not recommended for production?
I mean, if we can configure how the server is run and its "router" file, we should be able to use it like any PHP server, yes or no?
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not shared, else we could make nodejs-like apps, but it seems PHP is not capable of doing something like this.
Is it really possible to make a full-stateless application with Symfony?
e.g. I send two different requests to the same kernel object, in this case, is there any possibility that the two requests create a conflict in Symfony core components?
Actually, the idea of "Create a kernel -> start server -> on request, make the kernel handle it" behavior would be awesome, because it would be something quite similar to NodeJS, but actually, the PHP paradigm is not compatible with this because we would need each request to be handled asynchronously. But if a kernel and its container is stateless, then, there should be a way to do something like that, shouldn't it?
Some thoughts
I've heard about React PHP, Ratchet PHP for Websocket integration, Icicle, PHP-PM but never experienced them, it seems a bit too complex to me for now (I may lack some concepts about asynchronicity in apps, that's why my brain won't understand until I have some more answers :D ).
Is there any way that these libraries could be used as "wrappers" for our kernel request handling?
I mean, let's create this reactphp/icicle/whatever environment setup, create our kernel like we would do in any Symfony app, and run the app as web-server, and when a request is retrieved, we send it asynchrously to our kernel, and as long as the kernel has not sent the response, the client waits for it, even if the response is also sent asynchrously (from nested callbacks, etc., like in NodeJS).
This would make any existing Symfony app compatible with this paradigm, as long as the app is stateless, obviously. (if the app config changes based on a request, there's a paradigm issue in the app itself...)
Is it even a possible reality with PHP libraries rather than using PHP internal web-server in another way?
Why ask these questions?
Actually, it would be kind of a revolution if PHP could implement real asynchronous stuff internally, like Javascript has, but this would also has a big impact on performances in PHP, because of persistent data in our web-server, less bootstraping (require autoloader, instantiate kernel, get heavy things from cached files, resolve routing, etc.).
In my thoughts, only the $kernel->handleRaw($request); would consume CPU, the whole rest (container, parameters, services, etc.) would be already in the memory, or, for the case of services, "awaiting to be instantiated". Then, performance boost, I think.
And it may troll a bit the people who still think PHP is a very bad and slow language to use :D
For readers and responders ;)
If a core PHP contributor reads me, is there any way that internally PHP could be more asynchronous even with a specific new internal API based on functions or classes?
I'm not a pro of all of these concepts, and I hope really good experts are going to read this and answer me!
It could be a great advance in the PHP world if all of this was possible in any way.

Why exactly is the internal PHP webserver not recommended for
production? I mean, if we can configure how the server is run and its
"router" file, we should be able to use it like any PHP server, yes or
no?
Because it's not written to behave well under load, and there are no configuration options that let you handle HTTP request processing before it reaches PHP.
Basically, it lacks features if you compare it to nginx. It would be equal to comparing a skateboard to a Lamborghini.
It can get you from A to B but.. you get the gist.
How does it behaves internally? Is memory shared between two requests?
By using the router, it seems obvious to me that variables are not
shared, else we could make nodejs-like apps, but it seems PHP is not
capable of doing something like this.
Documentation states it's singlethreaded, so it appears that it would behave the same as if you wrote while(true) { // all your processing here }.
It's a playtoy designed to quickly check a few things if you can't be bothered to set up a proper web server before trying out your code.
Is it really possible to make a full-stateless application with
Symfony? e.g. I send two different requests to the same kernel object,
in this case, is there any possibility that the two requests create a
conflict in Symfony core components?
Why would it go to the same kernel object? Why not design your app in such a way that it's not relevant which object or even processing server gets the request? Why not design for redundancy and high availability from the get go? HTTP = stateless by default. Your task = make it irrelevant what processes the request. It's not difficult to do so, if you avoid coupling with the actual processing server (example: don't store sessions to local filesystem etc.)
Actually, the idea of "Create a kernel -> start server -> on request,
make the kernel handle it" behavior would be awesome, because it would
be something quite similar to NodeJS, but actually, the PHP paradigm
is not compatible with this because we would need each request to be
handled asynchronously. But if a kernel and its container is
stateless, then, there should be a way to do something like that,
shouldn't it?
Actually, nginx + php-fpm behave almost identical to node.js.
nginx uses a reactor to handle all connections on the same thread. Node.js does the exact same thing. What you do is create a closure / callback that is fed into Node's libraries and I/O is handled in a threaded environment. Multithreading is abstracted from you (related to I/O, not CPU). That's why you can experience that Node.js blocks when it's asked to do a CPU intensive task.
nginx implements the exact same concept, except this callback isn't a closure written in javascript. It's a callback that expects an answer from php-fpm during <timeout> seconds. Nginx takes care of async for you. What your task is is to write what you want in PHP. Now, if you're reading a huge file, then async code in your PHP would make sense, except it's not really needed.
With nginx and sending off requests for processing to a fastcgi worker, scaling becomes trivial. For example, let's assume that 1 PHP machine isn't enough to deal with the amount of requests you're dealing with. No problem, add more machines to nginx's pool.
This is taken from nginx docs:
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com:8080;
server unix:/tmp/backend3;
server backup1.example.com:8080 backup;
server backup2.example.com:8080 backup;
}
server {
location / {
proxy_pass http://backend;
}
}
You define a pool of servers and then assign various weights / proxying options related to balancing how requests are handled.
However, the important part is that you can add more servers to cope with availability requirements.
This is the reason why nginx + php-fpm stack is appealing. Since nginx acts as a proxy, it can proxy requests to node.js as well, letting you handle web socket related operations in node.js (which, in turn, can perform an HTTP request to a PHP endpoint, allowing you to contain your entire app logic in PHP).
I know this answer might not be what you're after, but what I wanted to highlight is the way node.js works (conceptually) is identical to what nginx does when it comes to handling incoming request. You could make php work as node does, but there's no need for that.

Your questions can be summed up as this:
"Could PHP be more like Node?"
to which the answer is of course "Yes." But that leads us to another question:
"Should PHP be more like Node?"
and now the answer is not that obvious.
Of course in theory PHP could be made more like Node - even to a point to make it exactly the same. Just take the next version of Node and call it PHP 6.0 or something.
I would argue that it would be harmful to both Node and PHP. There is a diversity in the runtime environments for a reason. One of the variations is the concurrency model used in a given environment. Making one like the other would mean less choice for the programmer. And less choice is less freedom of expression.
PHP and Node were created in different times and for different reasons.
PHP was developed in 1995 and the name stood for Personal Home Page. The use case was to add some server-side dynamic features to HTML. We already had SSI and CGI at that point but people wanted to be able to inject right into the HTML - synchronously, as it wouldn't make much sense otherwise - results of database queries and other computations. It isn't a surprise how good it is at this job even today.
Node, on the other hand, was developed in 2009 - almost 15 years later - to create high performance network servers. So it shouldn't surprise us that writing such servers in Node is easy and that they have great performance characteristics. This is why Node was created in the first place. One of the choices it had to make was a 100% non-blocking environment of single-threaded, asynchronous event loops.
Now, single-threading concurrency is conceptually more difficult than multi-threading. But if you want performance for I/O-heavy operations then currently you have no other options. You will not be able to create 10,000 threads but you can easily handle 10,000 connections with Node in a single thread. There is a reason why nginx is single-threaded and why Redis is single threaded. And one common characteristic of nginx and Redis is amazing performance - but both of those were hard to write.
Now, as far as Node and PHP go, those technologies are so far from each other that it's hard to even comprehend how their fusion would look like. It reminds me the old April Fool's joke about unifying Perl and Python that so many people believed in.
PHP has its strengths and Node has it strengths. And just like it would be hard to imagine Node with blocking-I/O, it would be equally hard to imagine PHP with non-blocking I/O.
To summarize: it could be possible to make PHP like Node, but I wouldn't expect it to happen any time soon - if ever.

how to get $_SERVER['HTTP_RANGE'] available at php side for Nginx + php-fpm

I saw several links where guys some way made available $_SERVER['HTTP_RANGE'] at php side.
For example this link:
NGINX - PHP-FPM Serving Movies Seek & Connection Handle
It sounds like fastcgi consumes Range header. See this link:
https://forum.nginx.org/read.php?2,247932,247936#REPLY
Maybe I skipped something? How I can resolve this problem?
My current config is:
nginx 1.8
php5-fpm 5.5.9
yii2 based application

My original problem was to organize secured access to the video files hosted on our servers. I recently used apache and all logic related to supporting HTTP RANGE was done at php side. When we migrated to the nginx, this logic become broken, because FAST CGI cache uses own logic related to the ranges. That is why there is no way to have HTTP_RANGE at php level. so I asked the question here.
I have resolved the problem by using X-Accel-Redirect nginx header.
This feature allows to do internal redirects. So php part still can do security validation, but nginx does all other work himself. It supports ranges and caching and partial content. Cool stuff.
I hope this post will save a lot of your time. Thanks

How can a make a PHP script run without redirecting to it?

Im making a simple web-based control panel, and i figured the easiest way for me to accomplish it would be to have PHP on the 2 machines (one being the web facing machine, the other being behind a VPN), basically I need it so when I press a button on the site on the externally facing IP of machine 1, it sends a request to the internally facing IP (eg 192.168.100.1) of machine 2 and runs the PHP file (test.php plus some $_GET data) without actually redirecting the end user to 192.168.100.1, because obviously that will time out as there is no access to it.

If all you want is to make certain internal PHP pages accessible on the external server, you should consider setting up a reverse proxy instead of manually proxying requests with PHP.
See the Apache documentation for an example: http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
Of course this won't work if you do your authentication on the external server and/or need to execute additional PHP code on the external server before/after the internal PHP code. In that case refer to Mihai's or Louis's answer.

You can use cURL to send or forward HTTP requests from machine 1 to machine 2 and to receive the responses machine 2 gives you and (if needed) process those responses to show them to the user.
You could also use (XML-/JSON-)RPC or SOAP which would be a bit more elegant and extensible (more commonplace than using cURL) but it would have a higher learning curve with a bigger setup time/work.

You should also be able to use file_get_contents (normally supporting the http protocol) or http_get, a function designed for simple http get requests.
It might not be the most ideal way, but should be fairly easy to do.

Node + now.js + Model-View-Control-Pattern

I'm using a forum software which is based on MVC-Pattern (Templates and PHP-Classes).
Pages look like this: domain.com/index.php?page=Test
I want to setup a chatserver on one page (domain.com/index.php?page=Chat) with node and now.js.
Now I encouter a problem: How to tell the server side code that the chat server has to work at index.php?page=Chat
Obviously I can't do something like that:
fs.readFile('index.php?page=Chat')
Any ideas how to setup a node server on URLs like that?
Thanks!

I would dive a little deeper into node.js. As node is itself a webserver, you have to learn a little about how routing and server configuration work. Basically, anything coming in on port 80 is listened to by your (likely) Apache Service. Apache looks at the URI, and decides which script in your application to run kicks off a php processes that runs your code and generates a web page to be sent to the user.
So when you see:
domain.com/chat
vs
domain.com/index.php?page=Chat
That's Apache saying, "hey, you configured me to read '/chat' as /index.php?page=Chat, so I'll fire that script off".
Node.js is like both Apache AND PHP balled up into one. It handles the requests and builds the pages. So you would have node.js and Apache stepping on each others toes when requests come in. To have both applications listening on port 80 you would have to user something like:
https://github.com/nodejitsu/node-http-proxy
This node module forwards unhandled server requests to Apache, which would allow you to have mixed nodejs/apache+php application.
As far at the templating goes, php and javascript templates can't be intermingled as they're built on completely different languages. So, you're out of luck, almost. Node has a very rich templating engine list. Some of which are likely to have near identical syntax to whatever you're using, so it would be simple to port.
https://github.com/joyent/node/wiki/modules#wiki-templating
I hope this answers your question. I would still, as commented, use an iFrame, put node on a different port, and keep the two architectures clean and separate. Or, use a chat service and don't bother setting up a whole separate application. Unless you want to learn, in which case, go crazy. :)

you can run node server at port say 8080 and can include client side js as normal javascript in any of the view file it will work

PHP running as a FastCGI application (php-cgi) - how to issue concurrent requests?

EDIT: Update - scroll down
EDIT 2: Update - problem solved
Some background information:
I'm writing my own webserver in Java and a couple of days ago I asked on SO how exactly Apache interfaces with PHP, so I can implement PHP support. I learnt that FastCGI is the best approach (since mod_php is not an option). So I have looked at the FastCGI protocol specification and have managed to write a working FastCGI wrapper for my server. I have tested phpinfo() and it works, in fact all PHP functions seem to work just fine (posting data, sessions, date/time, etc etc).
My webserver is able to serve requests concurrently (ie user1 can retrieve file1.html at the same time as user2 requesting some_large_binary_file.zip), it does this by spawning a new Java thread for each user request (terminating when completed or user connection with client is cancelled).
However, it cannot deal with 2 (or more) FastCGI requests at the same time. What it does is, it queues them up, so when request 1 is completed immediately thereafter it starts processing request 2. I tested this with 2 PHP pages, one contains sleep(10) and the other phpinfo().
How would I go about dealing with multiple requests as I know it can be done (PHP under IIS runs as FastCGI and it can deal with multiple requests just fine).
Some more info:
I am coding under windows and my batch file used to execute php-cgi.exe contains:
set PHP_FCGI_CHILDREN=8
set PHP_FCGI_MAX_REQUESTS=500
php-cgi.exe -b 9000
But it does not spawn 8 children, the service simply terminates after 500 requests.
I have done research and from Wikipedia:
Processing of multiple requests
simultaneously is achieved either by
using a single connection with
internal multiplexing (ie. multiple
requests over a single connection)
and/or by using multiple connections
Now clearly the multiple connections isn't working for me, as everytime a client requests something that involves FastCGI it creates a new socket to the FastCGI application, but it does not work concurrently (it queues them up instead).
I know that internal multiplexing of FastCGI requests under the same connection is accomplished by issuing each unique FastCGI request with a different request ID.
(also see the last 3 paragraphs of 'The Communication Protocol' heading in this article).
I have not tested this, but how would I go about implementing that? I take it I need some kind of FastCGI Java thread which contains a Map of some sort and a static function which I can use to add requests to. Then in the Thread's run() function it would have a while loop and for every cycle it would check whether the Map contains new requests, if so it would assign them a request ID and write them to the FastCGI stream. And then wait for input etc etc, As you can see this becomes too complicated.
Does anyone know the correct way of doing this? Or any thoughts at all? Thanks very much.
Note, if required I can supply the code for my FastCGI wrapper.
Update:
Basically, I downloaded nginx and set it up to use PHP as a FastCGI application and it too suffered from the same problem as my server. It could not handle concurrent PHP requests. This is leads me to believe my code is in fact correct. So something is wrong with PHP or I am not setting it up correctly. Maybe it is because I am using Windows because some lighttpd users claim Windows can't handle FastCGI properly (this doesn't make much sense). I'll install Linux sometime soon and report any progress with that.

Okay, I managed to find the cause of the problem. It wasn't my code at all. It's PHP, it cannot spawn additional php-cgi's under Windows when running as FastCGI mode, under Linux it works perfectly, I simply pointed my server to my linux box IP and it had no problems with concurrent FCGI requests. Sucks, but I guess that's the way it is...
I did look deeper into the PHP source code after that and found that the section of code which responds to PHP_FCGI_CHILDREN has been encapsulated by #ifndef WIN32 So the developers must be aware of the issue

Hi this comes a little late, I've wrote a spawner for php-cgi.exe on windows, not perfect but it might be what you needed. Check it at here.

re: spawn-php python script...
Thanks #nosam that really helped.
For those wanting to get it working quickly you'll need the following (if 64bit system)
ActivePython-2.7.2.5-win64-x64.msi pywin32-217.win-amd64-py2.7.exe
ActivePython does not have older versions of these on their www so you will need to do a bit of googling around to find a working mirror (there are plenty out there)
Once you have downloaded the src from bitbucket you may need to edit spawn-php.py (to fix up the tab spacing), as bit-bucket seemed to mess up the tab's in the file preventing it from running.
All-in-all that saved my day for a busy little windows website using nginx + fast-cgi.
Thanks mate!

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.