How to scale php application servers horizontally?

How to scale php application servers horizontally? - php

I'm using lighttpd as webserver for php application server. The avg. load on this server is about 2-3. MySQL database is separated to another server (it's load ~0.4). How could I scale php application server?
Thank you.

In a few words, a solution, generally speaking, is to :
Have several PHP servers
Have another load-balancing server in front of those
If you have 3 PHP servers, then, this load-balancer will send 33% of the requests it receives on each PHP server.
The load-balancing server can be some specialized hardware ; or a reverse-proxy, using Apache (see mod_proxy_balancer for example), nginx (see Load Balancing with Nginx, for example), varnish, ...
The most important problems you'll generally face when using several PHP servers instead of ones are related to the filesystem : with several servers, each server has its own disks and filesystem.
For example, if a user is randomly balanced on server 1 for one page, and server 2 for another page, you cannot use file-based sessions : the session will be created on server 1, but will not be found on server 2, later.
In this specific case, you'll have to use another mecanism to store sessions -- store them in a database, or memcached, for example.
Same things with images (uploaded by users, for example) : you'll have to either :
Synchronise them between servers
Or use some kind of network-drive
Edit after the comment : For the deployment, with several servers, I generally do exactly as I would with only one server.
For more informations on the kind of process I often use, you can take a look at the answer I gave a while back on this question : Updating a web app without any downtime.
The explanation given there is for one server, but to do the same thing on several servers, you can :
Upload the archive to each one of your servers
Extract it
And, at almost the same instant, switch the symlink on all servers
At worst, you'll have 2 or 3 seconds of delay between each servers, which means 10 seconds if you have 5 servers ? It's probably not a big problem :-)
(I've done this with up to 7 servers, for one application, and never had any problem)

Related

Multiple webservers (NGINX) behind Load Balancer, share settings (database connections etc.)

I'm designing a system that will require several web servers (NGINX) behind a load balancer.
My question is: what techniques do you suggest using for sharing settings among all webbservers (hosting a PHP-app)? Let's say that I have to change the credentials for the database connection. In that case I don't want to log in to every single server and change all of the config files.
What do you suggest I do to be able to update those variables in one place so it's accessible for all web servers. I've considered having a small server in the middle which all servers read from (through a scp-connection or such), but I don't want a single point of failure.

There are various solutions to automate server management like Puppet and Chef, but if you are just getting started with two our three servers, consider a tool that's like you send broadcast the same SSH session to multiple hosts. Terminator for Gnome is great and there's also CSSHX for mac (which I haven't tried).
It's great to be able to see both terminals at once in case there are small differences between the server from prior manual maintenance. In Terminator, you can easily switch between broadcasting your commands to a group of terminals and typing into a specific one. You can set up a layout that automatically starts up two joined sessions and SSH's into all the servers in a given cluster. In other words, it's as easy to use as a SSH session normally would be, but works for multiple servers.
When project size and complexity grows, consider switching to more fully automated server management.

What to care about when using a load balancer?

I have a web application written in PHP, is already deployed on an Apache server and works perfectly.
The application uses Mysql as db, session are saved in memcached server.
I am planning to move to an HAproxy environment with 2 servers.
What I know: I will deploy the application to the servers and configure HAproxy.
My question is: is there something I have to care about/change in the code ?

It depends.
Are you trying to solve a performance or redundancy problem?
If your database (MySQL) and session handler (memcached) are running on one or more servers separate from the two Apache servers, then the only major thing your code will have to do differently is manage the forwarded IP addresses (via X-FORWARDED-FOR), and HAProxy will happily round robin your requests between Apache servers.
If your database and session handler are currently running on the same server, then you need to decide if the performance or redundancy problem you are trying to solve is with the database, the session management, or Apache itself.
The easiest solution, for a performance problem with a database/session-heavy web app, is to simply start by putting MySQL and memcached on the second server to separate your concerns. If this solves the performance problem you were having with one server, then you could consider the situation resolved.
If the above solution does not solve the performance problem, and you notice that Apache is having trouble serving your website files, then you would have the option of a "hybrid" approach where Apache would exist on both servers, but then you would also run MySQL/memcached on one of the servers. If you decided to use this approach, then you could use HAProxy and set a lower weight to the hybrid server.
If you are attempting to solve a redundancy issue, then your best bet will be to isolate each piece into logical groups (e.g. database cluster, memcached cluster, Apache cluster, and a redundant HAProxy pair), and add redundancy to each logical group as you see fit.

The biggest issue that you are going to run into is going to be related to php sessions. By default php sessions maintain state with a single server. When you add the second server into the mix and start load balancing connections to both of them, then the PHP session will not be valid on the second server that gets hit.
Load balancers like haproxy expect a "stateless" application. To make PHP stateless you will more than likely need to use a different mechanism for your sessions. If you do not/can not make your application stateless then you can configure HAProxy to do sticky sessions either off of cookies, or stick tables (source IP etc).
The next thing that you will run into is that you will loose the original requestors IP address. This is because haproxy (the load balancer) terminates the TCP session and then creates a new TCP session to apache. In order to continue to see what the original requstors IP address is you will need to look at using something like x-forwarded-for. In the haproxy config the option is:
option forwardfor
The last thing that you are likely to run into is how haproxy handles keep alives. Haproxy has acl's, rules that determine where to route the traffic to. If keep alives are enabled, haproxy will only make the decision on where to send traffic based on the first request.
For example, lets say you have two paths and you want to send traffic to two different server farms (backends):
somedomain/foo -> BACKEND_serverfarm-foo
somedomain/bar -> BACKEND_serverfarm-bar
The first request for somedomain/foo goes to BACKEND_serverfarm-foo. The next request for somedomain/bar also goes to BACKEND_serverfarm-foo. This is because haproxy only processes the ACL's for the first request when keep alives are used. This may not be an issue for you because you only have 2 apache servers, but if it is then you will need to have haproxy terminate the keep alive session. Haproxy has several options for this but these two make the most since in this scenario:
option forceclose
option http-server-close
The high level difference is that forceclose closes both the server side and the client side keep alive session. http-server-close only closes the server side keep alive session which allows the client to maintain a keepalive with haproxy.

Load Balancing - How to set it up correctly?

Here it gets a little complicated. I'm in the last few months to finish a larger Webbased Project, and since I'm trying to keep the budget low (and learn some stuff myself) I'm not touching an Issue that I never touched before: load balancing with NGINX, and scalability for the future.
The setup is the following:
1 Web server
1 Database server
1 File server (also used to store backups)
Using PHP 5.4< over fastCGI
Now, all those servers should be 'scalable' - in the sense that I can add a new File Server, if the Free Disk Space is getting low, or a new Web Server if I need to handle more requests than expected.
Another thing is: I would like to do everything over one domain, so that the access to differend backend servers isnt really noticed in the frontend (some backend servers are basically called via subdomain - for example: the fileserver, over 'http://file.myserver.com/...' where a load balancing only between the file servers happens)
Do I need an additional, separate Server for load balancing? Or can I just use one of the web servers? If yes:
How much power (CPU / RAM) do I require for such a load-balancing server? Does it have to be the same like the webserver, or is it enough to have a 'lighter' server for that?
Does the 'load balancing' server have to be scalable too? Will I need more than one if there are too many requests?
How exactly does the whole load balancing work anyway? What I mean:
I've seen many entries stating, that there are some problems like session handling / synchronisation on load balanced systems. I could find 2 Solutions that maybe would fit my needs: Either the user is always directed to the same machine, or the data is stored inside a databse. But with the second, I basically would have to rebuild parts of the $_SESSION functionality PHP already has, right? (How do I know what user gets wich session, are cookies really enough?)
What problems do I have to expect, except the unsynchronized sessions?
Write scalable code - that's a sentence I read a lot. But in terms of PHP, for example, what does it really mean? Usually, the whole calculations for one user happens on one server only (the one where NGINX redirected the user at) - so how can PHP itself be scalable, since it's not actually redirected by NGINX?
Are different 'load balancing' pools possible? What I mean is, that all fileservers are in a 'pool' and all web servers are in a 'pool' and basically, if you request an image on a fileserver that has too much to do, it redirects to a less busy fileserver
SSL - I'll only need one certificate for the balance loading server, right? Since the data always goes back over the load balancing server - or how exactly does that work?
I know it's a huge question - basically, I'm really just searching for some advices / and a bit of a helping hand, I'm a bit lost in the whole thing. I can read snippets that partially answer the above questions, but really 'doing' it is completly another thing. So I already know that there wont be a clear, definitive answer, but maybe some experiences.
The end target is to be easily scalable in the future, and already plan for it ahead (and even buy stuff like the load balancer server) in time.

You can use one of web servers for load balacing. But it'll be more reliable to set the balacing on a separate machine. If your web servers responds not very quickly and you're getting many requests then load balancer will set the requests in the queue. For the big queue you need a sufficient amount of RAM.
You don't generally need to scale a load balancer.
Alternatively, you can create two or more A (address) records for your domain, each pointing to different web server's address. It'll give you a 'DNS load-balancing' without a balancing server. Consider this option.

Memcahced on multiple servers in PHP

I have the following question regarding the memcached module in PHP:
Intro:
We're using the module to prevent the same queries from being sent to the Database server, on a site with 500+ users in every moment.
Sometimes (very rarely) the memcahed process defuncts and all active users start generating queries to the database, so everything stops working.
Question:
I know, that memcached supports multiple servers, but I want to know what happens when one of them dies? Is there some sort of balancer background or something, that can tell Ow! server 1 is dead. I'll send everything to server 2 until the server 1 goes back online. or the load is being sent equally to each one?
Possible sollutions:
I need to know this, because if it's not supported our sysadmin can set the current memcached server to be a load ballancer and to balance the load between several other servers.
Should I ask him to create the load-balancing manualy or is this feature supported by default and what are the risks for both methods?
Thank you!

You add multiple servers in your PHP script, not in Memcache's configuration.
When you use Memcached::addServers(), you can specify a weight for every server. In your case, you might set one Memcache server to be higher than the other and only have the second act as a failover.
Using Memcached::setOption(), you can set how often a connection should be retried and set the timeout. If you know your Memcache servers die a lot, it might be worth it to set this lower than the defaults, but it shouldn't be necessary.

Scaling File Systems

This could be a question for serverfault as well, but it also includes topics from here.
I am building a new web site that consist of 6 servers. 1 mysql, 1 web, 2 file processing servers, 2 file servers. In short, file processing servers process files and copy them to the file servers. In this case I have two options;
I can setup a web server for each file server and serve files directly from there. Like, file1.domain.com/file.zip. Some files (not all of them) will need authentication so I will authenticate users via memcache from those servers. 90% of the requests won't need any authentication.
Or I can setup NFS and serve files directly from the web server, like www.domain.com/fileserve.php?id=2323 (it's a basic example)
As the project is heavily based on the files, the second option might not be as effective as the first option, as it will consume more memory (even if I split files into chunks while serving)
The setup will stay same for a long time, so we won't be adding new file servers into the setup.
What are your ideas, which one is better? Or any different idea?
Thanks in advance,

Just me, but I would actually put a set of reverse proxy rules on the "web server" and then proxy HTTP requests (possibly load balanced if they have equal filesystems) back to a lightweight HTTP server on the file servers.
This gives you flexibility and the ability implement future caching, logging, filter chains, rewrite rules, authentication &c, &c. I find having a fronting web server as a proxy layer a very effective solution.

I recommend your option #1: allow the file servers to act as web servers. I have personally found NFS to be a little flaky when used under high volume.

You can also use Content Delivery Network such as simplecdn.com, they can solve bandwidth and server load issue.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.