Load balancing and APC

Load balancing and APC - php

I am interested in a scenario where webservers serving a PHP application is set up with a load balancer.
There will be multiple webservers with APC behind the load balancer. All requests will have to go through the load balancer, which then sends it to one of the web servers to process.
I understand that memcached should be used for distributed caching, but I think having the APC cache on each machine cache things like application configurations and other objects that will NOT be different across any of the servers would yield even better performance.
There is also an administrator area for this application. It is also accessed via the load balancer (for example, site.com/admin). In a case like this, how can I call apc_clear_cache to clear the APC object cache on ALL servers?

Externally in your network you have a public IP you use to route all your requests to your load balancer that distributes load round robin so outside you cannot make a request to clear your cache on each server one at a time because you don't know which one is being used at any given time. However, within your network, each machine has its own internal IP and can be called directly. Knowing this you can do some funny/weird things that do work externally.
A solution I like is to be able to hit a single URL and get everything done such as http://www.mywebsite/clearcache.php or something like that. If you like that as well, read on. Remember you can have this authenticated if you like so your admin can hit this or however you protect it.
You could create logic where you can externally make one request to clear your cache on all servers. Whichever server receives the request to clear cache will have the same logic to talk to all servers to clear their cache. This sounds weird and a bit frankenstein but here goes the logic assuming we have 3 servers with IPs 10.232.12.1, 10.232.12.2, 10.232.12.3 internally:
1) All servers would have two files called "initiate_clear_cache.php" and "clear_cache.php" that would be the same copies for all servers.
2) "initiate_clear_cache.php" would do a file_get_contents for each machine in the network calling "clear_cache.php" which would include itself
for example:
file_get_contents('http://10.232.12.1/clear_cache.php');
file_get_contents('http://10.232.12.2/clear_cache.php');
file_get_contents('http://10.232.12.3/clear_cache.php');
3) The file called "clear_cache.php" is actually doing the cache clearing for its respective machine.
4) You only need to make a single request now such as http://www.mywebsite/initial_clear_cache.php and you are done.
Let me know if this works for you. I've done this in .NET and Node.js similar but haven't tried this in PHP yet but I'm sure the concept is the same. :)

Related

How do I share objects between multiple get requests in PHP?

I created a small and very simple REST-based webservice with PHP.
This service gets data from a different server and returns the result. It's more like a proxy rather than a full service.
Client --(REST call)--> PHP Webservice --(Relay call)--> Remote server
<-- Return data ---
In order to keep costs as low as possible I want to implement a caching table on the PHP webservice system by maintaining data for a period of time in server memory and only re-request the data after a timeout (let's say after 30 mins).
In pseudo-code I basically want to do this:
$id = $_GET["id"];
$result = null;
if (isInCache($id) && !cacheExpired($id, 30)){
$result = getFromCache($id);
}
else{
$result = getDataFromRemoteServer($id);
saveToCache($result);
}
printData($result);
The code above should get data from a remote server which is identified by an id. If it is in the cache and 30 mins have not passed yet the data should be read from the cache and returned as a result of the webservice call. If not, the remote server should be queried.
While thinking on how to do this I realized 2 important aspects:
I don't want to use filesystem I/O operation because of performance concerns.
Instead, I want to keep the cache in memory. So, no MySQL or local
file operations.
I can't use sessions because the cached data must be shared across different users, browsers and internet connections worldwide.
So, if I could somehow share objects in memory between multiple GET requests, I would be able to implement this caching system pretty easily I think.
But how could I do that?
Edit: I forgot to mention that I cannot install any modules on that PHP server. It's a pure "webhosting-only" service.

I would not implement the cache on the (PHP) application level. REST is HTTP, therefore you should use a caching HTTP proxy between the internet and the web server. Both servers, the web server and the proxy could live on the same machine as long as the application grows (if you worry about costs).
I see two fundamental problems when it comes to application or server level caching:
using memcached would lead to a situation where it is required that a user session is bound to the physical server where the memcache exists. This makes horizontal scaling a lot more complicated (and expensive)
software should being developed in layers. caching should not being part of the application layer (and/or business logic). It is a different layer using specialized components. And as there are well known solutions for this (HTTP caching proxy) they should being used in favour of self crafted solutions.

Well, if you do have to use PHP, and you cannot modify the server, and you do want in-memory caching for performance reasons (without first measuring that any other solution has good enough performance), then the solution for you must be to change the webhosting.
Otherwise, you won't be able to do it. PHP does not really have any memory-sharing facilities available. The usual approach is to use Memcached or Redis or something else that runs separately.
And for a starter and proof-of-concept, I'd really go with a file-based cache. Accessing a file instead of requesting a remote resource is WAY faster. In fact, you'd probably not notice the difference between file cache and memory cache.

What to care about when using a load balancer?

I have a web application written in PHP, is already deployed on an Apache server and works perfectly.
The application uses Mysql as db, session are saved in memcached server.
I am planning to move to an HAproxy environment with 2 servers.
What I know: I will deploy the application to the servers and configure HAproxy.
My question is: is there something I have to care about/change in the code ?

It depends.
Are you trying to solve a performance or redundancy problem?
If your database (MySQL) and session handler (memcached) are running on one or more servers separate from the two Apache servers, then the only major thing your code will have to do differently is manage the forwarded IP addresses (via X-FORWARDED-FOR), and HAProxy will happily round robin your requests between Apache servers.
If your database and session handler are currently running on the same server, then you need to decide if the performance or redundancy problem you are trying to solve is with the database, the session management, or Apache itself.
The easiest solution, for a performance problem with a database/session-heavy web app, is to simply start by putting MySQL and memcached on the second server to separate your concerns. If this solves the performance problem you were having with one server, then you could consider the situation resolved.
If the above solution does not solve the performance problem, and you notice that Apache is having trouble serving your website files, then you would have the option of a "hybrid" approach where Apache would exist on both servers, but then you would also run MySQL/memcached on one of the servers. If you decided to use this approach, then you could use HAProxy and set a lower weight to the hybrid server.
If you are attempting to solve a redundancy issue, then your best bet will be to isolate each piece into logical groups (e.g. database cluster, memcached cluster, Apache cluster, and a redundant HAProxy pair), and add redundancy to each logical group as you see fit.

The biggest issue that you are going to run into is going to be related to php sessions. By default php sessions maintain state with a single server. When you add the second server into the mix and start load balancing connections to both of them, then the PHP session will not be valid on the second server that gets hit.
Load balancers like haproxy expect a "stateless" application. To make PHP stateless you will more than likely need to use a different mechanism for your sessions. If you do not/can not make your application stateless then you can configure HAProxy to do sticky sessions either off of cookies, or stick tables (source IP etc).
The next thing that you will run into is that you will loose the original requestors IP address. This is because haproxy (the load balancer) terminates the TCP session and then creates a new TCP session to apache. In order to continue to see what the original requstors IP address is you will need to look at using something like x-forwarded-for. In the haproxy config the option is:
option forwardfor
The last thing that you are likely to run into is how haproxy handles keep alives. Haproxy has acl's, rules that determine where to route the traffic to. If keep alives are enabled, haproxy will only make the decision on where to send traffic based on the first request.
For example, lets say you have two paths and you want to send traffic to two different server farms (backends):
somedomain/foo -> BACKEND_serverfarm-foo
somedomain/bar -> BACKEND_serverfarm-bar
The first request for somedomain/foo goes to BACKEND_serverfarm-foo. The next request for somedomain/bar also goes to BACKEND_serverfarm-foo. This is because haproxy only processes the ACL's for the first request when keep alives are used. This may not be an issue for you because you only have 2 apache servers, but if it is then you will need to have haproxy terminate the keep alive session. Haproxy has several options for this but these two make the most since in this scenario:
option forceclose
option http-server-close
The high level difference is that forceclose closes both the server side and the client side keep alive session. http-server-close only closes the server side keep alive session which allows the client to maintain a keepalive with haproxy.

Using Laravel behind a load balancer

I have been working on a Laravel 4 site for awhile now and the company just put it behind a load balancer. Now when I try to login it basically just refreshes the page. I tried using fideloper's proxy package at https://github.com/fideloper/proxy but see no change. I even opened it up to allow all IP addresses by doing proxies => '*'. I need some help with knowing what needs to be done to get Laravel to work behind a load balancer, especially with sessions. Please note that I am using the database Laravel session driver.
The load balancer is a KEMP LM-3600.

Thank you to everyone for the useful information you provided. After further testing I found that the reason this wasn't working is because we are forcing https through the load balancer, but allowing http when not going through the load balancer. The login form was actually posting to http instead of https. This allowed the form to post but the session data never made it back to the client. Changing the form to post to https fixed this issue.

We use a load balancer where I work and I ran into similiar problems with accessing cPanel dashboards where the page would just reload every time I tried accessing a section and log me off as my IP address was changing to them. The solution was to find which port cPanel was using and configure the load balancer to bind that port to one WAN. Sorry, I am not familiar with laravel and if it just using port 80 then this might not be a solution.

Note that the session handling in Laravel 4 uses Symfony 2 code, which lacks proper session locking in all self-coded handlers that do not use the PHP provided session save handlers like "files", "memcached" etc.
This will create errors when used in a web application with parallel requests like Ajax, but this should occur unrelated to any load balancer.
You really should do some more investigation. HTTP load balancers do have some impact on the information flow, but the only effect on a PHP application would be that a single user surfing the site will randomly send the requests to any one of the connected servers, and not always to the same.
Do you also use any fancy database setup, like master-slave replication? This would affect sessions more likely, if the writing is only done on the master, the reading is done only on a slave, and this slave is behind the master with updating the last write operation. Such a configuration is not recommended as a session storage. I'd rather use Memcached instead. The PHP session save handler does implement proper locking as well...
Using fideloper's proxy does not make sense. A load balancer should be transparent to the web server, i.e. it should not act as a reverse proxy unless configured to do so.

Use a shared resource to store the session data. File and memcached will surely not work. DB should be OK. That's what I'm using on a load balanced setup with a common database.

I have been using TrustedProxy for a while now and its working fine.
the main issue with load balancers is proxy routing. the next is from the readme file and its what I was looking for.
If your site sits behind a load balancer, gateway cache or other
"reverse proxy", each web request has the potential to appear to
always come from that proxy, rather than the client actually making
requests on your site.
To fix that, this package allows you to take advantage of Symfony's
knowledge of proxies. See below for more explanation on the topic of
"trusted proxies".

Load Balancing - How to set it up correctly?

Here it gets a little complicated. I'm in the last few months to finish a larger Webbased Project, and since I'm trying to keep the budget low (and learn some stuff myself) I'm not touching an Issue that I never touched before: load balancing with NGINX, and scalability for the future.
The setup is the following:
1 Web server
1 Database server
1 File server (also used to store backups)
Using PHP 5.4< over fastCGI
Now, all those servers should be 'scalable' - in the sense that I can add a new File Server, if the Free Disk Space is getting low, or a new Web Server if I need to handle more requests than expected.
Another thing is: I would like to do everything over one domain, so that the access to differend backend servers isnt really noticed in the frontend (some backend servers are basically called via subdomain - for example: the fileserver, over 'http://file.myserver.com/...' where a load balancing only between the file servers happens)
Do I need an additional, separate Server for load balancing? Or can I just use one of the web servers? If yes:
How much power (CPU / RAM) do I require for such a load-balancing server? Does it have to be the same like the webserver, or is it enough to have a 'lighter' server for that?
Does the 'load balancing' server have to be scalable too? Will I need more than one if there are too many requests?
How exactly does the whole load balancing work anyway? What I mean:
I've seen many entries stating, that there are some problems like session handling / synchronisation on load balanced systems. I could find 2 Solutions that maybe would fit my needs: Either the user is always directed to the same machine, or the data is stored inside a databse. But with the second, I basically would have to rebuild parts of the $_SESSION functionality PHP already has, right? (How do I know what user gets wich session, are cookies really enough?)
What problems do I have to expect, except the unsynchronized sessions?
Write scalable code - that's a sentence I read a lot. But in terms of PHP, for example, what does it really mean? Usually, the whole calculations for one user happens on one server only (the one where NGINX redirected the user at) - so how can PHP itself be scalable, since it's not actually redirected by NGINX?
Are different 'load balancing' pools possible? What I mean is, that all fileservers are in a 'pool' and all web servers are in a 'pool' and basically, if you request an image on a fileserver that has too much to do, it redirects to a less busy fileserver
SSL - I'll only need one certificate for the balance loading server, right? Since the data always goes back over the load balancing server - or how exactly does that work?
I know it's a huge question - basically, I'm really just searching for some advices / and a bit of a helping hand, I'm a bit lost in the whole thing. I can read snippets that partially answer the above questions, but really 'doing' it is completly another thing. So I already know that there wont be a clear, definitive answer, but maybe some experiences.
The end target is to be easily scalable in the future, and already plan for it ahead (and even buy stuff like the load balancer server) in time.

You can use one of web servers for load balacing. But it'll be more reliable to set the balacing on a separate machine. If your web servers responds not very quickly and you're getting many requests then load balancer will set the requests in the queue. For the big queue you need a sufficient amount of RAM.
You don't generally need to scale a load balancer.
Alternatively, you can create two or more A (address) records for your domain, each pointing to different web server's address. It'll give you a 'DNS load-balancing' without a balancing server. Consider this option.

PHP Code on Separate Server From Apache?

This is something I've never seen done, and I'm not turning up in my research, but my boss is interested in the idea. We are looking at some load balancing options, and wonder if it is possible to have apache and php installed on multiple servers, managed by a load balancer, but have all the actual php code on one server, with the various apache servers pointing to the one central code base?

For instance NFS mounts are certainly possible, but I wouldn't recommend it. A lot of the advantage of loadbalancing is lost, and you're reintroducing a single point of failure. When syncing code, and rsync cronjob can handle itself very nicely, or a manual rsync on deployment can be done.
What is the reason you would want this central code base? I'm about 99% sure there is a better solution then a single server dishing out code.

I believe it is possible. To add to Wrikken's answer, I can imagine NFS could be a good choice. However, there are some drawbacks and caveats. For one, when Apache tries access files on an NFS share that has gone away (connection dropped, host failed, etc) very bad things happen. Apache locks up, and continues to try to retrieve the file. The processes attempting to access the share, for whatever reason, do not die, and it is necessary to re-boot the server.
If you do wind up doing this, I would recommend an opcode cache, such as APC. APC will cache the pre-processed php locally, and eliminate round trips to your storage. Just be prepared to clear the opcode cache whenever you update your application/

PHP has to run under something to act as a web processor, Apache is the most popular. I've done NFS mounts across servers without problem. Chances are if NFS is down, the network is down. But it doesn't take long to do an rsync across servers to replicate files, and is really a better idea.
I'm not sure what your content is like, but you can separate static files like javascript, css and images so they are on their own server. lighttpd is a good, light weight web server for things like this. Then you end up with a "dedicated" php server. You don't even need a load balancer for this setup.
Keep in mind that PHP stores sessions on the local file system. So if you are using sessions, you need to make sure users always return to the same server. Otherwise you need to do something like store sessions in memcache.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.