Memcahced on multiple servers in PHP - php

I have the following question regarding the memcached module in PHP:
Intro:
We're using the module to prevent the same queries from being sent to the Database server, on a site with 500+ users in every moment.
Sometimes (very rarely) the memcahed process defuncts and all active users start generating queries to the database, so everything stops working.
Question:
I know, that memcached supports multiple servers, but I want to know what happens when one of them dies? Is there some sort of balancer background or something, that can tell Ow! server 1 is dead. I'll send everything to server 2 until the server 1 goes back online. or the load is being sent equally to each one?
Possible sollutions:
I need to know this, because if it's not supported our sysadmin can set the current memcached server to be a load ballancer and to balance the load between several other servers.
Should I ask him to create the load-balancing manualy or is this feature supported by default and what are the risks for both methods?
Thank you!

You add multiple servers in your PHP script, not in Memcache's configuration.
When you use Memcached::addServers(), you can specify a weight for every server. In your case, you might set one Memcache server to be higher than the other and only have the second act as a failover.
Using Memcached::setOption(), you can set how often a connection should be retried and set the timeout. If you know your Memcache servers die a lot, it might be worth it to set this lower than the defaults, but it shouldn't be necessary.

Related

PHP/Mysqli caching mysql host? Not respecting low TTL

We're using PHP on AWS, with RDS/Aurora.
This works by exposing an endpoint for the cluster, which was CNAME records to the currently active mysql nodes.
As we add/remove reader nodes, in the case of a failover, this endpoint is updated automatically, with a 5 second TTL.
As such, our app should see and respond to the new nodes very quickly.
We're noticing after a failover, we get 'Mysql has gone away' for much longer than the 5 second period. We've had instances of this being 30 minutes, at which point we restarted Apache and it resolved the issue.
It seems as though somewhere in the application, the database is not querying the endpoint DNS and resolving the new endpoints, therefore still pointing at a node which is no longer there.
We do use Persistent connections (for performance), which was the obvious culprit, however we've then tested with these turned off, and same behaviour exists.
We're using PHP 7.1, with Mysqli. We have a singleton class around the mysqli connection, but even if this kept the same connection open it would only last the time a single script executed, which is typically a few milliseconds.
Any guidance as to where the caching may be occurring?
It's unclear if your issue is DNS related on the remote services or caching on your own (AWS) local network/services. This is the first thing to look into.
To my knowledge Linux does not cache DNS lookups and nor does Apache/PHP (unless you're using mod_proxy, in which case look into disablereuse Off setting). With this in mind I expect that your Apache service restart causing it start working was likely a coincidence.
My first suggestion would be to force a fail-over and then immediately check with the name servers from several different geographical locations and using a terminal from your AWS server to see how long it takes the same servers to report updated results. The chances are name servers are simply ignoring your TTL, or perhaps just 'treating it as a suggestion'.
The long and the short of it is that DNS TTL is just a suggestion to resolving name servers as to how long to cache. There is nothing to enforce name servers to actually abide by what you set. And the reality of it is, many name services don't follow your setting exactly or even at all.
If the name-servers are updating as quickly as expected elsewhere, but not on your AWS server and mysql still can't connect; this suggests caching somewhere on your server or more likely within the AWS network. Unless the caching is directly on your server, which as discussed above I believe is unlikely, I doubt there is much you can do about this.
Ultimately updating a DNS record and using a low TTL as a fail-over solution is likely never going to achieve consistent <1 min fail-over speeds.
You may want to look into alternative methods such as ClusterControl or a proxy method such as ProxySQL.

What to care about when using a load balancer?

I have a web application written in PHP, is already deployed on an Apache server and works perfectly.
The application uses Mysql as db, session are saved in memcached server.
I am planning to move to an HAproxy environment with 2 servers.
What I know: I will deploy the application to the servers and configure HAproxy.
My question is: is there something I have to care about/change in the code ?
It depends.
Are you trying to solve a performance or redundancy problem?
If your database (MySQL) and session handler (memcached) are running on one or more servers separate from the two Apache servers, then the only major thing your code will have to do differently is manage the forwarded IP addresses (via X-FORWARDED-FOR), and HAProxy will happily round robin your requests between Apache servers.
If your database and session handler are currently running on the same server, then you need to decide if the performance or redundancy problem you are trying to solve is with the database, the session management, or Apache itself.
The easiest solution, for a performance problem with a database/session-heavy web app, is to simply start by putting MySQL and memcached on the second server to separate your concerns. If this solves the performance problem you were having with one server, then you could consider the situation resolved.
If the above solution does not solve the performance problem, and you notice that Apache is having trouble serving your website files, then you would have the option of a "hybrid" approach where Apache would exist on both servers, but then you would also run MySQL/memcached on one of the servers. If you decided to use this approach, then you could use HAProxy and set a lower weight to the hybrid server.
If you are attempting to solve a redundancy issue, then your best bet will be to isolate each piece into logical groups (e.g. database cluster, memcached cluster, Apache cluster, and a redundant HAProxy pair), and add redundancy to each logical group as you see fit.
The biggest issue that you are going to run into is going to be related to php sessions. By default php sessions maintain state with a single server. When you add the second server into the mix and start load balancing connections to both of them, then the PHP session will not be valid on the second server that gets hit.
Load balancers like haproxy expect a "stateless" application. To make PHP stateless you will more than likely need to use a different mechanism for your sessions. If you do not/can not make your application stateless then you can configure HAProxy to do sticky sessions either off of cookies, or stick tables (source IP etc).
The next thing that you will run into is that you will loose the original requestors IP address. This is because haproxy (the load balancer) terminates the TCP session and then creates a new TCP session to apache. In order to continue to see what the original requstors IP address is you will need to look at using something like x-forwarded-for. In the haproxy config the option is:
option forwardfor
The last thing that you are likely to run into is how haproxy handles keep alives. Haproxy has acl's, rules that determine where to route the traffic to. If keep alives are enabled, haproxy will only make the decision on where to send traffic based on the first request.
For example, lets say you have two paths and you want to send traffic to two different server farms (backends):
somedomain/foo -> BACKEND_serverfarm-foo
somedomain/bar -> BACKEND_serverfarm-bar
The first request for somedomain/foo goes to BACKEND_serverfarm-foo. The next request for somedomain/bar also goes to BACKEND_serverfarm-foo. This is because haproxy only processes the ACL's for the first request when keep alives are used. This may not be an issue for you because you only have 2 apache servers, but if it is then you will need to have haproxy terminate the keep alive session. Haproxy has several options for this but these two make the most since in this scenario:
option forceclose
option http-server-close
The high level difference is that forceclose closes both the server side and the client side keep alive session. http-server-close only closes the server side keep alive session which allows the client to maintain a keepalive with haproxy.

Load Balancing - How to set it up correctly?

Here it gets a little complicated. I'm in the last few months to finish a larger Webbased Project, and since I'm trying to keep the budget low (and learn some stuff myself) I'm not touching an Issue that I never touched before: load balancing with NGINX, and scalability for the future.
The setup is the following:
1 Web server
1 Database server
1 File server (also used to store backups)
Using PHP 5.4< over fastCGI
Now, all those servers should be 'scalable' - in the sense that I can add a new File Server, if the Free Disk Space is getting low, or a new Web Server if I need to handle more requests than expected.
Another thing is: I would like to do everything over one domain, so that the access to differend backend servers isnt really noticed in the frontend (some backend servers are basically called via subdomain - for example: the fileserver, over 'http://file.myserver.com/...' where a load balancing only between the file servers happens)
Do I need an additional, separate Server for load balancing? Or can I just use one of the web servers? If yes:
How much power (CPU / RAM) do I require for such a load-balancing server? Does it have to be the same like the webserver, or is it enough to have a 'lighter' server for that?
Does the 'load balancing' server have to be scalable too? Will I need more than one if there are too many requests?
How exactly does the whole load balancing work anyway? What I mean:
I've seen many entries stating, that there are some problems like session handling / synchronisation on load balanced systems. I could find 2 Solutions that maybe would fit my needs: Either the user is always directed to the same machine, or the data is stored inside a databse. But with the second, I basically would have to rebuild parts of the $_SESSION functionality PHP already has, right? (How do I know what user gets wich session, are cookies really enough?)
What problems do I have to expect, except the unsynchronized sessions?
Write scalable code - that's a sentence I read a lot. But in terms of PHP, for example, what does it really mean? Usually, the whole calculations for one user happens on one server only (the one where NGINX redirected the user at) - so how can PHP itself be scalable, since it's not actually redirected by NGINX?
Are different 'load balancing' pools possible? What I mean is, that all fileservers are in a 'pool' and all web servers are in a 'pool' and basically, if you request an image on a fileserver that has too much to do, it redirects to a less busy fileserver
SSL - I'll only need one certificate for the balance loading server, right? Since the data always goes back over the load balancing server - or how exactly does that work?
I know it's a huge question - basically, I'm really just searching for some advices / and a bit of a helping hand, I'm a bit lost in the whole thing. I can read snippets that partially answer the above questions, but really 'doing' it is completly another thing. So I already know that there wont be a clear, definitive answer, but maybe some experiences.
The end target is to be easily scalable in the future, and already plan for it ahead (and even buy stuff like the load balancer server) in time.
You can use one of web servers for load balacing. But it'll be more reliable to set the balacing on a separate machine. If your web servers responds not very quickly and you're getting many requests then load balancer will set the requests in the queue. For the big queue you need a sufficient amount of RAM.
You don't generally need to scale a load balancer.
Alternatively, you can create two or more A (address) records for your domain, each pointing to different web server's address. It'll give you a 'DNS load-balancing' without a balancing server. Consider this option.

PHP Sessions to handle Multiple Servers

All,
I have a PHP5 web application written with Zend Framework and MVC. This application is installed on 2 servers with the same setup. Server X has php5/MySql/Apache and Server Y also have the same. We don't have a common DB server between both the servers.
My application works when accessed individually via https on Server X and Server Y. But when we turn on load balancing and have both servers up, the sessions get lost.
How can I make sure my sessions persist across servers? Should I maintain my db on a third server and write sessions to it? IF so, what's the easiest and most secure way to do it?
Thanks
memcached is a popular way to solve this problem. You just need to get it up and running (easy) and update your php.ini file to tell it to use memcached as the session storage.
In php.ini you would modify:
session.save_handler = memcache
session.save_path = ""
For the general idea: PHP Sessions in Memcached.
There are any number of tutorials on setting up the Zend session handler to work with memcached. Take your pick.
Should I maintain my db on a third
server and write sessions to it?
Yes, one way to handle it is to have a 3rd machine running the database that both webservers use for the application. I've done that for several projects in the past and its worked well. The question with that approach is... is the bottleneck at the webservers or the database. If its at the database, you wont see much improvement by throwing load balancing of the web servers into the mix. You may need to instead think of mirroring schemes for the database.
Another option is to use the sticky sessions feature on your load balancer. What this will do is keep users on certain servers. So when user 1 comes to the site, they will be directed to server X. Every subsequent request will also be directed to server X. This allows you to not worry about persisting sessions between servers, as each user will continue to be directed to the server they have their session on.
The one downside of this is that when you take a web server out of the pool, half the users with a session will be logged out. So the effectiveness of this solution depends on how often you take servers out of the pool.

Should $new_link be used in mysql_connect()?

I'm maintaining an inherited site built on Drupal. We are currently experiencing "too many connections" to the database.
In the /includes/database.mysql.inc file, #mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, 2) (mysql_connect() documentation) is used to connect to the database.
Should $new_link = TRUE be used? My understanding is that it will "always open a new link." Could this be causing the "too many connections"?
Editing core is a no no. You'll forget you did, upgrade the version, and bam, changes are gone.
Drupal runs several high-performance sites without problems. For instance, the Grammy Awards site switched to Drupal this year and for the first time the site didn't go down during the cerimony! some configuration needs tweaking on your setup. Probably mysql.
Edit your my.cfg and restart your mysql server (/etc/my.cfg in fedora, RH, centos and /etc/mysql/my.cfg on *buntu)
[mysqld]
max_connections=some-number-here
alternatively, to first try the change without restarting the server, login to mysql and try:
show variables like 'max_connections' #this tells you the current number
set global max_connections=some-number-here
Oh, and like another person said: DO. NOT. EDIT. DRUPAL. CORE. It does pay off if you want to keep your site updated, may cause inflexible headache and bring you a world of hurt.
MySQL, just like any RDBMS out there will limit the amount of connections that it accepts at any time. The my.cnf configuration file specifies this value for the server under the max_connections configuration. You can change this configuration, but there are real limitations depending on the capacity of your server.
Persistent connections may help reducing the amount of time it takes to connect to the database, but it has no impact on the total amount of connections MySQL will accept.
Connect to MySQL and use 'SHOW PROCESSLIST'. It will show you the currently open connections and what they do. You might have multiple connections sitting idle or running queries that take way too long. For idle connections, it might just be a matter of making sure your code does not keep connections open when they don't need them. For the second one, they may be parts of your code that need to be optimized so that the queries don't take too long.
If all connections are legitimate, you simply have more load than your current configuration allows for. If you MySQL load is low even with the current connection count, you can increase it a little and see how it evolves.
If you are not on a dedicated server, you might not be able to do much about this. It may just be someone else's code causing trouble.
Sometimes, those failures are just temporary. When it fails, you can simply retry the connection a few milliseconds later. If it still fails, it might be a problem and stopping the script is the right thing to do. Don't put that in an infinite loop (seen it before, terrible idea).
FYI: using permanent connections with Drupal is asking for trouble. Noone uses that as far as I know.
The new_link parameter only has effect, if you have multiple calls to mysql_connect(), during 1 request, which is probably not the case here.
I suspect it is caused by too many users visiting your site, simultaneously, because for each visitor, a new connection to the DB will be made.
If you can confirm that this is the case, mysql_pconnect() might help, because your problem is not with the stress on your database server, but the number of connections. You should also read Persistent database connections to see if it is applicable to your webserver setup, if you choose to go this route.

Categories