PHP sessions in a load balancing cluster - how? - php

OK, so I've got this totally rare an unique scenario of a load balanced PHP website. The bummer is - it didn't used to be load balanced. Now we're starting to get issues...
Currently the only issue is with PHP sessions. Naturally nobody thought of this issue at first so the PHP session configuration was left at its defaults. Thus both servers have their own little stash of session files, and woe is the user who gets the next request thrown to the other server, because that doesn't have the session he created on the first one.
Now, I've been reading PHP manual on how to solve this situation. There I found the nice function of session_set_save_handler(). (And, coincidentally, this topic on SO) Neat. Except I'll have to call this function in all the pages of the website. And developers of future pages would have to remember to call it all the time as well. Feels kinda clumsy, not to mention probably violating a dozen best coding practices. It would be much nicer if I could just flip some global configuration option and VoilĂ  - the sessions all get magically stored in a DB or a memory cache or something.
Any ideas on how to do this?
Added: To clarify - I expect this to be a standard situation with a standard solution. FYI - I have a MySQL DB available. Surely there must be some ready-to-use code out there that solves this? I can, of course, write my own session saving stuff and auto_prepend option pointed out by Greg seems promising - but that would feel like reinventing the wheel. :P
Added 2: The load balancing is DNS based. I'm not sure how this works, but I guess it should be something like this.
Added 3: OK, I see that one solution is to use auto_prepend option to insert a call to session_set_save_handler() in every script and write my own DB persister, perhaps throwing in calls to memcached for better performance. Fair enough.
Is there also some way that I could avoid coding all this myself? Like some famous and well-tested PHP plugin?
Added much, much later: This is the way I went in the end: How to properly implement a custom session persister in PHP + MySQL?
Also, I simply included the session handler manually in all pages.

You could set PHP to handle the sessions in the database, so all your servers share same session information as all servers use the same database for that.
A good tutorial for that can be found here.

The way we handle this is through memcached. All it takes is changing the php.ini similar to the following:
session.save_handler = memcache
session.save_path = "tcp://path.to.memcached.server:11211"
We use AWS ElastiCache, so the server path is a domain, but I'm sure it'd be similar for local memcached as well.
This method doesn't require any application code changes.

You don't mentioned what technology you are using for load balancing (software, hardware etc.); but in any case, the solution to your problem is to employ "sticky sessions" on the load balancer.
In summary, this means that when the first request from a "new" visitor comes in, they are assigned a specific server from the cluster: all future requests for the lifetime of their session are then directed to that server. In practice this means that applications written to work on a single server can be up-scaled to a balanced environment with zero/few code changes.
If you are using a hardware balancer, such as a Radware device, then the sticky sessions is configured as part of the cluster setup. Hardware devices usually give you more fine-grained control: such as which server a new user is assigned to (they can check for health status etc. and pick the most healthy / least utilised server), and more control of what happens when a server fails and drops out of the cluster. The drawback of hardware balancers is the cost - but they are worth it imho.
As for software balancers, it comes down to what you are using. For Apache there is the stickysession property on mod_proxy - and plenty of articles via google to get this working with the php session ( for example )
Edit:
From other comments posted after the original question, it sounds like your "balancing" is done via Round Robin DNS, so the above probably won't apply. I'll refrain from commenting further and starting a flame against round robin dns.

The easiest thing to do is configure your load balancer to always send the same session to the same server.
If you still want to use session_set_save_handler then maybe take a look at auto_prepend.

If you have time and you still want to check more solutions, take a look at
http://redis4you.com/articles.php?id=01..
Using redis you are fault tolerant. From my point of view, it could be better than memcache solutions because of this robustness.

If you are using php sessions you could share with NFS the /tmp directory, where I think the sessions are stored, between all the servers in the cluster. That way you don't need database.
Edited: You can also use an external service like memcachedb (persistent and fast) and store the session info in the memcachedb index and indentify it with a hash of the content or even the session ID.

When we had this situation we implemented some code that lives in a common header.
Essentially for each page we check if we know the session Id. If we dont we check if we're in the situation whehich you describe, by checking if we have stored sesion data in the DB.Otherwise we just start a new session.
Obviously this requires all relevant data to be copied to the DB, but if you encapsulate your session data in a seperate class then it works OK.

you could also try using memcache as session handler

Might be too late, but check this out: http://www.pureftpd.org/project/sharedance
Sharedance is a high-performance server to centralize ephemeral key/data
pairs on remote hosts, without the overhead and the complexity of an SQL
database.
It was mainly designed to share caches and sessions between a pool of web
servers. Access to a sharedance server is trivial through a simple PHP API and
it is compatible with the expectations of PHP 4 and PHP 5 session handlers.

When it comes to php session handling in the Load Balancing Cluster, it's best to have Sticky Sessions. For that ask the network of datacenter who is maintaining the load balancer to enable the sticky session. Once that is enabled you'll don't need worry about sessions at php end

Related

Handling $_SESSION object in a load-balanced setup, PHP

I have a setup where I have a host which routes multiple requests in a load-balanced fashion. My backend uses PHP. Now, I need to use the $_SESSION object for some of my processing.
Will $_SESSION work where I have 3 backend servers which can receive any request at any time?
If not, Can one suggest alternatives to handle such cases?
EDIT: I do understand that we can store sessions in a database and find a way to track it. But, the problem in a realtime load-balanced production scenario is the number of calls that go into a DB. That can be a real bummer for my performance. I'm kind of hoping that, we can handle this at an webserver level.
Not sure, if it is possible, but, if two webservers have some kind of replication mechanism like databases do, it will be brilliant. I dont have to do a thing.
If such a thing does not exist, PHP should be modified to support it. That will actually, make it a seriously robust language.
My Suggestion is to setup PHP to handle the sessions in the database (this way they can all access the session data independent of which server is requesting it).
A good tutorial for that can be found HERE
Look into either memcache (Is it recommended to store PHP Sessions in MemCache?) or REDIS (https://joshtronic.com/2013/06/20/redis-as-a-php-session-handler/).
There is a good tutorial on setting up memcache on Ubuntu at https://www.globo.tech/learning-center/php-memcached-instances-ubuntu-16/. Which also covers using haproxy as a load balancer (although you may already a solution).
Perhaps have a read of https://blog.newtonhq.com/session-handling-for-1-million-requests-per-hour-68cdece15030.
You can store sessions in memcache in order to share session among servers.
Please have a look on documentation here

PHP sessions, cookieless domains, and performance

I'm on board with the whole cookieless domains / CDN thing, and I understand how just sending cookies for requests to www.yourdomain.com, while setting up a separate domain like cdn.yourdomain.com to keep unnecessary cookies from being sent can help performance.
What I'm curious about is if using PHP's native sessions have a negative effect on performance, and if so, how? I know the session key is kept track of in a cookie, which is small, and so that seems fine.
I'm prompted to ask this question because in the past I've written my web apps and stored a lof of the user's active data, preferences, and authentication information in the $_SESSION variable. However, I notice that some popular web applications out there, like Wordpress, don't use $_SESSION at all. But sessions are easy to use and seem fairly secure, especially if you combine it with tracking user-agent / ip changes to prevent session hijacking. So why don't Wordpress and other web apps use php's sessions? Should I also stop using sessions?
Also, let me also clarify that I do realize the server must load the session data to process a page request, but that's not what I'm asking about here. My question is about if / how it impacts the network performance, especially in regard to the headers being sent / received. For example does using sessions prevent pages or images on the site from being served from the browser's cache? Is the PHPSESID cookie the only additional header that is being sent? These sorts of things.
The standard store for $_SESSION is the file-system with one file per session. This comes with a price:
When two requests access the same session, one request will win over the other and the other request needs to wait until the first request has finished. A race condition controlled by file-locking.
Using cookies to store the session data (Wordpress, Codeigniter), the race-condition is the same but the locking is not that immanent, but a browser might do locking within the cookie management.
Using cookies has the downside that you can not store that much data and that the data get's passed with each request and response. This is likely to trigger security issues as well. Steal the cookie and you've got the data. If it's encrypted, an attacker can try to decrypt it to gain the data stored therein.
The historical reason for Wordpress was that the platform never used the PHP Sessions. The root project started around 2000, it got a lot of traction in 2002 and 2004. As session handling was only available with PHP 4 and PHP 3 was much more popular that time.
Later on, when $_SESSION was available, the main design of the application was already done, and it worked. Next to that, in 2004/2005 wordpress decided to start a commercial multi-blog hosting service. This created a need in scaling the application(s) across servers and cookies+database looked more easy for the session/user handling than using the $_SESSION implementation. Infact, this is pretty easy and just works, so there never was need to change it.
For Codeigniter I can not say that much. I know that it stores all session information inside a cookie by default. So session is just another name for cookie. Optionally it can be encrypted but this needs configuration. IIRC it was said that this has been done because "most users do not need sessions". For those who need, there is a database backend (requires additional configuration) so users can change from cookie to database store transparently within their application. There is a new implementation available as well that allows you to change to any store you like, e.g. to native PHP sessions as well. This is done with so called drivers.
However this does not mean that you can't achieve the same based on $_SESSION nowadays. You can replace the store with whatever you like (even cookies :) ) and the PHP implementation of it should be encapsulated anyway in a good program design.
That done you can implement a store you can better control locking on (e.g. a database) and that works across servers in a load balanced infrastructure that does not support sticky sessions.
Wordpress is a good example for an own implementation of sessions handling totally agnostic to whatever PHP offers. That means the wheel has been re-invented. With a view from today, I would not call their design explicitly innovative, so it full-fills a very specific need in a very specific environment that you can only understand if you know about the projects roots.
Codeigniter is maybe a little step ahead (in an interface sense) as it offers some sort of (unstable) interface to sessions and it's possible to replace it with any implementation you like. That's much better for new developers but it's also sort of re-inventing the wheel because PHP does this already out of the box.
The best thing you can do in an application design is to make the implementation independent from system needs, so to make the storage mechanism of your session data independent from the rest of the program flow. PHP offers this with a pretty direct interface, the $_SESSION array and the session configuration.
As $_SESSION is a superglobal array you might want to prevent your application to access it directly as this would introduce global state. So in a good design you would have an interface to it, to be able to fully abstract away from the superglobal.
Done that, plus abstraction of the store plus configuration (e.g. all in one session dependency container), you should be able to scale and maintain your application well over as many servers as you like for whatever reason. Your implementation then can just use cookies if you think that's it for you. However you will be able to switch to database based session in case you need it - without the need to rewrite large parts of your application.
I'm not 100% confident this is the case but one reason to avoid the built-in $_SESSION mechanism in PHP is if you want to deploy your web application in a high-availability web farm scenario.
Because the default session behavior in PHP is to store session objects in process, in memory, it makes it hard (if not impossible) to have multiple servers processing requests from the same user. You would only have this if you wanted to deploy your web application in a web farm environment where you have a number of PHP web servers processing requests for your app to balance the load.
So, while in-process session state is generally much faster than a database-based solution, the latter is favorable when you need to process a huge number of requests and to service the capacity a web-farm environment is used.
As I said in the beginning, I'm not 100% sure if PHP supports configuring the session state provider to be a database, or session state server, instead of the in-process default.

Are there limitations in PHP session handling?

I've seen many sites give up the use of the default handling of sessions in PHP for their own method and I still have no clue why.
They are definitely running PHP and it just seems pointless to me that people would design their own method. Is there some sort of limitation that I do not know of or is it purely so they have control of everything?
(I tried asking them and yeah they either didn't have a way of contacting them or they "saw something somewhere against using PHP sessions" without knowing what it actually was)
Default sessions are stored on the hard drive, usually in the /tmp directory.
When your site gets larger, 1 computer isn't sufficient to run it.
Therefore, people resort to load balancing (among other solutions).
Load balancer effectively switches between a cluster of computers. Therefore, if by any chance you got served by computer #1 on your first request and then by computer #2 at your second request - the second computer cannot read the session since it's not in its /tmp folder.
This is a simplified scenario of course since there's much more to application scaling but this is one of the reasons why people resort to overriding the default session mechanism.
The other thing of interest is storing sessions in the db thus making them searchable and what not. You can also create an interface for effectively forcefully logging people out, which is something that the default mechanism cannot provide.
I would have thought a principal reason for rolling your own session-handling functionality is for the purposes of testing. If you're running unit tests, you won't necessarily have a browser environment going. You won't be able to set cookies, and so PHP won't set $_SESSION variables for you.
If, however, you wrote your own session handling class(es), then you could create a mock class for running unit tests. The object would behave like a "real" session, but you won't have to faff about with browsers, cookies and human beings.
Well with the standard setup you are tied to using the file system, saving session data unencrypted etc.
Writing your own session handling using session_set_save_handler you can adjust the sesssion management to your needs ... applying encryption, saving session in a database, synchronizing the sessions with separate software systems ...
1) Session are still widely used. They works and do the work, so there is not point to change it unless a special case.
2) However, Session is weak, it relies in a single PHP (that can be stolen). However, it is possible to protect a session using different method such cookie + ip + expiration.
So yes and no. Session are still widely used but require a fine tune.

Are there problems using PHP sessions in a server cluster?

We are developing a web site in PHP, and we have to use sessions. The site will be published in a server cluster. How can we make that work?
Thanks.
Yes this is possible, you need to store your sessions in a central location like a database though. This is pretty simple and just requires you to make some changes to session_set_save_handler - there's a good example of the process you need to follow here
I would use memcache to store your sessions. It will be much faster than storing them in a database or disk.
Database storage is good but you will need more databases when your site becomes very high traffic. Sessions on disk will also cause a lot of IO issues when your site gets a lot of traffic. Memcache on the other hand scales much better than a DB and files.
I personally use memecache and the sites i work on get millions of hits a day. I have never had any issues with storing sessions in memcache.
If you've got multiple PHP boxes, you'll want a central session store.
Your best choices are probably database (that link from seengee's answer is a good explanation) or a dedicated memcache box.
A shared NFS mount for the session directory would be an option, though I've always found nfs performance a bit slow. Alternatives are to write your own session handler using memcache or database for the sessions.
An alternative option is to load balance your web servers using sticky sessions, which will ensure that requests from the same client always go to the same server during the course of the session.

Is there a way to share object between php pages?

I am new to php, but in other web technologies, you can share objects between page instances. For example, in java jsp pages you easily have on class that exist as static class for the whole server instance. How to do this in php?
I am not refering to sessions variables (at least I don't think so). This is more for the purpose of resource pooling (perhaps a socket to share, or database connections etc). So a whole class needs to be shared between subsequent loads, not just some primitive variables that I can store in the session.
I have also looked into doing php singleton classes but I believe that class is only shared within the same page and not across pages.
To make things even more clear, I'm looking for something that can help me share, say, a socket connected to a server for a connectSocket.php page such that all users who loads that page uses the same socket and does not open a new one.
This is a bit of a difficult answer, and might not be exactly what you are looking for.
PHP is built upon a 'shared-nothing' architecture. If you require some type of state across your application, you must do this through other means.
First I would recommend looking into the core of the problem.. Do you really need it? If you assume the PHP application could die (and lose state) is it ok to lose the data?
If you must maintain the state, even after the application dies or otherwise, you should assume probably the best place to put the data is in MySQL. PHP is intended as a thin layer around your business logic, so I can highly recommend this.
If you don't care about losing the data after a restart, the problem domain you're looking for is probably caching. I would recommend looking into memcached or if you're on a single machine, apc. APC will definitely work for you with Apache on a single machine, but you will still have to code your application assuming you might lose the data.
If you're worried your underlying datastore (MySQL) is too slow, but you still need to maintain the data after a restart, you should look into a combination of these 2 systems. You can always push and pull your data from the cache, but only when it updates send it over to Mysql.
If the data is purely user or session-bound, you probably want to just looking into the sessions system.
I've personally developed a reasonably large multi-tenant application, and although its a pretty complex application, I've never needed the true state you're looking for.
Update: Sorry, I did not read your note about sharing a socket. You will need a separate daemon to handle this, perhaps if you can explain your problem further, there might be other approaches. What type of socket is this?
There's a fundamental difference between web-served Java and web-served interpreted languages like PHP and Perl. In Java, your web server will have an operating environment that maintains state (ie. Tomcat). With interpreted languages, a request to your web server will generally spawn a new web server thread, which in turn loads a fresh operating environment for that thread, in this case, the PHP environment.
Therefore, in PHP, there is no concept of page instances. Every request to the web server is a fresh start. All the classes are re-loaded, so there is no concept of class sharing, nor is there a concept of resource pooling, unless it is implemented externally.
Sharing sockets between web requests therefore isn't really possible.
This is likely a partial answer but you can save an instance of a class into a Session variable and access it at another time.
Most of the PHP database libraries use connection pooling already. You call, for example, pg_connect as if you were requesting a new connection, but if the connection string is the same as a connection that already exists, you will get the established connection back instead. If you only care about pooling for database access, then you can just confirm that it exists in the db library you're using.
An other horroble solution may be to load the data of the object to any $_SESSION variable and then user it back into the object of the other page.
In fact, this is the solution I'm going to follow in my project, until I get some better one.
Regards!

Categories