Magento + Varnish + Memcache: session_start() is very slow - php

we run a Magento shop with Varnish. Everything works fine except the following problem: If you open a shop page and then leave the browser open for a very long time, say 12 - 24 hours, and then reload the page, the page loads very slow (about 15s).
We located the problem in the start_session() call in app/code/core/Mage/Core/Model/Session/Abstract.php. This call takes about 15s.
We use memcache (not memcached) for session management.
We googled a lot, and found many posts about slow session starts, but none about this particular issue.
Can anobody help?
Many thanks in advance,
Tilman

I've seen this quite a lot on New Relic as well.
From what I've seen there are a few different causes, I don't have a complete understanding of this issue but it is something I've been looking into recently. Here's my findings.
Sessions in Magento, Locking, and New Relic
Every controller action in Magento uses the session, whether it needs to or not. The session is eagerly instantiated in Mage_Core_Controller_Varien_Action::preDispatch
If you have session locking enabled, this means that for the duration of the request your session is locked down until the request completes. I haven't found the bit of code that releases the session lock yet, but I'm pretty sure it's in there somewhere.
Ultimately this means if you fire off multiple concurrent requests to Magento controller actions from the one location using the same session, you will have to wait for some of those requests to complete and unlock the session to proceed. I usually see this as a slow transaction on new relic stuck at Mage_Core_Model_Session_Abstract_Varien::start for ~30 seconds (my session lock wait timeout I think).
This report on New Relic has multiple downsides as I see it
Slows down the total average response time, because these requests are slower than they otherwise should have been.
New Relic records a sample of the slowest transactions, if I have performance bottlenecks that take for example 20 seconds New Relic will not report them automatically for me if the same URL is plagued by session locking timeouts. The timeouts are hiding the useful data.
Causes
I've seen a few common causes for this, not a definitive list by any means
Bots
Crawlers like Baidu and Yandex being a being a bit rude and battering the website. They're being run from one location firing off numerous requests, using the same session, and tripping up the session locking mechanism, hence showing slow transactions in New Relic.
Ajax calls to Magento controller actions
With varnished websites customer specific data must be loaded with care, some websites manage this by using ajax calls to the Magento backend to get the required data. I have also seen some websites using ajax calls to the backend to get product specific information, such as the amount left in stock when an item is on sale.
If a single page triggers multiple ajax calls to the backend on page load, it can potentially trigger the session locking mechanism. The more ajax calls back to the Magento backend the more likely you are to experience locking.
Varnish ESI
The same as above really, except instead of using ajax calls it uses Edge Side Includes which seem to be new calls to the backend.
My Plan
I have not actioned this yet so it's still purely theoretical, but it's something i'm looking into doing over the next few months.
I brought this problem up during the Mage Titans UK 2016 conference and Fabrizio Branca pointed me towards the following module: https://github.com/AOEpeople/Aoe_BlackHoleSession.
Based on a regular expression the module will prevent Bots from creating real sessions, this should have the benefit that no session lock will be hit, and that your session resources won't be battered by rude bots. Bots should no longer pollute your New Relic readings.
For ajax/ESI calls to get customer data there on cached pages there's nothing you can do that I can see. You need access to the session in order to retrieve customer specific data.
However, for ajax/ESI calls to get catalog specific data (such as limited stock) I don't see any need for a session to exist on that request at all. My plan for the future is to trial out an extension to the Aoe_BlackHoleSession module so that I can silo off requests to a specific URL as being sessionless.
I'm less familiar with the internals of ESI, so sadly I don't have too much to comment there.
An alternative
During the conference Fabrizio Branca said he was able to disable session locking completely without any ill effects, test at your own risk.

Related

Session regenerate causes expired session with fast AJAX calls

My application is a full AJAX web page using Codeigniter Framework and memcached session handler.
Sometimes, it sends a lot of asynchronous calls and if session has to regenerate its ID (to avoid session fixation security issue), the session cookie is not renewed fast enough and some AJAX calls fail due to session id expired.
Here is a schematic picture I made to show clearly the problem :
I walked across the similar threads (for example this one) but the answers doesn't really solve my problem, I can't disable the security as there is only AJAX calls in my application.
Nevertheless, I have an Idea and I would like an opinion before hacking into the Codeigniter session handler classes :
The idea is to manage 2 simultaneous session Ids for a while, for example 30 seconds. This would be a maximum request execution time. Therefore, after session regeneration, the server would still accept the previous session ID, and switch to session to the new one.
Using the same picture that would give something like this :
First of all, your proposed solution is quite reasonable. In fact, the people at OSWAP advise just that:
The web application can implement an additional renewal timeout after which the session ID is automatically renewed. (...) The previous session ID value would still be valid for some time,
accommodating a safety interval, before the client is aware of the new
ID and starts using it. At that time, when the client switches to the
new ID inside the current session, the application invalidates the
previous ID.
Unfortunately this cannot be implemented with PHP's standard session management (or I don't know how to do that). Nevertheless, implementing this behaviour in a custom session driver 1 should not pose any serious problem.
I am now going to make a bold statement: the whole idea of regenerating the session ID periodically, is broken. Now don't get me wrong, regenerating the session ID on login (or more accurately, as OSWAP put it, on "privilege level change") is indeed a very good defense against session fixation.
But regenerating session IDs regularly poses more problems than it solves: during the interval when the two sessions co-exist, they must be synchronised or else one runs the risk loosing information from the expiring session.
There are better (and easier) defenses against simple session theft: use SSL (HTTPS). Periodic session renewal should be regarded as the poor man's workaround to this attack vector.
1 link to the standard PHP way
your problem seems to be less with the actual speed of the requests (though it is a contributing factor) but more with concurrency.
If i understand right, your javascript application makes many (async) ajax calls - fast (presumably in bursts)- and sometimes some of them fail due to session invalidation due to what you think is speed of requests issue.
Well i think that the problem is that you actually have several concurrent requests to the server, while the first one has its session renewed the other essentially cannot see it because the request is already made and waits to be processed by the server.
This problem will of course manifest itself only when doing several requests for the same user simultaneously.
Now The real question here - what in your application business logic demands for this?
It looks to me that you are trying to find a technical solution to a 'business' problem. What i mean is that either you've mis-interpreted your requirements, or the requirements are just not that well thought/specified.
I would advice you to try some of the following:
ask yourself if these multiple simultaneous requests can be combined to one
look deeply into the requirements and try to find the real reason why you do what you do, maybe there is no real business reason for this
every time before you fire the series of requests fire a 'refresh' ajax request to get the new session, and only on success proceed with all the other requests
Hope some of what i've wrote help to guide you to solution.
Good luck

Handling PHP Sessions With Multiple Front End's

We have 2x pfSense FW's in HA, behind that, 2x Zen Load Balancers in Master/Slave Cluster, behind those, 3x Front End web stack servers running NGinx, PHP-FPM, PHP-APC. In that same network segment, there are 2x MySQL DB Servers in Master/Slave replication.
PHP sessions on the front ends should be "cleaned up" after 1440 seconds:
session.gc_maxlifetime = 1440
.
Cookies are expired when the users browser closes:
session.cookie_lifetime = 0
Today, we were alerted by an end user that they logged in (PHP based login form on the website), but were authenticated as a completely different user. This is inconvenient to say the least.
The ZLB's are set to use Hash: Sticky Client. They should stick users to a single Front End (FE) for the duration of their session. The only reason I can think of this happening is that two of the FE's generated the same PHP Session ID, and then somehow the user was unlucky enough to be directed to that other FE by the LB's.
My questions are plentiful, but for now, I only have a few:
Could I perhaps set a different SESSID name per front end server? Would this stop the FE's generating session ID's that were the same? This would at least then result in the user getting logged out rather than logged in again as a different user!
We sync the site data using lsyncd and a whole bunch of inotifywatch processes, but we do not sync the /var/lib/php directories that contain the sessions. I deliberately didn't do this... I'm now thinking perhaps I should be syncing that. lsyncd will be able to duplicate session files across all 3 front ends within about 10seconds of the sessions being modified. Good idea as a temporary fix?
Lastly, I know full well that the client should be using the DB to store sessions. This would completely eradicate it being able to duplicate the session ID's. But right now, they are unwilling to prioritise that in the development time-line.
Ideas very much welcome as I'm struggling to see an easy way out, even as a temporary measure. I cant let another client get logged in as a different user. It's a massive no-no.
Thanks!!
Judging by your question you are somewhat confused by the problem - and its not clear exactly what problem you are trying to fix.
Today, we were alerted by an end user that they logged in (PHP based login form on the website), but were authenticated as a completely different user
There's potentially several things happening here.
Cookies are expired when the users browser closes:
Not so. Depending on how the browser is configured, most will retain session cookies across restarts. Since this is controlled at the client, its not something you can do much about.
PHP sessions on the front ends should be "cleaned up" after 1440 seconds
The magic word here is "after" - garbage collection is triggered on a random basis. Session files can persist for much longer and the default handler will happily retrieve and unserialize session data after the TTL has expired.
Do you control the application code? (if not, your post is off-topic here). If so, then its possible you have session fixation and hijack vulnerabilities in your code (but that's based on the description provided by the user - which is typically imprecise and misleading).
Its also possible that content is being cached somewhere in the stack inappropriately.
You didn't say if the site is running on HTTP, HTTPS or mixed, and if HTTPS is involved, where the SSL is terminated. These are key to understanding where the issue may have arisen.
Your next steps are to ensure that:
you have logout functionality in your code which destroys the session data and changes the session id
that you change the session id on authentication
That your session based scripts are returning appropriate caching information (including a Varies: Cookie header)
It is highly improbable that 2 systems would generate the same session id around the same time.
Really you want to get away from using sticky sessions. It's not hard.
You've got 2 layers at your front end that are adding no functional or performance value, and since you are using sticky sessions, effectively no capacity or resillience value!!! Whoever sold you this is laughing all the way to the bank.

PHP concurrent changes on session variable for same user?

Suppose we have a user variable $_SESSION['variable'] that may or may not be modified as the user access a page.
Suppose the same user has several browser windows open and somehow makes simultaneous requests to the server that result on changes to the session variable.
Questions:
How does the server "queue" these changes, since they are targeted at
the same variable? Is there a potential for server error here?
Is there a way to "lock" the session variable for reading/writing in
order to implement some kind of status check before changing its
value?
EDIT
( thanks Unheilig for the cleanup)
Regarding the "queueing", I am interested in what happens if two requests arrive at the same time:
Change X to 1
Change X to 2
I know this doesn't seem a real world scenario, but it just came to my mind when designing something. It could become a problem if the system allows too many concurrent requests from the same user.
Each individual PHP Session is 'locked' between the call to session_start() and either the call to session_write_close() or the end of the request, whichever is sooner.
Most of the time you would never notice this behaviour.
However, if you have a site which does make many concurrent requests* then these requests would appear to queue in first-come-first-served order.
To be clear; in a typical Apache/PHP setup your requests will come in to the server and start your PHP executions concurrently. It is the session_start() call that will block/pause/wait/queue because it's waiting to gain the file-lock on the session file (or similar depending on your session_hander).
To increase request throughput or reduce waiting requests therefore:
Open and close the session (session_start(), do_work(), session_write_close()) as rapidly as possible; to reduce the time the session is locked for writing.
Make sure you're not leaving the session open on requests that are doing long work (accessing 3rd party APIs, generating or manipulating documents, running long database queries etc).. unless absolutely necessary.
Avoid, where possible, touching the session at all. Be as RESTful as possible.
Manage the queuing and debouncing of requests as elegantly as possible on the client side of the application
Hope that helps.
J.
*Ajax & Frame/iFrame heavy applications often exhibit this problem.

Will caching be appropriate for this scenario?

So I have a PHP CodeIgniter webapp and am trying to decide whether to incorporate caching.
Please bear with me on this one, since I'll happily admit I don't fully understand caching!
So the first user loads up a page of user submitted-content. It takes 0.8 seconds (processing) to load it 'slow'. The next user then loads up that same page, it takes 0.1 seconds to load it 'fast' from cache.
The third user loads it up, also taking 0.1 seconds execution time. This user decides to comment on the page.
The fourth user loads it up 2 minutes later but doesn't see the third user's comment, because there's still another 50 minutes left before the cache expires
What do you do in this situation? Is it worth incorporating caching on pages like this?
The reason I'd like to use caching is because I ran some tests. Without caching, my page took an average of 0.7864 seconds execution time. With caching, it took an average of 0.0138 seconds. That's an improvement of 5599%!
I understand it's still only a matter of milliseconds, but even so...
Jack
You want a better cache.
Typically, you should never reach your cache's timeout. Instead, some user-driven action will invalidate the cache.
So if you have a scenario like this:
Joe loads the page for the first time (ever). There is no cache, so it takes a while, but the result is cached along the way.
Mary loads the page, and it loads quickly, from the cache.
Mary adds a comment. The comment is added to the database (or whatever), and the software invalidates the cache
Pete comes along and loads the page, the cache is invalid, so it takes a second to render the page, and the result is cached (as a valid cache entry)
Sam comes along, page loads fast
Jenny comes along, page loads fast.
I'm not a CodeIgniter guy, so I'm not sure what that framework will do for you, but the above is generally what should happen. Your application should have enough smarts built-in to invalidate cache entries when data gets written that requires cache invalidation.
Try CI's query caching instead. The page is still rendered every time but the DB results are cached... and they can be deleted using native CI functionality (i.e no third party libraries).
While CI offers only page level caching, without invalidation I used to handle this issue somewhat differently. The simplest way to handle this problem was to load all the heavy content from the cache, while the comments where loaded via a non cacheable Ajax calls.
Or you might look into custom plugins which solve this, like the one you pointed out earlier.
It all comes down to the granularity at which you want to control the cache. For simple things like blogs, loading comments via external ajax calls (on demand - as in user explicitly requests the comments) is the best approach.

Does PHP session conflict with Share-Nothing-Architecture?

When I first meet PHP, I'm amazed by the idea Sharing-Nothing-Architecture. I once in a project whose scalaiblity suffers from sharing data among different HTTP requests.
However, as I proceed my PHP learning. I found that PHP has sessions. This looks conflict with the idea of sharing nothing.
So, PHP session is just invented to make counterpart technology of ASP/ASP.NET/J2EE? Should high scalable web sites use PHP session?
The default PHP model locks sessions on a per-user basis. That means that if user A is loading pages 1 and 2, and user B is loading page 3, the only delay that will occur is that page 2 will have to wait until page 1 is done - page 3 will still load independently of pages 1 and 2 because there is nothing shared for separate users; only within a given session.
So it's basically a half-and-half solution that works out okay in the end - most users aren't loading multiple pages simultaneously; thus session lock delays are typically low. As far as requests from different users are concerned, there's still nothing shared.
PHP allows you to write your own session handler - so you can build in your own semantics using the default hooks - or, if you prefer you could use the built in functionality to generate the session id and deal with the browser side of things then write your own code to store/fetch the session data (e.g. if you only wanted the login page and not other pages to lock the session data during processing, then this is a bit tricky though not impossible using the standard hooks).
I don't know enough about the Microsoft architecture for session handling to comment on that, but there's a huge difference in the way that PHPs session handling, and what actually gets stored in the session compared with J2EE.
Not using sessions in most of your pages will make the application tend to perform a lot faster and potentially scale more easily - but you could say that about any data used by the application.
C.

Categories