Yes, I've read session_start seems to be very slow (but only sometimes), but my problem is slightly different.
We have a PHP application that stores very simple sessions in memcached (elasticache to be specific), and have been monitoring our slowest-performing pageloads. Almost all of the slow ones spend a majority of their time in Zend_Session::Start, and we can't figure out why. It's a very AJAX-y front end, moving more and more toward a single-page app, making a number of simultaneous requests to the backend per pageload, and some of the requests take up to three to four times as long as they should based solely on this.
It's not every request, obviously, but enough of them that we're concerned. Has anyone else seen this behavior? We were under the impression that memcache is not blocking (how could it be?), so the very worst would be a user has a bum session, but multiple-second wait times in session_start seems inexplicable.
Take a look at your session garbage collection mechanism (play with probability and divisor).
If gc is slowing down the app, consider cleaning the old sessions manually (e.g. setting gc probability to 1 and running gc script via cron).
Another possibilities:
session mechanism is locked (so the app waits to unlock, then write)
too much data saved
Related
We have a client server architecture with Angular on client side and Apache2 PHP PDO and MySQL on the server side. server side exposing an API to clients that gives them data to show.
Some observations :
some API calls can take very long to compute and return response.
server side seem to handle a single request per client at any given time (im seeing only one coresponding query thats being executed in mysql), that limit comes either from apache or from mysql since front-end sending requests in parallel for sure.
front end cancels requests that are not relevant anymore (data being fetched will not be visible)
seems like requests canceled by front end are not canceled in server side and continues to run anyway, i think even if they are queued they will still run when their turn arrives (even though they were cancelled on client side)
Need help to understand :
what exactly is the cause of not having all requests (or at least X>1 requests) run on parallel? can it be changed?
What configurations should i change in either apache or mysql to overcome this?
is there a way to make apache drop cancelled requests? at least those that are still queued and not started?
Thanks!
EDIT
Following #Markus AO comment (Thanks Markus!!!) this was session blocking related... wish i knew about that before!
OP has a number of tangled problems on the table. However I feel these are worthwhile concerns (having wrestled with them myself), so let's take this apart. For great justice; main screen turn on:
Solving Concurrent Request Problems
There are several possible problems and solutions with concurrent connections in a (L)AMP stack. Before looking at tuning Apache and MySQL, however, let me gloss a common "mystery" issue that creates concurrence problems; namely, a necessary evil called "PHP Session Locking".
PHP Session Blocking & Concurrent Requests
In a nutshell: When you use sessions in your application, after calling session_start(), PHP locks the session file stored at your session.save_path directory. This file lock will remain in place until the script ends, or session_write_close() is called. Result: Any subsequent calls by the same user will be queued, rather than concurrently processed, to ensure there's no session data corruption. (Imagine parallel scripts writing into the same $_SESSION!)
An easy way to demonstrate this is to create a long-running script; then call it in your browser; and then open a new tab, and call it again (or in fact, call any script sharing the same session cookie/ID). You'll see that the second call won't execute until the first one is concluded. This is a common cause of strange AJAX lags, especially with parallel AJAX requests from a single page. Processing will be consecutive instead of concurrent. Then, 10 calls at 0.3 sec each will take a total of 3 sec to conclude, and so on. We don't want that, do we!
You can remedy request blocking caused by PHP session lock by ensuring that:
Scripts using sessions should call session_write_close() once done storing session data. The session lock will be immediately released.
Scripts that don't require sessions shouldn't start sessions to begin with.
Scripts that need to only read session data: Using session_start() with ['read_and_close' => true] option will give you a read-only (non-persistent) $_SESSION variable without session locking. (Available since PHP 7.)
Options 1 and 3 will leave you with read access for the $_SESSION variable and release/avoid the session lock. Any changes made to $_SESSION after the session is closed will be silently discarded; no warnings/errors are displayed.
The session lock request blocking issue is only consequential for a single user (using the same session). It has no impact on multi-user concurrence. For further reading, please see:
SO: Session (Auto)-Start, Performance & Session Locking
SO: PHP & Sessions: Is there any way to disable PHP session locking?
In-Depth: PHP Session Locking: How To Prevent Sessions Blocking in PHP requests.
Apache & MySQL Concurrent Requests
Once upon a time, before realizing PHP was the culprit behind blocking/queuing my concurrent calls, I spent a small aeon in tweaking Apache and MySQL and wondering, what happen?
Apache 2.4 supports 150 concurrent requests by default; any further requests will queue up. There are several settings under the MPM/Multi-Processing Module that you can tune to support the desired level of concurrent connections. Please see:
MPM Docs
Worker Docs
Overview at Oxpedia
MySQL has options for max_connections (default 151) and max_user_connections (default unlimited). If your application sends a lot of concurrent requests per user, you'll want to ensure the global max connections is high enough to ensure a handful of users don't hog the entire DBMS.
Obviously, you'll further want to tune these settings in light of your server CPU/RAM specs. (The calculations for which are beyond this answer.) Your concurrency issues probably aren't caused by too many open TCP sockets, but hey, you never know...
Canceling Requests to Apache/PHP/MySQL
We don't have much to go on as far as your application's specific wiring, but I understand from the comments that as it stands, a user can cancel a request at the front-end, but no back-end action is taken. (Ie. any back-end response is simply ignored/discarded.)
"Is there a way to make Apache drop cancelled requests?" I'm assuming that your front-end sends the requests directly and without delay to Apache; and onward to PHP > MySQL > PHP > Apache. In that case, no, you can't really have Apache cancel the request that it's already received; or you could hit "stop", but chances are PHP and MySQL are already munching it away...
Holding a "Cancel Window"
However, you could program a "cancel window" lag into your front-end, where requests are only passed on to Apache after e.g. a 0.5-second sleep waiting for a possible cancel. This may or may not have a negative impact on the UX; may be worth implementing to save server resources if a significant portion of requests are canceled. This assumes an UI with Javascript. If you're getting direct HTTP calls to API, you could have a "sleepy proxy receiver" instead.
Using a "Cancel Controller"
How would one cancel PHP/MySQL processes? This is obviously only feasible/doable if calls to your API result in a processing time of any significant duration. If the back-end takes 0.28 sec to process, and user cancels after 0.3 seconds, then there isn't much left to cancel, is there.
However, if you do have scripts that may run for longer, say into a couple of seconds. You could always find relevant break-points in your code, where you have a "not-canceled" check or a kill/rollback routine. Basically, you'd have the following flow:
Front-end sends request with unique ID to main script
PHP script begins the long march for building a response
On cancel: Front-end re-sends the ID to a light-weight cancel controller
Cancel controller logs ID to temporary file/database/wherever
PHP checks at break-points if there's a cancel request for current process
On cancel, PHP executes a kill/rollback routine instead of further processing
This sort of "cancel watch" will obviously create some overhead, and as such you may want to only incorporate this into heavier scripts, to ensure you actually save some processing time in the big picture. Further, you'd only want at most a couple of breakpoints at significant junctions. For read requests, you could just kill the process; but for write requests, you'd probably want to have a graceful rollback to ensure data integrity in your system.
You can also cancel/kill a long-running MySQL thread, already initiated by PHP, with mysqli::kill. For this to make sense, you'd want to run it as MYSQLI_ASYNC, so PHP's around to pull the plug. PDO doesn't seem to have a native equivalent for either async queries or kill. Came across $pdo->query('KILL CONNECTION_ID()'); and PHP Asynchronous MySQL Query (see answer for PDO). Haven't tested these myself. Also see: Kill MySQL query on user abort
PHP Connection Handling
As an alternative to a controller that passes the cancel signal "from the side", you could look into PHP Connection Handling and poll for aborted connection status at your cancel check-points with connection_aborted(). (See "MySQL kill" link above for a code example.)
A CONNECTION_ABORTED state follows if a user clicks the "stop" button in their browser. PHP has a ignore_user_abort() setting, default "Off", which should abort a script on user-abort. (In my experience though, if I have a rogue script and session lock is on, I can't do anything until it times out, even when I hit "stop" in the browser. Go figure.)
If you have "ignore user abort" on false, ie. the PHP script terminates on user abort, be aware that this will be a wholly uncontrolled termination, unless you have register_shutdown_function() implemented. Even so, you'd have to flag check-points in your code for your shutdown function to be able to "rewind the clock" from the termination point onward. Also note this caveat:
PHP will not detect that the user has aborted the connection until an attempt is made to send information to the client. Simply using an echo statement does not guarantee that information is sent, see flush(). ~ PHP Manual on ignore_user_abort
I have no experience with implementing "user abort" over AJAX/JS. For a starting point, see: Abort AJAX Request and Cancel an HTTP fetch() request. Not sure how/if they register with PHP. If you decide to travel down this road, please return and update us with your code / research!
I have a standard scenario where you have multiple parallel requests trying to access the same key in Redis based cache.
When this key is expired the requesting process notifies some external worker that it needs to be recomputed (the worker might possibly be on another server). The worker recomputes it and updates the cache.
When the cache is hot, everything is fine because I can keep serving the stale data from the cache until the new value is recomputed.
The problem is when the cache is cold and there is no data in Redis to serve yet. The requesting process needs to wait until the value is generated by the external worker. I can't use a cache warm-up in this case because only the keys that are requested should be cached because of limited cache size.
So the question is how can I make PHP requests to wait until the computed value is available in Redis? Or what would be the common solution in this case?
The possible solutions I have already though about:
The Redis blpop command probably would not work because the value being recomputed is not in a list and feels a bit like a workaround.
Maybe it is possible to implement some kind of file based lock? However web app and worker are on separate servers and NFS does not support file locks for example.
The only possible working solution I could think of is to have an infinite while loop that pings Redis every X miliseconds with some Y max wait time. However, is this really a good and practical solution? Because I am not a fan of having infinite loops in supposedly short-living web requests. Besides, hundreds of requests would potentially be running infinite loops and waiting while the value is being recomputed.
I'm new in web design. My concern is if I trace anonymous users by session to keep correct language, and etc., then I would save data for each user who visit my website(for example 2 KB). then wouldn't it make my website vulnerable against attacking users to overflow memory of session storage by creating false sessions?
thanks
Why not use local storage, cookies or some other solution instead of sessions? I am not saying that thats the best solution but cookies might be better solution for just keeping preferences. They "could" be longer lasting than session and less intensive on server side.
PHP saves sessions to disk by default; it's only in memory while the program is actually running, so it would only be a memory issue if you had a lot of visitors simultaneously -- ie running your PHP code at exactly the same time on the same server.
But the amount of memory used by your session array is small compared with the memory used overall by your whole PHP process, so if you had sufficient simultaneous visitors for that to cause a problem, then it's unlikely that having a session for each of them would make much of a difference.
The real way to mitigate against this kind of thing is to make your programs run fast, so that they exit quickly, and thus there is less chance of having large numbers of copies of it running simultaneously.
The first page I load from my site after not visiting it for 20+ mins is very slow. Subsequent page loads are 10-20x faster. What are the common causes of this symptom? Could my server be sleeping or something when it's not receiving http requests?
I will answer this question generally because I'm sure it's something that confuses a lot of newcomers.
The really short answer is: caching.
Just about every program in your computer uses some form of caching to remember data that has already been loaded/processed recently, so it doesn't have to do the work again.
The size of the cache is invariably limited, so stuff has to be thrown out. And 99% of the time the main criteria for expiring cache entries is, how long ago was this last used?
Your operating system caches file data that is read from disk
PHP caches pages and keeps them compiled in memory
The CPU caches memory in its own special faster memory (although this may be less obvious to most users)
And some things that are not actually a cache, work in the same way as cache:
virtual memory aka swap. When there not enough memory available for certain programs, the operating system has to make room for them by moving chunks of memory onto disk. On more recent operating systems the OS will do this just so it can make the disk cache bigger.
Some web servers like to run multiple copies of themselves, and share the workload of requests between them. The copies individually cache stuff too, depending on the setup. When the workload is low enough the server can terminate some of these processes to free up memory and be nice to the rest of the computer. Later on if the workload increases, new processes have to be started, and their memory loaded with various data.
(Note, the wikipedia links above go into a LOT of detail. I'm not expecting everyone to read them, but they're there if you really want to know more)
It's probably not sleeping. It's just not visited for a while and releases it's resources. It takes time to get it started again.
If the site is visited frequently by many users it should response quickly every time.
It sounds like it could be caching. Is the server running on the same machine as your browser? If not, what's the network configuration (same LAN, etc...)?
I own a community website of about 12.000 users (write heavy), 100 concurrent users max on a single VPS with 1Gb ram. The load rarely goes above 3 and response is quite good.
Currently a simple file cache is used to store DB query results to ease the load on the DB, but the website still can slow down over 220 concurrent users (load test).
How can I find out what the bottleneck is?
I assume that DB is fine as cache is working fine, however Disk IO could cause problem. Each pageload has about 10 includes and 10-20 querys from DB or from the file cache, plus lots of php processing.
I tried using memcache instead of the file cache, but to my suprise the load test seemed to like file cache more.
I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?
Any other suggestions for finding bottlenecks (tried xdebug)?
Thanks,
Hamlet
I plan to use Alternative PHP Cache,
but I still don't really understand
how that cache is invalidated. I have
a singe index.php that handles all
requests. Will the cache store the
result for each individual request?
Will it clear the cache automatically
if one of my includes (or query result
from cache) change?
APC doesn't cache output. It caches your compiled bytecode.
Essentially, a normal PHP request looks like this:
PHP files are parsed and compiled to bytecode
The PHP interpreter executes the bytecode
APC caches the result of the first step, so you aren't reparsing/recompiling the same code over and over again. By default, it still stat()s your PHP files on every request, to see if the file has been modified since its cached copy was compiled -- so any changes to your code will automatically invalidate the cached copy.
You can also use APC much like you'd use memcached, for storing arbitrary user data. Keep in mind, however:
A memcached server can serve data to multiple servers; data cached in APC can only really be used locally. Better to serve a gig of data from one memcached box to four servers, than to have 4 copies of that gig of data in APC on each individual server.
Memcached, in my experience, is better at handling large numbers of concurrent writes to a single cache key.
APC doesn't seem to cope very well with its cache filling up. Fragmentation increases, and performance drops.
Also, beware: unless you've set up some sort of locking mechanism, your file-based cache is likely to become corrupt due to simultaneous writes. If you have implemented locking, that may become a bottleneck of its own. IMO, concurrency is tricky -- let memcached/APC/the database deal with it.
You mention you used XDebug - what weren't you able to do? Typically, to start tracking down a bottleneck you enable profiling of a request and then view the resulting "cachegrind" file in KCacheGrind or WinCacheGrind.
As for using a cache system, a dynamic script such as yours will generally do something like this
construct a cache "key" from the unique inputs to the script
ask the caching system if it has data for that key. If has, you're good to go!
otherwise, do all the hard work to generate the data, and ask the caching system to store it under the desired key for next time.
APC Cache can help to speed things up further by caching the parsed version of the PHP code.
MySQL has its own query cache.
You can enable it by setting query_cache_size to more than 0.
The query results are taken from the cache if the query is repeated verbatim and does not contain certain things like non-deterministic functions, session variables and some other things describe here:
The cache for a query is invalidated by issuing any DML operation against any of the underlying queries.
I turned on and configured APC on the test server and got a performance increase of about 400%
300 concurrent users with response time 1,4 secs max :) Good for a start.
Update:
Live server test results
Original:
No APC: 220 concurrent users, server load 20, response time 5000ms
No APC: 250 concurrent users, server load 20+, site is unavailable
New:
APC enabled: 250 concurrent users, server load 2, response time is 600ms
APC enabled: 350 concurrent users, server load 10, response time is 1500ms
APC enabled: 500 concurrent users, server load 20, response is 5000ms + site is fully operational, but a bit slow but can be used normally
Thanks for the suggestions, this is pretty great improvement.
Query cache is disabled as the site is write heavy thus cache would be invalidated constantly for whole tables.
I would say that it's likely that your database is IO bound, I don't know exactly what a "VPS" is, but if it's some kind of VM, then there is almost guaranteed to be very poorly performing IO.
Get it on to real hardware ASAP; and get a sensible amount of ram (1G is tiny; 16G sounds more reasonable).
Then you may be able to tune your db so it can behave properly. How big are your data in total? If you can get all of them (or most of them) to fit in your database cache (not the dodgy query cache, the proper innodb buffer pool one), then do so.
I'm assuming you're using the innodb engine; if so, then set up the buffer pool to be big enough for all your data - if you don't have enough ram, buy more until you do (No, really!).
Then your db queries should be fast even if they're fairly bad (yes).
The tricky bit is, if you have a single machine, how to carve up ram usage between mysql and PHP - the web server (I assume Apache), particularly if you use prefork and lots of MaxClients, can use up loads of ram and deprive your database of it.
Get some decent monitoring on the job (with trending), and make changes carefully and record exactly when you made them.