I am working on a quick survey for a company who will be getting about 200k (at peak) visitors hourly for about 2 days straight. I was just wondering if using $_SESSION variables would tie up the server. All that we are storing in those variables are at most a 6 character string or a single digit integer. I'm new to the PHP world so I'm not sure how reliable or how much $_Session variables will tie up the servers. The servers we are using will be cloud servers.
One final note is that the the sessions will only last maybe 6 - 10 minutes tops for each visitor before I close it out.
Any help will be greatly appreciated!
By default, data in $_SESSION will be written to disk upon each call to session_write_close(), or upon script termination. There is no way to know for sure how this will perform without testing the final application on the server hardware you will be using. Since the volume of data is small, the real worry is disk latency. An easy workaround for this would be to set PHP's session_save_path to an in-memory filesystem.
Tie up how? Disk space? Storing a simple 6char string using the default file-based session handler will take up about 6+length-of-variable-name + ~6 chars of space on the disk. There'll be some overhead to load/unserialize the data in the session file. but it'll be much less than the initial overhead of loading/compiling the script that's using the session data.
Remember, PHP's default sessions use the disk as their storage media - they're not persisted in memory after the script exits.
I think you don't want to store data in sessions, because it writes to disk. If someone hits the app with multiple requests, are you able to guarantee that they hit the same machine in the cloud? That's rather complicated to write. I would cookie the user instead.
http://php.about.com/od/learnphp/qt/session_cookie.htm
http://www.quora.com/Does-PHP-handle-sessions-by-writing-session-variable-data-to-disc-or-does-this-information-persist-only-in-RAM-Will-accessing-session-data-cause-a-disc-read-in-PHP
Like the others said, I'd use Memcached if you want to scale, but to answer your question directly, I think your server should be able to handle the usage you describe.
In PHP you can change the session handler. The default session handler is to write data in a temp file, with one file per session. It works okay, but has limitations when runnning high traffic apps (although with 200K/hour you shoudln't have problems with the default handler).
And easy solution is to use the session handler for Memcached, with the PECL/Memcache extension (not to confuse with the PECL/Memcached extension):
http://www.php.net/manual/en/memcache.examples-overview.php (see example #2)
Related
Question basically says it all. I get a lot of traffic, about 200k hits a day. I want to store the original referrer (where they came from) in a session variable for various purposes. Is this a good idea or should I stick in a database instead?
You can do both at once :). PHP allows you define the storage logic of your sessions in scripts. This way it is possible to store sessions in a database as well. Check the manual of set_session_save_handler()
Using a database would have its advantages if you use load balancing (or plan to do it once). This way all web servers could read the session data from the same database (or cluster) and the load balancer would not have to worry about which request should be forwarded to which web server. If session data is stored in files, which is the default mechanism, then a load balancer has to forwared each request of a session to the same physical web server, which is much more complex, as the load balancer has to work on HTTP level.
You could just store the information in a cookie if you only need it for the user's current session. Then you don't need to store it at all on your end.
There are a few down sides as well:
They may have cookies disabled, so you may not be able to save it.
If you need the information next time you may not be able to get it, as it could have been deleted.
Not super secure so don't save passwords, bank info, etc.
So if needing this information is required no matter what, maybe its not the way to go. If the information is optional, then this will work.
The default PHP session handler is the file handler. So, the pertinent questions are:
Are you using more than 1 webserver without sticky sessions (load balancing)?
Are you running out of disk space?
Do you ever intend to do those?
If yes (to any), then store it in a database. Or, even better, calculate the stuff on every request (or cache it somewhere like Memcached). You could also store the stuff in a signed cookie (to prevent tampering).
There is certain userdata read from the (MySQL) database that will be needed in subsequent page-requests, say the name of the user or some preferences.
Is it beneficial to store this data in the $_SESSION variable to save on database lookups?
We're talking (potentially) lots of users. I'd imagine storing in $_SESSION contributes to RAM usage (very-small-amount times very-many-users) while accessing the database on every page request for the same data again and again should increase disk activity.
The irony of your question is that, for most systems, once you get a large number of users, you need to find a way to get your sessions out of the default on-disk storage and into a separate persistence layer (i.e. database, in-memory cache, etc.). This is because at some point you need multiple application servers, and it is usually a lot easier not to have to maintain state on the application servers themselves.
A number of large systems utilize in-memory caching (memcached or similar) for session persistence, as it can provide a common persistence layer available to multiple front-end servers and doesn't require long time persistence (on-disk storage) of the data.
Well-designed database tables or other disk-based key-value stores can also be successfully used, though they might not be as performant as in-memory storage. However, they may be cheaper to operate depending on how much data you are expecting to store with each session key (holding large quantities of data in RAM is typically more expensive than storing on disk).
Understanding the size of session data (average size and maximum size), the number of concurrent sessions you expect to support, and the frequency with which the session data will need to be accessed will be important in helping you decide what solution is best for your situation.
You can use multiple storage backends for session data in PHP. Per default its saved to files. One file for one session. You can also use a database as session backend or whatever you wan't by implementing you own session save handler
If you want your application most scalable I would not use sessions on file system. Imagine you have a setup with mutiple web servers all serving your site as a farm. When using session on filesystem a user had to be redirected to the same server for each request because the session data is only available on that servers filesystem. If you not using sessions on filesystem it would not matter which server is being used for a request. This makes the load balancing much easier.
Instead of using session on filesystem I would suggest
use cookies
use request vars across multiple requests
or (if data is security critical)
use sessions with a database save handler. So data would be available to each webserver that reads from the database (or cluster).
Using sessions has one major drawback: You cannot serve concurrent requests to the user if they all try to start the session to access data. This is because PHP locks the session once it is started by a script to prevent data from getting overwritten by another script. The usual thinking when using session data is that after your call to session_start(), the data is available in $_SESSION and will get written back to the storage after the script ends. Before this happens, you can happily read and write to the session array as you like. PHP ensures this will not destroy or overwrite data by locking it.
Locking the session will kill performance if you want to do a Web2.0 site with plenty of Ajax calls to the server, because every request that needs the session will be executed serially. If you can avoid using the session, it will be beneficial to user's perceived performance.
There are some tricks that might work around the problem:
You can try to release the lock as soon as possible with a call to session_write_close(), but you then have to deal with not being able to write to the session after this call.
If you know some script calls will only read from the session, you might try to implement code that only reads the session data without calling session_start(), and avoid the lock at all.
If I/O is a problem, using a Memcache server for storage might get you some more performance, but does not help you with the locking issue.
Note that the database also has this locking issue with all data it stores in any table. If your DB storage engine is not wisely chosen (like MyISAM instead of InnoDB), you'll lose more performance than you might win with avoiding sessions.
All these discussions are moot if you do not have any performance issues at all right now. Do whatever serves your intentions best. Whatever performance issues you'll run into later we cannot know today, and it would be premature optimization (which is the root of evil) trying to avoid them.
Always obey the first rule of optimization, though: Measure it, and see if a change improved it.
I am developing a website. Currently, I'm on cheapo shared hosting. But a boy can dream and I'm already thinking about what happens with larger numbers of users on my site.
The visitors will require occasional database writes, as their progress in the game on the site is logged.
I thought of minimizing the queries by writing progress and other info live into the $_SESSION variable. And only when the session is destroyed (log out, browser close or timeout), I want to write the contents of $_SESSION to the database.
Questions:
Is that possible? Is there a way to execute a function when the sessions is destroyed by timeout or closing of the browser?
Is that sensical? Are a couple of hundred concurrent SQL queries going to be a problem for a shared server and is the idea of using $_SESSION as a buffer going to alleviate some of this.
Is there a way to execute a function when the sessions is destroyed by
timeout or closing of the browser?
Yes, but it might not work the way you imagine. You can define your own custom session handler using session_set_save_handler, and part of the definition is supplying the destroy and gc callback functions. These two are invoked when a session is destroyed explicitly and when it is destroyed due to having expired, so they do exactly what you ask.
However, session expiration due to timeout does not occur with clockwork precision; it might be a whole lot of time before an expired session is actually "garbage-collected". In addition, garbage collection triggers probabilistically so in theory there is the chance that expired sessions will never be garbage collected.
Is that sensical? Are a couple of hundred concurrent SQL queries going
to be a problem for a shared server and is the idea of using $_SESSION
as a buffer going to alleviate some of this.
I really wouldn't do this for several reasons:
Premature optimization (before you measure, don't just assume that it will be "better").
Session might never be garbage collected; even if this doesn't happen, you don't control when they are collected. This could be a problem.
There is a possibility of losing everything a session contains (e.g. server reboots), which includes player progress. Players do not like losing progress.
Concurrent sessions for the same user would be impossible (whose "saved data" wins and remains persisted to the database?).
What about alternatives?
Well, since we 're talking about el cheapo shared hosting you are definitely not going to be in control of the server so anything that involves PHP extensions (e.g. memcached) is conditional. Database-side caching is also not going to fly. Moreover, the load on your server is going to be affected by variables outside your control so you can't really do any capacity planning.
In any case, I 'd start by making sure that the database itself is structured optimally and that the code is written in a way that minimizes load on the database (free performance just by typing stuff in an editor).
After that, you could introduce read-only caching: usually there is a lot of stuff that you need to display but don't intend to modify. For data that "almost never" gets updated, a session cache that you invalidate whenever you need to could be an easy and very effective improvement (you can even have false positives as regards the invalidation, as long as they are not too many in the grand scheme of things).
Finally, you can add per-request caching (in variables) if you are worried about pulling the same data from the database twice during a single request.
It's not a good idea to write data when the session is destroyed. Since the session datas could be destroyed via a garbage collector configured by your hoster, you don't have any idea when the session is really closed until the user's cookie is out of date.
So... I suggest you to use either a shared memory (RAM) cache system like memcache (if your hoster offers it) or a disk based cache system.
By the way, if your queries are optimized, columns correctly indexed, etc., your shared hosting could take tons of queries at the "same time".
Is that sensical? Are a couple of hundred concurrent SQL queries going to be a problem for a shared server and is the idea of using $_SESSION as a buffer going to alleviate some of this.
No. First and foremost, you never know what happens to a session (logout is obvious, where a time-out is nearly undetectable), so it's not a trustworthy caching mechanism at any rate. If there are results that you query multiple times over the span of a few request, which don't change all too often, save the results of those queries to a dedicated caching mechanism, such as APC, or memcached.
Now, I understand your webhost will not provide these caching systems, but then, you probably can do different things to optimise your site. For starters, my most complex software products (which are fairly complex) query the database about 6 times per page, on average. If the result is reusable, I tend to use caching, so that lowers the number of queries.
On top of that, writing decent queries is more important; the quality of your design and queries is more important than the quantity. If you get your schema, indexes and queries right, 10 queries are faster than one query that's not optimised. I'd invest my time investigating how to write efficient queries, and read up on indexing, rather than trying to overcome the problem with a "workaround", such as caching in a session.
Good luck, I hope your site becomes that big of a success you will actually need the advice above ;)
Actually you could use the $_SESSION as a buffer to avoid duplicate reads, thet seems a good idea to me (memcached even better than that), surely not for delaying writing (that is much more complex and should be handled by the db);
you could use a simple hash that you save in $_SESSION
$cache = array();
$_SESSION['cache'] = $cache;
then when you have to make a query
if(isset($_SESSION['cache'][$id]){
//you have a cache it
$question = $_SESSION['cache'][$id];
}else{
//no cache, retrieve your $question and save it in the cache
$_SESSION['cache'][$id] = $question ;
}
I'm currently storing a fair amount of data in the $_SESSION variable. I'm doing this so I don't need to keep accessing the database.
Should I be worried about memory issues on a shared server?
Can servers cope with large amounts of data stored in the $_SESSION variable?
Should I be worried about memory issues on a shared server?
Yes - session data is loaded into the script's memory on every request. Hence, you are at risk of breaking the individual per-script memory limit. Even if you don't hit the limit, this is really inefficient.
Accessing data from the database on demand is much better.
.. in addition to what #Pekka wrote:
PHP sessions an not alternative to a caching solution !
You should investigate if your server has APC available. You should use that on top of layer which accesses information from database (assuming you actually have an OO code).
I have a social network type app I have been working on for at least 2 years now, I need it to scale well so I have put a lot of effort into perfecting the code of this app. I use sessions very often to cache database results for a user when a user logs into the site I cache there userID number, username, urer status/role, photo URL, online status, last activity time, and several other things into session variables/array. Now I currently have 2 seperate servers to handle this site, 1 server for apache webserver and a seperate server for mysql. Now I am starting to use memcache in some areas to cut down on database load.
Now my sessions are stored on disk and I know some people use a database to store sessions data, for me it would seem that storing session data that I cached from mysql would kind of defeat the purpose if I were to switch to storingsessions in mysql. SO what am I missing here? Why do people choose to use a database for sessions?
Here are my ideas, using a database for sessions would make it easiar to store and access sessions across multiple servers, is this the main reason for using a database?
Also should I be using memcache to store temp variables instead of storing them into a session?
PHP has the ability to use memcached to store sessions.
That may just be the winning ticket for you.
Take a look at this google search.
One part of the Zend Server package is a session daemon.
Be careful using memcache for that purpose. Once the memory bucket is full, it starts throwing away stuff in a FIFO fashion.
Found this on slideshare about creating your own session server with php-cli.
The single best reason to store sessions in the database is so you can load-balance your website. That way it doesn't matter which server hands out the next page because they are all using the same database for storing their sessions.
Have a look at PHP's set_save_handler() for how to install a custom session handler. It takes about 30 lines to set one up that puts the session in the database, though that doesn't count the lines to make a decent database handler. :-) You will need to do:
ini_set('session.save_handler', 'user');
ini_set('session.auto_start', '0');
... although session.auto_start will need to be in your php.ini (and set to 0).
If the database access is going to be a bit expensive, there are some other things you can do to mitigate that. The obvious one is to have a DB server that is just for sessions. Another trick is to have it poke stuff into both memcache and the DB, so when it checks, if the memcache record is missing, it just falls back to the DB. You could get fancy with that, too, and split the session up so some of it is in memcache but the rest lives in the database. I think you'd need to put your own access functions on top of PHP's session API, though.
The main reason to store session data in a database is security because otherwise you have no way to validate it. You'd store the session ID along with the data in the database and match them to see if the session has been tampered with but you can't use the server's (apache) default session mechanism anymore.
For Storing variables in memcache instead of the session.. have you set up your database query cache? I'd have a look there instead first as it's far easier to deal with than with memcache.