If I have a login system or something similar, I store a session_id and a user_id in sessions, but any other data pertaining to a certain user is stored in a database. I've seen other scripts where people store other data (username, email etc) in sessions.
I was just wondering, which would be "better"? Saving data in sessions from the DB or having less sessions and grabbing the data from the database?
Thanks!
You can store whatever information you like in the $_SESSION. I believe it can be up to 128Mb - limit is governed by memory_limit which is 128Mb by default. You could change this.
However, as a rule of thumb, I'd store information that is pertinent and/or less expensive than querying a database for - Put another way as little as possible.
It will no-doubt vary widely by use, but often, sessions contain things like:
Username
Full display names
Email address
Id's (user or otherwise)
Permissions
User groups
Hashes
Form input errors (temporarily, to highlight form errors)
Storing large blocks of data/info isn't advised though for reasons of speed/scale.
If your site/platform needs to scale at a later date, at the right point, you'd be better off looking at write-through caching or similar for frequently used/required data (e.g. Memcached) and storing the vast majority of data in your DB - where is should be.
Hope this helps.
The answer is it depends and in your case, it probably doesn't even matter.
Session approach
Less queries = faster
DB approach
Less data in session prevents clobbering
Updates to the DB are reflected immediately without having to worry about simultaneously updating the session
Practice shows it is better to keep data in database (for >= medium sized projects (server farm/really LOTS of data in session) or to enhance security for any sort of project(e.g. shared hosting)). Even user id should not be kept in the $_SESSION. Hashes, flash messages, quick settings - that's what ought to be in the $_SESSION.
But if you still have a question "Do I need to save session in DB" then most probably you should not keep it in DB.
Related
As part of an effort to implement username / password authentication for websites, I am using PHP sessions. I read a very interesting article called PHP Sessions in Depth
One thing that was not addressed in the article (as it was not for novices like me) is why data would be stored in the session as opposed to a non-session database other than for the reason of knowing which user the session is for.
For example the following is given as an example of data that might be in session storage:
[ "theme" => "blue",
"volume" = >100]
The article goes on to discuss race conditions which can result when a large amount of this type of data is being changed in session storage. What I want to know is why not just store it in a non-session storage database along with the user's ID. Only the user's ID would also be stored in session storage so it would persist between http requests? I have taken the latter approach but my websites are very simple and I am wondering if I might run into problems if they become more complex.
You are absolutely correct. Storing too much data directly in a session can cause multiple problems:
Performance, since PHP serialises the entire season to a string when saving, and unserialises the whole thing on next page load
Stale data, if the session isn't the "source of truth" for some information, just an extra copy of data retrieved from somewhere else, like a primary database
Race conditions as such are not generally an issue with sessions as designed, because they are intended to be "locked" for the entire duration of a request. If the same user makes two requests at once, the second one just waits for the first to complete and unlock the session. However, if you write your own session handler that doesn't lock, or make heavy use of session_start and session_write_close, you won't have that guarantee.
The main reason for putting extra data into the session is to use it as a sort of cache: if every single page says "Hello $username" at the top, storing the username as well as the ID in the session saves you a round trip to the database to fetch it every time. So there is a balance to be found, and you can't 100% say that storing the absolute minimum is always optimal.
I'm planning on saving my users often used parameters, i.e. name, picture, etc, in session variables as opposed to pulling then from the MySQL table each time they are needed. Saving often used parameters in variables as opposed to a database in theory should be more efficient, but because I'm not sure how SESSION variables are saved I'm not too sure if this is true. Does anyone know if pulling info. from a SESSION variable is more efficient than querying the MySQL table?
The term variable is used loosely as SESSION "variables" are stored in files in the server's temporary directory.
You would think reading files is more costly than reading a database, I mean that is what a database is essentially, a file, but it is optimized for this purpose as opposed to "temporary session files"
Yes, pulling information from a session variable is more efficient than querying a database for that info. However, loading the information INTO the session variables requires reading a file off of your servers file system and into RAM, which depending on many factors (disk speed, IO load, db speed, etc) might be slower or faster than reading the same information from a DB. Without information on your specific setup, it's hard to say. One thing to keep in mind, if you plan on growing and using more than one web server, you will need to write some custom session handlers to either store your sessions to a central server (possibly a database), memcache, or a shared mount point where all your web servers can go to fetch the session files.
In the end, putting something into the session and using it from there can be more efficient than loading it from the DB every time, but you are still loading it from somewhere, and so, knowledge of your hardware and your setup will be your best guide.
The default Session handler for PHP stores that info to disk; one unique temporary file per session. The issues you may come across are if the disk/file system gets overloaded, or if your data becomes stale.
If you're making a trip to disk to access the session, there is slightly less overhead than accessing MySQL, but you're still making a trip to disk upon every page request. You can try to use an in-memory Session handler.
Session variables are preferred for persisting a relatively small amount of temporary data. They're good for "sessions".
Use a database for everything else. Especially for:
larger amounts of data,
for any kind of "transaction", or
for data that needs to be persisted between "sessions".
This article is somewhat dated, and it doesn't apply to PHP per se ... but it should give you some idea about the relative efficiencies of filesystem (e.g. NTFS) vs database (e.g. MSSQL):
To Blob or Not To Blob: MS Research white paper
Yes it's more efficient to use session variables.
Typically Session variables are stored on the server in the /tmp directory (you can check your PHP Info file to see how yours is configured.
And because they're stored on the server, you can assume they're just as secure as the rest of your server.
Yes it is more efficient. Session is saved on server. However, with or without sessions you need to check if user is logged and if user has correct SESSION ID. It depends on number of your columns, rows and many other things
Assumption
I understand that it's not good to store to much data and it is needed to be as simple.
State today
Now I use as minimum needed and using simple data types (int and strings)
mainly for storing user's id and to tell if he is logged in.
must of my functions are static or singleton that has to be built each post/get.
I have trouble to representing the current state and changing it.
and get a largely static site.
most of state representing goes into javascript .
Target
for the other hand if I'll create a object that represent the entire website it will be much easier for me to maintain user's input , including database interaction.
simple question, how much data should be stored there?
example
One of the things i want to implement is
objects that relate to Database tables,
Let's take a page for a "car.update()".
Now if i store an object for it, that extends a connection to the Database with methods
for CRUD.
When I handle a post back from that page with details i could just put them in properties needed and call the update method.
situation now: I need to create a new object with that details and make an static update
Another example
storing previous search result and filter it using new data
In many cases the ideal amount would be none. Store the username in a cookie along with an HMAC hash used to verify the cookie was created by your site, and get everything else from the database (or cache) as needed. This makes it easy to load balance across servers because any server can handle any request and there's no state that needs to be shared between them.
This approach wouldn't be appropriate for banking or other top-security uses because if someone gets your cookie they connect as you. But for sites where you're not doing anything super critical it's great. The risk can also be mitigated somewhat by adding an expiration mechanism to your cookie handling. See chubbards great answer related to another HMAC question for more info.
note you can switch the way PHP stores data using session_set_save_handler. Then you don't have to change the calls and you improve performances/maintenance with the efficiency of database.
The minimum would be the user I.D.—assuming it is a logging in type of interface. But it is often helpful to include the most common aspects of that, like the user's permission and other items which are stored in the database, but are frequently referenced when constructing pages.
You shouldn't store an enormous amount of data, but you can without problems store some user-information if it helps you server you pages faster.
But if you want to build a more dynamic website, you will probably retreive more and more data from the database. So when you're connecting to a database after all, you could skip storing all kinds of information in the session, because you can just as well get them from the database. Most databases (including MySQL) have a quite efficient query cache that will make repeated queries lightning fast.
So in that case you'll need to save little more than the userid and maybe a small amount of flags.
I am thinking about using a noSQL (mongoDB) paired with memcached to store sessions with in my webapp. The idea is that upon each page load, the user data is compared to the data in the memcache and if something has changed, the data would be written to both memcached and mySQL. This way the reads would be greatly reduced and memcached utilized to do what it does best.
However I am a bit concerned about using a non-ACID database for session storage especially with the memcached layer. Let's say something goes wrong while updating the session to the DB and our users got instant headache wondering why their product that they put in the cart doesn't show up...
What's an appropriate approach to this? Should we go for a mySQL session storage or is it fine to keep a non-acid supportive database for sessions?
Thanks!
I'm using MongoDB as session storage currently. It is possible to avoid race conditions mentioned by pilif. I found a class that implements a session handler for MongoDB (http://www.jqueryin.com/projects/mongo-session/) and forked it on github to suit my needs (http://github.com/halfdan/MongoSession).
If you don't want to lose your data, stick with ACID tested databases.
What's the payoff you're looking for?
If you want a secure system, you can't trust anything from the user, save for perhaps selected integers, so letting them store the information is typically a really bad idea.
I don't see the payoff for storing sessions outside of your MySQL database. You can cron cleanup on the tables if that's your concern, but why bother? Some users will shop on a site and then get distracted for a while. They would then come back a day or two later.
If you use cookies or something really temporary to store their session info, there is a really good chance their shopping time was wasted. Users really value their time... so if you stored their session info in the database, you can write something sexy to manage that data.
Plus, the nice side effect of this is that you'll generate a lot of residual information about what people like on your website that wouldn't perhaps be available to you later on. Like you could even consider some of it to be like a poll or something where the items people are adding to their cart could impact how you manage your business, order inventory or focus your marketing.
If you go with something really temporary then you lose out on getting residual benefits.
Without any locking on the session, be really, really careful of what you are storing. Never ever store anything that is dependent on what you have read before as the data might change between you reading and writing - especially in case of ajax where multiple requests can go out at once.
An example what you must not store in a non-locked session would be a shopping cart as, to add a product, you have to read, unserialize, add the product and then serialize again. If any other request does the same thing between the first requests read and write, you lose the second request's data.
Have a look at this article for detail: http://thwartedefforts.org/2006/11/11/race-conditions-with-ajax-and-php-sessions/
Keep Sessions on your filesystem (where PHP locks them for you), in your database (where you have to do manual locking) or never, ever, write anything of value to your session if that value is derived of a previous read.
While using memcached as a cache for database, it is the user who have to ensure the data consistency between database and cache. If you'll want to scale up and add more servers there is a probability to be out of sync with database even if everything seems ok.
Instead you may consider Hazelcast. As of 1.9 it also supports memcache protocol. Compared to memcached Hazelcast wants you to implement Map Persister and only itself updates the database for the updated entries. This way you don't have to handle "check cache, if data changed update database" kind of stuff.
If you write your app so that the user stores all session information client side, then you just verify that information as needed, you won't need to worry about sessions on the server side. This is one of the principles in REST style architecture. For instance, if the user is requesting adding an item to their shopping cart, just store the itemID list and count on the client side. When you hit the cart page, you can easily look up the item information from the list of itemIDs they are telling you are in their cart.
During checkout, go directly against the database with transactions to ensure you aren't getting any race conditions, and check your live inventory. If inventory isn't there when they go to check out, just say, "sorry, we just sold out". Of course, at that point you should go update any caches you have out there that are telling people you have inventory.
I would look at how much the user costs to acquire and then ask what is the cost for implementing a really good system. Keep in mind that users are a biological retry method. "I'm bored... press reload again..." While, this isn't the most perfect solution, it is sometimes acceptable vs the cost comparsion for "not lose anything - ever".
If you want additional security, you can have your sessions cached to a separate set of memcache servers so there are no accidental flushes. :)
There are a number of other systems membase.org, and some other persistent memcache solutions (java implementations) that will persist storage to disk. If you want to modify your client somewhat, or how you access memcache, you can do your own replication of memcache session objects.
-daniel
I'm at a crossroads, not exactly sure what is better to use. Right now I'm using sessions arrays to store calculations on the fly, but I need to switch to objects because I would like have some functions. But I was also considering using ajax and insert the data on the fly back and forth to the database but I'm worried it might be to many calculations under heavy trafic.
What i'm trying to do is have a cart were items are added to, all of the items in the cart will need to be recalculated if a change to a quantity field is made so I was wondering which is the better solution, to use objects and sessions or update a large table in the db with multiple users manipulating the data.
I'm using mysql db if that helps in the decision..
The session should be much faster, but it will not persist if the user closes their browser. I use sessions for that sort of thing (shopping carts, last accessed searches, etc) most of the time. On a site with a medium-large number of users, you'll need to do maintenance on a table to clean out old session information. That's too much of a headache for me unless I really need the data to be there days later.
Well keep in mind that sessions themselves can be stored in the database. So regardless of what you choose, you shouldn't try to roll your own database-based sessions.
Keep in mind though that standard PHP sessions store the session data on disc. If you have many users on your site with many sessions over it's possible that you could bog down your box with too much disc i/o. This would be less of a problem if you had an SSD drive, however.
Another option I've used before is to simply have a MySQL database on the same box as the server, and store the data in a memory only table. I think this is the best of both worlds:
It's faster than a traditional myisam
or innodb database because it doesn't
ever get dumped to disk (this mostly
only affects session writes)
Assuming your database is on a
different box than your web server,
it's faster because your session db
is on the same box.
You can perform some sql wizardry on
your session store if you need (can't
do this with files)
I create a cart table which stores a SESSION ID and the corresponding item id.
The session id would be used to associate the browsing user. Calculating the price would be as easy as retrieving all rows with the users session id and summing the price from the item table (for example). Any changes to the cart involve a simple insert or delete for each item.
For doing many different calculations, I'd go with session... This really is the most bandwidth friendly way. If you would like to persist their cart without using cookies, then you can store their cart in the database when they leave and reload it when they come back (store the cart based on userID obviously).