Maintaining state - object serialization through $_SESSION - php

I'm developing my first project and I've done a some reading on maintain state across pages. From the few hours I've spent on the subject, it would appear that serializing and unserializing using $_SESSION seems a pretty simple and effective approach.
However, it seems to get frowned upon because of two main (and often disputed) reasons:
Performance
Security
I have a very typical (3 page) application process within my project 1>Select Category 2>Input Details 3>Confirmation => Add to database. And it makes sense to store the my object information within the $_SESSION.
On the performance front, the time to serialize the object was around 4 microseconds and to unserialize was 5 microseconds.
It would appear (from my reading) the preferred approach would be to use the actual datastore but surely to ask the database to save and retrieve this (often partial) information would take far longer and would result in a lot more code?
On the security front I understand the actual session information is stored on the server so isn't this secure?
I realise this has been asked before but the closest I found was asked 7 years ago
PHP: Storing 'objects' inside the $_SESSION
And was looking for more up to date opinions.

TL;DR:
IMHO storing objects in the session is not that much of a big deal if within system requirements
Longer story:
It really depends on the requirements / specifications of the system. i.e. the webserver is shared and everybody can read session storage... then i would not store it in session :P... i would not store a session at all
On the requirements point of view: is it required that a user can pick up where he left off during the process? If so, I would use a "persistent" store (i.e. the data store). If it is a process which cannot be resumed (only valid during the current session), I would not bother storing it in the datastore.
As security is involved: If your webserver is compromised, chances are they can access the database server as well (as in: get the DB login information from your scripts etc.). Meaning your data is compromised anyway. Either way around they could get the data if they would want to.
My suggestion in your case from your reading: just use the session for it, since it is much easier implemented (less code)

Related

Is it a good idea to store small amounts non user data in a PHP session

I am wondering what the best practice way is to store small amounts of data across a site. For example you access an API you retrieve some non sensitive, non user data and you want to use it across your site to add functionality, reference dates of a last event etc.
I have an idea that I could do this via the session variable, to avoid hitting the API every request etc. Is that a good idea or is it bad practice? If it's bad practice what other approaches should I take?
That's fine to do it that way, in fact, that's part of what sessions are all about. If you want to store it long term (after the user destroys the session), store it in the database and reference it when needed.
A nice alternative is Memcached, because putting information in a session will be visible only for the current client.
Free & open source, high-performance, distributed memory object
caching system, generic in nature, but intended for use in speeding up
dynamic web applications by alleviating database load.
Here you can find more information about memcache: http://php.net/manual/en/book.memcache.php

PHP - What data should we include in the session?

This is a beginner question...
In a website, what type of data should or should not be included inside the session? I understand that I should not include any info that needs to remain secure. I'm more interested in programming best practice. For example, it is possible to include into the session some data which would otherwise be sent from page to page as dependency injection. Wouldn't that correspond to creating a global variable?
Generally speaking, what kind of data has or hasn't its place inside a session table?
Thanks,
JDelage
The minimum amount of information needed to maintain needed state information between requests.
You should treat your session as a write-once, read many storage. But one which is rather volatile - e.g. the state of your underlying application data should be consistent (or recoverable) if all the sessions suddenly disappeared.
There are some exceptions to this (normally the shopping basket would be stored in the session - but you might want to perform stock adjustments to 'reserve' items prior to checkout). Here items may be added/edited/changed multiple times - so its not really write-once - but by pre-reserving stock items you are maintaining the recoverabiltiy of the database - but an implication of this is that you should reverse the stock adjustments when the session expires in the absence of completion.
If you start trying to store information about the data relating to individual page turns, you're quickly going to get into problems when the user starts clicking on the forward/back buttons or opens a new window.
In general you can put anything you like in a session. It's bad practice to put information in a session that has to be present to make your page run without (technical) errors.
I suggest to minimize the amount of data in your session as much as possible.
stuff you can save in the session so that you dont have to make another database query for info that isn't going to change. like their username, address, phone number, account balance, security permissions on your site, etc.
(This is perhaps more than you're looking for, but might make for good additional information to add to the good answers already posted.)
Since you mention best practices, you may want to look into some projects/technologies which can be used to take the idea of session state a bit further. One common pitfall with horizontally scaling web applications across multiple servers is maintaining session state between them. (User A logs in to Server A which stores the user's session, but on the next request hits Server B which doesn't know about User A's session, etc.)
One of the things I always end up saying to myself and to colleagues is that session by itself isn't really the best place to store data, even if that data is highly transient in nature. A web server is a request/response system, not a data store. It's highly tuned to the former, but not always so great for the latter.
Thus, there are ways to externalize your application's session data (or any stateful data, which should really be kept to a design minimum in the RESTful stateless nature of the web) from your web server and to another system. Memcached is a very common tool for this. There are also drop-in session replacements (or configurable session options for various frameworks/environments) which store session in a database like SQL or MySQL.
One idea I've been toying with lately is to store session data (well, any transient data where it's OK to lose it in a catastrophe) in a NoSQL database. CouchDB and MongoDB are my current top choices for this, but there's no shortage of other options. CouchDB has excellent horizontal scaling, MongoDB is ridiculously fast when run entirely in-memory, etc.
One of the major benefits of something like this, at least for me, is that deployments can easily become non-events. The web services on any given server can be re-started and the applications therein re-initialized without losing stateful data. If the data is persisted to the disk (that is, not entirely run in-memory) then the server can even be rebooted without losing it. Servers/services can drop in and out of the farm and users would never know the difference.
Additionally, externalizing this data allows you to analyze the data in potentially useful ways. Query it, run metrics on it, interface with it via other web applications or entirely offline tools, etc. It really opens up the options as a project grows in complexity.
(Again, this isn't really intended to answer your question, but rather to just add information that you may find useful. It's something my colleagues and I have been tinkering with as of late and your question seemed like a good place to mention it.)

Approach for authentication and storing user details

I am using the Zend Framework but my question is broadly about sessions / databases / auth (PHP MySQL).
Currently this is my approach to authentication:
1) User signs in, the details are checked in database.
- Standard stuff really.
2) If the details are correct only the user's unique ID is stored in the session and a security token (user unique ID + IP + Browser info + salt). The session in written to the filesystem.
I've been reading around and many are saying that storing stuff in sessions is not a good idea, and that you should really only write a unique ID which refers back to the user's details and a security token to prevent session hijacking. So this is the approach i've taken, i use to write the user's details in session, but i've moved that out. Wanted to know your opinions on this.
I'm keeping sessions in the filesystem since i don't run on multiple servers, and since i'm only writting a tiny tiny bit of data to sessions, i thought that performance would be greater keeping sessions in the filesystem to reduce load on the database. Once the session is written on authentication, it really is only read-only from then on.
3) The rest of the user's details (like subscription details, permissions, account info etc) are cached in the filesystem (this can always be easily moved to memory if i wanted even more performance).
So rather than keeping the user's details in session, the user's details are cached in the file system. I'm using Zend_Cache and the unique cache id is something like md5(/cache/auth/2892), the number is the unique id of the user. I guess the benefit of this method is that once the user is logged in, there is essentially not database queries being run to get the user's details. Just wonder if this approach is better than keeping the whole lot in session...
4) As the user moves throughout the site the only thing that is checked is the ID in the session and the security token.
So, overall the first question is 1) is the filesystem more efficient than a database for this purpose 2) have i taken enough security precautions 3) is separating user detail's from the session into a cached file a pointless task?
Thanks.
You're asking a range of things.
Sessions
Sessions in PHP are fast and efficient. Thousands of small disk-based sessions on a moderately up-to-date server is not going to be a performance bottleneck. Neither is writing your own handlers (very easy; the PHP manual has examples) to put it in a database.
About the only best-practice rules about sessions is: only give the web browser one thing, the session ID. Putting just the logged in userid in the session and retrieving those details from the DB when you need them is also best-practice. It also means that user information can be changed and they get it on the next page update.
It doesn't sound like you will have this problem but beware of just throwing a lot of stuff into a session. A few K of data (say, a few dozen scalars) is fine. Tossing many objects and large arrays of data in there will be noticed. If you do this for a specific page, remember to throw it away in the session once the page is done with it.
You may also want to implement your own login timeout with a session variable. The garbage collection settings in php.ini are intended for managing the storage of session data, not for doing login timeouts.
Caching
This is a complex topic and you will probably need to start gathering metrics (generally page load times) before implementing anything.
To implement any sort of caching, you do need to consider the lifetime of the data you're caching and how expensive re-generating it will be on a cache miss. Just throwing memcache at the problem is not a solution; you still need to understand your caching parameters and how memcache interprets them. This also applies to any persistent storage solution, including disk-based sessions, but I'm highlighting memcache because it is high-profile and has quite an aggressive expiry mechanism.
An often overlooked example is loading the same data from the database multiple times in a page: a good ORM will do that for you without relying on MySQL query caching. Another overlooked example are small queries that run on every page: caching these for just a few seconds on a moderately busy server and the database load will drop considerably.
Finally, caching at multiple levels is often much more effective and scalable than once because they can leverage each other's expiries. It also abstracts well: for example, hide it in your ORM and it's theoretically available invisibly and automatically for all your objects.
1) You can easily test which is faster by making a loop script. Anyway, a drawback with using the filesystem is that you need to update the cached file everytime you update the db. Copies of data is in general a bad thing. Also, unless you have millions of visitors I dont think there will be any practical diffrence regarding speed in any of the stratagies. And... not to forget, sessions are also stored in the filesystem. One file for each session.
Is a query faster then the filesystem: Depends. Is query caching enabled. In MySql it is by default, and than you might be lucky and only need a memory access. If not, the db needs to do a filesystem accass anyway. Second, how optimized is your query with index's. How buissy is the server harddisk.
3) Depends on the speed of fetching it from db. In general, caching can do magic to speed performance, but caching in memory would be even better by using memcached or something similar. In general i would avoid copies of the data in files. But of course, if it takes secods to query the data from the db, than go for filesystem caching. Also, if you have many users.. like 10.000+ you have to make some folder system, since putting 10.000 cached files in the same folder slows downs the accesstime...

PHP: Storing 'objects' inside the $_SESSION

I just figured out that I can actually store objects in the $_SESSION and I find it quite cool because when I jump to another page I still have my object. Now before I start using this approach I would like to find out if it is really such a good idea or if there are potential pitfalls involved.
I know that if I had a single point of entry I wouldn't need to do that but I'm not there yet so I don't have a single point of entry and I would really like to keep my object because I don't lose my state like that. (Now I've also read that I should program stateless sites but I don't understand that concept yet.)
So in short: Is it ok to store objects in the session, are there any problems with it?
Edit:
Temporary summary: By now I understand that it is probably better to recreate the object even if it involves querying the database again.
Further answers could maybe elaborate on that aspect a bit more!
I know this topic is old, but this issue keeps coming up and has not been addressed to my satisfaction:
Whether you save objects in $_SESSION, or reconstruct them whole cloth based on data stashed in hidden form fields, or re-query them from the DB each time, you are using state. HTTP is stateless (more or less; but see GET vs. PUT) but almost everything anybody cares to do with a web app requires state to be maintained somewhere. Acting as if pushing the state into nooks and crannies amounts to some kind of theoretical win is just wrong. State is state. If you use state, you lose the various technical advantages gained by being stateless. This is not something to lose sleep over unless you know in advance that you ought to be losing sleep over it.
I am especially flummoxed by the blessing received by the "double whammy" arguments put forth by Hank Gay. Is the OP building a distributed and load-balanced e-commerce system? My guess is no; and I will further posit that serializing his $User class, or whatever, will not cripple his server beyond repair. My advice: use techniques that are sensible to your application. Objects in $_SESSION are fine, subject to common sense precautions. If your app suddenly turns into something rivaling Amazon in traffic served, you will need to re-adapt. That's life.
it's OK as long as by the time the session_start() call is made, the class declaration/definition has already been encountered by PHP or can be found by an already-installed autoloader. otherwise it would not be able to deserialize the object from the session store.
HTTP is a stateless protocol for a reason. Sessions weld state onto HTTP. As a rule of thumb, avoid using session state.
UPDATE:
There is no concept of a session at the HTTP level; servers provide this by giving the client a unique ID and telling the client to resubmit it on every request. Then the server uses that ID as a key into a big hashtable of Session objects. Whenever the server gets a request, it looks up the Session info out of its hashtable of session objects based on the ID the client submitted with the request. All this extra work is a double whammy on scalability (a big reason HTTP is stateless).
Whammy One: It reduces the work a single server can do.
Whammy Two: It makes it harder to scale out because now you can't just route a request to any old server - they don't all have the same session. You can pin all the requests with a given session ID to the same server. That's not easy, and it's a single point of failure (not for the system as a whole, but for big chunks of your users). Or, you could share the session storage across all servers in the cluster, but now you have more complexity: network-attached memory, a stand-alone session server, etc.
Given all that, the more info you put in the session, the bigger the impact on performance (as Vinko points out). Also as Vinko points out, if your object isn't serializable, the session will misbehave. So, as a rule of thumb, avoid putting more than absolutely necessary in the session.
#Vinko You can usually work around having the server store state by embedding the data you're tracking in the response you send back and having the client resubmit it, e.g., sending the data down in a hidden input. If you really need server-side tracking of state, it should probably be in your backing datastore.
(Vinko adds: PHP can use a database for storing session information, and having the client resubmit the data each time might solve potential scalability issues, but opens a big can of security issues you must pay attention to now that the client's in control of all your state)
Objects which cannot be serialized (or which contain unserializable members) will not come out of the $_SESSION as you would expect
Huge sessions put a burden on the server (serializing and deserializing megs of state each time is expensive)
Other than that I've seen no problems.
In my experience, it's generally not worth it for anything more complicated than an StdClass with some properties. The cost of unserializing has always been more than recreating from a database given a session-stored Identifier. It seems cool, but (as always), profiling is the key.
I would suggest don't use state unless you absolutely need it. If you can rebuild the object without using sessions do it.
Having states in your webapplication makes the application more complex to build, for every request you have to see what state the user is in. Ofcourse there are times where you cannot avoid using session (example: user have to be kept login during his session on the webapplication).
Last I would suggest keeping your session object as small as possible as it impacts performance to serialize and unserialize large objects.
You'll have to remember that resource types (such as db connections or file pointers) wont persist between page loads, and you'll need to invisibly re-create these.
Also consider the size of the session, depending how it is stored, you may have size restrictions, or latency issues.
I would also bring up when upgrading software libraries - we upgraded our software and the old version had objects in session with the V1 software's class names, the new software was crashing when it tried to build the objects that were in the session - as the V2 software didn't use those same classes anymore, it couldn't find them. We had to put in some fix code to detect session objects, delete the session if found, reload the page. The biggest pain initially mind you was recreating this bug when it was first reported (all too familiar, "well, it works for me" :) as it only affected people who where in and out the old and new systems recently - however, good job we did find it before launch as all of our users would surely have had the old session variables in their sessions and would have potentially crashed for all, would have been a terrible launch :)
Anyway, as you suggest in your amendment, I also think it's better to re-create the object. So maybe just storing id and then on each request pulling the object from the database, is better/safer.

Session variables

Hy guys. I'm currently working on a project which uses a lot of data stored in session variables. My question is how reliable is this method and if affects the server performance and memory usage. Basicaly, what you would choose between session variables and cookies.
In general, session variables are going to be a lot more secure in the fact that the user cannot edit them locally on his/her machine.
But the real question begs, what are you looking to store? With a bit more information we might be able to give you a better answer as to where you would want to store it :)
Edit:
If you are looking to store user actions, I might recommend building a UserActions table or something along those lines. A table that contains the following:
id INT (generic ID for the record),
timestamp TIMESTAMP/DATETIME (whatever your DB supports),
userid INT (lookup to the user table),
action VARCHAR (what action you want to record),
etc etc (whatever else you want to store)
Then when a user performs an action you want to record, just log it into the table itself, instead of making it travel along with the user in a session/cookie. Really the page itself doesn't need to know what actions the user has performed in the past, unless its a "multi-step wizard" type application. In that case, it probably would be best to pass them as a session variable.
Then you are pushing the storage into a true storage component (being the database) instead of session/cookie as storage.
I mean we still don't really have an idea of exactly what you are developing, but I hope it helps.
Session variables are generally preferable to cookies. That said, they are usually stored in the /tmp directory on your web server, which is world-readable and world-writable. This could be breeding ground for mischief if you don't control your server or you run in a shared environment. Not storing sensitive information in session variables, and not relying on them for stuff that has to work is a good practice.
You should only use cookies if you need the data-per-user to persist across sessions. That is to say, if they revisit the site outside of the session expiry time and you need the data there.
Otherwise, if the data is only for their current session, then go ahead and put it in $_SESSION. That what it's there for.
Session data is usually stored in files on the server or in database. So how many data is there depends only from your scripts. If you want to store big binary files in the sessions you will probably reach memory limits quickly.
Storing the data in cookies is not always a good idea. This data is visible to the client, he can easy change it and in some cases that just something you mustn't allow.
Session variables do not require submission by the user, they are simply loaded based on a session key. Memory usage depends on the session implementation since there is the cost of retrieving the session from your database (or file system, or memory, or w/e).
it's always a tradeoff between keeping information on the server (more memory used) and pushing some of that data off to the client machine (more bandwidth and less secure). As a rule of thumb, I prefer sessions, they are more secure and easy to manage.
When i wrote this question I was thinking at non-sensitive data and with application for logging user activity on website. I think that, for a busy server, with a large number of users, it's better to use cookies instead, it will unload the server resources(memory, hard-drive I/O). In terms of performance, I think that session variables are a better solution.
Anyway, I don't know how better it will scale the SV solution.
Using any session variables - at all - means that your application servers need to maintain session state with appropriate synchronisation.
This has an overhead and may negatively affect the scalability of your application, because every server needs to know about (potentially) every session - which is going to mean a lot of cross-server traffic for session data and synchronisation.
While you only have one server, it's ok.
When you get more and more servers geographically distributed, it gets more and more painful.
There is some overhead for the serialisation / unserialisation of the session, but in practice that's not such a problem as it will be relatively fixed per request, hence scalable to high traffic applications.

Categories