Session storage vs non-session storage in PHP - php

As part of an effort to implement username / password authentication for websites, I am using PHP sessions. I read a very interesting article called PHP Sessions in Depth
One thing that was not addressed in the article (as it was not for novices like me) is why data would be stored in the session as opposed to a non-session database other than for the reason of knowing which user the session is for.
For example the following is given as an example of data that might be in session storage:
[ "theme" => "blue",
"volume" = >100]
The article goes on to discuss race conditions which can result when a large amount of this type of data is being changed in session storage. What I want to know is why not just store it in a non-session storage database along with the user's ID. Only the user's ID would also be stored in session storage so it would persist between http requests? I have taken the latter approach but my websites are very simple and I am wondering if I might run into problems if they become more complex.

You are absolutely correct. Storing too much data directly in a session can cause multiple problems:
Performance, since PHP serialises the entire season to a string when saving, and unserialises the whole thing on next page load
Stale data, if the session isn't the "source of truth" for some information, just an extra copy of data retrieved from somewhere else, like a primary database
Race conditions as such are not generally an issue with sessions as designed, because they are intended to be "locked" for the entire duration of a request. If the same user makes two requests at once, the second one just waits for the first to complete and unlock the session. However, if you write your own session handler that doesn't lock, or make heavy use of session_start and session_write_close, you won't have that guarantee.
The main reason for putting extra data into the session is to use it as a sort of cache: if every single page says "Hello $username" at the top, storing the username as well as the ID in the session saves you a round trip to the database to fetch it every time. So there is a balance to be found, and you can't 100% say that storing the absolute minimum is always optimal.

Related

My Cache solution to store data before they are sent to the DB is unreliable

So I have a website with flash games, each time a user gets a answer correct or wrong the values and the timestamps are sent to the database.
I wanted to reduce the number of accesses to the database so I made a solution with APC Cache, where it accumulates the values and it also has a table to store the references between the users and the sessions that have data to send. When the user logs out/in or changes game they are sent to the mySQL DB. For example if the user shuts down the PC without logout the data is stored in cache until he logs in again. But I found out the APC Cache is very unreliable, sometimes it deletes the values without warning.
Is there any PHP cache where I can achieve that, or another similar solution?
Caches are .. well, they are caches. They're not meant for information that are going to live forever, and will need to be flushed (for write caches) at regular intervals. Both memcached and apc will expire entries when the cache is full (or the TTL expires), and if they didn't, they'd be in-memory databases instead.
Without knowing the details of why you're having trouble storing data directly in the database, you could use a simple table without indices and one row for each entry to store data. The insert time should be negligible, and you can process the data every five / ten / fifteen minutes and move it into its proper location. This will give you more permanence, at least. That's probably the simplest way of doing stuff with the stack you have today.
You can also look into other solutions, such as redis, message queues (rabbit, gearman, etc.) and a whole sleeve of other technologies. The important part is to avoid using technology made for non-permanent data to store permanent data.

Database storage vs Cookies - To store form data

Good day to all,
I have a form with around 90 to 100 fields, divided into sub forms, which are loaded using ajax each time a form has to be displayed. but i would like to retain the data on the form fields every time a subform is loaded(lets say on an accidental page refresh or even if the user is switching between sub forms). What is the best way that this can be done.
I was thinking that i can store it in cookies or lets say in the database. But, storing it in the database would mean that i would have to query for the fields data every time a sub form is loaded. And again, if it was cookies, it has to read the data stored in the cookie files. I need some help with deciding what is the most efficient way, more in terms of speed.
What is the best way among these, or is there any other possibility to retain the fields data in the sub forms, every time they are loaded (which are loaded via AJAX each time.)
I am working with PHP and Codeigniter framework.
Thanks!!
A form like that needs to be durably stored. I would consider session state to smooth out the sub form loads, with writes to the database whenever the user updates something of consequence. Personally, I would start with a database-only solution, and then add session state if performance is an issue.
Cookies aren't meant to store large amounts of data. Even if it were possible, they bloat the request considerably (imagine 100 form fields all being transmitted with every request).
Local storage in the browser is also an option, though I would consider other options first.
I would first simplify it by using serialize:
$data = serialize(array_merge($_POST,$olddata));
Then that may be enough for you, but it's now super easy to store it anywhere since it is just a string. To reform it into its original state:
$data = unserialize($data);
.. wherever you end up pulling it from - database,session,etc..
Prose of database
It can also access from other computer too
You can store far far more data then cookie
Cones
If you retrive data by ajax it coukd cose more load on server
Cookie
Faster than database no query , fetch and all process .
Cones
Limited amount of space
However you can use local storage
So answer is database storage

Storing user data in sessions - from DB

If I have a login system or something similar, I store a session_id and a user_id in sessions, but any other data pertaining to a certain user is stored in a database. I've seen other scripts where people store other data (username, email etc) in sessions.
I was just wondering, which would be "better"? Saving data in sessions from the DB or having less sessions and grabbing the data from the database?
Thanks!
You can store whatever information you like in the $_SESSION. I believe it can be up to 128Mb - limit is governed by memory_limit which is 128Mb by default. You could change this.
However, as a rule of thumb, I'd store information that is pertinent and/or less expensive than querying a database for - Put another way as little as possible.
It will no-doubt vary widely by use, but often, sessions contain things like:
Username
Full display names
Email address
Id's (user or otherwise)
Permissions
User groups
Hashes
Form input errors (temporarily, to highlight form errors)
Storing large blocks of data/info isn't advised though for reasons of speed/scale.
If your site/platform needs to scale at a later date, at the right point, you'd be better off looking at write-through caching or similar for frequently used/required data (e.g. Memcached) and storing the vast majority of data in your DB - where is should be.
Hope this helps.
The answer is it depends and in your case, it probably doesn't even matter.
Session approach
Less queries = faster
DB approach
Less data in session prevents clobbering
Updates to the DB are reflected immediately without having to worry about simultaneously updating the session
Practice shows it is better to keep data in database (for >= medium sized projects (server farm/really LOTS of data in session) or to enhance security for any sort of project(e.g. shared hosting)). Even user id should not be kept in the $_SESSION. Hashes, flash messages, quick settings - that's what ought to be in the $_SESSION.
But if you still have a question "Do I need to save session in DB" then most probably you should not keep it in DB.

Session variables vs. Mysql table

I'm planning on saving my users often used parameters, i.e. name, picture, etc, in session variables as opposed to pulling then from the MySQL table each time they are needed. Saving often used parameters in variables as opposed to a database in theory should be more efficient, but because I'm not sure how SESSION variables are saved I'm not too sure if this is true. Does anyone know if pulling info. from a SESSION variable is more efficient than querying the MySQL table?
The term variable is used loosely as SESSION "variables" are stored in files in the server's temporary directory.
You would think reading files is more costly than reading a database, I mean that is what a database is essentially, a file, but it is optimized for this purpose as opposed to "temporary session files"
Yes, pulling information from a session variable is more efficient than querying a database for that info. However, loading the information INTO the session variables requires reading a file off of your servers file system and into RAM, which depending on many factors (disk speed, IO load, db speed, etc) might be slower or faster than reading the same information from a DB. Without information on your specific setup, it's hard to say. One thing to keep in mind, if you plan on growing and using more than one web server, you will need to write some custom session handlers to either store your sessions to a central server (possibly a database), memcache, or a shared mount point where all your web servers can go to fetch the session files.
In the end, putting something into the session and using it from there can be more efficient than loading it from the DB every time, but you are still loading it from somewhere, and so, knowledge of your hardware and your setup will be your best guide.
The default Session handler for PHP stores that info to disk; one unique temporary file per session. The issues you may come across are if the disk/file system gets overloaded, or if your data becomes stale.
If you're making a trip to disk to access the session, there is slightly less overhead than accessing MySQL, but you're still making a trip to disk upon every page request. You can try to use an in-memory Session handler.
Session variables are preferred for persisting a relatively small amount of temporary data. They're good for "sessions".
Use a database for everything else. Especially for:
larger amounts of data,
for any kind of "transaction", or
for data that needs to be persisted between "sessions".
This article is somewhat dated, and it doesn't apply to PHP per se ... but it should give you some idea about the relative efficiencies of filesystem (e.g. NTFS) vs database (e.g. MSSQL):
To Blob or Not To Blob: MS Research white paper
Yes it's more efficient to use session variables.
Typically Session variables are stored on the server in the /tmp directory (you can check your PHP Info file to see how yours is configured.
And because they're stored on the server, you can assume they're just as secure as the rest of your server.
Yes it is more efficient. Session is saved on server. However, with or without sessions you need to check if user is logged and if user has correct SESSION ID. It depends on number of your columns, rows and many other things

Handling sessions without ACID database?

I am thinking about using a noSQL (mongoDB) paired with memcached to store sessions with in my webapp. The idea is that upon each page load, the user data is compared to the data in the memcache and if something has changed, the data would be written to both memcached and mySQL. This way the reads would be greatly reduced and memcached utilized to do what it does best.
However I am a bit concerned about using a non-ACID database for session storage especially with the memcached layer. Let's say something goes wrong while updating the session to the DB and our users got instant headache wondering why their product that they put in the cart doesn't show up...
What's an appropriate approach to this? Should we go for a mySQL session storage or is it fine to keep a non-acid supportive database for sessions?
Thanks!
I'm using MongoDB as session storage currently. It is possible to avoid race conditions mentioned by pilif. I found a class that implements a session handler for MongoDB (http://www.jqueryin.com/projects/mongo-session/) and forked it on github to suit my needs (http://github.com/halfdan/MongoSession).
If you don't want to lose your data, stick with ACID tested databases.
What's the payoff you're looking for?
If you want a secure system, you can't trust anything from the user, save for perhaps selected integers, so letting them store the information is typically a really bad idea.
I don't see the payoff for storing sessions outside of your MySQL database. You can cron cleanup on the tables if that's your concern, but why bother? Some users will shop on a site and then get distracted for a while. They would then come back a day or two later.
If you use cookies or something really temporary to store their session info, there is a really good chance their shopping time was wasted. Users really value their time... so if you stored their session info in the database, you can write something sexy to manage that data.
Plus, the nice side effect of this is that you'll generate a lot of residual information about what people like on your website that wouldn't perhaps be available to you later on. Like you could even consider some of it to be like a poll or something where the items people are adding to their cart could impact how you manage your business, order inventory or focus your marketing.
If you go with something really temporary then you lose out on getting residual benefits.
Without any locking on the session, be really, really careful of what you are storing. Never ever store anything that is dependent on what you have read before as the data might change between you reading and writing - especially in case of ajax where multiple requests can go out at once.
An example what you must not store in a non-locked session would be a shopping cart as, to add a product, you have to read, unserialize, add the product and then serialize again. If any other request does the same thing between the first requests read and write, you lose the second request's data.
Have a look at this article for detail: http://thwartedefforts.org/2006/11/11/race-conditions-with-ajax-and-php-sessions/
Keep Sessions on your filesystem (where PHP locks them for you), in your database (where you have to do manual locking) or never, ever, write anything of value to your session if that value is derived of a previous read.
While using memcached as a cache for database, it is the user who have to ensure the data consistency between database and cache. If you'll want to scale up and add more servers there is a probability to be out of sync with database even if everything seems ok.
Instead you may consider Hazelcast. As of 1.9 it also supports memcache protocol. Compared to memcached Hazelcast wants you to implement Map Persister and only itself updates the database for the updated entries. This way you don't have to handle "check cache, if data changed update database" kind of stuff.
If you write your app so that the user stores all session information client side, then you just verify that information as needed, you won't need to worry about sessions on the server side. This is one of the principles in REST style architecture. For instance, if the user is requesting adding an item to their shopping cart, just store the itemID list and count on the client side. When you hit the cart page, you can easily look up the item information from the list of itemIDs they are telling you are in their cart.
During checkout, go directly against the database with transactions to ensure you aren't getting any race conditions, and check your live inventory. If inventory isn't there when they go to check out, just say, "sorry, we just sold out". Of course, at that point you should go update any caches you have out there that are telling people you have inventory.
I would look at how much the user costs to acquire and then ask what is the cost for implementing a really good system. Keep in mind that users are a biological retry method. "I'm bored... press reload again..." While, this isn't the most perfect solution, it is sometimes acceptable vs the cost comparsion for "not lose anything - ever".
If you want additional security, you can have your sessions cached to a separate set of memcache servers so there are no accidental flushes. :)
There are a number of other systems membase.org, and some other persistent memcache solutions (java implementations) that will persist storage to disk. If you want to modify your client somewhat, or how you access memcache, you can do your own replication of memcache session objects.
-daniel

Categories