Right now I'm stuck between using PHP's native session management, or creating my own (MySQL-based) session system, and I have a few questions regarding both.
Other than session fixation and session hijacking, what other concerns are there with using PHP's native session handling code? Both of these have easy fixes, but yet I keep seeing people writing their own systems to handle sessions so I'm wondering why.
Would a MySQL-based session handler be faster than PHP's native sessions? Assuming a standard (Not 'memory') table.
Are there any major downsides to using session_set_save_handler? I can make it fit my standards for the most part (Other than naming). Plus I personally like the idea of using $_SESSION['blah'] = 'blah' vs $session->assign('blah', 'blah'), or something to that extent.
Are there any good php session resources out there that I should take a look at? The last time I worked with sessions was 10 years ago, so my knowledge is a little stagnant. Google and Stackoverflow searches yield a lot of basic, obviously poorly written tutorials and examples (Store username + md5(password) in a cookie then create a session!), so I'm hoping someone here has some legitimate, higher-brow resources.
Regardless of my choice, I will be forcing a cookie-only approach. Is this wrong in any way? The sites that this code will power have average users, in an average security environment. I remember this being a huge problem the last time I used sessions, but the idea of using in-url sessions makes me extremely nervous.
The answer to 2) is - id depends.
Let me explain: in order for the session handler to function properly you really should implement some type of lock and unlock mechanism. MySQL conveniently have the functions to lock table and unclock table. If you don't implement table locking in session handler then you risk having race conditions in ajax-based requests. Believe me, you don't want those.
Read this detailed article that explains race condition in custom session handler :
Ok then, if you add LOCK TABLE and UNLOCK TABLE to every session call like you should, then the your custom session handler will become a bit slower.
One thing you can do to make it much faster is to use HEAP table to store session. That means data will be stored in RAM only and never written to disk. This will work blindingly fast but if server goes down, all session data is lost.
If you OK with that possibility of session being lost when server goes down, then you should instead use memcache as session handler. Memcache already has all necessary functions to be used a php session handler, all you need to do it install memcache server, install php's memcache extension and then add something like this to you php.ini
[Session]
; Handler used to store/retrieve data.
; http://php.net/session.save-handler
;session.save_handler = files
session.save_handler = memcache
session.save_path="tcp://127.0.0.1:11215?persistent=1"
This will definetely be much faster than default file based session handler
The advantages of using MySQL as session handler is that you can write custom class that does other things, extra things when data is saved to session. For example, let's say you save an object the represents the USER into session. You can have a custom session handler to extract username, userid, avatar from that OBJECT and write them to MySQL SESSION table into their own dedicated columns, making it possible to easily show Who's online
If you don't need extra methods in your session handler then there is no reason to use MySQL to store session data
PHP application loses session information between requests. Since Stanford has multiple web servers, different requests may be directed to different servers, and session information is often lost as a result.
The web infrastructure at Stanford consists of multiple web servers which do not share session data with each other. For this reason, sessions may be lost between requests. Using MySQL effectively counteracts this problem, as all session data is directed to and from the database server rather than the web cluster. Storing sessions in a database also has the effect of added privacy and security, as accessing the database requires authentication. We suggest that all web developers at Stanford with access to MySQL use this method for session handling.
Referred from http://www.stanford.edu/dept/its/communications/webservices/wiki/index.php/How_to_use_MySQL-based_sessions
This may help you.
Most session implementations of languages’ standard libraries do only support the basic key-value association where the key is provided by the client (i.e. session ID) and the value is stored on the server side (i.e. session storage) and then associated to the client’s request.
Anything beyond that (especially security measures) are also beyond that essential session mechanism of a key-value association and needs be added. Especially because these security measures mostly come along with faults: How to determine the authenticity of a request of a certain session? By IP address? By user agent identification? Or both? Or no session authentication at all? This is always a trade-off between security and usability that the developer needs to deal with.
But if I would need to implement a session handler, I would not just look for pure speed but – depending on the requirements – also for reliability. Memcache might be fast but if the server crashed all session data is lost. In opposite to that, a database is more reliable but might have speed downsides opposed to memcache. But you won’t know until you test and benchmark it on your own. You could even use two different session handlers that handle different session reliability levels (unreliable in memcache and reliable in MySQL/files).
But it all depends on your requirements.
They are both great methods, there are some downsides to using MySQL which would be traffic levels in some cases could actually crash a server if done incorrectly or the load is just too high!
I recommend personally given that standard sessions are already handled properly to just use them and encrypted the data you do not want the hackers to see.
cookies are not safe to use without proper precaution. Unless you need "Remember me" then stick with sessions! :)
EDIT
Do not use MD5, it's poor encryption and decryptable... I recommend salting the data and encrypting it all together.
$salt = 'dsasda90742308408324708324832';
$password = sha1(sha1(md5('username').md5($salt));
This encryption will not be decrypted anytime soon, I would considered it impossible as of now.
Related
Is there a more contemporary alternative to PHP session or is PHP session still the main choice to store information? I read this: https://pasztor.at/blog/stop-using-php-sessions. I'm still learning PHP and frankly, I'm quite clueless.
Your first assumption is incorrect. PHP Sessions are not where you store data. Databases, files, Document stores, etc. are where you store your data.
Session "data" is simply the variables included in the $_SESSION array in serialized form. You can run serialize() and unserialize() on variables to gain some insight into what these look like.
In your script, once you have started a session using session_start(), when you add change or delete variables in $_SESSION, php serializes this and stores it for you.
Once a session exists, and a user makes another request that is identified as being the same user (having the same session id) which has typically passed to the client via a cookie, then upon issuing session_start(), PHP reads the serialized data in the session file, and unserializes it, and stores it back into $_SESSION.
By default, PHP will store the individual session data as files on the filesystem. For a small or medium size application, this is highly performant.
So to be clear, what people store in PHP sessions is basically variables read out of whatever other persistent storage you might have, so that you can avoid doing things like re-querying a database to get the name and user_id for a user who has already logged into your application.
It is not the master version of that data, nor the place through which you will update that data should it change. That will be the original database or mongodb collection.
The article you posted has a number of stated and unstated assumptions including:
Devops/Sysadmins just decide to reconfigure PHP applications to change the session handlers (misleading/false)
The deployment involves a load balancer (possibly)
The load balancer doesn't support or use sticky sessions
He then goes on into some detail as to several alternatives that allow for shared session handlers to solve the race conditions he describes
As you stated, you aren't clear yet what sessions actually are, or how they work or what they do for you. The important thing to know about PHP scripts is that they are tied to a single request and sessions are a way of not repeating expensive database reads. It's essentially variable cache for PHP to use (or not) when it suits your design.
At the point you have a cluster, as pointed out in the article people often store data into shared resources which can be a relational database, or any of many other backends which each have different properties that match their goals.
Again, in order to change the session handlers, there is typically code changes being made to implement the session handler functions required, and there are ways to code things that mitigate the issues brought up in the article you posted, for just about every persistence product that people use.
Last but not least, the problems described exist to whatever degree with pretty much any clustered serverside process and are not unique to PHP or its session mechanism.
Usually, that will depends on the use case and other requirements of your application and most of the time people will use PHP frameworks to handle sessions.
Take for example, for Yii 2 framework, the framework provides different session classes for implementing the different types of session storage. Take a look at here https://www.yiiframework.com/doc/guide/2.0/en/runtime-sessions-cookies.
Learning the different types of sessions available out there, allows you to be able to make decisions by weighing the pros and cons. You can also read here for more understanding https://www.phparch.com/2018/01/php-sessions-in-depth/
According to the documentation (http://ellislab.com/codeigniter/user-guide/libraries/sessions.html), the CodeIgniter session library has the following behavior:
"When a page is loaded, the session class will check to see if valid session data exists in the user's session cookie. If sessions data does not exist (or if it has expired) a new session will be created and saved in the cookie. If a session does exist, its information will be updated and the cookie will be updated. With each update, the session_id will be regenerated."
I think this behavior can be dangerous from a security point of view, because somebody could flood the site with requests and that way pollute the session store (which, in my case, is a mysql database). And my app is running on an ordinary web host..
Is there any easy solution to this which does not require too much additional coding? Maybe a library that could substitute for the one that ships with the core? I don't want to code it all myself because I think that would defend the purpose of using a framework.. and I actually don't want to use another PHP framework, since, for my specific requirements, CI is perfect as regards the freedom it gives you...
because somebody could flood the site with requests and that way pollute the session store
So? Then you just have a bunch of sessions in the db. This doesn't affect the validity of sessions. If there is a mechanism to delete old session based on space/time, then those sessions are gone and the former owners of those sessions will need to re-authenticate.
If you are worried about collisions, do a little research and you will find that any collision probability is a function of the underlying operating system and/or PHP itself, so CodeIgniter can't help you there.
Also, maybe disk space fills up but that is an operations/architecture problem, not a CodeIgniter problem and not a security issue in and of itself.
There is certain userdata read from the (MySQL) database that will be needed in subsequent page-requests, say the name of the user or some preferences.
Is it beneficial to store this data in the $_SESSION variable to save on database lookups?
We're talking (potentially) lots of users. I'd imagine storing in $_SESSION contributes to RAM usage (very-small-amount times very-many-users) while accessing the database on every page request for the same data again and again should increase disk activity.
The irony of your question is that, for most systems, once you get a large number of users, you need to find a way to get your sessions out of the default on-disk storage and into a separate persistence layer (i.e. database, in-memory cache, etc.). This is because at some point you need multiple application servers, and it is usually a lot easier not to have to maintain state on the application servers themselves.
A number of large systems utilize in-memory caching (memcached or similar) for session persistence, as it can provide a common persistence layer available to multiple front-end servers and doesn't require long time persistence (on-disk storage) of the data.
Well-designed database tables or other disk-based key-value stores can also be successfully used, though they might not be as performant as in-memory storage. However, they may be cheaper to operate depending on how much data you are expecting to store with each session key (holding large quantities of data in RAM is typically more expensive than storing on disk).
Understanding the size of session data (average size and maximum size), the number of concurrent sessions you expect to support, and the frequency with which the session data will need to be accessed will be important in helping you decide what solution is best for your situation.
You can use multiple storage backends for session data in PHP. Per default its saved to files. One file for one session. You can also use a database as session backend or whatever you wan't by implementing you own session save handler
If you want your application most scalable I would not use sessions on file system. Imagine you have a setup with mutiple web servers all serving your site as a farm. When using session on filesystem a user had to be redirected to the same server for each request because the session data is only available on that servers filesystem. If you not using sessions on filesystem it would not matter which server is being used for a request. This makes the load balancing much easier.
Instead of using session on filesystem I would suggest
use cookies
use request vars across multiple requests
or (if data is security critical)
use sessions with a database save handler. So data would be available to each webserver that reads from the database (or cluster).
Using sessions has one major drawback: You cannot serve concurrent requests to the user if they all try to start the session to access data. This is because PHP locks the session once it is started by a script to prevent data from getting overwritten by another script. The usual thinking when using session data is that after your call to session_start(), the data is available in $_SESSION and will get written back to the storage after the script ends. Before this happens, you can happily read and write to the session array as you like. PHP ensures this will not destroy or overwrite data by locking it.
Locking the session will kill performance if you want to do a Web2.0 site with plenty of Ajax calls to the server, because every request that needs the session will be executed serially. If you can avoid using the session, it will be beneficial to user's perceived performance.
There are some tricks that might work around the problem:
You can try to release the lock as soon as possible with a call to session_write_close(), but you then have to deal with not being able to write to the session after this call.
If you know some script calls will only read from the session, you might try to implement code that only reads the session data without calling session_start(), and avoid the lock at all.
If I/O is a problem, using a Memcache server for storage might get you some more performance, but does not help you with the locking issue.
Note that the database also has this locking issue with all data it stores in any table. If your DB storage engine is not wisely chosen (like MyISAM instead of InnoDB), you'll lose more performance than you might win with avoiding sessions.
All these discussions are moot if you do not have any performance issues at all right now. Do whatever serves your intentions best. Whatever performance issues you'll run into later we cannot know today, and it would be premature optimization (which is the root of evil) trying to avoid them.
Always obey the first rule of optimization, though: Measure it, and see if a change improved it.
This question may expose my ignorance as a web developer, but that wouldn't exactly be a bad thing for me now would it?
I have the need to store user-state information. Examples of information that I need to store per user. (define user: unauthenticated visitor)
User arrived to the site from google/bing/yahoo
User utilized the search feature (true/false)
List of previous visited product pages on current visit
It is my understanding that I could store this in the view state, but that causes a problem with page load from the end-users' perspective because a significant amount of non-viewable information is being transferred to and from the end-users even though the server is the only side that needs the info.
On a similar note, it is my understanding that the session state can be used to store such information, but does not this also result in the same information being transferred to the user and stored in their cookie? (Not quite as bad as viewstate, but it does not feel ideal).
This leaves me with either a server-only-session storage system or a mem-caching solution.
Is memcached the only good option here?
Items 1 and 2 seem to be logging information - i.e. there's no obvious requirement to refer back them, unless you want to keep their search terms. So just write it to a log somewhere and forget about it?
I could store this in the view state
I'm guessing that's a asp.net thing - certainly it doesn't mean anything to me, so I can't comment.
it is my understanding that the session state can be used to store such information
Yes - it would be my repository of choice for item 3.
does not this also result in the same information being transferred to the user and stored in their cookie?
No - a session cookie is just a handle to data stored server-side in PHP (and every other HTTP sesion imlpementation I've come across).
This leaves me with either a server-only-session storage system or a mem-caching solution
memcache is only really any use as a session storage substrate.
Whether you should use memcache as your session storage substrate....on a single server, there's not a lot of advantage here - although NT (IIRC) uses fixed size caches/buffers (whereas Posix/Unix/Linux will use all available memory) I'd still expect most the session I/O to be to memory rather than disk unless your site is overloaded. The real advantage is if you're running a cluster (but in that scenario you might want to consider something like Cassandra).
IME, sessions are not that great of an overhead, however if you want to keep an indefinite history of the activity in a session, then you should overflow the data (or keep it seperate from) the session to prevent it getting too large.
C.
Memcached is good for read often/write rarely key/value data, it is a cache as the name implies. e.g. if you are serving up product info that changes infrequently you store it in memcached so you are not querying the database repeatedly for semi static data. You should use session state to store your info, the only thing that will be passed to and fro is the session identifier, not the actual data, that stays on the server.
I have a social network type app I have been working on for at least 2 years now, I need it to scale well so I have put a lot of effort into perfecting the code of this app. I use sessions very often to cache database results for a user when a user logs into the site I cache there userID number, username, urer status/role, photo URL, online status, last activity time, and several other things into session variables/array. Now I currently have 2 seperate servers to handle this site, 1 server for apache webserver and a seperate server for mysql. Now I am starting to use memcache in some areas to cut down on database load.
Now my sessions are stored on disk and I know some people use a database to store sessions data, for me it would seem that storing session data that I cached from mysql would kind of defeat the purpose if I were to switch to storingsessions in mysql. SO what am I missing here? Why do people choose to use a database for sessions?
Here are my ideas, using a database for sessions would make it easiar to store and access sessions across multiple servers, is this the main reason for using a database?
Also should I be using memcache to store temp variables instead of storing them into a session?
PHP has the ability to use memcached to store sessions.
That may just be the winning ticket for you.
Take a look at this google search.
One part of the Zend Server package is a session daemon.
Be careful using memcache for that purpose. Once the memory bucket is full, it starts throwing away stuff in a FIFO fashion.
Found this on slideshare about creating your own session server with php-cli.
The single best reason to store sessions in the database is so you can load-balance your website. That way it doesn't matter which server hands out the next page because they are all using the same database for storing their sessions.
Have a look at PHP's set_save_handler() for how to install a custom session handler. It takes about 30 lines to set one up that puts the session in the database, though that doesn't count the lines to make a decent database handler. :-) You will need to do:
ini_set('session.save_handler', 'user');
ini_set('session.auto_start', '0');
... although session.auto_start will need to be in your php.ini (and set to 0).
If the database access is going to be a bit expensive, there are some other things you can do to mitigate that. The obvious one is to have a DB server that is just for sessions. Another trick is to have it poke stuff into both memcache and the DB, so when it checks, if the memcache record is missing, it just falls back to the DB. You could get fancy with that, too, and split the session up so some of it is in memcache but the rest lives in the database. I think you'd need to put your own access functions on top of PHP's session API, though.
The main reason to store session data in a database is security because otherwise you have no way to validate it. You'd store the session ID along with the data in the database and match them to see if the session has been tampered with but you can't use the server's (apache) default session mechanism anymore.
For Storing variables in memcache instead of the session.. have you set up your database query cache? I'd have a look there instead first as it's far easier to deal with than with memcache.