Custom Sessions: file or database?

Custom Sessions: file or database? - php

I'm designing my own session handler for my web app, the PHP sessions are too limited when trying to control the time the session should last.
Anyway, my first tests were like this: a session_id stored on a mysql row and also on a cookie, on the same mysql row the rest of my session vars.
On every request to the server I make a query, get these vars an put them on an array to use the necesary ones on runtime.
Last night I was thinking if I could write the vars on a server file once, on the login stage, and later just include that file instead of making a mysql query on every request.
So, my question is: which is less resource consuming? doing this on mysql or on a file?
I know, I know, I already read several threads on stackoverflow about this issue, but I have something different from all those cases (I hope I didn't miss something):
I need to keep track of the time that has passed since the last time the user used the app, so, in every call to the server not only I request the entire database row, I also update a timestamp on that same row.
So, on both cases I need to write to the session on every request...
FYI: the entire app runs on one server so the several servers scenario when using files does not apply..

It's easier to work with when it's done in a database and I've been using sessions in database mostly for scalability.
You may use MySQL since it can store sessions in it's temporary memory with well-configured MySQL servers, you can even use memory tables to fasten the thing if you can store all the sessions within memory. If you get near your memory limit it's easy to switch to a normal table.
I'd say MySQL wins over files for performance for medium to large sites and also for customization/options. For smaller websites I think that it doesn't make that much of a difference, but you will use more of the hard drive when using files.

Related

Getting all data once for future use

Well this is kind of a question of how to design a website which uses less resources than normal websites. Mobile optimized as well.
Here it goes: I was about to display a specific overview of e.g. 5 posts (from e.g. a blog). Then if I'd click for example on the first post, I'd load this post in a new window. But instead of connecting to the Database again and getting this specific post with the specific id, I'd just look up that post (in PHP) in my array of 5 posts, that I've created earlier, when I fetched the website for the first time.
Would it save data to download? Because PHP works server-side as well, so that's why I'm not sure.
Ok, I'll explain again:
Method 1:
User connects to my website
5 Posts become displayed & saved to an array (with all its data)
User clicks on the first Post and expects more Information about this post.
My program looks up the post in my array and displays it.
Method 2:
User connects to my website
5 Posts become displayed
User clicks on the first Post and expects more Information about this post.
My program connects to MySQL again and fetches the post from the server.

First off, this sounds like a case of premature optimization. I would not start caching anything outside of the database until measurements prove that it's a wise thing to do. Caching takes your focus away from the core task at hand, and introduces complexity.
If you do want to keep DB results in memory, just using an array allocated in a PHP-processed HTTP request will not be sufficient. Once the page is processed, memory allocated at that scope is no longer available.
You could certainly put the results in SESSION scope. The advantage of saving some DB results in the SESSION is that you avoid DB round trips. Disadvantages include the increased complexity to program the solution, use of memory in the web server for data that may never be accessed, and increased initial load in the DB to retrieve the extra pages that may or may not every be requested by the user.
If DB performance, after measurement, really is causing you to miss your performance objectives you can use a well-proven caching system such as memcached to keep frequently accessed data in the web server's (or dedicated cache server's) memory.
Final note: You say
PHP works server-side as well
That's not accurate. PHP works server-side only.

Have you think in saving the posts in divs, and only make it visible when the user click somewhere? Here how to do that.

Put some sort of cache between your code and the database.
So your code will look like
if(isPostInCache()) {
loadPostFromCache();
} else {
loadPostFromDatabase();
}
Go for some caching system, the web is full of them. You can use memcached or a static caching you can made by yourself (i.e. save post in txt files on the server)

To me, this is a little more inefficient than making a 2nd call to the database and here is why.
The first query should only be pulling the fields you want like: title, author, date. The content of the post maybe a heavy query, so I'd exclude that (you can pull a teaser if you'd like).
Then if the user wants the details of the post, i would then query for the content with an indexed key column.
That way you're not pulling content for 5 posts that may never been seen.

If your PHP code is constantly re-connecting to the database you've configured it wrong and aren't using connection pooling properly. The execution time of a query should be a few milliseconds at most if you've got your stack properly tuned. Do not cache unless you absolutely have to.
What you're advocating here is side-stepping a serious problem. Database queries should be effortless provided your database is properly configured. Fix that issue and you won't need to go down the caching road.
Saving data from one request to the other is a broken design and if not done perfectly could lead to embarrassing data bleed situations where one user is seeing content intended for another. This is why caching is an option usually pursued after all other avenues have been exhausted.

PHP One time database query on application startup

I have a ajax based PHP app (without any frameworks etc.).
I need to retrieve some records from the database (html select element items) ONCE, and once only, during application startup, store it in a PHP array, and have this array available for future use to prevent future database calls, for ALL future users.
I could do this easily in Spring with initializing beans. And this bean would have the application scope (context) so that it could be used for ALL future user threads needing the data. That means the database retrieval would be once, only during app boot, and then some bean would hold the dropdown data permanently.
I can't understand how to replicate the usecase in PHP.
There's no "application" bootstrapping as such, not until the first user actually does something to invoke my php files.
Moreover, there is no application context - records retrieved for the first user will not be available to another user.
How do I solve this problem? (Note: I don't want to use any library like memcache or whatever.)

If you truly need to get the data only the first time the app is loaded by any user, than you could write something that gets the data from your database, and then rewrites the html page that you're wanting those values in. That way when the next user comes along, they are viewing a static page that has been written by a program.
I'm not so sure that 1 call to the database everytime a user hits your app is going to kill you though. Maybe you've got a good reason, but avoiding the database all but 1 time seems rediculous IMO.

If you need to hit the database one time per visitor, you could use $_SESSION. At the beginning of your script(s) you would start up a session and check to see if there are values in it from the database. If not, it's the user's first visit and you need to query the database. Store the database values in the $_SESSION superglobal and carry on. If the data is in the session, use it and don't query the database.
Would that cover you?

Session variables vs. Mysql table

I'm planning on saving my users often used parameters, i.e. name, picture, etc, in session variables as opposed to pulling then from the MySQL table each time they are needed. Saving often used parameters in variables as opposed to a database in theory should be more efficient, but because I'm not sure how SESSION variables are saved I'm not too sure if this is true. Does anyone know if pulling info. from a SESSION variable is more efficient than querying the MySQL table?
The term variable is used loosely as SESSION "variables" are stored in files in the server's temporary directory.
You would think reading files is more costly than reading a database, I mean that is what a database is essentially, a file, but it is optimized for this purpose as opposed to "temporary session files"

Yes, pulling information from a session variable is more efficient than querying a database for that info. However, loading the information INTO the session variables requires reading a file off of your servers file system and into RAM, which depending on many factors (disk speed, IO load, db speed, etc) might be slower or faster than reading the same information from a DB. Without information on your specific setup, it's hard to say. One thing to keep in mind, if you plan on growing and using more than one web server, you will need to write some custom session handlers to either store your sessions to a central server (possibly a database), memcache, or a shared mount point where all your web servers can go to fetch the session files.
In the end, putting something into the session and using it from there can be more efficient than loading it from the DB every time, but you are still loading it from somewhere, and so, knowledge of your hardware and your setup will be your best guide.

The default Session handler for PHP stores that info to disk; one unique temporary file per session. The issues you may come across are if the disk/file system gets overloaded, or if your data becomes stale.
If you're making a trip to disk to access the session, there is slightly less overhead than accessing MySQL, but you're still making a trip to disk upon every page request. You can try to use an in-memory Session handler.

Session variables are preferred for persisting a relatively small amount of temporary data. They're good for "sessions".
Use a database for everything else. Especially for:
larger amounts of data,
for any kind of "transaction", or
for data that needs to be persisted between "sessions".
This article is somewhat dated, and it doesn't apply to PHP per se ... but it should give you some idea about the relative efficiencies of filesystem (e.g. NTFS) vs database (e.g. MSSQL):
To Blob or Not To Blob: MS Research white paper

Yes it's more efficient to use session variables.
Typically Session variables are stored on the server in the /tmp directory (you can check your PHP Info file to see how yours is configured.
And because they're stored on the server, you can assume they're just as secure as the rest of your server.

Yes it is more efficient. Session is saved on server. However, with or without sessions you need to check if user is logged and if user has correct SESSION ID. It depends on number of your columns, rows and many other things

different databases for handling sessions...am I doing the right thing?

I'm looking for some advice on whether or not I should use a separate database to handle my sessions.
We are writing a web app for multiple users to login and check/update their account specific information. We didn't want to use the file storage method on the webserver for storing session information, so we decided to use a database (MySQL). It's working fine, but I'm wondering about performance when this gets into production.
Currently, we have two databases (rst_sessions, and rst). The "RST" database is where all the tables are stored for the webapp...they are all MYSQL InnoDB using Referential Integrity/foreign keys to link the tables. The "RST_SESSIONS" database simply has one table and all the session information gets stored there.
Here's one of my concerns. In the PHP code if I want to run a query against "RST" then I have to select that database as such inside php ( $db->select("RST") )...when I'm done with the query I have to re-select the "RST_SESSIONS" ( $db->select("RST_SESSIONS") ) or else the session specific information doesn't get set. So, throught the webapp the code is doing a lot of selecting and reselecting of the two databases. Is this likely to cause performance issues with user base of say (10,000 - 15,000)? Would we be better off moving the RST_SESSIONS table into the RST database to avoid all the selecting?
One reason we initially set things up this way was to be able to store the sessions information on a separate database server so it didn't interfere with the operations of the webapp database.
What are some of the pro's and con's of both methods and what would you suggest we do for performance? Thanks in advance.

If you're worrying about performances, another alternate solution would be to not store your sessions in database, but to use something like memcached -- the PHP library to dialog with memcached already provides a handler for sessions.
A couple of advantages of using memcached :
No hit to the disk : everything is in RAM
Of course, this means sessions will be lost if your server crashes ; but if a crash happens, you'll probably have other troubles than jsut losing sessions, and this is not likely to happen often
Used in production by many websites, and works well (I'm using it for a couple of websites)
Better scalability : if you need more RAM or more CPU-power for your memcached cluster, just add a couple of servers
And I would add : once you've started using memcached, you can also use it as a caching mecanism ;-)
Now, to answer to your specific questions :
Instead of selecting the DB, I would use two distinct connections :
One for the DB that's use for the application,
And one other for the DB that's used for the sessions.
Of course, this means a bit more load on the server (it doubles the number of opened connections), but it make sure that, the day it becomes needed, you'll be able to move the "session" database to another server : you'll just have to re-configure a connection string ; and as the application already uses two separate connections, it'll still work fine.

If you can live with it, just open a second connection to the database. That way you won't have to switch between databases at all. Of course, now you consume twice as many connections, and may need to bump the limit.

Unless there's some overriding reason to put your auth information in a separate database, why not put it with the rest of your data? You may find it convenient to have everything in one place.
Notice also that you can qualify your table names in your sql queries with a schema (database) name e.g.
SELECT ACTIVE
FROM RST_SESSIONS.SESSION
WHERE SID=*whatever*
This may get you out of the need to switch dbs explicitly, if they're both on the same server.

Caching table results for better performance... how?

First of all, the website I run is hosted and I don't have access to be able to install anything interesting like memcached.
I have several web pages displaying HTML tables. The data for these HTML tables are generated using expensive and complex MySQL queries. I've optimized the queries as far as I can, and put indexes in place to improve performance. The problem is if I have high traffic to my site the MySQL server gets hammered, and struggles.
Interestingly - the data within the MySQL tables doesn't change very often. In fact it changes only after a certain 'event' that takes place every few weeks.
So what I have done now is this:
Save the HTML table once generated to a file
When the URL is accessed check the saved file if it exists
If the file is older than 1hr, run the query and save a new file, if not output the file
This ensures that for the vast majority of requests the page loads very fast, and the data can at most be 1hr old. For my purpose this isn't too bad.
What I would really like is to guarantee that if any data changes in the database, the cache file is deleted. This could be done by finding all scripts that do any change queries on the table and adding code to remove the cache file, but it's flimsy as all future changes need to also take care of this mechanism.
Is there an elegant way to do this?
I don't have anything but vanilla PHP and MySQL (recent versions) - I'd like to play with memcached, but I can't.

Ok - serious answer.
If you have any sort of database abstraction layer (hopefully you will), you could maintain a field in the database for the last time anything was updated, and manage that from a single point in your abstraction layer.
e.g. (pseudocode): On any update set last_updated.value = Time.now()
Then compare this to the time of the cached file at runtime to see if you need to re-query.
If you don't have an abstraction layer, create a wrapper function to any SQL update call that does this, and always use the wrapper function for any future functionality.

There are only two hard things in
Computer Science: cache invalidation
and naming things.
—Phil Karlton
Sorry, doesn't help much, but it is sooooo true.

You have most of the ends covered, but a last_modified field and cron job might help.
There's no way of deleting files from MySQL, Postgres would give you that facility, but MySQL can't.

You can cache your output to a string using PHP's output buffering functions. Google it and you'll find a nice collection of websites explaining how this is done.
I'm wondering however, how do you know that the data expires after an hour? Or are you assuming the data wont change that dramatically in 60 minutes to warrant constant page generation?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.