I've inherited a php/js project that creates audio-playing widgets. The requirement is really that they be able to stand up to some pretty heavy load at "peak times": when a new track is first announced there may be a lot of people rushing to play it at once.
Unfortunately, the widgets tend to do pretty badly under such stressful conditions. We had considered that saving and looking up an access key in a SQLite database might have been causing fatal errors due to locking. Experimentally I changed the access keys to be stored in session variables, but I'm now worried this may just be creating a new kind of bottleneck: does every request have to wait for the session to free up before it can go ahead?
I downloaded Pylot and did some basic load tests: it doesn't take many agents trying to access the same widget to make it glitchy or completely unusable, maybe 10 or 20. Ideally we'd like to be able to handle a considerably greater volume of traffic than this. What strategies can I sensibly adopt to be able to field many times more requests?
A PHP file-based session will lock the session file until the script exits, or you call session_write_close(). You can do a quick session_start(); session_write_close(). The $_SESSION array will still be available, but any subsequent changes will NOT be written to disk, as PHP has been told the session is closed.
Store your session into a database that does only locking while wrtiing on the concrete session id (primary key that is) in a database that is supporting MVCC, like MySQL and the InnoDB backend. You can further optimize this by improving the file-system beneath it.
This done you might run into race-conditions but not into lockings. Have fun!
Related
I am developing a website and after the login authentication, i am using $_SESSION super global array to pass my data to other pages and display when required. This is how i am doing this. Its my own little MVC framework.
//please ignore the syntax errors
$recieved_data = $this->registry->{auth_login}($username, $password);
//$recieved_data holds records like (fname,lname,email,username,password)
$_SESSION = $recieved_data;
//Or should i choose PHP cache instead at this point?
My website will have a huge traffic after some time. In this particular case, should i choose php cache or keep continue using $_SESSION?
I know i cannot ignore the use of sessions completely but what are the right options in my case?
Today, i surprised when i set the $_SESSION array with different index names in all the projects and used print_r($_SESSION) function to check the available sessions in $_SESSION Array in any one of the project.
It showed me all active sessions belonging to different project folders. Is it fine if the $_SESSION are globally available in all other projects or its my fault somewhere?
I am using Xampp 1.8.3 with PHP version 5.5.3 and Netbeans 7.4 (candidate release) for writing code. I would be thankful for expert guideline.
Basic rule: Don't abuse the session as a cache!
Here's why: The session data is read every time a request is made. All of it. And it is written back every time a request ends.
So if you write some data into it as a cache, that isn't used in every request, you are constantly reading and writing data that is not needed in this request.
Low amount of data will not affect performance significantly, but serializing, unserializing and disk or network I/O of huge amounts of data will affect it. And you miss the opportunity to use that data shared between multiple sessions.
On the other hand, a cache is no session storage, for obvious reasons: It is shared between all sessions and cannot really contain private data.
Regarding optimization for more traffic: You cannot optimize right now. Whatever usage pattern will evolve, you will only then see where performance is really needed. And you probably will want to scale - with the easiest way being to scale with some sort of cloud service instead of hosting it on your own hardware.
There is certain userdata read from the (MySQL) database that will be needed in subsequent page-requests, say the name of the user or some preferences.
Is it beneficial to store this data in the $_SESSION variable to save on database lookups?
We're talking (potentially) lots of users. I'd imagine storing in $_SESSION contributes to RAM usage (very-small-amount times very-many-users) while accessing the database on every page request for the same data again and again should increase disk activity.
The irony of your question is that, for most systems, once you get a large number of users, you need to find a way to get your sessions out of the default on-disk storage and into a separate persistence layer (i.e. database, in-memory cache, etc.). This is because at some point you need multiple application servers, and it is usually a lot easier not to have to maintain state on the application servers themselves.
A number of large systems utilize in-memory caching (memcached or similar) for session persistence, as it can provide a common persistence layer available to multiple front-end servers and doesn't require long time persistence (on-disk storage) of the data.
Well-designed database tables or other disk-based key-value stores can also be successfully used, though they might not be as performant as in-memory storage. However, they may be cheaper to operate depending on how much data you are expecting to store with each session key (holding large quantities of data in RAM is typically more expensive than storing on disk).
Understanding the size of session data (average size and maximum size), the number of concurrent sessions you expect to support, and the frequency with which the session data will need to be accessed will be important in helping you decide what solution is best for your situation.
You can use multiple storage backends for session data in PHP. Per default its saved to files. One file for one session. You can also use a database as session backend or whatever you wan't by implementing you own session save handler
If you want your application most scalable I would not use sessions on file system. Imagine you have a setup with mutiple web servers all serving your site as a farm. When using session on filesystem a user had to be redirected to the same server for each request because the session data is only available on that servers filesystem. If you not using sessions on filesystem it would not matter which server is being used for a request. This makes the load balancing much easier.
Instead of using session on filesystem I would suggest
use cookies
use request vars across multiple requests
or (if data is security critical)
use sessions with a database save handler. So data would be available to each webserver that reads from the database (or cluster).
Using sessions has one major drawback: You cannot serve concurrent requests to the user if they all try to start the session to access data. This is because PHP locks the session once it is started by a script to prevent data from getting overwritten by another script. The usual thinking when using session data is that after your call to session_start(), the data is available in $_SESSION and will get written back to the storage after the script ends. Before this happens, you can happily read and write to the session array as you like. PHP ensures this will not destroy or overwrite data by locking it.
Locking the session will kill performance if you want to do a Web2.0 site with plenty of Ajax calls to the server, because every request that needs the session will be executed serially. If you can avoid using the session, it will be beneficial to user's perceived performance.
There are some tricks that might work around the problem:
You can try to release the lock as soon as possible with a call to session_write_close(), but you then have to deal with not being able to write to the session after this call.
If you know some script calls will only read from the session, you might try to implement code that only reads the session data without calling session_start(), and avoid the lock at all.
If I/O is a problem, using a Memcache server for storage might get you some more performance, but does not help you with the locking issue.
Note that the database also has this locking issue with all data it stores in any table. If your DB storage engine is not wisely chosen (like MyISAM instead of InnoDB), you'll lose more performance than you might win with avoiding sessions.
All these discussions are moot if you do not have any performance issues at all right now. Do whatever serves your intentions best. Whatever performance issues you'll run into later we cannot know today, and it would be premature optimization (which is the root of evil) trying to avoid them.
Always obey the first rule of optimization, though: Measure it, and see if a change improved it.
I am developing a website. Currently, I'm on cheapo shared hosting. But a boy can dream and I'm already thinking about what happens with larger numbers of users on my site.
The visitors will require occasional database writes, as their progress in the game on the site is logged.
I thought of minimizing the queries by writing progress and other info live into the $_SESSION variable. And only when the session is destroyed (log out, browser close or timeout), I want to write the contents of $_SESSION to the database.
Questions:
Is that possible? Is there a way to execute a function when the sessions is destroyed by timeout or closing of the browser?
Is that sensical? Are a couple of hundred concurrent SQL queries going to be a problem for a shared server and is the idea of using $_SESSION as a buffer going to alleviate some of this.
Is there a way to execute a function when the sessions is destroyed by
timeout or closing of the browser?
Yes, but it might not work the way you imagine. You can define your own custom session handler using session_set_save_handler, and part of the definition is supplying the destroy and gc callback functions. These two are invoked when a session is destroyed explicitly and when it is destroyed due to having expired, so they do exactly what you ask.
However, session expiration due to timeout does not occur with clockwork precision; it might be a whole lot of time before an expired session is actually "garbage-collected". In addition, garbage collection triggers probabilistically so in theory there is the chance that expired sessions will never be garbage collected.
Is that sensical? Are a couple of hundred concurrent SQL queries going
to be a problem for a shared server and is the idea of using $_SESSION
as a buffer going to alleviate some of this.
I really wouldn't do this for several reasons:
Premature optimization (before you measure, don't just assume that it will be "better").
Session might never be garbage collected; even if this doesn't happen, you don't control when they are collected. This could be a problem.
There is a possibility of losing everything a session contains (e.g. server reboots), which includes player progress. Players do not like losing progress.
Concurrent sessions for the same user would be impossible (whose "saved data" wins and remains persisted to the database?).
What about alternatives?
Well, since we 're talking about el cheapo shared hosting you are definitely not going to be in control of the server so anything that involves PHP extensions (e.g. memcached) is conditional. Database-side caching is also not going to fly. Moreover, the load on your server is going to be affected by variables outside your control so you can't really do any capacity planning.
In any case, I 'd start by making sure that the database itself is structured optimally and that the code is written in a way that minimizes load on the database (free performance just by typing stuff in an editor).
After that, you could introduce read-only caching: usually there is a lot of stuff that you need to display but don't intend to modify. For data that "almost never" gets updated, a session cache that you invalidate whenever you need to could be an easy and very effective improvement (you can even have false positives as regards the invalidation, as long as they are not too many in the grand scheme of things).
Finally, you can add per-request caching (in variables) if you are worried about pulling the same data from the database twice during a single request.
It's not a good idea to write data when the session is destroyed. Since the session datas could be destroyed via a garbage collector configured by your hoster, you don't have any idea when the session is really closed until the user's cookie is out of date.
So... I suggest you to use either a shared memory (RAM) cache system like memcache (if your hoster offers it) or a disk based cache system.
By the way, if your queries are optimized, columns correctly indexed, etc., your shared hosting could take tons of queries at the "same time".
Is that sensical? Are a couple of hundred concurrent SQL queries going to be a problem for a shared server and is the idea of using $_SESSION as a buffer going to alleviate some of this.
No. First and foremost, you never know what happens to a session (logout is obvious, where a time-out is nearly undetectable), so it's not a trustworthy caching mechanism at any rate. If there are results that you query multiple times over the span of a few request, which don't change all too often, save the results of those queries to a dedicated caching mechanism, such as APC, or memcached.
Now, I understand your webhost will not provide these caching systems, but then, you probably can do different things to optimise your site. For starters, my most complex software products (which are fairly complex) query the database about 6 times per page, on average. If the result is reusable, I tend to use caching, so that lowers the number of queries.
On top of that, writing decent queries is more important; the quality of your design and queries is more important than the quantity. If you get your schema, indexes and queries right, 10 queries are faster than one query that's not optimised. I'd invest my time investigating how to write efficient queries, and read up on indexing, rather than trying to overcome the problem with a "workaround", such as caching in a session.
Good luck, I hope your site becomes that big of a success you will actually need the advice above ;)
Actually you could use the $_SESSION as a buffer to avoid duplicate reads, thet seems a good idea to me (memcached even better than that), surely not for delaying writing (that is much more complex and should be handled by the db);
you could use a simple hash that you save in $_SESSION
$cache = array();
$_SESSION['cache'] = $cache;
then when you have to make a query
if(isset($_SESSION['cache'][$id]){
//you have a cache it
$question = $_SESSION['cache'][$id];
}else{
//no cache, retrieve your $question and save it in the cache
$_SESSION['cache'][$id] = $question ;
}
When I first meet PHP, I'm amazed by the idea Sharing-Nothing-Architecture. I once in a project whose scalaiblity suffers from sharing data among different HTTP requests.
However, as I proceed my PHP learning. I found that PHP has sessions. This looks conflict with the idea of sharing nothing.
So, PHP session is just invented to make counterpart technology of ASP/ASP.NET/J2EE? Should high scalable web sites use PHP session?
The default PHP model locks sessions on a per-user basis. That means that if user A is loading pages 1 and 2, and user B is loading page 3, the only delay that will occur is that page 2 will have to wait until page 1 is done - page 3 will still load independently of pages 1 and 2 because there is nothing shared for separate users; only within a given session.
So it's basically a half-and-half solution that works out okay in the end - most users aren't loading multiple pages simultaneously; thus session lock delays are typically low. As far as requests from different users are concerned, there's still nothing shared.
PHP allows you to write your own session handler - so you can build in your own semantics using the default hooks - or, if you prefer you could use the built in functionality to generate the session id and deal with the browser side of things then write your own code to store/fetch the session data (e.g. if you only wanted the login page and not other pages to lock the session data during processing, then this is a bit tricky though not impossible using the standard hooks).
I don't know enough about the Microsoft architecture for session handling to comment on that, but there's a huge difference in the way that PHPs session handling, and what actually gets stored in the session compared with J2EE.
Not using sessions in most of your pages will make the application tend to perform a lot faster and potentially scale more easily - but you could say that about any data used by the application.
C.
I'm interested in what is the more efficient way of handling user data in my game. When someone is playing the game, data from the server will constantly need to be queried from the database.
So I want to know if it is more efficient to keep querying the database or to store the data from the first query in a session and then keep using the session every time I need the data.
This is probably a stupid question as I think it is going to be sessions that are better, but it's best to be 100% sure :)
If the data will only be updated by the client session in question, then sure, cache it in the session. If other processes will be updating it, then you need to either reobtain it from the database or work out some method for invalidating your session's cached version.
Shared state goes in the database, unless you are ready to manage shared access yourself, which is a big pain.
Often-updated user-specific state goes into the session (if you issue an UPDATE every time anyone presses a key in your game, your database is dead).
If you need a superfast session architecture, try memcached.
Using sessions will be more efficient. But (assuming the data in the session as cache) any other script not invoked by the user, which updates the dataset you're using, should invalidate the cache somehow.
This means that the cache (now maitained in a session) should be accessible to other scripts. So it might be easier to maintain the cache in files (or you could use php_apc or memcached) instead of sessions.
I think there are many caching classes that are good but the only experience I have is with Zend_Cache and it is really easy to use. It supports APC, memcached, file, etc as backends (a.k.a storage)