I just finished taking a final exam on web applications. Capping off what had been a rather easy (albeit lengthy - 12 pages) exam was a question asking us to code an implementation of sessions, similar to that done by javax.http.HttpSession.
I hate to admit, it stumped me. I cranked out a rather BS implemetation using a HashMap and did some craziness with a random cookie string mapping to a serialized HashMap on the server, but I'm pretty sure it's bogus...and now I'm dying to know how it's actually done.
Particularly as someone who has used PHP extensively but for whatever reason never bothered to learn the magic behind the convenience, I'm very interested to learn more about the underlying implementations of sessions. J2EE and PHP for sure, but any other languages/frameworks are great, too. Thanks!
From my understanding - you're close.
From my understanding a cookie with what is essentially an MD5 "ID" is saved on the client side and delivered via cookie or modified GET.
On the server side the "session" data with matched sessionID is saved in a temp file (on Linux it is defaulted to /tmp). The session directory I believe can be set in the PHP.ini file.
As it is an interface, you may look at the class(es) implementing it in an open-source web container like Tomcat, and see for yourself.
Related
I develloped a standalone flatfile based cms. Now in order to protect my code from being stolen by clients, i've been looking arround but didn't find much usefull to protect my code. I have found ioncube, but i don't really like that...
I am wondering if it is possible to create a file within the cms, a php file, with a unique ID for every sale i made. That file transmits a signal to a webserver with that ID. So i can tract the online versions of my cms. If it gets copied i will see 2 or more versions with that ID online and i know which company or user distributed my code. But is it possible to make a cms dependable on that file... if a user erases the ID transmitting file no code is send out...How can i make that file so it can't be deleted or make that file CMS dependable. Anybody idea's?
I am wondering if it is possible to create a file within the cms, a php file, with a unique ID for every sale i made.
Yes, it's pretty easy to do it. Just create a file and include it in your php code.
That file transmits a signal to a webserver with that ID.
No! That file will transmit an ID in http header to YOUR webserver. Not a webserver.
So i can track the online versions of my cms.
Yes you can, until sysadmin check the log and see that their new cms is transmiting ID to your server. Then someone might ask why didn't you warn them, what else are you sending to your server, etc.
But is it possible to make a cms dependable on that file... if a user erases the ID transmitting file no code is send out...
Yes, that's fairly easy. Just put some if (file) work, if (no_file) {dont work}.
And be prepared to obfuscate that code AS MUCH AS YOU CAN.
You are delivering PHP as source code and any descent programmer will deobfuscate almost any code.
How can i make that file so it can't be deleted or make that file CMS dependable. Anybody idea's?
As far as I know, you can't. Almost anything can be deleted.
One idea is to create some nasty pgp public keys with some hashes that are calculated and recalculated all over your cms. But that will make your code hard to maintain and it will put some additional load to server...
Other solution is to put your code to your server. That's the only way to keep it safe.
p.s.
It would not be fair if I didn't mention that reading and editing (adding new features to someone's php code) if hard. It's really hard if code is bad (speaking from experience here). It's extremely hard if it's very, very bad code!
Many 'programmers' wouldn't touch core code of some app. 'Just gimme my framework...'
Obfuscated code is next to impossible to change if you don't have excellent coding skills + experience + lots of time.
Provided you really created your own cms (that's not an easy task) you will be able to create ok protection :) Some guide lines:
never create 2 links. Always link 3 or more features (functions, classes, globals, etc) in some ludicrous way. One f() returns a resuls that is used in class that creates fake object which checks some global used in first f().
use what you already have! but add some checks and tests with in. logical and illogical.
use long time tests. In odd lunar months goto this {code}. In even goto this {code}.
use different tests for same thing. copy/paste is search friendly.
be shamelessly creative :)
prepare yourself for your own traps. You are doomed without heavy documentation of your work.
So, I know that the general idea in PHP is that the whole application is loaded and executed every time the page loads. But for optimization of a sizable object-oriented PHP app that needs to be portableā¦is it possible to load an object into memory that can be used for every request, not recreated for each one?
I've seen people using the $_SESSION variable for something like this, but that seems like it is a) ugly, b) will take up a lot of space on the server, and c) doesn't really do what I need it to as it's a session by session sort of thing.
Is there some sort of $_ALL_SESSIONS? ;)
(Or, approaching the question from a different angle, are purely static objects loaded into memory each time you load the page with a standard Apache mod-php install?)
You are more or less looking for an equivalent of ASP/IIS's Application object in PHP. AFAIK there isn't one.
There is EG(persistent_list), a list of "objects" that are not (necessarily) removed after a request is served. It's used by functions like mysql_pconnect(), pg_pconnect(), ... But it is not directly accessible by script code.
memchache has already been mentioned. Can you please elaborate on "purely static objects" ?
Maybe you could serialize it and store it in memcache? I don't know if that would be any faster though.
Not by default, no. You'll have to use some workaround, whether it be a 3rd party tool (memcached, DBMS, etc.), or a built-in mechanism (sessions, serializing to a file, etc.) Whether it's faster than recreating the object for each record is up to you.
You could also write a PHP plugin for this. :) Or maybe there already is one. A quick google search revealed nothing but I didn't try very hard.
If you do decide on writing one yourself know that it's not as straightforward as it sounds. For example, webservers such as Apache spawn several child processes for handling requests in parallel. You'll have to be tricky to get data across to them. Not to mention proper locking (and lock breaking if a request hangs), handling of webserver clusters, etc.
What you can do is use the CLI version of PHP to write a 'daemon' app which persists across requests and maintains state etc, and then have a regular web based script which can communicate it with via sockets or some other mechanism (here's one example)
If the server is Your own machine then it should be possible to run a process in background that would do the "global thing". You could communicate with it using SOAP.
You would only need to create a SOAP object.
That's the only way I see to really create a long-lived object for php. Everything else is just serialization. There might be a technology outside PHP for that purpose though.
Honestly, I don't think Your object is big and complicated enough for it to be created and populated longer than it takes to make a SOAP call. But if creating this object requires lots of DB connections - it's plausible that my idea could help...
I am working on a project has me constantly pinging a php script for new data, so if I understand this correctly that means that the php script being pinged gets run over and over indefinitely. It works but i'm guessing its a huge strain on the server, and is probably considered ugly and bad practice. Am I right about that?
Is there any way I could keep the connection to the script alive and make use of php's built in output buffering to flush the contents I need, but keep the script running for infinity using some sort of loop so when new data is available it can be output. Is this a bad idea as well?
I'm just looking for input form developers out there with more experience.
One last thing...
Are there any other ways to keep a constant flow of data going (excluding technologies such as flash or silverlight)?
If what you have currently works and continues to work when tested against the kind of load you might expect in this application, it is not really considered bad practice. It is not a crime to keep it simple if it works. Anything that does what you are describing is going to go against the grain of the original model of the web, so you're venturing into shaky territory.
I do recommend you check out the Comet technique. It is mostly popular for the inverse of what you want - the server pushing information to a page continuously - but it can obviously work both ways. Although your mileage may vary, I've heard good things. As Wikipedia describes it:
In web development, Comet is a neologism to describe a web application model in which a long-held HTTP request allows a web server to push data to a browser, without the browser explicitly requesting it. Comet is an umbrella term for multiple techniques for achieving this interaction. All these methods rely on features included by default in browsers, such as Javascript, rather than on non-default plugins.
It almost seems like php wouldn't be the best choice of language for this. Possibly consider something like scala or erlang which are setup to handle this type of long lived messaging better.
You have to learn how to use sockets in php.
Start from here: http://php.net/manual/en/book.sockets.php
And afair here is useful manual about writing standalone php apps: Advanced PHP Programming
I'd say that depends. If you want the data transfers to be started by the client, your best choice here would be some ajax (like getxmlhttpobject or just iframes if you feel like cheating :P). If you want the transfers to be started by the server, then, perhaps php is not the language you want to use.
You can use ajax to have http-streaming. Take a look at ajaxpatterns.
I currently have a custom session handler class which simply builds on php's session functionality (and ties in some mySQL tables).
I have a wide variety of session variables that best suits my application (primarily kept on the server side). Although I am also using jQuery to improve the usability of the front-end, and I was wondering if feeding some of the session variables (some basics and some browse preference id's) to a JS object would be a bad way to go.
Currently if I need to access any of this information at the front-end I do a ajax request to a php page specifically written to provide the appropriate response, although I am unsure if this is the best practice (actually I'm pretty sure this just creates a excess number of Ajax requests).
Has anyone got any comments on this? Would this be the best way to have this sort of information available to the client side?
I really guess it depends on many factors. I'm always having "premature optimization ..." in the back of my head.
In earlier years I rushed every little idea that came to my mind into the app. That often lead to "i made it cool but I didn't took time to fully grasp the problem I'm trying to solve; was there a problem anyway?"
Nowadays I use the obvious approach (like yours) which is fast (without scarifying performance completely on the first try) and then analyze if I'm getting into problems or not.
In other words:
How often do you need to access this information from different kind of loaded pages (because if you load the information once without the user reloading there's probably not much point in re-fetching it anyway) multiplied by number of concurrent clients?
If you write the information into a client side cookie for fast JS access, can harm be done to your application if abused (modified without application consent)? Replace "JS" and "cookie" without any kind of offline storage like WHATWG proposes it, if #1 applies.
The "fast" approach suits me, because often there's not the big investment into prior-development research. If you've done that carefully ... but then you would probably know that answer already ;)
As 3. you could always push the HTML to your client already including the data you need in JS, maybe that can work in your case. Will be interesting to see what other suggestions will come!
As I side note: I've had PHP sessions stored in DB too, until I moved them over to memcached (alert: it's a cache and not a persistent store so may be not a good idea for you case, I can live with it, I just make sure it's always running) to realize a average drop of 20% of database queries and and through this a 90% drop of write queries. And I wasn't even using any fancy Ajax yet, just the number of concurrent users.
I would say that's definately an overkill of AJAX, are these sessions private or important not to show to a visitor? Just to throw it out there; a cookie is the easiest when it comes to both, to have the data in a javascript object makes it just as easily readable to a visitor, and when it comes down to cookies being enabled or not, without cookies you wouldn't have sessions anyway.
http://www.quirksmode.org/js/cookies.html is a good source about cookie handling in JS and includes two functions for reading and writing cookies.
I just figured out that I can actually store objects in the $_SESSION and I find it quite cool because when I jump to another page I still have my object. Now before I start using this approach I would like to find out if it is really such a good idea or if there are potential pitfalls involved.
I know that if I had a single point of entry I wouldn't need to do that but I'm not there yet so I don't have a single point of entry and I would really like to keep my object because I don't lose my state like that. (Now I've also read that I should program stateless sites but I don't understand that concept yet.)
So in short: Is it ok to store objects in the session, are there any problems with it?
Edit:
Temporary summary: By now I understand that it is probably better to recreate the object even if it involves querying the database again.
Further answers could maybe elaborate on that aspect a bit more!
I know this topic is old, but this issue keeps coming up and has not been addressed to my satisfaction:
Whether you save objects in $_SESSION, or reconstruct them whole cloth based on data stashed in hidden form fields, or re-query them from the DB each time, you are using state. HTTP is stateless (more or less; but see GET vs. PUT) but almost everything anybody cares to do with a web app requires state to be maintained somewhere. Acting as if pushing the state into nooks and crannies amounts to some kind of theoretical win is just wrong. State is state. If you use state, you lose the various technical advantages gained by being stateless. This is not something to lose sleep over unless you know in advance that you ought to be losing sleep over it.
I am especially flummoxed by the blessing received by the "double whammy" arguments put forth by Hank Gay. Is the OP building a distributed and load-balanced e-commerce system? My guess is no; and I will further posit that serializing his $User class, or whatever, will not cripple his server beyond repair. My advice: use techniques that are sensible to your application. Objects in $_SESSION are fine, subject to common sense precautions. If your app suddenly turns into something rivaling Amazon in traffic served, you will need to re-adapt. That's life.
it's OK as long as by the time the session_start() call is made, the class declaration/definition has already been encountered by PHP or can be found by an already-installed autoloader. otherwise it would not be able to deserialize the object from the session store.
HTTP is a stateless protocol for a reason. Sessions weld state onto HTTP. As a rule of thumb, avoid using session state.
UPDATE:
There is no concept of a session at the HTTP level; servers provide this by giving the client a unique ID and telling the client to resubmit it on every request. Then the server uses that ID as a key into a big hashtable of Session objects. Whenever the server gets a request, it looks up the Session info out of its hashtable of session objects based on the ID the client submitted with the request. All this extra work is a double whammy on scalability (a big reason HTTP is stateless).
Whammy One: It reduces the work a single server can do.
Whammy Two: It makes it harder to scale out because now you can't just route a request to any old server - they don't all have the same session. You can pin all the requests with a given session ID to the same server. That's not easy, and it's a single point of failure (not for the system as a whole, but for big chunks of your users). Or, you could share the session storage across all servers in the cluster, but now you have more complexity: network-attached memory, a stand-alone session server, etc.
Given all that, the more info you put in the session, the bigger the impact on performance (as Vinko points out). Also as Vinko points out, if your object isn't serializable, the session will misbehave. So, as a rule of thumb, avoid putting more than absolutely necessary in the session.
#Vinko You can usually work around having the server store state by embedding the data you're tracking in the response you send back and having the client resubmit it, e.g., sending the data down in a hidden input. If you really need server-side tracking of state, it should probably be in your backing datastore.
(Vinko adds: PHP can use a database for storing session information, and having the client resubmit the data each time might solve potential scalability issues, but opens a big can of security issues you must pay attention to now that the client's in control of all your state)
Objects which cannot be serialized (or which contain unserializable members) will not come out of the $_SESSION as you would expect
Huge sessions put a burden on the server (serializing and deserializing megs of state each time is expensive)
Other than that I've seen no problems.
In my experience, it's generally not worth it for anything more complicated than an StdClass with some properties. The cost of unserializing has always been more than recreating from a database given a session-stored Identifier. It seems cool, but (as always), profiling is the key.
I would suggest don't use state unless you absolutely need it. If you can rebuild the object without using sessions do it.
Having states in your webapplication makes the application more complex to build, for every request you have to see what state the user is in. Ofcourse there are times where you cannot avoid using session (example: user have to be kept login during his session on the webapplication).
Last I would suggest keeping your session object as small as possible as it impacts performance to serialize and unserialize large objects.
You'll have to remember that resource types (such as db connections or file pointers) wont persist between page loads, and you'll need to invisibly re-create these.
Also consider the size of the session, depending how it is stored, you may have size restrictions, or latency issues.
I would also bring up when upgrading software libraries - we upgraded our software and the old version had objects in session with the V1 software's class names, the new software was crashing when it tried to build the objects that were in the session - as the V2 software didn't use those same classes anymore, it couldn't find them. We had to put in some fix code to detect session objects, delete the session if found, reload the page. The biggest pain initially mind you was recreating this bug when it was first reported (all too familiar, "well, it works for me" :) as it only affected people who where in and out the old and new systems recently - however, good job we did find it before launch as all of our users would surely have had the old session variables in their sessions and would have potentially crashed for all, would have been a terrible launch :)
Anyway, as you suggest in your amendment, I also think it's better to re-create the object. So maybe just storing id and then on each request pulling the object from the database, is better/safer.