I have an urge to detect when a user leaves my site in order to record accurately the session length of the user in question. I have thought of a couple possible solutions for this:
I first thought I could use onbeforeunload and send a simple ajax to record the last activity but what practice has shown me is that onbeforeunload is unreliable for now and it's a bad idea to use it since it's not cross browser.
Then I thought I could use cookies to record the user's session length, respectively increase the cookie value every time a user has shown activity. The problem here is that I cannot detect which would be the user's last activity and the only possible way I can safely insert the session length and know it's accurate is when the user hasn't logged in for quite some time and the cookie's value would be the last session length. This wouldn't be suitable for me because many users may just open the site once and never visit it again ( for example ), then none of those users would be recorded.
Does anyone have a solution for this issue? I seem to have searched but none of the answers I found were satisfying.
Thank you in advance!
You can't tell when a user leaves your site, this is fundamentally unsupported by the underlying technology of the Internet. All you can do is tell the time of the last request made before the session expired.
You could poll using Javascript. For instance if you have a ticker text or image scroller that loads new information using Ajax, you can use the times of those requests to guess the last activity. You could also do dedicated requests for it, but to the visitor that is a waste of bandwidth, and they might not like the idea of such strict monitoring.
You could also measure times between page views, and leave the last page view (the exit-page) out of the equasion.
This question seems to be getting some attention, so I thought I'd update the answer as it has been nearly 10 years.
WebSockets have come a long way since 2013, and a socket can be used to determine the (almost) exact moment the user leaves a site.
It might not be such a good idea on a website with a lot of traffic as it might require sophisticated management and is more than likely to not be available on shared hosting, so make sure you are familiar with the technology before using it.
If you need to do this for specific users then the best bet is to do an Ajax request, say, every 60 seconds, and record the timestamp against their session in your database. When you're measuring the responses you can classify anything beyond a reasonable cut-off as having 'left' (for example 20 minutes for a typical browser session).
If you don't need this data for individual users and just want an average metric for your visitors, you may as well just use Google Analytics.
Related
This issue has been quite the brain teaser for me for a little while. Apologies if I write quite a lot, I just want to be clear on what I've already tried etc.
I will explain the idea of my problem as simply as possible, as the complexities are pretty irrelevant.
We may have up to 80-90 users on the site at any one time. They will likely all be accessing the same page, that I will call result.php. They will be accessing different results however via a get variable for the ID (result.php?ID=456). It is likely that less than 3 or 4 users will be on an individual record at any one time, and there are upwards of 10000 records.
I need to know, with less than a 20-25 second margin of error (this is very important), who is on that particular ID on that page, and update the page accordingly. Removing their name once they are no longer on the page, once again as soon as possible.
At the moment, I am using a jQuery script which calls a php file, reading from a database of "Currently Accessing" usernames who are accessing this particular ID, and only if the date at which they accessed it is within the last 25 seconds. The file will also remove all entries older than 5 minutes, to keep the table tidy.
This was alright with 20 or 30 users, but now that load has more than doubled, I am noticing this is a particularly slow method.
What other methods are available to me? Has anyone had any experience in a similar situation?
Everything we use at the moment is coded in PHP with a little jQuery. We are running on a server managed offsite by a hosting company, if that matters.
I have come across something called Comet or a Comet Server which sounds like it could potentially be of assistance, but it also sounds extremely complicated for my purposes and far beyond my understanding at the moment.
Look into websockets for a realtime socket connection. You could use websockets to push out updates in real time (instead of polling) to ensure changes in the 'currently online users' is sent within milliseconds.
What you want is an in-memory cache with a service layer that maintains the state of activity on the site. Using memcached might be a good starting point. Your pseudo-code would be something like:
On page access, make a call to CurrentUserService
CurrentUserService takes as a parameter the page you're accessing and who you are.
Each time you call it, it removes whatever you were accessing before from the cache.
Then it adds what you're currently accessing.
Then it compiles a list of who else is accessing the same thing based on the current state in the cache.
It returns this list, which your page processes and displays.
If you record when someone accesses a page, you can set a timeout for when the service stops 'counting' them as accessing the page.
I want to have an Online user counter but something which performs real time. I mean when someone comes in, the counter updates, or when someone leaves the site, the counter decrease.
I can't find anything like this on net. Is there any script for this ?
You could probably keep a list of all sessions in a database and update the "online time" every time someone hits a page. Then check how many sessions were updated in the last x minutes. However, this won't be very real time: depending on the amount of minutes you defined it will be a little bit off.
Even Google Analytics (the new real time version) gets it wrong sometimes. Don't worry too much if you can't get it right either. ;-)
You should have a look to WebSocket. There is a lot of demos out there, mostly real-time chat application, you could hack something on it :)
In my opinion, WebSocket seems a bit overhead in you case (you just want a number, no real two-sides communications) but it's the good way to do "real-time" apps.
Here are some links:
Socket.IO (node.js backend)
WebSocket and Socket.IO
Introduction to Server-Sent Events (another technique)
phpwebsocket
as far as i know there is no way to track when a user leaves your site, short of a log out button (which one can easily avoid by simply closing the window)
To expand on Tom's answer, you could create a table that tracks sessions in a database. At minimum the fields should be session_id, ip_address, activity_time. Name them whatever you'd like. You would need a function that executes on every page load that matches a record on session_id and ip_address. If a matching record does not exist, you create one; if a match is made, then update the time.
A few caveats:
1) getting the right ip address can be tricky, especially with AOL users and/or proxy users. You need to look for X_Forwarded_For headers. If they exist, for a user, use that address, otherwise use $_SERVER['REMOTE_ADDR']. I would suggest looking up X_Forwarded_For for your setup because Im not sure its available for all setups
1a) if you dont get the right ip address, some users will create a new entry on every page view
2) You need a way to remove stale sessions. I suggest as part of the function that updates the activity time, it also checks for any activity_time that is older than 5 minutes (I use 15 minutes) and if so, removes the corresponding record.
Then all you need to do is a simple count on the table and that will give you a reasonably accurate representation of the number of users currently online. With very little coding you could put this to a lot of uses. On a dating site I created I added an extra column to the table and was able to display an online icon next to users that were logged in, the same in search results to show users doing searches what users were currently online. with a bit of imagination it could be used for a few more scenarios.
Also, with a membership feature, when a user logs in you can update the session table to show that they are a member rather than a guest and if the user logs out you can remove the session from the table. Its best when a user logs out but stays on the site that you generate them a new session for security purposes. This is a bit more than what you were asking for.
You have the browser leave an HTTP connection open to your server for some unused resource (like /usercounter) that your server never responds to. Then you count the number of open connections. You can have the request send a cookie associated with the user's session so you can know if the connections are all unique users. This solution is very difficult and you will likely not find any ready-to-go solutions for implementing this.
The solution above will get a count of users who have javascript enabled. For other users, you would have to have some sort of guesstimate of how long a user would be around and update that timer on each page load.
I'm working on a game, which has score based on a JavaScript countdown: the faster you finish the level before the countdown reaches zero, the bigger your score is.
How can I make sure it is not somehow altered when I finally receive it from client-side on server-side?
My initial idea is to make two checkpoints: one at the beginning of a level and another at the end. Checkpoint is basically a session sent via AJAX to server-side PHP script which is then timestamped. So after the game is finished on client-side, the score is verified with the one on server-side. Is this kind of protection any good?
Thank you in advance!
EDIT:
I'm also open to any other ways to achieve the desired functionality.
Simply, you store the value in a datetime field in your database. Then, you seed your javascript with that value. Thus, any change on the client side, will not have an effect on the stored time.
However, if you depend on the client side to get a value, you cannot do anything to make sure it's correct. The user can still spoof the ajax request with no real problem. It makes it a bit harded, but certainly doable.
Once your countdown is somehow related to the client side, there is no escape :)
As others have pointed out, there's no way you can be certain that the times have not been tampered with, however there are ways to mitigate the consequences:
If you have a (server-side) system that suspects that scores have been tampered, you can blacklist that IP address or cookie, and not show those scores to other users. Do show the scores to the hacker, though. This has several effects: Firstly, if they think they've beaten you they may move on and leave your code alone. Secondly, if your cheat detection wrongly thinks that a ninja player is hacking, the player will still see their score in the tables as normal (even if other players don't). Consequently, false positives don't matter so much, and you can use fuzzier algorithms, e.g. How does this player's rate of improvement compare to the average? Has he got a bunch of poor scores then suddenly an incredible one? Has my server seen an unusual pattern of hits from this user? Etc.
You could probably set this up so that you could refine your detection algorithms incrementally, and blacklist players after you've got suspicious about them (and un-blacklist false positives).
There are 2 possible scenarios which you might be facing. Let me start with the easy one:
a) If the web application is designed such that the game starts as soon as the page is loaded, your life is going to be simple. The script which sends out the game should timestamp the database with the time at which the game was sent out. This would be the start time. The end time would be recorded when the client sends in a "level completed" message. As time is being recorded at the server side in both the cases, you do not need the client to keep time. However, there is a catch. See The Catch section below.
b) If the client loads the application but the game begins much later when the user hits 'play' etc., your life is going to be a little more difficult. In this scenario, you would need a "level began" as well as a "level completed" message coming from the client. Again, it would be a better idea to keep time at the server and not the client. However, you would need to ensure that the client receives an ACK to the "level began" message before starting the game to ensure that the user does not play a game which is not being recorded by the server. (The "level began" message might never have reached the server).
The Catch: You need to realise that there is no protection possible for the user cheating on his scores! JS is completely open and no matter how you implement your start / end calls to the server, any user can write a script to send similar calls to the server at whatever time interval she wishes to use. Even if you use a session / cookie, these can be easily replicated. (Using a sniffer for instance). Thus, you must realise and accept the design limitations imposed by the HTML/JS architecture and code within these limits. Hence, the best idea is to write code for the users and not to prevent the hackers from sending rogue calls. Make your game fun for the people who would be playing your game and do not worry about the hackers cheating on their scores - they would not be your target audience anyway.
First of all, forget getting the elapsed time from the client side. Any malicious user can alter the sent data.
Server side must be the only authority for storing the time. At the beginning of the level, store the current time in the $_SESSION. At the end of the level, subtract it from the current time and it is the elapsed time for the level.
$_SESSION['start_time'] = time();
$elapsed_time = time() - $_SESSION['start_time'];
You can still show the elapsed time by Javascript for the user's convenience. For the timing differences between the client and the server (which is perfectly possible), you can do synchronization by getting the elapsed_time whenever your client hit the server.
If the level completion span between multiple sessions (like you start the level, leave the site, and come back later to finish it) you have to store it in a persistent data store (database).
You can use a timestamp in a session to store the start date and then send make JavaScript do a request when the player's done (but the second timestamp should come from PHP, or other server-side language, too).
The ony really bullet-proof way is to show nothing to the user and to ask him to tell you every single move, check it with the server and send back what it allows him to know. But this means delay.
You could issue a unique token, that is stored within the user's session and is available to your Javascript code. When starting an AJAX request, pass this token as an additional parameter, so the server can distinguish between legimitate and spurious requests.
This token should be valid for a single request only of course.
In combination with the mentioned solutions (server-based time checks etc.) you should be able to build a solid scoring system.
well, thinking of this problem gives me two ideas:
Attack your own server.
by that i mean, send a request every 1 second, that will save the score.
this way, the "hacker" can not send Start/End time and cheat.
make the requests at a specific time diffrences.
ok, so lets say we started playing, you can send a request at specific time intervals (3.4 sec? )
if a request is not in that time frame then the user is cheating ?
or at least marked as possible cheater.
use a simple string. XD
for start/end time sent to server, offcourse encrypted.
you can try jCryption for encryption.
since as the others said, it is not totaly fail proof ( since we are talking about client side script ) , but at least it will make it a lot harder to cheat.
dunno, its just my two cents.
It is not possible to make it 100% bulletproof, you can only make it harder to hack if it is based on client-side
You can generate a GUID when the page is rendered. You can concatenate this GUID, the start datetime ticks, the session ID, and calculate a hash of them to validate the data when user return.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Take my profile for example, or any question number of views on this site, what is the process of logging the number of visits per page or object on a website, which I presumably think includes:
Counting registered users once (this must be reflected in the db, which pages / objects the user has visited). this will also not include unregistered users
IP: log the visit of each IP per page / object; this could be troublesome as you might have 2 different people checking the same website; or you really do want to track repeat visitors.
Cookie: this will probably result in that people with multiple computers would be counted twice
other method goes here ....
The question is, what is the process and best practice to count user requests?
EDIT
I've added the computer languages to the list of tags as they are of interest to me. Feel free to include any libraries, modules, and/or extensions that achieve the task.
The question could be rephrased into:
How does someone go about measuring the number of imprints when a user goes on a page? The question is not intended to be similar to what Google analytics does, rather it should be something similar to when you click on a stackoverflow question or profile and see the number of views.
The "correct" answer varies according to the situation; primarily the most desired statistic and the availability of resources to gather and process them:
eg:
Server Side
Raw web server logs
All webservers have some facility to log requests. The trouble with them is that it requires a lot of processing to get meaningful data out and, for your example scenario, they won't record application specific details; like whether or not the request was associated with a registered user.
This option won't work for what you're interested in.
File based application logs
The application programmer can apply custom code to the application to record the stuff you're most interested in to a log file. This is similiar to the webserver log; except that it can be application aware and record things like the member making the request.
The programmers may also need to build scripts which extract from these logs the stuff you're most interested. This option might be suited to a high traffic site with lots of disk space and sysadmins who know how to ensure the logs get rotated and pruned from the production servers before bad things happen.
Database based application logs
The application programmer can write custom code for the application which records every request in a database. This makes it relatively easy to run reports and makes the data instantly accessible. This solution incurs more system overhead at the time of each request so better suited to lesser traffic sites, or scenarios where the data is highly valued.
Client Side
Javascript postback
This is a consideration on top of the above options. Google analytics does this.
Each page includes some javascript code which tells the client to report back to the webserver that the page was viewed. The data might be recorded in a database, or written to file.
Has an strong advantage of improving accuracy in scenarios where impressions get lost due to heavy caching/proxying between the client and server.
Cookies
Every time a request is received from someone who doesn't present a cookie then you assume they're new and record that hit as 'anonymous' and return a uniquely identifying cookie after they login. It depends on your application as to how accurate this proves. Some applications don't lend themselves to caching so it will be quite accurate; others (high traffic) encourage caching which will reduce the accuracy. Obviously it's not much use till they re-authenticate whenever they switch browsers/location.
What's most interesting to you?
Then there's the question of what statistics are important to you. For example, in some situations you're keen to know:
how many times a page was viewed, period,
how many times a page was viewed, by a known user
how many of your known users have viewed a specific page
Thence you typically want to break it down into periods of time to see trending.
Respectively:
are we getting more views from random people?
or we getting more views from registered users?
or has pretty much every one who is going to see the page now seen it?
So back to your question: best practice for "number of imprints when a user goes on a page"?
It depends on your application.
My guess is that you're best off with a database backed application which records what is most interesting to your application and uses cookies to trace the member's sessions.
The best practice for a hit counter depends on how much traffic you expect your site to receive. As wybiral suggested, you can implement something that writes to a database after every request. This might include the IP address if you want to count unique visitors, or it could be a simple as just incrementing a running total for each page or for each (page, user) pair.
But that requires a database write for every request, even if you just want to serve a static page. Ideally speaking, a scalable web app should serve as much as possible from an in-memory cache. Database or disk I/O should be avoided as much as possible.
So the ideal set up would be to build up some representation of the server's activity in-memory and then occasionally (say every 15 minutes) write those events to the database. You could conceivably queue up thousands of requests and then store them with a single database write.
There's a tutorial describing how to do exactly this in python using Celery and Carrot: http://packages.python.org/celery/tutorials/clickcounter.html. It also includes some examples of how to set up your database tables using Django models and what code to call whenever someone accesses a page.
This tutorial will certainly be helpful to you regardless of what you choose to implement, although this level of architecture might be overkill if you don't expect thousands of hits each hour.
Use a database to keep a record of the unique IPs (if the IP doesn't exist in the DB, create it, otherwise continue as planned) and then query the database for the number of those entities. Index this with IP and URL to store views for individual pages. You wont have to worry about tracking registered users this way, they will be totaled into the unique IP count. As far as multiple people from one IP, there's not much you can do there short of requiring an account and counting user->to->page-views similarly.
I would suggest using a persistent key/value store like Redis. If you use a list with the list key being the serialized identifier, you can store other serialized entries and use llen to find the list size.
Example (python) after initializing your Redis store:
def intializeAndPush(serializedKey, serializedValue):
if not redisStore.exists(serializedKey):
redisStore.push(serializedKey, serializedValue)
else:
if serializedValue not in redisStore.lrange(serializedKey, 0, -1):
redisStore.push(serializedKey, serializedValue)
def getSizeOf(serializedKey):
if redisStore.exists(serializedKey):
return redisStore.llen(serializedKey)
else:
return 0
Using this technique, you can use anything as serializedKey or serializedValue. If you want to store IPs with today's date or serialized login information, both are just as simple. Also, only unique serializedValues are stored since writes are locked on read (at least as I recall).
I will try and implement pixel tracking to track views on your page/object. This method is used by google (google analytics) and other high profile media companies.
Pixel tracking will be fine, since you can have point the trackingpixel to a HttpHandler specific for that purpose. That way you can seperate the load and even use some kind of queue for highload scenarios.
Also, you can incorporate user specific information in the tracking pixel such as WHO has visited the page.
eg:
<a href="fakeimages/imba.gif?uid=123&info2=a&info3=b" style="height:1px;width:1px;" />
Then you need to handle the request going to fakeimages/*.gif with a specific HttpHandler / php redirect/controller (whatever language you're using) and process the infos.
regards
I am making a game where the battle system uses javascript to battle. At the end of the battle you either win or lose. If the user wins, I need to update the mysql database with the XP they earned.
The best way I can think of doing this is to have the javascript run an ajax function when the user wins that POSTs something like addxp.php?amount=235, but if I do that then the user can easilly look at the source and see that they can just enter in that page themself to update their xp without battling. But this is the only way I know how to do it?
Help please :-/
If you rely on the code running on the client's web browser to update the battle results, you do not have control over that code. Many javascript and flash games that have a high score board that depend on the browser sending in the high score registration are vulnerable to this. There is no real easy way around this.
You can try to obfuscate things somewhat, but someone who's interested enough is going to be able to fairly easily get around this.
As knoopx mentioned in his comments, the only sure-fire way to get around this is to do computations server-side. For example, the client browser sends user actions to the server, and the server is the one that determines the outcome of the battle, inserts the result into the mySQL db, and sends the result back to the client. This is obviously a major architectural change and you'll have to decide whether it's worth it.
This one is tricky and unfortunately there is no easy solution. I can give you some advice that helped me when I was creating a flash-game with a cash-prize. It worked quite well for me, but again - it was by no means full proof.
First of all do some thinking about the highest score it would be possible to achieve over a given time period. For example, you could say that the highest score you could reasonably get after playing for 1 minute is 200 points.
Each time someone starts playing the game, you do an AJAX call to your server to obtain a game ID. At set intervals (say 10 seconds), you make your game phone home with the game ID and the latest score. This way the only way to cheat would be to create a script that periodically contacts the server with a slowly incrementing score that falls under your maximum. Not a difficult thing to do, but at least now we're entering the territory where we've eliminated the casual louts with TamperData and a few minutes to kill boredom with.
Another thing you can do when you send back the current score is the current status of the gameboard. This isn't so useful for catching cheats live, but it's a very good tool you can use when awarding a prize to check that the high-score is a genuine one. This adds another layer of complexity to your system and hopefully make some of the more slightly-hard-core louts get bored and find something else to do.
My last suggestion is this - you in no way make your users immediately aware of what you're doing. That is to say, that if someone's score falls above your high-score/time threshold, you do nothing to let them know that they've tripped your cheat-detector. In the game I created, I even recorded their high-score along with their cookie. When getting the highscores from your database you SELECT * FROM scores WHERE cheated = FALSE OR cookie = userscookie. This way, unless they clear their cookie and check again, it will appear (only to them) that their hack attempt was successful.
Oh and one last thing; minify your javascript. This will obfuscate the code and make it very hard to read. Again, someone determined enough can easily circumvent this and look at your code, but it's all about making your system complex enough that people won't bother.
Unfortunately the web's strongest point can sometimes also be its weakest. It is the nature of the WWW that source code is open and available for anyone to read, which means that keeping secrets from your users is very hard if not impossible.