Thumbs system on Urban Dictionary - php

I was thinking of implementing a thumbs system, but mine would require a registration thus ruling out the possibility of people voting more than once unless they create a new account to do so. So I was wondering about Urban Dictionary's thumb system. How does it work? I would imagine that my IP would be stored in a database, so people would not be able to vote more than once however IPs do change pretty often and especially when you're on an iPhone. Probably a combination of cookies and IP checking. Can anyone give me a better insight? What would they check for to ensure you don't vote more than once?
The reason I ask is because I may want to make my a public system instead. Maybe even a hybrid, similar to SO where you can ask a question before creating an account and then have the two linked together. I am using PHP and MySQL.

Almost always it's done with cookies. As you say, IPs can't be used (naively) as they change, or cover too many people (i.e. everyone in a given office, etc).
But online polls not reliable anyway, so don't get too concerned about solving a problem no-one cares about. You can implement more 'intelligent' rules but then you need to ask what benefit you are getting for all your work.
Personally, I would go with:
Cookies
Forced signup voting
Some sort of analysis of voting patterns
Because it goes without saying that people can just sign up constantly, to submit more votes. It really depends on what benefit people get from voting, and how much you care (in terms of time, which is, obviously, money).

I know urban dictionary allows for more than one vote per day. Once every six hours to be exact.

Related

Secure voting system with php without login

Is there a way to make a reasonably secure system to vote without having to login. I now use cookies to set if the person has voted yet and also insert the users ip in the database.
If that user removes his cookies, he will be able to vote again. That's why I do a check if the user's ip exists in the database and if that IP has voted in the last 30 seconds. That way he'll have to remove his cookies and change his IP address to vote again.
I know there's no 100% failproof solution to this, but
is there a more secure way to do this?
There are two ways that could improve your results, but read and judge for yourself, if you need them:
More persistent cookies
There is the Evercookie project, which stores cookie-like information in a lot of places. It is much harder to delete than just normal cookies.
I personally think that this project should be considered a proof of concept and actually using it would be unethical
Better user recognition
Instead of just looking at the IP address in order to identify a returning visitor, you could use Browser fingerprinting. The EFF has shown with their Panopticlick project, that the combination of Browser version, OS version, installed add-ons etc. is often unique. The Piwik web analytics tool also uses this kind of user heuristics to tell visitors apart. I don't know the implementation, but it's FOSS and in PHP, so you should be able to find that part.
You can run with both of those solutions in unison - but it's still not very secure. You could go as far as blocking a subnet from voting (192.168.1.xxx) to prevent against dynamic IP changes, but then you're also blocking up to 254 people from voting - and it won't prevent against a proxy.
One method I've seen used quite a bit is making it look like you allow duplicate votes; i.e: show it on the end user's end that their duplicate vote has been counted, but don't actually count it in your own database.
But realistically, a login system is about the only relatively "secure" way of doing this - but if someone is determined enough, that can obviously be gamed too.
Hope this helps.
Eoghan
You could ad the
User agent (on short periods there's often little chance that 2 surfers have exactly the same : https://panopticlick.eff.org/index.php?action=log&js=yes)
But again ' if someone is determined enough, that can obviously be gamed too.'

Detect if user is human without captcha or useragent

I've a website where I'm providing email encryption to users and I'm trying to figure out if there's a way to detect if a user is human or a bot.
I've been digging into $_SESSION in php but it's easy to bypass, I'm also not interested in captcha, useragent or login solutions, any idea of what I need ?
There are other questions very similar to this one in SO but I couldn't find any straight answer...
Any help will be very welcome, thank you all !
This is a hard problem, and no solution I know of is going to be 100% perfect from a bot-defending and usability perspective. If your attacker is really determined to use a bot on your site, they probably will be able to. If you take things far enough to make it impractical for a computer program to access anything on your site, it's likely no human will want to either, but you can strike a good balance.
My point of view on this is partially as a web developer, but more so from the other side of things, having written numerous web crawler programs for clients all over the world. Not all bots have malicious intent, and can be used for things from automating form submissions to populating databases of doctors office addresses or analyzing stock market data. If your site is well designed from a usability standpoint, there should be no need for a bot that "makes things easier" for a user, but there are cases where there are special needs you can't plan for.
Of course there are those who do have malicious intent, which you definitely want to protect your site against as well as possible. There is virtually no site that can't be automated in some way. Most sites are not difficult at all, but here are a few ideas off the top of my head, from other answers or comments on this page, and from my experience writing (non-malicious) bots.
Types of bots
First I should mention that there are two different categories I would put bots into:
General purpose crawlers, indexers, or bots
Special purpose bots, made specifically for your site to perform some task
Usually a general-purpose bot is going to be something like a search engine's indexer, or possibly some hacker's script that looks for a form to submit, uses a dictionary attack to search for a vulnerable URL, or something like this. They can also attack "engine sites", such as Wordpress blogs. If your site is properly secured with good passwords and the like, these aren't usually going to pose much of a risk to you (unless you do use Wordpress, in which case you have to keep up with the latest versions and security updates).
Special purpose "personalized" bots are the kind I've written. A bot made specifically for your site can be made to act very much like a human user of your site, including inserting time delays between form submissions, setting cookies, and so on, so they can be hard to detect. For the most part this is the kind I'm talking about in the rest of this answer.
Captchas
Captchas are probably the most common approach to making sure a user is humanoid, and generally they are difficult to automatically get around. However, if you simply require the captcha as a one-time thing when the user creates an account, for example, it's easy for a human to get past it and then give their shiny new account credentials to a bot to automate usage of the system.
I remember a few years ago reading about a pretty elaborate system to "automate" breaking captchas on a popular gaming site: a separate site was set up that loaded captchas from the gaming site, and presented them to users, where they were essentially crowd-sourced. Users on the second site would get some sort of reward for each correct captcha, and the owners of the site were able to automate tasks on the gaming site using their crowd-sourced captcha data.
Generally the use of a good captcha system will pretty well guarantee one thing: somewhere there is a human who typed the captcha text. What happens before and after that depends on how often you require captcha verification, and how determined the person making a bot is.
Cell-phone / credit-card verification
If you don't want to use Captchas, this type of verification is probably going to be pretty effective against all but the most determined bot-writer. While (just as with the captcha) it won't prevent an already-verified user from creating and using a bot, you can verify that a human being created the account, and if abused block that phone number/credit-card from being used to create another account.
Sites like Facebook and Craigslist have started using cell-phone verification to prevent spamming from bots. For example, in order to create apps on Facebook, you have to have a phone number on record, confirmed via text message or an automated phone call. Unless your attacker has access to a whole lot of active phone numbers, this could be an effective way to verify that a human created the account and that he only creates a limited number of accounts (one for most people).
Credit cards can also be used to confirm that a human is performing an action and limit the number of accounts a single human can create.
Other [less-effective] solutions
Log analysis
Analyzing your request logs will often reveal bots doing the same actions repeatedly, or sometimes using dictionary attacks to look for holes in your site's configuration. So logs will tell you after-the-fact whether a request was made by a bot or a human. This may or may not be useful to you, but if the requests were made on a cell-phone or credit-card verified account, you can lock the account associated with the offending requests to prevent further abuse.
Math/other questions
Math problems or other questions can be answered by a quick google or wolfram alpha search, which can be automated by a bot. Some questions will be harder than others, but the big search companies are working against you here, making their engines better at understanding questions like this, and in turn making this a less viable option for verifying that a user is human.
Hidden form fields
Some sites employ a mechanism where parameters such as the coordinates of the mouse when they clicked the "submit" button are added to the form submission via javascript. These are extremely easy to fake in most cases, but if you see in your logs a whole bunch of requests using the same coordinates, it's likely they are a bot (although a smart bot could easily give different coordinates with each request).
Javascript Cookies
Since most bots don't load or execute javascript, cookies set using javascript instead of a set-cookie HTTP header will make life slightly more difficult for most would-be bot makers. But not so hard as to prevent the bot from manually setting the cookie as well, once the developer figures out how to generate the same value the javascript generates.
IP address
An IP address alone isn't going to tell you if a user is a human. Some sites use IP addresses to try to detect bots though, and it's true that a simple bot might show up as a bunch of requests from the same IP. But IP addresses are cheap, and with Amazon's EC2 service or similar cloud services, you can spawn a server and use it as a proxy. Or spawn 10 or 100 and use them all as proxies.
UserAgent string
This is so easy to manipulate in a crawler that you can't count on it to mark a bot that's trying not to be detected. It's easy to set the UserAgent to the same string one of the major browsers sends, and may even rotate between several different browsers.
Complicated markup
The most difficult site I ever wrote a bot for consisted of frames within frames within frames....about 10 layers deep, on each page, where each frame's src was the same base controller page, but had different parameters as to which actions to perform. The order of the actions was important, so it was tough to keep straight everything that was going on, but eventually (after a week or so) my bot worked, so while this might deter some bot makers, it won't be useful against all. And will probably make your site about a gazillion times harder to maintain.
Disclaimer & Conclusion
Not all bots are "bad". Most of the crawlers/bots I have made were for users who wanted to automate some process on the site, such as data entry, that was too tedious to do manually. So make tedious tasks easy! Or, provide an API for your users. Probably one of the easiest way to discourage someone from writing a bot for your site is to provide API access. If you provide an API, it's a lot less likely someone will go to the effort to create a crawler for it. And you could use API keys to control how heavily someone uses it.
For the purpose of preventing spammers, some combination of captchas and account verification through cell numbers or credit cards is probably going to be the most effective approach. Add some logging analysis to identify and disable any malicious personalized bots, and you should be in pretty good shape.
My favorite way is presenting the "user" with a picture of a cat or a dog and asking, "Is this a cat or a dog?" No human ever gets that wrong; the computer gets it right perhaps 60% of the time (so you have to run it several times). There's a project that will give you bunches of pictures of cats and dogs -- plus, all the animals are available for adoption so if the user likes the pet, he can have it.
It's a Microsoft corporate project, which puts me in a state of cognitive dissonance, as if I found out that Harry Reid likes zydeco music or that George Bush smokes pot. Oh, wait...
I've seen/used a simple arithmetic problem with written numbers ie:
Please answer the following question to prove you are human:
"What is two plus four?"
and similar simple questions which require reading:
"What is man's best friend?"
you can supply an endless stream of questions, should the person attempting access be unfamiliar with the subject matter, and it is accessible to all readers, etc.
There's a reason why companies use captchas or logins. As ugly of a solution as captchas are, they're currently the best (most accurate, least disruptive to users) way of weeding out bots. If a login solution doesn't work for you, I'm afraid the only realistic solution is a captcha.
If users will be filling in a form, honeypot fields are simple to implement, and can be reasonably effective, but nothing is perfect. Create one or more hidden fields in the form, and if they contain anything when the form is posted, reject the form. Spambots will usually attempt to fill in everything.
You do need to be aware of accessibility. Hidden fields probably won't be filled in by those using a standard browser (where the field is not visible), but those using screen readers may be presented with the field. Be sure to label it correctly so that these users do not fill it in. Perhaps with something like "Please help us to prevent spam by leaving this field empty". Also, if you do reject the form, be sure to reject it with helpful error messages, just in case it has been filled in by a human.
I suggest getting the Growmap Anti Spambot Wordpress plugin and seeing what code you can borrow from it or just using the same technique. I've found this plugin to be very effective for curtailing automated spam on my WordPress sites and I've started adapting the same technique for my ASP.NET sites.
The only thing it doesn't deal with are human cut-and-paste spammers.

Best method to prevent gaming with anonymous voting

I am about to write a voting method for my site. I want a method to stop people voting for the same thing twice. So far my thoughts have been:
Drop a cookie once the vote is complete (susceptible to multi browser gaming)
Log IP address per vote (this will fail in proxy / corporate environments)
Force logins
My site is not account based as such, although it aggregates Twitter data, so there is scope for using Twitter OAuth as a means of identification.
What existing systems exist and how do they do this?
The best thing would be to disallow anonymous voting. If the user is forced to log in you can save the userid with each vote and make sure that he/she only votes once.
The cookie approach is very fragile since cookies can be deleted easily. The IP address approach has the shortcoming you yourself describe.
One step towards a user auth system but not all of the complications:
Get the user to enter their email address and confirm their vote, you would not eradicate gaming but you would make it harder for gamers to register another email address and then vote etc.
Might be worth the extra step.
Let us know what you end up going for.
If you want to go with cookies after all, use an evercookie.
evercookie is a javascript API available that produces
extremely persistent cookies in a browser. Its goal
is to identify a client even after they've removed standard
cookies, Flash cookies (Local Shared Objects or LSOs), and
others.
evercookie accomplishes this by storing the cookie data in
several types of storage mechanisms that are available on
the local browser. Additionally, if evercookie has found the
user has removed any of the types of cookies in question, it
recreates them using each mechanism available.
Multi-browser cheating won't be affected, of course.
What type of gaming do you want to protect yourself against? Someone creating a couple of bots and bombing you with thousands (millions) of requests? Or someone with no better things to do and try to make 10-20 votes?
Yes, I know: both - but which one is your main concern in here?
Using CAPTCHA together with email based voting (send a link to the email to validate the vote) might work well against bots. But a human can more or less easily exploit the email system (as I comment in one answer and post here again)
I own a custom domain and I can have any email I want within it.
Another example: if your email is
myuser*#gmail.com*, you could use
"myuser+1#gmail.com"
myuser+2#gmail.com, etc (the plus sign and the text after
it are ignored and it is delivered
to your account). You can also include
dots in your username (my.user#gmail.com). (This only
works on gmail addresses!)
To protect against humans, I don't know ever-cookie but it might be a good choice. Using OAuth integrated with twitter, FB and other networks might also work well.
Also, remember: requiring emails for someone to vote will scare many people off! You will get many less votes!
Another option is to limit the number of votes your system accepts from each ip per minute (or hour or anything else). To protect against distributed attacks, limit the total number of votes your system accepts within a timeframe.
Different approach, just to provide an alternative:
Assuming most people know how to behave or just can't be bothered to misbehave, just retroactively clean the votes. This would also keep voting unobtrusive for the voters.
So, set cookies, log every vote and afterwards (or on a time interval?) go through the results and remove duplicates based on the cookie values, IP/UserAgent combinations etc.
I'd assume that not actively blocking multiple votes from same person keeps the usage of highly technical circumvention methods to a minimum and the results are easy to clean.
As a down side, you can't probably show the actual vote counts live on the user interface, or eyebrows will be raised when bunch of votes just happen to go missing.
Although I probably wouldn't do this myself, but look at these cookies, they are pretty hard to get rid of:
http://samy.pl/evercookie/
A different way that I had to approach this problem and fight voting fraud, was to require an email address, then a person could still vote, but the votes wouldn't count until they clicked on a link in the email. This was easier than full on registration, but was still very effective in eliminating most of the fraudulent votes.
If you don't want force users to log, consider this evercookie, but force java script to enable logging!
This evercookie is trivial to block because it is java script based. The attacker would not likely use browser, with curl he could generate tousends of requests. Hovewer such tools have usually poor javascript support.
Mail is even easier to cheat. When you run your own server, you can accept all email addresses, so you will have practically unlimited pool of addresses to use.

How to disable the same person to play my RPG game as two different persons?

Of course, I store all players' ip addresses in mysql and I can check if there is a person with the same ip address before he registers, but then, he can register to my page at school or wherever he wants. So, any suggestions?
The only way that proves particularly effective is to make people pay for accessing your game.
Looking behind the question:
Why do you want to stop the same person registering and playing twice?
What advantage will they have if they do?
If there's no (or only a minimal) advantage then don't waste your time and effort trying to solve a non-problem. Also putting up barriers to something will make some people more determined to break or circumvent them. This could make your problem worse.
If there is an advantage then you need to think of other, more creative, solutions to that problem.
You can't. There is no way to uniquely identify users over the internet. Don't use ip addresses because there could be many people using the same ip, or people using dynamic ip's.
Even if somehow you made them give you a piece of legal identification, you still wouldn't be absolutely sure that they were not registered on the site twice as two different accounts.
I would check the user's IP every time they log onto the game, then log users who come from the same IP and how much they interact. You may find that you get some users from the same IP (ie, roomates, spouses, who play together and are not actually the same person). You may just have to flag these users and monitor their interactions - for example, is there a chat service in the game? If they don't ever talk to each other, they're more than likely the same person, and review on an individual basis.
If its in a webrowser you could bring the information like OS or browser but this even makes it not save but still safer.
It would take the hackers only a little more time and You have to look for the possibility that some people could play on systems with the same OS and browser
The safest thing would be that people on the same IP cannot do things with each other like trading or like in the game PKR (poker game) that you cannot sit on the same table.
An other thing would be wise to do is to use captcha's, its very user unfriendly but it keeps a lot bots out
If it is a browser-based game, Flash cookies are a relatively resilient way to identify a computer. Or have them pay a minimal amount, and identify them by credit card number - that way, it still won't be hard to make multiple account (friends' & family members' cards), but it will be hard to make a lot of them. Depending on your target demographic, it might prohibit potential players from registering, though.
The best approach is probably not worrying much about it and setting the game balance in such a way that progress is proportional to time spent playing (and use a strong captcha to keep bots away). That way, using multiple accounts will offer no advantage.
There are far too many ways to circumvent any restrictions to limit to a single player. FAR too many.
Unless the additional player is causing some sort of problem it is not worth the attempt. You will spend most of your time chasing 'ghosts' instead of concentrating on improving the game and making more money.
IP bans do not work nor flash cookies as a control mechanism either.
Browser fingerprinting does not work either. People can easily use a second browser.
Even UUID's will not work as those too can be spoofed.
And if you actually did manage to discover and implement a working method, the user could simply use a second computer or laptop and what then?
People can also sandbox a browser so as to use the same browser twice thus defeating browser identification.
And then there are virtual machines....
We have an extreme amount of control freaks out there wanting to control every aspect of computing. And the losers are the people who do the computing.
Every tracking issue I ever had I can circumvent easily. Be it UUID's, mac addresses, ip addresses, fingerprinting, etc. And it is very easy to do too.
Best suggestion is to simply watch for any TOU violations and address the problem accordingly.

Hunting cheaters in a voting competition

Currently we are running a competition which proceeds very well. Unfortunately we have all those cheaters back in business who are running scripts which automatically vote for their entries. We already saw some cheaters by looking at the database entries by hand - 5 Star ratings with same browser exactly all 70 minutes for example. Now as the userbase grows up it gets harder and harder to identify them.
What we do until now:
We store the IP and the browser and block that combination to a one hour timeframe. Cookies won't help against these guys.
We are also using a Captcha, which has been broken
Does anyone know how we could find patterns in our database with a PHP script or how we could block them more efficiently?
Any help would be very appreciated...
Direct feedback elimination
This is more of a general strategy that can be combined with many of the other methods. Don't let the spammer know if he succeeds.
You can either hide the current results altogether, only show percentages without absolute number of votes or delay the display of the votes.
Pro: good against all methods
Con: if the fraud is massive, percentage display and delay won't be effective
Vote flagging
Also a general strategy. If you have some reason to assume that the vote is by a spammer, count their vote and mark it as invalid and delete the invalid votes at the end.
Pro: good against all detectable spam attacks
Con: skews the vote, harder to set up, false positives
Captcha
Use a CAPTCHA. If your Captcha is broken, use a better one.
Pro: good against all automated scripts.
Con: useless against pharygulation
IP checking
Limit the number of votes an IP address can cast in a timespan.
Pro: Good against random dudes who constantly hit F5 in their browser
Pro: Easy to implement
Con: Useless against Pharyngulation and elaborate scripts which use proxy servers.
Con: An IP address sometimes maps to many different users
Referrer checking
If you assume that one user maps one IP address, you can limit the number if votes by that IP address. However this assumption usually only holds true for private households.
Pro: Easy to implement
Pro: Good against simple pharyngulation to some extent
Con: Very easy to circumvent by automated scripts
Email Confirmation
Use Email confirmation and only allow one vote per Email. Check your database manually to see if they are using throwaway-emails.
Note that you can add +foo to your username in an email address. username#example.com and username+foo#example.com will both deliver the mail to the same account, so remember that when checking if somebody has already voted.
Pro: good against simple spam scripts
Con: harder to implement
Con: Some users won't like it
HTML Form Randomization
Randomize the order of choices. This might take a while for them to find out.
Pro: nice to have anyways
Con: once detected, very easy to circumvent
HTTPS
One method of vote faking is to capture the http request from a valid browser like Firefox and mimic it with a script, this doesn't work as easy when you use encryption.
Pro: nice to have anyway
Pro: good against very simple scripts
Con: more difficult to set up
Proxy checking
If the spammer votes via proxy, you can check for the X-Forwarded-For header.
Pro: good against more advanced scripts that use proxies
Con: some legitimate users can be affected
Cache checking
Try to see if the client loads all the uncached resources.
Many spambots don't do this. I never tried this, I just know that this isn't checked usually by voting sites.
An example would be embedding <img src="a.gif" /> in your html, with a.gif being some 1x1 pixel image. Then you have to set the http header for the request GET /a.gif with Cache-Control "no-cache, must-revalidate". You can set the http headers in Apache with your .htaccess file like this. (thanks Jacco)
Pro: uncommon method as far as I know
Con: slightly harder to set up
[Edit 2010-09-22]
Evercookie
A so-called evercookie can be useful to track browser-based spammers
Have you tried to do browser fingerprinting?
Check this open source from EFF:
https://panopticlick.eff.org/
Could be used to identify one person similar to 500-1500 in the world (!).
You may add captcha to voting form. Also requiring e-mail confirmation will be useful
If you're really worried about it then you have to do something like email verification, which might be sufficient to block most cheaters.
Also it depends whether multiple people behind a NAT are likely to want to vote for the same option (e.g. favourite school).
Any scheme you create can be gamed.
EDIT: As everyone else has suggested, you can use a CAPTCHA such as reCAPTCHA to block automated bots, and make humans less likely to repeat vote. At the cost of making humans less likely to vote at all.
The Vote to Promote pattern (you may be aware of it) has a section on how to mitigate against gaming - but it is a tricky one to avoid altogether. Given your actions to date I would consider using weighting, for example consider a reasonable level of voting over a time period, say 10 votes per ting per hour (just an example not a guide) and for surplus votes weight the next 10 at 90% (ie only count 9), the next 10 at 80% and so on. This is Yahoo's advice on gaming within this pattern:
Community voting systems do present a
number of challenges. Particularly the
possibility that members of the
community may try to game the system,
out of any number of motivations:
malice - perhaps against another member of the community and that
member's contributions.
gain - to realize some reward, monetary or otherwise, from
influencing the placement of certain
items in the pool)
or an overarching agenda - always promoting certain viewpoints or
political statements, with little
regard for the actual quality of the
content being voted for.
There are a number of ways to attempt
to safeguard against this type of
abuse. Though nothing can stop gaming
altogether. Here are some ways to
minimize or hinder abusers in their
efforts:
Vote for things, not people. In keeping with Yahoo's general strategy,
don't offer users the ability to
directly vote on another user: their
looks, their likeability,
intelligence, or anything else. It's
OK for the community to vote on a
person's contributions, but not on the
quality of their character.
Consider rate-limiting of votes.
o Only allow the user a certain number of votes within a given
time-period.
o Limit the number of times (or the rate at which) a user votes
down a particular user's content. (To
prevent ad-hominem attacks.)
Weigh other factors besides just the number of votes. Digg, for
instance, does not calculate their
Digg-score solely on the number of
votes a submission receives. Their
algorithm also considers: "story
source (is it a blog repost, or the
original story), user history, traffic
levels of the category the story falls
under, and user reports." They update
this algorithm frequently. Consider
keeping the exact algorithm a secret
from the community, or only discuss
the factored inputs in general terms.
If relationship information is available consider weighting user
votes accordingly. Perhaps prohibit
users with formal relationships from
voting for each other's submissions.
While this is currently a popular
pattern on the Web, it is important to
consider the contexts in which we use
it. Very active and popular
communities (Digg is an excellent
example) that enable community-voting
can also engender a certain negativity
of spirit (mean comments, opinionated
cliques, group attacks on 'outlier'
viewpoints).
Check out Asirra: http://research.microsoft.com/en-us/um/redmond/projects/asirra/
It's still in beta, but it's pretty cool.
To prevent the bots from voting you can use CAPTCHA.
The only thing that comes to mind is using a Captcha. Either an elaborate one with pictures and noise like the ReCaptcha service, or a very simple and unobtrusive one like "What is seven plus three?" or (If you're located in the US), "What is the last name of our President", simple common sense questions everybody can answer. If you change them often enough, this could even be more effective than a classic image-based CAPTCHA.
CAPTCHA's aren't a silver bullet, the user could have their script display the CAPTCHA to them and solve them manually for at least several votes per minute.
You need to use them in combination with other techniques mentioned here.
You could add a honeypot field like in Django. Most likely, this will not protect you from cheaters who deliberately want to change your competition, but at least you will have lesser 'drive-by' spammers to additionally take care of.
Sorry for the double post, but I wasn't allowed to post two URLs in the same post...
If you're looking at building your own tracking, maybe this link might provide some inspiration: https://panopticlick.eff.org/
Turns out that a lot of browsers can be uniquely identified, even without any form of tracking cookies. I'm guessing a vote-bot might give a very specific fingerprint?
So if everyone ever wants to make a competition where people can win something and wanna use a community driven rating system... here i share some experiences:
The bad:
1) First it cant be made secure for 100%
2) to reach a mass of users which filters out all the nonsense ratings is very hard
3) Forget about star ratings in that case... their is always either 5 Stars or 1 Star
The good
1) Dont give them orientation about where they stand... We replaced the "Order by place" view with a random presentation of the TOP 100 (only the top 30 wll win a price)... This really helped because a lot of users lost their interest as soon as they didnt see where they stood.
2) Don't allow votings like: 1x5_Stars 40x1_Star... Just allow users which vote in a fair way...
3) Most of them act a little bit stupid... You'll see them in your logs and can trace down who votes fair and who unfair... Search for patterns...
**GOOD LUCK ;-) **
CAPTCHA is always good, might be "disturbing" for some users though.
reCAPTCHA is a fairly used service
How about only allow users who logged in with openid and with reCaptcha before submitting the vote, and monitering the submitter list with same ip address.
We use a combination of CAPTCHA and email. The user receive a link with a GUID by mail.
This one must be unique for each user that try to vote.
www.votesite.com/vote.aspx?guid=.....
By using this link the vote is confirmed or not. In database we check the combination of email address and GUID to be unique.
I use a combination of CAPTCHA, IP verification and LSO (Flash Local Shared Objects, hard to find and delete for common people).
1.Use recaptcha
2. Yes randomize your voting options but not like this:
-> from vote_id_1 to asdsasd_1, grdsgsdg_2,
Instead use session variables to set a mask from vote_id_1 to asgjdas87th2ad in the vote form.
What about some post hoc stochastic analysis, like time series analysis - looking for periodicity in events of particular (ip, browser, vote)? You could then assign probability to each such group of events that it belongs to 1 person and either discard all such groups of events beyond some probability level, or use some kind of weighting to lower the weight according to the probability.
Look in R, it contains A LOT of useful analysis packages.
Check the domain details of the email they are using. I had the same problem and found that all of them were registered to the same registrant. I wrote it up here: http://tincan.co.uk/659/news/competition-spammers.html
Now, I filter on the DNS information for the email used in the registration.

Categories