I have an algorithm that receives input and delivers output which I would like developers to use like an API. To prevent denial of service attack and excessive overuse, I want some rate limits or protection. What options do I have? Do I provide accounts and API keys? How would that generally work? And what other ideas are possible for this scenario?
Accounts and API keys does sound like a good idea, if nothing else it stops people other than your intended developers being able to access your API.
It should be fairly straightforward to have a simple database table logging the last time a particular API was accessed, and denying re-use if it is accessed too many times in a certain time frame. If possible, return the next time the API will be available for re-use in the output, so developers can throttle accordingly, instead of having to go for a trial and error approach.
Are you expecting the same inputs to be used over and over again or will it be completely random? What about caching the output and only serving the cache to the developer(s) until the API is ready for re-use? This approach is far less dependent on accounts and keys too.
API keys can definitely be a good way to go, there is also openAuth (http://oauth.net) if you scenarios where end users will be accessing the service via apps built by third parties.
If you don't want to code the rate limits / key management yourself, it's worth taking a look at http://www.3scale.net/ which does a lot of this free out of the box as a service (plus other stuff including a developer portal, billing and so on). As a disclaimer, I work there so I might have some bias but we try to make exactly this as simple as possible!
I should add, there's a PHP plugin for 3scale which you can drop into your code and that'll enable all the rate limits etc.
other options that are slightly less complex at the expense of accuracy is using the ip address. obviously this is easier to overcome, but for the average user that does not know what an ip address is it works. Also easy to set up.
it all depends on the complexity of the app and the amount of time you got to do it in
Related
I've decided the best way to handle authentication for my apps is to write my own session handler from the ground up. Just like in Aliens, its the only way to be sure a thing is done the way you want it to be.
That being said, I've hit a bit of a roadblock when it comes to my fleshing out of the initial design. I was originally going to go with PHP's session handler in a hybrid fashion, but I'm worried about concurrency issues with my database. Here's what I was planning:
The first thing I'm doing is checking IPs (or possibly even sessions) to honeypot unauthorized attempts. I've written up some conditionals that sleep naughtiness. Big problem here is obviously WHERE to store my blacklist for optimal read speed.
session_id generates, hashed, and gets stored in $_SESSION[myid]. A separate piece of the same token gets stored in a second $_SESSION[mytoken]. The corresponding data is then stored in TABLE X which is a location I'm not settled on (which is the root of this question).
Each subsequent request then verifies the [myid] & [mytoken] are what we expect them to be, then reissues new credentials for the next request.
Depending on the status of the session, more obvious ACL functions could then be performed.
So that is a high level overview of my paranoid session handler. Here are the questions I'm really stuck on:
I. What's the optimal way of storing an IP ACL? Should I be writing/reading to hosts.deny? Are there any performance concerns with my methodology?
II. Does my MitM prevention method seem ok, or am I being overly paranoid with comparing multiple indexes? What's the best way to store this information so I don't run into brick walls at 80-100 users?
III. Am I hammering on my servers unnecessarily with constant session regeneration + writebacks? Is there a better way?
I'm writing this for a small application initially, but I'd prefer to keep it a reusable component I could share with the world, so I want to make sure I make it as accessible and safe as possible.
Thanks in advance!
Writing to hosts.deny
While this is a alright idea if you want to completely IP ban a user from your server, it will only work with a single server. Unless you have some kind of safe propagation across multiple servers (oh man, it sounds horrible already) you're going to be stuck on a single server forever.
You'll have to consider these points about using hosts.deny too:
Security: Opening up access to as important a file as hosts.deny to the web server user
Pain in the A: Managing multiple writes from different processes (denyhosts for example)
Pain in the A: Safely making amends to the file if you'd like to grant access to an IP that was previously banned at a later date
I'd suggest you simply ban the IP address on the application level in your application. You could even store the banned IP addresses in a central database so it can be shared by multiple subsystems with it still being enforced at the application level.
I. Optimal way of storing IP ACL would be pushing banned IP's to an SQL database, which does not suffer from concurrency problems like writing to files. Then an external script, on a regular basis or a trigger, may generate IPTABLES rules. You do not need to re-read your database on every access, you write only when you detect mis-behavior.
II. Fixation to IP is not a good thing on public Internet if you offer service to clients behind transparent proxies, or mobile devices - their IP changes. Let users chose in preferences, if they want this feature (depends on your audience, if they know what does the IP mean...). My solution is to generate unique token per (page) request, re-used in that page AJAX requests (not to step into a resource problem - random numbers, session data store, ...). The tokens I generate are stored within session and remembered for several minutes. This let's user open several tabs, go back and submit in an earlier opened tab. I do not bind to IP.
III. It depends... there is not enough data from you to answer. Above may perfectly suit your needs for ~500 user base coming to your site for 5 minutes a day, once. Or it may fit even for 1000 unique concurent users in a hour at a chat site/game - it depends on what your application is doing, and how well you cache data which can be cached.
Design well, test, benchmark. Test if session handling is your resource problem, and not something else. Good algorithms should not throw you into resource problems. DoS defense included, and it should not be an in-application code. Applications may hint to DoS prevention mechanisms what to do, and let the defense on specialized tools (see answer I.).
Anyway, if you get into a resource problems in future, the best way to get out is new hardware. It may sound rude or even incompetent to someone, but calculate price for new server in 6 months, practically 30% better, versus price for your work: pay $600 for new server and have additional 130% of horsepower, or pay yourself $100 monthly for improving by 5% (okay, improve by 40%, but if the week is worth $25 may seriously vary).
If you design from scratch, read https://www.owasp.org/index.php/Session_Management first, then search for session hijacking, session fixation and similar strings on Google.
I am working on a API for my web application written in CodeIgniter. This is my first time writing a API.
What is the best way of imposing a API limit on the API?
Thanks for your time
Log the user's credentials (if he has to provide them) or his IP address, the request (optional) and a timestamp in a database.
Now, for every request, you delete records where the timestamp is more than an hour ago, check how many requests for that user are still in the table, and if that is more than your limit, deny the request.
Simple solution, keep in mind, though, there might be more performant solutions out there.
Pretty straight forward. If that doesn't answer your question, please provide more details.
I don't see how this is codeigniter related, for example.
You can use my REST_Controller to do basically all of this for you:
http://net.tutsplus.com/tutorials/php/working-with-restful-services-in-codeigniter-2/
I recently added in some key logging, request limiting features in so this can all be done through config.
One thing you can do is consider using an external service to impose API limits and provide API management functionality in general.
For example, my company, WebServius ( http://www.webservius.com ) provides a layer that sits in front of your API and can provide per-user throttling (e.g. requests per API key per hour), API-wide throttling (e.g. total requests per hour), adaptive throttling (where throttling limits decrease as API response time increases), etc, with other features coming soon (e.g. IP-address-based throttling). It also provides a page for user registration / issuing API keys, and many other useful features.
Of course, you may also want to look at our competitors, such as Mashery or Apigee.
Of course, I store all players' ip addresses in mysql and I can check if there is a person with the same ip address before he registers, but then, he can register to my page at school or wherever he wants. So, any suggestions?
The only way that proves particularly effective is to make people pay for accessing your game.
Looking behind the question:
Why do you want to stop the same person registering and playing twice?
What advantage will they have if they do?
If there's no (or only a minimal) advantage then don't waste your time and effort trying to solve a non-problem. Also putting up barriers to something will make some people more determined to break or circumvent them. This could make your problem worse.
If there is an advantage then you need to think of other, more creative, solutions to that problem.
You can't. There is no way to uniquely identify users over the internet. Don't use ip addresses because there could be many people using the same ip, or people using dynamic ip's.
Even if somehow you made them give you a piece of legal identification, you still wouldn't be absolutely sure that they were not registered on the site twice as two different accounts.
I would check the user's IP every time they log onto the game, then log users who come from the same IP and how much they interact. You may find that you get some users from the same IP (ie, roomates, spouses, who play together and are not actually the same person). You may just have to flag these users and monitor their interactions - for example, is there a chat service in the game? If they don't ever talk to each other, they're more than likely the same person, and review on an individual basis.
If its in a webrowser you could bring the information like OS or browser but this even makes it not save but still safer.
It would take the hackers only a little more time and You have to look for the possibility that some people could play on systems with the same OS and browser
The safest thing would be that people on the same IP cannot do things with each other like trading or like in the game PKR (poker game) that you cannot sit on the same table.
An other thing would be wise to do is to use captcha's, its very user unfriendly but it keeps a lot bots out
If it is a browser-based game, Flash cookies are a relatively resilient way to identify a computer. Or have them pay a minimal amount, and identify them by credit card number - that way, it still won't be hard to make multiple account (friends' & family members' cards), but it will be hard to make a lot of them. Depending on your target demographic, it might prohibit potential players from registering, though.
The best approach is probably not worrying much about it and setting the game balance in such a way that progress is proportional to time spent playing (and use a strong captcha to keep bots away). That way, using multiple accounts will offer no advantage.
There are far too many ways to circumvent any restrictions to limit to a single player. FAR too many.
Unless the additional player is causing some sort of problem it is not worth the attempt. You will spend most of your time chasing 'ghosts' instead of concentrating on improving the game and making more money.
IP bans do not work nor flash cookies as a control mechanism either.
Browser fingerprinting does not work either. People can easily use a second browser.
Even UUID's will not work as those too can be spoofed.
And if you actually did manage to discover and implement a working method, the user could simply use a second computer or laptop and what then?
People can also sandbox a browser so as to use the same browser twice thus defeating browser identification.
And then there are virtual machines....
We have an extreme amount of control freaks out there wanting to control every aspect of computing. And the losers are the people who do the computing.
Every tracking issue I ever had I can circumvent easily. Be it UUID's, mac addresses, ip addresses, fingerprinting, etc. And it is very easy to do too.
Best suggestion is to simply watch for any TOU violations and address the problem accordingly.
I am currently selling time based access passes to an online service at micro payment prices.
After payment the customer gets a set of credentials that is only valid for the purchased period. When the access pass expires the customer has to buy a new set of credentials.
So basically the credentials are one-time(period) use only.
I would like to offer a free-trial of x minutes to this service so potential customers can realise it works fine, possibly increasing total sales.
My question is, how would you stop abusers?
That is, people should only be allowed to try for free once, and if that is not possible at least make them go through a process/test which (as in shareware) is too cumbersome or annoying for them to keep trying it.
Obviously there is always someone who will bypass it. I am looking for a solution for the majority of people who are either not IT savvy, time constrained, or simply too lazy to bother abusing it, instead of simply paying the tiny fee.
I have some approaches in mind but would like to be inspired by other people too.
The service is developed with LAMP.
Put a cookie in their browser. Force a small delay before they can re-use your service, or make them go to the trouble of deleting the cookie. If they block cookies, politely ask them to allow them. You might have more business success if you allow several trials, with a minimum of hassle.
If you look around, everybody who gives out free trials binds them to a credit card - not to charge them, but to verify the user's identity. That's about the only feasible way to prevent abuse I can think of.
Any other idea will depend on the kind of service you are offering. StackExchange for example can offer a 45 day trial without a credit card no problem, simply because the effort to build a SE site is so huge, starting multiple trial periods (and having to configure a new site and build a new community every time) just wouldn't work.
Something similar could be unique login names that you can register during your trial period, and would have to give away if you don't convert it into a pay subscription, things like that. Really depends on the nature of your service.
The users who want to try your product again via a trial are highly convertible users because they already know the value of the product.
The challenge is detecting them and then converting them to paying users.
Detecting can be done using a variety of signals including:
IP
Cookies
Device fingerprints
Credit card or payment information
Email verification and validation
Each individual signal has its challenges eg. IPs can change and are legitimately shared among large audience such as via carrier grade nat.
SMS verification is good in most markets but adds friction and potentially cost for you and your users.
Something like Upollo.ai solves all the hard parts for you so worth a look for people facing these problems in future
I ended up using the smallest possible payment amount for a short time span, but enough to get the user satisfied at very low monetary risk.
In the time past since I asked I actually seriously considered using Flash cookies which very few people know (even that they exist) how to remove.
The other simple (although not-free) is using a SMS confirmation option which binds the user's mobile phone number. As a mobile phone number you just do not throw away like you do with email addresses then this is also a safe limitation method.
I ask this because I am creating a spider to collect data from blogger.com for a data visualisation project for university.
The spider will look for about 17,000 values on the browse function of blogger and (anonymously) save certain ones if they fit the right criteria.
I've been running the spider (written in PHP) and it works fine, but I don't want to have my IP blacklisted or anything like that. Does anyone have any knowledge on enterprise sites and the restrictions they have on things like this?
Furthermore, if there are restrictions in place, is there anything I can do to circumvent them? At the moment all I can think of to help the problem slightly is; adding a random delay between calls to the site (between 0 and 5 seconds) or running the script through random proxies to disguise the requests.
By having to do things like the methods above, it makes me feel as if I'm doing the wrong thing. I would be annoyed if they were to block me for whatever reason because blogger.com is owned by Google and their main product is a web spider. Allbeit, their spider does not send its requests to just one website.
It's likely they have some kind of restriction, and yes there are ways to circumvent them (bot farms and using random proxies for example) but it is likely that none of them would be exactly legal, nor very feasible technically :)
If you are accessing blogger, can't you log in using an API key and query the data directly, anyway? It would be more reliable and less trouble-prone than scraping their page, which may be prohibited anyway, and lead to trouble once the number of requests is big enough that they start to care. Google is very generous with the amount of traffic they allow per API key.
If all else fails, why not write an E-Mail to them. Google have a reputation of being friendly towards academic projects and they might well grant you more traffic if needed.
Since you are writing a spider, make sure it reads the robots.txt file and does accordingly. Also, one of the rules of HTTP is not to have more than 2 concurrent requests on the same server. Don't worry, Google's servers are really powerful. If you only read pages one at the time, they probably won't even notice. If you inject 1 second interval, it will be completely harmless.
On the other hand, using a botnet or other distributed approach is considered harmful behavior, because it looks like DDOS attack. You really shouldn't be thinking in that direction.
If you want to know for sure, write an eMail to blogger.com and ask them.
you could request it through TOR you would have a different ip each time at a peformance cost.