I am working on a API for my web application written in CodeIgniter. This is my first time writing a API.
What is the best way of imposing a API limit on the API?
Thanks for your time
Log the user's credentials (if he has to provide them) or his IP address, the request (optional) and a timestamp in a database.
Now, for every request, you delete records where the timestamp is more than an hour ago, check how many requests for that user are still in the table, and if that is more than your limit, deny the request.
Simple solution, keep in mind, though, there might be more performant solutions out there.
Pretty straight forward. If that doesn't answer your question, please provide more details.
I don't see how this is codeigniter related, for example.
You can use my REST_Controller to do basically all of this for you:
http://net.tutsplus.com/tutorials/php/working-with-restful-services-in-codeigniter-2/
I recently added in some key logging, request limiting features in so this can all be done through config.
One thing you can do is consider using an external service to impose API limits and provide API management functionality in general.
For example, my company, WebServius ( http://www.webservius.com ) provides a layer that sits in front of your API and can provide per-user throttling (e.g. requests per API key per hour), API-wide throttling (e.g. total requests per hour), adaptive throttling (where throttling limits decrease as API response time increases), etc, with other features coming soon (e.g. IP-address-based throttling). It also provides a page for user registration / issuing API keys, and many other useful features.
Of course, you may also want to look at our competitors, such as Mashery or Apigee.
Related
I am building a restful API for users of my Laravel application to retrieve their data.
The current plan is that they can generate an API Token within the application to then authenticate their API requests. I do not know from where they will be making the requests.
The main reason I want to implement rate limiting is to reduce the impact of accidental/intentional DDOS, as well as part of the users current subscription package (necessary). Because of the latter, different users may have different rates.
Laravel already provides a rate limiter built in, including access to dynamic user limits specified in the User table.
I'm wondering though how the session is handled. From what I can see the Laravel TokenGuard class does not store the the user between requests. Therefore the user is being retrieved between every request, even to retrieve the rate limit. This seems to defeat the point of the rate limiter if we are still making database queries each time.
What is the appropriate way to handle this?
If I write my own authentication middleware, and store the user in the session, would that work? Do requests sent from another server (not a browser) even handle sessions?
Thanks.
Every time anyone accesses your site, you are spinning up an entire Laravel instance which is already putting stress on your server. DDOS doesn't depend on distressing your DB only. If someone is determined to DDOS you, you are going to notice! All you can do is mitigate the problem, so don't worry too much that each request has an associated DB call.
You could have a local session, but in the long run this is a bad design decision, since it introduces state to your server, which will make scaling in the future much harder. (https://12factor.net/ for more info on that.) This is why Laravel uses the user stored in the DB instead.
Unless you are doing something pretty special, it's generally safe to assume that Laravel is using an adequate solution. They do frameworks so that you can worry about business logic!
Finally, there are many websites out there. The chances are that by the time you're big enough to attract attention of people trying to DDOS you (remember it takes resources, and, therefore, money) you'll probably be using a much more sophisticated system.
If a request with some kind of token reaches your application, you should not need any kind of session. As you've assumed: a session is usually handled through cookies, but raw HTTP calls (like cURL does it) usually don't use them.
Don't overestimate the cost of getting the current user from the database - if your application performs some more actions, these additional actions will make the difference! Getting one entity from the database is fairly cheap, compared to everything else, and to check for the proper permissions and rate limits, this is obviously neccessary.
Everything else looks like you're looking for something like Laravel Passport (see https://laravel.com/docs/5.7/passport). Additional tools like the Throttle package (see https://github.com/GrahamCampbell/Laravel-Throttle) will help you to enable the rate limiting for your routes
I would like to implement an API using php and mysql technologies that can handle several thousands of requests per second.
I haven't did this kind of API before. If you have an experienced to implement similar task could you please tell me what are the steps?
How can I implement such kind of API that can handle thousands of request per second?
I would be glad if you could explain with sample codes.
Thanks in advance for your help.
Based on the details described in the post, you likely want to use an asynchronous, stateless architecture. So requests don’t block resources and can scale easier (always sounds easier than actually doing it ;)).
Without knowing to what other services these servers would connect (it certainly doesn’t make things easier), I’d go for Elixir/Erlang as programming language and use Phoenix as a framework.
You get a robust functional language which comes with a lot of great built-in features/modules (e.g. mnesia, roll/unroll versions while being live) and scales well (good in utilizing all cores of your server).
If you need to queue up requests to the 2nd tier servers AMQP client/server (e.g. RabbitMQ) might be a good option (holds/stores the requests for the servers).
That works pretty okay if it’s stateless, in form of client asks one thing and the server responds once and is done with the task. If you have many requests because the clients ask for updates every single second, then you’re better switching to a stateful connection and use WebSockets so the server can push updates back to a lot of clients and cuts a lot of chatter/screaming.
All of this writing is from a ‘high up view’. In the end, it depends on what kind of services you want to provide. As that narrows down what the ‘suitable tool’ would be. My suggestion is one possibility which I think isn’t far off (Node.js mentioned before is also more than valid).
Well you need to consider several factors such as:
Authenticating the API. Your API should be called by valid users that are authorized and authenticated
Caching API results. Your API should cache the results of API call. This will allow your API to handle requests more quickly, and it will be able to handle more requests per second. Memcache can be used to cache results of API call
The API architecture. RESTFul APIs have less overhead as compared to SOAP based APIs. SOAP based APIs have better support for authentication. They are also better structured then RESTFul APIs.
API documentation. Your API should be well documented and easy for users to understand.
API scope. Your API should have a well defined scope. For example will it be used over the internet as a public API or will it be used as private API inside corporate intranet.
Device support. When designing your API you should keep in mind the devices that will consume your API. For example smart phones, desktop application, browser based application, server application etc
API output format. When designing your API you should keep in mind the format of the output. For example will the output contain user interface related data or just plain data. One popular approach is known as separation of concerns (https://en.wikipedia.org/wiki/Separation_of_concerns). For example separating the backend and frontend logic.
Rate limiting and throttling. Your API should implement rate limiting and throttling to prevent overuse and misuse of the API.
API versioning and backward compatibility. Your API should be carefully versioned. For example if you update your API, then the new version of your API should support older version of API clients. Your API should continue to support the old API clients until all the API clients have migrated to the new version of your API.
API pricing and monitoring. The usage of your API should be monitored, so you know who is using your API and how it is being used. You may also charge users for using your API.
Metric for success. You should also decide which metric to use for measuring the success of your API. For example number of API calls per second or monitory earnings from your API. Development activities such as research, publication of articles, open source code, participation in online forums etc may also be considered when determining the success of your API.
Estimation of cost involved. You should also calculate the cost of developing and deploying your API. For example how much time it will take you to produce a usable version of your API. How much of your development time the API takes etc.
Updating your API. You should also decide how often to update your API. For example how often should new features be added. You should also keep in mind the backward compatibility of your API, so updating your API should not negatively affect your clients.
Good answer, I think one thing to keep in mind is where the bottleneck is. Many times, the bottleneck isn't the API server itself but the data access patterns with the persistence layer.
Think about how you access your data. For posting new items, a lot of times the processing can be delayed and processed async to the original request. For example if resizing an image or sending an email, you can integrate RabmitMQ or SQS to queue up a job, which can be processed later by workers. Queues are great in buffering work so that if a server goes down, stuff is just queued up to be processed once back online.
On the query side, it is important to understand how indexing works and how data is stored. There are different types of indices, for example hash tables can give you constant access time, but you cannot perform range queries with hash tables. The easiest is if you have simple decentralized data objects queried by identifiers which can be stored in a index. If you're data is more complex where you need to do heavy joins or aggregations, then you can look at precomputed values stored in something like Redis or memcache.
I am currently trying to make a syncing operation between my Database and Gmail's contacts.
My first initial sync, downloading/uploading over 1,000 contacts per user might throw some errors up in gmails face.
Is there any work-arounds? What is the limitations to having many contacts?
My understanding is that it is limited per IP, and not per User... is this correct?
I hope that someone can share some info on this, I have searched the web, but haven't found the best of resources... Thoughts?!
I actually received a response from Google.
The query is currently per user and is quite high though there is a
limit in the number of queries per second, per hour and per half a day
you can send. Unfortunately, we don't publicly communicate on these
values but I can assure you that normal applications (not spamming
Google's servers) shouldn't have any issues.
Also, when syncing, please make sure to use the updated-min query
parameter to only request contacts that have been updated since the
provided time and use batch-request when sending requests to the API
as it will perform multiple operations while consuming only one
request on the user's quota.
Hopefully this helps someone else if in need.
Yes, there is a limitation on accessing the Google API (at least on Google Maps API) on an IP basis. The only workaround I was abble to find is to use proxy servers (or tor).
I have an algorithm that receives input and delivers output which I would like developers to use like an API. To prevent denial of service attack and excessive overuse, I want some rate limits or protection. What options do I have? Do I provide accounts and API keys? How would that generally work? And what other ideas are possible for this scenario?
Accounts and API keys does sound like a good idea, if nothing else it stops people other than your intended developers being able to access your API.
It should be fairly straightforward to have a simple database table logging the last time a particular API was accessed, and denying re-use if it is accessed too many times in a certain time frame. If possible, return the next time the API will be available for re-use in the output, so developers can throttle accordingly, instead of having to go for a trial and error approach.
Are you expecting the same inputs to be used over and over again or will it be completely random? What about caching the output and only serving the cache to the developer(s) until the API is ready for re-use? This approach is far less dependent on accounts and keys too.
API keys can definitely be a good way to go, there is also openAuth (http://oauth.net) if you scenarios where end users will be accessing the service via apps built by third parties.
If you don't want to code the rate limits / key management yourself, it's worth taking a look at http://www.3scale.net/ which does a lot of this free out of the box as a service (plus other stuff including a developer portal, billing and so on). As a disclaimer, I work there so I might have some bias but we try to make exactly this as simple as possible!
I should add, there's a PHP plugin for 3scale which you can drop into your code and that'll enable all the rate limits etc.
other options that are slightly less complex at the expense of accuracy is using the ip address. obviously this is easier to overcome, but for the average user that does not know what an ip address is it works. Also easy to set up.
it all depends on the complexity of the app and the amount of time you got to do it in
Let's say I have a website that has a lot of information on our products. I'd like some of our customers (including us!) to be able to look up our products for various methods, including:
1) Pulling data from AJAX calls that return data in cool, JavaScripty-ways
2) Creating iPhone applications that use that data;
3) Having other web applications use that data for their own end.
Normally, I'd just create an API and be done with it. However, this data is in fact mildly confidential - which is to say that we don't want our competitors to be able to look up all our products every morning and then automatically set their prices to undercut us. And we also want to be able to look at who might be abusing the system, so if someone's making ten million complex calls to our API a day and bogging down our server, we can cut them off.
My next logical step would be then to create a developers' key to restrict access - which would work fine for web apps, but not so much for any AJAX calls. (As I see it, they'd need to provide the key in the JavaScript, which is in plaintext and easily seen, and hence there's actually no security at all. Particularly if we'd be using our own developers' keys on our site to make these AJAX calls.)
So my question: after looking around at Oauth and OpenID for some time, I'm not sure there is a solution that would handle all three of the above. Is there some sort of canonical "best practices" for developers' keys, or can Oauth and OpenID handle AJAX calls easily in some fashion I have yet to grok, or am I missing something entirely?
I think that 2-legged OAuth is what you want to satisfy #2 and #3. For #1 I would suggest that instead of the customer making JS requests directly against your application, they could instead proxy those requests through their own web application.
A midway solution is to require an API key; and then demand that whomsoever uses it doesn't actually use it directly with the AJAX; but wrap their calls in a server-side request, e.g.:
AJAX -> customer server -> your server -> customer server -> user
Creating a simple PHP API for interested parties shouldn't be too tricky, and your own iPhone applications would obviously cut out the middle man, shipping with their own API key.
OAuth and OpenID are unlikely to have much to do with the AJAX calls directly. Most likely, you'll have some sort of authorization filter in front of your AJAX handler that checks a cookie, and maybe that cookie is set as a result of an OpenID authentication.
It seems like this is coming down to a question of "how do I prevent screen scraping." If only logged-in customers get to see the prices, that's one thing, but assuming you're like most retail sites and your barrier to customer sign-up is as low as possible, that doesn't really help.
And, hey, if your prices aren't available, you don't get to show up in search engines like Froogle or Nextag or PriceGrabber. But that's more of a business strategy decision, not a programming one.