Ajax calls vs. server side calls - php

I am building a twitter feed widget for Wordpress, and one of the issues I have to deal with is Twitter's rate limits (150 tweets per hour per account). I have noticed that when i'm fetching the tweets using server-side calls (e.g. file_get_contents()) the limit is reached very quickly, especially on a shared host. I've tried to fetch the tweets using client-side calls with jQuery's getJSON function, and the rate limit took a lot longer to reach.
What is the reason for this difference between client-side and
server-side calls when it comes to Twitter rate limits?
Which method would be preferable for this case?
Update
I should note that the tweets are being cached to avoid hitting the rate limits, but that does not help when the calls are made from a shared host.

When you use server-side calls, all the calls are coming from the same IP; all the users are sharing the same 150 tweat/hour quota.
When you use client-side calls, they calls come from different IPs for each client. Each client gets 150 tweats per hour, so all the clients combined can get a much larger volume.

Related

cURL: from ethical point of view, what the calls frequency is not harmful?

Question:
Could anyone please let me know at what frequency calling someone's website via cURL is not considered harmful?
Explanation:
I am building a small web app, where I fetch wordpress posts and some of its information from a clients website.
(!) Not as a web scraper , as they have to install mini-plugin that supplies only relevant information using my authkey.
Because the amount of pages can vary from 10 to 1000+. I am not doing it in one call; So I have made a page with Ajax script that pulls max 50 pages per call. This Ajax url calls my fetch.php, verifies the url each time (including header) and then gets the information via cURL. Repeats until finished.
Scenario:
Let's imagine client website has 1000 pages. So I would need to make a call 20 times (without delays, it's likely to happen within 30s).
Also, might need to consider that because I have to verify Domain URL before each call, which also have cURL with get headers only(as faster alternative to get_headers()).
I believe it's effectively doubles the amount of calls to 40 times.
So, ethically do I need to make a delay? or such volume of calls is not considered harmful to the client's website?
Thank you
This is likely to vary a lot, but as long as you make your calls sequentially one at a time I can't see that it could be harmful even for a small site. If you make them run at the same time it is another story.

Query free API, IP blocking

I am using some API which is free.
I am using PHP script which is using fopen to download JSON from API.
When I make to many requests(eg. 2 requests every minute) API is blocking my PHP server IP.
Is there a way to solve it and possibility to make more requests (I don't want to DDoS attack)?
Is there better solution than use of many PHP servers with different IP's?
This is a quite abstract question as we don't know the actual api you are talking about.
But, usually, if an api implement a rate limit, it shows this kind of header in it's answer:
X-Rate-Limit-Limit: the rate limit ceiling for that given request
X-Rate-Limit-Remaining: the number of requests left for the 15 minute window
X-Rate-Limit-Reset: the remaining window before the rate limit resets in UTC epoch seconds
Please check the docs (this one is from twitter, https://dev.twitter.com/rest/public/rate-limiting).

Google safebrowsing api limits

Who know how many url can i check and what time limit between request i need to use with safebrowsing api. I use it with PHP, but after checking 2k urls, i got
Sorry but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.
are supposed to be 10.000
with both the Safe Browsing Lookup API
https://developers.google.com/safe-browsing/lookup_guide#UsageRestrictions
and Safe Browsing API v2
https://developers.google.com/safe-browsing/developers_guide_v2#Overview
but you could ask for more is free they said.
I understand that they allow you to do 10k request per day. On each request you can query for up to 500 URLs, so, in total they let you lookup 5M URLs daily, not bad.
I currently use Google Safe Browsing API and following are the limitations in the API.
A single API key can make requests for up to 10,000 clients per 24-hour period.
You can query up to 500 URLs in a single POST
request.
I previously used one request per time and ended by exceeding the quota defined by the API. But now per request I set maximum of 500 URLs. It helped me not to exceed the limit of the API and it is super fast too.

Server-side or client-side for fetching tweets?

I run this website for my dad which pulls tweets from his twitter feed and displays them in an alternative format. Currently, the tweets are pulled using javascript so entirely client-side. Is the most efficient way of doing things? The website has next to no hit rate but I'm just interested in what would be the best way to scale it. Any advice would be great. I'm also thinking of including articles in the stream at some point. What would be the best way to implement that?
Twitter API requests are rate limited to 150 an hour. If your page is requested more than that, you will get an error from the Twitter API (an HTTP 400 error). Therefore, it is probably a better idea to request the tweets on the server and cache the response for a certain period of time. You could request the latest tweets up to 150 times an hour, and any time your page is requested it receives the cached tweets from your server side script, rather than calling the API directly.
From the Twitter docs:
Unauthenticated calls are permitted 150 requests per hour.
Unauthenticated calls are measured against the public facing IP of the
server or device making the request.
I recently did some work integrating with the Twitter API in exactly the same way you have. We ended up hitting the rate limit very quickly, even just while testing the app. That app does now cache tweets at the server, and updates the cache a few times every hour.
I would recommend using client-side to call the Twitter API. Avoid calling your server. The only downfall to using client-side js is that you cannot control whether or not the viewer will have js deactivated.
What kind of article did you want to include in the stream? Like blog posts directly on your website or external articles?
By pulling the tweets server side, you're routing all tweet traffic through your server. So, all your traffic is then coming from your server, potentially causing a decrease in the performance of your website.
If you don't do any magic stuff with those tweets that aren't possible client side, I should stick with your current solution. Nothing wrong with it and it scales tremendously (assuming that you don't outperform Twitter's servers of course ;))
Pulling your tweets from the client side is definitely better in terms of scalability. I don't understand what you are looking for in your second question about adding articles
I think if you can do them client side go for it! It pushes the bandwith usage to the browser. Less load on your server. I think it is scalable too. As long as your client can make a web request they can display your site! doesn't get any easier than that! Your server will never be a bottle neck to them!
If you can get articles through an api i would stick to the current setup keep everythign client side.
For really low demand stuff like that, it's really not going to matter a whole lot. If you have a large number of tasks per user then you might want to consider server side. If you have a large number of users, and only a few tasks (tweets to be pulled in or whatever) per user, client side AJAX is probably the way to go. As far as your including of articles, I'd probably go server side there because of the size of the data you'll be working with..

cURL call to Twitter API meeting "Rate Limit" without making more than 5 requests

I'm just starting to mess around with a very, very basic call to the
Twitter API (http://api.twitter.com/1/statuses/user_timeline.json) to
pull my tweets to my website through cURL. However, using a page that
nobody knows exists yet (thus eliminating the possibility of
inadvertent traffic), I'm getting a Rate Limit Exceeded thing before
I've had the chance to even test it. It says it resets at 5 past the
hour, so I check again, and for a minute it works but then it's back
to telling me my rate limit is exceeded. A few questions for anyone who knows about the Twitter API and/or cURL:
First, is the rate limit applied to my server (instead of the user)? I would assume so, but
that could make it tough of course. Even one API call per visitor
could, on a site with marginal traffic, easily surpass the rate limit
in an hour. Is there a way to associate the call with the visitor, not
the server? Seems like probably not, but I'm not entirely sure how the
whole API works, and cURL does seem to be advocated in a number of
places. I'm aware that if I use JSON and AJAX the data in I can make
that request from the user, but just for the sake of argument, what
about cURL?
Second, any idea how I could be surpassing my rate limit without even
refreshing the page? I pay for hosting at another location, so I might
be sharing server space with another site, but my site definitely has
a unique IP, so that should … that should be OK, right? So how is it
that I'm surpassing the rate limit without even running the code (or
by running it once?)?
Here's what I've got for code, if it helps:
$ch=curl_init("http://api.twitter.com/1/statuses/user_timeline.json?screen_name=screenname");
curl_setopt_array($ch,array(
CURLOPT_RETURNTRANSFER=>true,
CURLOPT_TIMEOUT=>5,
)
);
$temp=curl_exec($ch);
curl_close($ch);
$results=json_decode($temp,true);
Also, I've now got it so that if Twitter returns a Rate Limit error, it records the error in a text file, as well as the time that the limit will reset. Looking at that file, the only time it updates (I don't have it rewrite, it just adds on) is when I've loaded the page (which is maybe once or twice in an hour), so it's not like something else is using this page and calling on this URL.
Any help?
Authenticated requests should count against the user's 350/hour limit. Non-authenticated requests get counted against your IP address's 150/hour limit.
If you're running into the limits during development, Twitter has generally been quite willing to whitelist dev server IPs.
http://dev.twitter.com/pages/rate-limiting
Some applications find that the default limit proves insufficient. Under such circumstances, we offer whitelisting. It is possible to whitelist both user accounts and IP addresses. Each whitelisted entity, whether an account or IP address, is allowed 20,000 requests per hour. If you are developing an application and would like to be considered for whitelisting you fill out the whitelisting request form. Due to the volume of requests, we cannot respond to them all. If your request has been approved, you'll receive an email.

Categories