This question may be a little vague but I'd like to hear peoples' opinions on the subject, from their experiences.
I'm using a RESTful API which returns a json string, this could potentially receive hundreds of hits a second.
Now my question is, which is the best method to handle a high volume of requests?
I've done a benchmark test for file_get_contents and curl, based on 50 requests each and I've found curl to range from anywhere between 0.06s and 0.07s per request, whereas file_get_contents ranges from 0.159s to 0.18s per request.
So from a very basic test it looks like curl would be the best option, but then you have other methods, plus many variables which could effect the results, especially when you're talking about hundreds of requests hitting the server every second.
I don't need the whole functionality of curl, the error handlers are great, but I'll only ever be dealing with simple GET requests, so would it be worthwhile using something else, like fopen?
Related
first of all i'm not sure about this question title so please correct me if it's not, thanks.
About:
I have two projects based on PHP: first project ( CLIENT ) who connects to second ( API ) via curl. In API project are done some calculations which are performed on CLIENT send data.
Problem:
If API project will have downtime by any issues or just slows down CLIENT must wait until API returns results, so it slows down too. Projects are in intensive development so calculations will increase so delay too.
Question:
How i can avoid mentioned problem, perfectly API must do not impact performance of CLIENT. Maybe there is any design patterns or something?
I have read about ASYNCH PHP, caching patterns but still not found solution. If there's any solutions ( patterns ) it would be great to have examples in practise!
P.S. Request doesn't slows, slows calculations. And i agree that first of all they should be optimized.
P.P.S. Total requests are more than 60 per minute ( > ~60 / min ).
There are two approaches, both work but have different pros and cons...
asynchronous processing, meaning that the client does not wait for each single call until it returns (its response returns), but moves on and relies on a mechanism like a callback or similar to handle the response once it comes in. This is for example what is typically done in web clients using javascript and ajax for remote calls. The makes the client considerably more fluent, but obviously involves a higher complexity of code and UI.
queue based processing, meaning that the client does not at all do any such potentially blocking requests directly, but only creates jobs instead inside some queuing mechanism. Those job can be handled then one by one by some scheduler which also must take care of handling the response. This is extremely powerful if it comes to scaling and robustness against load peaks and outages of the API, but the implementation is much more expensive. Also the overall task must accept that response times are not guaranteed at all, typically the responses will take longer than in the first approach so cannot be shown interactively.
We have a not so RESTful API used by our single page app and mobile apps.
It is not so RESTfu since in the past, URI's have returned whatever was useful for a particular page. This has led to a large number of endpoints, with many having very similar responses.
Datawise we have resources with tens of related resources. In some cases we want those related resources returned, or some of them, and other cases we do not. Some are slow to return, and therefore we only want them in specific cases.
The problem we've had has been with splitting the data up into meaningful URI's, without needing another request to get each related resource.
we therefore considered a /batch endpoint where a POST request containing a number of requests in the body could execute those requests in parallel on the server. Like this https://developers.facebook.com/docs/graph-api/making-multiple-requests
That way we could split the data into meaningful URI's and not have to make 20 API requests for each page.
Is this an acceptable way of handling related resources? Or would it be better to have a URI for each response that we may want?
HTTP/2 will make this problem go away by allowing you to multiplex requests on a single connection.
In the meanwhile, I would suggest that you are not violating any REST constraints with your current solution. However, creating a batch of requests breaks the resource identification constraint which will have a major impact on the cacheability of representations.
Batching is fine -- don't let RESTful design patterns steer you down a path to bad performance.
It is likely that your batching solution can be designed such that each piece that can be batched can be called separately. If so, it might be straightforward to create RESTful endpoints for each as well, for those who would want that at the expense of multiple round-trips.
You can also use query parameters to select different, packaged returns of resources. For example, for a user, you could use something like:
GET /v1/users/1?related={none,all,basic}
You could also use a field selector:
GET /v1/users/1?data=addresses,history
I'm working out on a few projects using node and each line of code that I write spans lots of ideas of how could someone destroy my node process.
Right now I'm thinking of this:
require('http').createServer(function(req, res)) {
//DEAL WITH REQUEST HERE
}.listen(port, net);
That's standard code for setting up a server and dealing with requests.
Let's say that I want to bring down that node process, I could send POST requests with loads of data on them and node.js would spend lots of time (and bandwith) on receiving all of them.
Is there a way to avoid this?
PHP Pros: How do you normally deal with this?
Is there a way to tell node or php (maybe apache) to just ignore requests from certain IPs ?
You can limit the http request size.
Here is the middleware you can use.
https://github.com/senchalabs/connect/blob/master/lib/middleware/limit.js
http://www.senchalabs.org/connect/middleware-limit.html
P.S. Possibility duplication from maximum request lengths in node.js
For getting the IP address in node.js, you can try request.connection.remoteAddress.
Hi all I have a site develop in codeigniter.
In this site I have to make requests to 10 server with this code for example in javascript:
var XML_req = new ActiveXObject("Microsoft.XMLHTTP");
XML_req.open("POST", link_server, false);
XML_req.send(unsescape(XMLdata));
I make a request and the result return me on var_XML_req.responseText
The problem is that:
The xml the xml response can have thousand of nodes, and I have to make the same request 10 times one for each server. I have 10 requests(the same for all server) and 10 response in single big xml.
I know that this requests are asynchronous but in this case I don't know if exist a method to handle this request and this big response, because multiple user can make request at the same time. I have a server with good features but I don't know if this requests and the big responses can be very slow.
Exist a method to handle this?
I have thinked to make a cronjob ebery hour for every server and stor into a database but in this case I don't have real time data I have to buy a very big database because I can have milion of data in many table.
I don't have many experience with this kind of work with a lot and a lot of data.
I hope that someone can explain me the right, the best and the more fast way to handle this request and response.
Honestly, if you are returning millions of rows in a single request something is wrong. Your queries should be small and should be returning only the bits of information people are looking for, not the full set:
Dealing with Large Data in Ajax
There is always the option to use local storage in the browser:
jQuery Ajax LocalStorage
Here is another answer that may provide some insight, too:
LocalStorage, several Ajax requests or huge Ajax request?
Good luck!
I have done a fair amount of reading on this and I am not quite sure what the correct way to go about this is.
I am accessing a websites api that provides information that I am using on my site. On average I will be making over 400 different API requests which means over 400 curl requests. What is the proper way to make my code pause for an amount of time then continue. The site does not limit the amount of hits on so I will not get banned for just pulling all of the stuff at once, but I would not want to be that server when 10,000 people like me do the same thing. What I am trying to do is pause my code and politely use the service they offer.
What is the best method to pause php execution with resource consumption in mind?
What is the most courteous amount of requests per wait cycle?
What is the most courteous amount of wait per cycle?
With all of these questions I would also like to obtain the information as fast as possible while attempting to stay with in the above questions.
sample eve central API response
Thank you in advance for your time and patience.
Here's a thought: have you asked? If an API has trouble handling a high load, they usually include a limit in their terms. If not, I'd recommend emailing the service provider, explain what you want to do, and ask what they think would be a reasonable load. Though it's quite possible that their servers are quite capable of handling any load you might reasonably want to give it, which is why they don't specify.
If you want to do good by the service provider, don't just guess want they want. Ask, and then you'll know exactly how far you can go without upsetting the people who built the API.
For the actual mechanics of pausing, I'd use the method alex suggested (but has since deleted) of PHP's usleep.