Request data design pattern - php

first of all i'm not sure about this question title so please correct me if it's not, thanks.
About:
I have two projects based on PHP: first project ( CLIENT ) who connects to second ( API ) via curl. In API project are done some calculations which are performed on CLIENT send data.
Problem:
If API project will have downtime by any issues or just slows down CLIENT must wait until API returns results, so it slows down too. Projects are in intensive development so calculations will increase so delay too.
Question:
How i can avoid mentioned problem, perfectly API must do not impact performance of CLIENT. Maybe there is any design patterns or something?
I have read about ASYNCH PHP, caching patterns but still not found solution. If there's any solutions ( patterns ) it would be great to have examples in practise!
P.S. Request doesn't slows, slows calculations. And i agree that first of all they should be optimized.
P.P.S. Total requests are more than 60 per minute ( > ~60 / min ).

There are two approaches, both work but have different pros and cons...
asynchronous processing, meaning that the client does not wait for each single call until it returns (its response returns), but moves on and relies on a mechanism like a callback or similar to handle the response once it comes in. This is for example what is typically done in web clients using javascript and ajax for remote calls. The makes the client considerably more fluent, but obviously involves a higher complexity of code and UI.
queue based processing, meaning that the client does not at all do any such potentially blocking requests directly, but only creates jobs instead inside some queuing mechanism. Those job can be handled then one by one by some scheduler which also must take care of handling the response. This is extremely powerful if it comes to scaling and robustness against load peaks and outages of the API, but the implementation is much more expensive. Also the overall task must accept that response times are not guaranteed at all, typically the responses will take longer than in the first approach so cannot be shown interactively.

Related

RESTful API in PHP - optimising successive requests?

I'm working (for the first time) on developing a PHP application driven by a PHP RESTful API (probably using peej/Tonic). Coming from an application with direct access which might make 20 different database calls during the course of a page load, I am trying to reconcile the fact that 20 API calls = 20x handshakes (which can be improved by Guzzle persistent connections) but also 20x connections to the database.
I believe that with better programming and planning, I can get my required API calls down to 4-5 per page. At this point:
a) Is it not worth considering the latency of 5x database connections + 5x handshakes per page load on account of all the other available optimisations?
b) Is there an existing method by which this can be mitigated that I've thus far failed to find?
c) I believe it violates the principles of RESTful programming but if I had a single API method which itself gathered information from other API endpoints (for instance, GET suppliers WHERE x=y then GET products for supplier), is there a documented method for internal API interaction (particularly within peej/Tonic or other frameworks).
Thank you all for your wisdom in advance.
Remember that the client should be "making 'a request' of the server," which is obliged to fulfill "that request." The server might "execute 20 different database queries" to prepare its response, and the client need not know nor care.
The client's point-of-view becomes, "I care what you tell me, not how you did it."
If you did wish to send query-responses directly to the client, for the client to "do the dirty-work" with these data, then you could still design your server request so that the server did many queries, all at once, and sent all of the result-sets back ... in just one exchange.
Your first priority should be to effectively minimize the number of exchanges that take place. The amount of data returned (within reason) is secondary.
Also consider that, "when the server does it, the work is naturally synchronized." When the client issues multiple asynchronous requests, those requests are, well, "asynchronous." Consider which strategy will be easier for you to debug.
If the server is given "a request to do," it can validate the request (thus checking for client bugs), and perform any number of database operations, perhaps in a TRANSACTION. This strategy, which puts the server into a very active role, is often much less complex than an interaction that is driven by the client with the server taking a passive role.

How to throttle API requests using PHP

We plan to use the SEMrush API, which allows access to SEO data relating to domain names and search keywords. Under their Terms of Use, they limit their usage to avoid killing their servers:
You may not perform more than 10 requests per second, nor more than 2 simultaneous requests.
We are going to be building a simple tool in PHP that aggregates data based on a domain name and are looking for the basics on how to fulfill that requirement. We are planning for hundreds/thousands of potential simultaneous users.
Maybe someone can provide some pseudo code in PHP that would let us do this - or is it really just as simple as forcing the actual API request function to sleep for 1 second in between each command? I don't have a lot of experience with APIs and large amounts of concurrent users so any help is appreciated.
PHP is really not the best language to use for concurrent programming. However, there are some third party solutions that you can use along-side of PHP that can help you achieve your goals.
What you need is a job-manager or a queue system that can handle the actual requests for you. Since this is a back-end tool (at least that's what I gathered from your question) it doesn't require PHP to handle the actual control over the jobs themselves, but just have some controlling process schedule these individual jobs and hand them to your PHP scripts so that you can effectively impose these limits.
My first suggestion would be to try something like gearman, which is a great job manager and has an extension in PHP to help you interface with the library.
Another suggestion is to take a look at queue systems like amqp or zmq, some of which also have extensions in PHP.
So here's an example scenario for you...
You have a PHP script that accepts these requests and hands them off to your job manager or queue over a socket. The job manager or queue will store the request and distribute it off to the individual workers in an a way that can be centralized and controlled to impose these limits. There are some examples from the links I gave you that can help you get there. However, doing it purely in PHP without the aid of these tools will prove quite tricky and could wind up in some very edge-case buggy behavior if not carefully crafted and considered.
Some APIs return rate limit information in the response header.
Check out:
Examples of HTTP API Rate Limiting HTTP Response headers
This information will help you wait for a few nanoseconds, before continuing with your next request using PHP's time_nanosleep()
Some PHP libraries go pretty in-depth with their ways of rate-limiting.
The Bucket Token Algorithm is pretty common across the web:
https://github.com/bandwidth-throttle/token-bucket
Now I find this a bit overkill when it comes down to throttling some URL requests that don't have something like X-RateLimit-Remaining in their return header. API requests in general are usually pretty slow. So I've built the PHP script below.
This PHP script will just wait for a few milliseconds based on a $throttlerID. Higher requestsInSeconds will result in shorter wait times... If the same $throttlerID is used across simultaneous requests, each request will wait for the other using File-Locking (FLOCK()).
function Throttler($requestsInSeconds, $throttlerID) {
// Use FLOCK() to create a system global lock (it's crash-safe:))
$fp = fopen(sys_get_temp_dir()."/$throttlerID", "w+");
// exclusive lock will blocking wait until obtained
if (flock($fp, LOCK_EX)) {
// Sleep for a while (requestsInSeconds should be 1 or higher)
$time_to_sleep = 999999999 / $requestsInSeconds;
time_nanosleep(0, $time_to_sleep);
flock($fp, LOCK_UN); // unlock
}
fclose($fp);
}
Put the call to Throttler() right before each CURL call. That's it!

Inter process pushing of captured events

I have a php based web application that captures certain events in a database table. It also features a visualization of those captured events: a html table listing the events which is controlled by ajax.
I would like to add an optional 'live' feature: after pressing a button ('switch on') all events captured from that moment on will be inserted into the already visible table. Three things have to happen: noticing the event, fetching the events data and inserting it into the table. To keep the server load inside sane limits I do not want to poll for new events with ajax request, instead I would prefer the long polling strategy.
The problem with this is obviously that when doing a long polling ajax call the servers counterpart has to monitor for an event. Since the events are registered by php scripts there is no easy way to notice that event without polling the database for changes again. This is because the capturing action runs in another process than the observing long polling request. I looked around to find a usable mechanism for such inter process communication as I know it from rich clients under linux. Indeed there are php extensions for semaphores, shared memory or even posix. However they all only exist under linux (or unix like) systems. Though not typically the application might be used under MS-Windows systems in rare cases.
So my simple question is: is there any means that is typically available on all (most) systems that can push such events to a php script servicing the long polling ajax request ? Something without polling a file or a database constantly, since I already have an event elsewhere ?
So, the initial caveats: without doing something "special", trying to do long polling with vanilla PHP will eat up resources until you kill your server.
Here is a good basic guide to basic PHP based long polling and some of the challenges associated with going the "simple" road:
How do I implement basic "Long Polling"?
As far as doing this really cross-platform (and simple enough to start), you may need to fall back to some sort of simple internal polling - but the goal should be to ensure that this action is much lower-cost than having the client poll.
One route would be to essentially treat it like you're caching database calls (which you are at this point), and go with some standard caching approaches. Everything from APC, to memcached, to polling a file, will all likely put less load on the server than having the client set up and tear down a connection every second. Have one process place data in the correct keys, and then poll them in your script on a regular basis.
Here is a pretty good overview of a variety of caching options that might be crossplatform enough for you:
http://simas.posterous.com/php-data-caching-techniques
Once you reach the limits of this approach, you'll probably have to move onto a different server architecture anyhow.

Guidance on the number of http calls; is too much AJAX a problem?

I develop a website based on the way that every front end thing is written in JavaScript. Communication with server is made trough JSON. So I am hesitating about it: - is the fact I'm asking for every single data with http request query OK, or is it completely unacceptable? (after all many web developers change multiple image request to css sprites).
Can you give me a hint please?
Thanks
It really depends upon the overall server load and bandwidth use.
If your site is very low traffic and is under no CPU or bandwidth burden, write your application in whatever manner is (a) most maintainable (b) lowest chance to introduce bugs.
Of course, if the latency involved in making thirty HTTP requests for data is too awful, your users will hate you :) even if you server is very lightly loaded. Thirty times even 30 milliseconds equals an unhappy experience. So it depends very much on how much data each client will need to render each page or action.
If your application starts to suffer from too many HTTP connections, then you should look at bundling together the data that is always used together -- it wouldn't make sense to send your entire database to every client on every connection :) -- so try to hit the 'lowest hanging fruit' first, and combine the data together that is always used together, to reduce extra connections.
If you can request multiple related things at once, do it.
But there's no real reason against sending multiple HTTP requests - that's how AJAX apps usually work. ;)
The reason for using sprites instead of single small images is to reduce loading times since only one file has to be loaded instead of tons of small files at once - or at a certain time when it'd be desirable to have the image already available to be displayed.
My personal philosophy is:
The initial page load should be AJAX-free.
The page should operate without JavaScript well enough for the user to do all basic tasks.
With JavaScript, use AJAX in response to user actions and to replace full page reloads with targeted AJAX calls. After that, use as many AJAX calls as seem reasonable.

System with two asynchronous processes

I'm planning to write a system which should accept input from users (from browser), make some calculations and show updated data to all users, currently visiting certain website.
Input can come one time in a hour, but can also come 100 times each second. It is VERY important not to loose any of user inputs, but really register and process ALL of them.
So, the idea was to create two programs. One will receive data (input) from browser and store it somehow in a queue (maybe an array, to be really fast?). Second program should wait until there are new items in the queue (saving resources) and then became active and begin to process the queue items. Both programs should run asynchronously.
I can php, so I would write first program using php. But I'm not sure about second part.. I'm not sure about how to send an event from first to second program. I need some advice at this point. Threads are not possible with php? I need some ideas how to create the system like i discribed.
I would use comet server to communicate feedback to the website the input came from (this part already tested)
As per the comments above, trivially you appear to be describing a message queueing / processing system, however looking at your question in more depth this is probably not the case:
Both programs should run asynchronously.
Having a program which process a request from a browser but does it asynchronously is an oxymoron. While you could handle the enqueueing of a message after dealing with the HTTP request, its still a synchronous process.
It is VERY important not to loose any of user inputs
PHP is not a good language for writing control systems for nuclear reactors (nor, according to Microsoft, is Java). HTTP and TCP/IP are not ideal for real time systems either.
100 times each second
Sorry - I thought you meant there could be a lot of concurrent requests. This is not a huge amount.
You seem to be confusing the objective of using COMET / Ajax with asynchronous processing of the application. Even with very large amounts of data, it should be possible to handle the interaction using a single php script working synchronously.

Categories