Google translate API POST 11second delay? - php

EDIT: 11.5 seconds for 28 messages
Single requests work fine. This code below takes 11 seconds, measured using postman setting route in API to access.
Am I doing something wrong? I feel as though it shouldn't take 11 seconds even without cache.
$xs = ChatMessage::where('chat_room_id','=',$roomId)
->with('user')
->orderBy('created_at','DESC')
->get();
foreach($xs as $r){
$translate = new TranslateClient([
'key' => 'xxxxxxxxxxxxxxxxxxxxxxx'
]);
$result = $translate->translate($r->message_english, [
'target' =>'es',
'source' => 'en',
]);
$r->message = $result['text'];
}
return $xs;

I think you can easily speed the process by just moving the client out of the for each loop. You are creating a client each time you iterate. That's not optimal. You should be able to reuse the client per translate call. That should speed your translation process. You can find samples of this usage in the official github client project
Here is a pseudo code sample:
client = new TranslateClient()
foreach(message in messages)
result = client.translate(message)
print(result)
Also, how long is your translated text? You should pass the whole text to be translated into a single call (as long as the supported library allows) So you also reduce the calls done to the API.
If you still have issues you can go by using multiple request in parallel as mentioned in the comments.
Some useful links about this:
PHP Documentation Overview
Translate Client

Related

Rate-limiting Guzzle Requests in Symfony

This actually follows on from a previous question I had that, unfortunately, did not receive any answers so I'm not exactly holding my breath for a response but I understand this can be a bit of a tricky issue to solve.
I am currently trying to implement rate limiting on outgoing requests to an external API to match the limit on their end. I have tried to implement a token bucket library (https://github.com/bandwidth-throttle/token-bucket) into the class we are using to manage Guzzle requests for this particular API.
Initially, this seemed to be working as intended but we have now started seeing 429 responses from the API as it no longer seems to be correctly rate limiting the requests.
I have a feeling what is happening is that the number of tokens in the bucket is now being reset every time the API is called due to how Symfony handles services.
I am setting currently setting up the bucket location, rate and starting amount in the service's constructor:
public function __construct()
{
$storage = new FileStorage(__DIR__ . "/api.bucket");
$rate = new Rate(50, Rate::MINUTE);
$bucket = new TokenBucket(50, $rate, $storage);
$this->consumer = new BlockingConsumer($bucket);
$bucket->bootstrap(50);
}
I'm then attempting to consume a token before each request:
public function fetch(): array
{
try {
$this->consumer->consume(1);
$response = $this->client->request(
'GET', $this->buildQuery(), [
'query' => array_merge($this->params, ['api_key' => $this->apiKey]),
'headers' => [ 'Content-type' => 'application/json' ]
]
);
} catch (ServerException $e) {
// Process Server Exception
} catch (ClientException $e) {
// Process Client Exception
}
return $this->checkResponse($response);
}
I can't see anything obvious in that, that would allow it to request more than 50 times per minute unless the amount of available tokens was being reset on each request.
This is being supplied to a set of repository services that handle converting the data from each endpoint into objects used within the system. Consumers use the appropriate repository to request the data needed to complete their process.
If the amount of tokens is being reset by the bootstrap function being in service constructor, where should it be moved to within the Symfony framework that would still work with consumers?
I assume that it should work, but maybe try to move the ->bootstrap(50) call from every request? Not sure, but it can be the reason.
Anyway it's better to do that only once, as a part of your deployment (every time you deploy a new version). It doesn't have anything with Symfony, really, because the framework doesn't have any restrictions on deployment procedure. So it depends on how you do the deployment.
P.S. Have you considered to just handle 429 errors from the server? IMO you can wait (that's what BlockingConsumer does inside) when you receive 429 error. It's simpler and doesn't require an additional layer in your system.
BTW, have you considered nginx's ngx_http_limit_req_module as an alternative solution? It usually comes with nginx by default, so no additional actions to install, only a small configuration is required.
You can place an nginx proxy behind your code and the target web service and enable limits on it. Then in your code you will handle 429 as usual, but the requests will be throttled by your local nginx proxy, not by the external web service. So the final destination will get only limited amount of requests.
I have found a trick using Guzzle bundle for symfony.
I had to improve a sequential program sending GET requests to a Google API. In code example, it a pagespeed URL.
To have a rate limit, there an option to delay the requests before they are sent asynchronously.
Pagespeed rate limit is 200 requests per minute.
A quick calculation gives 200/60 = 0.3s per request.
Here is the code I tested on 300 urls, getting a fantastic result of no error, except if the url passed as a parameter in the GET request gives a 400 HTTP Error (Bad request).
I put a delay of 0.4s and the average result time is less then 0.2s, whereas it took more than a minute with a sequential program.
use GuzzleHttp;
use GuzzleHttp\Client;
use GuzzleHttp\Promise\EachPromise;
use GuzzleHttp\Exception\ClientException;
// ... Now inside class code ... //
$client = new GuzzleHttp\Client();
$promises = [];
foreach ($requetes as $i=>$google_request) {
$promises[] = $client->requestAsync('GET', $google_request ,['delay'=>0.4*$i*1000]); // delay is the trick not to exceed rate limit (in ms)
}
GuzzleHttp\Promise\each_limit($promises, function(){ // function returning the number of concurrent requests
return 100; // 1 or 100 concurrent request(s) don't really change execution time
}, // Fulfilled function
function ($response,$index)use($urls,$fp) { // $urls is used to get the url passed as a parameter in GET request and $fp a csv file pointer
$feed = json_decode($response->getBody(), true); // Get array of results
$this->write_to_csv($feed,$fp,$urls[$index]); // Write to csv
}, // Rejected function
function ($reason,$index) {
if ($reason instanceof GuzzleHttp\Exception\ClientException) {
$message = $reason->getMessage();
var_dump(array("error"=>"error","id"=>$index,"message"=>$message)); // You could write the errors to a file or database too
}
})->wait();

Pulling Bright Local API Data into my Ruby on Rails App - API Docs written in PHP

I'm trying to build a rails app that pulls data from several different SEO tool API's. For Bright Local (see their API docs here - http://apidocs.brightlocal.com/) all the API doc examples are written in PHP, which I can't read all that great.
So first, to ask a specific question, how would I write this batch request in Ruby:
<?php
use BrightLocal\Api;
use BrightLocal\Batches\V4 as BatchApi;
$api = new Api('[INSERT_API_KEY]', '[INSERT_API_SECRET]');
$batchApi = new BatchApi($api);
$result = $batchApi->create();
if ($result['success']) {
$batchId = $result['batch-id'];
}
Also, any suggestions for how I can bring myself up to snuff on using API's in my rails apps?
Our docs do currently only show PHP examples - although we are planning to expand on that and Ruby is one of the languages we'll be looking to add.
A simple command line CURL request for the above PHP code would look like this:
curl --data "api-key=<YOUR API KEY HERE>" https://tools.brightlocal.com/seo-tools/api/v4/batch
and would return a response like this:
{"success":true,"batch-id":<RETURNED BATCH ID>}
All our API endpoints respond to either POST, PUT, GET or DELETE. It's also important to note that whenever data is posted with POST or PUT it's passed like "param1=value1&param2=value2" in the body of the request rather than JSON encoded.
I don't know Ruby at all I'm afraid but something like this might make the request you want:
params = {"api-key" => "<YOUR API KEY>"}
Net::HTTP::Post.new("https://tools.brightlocal.com/seo-tools/api/v4/batch").set_form_data(params)
I'm also implementing brightlocal into my Rails app. I'm using the HTTParty gem. this is what I have so far and am able to make successful calls
this is to obtain your batch id:
api_key = YOUR_API_KEY
secret_key = YOUR_SECRET_KEY
request_url = "https://tools.brightlocal.com/seo-tools/api/v4/batch?api-key=#{api_key}"
response = HTTParty.post(request_url)
if response.code == 201
batch_id = response['batch-id']
end
this is an example of running one job in the batch (the query parameters go inside the body):
rank_url = "https://tools.brightlocal.com/seo-tools/api/v4/rankings/search"
response = HTTParty.post(rank_url, {
:body => {
"api-key" => api_key,
"batch-id" => batch_id,
"search-engine" => "google",
"country" => "USA",
"search-term" => "restaurant"
}
})
I have not tested this next part, but theoretically, this is how you would deal with signatures and expirations
expires = Time.now.to_i + 1800
string_to_sign = "#{api_key}.#{expires}"
binary_signature = OpenSSL::HMAC.digest('sha1', string_to_sign, secret_key)
url_safe_signature = CGI::escape(Base64.encode64(binary_signature).chomp)
All that would be left is to use a PUT request to commit the batch, and a GET request to retrieve the data inside the batch.
EDIT: Figured out how to correctly get a passing signature for the jobs that require one. (this example is for local search rank checker http://apidocs.brightlocal.com/#local-search-rank-checker)
expires = Time.now.to_i + 1800
concat = api_key + expires.to_s
sig = OpenSSL::HMAC.digest('sha1', secret_key, concat)
sig = CGI::escape(Base64.encode64(sig).chomp)
local_rank = "https://tools.brightlocal.com/seo-tools/api/v2/lsrc/add?api-key=#{api_key}&sig=#{sig}&expires=#{expires}"
response = HTTParty.post(local_rank, {
:body => {
"name" => "pizza hut",
"search-terms" => "restaurant"
}
})
Since you are using Ruby and not PHP you will have to implement everything yourself. The example you give shows the user of a PHP wrapper created by BrightLocal (and it seems they only have it in PHP).
Basically you will have to make calls to the endpoints yourself and manage the data yourself instead of using their wrapper.

Will my old guzzle code run with a new version and what look for?

I am a Java programmer and new with php. I am experiencing high cpu usage and long transaction times when I acces services using guzzle. Sending small message wil cost me on average half a second.
The code below will cost me 0.249 seconds
// Create the REST client
$client = new Client(URL, array(
'request.options' => array(
'auth' => array($lgUser, $lgPassword, 'Basic')
)
));
$time_start = microtime(true);
// Login to the web service
$request = $client->get('/PartnerInformation.svc/Login');
$request = $client->get('/PartnerInformation.svc/Login');
try {
$response = $request->send();
$lgSID = $response->xml();
echo ("Logged in successfully; SID: ".$lgSID);
} catch (Exception $e) {
echo ("Error while logging in: ".$e);
}
$time_end = microtime(true);
$time_total = $time_end-$time_start;
echo('login time: '.$time_total);
Are there things I can do to speed things up or find the problem?
I found out by looking into the guzzle.phar file that we are using version 3.8.1., would a transfer to a newer version boost the performance and lower the cpu usage? What kind of problems can I expect installing a new goozle version? Will it be enough to change the guzzle.phar file?
You can read about changes and breakwards in official documentation. However as I see from pasted code there is no changes.
Architecturally speaking, there are some huge differences between 3.8 and 5.2. 5.x makes more use of closures and anonymous functions. I've found it to be more resource friendly.
Guzzle, by default, will utilize libcurl. Ultimately, any performance increase observed will be marginal due to the common underpinning.
I'd recommend upgrading into the 5.x series, and possibly even start looking at the 6.x (still in development), if for nothing else then the fact that it is actively being developed and maintained.
There are some significant changes you will need to be aware of. The most prominant of these is the fact that the "lazy methods" (get, post, header, etc, etc) will perform the request and return a Response object.
I've found the Guzzle Docs a vital resource.

Facebook Batch API insight requests

For a project I have to grab the insights of a page over a long period of time (e.g. 1-2 years) of facebook.
I first tried to do a single request but it turned out that only requesting
/PAGE_ID/insights?since=xxx&until=xxx
doesn't return all the data I want (it somehow supresses data as if there's some limit to the size of the answer).
I then tried to split up the date range (e.g. 01.04.2011-01.04.2011 -> 01.04.2011-01.08.2011-01.12.2011-01.04.2011) which as well, didn't quite work like I wanted it to.
My next approach was to request only the insight values I need, like 'page_stories, page_impressions, ...'. The requests looked like this
/PAGE_ID/insights/page_impressions/day?since=xxx&until=xxx
This actually worked but not with ajax. It sometimes seemed to drop some requests (especially if I changed the browser tab in google chrome) and I need to be sure that all requests return an answer. A synchronus solution would just take way too much time considering that one requests needs at least 2 seconds and with a date range of 2 years I may have about 300 single requests this just takes way too long to complete.
Lastly I stumbled over facebook ability to do batch requests which is exactly what I need. It can pack up to 50 requests in one call which significantly lowers the bandwith. And here's where I'm stuck. The facebook api gives some examples on how to use it but none of them worked when I tested them in the Graph Explorer and via the php facebook api sdk. I tried to pack this request
PAGE_ID/insights/page_fan_adds/day?since=1332486000&until=1333695600
into a batch request but failed.
It seems that the api is bugged. It's always giving me this error when I'm using a question mark '?' in the 'relative_url' field.
{
"error": {
"message": "batch parameter must be a JSON array",
"type": "GraphBatchException"
}
}
Here is what I tried:
These give the 'must be a JSON array' error:
?batch=[{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day?since=1332486000&until=1333695600"}]
These two actually return data but they are ignoring the parameters:
?batch=[{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day","body":"since=1332486000 until=1333695600"}]
?batch=[{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day","body":"since=1332486000,until=1333695600"}]
?batch=[{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day","body":{"since":"1332486000","until":"1333695600"}}]
And this one tells me that this is an 'Unsupported post request':
?batch=[{"method":"POST","relative_url":"/PAGE_ID/insights/page_fan_adds/day","body":"since=1332486000 until=1333695600"}]
Can someone help?
I finally found the solution to my problem. It's not mentioned in the facebook documentation but for this request
?batch=[{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day?since=1332486000&until=1333695600"}]
to properly work we have to use a function like
urlencode()
to encode the json part. This way the querys work like a charm. A php example:
$insights = $facebook->api('?batch=['.urlencode('{"method":"GET","relative_url":"/PAGE_ID/insights/page_fan_adds/day?since=1332572400&until=1333782000"}').']'
,'post',array('access_token' => $this->facebook->getAccessToken()));
which results in this:
?batch=[%7B%22method%22%3A%22GET%22%2C%22relative_url%22%3A%22%2FPAGE_ID%2Finsights%2Fpage_fan_adds%2Fday%3Fsince%3D1300086000%26until%3D1307862000%22%7D]
This example is for using an array of IDs to make a batch request with urlencoding.
$postIds = [
'XXXXXXXXXXXXXXX_XXXXXXXXXXXXXXX',
'XXXXXXXXXXXXXXX_XXXXXXXXXXXXXXX',
'XXXXXXXXXXXXXXX_XXXXXXXXXXXXXXX',
'XXXXXXXXXXXXXXX_XXXXXXXXXXXXXXX',
'XXXXXXXXXXXXXXX_XXXXXXXXXXXXXXX',
];
$queries = [];
foreach( $postIds as $postId ) {
$queries[] = [
'method' => 'GET',
'relative_url' => '/' . $postId . '/comments?summary=1&filter=stream&order=reverse_chronological',
];
}
$requests = $facebook->post( '?batch=' . urlencode( json_encode( $queries ) ) )->getGraphNode();

Speeding up a soap powered website

We're currently looking into doing some performance tweaking on a website which relies heavily on a Soap webservice. But ... our servers are located in Belgium and the webservice we connect to is locate in San Francisco so it's a long distance connection to say the least.
Our website is PHP powered, using PHP's built in SoapClient class.
On average a call to the webservice takes 0.7 seconds and we are doing about 3-5 requests per page. All possible request/response caching is already implemented so we are now looking at other ways to improved the connection speed.
This is the code which instantiates the SoapClient, what i'm looking for now is other ways/methods to improve speed on single requestes. Anyone has idea's or suggestions?
private function _createClient()
{
try {
$wsdl = sprintf($this->config->wsUrl.'?wsdl', $this->wsdl);
$client = new SoapClient($wsdl, array(
'soap_version' => SOAP_1_1,
'encoding' => 'utf-8',
'connection_timeout' => 5,
'cache_wsdl' => 1,
'trace' => 1,
'features' => SOAP_SINGLE_ELEMENT_ARRAYS
));
$header_tags = array('username' => new SOAPVar($this->config->wsUsername, XSD_STRING, null, null, null, $this->ns),
'password' => new SOAPVar(md5($this->config->wsPassword), XSD_STRING, null, null, null, $this->ns));
$header_body = new SOAPVar($header_tags, SOAP_ENC_OBJECT);
$header = new SOAPHeader($this->ns, 'AuthHeaderElement', $header_body);
$client->__setSoapHeaders($header);
} catch (SoapFault $e){
controller('Error')->error($id.': Webservice connection error '.$e->getCode());
exit;
}
$this->client = $client;
return $this->client;
}
So, the root problem is number of request you have to do. What about creating grouped services ?
If you are in charge of the webservices, you could create specialized webservices which do multiple operations at the same time so your main app can just do one request per page.
If not you can relocate your app server near SF.
If relocating all the server is not possible and you can not create new specialized webservices, you could add a bridge, located near the webservices server. This bridge would provide the specialized webservices and be in charge of calling the atomic webservices. Instead of 0.7s * 5 you'd have 0.7s + 5 * 0.1 for example.
PHP.INI
output_buffering = On
output_handler = ob_gzhandler
zlib.output_compression = Off
Do you know for sure that it is the network latency slowing down each request? 0.7s seems a long round time, as Benoit says. I'd look at doing some benchmarking - you can do this with curl, although I'm not sure how this would work with your soap client.
Something like:
$ch = curl_init('http://path/to/sanfrancisco/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
$info = curl_getinfo($ch);
$info will return an array including elements for total_time, namelookup_time, connect_time, pretransfer_time, starttransfer_time and redirect_time. From these you should be able to work out whether it's the dns, request, the actual soap server or the response that's taking up the time.
One obvious thing that's just occurred to me is are you requesting the SOAP server via a domain or an IP? If you're using a domain, your dns might be slowing things down significantly (although it will be cached at several stages). Check your local dns time (in your soap client or php.ini - not sure) and the TTL of your domain (in your DNS Zone). Set up a static IP for your SanFran server and reference it that way if not already.
Optimize the Servers (not the client!) HTTP response by using caching and HTTP compressing. Check out the tips at yahoo http://developer.yahoo.com/performance/rules.html
1 You can assert your soap server use gzip compression for http content, as well as your site output does. A 0,7s roundup to SF seems a bit long, it's either webservice is long to answer, either there is an important natwork latency.
If you can, give a try to other hosting companies for your belgium server, in France some got a far better connectivity to the US than others.
I experienced to move a website from one host o another and network latency between Paris and New york has almost doubled ! it's uge and my client with a lot of US visitors was unhappy with it.
The solution of relocating web server to SF can be an option, you'll get a far better connectivity between servers, but be careful of latency if your visitors are mainly located in Europe.
2 You can use an opcode cache mecanism, such as xcache or APC. It wil not change the soap latency, but will improve php execution time.
3 Depending if soap request are repetitive, and how long could a content update could be extended, you can give it a real improvement using cache on soap results. I suggest you to use in-memory caching system (Like xcache/memcached or other) because they're ay much faster than file or DB cache system.
From your class, the createclient method isn't the most adapted exemple functionality to be cached, but for any read operation it's just the best way to perf :
private function _createClient()
{
$xcache_key = 'clientcache'
if (!xcache_isset($key)) {
$ttl = 3600; //one hour cache lifetime
$client = $this->_getClient(); ///private method embedding your soap request
xcache_set($xcache_key, $client, $ttl);
return $client;
}
//return result form mem cache
return xcache_get($xcache_key);
}
The example is for xcache extension, but you can use other systems in a very similar manner
4 To go further you can use similar mecanism to cache your php processing results (like template rendering output and other ressource consumming operations). The key to success with this technic is to know exactly wich part is cached and for how long it will stay withous refreshing.
Any chance of using an AJAX interface.. if the requests can be happening in the background, you will not seem to be left waiting for the response.

Categories