How to gracefully handle a downed API - php

With twitter being down today I was thinking about how to best handle calls to an API when it is down. If I am using CURL to call their api how do I cause the script to fail quickly and handle the errors so as not to slow down the application?

Perhaps use a sort of cache of whether or not twitter is up or down. Log invalid responses from the api in a database or server-sided file. Once you get two/three/some other amount of invalid responses in a row, disable all requests to the api for x amount of time.
After x amount of time, attempt a request, if it's still down, disable for x minutes again.
If your server can run CRON jobs consider making a script that checks the api for a valid response every few minutes. If it finds out it's down, disable requests until it's back up. At least in this case the server would be doing the testing and users won't have to be the guinea pigs.

Use curl_setopt
curl_setopt($yourCurlHandle, CURLOPT_CONNECTTIMEOUT, '1'); // 1 second
If you use curl >= 7.16.2 and PHP >= 5.2.3 there is CURLOPT_CONNECTTIMEOUT_MS

Use curl_getinfo to get the cURL response code or content length and check against those.
$HttpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);

Related

PHP: Right process to avoid timeout issue

I have a PHP website and one of the pages I use makes a CURL call to another server. Now this server need about 45 seconds to respond, and there is nothing I can do about it.There is actually 2 step to get the information, the first step is to send the request to update the information (this takes about 43 seconds) and after I need to send another request to get the data back (normally takes 2-5 sec).
My server is on GoDaddy and obviously sometimes it timeout (CGI Timeout) because I think it's normally 30 seconds.
This script (asking the request + getting the data back), is normally triggered overnight via cron job however it can be triggered during the day.
So I was wondering: what would be the best way to split the information to avoid timeout issues?
I was thinking of just sending theupdate request and don't care about the result. Then, about a minute after, I would send a request to get back the data. However, I have no idea if it's even possible to do a timer in PHP, and if so, would the page timeout anyways?
Thanks!
You can set a timeout value in your PHP code to allow more time.
Setting Curl's Timeout in PHP
If you want to run the files separately, I would set up a separate cron job for the second file.
Use CURLOPT_CONNECTTIMEOUT to increase server response time.
CURLOPT_CONNECTTIMEOUT
The number of seconds to wait while trying to connect. Use 0 to wait indefinitely.
And then you need to use CURLOPT_TIMEOUT to get working the option CURLOPT_CONNECTTIMEOUT.
Something like this,
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,0); // 0 for wait infinitely, not a good practice
curl_setopt($ch, CURLOPT_TIMEOUT, 400); //in seconds
You can set it in micro seconds as well , like so,
CURLOPT_TIMEOUT_MS

CURL and DDOS Problems

I need get some data from remote http server.Im using Curl Classes for multirequests.
My problem is Remote Server's Firewall. Im sending 1000 between 10000 GET and POST requests. And Server bans me from DDOS.
İ used this measures.
packages still contain header information
curl_setopt($this->ch, CURLOPT_HTTPHEADER, $header);
packages still contain random referer information
curl_setopt($this->ch, CURLOPT_REFERER, $refs[rand(0,count($refs))]);
packages still contain random user agents
curl_setopt($this->ch, CURLOPT_USERAGENT, $agents[rand(0,count($agents))]);
I send packages by using the function of sleep at random intervals.
sleep(rand(0,10));
But bans access to the server each time for 1 hour.
Sorry for my bad english :)
Thanks for all.
Sending a large number of requests in a short space of time to the server is likely to have the same impact as a DOS attack whether that is what you intended or not. A quick fix would be to change the sleep line from sleep(rand(0,10)); which means there is a 1 in 11 chance of sending the next request instantly to sleep(3); which means there will always be 3 seconds (approximately) between requests. 3 seconds should be enough of a gap to keep most servers happy. Once you've verified this works you can reduce the value to 2 or 1 to see if you can speed things up.
A far better solution would be to create an API on the server that allows you to get the data you need in 1, or at least only a few, requests. Obviously this is only possible if you're able to make changes to the server (or can persuade those who can to make the changes on your behalf).

What is a practical use for PHP's sleep()?

I just had a look at the docs on sleep().
Where would you use this function?
Is it there to give the CPU a break in an expensive function?
Any common pitfalls?
One place where it finds use is to create a delay.
Lets say you've built a crawler that uses curl/file_get_contents to get remote pages. Now you don't want to bombard the remote server with too many requests in short time. So you introduce a delay between consecutive requests.
sleep takes the argument in seconds, its friend usleep takes arguments in microseconds and is more suitable in some cases.
Another example: You're running some sort of batch process that makes heavy use of a resource. Maybe you're walking the database of 9,000,000 book titles and updating about 10% of them. That process has to run in the middle of the day, but there are so many updates to be done that running your batch program drags the database server down to a crawl for other users.
So you modify the batch process to submit, say, 1000 updates, then sleep for 5 seconds to give the database server a chance to finish processing any requests from other users that have backed up.
Here's a snippet of how I use sleep in one of my projects:
foreach($addresses as $address)
{
$url = "http://maps.google.com/maps/geo?q={$address}&output=json...etc...";
$result = file_get_contents($url);
$geo = json_decode($result, TRUE);
// Do stuff with $geo
sleep(1);
}
In this case sleep helps me prevent being blocked by Google maps, because I am sending too many requests to the server.
Old question I know, but another reason for using u/sleep can be when you are writing security/cryptography code, such as an authentication script. A couple of examples:
You may wish to reduce the effectiveness of a potential brute force attack by making your login script purposefully slow, especially after a few failed attempts.
Also you might wish to add an artificial delay during encryption to mitigate against timing attacks. I know that the chances are slim that you're going to be writing such in-depth encryption code in a language like PHP, but still valid I reckon.
EDIT
Using u/sleep against timing attacks is not a good solution. You can still get the important data in a timing attack, you just need more samples to filter out the noise that u/sleep adds.
You can find more information about this topic in: Could a random sleep prevent timing attacks?
Another way to use it: if you want to execute a cronjob more often there every minute. I use the following code for this:
sleep(30);
include 'cronjob.php';
I call this file, and cronjob.php every minute.
This is a bit of an odd case...file transfer throttling.
In a file transfer service we ran a long time ago, the files were served from 10Mbps uplink servers. To prevent the network from bogging down, the download script tracked how many users were downloading at once, and then calculated how many bytes it could send per second per user. It would send part of this amount, then sleep a moment (1/4 second, I think) then send more...etc.
In this way, the servers ran continuously at about 9.5Mbps, without having uplink saturation issues...and always dynamically adjusting speeds of the downloads.
I wouldn't do it this way, or in PHP, now...but it worked great at the time.
You can use sleep to pause the script execution... for example to delay an AJAX call by server side or implement an observer. You can also use it to simulate delays.
I use that also to delay sendmail() & co. .
Somebody uses use sleep() to prevent DoS and login brutefoces, I do not agree 'cause in this you need to add some checks to prevent the user from running multiple times.
Check also usleep.
I had to use it recently when I was utilising Google's Geolocation API. Every address in a loop needed to call Google's server so it needed a bit of time to receive a response. I used usleep(500000) to give everything involved enough time.
I wouldn't typically use it for serving web pages, but it's useful for command line scripts.
$ready = false;
do {
$ready = some_monitor_function();
sleep(2);
} while (!$ready);
Super old posts, but I thought I would comment as well.
I recently had to check for a VERY long running process that created some files. So I made a function that iterates over a cURL function. If the file I'm looking for doesn't exist, I sleep the php file, and check again in a bit:
function remoteFileExists() {
$curl = curl_init('domain.com/file.ext');
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 404) {
sleep(7);
remoteFileExists();
}
else{
echo 'exists';
}
}
curl_close($curl);
}
echo remoteFileExists();
One of its application is, if I am sending mails by a script to 100+ customers then this operation will take maximum 1-2 seconds thus most of the website like hotmail and yahoo consider it as spam, so to avoid this we need to use some delay in execution after every mail.
Among the others: you are testing a web application that makes ayncronous requests (AJAX calls, lazy image loading,...)
You are testing it locally so responses are immediate since there is only one user (you) and no network latency.
Using sleep lets you see/test how the web app behaves when load and network cause delay on requests.
A quick pseudo code example of where you may not want to get millions of alert emails for a single event but you want your script to keep running.
if CheckSystemCPU() > 95
SendMeAnEmail()
sleep(1800)
fi

Faster alternative to file_get_contents()

Currently I'm using file_get_contents() to submit GET data to an array of sites, but upon execution of the page I get this error:
Fatal error: Maximum execution time of 30 seconds exceeded
All I really want the script to do is start loading the webpage, and then leave. Each webpage may take up to 5 minutes to load fully, and I don't need it to load fully.
Here is what I currently have:
foreach($sites as $s) //Create one line to read from a wide array
{
file_get_contents($s['url']); // Send to the shells
}
EDIT: To clear any confusion, this script is being used to start scripts on other servers, that return no data.
EDIT: I'm now attempting to use cURL to do the trick, by setting a timeout of one second to make it send the data and then stop. Here is my code:
$ch = curl_init($s['url']); //load the urls
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 1); //Only send the data, don't wait.
curl_exec($ch); //Execute
curl_close($ch); //Close it off.
Perhaps I've set the option wrong. I'm looking through some manuals as we speak. Just giving you an update. Thank you all of you that are helping me thus far.
EDIT: Ah, found the problem. I was using CURLOPT_CONNECTTIMEOUT instead of CURLOPT_TIMEOUT. Whoops.
However now, the scripts aren't triggering. They each use ignore_user_abort(TRUE); so I can't understand the problem
Hah, scratch that. Works now. Thanks a lot everyone
There are many ways to solve this.
You could use cURL with its curl_multi_* functions to execute asynchronously the requests. Or use cURL the common way but using 1 as timeout limit, so it will request and return timeout, but the request will be executed.
If you don't have cURL installed, you could continue using file_get_contents but forking processes (not so cool, but works) using something like ZendX_Console_Process_Unix so you avoid the waiting between each request.
As Franco mentioned and I'm not sure was picked up on, you specifically want to use the curl_multi functions, not the regular curl ones. This packs multiple curl objects into a curl_multi object and executes them simultaneously, returning (or not, in your case) the responses as they arrive.
Example at http://php.net/curl_multi_init
Re your update that you only need to trigger the operation:
You could try using file_get_contents with a timeout. This would lead to the remote script being called, but the connection being terminated after n seconds (e.g. 1).
If the remote script is configured so it continues to run even if the connection is aborted (in PHP that would be ignore_user_abort), it should work.
Try it out. If it doesn't work, you won't get around increasing your time_limit or using an external executable. But from what you're saying - you just need to make the request - this should work. You could even try to set the timeout to 0 but I wouldn't trust that.
From here:
<?php
$ctx = stream_context_create(array(
'http' => array(
'timeout' => 1
)
)
);
file_get_contents("http://example.com/", 0, $ctx);
?>
To be fair, Chris's answer already includes this possibility: curl also has a timeout switch.
it is not file_get_contents() who consume that much time but network connection itself.
Consider not to submit GET data to an array of sites, but create an rss and let them get RSS data.
I don't fully understands the meaning behind your script.
But here is what you can do:
In order to avoid the fatal error quickly you can just add set_time_limit(120) at the beginning of the file. This will allow the script to run for 2 minutes. Of course you can use any number that you want and 0 for infinite.
If you just need to call the url and you don't "care" for the result you should use cUrl in asynchronous mode. This case any call to the URL will not wait till it finished. And you can call them all very quickly.
BR.
If the remote pages take up to 5 minutes to load, your file_get_contents will sit and wait for that 5 minutes. Is there any way you could modify the remote scripts to fork into a background process and do the heavy processing there? That way your initial hit will return almost immediately, and not have to wait for the startup period.
Another possibility is to investigate if a HEAD request would do the trick. HEAD does not return any data, just headers, so it may be enough to trigger the remote jobs and not wait for the full output.

Improving cURL performance (PHP Library)

Here is a brief overview of what I am doing, it is quite simple really:
Go out and fetch records from a database table.
Walk through all those records and for each column that contains a URL go out (using cURL) and make sure the URL is still valid.
For each record a column is updated with a current time stamp indicating when it was last checked and some other db processing takes place.
Anyhow all this works well and good and does exactly what it is supposed to. The problem is that I think performance could be greatly improved in terms of how I am validating the URL's with cURL.
Here is a brief (over simplified) excerpt from my code which demonstrates how cURL is being used:
$ch = curl_init();
while($dbo = pg_fetch_object($dbres))
{
// for each iteration set url to db record url
curl_setopt($ch, CURLOPT_URL, $dbo->url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_exec($ch); // perform a cURL session
$ihttp_code = intval(curl_getinfo($ch, CURLINFO_HTTP_CODE));
// do checks on $ihttp_code and update db
}
// do other stuff here
curl_close($ch);
As you can see I am just reusing the same cURL handle the entire time but even if I strip out all over the processing (database or otherwise) the script still takes incredibly long to run. Would changing any of the cURL options help improve performance? Tuning timeout values / etc? Any input would be appreciated.
Thank you,
Nicholas
Set CURLOPT_NOBODY to 1 (see curl documentation) tell curl not to ask for the body of the response. This will contact the web server and issue a HEAD request. The response code will tell you if the URL is valid or not, and won't transfer the bulk of the data back.
If that's still too slow, then you'll likely see a vast improvement by running N threads (or processes) each doing 1/Nth of the work. The bottleneck may not be in your code, but in the response times of the remote servers. If they're slow to respond, then your loop will be slow to run.

Categories