php url data fetch stop on particular time - php

I am using php 5.2 and I am fetching data from url using file_get_contents function. This is loop for 5000 and I have divided into 500 slots and set a script like this.
For 500 it is taking 3 hours to complete because for some url it is taking too much time and for some it is in 1 sec that is fine.
What I want if url is taking more than 30 sec then skip and go for next.
I want to stop fetch after 30 sec.
<?php
// Create the stream context
$context = stream_context_create(array(
'http' => array(
'timeout' => 1 // Timeout in seconds
)
));
// Fetch the URL's contents
echo date("Y-m-d H:i:s")."\n";
$contents = file_get_contents('http://example.com', 0, $context);
echo date("Y-m-d H:i:s")."\n";
// Check for empties
if (!empty($contents))
{
// Woohoo
// echo $contents;
echo "file fetched";
}
else
{
echo $contents;
echo "more than 30 sec";
}
?>
I have already done that it is not working for me because file_get_contents function is not stoping it will continue , then only thing now I am getting no result after 30 sec but time it is taking sameas u can see in output.
Output of php
2012-03-09 11:26:38
2012-03-09 11:26:40
more than 30 sec

You can set the HTTP timeout. (Not tested)
<?php
$ctx = stream_context_create(array(
'http' => array(
'timeout' => 30
)
));
file_get_contents("http://example.com/", 0, $ctx);
Source
Edit: I don't know why it isn't working with this code by you. But if you don't manage it to bring it to work with this you may also want to give CURL a try. This could be eventually also faster for that (but I don't know if that is really faster...).
If that would work for you, you could than use the curl_setopt function to set the timeout time with the CURLOPT_TIMEOUT flag.

There some info on the php manual about timeouts.
http://php.net/manual/en/function.file-get-contents.php
there is mention of the following as of php 5.2.1
ini_set('default_socket_timeout', 120);
$a = file_get_contents("http://abcxyz.com");
or adding a context which is more or less the same.
// Create the stream context
$context = stream_context_create(array(
'http' => array(
'timeout' => 3 // Timeout in seconds
)
));
// Fetch the URL's contents
$contents = file_get_contents('http://abcxyz.com', 0, $context);`
A third option is using PHP's fsockopen which has an explicit timeout option
http://www.php.net/manual/en/function.fsockopen.php
$timeout = 2; // seconds
$fp = fsockopen($url, 80, $errNo, $errString, $timeout);
/* stops connecting after 2 seconds,
stores the error Number in $errNo,
the error String in $errStr */
To save writing a lot of code, you could use it as a quick check if host is up.
ie:
if (pingLink($domain,$timeout)) {
file_get_contents()
}
function pingLink($domain,$timeout=30){
$status = 0; //default site is down
$file = fsockopen($domain,"r");
if ($file) {
$status = 1; // Site is up
fclose($file);
}
return $status;
}

Related

http post in loop is running timing out

I am trying to send curl request from source to destination in loop. Loop runs for 2 times. First request lasts for 32 seconds and second one for 50 seconds. Finally times out. Controlling timeout is not in my control as it is shared hosting.
Source section below is being run in the browser. the below error message shows after using up 120 seconds
Error Details: Fatal error: Maximum execution time of 120 seconds
exceeded
Question
I am assuming that the request should not timeout, since both requests are submitted separately through their own curl request. Still, it seems like it is getting consolidated to form total one request.
In case I run the loop for one time, then everything works as it takes 30 seconds.
Am I missing anything?
Source
for($i = 0; $i <= 200; $i+= 100) {
$postData = array(
'start' => $i,
'end' => $i + 100
);
$ch = curl_init('Server url');
curl_setopt_array($ch, array(
CURLOPT_POST => TRUE,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_HTTPHEADER => array(
'Content-Type: application/json'
),
CURLOPT_POSTFIELDS => json_encode($postData)
));
$response = curl_exec($ch);
$responseData = json_decode($response, TRUE);
curl_close($ch);
echo $response;
}
Destination
public function methodname()
{
$json = json_decode(file_get_contents('php://input'), true);
// .
// .
// Logic that runs for 32 seconds
// .
// .
header('Content-type: application/json');
echo json_encode("message");
}
Try to add a sleep(1) function inner your loop. It could be that the server which you are requested dont like multiple POST request in a short time.
try using cURl's CURLOPT_TIMEOUT or similar configurations. More information https://www.php.net/manual/en/function.curl-setopt.php here
Answer: read the documentation
LE:
You could also use set_time_limit(0); // or value > 120 to increase your script execution timeout

PHP - which are fastest and best way to get html content

I have to read an URL html content about ~1 MB, exactly is 926 KB.
And I already create 2 functions.
An URL content with filesize ~1 MB:
https://example.com/html_1MB_Content.html
And here are 2 functions that I was created:
function getContent1 ($url) {
$file_handle = fopen($url, "r");
while (!feof($file_handle)) {
$line = fgets($file_handle);
echo $line;
}
fclose($file_handle);
}
function getContent2 ($url) {
$handle = curl_init($url);
curl_setopt_array($handle, array(
CURLOPT_USERAGENT => $_SERVER['HTTP_USER_AGENT'],
CURLOPT_ENCODING => '',
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_SSL_VERIFYPEER => 0,
CURLOPT_FOLLOWLOCATION => 1
));
$curl_response = curl_exec($handle);
curl_close($handle);
return $curl_response;
}
$testUrl = 'https://example.com/html_1MB_Content.html';
$result1 = getContent1 ($testUrl);
$result2 = getContent2 ($testUrl);
What I want is fastest and less memory. Which is best in this case?
One more question is anyway to read page content from bottom to top, if found data stop reading?
if you want to know how long it takes to execute your code, you may use these couple of code..
//put this to the start of your code..
$time_start = microtime(true);
//here goes your code...
//and put this to the end of your code...
echo 'Total execution time in seconds: ' . (microtime(true) - $time_start).'. memory used in kb : '.echo memory_get_usage();
this will show you performance time in second and size of used memory in KB...

Check if file_get_contents url is working

I have a website, where I need to determine user's location, so I use webservice, which gives me detailed information about my user (using his IP address).
My function looks like this:
$user_ip = $_SERVER['REMOTE_ADDR'];
$json_url = 'http://example.com/'.$user_ip;
$json = file_get_contents($json_url);
$obj = json_decode($json);
Today morning this webservice had a problems (500 errors, too many connections, bad gateway...) and my website was loading very long time.
So I have a question: Is it possible to set timeout for file_get_contens function? Or maybe there are a ways to get fast that the server is not working?
You can set the timeout option of the http context:
$opts = array('http' =>
array(
'timeout' => 5
)
);
$result = file_get_contents($url, false, stream_context_create($opts));
Check the docs of:
stream_context_create()
HTTP context options
An alternative would be to set the default socket timeout via ini_set():
$st = ini_get("default_socket_timeout"); // backup current value
ini_set("default_socket_timeout", 5000); // 5 seconds
$content = file_get_contents($url);
if($content === false) {
// error handling
}
ini_set("default_socket_timeout", $st); // restore previous value

file get content or fsockopen - timeout issue

I have a php file called testResponse.php which is only :
<?php
sleep(5);
echo"go";
?>
Now, I'm calling this file from a another page using file_get_contents like this :
$start= microtime(true);
$opts = array('http' =>
array(
'method' => 'GET',
'timeout' => 1
)
);
$context = stream_context_create($opts);
$loc = #file_get_contents("http://www.mywebsite.com/testResponse.php", false, $context);
$end= microtime(true);
echo $end - $start, "\n";
The output is more than 5 sec, which means that my timeout has been ignored...
I followed the advice of this post : stackoverflow.com/questions/3689371
But it seems that hostname cannot be a path (like www.mywebsite.com/testResponse.php) but directly the hostname like www.mywebsite.com.
So I'm stuck to achieve this goal :
Get content of page www.test.com/x.php with constraint :
if test.com doesn't exist or the page x.php doesn't exist returns nothing quickly
if the page exist but takes more than 1 sec to load, abort
else get the content of the file
Edit : By the way, it seems to work when I call this page (testResponse.php) from my local server. Well, it multiply the timeout by 2. For instance, If I have 1 for timeout, I will have echoed something like "2.0054645". But only from local...
The solution is to use PHP's cURL functions. The other question you linked to explains things properly, about the read timeouts vs. the connection timeouts, and so on, but neither of those are truly what you're looking for here. Even the connection timeout won't work, because the connection to testResponse.php is always successful; after that it's waiting, so what you need is an execution timeout. This is where cURL comes in handy.
So, testResponse.php doesn't need to be altered. In your main file, though, try the following code (this is tested and it works on my server):
$start = microtime(true);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.mywebsite.com/testResponse.php");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 1);
$output = curl_exec($ch);
$errno = curl_errno($ch);
if ($errno > 0) {
if ($errno === 28) {
echo "Connection timed out.";
}
else {
echo "Error #" . $errno . ": " . curl_error($ch);
}
}
else {
echo $output;
}
$end = microtime(true);
echo "<br><br>" . ($end - $start);
curl_close($ch);
This sets the execution time of the cURL session, via the CURLOPT_TIMEOUT option you see on line 5. So, when the connection is timed out, $errno will equal 28, the code for cURL's operation timeout error. The rest of the error codes are listed in the cURL documentation, so you can expand the script above to act accordingly.
Finally, because of the CURLOPT_RETURNTRANSFER option that's set, curl_exec($ch) will be set to the content of the retrieved page if the session succeeds. Otherwise, it will equal false.
Hope this helps!
Edit: Removed the statement setting CURLOPT_HEADER. I also, for some reason, was under the impression that curl_exec($ch) set the value of $ch to the returned contents, forgetting that the contents are returned by curl_exec().

Timing out a script portion and allowing the rest to continue

I have a widget that runs on my homepage which is loading xml data from an external source. I want to timeout the xml load after x seconds (lately the other site has been having load issues). Here is the function I have so far. I can't figure out how to make the timer ineract with the simplexml_load_file().
Am I on the right track? Is there a way to make this work? Or is there a better way to do this? If this does timeout, I still need the rest of the page to continue loading, so I can't use set_time_limit(), because that will end all script execution, right?
function timer($end) {
$count = 0;
while($end > $count) {
sleep(1);
$count++;
}
return true;
}
$we = simplexml_load_file('http://forecast.weather.gov/MapClick.php?lat=44.08920&lon=-70.17250&FcstType=xml');
if(timer(3)) return;
So you want to set a timeout for simplexml_load_file(). You can't set it specifically, but you can just set it globally (for all socket based streams) before using the function:
ini_set('default_socket_timeout', 3);
$we = simplexml_load_file($url);
// you can restore the default value after use, if you want
ini_restore('default_socket_timeout');
I would use CURL instead of loading the URL directly...
function getXml($url, $timeout = 0){
$ch = curl_init($url);
curl_setopt_array($ch,array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => (int) $timeout
));
if($xml = curl_exec($ch)){
return new SimpleXmlElement($xml);
}
else {
return null;
}
}
//Example
$xmlData = getXml('http://yoururl.com', 2); // 2 second timeout
You could first read the content of the file with some blocking or more reliable function (like fopen, fsockopen or curl, choose the best you can use) and then pass the content to simplexml_load_string instead of simplexml_load_file

Categories