I'm new with curl and I can't find my answer.
I want to get the http status of a page, so I'm using
$httpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
But this does not work if there is a redirection (for example a 301 that redirect to another page, it will give me a 200 answer) because curlinfo_http_code gives the Last received HTTP code
Any idea how I can get the first received http code ?
thanks
You need to configure curl with CURLOPT_FOLLOWLOCATION = 0
The way I am accomplishing this is by having PHP consult an INI file with the different return codes and replacing them with something human readable
<?php
$ch = curl_init(); // create cURL handle (ch)
if (!$ch) {
die("Couldn't initialize a cURL handle");
}
// set some cURL options
$ret = curl_setopt($ch, CURLOPT_URL, "http://mail.yahoo.com");
$ret = curl_setopt($ch, CURLOPT_HEADER, 1);
$ret = curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$ret = curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
$ret = curl_setopt($ch, CURLOPT_TIMEOUT, 30);
// execute
$ret = curl_exec($ch);
if (empty($ret)) {
// some kind of an error happened
die(curl_error($ch));
curl_close($ch); // close cURL handler
} else {
$info = curl_getinfo($ch);
curl_close($ch); // close cURL handler
if (empty($info['http_code'])) {
die("No HTTP code was returned");
} else {
// load the HTTP codes
$http_codes = parse_ini_file("HTMLCodes.ini");
// echo results
echo "The server responded: <br />";
echo $http_codes[$info['http_code']];
}
}
?>
example of the ini file with return codes
[Informational 1xx]
100="Continue"
101="Switching Protocols"
[Successful 2xx]
200="OK"
201="Created"
202="Accepted"
203="Non-Authoritative Information"
204="No Content"
205="Reset Content"
206="Partial Content"
[Redirection 3xx]
300="Multiple Choices"
301="Moved Permanently"
302="Found"
303="See Other"
304="Not Modified"
305="Use Proxy"
306="(Unused)"
307="Temporary Redirect"
Related
I would like to have a lottery check page written in php. The code does not work with the Hungarian lottery database ($ url2) but works with the other ($ url1). Too much data is the problem?
<?php
echo "CURL - function test <br>";
$url1 = "http://www.example.com";
$url2 = "https://bet.szerencsejatek.hu/cmsfiles/otos.html";
function curl_download($Url){
// is cURL installed yet?
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
// OK cool - then let's create a new cURL resource handle
$ch = curl_init();
// Now set some options (most are optional)
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $Url);
// Set a referer
curl_setopt($ch, CURLOPT_REFERER, "http://www.example.org/yay.htm");
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
// Download the given URL, and return output
$output = curl_exec($ch);
// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}
echo curl_download($url2);
echo strlen(curl_download($url2));
The first thing that it depends on what the error is.
I think you should dump the result of CURL work. Something like
if (!curl_errno($ch)) {
switch ($http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE)) {
case 200: # OK
$return = ['result' => 'ok', 'response_text' => $result];
break;
default:
$return = ['result' => 'unexpected_http_code', 'http_code' => $http_code,
'response_text' => $result
];
}
} else {
$return = ['result' => 'curl_error', 'curl_error' => curl_error($ch)];
}
Maybe it's because you didn't configure your SSL settings because the second URL starts with https://
If I get title of the page, I can tell the download link is active or dead.
For example: "Free online storage" is title of dead link and "[file name]" is the title of active link (mediafire). But my page takes too long to respond, so is there any other way to check if a download link is active or dead?
That is what i have done:
<?php
function getTitle($Url){
$str = file_get_contents($Url);
if(strlen($str)>0){
preg_match("/\<title\>(.*)\<\/title\>/",$str,$title);
return $title[1];
}
}
?>
Do not perform a GET request, which downloads the whole page/file, but HEAD request, which gets only the HTTP headers, and check if the status is 200, and the content-type is not text/html
Something like this...
function url_validate($link)
{
#[url]http://www.example.com/determining-if-a-url-exists-with-curl/[/url]
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $link);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
$data = curl_exec($ch);
curl_close($ch);
preg_match_all("/HTTP\/1\.[1|0]\s(\d{3})/",$data,$matches);
$code = end($matches[1]);
if(!$data)
{
return(false);
}
else
{
if($code==200)
{
return(true);
}
elseif($code==404)
{
return(false);
}
}
}
You can safely use any cURL library function. It is legitimate and thus would not regarded as a hacking attempt. The only requirement is that your web hosting company has cURL extension installed, which is very likely.
cURL should do the job. You can check the headers returned and the text content as well if you want.
If you just enter the urls into the browser you can see that both work, cdon works even without javascript, have they blocked cURL somehow?
I'm trying to build a scraper to benifit legal movies online which would benifit them a whole lot, seems stupid blocking scrapers in general imho. Although I'm far from sure that's whats going on here! Might be just an error somewhere..
// Works
get_file1('http://sfanytime.com/sv-SE/Sokresultat/?field=all&q=The+Matrix', '/', 'sfanytime.html');
// Saves a blank 0 KB file
get_file1('http://downloads.cdon.com/index.phtml?action=search&search_terms=The+Matrix', '/', 'cdon.html');
function get_file1($file, $local_path, $newfilename) {
$out = fopen($newfilename, 'wb');
if ($out === FALSE) {
return false;
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_FILE, $out);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_URL, $file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
$error = curl_error($ch);
if (strlen($error) > 0) {
echo "<br>Error is : ". $error;
return false;
}
curl_close($ch);
return true;
}
You should change the line
curl_setopt($ch, CURLOPT_FAILONERROR, true);
...to...
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
CURLOPT_FAILONERROR will cause a "silent fail" - which from what you say, is not what you want. I have replaced this with CURLOPT_FOLLOWLOCATION, because when I visit the second URL, I get redirected to a "choose your country" type page, which will be a response with an empty body - which is why you get an empty file.
There is no problem with your code as such, simply a problem with the way you handle the response from the second URL. You don't see an error because, technically, there wasn't one.
I want to do 300 redirect by php but first I want script will check whether the site is online or not if online then it will redirect else it will show unable to redirect. Can any one tell me how is possible?
thanks
This should do it (edited for betterness)
$destination = 'http://www.google.com';
$ch = curl_init($destination);
// use request type HEAD because it's faster and lighter
curl_setopt($ch, CURLOPT_NOBODY, true);
// prevent curl from dumping any output
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// prevent curl from producing any output
curl_setopt($ch, CURLOPT_HEADER, false);
// run request
curl_exec($ch);
// consider request a success if the HTTP Code is less than 400 (start of errors)
// change this to whatever you expect to get, e.g. "equal to 200 (OK)"
$success = (bool) ((int)curl_getinfo($ch, CURLINFO_HTTP_CODE) < 400);
curl_close($ch);
// redirect or die
if ($success) {
header('Location: ' . $destination, true, 301);
} else {
die('Unable to redirect to '.$destination.'');
}
how to check if a URL exists or not - error 404 ? (using php)
<?php
$url = "http://www.faressoft.org/";
?>
If you have allow_url_fopen, you can do:
$exists = ($fp = fopen("http://www.faressoft.org/", "r")) !== FALSE;
if ($fp) fclose($fp);
although strictly speaking, this won't return false only for 404 errors. It's possible to use stream contexts to get that information, but a better option is to use the curl extension:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/notfound");
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_exec($ch);
$is404 = curl_getinfo($ch, CURLINFO_HTTP_CODE) == 404;
curl_close($ch);
The simplest one to check the 404/200 or etc..
<?php
$mylink="http://site.com";
$handler = curl_init($mylink);
curl_setopt($handler, CURLOPT_RETURNTRANSFER, TRUE);
$re = curl_exec($handler);
$httpcdd = curl_getinfo($handler, CURLINFO_HTTP_CODE);
if ($httpcdd == '404')
{ echo 'it is 404';}
else {echo 'it is not 404';}
?>
You could use curl which is a PHP library. With curl, you could query the page and then check for the error code called:
CURLE_HTTP_RETURNED_ERROR (22)
This is returned if CURLOPT_FAILONERROR is set TRUE and the HTTP server returns an error code that is >= 400.
From the CURL documentation at php.net:
<?php
// Create a curl handle to a non-existing location
$ch = curl_init('http://404.php.net/');
// Execute
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_exec($ch);
// Check if any error occured
if(curl_errno($ch))
{
echo 'Curl error: ' . curl_error($ch);
}
// Close handle
curl_close($ch);
?>
http://www.php.net/manual/en/function.curl-errno.php