curl to get headers - php

I want to check if a page gives a 200 header using curl.
I am using the following script:
public static function isUrlExist($url)
{
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, TRUE);
curl_exec($curl);
$code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($code == 200) {
$status = true;
} else {
$status = false;
}
curl_close($curl);
return $status;
}
The strange thing is it evaluates to true if I query a website such as vimeo: https://vimeo.com/api/oembed.json?url=https://vimeo.com/11896354
But websites such as Facebook or Google return false.
Am I missing something?

My best guess is that facebook/google is redirecting you, causing a 3xx redirect status code. Try adding curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE); before curl_exec() to make curl follow redirects.

Related

How to check whether an image is present or not at a specific URL using PHP?

Suppose I've one URL which is supposed to represent an image i.e. if I enter the same URL in an address bar and hit it, the image should display in a browser window.
If the URL doesn't have any image present at it it should return false otherwise it should return true.
How should this be done in an efficient and reliable way using PHP ?
I use this little guy:
function remoteFileExists($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
if (curl_exec($ch)) return true;
else return false;
}
Use like:
if (remoteFileExists('https://www.google.com/images/srpr/logo11w.png')){
echo 'Yay! Photo is there.';
} else {
echo 'Photo no home.';
}
There are two options:
You can use curl, it is explained here : How can one check to see if a remote file exists using PHP?
Use PHP file_exists() : http://php.net/manual/en/function.file-exists.php
Example :
$file = 'http://www.domain.com/somefile.jpg';
$file_headers = #get_headers($file);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
}
else {
$exists = true;
}
Try this
$ch = curl_init("https://www.google.com/images/srpr/logo11w.png");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if($retcode==200)
echo 'File Exist';

How can I get HTTP headers (Location) from some URL?

I have some address (for example: http://example.com/b-out/3456/3212/).This address i must pass through curl. I know that this URL redirects to another URL (like http://sdss.co/go/36a7fe71189fec14c85636f33501f6d2/?...). And this another URL located in the headers (Location) of first URL. How can I get second URL in some variable?
Perform a request to the first URL, confirm a redirect takes place and read the Location header. From PHP cURL retrieving response headers AND body in a single request? and Check headers in PHP cURL server response:
$curlHandle = curl_init();
curl_setopt($curlHandle, CURLOPT_URL, $url);
curl_setopt($curlHandle, CURLOPT_HEADER, 1);
curl_setopt($curlHandle, CURLOPT_NOBODY, 1);
curl_setopt($curlHandle, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1);
$redirectResponse = curl_exec($curlHandle);
The options being set there mean: return the response headers, don't return the response body, don't automatically follow redirects and return the result in the exec-call.
Now you've got the HTTP response headers, without body, in $redirectResponse. You'll now need to verify that it's a redirect:
$statusCode = curl_getinfo($curlHandle, CURLINFO_HTTP_CODE);
if ($statusCode == 301 || $statusCode == 302 || $statusCode == 303)
{
$headerLength = curl_getinfo($curlHandle, CURLINFO_HEADER_SIZE);
$responseHeaders = substr($redirectResponse, 0, $headerLength);
$redirectUrl = getLocationHeader($responseHeaders);
}
Then create a function to do that:
function getLocationHeader($responseHeaders)
{
}
In there you'll want to explode() the $responseHeaders on HTTP newline (\r\n) and find the header starting with location.
Alternatively, you can use a more abstract HTTP client library like Zend_Http_Client, where it is a little easier to obtain the headers.
I did it like CodeCaster said. This is my function 'getLocationHeader':
function getLocationHeader($responseHeaders)
{
if (preg_match('/Location:(.+)Vary/is', $redirectResponse, $loc))
{
$location = trim($loc[1]);
return $location;
}
return FALSE;
}

Check if site is up or down

I have problem with this script if i check http://google.com/ or other website is not work and with http://stackoverflow.com or cnn.com is work...
$url = 'http://google.com/';
function urlExists($url=NULL)
{
if($url == NULL) return false;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if($httpcode>=200 && $httpcode<300){
return true;
} else {
return false;
}
}
if(urlExists($url))
{
echo "ok";
}
else
{
echo "no";
}
I have test #fopen and not work too
Website like google have blocker ? Thanks
You should change:
if($httpcode>=200 && $httpcode<300){
to:
if($httpcode>=200 && $httpcode<303){
The reason is that many sites use a 301 Moved Permanently by default as well as 302 Found.
You are curling http://google.com which responds with the response code 301 to redirect visitors the http://www.google.com. In your logic, the 301 response code is considered offline.
You will need to adjust the URL you are checking, adjust the logic to accept this response code, or add the code below to make cURL follow redirects.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
it should probably read
if($httpcode>=200 && $httpcode<400){
Instead Of
if($httpcode>=200 && $httpcode<300){
Otherwise, a redirect would be regarded as “server down”.
With the above change, the result for http://google.com is ok

Check if tweet status exists?

I need a way to check if tweet exists. I have link to tweet like https://twitter.com/darknille/status/355651101657280512 . I preferably want a fast way to check (without retrieving body of page, just HEAD request), so I tried something like this
function if_curl_exists($url)
{
$resURL = curl_init();
curl_setopt($resURL, CURLOPT_URL, $url);
curl_setopt($resURL, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($resURL, CURLOPT_HEADERFUNCTION, 'curlHeaderCallback');
curl_setopt($resURL, CURLOPT_FAILONERROR, 1);
$x = curl_exec ($resURL);
//var_dump($x);
echo $intReturnCode = curl_getinfo($resURL, CURLINFO_HTTP_CODE);
curl_close ($resURL);
if ($intReturnCode != 200 && $intReturnCode != 302 && $intReturnCode != 304) {
return false;
}
else return true;
}
or like this
function if_curl_exists_1($url)
{
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, true);//head request
$result = curl_exec($curl);
$ret = false;
if ($result !== false) {
//if request was ok, check response code
echo $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
but both those return null with curl_exec(), there is nothing to check for http status code.
The other way is to use twitter api, like GET statuses/show/:id https://dev.twitter.com/docs/api/1.1/get/statuses/show/%3Aid but there is no special return value if tweet doesn't exist, as said here https://dev.twitter.com/discussions/8802
I need advice whats the fastest way to check, I am doing in php.
You probably have to set the Return Transfer flag
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
If the code returns as 30x status you probably have to add the Follow Location flag as well
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
You can use #get_header. It will return an array in which the first item has the response code:
$response = #get_headers($url);
print_r($response[0]);
if($response[0]=='HTTP/1.0 404 Not Found'){
echo 'Not Found';
}else{
echo 'Found';
}

Check if a remote page exists using PHP?

In PHP, how can I determine if any remote file (accessed via HTTP) exists?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
$data = curl_exec($ch);
curl_close($ch);
if (!$data) {
echo "Domain could not be found";
}
else {
preg_match_all("/HTTP\/1\.[1|0]\s(\d{3})/",$data,$matches);
$code = end($matches[1]);
if ($code == 200) {
echo "Page Found";
}
elseif ($code == 404) {
echo "Page Not Found";
}
}
Modified version of code from here.
I like curl or fsockopen to solve this problem. Either one can provide header data regarding the status of the file requested. Specifically, you would be looking for a 404 (File Not Found) response. Here is an example I've used with fsockopen:
http://www.php.net/manual/en/function.fsockopen.php#39948
This function will return the response code (the last one in case of redirection), or false in case of a dns or other error. If one argument (the url) is supplied a HEAD request is made. If a second argument is given, a full request is made and the content, if any, of the response is stored by reference in the variable passed as the second argument.
function url_response_code($url, & $contents = null)
{
$context = null;
if (func_num_args() == 1) {
$context = stream_context_create(array('http' => array('method' => 'HEAD')));
}
$contents = #file_get_contents($url, null, $context);
$code = false;
if (isset($http_response_header)) {
foreach ($http_response_header as $header) {
if (strpos($header, 'HTTP/') === 0) {
list(, $code) = explode(' ', $header);
}
}
}
return $code;
}
I recently was looking for the same info. Found some really nice code here: http://php.assistprogramming.com/check-website-status-using-php-and-curl-library.html
function Visit($url){
$agent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,$url );
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch,CURLOPT_VERBOSE,false);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$page=curl_exec($ch);
//echo curl_error($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if($httpcode >= 200 && $httpcode < 300){
return true;
}
else {
return false;
}
}
if(Visit("http://www.site.com")){
echo "Website OK";
}
else{
echo "Website DOWN";
}
Use Curl, and check if the request went through successfully.
http://w-shadow.com/blog/2007/08/02/how-to-check-if-page-exists-with-curl/
Just a note that these solutions will not work on a site that does not give an appropriate response for a page not found. e.g I just had a problem with testing for a page on a site as it just loads a main site page when it gets a request it cannot handle. So the site will nearly always give a 200 response even for non-existent pages.
Some sites will give a custom error on a standard page and not still not give a 404 header.
Not much you can do in these situations unless you know the expected content of the page and start testing that the expected content exists or test for some expected error text within the page and that is all getting a bit messy...

Categories