CURL 404 (Yahoo)

CURL 404 (Yahoo) - php

I was testing an script that comprove if the domain its working or not and I tested 50 sites (that should work) but just one site (Yahoo) returned "Died" (Not Working) and it returns 404, I'm not sure if the problem is from my code or from the website, I think the problem is the "Redirection" but I don't know what to do.
How can I follow it?
$host = "www.yahoo.com";
$ip = gethostbyname($host);
$domain = $ip;
//Starting process to check the domain
$check = curl_init($domain);
curl_setopt($check, CURLOPT_TIMEOUT, 10);
curl_setopt($check, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($check, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($check);
$httpcode = curl_getinfo($check, CURLINFO_HTTP_CODE);
curl_close($check);
if ($httpcode >= 200 && $httpcode <= 350) {
$data_array[$key]['status'] = "Alive";
} else {
$data_array[$key]['status'] = "Died";
}
}

I changed the URL like $host = "www.yahoo.com/"; and added curl_setopt($check, CURLOPT_FOLLOWLOCATION, 1); it is work for me now.

Related

Redirect php script using CURL

I am trying to make a redirect php script, I want that script to check if the link exist and then redirect the user to the link, if it doesn't exist then it will get the next link and so on, but for some reason is not working, maybe you could give me some help on this.
<?php
$URL = 'http://www.site1.com';
$URL = 'http://www.site2.com';
$URL = 'http://www.site3.com';
$handlerr = curl_init($URL);
curl_setopt($handlerr, CURLOPT_RETURNTRANSFER, TRUE);
$resp = curl_exec($handlerr);
$ht = curl_getinfo($handlerr, CURLINFO_HTTP_CODE);
if ($ht == '404')
{ echo "Sorry the website is down atm, please come back later!";}
else { header('Location: '. $URL);}
?>

You are overwriting your $URL variable..
$URL = 'http://www.site1.com';
$URL = 'http://www.site2.com';
$URL = 'http://www.site3.com';
Put these urls in an array and go through it with a for each loop.

You have a few issues in your code. For 1, your $URL will overwrite itself, resulting in only 1 url in there. It needs to be an array:
array( 'http://www.site1.com', 'http://www.site2.com', 'http://www.site3.com' );
You can get many responses, not just a 404, so you should tell cURL to follow redirects. If the URL was a redirect itself, could get a 301 that redirects to a 200. So we want to follow that.
Try This:
<?php
function curlGet($url)
{
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$output = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ( $httpcode == 200 ) {
return true;
}
return false;
}
$urlArray = array( 'http://www.site1.com', 'http://www.site2.com', 'http://www.site3.com' );
foreach ( $urlArray as $url ) {
if ( $result = curlGet($url) ) {
header('Location: ' . $url);
exit;
}
}
// if we made it here, we looped through every url
// and none of them worked
echo "No valid URLs found...";

http://php.net/manual/en/function.file-exists.php#74469
<?php
function url_exists($url) {
if (!$fp = curl_init($url)) return false;
return true;
}
?>
This will give you the url exists check.
to check multiple urls though, you need an array:
<?
$url_array = [];
$url_array[] = 'http://www.site1.com';
$url_array[] = 'http://www.site2.com';
$url_array[] = 'http://www.site3.com';
foreach ($url_array as $url) {
if url_exists($url){
// do what you need;
break;
}
}
?>
PS - this is completely untested, but should theoretically do what you need.

Why are curl and file_get_contents returning 404 on cloudlinux?

I am pulling some content in to a text file and than using curl or file_get_contents to display it
Here it works perfectly fine
http://www.dev.phosting.eu/
but here
it returns 404
http://dev5.gozenhost.com/index.php/shortcodes/114-testing
and the file is accessible
http://dev5.gozenhost.com/media/plg_system_yjsg/yjsgparsed/raw-githubusercontent-com/yjsgframework/demo-docs/master/shortcodes/Icons.txt
$getContent returns the accessible link above , and this is curl.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $getContent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if (empty($data)) {
$content = 'Error processing url. ' . $httpCode;
} else if ($httpCode >= 200 && $httpCode < 300) {
if ($local) {
$content = $data;
} else {
$content = yjsg_clean_html($data);
JFile::write($filepath, $content);
}
} else {
$content = 'Error processing url.' . $httpCode;
}
I mean all files are in the right places , and accessible
Funny thing is if I use curl or file_get_contents to access someone else site it works fine , if I am accessing file on my own domain it fails. Again only on cloudlinux.
Does anyone know what the issue is and possible fix .
Thank you!

cURL Geocode OVER_QUERY_LIMIT

I have a problem with google's geocoding api:
I have an address which should be converted to latitude and longitude by google geocoding api using cURL. You will see my code below. It worked fine, however suddenly it stopped working and I got an "OVER_QUERY_LIMIT" answer. I looked it up and it normally happens if api is requested more than 2500 times a day. This is impossible, because my website is just upon finishing and has about 20-40 geocode requests per day.
So what is really the problem that "OVER_QUERY_LIMIT" occurs? Is something wrong with my code that somehow google blocks it?!
ini_set('allow_url_fopen', true);
$CookiePath = 'mycookiepath/geocookie.txt';
$userAgent = "mysite.com";
$ListURLRoot = "http://maps.googleapis.com/maps/api/geocode/json";
$ListURLSuffix = '&sensor=false';
$Curl_Obj = curl_init(); // Setup cURL
curl_setopt($Curl_Obj, CURLOPT_COOKIEJAR, $CookiePath);
curl_setopt($Curl_Obj, CURLOPT_USERAGENT, $userAgent);
curl_setopt($Curl_Obj, CURLOPT_HEADER, 0);
curl_setopt($Curl_Obj, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($Curl_Obj, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($Curl_Obj, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($Curl_Obj, CURLOPT_TIMEOUT, 30);
curl_setopt($Curl_Obj, CURLOPT_POST, 0);
$stAddr = str_replace(' ','+', $ast);
$City = str_replace(' ','+', $ac);
$address = "$stAddr,+$City,+{$aco},+{$ap}";
$address = str_replace(' ', '+', $address);
$ListURL = "{$ListURLRoot}?address=$address$ListURLSuffix";
curl_setopt ($Curl_Obj, CURLOPT_URL, $ListURL);
$output = curl_exec ($Curl_Obj);
GetLocation($output);
function GetLocation($output) {
global $db_connect;
$Loc = json_decode($output, true);
if(isset($Loc)) {
$i = 1;
while ($Loc['status']=="OVER_QUERY_LIMIT") {
if ($i > 14) {
echo "Geocode failed!";
exit();
}
sleep(1);
$i++;
}
if(isset($Loc['status']) && stristr($Loc['status'], 'OK')) {
if(isset($Loc['results'][0]['geometry']['location'])) {
$Lat = $Loc['results'][0]['geometry']['location']['lat'];
$Lng = $Loc['results'][0]['geometry']['location']['lng'];
}
}
else {
error_log($Loc['status'], 0);
echo "Unknown error occured!";
exit();
}
}
}
Thanks in advance!

Mostly it's a Google problem. Try to avoid server side geocodeing. Instead use a client side javascript solver.

Checking directory or file from external owned server

I am developing a script for a music company in PHP that has different servers so they need to display a file if it exists or not on the external server
like they have 3 versions of each music file mp3 mp4 etc ..... and they are accessing the files (each version ) from there specific external server . i have made three solutions for it all of them worked like charm but they are making the server slow .
First Method :
$handle = curl_init($url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
/* Get the HTML or whatever is linked in $url. */
$response = curl_exec($handle);
/* Check for 404 (file not found). */
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
if($httpCode == 404) {
/* Handle 404 here. */
}
curl_close($handle);
/* Handle $response here. */
Second Method : Using NuSOAP i made an api which checks internally the file and returns yes/no
Third Method:
function checkurl($url)
{
return true;
$file_headers = #get_headers($url);
//var_dump($file_headers);
if($file_headers[0] == 'HTTP/1.1 302 Moved Temporarily' || $file_headers[0] =='HTTP/1.1 302 Found') {
$exists = false;
}
else {
$exists = true;
}
return $exists;
}
So i need a solution that doesn't makes the server slow any suggestions

Be sure to issue a HEAD request, not GET, since you don't want to get the file contents. And maybe you need to follow redirects, or not...
Example with curl (thanks to this blog post):
<?php
$url = 'http://localhost/c.txt';
echo "\n checking: $url";
$c = curl_init();
curl_setopt( $c, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $c, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $c, CURLOPT_MAXREDIRS, 5 );
curl_setopt( $c, CURLOPT_CUSTOMREQUEST, 'HEAD' );
curl_setopt( $c, CURLOPT_HEADER, 1 );
curl_setopt( $c, CURLOPT_NOBODY, true );
curl_setopt( $c, CURLOPT_URL, $url );
$res = curl_exec( $c );
echo "\n\ncurl:\n";
var_dump($res);
echo "\nis 200: ";
var_dump(false !== strpos($res, 'HTTP/1.1 200 OK'));
SOAP or other web service implementation can be an option if the file is not available by HTTP.
If you want to use get_headers(), please note that by default it's slow because it issues a GET request. To use HEAD request, you should change the default stream context (please check get_headers() on php manual):
stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);

I thought it works with above answers but it wasnt working where there were too many requests so i finally try again and again and found this solution its working perfectly actually the problem was redirects too many of them so i set time_out 15 in curl and it worked
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
$r = curl_exec($ch);
$r = split("\n", $r);
var_dump($r);

Check if a remote page exists using PHP?

In PHP, how can I determine if any remote file (accessed via HTTP) exists?

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
$data = curl_exec($ch);
curl_close($ch);
if (!$data) {
echo "Domain could not be found";
}
else {
preg_match_all("/HTTP\/1\.[1|0]\s(\d{3})/",$data,$matches);
$code = end($matches[1]);
if ($code == 200) {
echo "Page Found";
}
elseif ($code == 404) {
echo "Page Not Found";
}
}
Modified version of code from here.

I like curl or fsockopen to solve this problem. Either one can provide header data regarding the status of the file requested. Specifically, you would be looking for a 404 (File Not Found) response. Here is an example I've used with fsockopen:
http://www.php.net/manual/en/function.fsockopen.php#39948

This function will return the response code (the last one in case of redirection), or false in case of a dns or other error. If one argument (the url) is supplied a HEAD request is made. If a second argument is given, a full request is made and the content, if any, of the response is stored by reference in the variable passed as the second argument.
function url_response_code($url, & $contents = null)
{
$context = null;
if (func_num_args() == 1) {
$context = stream_context_create(array('http' => array('method' => 'HEAD')));
}
$contents = #file_get_contents($url, null, $context);
$code = false;
if (isset($http_response_header)) {
foreach ($http_response_header as $header) {
if (strpos($header, 'HTTP/') === 0) {
list(, $code) = explode(' ', $header);
}
}
}
return $code;
}

I recently was looking for the same info. Found some really nice code here: http://php.assistprogramming.com/check-website-status-using-php-and-curl-library.html
function Visit($url){
$agent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,$url );
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch,CURLOPT_VERBOSE,false);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$page=curl_exec($ch);
//echo curl_error($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if($httpcode >= 200 && $httpcode < 300){
return true;
}
else {
return false;
}
}
if(Visit("http://www.site.com")){
echo "Website OK";
}
else{
echo "Website DOWN";
}

Use Curl, and check if the request went through successfully.
http://w-shadow.com/blog/2007/08/02/how-to-check-if-page-exists-with-curl/

Just a note that these solutions will not work on a site that does not give an appropriate response for a page not found. e.g I just had a problem with testing for a page on a site as it just loads a main site page when it gets a request it cannot handle. So the site will nearly always give a 200 response even for non-existent pages.
Some sites will give a custom error on a standard page and not still not give a 404 header.
Not much you can do in these situations unless you know the expected content of the page and start testing that the expected content exists or test for some expected error text within the page and that is all getting a bit messy...

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

CURL 404 (Yahoo) - php

I changed the URL like $host = "www.yahoo.com/"; and added curl_setopt($check, CURLOPT_FOLLOWLOCATION, 1); it is work for me now.

Related

Redirect php script using CURL

Why are curl and file_get_contents returning 404 on cloudlinux?

cURL Geocode OVER_QUERY_LIMIT

Checking directory or file from external owned server

Check if a remote page exists using PHP?

Categories

Resources