Check if page exist or 404 using php - php

Hi I have created a script using php to verify whether the page exist or its a 404 not found means invalid url the problem which I am facing at moment is weather the link exist or not in both case it is showing that the url is valid I am not sure if I am using a wrong approach or am I missing something below is the code which I am using
$file_headers = #get_headers('https://instagram.com/p/xyz');
echo "<pre>";
print_r($file_headers);
echo "</pre>";
if(!$file_headers || $file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = 0;
$error = '<div class="message">Provided url does not exists.</div>';
} else {
$error = '';
$exists = 1;
}
echo $exists;
I also tried to achieved this with curl but the problem is that with curl I am getting null response clean blank page nothing less nothing more below is how I am using with curl
$handle = curl_init($url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
$response = curl_exec($handle);
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
print_r($httpCode);
curl_close($handle);

Related

Why is my get_headers not working ? - Checking to see if a url exists or not

I am trying to validate if the input of a url, actually exists or not. I have been trying the following code, however I got no success. This is the following code:
Using cURL:
<?php
$url = 'https://github.com';
$handle = curl_init($url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
/* Get the HTML or whatever is linked in $url. */
$response = curl_exec($handle);
/* Check for 404 (file not found). */
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
if($httpCode == 404) {
echo "Url not working";
}
echo "true";
?>
Using get_headers
<?php
// Initialize an URL to the variable
$url = "https://www.geeksforgeeks.org";
// Use get_headers() function
$headers = #get_headers($url);
// Use condition to check the existence of URL
if($headers && strpos( $headers[0], '200')) {
$status = "URL Exist";
}
else {
$status = "URL Doesn't Exist";
}
// Display result
echo($status);
?>
I have searche both these answers on stackoverflow and other websites, and used these to check if the url I give actually exists. When I write down a non existing url, I would like it to output that the website does not exists, however, I always end up having the same output as an existing url, meaning that somethig might be wrong, although I cannot fully see it. By the first code, the output is always true, even if the url does not exist. By the second code, the output is always 'URL doesn't exist', even if the url actually exists. Am I doing something wrong? I am using PHP Version 7.4, is this tool still working? I apologize if the question is not clear.

retrieve file information from url using php

i am trying to retrieve information of file from the url containing the file. but how can i get the information of file before downloading it to my server.
i need file information like file size,file type etc
i had found the code to validate and download file but how to get information from it before downloading file actually to server
<?php
function is_url_exist($url)
{
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if($code == 200)
{
$status = "true";
}
curl_close($ch);
if ( $status == true)
{
$name = "abc.png";
if (file_put_contents("uploads/$name", file_get_contents($url)))
echo "file uploaded";
else
echo "error check upload link";
}
}
$url = "http://theonlytutorials.com/wp-content/uploads/2015/06/blog-logo1.png";
echo is_url_exist($url);
?>
you can get all information of remote file by get_headers function. Try following code to find out type, content length etc.
$url = "http://theonlytutorials.com/wp-content/uploads/2015/06/blog-logo1.png";
$headers = get_headers($url,1);
print_r($headers);
Know more about get_headers click http://php.net/manual/en/function.get-headers.php

WP - PHP cURL or get_headers() function leads to the 404 error

1) I am using wordpress engine.
2) I have an numeric array() with 800+ links in it, like this.
What I'm trying to do is to run foreach() function and check if link still exist (not returning 404 error).
I tried 2 functions:
1)
<?php
foreach($links as $link) {
$file_headers = #get_headers($link);
if(strpos($file_headers[0],'404') === false) {
$toDeleteLinks[] = $link;
}
}
?>
so according to this first function, $toDeleteLinks array should contain all the links that return 404 error. using get_headers() functions here...
2)
<?php
foreach($links as $link) {
$handle = curl_init($link);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
$response = curl_exec($handle);
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
if($httpCode != 404) {
$toDeleteLinks[] = $link;
}
curl_close($handle);
}
?>
this second one should do the same just using cURL..
BUT in both cases I get redirected to the wordpress 404.php page ((. I think that's because of a big number of links.
Can you please help me get a solution for this? Use another function instead or however...
Thanks.

How detect if external website working or not?

How can I detect if external website working or not? I have thinked about HTTP ERROR MESSAGE. In general something as:
if ( <<something(url)>> != 200 ) {
// website defined in url working (up)
} else {
// website defined in url not working (down)
}
200 is code that define a success querying url. Just so understood reading here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Try using CURL. The example is adapted from cur_getinfo()
// Create a curl handle
$ch = curl_init('http://stackoverflow.com/');
//Return only headers
curl_setopt($ch, CURLOPT_NOBODY, true);
// Execute
curl_exec($ch);
// Check if any error occurred
if(!curl_errno($ch))
$info = curl_getinfo($ch);
// Close handle
curl_close($ch);
if ( isset($info) && $info['http_code'] == 200 )
echo "Website is up!";
else
echo "Website is down.";

PHP Curl check for file existence before downloading

I am writing a PHP program that downloads a pdf from a backend and save to a local drive. Now how do I check whether the file exists before downloading?
Currently I am using curl (see code below) to check and download but it still downloads the file which is 1KB in size.
$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);
function downloadAndSave($urlS,$pathS)
{
$fp = fopen($pathS, 'w');
$ch = curl_init($urlS);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo $httpCode;
//If 404 is returned, then file is not found.
if(strcmp($httpCode,"404") == 1)
{
echo $httpCode;
echo $urlS;
}
fclose($fp);
}
I want to check whether the file exists before even downloading. Any idea how to do it?
You can do this with a separate curl HEAD request:
curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
When you actually want to download you can use set NOBODY to false.
Call this before your download function and it's done:
<?php function remoteFileExists($url) {
$curl = curl_init($url);
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
$ret = false;
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
?>
Since you are using HTTP to fetch a resource on the internet, what you really want to check is that the return code is a 404.
On some PHP installations, you can just use file_exists($url) out of the box. This does not work in all environments, however. http://www.php.net/manual/en/wrappers.http.php
Here is a function much like file_exists but for URLs, using curl:
<?php function curl_exists()
$file_headers = #get_headers($url);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
}
else {
$exists = true;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#75064
Sometimes the CURL extension isn't installed with PHP. In that case you can still use the socket library in the PHP core:
<?php function url_exists($url) {
$a_url = parse_url($url);
if (!isset($a_url['port'])) $a_url['port'] = 80;
$errno = 0;
$errstr = '';
$timeout = 30;
if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
$fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
if (!$fid) return false;
$page = isset($a_url['path']) ?$a_url['path']:'';
$page .= isset($a_url['query'])?'?'.$a_url['query']:'';
fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
$head = fread($fid, 4096);
$head = substr($head,0,strpos($head, 'Connection: close'));
fclose($fid);
if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
$pos = strpos($head, 'Content-Type');
return $pos !== false;
}
} else {
return false;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#73175
An even faster function can be found here:
http://www.php.net/manual/en/function.file-exists.php#76246
In the first example above $file_headers[0] may contain more than or something other than 'HTTP/1.1 404 Not Found', e.g:
HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found
So it's important to use some other test, e.g., regex, as '==' is not reliable.

Categories