How to check if link is downloadable file in php? [duplicate] - php

This question already has answers here:
What is the best way to check if a URL exists in PHP?
(5 answers)
Closed 7 years ago.
I'm trying to make broken link checker with php.
I modified some php code i found online i'm not php programmer.
It let's in some unbroken link's but thats ok.
However I have problem with all presentation, zips and so on...
Basicly if it is downlaod then algorithm thinks it's a dead link.
$servername = "localhost";
$username = "";
$password = "";
try {
$conn = new PDO("mysql:host=$servername;dbname=test", $username, $password);
// set the PDO error mode to exception
echo "Connected successfully" . "<br />";
echo "----------------------------------------------------<br />";
catch (PDOException $e) {
echo "Connection failed: " . $e->getMessage();
$sql = "SELECT object,value FROM metadata where xpath = 'lom/technical/location'";
$result = $conn->query($sql)->fetchAll(PDO::FETCH_ASSOC);
$array_length = sizeof($result); //26373
//$array_length = 26373;
$i = 0;
$myfile = fopen("Lom_Link_patikra1.csv", "w") or die("Unable to open file!");
$menu_juosta = "Objektas;Nuoroda;Klaidos kodas;\n";
for ($i; $i < $array_length; $i++) {
$new_id = $result[$i]["object"];
$sql1 = "SELECT published from objects where id ='$new_id'";
$result_published = $conn->query($sql1)->fetchAll(PDO::FETCH_ASSOC);
//print_r ($result_published);
if ($result_published[0]["published"] != 0) {
$var1 = $result[$i]["value"];
$var1 = str_replace('|experience|902', '', $var1);
$var1 = str_replace('|packed_in|897', '', $var1);
$var1 = str_replace('|packed_in|911', '', $var1);
$var1 = str_replace('|packed_in|895', '', $var1);
$request_response = check_url($var1); // Puslapio atsakymas
if ($request_response != 200) {
$my_object = $result[$i]["object"] . ";" . $var1 . ";" . $request_response . ";\n";
fwrite($myfile, $my_object);
$conn = null;
function check_url($url)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
$headers = curl_getinfo($ch);
return $headers['http_code'];
Link example :
Any solutions, advice?
Thank you all for help.Now it works way faster. It seems there is problem with blank spaces, but that's even intriguing.
As it seems the problem i had was in understanding, how http status is working, like what it return's and why. Link's that i had marked as bad,but working where 301 or 302 - Redirect's.
Thank you all for help.

Using CURL for remote file
function checkRemoteFile($url)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
// don't download content
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
return true;
return false;
EDIT: I may have misunderstood you but if you just want to check if the url actually exists than the code below will be all you need.
function url_exists($url) {
{return 1;}
{return 0;}

curlopt_nobody set to TRUE makes a HTTP HEAD request instead of a GET request, so try using curl_setopt( $ch, CURLOPT_NOBODY, true );

Try to use file_exists method :


curl command not working on bigrock server…?

$db = new database;
if($row["phn_no1"]==$phn || $row["phn_no2"]==$phn || $row["phn_no3"]==$phn)
$formatted = "".substr($phn,6,10)." ";
$password = $formatted + $adm;
echo $password;
$pre = 'PREFIX';
$suf = '%20ThankYou';
$sms = $pre.$password.$suf;
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$result = curl_exec($ch);
this code is working perfect on my local host .. but when i put it on server ... it takes lots of time but the code in curl command is not working it only refers to next page ... i checked that curl is enabled .. if i use only sms api without curl command it sends sms immidiately.... but i want to run both header and also want to hide my sms api.... is there any alternate of this ???

Using cURL and PHP for CACTI in Windows

Recently tasked to monitor external webpage response/loading time via CACTI. I found some PHP scripts that were working (pageload-agent.php and class.pageload.php) using cURL. All was working fine until they requested it to be transferred from LINUX to Windows 2012R2 server. I'm having a very hard time modifying the scripts to work for windows. Already installed PHP and cURL and both working as tested. Here are the scripts taken from askaboutphp.
class PageLoad {
var $siteURL = "";
var $pageInfo = "";
* sets the URLs to check for loadtime into an array $siteURLs
function setURL($url) {
if (!empty($url)) {
$this->siteURL = $url;
return true;
return false;
* extract the header information of the url
function doPageLoad() {
$u = $this->siteURL;
if(function_exists('curl_init') && !empty($u)) {
$ch = curl_init($u);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_ENCODING, "gzip");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)");
$pageBody = curl_exec($ch);
$this->pageInfo = curl_getinfo($ch);
curl_close ($ch);
return true;
return false;
* compile the page load statistics only
function getPageLoadStats() {
$info = $this->pageInfo;
//stats from info
$s['dest_url'] = $info['url'];
$s['content_type'] = $info['content_type'];
$s['http_code'] = $info['http_code'];
$s['total_time'] = $info['total_time'];
$s['size_download'] = $info['size_download'];
$s['speed_download'] = $info['speed_download'];
$s['redirect_count'] = $info['redirect_count'];
$s['namelookup_time'] = $info['namelookup_time'];
$s['connect_time'] = $info['connect_time'];
$s['pretransfer_time'] = $info['pretransfer_time'];
$s['starttransfer_time'] = $info['starttransfer_time'];
return $s;
#! /usr/bin/php -q
//include the class
include_once 'class.pageload.php';
// read in an argument - must make sure there's an argument to use
if ($argc==2) {
//read in the arg.
$url_argv = $argv[1];
if (!eregi('^http://', $url_argv)) {
$url_argv = "http://$url_argv";
// check that the arg is not empty
if ($url_argv!="") {
//initiate the results array
$results = array();
//initiate the class
$lt = new PageLoad();
//set the page to check the loadtime
//load the page
if ($lt->doPageLoad()) {
//load the page stats into the results array
$results = $lt->getPageLoadStats();
} else {
//do nothing
print "";
//print out the results
if (is_array($results)) {
//expecting only one record as we only passed in 1 page.
$output = $results;
print "dns:".$output['namelookup_time'];
print " con:".$output['connect_time'];
print " pre:".$output['pretransfer_time'];
print " str:".$output['starttransfer_time'];
print " ttl:".$output['total_time'];
print " sze:".$output['size_download'];
print " spd:".$output['speed_download'];
} else {
//do nothing
print "";
} else {
//do nothing
print "";
Thank you. any type of assistance is greatly appreciated.

PHP's strlen function behaving strangely

Please consider the following code:
$imagePath = "";
$imagedata = get that image data through curl and store in this variable;
echo strlen($imagedata); // outputs 4699
if(strlen($imagedata) == 4699 ) {
echo "length is 4699";
The above if-condition never becomes true even though the strlen value is 4600. It seems very strange; am I missing anything? I've already tried mb_strlen, but to no avail.
It seems to work for certain images, but not for the following image.
$strImageURL = "";
$strImageData = getData($strImageURL);
echo "<br />" . strlen($strImageData);
if(strlen($strImageData) === 4699) {
echo "true";
function getData($strSubmitURL)
$strData = null;
$ch = curl_init();
//return parameter
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 120);
curl_setopt($ch, CURLOPT_TIMEOUT, 140);
//site name
curl_setopt($ch, CURLOPT_URL,$strSubmitURL);
// don' verify ssl host
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
$strData = curl_exec ($ch);
if (!$strData) {
//die ("cURL error: " . curl_error ($ch) . "\n");
return '';
curl_close ($ch);
return $strData;
This works for me:
$url = "";
$imagedata = file_get_contents($url);
//echo strlen($imagedata); // outputs 1020
if(strlen($imagedata) == 1020 ) {
echo "length is 1020";
And as further troubleshooting, I would try a var_dump(get_defined_vars()); at the end of your code and inside the if statement to see whats going on.
Using your url, and also putting in a var dump twice:
$url = "";
$imagedata = file_get_contents($url);
$strlen = strlen($imagedata); // outputs 4669
if($strlen == 4669 ) {
echo "length is 4669 \n";
PhpMate running PHP 5.3.15 with (/usr/bin/php)
>>> untitled
length is 4669
strlen() is used to count the bytes that a string equates an one character does not necessarily equal 1 byte in UTF-8
Another issue could be type-casting since PHP has such loose rules about this. Would this work for you?
if((int)strlen($imagedata) == 4600 ) {
echo "length is 4600";
if(strlen($imagedata) == '4600' ) {
echo "length is 4600";

parsing multiple urls from the form

I'm trying to make a script, which search the list of urls given in the form for the email adresses. Could anyone advice me how to do it? Is there some alternative to cURL?
I tried to make it with file_get_contents, but the script analyze only the last url given in the form: when I enter for example two urls to the form, the first "print_r("show current_url:". $current_url); is empty and for the second it shows the page(url) content(without pictures).
I asked on different forums, but received no answer. Will really appraciate your help.
Thank you
$urls = explode("\n", $_POST['urls']);
$db = new mysqli('localhost', 'root', 'root', 'urls');
if (mysqli_connect_errno()) {
echo 'Błąd: ';
for ($i=0; $i<count($urls); $i++){
print_r("show link:". $urls[$i]."<br>");
$current_url = file_get_contents($urls[$i]);
print_r("show current_url:". $current_url);
preg_match( "/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $current_url, $email);//email
print_r ("show email:".$email[0]);
$query = "INSERT INTO urle set adres = '$email[0]' ";
$result = $db->query($query);
if ($query) {
echo $db->affected_rows ."pozycji dodano.";
} else {
echo mysql_errno() . ":" . mysql_error() . "Wystąpił błąd przy dodawaniu urli ";
I have tried with curl. var_dump($email); shows: array(0) { }
The script displays now all of the urls given in the form in the browser, but preg_match doesn't work, so it doesn't extract email adresses.
$urls = explode("\n", $_POST['urls']);
$db = new mysqli('localhost', 'root', 'root', 'linki');
if (mysqli_connect_errno()) {
echo 'Błąd: ';
for ($i=0; $i<count($urls); $i++){
$url = $urls[$i];
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL, $url);
$output = curl_exec($ch);
preg_match( "/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $output, $email);//email
$query = "INSERT INTO urle set adres = '$email[0]' ";
$result = $db->query($query);
if ($result) {
echo $db->affected_rows ."pozycji dodano.";
} else {
echo mysql_errno() . ":" . mysql_error() . "Wystąpił błąd przy dodawaniu urli ";
Is there some alternative to cURL?
file_get_contents, which doesn't give you any error messages (unless error_reporting is raised), and which is often blocked unless ini_set("user_agent", ...) was set.
Alternatively HttpRequest on newer PHP installations.
Still curl is not difficult to use. The manual is full of examples.
the first "print_r("show current_url:". $current_url); is empty
Nobody can tell. It's your duty to debug that (especially since you haven't mentioned the affected url in your question). Use curl or httprequest.
Ok, i've fixed it!!!:)
Here is the code:
for ($i=0; $i<count($linki); $i++){
$url = $linki[$i];
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result =curl_exec($ch);
preg_match("/[-a-z0-9\._]+#[-a-z0-9\._]+\.[a-z]{2,4}/", $result, $email);//email
$zapytanie = "INSERT INTO urle set adres = '$email[0]' ";
$wynik = $db->query($zapytanie);

PHP: Check if URL redirects?

I have implemented a function that runs on each page that I want to restrict from non-logged in users. The function automatically redirects the visitor to the login page in the case of he or she is not logged in.
I would like to make a PHP function that is run from a exernal server and iterates through a number of set URLs (array with URLs that is for each protected site) to see if they are redirected or not. Thereby I could easily make sure if protection is up and running on every page.
How could this be done?
$urls = array(
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
foreach($urls as $url) {
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
// only look at the headers
$headers_end = strpos($out, "\n\n");
if( $headers_end !== false ) {
$out = substr($out, 0, $headers_end);
$headers = explode("\n", $out);
foreach($headers as $header) {
if( substr($header, 0, 10) == "Location: " ) {
$target = substr($header, 10);
echo "[$url] redirects to [$target]<br>";
continue 2;
echo "[$url] does not redirect<br>";
I use curl and only take headers, after I compare my url and url from header curl:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, '60'); // in seconds
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$res = curl_exec($ch);
if(curl_getinfo($ch)['url'] == $url){
echo "not redirect";
}else {
echo "redirect";
You could always try adding:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
since 302 means it moved, allow the curl call to follow it and return whatever the moved url returns.
Getting the headers with get_headers() and checking if Location is set is much simpler.
$urls = [
foreach ($urls as $key => $url) {
$is_redirect = does_url_redirect($url) ? 'yes' : 'no';
echo $url . ' is redirected: ' . $is_redirect . PHP_EOL;
function does_url_redirect($url){
$headers = get_headers($url, 1);
if (!empty($headers['Location'])) {
return true;
} else {
return false;
I'm not sure whether this really makes sense as a security check.
If you are worried about files getting called directly without your "is the user logged in?" checks being run, you could do what many big PHP projects do: In the central include file (where the security check is being done) define a constant BOOTSTRAP_LOADED or whatever, and in every file, check for whether that constant is set.
Testing is great and security testing is even better, but I'm not sure what kind of flaw you are looking to uncover with this? To me, this idea feels like a waste of time that will not bring any real additional security.
Just make sure your script die() s after the header("Location:...") redirect. That is essential to stop additional content from being displayed after the header command (a missing die() wouldn't be caught by your idea by the way, as the redirect header would still be issued...)
If you really want to do this, you could also use a tool like wget and feed it a list of URLs. Have it fetch the results into a directory, and check (e.g. by looking at the file sizes that should be identical) whether every page contains the login dialog. Just to add another option...
Do you want to check the HTTP code to see if it's a redirect?
$params = array('http' => array(
'method' => 'HEAD',
'ignore_errors' => true
$context = stream_context_create($params);
foreach(array('', '') as $url) {
$fp = fopen($url, 'rb', false, $context);
$result = stream_get_contents($fp);
if ($result === false) {
throw new Exception("Could not read data from {$url}");
} else if (! strstr($http_response_header[0], '301')) {
// Do something here
I hope it will help you:
function checkRedirect($url)
$headers = get_headers($url);
if ($headers) {
if (isset($headers[0])) {
if ($headers[0] == 'HTTP/1.1 302 Found') {
//this is the URL where it's redirecting
return str_replace("Location: ", "", $headers[9]);
return false;
$isRedirect = checkRedirect($url);
if(!$isRedirect )
echo "URL Not Redirected";
echo "URL Redirected to: ".$isRedirect;
You can use session,if the session array is not set ,the url redirected to a login page.
I modified Adam Backstrom answer and implemented chiborg suggestion. (Download only HEAD). It have one thing more: It will check if redirection is in a page of the same server or is out. Example: redirects to PHP will considerate it like redirect, and it is correct. But i only wanted to list that url that redirect to another URL. My English is not good, so, if someone found something really difficult to understand and can edit this, you're welcome.
function RedirectURL() {
$urls = array('','');
foreach ($urls as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// chiborg suggestion
curl_setopt($ch, CURLOPT_NOBODY, true);
// ================================
// ================================
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
echo $out;
$headers = explode("\n", $out);
foreach($headers as $header) {
if(substr(strtolower($header), 0, 9) == "location:") {
// read URL to check if redirect to somepage on the server or another one.
// redirect to it is valid.
// but redirect to, and this is invalid.
// what we want is to check if the address continues being or changes. if changes, prints on page.
// if contains http, we will check if changes url or not.
// some servers, to redirect to a folder available on it, redirect only citting the folder. Example: redirect only to /heiden
// only execute if have http on location
if ( strpos(strtolower($header), "http") !== false) {
$address = explode("/", $header);
// $address['0'] = http
// $address['1'] =
// $address['2'] =
// $address['3'] = portal
echo "url (address from array) = " . $url . "<br>";
echo "address[2] = " . $address['2'] . "<br><br>";
// url:
// address['2'] =
// check if string is still available in It indicates that server did not redirect to some page away from here.
if(strpos(strtolower($address['2']), strtolower($url)) !== false) {
} else {
// not the same. (areiaebrita)
function unshorten_url($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
$real_url = $url;//default.. (if no redirect)
if (preg_match("/location: (.*)/i", $out, $redirect))
$real_url = $redirect[1];
if (strstr($real_url, ""))//the redirect is another shortened url
$real_url = unshorten_url($real_url);
return $real_url;
I have just made a function that checks if a URL exists or not
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
function url_exists($url, $ch) {
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
// only look at the headers
$headers_end = strpos($out, "\n\n");
if( $headers_end !== false ) {
$out = substr($out, 0, $headers_end);
//echo $out."====<br>";
$headers = explode("\n", $out);
//echo "<pre>";
foreach($headers as $header) {
//echo $header."---<br>";
if( strpos($header, 'HTTP/1.1 200 OK') !== false ) {
return true;
Now I have used an array of URLs to check if a URL exists as following:
$my_url_array = array('', '', '', '');
for($j = 0; $j < count($my_url_array); $j++){
if(url_exists($my_url_array[$j], $ch)){
echo 'This URL "'.$my_url_array[$j].'" exists. <br>';
I can't understand your question.
You have an array with URLs and you want to know if user is from one of the listed URLs?
If I'm right in understanding your quest:
$urls = array('','','');
echo 'FROM ARRAY';
} else {
echo 'NOT FROM ARR';
