Converting a filepath to a url securely and reliably - php

I'm using php and I have the following code to convert an absolute path to a url.
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').str_replace($_SERVER['DOCUMENT_ROOT'], $_SERVER['HTTP_HOST'], $path);
}
My question is basically, is there a better way to do this in terms of security / reliability that is portable between locations and servers?

The HTTP_HOST variable is not a reliable or secure value as it is also being sent by the client. So be sure to validate its value before using it.

I don't think security is going to be effected, simply because this is a url, being printed to a browser... the worst that can happen is exposing the full directory path to the file, and potentially creating a broken link.
As a little side note, if this is being printed in a HTML document, I presume you are passing the output though something like htmlentities... just in-case the input $path contains something like a [script] tag (XSS).
To make this a little more reliable though, I wouldn't recommend matching on 'DOCUMENT_ROOT', as sometimes its either not set, or won't match (e.g. when Apache rewrite rules start getting in the way).
If I was to re-write it, I would simply ensure that 'HTTP_HOST' is always printed...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].str_replace($_SERVER['DOCUMENT_ROOT'], '', $path);
}
... and if possible, update the calling code so that it just passes the path, so I don't need to even consider removing the 'DOCUMENT_ROOT' (i.e. what happens if the path does not match the 'DOCUMENT_ROOT')...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].$path;
}
Which does leave the question... why have this function?
On my websites, I simply have a variable defined at the beggining of script execution which sets:
$GLOBALS['webDomain'] = 'http://' . (isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '');
$GLOBALS['webDomainSSL'] = $GLOBALS['webDomain'];
Where I use GLOBALS so it can be accessed anywhere (e.g. in functions)... but you may also want to consider making a constant (define), if you know this value won't change (I sometimes change these values later in a site wide configuration file, for example, if I have an HTTPS/SSL certificate for the website).

I think this is the wrong approach.
URLs in a HTML support relative locations. That is, you can do link to refer to a page that has the same path in its URL as the corrent page. You can also do link to provide a full path to the same website. These two tricks mean your website code doesn't really need to know where it is to provide working URLs.
That said, you might need some tricks so you can have one website on http://www.example.com/dev/site.php and another on http://www.example.com/testing/site.php. You'll need some code to figure out which directory prefix is being used, but you can use a configuration value to do that. By which I mean a value that belongs to that (sub-)site's configuration, not the version-controlled code!

Related

PHP: Get current directory index file name

I must not be phrasing this question right because I couldn't find an answer to this but surely it's been asked before. How do I get the current filename from a URL if it's the directory's index file?
I.e. This will get index.html if I'm on www.example.com/index.html
$url = basename($_SERVER['REQUEST_URI']);
But that won't work if i'm on www.example.com. The only thing I've come up with so far is something like this:
$url = basename($_SERVER['REQUEST_URI']);
if($url == "") {
$filename = "index.html";
}
But that's obviously a bad solution because I may actually be on index.htm or index.php. Is there a way to determine this accurately?
$_SERVER['SCRIPT_FILENAME'] will determine the full path of the currently executing PHP file. And $_SERVER['SCRIPT_NAME'] returns just the file name.
This is one of the other methods.
$url = basename($_SERVER['REQUEST_URI']);
$urlArray = explode("/",$url));
$urlArray = array_reverse($urlArray);
echo $urlArray[0];
Unfortunately, the $_SERVER array entries may not always be available by your server. Some may be omitted, some not. With a little testing though, you can easily find out what your server will output for these entries. On my servers (usually Apache) I find that $_SERVER['DOCUMENT_ROOT'] usually gets me the base of the URI I'm after. This also works well for me in the production environment I work on (XAMPP). As my URI will have a localhost root. I have seen people encourage DOCUMENT_ROOT before in this situation. You can read all about the $_SERVER array here.
In this example, I get the following results:
echo $_SERVER['DOCUMENT_ROOT'] ; // outputs http://example.com
If you are working in a production environment this is very helpful because you won't have to modify your URL's when you go live:
echo $_SERVER['DOCUMENT_ROOT'] ; // outputs C:/xampp/htdocs/example
'DOCUMENT_ROOT' The document root directory under which the current
script is executing, as defined in the server's configuration file.
You could find the last occurrence of the slash / using strrchr() and simply extract the rest using substr(). The optional parameter in substr() tells where to begin. With one we skip the slash /. If you want to keep it, just set the parameter to 0.
echo substr( strrchr( "http://example.com/index.html" , "/" ) , 1 ) ; // outputs index.html
EDIT: Considering that not every server will provide $_SERVER with entities, my approach might be more reliable. That is, if the URL you pass to strrchr() is reliable. In either case, make sure you test the different outputs from $_SERVER, or your paths you provide.

Get current domain

I have my site on the server http://www.myserver.uk.com.
On this server I have two domains:
one.com and two.com
I would like to get the current domain using PHP, but if I use $_SERVER['HTTP_HOST'] then it is showing me
myserver.uk.com
instead of:
one.com or two.com
How can I get the domain, and not the server name?
Try using this:
$_SERVER['SERVER_NAME']
Or parse:
$_SERVER['REQUEST_URI']
Reference: apache_request_headers()
The best use would be
echo $_SERVER['HTTP_HOST'];
And it can be used like this:
if (strpos($_SERVER['HTTP_HOST'], 'banana.com') !== false) {
echo "Yes this is indeed the banana.com domain";
}
This code below is a good way to see all the variables in $_SERVER in a structured HTML output with your keywords highlighted that halts directly after execution. Since I do sometimes forget which one to use myself - I think this can be nifty.
<?php
// Change banana.com to the domain you were looking for..
$wordToHighlight = "banana.com";
$serverVarHighlighted = str_replace( $wordToHighlight, '<span style=\'background-color:#883399; color: #FFFFFF;\'>'. $wordToHighlight .'</span>', $_SERVER );
echo "<pre>";
print_r($serverVarHighlighted);
echo "</pre>";
exit();
?>
The only secure way of doing this
The only guaranteed secure method of retrieving the current domain is to store it in a secure location yourself.
Most frameworks take care of storing the domain for you, so you will want to consult the documentation for your particular framework. If you're not using a framework, consider storing the domain in one of the following places:
   Secure methods of storing the domain   
  Used By
A configuration file  
Joomla, Drupal/Symfony
The database  
WordPress
An environmental variable
Laravel  
A service registry  
Kubernetes DNS
The following work... but they're not secure
Hackers can make the following variables output whatever domain they want. This can lead to cache poisoning and barely noticeable phishing attacks.
$_SERVER['HTTP_HOST']
This gets the domain from the request headers which are open to manipulation by hackers. Same with:
$_SERVER['SERVER_NAME']
This one can be made better if the Apache setting usecanonicalname is turned off; in which case $_SERVER['SERVER_NAME'] will no longer be allowed to be populated with arbitrary values and will be secure. This is, however, non-default and not as common of a setup.
In popular systems
Below is how you can get the current domain in the following frameworks/systems:
WordPress
$urlparts = parse_url(home_url());
$domain = $urlparts['host'];
If you're constructing a URL in WordPress, just use home_url or site_url, or any of the other URL functions.
Laravel
request()->getHost()
The request()->getHost function is inherited from Symfony, and has been secure since the 2013 CVE-2013-4752 was patched.
Drupal
The installer does not yet take care of making this secure (issue #2404259). But in Drupal 8 there is documentation you can you can follow at Trusted Host Settings to secure your Drupal installation after which the following can be used:
\Drupal::request()->getHost();
Other frameworks
Feel free to edit this answer to include how to get the current domain in your favorite framework. When doing so, please include a link to the relevant source code or to anything else that would help me verify that the framework is doing things securely.
Addendum
Exploitation examples:
Cache poisoning can happen if a botnet continuously requests a page using the wrong hosts header. The resulting HTML will then include links to the attackers website where they can phish your users. At first the malicious links will only be sent back to the hacker, but if the hacker does enough requests, the malicious version of the page will end up in your cache where it will be distributed to other users.
A phishing attack can happen if you store links in the database based on the hosts header. For example, let say you store the absolute URL to a user's profiles on a forum. By using the wrong header, a hacker could get anyone who clicks on their profile link to be sent a phishing site.
Password reset poisoning can happen if a hacker uses a malicious hosts header when filling out the password reset form for a different user. That user will then get an email containing a password reset link that leads to a phishing site. Another more complex form of this skips the user having to do anything by getting the email to bounce and resend to one of the hacker's SMTP servers (for example CVE-2017-8295.)
Here are some more malicious examples
Additional Caveats and Notes:
When usecanonicalname is turned off the $_SERVER['SERVER_NAME'] is populated with the same header $_SERVER['HTTP_HOST'] would have used anyway (plus the port). This is Apache's default setup. If you or DevOps turns this on then you're okay -- ish -- but do you really want to rely on a separate team, or yourself three years in the future, to keep what would appear to be a minor configuration at a non-default value? Even though this makes things secure, I would caution against relying on this setup.
Red Hat, however, does turn usecanonical on by default [source].
If serverAlias is used in the virtual hosts entry, and the aliased domain is requested, $_SERVER['SERVER_NAME'] will not return the current domain, but will return the value of the serverName directive.
If the serverName cannot be resolved, the operating system's hostname command is used in its place [source].
If the host header is left out, the server will behave as if usecanonical
was on [source].
Lastly, I just tried exploiting this on my local server, and was unable to spoof the hosts header. I'm not sure if there was an update to Apache that addressed this, or if I was just doing something wrong. Regardless, this header would still be exploitable in environments where virtual hosts are not being used.
A Little Rant:
     This question received hundreds of thousands of views without a single mention of the security problems at hand! It shouldn't be this way, but just because a Stack Overflow answer is popular, that doesn't mean it is secure.
Using $_SERVER['HTTP_HOST'] gets me (subdomain.)maindomain.extension. It seems like the easiest solution to me.
If you're actually 'redirecting' through an iFrame, you could add a GET parameter which states the domain.
<iframe src="myserver.uk.com?domain=one.com"/>
And then you could set a session variable that persists this data throughout your application.
Try $_SERVER['SERVER_NAME'].
Tips: Create a PHP file that calls the function phpinfo() and see the "PHP Variables" section. There are a bunch of useful variables we never think of there.
To get the domain:
$_SERVER['HTTP_HOST']
Domain with protocol:
$protocol = strpos(strtolower($_SERVER['SERVER_PROTOCOL']), 'https') === FALSE ? 'http' : 'https';
$domainLink = $protocol . '://' . $_SERVER['HTTP_HOST'];
Protocol, domain, and queryString total:
$url = $protocol . '://' . $_SERVER['HTTP_HOST'] . '?' . $_SERVER['QUERY_STRING'];
**As the $_SERVER['SERVER_NAME'] is not reliable for multi-domain hosting!
I know this might not be entirely on the subject, but in my experience, I find storing the WWW-ness of the current URL in a variable useful.
In addition, please see my comment below, to see what this is getting at.
This is important when determining whether to dispatch Ajax calls with "www", or without:
$.ajax("url" : "www.site.com/script.php", ...
$.ajax("url" : "site.com/script.php", ...
When dispatching an Ajax call the domain name must match that of in the browser's address bar, and otherwise you will have an Uncaught SecurityError in the console.
So I came up with this solution to address the issue:
<?php
substr($_SERVER['SERVER_NAME'], 0, 3) == "www" ? $WWW = true : $WWW = false;
if ($WWW) {
/* We have www.example.com */
} else {
/* We have example.com */
}
?>
Then, based on whether $WWW is true, or false run the proper Ajax call.
I know this might sound trivial, but this is such a common problem that is easy to trip over.
Everybody is using the parse_url function, but sometimes a user may pass the argument in different formats.
So as to fix that, I have created a function. Check this out:
function fixDomainName($url='')
{
$strToLower = strtolower(trim($url));
$httpPregReplace = preg_replace('/^http:\/\//i', '', $strToLower);
$httpsPregReplace = preg_replace('/^https:\/\//i', '', $httpPregReplace);
$wwwPregReplace = preg_replace('/^www\./i', '', $httpsPregReplace);
$explodeToArray = explode('/', $wwwPregReplace);
$finalDomainName = trim($explodeToArray[0]);
return $finalDomainName;
}
Just pass the URL and get the domain.
For example,
echo fixDomainName('https://stackoverflow.com');
will return:
stackoverflow.com
And in some situation:
echo fixDomainName('stackoverflow.com/questions/id/slug');
And it will also return stackoverflow.com.
This quick & dirty works for me.
Whichever way you get the string containing the domain you want to extract, i.e. using a super global -$_SERVER['SERVER_NAME']- or, say, in Drupal: global $base_url, regex is your friend:
global $base_url;
preg_match("/\w+\.\w+$/", $base_url, $matches);
$domain = $matches[0];
The particular regex string I am using in the example will only capture the last two components of the $base_url string, of course, but you can add as many "\w+." as desired.
Hope it helps.

To convert an absolute path to a relative path in php

I would like to convert an absolute path into a relative path.
This is what the current absolute code looks like
$sitefolder = "/wmt/";
$adminfolder = "/wmt/admin/";
$site_path = $_SERVER["DOCUMENT_ROOT"]."$sitefolder";
// $site_path ="//winam/refiller/";
$admin_path = $site_path . "$adminfolder";
$site_url = "http://".$_SERVER["HTTP_HOST"]."$sitefolder";
$admin_url = $site_url . "$adminfolder";
$site_images = $site_url."images/";
so for example, the code above would give you a site url of
www.temiremi.com/wmt
and accessing a file in that would give
www.temiremi.com/wmt/folder1.php
What I want to do is this I want to mask the temiremi.com/wmt and replace it with dolapo.com, so it would say www.dolapo.com/folder1.php
Is it possible to do that with relative path.
I'm a beginner coder. I paid someone to do something for me, but I want to get into doing it myself now.
The problem is that your question, although it seems very specific, is missing some crucial details.
If the script you posted is always being executed, and you always want it to go to delapo.com instead of temiremi.com, then all you would have to do is replace
$site_url = "http://".$_SERVER["HTTP_HOST"]."$sitefolder";
with
$site_url = "http://www.delapo.com/$sitefolder";
The $_SERVER["HTTP_HOST"] variable will return the domain for whatever site was requested. Therefore, if the user goes to www.temiremi.com/myscript.php (assuming that the script you posted is saved in a file called myscript.php) then $_SERVER["HTTP_HOST"] just returns www.temiremi.com.
On the other hand, you may not always be redirecting to the same domain or you may want the script to be able to adapt easily to go to different domains without having to dig through layers of code in the future. If this is the case, then you will need a way to figuring out what domain you wish to link to.
If you have a website hosted on temiremi.com but you want it to look like you are accessing from delapo.com, this is not an issue that can be resolved by PHP. You would have to have delapo.com redirect to temiremi.com or simply host on delapo.com in the first place.
If the situation is the other way around and you want a website hosted on delapo.com but you want users to access temiremi.com, then simply re-writing links isn't a sophisticated enough answer. This strategy would redirect the user to the other domain when they clicked the link. Instead you would need to have a proxy set up to forward the information. Proxy scripts vary in complexity, but the simplest one would be something like:
<?php
$site = file_get_contents("http://www.delapo.com/$sitefolder");
echo $site;
?>
So you see, we really need a little more information on why you need this script and its intended purpose in order to assist you.
This would be a lot easier to do in the HTTP server configuration. For example, using Apache's VHost
I'm not really sure what you're going for bc this doesnt look like absolute path to relative path, but rather one absolute path to another.
Are you always trying to simply change "www.temiremi.com/wmt/" to "delapo.com"? If thats the case, you just want simple string replacement rather than $_SERVER variables or path functions.
$alteredPath = str_replace("www.temiremi.com/wmt/", "delapo.com", $oldPath);
OR
$alteredParth "www.delapo.com/" . basename($oldPath)
If i misunderstand please explain, I don't know if you need this to be more robust/generic, and you kind of threw me for a loop with "dolapo.com" (when i first thought your title, i originally thought of comparing path to a value from $_SERVER and removing common parts,)
And as mentioned, if you are just trying to make the URL displayed the the user (in the address bar or links) look different PHP can't do this.

Using PHP to grab the absolute URL of the script?

Basically, I want my script to output its absolute URL, but I don't want to statically program it into the script. For example, if my current URL is http://example.com/script.php I want to be able to store it as a variable, or echo it. i.e. $url = http://example.com/script.php;
But if I move the script to a different server/domain, I want it to automatically adjust to that, i.e. $url = http://example2.com/newscript.php;
But I have no idea how to go about doing this. Any ideas?
$url = "http://" . $_SERVER['HTTP_HOST'] . '/script.php';
If there's a possibility the protocol will change as well (i.e. https instead of http), use this:
$url = ($_SERVER['HTTPS'] ? "https://" : "http://") . $_SERVER['HTTP_HOST'] . '/script.php';
$_SERVER['HTTP_HOST'] and $_SERVER['SCRIPT_NAME'] contain this information.
UPDATE: As #Col. Shrapnel points out, SCRIPT_NAME returns the actual path of the script relative to the host, not the requested URL, which may be different if using URL rewrite. Also, unlike REQUEST_URI, it doesn't include the possibly appended variables.
Note that SCRIPT_NAME is equivalent in content to PHP_SELF, the difference is that:
SCRIPT_NAME is defined in the CGI 1.1
specification, and is thus a standard.
However, not all web servers actually
implement it, and thus it isn't
necessarily portable. PHP_SELF, on the
other hand, is implemented directly by
PHP, and as long as you're programming
in PHP, will always be present.
by bet (:
$_SERVER['HTTP_HOST'] and $_SERVER["REQUEST_URI"];
however, $_SERVER['HTTP_PORT'] and $_SERVER['HTTPS'] could be used in the critical case
however, most of time you do not need all of these, save for $_SERVER["REQUEST_URI"]
because browser knows the rest already: port, host and everything.
Try using
$url = "http://{$_SERVER['SERVER_NAME']}{$_SERVER['REQUEST_URI']}";
I have a library that helps me do this across webservers and is also agnostic to mod_rewrite.
The library is called Bombay: http://github.com/sandeepshetty/bombay
To use it you need to do this:
<?php
require '/path/to/bombay.php';
requires ('uri');
echo absolute_uri('script.php');
//prints http://example.com/script.php if hosted on example.com and accessed over http
//prints https://example2.com/script.php if hosted on example2.com and accessed over https
?>
You could also study the code, and take what you need.

Is this code vulnerable to hacker attack?

I am really new to online web application. I am using php, I got this code:
if(isset($_GET['return']) && !empty($_GET['return'])){
return = $_GET['return'];
header("Location: ./index.php?" . $return);
} else {
header("Location: ./index.php");
}
the $return variable is URL variable which can be easily changed by hacker.
E.g i get the $return variable from this : www.web.com/verify.php?return=profile.php
Is there anything I should take care? Should I use htmlentities in this line:
header("Location: ./index.php?" . htmlentities($return));
Is it vulnerable to attack by hacker?
What should i do to prevent hacking?
Apart from that typo on line 2 (should be $return = $_GET['return'];) you should do $return = urlencode($return) to make sure that $return is a valid QueryString as it's passed as parameter to index.php.
index.php should then verify that return is a valid URL that the user has access to. I do not know how your index.php works, but if it simply displays a page then you could end up with someting like index.php?/etc/passwd or similar, which could indeed be a security problem.
Edit: What security hole do you get? There are two possible problems that I could see, depending how index.php uses the return value:
If index.php redirects the user to the target page, then I could use your site as a relay to redirect the user to a site I control. This could be either used for phishing (I make a site that looks exactly like yours and asks the user for username/password) or simply for advertising.
For example, http://yoursite/index.php?return?http%3A%2F%2Fwww.example.com looks like the user accesses YourSite, but then gets redirected to www.example.com. As I can encode any character using the %xx notation, this may not even be obvious to the user.
If index.php displays the file from the return-parameter, I could try to pass in the name of some system file like /etc/passwd and get a list of all users. Or I could pass something like ../config.php and get your database connection
I don't think that's the case here, but this is such a common security hole I'd still like to point it out.
As said, you want to make sure that the URL passed in through the querystring is valid. Some ways to do that could be:
$newurl = "http://yoursite/" . $return;
this could ensure that you are always only on your domain and never redirect to any other domain
$valid = file_exists($return)
This works if $return is always a page that exists on the hard drive. By checking that return indeed points to a valid file you can filter out bogus entries
If return would accept querystrings (i.e. return=profile.php?step=2) then you would need to parse out the "profile.php" path
Have a list of valid values for $return and compare against it
this is usually impractical, unless you really designed your application so that index.php can only return t a given set of pages
There are many ways to skin this cat, but generally you want to somehow validate that $return points to a valid target. What those valid targets are depends on your specification.
If you're running an older version of both PHP 4 or 5, then I think you will be vulnerable to header injection - someone can set return to a URL, followed by a line return, followed by any other headers they want to make your server send.
You could avoid this by sanitising the string first. It might be enough to strip line returns but it would be better to have an allowed list of characters - this might be impractical.
4.4.2 and 5.1.2: This function now prevents more than one header to be
sent at once as a protection against
header injection attacks.
http://php.net/manual/en/function.header.php
What would happen if you put in a page that didn't exist. For example:
return=blowup.php
or
return=http://www.google.co.uk
or
return=http%3A%2F%2Fwww.google.co.uk%2F
You could obfuscate the reference by not including the .php in the variable. You could then append it in the code-behind and check for the existence of the file in your directory / use a switch statement of allowable values before redirecting to it.
In this case, it more depends on what's done with that part of the query string on index.php. If it's being sent to a database query, output, eval(), or exec() yes, its a very common security hole. Most other situations will be safe unfiltered, but its best to write your own general purpose sanitizing function which converts quotes of all varieties to their HTML entity, as well as equals symbols.
The things I would do are:
Define, what type of return values are allowed?
Write down all types of possible return values.
Then, make conclusions: what characters are not allowed, what is the maximum url length, what domains are allowed, etc.
Finally: make a filter function according to above conclusions.
I Thing hacker can do this
you will redirect if $_GET['return'] Contain any thing
the hacker can use it as xss
redirect to virus or any thing like it
but there is no ability to make any thing else

Categories