Using PHP to grab the absolute URL of the script?

Using PHP to grab the absolute URL of the script? - php

Basically, I want my script to output its absolute URL, but I don't want to statically program it into the script. For example, if my current URL is http://example.com/script.php I want to be able to store it as a variable, or echo it. i.e. $url = http://example.com/script.php;
But if I move the script to a different server/domain, I want it to automatically adjust to that, i.e. $url = http://example2.com/newscript.php;
But I have no idea how to go about doing this. Any ideas?

$url = "http://" . $_SERVER['HTTP_HOST'] . '/script.php';
If there's a possibility the protocol will change as well (i.e. https instead of http), use this:
$url = ($_SERVER['HTTPS'] ? "https://" : "http://") . $_SERVER['HTTP_HOST'] . '/script.php';

$_SERVER['HTTP_HOST'] and $_SERVER['SCRIPT_NAME'] contain this information.
UPDATE: As #Col. Shrapnel points out, SCRIPT_NAME returns the actual path of the script relative to the host, not the requested URL, which may be different if using URL rewrite. Also, unlike REQUEST_URI, it doesn't include the possibly appended variables.
Note that SCRIPT_NAME is equivalent in content to PHP_SELF, the difference is that:
SCRIPT_NAME is defined in the CGI 1.1
specification, and is thus a standard.
However, not all web servers actually
implement it, and thus it isn't
necessarily portable. PHP_SELF, on the
other hand, is implemented directly by
PHP, and as long as you're programming
in PHP, will always be present.

by bet (:
$_SERVER['HTTP_HOST'] and $_SERVER["REQUEST_URI"];
however, $_SERVER['HTTP_PORT'] and $_SERVER['HTTPS'] could be used in the critical case
however, most of time you do not need all of these, save for $_SERVER["REQUEST_URI"]
because browser knows the rest already: port, host and everything.

Try using
$url = "http://{$_SERVER['SERVER_NAME']}{$_SERVER['REQUEST_URI']}";

I have a library that helps me do this across webservers and is also agnostic to mod_rewrite.
The library is called Bombay: http://github.com/sandeepshetty/bombay
To use it you need to do this:
<?php
require '/path/to/bombay.php';
requires ('uri');
echo absolute_uri('script.php');
//prints http://example.com/script.php if hosted on example.com and accessed over http
//prints https://example2.com/script.php if hosted on example2.com and accessed over https
?>
You could also study the code, and take what you need.

Related

PHP: Get current directory index file name

I must not be phrasing this question right because I couldn't find an answer to this but surely it's been asked before. How do I get the current filename from a URL if it's the directory's index file?
I.e. This will get index.html if I'm on www.example.com/index.html
$url = basename($_SERVER['REQUEST_URI']);
But that won't work if i'm on www.example.com. The only thing I've come up with so far is something like this:
$url = basename($_SERVER['REQUEST_URI']);
if($url == "") {
$filename = "index.html";
}
But that's obviously a bad solution because I may actually be on index.htm or index.php. Is there a way to determine this accurately?

$_SERVER['SCRIPT_FILENAME'] will determine the full path of the currently executing PHP file. And $_SERVER['SCRIPT_NAME'] returns just the file name.

This is one of the other methods.
$url = basename($_SERVER['REQUEST_URI']);
$urlArray = explode("/",$url));
$urlArray = array_reverse($urlArray);
echo $urlArray[0];

Unfortunately, the $_SERVER array entries may not always be available by your server. Some may be omitted, some not. With a little testing though, you can easily find out what your server will output for these entries. On my servers (usually Apache) I find that $_SERVER['DOCUMENT_ROOT'] usually gets me the base of the URI I'm after. This also works well for me in the production environment I work on (XAMPP). As my URI will have a localhost root. I have seen people encourage DOCUMENT_ROOT before in this situation. You can read all about the $_SERVER array here.
In this example, I get the following results:
echo $_SERVER['DOCUMENT_ROOT'] ; // outputs http://example.com
If you are working in a production environment this is very helpful because you won't have to modify your URL's when you go live:
echo $_SERVER['DOCUMENT_ROOT'] ; // outputs C:/xampp/htdocs/example
'DOCUMENT_ROOT' The document root directory under which the current
script is executing, as defined in the server's configuration file.

You could find the last occurrence of the slash / using strrchr() and simply extract the rest using substr(). The optional parameter in substr() tells where to begin. With one we skip the slash /. If you want to keep it, just set the parameter to 0.
echo substr( strrchr( "http://example.com/index.html" , "/" ) , 1 ) ; // outputs index.html
EDIT: Considering that not every server will provide $_SERVER with entities, my approach might be more reliable. That is, if the URL you pass to strrchr() is reliable. In either case, make sure you test the different outputs from $_SERVER, or your paths you provide.

Difference between $_SERVER['DOCUMENT_ROOT'] and $_SERVER['HTTP_HOST']

I am back with a simple question (or related question).
The question is simple however I have not received an answer yet. I have asked many people with different experience in PHP. But the response I get is: "I don't have any idea. I've never thought about that." Using Google I have not been able to find any article on this. I hope that I will get a satisfying answer here.
So the question is:
What is the difference between $_SERVER['DOCUMENT_ROOT'] and $_SERVER['HTTP_HOST'] ?
Are there any advantages of one over the other?
Where should we use HTTP_HOST & where to use DOCUMENT_ROOT?

DOCUMENT_ROOT
The root directory of this site defined by the 'DocumentRoot' directive in the General Section or a section e.g.
DOCUMENT_ROOT=/var/www/example
HTTP_HOST
The base URL of the host e.g.
HTTP_HOST=www.example.com
The document root is the local path to your website, on your server; The http host is the hostname of the server. They are rather different; perhaps you can clarify your question?
Edit:
You said:
Case 1 : header('Location: '. $_SERVER['DOCUMENT_ROOT'] . '/abc.php')
Case 2: header('Location: '. $_SERVER['HTTP_HOST'] . '/abc.php')
I suspect the first is only going to work if you run your browser on the same machine that's serving the pages.
Imagine if someone else visits your website, using their Windows machine. And your webserver tells them in the HTTP headers, "hey, actually, redirect this location: /var/www/example/abc.php." What do you expect the user's machine to do?
Now, if you're talking about something like
<?php include($_SERVER['DOCUMENT_ROOT'] . '/include/abc.php') ?>
vs
<?php include($_SERVER['HTTP_HOST'] . '/include/abc.php') ?>
That might make sense. I suspect in this case the former is probably preferred, although I am not a PHP Guru.

<?php include($_SERVER['DOCUMENT_ROOT'] . '/include/abc.php') ?>
should be used for including the files in another file.
header('Location: '. $_SERVER['HTTP_HOST'] . '/abc.php')
should be used for hyperlinking

Eh, what's the question? DOCUMENT_ROOT contains the path to current web, in my case /home/www. HTTP_HOST contains testing.local, as it runs on local domain. The difference is obvious, isn't it?
I cannot figure out where you could interchange those two, so why should you consider advantages?

HTTP_HOST will give you URL of the host, e.g. domain.com
DOCUMENT_ROOT will give you absolute path to document root of the website in server's file system, e.g. /var/www/domain/
Btw, have you tried looking at PHP's manual, specifically $_SERVER? Everything is explanied there.

if you want domain path like 'example.com', you can use "HTTP_HOST"
if you want folder '/public_html/foldername/' path you can use
"DOCUMENT_ROOT"

$_SERVER ['HTTP_HOST'] is defined by the client and may not even be set! You can repeat a request and withhold the header for local testing in developer tools such as for Waterfox/Firefox. You must determine if this header is set and if the host being requested exists (one of the very first things you do, even before starting to send any of your headers) otherwise the appropriate action is to kill the entire process and respond with an HTTP 400 Bad Request. This goes for all server-side programming languages.
$_SERVER['DOCUMENT_ROOT'] is defined by the server as the directory which the executing script is located. Examples:
public_html/example.php = public_html/
public_html/test1/example.php = public_html/test1/
Keep in mind that if you're using Apache rewrites that there is a difference between the $_SERVER['REQUEST_URI'] (the URL requested) and $_SERVER['PHP_SELF'] (the file handling the request).

The Title question is perfectly awnsered by John Ledbetter.
This awnser is intended to expand and offer additional information about what seems to be the original poster inner concerns:
Where would make sense to use the URL based location: $_SERVER['HTTP_HOST'] ?
Where would make sense to use the local based location: $_SERVER['DOCUMENT_ROOT'] ?
Where both can be used, what are the Advantages and Disadvantages of each one. ?
Following my awnsers:
By usign the HTTP_HOST you can abstract yourself from the machine Folder System which means in cases where portability is a concern and you are expected to install the Application on multiple servers potentially with diferent OS this approach could be easier to maintain.
You can also take advantage of HTTP_HOST if your server is going to become unavailible and you want a diferent one from the cluster to handle the request.
By Using the DOCUMENT_ROOT you can access the whole filesystem (depends on the permissions you give to php) it makes sense if you want to access a program which you dont want to be accesible from the web or when the Folder System is relevant to your Application.
You can also take advantage of DOCUMENT_ROOT to get the subsite root instead of the Host.
$_SERVER['HTTP_HOST'] = "www.example.com";
$_SERVER['DOCUMENT_ROOT'] = "var/www/domain/subsite1" // equivalent to www.example.com/subsite1

$_SERVER ['HTTP_HOST'] returns the domain url
a.g. www.example.com
While $_SERVER['DOCUMENT_ROOT'] returns the roof of current web..
Such as

Other answers have alluded to it, but I wanted to add an answer just to be sharp as a grizzly bear tooth in one point - don't trust $_SERVER['HTTP_HOST'] as safe where following code does:
<?php
header('Location: '. $_SERVER['HTTP_HOST'] . '/abc.php');
#Or
include($_SERVER['HTTP_HOST'] . '/include/abc.php');
?>
The variable is subject to manipulation by the incoming request and could contribute to an exploit. This may depend on your server configuration, but you don't want something filling out this variable for you :)
See also:
https://security.stackexchange.com/questions/32299/is-server-a-safe-source-of-data-in-php
https://expressionengine.com/blog/http-host-and-server-name-security-issues

Is it possible to use curl with relative path in PHP?

I have two php pages. I want to fetch b.php in a.php.
In my a.php:
$ch = curl_init("b.php");
echo(curl_exec($ch));
curl_close($ch);
Doesn't work;
But:
$ch = curl_init("www.site.com/b.php");
echo(curl_exec($ch));
curl_close($ch);
is OK. I'm sure a.php is under www.site.com.
Why curl can't work with relative path? Is there a workaround?

Curl is a seperate library which does not really know anything about webservers and where it's coming from or (philosophicaly) why it is there. So you may 'fake' relative urls using one of the two _SERVER variables:
$_SERVER['SERVER_NAME']
The name of the server host under which the current script is executing. If the script is running on a virtual host, this will be the value defined for that virtual host.
$_SERVER['HTTP_HOST']
Contents of the Host: header from the current request, if there is one.
See: http://php.net/manual/en/reserved.variables.server.php
Edit update:
I thought a moment longer about this: do you really need to fetch it with curl?
You usually may also fetch any output of another script like this and save the overhead of loading it through a new http request:
ob_start();
require "b.php";
$output = ob_get_clean();

How about taking the domain from HTTP_HOST?
$domain = $_SERVER['HTTP_HOST'];
$prefix = $_SERVER['HTTPS'] ? 'https://' : 'http://';
$relative = '/b.php';
$ch = curl_init($prefix.$domain.$relative);
echo(curl_exec($ch));
curl_close($ch);

cUrl needs an absolute URI to operate on.
A relative URI does not work because there is no base URI given to which that relative URI is absolute to.
You can however, if you have both the base URI and the relative URI, create the absolute URI of the relative URI and use it with cUrl.
See 12.4.1 Resolving relative URIs.
A PHP class that can build an absolute URI based on a relative URI and it's base is the Net_URL2 package in Pear.

cURL would primarily be used to retrieve data from external domains, therefore it wouldn't make too much sense to allow relative paths. The easiest thing to do would just be to append your current domain to the URL.
$domain = $_SERVER['HTTP_HOST'] . "/";
$ch = curl_init($domain . "b.php");
echo(curl_exec($ch));
curl_close($ch);

CURL has absolutely no knowledge of its operating environment. There is no way for it to know where 'b.php' is. Should it turn that into example.org/b.php or some.wonky.multi.level.domain.co.uk/b.php?
Even treating it as a local file reference would be useless... CURL wouldn't know that a .php file is actually a PHP script. Even if it did a local file fetch, you'd just get PHP source, not the output of the script after PHP's run it. What if you've got a site like arstechnica.com where all its pages are actually .ars scripts? Is that .asp? .aspx? .html? PHP script? perl? ruby?
So.. simple answer: you must always specify a complete URL, with protocol, for CURL to operate on.

Self-referential URLs

What's the most reliable, generic way to construct a self-referential URL? In other words, I want to generate the http://www.site.com[:port] portion of the URL that the user's browser is hitting. I'm using PHP running under Apache.
A few complications:
Relying on $_SERVER["HTTP_HOST"] is dangerous, because that seems to come straight from the HTTP Host header, which someone can forge.
There may or may not be virtual hosts.
There may be a port specified using Apache's Port directive, but that might not be the port that the user specified, if it's behind a load-balancer or proxy.
The port may not actually be part of the URL. For example, 80 and 443 are usually omitted.
PHP's $_SERVER["HTTPS"] doesn't always give a reliable value, especially if you're behind a load-balancer or proxy.
Apache has a UseCanonicalName directive, which affects the values of the SERVER_NAME and SERVER_PORT environment variables. We can assume this is turned on, if that helps.

I would suggest that the only way to be sure and to be secure is to define a constant for the url in some kind of config file for the site. You could generate the constant with $_SERVER['HTTP_HOST'] as a default and replace with a hard coded definition on deployments where security really matters.
define('SITE_URL', $_SERVER['HTTP_HOST']);
and replace as needed:
define('SITE_URL', 'http://foo.bar.com:8080/');

As I recall, you want to do something like this:
$protocol = 'http';
if ( (!empty($_SERVER['HTTPS'])) || ($_SERVER['HTTPS'] == 'off') ) {
$protocol = 'https';
if ($_SERVER['SERVER_PORT'] != 443)
$port = $_SERVER['SERVER_PORT'];
} else if ($_SERVER['SERVER_PORT'] != 80) {
$port = $_SERVER['SERVER_PORT'];
}
// Server name is going to be whatever the virtual host name is set to in your configuration
$address = $protocol . '://' . $_SERVER['SERVER_NAME'];
if (!empty($port))
$address .= ':' . $port
$address .= $_SERVER['REQUEST_URI'];
// Optional, if you want the query string intact
if (!empty($_SERVER['QUERY_STRING']))
$address .= '?' . $_SERVER['QUERY_STRING'];
I haven't tested this code, because I don't have PHP handy at the moment.

The most reliable way is to provide it yourself.
The site should be coded to be hostname neutral, but to know about a special configuration file. This file doesn't get put into source control for the codebase because it belongs to the webserver's configuration. The file is used to set things like the hostname and other webserver-specific parameters. You can accomodate load balancers, changing ports, etc, because you're saying if an HTTP request hits that code, then it can assume however much you will let it assume.
This trick also helps development, incidentally. :-)

$_SERVER["HTTP_HOST"] is probably the best way, after some validation of course.
Yes, the user specifies it and so it cannot be trusted, but you can easily detect when the user is playing games with it.

One idea for validating that $_SERVER['HTTP_HOST'] is valid could be to validate it by DNS. I've used this method in one or two cases without serious consequences to speed and I believe this method fails silently if provided a IP address.
http://www.php.net/manual/en/function.gethostbyname.php
Peusudo code might be:
define('SITEHOME', in_array(gethostbyname($_SERVER['HTTP_HOST']), array(... valid IP's)))
? $_SERVER['HTTP_HOST']
: 'default_hostname';

why {if you wish the user to continue using http:///host:port/ that they are on do you wish to generate full urls}
whan you can use relative urls instead of either
say on page http://xxx:yy/zzz/fff/
you culd use either
../graphics/whatever.jpg
{to go back one directory from current and get http://xxx:yy/zzz/graphics/whatever.jpg
or
/zzz/graphics/whatever.jpg
{to goto site root and work up the directories as specified}
these both avoid mentioning the host:port part and inherit it from the one currently in use

Converting a filepath to a url securely and reliably

I'm using php and I have the following code to convert an absolute path to a url.
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').str_replace($_SERVER['DOCUMENT_ROOT'], $_SERVER['HTTP_HOST'], $path);
}
My question is basically, is there a better way to do this in terms of security / reliability that is portable between locations and servers?

The HTTP_HOST variable is not a reliable or secure value as it is also being sent by the client. So be sure to validate its value before using it.

I don't think security is going to be effected, simply because this is a url, being printed to a browser... the worst that can happen is exposing the full directory path to the file, and potentially creating a broken link.
As a little side note, if this is being printed in a HTML document, I presume you are passing the output though something like htmlentities... just in-case the input $path contains something like a [script] tag (XSS).
To make this a little more reliable though, I wouldn't recommend matching on 'DOCUMENT_ROOT', as sometimes its either not set, or won't match (e.g. when Apache rewrite rules start getting in the way).
If I was to re-write it, I would simply ensure that 'HTTP_HOST' is always printed...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].str_replace($_SERVER['DOCUMENT_ROOT'], '', $path);
}
... and if possible, update the calling code so that it just passes the path, so I don't need to even consider removing the 'DOCUMENT_ROOT' (i.e. what happens if the path does not match the 'DOCUMENT_ROOT')...
function make_url($path, $secure = false){
return (!$secure ? 'http://' : 'https://').$_SERVER['HTTP_HOST'].$path;
}
Which does leave the question... why have this function?
On my websites, I simply have a variable defined at the beggining of script execution which sets:
$GLOBALS['webDomain'] = 'http://' . (isset($_SERVER['HTTP_HOST']) ? $_SERVER['HTTP_HOST'] : '');
$GLOBALS['webDomainSSL'] = $GLOBALS['webDomain'];
Where I use GLOBALS so it can be accessed anywhere (e.g. in functions)... but you may also want to consider making a constant (define), if you know this value won't change (I sometimes change these values later in a site wide configuration file, for example, if I have an HTTPS/SSL certificate for the website).

I think this is the wrong approach.
URLs in a HTML support relative locations. That is, you can do link to refer to a page that has the same path in its URL as the corrent page. You can also do link to provide a full path to the same website. These two tricks mean your website code doesn't really need to know where it is to provide working URLs.
That said, you might need some tricks so you can have one website on http://www.example.com/dev/site.php and another on http://www.example.com/testing/site.php. You'll need some code to figure out which directory prefix is being used, but you can use a configuration value to do that. By which I mean a value that belongs to that (sub-)site's configuration, not the version-controlled code!

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Using PHP to grab the absolute URL of the script? - php

$url = "http://" . $_SERVER['HTTP_HOST'] . '/script.php'; If there's a possibility the protocol will change as well (i.e. https instead of http), use this: $url = ($_SERVER['HTTPS'] ? "https://" : "http://") . $_SERVER['HTTP_HOST'] . '/script.php';

by bet (: $_SERVER['HTTP_HOST'] and $_SERVER["REQUEST_URI"]; however, $_SERVER['HTTP_PORT'] and $_SERVER['HTTPS'] could be used in the critical case however, most of time you do not need all of these, save for $_SERVER["REQUEST_URI"] because browser knows the rest already: port, host and everything.

Try using $url = "http://{$_SERVER['SERVER_NAME']}{$_SERVER['REQUEST_URI']}";

Related

PHP: Get current directory index file name

Difference between $_SERVER['DOCUMENT_ROOT'] and $_SERVER['HTTP_HOST']

Is it possible to use curl with relative path in PHP?

Self-referential URLs

Converting a filepath to a url securely and reliably

Categories

Resources