PHP: How to resolve a relative url

PHP: How to resolve a relative url - php

I need a function that given a relative URL and a base returns an absolute URL. I've searched and found many functions that do it different ways.
resolve("../abc.png", "http://example.com/path/thing?foo=bar")
# returns http://example.com/abc.png
Is there a canonical way?
On this site I see great examples for python and c#, lets get a PHP solution.

Perhaps this article could help?
http:// nashruddin.com/PHP_Script_for_Converting_Relative_to_Absolute_URL
Edit: reproduced code below for convenience
<?php
function rel2abs($rel, $base)
{
/* return if already absolute URL */
if (parse_url($rel, PHP_URL_SCHEME) != '' || substr($rel, 0, 2) == '//') return $rel;
/* queries and anchors */
if ($rel[0]=='#' || $rel[0]=='?') return $base.$rel;
/* parse base URL and convert to local variables:
$scheme, $host, $path */
extract(parse_url($base));
/* remove non-directory element from path */
$path = preg_replace('#/[^/]*$#', '', $path);
/* destroy path if relative url points to root */
if ($rel[0] == '/') $path = '';
/* dirty absolute URL */
$abs = "$host$path/$rel";
/* replace '//' or '/./' or '/foo/../' with '/' */
$re = array('#(/\.?/)#', '#/(?!\.\.)[^/]+/\.\./#');
for($n=1; $n>0; $abs=preg_replace($re, '/', $abs, -1, $n)) {}
/* absolute URL is ready! */
return $scheme.'://'.$abs;
}
?>

Another solution in case you already use GuzzleHttp.
This solution is based on an internal method of GuzzleHttp\Client.
use GuzzleHttp\Psr7\UriResolver;
use GuzzleHttp\Psr7\Utils;
function resolve(string $uri, ?string $base_uri): string
{
$uri = Utils::uriFor(trim($uri));
if (isset($base_uri)) {
$uri = UriResolver::resolve(Utils::uriFor(trim($base_uri)), $uri);
}
// optional: set default scheme if missing
$uri = $uri->getScheme() === '' && $uri->getHost() !== '' ? $uri->withScheme('http') : $uri;
return (string)$uri;
}
EDIT: the source code was updated as suggested by myriacl

If your have pecl-http, you can use http://php.net/manual/en/function.http-build-url.php
<?php
$url_parts = parse_url($relative_url);
$absolute = http_build_url($source_url, $url_parts, HTTP_URL_JOIN_PATH);
Ex:
<?php
function getAbsoluteURL($source_url, $relative_url)
{
$url_parts = parse_url($relative_url);
return http_build_url($source_url, $url_parts, HTTP_URL_JOIN_PATH);
}
echo getAbsoluteURL('http://foo.tw/a/b/c', '../pic.jpg') . "\n";
// http://foo.tw/a/pic.jpg
echo getAbsoluteURL('http://foo.tw/a/b/c/', '../pic.jpg') . "\n";
// http://foo.tw/a/b/pic.jpg
echo getAbsoluteURL('http://foo.tw/a/b/c/', 'http://bar.tw/a.js') . "\n";
// http://bar.tw/a.js
echo getAbsoluteURL('http://foo.tw/a/b/c/', '/robots.txt') . "\n";
// http://foo.tw/robots.txt

other tools that are already linked in page linked in pguardiario's comment: http://publicmind.in/blog/urltoabsolute/ , https://github.com/monkeysuffrage/phpuri .
and i have found other tool from comment in http://nadeausoftware.com/articles/2008/05/php_tip_how_convert_relative_url_absolute_url :
require_once 'Net/URL2.php';
$base = new Net_URL2('http://example.org/foo.html');
$absolute = (string)$base->resolve('relative.html#bar');

Here is another function that can handle protocol relative urls
<?php
function getAbsoluteURL($to, $from = null) {
$arTarget = parse_url($to);
$arSource = parse_url($from);
$targetPath = isset($arTarget['path']) ? $arTarget['path'] : '';
if (isset($arTarget['host'])) {
if (!isset($arTarget['scheme'])) {
$proto = isset($arSource['scheme']) ? "{$arSource['scheme']}://" : '//';
} else {
$proto = "{$arTarget['scheme']}://";
}
$baseUrl = "{$proto}{$arTarget['host']}" . (isset($arTarget['port']) ? ":{$arTarget['port']}" : '');
} else {
if (isset($arSource['host'])) {
$proto = isset($arSource['scheme']) ? "{$arSource['scheme']}://" : '//';
$baseUrl = "{$proto}{$arSource['host']}" . (isset($arSource['port']) ? ":{$arSource['port']}" : '');
} else {
$baseUrl = '';
}
$arPath = [];
if ((empty($targetPath) || $targetPath[0] !== '/') && !empty($arSource['path'])) {
$arTargetPath = explode('/', $targetPath);
if (empty($arSource['path'])) {
$arPath = [];
} else {
$arPath = explode('/', $arSource['path']);
array_pop($arPath);
}
$len = count($arPath);
foreach ($arTargetPath as $idx => $component) {
if ($component === '..') {
if ($len > 1) {
$len--;
array_pop($arPath);
}
} elseif ($component !== '.') {
$len++;
array_push($arPath, $component);
}
}
$targetPath = implode('/', $arPath);
}
}
return $baseUrl . $targetPath;
}
// SAMPLES
// Absolute path => https://www.google.com/doubleclick/
echo getAbsoluteURL('/doubleclick/', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 1 => https://www.google.com/doubleclick/studio
echo getAbsoluteURL('../studio', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 2 => https://www.google.com/doubleclick/insights/case-studies.html
echo getAbsoluteURL('./case-studies.html', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 3 => https://www.google.com/doubleclick/insights/case-studies.html
echo getAbsoluteURL('case-studies.html', 'https://www.google.com/doubleclick/insights/') . "\n";
// Protocol relative url => https://www.google.com/doubleclick/
echo getAbsoluteURL('//www.google.com/doubleclick/', 'https://www.google.com/doubleclick/insights/') . "\n";
// Empty path => https://www.google.com/doubleclick/insights/
echo getAbsoluteURL('', 'https://www.google.com/doubleclick/insights/') . "\n";
// Different url => http://www.yahoo.com/
echo getAbsoluteURL('http://www.yahoo.com/', 'https://www.google.com') . "\n";

function absoluteUri($Path, $URI)
{ # Requires PHP4 or better.
$URL = parse_url($URI);
$Str = "{$URL['scheme']}://";
if (isset($URL['user']) || isset($URL['pass']))
$Str .= "{$URL['user']}:{$URL['pass']}#";
$Str .= $URL['host'];
if (isset($URL['port']))
$Str .= ":{$URL['port']}";
$Str .= realpath($URL['path'] . $Path); # This part might have an issue on windows boxes.
if (isset($URL['query']))
$Str .= "?{$URL['query']}";
if (isset($URL['fragment']))
$Str .= "#{$URL['fragment']}";
return $Str;
}
absoluteUri("../abc.png", "http://example.com/path/thing?foo=bar");
# Should return "http://example.com/abc.png?foo=bar" on Linux boxes.

I noticed the upvoted answer above uses RegEx, which can be dangerous when dealing with URLs.
This function will resolve relative URLs to a given current page url in $pgurl without regex. It successfully resolves:
/home.php?example types,
same-dir nextpage.php types,
../...../.../parentdir types,
full http://example.net urls,
and shorthand //example.net urls
//Current base URL (you can dynamically retrieve from $_SERVER)
$pgurl = 'http://example.com/scripts/php/absurl.php';
function absurl($url) {
global $pgurl;
if(strpos($url,'://')) return $url; //already absolute
if(substr($url,0,2)=='//') return 'http:'.$url; //shorthand scheme
if($url[0]=='/') return parse_url($pgurl,PHP_URL_SCHEME).'://'.parse_url($pgurl,PHP_URL_HOST).$url; //just add domain
if(strpos($pgurl,'/',9)===false) $pgurl .= '/'; //add slash to domain if needed
return substr($pgurl,0,strrpos($pgurl,'/')+1).$url; //for relative links, gets current directory and appends new filename
}
function nodots($path) { //Resolve dot dot slashes, no regex!
$arr1 = explode('/',$path);
$arr2 = array();
foreach($arr1 as $seg) {
switch($seg) {
case '.':
break;
case '..':
array_pop($arr2);
break;
case '...':
array_pop($arr2); array_pop($arr2);
break;
case '....':
array_pop($arr2); array_pop($arr2); array_pop($arr2);
break;
case '.....':
array_pop($arr2); array_pop($arr2); array_pop($arr2); array_pop($arr2);
break;
default:
$arr2[] = $seg;
}
}
return implode('/',$arr2);
}
Usage Example:
echo nodots(absurl('../index.html'));
nodots() must be called after the URL is converted to absolute.
The dots function is kind of redundant, but is readable, fast, doesn't use regex's, and will resolve 99% of typical urls (if you want to be 100% sure, just extend the switch block to support 6+ dots, although I've never seen that many dots in a URL).
Hope this helps,

Related

How I can do replace correctly for Url?

I have a code with switch language on site.
When url type is: site.com/ru/rumanya-test . My code replace a rumanya-test with mdmanya-test. How I can prevent this, without add slash on language?
My code:
if ($lang == 'ru') {
$ru_href = 'javascript:void(0);';
$en_href = str_replace("/ru", '/en', $_SERVER['REQUEST_URI']);
$md_href = str_replace("/ru", '/md', $_SERVER['REQUEST_URI']);
$logo_href = '/ru/';
} elseif ($lang == 'en') {
$ru_href = str_replace("/en", '/ru', $_SERVER['REQUEST_URI']);
$en_href = 'javascript:void(0);';
$md_href = str_replace("/en", '/md', $_SERVER['REQUEST_URI']);
$logo_href = '/en/';
} else {
$ru_href = str_replace("/md", '', $_SERVER['REQUEST_URI']);
$ru_href = '/ru' . $ru_href;
$en_href = str_replace("/md", '', $_SERVER['REQUEST_URI']);
$en_href = '/en' . $ru_href;
$md_href = 'javascript:void(0);';
$logo_href = '/';
}

You can use preg_replace with only first occurrence of match should be replaced.
$md_href = preg_replace('#/ru#', '/md', $_SERVER['REQUEST_URI'], 1);

If you are always going to keep the same url format (eg. domain/[language]/page) then there are a couple of options I can think of.
Option 1
Instead of replacing "/ru", replace "/ru/"
Option 2
Split the url by the "/" and replace the 2nd element with the required language.
$url = rtrim("site.com/ru/rumanya-test", '/') . '/'; // Adds a slash at the end of the url if it doesn't already exist
$parts = explode("/", $url);
// If the URL contains language information
if(count($parts > 1)) {
$newURL = $parts[0]."/en/".$parts[2];
}
// If it doesn't contain language information (ie. on the root domain)
else {
// Your logic here
}
Note: This assumes that the language will always be the second element of your url.

use parse_url to get entire url instead of part of it

I was trying to make this function more comprehensive to parse more of a url
Currently the function I have is this
function _pagepeeker_format_url($url = FALSE) {
if (filter_var($url, FILTER_VALIDATE_URL) === FALSE) {
return FALSE;
}
// try to parse the url
$parsed_url = parse_url($url);
if (!empty($parsed_url)) {
$host = (!empty($parsed_url['host'])) ? $parsed_url['host'] : '';
$port = (!empty($parsed_url['port'])) ? ':' . $parsed_url['port'] : '';
$path = (!empty($parsed_url['path'])) ? $parsed_url['path'] : '';
$query = (!empty($parsed_url['query'])) ? '?' . $parsed_url['query'] : '';
$fragment = (!empty($parsed_url['fragment'])) ? '#' . $parsed_url['fragment'] : '';
return $host . $port . $path . $query . $fragment;
}
return FALSE;
}
This function turns urls that look like this
http://www.google.com/url?sa=X&q=http://www.beautyjunkiesunite.com/WP/2012/05/30/whats-new-anastasia-beverly-hills-lash-genius/&ct=ga&cad=CAcQARgAIAEoATAAOABA3t-Y_gRIAlgBYgVlbi1VUw&cd=F7w9TwL-6ao&usg=AFQjCNG2rbJCENvRR2_k6pL9RntjP66Rvg
into this
http://www.google.com/url
Is there anyway to make this array return the entire url instead of just part of it ?
I have looked at the parse_url php page and it helps and searched the stackoverflow and found a couple of things I am just having a bit of trouble grasping the next step here.
Let me know if I can clarify in any way
thanks!!

return $url;
Or am I missing something?

this is what i use (getting rid of parse_url and such):
function get_full_url() {
// check SSL
$ssl = "";
if ((isset($_SERVER["HTTPS"]) && $_SERVER["HTTPS"]=="on") || (isset($_SERVER["SERVER_PORT"]) && $_SERVER["SERVER_PORT"]=="443"))
{ $ssl = "s"; }
$serverport = ($_SERVER["SERVER_PORT"]!="80"?":".$_SERVER["SERVER_PORT"]:"");
return "http".$ssl."://".$_SERVER["SERVER_NAME"].$serverport.$_SERVER["REQUEST_URI"];
}
just call get_full_url(); from anywhere in your script.

Auto-link in a string not working for short link

I need a form to autolink links that users input in text fields. I found an example on stack which works perfectly, except for one thing. if the user inputs a link without including http:// or https:// and instead starts the link only using www. the link does not work properly.
ie a user input would be
check out our twitter!
www.twitter.com/#!/profile
and our facebook!
https://www.facebook.com/profile
the output would be
check out our twitter!
www.twitter.com/#!/profile
and our facebook!
http://www.facebook.com/profile
so the facebook link works perfectly, but the twitter one would not, as its being linked to the current location the user is on plus the new link, ie if they are currently on www.example.com the link would become www.example.com/www.twitter.com/#!/profile
for the life of me, i cant figure out how to fix this by simply adding http:// to the beginning of the link, this is the function:
function auto_link_text($text) {
$pattern = '#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#';
return preg_replace_callback($pattern, 'auto_link_text_callback', $text);
}
function auto_link_text_callback($matches) {
$max_url_length = 50;
$max_depth_if_over_length = 2;
$ellipsis = '…';
$url_full = $matches[0];
$url_short = '';
if (strlen($url_full) > $max_url_length) {
$parts = parse_url($url_full);
$url_short = $parts['scheme'] . '://' . preg_replace('/^www\./', '', $parts['host']) . '/';
$path_components = explode('/', trim($parts['path'], '/'));
foreach ($path_components as $dir) {
$url_string_components[] = $dir . '/';
}
if (!empty($parts['query'])) {
$url_string_components[] = '?' . $parts['query'];
}
if (!empty($parts['fragment'])) {
$url_string_components[] = '#' . $parts['fragment'];
}
for ($k = 0; $k < count($url_string_components); $k++) {
$curr_component = $url_string_components[$k];
if ($k >= $max_depth_if_over_length || strlen($url_short) + strlen($curr_component) > $max_url_length) {
if ($k == 0 && strlen($url_short) < $max_url_length) {
// Always show a portion of first directory
$url_short .= substr($curr_component, 0, $max_url_length - strlen($url_short));
}
$url_short .= $ellipsis;
break;
}
$url_short .= $curr_component;
}
} else {
$url_short = $url_full;
}
return "<a rel=\"nofollow\" href=\"$url_full\">$url_short</a>";
}

Use strpos function.
if the input contains "http://" forward directly. Otherwise add it direct it.

jquery address crawling - logic issues

I'm using the jquery address plugin to build an ajax driven site, and i've got it working! Yay! For the purposes of this question we can use the test site:
http://www.asual.com/jquery/address/samples/crawling
http://www.asual.com/download/jquery/address
(I had to remove two calls to urlencode() to make the crawling example work.)
I'm encountering a problem with the $crawling->nav() call. It basically uses js and php to load parts of an xml file into the dom. I (mostly) understand how it works, and I would like to modify the example code to include sub pages.
For example, I would like to show 'subnav-project.html' at '/!#/project' and '/!#/project/blue', but not at '/!#/contact'. To do this, I figure php should 'know' what page the user is on, that way I can base my logic off of that.
Is this crazy? Can php ever know the current state of the site if I'm building it this way? If not, how does one selectively load html snippets, or modify what links are shown in navigation menus?
I've never gotten too crazy with ajax before, so any feedback at all would be helpful.
EDIT
This is the crawling class.
class Crawling {
const fragment = '_escaped_fragment_';
function Crawling(){
// Initializes the fragment value
$fragment = (!isset($_REQUEST[self::fragment]) || $_REQUEST[self::fragment] == '') ? '/' : $_REQUEST[self::fragment];
// Parses parameters if any
$this->parameters = array();
$arr = explode('?', $fragment);
if (count($arr) > 1) {
parse_str($arr[1], $this->parameters);
}
// Adds support for both /name and /?page=name
if (isset($this->parameters['page'])) {
$this->page = '/?page=' . $this->parameters['page'];
} else {
$this->page = $arr[0];
}
// Loads the data file
$this->doc = new DOMDocument();
$this->doc->load('data.xml');
$this->xp = new DOMXPath($this->doc);
$this->nodes = $this->xp->query('/data/page');
$this->node = $this->xp->query('/data/page[#href="' . $this->page . '"]')->item(0);
if (!isset($this->node)) {
header("HTTP/1.0 404 Not Found");
}
}
function base() {
$arr = explode('?', $_SERVER['REQUEST_URI']);
return $arr[0] != '/' ? preg_replace('/\/$/', '', $arr[0]) : $arr[0];
}
function title() {
if (isset($this->node)) {
$title = $this->node->getAttribute('title');
} else {
$title = 'Page not found';
}
echo($title);
}
function nav() {
$str = '';
// Prepares the navigation links
foreach ($this->nodes as $node) {
$href = $node->getAttribute('href');
$title = $node->getAttribute('title');
$str .= '<li><a href="' . $this->base() . ($href == '/' ? '' : '?' . self::fragment . '=' .html_entity_decode($href)) . '"'
. ($this->page == $href ? ' class="selected"' : '') . '>'
. $title . '</a></li>';
}
echo($str);
}
function content() {
$str = '';
// Prepares the content with support for a simple "More..." link
if (isset($this->node)) {
foreach ($this->node->childNodes as $node) {
if (!isset($this->parameters['more']) && $node->nodeType == XML_COMMENT_NODE && $node->nodeValue == ' page break ') {
$str .= '<p><a href="' . $this->page .
(count($this->parameters) == 0 ? '?' : '&') . 'more=true' . '">More...</a></p>';
break;
} else {
$str .= $this->doc->saveXML($node);
}
}
} else {
$str .= '<p>Page not found.</p>';
}
echo(preg_replace_callback('/href="(\/[^"]+|\/)"/', array(get_class($this), 'callback'), $str));
}
private function callback($m) {
return 'href="' . ($m[1] == '/' ? $this->base() : ($this->base() . '?' . self::fragment . '=' .$m[1])) . '"';
}
}
$crawling = new Crawling();

You won't be able to make server-side decisions using the fragment-identifier (i.e., everything to the right of the # character). This is because browsers don't send fragment-identifiers to the server. If you're going to want to make server-side decisions, you'll need to use some JavaScript assistance (including AJAX) to communicate what the current fragment-identifier is.

Convert URI to URL

How to convert an URI to URL if I know the current site path?
Consider these examples:
Current path is: `http://www.site.com/aa/folder/page1.php
Uri: folder2/page.php
Uri: /folder2/page.php
And what if the current path is:
`http://www.site.com/aa/folder/
or
`http://www.site.com/aa/folder
What the URLs will look like then?
I know this should be easy and obvious, but I can't find anywhere the complete answer (and yes, I did searched on Google)

Here is a block of code that has the function that you need:
http://ca.php.net/manual/en/function.parse-url.php#76682
Edit: The above linked function modified with an example
<?php
var_dump(resolve_url('http://www.site.com/aa/folder/page1.php','folder2/page.php?x=y&z=a'));
var_dump(resolve_url('http://www.site.com/aa/folder/page1.php','/folder2/page2.php'));
function unparse_url($components) {
return $components['scheme'].'://'.$components['host'].$components['path'];
}
/**
* Resolve a URL relative to a base path. This happens to work with POSIX
* filenames as well. This is based on RFC 2396 section 5.2.
*/
function resolve_url($base, $url) {
if (!strlen($base)) return $url;
// Step 2
if (!strlen($url)) return $base;
// Step 3
if (preg_match('!^[a-z]+:!i', $url)) return $url;
$base = parse_url($base);
if ($url{0} == "#") {
// Step 2 (fragment)
$base['fragment'] = substr($url, 1);
return unparse_url($base);
}
unset($base['fragment']);
unset($base['query']);
if (substr($url, 0, 2) == "//") {
// Step 4
return unparse_url(array(
'scheme'=>$base['scheme'],
'path'=>$url,
));
} else if ($url{0} == "/") {
// Step 5
$base['path'] = $url;
} else {
// Step 6
$path = explode('/', $base['path']);
$url_path = explode('/', $url);
// Step 6a: drop file from base
array_pop($path);
// Step 6b, 6c, 6e: append url while removing "." and ".." from
// the directory portion
$end = array_pop($url_path);
foreach ($url_path as $segment) {
if ($segment == '.') {
// skip
} else if ($segment == '..' && $path && $path[sizeof($path)-1] != '..') {
array_pop($path);
} else {
$path[] = $segment;
}
}
// Step 6d, 6f: remove "." and ".." from file portion
if ($end == '.') {
$path[] = '';
} else if ($end == '..' && $path && $path[sizeof($path)-1] != '..') {
$path[sizeof($path)-1] = '';
} else {
$path[] = $end;
}
// Step 6h
$base['path'] = join('/', $path);
}
// Step 7
return unparse_url($base);
}
?>

The $_SERVER superglobal will have the information you're looking for, namely $_SERVER['REQUEST_URI'] and $_SERVER['SERVER_NAME']. $_SERVER['QUERY_STRING'] might also be useful.
Please see:
http://php.net/manual/en/reserved.variables.server.php

php has pathinfo(), realpath() and parseurl() and other filesystem and url path functions. Used together with info from the $_SERVER superglobal (as mentioned by andre), you should be able to do what you need.

$uri = "http://www.site.com/aa/folder/";
$url = explode("/", $uri);
$url = $url[2];
echo $url; //www.site.com
Is this what you are looking for?

If you install PECL pecl_http, you can make use of http_build_url:
http_build_url("http://www.site.com/aa/folder/page1.php",
array("path" => "folder2/page.php"));
and you pass any of your relative URI(L)s as path. The function will make sure to build the correct one.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP: How to resolve a relative url - php

Related

How I can do replace correctly for Url?

use parse_url to get entire url instead of part of it

Auto-link in a string not working for short link

jquery address crawling - logic issues

Convert URI to URL

Categories

Resources