Convert URI to URL - php

How to convert an URI to URL if I know the current site path?
Consider these examples:
Current path is: `http://www.site.com/aa/folder/page1.php
Uri: folder2/page.php
Uri: /folder2/page.php
And what if the current path is:
`http://www.site.com/aa/folder/
or
`http://www.site.com/aa/folder
What the URLs will look like then?
I know this should be easy and obvious, but I can't find anywhere the complete answer (and yes, I did searched on Google)

Here is a block of code that has the function that you need:
http://ca.php.net/manual/en/function.parse-url.php#76682
Edit: The above linked function modified with an example
<?php
var_dump(resolve_url('http://www.site.com/aa/folder/page1.php','folder2/page.php?x=y&z=a'));
var_dump(resolve_url('http://www.site.com/aa/folder/page1.php','/folder2/page2.php'));
function unparse_url($components) {
return $components['scheme'].'://'.$components['host'].$components['path'];
}
/**
* Resolve a URL relative to a base path. This happens to work with POSIX
* filenames as well. This is based on RFC 2396 section 5.2.
*/
function resolve_url($base, $url) {
if (!strlen($base)) return $url;
// Step 2
if (!strlen($url)) return $base;
// Step 3
if (preg_match('!^[a-z]+:!i', $url)) return $url;
$base = parse_url($base);
if ($url{0} == "#") {
// Step 2 (fragment)
$base['fragment'] = substr($url, 1);
return unparse_url($base);
}
unset($base['fragment']);
unset($base['query']);
if (substr($url, 0, 2) == "//") {
// Step 4
return unparse_url(array(
'scheme'=>$base['scheme'],
'path'=>$url,
));
} else if ($url{0} == "/") {
// Step 5
$base['path'] = $url;
} else {
// Step 6
$path = explode('/', $base['path']);
$url_path = explode('/', $url);
// Step 6a: drop file from base
array_pop($path);
// Step 6b, 6c, 6e: append url while removing "." and ".." from
// the directory portion
$end = array_pop($url_path);
foreach ($url_path as $segment) {
if ($segment == '.') {
// skip
} else if ($segment == '..' && $path && $path[sizeof($path)-1] != '..') {
array_pop($path);
} else {
$path[] = $segment;
}
}
// Step 6d, 6f: remove "." and ".." from file portion
if ($end == '.') {
$path[] = '';
} else if ($end == '..' && $path && $path[sizeof($path)-1] != '..') {
$path[sizeof($path)-1] = '';
} else {
$path[] = $end;
}
// Step 6h
$base['path'] = join('/', $path);
}
// Step 7
return unparse_url($base);
}
?>

The $_SERVER superglobal will have the information you're looking for, namely $_SERVER['REQUEST_URI'] and $_SERVER['SERVER_NAME']. $_SERVER['QUERY_STRING'] might also be useful.
Please see:
http://php.net/manual/en/reserved.variables.server.php

php has pathinfo(), realpath() and parseurl() and other filesystem and url path functions. Used together with info from the $_SERVER superglobal (as mentioned by andre), you should be able to do what you need.

$uri = "http://www.site.com/aa/folder/";
$url = explode("/", $uri);
$url = $url[2];
echo $url; //www.site.com
Is this what you are looking for?

If you install PECL pecl_http, you can make use of http_build_url:
http_build_url("http://www.site.com/aa/folder/page1.php",
array("path" => "folder2/page.php"));
and you pass any of your relative URI(L)s as path. The function will make sure to build the correct one.

Related

How I can do replace correctly for Url?

I have a code with switch language on site.
When url type is: site.com/ru/rumanya-test . My code replace a rumanya-test with mdmanya-test. How I can prevent this, without add slash on language?
My code:
if ($lang == 'ru') {
$ru_href = 'javascript:void(0);';
$en_href = str_replace("/ru", '/en', $_SERVER['REQUEST_URI']);
$md_href = str_replace("/ru", '/md', $_SERVER['REQUEST_URI']);
$logo_href = '/ru/';
} elseif ($lang == 'en') {
$ru_href = str_replace("/en", '/ru', $_SERVER['REQUEST_URI']);
$en_href = 'javascript:void(0);';
$md_href = str_replace("/en", '/md', $_SERVER['REQUEST_URI']);
$logo_href = '/en/';
} else {
$ru_href = str_replace("/md", '', $_SERVER['REQUEST_URI']);
$ru_href = '/ru' . $ru_href;
$en_href = str_replace("/md", '', $_SERVER['REQUEST_URI']);
$en_href = '/en' . $ru_href;
$md_href = 'javascript:void(0);';
$logo_href = '/';
}
You can use preg_replace with only first occurrence of match should be replaced.
$md_href = preg_replace('#/ru#', '/md', $_SERVER['REQUEST_URI'], 1);
If you are always going to keep the same url format (eg. domain/[language]/page) then there are a couple of options I can think of.
Option 1
Instead of replacing "/ru", replace "/ru/"
Option 2
Split the url by the "/" and replace the 2nd element with the required language.
$url = rtrim("site.com/ru/rumanya-test", '/') . '/'; // Adds a slash at the end of the url if it doesn't already exist
$parts = explode("/", $url);
// If the URL contains language information
if(count($parts > 1)) {
$newURL = $parts[0]."/en/".$parts[2];
}
// If it doesn't contain language information (ie. on the root domain)
else {
// Your logic here
}
Note: This assumes that the language will always be the second element of your url.

PHP - Get Website Title From User Site Input

I'm trying to get the title of a website that is entered by the user.
Text input: website link, entered by user is sent to the server via AJAX.
The user can input anything: an actual existing link, or just single word, or something weird like 'po392#*#8'
Here is a part of my PHP script:
// Make sure the url is on another host
if(substr($url, 0, 7) !== "http://" AND substr($url, 0, 8) !== "https://") {
$url = "http://".$url;
}
// Extra confirmation for security
if (filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_HOST_REQUIRED)) {
$urlIsValid = "1";
} else {
$urlIsValid = "0";
}
// Make sure there is a dot in the url
if (strpos($url, '.') !== false) {
$urlIsValid = "1";
} else {
$urlIsValid = "0";
}
// Retrieve title if no title is entered
if($title == "" AND $urlIsValid == "1") {
function get_http_response_code($theURL) {
$headers = get_headers($theURL);
if($headers) {
return substr($headers[0], 9, 3);
} else {
return 'error';
}
}
if(get_http_response_code($url) != "200") {
$urlIsValid = "0";
} else {
$file = file_get_contents($url);
$res = preg_match("/<title>(.*)<\/title>/siU", $file, $title_matches);
if($res === 1) {
$title = preg_replace('/\s+/', ' ', $title_matches[1]);
$title = trim($title);
$title = addslashes($title);
}
// If title is still empty, make title the url
if($title == "") {
$title = $url;
}
}
}
However, there are still errors occuring in this script.
It works perfectly if an existing url as 'https://www.youtube.com/watch?v=eB1HfI-nIRg' is entered and when a non-existing page is entered as 'https://www.youtube.com/watch?v=NON-EXISTING', but it doesn't work when the users enters something like 'twitter.com' (without http) or something like 'yikes'.
I tried literally everthing: cUrl, DomDocument...
The problem is that when an invalid link is entered, the ajax call never completes (it keeps loading), while it should $urlIsValid = "0" whenever an error occurs.
I hope someone can help you - it's appreciated.
Nathan
You have a relatively simple problem but your solution is too complex and also buggy.
These are the problems that I've identified with your code:
// Make sure the url is on another host
if(substr($url, 0, 7) !== "http://" AND substr($url, 0, 8) !== "https://") {
$url = "http://".$url;
}
You won't make sure that that possible url is on another host that way (it could be localhost). You should remove this code.
// Make sure there is a dot in the url
if (strpos($url, '.') !== false) {
$urlIsValid = "1";
} else {
$urlIsValid = "0";
}
This code overwrites the code above it, where you validate that the string is indeed a valid URL, so remove it.
The definition of the additional function get_http_response_code is pointless. You could use only file_get_contents to get the HTML of the remote page and check it against false to detect the error.
Also, from your code I conclude that, if the (external to context) variable $title is empty then you won't execute any external fetch so why not check it first?
To sum it up, your code should look something like this:
if('' === $title && filter_var($url, FILTER_VALIDATE_URL))
{
//# means we suppress warnings as we won't need them
//this could be done with error_reporting(0) or similar side-effect method
$html = getContentsFromUrl($url);
if(false !== $html && preg_match("/<title>(.*)<\/title>/siU", $file, $title_matches))
{
$title = preg_replace('/\s+/', ' ', $title_matches[1]);
$title = trim($title);
$title = addslashes($title);
}
// If title is still empty, make title the url
if($title == "") {
$title = $url;
}
}
function getContentsFromUrl($url)
{
//if not full/complete url
if(!preg_match('#^https?://#ims', $url))
{
$completeUrl = 'http://' . $url;
$result = #file_get_contents($completeUrl);
if(false !== $result)
{
return $result;
}
//we try with https://
$url = 'https://' . $url;
}
return #file_get_contents($url);
}

replace underscores with dashes in codeigniter URL structure

im having a website which is developed using codeigniter. My URL strcuture has around 4 segments and over 500 pages. Currently my URL has underscores (_) but i need to replace it which dashes (-)
current structure
mywebsite.com/app/standalone_apps/high_runner_beta/download
required URL structure
mywebsite.com/app/standalone-apps/high-runner-beta/download
i cant user the routing feature in CI coz there are too many links that needs to be redirected.
i did try the solutions in
Codeigniter Routes regex - using dashes in controller/method names
but it didn't work.
can someone suggest a method where i can replace the underscores with dashes on my URLs.
This work for me -> create a a php file: MY_Route.php
with this code and upload to 'application/core' directory:
<?php class MY_Router extends CI_Router {
function set_class($class)
{
$this->class = str_replace('-', '_', $class);
}
function set_method($method)
{
$this->method = str_replace('-', '_', $method);
}
function set_directory($dir) {
$this->directory = $dir.'/';
}
function _validate_request($segments)
{
if (count($segments) == 0)
{
return $segments;
}
// Does the requested controller exist in the root folder?
if (file_exists(APPPATH.'controllers/'.str_replace('-', '_', $segments[0]).'.php'))
{
return $segments;
}
// Is the controller in a sub-folder?
if (is_dir(APPPATH.'controllers/'.$segments[0]))
{
// Set the directory and remove it from the segment array
$this->set_directory($segments[0]);
$segments = array_slice($segments, 1);
while(count($segments) > 0 && is_dir(APPPATH.'controllers/'.$this->directory.$segments[0]))
{
// Set the directory and remove it from the segment array
$this->set_directory($this->directory . $segments[0]);
$segments = array_slice($segments, 1);
}
if (count($segments) > 0)
{
// Does the requested controller exist in the sub-folder?
if ( ! file_exists(APPPATH.'controllers/'.$this->fetch_directory().str_replace('-', '_', $segments[0]).'.php'))
{
if ( ! empty($this->routes['404_override']))
{
$x = explode('/', $this->routes['404_override']);
$this->set_directory('');
$this->set_class($x[0]);
$this->set_method(isset($x[1]) ? $x[1] : 'index');
return $x;
}
else
{
show_404($this->fetch_directory().$segments[0]);
}
}
}
else
{
// Is the method being specified in the route?
if (strpos($this->default_controller, '/') !== FALSE)
{
$x = explode('/', $this->default_controller);
$this->set_class($x[0]);
$this->set_method($x[1]);
}
else
{
$this->set_class($this->default_controller);
$this->set_method('index');
}
// Does the default controller exist in the sub-folder?
if ( ! file_exists(APPPATH.'controllers/'.$this->fetch_directory().$this->default_controller.'.php'))
{
$this->directory = '';
return array();
}
}
return $segments;
}
// If we've gotten this far it means that the URI does not correlate to a valid
// controller class. We will now see if there is an override
if ( ! empty($this->routes['404_override']))
{
$x = explode('/', $this->routes['404_override']);
$this->set_class($x[0]);
$this->set_method(isset($x[1]) ? $x[1] : 'index');
return $x;
}
// Nothing else to do at this point but show a 404
show_404($segments[0]);
} }

Simplifying PHP if's and arrays

I have a piece of code I just wrote that detects if there is a user logged in and if [1] and [2] have any specific text in the string and then will relocate that person to another page if the values are met.
But I think my code is a little long winded. Is there a way to simplify what I have or is this the best I'll get?
if (!isset($_SESSION['user_id'])){
$dir = dirname($_SERVER['PHP_SELF']);
$dirs = explode('/', $dir);
if(isset($dirs[1])){
if (($dirs[1] == "account") || ($dirs[1] == "admin")){
header('Location: /');
}
}
if(isset($dirs[2])){
if(($dirs[2] == "account")){
header('Location: /');
}
}
}
Thanks in advance
a simple way is to use a closure
$dir = explode('/', dirname($_SERVER['PHP_SELF']));
$is = function($pos, $check) use($dir) {
return array_key_exists($pos, $dir) && $dir[$pos] == $check;
};
if($is->__invoke(1, 'account')
|| $is->__invoke(1, 'admin')
|| $is->__invoke(2, 'account')) {
header('Location: /');
}
You could do that for instance:
$dir = dirname($_SERVER['PHP_SELF']);
$dirs = explode('/', $dir);
if(in_array('account',$dirs) || in_array('admin', $dirs)){
header('Location: /');
}
One of a few simpler solutions could be to use PHP's array_intersect($array1, $array2) function. This is well documented on the php.net website, but here's a little example:
// Define all the 'search' needles
$needles = array('account', 'admin');
// Get all the dirs
$dirs = explode('/', dirname( $_SERVER['PHP_SELF'] ));
// Check for needles in the hay
if( array_intersect($needles, $dirs) )
{
// Redirect
header('Location: /');
}
ADDED: You could of course make the above very simple by combining multiple lines into one, this would leave you with:
if( array_intersect(array('account', 'admin'), explode('/', dirname($_SERVER['PHP_SELF']))) )
{
header('Location: /');
}

PHP: How to resolve a relative url

I need a function that given a relative URL and a base returns an absolute URL. I've searched and found many functions that do it different ways.
resolve("../abc.png", "http://example.com/path/thing?foo=bar")
# returns http://example.com/abc.png
Is there a canonical way?
On this site I see great examples for python and c#, lets get a PHP solution.
Perhaps this article could help?
http:// nashruddin.com/PHP_Script_for_Converting_Relative_to_Absolute_URL
Edit: reproduced code below for convenience
<?php
function rel2abs($rel, $base)
{
/* return if already absolute URL */
if (parse_url($rel, PHP_URL_SCHEME) != '' || substr($rel, 0, 2) == '//') return $rel;
/* queries and anchors */
if ($rel[0]=='#' || $rel[0]=='?') return $base.$rel;
/* parse base URL and convert to local variables:
$scheme, $host, $path */
extract(parse_url($base));
/* remove non-directory element from path */
$path = preg_replace('#/[^/]*$#', '', $path);
/* destroy path if relative url points to root */
if ($rel[0] == '/') $path = '';
/* dirty absolute URL */
$abs = "$host$path/$rel";
/* replace '//' or '/./' or '/foo/../' with '/' */
$re = array('#(/\.?/)#', '#/(?!\.\.)[^/]+/\.\./#');
for($n=1; $n>0; $abs=preg_replace($re, '/', $abs, -1, $n)) {}
/* absolute URL is ready! */
return $scheme.'://'.$abs;
}
?>
Another solution in case you already use GuzzleHttp.
This solution is based on an internal method of GuzzleHttp\Client.
use GuzzleHttp\Psr7\UriResolver;
use GuzzleHttp\Psr7\Utils;
function resolve(string $uri, ?string $base_uri): string
{
$uri = Utils::uriFor(trim($uri));
if (isset($base_uri)) {
$uri = UriResolver::resolve(Utils::uriFor(trim($base_uri)), $uri);
}
// optional: set default scheme if missing
$uri = $uri->getScheme() === '' && $uri->getHost() !== '' ? $uri->withScheme('http') : $uri;
return (string)$uri;
}
EDIT: the source code was updated as suggested by myriacl
If your have pecl-http, you can use http://php.net/manual/en/function.http-build-url.php
<?php
$url_parts = parse_url($relative_url);
$absolute = http_build_url($source_url, $url_parts, HTTP_URL_JOIN_PATH);
Ex:
<?php
function getAbsoluteURL($source_url, $relative_url)
{
$url_parts = parse_url($relative_url);
return http_build_url($source_url, $url_parts, HTTP_URL_JOIN_PATH);
}
echo getAbsoluteURL('http://foo.tw/a/b/c', '../pic.jpg') . "\n";
// http://foo.tw/a/pic.jpg
echo getAbsoluteURL('http://foo.tw/a/b/c/', '../pic.jpg') . "\n";
// http://foo.tw/a/b/pic.jpg
echo getAbsoluteURL('http://foo.tw/a/b/c/', 'http://bar.tw/a.js') . "\n";
// http://bar.tw/a.js
echo getAbsoluteURL('http://foo.tw/a/b/c/', '/robots.txt') . "\n";
// http://foo.tw/robots.txt
other tools that are already linked in page linked in pguardiario's comment: http://publicmind.in/blog/urltoabsolute/ , https://github.com/monkeysuffrage/phpuri .
and i have found other tool from comment in http://nadeausoftware.com/articles/2008/05/php_tip_how_convert_relative_url_absolute_url :
require_once 'Net/URL2.php';
$base = new Net_URL2('http://example.org/foo.html');
$absolute = (string)$base->resolve('relative.html#bar');
Here is another function that can handle protocol relative urls
<?php
function getAbsoluteURL($to, $from = null) {
$arTarget = parse_url($to);
$arSource = parse_url($from);
$targetPath = isset($arTarget['path']) ? $arTarget['path'] : '';
if (isset($arTarget['host'])) {
if (!isset($arTarget['scheme'])) {
$proto = isset($arSource['scheme']) ? "{$arSource['scheme']}://" : '//';
} else {
$proto = "{$arTarget['scheme']}://";
}
$baseUrl = "{$proto}{$arTarget['host']}" . (isset($arTarget['port']) ? ":{$arTarget['port']}" : '');
} else {
if (isset($arSource['host'])) {
$proto = isset($arSource['scheme']) ? "{$arSource['scheme']}://" : '//';
$baseUrl = "{$proto}{$arSource['host']}" . (isset($arSource['port']) ? ":{$arSource['port']}" : '');
} else {
$baseUrl = '';
}
$arPath = [];
if ((empty($targetPath) || $targetPath[0] !== '/') && !empty($arSource['path'])) {
$arTargetPath = explode('/', $targetPath);
if (empty($arSource['path'])) {
$arPath = [];
} else {
$arPath = explode('/', $arSource['path']);
array_pop($arPath);
}
$len = count($arPath);
foreach ($arTargetPath as $idx => $component) {
if ($component === '..') {
if ($len > 1) {
$len--;
array_pop($arPath);
}
} elseif ($component !== '.') {
$len++;
array_push($arPath, $component);
}
}
$targetPath = implode('/', $arPath);
}
}
return $baseUrl . $targetPath;
}
// SAMPLES
// Absolute path => https://www.google.com/doubleclick/
echo getAbsoluteURL('/doubleclick/', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 1 => https://www.google.com/doubleclick/studio
echo getAbsoluteURL('../studio', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 2 => https://www.google.com/doubleclick/insights/case-studies.html
echo getAbsoluteURL('./case-studies.html', 'https://www.google.com/doubleclick/insights/') . "\n";
// Relative path 3 => https://www.google.com/doubleclick/insights/case-studies.html
echo getAbsoluteURL('case-studies.html', 'https://www.google.com/doubleclick/insights/') . "\n";
// Protocol relative url => https://www.google.com/doubleclick/
echo getAbsoluteURL('//www.google.com/doubleclick/', 'https://www.google.com/doubleclick/insights/') . "\n";
// Empty path => https://www.google.com/doubleclick/insights/
echo getAbsoluteURL('', 'https://www.google.com/doubleclick/insights/') . "\n";
// Different url => http://www.yahoo.com/
echo getAbsoluteURL('http://www.yahoo.com/', 'https://www.google.com') . "\n";
function absoluteUri($Path, $URI)
{ # Requires PHP4 or better.
$URL = parse_url($URI);
$Str = "{$URL['scheme']}://";
if (isset($URL['user']) || isset($URL['pass']))
$Str .= "{$URL['user']}:{$URL['pass']}#";
$Str .= $URL['host'];
if (isset($URL['port']))
$Str .= ":{$URL['port']}";
$Str .= realpath($URL['path'] . $Path); # This part might have an issue on windows boxes.
if (isset($URL['query']))
$Str .= "?{$URL['query']}";
if (isset($URL['fragment']))
$Str .= "#{$URL['fragment']}";
return $Str;
}
absoluteUri("../abc.png", "http://example.com/path/thing?foo=bar");
# Should return "http://example.com/abc.png?foo=bar" on Linux boxes.
I noticed the upvoted answer above uses RegEx, which can be dangerous when dealing with URLs.
This function will resolve relative URLs to a given current page url in $pgurl without regex. It successfully resolves:
/home.php?example types,
same-dir nextpage.php types,
../...../.../parentdir types,
full http://example.net urls,
and shorthand //example.net urls
//Current base URL (you can dynamically retrieve from $_SERVER)
$pgurl = 'http://example.com/scripts/php/absurl.php';
function absurl($url) {
global $pgurl;
if(strpos($url,'://')) return $url; //already absolute
if(substr($url,0,2)=='//') return 'http:'.$url; //shorthand scheme
if($url[0]=='/') return parse_url($pgurl,PHP_URL_SCHEME).'://'.parse_url($pgurl,PHP_URL_HOST).$url; //just add domain
if(strpos($pgurl,'/',9)===false) $pgurl .= '/'; //add slash to domain if needed
return substr($pgurl,0,strrpos($pgurl,'/')+1).$url; //for relative links, gets current directory and appends new filename
}
function nodots($path) { //Resolve dot dot slashes, no regex!
$arr1 = explode('/',$path);
$arr2 = array();
foreach($arr1 as $seg) {
switch($seg) {
case '.':
break;
case '..':
array_pop($arr2);
break;
case '...':
array_pop($arr2); array_pop($arr2);
break;
case '....':
array_pop($arr2); array_pop($arr2); array_pop($arr2);
break;
case '.....':
array_pop($arr2); array_pop($arr2); array_pop($arr2); array_pop($arr2);
break;
default:
$arr2[] = $seg;
}
}
return implode('/',$arr2);
}
Usage Example:
echo nodots(absurl('../index.html'));
nodots() must be called after the URL is converted to absolute.
The dots function is kind of redundant, but is readable, fast, doesn't use regex's, and will resolve 99% of typical urls (if you want to be 100% sure, just extend the switch block to support 6+ dots, although I've never seen that many dots in a URL).
Hope this helps,

Categories