PHP - Remove the last directory from a URL - php

Lets say my URL is:
http://www.example.com/news/media-centre/news/17/an-example-news-post/?foo=bar
I want to in PHP remove the last directory in the URL so I get:
http://www.example.com/news/media-centre/news/17/?foo=bar
How do I do this while making sure I maintain any other URL parameters?
I've tried using this:
$url = parse_url( $url );
$url['path'] = str_replace( strrchr($url['path'], "/"), "", $url['path'] );
But the replace would cause issues if the last directory is also somewhere else in the path too.
Not to mention stitching the URL back together seems like a long way round...

$url = "http://www.example.com/news/media-centre/news/17/an-example-news-post/?foo=bar";
$info = parse_url($url);
$info["path"]=dirname($info["path"]);
$new_url = $info["scheme"]."://".$info["host"].$info["path"];
if(!empty($info["query"])) $new_url .= "?".$info["query"];
if(!empty($info["fragment"])) $new_url .= "#".$info["fragment"];

Related

PHP: basename not working as expected on special character

Here's my code:
$url = "https://de.wikipedia.org/wiki/…_und_wenn_der_letzte_Reifen_platzt";
$base = basename($url);
echo $base . "<br>";
$url2 = urlencode($base);
echo $url2 . "<br>";
$url = dirname($url) . "/" . $url2;
echo $url;
$aHeader = #get_headers($url);
echo "<pre>" . print_r($aHeader,true) . "</pre>";
It works fine on my local machine (running Xampp with PHP v7.3.12) - $base encodes as %E2%80%A6_und_wenn_der_letzte_Reifen_platzt
But when running on my server, $base will encode as _und_wenn_der_letzte_Reifen_platzt which is wrong and will result in an error 404 (the server is running on PHP 7.2.24).
Any ideas what is causing this behaviour? Both scripts are encoded in UTF-8.
I could be a bug related to the basename function. Because if you mix … char with letters in und_wenn_der_letzte_Reifen_platzt part, if works as expected. You can try to upgrade your PHP on your server matching your local version if possible.
If you can't do this, there is always a better way to achieve this with regular expressions.
$re = '/.+\/(.*)/m';
$str = 'https://de.wikipedia.org/wiki/…_und_wenn_der_letzte_Reifen_platzt';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
$base = $matches[0][1];
echo $base . "<br>";
$url2 = rawurlencode($base);
echo $url2 . "<br>";
I just ran into the same problem while processing some MP3 files of French songs I listen to. I set up a webpage where I can download a M3U playlist filtered according to what I want to listen to on my phone. I simply download the playlist and it will find the songs on my phone in a MP3 folder. Problem was that basename truncated the base filenames. Frustrated, I tracked it down to the "basename" function in PHP. I found a simple solution by creating a new basename function once I realized that paths as well as URLs used the "/" as a seperator. And, it was the final "/" that defines what the base name is ...
function basename_x($url, $ext = NULL ) {
$Array_Check = TRUE;
$url = explode("/", $url);
$Array_Check = ( is_array($url) ? TRUE : FALSE );
$key = ( $Array_Check ? count($url) - 1 : NULL );
if ( $ext != NULL ) {
if ( $Array_Check ) {
$url[$key] = preg_replace( "/$ext/", '', $url[$key] );
} else {
$url = preg_replace( "/$ext/", '', $url );
}
}
$base_name = ( $Array_Check ? $url[$key] : $url );
return $base_name;
}
$sample = "./MP3s/À_ton_nom_-_Collectif_Cieux_Ouverts.mp3";
$this_doesnt_work = basename($sample);
$will_this_work = basename_x($sample);
var_dump($this_doesnt_work,$will_this_work);
From the command line, this is the output ...
string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"
string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"
But, when I ran this on my Apache Server, I got this instead ...
string(38) "_ton_nom_-_Collectif_Cieux_Ouverts.mp3"
string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"
I find it interesting that "A" in the file accounts for two characters, not one. Anyway, this approach solved my problem without having to play with my locale settings in PHP. Of course, I added the feature of removing the extension as well as insuring the URL is exploded into a true array. But, it was a quick work around with a simple solution.
Hope this helps someone with the same problem.

parse_url and removing subdomain

I'm wanting to strip out everything from a URL but the domain. So http://i.imgur.com/rA81kQf.jpg becomes imgur.com.
$url = 'http://i.imgur.com/rA81kQf.jpg';
$parsedurl = parse_url($url);
$parsedurl = preg_replace('#^www\.(.+\.)#i', '$1', $parsedurl['host']);
// now if a dot exists, grab everything after it. This removes any potential subdomain
$parsedurl = preg_replace("/^(.*?)\.(.*)$/","$2",$parsedurl);
The above works but I feel like I should only being one preg_replace for this. Any idea how I may combine the two?
You can use parse_url() to get desired output like this,
$url = "http://i.imgur.com/rA81kQf.jpg";
$parseData = parse_url($url);
$domain = preg_replace('/^www\./', '', $parseData['host']);
$array = explode(".", $domain);
echo (array_key_exists(count($array) - 2, $array) ? $array[count($array) - 2] : "") . "." . $array[count($array) - 1];
which prints
imgur.com

Get what page the visitor visit in PHP

I was trying to get what page the visitor visit:
Here is my code:
$url = $_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];
$urlcomplete = $url;
$url = explode(".com/",$url);
$urlcount = count($url);
$newurl = '';
for ($start = 1; $start < $urlcount; $start++) {
if ($newurl != '') {
$newurl .= '.com/';
}
$newurl .= $url[$start];
}
$url = explode('/',$newurl);
$urlcount = explode('?',end($url));
$url[count($url) - 1] = $urlcount[0];
$urlcount = count($url);
By using the code above, all the subpage will be store in the $url.
https://stackoverflow.com/questions/ask
$url[0] = 'questions'
$url[1] = 'ask'
Just want to ask, is this good way, or there are others better way?
First prepending SERVER_NAME to the REQUEST_URI, and then trying to split it off, is pointless. This should be a simpler solution:
# first, split off the query string, if any:
list( $path ) = explode( '?', $_SERVER['REQUEST_URI'], 2 );
# then just split the URL path into its components:
$url = explode( '/', ltrim( $path, '/' ) );
The ltrim removes any leading slashes from the path, so that $url[0] won't be empty.
Note that there might still be an empty element at the end of the $url array, if the path ends in a slash. You could get rid of it by using trim instead of ltrim, but you may not want to, since the trailing slash is significant for things like resolving relative URLs.

php - file_get_contents - Downloading files with spaces in the filename not working

I am trying to download files using file_get_contents() function.
However if the location of the file is http://www.example.com/some name.jpg, the function fails to download this.
But if the URL is given as http://www.example.com/some%20name.jpg, the same gets downloaded.
I tried rawurlencode() but this coverts all the characters in the URL and the download fails again.
Can someone please suggest a solution for this?
I think this will work for you:
function file_url($url){
$parts = parse_url($url);
$path_parts = array_map('rawurldecode', explode('/', $parts['path']));
return
$parts['scheme'] . '://' .
$parts['host'] .
implode('/', array_map('rawurlencode', $path_parts))
;
}
echo file_url("http://example.com/foo/bar bof/some file.jpg") . "\n";
echo file_url("http://example.com/foo/bar+bof/some+file.jpg") . "\n";
echo file_url("http://example.com/foo/bar%20bof/some%20file.jpg") . "\n";
Output
http://example.com/foo/bar%20bof/some%20file.jpg
http://example.com/foo/bar%2Bbof/some%2Bfile.jpg
http://example.com/foo/bar%20bof/some%20file.jpg
Note:
I'd probably use urldecode and urlencode for this as the output would be identical for each url. rawurlencode will preserve the + even when %20 is probably suitable for whatever url you're using.
As you have probably already figured out urlencode() should only be used on each portion of a URL that requires escaping.
From the docs for urlencode() just apply it to the image file name giving you the problem and leave the rest of the URL alone. From your example you can safely encode everything following the last "/" character
Here is maybe a better solution. If for any reason you are using a relative url like:
//www.example.com/path
Prior to php 5.4.7 this would not create the [scheme] array element which would throw off maček function. This method may be faster as well.
$url = '//www.example.com/path';
preg_match('/(https?:\/\/|\/\/)([^\/]+)(.*)/ism', $url, $result);
$url = $result[1].$result[2].urlencode(urldecode($result[3]));
Assuming only the file name has the problem, this is a better approach. only urlencode the last section ie. file name.
private function update_url($url)
{
$parts = explode('/', $url);
$new_file = urlencode(end($parts));
$parts[key($parts)] = $new_file;
return implode("/", $parts);
}
This should work
$file = 'some file name';
urlencode($file);
file_get_contents($file);

Get last word from URL after a slash in PHP

I need to get the very last word from an URL. So for example I have the following URL:
http://www.mydomainname.com/m/groups/view/test
I need to get with PHP only "test", nothing else. I tried to use something like this:
$words = explode(' ', $_SERVER['REQUEST_URI']);
$showword = trim($words[count($words) - 1], '/');
echo $showword;
It does not work for me. Can you help me please?
Thank you so much!!
Use basename with parse_url:
echo basename(parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH));
by using regex:
preg_match("/[^\/]+$/", "http://www.mydomainname.com/m/groups/view/test", $matches);
$last_word = $matches[0]; // test
I used this:
$lastWord = substr($url, strrpos($url, '/') + 1);
Thnx to: https://stackoverflow.com/a/1361752/4189000
You can use explode but you need to use / as delimiter:
$segments = explode('/', $_SERVER['REQUEST_URI']);
Note that $_SERVER['REQUEST_URI'] can contain the query string if the current URI has one. In that case you should use parse_url before to only get the path:
$_SERVER['REQUEST_URI_PATH'] = parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH);
And to take trailing slashes into account, you can use rtrim to remove them before splitting it into its segments using explode. So:
$_SERVER['REQUEST_URI_PATH'] = parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH);
$segments = explode('/', rtrim($_SERVER['REQUEST_URI_PATH'], '/'));
To do that you can use explode on your REQUEST_URI.I've made some simple function:
function getLast()
{
$requestUri = $_SERVER['REQUEST_URI'];
# Remove query string
$requestUri = trim(strstr($requestUri, '?', true), '/');
# Note that delimeter is '/'
$arr = explode('/', $requestUri);
$count = count($arr);
return $arr[$count - 1];
}
echo getLast();
If you don't mind a query string being included when present, then just use basename. You don't need to use parse_url as well.
$url = 'http://www.mydomainname.com/m/groups/view/test';
$showword = basename($url);
echo htmlspecialchars($showword);
When the $url variable is generated from user input or from $_SERVER['REQUEST_URI']; before using echo use htmlspecialchars or htmlentities, otherwise users could add html tags or run JavaScript on the webpage.
use preg*
if ( preg_match( "~/(.*?)$~msi", $_SERVER[ "REQUEST_URI" ], $vv ))
echo $vv[1];
else
echo "Nothing here";
this was just idea of code. It can be rewriten in function.
PS. Generally i use mod_rewrite to handle this... ans process in php the $_GET variables.
And this is good practice, IMHO
ex: $url = 'http://www.youtube.com/embed/ADU0QnQ4eDs';
$url = "http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
$url_path = parse_url($url, PHP_URL_PATH);
$basename = pathinfo($url_path, PATHINFO_BASENAME);
// **output**: $basename is "ADU0QnQ4eDs"
complete solution you will get in the below link. i just found to Get last word from URL after a slash in PHP.
Get last parameter of url in php

Categories