Best way to remove trailing slashes in URLs with PHP - php

I have some URLs, like www.amazon.com/, www.digg.com or www.microsoft.com/ and I want to remove the trailing slash, if it exists, so not just the last character. Is there a trim or rtrim for this?

You put rtrim in your question, why not just look it up?
$url = rtrim($url,"/");
As a side note, look up any PHP function by doing the following:
http://php.net/functionname
http://php.net/rtrim
http://php.net/trim
(rtrim stands for 'Right trim')

Simple and works across both Windows and Unix:
$url = rtrim($url, '/\\')

I came here looking for a way to remove trailing slash and redirect the browser, I have come up with an answer that I would like to share for anyone coming after me:
//remove trailing slash from uri
if( ($_SERVER['REQUEST_URI'] != "/") and preg_match('{/$}',$_SERVER['REQUEST_URI']) ) {
header ('Location: '.preg_replace('{/$}', '', $_SERVER['REQUEST_URI']));
exit();
}
The ($_SERVER['REQUEST_URI'] != "/") will avoid host URI e.g www.amazon.com/ because web browsers always send a trailing slash after a domain name, and preg_match('{/$}',$_SERVER['REQUEST_URI']) will match all other URI with trailing slash as last character. Then preg_replace('{/$}', '', $_SERVER['REQUEST_URI']) will remove the slash and hand over to header() to redirect. The exit() function is important to stop any further code execution.

$urls="www.amazon.com/ www.digg.com/ www.microsoft.com/";
echo preg_replace("/\b\//","",$urls);

Related

Trailing %20 white space in URLs producing 404 errors in Codeigniter

Our URLs with a URL encoded trailing white space (%20) are producing a 404 error. The application is run on Codeigniter on Apache.
/directory/page%20 will return a 404 error
/directory/page will return a 200 OK
How can I route all URLs with a trailing %20 to the intended URL?
The problem is that some third party websites are linking to us with trailing white space in the HREF
In that case you can add something like the following at the top of your .htaccess file to redirect (canonicalise) such requests to remove the trailing space.
For example, before the Codeigniter front-controller:
RewriteCond %{REQUEST_URI} \s$
RewriteRule (.*) /$1 [R=302,L]
The "processed" URL-path matched by the RewriteRule pattern has already had the trailing slash removed, however, the REQUEST_URI server variable has not. So, we can check for the trailing space on the REQUEST_URI and simply redirect to "the same" (processed) URL-path, as captured by the RewriteRule pattern.
The REQUEST_URI server variable is already %-decoded. The \s shorthand character class matches against any whitespace character and the trailing $ anchors this to the end of the URL-path.
Test first with a 302 (temporary) redirect to make sure that it works OK before changing to a 301 (permanent) redirect.
#Juan Cullen, this is a common issue when you have a space before a trailing slash. For example lets say: "http://example.com/directory/page /". you can notice the space before the trailing slash.
To solve this for all urls that have such behavior, you can use PHP's rtrim() function.
Check the code below
<?php
function fix_url($url) {
$trailing_slash = ' /'; //notice a space before the slash
return $url = rtrim($url, $trailing_slash);
}
Now you can call it like this:
$error_url = "http://example.com/directory/page /";
$correct_url = fix_url($error_url);
As you are using Codeigniter, you can put this function in a helper file and access wherever you want.
This is an Idea try it out and let me know if it works.

Remove trailing slash on domain extensions without trailing directory

I'm importing data from a csv and I've been looking high and low for a particular regular expression to remove trailing slashes from domain names without a directory after it. See the following example:
example.com/ (remove trailing slash)
example.co.uk/ (remove trailing slash)
example.com/gb/ (do not remove trailing slash)
Can anyone help me out with this or at least point me in the right direction?
Edit: This is my progress so far, I've only matched the extension at the moment but it's picking up those domains with trailing directories.
[a-z0-9\-]+[a-z0-9]\/[a-z]
Many thanks
I don't know how it would compare to a regular expression performance-wise, but you can do it without one.
A simple example:
$string = rtrim ($string, '/');
$string .= (strpos($string, '/') === false) ? '' : '/';
In the second line I'm only adding a / at the end if the string already contains one (to separate domain from folder).
A more solid approach would probably be to only rtrim if the first / found, is the last character of the string.
not sure,
but you can try this,
if it is a $_SERVER['SERVER_NAME'] only then remove slash otherwise keep it
because $_SERVER['SERVER_NAME'] will return URL without any directory
try this
/^(http|https|ftp)\:\/\/[a-z0-9\-\.]+\.[a-z]{2,3}(:[a-z0-9]*)?\/?([a-z0-9\-\._\?\,\'\/\\\+&%\$#\=~])*$/i
you could test for a match on /[a-z]/, then remove the last charater if it's not found.
this is javascript, but it'd be similar in php.
/\/[a-z]+\//
var txt = 'example.com/gb/';
var match = txt.match(/\/[a-z]+\//);
if (!match) {
alert(txt.substring(txt,txt.length-1));
}
else {
alert(txt);
}
http://jsfiddle.net/xjKTS/
Try this, it works:
<?
$result = preg_replace('/^([^\/]+)(\/)$/','$1',$your_data);
?>
I have tested like this:
$reg = '/^([^\/]+)(\/)$/';
echo preg_replace($reg,'$1',$str1);//example.com
echo preg_replace($reg,'$1',$str2);//example.co.uk
echo preg_replace($reg,'$1',$str3);//example.com/gb/
?>

How to fix a path with regex in php for PATHS and not break URLs?

I want to replace // but not ://. I'm using this function to fix broken urls:
function fix ($path)
{
return preg_replace( "/\/+/", "/", $path );
}
For example:
Input:
a//a//s/b/d//df//a/s/
Output (collapsed blocks of more than one slash):
a/a/s/b/d/df/a/s/
That is OK, but if I pass a URL I break the http:// part, and end up with http:/. For example:
http://www.domain.com/a/a/s/b/d/df/a/s/
I get:
http:/www.domain.com/a/a/s/b/d/df/a/s/
I want to keep the http:// intact:
http://www.domain.com/a/a/s/b/d/df/a/s/
You can solve it rather easily using a negative lookbehind:
function fix ($path)
{
return preg_replace("#(?<!:)/{2,}#", "/", $path);
}
Note that I've also changed your delimiter from / to #, so you don't have to escape slashes.
Working example: http://ideone.com/6zGBg
This can still match the second slash if you have more than two (file://// -> file://). If this is a problem, you can use #(?<![:/])/{2,}#.
Example: http://ideone.com/T2mlR
return preg_replace("/[^:]\/+/", "/", $path);

PHP: how to add trailing slash to absolute URL

I have a list of absolute URLs. I need to make sure that they all have trailing slashes, as applicable. So:
http://www.domain.com/ <-- does not need a trailing slash
http://www.domain.com <-- needs a trailing slash
http://www.domain.com/index.php <-- does not need a trailing slash
http://www.domain.com/?message=hello <-- does not need a trailing slash
I'm guessing I need to use regex, but matching URLs are a pain. Was hoping for an easier solution. Ideas?
For this very specific problem, not using a regex at all might be an option as well. If your list is long (several thousand URLs) and time is of any concern, you could choose to hand-code this very simple manipulation.
This will do the same:
$str .= (substr($str, -1) == '/' ? '' : '/');
It is of course not nearly as elegant or flexible as a regular expression, but it avoids the overhead of parsing the regular expression string and it will run as fast as PHP is able to do it.
It is arguably less readable than the regex, though this depends on how comfortable the reader is with regex syntax (some people might acually find it more readable).
It will certainly not check that the string is really a well-formed URL (such as e.g. zerkms' regex), but you already know that your strings are URLs anyway, so that is a bit redundant.
Though, if your list is something like 10 or 20 URLs, forget this post. Use a regex, the difference will be zero.
Rather than doing this using regex, you could use parse_url() to do this.
For example:
$url = parse_url("http://www.example.com/ab/abc.html?a=b#xyz");
if(!isset($url['path'])) $url['path'] = '/';
$surl = $url['scheme']."://".$url['host'].$url['path'].'?'.$url['query'].'#'.$url['fragment'];
echo $surl;
$url = 'http://www.domain.com';
$need_to_add_trailing_slash = preg_match('~^https?://[^/]+$~', $url);
Try this:
if (!preg_match("/.*\/$/", $url)) {
$url = "$url" . "/";
}
This may not be the most elegant solution, but it works like a charm. First we get the full url, then check to see if it has a a trailing slash. If not, check to see that there is no query string, it isn't an actual file, and isn't an actual directory. If the url meets all these conditions we do a 301 redirect with the trailing slash added.
If you're unfamiliar with PHP headers... note that there cannot be any output - not even whitespace - before this code.
$url = $_SERVER['REQUEST_URI'];
$lastchar = substr( $url, -1 );
if ( $lastchar != '/' ):
if ( !$_SERVER['QUERY_STRING'] and !is_file( $_SERVER['DOCUMENT_ROOT'].$url ) and !is_dir( $_SERVER['DOCUMENT_ROOT'].$url ) ):
header("HTTP/1.1 301 Moved Permanently");
header( "Location: $url/" );
endif;
endif;

Remove Trailing Slash From String PHP

Is it possible to remove the trailing slash / from a string using PHP?
Sure it is, simply check if the last character is a slash and then nuke that one.
if(substr($string, -1) == '/') {
$string = substr($string, 0, -1);
}
Another (probably better) option would be using rtrim() - this one removes all trailing slashes:
$string = rtrim($string, '/');
This removes trailing slashes:
$str = rtrim($str, '/');
Long accepted, however in my related searches I stumbled here, and am adding for "completeness"; rtrim() is great, however implemented like this:
$string = rtrim($string, '/\\'); //strip both forward and back slashes
It ensures portability from *nix to Windows, as I assume this question pertains to dealing with paths.
rtrim
Use rtrim cause it respects the string doesnt end with a trailing slash
Yes, it is!
http://php.net/manual/en/function.rtrim.php

Categories