Trimming more than needed off url in php - php

I am entering URL's into my database and i was getting all possible entries
I have the following code that takes the http:// or http or www .com .co.uk away
but the problem is this
when I enter a site like hat.com its taking the 'h' away this happens with t, p, w, and if its .co.uk it only removes the .uk
$new = rtrim($url, "/");
$reverse = strrev( $new );
$new = rtrim($reverse, ".www");
$new = rtrim($reverse, "//:ptth");
$new = rtrim($reverse, ".www//:ptth");
$new = rtrim($reverse, "//:sptth");
$new = rtrim($reverse, ".www//:sptth");
$url = strrev( $new );
Whats have I missed and what would I have to add?

Using a regular expression will help here:
preg_replace('~(^https?(://(www\.)?)?|\.com$|\.co\.uk$)~', '', $url);
The regular expression used will match:
http, https, http://, https://, http://www., https://www. at the beginning of the string
.com, .co.uk at the end of the string.
See this example:
php> $url = 'https://www.example.com';
'https://www.example.com'
php> preg_replace('~(^https?(://(www\.)?)?|\.com$|\.co\.uk$)~', '', $url);
'example'
php> $url = 'http://hat.com';
'http://hat.com'
php> preg_replace('~(^https?(://(www\.)?)?|\.com$|\.co\.uk$)~', '', $url);
'hat'

Related

Extract particular point of URL in PHP

I'm trying to get a very specific part of a URL using PHP so that I can use it as a variable later on.
The URL I have is:
https://forums.mydomain.com/index.php?/clubs/11-Default-Club
The particular part I am trying to extract is the 11 part between the /clubs/ and -Default-Club bits.
I was wondering what the best way to do this was. I've seen examples on here that use a regex-esque parser but I can't wrap my head around it for this particular instance.
Thanks
Edit; this is what I've tried so far using an explode query, but it seems to give me all sorts of elements which are not present in the URL above:
$url = $_SERVER['REQUEST_URI'];
$url = explode('/', $url);
$url = array_filter($url);
$url = array_merge($url, array());
Which returns:
Array ( [0] => index.php?app=core&module=system&controller=widgets&do=getBlock&blockID=plugin_9_bimBlankWidget_dqtr03ssz&pageApp=core&pageModule=clubs&pageController=view&pageArea=header&orientation=horizontal&csrfKey=8e19769b95c733b05439755827a98ac8 )
If you expect that the string with dashes (11-Default-Club) will be always at the end you can try this:
$url = $_SERVER['REQUEST_URI'];
$urlParts = explode('/', $url);
$string = end($urlParts);
$stringParts = explode('-', $string);
$theNumber = $stringParts[0]; // this will be 11
I'd rather be explicit:
<?php
$url = 'https://forums.mydomain.com/index.php?/clubs/11-Default-Club';
$query = parse_url($url, PHP_URL_QUERY);
$pattern = '#^/clubs/(\d+)[a-zA-Z-]+$#';
$digits = preg_match($pattern, $query, $matches)
? $matches[1]
: null;
var_dump($digits);
Output:
string(2) "11"
If this URL structure is fix for all URLs in your site and you only want to get the integer/number/digit part of the URL:
<?php
$url = 'https://forums.mydomain.com/index.php?/clubs/11-Default-Club';
$int = (int) filter_var($url, FILTER_SANITIZE_NUMBER_INT);
echo $int;
If this url structure is fix for all URLs in your site then below is best way to get your value.
<?php
$url = "https://forums.mydomain.com/index.php?/clubs/11-Default-Club";
$url = explode('/', $url);
$url = array_filter($url);
$end = end($url);
$end_parts = explode('-',$end);
echo $end_parts[0];
Output:
11

How to use preg_replace on a URL

How do I use preg_replace text/url. For example, I have a url like this: http://www.web.org/dorama/1201102144/hitoya-no-toge. I just want to show web.org. The url is not always same, for example sometimes it's: http://www.web.org/movies/123/no etc.
I only know the basics of it. Here is what I tried. It still does not delete the slash.
$url = "http://www.web.org/dorama/1201102144/hitoya-no-toge";
$patterns = array();
$patterns[0] = '/http:/';
$patterns[1] = '/dorama/';
$patterns[2] = '/1201102144/';
$replacements = array();
$replacements[2] = '';
$replacements[1] = '';
$replacements[0] = '';
echo preg_replace($patterns, $replacements, $url);
result when i run it //www.web.org///hitoya-no-toge
For such a job, I'd use parse_url then explode:
$url = "http://www.web.org/dorama/1201102144/hitoya-no-toge";
$host = (parse_url($url))['host'];
$domain = (explode('.', $host, 2))[1];
echo $domain;
Output:
web.org
Use preg_match instead preg_replace i think http://php.net/manual/en/function.preg-match.php
// get host name from URL
preg_match('#^(?:http://)?([^/]+)#i', $url, $matches);
$host = $matches[1];
// get last two segments of host name
preg_match('/[^.]+\.[^.]+$/', $host, $matches);
echo "{$matches[0]}"
If use https change http to https, I don't know how to make it work for http and https.

Removing anchor (#hash) from URL

Is there any reliable way in PHP to clean a URL of anchor tags?
So input:
http://site.com/some/#anchor
Outputs:
http://site.com/some/
Using strstr()
$url = strstr($url, '#', true);
Using strtok()
Shorter way, using strtok:
$url = strtok($url, "#");
Using explode()
Alternative way to separate the url from the hash:
list ($url, $hash) = explode('#', $url, 2);
If you don't want the $hash at all, you can omit it in list:
list ($url) = explode('#', $url);
With PHP version >= 5.4 you don't even need to use list:
$url = explode('#', $url)[0];
Using preg_replace()
Obligatory regex solution:
$url = preg_replace('/#.*/', '', $url);
Using Purl
Purl is neat URL manipulation library:
$url = \Purl\Url::parse($url)->set('fragment', '')->getUrl();
There is also one other option with parse_url();
$str = 'http://site.com/some/#anchor';
$arr = parse_url($str);
echo $arr['scheme'].'://'.$arr['host'].$arr['path'];
Output:
http://site.com/some/
Alternative way
$url = 'http://site.com/some/#anchor';
echo str_replace('#'.parse_url($url,PHP_URL_FRAGMENT),'',$url);
Using parse_url():
function removeURLFragment($pstr_urlAddress = '') {
$larr_urlAddress = parse_url ( $pstr_urlAddress );
return $larr_urlAddress['scheme'].'://'.(isset($larr_urlAddress['user']) ? $larr_urlAddress['user'].':'.''.$larr_urlAddress['pass'].'#' : '').$larr_urlAddress['host'].(isset($larr_urlAddress['port']) ? ':'.$larr_urlAddress['port'] : '').$larr_urlAddress['path'].(isset($larr_urlAddress['query']) ? '?'.$larr_urlAddress['query'] : '');
}

PHP Get end string on url between / and /

I need to get the last string content of the url between / and /
For example:
http://mydomain.com/get_this/
or
http://mydomain.com/lists/get_this/
I need to get where get_this is in the url.
trim() removes the trailing slash, strrpos() finds the last occurrence of / (after it's trimmed), and substr() gets all content after the last occurrence of /.
$url = trim($url, '/');
echo substr($url, strrpos($url, '/')+1);
View output
Even better, you can just use basename(), like hakre suggested:
echo basename($url);
View output
Assuming there always is a trailing slash:
$parts = explode('/', $url);
$get_this = $parts[count($parts)-2]; // -2 since there will be an empty array element due to the trailing slash
If not:
$url = trim($url, '/'); // If there is a trailing slash in this URL instance get rid of it so we're always sure the last part is where we expect it
$parts = explode('/', $url);
$get_this = $parts[count($parts)-1];
Something like this should work.
<?php
$subject = "http://mydomain.com/lists/get_this/";
$pattern = '/\/([^\/]*)\/$/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
print_r($matches);
?>
Just use parse_url() and explode():
<?php
$url = "http://mydomain.com/lists/get_this/";
$path = parse_url($url, PHP_URL_PATH);
$path_array = array_filter(explode('/', $path));
$last_path = $path_array[count($path_array) - 1];
echo $last_path;
?>
You can try this:
preg_match("/http:\/\/([a-z0-9\.]+)\/(.+)\/(.*)\/?/", $url, $matches);
print_r($matches);

How to trim down a URL using regex in PHP?

I am struggling to finish this regex code in PHP. I want to trim down the following url which is held in variable $text so that it goes from:
http://www.site.net/showthread.php?tid=324&pid=...
to:
showthread.php?tid=324
Thank you kindly!
Why use a regex? The parse_url method should give you all you want: http://php.net/manual/en/function.parse-url.php
Edit: working example
$someurl = 'http://www.site.net/showthread.php?tid=324&pid=...';
$urlParts = parse_url($someurl, PHP_URL_PATH | PHP_URL_QUERY);
$params = parse_str($urlParts['query']);
unset($params['pid']);
$queryString = http_build_query($params);
$newUrl = $urlParts['path'] . '?' . $queryString;
Since $urlParts['path'] start with a / and you didn't want that, you could even use
$newUrl = substr($newUrl, 1);
and be done :) Does that help at all?
This should do it:
$url = 'http://www.site.net/showthread.php?tid=324&pid=...';
$pattern = "/showthread.php\?tid=[0-9]+/";
if (preg_match($pattern, $url, $match))
print_r($match);

Categories