Get the subdomain from a string using php - php

To get the domain name i am using this code:
<?php
$myURL = 'http://answers.yahoo.com/question/index?qid=20130406061745AAmovgl';
$pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i';
if (preg_match($pattern, $myURL, $domain) === 1) {
$domain = $domain[0];
}
$ndomain = "http://$domain";
echo $ndomain;
?>
but it will output: http://yahoo.comBut, how i can output http://answers.yahoo.com this sub-domain exactly.

You should instead use the parse_url function, since it exists to do this very thing.
echo parse_url( $url, PHP_URL_HOST );

You can use parse_url()like this:
$urlData = parse_url($myURL);
$host = $urlData['host']; //domain + subdomain

Related

How to get word between particular symbols in string

My source string could be:
example.com or http://example.com or www.example.com or https://example.com or http://www.example.com or https://www.example.com
or
example.abc.com or http://example.abc.com or www.example.abc.com or https://example.abc.com or http://www.example.abc.com or https://www.example.abc.com
I want the result: example
How can we do this using php string functions? or in other way?
Try this
$str = 'http://example.abc.com';
$last = explode("/", $str, 3);
$ans = explode('.',$last[2]);
echo $ans[0];
You can use parse_url
<?php
// Real full current URL, this can be useful for a lot of things
$url = 'http'.((isset($_SERVER['HTTPS']) && $_SERVER['HTTPS'] == 'on') ? 's' : '').'://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
// Or you can put another url
$url = 'https://www.example.foo.biz/';
// Get the host name
$hostName = parse_url($url, PHP_URL_HOST);
// Get the first part of the host name
$host = substr($hostName, 0, strpos($hostName, '.'));
print_r($url);
print_r($hostName);
// Here is what you want
print_r($host);
?>
you can use strpos:
<?php
$url = "http://www.example.com";
/* Use any of you want.
$url = "https://example.com";
$url = "https://www.example.abc.com";
$url = "https://www.www.example.com"; */
if ($found = strpos($url,'example') !== false) {
echo "it exists";
}
?>
EDIT:
So this is what I cam up with now, using explode and substr:
$url = "http://www.example.com";
/* Use any of you want.
$url = "https://example.com";
$url = "https://www.example.abc.com";
$url = "https://www.www.example.com"; */
$exp ='example';
if ($found = strpos($url, $exp) !== false) {
echo $str = substr($url, strpos($url, $exp));
echo "<br>". "it exists" . "<br>";
$finalword = explode(".", $str);
var_dump($finalword);
}
?>

get pure name form an URL

I use this function to get basename of an URL
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
print get_domain("http://mail.somedomain.co.uk"); // outputs 'somedomain.co.uk'
But how can I get pure name without '.co.uk' or '.com' or anything else?
for example: somedomain without co.uk
I know I can remove manual via str_replace($old, $new, $string) ... but Is there not a better method?
You can parse_url to get what you want:
$url= "http://mail.somedomain.co.uk";
$parts = parse_url($url);
$hostParts = explode('.',$parts['host']);
$main = $hostParts[1];
echo $main;
However, this will always give you the second part of domain. So, if you have a URL like http://somedomain.com/ the output will be com.
$array = explode('.', $_SERVER['SERVER_NAME']);
echo $array[1];
OR
if URL Is dynamic than try this.
$array = explode('.',$url);
echo $array[1];

Getting the first sub directory name from a URL

I have been trying to get the first subdirectory of a URL using all kinds of string manipulation functions and have been having a lot of trouble. I was wondering if anyone knew of an easy way to accomplish this?
I appreciate any advice, thanks in advance!
http://www.domain.com/pages/images/apple.png //output: pages
www.domain.com/pages/b/c/images/car.png // output: pages
domain.com/one/apple.png // output: one
You can use php function parse_url();
$url = 'domain.com/one/apple.png';
$path = parse_url($url, PHP_URL_PATH);
$firstSubDir = explode('/', $path)[1]; // [0] is the domain [1] is the first subdirectory, etc.
echo $firstSubDir; //one
function startsWith($haystack, $needle)
{
return $needle === "" || strpos($haystack, $needle) === 0;
}
$url = "http://www.domain.com/pages/images/apple.png";
$urlArr = explode('/', $url);
echo (startsWith($url, 'http')) ? $urlArr[3] : $urlArr[1]; // Should echo 'pages'
The above should work on both with and without 'http' as url-prefix case.
An alternative function to get first path from URL (with or without scheme).
function domainpath($url = '')
{
$url = preg_match("#^https?://#", $url) ? $url : 'http://' . $url;
$url = parse_url($url);
$explode = explode('/', $url['path']);
return $explode[1];
}
echo domainpath('http://www.domain.com/pages/images/apple.png');
echo domainpath('https://domain.com/pages/images/apple.png');
echo domainpath('www.domain.com/pages/b/c/images/car.png');
echo domainpath('domain.com/one/apple.png');

Get last URI segment excluding query string

I have an URL say: www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go. I want to extract the string register-car from it using php code. Please help.
try this:
<?php
$url = "www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go";
$register_car = basename(parse_url($url, PHP_URL_PATH));
echo $register_car; // will echo "register-car"
?>
You can use a combination of parse_url and basename, or even pathinfo to help you with this.
In PHP < 5.4.7
$url = "www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go";
$result = basename(parse_url("http://" . $url, PHP_URL_PATH));
echo $result; // register-car
In PHP >= 5.4.7
$url = "www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go";
$result = basename(parse_url("//" . $url, PHP_URL_PATH));
echo $result; // register-car
You are looking for the parse_url function, followed by basename to return the last /item/. The following prints register-car.
$parsed = parse_url("http://www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go", PHP_URL_HOST);
echo basename($parsed['path']);
$parts = parse_url("www.abc.com/blog/2012/12/register-car?w=Search&searchDmv=Go", PHP_URL_PATH);
$pieces = explode($parts, "/");
$action = $pieces[count($pieces)-1];

How to remove http://, www and slash from URL in PHP?

I need a php function which produce a pure domain name from URL. So this function must be remove http://, www and /(slash) parts from URL if these parts exists. Here is example input and outputs:
Input - > http://www.google.com/ | Output -> google.com
Input - > http://google.com/ | Output -> google.com
Input - > www.google.com/ | Output -> google.com
Input - > google.com/ | Output -> google.com
Input - > google.com | Output -> google.com
I checked parse_url function, but doesn't return what I need.
Since, I'm beginner in PHP, it was difficult for me. If you have any idea, please answer.
Thanx in advance.
$input = 'www.google.co.uk/';
// in case scheme relative URI is passed, e.g., //www.google.com/
$input = trim($input, '/');
// If scheme not included, prepend it
if (!preg_match('#^http(s)?://#', $input)) {
$input = 'http://' . $input;
}
$urlParts = parse_url($input);
// remove www
$domain = preg_replace('/^www\./', '', $urlParts['host']);
echo $domain;
// output: google.co.uk
Works correctly with all your example inputs.
$str = 'http://www.google.com/';
$str = preg_replace('#^https?://#', '', rtrim($str,'/'));
echo $str; // www.google.com
There are lots of ways grab the domain out of a url I've posted 4 ways below starting from the shortest to the longest.
#1
function urlToDomain($url) {
return implode(array_slice(explode('/', preg_replace('/https?:\/\/(www\.)?/', '', $url)), 0, 1));
}
echo urlToDomain('http://www.example.com/directory/index.php?query=true');
#2
function urlToDomain($url) {
$domain = explode('/', preg_replace('/https?:\/\/(www\.)?/', '', $url));
return $domain['0'];
}
echo urlToDomain('http://www.example.com/directory/index.php?query=true');
#3
function urlToDomain($url) {
$domain = preg_replace('/https?:\/\/(www\.)?/', '', $url);
if ( strpos($domain, '/') !== false ) {
$explode = explode('/', $domain);
$domain = $explode['0'];
}
return $domain;
}
echo urlToDomain('http://www.example.com/directory/index.php?query=true');
#4
function urlToDomain($url) {
if ( substr($url, 0, 8) == 'https://' ) {
$url = substr($url, 8);
}
if ( substr($url, 0, 7) == 'http://' ) {
$url = substr($url, 7);
}
if ( substr($url, 0, 4) == 'www.' ) {
$url = substr($url, 4);
}
if ( strpos($url, '/') !== false ) {
$explode = explode('/', $url);
$url = $explode['0'];
}
return $url;
}
echo urlToDomain('http://www.example.com/directory/index.php?query=true');
All of the functions above return the same response: example.com
Try this, it will remove what you wanted (http:://, www and trailing slash) but will retain other subdomains such as example.google.com
$host = parse_url('http://www.google.com', PHP_URL_HOST);
$host = preg_replace('/^(www\.)/i', '', $host);
Or as a one-liner:
$host = preg_replace('/^(www\.)/i', '', parse_url('http://www.google.com', PHP_URL_HOST));
if (!preg_match('/^http(s)?:\/\//', $url))
$url = 'http://' . $url;
$host = parse_url($url, PHP_URL_HOST);
$host = explode('.', strrev($host));
$host = strrev($host[1]) . '.' strrev($host[0]);
This would return second level domain, though it would be useless for say .co.uk domains, so you might want to do some more checking, and include additional parts if strrev($host[0]) is uk, au, etc.
$value = 'https://google.ca';
$result = str_ireplace('www.', '', parse_url($value, PHP_URL_HOST));
// google.ca
First way is to use one regular expression to trim unnecesary parts of URL like protocol, www and ending slash
function trimUrlProtocol($url) {
return preg_replace('/((^https?:\/\/)?(www\.)?)|(\/$)/', '', trim($url));
}
echo trimUrlProtocol('http://sandbox.onlinephpfunctions.com/') . PHP_EOL;
echo trimUrlProtocol('https://sandbox.onlinephpfunctions.com/') . PHP_EOL;
echo trimUrlProtocol('http://www.sandbox.onlinephpfunctions.com/') . PHP_EOL;
echo trimUrlProtocol('https://www.sandbox.onlinephpfunctions.com/') . PHP_EOL;
echo trimUrlProtocol('http://sandbox.onlinephpfunctions.com') . PHP_EOL;
echo trimUrlProtocol('https://sandbox.onlinephpfunctions.com') . PHP_EOL;
echo trimUrlProtocol('http://www.sandbox.onlinephpfunctions.com') . PHP_EOL;
echo trimUrlProtocol('https://www.sandbox.onlinephpfunctions.com') . PHP_EOL;
echo trimUrlProtocol('sandbox.onlinephpfunctions.com') . PHP_EOL;
By alternative way you can use parse_url, but you have to make additional cheks to check if host part exists and then use regular expression to trim www. Just use first way, it is simple and lazy.
This will account for "http/https", "www" and the ending slash
$str = 'https://www.google.com/';
$str = preg_replace('#(^https?:\/\/(w{3}\.)?)|(\/$)#', '', $str);
echo $str; // google.com
Just ask if you need help understanding the regex.
Use parse_url
http://www.php.net/manual/en/function.parse-url.php

Categories