PHP function to get the subdomain of a URL - php

Is there a function in PHP to get the name of the subdomain?
In the following example I would like to get the "en" part of the URL:
en.example.com

Here's a one line solution:
array_shift((explode('.', $_SERVER['HTTP_HOST'])));
Or using your example:
array_shift((explode('.', 'en.example.com')));
EDIT: Fixed "only variables should be passed by reference" by adding double parenthesis.
EDIT 2: Starting from PHP 5.4 you can simply do:
explode('.', 'en.example.com')[0];

Uses the parse_url function.
$url = 'http://en.example.com';
$parsedUrl = parse_url($url);
$host = explode('.', $parsedUrl['host']);
$subdomain = $host[0];
echo $subdomain;
For multiple subdomains
$url = 'http://usa.en.example.com';
$parsedUrl = parse_url($url);
$host = explode('.', $parsedUrl['host']);
$subdomains = array_slice($host, 0, count($host) - 2 );
print_r($subdomains);

You can do this by first getting the domain name (e.g. sub.example.com => example.co.uk) and then use strstr to get the subdomains.
$testArray = array(
'sub1.sub2.example.co.uk',
'sub1.example.com',
'example.com',
'sub1.sub2.sub3.example.co.uk',
'sub1.sub2.sub3.example.com',
'sub1.sub2.example.com'
);
foreach($testArray as $k => $v)
{
echo $k." => ".extract_subdomains($v)."\n";
}
function extract_domain($domain)
{
if(preg_match("/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i", $domain, $matches))
{
return $matches['domain'];
} else {
return $domain;
}
}
function extract_subdomains($domain)
{
$subdomains = $domain;
$domain = extract_domain($subdomains);
$subdomains = rtrim(strstr($subdomains, $domain, true), '.');
return $subdomains;
}
Outputs:
0 => sub1.sub2
1 => sub1
2 =>
3 => sub1.sub2.sub3
4 => sub1.sub2.sub3
5 => sub1.sub2

http://php.net/parse_url
<?php
$url = 'http://user:password#sub.hostname.tld/path?argument=value#anchor';
$array=parse_url($url);
$array['host']=explode('.', $array['host']);
echo $array['host'][0]; // returns 'sub'
?>

As the only reliable source for domain suffixes are the domain registrars, you can't find the subdomain without their knowledge.
There is a list with all domain suffixes at https://publicsuffix.org. This site also links to a PHP library: https://github.com/jeremykendall/php-domain-parser.
Please find an example below. I also added the sample for en.test.co.uk which is a domain with a multi suffix (co.uk).
<?php
require_once 'vendor/autoload.php';
$pslManager = new Pdp\PublicSuffixListManager();
$parser = new Pdp\Parser($pslManager->getList());
$host = 'http://en.example.com';
$url = $parser->parseUrl($host);
echo $url->host->subdomain;
$host = 'http://en.test.co.uk';
$url = $parser->parseUrl($host);
echo $url->host->subdomain;

PHP 7.0: Use the explode function and create a list of all the results.
list($subdomain,$host) = explode('.', $_SERVER["SERVER_NAME"]);
Example: sub.domain.com
echo $subdomain;
Result: sub
echo $host;
Result: domain

Simply...
preg_match('/(?:http[s]*\:\/\/)*(.*?)\.(?=[^\/]*\..{2,5})/i', $url, $match);
Just read $match[1]
Working example
It works perfectly with this list of urls
$url = array(
'http://www.domain.com', // www
'http://domain.com', // --nothing--
'https://domain.com', // --nothing--
'www.domain.com', // www
'domain.com', // --nothing--
'www.domain.com/some/path', // www
'http://sub.domain.com/domain.com', // sub
'опубликованному.значения.ua', // опубликованному ;)
'значения.ua', // --nothing--
'http://sub-domain.domain.net/domain.net', // sub-domain
'sub-domain.third-Level_DomaIN.domain.uk.co/domain.net' // sub-domain
);
foreach ($url as $u) {
preg_match('/(?:http[s]*\:\/\/)*(.*?)\.(?=[^\/]*\..{2,5})/i', $u, $match);
var_dump($match);
}

Simplest and fastest solution.
$sSubDomain = str_replace('.example.com','',$_SERVER['HTTP_HOST']);

$REFERRER = $_SERVER['HTTP_REFERER']; // Or other method to get a URL for decomposition
$domain = substr($REFERRER, strpos($REFERRER, '://')+3);
$domain = substr($domain, 0, strpos($domain, '/'));
// This line will return 'en' of 'en.example.com'
$subdomain = substr($domain, 0, strpos($domain, '.'));

Using regex, string functions, parse_url() or their combinations it's not real solution. Just test any of proposed solutions with domain test.en.example.co.uk, there will no any correct result.
Correct solution is use package that parses domain with Public Suffix List. I recomend TLDExtract, here is sample code:
$extract = new LayerShifter\TLDExtract\Extract();
$result = $extract->parse('test.en.example.co.uk');
$result->getSubdomain(); // will return (string) 'test.en'
$result->getSubdomains(); // will return (array) ['test', 'en']
$result->getHostname(); // will return (string) 'example'
$result->getSuffix(); // will return (string) 'co.uk'

What I found the best and short solution is
array_shift(explode(".",$_SERVER['HTTP_HOST']));

For those who get 'Error: Strict Standards: Only variables should be passed by reference.'
Use like this:
$env = (explode(".",$_SERVER['HTTP_HOST']));
$env = array_shift($env);

$domain = 'sub.dev.example.com';
$tmp = explode('.', $domain); // split into parts
$subdomain = current($tmp);
print($subdomain); // prints "sub"
As seen in a previous question:
How to get the first subdomain with PHP?

There isn't really a 100% dynamic solution - I've just been trying to figure it out as well and due to different domain extensions (DTL) this task would be really difficult without actually parsing all these extensions and checking them each time:
.com vs .co.uk vs org.uk
The most reliable option is to define a constant (or database entry etc.) that stores the actual domain name and remove it from the $_SERVER['SERVER_NAME'] using substr()
defined("DOMAIN")
|| define("DOMAIN", 'mymaindomain.co.uk');
function getSubDomain() {
if (empty($_SERVER['SERVER_NAME'])) {
return null;
}
$subDomain = substr($_SERVER['SERVER_NAME'], 0, -(strlen(DOMAIN)));
if (empty($subDomain)) {
return null;
}
return rtrim($subDomain, '.');
}
Now if you're using this function under http://test.mymaindomain.co.uk it will give you test or if you have multiple sub-domain levels http://another.test.mymaindomain.co.uk you'll get another.test - unless of course you update the DOMAIN.
I hope this helps.

Simply
reset(explode(".", $_SERVER['HTTP_HOST']))

I'm doing something like this
$url = https://en.example.com
$splitedBySlash = explode('/', $url);
$splitedByDot = explode('.', $splitedBySlash[2]);
$subdomain = $splitedByDot[0];

Suppose current url = sub.example.com
$host = array_reverse(explode('.', $_SERVER['SERVER_NAME']));
if (count($host) >= 3){
echo "Main domain is = ".$host[1].".".$host[0]." & subdomain is = ".$host[2];
// Main domain is = example.com & subdomain is = sub
} else {
echo "Main domain is = ".$host[1].".".$host[0]." & subdomain not found";
// "Main domain is = example.com & subdomain not found";
}

this is my solution, it works with the most common domains, you can fit the array of extensions as you need:
$SubDomain = explode('.', explode('|ext|', str_replace(array('.com', '.net', '.org'), '|ext|',$_SERVER['HTTP_HOST']))[0]);

// For www.abc.en.example.com
$host_Array = explode(".",$_SERVER['HTTP_HOST']); // Get HOST as array www, abc, en, example, com
array_pop($host_Array); array_pop($host_Array); // Remove com and exmaple
array_shift($host_Array); // Remove www (Optional)
echo implode($host_Array, "."); // Combine array abc.en

I know I'm really late to the game, but here goes.
What I did was take the HTTP_HOST server variable ($_SERVER['HTTP_HOST']) and the number of letters in the domain (so for example.com it would be 11).
Then I used the substr function to get the subdomain. I did
$numberOfLettersInSubdomain = strlen($_SERVER['HTTP_HOST'])-12
$subdomain = substr($_SERVER['HTTP_HOST'], $numberOfLettersInSubdomain);
I cut the substring off at 12 instead of 11 because substrings start on 1 for the second parameter. So now if you entered test.example.com, the value of $subdomain would be test.
This is better than using explode because if the subdomain has a . in it, this will not cut it off.

if you are using drupal 7
this will help you:
global $base_path;
global $base_root;
$fulldomain = parse_url($base_root);
$splitdomain = explode(".", $fulldomain['host']);
$subdomain = $splitdomain[0];

$host = $_SERVER['HTTP_HOST'];
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
$domain = $matches[0];
$url = explode($domain, $host);
$subdomain = str_replace('.', '', $url[0]);
echo 'subdomain: '.$subdomain.'<br />';
echo 'domain: '.$domain.'<br />';

From PHP 5.3 you can use strstr() with true parameter
echo strstr($_SERVER["HTTP_HOST"], '.', true); //prints en

Try this...
$domain = 'en.example.com';
$tmp = explode('.', $domain);
$subdomain = current($tmp);
echo($subdomain); // echo "en"

function get_subdomain($url=""){
if($url==""){
$url = $_SERVER['HTTP_HOST'];
}
$parsedUrl = parse_url($url);
$host = explode('.', $parsedUrl['path']);
$subdomains = array_slice($host, 0, count($host) - 2 );
return implode(".", $subdomains);
}

you can use this too
echo substr($_SERVER['HTTP_HOST'], 0, strrpos($_SERVER['HTTP_HOST'], '.', -5));

Maybe I'm late, but even though the post is old, just as I get to it, many others do.
Today, the wheel is already invented, with a library called php-domain-parser that is active, and in which two mechanisms can be used.
One based on the Public Suffix List and one based on the IANA list.
Simple and effective, it allows us to create simple helpers that help us in our project, with the ability to know that the data is maintained, in a world in which the extensions and their variants are very changeable.
Many of the answers given in this post do not pass a battery of unit tests, in which certain current extensions and their variants with multiple levels are checked, and neither with the casuistry of domains with extended characters.
Maybe it serves you, as it served me.

<?php
// Your code here!
function get_domain($host) {
$parts = explode('.',$host);
$extension = $parts[count($parts)-1];
$name = $parts[count($parts)-2];
return $name.'.'.$extension;
}
echo get_domain("https://api.neoistone.com");
?>

If you only want what comes before the first period:
list($sub) = explode('.', 'en.example.com', 2);

Related

changing a given url to return only the domain

I'm getting a url from a form, this way:
$input_website = isset($_POST['website']) ? check_plain($_POST['website']) : 'None';
I need to get back a naked domain name(for some API integration), for example: http://www.example.com will return as example.com
and www.example.com will return example.com etc.
I have this code now, that returns the correct url for the first case http://www.example.com but returns nothing for www.example.com or even example.com:
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
Can you please advice on the matter?
As per discussion with you:
$url = 'www.noamddd.com';
$arrUrl = explode("/", $url);
echo $arrUrl[0];
Old Answer:
Make a function with the following code block and get the domain names.
Try this
more about parse_url
$url = 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html';
$parse = parse_url($url);
print $parse['host']; //google.com
Also you can do this in another way:
echo $domain = str_ireplace('www.', '', parse_url($url, PHP_URL_HOST));//google.com
If you just have the URL (and not want the current domain name like frayne-konok suggests) and want to extract the server name, you can use a regular expression like this:
$serverName = preg_replace('|.*?://(.*?)/.*|', '$1', $url);
I ended up doing something a bit different - checking if there is http and if not, i'm adding it using this function:
function addHttp($website) {
if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
$url = "http://" . $url;
}
return $website;
}
and only then i'm sending it to my other function that return the domain.
For sure not the best way, but it works.

Get domain from any php string

I am trying to get domain from inputting a url, or the domain itself (i.e "domain.com")
I tried using
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
But that fails when inputting simply "domain.com"
I can't seem to figure this out.
if you ask parse_url with 'domain.com' it return it as file name
$pieces = parse_url($url); // 'path' => domain.com
Update
You can use such regex to take domain
^(https{0,1}:\/\/|)([^\/]+)(\/*.+|)$
it takes example.com from all lines below
http://example.com/dd.dfgjhdfg
https://example.com/dd.dfgjhdfg
http://example.com/
http://example.com
example.com/
example.com
Here is the working code:
$domainparse = parse_url($domainpost);
$domainpath = $domainparse['path'];
if ($domainpost == $domainpath) {
$domain = $domainpost;
}else {
$domain = get_domain($domainpost);
}
$domain = str_replace("www.", "", $domain);
Thank you splash58! I would have used your code if I had seen it in time

PHP get domain with subdomain from string

Im using the following function to cut domain from string:
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
I need to cut subdomain + domain how should i change preg_match to get it?
PS i was searching solution but everyone wants to cut only domain without sub.
If you can't work out the regexp, a more procedural approach might be:
$pieces = parse_url($url);
$aDomains = explode('.', $pieces['host']);
$sub = array_shift($aDomains);
$restofdomain = implode($aDomains);
...if you're always going to just want the first domain (i.e. it wouldn't work with a root domain like 'somedomain.com'.

php check if domain equals value, then perform action

I need to take a variable that contains a URL, and check to see if the domain equals a certain domain, if it does, echo one variable, if not, echo another.
$domain = "http://www.google.com/docs";
if ($domain == google.com)
{ echo "yes"; }
else
{ echo "no"; }
Im not sure how to write the second line where it checks the domain to see if $domain contains the url in the if statement.
This is done by using parse_url:
$host = parse_url($domain, PHP_URL_HOST);
if($host == 'www.google.com') {
// do something
}
Slightly more advanced domain-finding function.
$address = "http://www.google.com/apis/jquery";
if (get_domain($address) == "google.com") {
print "Yes.";
}
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
In addition to the other answers you could use a regex, like this one which looks for google.com in the domain name
$domain = "http://www.google.com/docs";
if (preg_match('{^http://[\w\.]*google.com/}i', $domain))
{
}
Have you tried parse_url()?
Note that you might also want to explode() the resulting domain on '.', depending on exactly what you mean by 'domain'.
You can use the parse_url function to divide the URL into the separate parts (protocol/host/path/query string/etc). If you also want to allow www.google.com to be a synonym for google.com, you'll need to add an extra substring check (with substr) that makes sure that the latter part of the host matches the domain you're looking for.

How do you strip out the domain name from a URL in php?

Im looking for a method (or function) to strip out the domain.ext part of any URL thats fed into the function. The domain extension can be anything (.com, .co.uk, .nl, .whatever), and the URL thats fed into it can be anything from http://www.domain.com to www.domain.com/path/script.php?=whatever
Whats the best way to go about doing this?
parse_url turns a URL into an associative array:
php > $foo = "http://www.example.com/foo/bar?hat=bowler&accessory=cane";
php > $blah = parse_url($foo);
php > print_r($blah);
Array
(
[scheme] => http
[host] => www.example.com
[path] => /foo/bar
[query] => hat=bowler&accessory=cane
)
You can also write a regular expression to get exactly what you want.
Here is my attempt at it:
$pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i';
$url = 'http://www.example.com/foo/bar?hat=bowler&accessory=cane';
if (preg_match($pattern, $url, $matches) === 1) {
echo $matches[0];
}
The output is:
example.com
This pattern also takes into consideration domains such as 'example.com.au'.
Note: I have not consulted the relevant RFC.
You can use parse_url() to do this:
$url = 'http://www.example.com';
$domain = parse_url($url, PHP_URL_HOST);
$domain = str_replace('www.','',$domain);
In this example, $domain should contain example.com, irrespective of it having www or not. It also works for a domain such as .co.uk
Following code will trim protocol, domain and port from absolute URL:
$urlWithoutDomain = preg_replace('#^.+://[^/]+#', '', $url);
Here are a couple simple functions to get the root domain (example.com) from a normal or long domain (test.sub.domain.com) or url (http://www.example.com).
/**
* Get root domain from full domain
* #param string $domain
*/
public function getRootDomain($domain)
{
$domain = explode('.', $domain);
$tld = array_pop($domain);
$name = array_pop($domain);
$domain = "$name.$tld";
return $domain;
}
/**
* Get domain name from url
* #param string $url
*/
public function getDomainFromUrl($url)
{
$domain = parse_url($url, PHP_URL_HOST);
$domain = $this->getRootDomain($domain);
return $domain;
}
Solved this...
Say we're calling dev.mysite.com and we want to extract 'mysite.com'
$requestedServerName = $_SERVER['SERVER_NAME']; // = dev.mysite.com
$thisSite = explode('.', $requestedServerName); // site name now an array
array_shift($thisSite); //chop off the first array entry eg 'dev'
$thisSite = join('.', $thisSite); //join it back together with dots ;)
echo $thisSite; //outputs 'mysite.com'
Works with mysite.co.uk too so should work everywhere :)
I spent some time thinking about whether it makes sense to use a regular expression for this, but in the end I think not.
firstresponder's regexp came close to convincing me it was the best way, but it didn't work on anything missing a trailing slash (so http://example.com, for instance). I fixed that with the following: '/\w+\..{2,3}(?:\..{2,3})?(?=[\/\W])/i', but then I realized that matches twice for urls like 'http://example.com/index.htm'. Oops. That wouldn't be so bad (just use the first one), but it also matches twice on something like this: 'http://abc.ed.fg.hij.kl.mn/', and the first match isn't the right one. :(
A co-worker suggested just getting the host (via parse_url()), and then just taking the last two or three array bits (split() on '.') The two or three would be based on a list of domains, like 'co.uk', etc. Making up that list becomes the hard part.
There is only one correct way to extract domain parts, it's use Public Suffix List (database of TLDs). I recomend TLDExtract package, here is sample code:
$extract = new LayerShifter\TLDExtract\Extract();
$result = $extract->parse('www.domain.com/path/script.php?=whatever');
$result->getSubdomain(); // will return (string) 'www'
$result->getHostname(); // will return (string) 'domain'
$result->getSuffix(); // will return (string) 'com'
This function should work:
function Delete_Domain_From_Url($Url = false)
{
if($Url)
{
$Url_Parts = parse_url($Url);
$Url = isset($Url_Parts['path']) ? $Url_Parts['path'] : '';
$Url .= isset($Url_Parts['query']) ? "?".$Url_Parts['query'] : '';
}
return $Url;
}
To use it:
$Url = "https://stackoverflow.com/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php";
echo Delete_Domain_From_Url($Url);
# Output:
#/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php

Categories