Get domain (without TLD) using PHP [duplicate] - php

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Get the subdomain from a URL
I have seen posts about using parse_url to get www.domain.tld but how can i just get "domain" using php?
i have got this regex currently
$pattern = '#https?://[a-z]{1,}\.{0,}([a-z]{1,})\.com(\.[a-z]{1,}){0,}#';
but this only works with .com and i need it to work with all TLDs (.co.uk, .com , .tv etc.)
Is there a reliable way to do this, i am not sure if regex is the best way to go or not? or maybe explode on "." but then again subdomains would mess it up.
EDIT
so the desired outcome would be
$url = "https://stackoverflow.com/questions/11952907/get-domain-without-tld-using-php#comment15926320_11952907";
$output = "stackoverflow";
Doing more research would anyone advise using parse_url to get www.domain.tld then using explode to get domain?

Try this regex :
#^https?://(www\.)?([^/]*?)(\.co)?\.[^.]+?/#

You could use the parse_url function. Doc is here.
Something like:
$url = 'http://username:password#hostname/path?arg=value#anchor';
print_r(parse_url($url));
And then you can take $url['host'] and do:
$arr = explode('.',$url['host']);
return $arr[count($arr) - 2];

I think you don't need regex.
function getDomain($url){
$things_like_WWW_at_the_start = array('www');
$urlContents = parse_url($url);
$domain = explode('.', $urlContents['host']);
if (!in_array($domain[0], $things_like_WWW_at_the_start))
return $domain[0];
else
return $domain[1];
}

Related

get domain name from link with a fast and reliable method

Currently I am using this code to get the domain name (without www. or domain ending like .com):
explode('.', $url)[1];
Due to the fact that this code is in a loop it takes very long to handle it. Furthermore it can not get "example" from http://example.com/asd/asd.asd.html. Is there another and faster way to solve this?
Thank you for any answer in advance!
best greetings
use parse_url()
$host = parse_url($url, PHP_URL_HOST);
PHP_URL_HOST returns the Host
Further, use a Regex to get the desired Part of the Host:
$result = preg_match('/^(?:www\.)?([^\.]+)/', $match);

Php - get path after php script

I have search high and low for an answer to my question and I cannot find it. Basically what I want to do is get the path after a php script. ex. "http://www.example.com/index.php/arg1/arg2/arg3/etc/" and get arg1, arg2, arg3, etc in an array. How can I do this in php and once I do this will "http://www.example.com/arg1/arg2/arg3/etc" still return the same results. If not then how can I achieve this?
Here is how to get the answer, and a few others you will have in the future. Make a script, e.g. "test.php", and just put this one line in it:
<?php phpinfo();
Then browse to it with http://www.example.com/test.php/one/two/three/etc
Look through the $_SERVER values for the one that matches what you are after.
I see the answer you want in this case in $_SERVER["PATH_INFO"]: "/one/two/three/etc"
Then use explode to turn it into an array:
print_r( explode('/',$_SERVER['PATH_INFO'] );
However sometimes $_SERVER["REQUEST_URI"] is going to be the one you want, especially if using URL rewriting.
UPDATE:
Responding to comment, here is what I'd do:
$args = explode('/',$_SERVER['REQUEST_URI']);
if($args[0] == 'index.php')array_shift($args);
Try like this :
$url ="http://www.example.com/index.php/arg1/arg2/arg3/etc/";
$array = explode("/",parse_url($url, PHP_URL_PATH));
if (in_array('index.php', $array))
{
unset($array[array_search('index.php',$array)]);
}
print_r(array_values(array_filter($array)));
Can you try using parse_url and parse_str
$url =$_SERVER['REQUEST_URI'];
$ParseUrl = parse_url($url);
print_r($ParseUrl);
$arr = parse_str($ParseUrl['query']);
print_r($arr);

How do i extract a link from a longer link using php?

I was wondering if someone knows what the best method would be to extract a link from another link , Here's an example:
If I have links in the following format:
http://www.youtube.com/watch?v=35HBFeB4jYg OR
http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv OR
https://www.google.it/search?q=rap+tedesco&aq=f&oq=rap+tedesco&aqs=chrome.0.57j62l2.2287&sourceid=chrome&ie=UTF-8#hl=en&sclient=psy-ab&q=migliori+programatori&oq=migliori+programatori&gs_l=serp.3..0i19j0i13i30i19l3.9986.13880.0.14127.14.10.0.4.4.0.165.931.6j4.10.0...0.0...1c.1.7.psy-ab.tPmiWRyUVXA&pbx=1&bav=on.2,or.r_cp.r_qf.&fp=ffc0e9337f73a744&biw=1280&bih=699
How would I go about extracting only the web pages like so:
http://www.youtube.com
http://it.answers.yahoo.com
https://www.google.it
I was wondering if and what regular expression I could use with PHP to achieve this, also are regular expressions the way to go?
There is a PHP function for parsing URLs: parse_url
$url = 'http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv';
$p = parse_url($url);
echo $p["scheme"] . "// . "$p["host"];
Use function parse_url.
$link = "https://www.google.it/search?q=rap+tedesco";
$parseUrl = parse_url($link);
$siteName = $parseUrl['scheme']."://". $parseUrl['host'];
Using Regexp.
preg_match('#http(s?)://([\w]+\.){1}([\w]+\.?)+#',$link,$matches);
echo $matches[0];
Codeviper Demo.
You just want to have the domain of the page, in PHP there exists a function called parse_url that could help

Remove certain part of string in PHP [duplicate]

This question already has answers here:
Get domain name (not subdomain) in php
(18 answers)
Closed 10 years ago.
I've already seen a bunch of questions on this exact subject, but none seem to solve my problem. I want to create a function that will remove everything from a website address, except for the domain name.
For example if the user inputs: http://www.stackoverflow.com/blahblahblah I want to get stackoverflow, and the same way if the user inputs facebook.com/user/bacon I want to get facebook.
Do anyone know of a function or a way where I can remove certain parts of strings? Maybe it'll search for http, and when found it'll remove everything until after the // Then it'll search for www, if found it'll remove everything until the . Then it keeps everything until the next dot, where it removes everything behind it? Looking at it now, this might cause problems with sites as http://www.en.wikipedia.org because I'll be left with only en.
Any ideas (preferably in PHP, but JavaScript is also welcome)?
EDIT 1:
Thanks to great feedback I think I've been able to work out a function that does what I want:
function getdomain($url) {
$parts = parse_url($url);
if($parts['scheme'] != 'http') {
$url = 'http://'.$url;
}
$parts2 = parse_url($url);
$host = $parts2['host'];
$remove = explode('.', $host);
$result = $remove[0];
if($result == 'www') {
$result = $remove[1];
}
return $result;
}
It's not perfect, at least considering subdomains, but I think it's possible to do something about it. Maybe add a second if statement at the end to check the length of the array. If it's bigger than two, then choose item nr1 instead of item nr0. This obviously gives me trouble related to any domain using .co.uk (because that'll be tree items long, but I don't want to return co). I'll try to work around on it a little bit, and see what I come up with. I'd be glad if some of you PHP gurus out there could take a look as well. I'm not as skilled or as experienced as any of you... :P
Use parse_url to split the URL into the different parts. What you need is the hostname. Then you will want to split it by the dot and get the first part:
$url = 'http://facebook.com/blahblah';
$parts = parse_url($url);
$host = $parts['host']; // facebook.com
$foo = explode('.', $host);
$result = $foo[0]; // facebook
You can use the parse_url function from PHP which returns exactly what you want - see
Use the parse_url method in php to get domain.com and then use replace .com with empty string.
I am a little rusty on my regular expressions but this should work.
$url='http://www.en.wikipedia.org';
$domain = parse_url($url, PHP_URL_HOST); //Will return en.wikipedia.org
$domain = preg_replace('\.com|\.org', '', $domain);
http://php.net/manual/en/function.parse-url.php
PHP REGEX: Get domain from URL
http://rubular.com/r/MvyPO9ijnQ //Check regular expressions
You're looking for info on Regular Expression. It's a bit complicated, so be prepared to read up. In your case, you'll best utilize preg_match and preg_replace. It searches for a match based on your pattern and replaces the matches with your replacement.
preg_match
preg_replace
I'd start with a pattern like this: find .com, .net or .org and delete it and everything after it. Then find the last . and delete it and everything in front of it. Finally, if // exists, delete it and everything in front of it.
if (preg_match("/^http:\/\//i",$url))
preg_replace("/^http:\/\//i","",$url);
if (preg_match("/www./i",$url))
preg_replace("/www./i","",$url);
if (preg_match("/.com/i",$url))
preg_replace("/.com/i","",$url);
if (preg_match("/\/*$/",$url))
preg_replace("/\/*$/","",$url);
^ = at the start of the string
i = case insensitive
\ = escape char
$ = the end of the string
This will have to be played around with and tweaked, but it should get your pointed in the right direction.
Javascript:
document.domain.replace(".com","")
PHP:
$url = 'http://google.com/something/something';
$parse = parse_url($url);
echo str_replace(".com","", $parse['host']); //returns google
This is quite a quick method but should do what you want in PHP:
function getDomain( $URL ) {
return explode('.',$URL)[1];
}
I will update it when I get chance but basically it splits the URL into pieces by the full stop and then returns the second item which should be the domain. A bit more logic would be required for longer domains such as www.abc.xyz.com but for normal urls it would suffice.

PHP Remove Domain Name Extension from String

I was wondering of the best way of removing certain things from a domain using PHP.
For example:
"http://mydomain.com/" into "mydomain"
or
"http://mydomain.co.uk/" into "mydomain"
I'm looking for a quick function that will allow me to remove such things as:
"http://", "www.", ".com", ".co.uk", ".net", ".org", "/" etc
Thanks in advance :)
To get the host part of a URL use parse_url:
$host = parse_url($url, PHP_URL_HOST);
And for the rest see my answer to Remove domain extension.
Could you use string replace?
str_replace('http://', '');
This would strip out 'http://' from a string. All you would have to do first is get the current url of the page and pass it through any string replace you wanted to..
I would str_replace out the 'http://' and then explode the periods in the full domain name.
You can combine parse_url() and str_* functions, but you'll never have correct result if you need cut domain zone (.com, .net, etc.) from your result.
For example:
parse_url('http://mydomain.co.uk/', PHP_URL_HOST); // will return 'mydomain.co.uk'
You need use library that uses Public Suffix List for handle such situations. I recomend TLDExtract.
Here is a sample code:
$extract = new LayerShifter\TLDExtract\Extract();
$result = $extract->parse('mydomain.co.uk');
$result->getHostname(); // will return 'mydomain'
$result->getSuffix(); // will return 'co.uk'
$result->getFullHost(); // will return 'mydomain.co.uk'
$result->getRegistrableDomain(); // will return 'mydomain.co.uk'

Categories