I have never used regex before and I was wondering how to write a regular expression in PHP that gets the domain of the URL. For example:
http://www.hegnar.no/bors/article488276.ece --> hegnar.no
You dont need to use regexp for this task.
Check PHP's built in function, parse_url
http://php.net/manual/en/function.parse-url.php
Just use parse_url() if you are specifically dealing with URLs.
For example:
$url = "http://www.hegnar.no/bors/article488276.ece";
$url_u_want = parse_url($url, PHP_URL_HOST);
Docs
EDIT:
To take out the www. infront, use:
$url_u_want = preg_replace("/^www\./", "", $url_u_want);
$page = "http://google.no/page/page_1.html";
preg_match_all("/((?:[a-z][a-z\\.\\d\\-]+)\\.(?:[a-z][a-z\\-]+))(?![\\w\\.])/", $page, $result, PREG_PATTERN_ORDER);
print_r($result);
$host = parse_url($url, PHP_URL_HOST);
$host = array_reverse(explode('.', $host));
$host = $host[1].'.'.$host[0];
See
PHP Regex for extracting subdomains of arbitrary domains
and
Javascript/Regex for finding just the root domain name without sub domains
This is the problem when you use parse_url, the $url with no .com or .net or etc then the result returned is bannedadsense, this mean returning true, the fact bannedadsense is not a domain.
$url = 'http://bannedadsense/isbanned'; // this url will return false in preg_match
//$url = 'http://bannedadsense.com/isbanned'; // this url will return domain in preg_match
$domain = parse_url($url, PHP_URL_HOST));
// return "bannedadsense", meaning this is right domain.
So that we need continue to check more a case with no dot extension (.com, .net, .org, etc)
if(preg_match("/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/i",$domain)) {
echo $domain;
}else{
echo "<br>";
echo "false";
}
Related
I've a PHP file. In this file I need to check, if my URL has the following ending:
www.example.de/dashboard/2/
So the ending can be a number 1 - 99+ which is always at the end of the url between two slashes. I can't use $_GET here. If it is $_GET, it would be easy:
if ( isset($_GET['ending']) ) :
So how can I do this without a parameter in the URL? Thanks for your help!
if(preg_match('^\/dashboard\/(\d+)', $_SERVER['REQUEST_URI'])){
foo();
}
Use regular expression on the request uri
You can make use of parse_url and explode:
$url = 'http://www.example.de/dashboard/2/';
$path = parse_url($url, PHP_URL_PATH); // '/dashboard/2/'
$parts = explode('/', $path); // ['', 'dashboard', '2', '']
$section = $parts[1]; // 'dashboard'
$ending = $parts[2]; // '2'
Demo: https://3v4l.org/dv6Cn
You can also make use of URL rewriting (this is for a Apache-based web server, but you can find simular resources for nginx or any other web servers if need be).
A more dynamic way is to explode and use array_filter to remove empty values then pick the last item.
If the item * 1 is the same as the item then we know it's a number.
(The return from explode is strings so we cant use is_int)
$url = "http://www.example.de/dashboard/2/";
$parts = array_filter(explode("/", $url));
$ending = end($parts);
if($ending*1 == $ending) echo $ending; //2
First you need to target this url to script - in web server config. For nginx and index.php:
try_files $uri #rewrite_location;
location #rewrite_location {
rewrite ^/(.*) /index.php?link=$1&$args last;
}
Second - you need to parse URI. In $end you find what you want
$link_as_array = array_values(array_diff(explode("/", $url), array('')));
$max = count($link_as_array) - 1;
$end = $link_as_array[$max];
I would think this way. If the URL is always the same, or the same format, I'll do the following:
Check for the approx URL.
Split the URL into pieces.
Find if there's the part I am looking for.
Extract the number.
<?php
$url = "http://www.example.de/dashboard/2/";
if (strpos($url, "www.example.de/dashboard") === 7 or strpos($url, "www.example.de/dashboard") === 8) {
$urlParts = explode("/", $url);
if (isset($urlParts[4]) && isNumeric($urlParts[4]))
echo "Yes! It is {$urlParts[4]}.";
}
?>
The strpos with 7 and 8 is for URL with http:// or https://.
The above will give you the output as the numeric part if it is set. I hope this works out.
I need to get domain name from URL excluding "www" and ".com" or ".co.uk" or anything other.
Example-
I have following urls like-
http://www.example.com
http://www.example.co.uk
http://subdomain.example.com
http://subdomain.example.co.uk
There will be anything at ".com" , ".org" , ".co.in", ".co.uk".
I try this it work for me.
$original_url="http://subdomain.example.co.uk"; //try with all urls above
$pieces = parse_url($original_url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
echo strstr( $regs['domain'], '.', true );
}
Output- example
I get this from Here
Get domain name from full URL
(?:https?:\/\/)?(?:www\.)?(.*)\.(?=[\w.]{3,4})
Try this.See demo.Grab the capture.
http://regex101.com/r/bW3aR1/2
You should use the PHP function parse_url() in combination with a str_replace() or regex, or maybe even an explode. It depends on a few things:
Things to note:
Will there always be a subdomain?
Will there be a specific list of allowed subdomains?
I would do something like this:
<?php
$url = 'http://www.something.com';
$parts = explode('.', parse_url($url, PHP_URL_HOST));
echo $parts[1]; // "something"
How would I translate this PHP statement: $domain = str_ireplace('www.', '', parse_url($url, PHP_URL_HOST)); to a smarty function such as:{$url|str_ireplace:'something':'etc'}
I want to print $domain in this case. $url is a smarty variable that is set for a certain URL. How do I do this?
You can pipe multiple modifiers, to first extract the host and then strip the www.:
{$url|parse_url:$smarty.const.PHP_URL_HOST|replace:'www.':''}
So for:
$url = 'http://www.example.com/foo/bar.html';
It prints:
example.com
I'm Stuck try to get domain using preg_replace,
i have some list url
download.adwarebot.com/setup.exe
athena.vistapages.com/suspended.page/
prosearchs.com/se/tds/in.cgi?4&group=5¶meter=mail
freeserials.spb.ru/key/68703.htm
what i want is
adwarebot.com
vistapages.com
prosearchs.com
spb.ru
any body can help me with preg_replace ?
i'm using this http://gskinner.com/RegExr/ for testing :)
using preg_replace, if the number of TLDs is limited:
$urls = array( 'download.adwarebot.com/setup.exe',
'athena.vistapages.com/suspended.page/',
'prosearchs.com/se/tds/in.cgi?4&group=5¶meter=mail',
'freeserials.spb.ru/key/68703.htm' );
$domains = preg_replace('|([^.]*\.(?:com|ru))/', '$1', $urls);
matches everything that comes before .com or .ru which is not a period. (to not match subdomains)
You could however use PHPs builtin parse_url function to get the host (including subdomain) – use another regex, substr or array manipulation to get rid of it:
$host = parse_url('http://download.adwarebot.com/setup.exe', PHP_URL_HOST);
if(count($parts = explode('.', $host)) > 2)
$host = implode('.', array_slice($parts, -2));
Following code assumes that every entry is exactly at the beginning of the string:
preg_match_all('#^([\w]*\.)?([\w]*\.[\w]*)/#', $list, $m);
// var_dump($m[2]);
P.S. But the correct answer is still parse_url.
Why use a regular expression? Of course it is possible, but using this:
foreach($url in $url_list){
$url_parts = explode('/', $url);
$domains[] = preg_replace('~(^[^\.]+\.)~i','',$url_parts[0]);
}
$domains = array_unique($domains);
will do just fine;
maybe a more generic solution:
tested by grep, I don't have php environment, sorry:
kent$ echo "download.adwarebot.com/setup.exe
dquote> athena.vistapages.com/suspended.page/
dquote> prosearchs.com/se/tds/in.cgi?4&group=5¶meter=mail
dquote> freeserials.spb.ru/key/68703.htm"|grep -Po '(?<!/)([^\./]+\.[^\./]+)(?=/.+)'
output:
adwarebot.com
vistapages.com
prosearchs.com
spb.ru
I have a variable, such as this:
$domain = "http://test.com"
I need to use preg_replace or str_place to get the variable like this:
$domain = "test.com"
I have tried using the following, but they do not work.
1) $domain = preg_replace('; ((ftp|https?)://|www3?\.).+? ;', ' ', $domain);
2) $domain = preg_replace(';\b((ftp|https?)://|www3?\.).+?\b;', ' ', $domain);
Any suggestions?
Or you can use parse_url:
parse_url($domain, PHP_URL_HOST);
$domain = ltrim($domain, "http://");
Did you try the str_replace?
$domain = "http://test.com"
$domain = str_replace('http://','',$domain);
You regular expressions probably don't find a match for the pattern.
preg_replace('~(https://|http://|ftp://)~',, '', $domain);
preg_match('/^[a-z]+:[/][/](.+)$/', $domain, $matches);
echo($matches[1]);
Should be what you are looking for, should give you everything after the protocol... http://domain.com/test becomes "domain.com/test". However, it doesn't care about the protocol, if you only want to support specific protocols such as HTTP and FTP, then use this instead:
preg_match('/^(http|ftp):[/][/](.+)$/', $domain, $matches);
If you only want the domain though, or similar parts of the URI, I'd recommend PHP's parse_url() instead. It does all the hard work for you and does it the proper way. Depending on your needs, I would probably recommend you use it anyway and just put it all back together instead.
simple regex:
preg_replace('~^(?:f|ht)tps?://~i','', 'https://www.site.com.br');