Email validation with edu domains only

Email validation with edu domains only - php

i have been trying to get the email address which has domains ends with .edu only using code below
$email = $_REQUEST['email'];
$school = substr($email, strpos($email, "#") + 1);
is there any way?

You just need to make a substring including the last 3 chars of the current string.
<?php
$tld = substr($email, strlen($email)-2, 3); // three last chars of the string
if ($tld = "edu") {
// do stuff
}
?>

It Should be work for get your domain name and domain extension:
$email = 'test#website.edu';
$getDomain = explode('#', $email);
$explValue = explode('.', $getDomain[1], 2);
print_r($explValue);
The out put is:
Array ( [0] => website [1] => edu )
After that you can check with
if($explValue[1] == 'edu'){
//your code here
}

If .edu is the last part of the email address, you could use strlen and substr:
$email = "test#test.edu";
$end = ".edu";
$string_end = substr($email, strlen($email) - strlen($end));
if ($end === $string_end) {
// Ok
}
Maybe it is also an option to use explode and split on #. Then use explode again and split on a dot and check if the array returned contains edu:
$strings = [
"test#test.edu",
"test#test.edu.pl",
"test#test.com"
];
foreach ($strings as $string) {
if (in_array("edu", explode(".", explode("#", $string)[1]))) {
// Etc..
}
}
Demo

strpos($email, ".edu."); it should be work.
for example gensek#metu.edu.tr

You can use substr And get last 4 characters if this is valid as per your requirement so the email is valid else it not.
$string = "xyzasd.edu";
echo $txt = substr($string,-4);
if($txt == ".edu"){
//Valid
}else{
//Not Valid
}

Related

PHP: Check if string is part of an array

I'm working on my little ticketing-system based on PHP.
Now I would like to exclude senders from being processed.
This is a possible list of excluded senders:
Array (
"badboy#example.com",
"example.org",
"spam#spamming.org"
)
Okay - now I would like to check if the sender of an mail matches one of these:
$sender = "badboy#example.com";
I think this is quite easy, I think I could solve this with in_array().
But what about
$sender = "me#example.org";
example.org is defined in the array, but not me#example.org - but me#example.org should also excluded, because example.org is in the forbidden-senders-list.
How could I solve this?

Maybe you are looking for stripos function.
<?php
if (!disallowedEmail($sender)) { // Check if email is disallowed
// Do your stuff
}
function disallowedEmail($email) {
$disallowedEmails = array (
"badboy#example.com",
"example.org",
"spam#spamming.org"
)
foreach($disallowedEmails as $disallowed){
if ( stripos($email, $disallowed) !== false)
return true;
}
return false
}

Another short alternative with stripos, implode and explode functions:
$excluded = array(
"badboy#example.com",
"example.org",
"spam#spamming.org"
);
$str = implode(",", $excluded); // compounding string with excluded emails
$sender = "www#example.com";
//$sender = "me#example.org";
$domainPart = explode("#",$sender)[1]; // extracting domain part from a sender email
$isAllowed = stripos($str, $sender) === false && stripos($str, $domainPart) === false;
var_dump($isAllowed); // output: bool(false)

mask mail with Alternative words using php

Mentioned below is a dummy Email ID say,
abcdefghij#gmail.com
How to mask this email ID partially using PHP?
Output i need as
a*c*e*g*i*#gmail.com
I have tried the below code, But it not works for below requirement
$prop=3;
$domain = substr(strrchr($Member_Email, "#"), 1);
$mailname=str_replace($domain,'',$Member_Email);
$name_l=strlen($mailname);
$domain_l=strlen($domain);
for($i=0;$i<=$name_l/$prop-1;$i++)
{
$start.='*';
}
for($i=0;$i<=$domain_l/$prop-1;$i++)
{
$end.='*';
}
$MaskMail = substr_replace($mailname, $start,2, $name_l/$prop).substr_replace($domain, $end, 2, $domain_l/$prop);

Give a try like this.
$delimeter = '#';
$mail_id = 'abcdefghij#gmail.com';
$domain = substr(strrchr($mail_id, $delimeter), 1);
$user_id = substr($mail_id,0,strpos($mail_id, $delimeter));
$string_array = str_split($user_id);
$partial_id = NULL;
foreach($string_array as $key => $val){
if($key % 2 == 0){
$partial_id .=$val;
}else{
$partial_id .='*' ;
}
}
echo $partial_id.$delimeter.$domain;

Here's a no loop approach to replace every second character of an email username with a mask.
Custom PHP function using native functions split, preg_replace with regex /(.)./, and implode:
echo email_mask('abcdefghi#gmail.com');
// a*c*e*g*i*k*#gmail.com
function email_mask($email) {
list($email_username, $email_domain) = split('#', $email);
$masked_email_username = preg_replace('/(.)./', "$1*", $email_username);
return implode('#', array($masked_email_username, $email_domain));
}
Regex Explanation:
The regular expression starts at the beginning of the string, matches 2 characters and captures the first of those two, replaces the match with the first character followed by an asterisk *. preg_replace repeats this throughout the remaining string until it can no longer match a pair of characters.

$mail='abcdefghij#gmail.com';
$mail_first=explode('#',$mail);
$arr=str_split($mail_first[0]);
$mask=array();
for($i=0;$i<count($arr);$i++) {
if($i%2!=0) {
$arr[$i]='*';
}
$mask[]=$arr[$i];
}
$mask=join($mask).'#'.$mail_first[1];
echo $mask;
Result is :
a*c*e*g*i*#gmail.com

Does it need to have that many asterisks?
It's so hard to read that way.
I will suggest you keep things simple.
Maybe something like this is enough
https://github.com/fedmich/PHP_Codes/blob/master/mask_email.php
Masks an email to show first 3 characters and then the last character before the # sign
ABCDEFZ#gmail.com becomes
A*****Z#gmail.com
Here is the full code that is also in that Github link
function mask_email( $email ) {
/*
Author: Fed
Simple way of masking emails
*/
$char_shown = 3;
$mail_parts = explode("#", $email);
$username = $mail_parts[0];
$len = strlen( $username );
if( $len <= $char_shown ){
return implode("#", $mail_parts );
}
//Logic: show asterisk in middle, but also show the last character before #
$mail_parts[0] = substr( $username, 0 , $char_shown )
. str_repeat("*", $len - $char_shown - 1 )
. substr( $username, $len - $char_shown + 2 , 1 )
;
return implode("#", $mail_parts );
}

PHP Get Subdomain But Not Actual Domain

I'm currently using the following to get the subdomain of my site
$subdomain = array_shift(explode(".",$_SERVER['HTTP_HOST']));
When I use this for http://www.website.com it returns "www" which is expected
However when I use this with http://website.com it returns "website" as the subdomain. How can I make absolute sure that if there is no subdomain as in that example, it returns NULL?
Thanks!

Please, note that in common case you should first apply parse_url to incoming data - and then use [host] key from it. As for your question, you can use something like this:
preg_match('/([^\.]+)\.[^\.]+\.[^\.]+$/', 'www.domain.com', $rgMatches);
//2-nd level check:
//preg_match('/([^\.]+)\.[^\.]+\.[^\.]+$/', 'domain.com', $rgMatches);
$sDomain = count($rgMatches)?$rgMatches[1]:null;
But I'm not sure that it's exactly what you need (since url can contain 4-th domain level e t.c.)

Do this:
function getSubdomain($domain) {
$expl = explode(".", $domain, -2);
$sub = "";
if(count($expl) > 0) {
foreach($expl as $key => $value) {
$sub = $sub.".".$value;
}
$sub = substr($sub, 1);
}
return $sub;
}
$subdomain = getSubdomain($_SERVER['HTTP_HOST']);
Works fine for me. Basicly you need to use the explode limit parameter.
Detail and source: phph.net - explode manual

If you have other domain, like a .net or .org etc, just change the value accordingly
$site['uri'] = explode(".", str_replace('.com', '', $_SERVER['HTTP_HOST']) );
if( count($site['uri']) >0 ) {
$site['subdomain'] = $site['uri'][0];
$site['domain'] = $site['uri'][1];
}
else {
$site['subdomain'] = null;
$site['domain'] = $site['uri'][0];
}
//For testing only:
print_r($site);
...or not (for more flexibility):
$site['uri'] = explode(".", $_SERVER['HTTP_HOST'] );
if( count($site['uri']) > 2 ) {
$site['subdomain'] = $site['uri'][0];
$site['domain'] = $site['uri'][1];
}
else {
$site['subdomain'] = null;
$site['domain'] = $site['uri'][0];
}
//For testing only:
print_r($site);

split full email addresses into name and email?

There seems to be many acceptable email address formats in the To: and From: raw email headers ...
person#place.com
person <person#place.com>
person
Another Person <person#place.com>
'Another Person' <person#place.com>
"Another Person" <person#place.com>
After not finding any effective PHP functions for splitting out names and addresses, I've written the following code.
You can DEMO IT ON CODEPAD to see the output...
// validate email address
function validate_email( $email ){
return (filter_var($email, FILTER_VALIDATE_EMAIL)) ? true : false;
}
// split email into name / address
function email_split( $str ){
$name = $email = '';
if (substr($str,0,1)=='<') {
// first character = <
$email = str_replace( array('<','>'), '', $str );
} else if (strpos($str,' <') !== false) {
// possibly = name <email>
list($name,$email) = explode(' <',$str);
$email = str_replace('>','',$email);
if (!validate_email($email)) $email = '';
$name = str_replace(array('"',"'"),'',$name);
} else if (validate_email($str)) {
// just the email
$email = $str;
} else {
// unknown
$name = $str;
}
return array( 'name'=>trim($name), 'email'=>trim($email) );
}
// test it
$tests = array(
'person#place.com',
'monarch <themonarch#tgoci.com>',
'blahblah',
"'doc venture' <doc#venture.com>"
);
foreach ($tests as $test){
echo print_r( email_split($test), true );
}
Am I missing anything here? Can anyone recommend a better way?

I have managed to make one regex to your test cases:
person#place.com
person <person#place.com>
person
Another Person <person#place.com>
'Another Person' <person#place.com>
"Another Person" <person#place.com>
using preg_match with this regex will surely help you bit.
function email_split( $str ){
$sPattern = "/([\w\s\'\"]+[\s]+)?(<)?(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))?(>)?/g";
preg_match($sPattern,$str,$aMatch);
if(isset($aMatch[1]))
{
echo $aMatch[1] //this is name;
}
if(isset($aMatch[3]))
{
echo $aMatch[3] //this is EmailAddress;
}
}
Note: I just noticed that single "person" i.e. your third test case could be discarded with this regex (just that because of space constraint in regex) so,at first line of your email_split function, append space at last place of your string.
Then it would be bang on target.
Thanks, Hope this helps.
Code I tried:
<?php
// validate email address
function validate_email($email) {
return (filter_var($email, FILTER_VALIDATE_EMAIL)) ? true : false;
}
// split email into name / address
function email_split($str) {
$str .=" ";
$sPattern = '/([\w\s\'\"]+[\s]+)?(<)?(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))?(>)?/';
preg_match($sPattern, $str, $aMatch);
//echo "string";
//print_r($aMatch);
$name = (isset($aMatch[1])) ? $aMatch[1] : '';
$email = (isset($aMatch[3])) ? $aMatch[3] : '';
return array('name' => trim($name), 'email' => trim($email));
}
// test it
$tests = array(
'person#place.com',
'monarch <themonarch#tgoci.com>',
'blahblah',
"'doc venture' <doc#venture.com>"
);
foreach ($tests as $test) {
echo "<pre>";
echo print_r(email_split($test), true);
echo "</pre>";
}
Output I got:
Array
(
[name] =>
[email] => person#place.com
)
Array
(
[name] => monarch
[email] => themonarch#tgoci.com
)
Array
(
[name] => blahblah
[email] =>
)
Array
(
[name] => 'doc venture'
[email] => doc#venture.com
)

How about this:
function email_split($str) {
$parts = explode(' ', trim($str));
$email = trim(array_pop($parts), "<> \t\n\r\0\x0B");
$name = trim(implode(' ', $parts), "\"\' \t\n\r\0\x0B");
if ($name == "" && strpos($email, "#") === false) { // only single string - did not contain '#'
$name = $email;
$email = "";
}
return array('name' => $name, 'email' => $email);
}
Looks like this is about twice as fast as the regex solution.
Note: the OPs third test case (for my purposes) is not needed. But in the interest of answering the OP I added the if stmt to produce the OPs expected results. This could have been done other ways (check the last element of $parts for '#').

use preg_match in php, http://php.net/manual/en/function.preg-match.php
or in my opinion, you can make your own function (let say get_email_address), it catch # character and then get the 'rest-left-string' from # until '<' character and 'rest-right-string' from # until '>' character.
for example, string monarch <themonarch#tgoci.com> will return 'rest-left-string' = themonarch and 'rest-right-string' = tgoci.com . finally, your function get_email_address will return themonarch#tgoci.com
hopefully it help.. :)

unfortunately the regex fails in a couple of conditions of the fullname:
non alphanumeric chars (eg. "Amazon.it")
non printable chars
emojs
i adjusted the expression this way
$sPattern = '/([^<]*)?(<)?(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))?(>)?/';
and now all chars are correctly recognized and splitted.
tested with
$address = "Test User # `` . !! 🔥 <test#email.com";
after 7 years, hope this helps :)

Get domain name (not subdomain) in php

I have a URL which can be any of the following formats:
http://example.com
https://example.com
http://example.com/foo
http://example.com/foo/bar
www.example.com
example.com
foo.example.com
www.foo.example.com
foo.bar.example.com
http://foo.bar.example.com/foo/bar
example.net/foo/bar
Essentially, I need to be able to match any normal URL. How can I extract example.com (or .net, whatever the tld happens to be. I need this to work with any TLD.) from all of these via a single regex?

Well you can use parse_url to get the host:
$info = parse_url($url);
$host = $info['host'];
Then, you can do some fancy stuff to get only the TLD and the Host
$host_names = explode(".", $host);
$bottom_host_name = $host_names[count($host_names)-2] . "." . $host_names[count($host_names)-1];
Not very elegant, but should work.
If you want an explanation, here it goes:
First we grab everything between the scheme (http://, etc), by using parse_url's capabilities to... well.... parse URL's. :)
Then we take the host name, and separate it into an array based on where the periods fall, so test.world.hello.myname would become:
array("test", "world", "hello", "myname");
After that, we take the number of elements in the array (4).
Then, we subtract 2 from it to get the second to last string (the hostname, or example, in your example)
Then, we subtract 1 from it to get the last string (because array keys start at 0), also known as the TLD
Then we combine those two parts with a period, and you have your base host name.

It is not possible to get the domain name without using a TLD list to compare with as their exist many cases with completely the same structure and length:
nas.db.de (Subdomain)
bbc.co.uk (Top-Level-Domain)
www.uk.com (Subdomain)
big.uk.com (Second-Level-Domain)
Mozilla's public suffix list should be the best option as it is used by all major browsers:
https://publicsuffix.org/list/public_suffix_list.dat
Feel free to use my function:
function tld_list($cache_dir=null) {
// we use "/tmp" if $cache_dir is not set
$cache_dir = isset($cache_dir) ? $cache_dir : sys_get_temp_dir();
$lock_dir = $cache_dir . '/public_suffix_list_lock/';
$list_dir = $cache_dir . '/public_suffix_list/';
// refresh list all 30 days
if (file_exists($list_dir) && #filemtime($list_dir) + 2592000 > time()) {
return $list_dir;
}
// use exclusive lock to avoid race conditions
if (!file_exists($lock_dir) && #mkdir($lock_dir)) {
// read from source
$list = #fopen('https://publicsuffix.org/list/public_suffix_list.dat', 'r');
if ($list) {
// the list is older than 30 days so delete everything first
if (file_exists($list_dir)) {
foreach (glob($list_dir . '*') as $filename) {
unlink($filename);
}
rmdir($list_dir);
}
// now set list directory with new timestamp
mkdir($list_dir);
// read line-by-line to avoid high memory usage
while ($line = fgets($list)) {
// skip comments and empty lines
if ($line[0] == '/' || !$line) {
continue;
}
// remove wildcard
if ($line[0] . $line[1] == '*.') {
$line = substr($line, 2);
}
// remove exclamation mark
if ($line[0] == '!') {
$line = substr($line, 1);
}
// reverse TLD and remove linebreak
$line = implode('.', array_reverse(explode('.', (trim($line)))));
// we split the TLD list to reduce memory usage
touch($list_dir . $line);
}
fclose($list);
}
#rmdir($lock_dir);
}
// repair locks (should never happen)
if (file_exists($lock_dir) && mt_rand(0, 100) == 0 && #filemtime($lock_dir) + 86400 < time()) {
#rmdir($lock_dir);
}
return $list_dir;
}
function get_domain($url=null) {
// obtain location of public suffix list
$tld_dir = tld_list();
// no url = our own host
$url = isset($url) ? $url : $_SERVER['SERVER_NAME'];
// add missing scheme ftp:// http:// ftps:// https://
$url = !isset($url[5]) || ($url[3] != ':' && $url[4] != ':' && $url[5] != ':') ? 'http://' . $url : $url;
// remove "/path/file.html", "/:80", etc.
$url = parse_url($url, PHP_URL_HOST);
// replace absolute domain name by relative (http://www.dns-sd.org/TrailingDotsInDomainNames.html)
$url = trim($url, '.');
// check if TLD exists
$url = explode('.', $url);
$parts = array_reverse($url);
foreach ($parts as $key => $part) {
$tld = implode('.', $parts);
if (file_exists($tld_dir . $tld)) {
return !$key ? '' : implode('.', array_slice($url, $key - 1));
}
// remove last part
array_pop($parts);
}
return '';
}
What it makes special:
it accepts every input like URLs, hostnames or domains with- or without scheme
the list is downloaded row-by-row to avoid high memory usage
it creates a new file per TLD in a cache folder so get_domain() only needs to check through file_exists() if it exists so it does not need to include a huge database on every request like TLDExtract does it.
the list will be automatically updated every 30 days
Test:
$urls = array(
'http://www.example.com',// example.com
'http://subdomain.example.com',// example.com
'http://www.example.uk.com',// example.uk.com
'http://www.example.co.uk',// example.co.uk
'http://www.example.com.ac',// example.com.ac
'http://example.com.ac',// example.com.ac
'http://www.example.accident-prevention.aero',// example.accident-prevention.aero
'http://www.example.sub.ar',// sub.ar
'http://www.congresodelalengua3.ar',// congresodelalengua3.ar
'http://congresodelalengua3.ar',// congresodelalengua3.ar
'http://www.example.pvt.k12.ma.us',// example.pvt.k12.ma.us
'http://www.example.lib.wy.us',// example.lib.wy.us
'com',// empty
'.com',// empty
'http://big.uk.com',// big.uk.com
'uk.com',// empty
'www.uk.com',// www.uk.com
'.uk.com',// empty
'stackoverflow.com',// stackoverflow.com
'.foobarfoo',// empty
'',// empty
false,// empty
' ',// empty
1,// empty
'a',// empty
);
Recent version with explanations (German):
http://www.programmierer-forum.de/domainnamen-ermitteln-t244185.htm

My solution in https://gist.github.com/pocesar/5366899
and the tests are here http://codepad.viper-7.com/GAh1tP
It works with any TLD, and hideous subdomain patterns (up to 3 subdomains).
There's a test included with many domain names.
Won't paste the function here because of the weird indentation for code in StackOverflow (could have fenced code blocks like github)

echo getDomainOnly("http://example.com/foo/bar");
function getDomainOnly($host){
$host = strtolower(trim($host));
$host = ltrim(str_replace("http://","",str_replace("https://","",$host)),"www.");
$count = substr_count($host, '.');
if($count === 2){
if(strlen(explode('.', $host)[1]) > 3) $host = explode('.', $host, 2)[1];
} else if($count > 2){
$host = getDomainOnly(explode('.', $host, 2)[1]);
}
$host = explode('/',$host);
return $host[0];
}

I recommend using TLDExtract library for all operations with domain name.

I think the best way to handle this problem is:
$second_level_domains_regex = '/\.asn\.au$|\.com\.au$|\.net\.au$|\.id\.au$|\.org\.au$|\.edu\.au$|\.gov\.au$|\.csiro\.au$|\.act\.au$|\.nsw\.au$|\.nt\.au$|\.qld\.au$|\.sa\.au$|\.tas\.au$|\.vic\.au$|\.wa\.au$|\.co\.at$|\.or\.at$|\.priv\.at$|\.ac\.at$|\.avocat\.fr$|\.aeroport\.fr$|\.veterinaire\.fr$|\.co\.hu$|\.film\.hu$|\.lakas\.hu$|\.ingatlan\.hu$|\.sport\.hu$|\.hotel\.hu$|\.ac\.nz$|\.co\.nz$|\.geek\.nz$|\.gen\.nz$|\.kiwi\.nz$|\.maori\.nz$|\.net\.nz$|\.org\.nz$|\.school\.nz$|\.cri\.nz$|\.govt\.nz$|\.health\.nz$|\.iwi\.nz$|\.mil\.nz$|\.parliament\.nz$|\.ac\.za$|\.gov\.za$|\.law\.za$|\.mil\.za$|\.nom\.za$|\.school\.za$|\.net\.za$|\.co\.uk$|\.org\.uk$|\.me\.uk$|\.ltd\.uk$|\.plc\.uk$|\.net\.uk$|\.sch\.uk$|\.ac\.uk$|\.gov\.uk$|\.mod\.uk$|\.mil\.uk$|\.nhs\.uk$|\.police\.uk$/';
$domain = $_SERVER['HTTP_HOST'];
$domain = explode('.', $domain);
$domain = array_reverse($domain);
if (preg_match($second_level_domains_regex, $_SERVER['HTTP_HOST']) {
$domain = "$domain[2].$domain[1].$domain[0]";
} else {
$domain = "$domain[1].$domain[0]";
}

$onlyHostName = implode('.', array_slice(explode('.', parse_url($link, PHP_URL_HOST)), -2));
Using https://subdomain.domain.com/some/path as example
parse_url($link, PHP_URL_HOST) returns subdomain.domain.com
explode('.', parse_url($link, PHP_URL_HOST)) then breaks subdomain.domain.com into an array:
array(3) {
[0]=>
string(5) "subdomain"
[1]=>
string(7) "domain"
[2]=>
string(3) "com"
}
array_slice then slices the array so only the last 2 values are in the array (signified by the -2):
array(2) {
[0]=>
string(6) "domain"
[1]=>
string(3) "com"
}
implode then combines those two array values back together, ultimately giving you the result of domain.com
Note: this will only work when end domain you're expecting only has one . in it, like something.domain.com or else.something.domain.net
It will not work for something.domain.co.uk where you would expect domain.co.uk

There are two ways to extract subdomain from a host:
The first method that is more accurate is to use a database of tlds (like public_suffix_list.dat) and match domain with it. This is a little heavy in some cases. There are some PHP classes for using it like php-domain-parser and TLDExtract.
The second way is not as accurate as the first one, but is very fast and it can give the correct answer in many case, I wrote this function for it:
function get_domaininfo($url) {
// regex can be replaced with parse_url
preg_match("/^(https|http|ftp):\/\/(.*?)\//", "$url/" , $matches);
$parts = explode(".", $matches[2]);
$tld = array_pop($parts);
$host = array_pop($parts);
if ( strlen($tld) == 2 && strlen($host) <= 3 ) {
$tld = "$host.$tld";
$host = array_pop($parts);
}
return array(
'protocol' => $matches[1],
'subdomain' => implode(".", $parts),
'domain' => "$host.$tld",
'host'=>$host,'tld'=>$tld
);
}
Example:
print_r(get_domaininfo('http://mysubdomain.domain.co.uk/index.php'));
Returns:
Array
(
[protocol] => https
[subdomain] => mysubdomain
[domain] => domain.co.uk
[host] => domain
[tld] => co.uk
)

Here's a function I wrote to grab the domain without subdomain(s), regardless of whether the domain is using a ccTLD or a new style long TLD, etc... There is no lookup or huge array of known TLDs, and there's no regex. It can be a lot shorter using the ternary operator and nesting, but I expanded it for readability.
// Per Wikipedia: "All ASCII ccTLD identifiers are two letters long,
// and all two-letter top-level domains are ccTLDs."
function topDomainFromURL($url) {
$url_parts = parse_url($url);
$domain_parts = explode('.', $url_parts['host']);
if (strlen(end($domain_parts)) == 2 ) {
// ccTLD here, get last three parts
$top_domain_parts = array_slice($domain_parts, -3);
} else {
$top_domain_parts = array_slice($domain_parts, -2);
}
$top_domain = implode('.', $top_domain_parts);
return $top_domain;
}

function getDomain($url){
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if(preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)){
return $regs['domain'];
}
return FALSE;
}
echo getDomain("http://example.com"); // outputs 'example.com'
echo getDomain("http://www.example.com"); // outputs 'example.com'
echo getDomain("http://mail.example.co.uk"); // outputs 'example.co.uk'

I had problems with the solution provided by pocesar.
When I would use for instance subdomain.domain.nl it would not return domain.nl. Instead it would return subdomain.domain.nl
Another problem was that domain.com.br would return com.br
I am not sure but i fixed these issues with the following code (i hope it will help someone, if so I am a happy man):
function get_domain($domain, $debug = false){
$original = $domain = strtolower($domain);
if (filter_var($domain, FILTER_VALIDATE_IP)) {
return $domain;
}
$debug ? print('<strong style="color:green">»</strong> Parsing: '.$original) : false;
$arr = array_slice(array_filter(explode('.', $domain, 4), function($value){
return $value !== 'www';
}), 0); //rebuild array indexes
if (count($arr) > 2){
$count = count($arr);
$_sub = explode('.', $count === 4 ? $arr[3] : $arr[2]);
$debug ? print(" (parts count: {$count})") : false;
if (count($_sub) === 2){ // two level TLD
$removed = array_shift($arr);
if ($count === 4){ // got a subdomain acting as a domain
$removed = array_shift($arr);
}
$debug ? print("<br>\n" . '[*] Two level TLD: <strong>' . join('.', $_sub) . '</strong> ') : false;
}elseif (count($_sub) === 1){ // one level TLD
$removed = array_shift($arr); //remove the subdomain
if (strlen($arr[0]) === 2 && $count === 3){ // TLD domain must be 2 letters
array_unshift($arr, $removed);
}elseif(strlen($arr[0]) === 3 && $count === 3){
array_unshift($arr, $removed);
}else{
// non country TLD according to IANA
$tlds = array(
'aero',
'arpa',
'asia',
'biz',
'cat',
'com',
'coop',
'edu',
'gov',
'info',
'jobs',
'mil',
'mobi',
'museum',
'name',
'net',
'org',
'post',
'pro',
'tel',
'travel',
'xxx',
);
if (count($arr) > 2 && in_array($_sub[0], $tlds) !== false){ //special TLD don't have a country
array_shift($arr);
}
}
$debug ? print("<br>\n" .'[*] One level TLD: <strong>'.join('.', $_sub).'</strong> ') : false;
}else{ // more than 3 levels, something is wrong
for ($i = count($_sub); $i > 1; $i--){
$removed = array_shift($arr);
}
$debug ? print("<br>\n" . '[*] Three level TLD: <strong>' . join('.', $_sub) . '</strong> ') : false;
}
}elseif (count($arr) === 2){
$arr0 = array_shift($arr);
if (strpos(join('.', $arr), '.') === false && in_array($arr[0], array('localhost','test','invalid')) === false){ // not a reserved domain
$debug ? print("<br>\n" .'Seems invalid domain: <strong>'.join('.', $arr).'</strong> re-adding: <strong>'.$arr0.'</strong> ') : false;
// seems invalid domain, restore it
array_unshift($arr, $arr0);
}
}
$debug ? print("<br>\n".'<strong style="color:gray">«</strong> Done parsing: <span style="color:red">' . $original . '</span> as <span style="color:blue">'. join('.', $arr) ."</span><br>\n") : false;
return join('.', $arr);
}

Here's one that works for all domains, including those with second level domains like "co.uk"
function strip_subdomains($url){
# credits to gavingmiller for maintaining this list
$second_level_domains = file_get_contents("https://raw.githubusercontent.com/gavingmiller/second-level-domains/master/SLDs.csv");
# presume sld first ...
$possible_sld = implode('.', array_slice(explode('.', $url), -2));
# and then verify it
if (strpos($second_level_domains, $possible_sld)){
return implode('.', array_slice(explode('.', $url), -3));
} else {
return implode('.', array_slice(explode('.', $url), -2));
}
}
Looks like there's a duplicate question here: delete-subdomain-from-url-string-if-subdomain-is-found

Very late, I see that you marked regex as a keyword and my function works like a charm, so far I haven't found a url that fails:
function get_domain_regex($url){
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}else{
return false;
}
}
if you want one without regex I have this one, which I am sure I also took from this post
function get_domain($url){
$parseUrl = parse_url($url);
$host = $parseUrl['host'];
$host_array = explode(".", $host);
$domain = $host_array[count($host_array)-2] . "." . $host_array[count($host_array)-1];
return $domain;
}
They both work amazing, BUT, this took me a while to realize if the url doesn't start with http:// or https:// it will fail so make sure the url string starts with the protocol.

Simply try this:
preg_match('/(www.)?([^.]+\.[^.]+)$/', $yourHost, $matches);
echo "domain name is: {$matches[0]}\n";
this working for majority of domains.

This function will return the domain name without the extension of any url given even if you parse a url without the http:// or https://
You can extend this code
(?:\.co)?(?:\.com)?(?:\.gov)?(?:\.net)?(?:\.org)?(?:\.id)?
with more extensions if you want to handle more second level domainnames.
function get_domain_name($url){
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : $url;
$domain = strtolower($domain);
$domain = preg_replace('/.international$/', '.com', $domain);
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,90}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
if (preg_match('/(.*?)((?:\.co)?(?:\.com)?(?:\.gov)?(?:\.net)?(?:\.org)?(?:\.id)?(?:\.asn)?.[a-z]{2,6})$/i', $regs['domain'], $matches)) {
return $matches[1];
}else return $regs['domain'];
}else{
return $url;
}
}

I'm using this to achieve the same target and it always works, I hope it will help others.
$url = https://use.fontawesome.com/releases/v5.11.2/css/all.css?ver=2.7.5
$handle = pathinfo( parse_url( $url )['host'] )['filename'];
$final_handle = substr( $handle , strpos( $handle , '.' ) + 1 );
print_r($final_handle); // fontawesome

Simplest solution
#preg_replace('#\/(.)*#', '', #preg_replace('#^https?://(www.)?#', '', $url))

Simply try this:
<?php
$host = $_SERVER['HTTP_HOST'];
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
echo "domain name is: {$matches[0]}\n";
?>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Email validation with edu domains only - php

i have been trying to get the email address which has domains ends with .edu only using code below $email = $_REQUEST['email']; $school = substr($email, strpos($email, "#") + 1); is there any way?

You just need to make a substring including the last 3 chars of the current string. <?php $tld = substr($email, strlen($email)-2, 3); // three last chars of the string if ($tld = "edu") { // do stuff } ?>

strpos($email, ".edu."); it should be work. for example gensek#metu.edu.tr

You can use substr And get last 4 characters if this is valid as per your requirement so the email is valid else it not. $string = "xyzasd.edu"; echo $txt = substr($string,-4); if($txt == ".edu"){ //Valid }else{ //Not Valid }

Related

PHP: Check if string is part of an array

mask mail with Alternative words using php

PHP Get Subdomain But Not Actual Domain

split full email addresses into name and email?

Get domain name (not subdomain) in php

Categories

Resources