I would appreciate any help that can be provided with this matter.
I am creating a registration form, one field is for the users domain which I will verify is valid with FILTER_VALIDATE_URL and that it exists with dns_check_record.
However a problem I'm having is that using these two methods will also allow subdomains to be submitted to the form which I don't want.
Does anyone know a way to allow domains but not subdomains?
I've tested the following function, from http://syntax.cwarn23.net/PHP/Strip_URL_to_Domain:
function domain($domainb)
{
$bits = explode('/', $domainb);
if ($bits[0]=='http:' || $bits[0]=='https:')
{
$domainb= $bits[2];
} else {
$domainb= $bits[0];
}
unset($bits);
$bits = explode('.', $domainb);
$idz=count($bits);
$idz-=3;
if (strlen($bits[($idz+2)])==2) {
$url=$bits[$idz].'.'.$bits[($idz+1)].'.'.$bits[($idz+2)];
} else if (strlen($bits[($idz+2)])==0) {
$url=$bits[($idz)].'.'.$bits[($idz+1)];
} else {
$url=$bits[($idz+1)].'.'.$bits[($idz+2)];
}
return $url;
However this isn't perfect as any domains such as www.domain.uk.com will appear as uk.com (I know not a common domain extension).
Does anyone know a method better than the above function?
As pointed by Micheal Mior, you have to check for .co.uk, .com.br and many others.
Some browser vendors are maintaining a list of such non-TLD that are effectively TLD: http://publicsuffix.org/. The list is quite huge.
There is a library here that uses this effective TLD list to implement the function you are looking for (download are here). (Found via https://wiki.mozilla.org/Gecko:Effective_TLD_Service.)
Combine them.
dns_check_record will fail on '.co.uk', so you can split your string on the dots, check the domain you get when you combine the last two parts, and if that fails, use a third part too, if any.
You will do a double check for invalid domains, but I assume that won't be an issue.
first you could use parse_url() to get only the host name: http://www.stackoverflow.com -> $url['host'] = 'www.stackoverflow.com'
Second you could count the amount of points in the hostname: explode() --> count() or substr_count()
Has the host more than 1 point a subdomain could be exist.
Now you could use the solution mentioned by GolezTrol or arnaud576875.
Related
Recently a question has been asked, how to get the Domain of any URL available as a String.
Unfortunately the question has been closed, and the so far linked answers only pointed to solutions using Regex (which fails for special cases like .co.uk) and static solutions, considering those exceptions (which ofc. might change over time).
So, I was searching for a generic solution for this question, that will work at any time and found one. (At least a couple of tests are positive)
If you find a domain for which the attempted solution does not work, feel free to mention it, and I'll try to imrpove the snipped to cover that case as well.
To find the domain of any string given, a three-step solution seems to work best:
First, get the actual Hostname, using parse_url (http://php.net/manual/en/function.parse-url.php)
Second, query any DNS-Server for the "Top-Most" A-Record available. (I used checkdnsrr for this purpose: http://php.net/manual/en/function.checkdnsrr.php)
Last but not least: Perform some validations to make sure you are not running into some "default response".
I performed only some tests and it seems like the result is as expected. The method directly generates the output, but can be modified to return the domain name instead of generating output:
<?php
getDomain("http://www.stackoverflow.com");
getDomain("http://www.google.co.uk");
getDomain("http://books.google.co.uk");
getDomain("http://a.b.c.google.co.uk");
getDomain("http://www.nominet.org.uk/intelligence/statistics/registration/");
getDomain("http://invalid.fail.pooo");
getDomain("http://AnotherOneThatShouldFail.com");
function getDomain($url){
echo "Searching Domain for '".$url."': ";
//Step 1: Get the actual hostname
$url = parse_url($url);
$actualHostname = $url["host"];
//step 2: Top-Down approach: check DNS Records for the first valid A-record.
//Re-Assemble url step-by-step, i.e. for www.google.co.uk, check:
// - uk
// - co.uk
// - google.co.uk (will match here)
// - www.google.co.uk (will be skipped)
$domainParts = explode(".", $actualHostname);
for ($i= count($domainParts)-1; $i>=0; $i--){
$domain = "";
$currentCountry = null;
for ($j = count($domainParts)-1; $j>=$i; $j--){
$domain = $domainParts[$j] . "." . $domain;
if ($currentCountry == null){
$currentCountry = $domainParts[$j];
}
}
$domain = trim($domain, ".");
$validRecord = checkdnsrr($domain, "A"); //looking for Class A records
if ($validRecord){
//If the host can be resolved to an ip, it seems valid.
//if hostname is returned, its invalid.
$hostIp = gethostbyname($domain);
$validRecord &= ($hostIp != $domain);
if ($validRecord){
//last check: DNS server might answer with one of ISPs default server ips for invalid domains.
//perform a test on this by querying a domain of the same "country" that is invalid for sure to obtain an
//ip list of ISPs default servers. Then compare with the response of current $domain.
$validRecord &= !(in_array($hostIp, gethostbynamel("iiiiiiiiiiiiiiiiiinvaliddomain." . $currentCountry)));
}
}
//valid record?
if ($validRecord){
//return $domain;
echo $domain."<br />";
return;
}
}
//return null;
echo " not resolved.<br />";
}
?>
Output of the example above:
Searching Domain for 'http://www.stackoverflow.com': stackoverflow.com
Searching Domain for 'http://www.google.co.uk': google.co.uk
Searching Domain for 'http://books.google.co.uk': google.co.uk
Searching Domain for 'http://a.b.c.google.co.uk': google.co.uk
Searching Domain for 'http://www.nominet.org.uk/intelligence/statistics/registration/': nominet.org.uk
Searching Domain for 'http://invalid.fail.pooo': not resolved.
Searching Domain for 'http://AnotherOneThatShouldFail.com': not resolved.
This is only a very limited set of test-cases but I cannot imagine a case, where a domain has no A-record.
As a nice side-effect, this also validates urls and does not just rely on theoretically valid formats like the last examples are showing.
best,
dognose
We a list of URL's in this format (http://www.xyz.gov.ac.in). Not all of them look like this, some of them have normal domains. I am confused on how to get the domain name from a 3 dotted url. The code we have is working fine for 2 dotted domain names.
Here is the code we have:
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}
echo get_domain($url) ;
How can we modify the above code to accommodate for 3 dotted domains as well as the other types?
The echo results should be in this format xyz.gov.ac.in
Basically, you can't. At least not without a lookup table that has all "TLDs".
For example, in my country (The Netherlands) we have .nl and .co.nl. But www.gov.nl is a normal website (I'm trying to illustrate that you can't automatically say that gov. isn't a domain). And www.edu.nl doesn't exist.
Any standard regex that would try to parse them would tell you that the domain is www.gov.nl, while the domain is actually gov.nl. Same for edu.nl.
The only way you can accomplish what you want is by getting a list of all TLDs (and sub-TLDs) and using that to parse them.
I believe that Firefox and Chrome have such a list implemented (for coloring the domain name in the URL) and constantly keep it up-to-date. Maybe look in those sources?
Try this:
/(^[\w|-]+\.)(?P<domain>([\w|-]+\.)+(\w+))/i
Hope this will help..
You should be able to use this Regex instead
/(?P<domain>([a-z0-9][a-z0-9\-]{1,63}\.)+[a-z\.]{2,6})$/i
I do have a domain search function. In search box you have the option to enter any kind of domain names. what I am looking into is how do I filter sub domain from search or else trim sub domain and keep only main.
for example if a user entered mail.yahoo.com then that to be convert to yahoo.com or it can be omitted from search.
Here's a more concise way to grab the domain and a likely subdomain from a URL.
function find_subdomain($url) {
$parts = parse_url($url);
$domain_parts = explode('.', $parts['host']);
while(count($domain_parts) > 4)
array_shift($domain_parts);
return join('.', $domain_parts);
}
Keep in mind that not everything that looks like a subdomain is really a subdomain. Some countries have their own country-specific domains that everyone uses, like .co.uk and .com.au. You can not rely on the number of dots in the URL to tell you what is and is not a subdomain. In fact, you might need the opposite approach - first remove the top-level domain, then see what's left. Unfortunately then you're left with the second-level domain problem.
Can you tell us more about what exactly you are trying to accomplish? Why are you trying to detect subdomains? You mentioned a search box. What is being searched?
Edit: I have updated the function to up to four of the right-most parts of the domain. Given "http://one.two.three.four.five.six.com" it will return 'four.five.six.com'
I customized an utility function that i'm using, it's close to perfection (but that's what you could get without hard-coding all the possible list of domain extensions).
Here's the catch: the assumes that the main domain contains at least 4 characters. i.e for: sub.mail.com, it returns mail.com But for sub.aol.com it returns sub.aol.com
function get_main_domain($host='') {
if(empty($host))$host=$_SERVER['HTTP_HOST'];
$domain_parts = explode('.',$host);
$count=count($domain_parts);
if($count<=2)return $host;
$permit=0;
for($i=$count-1;$i>=0;$i--){
$permit++;
if(strlen($domain_parts[$i])>3)break;
}
while(count($domain_parts) >$permit)array_shift($domain_parts);
return join('.', $domain_parts);
}
Well that doesnt work for all domain if you forgot to mention it in array...
Here is my solution...but I need to compress it to few lines...is it possible??
function subdomain($domainb){$bits = explode('/', $domainb);
if ($bits[0]=='http:' || $bits[0]=='https:'){
$domainb= $bits[2];
} else {$domainb= $bits[0];}
unset($bits);
$bits = explode('.', $domainb); $idz=0;
while (isset($bits[$idz])){$idz+=1;}
$idz-=4; $idy=0;
while ($idy<$idz){ unset($bits[$idy]);
$idy+=1;} $part=array();
foreach ($bits AS $bit){$part[]=$bit;}
unset($bit); unset($bits); unset($domainb);
if (strlen($part[1])>4){ unset($part[0]);}
foreach($part AS $bit){$domainb.=$bit.'.';}
unset($bit);
return preg_replace('/(.*)\./','$1',$domainb);}
Greetings,
I already have a working connection to the AD and can search and retrieve information from it. I've even developed a recursive method by which one can retrieve all groups for a given user. However, I'd like to avoid the recursion if possible. One way to do this is to get the tokenGroups attribute from the AD for the user, which should be a list of the SIDs for the groups that the specified user has membership, whether that membership be direct or indirect.
When I run a search for a user's AD information, though, the tokenGroups attribute isn't even in it. I tried specifically requesting that information (i.e., specifying it using the fourth parameter to ldap_search) but that didn't work, either.
Thanks,
David Kees
Solved my own problem and thought I'd put the answer here so that others might find it. The issue was using the ldap_search() function. The answer was to use the ldap_read() function instead of ldap_search(). The difference is the scope of the request. The search function uses a scope of "sub" (i.e., subtree) while the read function uses "base." The tokenGroups information can only be found when using a scope of "base" so using the correct PHP function was the key.
As I mentioned above, I was working from someone else code in perl to create my solution and the perl script used a function named "search" to do it's LDAP requests which lead me down wrong path.
Thanks to those who took a peek at the question!
--
As per the requests in the comments, here's the basics of the solution in code. I'm extracting from an object that I use so this might not be 100% but it'll be close. Also, variables not declared in this snipped (e.g. $server, $user, $password) are for you to figure out; I won't know your AD credentials anyway!
$ldap = ldap_connect($server);
ldap_bind($ldap, $user, $password);
$tokengroups = ldap_read($ldap, $dn, "CN=*", array("tokengroups")));
$tokengroups = ldap_get_entries($ldap, $tokengroups);
At this point, $tokengroups is our results as an array. it should have count index as well as some other information. To extract the actual groups, you'll need to do something like this:
$groups = array();
if($tokengroups["count"] > 0) {
$groups = $tokengroups[0]["tokengroups"];
unset($groups["count"]);
// if you want the SID's for your groups, you can stop here.
// if you want to decode the SID's then you can do something like this.
// the sid_decode() here: http://www.php.net/manual/en/function.unpack.php#72591
foreach($groups as $i => &$sid) {
$sid = sid_decode($sid);
$sid_dn = ldap_read($ldap, "<SID=$sid>", "CN=*", array("dn"));
if($sid_dn !== false) {
$group = ldap_get_entries($ldap, $sid_dn);
$group = $group["count"] == 1 ? $group[0]["dn"] : NULL;
$groups[$i] = $group;
}
}
}
That's the basics. There's one caveat: you'll probably need to work with the individual or individuals who manage AD accounts at your organization. The first time I tried to get this running (a few years ago, so my memory is somewhat fuzzy) the account that I was given did not have the appropriate authorization to access the token groups information. I'm sure there are other ways to do this, but because I was porting someone else's code for this specific solution, this was how I did it.
I'm using the following snippet to redirect an array of IP addresses. I was wondering how I would go about adding an entire range/block of IP addresses to my dissallowed array...
<?php // Let's redirect certain IP addresses to a "Page Not Found"
$disallowed = array("76.105.99.106");
$ip = $_SERVER['REMOTE_ADDR'];
if(in_array($ip, $disallowed)) {
header("Location: http://google.com");
exit;
}
?>
I tried using "76.105.99.*", "76.105.99", "76.105.99.0-76.105.99.255" without any luck.
I need to use PHP rather than mod_rewrite and .htaccess for other reasons.
Here's an example of how you could check a particular network/mask combination:
$network=ip2long("76.105.99.0");
$mask=ip2long("255.255.255.0");
$remote=ip2long($_SERVER['REMOTE_ADDR']);
if (($remote & $mask)==$network)
{
header("Location: http://example.com");
exit;
}
This is better than using a string based match as you can test other masks that align within an octet, e.g. a /20 block of IPs
Try the substr function:
$ip = '76.105.99.';
if (substr($_SERVER['REMOTE_ADDR'], 0, strlen($ip)) === $ip) {
// deny access
}
You can approach the problem in a different way.
If you want to ban 76.105.99.* you could do:
if (strpos($_SERVER['REMOTE_ADDR'], "76.105.99.")!==FALSE)
{
header ('Location: http://google.com');
}
Who exactly are you interested in blocking? You can use PHP or apache to block (or allow) a bunch of specific IP addresses.
If you are interested in blocking people from an entire country for example, then there are tools that give you the IP addresses you need to block. Unfortunately, it's not as simple as just specifying a range.
Check out http://www.blockacountry.com/ which generates a bunch of ip addresses you can stick in your .htaccess to block whole countries.
What you need to do is to have a test to see if a particular address lives inside a particular address range as defined by CIDR
So for instance, you need to be able to say
is 192.168.1.5
inside
192.168.1.0/24
That function is easy to write, assuming you have some basic tools to do CIDR work.
Assuming you are on a 32bit system, this class http://snipplr.com/view/15557/cidr-class-for-ipv4/
Pay attention to the IPisWithinCIDR function
It would be better to do this in apache(or any other server)
I believe that you'll need to create a for loop to add each IP address (within the range) to your array.
pseudo code
for i = 0 to 255
disallowed[i] = "76.105.99." + i
next
$blocked_ip_range_array = array('109.237.108.0','109.238.0.0');
for($i=0;$i<count($blocked_ip_range_array);$i++){
$network=ip2long($blocked_ip_range_array[$i]);
$blipr = explode(".",$blocked_ip_range_array[$i]);
if($blipr[2]=='0'){
$mask=ip2long("255.255.0.0");
}
else{
$mask=ip2long("255.255.255.0");
}
$remote=ip2long($_SERVER['REMOTE_ADDR']);
if (($remote & $mask)==$network)
{
header("Location: http://xurcun.info");
exit;
}
}
Below is a URL showing something rather similar to what Mr. Dixon and Ameer are discussing:
http://www.blackdog.ie/blog/blocking-ip-ranges-with-php/
Hope this helps.
Respectfully,
Wil