I'm working on an email validation check and we need to decided whether to allow user#localhost and user#example (notice no .anything) to be validated as a valid email address. This is for an open source project that has a number of use cases on both the web at large and intranets.
RFC 2822 (Internet Message Format Standard) allows it but RFC 2821 (SMTP Standard) says it should fail.
Thoughts?
It depends on your application. If you think that several of your users will have an email #localhost, and you don't mind. Then go for it.
Make it a configurable option, so people can decide for themselves. I'd default it to failure, personally, as I've yet to run into a case - intranet or public internet - where I've had someone use a valid user#localhost type address.
I would disable it. Very few organizations use internal domains, and those that do generally use "acme.localhost" or "intranet.com" or something else of the like. There is some sort of configuration going on in the DNS that they use to make it work.
Regardless, internal email is nearly dead anyway: with the advent of instant messaging, Twitter, and SMS along with the increasing availability of external email for every member of a company, it is almost entirely likely that you will never get a TLD-less domain in an email.
For the folks that do require it, they can always tweak the regex themselves, as they were savvy enough to set up a custom hostname to handle internal email.
Well, if you have DNS working for internally you could always just do a DNS lookup.
But if this is going to fail with SMTP, then I would suggest making sure you don't include it.
I have seen email addresses of the form user#localhost, typically when looking at archives of a mailing list and the administrator hosted and posted from the same machine. So it can definitely occur - and I admit it broke my parsing routine! So now I am a little more flexible to email addresses.
Looking at this it looks like you've we need two quick checks as detailed:
<?php
function valid_email($email) {
// First, we check that there's one # symbol, and that the lengths are right
if (!ereg("^[^#]{1,64}#[^#]{1,255}$", $email)) {
// Email invalid because wrong number of characters in one section, or wrong number of # symbols.
return false;
}
// take a given email address and split it into the username and domain.
list($userName, $mailDomain) = split("#", $email);
if (checkdnsrr($mailDomain, "MX")) {
// this is a valid email domain!
return true;
}
else {
// this email domain doesn't exist!
return false;
}
}
?>
(source 1, source 2)
Related
What is the most accurate way to get user's IP address in 2017 via PHP?
I've read a lot of SO questions and answers about it, but most of answers are old and commented by users that these ways are unsafe.
For example, take a look at this question (2011): How to get the client IP address in PHP?
Tim Kennedy's answer contains a recommendation to use something like:
if (!empty($_SERVER['HTTP_CLIENT_IP'])) {
$ip = $_SERVER['HTTP_CLIENT_IP'];
} elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ip = $_SERVER['HTTP_X_FORWARDED_FOR'];
} else {
$ip = $_SERVER['REMOTE_ADDR'];
}
But as I've read a lot, I have seen that to use X_FORWARDED_FOR is unsafe, as the comment below highlights:
Do NOT use the above code unless you know EXACTLY what it does! I've
seen MASSIVE security holes due to this. The client can set the
X-Forwarded-For or the Client-IP header to any arbitrary value it
wants. Unless you have a trusted reverse proxy, you shouldn't use any
of those values.
As I didn't know EXACTLY what it does, I don't want to take the risk. He said it is unsafe, but did not provide a safe method to get user's IP address.
I've tried the simple $_SERVER['REMOTE_ADDR'];, but this returns the wrong IP. I've tested this and my real IP follows this pattern: 78.57.xxx.xxx, but I get an IP address like: 81.7.xxx.xxx
So do you have any ideas?
Short answer:
$ip = $_SERVER['REMOTE_ADDR'];
As of 2021 (and still) $_SERVER['REMOTE_ADDR']; is the only reliable way to get users ip address, but it can show erroneous results if behind a proxy server.
All other solutions imply security risks or can be easily faked.
From a security POV, nothing but $_SERVER['REMOTE_ADDR'] is reliable - that's just the simple truth, unfortunately.
All the variables prefixed with HTTP_ are in fact HTTP headers sent by the client, and there there's no other way to transfer that information while requests pass through different servers.
But that of course automatically means that clients can spoof those headers.
You can never, ever trust the client.
Unless it is you ... If you control the proxy or load-balancer, it is possible to configure it so that it drops such headers from the original request.
Then, and only then, you could trust an e.g. X-Client-IP header, but really, there's no need to at that point ... your webserver can also be configured to replace REMOTE_ADDR with that value and the entire process becomes transparent to you.
This will always be the case, no matter which year we are in ... for anything related to security - only trust REMOTE_ADDR.
Best case scenario is to read the HTTP_ data for statistical purposes only, and even then - make sure that the input is at least a valid IP address.
You have to collaborate with your sysops team (or if you're wearing that hat too, you will need to do some research). The header check is used when your network infrastructure is configured in certain ways where the remote requester is one of your network appliances instead of the end
user.
This sort of thing happens when your web server(s) sit behind a load balancer or firewall or other appliance that needs to interrogate the payload to properly handle it. An example is when a load balancer terminated ssl and forwards the request on to the web server without ssl. When this occurs the remote address becomes the load balancer. It also happens with firewall appliances that do the same thing.
Most instances the device will offer configuration to set a header value in the request with the original remote ip address. The header is usually what you'd expect but it can in some cases be different or even configurable.
What's more, depending on your web server configuration (apache, nginx or other) may not support or be currently configured to support certain custom headers such as the common ip header.
The point is us you will need to investigate your network configuration to ensure that the original requester's ip makes it all the way through to your application code and in what form.
If you'd like to use a pre-built library, you can use Whip.
Using pre-made libraries are usually better because most of them will have been checked thoroughly by an active community. Some of them, especially the ones that have been around for a long time, have more features built-in.
But if you want to code it yourself to learn the concept, then it's ok I guess. I recommend packaging it as a stand alone library and releasing it as open-source :)
EDIT: I do not recommend using the remote IP in security mechanisms as they are not always reliable.
First, it is impossible to reliably determine someone's source IP address if they are intent on being hidden. Even something which today seems foolproof, will soon have a loophole (if it doesn't already). As such, any answer below should be considered UNTRUSTED, which means that if you put all of your eggs in this basket, be prepared for someone to take advantage of it or circumvent it somehow.
I won't get in to all the ways someone can circumvent IP tracking, because
it is constantly evolving. What I will say is that it can be a useful tool for logging as long as you know that IP addresses can easily change or otherwise be masked.
Now, one big point to make is that there is a difference between a public IP address and a private IP address. In IPV4, routers are generally assigned one public IP address, which is all that a server-side language can actually grab, because it doesn't see your client-side IP address. To a server, your computer doesn't exist as a web-space. Instead, your router is all that matters. In turn, your router is the only thing that cares about your computer, and it assigns a private IP address (to which your 172...* address belongs) to make this work. This is good for you, because you can't directly access a computer behind a router.
If you want to access a private IP address, you need to use JavaScript (client-side language). You can then store the data asynchronously via AJAX. As far as I know, this is only currently possible using WebRTC-enabled Chrome and Firefox. See here for a demo.
I tested this and it returns private IP addresses. Typically I think this is used by advertisers to help track individual users in a network, in conjunction with the public IP address. I am certain that it will quickly become useless as people come up with workarounds or as public outcry forces them to offer the ability to disable the WebRTC API. However, for the time being it works for anyone who has JavaScript enabled on Chrome and Firefox.
More Reading:
What is a Private Network?
STUN IP Address requests for WebRTC
Quick Link: IP address checker
Get Client IP Address:
<?php
echo $ip = $_SERVER['REMOTE_ADDR'];
?>
Note::
This would work only on live site, because on your local host your ip would be one (1) of the internal ip addresses, like 127.0.0.1
So, its Return ::1
Example : https://www.virendrachandak.com/demos/getting-real-client-ip-address-in-php.php
Its Show Your Local Ip:
Like ... 78.57.xxx.xxx
Example:
<?php
$myIp= getHostByName(php_uname('n'));
echo $myIp;
?>
As the real method is to check user IP is $ip = $_SERVER['REMOTE_ADDR'];
If the user is using VPN or any proxy then it will not detect the original IP.
How about this one -
public function getClientIP()
{
$remoteKeys = [
'HTTP_X_FORWARDED_FOR',
'HTTP_CLIENT_IP',
'HTTP_X_FORWARDED',
'HTTP_FORWARDED_FOR',
'HTTP_FORWARDED',
'REMOTE_ADDR',
'HTTP_X_CLUSTER_CLIENT_IP',
];
foreach ($remoteKeys as $key) {
if ($address = getenv($key)) {
foreach (explode(',', $address) as $ip) {
if ($this->isValidIp($ip)) {
return $ip;
}
}
}
}
return '127.0.0.0';
}
private function isValidIp($ip)
{
if (!filter_var($ip, FILTER_VALIDATE_IP,
FILTER_FLAG_IPV4 | FILTER_FLAG_NO_PRIV_RANGE | FILTER_FLAG_NO_RES_RANGE)
&& !filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6 | FILTER_FLAG_NO_PRIV_RANGE)
) {
return false;
}
return true;
}
I use this code, and it works for me. Take a look to it.
<?php
// Gets client's IP.
$ip = getenv("HTTP_CLIENT_IP")?:
getenv("HTTP_X_FORWARDED_FOR")?:
getenv("HTTP_X_FORWARDED")?:
getenv("HTTP_FORWARDED_FOR")?:
getenv("HTTP_FORWARDED")?:
getenv("REMOTE_ADDR");
echo $ip;
?>
Here, a working example. Hope it helps!
Because of different network setups (proxy servers, private networks, etc.) and how administrators configure their networks, it is difficult to obtain the client IP address. Standards are being addressed related to this issue.
The following function worked in 4 different tests (Home Network, VPN, Remote connection, public internet). The code can be used as base code for your project. Modify as needed.
The function does validate the IP address, but does not validate IP ranges. This would be an additional test after you obtain the client IP.
$_SERVER["REMOTE_ADDR"] does not always return the true client IP address.
Because some of the parameters can be set by end users, security can be an issue.
Set Client IP address
$clientIpAddress = $du->setClientIpAddress($_SERVER);
public function setClientIpAddress($serverVars) {
# Initialization
$searchList = "HTTP_CLIENT_IP,HTTP_X_FORWARDED_FOR,HTTP_X_FORWARDED,HTTP_X_CLUSTER_CLIENT_IP,HTTP_FORWARDED_FOR,HTTP_FORWARDED,REMOTE_ADDR";
$clientIpAddress = "";
# Loop through parameters
$mylist = explode(',', $searchList);
foreach ($mylist as $myItem) {
# Is our list set?
if (isset($serverVars[trim($myItem)])) {
# Loop through IP addresses
$myIpList = explode(',', $serverVars[trim($myItem)]);
foreach ($myIpList as $myIp) {
if (filter_var(trim($myIp), FILTER_VALIDATE_IP)) {
# Set client IP address
$clientIpAddress = trim($myIp);
# Exit loop
break;
}
}
}
# Did we find any IP addresses?
if (trim($clientIpAddress) != "") {
# Exit loop
break;
}
}
# Default (if needed)
if (trim($clientIpAddress) == "") {
# IP address was not found, use "Unknown"
$clientIpAddress = "Unknown";
}
# Exit
return $clientIpAddress;
}
I'm writing an email parser for a site and I'm not sure of best practices. Specifically, I am not sure how to mark emails that I have already parsed, so I don't access them each time I access the mailbox.
PS - I've never done any email parsing.
I'm using the Flourish library (along with Codeigniter) so so far I am calling cronjobs/parseMail with a cron job
public function parseMail(){
// Connect to a remote imap server
$mailbox = new fMailbox('imap', 'mysite.com', 'user', 'password');
// Retrieve an overview of all messages
$messages = $mailbox->listMessages();
foreach ( $messages as $message ){
$messageBody = $message['text'];
// parse it
}
}
So once I have "dealt with" an email.. should I just delete it? Or is there a better way to insure that I am not parsing emails I have already done?
BONUS QUESTION > Dont I need to supply a specific email account somewhere? If I have "admin#mysite.com" and "addressForParsing#mysite.com" .. where does that get specified that I am only interested in the latter? Do I just pull the "To:" out of my parsed info or is there a better way?
Flourish: wow... this is even less helpful than the stock PHP functions. You'll have to store message UIDs externally from IMAP to track if something's been processed or not.
PHP/CodeIgniter: CI doesn't seem to have an IMAP library, so you're using PHP functions. imap-setflag-full() will let you set the \Flagged flag on the message which you can use to track if the message has been processed.
Custom Socket Code: you can use something like this code to set/get custom IMAP flags, but you'll probably have to read a handful of IMAP RFCs to get everything else working.
I run security checks on a number of AJAX calls to see if the same IP requested that I have on record.
I used the following set of class functions to establish the IP (which can come via load balancers, hence the lengthly methodology.
private function IPMask_Match ($network, $ip) {
$ip_arr = explode('/', $network);
if (count($ip_arr) < 2) {
$ip_arr = array($ip_arr[0], null);
}
$network_long = ip2long($ip_arr[0]);
$x = ip2long($ip_arr[1]);
$mask = long2ip($x) == $ip_arr[1] ? $x : 0xffffffff << (32 - $ip_arr[1]);
$ip_long = ip2long($ip);
return ($ip_long & $mask) == ($network_long & $mask);
}
private function IPCheck_RFC1918 ($IP) {
$PrivateIP = false;
if (!$PrivateIP) {
$PrivateIP = $this->IPMask_Match('127.0.0.0/8', $IP);
}
if (!$PrivateIP) {
$PrivateIP = $this->IPMask_Match('10.0.0.0/8', $IP);
}
if (!$PrivateIP) {
$PrivateIP = $this->IPMask_Match('172.16.0.0/12', $IP);
}
if (!$PrivateIP) {
$PrivateIP = $this->IPMask_Match('192.168.0.0/16', $IP);
}
return $PrivateIP;
}
public function getIP () {
$UsesProxy = (!empty($_SERVER['HTTP_X_FORWARDED_FOR']) || !empty($_SERVER['HTTP_CLIENT_IP'])) ? true : false;
if ($UsesProxy && !empty($_SERVER['HTTP_CLIENT_IP'])) {
$UserIP = $_SERVER['HTTP_CLIENT_IP'];
}
elseif ($UsesProxy && !empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$UserIP = $_SERVER['HTTP_X_FORWARDED_FOR'];
if (strstr($UserIP, ',')) {
$UserIPArray = explode(',', $UserIP);
foreach ($UserIPArray as $IPtoCheck) {
if (!$this->IPCheck_RFC1918($IPtoCheck)) {
$UserIP = $IPtoCheck;
break;
}
}
if ($UserIP == $_SERVER['HTTP_X_FORWARDED_FOR']) {
$UserIP = $_SERVER['REMOTE_ADDR'];
}
}
}
else{
$UserIP = $_SERVER['REMOTE_ADDR'];
}
return $UserIP;
}
The Problem is I've been getting problems with users operating via a proxy. Can anyone indicate why that might be? I've used basic free proxy's online to try and emulate, but it doesn't look to be getting variable IPs or anything - so I'm not sure why this would be saying the two IPs don't match.
I am going to explain what a proxy is first so we are both on the same page.
What Is A Proxy
A proxy is usually a single computer that accesses the internet ON BEHALF OF the user and then the proxy sends the results back to the user. The problem appears when there could be hundreds or even thousands of other people also using that one computer - and they all have the SAME IP address but normally the headers indicate that the users are via a proxy.
Your script i am assuming (without properly looking) is getting the IP's and headers mixed up.
Different Solution
There is a better way. Use sessions and save a key in the session ensuring they have previously been to the main site first BEFORE accessing the ajax page. Example:
index.php
session_start();
$_SESSION['ajax_ok'] = true;
ajax/username_check.php
session_start();
if (empty($_SESSION['ajax_ok'])) {
die("You can not access this page...");
}
That will enforce that they must visit your main site first and the client must also support sessions which most browsers will which is a plus to help prevent bots etc abusing your ajax scripts.
More reliable and easier solution than using that mangle of code you got above ><
What if you can't use sessions?
If you can't use sessions on the specific comp you're on, you could setup another computer, rewrite the session handler (using the callbacks that php provides) or instead of using the file system use something else like the database for sessions instead? There must be something your website uses that the sessions could also use. Such as if you have a load balancer file based sessions generally wont work anyway so it's generally a good idea to change the sessions to use something else like i mentioned above.
The problem unfortunately almost certainly isn't a proxy - it's almost certainly a stationary public IP router, routing traffic through a subnet.
And subnets can be HUGE (say, at universities).
And even if it was by some fluke a genuine proxy (they are rare these days - 10 year old tech), even if the proxy volunteers to forward, it won't happen to mean anything because it's almost certainly a subnet ip like 192.168.x.x anyway.. This is basically the public IP address (aka switchboard) internal extension.
You could cross your fingers and try php ipv6 Working with IPv6 Addresses in PHP or even be even more clever and try mac address How can I get the MAC and the IP address of a connected client in PHP? but both are doomed to failure. My gut instinct is to try to cheat: I would gamble that the best way is basically using a network share for the session store and allowing the load balanced PHP servers to all access it and do everything via the same dns prefix. Or perhaps set up a 3rd party dns for doing the sessional gathering.
The answer is that unless you track with "cookies" like your an ad agency you can't do it.
A friend identified this, basically some proxy's can come back with X_FORWARDED_FOR as a comma separated value OR as comma separated and spaced.
To fix:-
after:
foreach ($UserIPArray as $IPtoCheck) {
add this line:
$IPtoCheck = trim($IPtoCheck);
Sorted.
The Problem is I've been getting problems with users operating via a
proxy. Can anyone indicate why that might be? I've used basic free
proxy's online to try and emulate, but it doesn't look to be getting
variable IPs or anything - so I'm not sure why this would be saying
the two IPs don't match.
Your parsing code explodes HTTP_X_FORWARDED_FOR on comma, but separator may be "comma space". If that happens, the RFC 1918 check will fail. While some proxies do not add space, the standard is to use it:
http://en.wikipedia.org/wiki/X-Forwarded-For
The general format of the field is:
X-Forwarded-For: client, proxy1, proxy2
where the value is a comma+space separated list of IP addresses, the
left-most being the original client, and each successive proxy that
passed the request adding the IP address where it received the request
from. In this example, the request passed proxy1, proxy2 and proxy3
(proxy3 appears as remote address of the request).
So you ought to change the explode separator to ", ", or better, use preg_split with ",\s*" as a separator and cover both cases.
Then, your problem is to authenticate the page doing the call in AJAX to the AJAX call itself.
If you don't want to use cookie-based sessions, which is the best way, you can try and do this with a nonce. That is, when you generate the page, you issue a unique ID and inject it into the HTML code, where the AJAX code will recover it and pass back to the AJAX servlet. The latter will be able to add it in Access-Control-Request, as detailed here, or just add more data to the request.
I'd like to discuss whether your solution does anything on behalf of security.
$_SERVER['REMOTE_ADDR'] cannot be forged. It is set by the webserver because of the accessing IP address used. Any response goes to this address.
$_SERVER['HTTP_FORWARDED_FOR'] and $_SERVER['HTTP_CLIENT_IP'] can easily be forged, because they are HTTP headers sent to the webserver - neither will you know you are talking to a proxy if it is configured to omit these headers, nor will you know you are NOT talking to a proxy if a client decides to insert these headers.
Filtering based on the IP address FORGED will not really help you, but this highly depends on the stuff you want to achieve - which remains unknown until you go into more detail there.
If you look around, you will stumble upon the Suhosin patch and extension for PHP, and it's feature to encrypt the session and cookie content. The encryption key is built by using some static key, but adding stuff like the HTTP User Agent or parts of the requesting IP address - note: The IP used to actually make the request, i.e. the proxy if one is used, because this is the only reliable info.
One can argue that using the full IP address is not a very good idea unless you know your users do not use proxy clusters with changing IP addresses for multiple requests, so only a part of the IP would usually be used, or none at all.
The HTTP User Agent though is a nice source of unique information. Yes, it can be forged, but that does not matter at all, because if you only want to allow requests from the same source, it is valid to assume that the user agent string does not change over time, but will be different for a bunch of other users. Hence there are statistics that show you can generate a nearly unique fingerprint of a browser if you just look at all the HTTP headers sent, because every user installs different extensions that change something, like accept-content headers.
I cannot provide working code, nor can I tell from your question whether any of this answer applies, but I'd suggest not using the IP info, especially if it can be forged. This is even more valid if you think about IPv6, where every client has multiple active addresses even on the same interface, and they are all randomly generated and highly unlikely to ever occur again later in time. (Of course this does not apply if you are never gonna host on IPv6, but at some point you'll be out of users then.)
I've found a couple online services that do this, and I found this post at stackoverflow about it:
How to check if an email address exists without sending an email?
The problem is that the PHP script linked to there requires you to populate a list of nameservers and domains, and thus (I think) only works if you are validating emails on a known domain. I want something that will work for any email (at least work with a high probability). I found a script that does it that I can buy for $40, but I'd rather find the same thing as open source.
Thanks for any advice,
Jonah
This:
$emailValidation = /(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")#(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])/;
if(preg_match($emailValidation, $testEmail)) {
echo "valid email.";
} else {
echo "invalid email.";
}
Is a well used email validation regex, which is PHP compatible.
Just check email addresses against it and you're done.
But note that without a confirmation postback you will never know that an email is 100% valid.
I'm evaluating a bunch of email validation techniques and someone of them output that email#domain is valid. I read somewhere that this email may be technically valid if it's being used for internal networking, but really for almost all practical purposes on the web today, this email should not evaluate to true.
What does your email validation library or home-baked method do, and do you allow this sort of thing?
Well, it depends on what the application is supposed to do.
Is it going to be used in an intranet? If so, email#domain may be fine.
If not, you might want to explicitly require a fqdn so that they can't send mail internally on your domain (foo#localhost, etc).
It shouldn't be difficult to check the domain part:
$domain = array_pop(explode('#', $email));
Then, depending on your need, validate the domain.
You can check it for valid syntax (that it's a fqdn). There are plenty of tutorials online (And that a lot of frameworks provide) that can validate a domain in a string to see if it's a fqdn format...
Or, if your needs are greater, you can just verify that your server can resolve it (Via something like dns_get_record()...
if (false === dns_get_record($domain, DNS_MX)) {
//invalid domain (can't find an MX record)...
}
(Note, I said you could do this, not if you should. That will depend on your exact use case)...
The domain .io currently has an MX resource record, so yes it is valid. It is explicitly allowed by RFC 5321.
You are welcome to use my free PHP function is_email() to validate addresses. It's available to download here. Try validating http://isemail.info/jblue#io online for example.
It will ensure that an address is fully RFC 5321 compliant. It can optionally also check whether the domain actually exists and has an MX record.
You shouldn't rely on a validator to tell you whether a user's email address actually exists: some ISPs give out non-compliant addresses to their users, particularly in countries which don't use the Latin alphabet. More in my essay about email validation here: http://isemail.info/about.
See this article for a regex to match all valid email addresses:
You may want to tweak it to
Discard IP domains
Discard port numbers
And to answer your quetion about email#domain, you can discard that too, if you are not expecting intranet emails.
the check with false isn't the best way, the return value can be an empty array, e.g. $domain = '-onlinbe.de';
try empty() instead: http://us3.php.net/manual/en/function.empty.php
$dnsMx = dns_get_record($domain, DNS_MX);
$dnsA = dns_get_record($domain, DNS_A);
if (empty($dnsMx) && empty($dnsA)) {
echo 'domain not available';
}
or use
if(gethostbyname($domain) === $domain) {
echo 'no ip found';
}