I am using this absolute amazing piece of code: https://github.com/plancake/official-library-php-email-parser/blob/master/PlancakeEmailParser.php
But the one thing it is missing is the ability to get the From email address.
I have simple added:
public function getFromEmail()
{
if (!isset($this->rawFields['from']))
{
return false;
}
return $this->rawFields['from'];
}
But how would I get only the email address part at the moment it returns: John Smith<john#gmail.com>?
Also I would need this to work if the From address was only john#gmail.com?
Thanks to the answers this was the finished code:
public function getFromEmail()
{
$email = trim($this->rawFields['from']);
if(substr($email, -1) == '>'){
$fromarr = explode("<",$email);
$mailarr1 = explode(">",$fromarr[1]);
$email = $mailarr1[0];
}
return $email;
}
This is a very simple regular expression:
$output = array();
preg_match("/.*<(.*?)>.*?/", $this->rawFields['from'], $output);
$email_address = $output[1];
Care though: If someone's name contains < or > it might cause a security vulnerability. The lazy operator (*.?) is used to ensure the last set of < > is used.
HTH
PS: Use http://gskinner.com/RegExr/ to test Regular Expressions!
$mailid='John Smith<john#gmail.com>';
$mailarr=explode("<",$mailid);
$mailarr1=explode(">",$mailarr[1]);
$just_emailid=$mailarr1[0];
Related
I have a problem that I need help fixing. I am trying to create a script that crawls websites for mailing addresses. Mostly German addresses, but I am unsure of how to create said script, I have created one already that extracts email addresses from said websites. But the address one is puzzling because there isn't a real format.. Here is a couple German addresses for examples on a way to possibly extract this data.
Ilona Mustermann
Hauptstr. 76
27852 Musterheim
Andreas Mustermann
Schwarzwaldhochstraße 1
27812 Musterhausen
D. Mustermann
Kaiser-Wilhelm-Str.3
27852 Mustach
Those are just a few examples of what I am looking to extract from the websites. Is this possible to do with PHP?
Edit:
This is what I have so far
function extract_address($str) {
$str = strip_tags($str);
$Name = null;
$zcC = null;
$Street = null;
foreach(preg_split('/([^A-Za-z0-9üß\-\#\.\(\) .])+/', $str) as $token) {
if(preg_match('/([A-Za-z\.])+ ([A-Za-z\.])+/', $token)){
$Name = $token;
}
if(preg_match('/ /', $token)){
$Street = $token;
}
if(preg_match('/[0-9]{5} [A-Za-zü]+/', $token)){
$zcC = $token;
}
if(isset($Name) && isset($zcC) && isset($Street)){
echo($Name."<br />".$Street."<br />".$zcC."<br /><br />");
$Name = null;
$Street = null;
$zcC = null;
}
}
}
It works to retrieve $Name(IE: Ilona Mustermann and City/zipcode(27852 Musterheim) but unsure of a regex to always retrieve streets?
Well this is what I have came up with so far, and it seems to be working about 60% of the time on streets, zip/city work 100% and so does name. But when it tries to extract the street occasionally it fails.. Any idea why?
function extract_address($str) {
$str = strip_tags($str);
$Name = null;
$zcC = null;
$Street = null;
foreach(preg_split('/([^A-Za-z0-9üß\-\#\.\(\)\& .])+/', $str) as $token) {
if(preg_match('/([A-Za-z\&.])+ ([A-Za-z.])+/', $token) && !preg_match('/([A-Za-zß])+ ([0-9])+/', $token)){
//echo("N:$token<br />");
$Name = $token;
}
if(preg_match('/(\.)+/', $token) || preg_match('/(ß)+/', $token) || preg_match('/([A-Za-zß\.])+ ([0-9])+/', $token)){
$Street = $token;
}
if(preg_match('/([0-9]){5} [A-Za-züß]+/', $token)){
$zcC = $token;
}
/*echo("<br />
N:$Name
<br />
S:$Street
<br />
Z:$zcC
<br />
");*/
if(isset($Name) && isset($zcC) && isset($Street)){
echo($Name."<br />".$Street."<br />".$zcC."<br /><br />");
$Name = null;
$Street = null;
$zcC = null;
}
}
}
Of course it is possible you need to use preg_match() function. It is all about making a good regex pattern.
For example to get post-code
<?php
$str = "YOUR ADRESSES STRING HERE";
preg_match('/([0-9]+) ([A-Za-z]+)/', $str, $matches);
print_r($matches);
?>
this regex matches adresses you've given you need to put in it also your native characters.
[A-Za-züß.]+ [A-Za-z.üß]+\s[A-Za-z. 0-9ß-]+\s[0-9]+ [A-Za-züß.]+
It's impossible to get a reliable answer with regex with such a complicated string. That's the only correct answer to this question.
Vlad Bondarenko is right.
In CS speak: Postal addresses do not form a regular language.
Extracting information is an active research topic. Regular expressions are not completely bogus, but will have a higher failure rate than approaches that use dictionaries ("gazetteers") or more advanced machine learning algorithms.
A nice stack overflow q/a is How to parse freeform street/postal address out of text, and into components
I have a coming soon form at a website where user fills out an email form and it will be emailed to me. However, a spammer has hit the site and is spamming the form with goatse and so on. IP ban isn't helping so I need to stop the form sending it if it contains goatse or something. Here's the mailer.
<?php
$SPOSTI =$_POST[sposti];
if ($SPOSTI=="")
{
return false;
}
if ($SPOSTI=="goatse.fr")
{
return false;
}
if ($SPOSTI=="http://www.goatse.info/hello.jpg")
{
return false;
}
else
{
$to = "xxx#gmail.com";
$subject = "xxx";
$message = "$_POST[sposti] haluaa tiedon kun kotisivut.name avautuu.
$_POST[ip]";
$from = "$_POST[sposti]";
$headers = "From:" . $from;
mail($to,$subject,$message,$headers);
}
?>
Is there someway to block it from executing the code if the email contains a certain word (goatse in this case)
You need to use exit or die instead of return false which works inside functions/methods:
if ( $SPOSTI =="" || strpos('goatse', $SPOSTI) !== FALSE)
{
exit();
}
strpos() will let you find a substring, but I really recommend a captcha security system as the attacker could simply switch to another annoying word.
Goatse's arn't your problem here, it's the security.
You can use stristr http://php.net/manual/de/function.stristr.php to achive this. I would recommend to using a captcha, since it is more efficient. A popular solution is reCaptcha: https://developers.google.com/recaptcha/docs/php Another, weaker possibility is to add a security question to your form, for instance "What is five plus five in numbers?".
Try the following:
function is_spam($array, $block_pattern){
$block = false;
foreach($array as $k => $v){
if(preg_match('/.*' . $block_pattern . '.*/', $k) ||
preg_match('/.*' . $block_pattern . '.*/', $v)){
$block = true;
break;
}
}
return $block;
}
Usage: is_spam($_POST, 'goatse');
Returns: true if 'goatse' is found in $_POST
The function will search all keys and values of $array for the $block_pattern string and will return true if the pattern is found.
I know email validation is one of those things which is not the funniest thing on the block. I'm starting up a website and i want to limit my audience to only the people in my college and i also want a preferred email address for my user. So this is a two part question.
Is there a really solid php function out there for email validation?
Can I validate an email from a specific domain. I dont want to just check if the domain exists, because I know www.mycollege.edu exists already. Is there really anyway to validate that the user has a valid #mycollege.edu web address?
This is what I use:
function check_email_address($email) {
// First, we check that there's one # symbol, and that the lengths are right
if (!preg_match("/^[^#]{1,64}#[^#]{1,255}$/", $email)) {
// Email invalid because wrong number of characters in one section, or wrong number of # symbols.
return false;
}
// Split it into sections to make life easier
$email_array = explode("#", $email);
$local_array = explode(".", $email_array[0]);
for ($i = 0; $i < sizeof($local_array); $i++) {
if (!preg_match("/^(([A-Za-z0-9!#$%&'*+\/=?^_`{|}~-][A-Za-z0-9!#$%&'*+\/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$/", $local_array[$i])) {
return false;
}
}
if (!preg_match("/^\[?[0-9\.]+\]?$/", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name
$domain_array = explode(".", $email_array[1]);
if (sizeof($domain_array) < 2) {
return false; // Not enough parts to domain
}
for ($i = 0; $i < sizeof($domain_array); $i++) {
if (!preg_match("/^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$/", $domain_array[$i])) {
return false;
}
}
}
return true;
}
EDIT Replaced depreciated ereg with preg_match for PHP 5.3 compliance
If you really want to make sure its valid make your signup form send them an email with a URL link in that they have to click to validate.
This way not only do you know the address is valid (because the received the email), but you also know the owner of the account has signed up (unless someone else knows his login details).
To make sure it ends correctly you could use explode() on the '#' and check the second part.
$arr = explode('#', $email_address);
if ($arr[1] == 'mycollege.edu')
{
// Then it's from your college
}
PHP also has it's own way of validating email addresses using filter_var: http://www.w3schools.com/php/filter_validate_email.asp
This should work:
if (preg_match('/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])#mycollege.edu$/', $email)) {
// Valid
}
Read here
http://ru2.php.net/manual/en/book.filter.php
Or in short
var_dump(filter_var('bob#example.com', FILTER_VALIDATE_EMAIL));
this might be a better solution. many answered already, eventhough its little different.
$email = "info#stakoverflow.com";
if (!filter_var($email, FILTER_VALIDATE_EMAIL) === false) {
echo $email ." is a valid email address";
} else {
echo $email ." is not a valid email address";
}
I hope this one has simple to use.
for any e-mail
([a-zA-Z0-9_-]+)(\#)([a-zA-Z0-9_-]+)(\.)([a-zA-Z0-9]{2,4})(\.[a-zA-Z0-9]{2,4})?
for php preg_match function
/([a-zA-Z0-9_-]+)(\#)([a-zA-Z0-9_-]+)(\.)([a-zA-Z0-9]{2,4})(\.[a-zA-Z0-9]{2,4})?/i
for #mycollege.edu
^([a-zA-Z0-9_-]+)(#mycollege.edu)$
for php preg_match function
/^([a-zA-Z0-9_-]+)(#mycollege.edu)$/i
PHP CODE
<?php
$email = 'tahir_aS-adov#mycollege.edu';
preg_match('/^([a-zA-Z0-9_-]+)(#mycollege.edu)$/i', $email, $matches);
if ($matches) {
echo "Matched";
} else {
echo "Not Matched";
}
var_dump($matches);
A simple function using filter_var in php
<?php
function email_validation($email) {
if (!filter_var($email, FILTER_VALIDATE_EMAIL) === false) {
echo("$email is a valid email address");
} else {
echo("$email is not a valid email address");
}
}
//Test
email_validation('johnson123');
?>
How can I validate the input value is a valid email address using php5. Now I am using this code
function isValidEmail($email){
$pattern = "^[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$";
if (eregi($pattern, $email)){
return true;
}
else {
return false;
}
}
but it shows deprecated error. How can I fix this issue. Please help me.
You can use the filter_var() function, which gives you a lot of handy validation and sanitization options.
filter_var($email, FILTER_VALIDATE_EMAIL)
PHP Manual filter_var()
Available in PHP >= 5.2.0
If you don't want to change your code that relied on your function, just do:
function isValidEmail($email){
return filter_var($email, FILTER_VALIDATE_EMAIL) !== false;
}
Note: For other uses (where you need Regex), the deprecated ereg function family (POSIX Regex Functions) should be replaced by the preg family (PCRE Regex Functions). There are a small amount of differences, reading the Manual should suffice.
Update 1: As pointed out by #binaryLV:
PHP 5.3.3 and 5.2.14 had a bug related to
FILTER_VALIDATE_EMAIL, which resulted in segfault when validating
large values. Simple and safe workaround for this is using strlen()
before filter_var(). I'm not sure about 5.3.4 final, but it is
written that some 5.3.4-snapshot versions also were affected.
This bug has already been fixed.
Update 2: This method will of course validate bazmega#kapa as a valid email address, because in fact it is a valid email address. But most of the time on the Internet, you also want the email address to have a TLD: bazmega#kapa.com. As suggested in this blog post (link posted by #Istiaque Ahmed), you can augment filter_var() with a regex that will check for the existence of a dot in the domain part (will not check for a valid TLD though):
function isValidEmail($email) {
return filter_var($email, FILTER_VALIDATE_EMAIL)
&& preg_match('/#.+\./', $email);
}
As #Eliseo Ocampos pointed out, this problem only exists before PHP 5.3, in that version they changed the regex and now it does this check, so you do not have to.
See the notes at http://www.php.net/manual/en/function.ereg.php:
Note:
As of PHP 5.3.0, the regex extension is deprecated in favor of
the PCRE extension. Calling this
function will issue an E_DEPRECATED
notice. See the list of differences
for help on converting to PCRE.
Note:
preg_match(), which uses a Perl-compatible regular expression
syntax, is often a faster alternative
to ereg().
This is old post but I will share one my solution because noone mention here one problem before.
New email address can contain UTF-8 characters or special domain names like .live, .news etc.
Also I find that some email address can be on Cyrilic and on all cases standard regex or filter_var() will fail.
That's why I made an solution for it:
function valid_email($email)
{
if(is_array($email) || is_numeric($email) || is_bool($email) || is_float($email) || is_file($email) || is_dir($email) || is_int($email))
return false;
else
{
$email=trim(strtolower($email));
if(filter_var($email, FILTER_VALIDATE_EMAIL)!==false) return $email;
else
{
$pattern = '/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}#)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*#(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD';
return (preg_match($pattern, $email) === 1) ? $email : false;
}
}
}
This function work perfectly for all cases and email formats.
I always use this:
function validEmail($email){
// First, we check that there's one # symbol, and that the lengths are right
if (!preg_match("/^[^#]{1,64}#[^#]{1,255}$/", $email)) {
// Email invalid because wrong number of characters in one section, or wrong number of # symbols.
return false;
}
// Split it into sections to make life easier
$email_array = explode("#", $email);
$local_array = explode(".", $email_array[0]);
for ($i = 0; $i < sizeof($local_array); $i++) {
if (!preg_match("/^(([A-Za-z0-9!#$%&'*+\/=?^_`{|}~-][A-Za-z0-9!#$%&'*+\/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$/", $local_array[$i])) {
return false;
}
}
if (!preg_match("/^\[?[0-9\.]+\]?$/", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name
$domain_array = explode(".", $email_array[1]);
if (sizeof($domain_array) < 2) {
return false; // Not enough parts to domain
}
for ($i = 0; $i < sizeof($domain_array); $i++) {
if (!preg_match("/^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$/", $domain_array[$i])) {
return false;
}
}
}
return true;
}
User data is very important for a good developer, so don't ask again
and again for same data, use some logic to correct some basic error in data.
Before validation of Email: First you have to remove all illegal characters from email.
//This will Remove all illegal characters from email
$email = filter_var($email, FILTER_SANITIZE_EMAIL);
after that validate your email address using this filter_var() function.
filter_var($email, FILTER_VALIDATE_EMAIL)) // To Validate the email
For e.g.
<?php
$email = "john.doe#example.com";
// Remove all illegal characters from email
$email = filter_var($email, FILTER_SANITIZE_EMAIL);
// Validate email
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
echo $email." is a valid email address";
} else {
echo $email." is not a valid email address";
}
?>
Use:
or "filter_var" from http://php.net/manual/en/function.filter-var.php
var_dump(filter_var('bob#example.com', FILTER_VALIDATE_EMAIL));
or "EmailValidator" from https://github.com/egulias/EmailValidator
$validator = new EmailValidator();
$multipleValidations = new MultipleValidationWithAnd([
new RFCValidation(),
new DNSCheckValidation()
]);
$validator->isValid("example#example.com", $multipleValidations); //true
take several care, a address as iasd#x.z-----com is INVALID, but filter_var() return true, many others strings (emails) INVALIDS return true using filter_var().
for validate email I use this function:
function correcorre($s){// correo correcto
$x = '^([[:alnum:]](_|-|\.)*)*[[:alnum:]]+#([[:alnum:]]+(-|\.)+)*[[:alnum:]]+\.[[:alnum:]]+$';
preg_match("!$x!i", $s, $M);
if(!empty($M[0]))return($M[0]);
}
please improve and share, thanks
This is a two-part question. Help on either (or both) is appreciated!
1) What is the best php method for checking if an email string is a Gmail address
2) How to strip out everything but the username?
Thanks!
list($user, $domain) = explode('#', $email);
if ($domain == 'gmail.com') {
// use gmail
}
echo $user;
// if $email is toto#gmail.com then $user is toto
Dunno about best method, but here is one method for checking a gmail address using stristr.
if (stristr($email, '#gmail.com') !== false) {
echo 'Gmail Address!';
}
As for pulling out the username there are a ton of functions as well, one could be explode:
$username = array_shift(explode('#', $email));
There are many ways to do it, the best depends on your needs.
For Multiple Emails
$expressions =
"/(gmail|googlmail|yahoo|hotmail|aol|msn|live|rediff|outlook|facebook)/";
if (preg_match($expressions, $input_email)) {
throw error
}
if (preg_match("/gmail.com/",$email_address)) {
$email_address = str_replace("#gmail.com","",$email_address);
}