what I need is not email validation..
Its simple.
Allow #hello.world or #hello_world or #helloworld but #helloworld. should be taken as #helloworld so as #helloworld?
In short check for alphabet or number after . and _ if not than take the string before it.
My existing RegEx is /#.([A-Za-z0-9_]+)(?=\?|\,|\;|\s|\Z)/ it only cares with #helloworld and not the #hello.world or #hello_world.
Update:
So now I got a regex which deals with problem number 1. i.e. Allow #hello.world or #hello_world or #helloworld but still What about #helloworld. should be taken as #helloworld so as #helloworld?
New RegEx: /#([A-Za-z0-9+_.-]+)/
Don't use a regex for that.
Use...
$valid = filter_var($str, FILTER_VALIDATE_EMAIL);
Regex will never be able to verify an email, only to do some very basic format checking.
The most comprehensive regex for matching email addresses was 8000 chars long, and that one is already invalid due to changes in what is accepted in emails.
Use some designed library for the checking if you need to get real verification, otherwise just check for # and some dots, anything more and you will probably end up invalidating perfectly legal email addresses.
Some examples of perfectly legal email addresses: (leading and trailing " are for showing boundary only"
"dama#nodomain.se"
"\"dama\"#nodomain.se"
"da/ma#nodomain.se"
"dama#nõdomain.se"
"da.ma#nodomain.se"
"dama#pa??de??µa.d???µ?"
"dama #nodomain .se"
"dama#nodomain.se "
You can use this regexp to validate email addresses
^[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,6}$.
For more information and complete complete expressions you can check here
I hope this helps you
Try this:
\#.+(\.|\?|;|[\r\n\s]+)
Related
I am using the following regex to validate emails and just noticed some problems and don't see what the issue is :
/^[a-z0-9_.-]+#[a-z0-9.-]+.[a-z]{2,6}$/i.test(value)
support#tes is invalid
support#test is valid
support#test.c is invalid
support#test.co is valid
the 2,6 is for requiring and ending tld between 2 or 6 and that does not appear to be working either. I am sure I had this working properly before.
In a regex, . is a wildcard (meaning any char). you need to escape it as \.
Keep in mind though, the regex is too restrictive. You can have non-alpha numeric chars in the address, like '
I notice you're not escaping the .. There might be more to it than that, but that jumps out at me.
This is a decent check for an e-mail with Regex
\w+([-+.']\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*
However you may want to read this. Using a regular expression to validate an email address
There are many ways to regex an email address. depending on how precise and restrictive you want it, but to re-write a working regex closest to what you have in you question. This should work:
^[\w_.-]+#[\w]+\.[\w]{2,6}$
support#tes - Invalid
support#test - Invalid
support#test.c - Invalid
support#test.co - Valid
supp34o.rt#tes.com - Valid
But also keep in mind ALL the characters allowed in a valid email address - What characters are allowed in an email address?
This sounds strange, but I've been using this function for quite a while now and "suddenly, from one day to the other" it does not filter some addresses in the right way anymore. However, I cannot see why...
function validate_email($email)
{
/*
(Name) Letters, Numbers, Dots, Hyphens and Underscores
(# sign)
(Domain) (with possible subdomain(s) ).
Contains only letters, numbers, dots and hyphens (up to 255 characters)
(. sign)
(Extension) Letters only (up to 10 (can be increased in the future) characters)
*/
$regex = '/([a-z0-9_.-]+)'. # name
'#'. # at
'([a-z0-9.-]+){2,255}'. # domain & possibly subdomains
'.'. # period
'([a-z]+){2,10}/i'; # domain extension
if($email == '') {
return false;
}
else {
$eregi = preg_replace($regex, '', $email);
}
return empty($eregi) ? true : false;
}
e.g. "some#gmail" will be shown as correct, etc so it seems sth happened with the tld - does anybody could tell me why?
Thank you very much in advance!
. means any character. You should escape it if you actually mean 'dot': \.
Your regex also has some other problems:
No uppercases are allowed in your regex: [a-zA-Z0-9]
No unicode characters are allowed in your regex (for example email addresses with é, ç, ... etc)
Some special characters such as + are in fact allowed in an email address
...
I would keep the email validation very simple. Like check if there is a # present and pretty much keep it at that. For if you really want to validate an email, the regex becomes gruesome.
Check this SO answer for a more detailed explanation.
What you commented with "period":
'.'. # period
is in fact a placeholder for any character. It should be \. instead.
However, you're overcomplicating things. Such validation should exist to reject either empty fields or obviously wrong stuff (e.g. name put in the email field). So in my experience the best check is just to look whether it contains an # and don't worry too much about getting the structure right. You can in fact write a regex which will faithfully validate any valid email address and reject any invalid one. It's a monster spanning about a screen of text. Don't do that. KISS.
I think the error is in this line:
'.'. # period
You mean a literal period here. But periods have a special meaning in regular expressions (they mean "any character").
You need to escape it with a backslash.
What about FILTER_VALIDATE_EMAIL
i am stuck at particular problem i have username field where on only alphabets numbers and . - and _ are allowed and should always start with alphabet
here are examples what are accepted
someone#mydomain.com
something1234#mydomain.com
someething.something#mydomain.com
something-something#mydomain.com
something_something#mydomain.com
something_1234#mydomain.com
something.123#mydomain.com
something-456#mydomain.com
what i have done till now is
[a-zA-Z0-9]+[._-]{1,1}[a-zA-Z0-9]+#mydomain.com
this matches all my requirement except of problem it dosent match
someone#mydomain.com
someont123#mydomain.com
but it even matches
someone_someone_something#mydomain.com
which is not required i am really not getting how to solve this one thing i tried is
[a-zA-Z0-9]+[._-]{0}[a-zA-Z0-9]+#mydomain.com
but this is also not solving my problem now it accepts everything like
something+455#mydomain.com
which is not required please help me
If you want to make the - or . optional, then you have to replace the {1,1} (quantifier: once) with an ? (quantifier: one or none) here:
[a-zA-Z0-9]+[._-]?[a-zA-Z0-9]+#mydomain.com
The reason this regex also matches shorter addresses without delimiter -._ is that you don't assert the whole string, but just some part of it. Use start ^ and end $ anchors:
^[a-zA-Z0-9]+[._-]?[a-zA-Z0-9]+#mydomain\.com$
This is why we have filter_var($email, FILTER_VALIDATE_EMAIL).
If email address is valid, then you just have to check if it ends with #domain.com. That could be done with strrpos($email, '#domain.com').
I'm currently using
if(preg_match('~#(semo\.edu|uni\.uu\.se|)$~', $email))
as a domain check.
However I need to only check if the e-mail ends with the domains above. So for instance, all these need to be accepted:
hello#semo.edu
hello#student.semo.edu
hello#cool.teachers.semo.edu
So I'm guessing I need something after the # but before the ( which is something like "any random string or empty string". Any regexp-ninjas out there who can help me?
([^#]*\.)? works if you already know you're dealing with a valid email address. Explanation: it's either empty, or anything that ends with a period but does not contain an ampersand. So student.cs.semo.edu matches, as does plain semo.edu, but not me#notreallysemo.edu. So:
~#([^#]*\.)?(semo\.edu|uni\.uu\.se)$~
Note that I've removed the last | from your original regex.
You can use [a-zA-Z0-9\.]* to match none or more characters (letters, numbers or dot):
~#[a-zA-Z0-9\.]*(semo\.edu|uni\.uu\.se|)$~
Well .* will match anything. But you don't actually want that. There are a number of characters that are invalid in a domain name (ex. a space). Instead you want something more like this:
[\w.]*
I might not have all of the allowed characters, but that will get you [A-Za-z0-9_.]. The idea is that you make a list of all the allowed characters in the square brakets and then use * to say none or more of them.
I'm trying to create a regular expressions that will filter valid emails using PHP and have ran into an issue that conflicts with what I understand of regular expressions. Here is the code that I am using.
if (!preg_match('/^[-a-zA-Z0-9_.]+#[-a-zA-Z0-9]+.[a-zA-Z]{2,4}$/', $string)) {
return $false;
}
Now from the materials that I've researched, this should allow content before the # to be multiple letters, numbers, underscores and periods, then afterwards to allow multiple letters and numbers, then require a period, then two to four letters for the top level domain.
However, right now it ignores the requirement for having the top level domain section. For example a#b.c obviously is valid (and should be), but a#b is also returning as valid, which I want ti to be flagged as not so.
I'm sure I"m missing something, but after browsing google for an hour I'm at a loss as to what it could be. Anyone have an answer for this conundrum?
EDIT: The speed that answers arrive here makes this site superior over it's competitors. Well done!
You should escape . when it's not a part of the group: '/^[-a-zA-Z0-9_.]+#[-a-zA-Z0-9]+\.[a-zA-Z]{2,4}$/'
Otherwise it will be equal to any letter:
. - any symbol (but not the newline \n if not using s modifier)
\. - dot symbol
[.] - dot symbol (inside symbol group)
Rather than rolling your own, perhaps you should read the article How to Find or Validate an Email Address on Regular-Expressions.info. The article also discusses reasons why you might not want to validate an email address using a regular expression and provides 3 regular expressions that you might consider using instead of your own.
From the page Comparing E-mail Address Validating Regular Expressions: Geert De Deckere from the Kohana project has developed a near perfect one:
/^[-_a-z0-9\'+*$^&%=~!?{}]++(?:\.[-_a-z0-9\'+*$^&%=~!?{}]+)*+#(?:(?![-.])[-a-z0-9.]+(?<![-.])\.[a-z]{2,6}|\d{1,3}(?:\.\d{1,3}){3})(?::\d++)?$/iD
But there is also a buildin function in PHP filter_var($email, FILTER_VALIDATE_EMAIL) but it seems to be under development. And there is an other serious solution: PEAR:Validate. I think the PEAR Solution is the best one.
An RFC822-compliant e-mail regex is available.
This is the most reasonable trade off of the spec versus real life that I have seen:
[a-z0-9!#$%&'*+/=?^_`{|}~-]+
(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
#
(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+
(?:[A-Z]{2}|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum)\b
Of course, you have to remove the line breaks, and you have to update it if more top-level domains become available.
A single dot in a regular expression means "match any character". And that's exactly what is does when a top level domain is missing (also when it's present, of course).
Thus you should change your code like that:
if (!preg_match('/^[-a-zA-Z0-9_.]+#[-a-zA-Z0-9]+\.[a-zA-Z]{2,4}$/', $string)) {
return $false;
}
And by the way: a lot more characters are allowed in the local part than what your regular expression currently allows for.