Regex email - how to allow plus symbols in email? - php

I always find regular expressions a headache, and googling didn't really help. I'm currently using the following expression (preg_match): /^[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/
However, if I'd want to allow emails with plus symbols, this obviously won't work, eg: foo+bar#domain.com
How would I need to change my expression to allow it? Thanks in advance for all the help!

You should just use PHPs builtin regex for email validation, because it covers all the things:
filter_var($email, FILTER_VALIDATE_EMAIL)
See filter_var and FILTER_VALIDATE_EMAIL (or https://github.com/php/php-src/blob/master/ext/filter/logical_filters.c#L499 for the actual beast).

Your wrong regex can be changed to another wrong regex:
/^[\w-]+(\.[\w+-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/
which allows for the + character where you want it. But it's wrong anyway.

Try add \+ into the char collection [] :
/^[_a-z0-9-]+(.[_a-z0-9-\+]+)#[a-z0-9-]+(.[a-z0-9-]+)(.[a-z]{2,3})$/

Related

preg_match for the email validation I want but somehow I don't know where I messed up

oh eh...ya...lots commented there are lots email validation can be used but just that for this one I have to do it like what is mentioned below that's why....
I need to validate email like this
alphanumeric characters followed by # followed by alphanumeric characters followed by . followed by 2 – 4 more alphanumeric characters
this is what I have done but somehow I know it's the last part after . I messed up but I couldn't find where I messed up....
preg_match("/^([0-9]|[a-z])([0-9]|[a-z]|[_-])*#([0-9]|[a-z])*\.([0-9][a-z]){2,4}$/i","")
at start I used [0-9]|[a-z])([0-9]|[a-z]|[_-] because I didn't want people able to use _- as the start....so forced start as number/letters only
There must be a million different people that wrote a new regex for email validation. If you are interested in the email format you can just use
$email = filter_var($email, FILTER_VALIDATE_EMAIL);
and if the final value is empty the initial one wasn't a valid email address format.
(as an extra step you could try to validate the domain by using this function http://php.net/manual/en/function.checkdnsrr.php)
Have a try with this:
^[0-9a-z_\-]+#[0-9a-z_\-]+\.[0-9a-z]{2,4}$
But as said: there are ready-to-use regexes, much better than trying to reinvent the wheel. Also this current approach does not macth all valid addresses and validates some addresses that are illegal.
Which reason of email validation? It is very upset when you try to enter you email and you can't due to the stupid validation. I think it is enoth to check the availability of '#' and '.' signs, in case user unintentionally missed this.
$res = preg_match("/#[^#\.]*\./", $str);

conditional regex

I've parsing a xml file that sometimes has the value <avg_cpc>some number</avg_cpc> sometime don't.
my regex look like this:
<is_adult>(.*?)</is_adult>.*?<trademark_probability>(.*?)</trademark_probability>.*?<total_extensions_used>(.*?)</total_extensions_used> **here comes the <avg_cpc>some number</avg_cpc>** .*?</appraisal>
how can I make this regex match items that don't have cpc value ?
I've tried (<avg_cpc>.*?</avg_cpc>)? without luck.
Thanks !
Please use a real XML parser for PHP, instead of regular expressions. This will make everything much easier, not to mention less error-prone.
I would guess it's because you're not escaping your slashes, try this:
<is_adult>(.*?)<\/is_adult>.*?<trademark_probability>(.*?)<\/trademark_probability>.*?<total_extensions_used>(.*?)<\/total_extensions_used>(<avg_cpc>.*?<\/avg_cpc>)?.*?<\/appraisal>
I would also use [^<]+ instead of .*? if possible.

PHP custom regex

I've written this regex to check for valid emails: /^[-a-z0-9._]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
I want it to work for emails like name1+name2#domaine.com
How can I fix this regex?
I Have a simpler solution.
if(filter_var($email,FILTER_VALID_EMAIL))
{
//true
}
this would be sufficient in most cases, this actually runs an regular check in C which in turn would be faster but if you wish to have control over the reg-ex in your application then the regex below is what's used for this check:
/^((\\\"[^\\\"\\f\\n\\r\\t\\b]+\\\")|([\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+(\\.[\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+)*))#((\\[(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))\\])|(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))|((([A-Za-z0-9\\-])+\\.)+[A-Za-z\\-]+))$/D
Another tip i will give you is that a user may enter an email address such as: invalid#dontexists.com which would then bypass your checks for a valid email, if you wan't to make sure that dontexists.com is running an email server is do:
$has_mx_server = (bool)checkdnsrr($domain,"MX");
if the domain has a registered MX Record the chances of the email being faked is reduced by a good chunk.
First part
[-a-z0-9._]+
does not accept right now plus sign. Expand it:
[-+a-z0-9._]+
Try
/^[-a-z0-9._+]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
Place the + inside the braces and escape it with a backslash
/^[-a-z0-9._\+]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
"+" is a meta character meaning to search for 1 or more occurrence, therefore, to search for the actual character, it must be escaped.

URL Validation?

Does anyone know an up to date regular expression for validating URLs? I found a few on Google but they all allowed junk URL's i.e (www.google_com) when testing.
My regular expression knowledge is not so vast, so I would hate to put something together that would fail under pressure.
Thanks.
You can use the filter functions in PHP
$filtered = filter_var($url, FILTER_VALIDATE_URL);
http://uk3.php.net/manual/en/function.filter-var.php
Not every problem should be answered with a regex.
http://php.net/manual/en/function.parse-url.php

PHP server-side validation regular expression match

I have the following part of a validation script:
$invalidEmailError .= "<br/>» You did not enter a valid E-mail address";
$match = "/\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b/";
That's the expression, here is the validation:
if ( !(preg_match($match,$email)) ) {
$errors .= $invalidEmailError; // checks validity of email
}
I think that's enough info, let me know if more is needed.
Basically, what happens is the message "You did not enter a valid E-mail address" gets echoed no matter what. Whether a correct email address or an incorrect email address is entered.
Does anyone have any idea or a clue as to why?
EDIT: I'm running this on localhost (using Apache), could that be the reason as to why the preg_match ain't working?
Thanks!
Amit
Your regex only includes [A-Z], not [a-z]. Try
$match = "/\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b/i";
to make the regex case-insensitive.
You can test this live on http://regexpal.com.
However, I'd advise you to try one of the expressions on the page mentioned by strager: http://fightingforalostcause.net/misc/2006/compare-email-regex.php. They have been perfected over time and will probably behave better. But Gmail users will be satisfied with yours, since they'll be able to use plus aliases which are rejected incorrectly by many validators.
You likely got the regular expression you're using from regular-expressions.info. On that page, the author states (emphasis added):
If you want to use the regular expression above, there's two things you need to understand. First, long regexes make it difficult to nicely format paragraphs. So I didn't include a-z in any of the three character classes. This regex is intended to be used with your regex engine's "case insensitive" option turned on. (You'd be surprised how many "bug" reports I get about that.) Second, the above regex is delimited with word boundaries, which makes it suitable for extracting email addresses from files or larger blocks of text. If you want to check whether the user typed in a valid email address, replace the word boundaries with start-of-string and end-of-string anchors, like this: ^[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}$.
To solve this problem, add the i PCRE flag after your regular expression.
You can always try debugging your regex using a simpler tool (I'm quite fond of using Notepad++ for this purpose) and performing iterative tests - ie. making the expression more/less complicated and seeing if that fixes/breaks things.

Categories