Create a regular expression with preg_match - php

I want to create a regular expression for below string.
The dynamic portion (i.e. for which expression needed is in bold text)
The mail system
**email address**: host mx2.hotmail.com[65.55.92.152] said:
550 Requested action not taken:
**mailbox unavailable** (in reply to RCPTTO command)
Basically I want that my regexp search for email and the "mailbox unavailable".
So it will search for mail first and then search for the string "mailbox unavailable".
How can I do this? I need to use preg_match php function for this.
Edit:
Actually I am doing code to find the bounced mail. I am placing the source code of the full email to match. one of my email giving above error. So I need to check for that error contain in email code or not. If that code found then it will return error accordingly. So in that error message dynamic parts are email address, ip in square brackets and the string mailbox unavailable.

The regex for e-mail address alone should give you an idea of the complexity of what you're trying to achieve:
http://fightingforalostcause.net/misc/2006/compare-email-regex.php

You can use those expressions:
preg_match("/[<]\S+#\S+[>]/i",$your_string); // this matches: <info#example.com>
preg_match("/mailbox unavailable/i",$your_string) // this matches: mailbox unavailable
The i after the last slash stands for case insensitive.

Related

How to validate email address and website in commnet

I want to validate email address and website in comment box. When someone writes comment in comment box and after submission check if email address or website found in comment remove that email and address.
I have put below regular expression for email.
"/(?:[a-z0-9!#$%&'*+=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")#(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])/"
above expression validates email address but I want to validate like email[at]email[dot]com, email{at}email{dot}com, email(at)email(dot)com
Same for website validation I used below expression
"/((((http|https|ftp|ftps)\:\/\/)|www\.)[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}(\/\S*)?)/"
But I want to validate website like website[dot]com, www[dot]website[dot]com
Basically what you need to do is, where you have the validation of # and . character in email or . in weburl, you need to enhance your regex and put the alternatives to # character as you are expecting. So,
# should be written as (?:#|[[({]at[\]})])
And,
\. should be written as (?:\.|[[{(]dot[\]})])
wherever you have them in your regex and then it will also filter those strings as well.
Here is a modified regex for email.
(?:[a-z0-9!#$%&'*+=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")(?:#|[[({]at[\]})])(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?(?:\.|[[{(]dot[\]})]))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Regex Demo for email
Same way you can replace . from your website regex and your modified regex becomes this,
(?:(?:(?:http|https|ftp|ftps)\:\/\/)|www(?:\.|[[{(]dot[\]})]))(?:[a-zA-Z0-9.-]|[[{(]dot[\]})])+(?:\.|[[{(]dot[\]})])[a-zA-Z]{2,4}(\/\S*)?
Regex Demo for web url
Now besides matching of [dot], {dot} and (dot), the regex will also match [dot} and similar and as you are trying to detect such strings further, hence matching these strings will be an added advantage, rather than a problem unless the context was otherwise.

imap headers returned formatted string broken down into values

I have a string that is returned via imap_headers() which has a format like this one:
Flags Number) Date Sender Subject (Size)
and sample returned string from an array is:
A 454)22-Jun-2004 cust.serv#amazon.co.uk Your Amazon.co.uk Order (5284 chars)
1)10-Oct-2010 Gmail Team Get Gmail on your mobile (2457 chars)
5)12-Nov-2010 =?ISO-8859-1?Q?Eric_ For Website (671669 chars)
U 6)17-Nov-2010 no-reply#hostmessage host.net account : boom (2082 chars)
I want to decode/parse this string and create an object out from this string.
e.g
Flags
Date
Sender
Subject
i find it hard parsing this using php string functions because, this string is just separated by a space. Can anybody help me on this? The sender part could also be a name, so we can just treat it as an email add or a website url.
As I mentioned above in the comments, it's pretty much impossible to create a regular expression that will work for you, because you don't provide enough information. However, I will give you an example that will match your line and maybe you can go on from there:
preg_match('_([A-Z]+) (\d+)\)(\d+-[A-Za-z]+-\d{4}) (\S+#\S+\.\S+) (.+) \((\d+) chars\)_', $header, $matches);
var_dump($matches);
Now it depends on your application if you need to extend this expression to account for other variants of the possible output. You can read up on all the details of what I did there in the official documentation.

PHP: Strip email address only from possible <>'s

I need a little assistance getting email addresses only from within their POSSIBLY INCLUDED <> brackets.
For example I have the following 3 strings and I need each one to return only the email address:
darth#vader.com
"Darth Vader" <darth#vader.com>
"Darth Vader" <darth#vader.com> "Possible additional text" (Shouldn't be here but I need to make sure the regex gets rid of it anyway just in case.)
On every single one of those I would want $email to equal darth#vader.com
How about just matching for valid e-mail addresses? The regex we use to check validity is:
/(([a-z0-9!#$%&*+-=?^_`{|}~][a-z0-9!#$%&*+-=?^_`{|}~.]*[a-z0-9!#$%&*+-=?^_`{|}~])|[a-z0-9!#$%&*+-?^_`{|}~]|("[^"]+"))\#([-a-z0-9]+\.)+(com|net|edu|org|gov|mil|int|biz|pro|info|arpa|aero|coop|name|museum|co|co\.uk)/img
(reFiddle link)
Or here's one that's completely TLD-agnostic:
/(([a-z0-9&*\+\-\=\?^_`{|\}~][a-z0-9!#$%&*+-=?^_`{|}~.]*[a-z0-9!#$%&*+-=?^_`{|}~])|[a-z0-9!#$%&*+-?^_`{|}~]|("[^"]+"))\#([-a-z0-9]+\.)+([a-z]{2,})/img
(reFiddle link)
One of those should work for what you're looking for and should cover most cases.

PHP custom regex

I've written this regex to check for valid emails: /^[-a-z0-9._]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
I want it to work for emails like name1+name2#domaine.com
How can I fix this regex?
I Have a simpler solution.
if(filter_var($email,FILTER_VALID_EMAIL))
{
//true
}
this would be sufficient in most cases, this actually runs an regular check in C which in turn would be faster but if you wish to have control over the reg-ex in your application then the regex below is what's used for this check:
/^((\\\"[^\\\"\\f\\n\\r\\t\\b]+\\\")|([\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+(\\.[\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+)*))#((\\[(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))\\])|(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))|((([A-Za-z0-9\\-])+\\.)+[A-Za-z\\-]+))$/D
Another tip i will give you is that a user may enter an email address such as: invalid#dontexists.com which would then bypass your checks for a valid email, if you wan't to make sure that dontexists.com is running an email server is do:
$has_mx_server = (bool)checkdnsrr($domain,"MX");
if the domain has a registered MX Record the chances of the email being faked is reduced by a good chunk.
First part
[-a-z0-9._]+
does not accept right now plus sign. Expand it:
[-+a-z0-9._]+
Try
/^[-a-z0-9._+]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
Place the + inside the braces and escape it with a backslash
/^[-a-z0-9._\+]+#[-a-z0-9._]+\.+[a-z]{2,6}$/i
"+" is a meta character meaning to search for 1 or more occurrence, therefore, to search for the actual character, it must be escaped.

PHP server-side validation regular expression match

I have the following part of a validation script:
$invalidEmailError .= "<br/>ยป You did not enter a valid E-mail address";
$match = "/\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b/";
That's the expression, here is the validation:
if ( !(preg_match($match,$email)) ) {
$errors .= $invalidEmailError; // checks validity of email
}
I think that's enough info, let me know if more is needed.
Basically, what happens is the message "You did not enter a valid E-mail address" gets echoed no matter what. Whether a correct email address or an incorrect email address is entered.
Does anyone have any idea or a clue as to why?
EDIT: I'm running this on localhost (using Apache), could that be the reason as to why the preg_match ain't working?
Thanks!
Amit
Your regex only includes [A-Z], not [a-z]. Try
$match = "/\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b/i";
to make the regex case-insensitive.
You can test this live on http://regexpal.com.
However, I'd advise you to try one of the expressions on the page mentioned by strager: http://fightingforalostcause.net/misc/2006/compare-email-regex.php. They have been perfected over time and will probably behave better. But Gmail users will be satisfied with yours, since they'll be able to use plus aliases which are rejected incorrectly by many validators.
You likely got the regular expression you're using from regular-expressions.info. On that page, the author states (emphasis added):
If you want to use the regular expression above, there's two things you need to understand. First, long regexes make it difficult to nicely format paragraphs. So I didn't include a-z in any of the three character classes. This regex is intended to be used with your regex engine's "case insensitive" option turned on. (You'd be surprised how many "bug" reports I get about that.) Second, the above regex is delimited with word boundaries, which makes it suitable for extracting email addresses from files or larger blocks of text. If you want to check whether the user typed in a valid email address, replace the word boundaries with start-of-string and end-of-string anchors, like this: ^[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}$.
To solve this problem, add the i PCRE flag after your regular expression.
You can always try debugging your regex using a simpler tool (I'm quite fond of using Notepad++ for this purpose) and performing iterative tests - ie. making the expression more/less complicated and seeing if that fixes/breaks things.

Categories