Regular expression and newline - php

I have such text:
<Neednt#email.com> If you do so, please include this problem report.
<Anotherneednt#email.com> You can delete your
own
text from the attached returned message.
The mail system
<Some#Mail.net>: connect to *.net[82.*.86.*]: Connection timed
out
I have to parse email from it. Could you help me with this job?
upd
There could be another email addresses in <%here%>. There should be connection between 'The mail system' text. I need in email which goes after that text.

Considering this text is stored in $text, what about this :
$matches = array();
if (preg_match('/<([^>]+)>/', $text, $matches)) {
var_dump($matches[1]);
}
Which gives me :
string 'Some#Mail.net' (length=13)
Basically, I used a pretty simple regex, that matches :
a < character
anything that's not a > character : [^>]
at least one time : [^>]+
capturing it : ([^>]+)
a > character
So, it captures anything that's between < and >.
Edit after comments+edit of the OP :
If you only want the e-mail address that's after The mail system, you could use this :
$matches = array();
if (preg_match('/The mail system\s*<([^>]+)>/', $text, $matches)) {
var_dump($matches[1]);
}
In addition to what I posted before, this expects :
The string The mail system
Any number of white-characters : \s*

You want to use preg_match() and looking at this input it should be simple:
<?php
if (preg_match('/<([^>]*?#[^>]*>/', $data, $matches)) {
var_dump($matches); // specifically look at $matches[1]
}
There are other patterns that would match it, you don't have to stick to that same pattern. The '<' and '>' in your input are helpful here.

Related

Extract e-mail from long text(PHP)

I have to find way to extract e-mail adress from webpage source code.
$str= "<a h=ref=3D.mailto:rys#adres.pl.><img src=3D.http://www.lowiecki.pl/img/list.gif=
. border=3D.0.></a></td><td class=3D.bb.>
$a = preg_split( "/ [:] /", $str )";
for($i=0;$i<count($a);$i++)
echo $a[$i];
I tried that, but i don't know how to set limit on substring "pl".
E-mail addresses can be far more complex than the forms we are used to, see examples of uncommon valid addresses.
An almost perfect, but very complex, regular expression for matching most e-mail address forms is proposed at https://emailregex.com/.
You could use this shorter, but more restrictive, expression derived from one proposed by Jan Goyvaerts at https://www.regular-expressions.info/email.html: /\b[A-Z0-9][A-Z0-9._%+-]{0,63}#(?:[A-Z0-9-]{1,63}\.){1,125}[A-Z]{2,63}\b/i
In a PHP script, it could be implemented this way:
<?php
$str = "<a h=ref=3D.mailto:rys#adres.pl.><img src=3D.http://www.lowiecki.pl/img/list.gif=
. border=3D.0.></a></td><td class=3D.bb.><a h=ref=3D.mailto:second-address#example.com.>foo</a>";
preg_match_all(
'/\b[A-Z0-9][A-Z0-9._%+-]{0,63}#(?:[A-Z0-9-]{1,63}\.){1,125}[A-Z]{2,63}\b/i', # After https://www.regular-expressions.info/email.html
quoted_printable_decode($str), # An e-mail address may be corrupted by the quoted-printable encoding.
$matches
);
echo isset($matches[0]) ? '<pre>'.print_r($matches[0], true).'</pre>' : 'No address found.';
?>
This script outputs:
Array
(
[0] => rys#adres.pl
[1] => second-address#example.com
)
Make sure to call $matches[0] to get the found addresses.
Best regards
Next code will search for an email and save it into a variable, after that you can use the result as you wish.
$email = preg_match_all(
"/[a-z0-9]+([_\\.-][a-z0-9]+)*#([a-z0-9]+([\.-][a-z0-9]+)*)+\\.[a-z]{2,}/i",
$str,
$listofemails
);
if($email) {
echo "you got a match";
}

Validate url with query string containing email address using PHP

Hi I have problem with correct url validation with query string containing email address like:
https://example.com/?email=john+test1#example.com
this email is ofc correct one john+test1#example.com is an alias of john#example.com
I have regex like this:
$page = trim(preg_replace('/[\\0\\s+]/', '', $page));
but it don't work as I expected because it replaces + to empty string what is wrong. It should keep this + as alias of email address and should cut out special characters while maintaining the correctness of the address.
Example of wrong url with +:
https://examp+le.com/?email=example#exam+ple.com
Other urls without email in query string should be validating correctly using this regex
Any idea how to solve it?
I think this is what you looking for:
<?php
function replace_plus_sign($string){
return
preg_replace(
'/#/',
'+',
preg_replace(
'/\++/i',
'',
preg_replace_callback(
'/(email([\d]+)?=)([^#]+)/i',
function($matches){
return $matches[1] . preg_replace('/\+(?!$)/i', '#', $matches[3]);
},
$string
)
)
);
}
$page = 'https://exam+ple.com/email=john+test1+#example.com&email2=john+test2#exam+ple.com';
echo replace_plus_sign($page);
Gives the following output:
https://example.com/email=john+test1#example.com&email2=john+test2#example.com
At first, I replaced the valid + sign on email addresses with a #, then removing all the remainings +, after that, I replaced the # with +.
This solution won't work if there's a #s on the URL if so you will need to use another character instead of # for the temporary replacement.

Retrieve full email address from string

I'm currently building a Slack bot using Laravel, and one of the features is that it can receive an email address and send a message to it.
The issue is that email addresses (e.g bob#example.com) come through as <mailto:bob#example.com|bob#example.com> from Slack.
I currently have a function that retrieves the email from this:
public function getEmail($string)
{
$pattern = '/[a-z0-9_\-\+]+#[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);
$matches = array_filter($matches);
return $matches[0][0];
}
This seemed to be working fine with email addresses like bob#example.com, however it seems to fail when working with email addresses like bob.jones#example.com (which would come through as <mailto:bob.jones#example.com|bob.jones#example.com>.
In these cases, the function is returning jones#example.com as the email address.
I'm not great with regex, but is there something else I could use/change in my pattern, or a better way to fetch the email address from the string provided by Slack?
Could always take regex out of the equation if you know that's always the format it'll be in:
$testString = '<mailto:bob#example.com|bob#example.com>';
$testString = str_replace(['<mailto:', '>'], '', $testString);
$addresses = explode('|', $testString);
echo $addresses[0];
This method will do the job and you avoid to have regular expressions. and make sure the email being returned is a real email address by validating it with php functions.
function getEmailAddress($string)
{
$string = trim($string, '<>');
$args = explode('|', $string);
foreach ($args as $_ => $val) {
if(filter_var($val, FILTER_VALIDATE_EMAIL) !== false) {
return $val;
}
}
return null;
}
echo getEmailAddress('<mailto:bob#example.com|bob#example.com>');
Output
bob#example.com
You know the strings containing the e-mail address will always be of the form <mailto:bob#example.com|bob#example.com>, so use that. Specifically, you know the string will start with <mailto:, will contain a |, and will end with >.
An added difficulty though, is that the local part of an e-mail address may contain a pipe character as well, but the domain may not; see the following question.
What characters are allowed in an email address?
public function getEmail($string)
{
$pattern = '/^<mailto:([^#]+#[^|]+)|(.*)>$/i';
preg_match_all($pattern, $string, $matches);
$matches = array_filter($matches);
return $matches[1][0];
}
This matches the full line from beginning to end, but we capture the e-mail address within the first set of parentheses. $matches[1] contains all matches from the first capturing parentheses. You could use preg_match instead, since you're not looking for all matches, just the first one.

Extract value from header string

I am writing a code to read bounced emails from inbox. I am getting the body of the email like so:
$body = imap_body($conn, $i);
After I get the body string, I split it into an array with explode.
$bodyParts = explode(PHP_EOL, $body);
The bounced emails that I am concerned with, they all have a particular header set i.e. X-OBJ-ID. I can loop through $bodyParts to check if that particular header is set or not, but how do I get it's value if the header exists. Currently, the header string looks like this for those bounced emails which had that header set:
"X-OBJ-ID: 24\r"
So, basically my question is: How do I extract 24 from the above string?
Lookbehinds can be helpful in such cases
/(?<=X-OBJ-ID: )\d+/
(?<=X-OBJ-ID: ) look behind. Ensures that the digits is preceded by X-OBJ-ID:
\d+ Matches digits.
Regex Demo
Example
preg_match("/(?<=X-OBJ-ID: )\d+/", "X-OBJ-ID: 24\r", $matches);
print_r($matches)
=> Array (
[0] => 24
)
Try
$int = filter_var($str, FILTER_SANITIZE_NUMBER_INT);
or you can do it via regular expression
preg_replace("/[^0-9]/","",$string);
You could do something like so:
$str = "X-OBJ-ID: 24\r";
preg_match('X-OBJ-ID:\s+(\d+)', $str, $re);
print($re);
This should match your string and store the 24 within a capture group which will be then made accessible through $re.
try this code
preg_replace('/\D/', '', $str)
it removes all the non numeric characters from the string
My solution:
<?php
$string = '"X-OBJ-ID: 24\r"';
preg_match_all('^\X-OBJ-ID: (.*?)[$\\\r]+^', $string, $matches);
echo !empty($matches[1]) ? trim($matches[1][0]) : 'No matches found';
?>
See it working here http://viper-7.com/kuMyVh

Regex Get Email handle from Email Address

I have an email address that could either be
$email = "x#example.com"; or $email="Johnny <x#example.com>"
I want to get
$handle = "x"; for either version of the $email.
How can this be done in PHP (assuming regex). I'm not so good at regex.
Thanks in advance
Use the regex <?([^<]+?)# then get the result from $matches[1].
Here's what it does:
<? matches an optional <.
[^<]+? does a non-greedy match of one or more characters that are not ^ or <.
# matches the # in the email address.
A non-greedy match makes the resulting match the shortest necessary for the regex to match. This prevents running past the #.
Rubular: http://www.rubular.com/r/bntNa8YVZt
Here is a complete PHP solution based on marcog's answer
function extract_email($email_string) {
preg_match("/<?([^<]+?)#([^>]+?)>?$/", $email_string, $matches);
return $matches[1] . "#" . $matches[2];
}
echo extract_email("ice.cream.bob#gmail.com"); // outputs ice.cream.bob#gmail.com
echo extract_email("Ice Cream Bob <ice.cream.bob#gmail.com>"); // outputs ice.cream.bob#gmail.com
Just search the string using this basic email-finding regex: \b[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}\b
It will match any email in any text, and in your first string it will match the whole string, and in the second, only the part of the string that is e-mail.
To quickly learn regexp this is the best place: http://www.regular-expressions.info
$email = 'x#gmail.com';
preg_match('/([a-zA-Z0-9\-\._\+]+#[a-z0-9A-Z\-\._]+\.[a-zA-Z]+)/', $email, $regex);
$handle = array_shift(explode('#', $regex[1]));
Try that (Not tested)

Categories