Regular expression to parse Final-Recipient email header - php

I have to get any text between:
Final-Recipient: RFC822; !HERE! Action
I need !HERE! from this example. There could be any string.
I tried something like:
$Pattern = '/Final-Recipient: RFC822; (.*) Action/';
But it doesn't work.
upd
Here is the string I'm trying to parse: http://dpaste.com/187638/

Since you said "any string" which may contain spaces, the closest approximate would be
$Pattern = '/Final-Recipient: RFC822; (.*?) Action/s';
# ^ ^
# lazy match instead of greedy match ----' |
# allow . to match newline -----'
Of course it won't match "Final-Recipient: RFC822; Action Action".

Your pattern works fine for me:
$i = 'This is a MIME-encapsulated message --o3ONXoEH01blah3:35:33 +0400 (MSD) Final-Recipient: RFC822; !HERE! Action: failed Status: 4.4.7 Lblahru> From: *
#*.ru';
$pattern = '/Final-Recipient: RFC822; (.*) Action/';
$matches = Array();
preg_match($pattern, $i, $matches);
print_r($matches);
Output:
Array
(
[0] => Final-Recipient: RFC822; !HERE! Action
[1] => !HERE!
)
Note also that your pattern will fail if the "any text" contains new lines. Use the DOTALL modifier /.../s to allow the dot to also match new lines. Also note that if the text " Action" appears elsewhere in the message it will cause your regular expression to fail. Matching dot is dangerous. Try to find a more specific pattern if possible.

$Pattern = '/Final-Recipient:[^;]+[;|<|\s]+([^\s|^<|^>]+)/i';
The following expression turned out to be the best for my problems, because sometimes there are lines of the following kind:
Final-Recipient: LOCAL;<example#rambler.ru>

I am going to suggest a method that does not use them, which requires extra busywork.
<?php
$message = 'This is a MIME-encapsulated message --o3ONXoEH016763.1272152184/zvm19.host.ru The original message was received at Fri, 23 Apr 2010 03:35:33 +0400 (MSD) from roller#localhost ----- The following addresses had permanent fatal errors ----- "Flucker" ----- Transcript of session follows ----- 451 grl.unibel.by: Name server timeout Message could not be delivered for 2 days Message will be deleted from queue --o3ONXoEH016763.1272152184/*.host.ru Content-Type: message/delivery-status Reporting-MTA: dns; zvm19.host.ru Arrival-Date: Fri, 23 Apr 2010 03:35:33 +0400 (MSD) Final-Recipient: RFC822; !HERE! Action: failed Status: 4.4.7 Last-Attempt-Date: Sun, 25 Apr 2010 03:36:24 +0400 (MSD) --o3ONXoEH016763.1272152184/zvm19.host.ru Content-Type: message/rfc822 Content-Transfer-Encoding: 8bit Return-Path: Received: (from *#localhost) by *.host.ru (8.13.8/Zenon/Postman) id o3MNZX5h059932; Fri, 23 Apr 2010 03:35:33 +0400 (MSD) (envelope-from *#roller.ru) Date: Fri, 23 Apr 2010 03:35:33 +0400 (MSD) Message-Id: <201004222335.o3MNZX5h059932#*.host.ru> From: *
#*.ru';
$left_delimiter = 'Final-Recipient: RFC822; ';
$right_delimiter = ' Action';
$left_delimiter_pos = strrpos($message, $left_delimiter);
$right_delimiter_pos = strpos($message, $right_delimiter);
$desired_message_fragment = '';
if ($left_delimiter_pos !== false && $right_delimiter_pos !== false) {
$fragment_start = $left_delimiter_pos + strlen($left_delimiter);
$fragment_length = $right_delimiter_pos - $fragment_start;
$desired_message_fragment = substr(
$message, $fragment_start, $fragment_length
);
}
var_dump($desired_message_fragment);

a bit late....
but has been asked in terms of how to solve a problem that is not quite his requirements Op perhaps has joined multiple lines onto one line?(imho).
This might help others....
I'm assuming that op is trying to parse the Final-Recipient header field of a delivery status notification.
The spec for the Final-Recipient field can be seen here: https://www.rfc-editor.org/rfc/rfc3464#page-15
If the problem is broken down, op can pull the final recipient field as a single field (Final recipient followed by a char/blank line on the next line.
e.g.
Original-recipient: rfc822;some-email-that-does-not-exist#gmail.com
Final-recipient: rfc822;some-email-that-does-not-exist#gmail.com
Action: failed
Status: 5.1.1 (Remote SMTP server has rejected address)
Final recipient is followed by the start of the next field, Action which has A on the next line. ie not followed by a space or blank line.
then all he has to do is split the line on ; and take the second part
ie
String[] twoparts = "Final-recipient: rfc822;some-email-that-does-not-exist#gmail.com".split(";",2) // 2 here means (2-1) = 1 match
String email = twoparts[1]

Related

extract value from text file using php

I'm trying to extract the price 44,380.86 between date and # using preg_match_all() from the following line.One more thing is date Jan 1, 2015 will be dynamic.Can someone tell me how to complete it?
start on Jan 1, 2015 44,380.86 # of count: 15 tc
You can use this regex (regex explanation):
start on\s[A-Za-z]+\s[1-9]+,\s[0-9]+\s+(.*?)\s+#
Example Code:
<?php
preg_match_all(
"/start on\s[A-Za-z]+\s[1-9]+,\s[0-9]+\s+(.*?)\s+#/",
"start on Jan 1, 2015 44,380.86 # of count: 15 tc",
$matches
);
var_dump($matches);
I think this should be work for your problem with other changes:
preg_match_all("(\S+(?:\s\S+)*?)","Your string",$matches);
for your question you can use:
preg_match_all("(\S+(?:\s\S+)*?)","start on Jan 1, 2015 44,380.86 # of count: 15 tc",$matches);
echo $matches[5];
this regex parse your string with spaces, so when your string change, just you can edit index of $matches from 5 to what you want

Return multiple lines from a long string

I have a large string with multiple instances of header information. For example:
HTTP/1.1 302 Found
Cache-Control: no-cache, no-store, must-revalidate
Content-Type: text/html; charset=iso-8859-1
Date: Tue, 01 Mar 2016 01:43:13 GMT
Expires: Sat, 26 Jul 1997 05:00:00 GMT
Location: http://www.google.com
Pragma: no-cache
Server: nginx/1.7.9
Content-Length: 294
Connection: keep-alive
After "Location:", I want to save all the data from that line to an array. There might be 3 or 4 lines to save from a big block of text.
How could I do this?
Thanks!
There are plenty of ways you could do this.
Here's one way:
Split the text up at the point where Location: occurs
Split the result by new lines into an array
Example:
$text = substr($text, strpos($text, 'Location:'));
$array = explode(PHP_EOL, $text);
Here's another way:
Using regex, match Location: and everything after it
As above - split the result by new lines
Example:
preg_match_all('~(Location:.+)~s', $text, $output);
$output = explode(PHP_EOL, $output[0][0]);
Note: the s modifier means match newlines as part of the . - they will otherwise be ignored and new lines will terminate the capture.
I found another way that works too I figured I would add in case it helps anyone:
foreach(preg_split("/((\r?\n)|(\r\n?))/", $bigString) as $line){
if (strpos($line, 'Location') !== false) {
// Do stuff with the line
}
}
Source: Iterate over each line in a string in PHP
There's a lot of helpful other ways in there too.

file_get_contents() with newLine for Regex

I have a file called mail.txt with the following contents :
From: elvis#tabloid.org (The King)
Subject: be seein' ya around
Date: Mon, 23 Oct 2006 11:04:13
From: The Prez <president#whitehouse.gov>
Date: Wed, 25 Oct 2006 8:36:24
Subject: now, about your vote
I'm using Sublime Text in which the Regex ^\w+: works properly.
I'm using file_get_contents() to read the content from mail.txt and then use the same Regex for preg_replace() to highlight the output.
The issue is, when I use file_get_contents(), it doesn't consider \n and for that I tried nl2br(), but that didn't work either.
Below are the outputs in Sublime and PHP :
Sublime
PHP
Below is the PHP code :
<?php
$path = "./mail.txt";
if(!file_exists($path))
die("File does not exist");
else {
if(!($handle = fopen($path, "r")))
die("File could not be opened");
else {
$file_data = file_get_contents($path);
}
}
$mod_file = preg_replace("/^\w+:/", "<span class='replaced'>$0</span>", $file_data);
echo "<pre>".$mod_file."</pre>";
?>
How to solve this issue?
You need to use m or Multiline flag.See demo.
https://regex101.com/r/cT0hV4/12
$re = "/^\\w+:/m";
$str = "From: elvis#tabloid.org (The King)\nSubject: be seein' ya around\nDate: Mon, 23 Oct 2006 11:04:13\nFrom: The Prez <president#whitehouse.gov>\nDate: Wed, 25 Oct 2006 8:36:24\nSubject: now, about your vote";
preg_match_all($re, $str, $matches);

regex failing with no errors

I have the following text in a string called $test:
Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}
I am trying to write a regular expression in php that will return everything after content-length: 125.
Here's what I have so far:
if (preg_match('/^Content\-Length\:[0-9\\n]+([a-zA-Z0-9\{\}\"\:])*/',$test,$result))
{
var_dump($result[1]);
}
I don't get any error messages, but it doesn't find the pattern I've defined in my string.
I've also tried this pattern:
'/^Content\-Length\:[0-9\\n]+([a-zA-Z0-9{}\"\:])*/'
where I tried to remove the escape char infront of the curly braces. But it's still a no go.
Can you tell me what I'm missing?
Thanks.
EDIT 1
my code now looks like this:
<?php
$test = "Content-Type: text/plain
Server: kamailio (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"test123","email_address":"","name":"j.doe","username":"jd123"}";
//if (preg_match('/Content-Length\:[0-9\\n]*([a-zA-Z0-9{}\"\:])*/',$test,$result))
//{
// var_dump($result);
//}
preg_match('/({.*})/', $str, $matches);
echo $matches[0];
?>
That gives me the following error:
Undefined offset: 0 in /var/www/html/test/test.php on line 31
Line 31 is where I'm trying to echo the matches.
$str = <<<HEREDOC
Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}
HEREDOC;
preg_match('/(\{.*\})/', $str, $matches);
echo $matches[0];
The regex here is simply matching a line that begins with { and ends with }. It's a quick and loose regex, however.
Instead of using a big pattern to match everything (which is timeconsuming) - why not use preg_split to cut your string into two pieces at your desired location?
$string = 'Content-Type: text/plain
Server: testapp (4.2.1 (x86_64/linux))
Content-Length: 125
{"password":"123","email_address":"","name":"j.doe","username":"jd123"}';
$parts = preg_split ("/Content-Length:\s*\d+\s*/", $string);
echo "The string i want is '" . $parts[1] . "'";
Output:
The string i want is '{"password":"123","email_address":"","name":"j.doe","username":"jd123"}'
You can avoid the regex altogether because the HTTP header is always separated from the response body by 2 consecutives line breaks.
list($headers, $body) = explode("\n\n", $string);
Or for windows-style breaks( which by the way are the standard for HTTP headers):
list($headers, $body) = explode("\r\n\r\n", $string);

how to extract full mail address in imap php

i use the below code to extract header details from the mail.. i could not get the mail address in from, to and cc as mentioned below..
$header = explode("\n", imap_fetchheader($mbox,$msgno));
echo "<br>";
for ($i=1; $i<count($header); $i++)
{
echo $header[$i] . "<br>";
}
output:
Delivered-To: user1#examplecom
X-WM-Delivered: user1#example.com
Received: from ElcotPC ([127.0.0.1])
(envelope-sender )
by 127.0.0.1 with ESMTP
for ; Wed, 31 Jul 2013 09:14:19 +0530
From: "user1"
To:
Cc:
Subject: testing with attachment
Date: Wed, 31 Jul 2013 09:14:18 +0530
The "from","to", "cc" field are empty without the mail address..
i want the output like this..
Delivered-To: user1#examplecom
X-WM-Delivered: user1#example.com
Received: from ElcotPC ([127.0.0.1])
(envelope-sender )
by 127.0.0.1 with ESMTP
for ; Wed, 31 Jul 2013 09:14:19 +0530
From: "user1" <user1#example.com>
To: <user2#example.com>
Cc: <user1#example.com>
how to get the email address to "from", "to" and "cc" field?
Update:
It's always best to use code that is readily available, so I checked if a imap-parsing function exists already. It does: imap_rfc822_parse_headers. Read the docs for details, and links to all sorts of imap_* functions. Perhaps imap_rfc822_parse_adrlist is exactly what you need?
A basic preg_match_all call could do the job, I think:
if (preg_match_all('/^\s*(From|To|Cc):[^<]*<([^>]+)\>/m',$string, $addresses)
{
$addresses = array_merge($addresses[1], $addresses[2]);
print_r($addresses);
}
Should output:
array (
'From' => 'user1#example.com',
'To' => 'user2#example.com',
'Cc' => 'user1#example.com',
)
I think that's what you were looking for.
The regex explained:
^\s* matches the start of the line, and zero or more whitespace chars
(From|To|Cc) matches (and groups) From, To or Cc
:[^<]*<: Matches (but doesn't group) the colon, and any char, except for the address delimiting <
([^>]+): Mathces (and groups) everything after the <, that isn't >
\>: Can be left out, but matches address-delimiting >
m: multi-line. If left out the leading ^ means start of string, now it means start of line
Notes: This expression doesn't deal with comma separated addresses or multiple addresses, and it might be usefull to call:
filter_var($addresses['From'], FILTER_VALIDATE_EMAIL)
or use array_map to filter $addresses[2] prior to merging...

Categories