Threading without In-Reply-To: Message-ID? - php

So our system is sending us email notifications regarding specific incidents. The problem is, things get really spammy and disorganized, and we really want to have our email client (Gmail) thread notifications regarding the same incident.
So for instance:
Email 1 is sent about incident 1.
Email 2 is sent about incident 1.
Gmail displays them in their own threads while we want them to appear as "replies" in the same thread.
I thought all I needed was to append "Re: Subject Line" but that doesn't work. Then I read about the References and In-Reply-To headers, but these require storing message IDs and we'd want to avoid the overhead if we can.
Is there any workaround to have clients thread emails received about the same topic without having to store message IDs and add them to In-Reply-To headers?
Thank you so much!
Note: Using PHPMailer

You really can't get around this. You must have something that uniquely identifies a message, and as far as email messages go, that something is the message ID, and that's also exactly the thing referenced by References and In-Reply-To headers, so you really can't get away from that. It has to be implemented using message headers because that's universally supported by all email clients.
The definitive threading algorithm, is, like email itself, now very old, but you can find it here, and it's still as valid as it ever was. I used that approach to implement the email threading used in the chamsocial.com social network.
Some email clients try to do threading without using message IDs, and it's almost always a failure - I sometimes see Apple Mail doing this when IDs are missing, pulling completely unrelated messages into a thread just because they happen to have the same subject.
If you wanted to do message threading without message IDs, you'd need to think up something that acts in exactly the same way - which is pointless, so you're best off biting the bullet and implementing message ID support properly.

Related

How to track an ID on an email when the user replies?

I'm using Laravel 5.5. In my app, users can write something and send it through email to someone. The thing they wrote gets recorded as a "message" in my DB. I need to, somehow, send the ID of this message in the e-mail, so when the receiver replies through e-mail, I know which message he's replying to.
What's the easiest way to do this? I know there are APIS, but I need to implement a custom solution.
Thanks for any light on this.
Use extended addressing on your mail server - replies+specialcode#example.com ... although a lot of bad "email validation" methods think it isn't valid, so I changed the extension character to a hyphen instead of the + on my postfix setup.
All mail is delivered to replies#example.com but the to: and other headers keep the specialcode part which you can then use to route into a ticketing system, etc.
This feature is supported by Postfix (recipient_delimiter in main.cf), QMail, and several other SMTP servers. Note that you need the support for this on the incoming-to-you SMTP side, nowhere else, especially if you use a character that isn't typically declared as "wrong" by various filters. May be worth splitting off a subdomain so you can easily set up separate MX servers, etc. since I'm sure you don't want your regular corporate mail running through the same filters and processing (at least, I would...)

Are custom mail headers preserved after reply?

I'm currently trying to design a PHP webapp that allows users to send emails to other users. The recipient can then reply to the email and the message will be updated in the webapp.
Now to keep track of each individual user message, I would like to add a custom header (ie. conversation_id) in the email. When the recipient replies to the email in their email client, will the custom mail header (ie. conversation_id) be preserved?
There will be cron job that executes every minute that opens a POP3 stream to the web server to retrieve new emails (replies that the user may have sent with their mail client) to update my DB.
I'm not sure if this is a good way for designing such an app. Any suggestions?
EDIT: Also, I'm sure wondering how I can strip out the quoted messages in the reply?
You can't rely on mail headers being preserved - it is pretty much up to the individual mail client to decide what to include.
I would generally put the conversation ID within [] brackets in the subject which makes it really easy to parse out with a regular expression.
Each message already contains the Message-ID field which is used by the mail clients to create the content of the In-Reply-To field.
Wouldn't the usual way after the standards be to rely on the user's mail client setting the in-reply-to field correctly? As far as I know, all email client use this correctly. (even though according to this thread Outlook may have an occasional bug?)
So I think, emails already feature this and you don't have to worry about creating a custom mail header entry and unpredictably behaving mail clients.
EDIT: I rembember a friend telling me his frustration from work about how many people remove or even edit these Tags in [ ] brackets from the suject field. Also, it seems to be a very dirty work-around and all of your software would need to handle it without opposing it to the users ability to change it => practically impossible.
EDIT: I think it will be hard to reliably strip out the quoted message in the reply, because each mail client handles it differently.

PHP - Bulk mailing and checking for server responses

We send a lot of emails to a lot of our users (ranges in 20k+ per day). One of the major problems we face are invalid or dead emails - sometimes our users delete their accounts, change their email address without updating their profile, or our email database builders simply caught emails that are invalid or no longer active. These unresolved returned status messages keep not only piling up on our webmaster account, but also waste precious server resources and flag us as spam more often because of the repeated attempts.
Now, while our mail server is set up to keep trying to send an email to an address that returns "temporarily unavailable", I would like to be able to receive status messages into PHP immediately after sending. For example, when my Sender class sends an email, I would like to know the status message that was returned - whether the email is no longer active, or the server does not exist, or the email was simply moved to another address.
Naturally, I want to be able to receive a status message of a delayed email as well. So if an email does not get sent because the recipient email address is temporarily unavailable, I would like to get the "temporarily unavailable" message back into Php, but I would also like the real one to get passed back once the sending succeeds (for example, if the email does go through 2 days later).
Is there a library that would help me achieve this? What are the most common approaches to this problem, if any?
Like most questions about PHP and mail this is mainly about the MTA.
Bulk emailing is a science in its own right (OK, so more like a black art) and at these kinds of volumes you need to up your game a lot if you want reasonable rates of deliverability.
But back to the question.
An awful lot of this is about how you configure your mail server. AFAIK, most MTAs will only send back a failure message when the message is removed from the queue (e.g. after the last attempt at delivery). Which gives 2 options for tracking each attempt:
1) parse the log files
2) set the number of attempts to 1 (and optionally handle re-queueing yourself).
Given that a message can fail to be delivered after it has successfully departed from your server, it makes a lot of sense to use delivery status notifications (i.e. bounce emails) to track the progress of messages - so using option 2 avoids having to build different code for handling different scenarios.
Without knowing what OS this is running on, nor which MTA, it's impossible to give more specific recommendations.
symcbean's answer gives many theorical inputs and several means to handle your case.
Besides, maybe you could have a look on how other libs or builtin functions work. For instance, you can have a look at :
PHPList
Extended PHP Mailer
i used PHPList some times ago but it was already a reliable solution. I don't know the PHP Mailer class but i may be worth a try (or a least a look on how they deal with similar issues).
Regards,
Max

Threading emails by subject

We're parsing an email inbox signed up to a mailing list (Mailman) that does nothing except sit there and capture emails from other users on the mailing list. This is going to be PHP connecting to an email box, grabbing new emails and putting them into a MySQL database for use as a web archive that's searchable.
I noticed that many of the subjects have RE: FW: FWD in front of them (obviously), but wondered if I didn't need to manually strip these out to get a grouping by subject when outputting database results to the web page.
Maybe there's a PHP/Mail or PEAR class that will automatically handle message grouping/threading that I'm not aware of. Thanks for your help!
The proper way to thread them is not by subject, but rather by the Message-ID and References headers. The References header will contain a comma-delimited string of all the previously related Messgage-ID headers. By using these, the actual content of the subject line becomes less relevant since it can get modified and mangled. In other cases, you might get many separate threads with subjects like "Need help please" that should not be threaded together.
You probably want to look into the References and In-Reply-To email headers. These give you information about which email the current email is in reply to.
There's a good algorithm for threading email based on this information here: http://www.jwz.org/doc/threading.html

Safe way to send mail via PHP to many users

Let me explain what I mean in my title. Let's say, that for example I'm creating a small e-commerce system for one web shop/catalog. There's a possibility for customers to choose, do they wish to receive newsletters or not. If they do, then logically thinking the newsletters should be send immediately as the newsletter is formed and ready.
Of course it can be done simple by fetching all specified user e-mails from database and using for cycle to send mail via mail function in loop, but problem is that I was told, that this is bad practice. The simple and not cheap way would be purchasing internet service for sending newsletters, but what for the php programmer is needed, then?
So I ask you humble comrades, what from your point of view might be a solution?
NB! You probably won't believe me, but it is not for spamming.
UPD: I might have explained myself wrong, but I would like to hear a solution not only about the correct way to send mail, but also about the correct delivery. Since not every mail send is always delivered.
Of course there are some reasons, which are unpredictable. For example someplace along the way something broke and mail was lost (if such thing is possible), but there are also other reasons which are influenced maybe from server or elsewhere. Maybe there is need to talk with hoster about it?
There is no reason why you couldn't write it in PHP, although I would not make it a part of a webrequest / HTTP process. I've successfully implemented for give or take 500,000 subscribers per mailing (depending on local data available, as this was a location-specific project). It was an in-house project, so unfortunately no code/package for you, but some pointers I came across:
Setting up delivery
Started out with phpmailer itself, to take care of formatting, encoding of contents and headers, adding of attachments etc. That portion of it works well, and I wouldn't want to write that from scratch.
The 'sending' of an email itself is just setting some flag in a database whether / how / what should be sent to (a portion of) the subscribers.
After this flag is set, it will automatically be picked up by a cronjob, no more webserver involved.
I started out with a heavily polluted database with millions of email addresses, of which a lot were obvious not valid, so first thing was to validate all email addresses for format, then for host:
filter_var($email, FILTER_VALIDATE_EMAIL); over the subscribers (and storing the result obviously) got rid of the first few hundred thousand invalid emails.
Splitting out the host (and storing the host name) from the emails, and validating that (does it have an MX or at least an A record in DNS, but keep in mind: you can send email to an IP-address foo#[255.255.255.255], so do keep those valid)) got rid of a good portion more. The emailaddresses here are not permanently disabled, but with a status flag that indicates they're disabled because of the domain name / ip.
Scripts were changed to require valid emailaddresses on subscription / before insertion, this nonsense of 'you aint gonna get it#anywhere' subscription-pollution in the database was just ridiculous.
Now I ended up with a list of email addresses that had the potential of being valid. There are in essence 3 ways to detect invalid addresses (keep in mind, the all can be temporary):
They are denied immediately by the server.
The earlier determined server just doesn't listen to traffic.
They are bounced long after you thought you delivered them.
Strange thing, the bounces, which every emailserver seems to have another format forand were a hell to parse at first, ended up actually pretty easy to capture using VERP. Rather then parsing whole emails, a dedicated email address (let's call it mailer#example.com) was configured to rather then deliver to mailbox, to pipe it through a command, and if we sent email to user#server.tld, the Return-Path was set for mailer+user=server.tld#example.com. Easily parsed on receipt, and after how many bounces (mailbox could not exist, mailbox may be full (yes, still!), etc.) you declare an emailaddress unusable is up to you.
Now, the direct denial by the server. Probably we could have gone with properly configuring some MTA and/or writing plugins for those, but as the emails were time-sensitive, and we had to have absolute configurable control per mailing over last usable delivery time (after which the email was not longer of interest to the user), throttling per receiving server, and generally everything, it would take about the same time writing a mailer in PHP which we knew better, that used the SMTP protocol directly to socket 25 on receiving servers. With a minimum amount of effort the possibility of another transport then the default choices in PHPMailer was built-in. The SMTP protocol is actually quite simple, but there are some caveats:
A lot of receiving server apply Grey Listing: most spambots will not really care if a specific mail arrives, they just churn them out. So, if an unknown / not yet trusted sender send mail, it will be temporarily rejected. Catch that (usually code 451), and place the email in the queue for later retry.
A mailserver, especially of the larger ISP's and free services (gmail, hotmail/msn/live, etc.) will not stand for a torrent of mail without fighting back: after the first couple of hundred / thousand, they start rejecting you. More about that later.
Getting speed
Now, we had a delivery system that worked, but it needed to be fast. Sending a 10,000 emails in an hour is all fine if you only have 10,000 addresses to send to, but the minimum we required was about 200,000 per hour. Start of it was a dedicated server (which can actually be pretty low powered, no matter what you do, most of time taken in delivery of email is in the network, not on your server).
Caching of IP's: remember all those IP's we requested from hostnames in email addresses? We stored those obviously, and looking up their IP's again and again causes considerable lag. However, IPs may change: a DNS record there, another MX at another place... the data gets stale fast. Most of the time the server isn't actually sending anything (subscription newslettters come in bursts obviously), a low-priority cronjob is running checking all hostnames with a stale IP (we chose older then 1 day as being stale) for an IP address, including those which previously had none (new domains get registered all the time, so why shouldn't a domain become available the day after someone already enthusiastically subscribed with his/her brand new email address? Or server problems with some domain are solved, etc.). Actually sending the emails now required no more domain lookups.
Reuse of the SMTP connection: setting up a connection to a server takes relatively a large portion of the time to deliver an email when you're talking directly to port 25. You don't have to setup a new connection for every email, you can just send the next over the same connection. A bit of trail-and-error has resulted in setting the default here to about 50 emails per connection (assuming you have that many or more for the domain). However, on failure of an emailaddress closing and reopening the connection for a retry sometimes helped. All in all, this really helped to speed things along.
Some obvious one, so obvious I almost forgot to mention it: it would be a waste to have to create the body of the email on the spot: if it's a general mail, have the body ready (I altered PHPMailer somewhat to be able to use a cached email), possibly days before (if you know you're going to send a mail on Friday, and your server is idling, why not prepare them on Wednesday already? If it's personalized, you could still prepare it beforehand given enough time, if not, at least have the non-personalized portions waiting to go.
Multiple processes. Did I mention much of the time it takes to deliver email is spend on the network? One mailing process is not nearly getting the most from your emailserver, barely noticable load and the mails are trickling out. Play around with a number of processes mailing different portions of the queue to see what's right for your server/connection, but remember 2 very important things:
Different processes make you very vulnerable to race conditions: be freaking absolutely sure you have a fullproof system that will never send the same mail twice (thrice, of even more). Not only does it seriously annoy users, your spamrating goes up a notch.
Keep domains together where possible: randomly picking from the queue you will lose the advantage of keeping an open connection to the server receiving email for the domain.
Avoiding rejects
You're going to send a lot of mail. That's exactly what spammers do. However, you don't want to be seen as a spammer (after all, you aren't, are you)? There are a number of mechanisms in place which will thoroughly increase your trustworthiness to receiving servers:
Have a proper reverse DNS: processes checking the DNS belonging to the IP that is sending the email like it very much if the second level domains match: are you sending mail on behalf of example.com? Make sure your server's reverse DNS is something like somename.example.com.
Publish SPF records for your domain: explicitly indicate the machine used to send your bulk email is allowed & expected to send mail with that From / Return-Path headers.
remember rejects: servers don't like it to tell you again and again that different email addresses don't exist. Either automated mechanisms, and even human admins, blocked our server while we worked through all non-validated email addresses that did (no longer) exist. We didn't employ a double opt-in until later, so the database was polluted with typos, people switching IPs and thereby email address, prank email addresses and so on. Be sure to capture those invalids, and given enough or sever enough failures, unsubscribe them. They're doing you no good, they're hogging resources, and if they really want you mail and the mailbox becomes available later, they'll just have to resubscribe.
DKIM is another mechanism that may increase your trustworthiness, but as we haven't implemented it (yet), I cannot tell you much about that.
MX records: some servers still like it if your sending server is also the receiving server for the domain. As it was at the time, we had only 1 MX, and as the mailing server was still not that very busy, we dubbed it the fallback MX server for the domain. The normal MX server was not the server sending the subscriptions, as it is very irritating to be temporarily blocked by a server you're trying to send an important email to (clients etc.) because you already sent a load of less important mail. It does have the highest preference as receiving MX, but in the event it would fail we had the nice bonus that our subscription sending server would still be fallback for delivery, so in crisis we could still get to it, preventing awkward bounces to customers trying to reach us.
Tell them about you. Seriously. A lot of major players in free email addresses like live.com offer you the opportunity to sign up in some way, or have some point of contact to go to for help & support if your emails get rejected. I you have a legitimate reason to send so many emails, and it is believable you have that many subscribers, chances are they seriously up the number of emails you can send to their server per hour. A meager 1,000 may become somewhere in the ten-thousands or even higher if you are persuasive and honest enough. There may be contracts, requirements you have to fulfill, and promises you must make (and keep) to be allowed this. ISP's are a brand apart, and every other player is different. Don't bother calling them usually, because 99% of the time the only numbers you can find will only have people willing to troubleshoot your internet connection, which understand (or are allowed) little else. An abuse# email address is a good place to start, but see if you can delve up a more to-the-point email address up from somewhere. Be precise, honest and complete: roughly how many subscribers of you have an emailaddress with that ISP, how often are you trying to mail them, what are the errors or denials you receive, how is the subscribe & unscubscribe process like, and what is the service you actually provide to their customers. Also, be nice: how vital sending those mails may be to your business, panicking about it and claiming terrible losses does not concern them. A polite statement of facts and wishes, and asking whether they can help rather then demanding a solution goes a very long way.
Throttling: as much as you tried, some server will accept only a certain amount of mail per hour and/or day from you. Learn those numbers (we're logging successes & failures anyway), set them to a reasonable default for the normal domains, set them to agreed upon limits for bigger players.
Avoiding being tagged as spam
First rule: don't spam!
Second rule: ever! Not a 'once off', not a 'they haven't subscribed but this may be the deal of a lifetime to them', not with the best intentions, people have had to ask for your emails.
Obviously set up a correct double opt-in subscription mechanism.
PHPMailer does set proper headers on its own,
Set up an easy unsubscribe mechanism, by web (include a link to it in every mail), possibly also email and customerservice if you have it. Make sure the customerservice can unsubscribe people directly.
As said earlier: unsubscribe (excessive) fails & bounces.
Avoid spammy 'deal of a lifetime' wordings.
Use url's in your emails sparingly.
Avoid adding links to domains outside your control, unless you are absolutely sure you can trust them not to spam, if even then...
Provide value to the user: being tagged as spam by user-interaction in google/yahoo/live webmail clients seriously hurts future successes (on a site note: if you sign up for it, live/msn/hotmail will forward all mail to you send by your domain which is tagged as spam by users. Learn to love it, and as always: unsubscribe them, they clearly don't want your mall and are hurting your spam rating).
Monitor blacklists for your IP. If you appear on one of those, it's bye bye, so immidiate action in both clearing your name and determining the case is required.
Measuring success rate
With the whole process under your control, you are reasonably sure the email ended up somewhere (although it could be the MX's bitbucket or a spam folder), or you have logged a failure & the reason why. That takes care of the 'actually delivered' numbers.
Some people will try to convince you to add links to online images to your emails (either real or the famous 1x1 transparent gif) to measure how many people actually read your email. As a high percentage blocks those images, these numbers are shaky at best, and our believe is we just shouldn't bother with them, their numbers are utterly unreliable.
Your best bet for measuring actual success rate are a lot easier if you want the users to do something. Add parameters to links in the mail, so you can measure how many users arrive at the site you linked, whether they performed desired actions (watched a video, left a comment, purchased goods).
All in all, with all the logging, the user-interface, configurable settings per domain / email / user etc. It took us about 1,5 man-month to build & iron out the quirks. That may be quite an investment compared to outsourcing the emails, it may be not, it all depends on the volume & business itself.
Now, let the flaming begin that I was a fool to write an MTA in PHP, I for one thoroughly enjoyed it (which is one reason I wrote this huge amount of text), and the extremely versatile logging & settings capabilities, per-host alerts based on failure percentage etc. are making live oh so easy ;)
Using something like Swiftmailer, PHPmailer or Zend_mail are much better alternatives to using the simple mail() function as it can be easily marked as spam. There are simply too many issues with mailing that need to be considered - most of these are solved by using pre-existing libraries.
Just a few problems that need to be addressed when sending mass emails manually:
Using incorrect headers.
Processing bounced messages
Timing out of script due to an influx of emails.
Edit:
Probably not the answer you're looking for. But, I would strongly suggest you invest in something like Campaign Monitor or Mail Chimp. Since this process is not for educational purposes, but commercial, I would strongly suggest the above services.
I got your question, but before replying, let's me go to usual considerations. First, I strongly recommend using a service like Mail Chimp. It's kinda free for small jobs, and has many cool features, like tracking back how many emails were open, how many were clicked, how many failed on deliver... Think about make a favor to yourself and don't reinvent the wheel.
Now, for getting to know purpose, let's go to the reply to your question.
First thing to have in mind is to enforce your list to be a good one. How to do that? Well, for a good list, I mean a valid email addresses list. Simple put a newsletter form on your page, with just one field (maybe a captcha, but I don't think it is necessary).
Save all input to a database table, with a field "isValid" set as false by default, and any kind of unique hash. Then you send a confirmation email, with a link (with the hash generated) for confimation which when clicked will make the "isValid" flag true, and a link for cancelation (ALWAYS send this cancelation link within all your emails).
This is what stores and serious sites do. Anything that force your customers/visitors to receive is a bad moral practice (ie, spam).
Second thing, use a good hosting service. Too cheap services usually are used by spammers, and major email services blacklist everything coming from those addresses.
I know, you should be asking yourself if I get your question wrong. No, I don't, technical stuff comes now.
Why is a bad practice put a mail function inside a for loop? Simple. Because function mail do several operations everytime it is called. PHP, will open a connection to mail server, send data to be parsed, ask for sending, register mail server status, close connection, bubble up status to finish the mail function you called and clean up the memory mess.
This connection overhead is the problem people state as bad practice from programming point of view. Using a SMTP/IMAP solution is better because it optimizes this process.
A little bit down on tech stuff I see your questions about delivery. Well, as I said, you have some ways to ensure you emails list is good enough. But what if another exception occurs, like having a blackout + no-break failures on customer server?
Well, PHP keeps the status of "asking mail server to send, mail server sent". If the mail server sent your message, PHP will return true. Period.
If client was unable to receive, or reject, you should check email headers and email status. These are on email server. Once again, these informations can be accessed with SMTP/POP/IMAP extensions, not with mail function.
If you wanna go further, read IMAP docs, search for email classes (phpclasses.org, pear and pecl are best places to look into).
Extra tip: RFCs can be useful, as you can understand better what email servers really talks to each other.
Extra tip 2: Access you gmail or ymail and check for your sent/received messages for their "full version" and read its headers. You can learn a lot with them.
Via SMTP Authentication: http://www.cyberciti.biz/tips/howto-php-send-email-via-smtp-authentication.html
Just use PHP Mail and study IMF and how to build custom headers you can attach the fourth parameter, exmaple follows
<?php
// multiple recipients
$to = 'aidan#example.com' . ', '; // note the comma
$to .= 'wez#example.com';
// subject
$subject = 'Birthday Reminders for August';
// message
$message = '
<html>
...
</html>
';
// To send HTML mail, the Content-type header must be set
$headers = 'MIME-Version: 1.0' . "\r\n";
$headers .= 'Content-type: text/html; charset=iso-8859-1' . "\r\n";
// Additional headers
$headers .= 'To: Mary <mary#example.com>, Kelly <kelly#example.com>' . "\r\n";
$headers .= 'From: Birthday Reminder <birthday#example.com>' . "\r\n";
$headers .= 'Cc: birthdayarchive#example.com' . "\r\n";
$headers .= 'Bcc: birthdaycheck#example.com' . "\r\n";
// Mail it
mail($to, $subject, $message, $headers);
?>
Source: http://php.net/manual/en/function.mail.php
create a mail queue subsystem which might include tables such as mail_queue, mail_status, mail_attachments, mail_recipients and mail_templates etc...
You might consider PHPMailer
http://phpmailer.worxware.com/index.php?pg=exampleasendmail
You can add multiple recipients and a special callback function for handling the returning messages for every single mail sent. (for an example visit the link)
I Don't think that catching a "mail delivery failed" error mail is possible via php except that you are using PHPMailer via SMTP and once in a while watch for any returning message form any recipient in from your outgoing email collection.

Categories