Manipulating the content of an email message - php

I am looking for a solution that will enable me to connect to a mailbox, obtain an email, apply specific modifications to the email body (for example, change the content), and then forward the newly modified email to a new email address.
The trick is that such modification must not destroy the format and headers of the original email and I must not lose any attachments that were in the original email.
The sort of manipulation that will be performed will need to be done by an external process that knows the logic of my application.
The solution I am looking for can be an external software that can invoke some API for processing the content of the emails, or even API by itself that my code will invoke.
Our solution is currently based on PHP, but any other solution is also acceptable.
I started working with the Zend Mail library but I am running into problem having to understand the inner-workings of email formats. I wouldn't want to start messing around with the mime objects in the email format. I only want to alter the textual content of the message and keep the rest untouched.

http://php.net/manual/en/book.imap.php - functions that let you manipulate email systems.

What mail server are you using? In qmail its easy to process any incomming email. You can put any script in any language to process the lines of the email.
If you have IMAP access to your server you can use the php IMAP lib. http://www.php.net/manual/en/book.imap.php

I wrote a library as part of a larger open source app that may help you a bit. Its an object orientated wrapper around the PHP imap functions and can be found at google code.
Unfortunately this doesn't do exactly what you want. What in the message are you trying to change? I may be possible to just grab a raw version and specifically search out what you want to change, ignoring the whole mimetype processing altogether, and then just send the whole message along again.
Resending the email is simple enough, and this (small tutorial)* on sending email with attachments can refresh you on the basics (although most of what is in there you can skip as the attachments and mimetypes will already be built).
* I can't post the link because my reputation isn't high enough for two links in a single post, so I'll add it in a comment.

Related

MIME Multipart Parser

The company I work for provides bulk-mailing functionality to our clients [double opt-in, not spam, I promise] and we get a figurative ton of reports back via Feedback Loops from AOL, Comcast, Yahoo, etc. These are generally from people that signed up, don't want it anymore, have been conditioned to not click 'Unsubscribe' links, [because "that's how the spammers get you"] and simply mark all the messages as spam.
Now, these FBL emails follow a specific format where the message is multipart, there are one or two text parts, and then the original message is attached, usually with all recipient information stripped out. This attached email is also multipart and contains the unsubscribe link, but the section in the attached email the link occurs in is quoted-printable encoded and the link is longer than what quoted-printable allows for in a line, so it get munged. Occasionally the section seems to get base64-encoded, I think it happens if the client is using a fancy language like chinese/japanese/etc.
What I need is a mime/multipart data parser that can give me these parts. PHP has oh so helpfully not implemented any form of multipart parser that I can find outside of what's internal to either their horrid IMAP functions, or internal to PHP itself which processes multipart form data.
Does anyone know of something I can use for this short of having to write my own? I had found one script, but it relies on old PECL functionality that relies on a custom-compilation of PHP which is not an option for this server.
TL;DR: PHP's imap_* functions will parse the parts of the message received from the server, but I need to parse the parts of an email attached to the email downloaded from the server.
This guy's script is ugly as sin, but it gets the job done:
http://www.phpclasses.org/package/3169-PHP-Decode-MIME-e-mail-messages.html

Update text on a website by an e-mail

My client is a restaurant that needs to change a paragraph section (<P>) every day for specials.
There are many people that will be handling it so I have to make it as easy a possible.
I don't think teaching the whole staff how to use a CMS is feasible, so I thought it would be a good idea to make something like an email service, that only updates that bit of text.
So in other words the staff would just have to send an email, and the server would somehow change the text on the HTML page for that day.
Can I do this in PHP code maybe?
I am also open to other ideas to something easy, like a simple login system to just change that bit of text.
I wouldn't recommend setting the text by e-mail. E-mail is an ugly, UGLY format to process, especially if it is sent by humans on every type of broken e-mail clients. The half of the e-mails will be invalid HTML, the other half will be tabulated unimaginably, the third half will contain signatures and there are so many more halves :)
And explaining the e-mail format you expect to the staff (utf-8 plain text with no signatures, etc...), and how to set it on their Outlook Expresses, Netscape Mails, and web clients you never even heard of, will be just as difficult as explaining a CMS.
What I would recommend is a simple form instead. If you open the form the current text could show up in a text field, and upon posting back the form you save it's contents on the server.
You would need to store this text somewhere. There is very few servers that host web applications without some form of database backends, so I'm pretty sure you have some kind of database to store your text in.
Also the form would need some kind of password protection. The easiest would be IMHO to password protect the folder where your php is. It's not too hard in Apache.
Check this link: http://www.groovypost.com/howto/htaccess-password-protect-apache-website-security/
I'm not familiar with your experience in PHP, but I hope you can make a form to edit a database record. If not, then please use google, there are tons of tutorials on it.
You could use imap extension http://php.net/manual/en/book.imap.php it allows you to read emails from email box. usualy, programmers create keywords that act like commands to script, for example, if title of email is match pattern UPDATE pageID then it will process email body as content for this page.
This script will be running with crontab, which is scheduler for unix OS. So you can run it every 1 hour for checking new mail.
maybe your client could send an excel sheet and you parse this email attachment on server side with php.
https://code.google.com/p/php-excel/
One option is to use a blogging platform to post the latest specials. You could then use PHP to grab the RSS output (last feed item) and populate the website. This would take care of the form, log in and security part. It also gives the client a running history.
(if you want to go this route I can post an rss reader php script to help you out)
Alternatively, if you decide to go the email route, put the text between something like this:
<!-- PUT PARAGRAPH HERE -->
Here is today's specials.
<!-- /PUT PARAGRAPH HERE -->
Can be anything really, but bookending it with something constant you can search for in the string will help avoid many of the issues mentioned by #SoonDead above. PHP can convert it into something consistent, but you'll need some php knowledge to make it work.

Edit an email subject line (IMAP)

I am trying to integrate IMAP email processing with another in house system that bases what it uses off of the subject line / email content.
We need to be able to change the text of the subject line before moving the email to a new folder. What/where would be a good place to start?
I've had a look around and it IS possible in a manual sense, via a thunderbird plugin or using outlook. I just can't seem to find a relevant example in PHP, or any other language for that matter. I also hear the idea is flakey at best as you need to modify the email content and upload it back to the imap server.
The outlook implementation seems to delete the original and save a new one to your IMAP folder on the server.
Side note: Yes I know it is a weird requirement, and although forwarding the email to ourselves then moving it is our fall back plan it is not much liked as it moves original headers useful for things like reply-all.
Any suggestions appreciated.
PS If I'm blind and there is something obvious I'm missing in the manual let me know.
Do you already have any code built to handle the email processing? IMAP subject line information is stored as a header so you would need to utilize the PHP functions of imap_headerinfo() and/or imap_fetchheader() depending on the functionality you're looking for to achieve this. You could have PHP check each message header and if it matches X format, remove the message, and create a new one with the appropriately modified header information.

Emails sometimes get scrambled

Folks,
I have a PHP-based site (using the QCubed framework); as a part of the site, I have a daemon that's sending out several thousand emails a day (no i'm not a spammer, everything is opt-in :)). Emails are sent through a custom framework component; that component serves as an SMTP client. I'm using a paid SMTP gateway from DNSExit.com to get the emails actually delivered.
Those emails are simple HTML-based emails; they really have just simple links inside.
My issue is that these links sometimes (not consistently!) get scrambled during transition. Tags somehow get mixed up, and some links are non-functional in the email. The issue happens on a small percentage of all sent emails; it is not consistent (i.e. the same exact source message HTML may or may not cause the scrambling in transition).
Have any of you seen this? Any thoughts on how to troubleshoot?
Is it possible that you are using temp files to create the emails (or at minimum to create the variable content)? I did something vaguely similar once upon a time. The email text was generated and written to a temp file based on the exact time in seconds. Unfortunately, when sending thousands per day, we were hitting the same second more than once (since there are only 86k seconds available). That might explain a) the small error rate and b) the apparent randomness. For troubleshooting, I'd just see if the error rate increases with the number of emails and go from there.
I ran into a similar problem on a server running sendmail.
I was creating and testing an html email that would one day be mass mailed (opt-in, of course). I had myself a template for the email that was easy for any html programmer to read, but as such was heavy on the whitespace to line everything up correctly. I thought to myself, if this is going to be mass emailed, after the template is rendered, I think I will minimize the whitespace in the file to save on space! So I created a brilliant regular expression to rid any unnecessary to send whitespace from the rendered email.
Upon sending the email to myself, I opened the email and was baffled when I saw that some of the css and html were not showing up correctly, when my previous emails prior to my regexp were. By looking at the original message I noticed that every once in a while, an exclamation mark (!) was appearing seemingly randomly throughout the message, thus breaking any css and html that came in its random path.
Turns out that sendmail doesn't like it if a line in your email gets too long without a line break. When the line does get too long, sendmail will insert an exclamation mark followed by a line break right then and there, just to confuse and confound you.
Why did it not just choose a space between words to line break? Why insert the exclamation mark? Questions I'm afraid, without answers.
My solution?
sudo apt-get remove sendmail
sudo apt-get install exim4
I was having other problems with sendmail like it taking a full 60 seconds to send an email and exim4 just worked and I have never had to think about it again.
If your mail server is using sendmail, this very well could be the problem, if not, thank you for letting me share my story with you. I needed to vent.
When you're sending email you should encode it so every line in the message body is not longer then 76 characters. You could use base64 for this but most systems use the
quoted-printable encoding for text because it generates smaller messages.
Base64 is usually only used for binary data.
The problem is that HTML is not compatible with email. That is why I created Mail Markup Language.
HTML was created to operate with the HTTP protocol as those two technologies were invented by the same person at about the same time. The difference is that HTTP is a single session one way transfer from a server to a client. That never changes as the HTML document always originates on a server, is sent to a requesting client, and once the transfer completes the connection between the client and server is dropped.
Email does not behave in such a way. In email a communication originates at a client, is sent to one or more email serves, and then terminates at a distant client. The biggest difference, however, is that the document does not die with finality of a single transmission instance as is the case with a document transfer over HTTP. A document sent in SMTP can be replied to, forwarded, or copied to multiple unrequested users. This one difference is profound when consideration for an email thread is considered.
The problem is that SMTP and HTTP are different as demonstrated in the prior two paragraphs. This differences is compounded in that SMTP and HTTP have radically different formatting methods for the creation of header data. HTML has header data that is intended to be compatible with the headers of HTTP transmissions and offer no compliance to SMTP transmissions. The HTML headers also do not account for the complexity of an email thread.
The problem is exemplified when email software corrupts a HTML document to add formatting changes necessary to fit the conforming demands of that software and to also write header data directly into the document. This exemplification becomes extremely pronounced when an HTML email becomes an email thread. Since the HTML header data has no method to account for the complexities of an email thread there is no way to supply relevant presentation definitions from a stylesheet that survive the transfer of the document. Each time a HTML document, or a document with HTML formatting, is sent from one email software to another the document is corrupted and each email software device corrupts the prior corruption. Email processing software may refer to either an email client, which certainly will corrupt a document, or an email server, that may only likely corrupt an email document.
The solution to the problem is to create a markup language convention that recognizes the requirements of email header data directly. Those requirements are defined in RFC 5321 for the SMTP protocol and RFC 5322 for the client processing. The only way to properly extend this solution to account for the complexities of an email thread are to provide a convention for a multi-agent DOM.
Paragraphs deleted due to technical inaccuracy and difference between the term multi-agent DOM and the nature of an invented feature not mentioned here even prior to the edit.
EDIT: a multi-agent DOM applies some degree of hierarchy, which may not be necessary to represent an email thread.
Had 2 problems with email data - usually "?" symbol somehow got inside some words, another was UTF and title related. First got "fixed" by changing hosting provider (so it was mail-server related) second one got fixed by changing PHPmailer library.
Try to specify how exactly data is scrambled.
Have you any special attributes in your links? May be title attribute with not escaped quotes inside?
Something like this: Link

PHP extracting body and attachments from piped email

I understand there are php IMAP functions to extract certain elements from an email stored in a mailbox. What I am trying to discover is whether this can translate to emails piped to a script.
The scripts that I have looked at for extracting the body and attachments are fairly inflexible and bulky. I sent my pipe script a variety of different email formats and it saved them in vastly different ways which makes me wary of starting to write a script myself.
Also as some of the emails sent from my work address attach a signature. Does anyone have any ideas how to combat this. I have a bunch of rather daft people who won't even understand the term 'don't add a signature when sending this email', or 'send in plain text only'.
AFAIK, the format for storing messages is not defined by any RFC however deliver, procmail and .forward all rely on the the headers being seperated from the body by a blank line.

Categories