PHP: =0D, =A20 symbols - php

After obtaining info from an email body, I have a lot of symbols such as =0D, =A20, etc... How can I remove them? I do not want to use
$body = str_replace('=A20', '', $body);
because if the email body actually contains that it will be replaced.
Any ideas? Thanks!

Don't replace them to nothing - thoose characters aren't nothing, they are part of the text.
E-mail messages aren't plain text, they are encoded. Thoose examples are part of the quoted-printable encoding, which you can identify by the
Content-Transfer-Encoding: quoted-printable
line at the beginning of the e-mail message.
And php has a method to decode it

Related

Email Template in PHP

I am trying to add paragraph in email template but not getting it done as I get confused because nl2br is not working in email template
Should I use \r\n and where to write it?
'email-message' => $data['mail-message']
There can be reason of content-type behind this issue. You need to use a <br> if your content-type is text/html. content-type header is must otherwise your e-mail will be interpreted an plain text. In case If you want to use \n you should use content-type: text/plain but then you will lose any markup. So better to use content-type as text/html and use <br>.

How can I convert UTF-8 Hex Bytes to Hexadecimal HTML Entities in php? [duplicate]

After obtaining info from an email body, I have a lot of symbols such as =0D, =A20, etc... How can I remove them? I do not want to use
$body = str_replace('=A20', '', $body);
because if the email body actually contains that it will be replaced.
Any ideas? Thanks!
Don't replace them to nothing - thoose characters aren't nothing, they are part of the text.
E-mail messages aren't plain text, they are encoded. Thoose examples are part of the quoted-printable encoding, which you can identify by the
Content-Transfer-Encoding: quoted-printable
line at the beginning of the e-mail message.
And php has a method to decode it

PHP quoted_printable_decode not working on Fandango plain/text emails

I'm working to clean up emails before they get stored in a database. A fandango email was sent as being encoded as 4 (quoted-printable). Here is part of the message without decoding...
=0A=0A=A0=0AJohn=0A(800) 123-4567=0A=0A----- Forwarded Message =
=20=0ASent:=20Thursday,=20July=204,=202013=204:14=20PM=0ASubject:=20Your=20Despicab=
le=20Me=202=20iTunes=20Download=0A=20=0A=0A=0ADespicable=20Me=202=20=0A=20=20=0A=20Your=20purchase=20=
of=20tickets=20for=20Despicable=20Me=202=20has=20earned=20you=20a=20complimentary=20download=20of=20t= he=20song=20'Just=20a=20Cloud=20Away'=20by=20Pharrell=20from=20the=20Original=20Motion=20Picture=20So=
undtrack=20on=20iTunes.=20=0AWe=20hope=20you=20enjoy=20the=20song=20and=20the=20film!=0AIf=20you=20ha=
ve=20iTunes=20installed,=20click=20here=20to=20start=20your=20complimentary=20download.=0AIF=20=
YOU=20DO=20NOT=20HAVE=20iTunes=20INSTALLED:=0A=0A1.=20Download=20iTunes=20for=20Mac=20or=20Window=
s,=20free=20of=20charge=20at=20www.iTunes.com.=20=0A2.=20Open=20iTunes=20and=20click=20iTunes=20Sto=
re.=20=0A3.=20Click=20Redeem=20under=20Quick=20Links.=20=0A4.=20Enter=20the=20code=20below.=20Your=20= download=20will=20start=20immediately.=20Enjoy.=20=0ADownload=20Code:=20FML6H34XXTMJ=20=0AC=
But when I use quoted_printable_decode() on the variable it produces no text.
This url has a decoder that works, albeit in ASP/VB...
http://www.motobit.com/util/quoted-printable-decoder.asp
I'm guessing the code here is relevant...
http://www.motobit.com/tips/detpg_quoted-printable-decode/
It decodes the quote-printable HTML above correctly. Hopefully this will help someone trying to help me. I'm sure I'm not the only one to encounter broken quote-printable email messages.
It looks like there are some spaces in the quoted-printable encoded string that you posted. That's probably what's causing the problem - if it's truly quoted-printable, than the encoded string should not contain any spaces. Spaces are =20 in quoted printable. If you use a replace function (e.g. PHP's str_replace) to replace the spaces in the encoded string with =20, then you get the following quoted-printable encoded string:
John=0D=0A(800)=20123-4567=0D=0A=0D=0A-----=20Forwarded=20Message=20
Then, this string can be decoded using PHP's quoted_printable_decode() function.
If you copy the quoted-printable encoded text above to a file, then run the following PHP script (which reads the quoted-printable text from the file, gets rid of the spaces using the str_replace function, then decodes the quoted-printable text using the quoted_printable_decode function), you should see that it produces the correct decoded output:
<?
$filename="./qp.txt";
$file = fopen($filename,"r");
$qp = fread($file,filesize($filename));
fclose($file);
$qp=str_replace(" ", "", $qp);
print "<plaintext>";
print quoted_printable_decode($qp);
?>

PHP - Special characters

I'm trying to send these characters through PHP:
áéíóúüchlñÁÉÍÓÚÜCLÑ
They show up in the received email like this:
áéÃóúüchlñÃÃÃÃÃ
I tried htmlentities but without success:
$newsubject = htmlentities($subject, ENT_COMPAT, "UTF-8");
mail($notes,$newsubject,$message,$headers);
Does anybody have an idea what I could try?
Thanks
I think, you need to use MIME (Multipurpose Internet Mail Extensions).
Add your mail headers the following:
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
You are attempting to send them as UTF-8 but your PHP is handling them as latin-1.
Call utf8_encode on the input string to treat it as UTF-8 again.
EDIT: Misread the question. Add a header to the email you're sending:
Content-Type: text/plain; charset=utf-8
Your character set is wrong on the characters themselves. Try this: (Windows) Copy and paste those characters from a "UTF-8 character set" site back into your application. Make sure your doucument is UTF-8, and BOM Signature disabled.

Proper PHP way to parse email attachments from EML format

I have a file containing an email in "plain text MIME message format". I am not sure if this is the EML format. The email contains an attachment and I want to extract the attachment and create those files again. This is how the attachment part looks like -
...
...
Receive, deliver details
...
...
From: sac ascsac <sacsac#sacascsac.ascsac>
Date: Thu, 20 Jan 2011 18:05:16 +0530
Message-ID: <AANLkTimmSL0iGW4rA3tvSJ9M3eT5yZLTGsqvCvf2fFC3#mail.gmail.com>
Subject: Test attachments
To: ascsacsa#ascsac.com
Content-Type: multipart/mixed; boundary=20cf3054ac85d97721049a465e12
--20cf3054ac85d97721049a465e12
Content-Type: multipart/alternative; boundary=20cf3054ac85d97717049a465e10
--20cf3054ac85d97717049a465e10
Content-Type: text/plain; charset=ISO-8859-1
hello this is a test mail. It contains two attachments
--20cf3054ac85d97717049a465e10
Content-Type: text/html; charset=ISO-8859-1
hello this is a test mail. It contains two attachments<br>
--20cf3054ac85d97717049a465e10--
--20cf3054ac85d97721049a465e12
Content-Type: text/plain; charset=US-ASCII; name="simple_test.txt"
Content-Disposition: attachment; filename="simple_test.txt"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gj5n2yx60
aGVsbG8gd29ybGQKYWMgYXNj
...
encoded things here
...
ZyBmZyAKCjIKNDIzCnQ2Mwo=
--20cf3054ac85d97721049a465e12
Content-Type: application/x-httpd-php; name="oscomm_backup_code.php"
Content-Disposition: attachment; filename="oscomm_backup_code.php"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gj5n5gxn1
PD9waHAKCg ...
...
encoded things here
...
X2xpbmsoRklMRU5BTUVfQkFDS1VQKSk7Cgo/Pgo=
--20cf3054ac85d97721049a465e12--
I can see that the part between X-Attachment-Id: f_gj5n2yx60 and ZyBmZyAKCjIKNDIzCnQ2Mwo=, both including
is the content of the first attachment. I want to parse those attachments (file names and contents and create those files).
I got this file after parsing a dbx format file using a DBX Parser class available in PHP classes.
I searched in many places and did not find much discussion regarding this here in SO other than Script to parse emails for attachments. May be I missed some terms while searching. In that answer it is mentioned -
you can use the boundries to extract
the base64 encoded information
But I am not sure which are the boundaries and how exactly to use the boundaries? There already must be some libraries or some well defined method of doing this. I guess I will commit many mistakes if I try reinventing the wheel here.
There's an PHP Mailparse extension, have you tried it?
The manual way would be, process the mail line by line. When you hit your first Content-Type header (this one in your example):
Content-Type: multipart/mixed; boundary=20cf3054ac85d97721049a465e12
You have the boundary. This string is used as the boundary between your multiple parts (that's why they call it multipart).
Everytime a line starts with the dashes and this string, a new part begin. In your example:
--20cf3054ac85d97721049a465e12
Every part will start with headers, a blank line, and content. By looking at the content-type of the headers you can determine which are attachments, what their type is and their filename.
Read the whole content, strip the spaces, base64_decode it, and you've got the binary contents of the file. Does this help?

Categories