Quoted-printable Email Subject lines not getting decoded by email clients - php

I've written a WordPress plugin which sends out new post notifications. There is a setting to convert subject lines from html entites to quoted-printable so they'll display in UTF-8 on any email client. A few weeks ago I started getting reports that the quoted-printable subject line was being kept as-is instead of being decoded.
Sample Subject header:
Subject: =?UTF-8?Q?[Pranamanasyoga]=20Foro=20Pranamanasyoga=20:=20estr?= =?UTF-8?Q?=C3=A9s=20y=20resilencia?=
I cannot replicate it locally and have not been able to find any common denominators between reporters.
The code that generates the quoted-printable line is this:
<?php
$enc = iconv_get_encoding( 'internal_encoding' ); // this is UTF-8
$preferences = ['input-charset' => $enc, 'output-charset' => "UTF-8", 'scheme' => 'Q' ];
$filtered_subject = '[Pranamanasyoga] Foro Pranamanasyoga : estrés y resilencia';
$encoded = iconv_mime_encode( 'Subject', html_entity_decode( $filtered_subject ), $preferences );
$encoded = substr( $encoded, strlen( 'Subject: ' ) );
If I try decoding it, it works fine:
$decoded = iconv_mime_decode($encoded, 0, "UTF-8");
var_dump(['encoded' => $encoded, 'decoded' => $decoded])."\n";
Result:
array(2) {
["encoded"]=>
string(102) "=?UTF-8?Q?[Pranamanasyoga]=20Foro=20Pranamanasyoga=20:=20estr?=
=?UTF-8?Q?=C3=A9s=20y=20resilencia?="
["decoded"]=>
string(59) "[Pranamanasyoga] Foro Pranamanasyoga : estrés y resilencia"
}
One thing I noticed, but think is not related is that my code actually adds a newline before the second =?UTF-8?Q? piece and the email subject header does not have it. Decoding the strings with- and without the newline works the same.
Does anyone have ideas/suggestions on what may be causing the email clients (Gmail included) to display the string as-is, instead of decoding it to UTF-8?
P.S. While writing this I saw a suggestion to use mb_encode_mimeheader() in a different thread. It seems to work well with iconv_mime_decode() in my test code, but the output string is indeed different from the original one:
[Pranamanasyoga] Foro Pranamanasyoga : =?UTF-8?Q?estr=C3=A9s=20y=20resile?=
=?UTF-8?Q?ncia?=
Could it be that email clients would prefer this format over the original one?

Related

PHP imap_search: UTF-8 / Non-ASCII characters on Microsoft Exchange mail servers

I want to fetch emails from outlook.office365.com using IMAP and PHP.
Since the most emails contain non-ASCII characters like äöü, I use UTF-8 in my imap_search() function:
imap_search($mbox_connection, 'ALL', SE_UID, "UTF-8")
With UTF-8 and the search criteria ALL I get all emails as expected. Now, I wanted to restrict it to for example only unseen (unread) emails:
imap_search($mbox_connection, 'UNSEEN', SE_UID, "UTF-8")
But this unfortunately causes the issue, that no emails can be found anymore - although there are unseen emails - and it also throws this PHP notice:
PHP Notice: Unknown: [BADCHARSET (US-ASCII)] The specified charset is not supported. (errflg=2) in Unknown on line 0
Based on this notice, I've changed the charset from UTF-8 to US-ASCII:
imap_search($mbox_connection, 'UNSEEN', SE_UID, "US-ASCII")
Now, it returns all expected unseen (unread) emails.
The problem is now, that I can't search for emails with UTF-8 characters. I've for example an email with these information:
From: Äpfel Nürnberg
Subject: Apfel vs. Äpfel
Body:
Einzahl gegen Mehrzahl.
Ein Apfel, mehrere Äpfel.
When I try to search for all emails with the subject "apfel" it works as expected - I can find the email:
imap_search($mbox_connection, 'FROM "apfel"', SE_UID, "US-ASCII")
Trying to connect to '{outlook.office365.com:993/imap/ssl}INBOX'...
Found 1 email(s)...
+------ P A R S I N G ------+
From: =?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <=?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <aepfel#nuernberg.de>>
Subject: =?iso-8859-1?Q?Apfel_vs._=C4pfel?=
But when I instead search for the word with the UTF-8 character (in this case äpfel), it does NOT find the email:
imap_search($mbox_connection, 'FROM "äpfel"', SE_UID, "US-ASCII")
Due to this fact, I've changed back the charset from US-ASCII to UTF-8, but this only ends again at the error message [BADCHARSET (US-ASCII)].
My code is very simple:
$mailbox = "{outlook.office365.com:993/imap/ssl}INBOX";
$mailbox_username = "someone#outlook.com";
$mailbox_password = "*******";
echo "Trying to connect to '$mailbox'...\n";
$mbox_connection = imap_open($mailbox, $mailbox_username, $mailbox_password);
$mailsIds = imap_search($mbox_connection, 'SUBJECT "äpfel"', SE_UID, "UTF-8");
if(!$mailsIds) {
echo "No emails found!\n";
imap_close($mbox_connection);
die();
}
echo "Found " . count($mailsIds) . " email(s)...\n";
foreach($mailsIds as $mailId) {
echo "+------ P A R S I N G ------+\n";
$headersRaw = imap_fetchheader($mbox_connection, $mailId, FT_UID);
$header = imap_rfc822_parse_headers($headersRaw);
echo "From: " . $header->from[0]->personal . " <" . $header->fromaddress . ">\n";
echo "Subject: " . $header->subject . "\n";
}
I've already tried this solution, but this returns also no matching email:
$str = "äpfel";
$str = preg_replace('/\=\?ISO\-8859\-1\?Q\?/i', '', mb_encode_mimeheader($str, "ISO-8859-1", "Q"));
$mailsIds = imap_search($mbox_connection, 'SUBJECT "'.$str.'"', SE_UID, 'US-ASCII');
Any ideas, how I can search for non-ASCII characters in the email fields From, Subject and Body when the IMAP server does not support UTF-8 and I also can NOT change this on server-side configuration?
This seems to be an issue with all Microsoft Exchange servers. Only those servers have this issue as far as I could found it out via Google.
You probably can't.
Exchange doesn't seem to implement charset aware searching for IMAP, and doing so is not a requirement of RFC3501 (only US-ASCII must be supported). UTF-8 is usually supported, but this does not seem to be the case for Exchange.
You would have to switch protocols (EAS, EWS, REST services, etc.) or pull down the information, decode it yourself, and search it. If you cache it, this isn't even too bad long term. Since it's headers, you can get this all in one fetch. If you need to search bodies, the case is much harder.

Encoding Issue with Gmail SMTP and Zend\Mail\Message- Zend Framework 2

I need to write accented letters into email body, but the utf-8 encode doesn't work. Into Gmail setting I've selected the option "Use Unicode (UTF-8) for outgoing messages".
I'm using Gmail SMTP and Zend\Mail\Messsage. I tried 4 different methods, but not one works.
complete function:
public function sendRegistrationEmail(){
$message = new Message();
$message->addTo($this->email)
->addFrom(self::FROM)
->setSubject($this->subject)
->setEncoding('UTF-8')
->setBody('àèéòù');
$transport = new SmtpTransport();
$options = new SmtpOptions($this->smtp);
$transport->setOptions($options);
$transport->send($message);
}
1:
->setBody('àèéòù');
output: à èéòù
2:
->setBody(utf8_encode('àèéòù'));
output: àèéòù
3:
->setEncoding('UTF-8')
->setBody('àèéòù');
output: à èéòù
4:
->setEncoding('UTF-8')
->setBody(utf8_encode('àèéòù'));
output:àèéòù
I tried to select into Gmail settings "Avoid Unicode (UTF-8) encoding for outgoing messages", but the resuts are the same! Where am I doing wrong? thanks for the help!
I found a solution here:
http://framework.zend.com/manual/current/en/modules/zend.mail.message.html
If I need to use html email:
$html = new MimePart($htmlMarkup);
$html->type = "text/html; charset = UTF-8";
else, pure text email:
$text = new MimePart($textContent);
$text->type = "text/plain; charset = UTF-8";
$body = new MimeMessage();
$body->setParts(array($text or $html));
.....rest of message instance....
->setBody($body);
I don't understand the reason why setEncoding('UTF-8') doesn't work. There are other solutions?

How to parse japanese char (utf8?) from imap_fetchbody?

I am pulling down an email which has english, chinese and japanese in the email.
I was using PHP/EZComponents to do this, but a certain japanese char was just not coming through so I am switching to php imap_* funcs to see if they will work.
This is what I have below, and the output I am getting. I need to decode this somehow... I know this has been well (read:overly/chaotically) documented all over the web, but I dont have time to earn a PHD in this right now. Any help is greatly appreciated.
$hn='{imap.gmail.com:993/imap/ssl}INBOX';
$inbox = imap_open($hn,$username,$password,CL_EXPUNGE);
foreach($emails as $email_number) {
$ov = imap_fetch_overview($inbox,$email_number,0);
$msg = imap_fetchbody($inbox,$email_number,2);
var_dump($msg);
// doesnt work... .. but right idea?
// var_dump( utf8_decode($msg) );
}
PARTIAL OUTPUT:
<font face=3D"Arial"><span lang=3D"EN-US" style=3D"font-size:10.5pt"><br></=
span></font><font color=3D"navy" face=3D"MS Gothic"><span lang=3D"JA" style=
=3D"font-size:10.5pt">=CC=EC=9A=DD=A4=AC=A4=A4=A4=A4=A4=AB=A4=E9=A1=A2</spa=
n></font></p><p style=3D"margin-right:0pt;margin-bottom:12pt;margin-left:0p=
t">
<font color=3D"navy" face=3D"MS Gothic"><span lang=3D"JA" style=3D"font-siz=
e:10.5pt"><br></span></font></p><p style=3D"margin-right:0pt;margin-bottom:=
12pt;margin-left:0pt"><font color=3D"navy" face=3D"MS Gothic"><span lang=3D=
"JA" style=3D"font-size:10.5pt">xxend</span></font></p>
I also met this problem employed imap_fetchbody function to get the mail body.
I found that the string get from imap_fetchbody was converted to quoted-printable string automatically.
I resolved this issue by using imap_qprint function to convert the fetched string body into correct one.

Stripping received email headers in PHP

I have a PHP script which has the source of an email. I aim to split the headers into variables $To and $From
The issue comes when trying to split the to and from strings up. What I need the script to do is take
From: John <john#somesite.com>
To: Susy <susy#mysite.com>, Steven <steven#somesite.com>, Mary <mary#mysite.com>
and return only the from address and the to addresses which are on my site. I.e.
$From = 'john#somesite.com';
$To = array('susy#mysite.com', 'mary#mysite.com');
So the code needs to turn a string of email addresses into an array and then filter out the ones from other sites. It's the first part that is proving difficult because of the different ways an email address can be listed in a header.
Edit
As you've now specified that you have the headers as a string but you actually need to parse the addresses from it, there is no need to reinvent the wheel:
imap_rfc822_parse_headersDocs
imap_rfc822_parse_adrlistDocs
These two functions will do the job for you, the last one will give you an array with objects that have the email addresses pre-parsed, so you can easily take decisions based on the host.
It was not specifically clear to me what your actual problem is from your question.
As long as you are concerned about filtering a string containing one email address (cast it to array) or an array containing one or multiple addresses:
To filter the existing array of email-addresses you can use a simple array mapping function that will set any email that is not matching your site's host to FALSE and then filter the array copy Demo:
$addresses = array(
'mary#mysite.com',
'mary#othersite.com',
);
$myhost = 'mysite.com';
$filtered = array_map(function($email) use ($myhost) {
$host = '#'.$myhost;
$is = substr($email, -strlen($host)) === $host;
return $is ? $email : FALSE;
}, $addresses);
$filtered = array_filter($filtered);
print_r($filtered);
This codes makes the assumption that you have the email addresses already gathered. You have not specified how you parse the headers already in your question, so it's actually unknown with which data you are dealing, so I opted to start from the end of your problem. Let us know if you have more information available.
<?php
$k= "......Subject: Write the program any of your favorite language whenever if you feel
you are free
From: Vinay Kumar <vinaykumarjg#gmail.com>
To: msnjsk#gmail.com, mithunsatish#gmail.com,Susy <susy#mysite.com>, Steven <steven#somesite.com>, Mary <mary#mysite.com>
Content-Type: multipart/alternative; boundary=bcaec53964ec5eed2604acd0e09a
--bcaec53964ec5eed2604acd0e09a
Content-Type: text/plain; charset=ISO-8859-1
.......";
if(preg_match('/From:(?P<text>.+)\r\n/', $k, $matches1))
{
if(preg_match('/(?P<from>([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\#([a-z0-9])' .'(([a-z0-9-])*([a-z0-9]))+' . '(\.([a-z0-9])([-a-z0-9_-])?([a-z0-9])+)+)/', $matches1['text'],$sender ))
{
print_r($sender['from']);
}
}
if(preg_match('/To:(?P<text>.+)\r\n/', $k, $matches2))
{
if(preg_match_all('/(?P<to>([a-z0-9])(([-a-z0-9._])*([a-z0-9]))*\#([a-z0-9])' .
'(([a-z0-9-])*([a-z0-9]))+' . '(\.([a-z0-9])([-a-z0-9_-])?([a-z0-9])+)+)/', $matches2['text'], $reciever))
{
if(isset($reciever['to']))
{
print_r($reciever['to']);
}
}
}
to get the subject:
if(preg_match('/Subject:(?P<subject>.+)\r\n/', $k, $subject))
{
print_r($subject['subject']);
}
preg_match_all("/<([^><]+)/", $headers, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => john#somesite.com
[1] => susy#mysite.com
[2] => steven#somesite.com
[3] => mary#mysite.com
)
The first one is always From email address.
Live demo

json_decode returns NULL after webservice call [duplicate]

This question already has answers here:
PHP json_decode() returns NULL with seemingly valid JSON?
(29 answers)
Closed 4 months ago.
There is a strange behaviour with json_encode and json_decode and I can't find a solution:
My php application calls a php web service. The webservice returns json that looks like this:
var_dump($foo):
string(62) "{"action":"set","user":"123123123123","status":"OK"}"
now I like to decode the json in my application:
$data = json_decode($foo, true)
but it returns NULL:
var_dump($data):
NULL
I use php5.
The Content-Type of the response from the webservice: "text/html; charset=utf-8" (also tried to use "application/json; charset=utf-8")
What could be the reason?
Well, i had a similar issue and the problems was the PHP magic quotes in the server... here is my solution:
if(get_magic_quotes_gpc()){
$param = stripslashes($_POST['param']);
}else{
$param = $_POST['param'];
}
$param = json_decode($param,true);
EDIT:
Just did some quick inspection of the string provided by the OP. The small "character" in front of the curly brace is a UTF-8 B(yte) O(rder) M(ark) 0xEF 0xBB 0xBF. I don't know why this byte sequence is displayed as  here.
Essentially the system you aquire the data from sends it encoded in UTF-8 with a BOM preceding the data. You should remove the first three bytes from the string before you throw it into json_decode() (a substr($string, 3) will do).
string(62) "{"action":"set","user":"123123123123","status":"OK"}"
^
|
This is the UTF-8 BOM
As Kuroki Kaze discovered, this character surely is the reason why json_decode fails. The string in its given form is not correctly a JSON formated structure (see RFC 4627)
Print the last json error when debugging.
json_decode( $so, true, 9 );
$json_errors = array(
JSON_ERROR_NONE => 'No error has occurred',
JSON_ERROR_DEPTH => 'The maximum stack depth has been exceeded',
JSON_ERROR_CTRL_CHAR => 'Control character error, possibly incorrectly encoded',
JSON_ERROR_SYNTAX => 'Syntax error',
);
echo 'Last error : ', $json_errors[json_last_error()], PHP_EOL, PHP_EOL;
Also use the json.stringify() function to double check your JSON syntax.
None of the solutions above worked for me, but html_entity_decode($json_string) did the trick
Try this
$foo = utf8_encode($foo);
$data = json_decode($foo, true);
make sure that if you sent the data by POST / GET, the server has not escape the quotes
$my_array = json_decode(str_replace ('\"','"', $json_string), true);
"{"action":"set","user":"123123123123","status":"OK"}"
This little apostrophe in the beginning - what is it? First symbol after the doublequote.
I had the similar problem in a live site. In my local site it was working fine. For fixing the same I Just have added the below code
json_decode(stripslashes($_GET['arr']));
I just put this
$result = mb_convert_encoding($result,'UTF-8','UTF-8');
$result = json_decode($result);
and it's working
Yesterday I spent 2 hours on checking and fixing that error finally I found that in JSON string that I wanted to decode were '\' slashes. So the logical thing to do is to use stripslashes function or something similiar to different PL.
Of course the best way is sill to print this var out and see what it becomes after json_decode, if it is null you can also use json_last_error() function to determine the error it will return integer but here are those int described:
0 = JSON_ERROR_NONE
1 = JSON_ERROR_DEPTH
2 = JSON_ERROR_STATE_MISMATCH
3 = JSON_ERROR_CTRL_CHAR
4 = JSON_ERROR_SYNTAX
5 = JSON_ERROR_UTF8
In my case I got output of json_last_error() as number 4 so it is JSON_ERROR_SYNTAX. Then I went and take a look into the string it self which I wanted to convert and it had in last line:
'\'title\' error ...'
After that is really just an easy fix.
$json = json_decode(stripslashes($response));
if (json_last_error() == 0) { // you've got an object in $json}
I had such problem with storage json-string in MySQL.
Don't really know why, but using htmlspecialchars_decode berofe json_decode resolved problem.
Non of these solutions worked for me.
What DID eventually work was checking the string encoding by saving it to a local file and opening with Notepad++.
I found out it was 'UTF-16', so I was able to convert it this way:
$str = mb_convert_encoding($str,'UTF-8','UTF-16');
Maybe you use thing as $ ${: these chars should be quoted.
I was having this problem, when I was calling a soap method to obtain my data, and then return a json string, when I tried to do json_decode I just keep getting null.
Since I was using nusoap to do the soap call I tried to just return json string and now I could do a json_decode, since I really neaded to get my data with a SOAP call, what I did was add ob_start() before include nusoap, id did my call genereate json string, and then before returning my json string I did ob_end_clean(), and GOT MY PROBLEM FIXED :)
EXAMPLE
//HRT - SIGNED
//20130116
//verifica se um num assoc deco é valido
ob_start();
require('/nusoap.php');
$aResponse['SimpleIsMemberResult']['IsMember'] = FALSE;
if(!empty($iNumAssociadoTmp))
{
try
{
$client = new soapclientNusoap(PartnerService.svc?wsdl',
array(
// OPTS
'trace' => 0,
'exceptions' => false,
'cache_wsdl' => WSDL_CACHE_NONE
)
);
//MENSAGEM A ENVIAR
$sMensagem1 = '
<SimpleIsMember>
<request>
<CheckDigit>'.$iCheckDigitAssociado.'</CheckDigit>
<Country>Portugal</Country>
<MemberNumber">'.$iNumAssociadoDeco.'</MemberNumber>
</request>
</SimpleIsMember>';
$aResponse = $client->call('SimpleIsMember',$sMensagem1);
$aData = array('dados'=>$aResponse->xpto, 'success'=>$aResponse->example);
}
}
ob_end_clean();
return json_encode($aData);
I don't know Why?
But this work:
$out = curl_exec($curl);
$out = utf8_encode($out);
$out = str_replace("?", "", $out);
if (substr($out,1,1)!='{'){
$out = substr($out,3);
}
$arResult["questions"] = json_decode($out,true);
without utf8_encode() - Don't work
Check the encoding of your file. I was using netbeans and had to use iso windows 1252 encoding for an old project and netbeans was using this encoding since then for every new file. json_decode will then return NULL. Saving the file again with UTF-8 encoding solved the problem for me.
In Notepad++, select Encoding (from the top menu) and then ensure that "Encode in UTF-8" is selected.
This will display any characters that shouldn't be in your json that would cause json_decode to fail.
Try using json_encode on the string prior to using json_decode... idk if will work for you but it did for me... I'm using laravel 4 ajaxing through a route param.
$username = "{username: john}";
public function getAjaxSearchName($username)
{
$username = json_encode($username);
die(var_dump(json_decode($username, true)));
}
You should try out json_last_error_msg(). It will give you the error message and tell you what is wrong. It was introduced in PHP 5.5.
$foo = "{"action":"set","user":"123123123123","status":"OK"}";
$data = json_decode($foo, true);
if($data == null) {
throw new Exception('Decoding JSON failed with the following message: '
. json_last_error_msg());
}
// ... JSON decode was good => Let's use the data
Before applying PHP related solutions, validate your JSON format. That may be the problem. Try below online JSON format validator. If your JSON format is invalid, correct it first, because PHP doesn't decode invalid JSON strings.
https://jsonformatter.org/
Laravel specific answer:
I got the same issue in Laravel. And this did the trick for me
$result = json_decode($result->getContent(), true);
In my case, when I was printing to the screen, json was fine and I copied and decode with json_deocode() function. It was working fine. But, when I was trying to put jsonString directly in the function, it was returning null because quotes were coming like these ". So I used htmlspecialchars_decode() function and now it is working fine.
I am new here, so if I am making any mistakes in writing answer then sorry for that. I hope it'll help somebody.
Sometimes the problem is generated when the content is compressed, so adding the Accept-Encoding: identity header can solve the problem without having to wrangle with the response.
$opts = array(
'http' =>
array(
'header' =>
array(
'Accept-Encoding: identity',
),
),
);
$context = stream_context_create($opts);
$contents = file_get_contents('URL', false, $context);
i had a similar problem, got it to work after adding '' (single quotes) around the json_encode string. Following from my js file:
var myJsVar = <?php echo json_encode($var); ?> ; -------> NOT WORKING
var myJsVar = '<?php echo json_encode($var); ?>' ; -------> WORKING
just thought of posting it in case someone stumbles upon this post like me :)

Categories