imap_search and subject command - php

Hello for all and happy holidays!!
I have a code for connect my inbox mail. Is this...
$host = '{'.SMTP_HOST.':143/novalidate-cert}INBOX';
$entrada = imap_open($host, SMTP_USER , SMTP_PASS);
$emails_mejora = imap_search($entrada, 'SUBJECT "Envíanos el tamaño de la imágen"', SE_UID, , 'UTF-8');
The subject contains utf8 characters and show 0 results. With other subjects without utf8 characters works fine...
Please any help
Thanks ;)

#sergio, use header("Content-type: text/html; charset=UTF-8"); at the top of your php file to set Character Set in UTF-8 format
You should use collation " utf8_general_ci " in database table field if u used special characters.

Related

PHP imap_search: UTF-8 / Non-ASCII characters on Microsoft Exchange mail servers

I want to fetch emails from outlook.office365.com using IMAP and PHP.
Since the most emails contain non-ASCII characters like äöü, I use UTF-8 in my imap_search() function:
imap_search($mbox_connection, 'ALL', SE_UID, "UTF-8")
With UTF-8 and the search criteria ALL I get all emails as expected. Now, I wanted to restrict it to for example only unseen (unread) emails:
imap_search($mbox_connection, 'UNSEEN', SE_UID, "UTF-8")
But this unfortunately causes the issue, that no emails can be found anymore - although there are unseen emails - and it also throws this PHP notice:
PHP Notice: Unknown: [BADCHARSET (US-ASCII)] The specified charset is not supported. (errflg=2) in Unknown on line 0
Based on this notice, I've changed the charset from UTF-8 to US-ASCII:
imap_search($mbox_connection, 'UNSEEN', SE_UID, "US-ASCII")
Now, it returns all expected unseen (unread) emails.
The problem is now, that I can't search for emails with UTF-8 characters. I've for example an email with these information:
From: Äpfel Nürnberg
Subject: Apfel vs. Äpfel
Body:
Einzahl gegen Mehrzahl.
Ein Apfel, mehrere Äpfel.
When I try to search for all emails with the subject "apfel" it works as expected - I can find the email:
imap_search($mbox_connection, 'FROM "apfel"', SE_UID, "US-ASCII")
Trying to connect to '{outlook.office365.com:993/imap/ssl}INBOX'...
Found 1 email(s)...
+------ P A R S I N G ------+
From: =?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <=?iso-8859-1?Q?=C4pfel=20N=FCrnberg?= <aepfel#nuernberg.de>>
Subject: =?iso-8859-1?Q?Apfel_vs._=C4pfel?=
But when I instead search for the word with the UTF-8 character (in this case äpfel), it does NOT find the email:
imap_search($mbox_connection, 'FROM "äpfel"', SE_UID, "US-ASCII")
Due to this fact, I've changed back the charset from US-ASCII to UTF-8, but this only ends again at the error message [BADCHARSET (US-ASCII)].
My code is very simple:
$mailbox = "{outlook.office365.com:993/imap/ssl}INBOX";
$mailbox_username = "someone#outlook.com";
$mailbox_password = "*******";
echo "Trying to connect to '$mailbox'...\n";
$mbox_connection = imap_open($mailbox, $mailbox_username, $mailbox_password);
$mailsIds = imap_search($mbox_connection, 'SUBJECT "äpfel"', SE_UID, "UTF-8");
if(!$mailsIds) {
echo "No emails found!\n";
imap_close($mbox_connection);
die();
}
echo "Found " . count($mailsIds) . " email(s)...\n";
foreach($mailsIds as $mailId) {
echo "+------ P A R S I N G ------+\n";
$headersRaw = imap_fetchheader($mbox_connection, $mailId, FT_UID);
$header = imap_rfc822_parse_headers($headersRaw);
echo "From: " . $header->from[0]->personal . " <" . $header->fromaddress . ">\n";
echo "Subject: " . $header->subject . "\n";
}
I've already tried this solution, but this returns also no matching email:
$str = "äpfel";
$str = preg_replace('/\=\?ISO\-8859\-1\?Q\?/i', '', mb_encode_mimeheader($str, "ISO-8859-1", "Q"));
$mailsIds = imap_search($mbox_connection, 'SUBJECT "'.$str.'"', SE_UID, 'US-ASCII');
Any ideas, how I can search for non-ASCII characters in the email fields From, Subject and Body when the IMAP server does not support UTF-8 and I also can NOT change this on server-side configuration?
This seems to be an issue with all Microsoft Exchange servers. Only those servers have this issue as far as I could found it out via Google.
You probably can't.
Exchange doesn't seem to implement charset aware searching for IMAP, and doing so is not a requirement of RFC3501 (only US-ASCII must be supported). UTF-8 is usually supported, but this does not seem to be the case for Exchange.
You would have to switch protocols (EAS, EWS, REST services, etc.) or pull down the information, decode it yourself, and search it. If you cache it, this isn't even too bad long term. Since it's headers, you can get this all in one fetch. If you need to search bodies, the case is much harder.

Unable to retrieve UTF-8 accented characters from Access via PDO_ODBC

I am trying to get an Access DB converted into MySQL. Everything works perfectly, expect for one big monkey wrench... If the access db has any non standard characters, it wont work. My query will tell me:
Incorrect string value: '\xE9d'
If I directly echo out the rows text that has the 'invalid' character I get a question mark in a black square in my browser (so é would turn into that invalid symbal on echo).
NOTE: That same from will accept, save and display the "é" fine in a textbox that is used to title this db upload. Also if I 'save as' the page and re-open it up the 'é' is displayed correctly....
Here is how I connect:
$conn = new PDO("odbc:Driver={Microsoft Access Driver (*.mdb)};Dbq=$fileLocation;SystemDB=$securefilePath;Uid=developer;Pwd=pass;charset=utf;");
I have tried numerous things, including:
$conn -> exec("set names utf8");
When I try a 'CurrentDb.CollatingOrder' in access it tells me 1033 apparently that is dbSortGeneral for "English, German, French, and Portuguese collating order".
What is wrong? It is almost like the PDO is sending me a collation my browser and PHP does not fully understand.
The Problem
When using native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver, text is not UTF-8 encoded, even though it is stored in the Access database as Unicode characters. So, for a sample table named "Teams"
Team
-----------------------
Boston Bruins
Canadiens de Montréal
Федерация хоккея России
the code
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'odbc:' .
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb;' .
'Uid=Admin;';
$db = new PDO($connStr);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT Team FROM Teams";
foreach ($db->query($sql) as $row) {
$s = $row["Team"];
echo $s . "<br/>\n";
}
?>
</body>
</html>
displays this in the browser
Boston Bruins
Canadiens de Montr�al
????????? ?????? ??????
The Easy but Incomplete Fixes
The text returned by Access ODBC actually matches the Windows-1252 character encoding for the characters in that character set, so simply changing the line
$s = $row["Team"];
to
$s = utf8_encode($row["Team"]);
will allow the second entry to be displayed correctly
Boston Bruins
Canadiens de Montréal
????????? ?????? ??????
but the utf8_encode() function converts from ISO-8859-1, not Windows-1252, so some characters (notably the Euro symbol '€') will disappear. A better solution would be to use
$s = mb_convert_encoding($row["Team"], "UTF-8", "Windows-1252");
but that still wouldn't solve the problem with the third entry in our sample table.
The Complete Fix
For full UTF-8 support we need to use COM with ADODB Connection and Recordset objects like so
<?php
header('Content-Type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Access character test</title>
</head>
<body>
<?php
$connStr =
'Driver={Microsoft Access Driver (*.mdb)};' .
'Dbq=C:\\Users\\Public\\__SO\\28311687.mdb';
$con = new COM("ADODB.Connection", NULL, CP_UTF8); // specify UTF-8 code page
$con->Open($connStr);
$rst = new COM("ADODB.Recordset");
$sql = "SELECT Team FROM Teams";
$rst->Open($sql, $con, 3, 3); // adOpenStatic, adLockOptimistic
while (!$rst->EOF) {
$s = $rst->Fields("Team");
echo $s . "<br/>\n";
$rst->MoveNext;
}
$rst->Close();
$con->Close();
?>
</body>
</html>
A bit more easily to manipulate the data. (Matrix array).
function consulta($sql) {
$db_path = $_SERVER["DOCUMENT_ROOT"] . '/database/Registros.accdb';
$conn = new COM('ADODB.Connection', NULL, CP_UTF8) or exit('Falha ao iniciar o ADO (objeto COM).');
$conn->Open("Persist Security Info=False;Provider=Microsoft.ACE.OLEDB.12.0;Jet OLEDB:Database Password=ifpb#10510211298;Data Source=$db_path");
$rs = $conn->Execute($sql);
$numRegistos = $rs->Fields->Count;
$index = 0;
while (!$rs->EOF){
for ($n = 0; $n < $numRegistos; $n++) {
if(is_null($rs->Fields[$n]->Value)) continue;
$resultados[$index][$rs->Fields[$n]->Name] = $rs->Fields[$n]->Value;
echo '.';
}
echo '<br>';
$index = $index + 1;
$rs->MoveNext();
}
$conn->Close();
return $resultados;
}
$dados = consulta("select * from campus");
var_dump($dados);
Found the following solution. True, I did not have the opportunity to test it on php. But I suppose it should work out.
In order for native PHP ODBC features (PDO_ODBC or the older odbc_ functions) and the Access ODBC driver to be able to correctly subtract texts in Unicode encoding, that stored in the Access database as Unicode character, it is need enables "Beta: Use Unicode UTF-8 for worldwide language support" in Region Settiongs of Windows Operetion System.
After I did this at me, many programs using the standard ODBC driver MC Access, began to display correct texts in Unicode encoding.
All Settings -> Time & Language -> Language -> "Administrative Language Settings"

PHP Mail_Mime: How to properly use encodeHeader() on the body of an email

How do I fix the character encoding for this:
require_once(/Mail/mime.php");
$corpoTxt = 'Teste envio relatórios são';
$corpoHtml= '<html><body>Versão HTML do texto</body></html>';
I tried:
$corpoTxt=$mime->encodeHeader("corpo", $corpoTxt, "utf-8", "quoted-printable");
$corpoHtml=$mime->encodeHeader("corpohtmls", $corpoHtml, "utf-8", "quoted-printable");
But it doesn't work, I get:
Versão HTML do texto
Thanks in advance for any help!
Try including these lines prior to calling the $mine-headers() function.
# Ensure EML is UTF-8 compliant
$mimeparams["debug"] = "True";
$mimeparams['text_encoding'] = "8bit";
$mimeparams['text_charset'] = "UTF-8";
$mimeparams['html_charset'] = "UTF-8";
$mimeparams['head_charset'] = "UTF-8";
$mime->get($mimeparams);
This will ensure your UTF-8 characters will display properly in the EML. Finally, save the TXT and HTML body like so:
$mime->setTXTBody($corpoTxt);
$mime->setHTMLBody($corpoHtml);
That should work. For more help, have a look at:
Pear Mail, how to send plain/text + text/html in UTF-8

RedBeanPHP not saving strings correctly, skips from an accented vowel onwards

I'm trying to save some simple PHP objects using RedBeanPHP. It works fine, except that on string fields, it reaches a point where there is an accented vowel, ie á or 'í' and just skips the rest of the remaining characters in the string.
Example:
// Actual string in PHP script.
Esta es una frase mía y me gusta!
// Saved to database.
Esta es una frase m
Here's my PHP script:
// Setup RedBean to work with a database.
R::setup('mysql:host=localhost;dbname=noticias','root','');
foreach($parsedNews as &$tmpNews) {
$noticia = R::dispense('noticia');
$noticia->imagen = $tmpNews->get_image();
$noticia->fecha = $tmpNews->get_fechanoticia();
$noticia->titulo = $tmpNews->get_title();
$noticia->url = $tmpNews->get_sourceurl();
$noticia->descripcion = $tmpNews->get_description();
$id = R::store($noticia);
}
I think the correct answer is that the source encoding is not actually UTF8.
$bean->property = iconv("ISO-8859-1", "UTF-8", "Esta es una frase mía y me gusta!");
Set your database table collations to UTF-8. (utf8-unicode-ci is probably what you want).

How to convert 'u00e9' into a utf8 char, in mysql or php?

Im doing some data cleansing on some messy data which is being imported into mysql.
The data contains 'pseudo' unicode chars, which are actually embedded into the strings as 'u00e9' etc.
So one field might be.. 'Jalostotitlu00e1n'
I need to rip out that clumsy 'u00e1n' and replace it with the corresponding utf character
I can do this in either mysql, using substring and CHR maybe, but Im preprocssing the data via PHP, so I could do it there also.
I already know all about how to configure mysql and php to work with utf data. The problem is really just in the source data Im importing.
Thanks
/*
Function php for convert utf8 html to ansi
*/
public static function Utf8_ansi($valor='') {
$utf8_ansi2 = array(
"\u00c0" =>"À",
"\u00c1" =>"Á",
"\u00c2" =>"Â",
"\u00c3" =>"Ã",
"\u00c4" =>"Ä",
"\u00c5" =>"Å",
"\u00c6" =>"Æ",
"\u00c7" =>"Ç",
"\u00c8" =>"È",
"\u00c9" =>"É",
"\u00ca" =>"Ê",
"\u00cb" =>"Ë",
"\u00cc" =>"Ì",
"\u00cd" =>"Í",
"\u00ce" =>"Î",
"\u00cf" =>"Ï",
"\u00d1" =>"Ñ",
"\u00d2" =>"Ò",
"\u00d3" =>"Ó",
"\u00d4" =>"Ô",
"\u00d5" =>"Õ",
"\u00d6" =>"Ö",
"\u00d8" =>"Ø",
"\u00d9" =>"Ù",
"\u00da" =>"Ú",
"\u00db" =>"Û",
"\u00dc" =>"Ü",
"\u00dd" =>"Ý",
"\u00df" =>"ß",
"\u00e0" =>"à",
"\u00e1" =>"á",
"\u00e2" =>"â",
"\u00e3" =>"ã",
"\u00e4" =>"ä",
"\u00e5" =>"å",
"\u00e6" =>"æ",
"\u00e7" =>"ç",
"\u00e8" =>"è",
"\u00e9" =>"é",
"\u00ea" =>"ê",
"\u00eb" =>"ë",
"\u00ec" =>"ì",
"\u00ed" =>"í",
"\u00ee" =>"î",
"\u00ef" =>"ï",
"\u00f0" =>"ð",
"\u00f1" =>"ñ",
"\u00f2" =>"ò",
"\u00f3" =>"ó",
"\u00f4" =>"ô",
"\u00f5" =>"õ",
"\u00f6" =>"ö",
"\u00f8" =>"ø",
"\u00f9" =>"ù",
"\u00fa" =>"ú",
"\u00fb" =>"û",
"\u00fc" =>"ü",
"\u00fd" =>"ý",
"\u00ff" =>"ÿ");
return strtr($valor, $utf8_ansi2);
}
There's a way. Replace all uXXXX with their HTML representation and do an html_entity_decode()
I.e. echo html_entity_decode("Jalostotitlán");
Every UTF character in the form u1234 could be printed in HTML as ሴ. But doing a replace is quite hard, because there could be much false positives if there is no other char that identifies the beginning of an UTF sequence. A simple regex could be
preg_replace('/u([\da-fA-F]{4})/', '&#x\1;', $str)
My twitter timeline script returns the special characters like é into \u00e9 so I stripped the backslash and used #rubbude his preg_replace.
// Fix uxxxx charcoding to html
$text = "De #Haarstichting is h\u00e9t medium voor alles Into: De #Haarstichting is hét medium voor alles";
$str = str_replace('\u','u',$text);
$str_replaced = preg_replace('/u([\da-fA-F]{4})/', '&#x\1;', $str);
echo $str_replaced;
It workes for me and it turns:
De #Haarstichting is h\u00e9t medium voor alles
Into:
De #Haarstichting is hét medium voor alles

Categories