Character encoding error in idHTTP - php

I'm having a situation with TIdHTTP and TIdMultipartFormDataStream.
My code is:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem',AImagem,'image/jpeg');
FormPHP.AddFormField('iduser',AIDUser,'text/plain');
FormPHP.AddFormField('nome',ANome,'text/plain');
FormPHP.AddFormField('data',AData,'text/plain');
FormPHP.AddFormField('hora',AHora,'text/plain');
FormPHP.AddFormField('mensagem',AMensagem,'text/plain');
FormPHP.AddFormField('latitude','1','text/plain');
FormPHP.AddFormField('longitude','1','text/plain');
Response := TStringStream.Create('', TEncoding.ANSI);
HTTP:= TIdHTTP.Create(self);
HTTP.Request.CustomHeaders.Clear;
HTTP.Request.Clear;
HTTP.Request.ContentType:= 'multipart/form-data'; //application/x-www-form-urlencoded
HTTP.Request.ContentEncoding:= 'MeMIME';
HTTP.Request.CharSet:= 'utf-8';
HTTP.Request.Referer:= 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php',FormPHP,Response);
This is the PHP script:
<?php
#cadastro.php - Cadastra os dados enviados na tabela online.
$mysqli = new mysqli("mysqlhost","username","password","dbname");
$iduser = $_POST['iduser'];
$nome = $_POST['nome'];
$data = $_POST['data'];
$hora = $_POST['hora'];
$mensagem = $_POST['mensagem'];
$latitude = $_POST['latitude'];
$longitude = $_POST['longitude'];
$imagem = $_FILES["imagem"]['tmp_name'];
$tamanho = $_FILES['imagem']['size'];
if ( $imagem != "none" )
{
$fp = fopen($imagem, "rb");
$conteudo = fread($fp, $tamanho);
$conteudo = addslashes($conteudo);
fclose($fp);
$queryInsercao = "INSERT INTO tabpainel (iduser, nome, data, hora, mensagem, latitude, longitude, imagem) VALUES ('$iduser', '$nome', '$data','$hora','$mensagem', '$latitude', '$longitude', '$conteudo')";
mysqli_query($mysqli,$queryInsercao) or die("Algo deu errado ao inserir o registro. Tente novamente.");
if(mysqli_affected_rows($mysqli) > 0)
print "Sucesso!";
else
print "Não foi possível inserir o registro";
}
else
print "Não á foi possível carregar a imagem.";
?>
Explaining: My application post these fields to this PHP script and the php saves the data into a MySQL database and returns a response of "Sucesso!" to the application to inform the user that the data were saved. This text response is encoded in ANSI. I discovered that when i had to change the TStringStream encode to TEncoding.ANSI so it could recognize the "Não" word when something goes wrong.
Until the post, the variable AMensagem is ok, however, when the PHP receives the text, it's not correct. A text like this: "á Á é É" looks like this "=E1 =C1 =E9 =C9". This is saved in the mysql database.
I don't know if the problem is with the idHTTP or with TIdMultipartFormDataStream, or even with the PHP code. Everything works fine, it's just the encoding that i have no clue why it's not working.

The text that is transmitted to the server is not being encoded in UTF-8.
All of your AddFormField() calls are specifying the text/plain media type in the ACharset parameter instead of the AContentType parameter. Unlike AddFile(), the 3rd parameter of AddFormField() is the charset, and the 4th parameter is the media type.
function AddFormField(const AFieldName, AFieldValue: string; const ACharset: string = ''; const AContentType: string = ''; const AFileName: string = ''): TIdFormDataField; overload;
By passing an invalid charset, TIdMultipartFormDataStream ends up using Indy's built-in raw 8bit encoding instead, which encodes Unicode characters U+0000 - U+00FF as bytes $00 - $FF, respectively, and all other characters as byte $3F ('?'). The text you are sending happens to fall in that first range.
TIdFormDataField does not currently inherit a charset from TIdMultipartFormDataStream or TIdHTTP (work on that is in-progress), so you have to specify it on a per-field basis.
On a side note, MeMIME is not a valid ContentEncoding value. And you should not be setting any ContentEncoding value for a multipart/form-data post anyway.
Try something more like this instead:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem', AImagem, 'image/jpeg');
FormPHP.AddFormField('iduser', AIDUser, 'utf-8');
FormPHP.AddFormField('nome', ANome, 'utf-8');
FormPHP.AddFormField('data', AData, 'utf-8');
FormPHP.AddFormField('hora', AHora, 'utf-8');
FormPHP.AddFormField('mensagem', AMensagem, 'utf-8');
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Response := TStringStream.Create('');
HTTP := TIdHTTP.Create(Self);
HTTP.Request.Referer := 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php', FormPHP, Response);
Alternatively:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem', AImagem, 'image/jpeg');
FormPHP.AddFormField('iduser', AIDUser).Charset := 'utf-8';
FormPHP.AddFormField('nome', ANome).Charset := 'utf-8';
FormPHP.AddFormField('data', AData).Charset := 'utf-8';
FormPHP.AddFormField('hora', AHora).Charset := 'utf-8';
FormPHP.AddFormField('mensagem', AMensagem).Charset := 'utf-8';
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Response := TStringStream.Create('');
HTTP := TIdHTTP.Create(Self);
HTTP.Request.Referer := 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php', FormPHP, Response);
Either way, the field text will be encoded using UTF-8 instead of Ansi.
Update: Now, with that said, AddFormField() sets the TIdFormDataField.ContentTransfer property to quoted-printable by default. However, PHP's $_POST does not decode quoted-printable by default, you would have to call quoted_printable_decode() manually:
$iduser = quoted_printable_decode($_POST['iduser']);
$nome = quoted_printable_decode($_POST['nome']);
$data = quoted_printable_decode($_POST['data']);
$hora = quoted_printable_decode($_POST['hora']);
$mensagem = quoted_printable_decode($_POST['mensagem']);
$latitude = quoted_printable_decode($_POST['latitude']);
$longitude = quoted_printable_decode($_POST['longitude']);
If you don't want TIdFormDataField to encode the UTF-8 text using quoted-printable, you can set the ContentTransfer property to 8bit instead:
FormPHP.AddFormField('iduser', AIDUser, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('nome', ANome, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('data', AData, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('hora', AHora, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('mensagem', AMensagem, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Alternatively:
with FormPHP.AddFormField('iduser', AIDUser) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('nome', ANome) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('data', AData) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('hora', AHora) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('mensagem', AMensagem) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Either way, you can then use your original PHP code again:
$iduser = $_POST['iduser'];
$nome = $_POST['nome'];
$data = $_POST['data'];
$hora = $_POST['hora'];
$mensagem = $_POST['mensagem'];
$latitude = $_POST['latitude'];
$longitude = $_POST['longitude'];
Whether you use quoted-printable or not, the PHP variables will end up holding UTF-8 encoded text. If you need the variables to be in another encoding, you will have to convert them as needed, by using either:
utf8_decode() (which decodes to ISO-8859-1):
$iduser = utf8_decode($iduser);
$nome = utf8_decode($nome);
$data = utf8_decode($data);
$hora = utf8_decode($hora);
$mensagem = utf8_decode($mensagem);
$latitude = utf8_decode($latitude);
$longitude = utf8_decode($longitude);
mb_convert_encoding()
$iduser = mb_convert_encoding($iduser, 'desired charset', 'utf-8');
$nome = mb_convert_encoding($nome), 'desired charset', 'utf-8');
$data = mb_convert_encoding($data, 'desired charset', 'utf-8');
$hora = mb_convert_encoding($hora, 'desired charset', 'utf-8');
$mensagem = mb_convert_encoding($mensagem, 'desired charset', 'utf-8');
$latitude = mb_convert_encoding($latitude, 'desired charset', 'utf-8');
$longitude = mb_convert_encoding($longitude, 'desired charset', 'utf-8');
iconv():
$iduser = iconv('utf-8', 'desired charset', $iduser);
$nome = iconv('utf-8', 'desired charset', $nome);
$data = iconv('utf-8', 'desired charset', $data);
$hora = iconv('utf-8', 'desired charset', $hora);
$mensagem = iconv('utf-8', 'desired charset', $mensagem);
$latitude = iconv('utf-8', 'desired charset', $latitude);
$longitude = iconv('utf-8', 'desired charset', $longitude);
Finally, when sending a response back to the client, you need to encode text when it contains non-ASCII characters. You should also be using header() to let the client know which charset is being used for that encoding:
header($_SERVER["SERVER_PROTOCOL"] . " 200 OK");
header('Content-Type: text/plain; charset="utf-8"');
if ( $imagem != "none" )
{
...
if (mysqli_affected_rows($mysqli) > 0)
print utf8_encode("Sucesso!");
else
print utf8_encode("Não foi possível inserir o registro");
}
else
print utf8_encode("Não á foi possível carregar a imagem.");

Related

league/csv problem reading file with ISO-8859-1 encoding

$data = file_get_contents($path);
$data = mb_convert_encoding($data, 'UTF-8', mb_detect_encoding($data, 'UTF-8, ISO-8859-1', true));
$csv = Reader::createFromString($data);
$csv->setDelimiter(';');
$csv->setHeaderOffset(0);
$test = $csv->getContent();
return (new Statement)->process($csv);
When I debug and look at $test, all characters are displayed correctly (no lønn etc).
When I loop through the TabularDataReader object returned from this line:
return (new Statement)->process($csv);
the headers are displaying incorrectly e.g "Bil lønn" (should be "Bil lønn").
Do I have to set encoding on the Statement object as well? I looked through the class, but couldn't find any functions related to encoding.
I've had the same issue with league/csv and ISO-8859-1 encoding. Try this workaround:
$data = file_get_contents($path);
if (!mb_check_encoding($data, 'UTF-8')) {
$data = mb_convert_encoding($data, 'UTF-8');
}
$csv = Reader::createFromString($data);
$csv->setDelimiter(';');
$csv->setHeaderOffset(0);
$test = $csv->getContent();
return (new Statement)->process($csv);

UTF-8 Accents not been well displayed in HTML email with PHPMailer

I'm setup a PHPMailer PHP Form that send us the form to our Office 365 account. Im having issue with French accents displayed has "ééé à à à çç" accents like "éé àà çç".
PHP Form are encoded in UTF-8;
PHP Code are also encoded in UTF-8;
But the email received seems to not show the proper characters.
I have add theses settings and nothing has changed :
In the PHP file
header('Content-Type: charset=utf-8');
Also
$mail->isHTML(true); // Set email format to HTML
$mail->CharSet = "UTF-8";
Php Sending Form Source Code:
<?php
header('Content-Type: charset=utf-8');
ini_set('startup_errors', 1);
ini_set('display_errors', 1);
error_reporting(E_ALL);
use PHPMailer\PHPMailer\PHPMailer;
use PHPMailer\PHPMailer\Exception;
require 'php/phpmailer/vendor/phpmailer/phpmailer/src/Exception.php';
require 'php/phpmailer/vendor/phpmailer/phpmailer/src/PHPMailer.php';
require 'php/phpmailer/vendor/phpmailer/phpmailer/src/SMTP.php';
$nom_compagnie = $_POST['nom_compagnie'];
$nom_complet = $_POST['nom_complet'];
$poste = $_POST['poste'];
$email = $_POST['email'];
$telephone = $_POST['telephone'];
$commentaire = $_POST['commentaire'];
$from = $_POST['email'];
function post_captcha($user_response) {
$fields_string = '';
$fields = array(
'secret' => 'PrivateKey',
'response' => $user_response
);
foreach($fields as $key=>$value)
$fields_string .= $key . '=' . $value . '&';
$fields_string = rtrim($fields_string, '&');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.google.com/recaptcha/api/siteverify');
curl_setopt($ch, CURLOPT_POST, count($fields));
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, True);
$result = curl_exec($ch);
curl_close($ch);
return json_decode($result, true);
}
$res = post_captcha($_POST['g-recaptcha-response']);
if (!$res['success']) {
// What happens when the reCAPTCHA is not properly set up
echo 'reCAPTCHA error: Check to make sure your keys match the registered domain and are in the correct locations. You may also want to doublecheck your code for typos or syntax errors.';
} else {
// If CAPTCHA is successful...
try {
$mail = new PHPMailer(true);
$mail->isSMTP();
$mail->Host = 'smtp.office365.com';
$mail->Port = 587;
$mail->SMTPSecure = 'tls';
$mail->SMTPAuth = true;
$mail->Username = 'EmailAccount';
$mail->Password = 'Password';
$mail->addReplyTo($from, $nom_complet);
$mail->SetFrom ("Hidden", "Hidden");
$mail->addCC ("Hidden", "Hidden");
$mail->addAddress ('Hidden', 'Hidden);
//$mail->SMTPDebug = 3;
//$mail->Debutoutput = fonction($str, $level) {echo "debug level $level; message: $str";}; //
$mail->isHTML(true); // Set email format to HTML
$mail->Subject = "FORMULAIRE CONTACT";
$mail->Body = "
<html>
<head>
<meta charset=\"UTF-8\">
<title>Formulaire de Contact</title>
....
</html>";
$mail->AltBody = "";
$mail->CharSet = "UTF-8";
//$mail->msgHTML(file_get_contents('email_form_contact-fr.html'));
$mail->send();
// Paste mail function or whatever else you want to happen here!
} catch (Exception $e) {
echo $e->getMessage();
die(0);
}
header('Location: envoi_form_contact-success-fr.html');
}
?>
The email received has shown like this :
The H3 title in the email are shown
Vous avez reçu un formulaire de contact via le site Web
It supposed to be written like this
Vous avez reçu un formulaire de contact via le site web
Accent "é" are also displayed has "é".
I don't know where is the problem.
Any clue if my code are well programmed?
Thanks.
Replace
$mail -> charSet = "UTF-8";
with
$mail->CharSet = 'UTF-8';
Bonjour Stéphane. This page describes the symptom you are seeing; one character turning into two means that your data is in UTF-8, but is being displayed using an 8-bit character set. Generally speaking, PHPMailer gets this right, so you need to figure out where you are going wrong.
If you use SMTPDebug = 2 you will be able to see the message being sent (use a really short message body like é😎 that is guaranteed to only work in UTF-8).
Make sure that the encoding of the script file itself is also UTF-8 - putting an emoji in it is a good way of being sure that it is.
The problem with diagnosing this is that you'll find your OS interferes - things like copying to the clipboard are likely to alter encodings - so the way to deal with it is to use a hex dump function that lets you inspect the actual byte values. French is one of the easier ones to look at because nearly all characters will be regular single-byte ASCII-compatible characters, and accented characters will be easier to spot. In PHP the bin2hex() function will do this.
You have 2 typos: Debutoutput should be Debugoutput and fonction should be function - and that is a good place to dump hex data from:
$mail->Debugoutput = function($str, $level) {echo $str, "\n", bin2hex($str), "\n";};
Here's an example:
echo 'abc ', bin2hex('abc');
which will produce abc 616263; each input char results in 2 hex digits (1 byte) of output. If your input is abé and your charset is ISO-8859-1, it will come out as abé 6162e9, because é in that charset is e9 (in hex). The same string in UTF-8 would come out as abé 6162c3a9, because é in UTF8 is c3a9 - two bytes, not just one. Inspecting the characters like this allows you to be absolutely certain what character set the data is in - just looking at it is not good enough!
So, while I can't tell exactly where your problem is, hopefully you have some better ideas of how to diagnose it.

PHP from UTF-8 source to ASCII

I would like to create a CSV file from database data with PHP 5.6. My application and the database is using UTF-8, but the CSV needs to be in ASCII for windows with extended character set (eastern european languages).
I have already tried the following function but without success:
$data = utf8_decode($csv_data);
$data = mb_convert_encoding($csv_data, 'Windows-1252', 'UTF-8');
$data = mb_convert_encoding($csv_data, 'ASCII', 'UTF-8');
$data = mb_convert_encoding($csv_data, 'ISO-8859-2', 'UTF-8');
$data = iconv("UTF-8", "Windows-1252", $csv_data);
$data = iconv("UTF-8", "CP437//TRANSLIT", $csv_data);
$data = htmlentities($csv_data, ENT_QUOTES, 'UTF-8');
$data = html_entity_decode($data, ENT_QUOTES , 'Windows-1252');
$data = htmlentities($csv_data, ENT_QUOTES, 'UTF-8');
$data = html_entity_decode($data, ENT_QUOTES , 'ISO-8859-15');
My problem is that I am unable to covert characters like this Ł (polish, L with stroke)

PHP transforming string "&note=" to "¬e="

$to= array();
foreach($users as $v) {
$to[(string)$v['address']] = (float)($v['amount']*100000);
}
$guid = "user";
$main_password = "pw";
$second_password = "pw2";
$fee = 60000;
$recipients = urlencode(json_encode($to));
$from = "address";
$note = "public";
$json_url = "https://blockchain.info/merchant/$guid/sendmany?password=".$main_password."&second_password=".$second_password."&recipients=".$recipients."&shared=false&fee=".$fee."&note=".$note."&from=".$from;
echo $json_url;
die();
For some reason, when I echo $json_url;, the &note= is transformed to ¬e=. I can't find any PHP or HTML symbol that could be making that transformation.
That is ¬, or ¬ (mathematical not). Always use & to echo out an ampersand (even in URLs) if you're using HTML. Browsers are tolerant of sloppy coding, but in this case it can bite you.

PHP JSON with UTF8 characters

I'm sending a JSON message using this PHP:
$result = mysql_query("select a.id, ad.nombre, a.imagen, a.lat, a.long, ad.desc, a.url, a.email, a.tel, a.direccion, a.cp, a.poblacion, a.provincia from `bck_alrededor` a, `bck_alrededor_description` ad, `bck_alrededor_partner` ap
where a.id = ad.id_alrededor
and a.id = ap.id_alrededor
and a.id_cat = '$cat'
and ad.language = '$idioma'
and ap.id_partner = '$idp'",$link);
while( $row = mysql_fetch_array($result) )
{
$id = $row['id'];
$nombre = $row['nombre'];
$imagen=$row['imagen'];
$lat=$row['lat'];
$long=$row['long'];
$desc=$row['desc'];
$url=$row['url'];
$email=$row['email'];
$tel=$row['tel'];
$direccion=$row['direccion'];
$cp=$row['cp'];
$poblacion=$row['poblacion'];
$provincia=$row['provincia'];
if ($imagen <>'')
{
$imagen = $dir.'/'.$imagen;
}
$posts[] = array('nid'=> $id , 'title'=> $nombre, 'image'=> $imagen , 'latitude'=> $lat, 'longitude'=> $long, 'html'=> $desc, 'web'=> $url, 'email'=> $email, 'phone'=> $tel, 'address'=> $direccion, 'cp'=> $cp, 'poblacion'=> $poblacion, 'provincia'=> $provincia );
}
$response['nodes'] = $posts;
$current_charset = 'ISO-8859-15';
array_walk_recursive($response,function(&$value) use ($current_charset){
$value = iconv('UTF-8//TRANSLIT',$current_charset,$value);
});
echo json_encode($response);
if(!headers_sent()) header('Content-Type: application/json; charset=utf-8', true,200);
header('Content-type: application/json');
But I've got this JSON message with UTF8 escaped characters:
{"nodes":[{"nid":"87","title":"Tienda Oficial","image":"\/tiendaoficialgbc.png","latitude":"43.3021","longitude":"-1.9721","html":"Entra y adquiere todos los productos oficiales del GBC. En 48h los tienes en casa","web":"http:\/\/www.gipuzkoabasket.com\/tienda\/tienda_es.php","email":"gipuzkoabasket#gipuzkoabasket.com.","phone":"943 44 44 28","address":"Paseo de Anoeta 22, 1a Planta","cp":"20014","poblacion":"Donostia - San Sebasti\u00e1n","provincia":"Gipuzkoa"},{"nid":"88","title":"Tienda Oficial Salaberria","image":"\/tiendaoficialgbc.png","latitude":"43.30384","longitude":"-1.9797","html":"Entra y adquiere todos los productos oficiales del GBC. En 48h los tienes en casa","web":"http:\/\/www.gipuzkoabasket.com\/tienda\/tienda_es.php","email":"gipuzkoabasket#gipuzkoabasket.com.","phone":"943 44 44 28","address":"Jos\u00e9 Maria Salaberria 88","cp":"20014","poblacion":"Donostia - San Sebasti\u00e1n","provincia":""}]}
I've tried to use echo json_encode(utf8_encode($response)); but then I got a null JSON message in the client app.
How can I get a regular JSON message without UTF8 characters?
Thanks
\u00e1 is a perfectly valid way to escape Unicode characters in JSON. It's part of the JSON spec. To decode that to UTF-8, just json_decode it. utf8_decode has nothing to do with it.
What I don't understand is this code:
iconv('UTF-8//TRANSLIT',$current_charset,$value);
This says you're trying to convert from UTF-8//TRANSLIT to ISO-8859-15, which doesn't make much sense. The //TRANSLIT should come after ISO-8859-15, or you shouldn't be doing this conversion at all.

Categories