PHP JSON with UTF8 characters - php

I'm sending a JSON message using this PHP:
$result = mysql_query("select a.id, ad.nombre, a.imagen, a.lat, a.long, ad.desc, a.url, a.email, a.tel, a.direccion, a.cp, a.poblacion, a.provincia from `bck_alrededor` a, `bck_alrededor_description` ad, `bck_alrededor_partner` ap
where a.id = ad.id_alrededor
and a.id = ap.id_alrededor
and a.id_cat = '$cat'
and ad.language = '$idioma'
and ap.id_partner = '$idp'",$link);
while( $row = mysql_fetch_array($result) )
{
$id = $row['id'];
$nombre = $row['nombre'];
$imagen=$row['imagen'];
$lat=$row['lat'];
$long=$row['long'];
$desc=$row['desc'];
$url=$row['url'];
$email=$row['email'];
$tel=$row['tel'];
$direccion=$row['direccion'];
$cp=$row['cp'];
$poblacion=$row['poblacion'];
$provincia=$row['provincia'];
if ($imagen <>'')
{
$imagen = $dir.'/'.$imagen;
}
$posts[] = array('nid'=> $id , 'title'=> $nombre, 'image'=> $imagen , 'latitude'=> $lat, 'longitude'=> $long, 'html'=> $desc, 'web'=> $url, 'email'=> $email, 'phone'=> $tel, 'address'=> $direccion, 'cp'=> $cp, 'poblacion'=> $poblacion, 'provincia'=> $provincia );
}
$response['nodes'] = $posts;
$current_charset = 'ISO-8859-15';
array_walk_recursive($response,function(&$value) use ($current_charset){
$value = iconv('UTF-8//TRANSLIT',$current_charset,$value);
});
echo json_encode($response);
if(!headers_sent()) header('Content-Type: application/json; charset=utf-8', true,200);
header('Content-type: application/json');
But I've got this JSON message with UTF8 escaped characters:
{"nodes":[{"nid":"87","title":"Tienda Oficial","image":"\/tiendaoficialgbc.png","latitude":"43.3021","longitude":"-1.9721","html":"Entra y adquiere todos los productos oficiales del GBC. En 48h los tienes en casa","web":"http:\/\/www.gipuzkoabasket.com\/tienda\/tienda_es.php","email":"gipuzkoabasket#gipuzkoabasket.com.","phone":"943 44 44 28","address":"Paseo de Anoeta 22, 1a Planta","cp":"20014","poblacion":"Donostia - San Sebasti\u00e1n","provincia":"Gipuzkoa"},{"nid":"88","title":"Tienda Oficial Salaberria","image":"\/tiendaoficialgbc.png","latitude":"43.30384","longitude":"-1.9797","html":"Entra y adquiere todos los productos oficiales del GBC. En 48h los tienes en casa","web":"http:\/\/www.gipuzkoabasket.com\/tienda\/tienda_es.php","email":"gipuzkoabasket#gipuzkoabasket.com.","phone":"943 44 44 28","address":"Jos\u00e9 Maria Salaberria 88","cp":"20014","poblacion":"Donostia - San Sebasti\u00e1n","provincia":""}]}
I've tried to use echo json_encode(utf8_encode($response)); but then I got a null JSON message in the client app.
How can I get a regular JSON message without UTF8 characters?
Thanks

\u00e1 is a perfectly valid way to escape Unicode characters in JSON. It's part of the JSON spec. To decode that to UTF-8, just json_decode it. utf8_decode has nothing to do with it.
What I don't understand is this code:
iconv('UTF-8//TRANSLIT',$current_charset,$value);
This says you're trying to convert from UTF-8//TRANSLIT to ISO-8859-15, which doesn't make much sense. The //TRANSLIT should come after ISO-8859-15, or you shouldn't be doing this conversion at all.

Related

PHPWord - How to render word formula into html

I have a MS Word with table content with formula/equation like this:
Gabriel, Amanda, Lucas e Joelma estão fazendo uma competição para ver quem tem mais figurinhas do álbum da Copa do mundo. Gabriel conseguiu completar 1/5, Amanda 9/10, Lucas colou em seu álbum 7/8, e Joelma 1/2. O ganhador da competição foi
My code is like this right now:
$phpword = \PhpOffice\PhpWord\IOFactory::load($file["tmp_name"]);
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpword, 'HTML');
$xmlWriter = new \PhpOffice\Common\XMLWriter(\PhpOffice\Common\XMLWriter::STORAGE_MEMORY, './', \PhpOffice\PhpWord\Settings::hasCompatibility());
$contentHTML = $objWriter->getContent();
preg_match_all('/<table>(.*?)<\/table>/s', $contentHTML, $tables);
array_splice($tables[0], 0, 1);
foreach ($tables[0] as $key => $table) {
preg_match_all('/<tr>(.*?)<\/tr>/s', $table, $questionTables);
foreach ($questionTables as $key2 => $questionTable) {
// table content
}
}
Now I can get all text and image, but not the formula. It is possible to render this? And transform formula into image or base64?

Not get info of coingecko API

I´m forking this repo https://github.com/FundacionPesetacoin/Pesetacoin_WooCommerce-Plugin and working fine. But when change the API for catch the price in other Site, not update
I try some differents links of API and make same.
Original code get info of his private API, and I want use other public API.
With original code, API show this info:
{"status" : "success" , "message" : "null", "ptc_btc" : "0.00000083", "btc_usd" : "5070.29", "btc_eur" : "4505.46", "supply" : "138188628.56442260", "ptc_eur" : "0.00373953", "ptc_usd" : "0.00420834" , "date" : "2019-04-13 10:20:07"}
and get "ptc_eur" of API for shows in shoppping cart.
Now I want use the new API of other site https://api.coingecko.com/api/v3/simple/price?ids=reecore&vs_currencies=eur than shows this info:
{"reecore":{"eur":0.0046564}}
I want use only the "eur" data , same the original code use the "ptc_eur" but dont work.
Sorry for my english.
ORIGINAL CODE:
//precio en PesetaCoins
global $woocommerce;
$euros= $woocommerce->cart->total;
$xaxa= "http://nodos.pesetacoin.info/api/api.php";
$data = file_get_contents($xaxa);
$pesetas = json_decode($data, true);
$valor_ptc= $pesetas['ptc_eur'];
$ptc= $euros/$valor_ptc;
$ptc= round($ptc, 2);
//precio en PesetaCoins
$pagos= array();
$metodo= $order->get_payment_method();
$i = -1;
foreach ( $this->account_details as $account ) {
$i++;
$pagos[$i]=
$pagos[$i]= esc_attr( wp_unslash( $account['hash_name'] ) );
}
$cont= rand(0, $i);
if($metodo == "ptc") {
$description= "<span style='font-size:14px'>Para completar el pedido, debe enviar la cantidad <b>".$ptc."</b> de Pesetacoin a la siguiente dirección: <b>";
$description.= $pagos[$cont];
$description.="</b><br>Una vez se reciba la transacción se enviará el pedido.</span>";
echo wpautop(wptexturize($description));
}
}
NEW CODE:
//precio en ReecoreCoins
global $woocommerce;
$euros= $woocommerce->cart->total;
$xaxa= "https://api.coingecko.com/api/v3/simple/price?ids=reecore&vs_currencies=eur";
$data = file_get_contents($xaxa);
$pesetas = json_decode($data, true);
$valor_reex= $pesetas['eur'];
$reex= $euros/$valor_reex;
$reex= round($reex, 2);
//precio en ReecoreCoins
$pagos= array();
$metodo= $order->get_payment_method();
$i = -1;
foreach ( $this->account_details as $account ) {
$i++;
$pagos[$i]=
$pagos[$i]= esc_attr( wp_unslash( $account['hash_name'] ) );
}
$cont= rand(0, $i);
if($metodo == "reex") {
$description= "<span style='font-size:14px'>Para completar el pedido, debe enviar la cantidad <b>".$reex."</b> de Reecorecoin a la siguiente dirección: <b>";
$description.= $pagos[$cont];
$description.="</b><br>Una vez se reciba la transacción se enviará el pedido.</span>";
echo wpautop(wptexturize($description));
}
}
It's because the now Coingecko API return a nested JSON which is simply a JSON file with a fairly big portion of its values being other JSON objects.
Compared with Simple JSON, Nested JSON provides higher clarity in that it decouples objects into different layers, making it easier to maintain.
Using Phrase, keys will be stored by separating levels with a dot.
The new API returns a nested JSON object, where you need two steps to access the desired value:
$valor_reex= $pesetas['reecore']['eur'];
You might want to use ready library for this. Like this one https://github.com/npabisz/coingecko-api.
Install via composer:
composer require npabisz/coingecko-api
And then get your reecore price by:
$client = new \CoinGecko\Client();
$data = $client->Simple->Price->get([
'ids' => 'reecore',
'vs_currencies' => 'eur',
]);
$reecorePrice = $data['reecore']['eur'] ?? null;

Character encoding error in idHTTP

I'm having a situation with TIdHTTP and TIdMultipartFormDataStream.
My code is:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem',AImagem,'image/jpeg');
FormPHP.AddFormField('iduser',AIDUser,'text/plain');
FormPHP.AddFormField('nome',ANome,'text/plain');
FormPHP.AddFormField('data',AData,'text/plain');
FormPHP.AddFormField('hora',AHora,'text/plain');
FormPHP.AddFormField('mensagem',AMensagem,'text/plain');
FormPHP.AddFormField('latitude','1','text/plain');
FormPHP.AddFormField('longitude','1','text/plain');
Response := TStringStream.Create('', TEncoding.ANSI);
HTTP:= TIdHTTP.Create(self);
HTTP.Request.CustomHeaders.Clear;
HTTP.Request.Clear;
HTTP.Request.ContentType:= 'multipart/form-data'; //application/x-www-form-urlencoded
HTTP.Request.ContentEncoding:= 'MeMIME';
HTTP.Request.CharSet:= 'utf-8';
HTTP.Request.Referer:= 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php',FormPHP,Response);
This is the PHP script:
<?php
#cadastro.php - Cadastra os dados enviados na tabela online.
$mysqli = new mysqli("mysqlhost","username","password","dbname");
$iduser = $_POST['iduser'];
$nome = $_POST['nome'];
$data = $_POST['data'];
$hora = $_POST['hora'];
$mensagem = $_POST['mensagem'];
$latitude = $_POST['latitude'];
$longitude = $_POST['longitude'];
$imagem = $_FILES["imagem"]['tmp_name'];
$tamanho = $_FILES['imagem']['size'];
if ( $imagem != "none" )
{
$fp = fopen($imagem, "rb");
$conteudo = fread($fp, $tamanho);
$conteudo = addslashes($conteudo);
fclose($fp);
$queryInsercao = "INSERT INTO tabpainel (iduser, nome, data, hora, mensagem, latitude, longitude, imagem) VALUES ('$iduser', '$nome', '$data','$hora','$mensagem', '$latitude', '$longitude', '$conteudo')";
mysqli_query($mysqli,$queryInsercao) or die("Algo deu errado ao inserir o registro. Tente novamente.");
if(mysqli_affected_rows($mysqli) > 0)
print "Sucesso!";
else
print "Não foi possível inserir o registro";
}
else
print "Não á foi possível carregar a imagem.";
?>
Explaining: My application post these fields to this PHP script and the php saves the data into a MySQL database and returns a response of "Sucesso!" to the application to inform the user that the data were saved. This text response is encoded in ANSI. I discovered that when i had to change the TStringStream encode to TEncoding.ANSI so it could recognize the "Não" word when something goes wrong.
Until the post, the variable AMensagem is ok, however, when the PHP receives the text, it's not correct. A text like this: "á Á é É" looks like this "=E1 =C1 =E9 =C9". This is saved in the mysql database.
I don't know if the problem is with the idHTTP or with TIdMultipartFormDataStream, or even with the PHP code. Everything works fine, it's just the encoding that i have no clue why it's not working.
The text that is transmitted to the server is not being encoded in UTF-8.
All of your AddFormField() calls are specifying the text/plain media type in the ACharset parameter instead of the AContentType parameter. Unlike AddFile(), the 3rd parameter of AddFormField() is the charset, and the 4th parameter is the media type.
function AddFormField(const AFieldName, AFieldValue: string; const ACharset: string = ''; const AContentType: string = ''; const AFileName: string = ''): TIdFormDataField; overload;
By passing an invalid charset, TIdMultipartFormDataStream ends up using Indy's built-in raw 8bit encoding instead, which encodes Unicode characters U+0000 - U+00FF as bytes $00 - $FF, respectively, and all other characters as byte $3F ('?'). The text you are sending happens to fall in that first range.
TIdFormDataField does not currently inherit a charset from TIdMultipartFormDataStream or TIdHTTP (work on that is in-progress), so you have to specify it on a per-field basis.
On a side note, MeMIME is not a valid ContentEncoding value. And you should not be setting any ContentEncoding value for a multipart/form-data post anyway.
Try something more like this instead:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem', AImagem, 'image/jpeg');
FormPHP.AddFormField('iduser', AIDUser, 'utf-8');
FormPHP.AddFormField('nome', ANome, 'utf-8');
FormPHP.AddFormField('data', AData, 'utf-8');
FormPHP.AddFormField('hora', AHora, 'utf-8');
FormPHP.AddFormField('mensagem', AMensagem, 'utf-8');
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Response := TStringStream.Create('');
HTTP := TIdHTTP.Create(Self);
HTTP.Request.Referer := 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php', FormPHP, Response);
Alternatively:
FormPHP := TIdMultiPartFormDataStream.Create;
FormPHP.AddFile('imagem', AImagem, 'image/jpeg');
FormPHP.AddFormField('iduser', AIDUser).Charset := 'utf-8';
FormPHP.AddFormField('nome', ANome).Charset := 'utf-8';
FormPHP.AddFormField('data', AData).Charset := 'utf-8';
FormPHP.AddFormField('hora', AHora).Charset := 'utf-8';
FormPHP.AddFormField('mensagem', AMensagem).Charset := 'utf-8';
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Response := TStringStream.Create('');
HTTP := TIdHTTP.Create(Self);
HTTP.Request.Referer := 'http://observadordecascavel.blog.br/cadastro.php';
HTTP.Post('http://observadordecascavel.blog.br/cadastro.php', FormPHP, Response);
Either way, the field text will be encoded using UTF-8 instead of Ansi.
Update: Now, with that said, AddFormField() sets the TIdFormDataField.ContentTransfer property to quoted-printable by default. However, PHP's $_POST does not decode quoted-printable by default, you would have to call quoted_printable_decode() manually:
$iduser = quoted_printable_decode($_POST['iduser']);
$nome = quoted_printable_decode($_POST['nome']);
$data = quoted_printable_decode($_POST['data']);
$hora = quoted_printable_decode($_POST['hora']);
$mensagem = quoted_printable_decode($_POST['mensagem']);
$latitude = quoted_printable_decode($_POST['latitude']);
$longitude = quoted_printable_decode($_POST['longitude']);
If you don't want TIdFormDataField to encode the UTF-8 text using quoted-printable, you can set the ContentTransfer property to 8bit instead:
FormPHP.AddFormField('iduser', AIDUser, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('nome', ANome, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('data', AData, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('hora', AHora, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('mensagem', AMensagem, 'utf-8').ContentTransfer := '8bit';
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Alternatively:
with FormPHP.AddFormField('iduser', AIDUser) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('nome', ANome) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('data', AData) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('hora', AHora) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
with FormPHP.AddFormField('mensagem', AMensagem) do begin
Charset := 'utf-8';
ContentTransfer := '8bit';
end;
FormPHP.AddFormField('latitude', '1');
FormPHP.AddFormField('longitude', '1');
Either way, you can then use your original PHP code again:
$iduser = $_POST['iduser'];
$nome = $_POST['nome'];
$data = $_POST['data'];
$hora = $_POST['hora'];
$mensagem = $_POST['mensagem'];
$latitude = $_POST['latitude'];
$longitude = $_POST['longitude'];
Whether you use quoted-printable or not, the PHP variables will end up holding UTF-8 encoded text. If you need the variables to be in another encoding, you will have to convert them as needed, by using either:
utf8_decode() (which decodes to ISO-8859-1):
$iduser = utf8_decode($iduser);
$nome = utf8_decode($nome);
$data = utf8_decode($data);
$hora = utf8_decode($hora);
$mensagem = utf8_decode($mensagem);
$latitude = utf8_decode($latitude);
$longitude = utf8_decode($longitude);
mb_convert_encoding()
$iduser = mb_convert_encoding($iduser, 'desired charset', 'utf-8');
$nome = mb_convert_encoding($nome), 'desired charset', 'utf-8');
$data = mb_convert_encoding($data, 'desired charset', 'utf-8');
$hora = mb_convert_encoding($hora, 'desired charset', 'utf-8');
$mensagem = mb_convert_encoding($mensagem, 'desired charset', 'utf-8');
$latitude = mb_convert_encoding($latitude, 'desired charset', 'utf-8');
$longitude = mb_convert_encoding($longitude, 'desired charset', 'utf-8');
iconv():
$iduser = iconv('utf-8', 'desired charset', $iduser);
$nome = iconv('utf-8', 'desired charset', $nome);
$data = iconv('utf-8', 'desired charset', $data);
$hora = iconv('utf-8', 'desired charset', $hora);
$mensagem = iconv('utf-8', 'desired charset', $mensagem);
$latitude = iconv('utf-8', 'desired charset', $latitude);
$longitude = iconv('utf-8', 'desired charset', $longitude);
Finally, when sending a response back to the client, you need to encode text when it contains non-ASCII characters. You should also be using header() to let the client know which charset is being used for that encoding:
header($_SERVER["SERVER_PROTOCOL"] . " 200 OK");
header('Content-Type: text/plain; charset="utf-8"');
if ( $imagem != "none" )
{
...
if (mysqli_affected_rows($mysqli) > 0)
print utf8_encode("Sucesso!");
else
print utf8_encode("Não foi possível inserir o registro");
}
else
print utf8_encode("Não á foi possível carregar a imagem.");

CodeIgniter and RESTful: how to remove backslashes from returned json?

i'm struggling trying to remove the scaped characters from the json response in my CodeIgniter with PhilSturgeon REST Server.
Everything is working OK, but the problem comes with the response, when I access the URL to get the data in json format I get it, but with escaped characters.
Example:
http://localhost/revista_servidor/index.php/api/notas/nota/id/1
Gives me the next response:
[{"id":"1","autor":"Prueba autor","titulo":"Comprobaci\u00f3n de t\u00ed\u00edtulo.","subtitulo":"Comprobaci\u00f3n de subt\u00edtulo.","foto1":"http://link.a.foto/foto1","texto1":"Comprobaci\u00f3n de texto 1.\r\n","pauta1":"1","texto2":"Comprobaci\u00f3n de texto 2.\r\n","foto2":"http://link.a.foto/foto2","pauta2":"1","texto3":"Comprobaci\u00f3n de texto 3.","foto3":"http://link.a.foto/foto3","pauta3":"1","texto4":"Comprobaci\u00f3n de texto 4.","texto5":"Comprobaci\u00f3n de texto 5.","texto6":"Comprobaci\u00f3n de texto 6.","datosweb":"http://link.a.pagina.de.datos/","adelanto":"Comprobaci\u00f3n del texto de adelante","nrorevista":"69"}]
It escapes URLs adding a backslash \ and changing specials characters (ó in this example) with: \u00f3.
I've tried adding stripslashes()but didn't work.
I checked the response in developers tools and it comes as expected: Content-Type: application/json.
How can I fix this encoding problem? I've also checked the configuration files and there seems to be nothing to change for this issue.
I hope someone can point me in the right direction, below is my code:
Controller: /application/controllers/api/notas.php
function nota_get() {
// ID verification.
if ( !$this->get('id') ) {
// NO ID.
$this->response(NULL, 400);
}
$nota = $this->Notas_model->get( $this->get('id') );
if ($nota) {
stripcslashes($this->response($nota, 200));
}
else {
$this->response(NULL, 404);
}
}
Model: /application/models/notas_model.php
function get($id = 0) {
$this->load->database();
if ( $id ) {
$query = $this->db->get_where( 'notas', array('id' => $id) );
}
else {
$query = $this->db->get('notas');
}
return $query->result();
}
I don't know if this matters, but this data will be accessed via javascript in the client side.
Thanks in advance!
Its just the way JSON works, try json_decode function.
e.g:
$json = json_decode($json_string);
$json->autor;
1) You need to be using json_decode() and then urldecode(), instead of stripslashes()
2) Both urldecode() and stripslashes() take a string as an argument, while you are trying to feed into it an object -- which gets "autoreduced" into something that depends on the PHP version... Whatever it is, it's probably not what you are expecting.
In your code:
if ($nota) {
stripcslashes($this->response($nota, 200));
}
you'll need a) save the result of decoding to the same or another variable, b) to loop through your object ( $nota ) and unescape the value in each key-value pair.
Try
...
$cleanObject = array();
if ($nota) {
$decodedObject = json_decode($this->response($nota, 200));
foreach ( $decodedObject as $key => $value ) {
$cleanObject[$key] = urldecode( $value );
}
}
echo "<pre>";
print_r($cleanedObject);
echo "</pre>";
// output
/*
Array
(
[id] => 1
[autor] => Prueba autor
[titulo] => Comprobación de tíítulo.
[subtitulo] => Comprobación de subtítulo.
[foto1] => http://link.a.foto/foto1
[texto1] => Comprobación de texto 1.
[pauta1] => 1
[texto2] => Comprobación de texto 2.
[foto2] => http://link.a.foto/foto2
[pauta2] => 1
[texto3] => Comprobación de texto 3.
[foto3] => http://link.a.foto/foto3
[pauta3] => 1
[texto4] => Comprobación de texto 4.
[texto5] => Comprobación de texto 5.
[texto6] => Comprobación de texto 6.
[datosweb] => http://link.a.pagina.de.datos/
[adelanto] => Comprobación del texto de adelante
[nrorevista] => 69
)
*/
...
Hopefully, this is what you are looking to achieve.
Depending on your further needs, you may have to re-encode the result. Based on javascript you mention, you may then need to convert the result back to JSON:
$unescapedAndJSONencodedObject = json_encode( $cleanObject );

php's json_encode and character representation

I'll try to present it as simple as I can:
I use json_encode() to encode a number of utf-8 strings from different languages and I notice that characters remain unchanged when they belong to ASCII table but everything else is returned as '\unnnn', where 'nnnn' a hexadecimal number.
See the code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
<title>Multibyte string functions</title>
</head>
<body>
<h3>Multibyte string functions</h3>
<p>
<?php
//present json encode errors nicely:
//assign integer values to keys and error names to values
echo '<br /><b>Define JSON errors</b><br />';
$constants = get_defined_constants(true);
$json_errors = array();
foreach ($constants["json"] as $name => $value) {
if (!strncmp($name, "JSON_ERROR_", 11)) {
$json_errors[$value] = $name;
}
}
echo nl2br(print_r($json_errors, true), true);
//Display current detection order
echo "<br /><b>Current detection order 'mb_detect_order()':</b> ", implode(", ", mb_detect_order());
//Display internal encoding
echo "<br /><b>Internal encoding 'mb_internal_encoding()':</b> ", mb_internal_encoding();
//Get current language
echo "<br /><b>Current detection language 'mb_language()' ('neutral' for utf8):</b> ", mb_language();
//our test data
//a nowdoc that can break a <input> field;
$str = <<<'STR'
O'Reilly(\n) "& 'Big\Two # <span>bo\tld</span>"
STR;
$strings = array(
$str,
"Latin: tell me the answer and I might find the question!",
"Greek: πες μου την ερώτηση και ίσως βρω την απάντηση!",
"Chinese simplified: 告诉我答复,并且我也许发现问题!",
"Arabic: أخبرني الاجابة, انا قد تجد مسالة!",
"Portuguese: mais coisas a pensar sobre diário ou dois!",
"French: plus de choses à penser à journalier ou à deux!",
"Spanish: ¡más cosas a pensar en diario o dos!",
"Italian: più cose da pensare circa giornaliere o due!",
"Danish: flere ting å tenke på hver dag eller to!",
"Chech: Další věcí, přemýšlet o každý den nebo dva!",
"German: mehr über Spaß spät schönen",
"Albanian: më vonë gjatë fun bukur",
"Hungarian: több mint szórakozás késő csodálatos kenyér"
);
//show encoding and then encode
foreach( $strings as $string ){
echo "<br /><br />$string :", mb_detect_encoding($string);
$json = json_encode($string);
echo "<br />Error? ", $json_errors[json_last_error()];
echo '<br />json=', $json;
}
The above code will output:
Define JSON errors
Array
(
[0] => JSON_ERROR_NONE
[1] => JSON_ERROR_DEPTH
[2] => JSON_ERROR_STATE_MISMATCH
[3] => JSON_ERROR_CTRL_CHAR
[4] => JSON_ERROR_SYNTAX
[5] => JSON_ERROR_UTF8
)
Current detection order 'mb_detect_order()': ASCII, UTF-8
Internal encoding 'mb_internal_encoding()': ISO-8859-1
Current detection language 'mb_language()' ('neutral' for utf8): neutral
O'Reilly(\n) "& 'Big\Two # bo\tld" :ASCII
Error? JSON_ERROR_NONE
json="O'Reilly(\\n) \"& 'Big\\Two # bo\\tld<\/span>\""
Latin: tell me the answer and I might find the question! :ASCII
Error? JSON_ERROR_NONE
json="Latin: tell me the answer and I might find the question!"
Greek: πες μου την ερώτηση και ίσως βρω την απάντηση! :UTF-8
Error? JSON_ERROR_NONE
json="Greek: \u03c0\u03b5\u03c2 \u03bc\u03bf\u03c5 \u03c4\u03b7\u03bd \u03b5\u03c1\u03ce\u03c4\u03b7\u03c3\u03b7 \u03ba\u03b1\u03b9 \u03af\u03c3\u03c9\u03c2 \u03b2\u03c1\u03c9 \u03c4\u03b7\u03bd \u03b1\u03c0\u03ac\u03bd\u03c4\u03b7\u03c3\u03b7!"
Chinese simplified: 告诉我答复,并且我也许发现问题! :UTF-8
Error? JSON_ERROR_NONE
json="Chinese simplified: \u544a\u8bc9\u6211\u7b54\u590d\uff0c\u5e76\u4e14\u6211\u4e5f\u8bb8\u53d1\u73b0\u95ee\u9898!"
Arabic: أخبرني الاجابة, انا قد تجد مسالة! :UTF-8
Error? JSON_ERROR_NONE
json="Arabic: \u0623\u062e\u0628\u0631\u0646\u064a \u0627\u0644\u0627\u062c\u0627\u0628\u0629, \u0627\u0646\u0627 \u0642\u062f \u062a\u062c\u062f \u0645\u0633\u0627\u0644\u0629!"
Portuguese: mais coisas a pensar sobre diário ou dois! :UTF-8
Error? JSON_ERROR_NONE
json="Portuguese: mais coisas a pensar sobre di\u00e1rio ou dois!"
French: plus de choses à penser à journalier ou à deux! :UTF-8
Error? JSON_ERROR_NONE
json="French: plus de choses \u00e0 penser \u00e0 journalier ou \u00e0 deux!"
Spanish: ¡más cosas a pensar en diario o dos! :UTF-8
Error? JSON_ERROR_NONE
json="Spanish: \u00a1m\u00e1s cosas a pensar en diario o dos!"
Italian: più cose da pensare circa giornaliere o due! :UTF-8
Error? JSON_ERROR_NONE
json="Italian: pi\u00f9 cose da pensare circa giornaliere o due!"
Danish: flere ting å tenke på hver dag eller to! :UTF-8
Error? JSON_ERROR_NONE
json="Danish: flere ting \u00e5 tenke p\u00e5 hver dag eller to!"
Chech: Další věcí, přemýšlet o každý den nebo dva! :UTF-8
Error? JSON_ERROR_NONE
json="Chech: Dal\u0161\u00ed v\u011bc\u00ed, p\u0159em\u00fd\u0161let o ka\u017ed\u00fd den nebo dva!"
German: mehr über Spaß spät schönen :UTF-8
Error? JSON_ERROR_NONE
json="German: mehr \u00fcber Spa\u00df sp\u00e4t sch\u00f6nen"
Albanian: më vonë gjatë fun bukur :UTF-8
Error? JSON_ERROR_NONE
json="Albanian: m\u00eb von\u00eb gjat\u00eb fun bukur"
Hungarian: több mint szórakozás késő csodálatos kenyér :UTF-8
Error? JSON_ERROR_NONE
json="Hungarian: t\u00f6bb mint sz\u00f3rakoz\u00e1s k\u00e9s\u0151 csod\u00e1latos keny\u00e9r"
As you can see in most languages-except English-there is a hexadecimal conversion of utf-8 characters.
Is it possible to encode by not replacing my unicode characters? Is it safe? What other people do?
You should consider such encodings that are coming from user input in pages and stored to mysql.
Thanks.
Maybe you should try json_encode($string, JSON_UNESCAPED_UNICODE) , or any method in http://php.net/manual/fr/function.json-encode.php that may be usefull for your various cases.
Ok,
really thanks for the answer!
The problem is that I'm on version PHP Version 5.3.10 and json_encode($string, JSON_UNESCAPED_UNICODE) isn't an option.
Fortunately, a guy called "Mr Swordsteel" posted a comment at php's manual http://www.php.net/manual/en/function.json-encode.php which actually does the trick (thank you Mr Swordsteel!)
The real paradox is that it emulates completely json_encode function and gives a hint if we want to port it to another language like javascript and keep our libraries communicative.
function my_json_encode($in){
$_escape = function ($str) {
return addcslashes($str, "\v\t\n\r\f\"\\/");
};
$out = "";
if (is_object($in)){
$class_vars = get_object_vars(($in));
$arr = array();
foreach ($class_vars as $key => $val){
$arr[$key] = "\"{$_escape($key)}\":\"{$val}\"";
}
$val = implode(',', $arr);
$out .= "{{$val}}";
}elseif (is_array($in)){
$obj = false;
$arr = array();
foreach($in as $key => $val){
if(!is_numeric($key)){
$obj = true;
}
$arr[$key] = my_json_encode($val);
}
if($obj){
foreach($arr AS $key => $val){
$arr[$key] = "\"{$_escape($key)}\":{$val}";
}
$val = implode(',', $arr);
$out .= "{{$val}}";
}else {
$val = implode(',', $arr);
$out .= "[{$val}]";
}
}elseif (is_bool($in)){
$out .= $in ? 'true' : 'false';
}elseif (is_null($in)){
$out .= 'null';
}elseif (is_string($in)){
$out .= "\"{$_escape($in)}\"";debug('in='.$in.', $_escape($in)='.$_escape($in).', out='.$out);
}else{
$out .= $in;
}
return "{$out}";
}
I gave it a lot of tests and couldn't break it!
It would be very interesting now to re-implement json_decode!
Thanks.

Categories