I am working on an old existing website. All pages were encoded in ISO-European, including the MySQL database.
I want to add some AJAX using PHP's json_encode, which only supports UTF8.
Is there a solution to use json_encode without UTF8?
The only thing you need to do is to convert your data to UTF-8 before passing it to json_encode. That function requires UTF-8 data, and unless you want to reimplement json_encode yourself it's a lot easier to go along with its requirements:
function recursivelyConvertToUTF8($data, $from = 'ISO-8859-1') {
if (!is_array($data)) {
return iconv($from, 'UTF-8', $data);
}
return array_map(function ($value) use ($from) {
return recursivelyConvertToUTF8($value, $from);
}, $data);
}
echo json_encode(recursivelyConvertToUTF8($myData));
This is not necessarily a complete solution covering every possible use case, but it should illustrate the idea.
You can use var_export, utf8_encode and eval to convert an array to UTF-8 recursively. It's a bit of a hack, but something like the following works:
$obj = array("key" => "\xC4rger"); // "Ärger" in Latin1
eval('$utf8_obj = ' . utf8_encode(var_export($obj, TRUE)) . ';');
print json_encode($utf8_obj);
This will print
{"key":"\u00c4rger"}
Related
I have a PHP script as below:
10. $json_sanitized = ds($json);
11. echo json_encode ( $json_sanitized );
The ds() function has few rules to sanitize the $json data.
function ds($text, $double = true, $charset = null) {
if (is_array($text)) {
// Some code
} elseif (is_object($text)) {
// Some code
} elseif (is_bool($text)) {
// Some code
}
$defaultCharset = 'UTF-8';
if (is_string($double)) {
$charset = $double;
}
return htmlspecialchars($text, ENT_QUOTES, ($charset) ? $charset : $defaultCharset, $double);
}
But the HP Fortify Scanner still says, Line #11, sends unvalidated data to a web browser, which can result in the browser executing malicious code.
Can anyone help on this?
Per a few other answers on this site, the json_encode function in PHP is generally safe, and there are some options that can help make it safer though additional escaping.
Using the following helps to escape more potentially unsafe characters that Fortify picks up:
echo json_encode($json_sanitized,JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS);
Per the json_encode docs and the json constants docs, these constants provide the following (optional) conversions:
JSON_HEX_QUOT - All " are converted to \u0022.
JSON_HEX_TAG - All < and > are converted to \u003C and \u003E.
JSON_HEX_AMP - All &s are converted to \u0026.
JSON_HEX_APOS - All ' are converted to \u0027.
You may be able to skip the escaping of the single and double quotes, as I imagine the biggest gripe is that <> are able to be printed unescaped.
Json: PHP to JavaScript safe or not?
Is json_encode Sufficient XSS Protection?
This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 7 months ago.
I'm using json_encode($data) to an data array and there's a field contains Russian characters.
I used this mb_detect_encoding() to display what encoding it is for that field and it displays UTF-8.
I think the json encode failed due to some bad characters in it like "ра▒". I tried alot of things utf8_encode on the data and it will by pass that error but then the data doesn't look correct anymore.
What can be done with this issue?
The issue happens if there are some non-utf8 characters inside even though most of them are utf8 chars. This will remove any non-utf8 characters and now it works.
$data['name'] = mb_convert_encoding($data['name'], 'UTF-8', 'UTF-8');
If you have a multidimensional array to encode in JSON format then you can use below function:
If JSON_ERROR_UTF8 occurred :
$encoded = json_encode( utf8ize( $responseForJS ) );
Below function is used to encode Array data recursively
/* Use it for json_encode some corrupt UTF-8 chars
* useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
*/
function utf8ize( $mixed ) {
if (is_array($mixed)) {
foreach ($mixed as $key => $value) {
$mixed[$key] = utf8ize($value);
}
} elseif (is_string($mixed)) {
return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
}
return $mixed;
}
Please, make sure to initiate your Pdo object with the charset iso as utf8.
This should fix this problem avoiding any re-utf8izing dance.
$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');
With php 7.2, two options allow to manage invalid UTF-8 direcly in json_encode :
https://www.php.net/manual/en/function.json-encode
json_encode($text, JSON_INVALID_UTF8_IGNORE);
Or
json_encode($text, JSON_INVALID_UTF8_SUBSTITUTE);
you just add in your pdo connection charset=utf8
like below line of pdo connection:
$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');
hope this will help you
Remove HTML entities before JSON encoding. I used html_entity_decode() in PHP and the problem was solved
$json = html_entity_decode($source);
$data = json_decode($json,true);
Do you by any chance have UUIDs in your result set? In that case the following database flag will help:
PDO::DBLIB_ATTR_STRINGIFY_UNIQUEIDENTIFIER => true
If your data is well encoded in the database for example, make sure to use the mb_ * functions for string handling, before json_encode. Functions like substr or strlen do not work well with utf8mb4 and can cut your text and leave a malformed UTF8
I know this is kind of an old topic, but for me it was what I needed. I just needed to modify the answer 'jayashan perera'.
//...code
$stmt->execute();
$result = $stmt->fetchAll(PDO::FETCH_ASSOC);
for ($i=0; $i < sizeof($result) ; $i++) {
$tempCnpj = $result[$i]['CNPJ'];
$tempFornecedor = json_encode(html_entity_decode($result[$i]['Nome_fornecedor']),true) ;
$tempData = $result[$i]['efetivado_data'];
$tempNota = $result[$i]['valor_nota'];
$arrResposta[$i] = ["Status"=>"true", "Cnpj"=>"$tempCnpj", "Fornecedor"=>$tempFornecedor, "Data"=>"$tempData", "Nota"=>"$tempNota" ];
}
echo json_encode($arrResposta);
And no .js i have use
obj = JSON.parse(msg);
I have this code here:
case 'resource_list':
if(file_exists('content.php')){
include('../ajax/content.php');
} else {
die('does not exist');
}
$html = render_content_page($array[1],$array[2]);
$slate = 'info_slate';
$reply_array = array(
'html' => $html,
'slate' => $slate
);
echo json_encode($reply_array);
break;
i have debugged every level right up until json_encode() is called. But the data i receive back in my ajax is nul for the html key. This code is essentially a copy and paste of another case the just calls a function other than render_content_page() but that works perfectly fine.
$reply_array var_exports to:
array (
'html' => '<ol>
<li unit="quiz" identifier=""><img src="img/header/notifications.png"/>Fran�ois Vase Volute Krater</li>
</ol>',
'slate' => 'info_slate',
)
My initial thought is that special character in Fran�ois Vase Volute Krater, as json_encode only works with UTF-8 encoded data.
Try UTF-8 encoding it before JSON encoding it like so:
json_encode(utf8_encode("Fran�ois Vase Volute Krater"));
Maybe problem is with encoding?
As manual states, json_encode() works only only with utf8 encoded data:
This function only works with UTF-8 encoded data.
http://php.net/json_encode
As documented, json_encode expects its input text in UTF-8. Most likely, your input (the ç) is not in UTF-8.
Use utf8_encode (if you're currently using ISO-8859-1) or mb_convert_encoding (otherwise) to convert input strings to UTF-8.
Trying to do a Latin1 to UTF-8 conversion for WordPress, had no luck with the tutorial posted in the Codex. I came up with this to check encoding and convert.
while($row = mysql_fetch_assoc($sql)) {
if(!mb_check_encoding($row['post_content'], 'UTF-8')) {
$row = mb_convert_encoding($row['post_content'], 'ISO-8859-1', 'UTF-8');
if(!mb_check_encoding($row['post_content'], 'UTF-8')) {
echo 'Can\'t Be Converted<br/>';
}
else {
echo '<br/>'.$row.'<br/><br/>';
}
}
else {
echo 'UTF-8<br/>';
}
}
This works... sorta. I'm not getting any rows that can't converted but I did notice that Panamá becomes Panam
Am I missing a step? Or am I doing this all wrong?
UPDATE
The data stored within the database is corrupt(á characters are stored). So its looking more like a find and replace job than a conversion. I haven't found any great solutions so far for doing this automagically.
This will help you. http://php.net/manual/en/book.iconv.php
Further more you can set your mysql connection to utf8 this way:
mysql_set_charset ('utf8',$this->getConnection());
$this->getConnection in my code returns the variable which was returned by
mysql_connect(MYSQL_SERVER,DB_LOGIN,DB_PASS);
Refer to the PHP documentation for mb_convert_encoding:
string mb_convert_encoding ( string $str , string $to_encoding [, mixed $from_encoding ] )
Your code is attempting to convert to ISO-8859-1 from UTF-8!
i'm using the code from here: http://phlymail.com/en/downloads/idna/download/ and built a function like this (from the example):
function convert_to_punycode($inputstring)
{
$IDN = new idna_convert();
// The input string, if input is not UTF-8 or UCS-4, it must be converted before
$inputstringutf8 = utf8_encode($inputstring);
// Encode it to its punycode presentation
$outputstringpunycode = $IDN->encode($inputstringutf8);
return $outputstringpunycode;
}
However it doesnt work properly.
For the input: Россию
It gives: РоÑÑиÑ
Whereas it should give: xn--h1alffa3f
What am I doing wrong? $inputstring which is being passed is a normal string with no special declarations/etc...
Is your string already UTF-8? Looks like it.
Or is it in ISO-8859-5?
In both cases you cannot use the PHP function utf8_encode(), since it expects your input string to be ISO-88591-1 (ISO Latin-1, Western European languages). Look into the file transcode_wrapper.php, which is delivered with the class source. This might help you.
you might need PHP IDNA Extension
I'd just add something like to use if possible the module otherwise Dave suggested function :
if(!function_exists('idn_to_ascii') and !function_exists('idn_to_utf8'))
{ define('IDN_FALLBACK_VERSION',2008);
require_once('idna_convert.class.php');
function idn_to_ascii($string)
{ $IDN = new idna_convert(array('idn_version'=>IDN_FALLBACK_VERSION));
return $IDN->encode($string);
}
function idn_to_utf8($string)
{ $IDN = new idna_convert(array('idn_version'=>IDN_FALLBACK_VERSION));
return $IDN->decode($string);
}
function idn_to_unicode($string){return idn_to_utf8($string);}
}
Try this method to convert encoding
//$inputstringutf8 = utf8_encode($inputstring);
$inputstringutf8 = mb_convert_encoding($inputstring, 'utf-8', mb_detect_encoding($inputstring));