The server is running with PHP 5.2.17, and I am trying to run get_html_translation_table() with three arguments. Here is how I invoke the function:
$text = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES, "UTF-8");
I am getting a warning message saying
get_html_translation_table expects at most 2 parameters, 3 given
(filename and line number).
Per PHP Documentation, the third argument is supported after PHP 5.3.4, but adding the third argument is the only way I can think of to encode the array returned in "UTF-8". (It works despite the ugly warning message.)
I need get_html_translation_table() to create a function that encode all html special characters and spaces, and the following function just won't work without the third argument.
/**
* Trying to encoding all html special characters, including nl2br()
* #param string $original
* #return string
*/
function ecode_html_sp_chars($original) {
$table = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES, "UTF-8");
$table[' '] = ' ';
$encoded = strtr($original, $table);
return nl2br($encoded);
}
Two options: change your php version or use the htmlentities function. In htmlentities the encoding parameter was added in 4.1.
Example:
function ecode_html_sp_chars($original) {
$encoded = htmlentities($original, ENT_QUOTES, "UTF-8");
$encoded = str_replace(' ', ' ', $encoded);
return nl2br($encoded);
}
Related
So I’m getting some text from a user in PHP and one of the characters is supposed to be an apostrophe but instead of coming in as the character apostrophe ’ it comes in as %u2019.
I tried all of the following to no avail:
$b = urldecode($a);
$c = utf8_decode($a);
$d = html_entity_decode($a);
$e = rawurldecode($a);
This %u2019 seemingly can’t be turned back to this character.
urldecode: %u2019
utf8_decode: %u2019
html_entity_decode: %u2019
rawurldecode: %u2019
It must be JavaScript escaped string. You can see the same string if you run escape("’") in your browser console.
Use my php function that is equivalent to JavaScript unescape():
$str = preg_replace_callback(
'/%u(\d+)/',
function($matches) {
return mb_convert_encoding('&#'.hexdec($matches[1]).';', 'UTF-8', 'HTML-ENTITIES');
},
$str
);
If your php is older than php 5.3, you should define the callback function as a normal function because older php can't support closure.
I have this PHP function for converts all the html special characters to html entities, UTF-8 compatible.
function safe($input) {
$text = trim($input); //<-- LINE 31
$text = preg_replace("/(\r\n|\n|\r)/", "\n", $text); // cross-platform newlines
$text = preg_replace("/\n\n\n\n+/", "\n", $text); // take care of duplicates
$text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
$text = stripslashes($text);
$text = str_replace ( "\n", " ", $text );
$text = str_replace ( "\t", " ", $text );
return $text;
}
Now, I check my script using acunetix web vuln scanner and i see this error :
This page contains an error/warning message that may disclose sensitive information.The message can also contain the location of the file that produced the unhandled exception.
This may be a false positive if the error message is found in documentation pages.
This vulnerability affects /cms/submit.php.
Discovered by: Scripting (Error_Message.script).
Attack details
URL encoded POST input access was set to 2
Error message found:
<b>Warning</b>: trim() expects parameter 1 to be string, array given in <b>C:\xampp\htdocs\cms\includes\safefunc.php</b> on line <b>31</b><br />
How do i fix this?
As others have said, the error is self explanatory and this function is not built to handle arrays. If you need to handle arrays then something like this:
function safe($input) {
if(is_array($input)) {
return array_map('safe', $input);
}
// rest of code
}
Read the Warning, and your answer is there.
Trim must receive a string as parameter, not a array.
Use
var_dump($input) to check your input variable type.
Could you show the code that call function safe()?
try to stringify the $input by using
json_encode($input)
it worked for me
I have a PHP script that deals with a wide variety of languages. Unfortunately, whenever I try to use json_encode, any Unicode output is converted to hexadecimal entities. Is this the expected behavior? Is there any way to convert the output to UTF-8 characters?
Here's an example of what I'm seeing:
INPUT
echo $text;
OUTPUT
База данни грешка.
INPUT
json_encode($text);
OUTPUT
"\u0411\u0430\u0437\u0430 \u0434\u0430\u043d\u043d\u0438 \u0433\u0440\u0435\u0448\u043a\u0430."
Since PHP/5.4.0, there is an option called JSON_UNESCAPED_UNICODE. Check it out:
https://php.net/function.json-encode
Therefore you should try:
json_encode( $text, JSON_UNESCAPED_UNICODE );
JSON_UNESCAPED_UNICODE is available on PHP Version 5.4 or later.
The following code is for Version 5.3.
UPDATED
html_entity_decode is a bit more efficient than pack + mb_convert_encoding.
(*SKIP)(*FAIL) skips backslashes itself and specified characters by JSON_HEX_* flags.
function raw_json_encode($input, $flags = 0) {
$fails = implode('|', array_filter(array(
'\\\\',
$flags & JSON_HEX_TAG ? 'u003[CE]' : '',
$flags & JSON_HEX_AMP ? 'u0026' : '',
$flags & JSON_HEX_APOS ? 'u0027' : '',
$flags & JSON_HEX_QUOT ? 'u0022' : '',
)));
$pattern = "/\\\\(?:(?:$fails)(*SKIP)(*FAIL)|u([0-9a-fA-F]{4}))/";
$callback = function ($m) {
return html_entity_decode("&#x$m[1];", ENT_QUOTES, 'UTF-8');
};
return preg_replace_callback($pattern, $callback, json_encode($input, $flags));
}
You like to set charset and unescaped unicode
header('Content-Type: application/json;charset=utf-8');
json_encode($data,JSON_UNESCAPED_UNICODE|JSON_PRETTY_PRINT);
One solution is to first encode data and then decode it in the same file:
$string =json_encode($input, JSON_UNESCAPED_UNICODE) ;
echo $decoded = html_entity_decode( $string );
Here is my combined solution for various PHP versions.
In my company we are working with different servers with various PHP versions, so I had to find solution working for all.
$phpVersion = substr(phpversion(), 0, 3)*1;
if($phpVersion >= 5.4) {
$encodedValue = json_encode($value, JSON_UNESCAPED_UNICODE);
} else {
$encodedValue = preg_replace('/\\\\u([a-f0-9]{4})/e', "iconv('UCS-4LE','UTF-8',pack('V', hexdec('U$1')))", json_encode($value));
}
Credits should go to Marco Gasi & abu. The solution for PHP >= 5.4 is provided in the json_encode docs.
json_encode($text, JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES);
The raw_json_encode() function above did not solve me the problem (for some reason, the callback function raised an error on my PHP 5.2.5 server).
But this other solution did actually work.
https://www.experts-exchange.com/questions/28628085/json-encode-fails-with-special-characters.html
Credits should go to Marco Gasi. I just call his function instead of calling json_encode():
function jsonRemoveUnicodeSequences( $json_struct )
{
return preg_replace( "/\\\\u([a-f0-9]{4})/e", "iconv('UCS-4LE','UTF-8',pack('V', hexdec('U$1')))", json_encode( $json_struct ) );
}
Is this the expected behavior?
the json_encode() only works with UTF-8 encoded data.
maybe you can get an answer to convert it here: cyrillic-characters-in-phps-json-encode
I want to convert everything like spaces, single/double quotes, line break, etc.
Here is a sample input (Thanks som_nangia) :
Escape Check < "escape these" > <“and these”> <html><tr><td></td></tr></html> 'these will need escaping too' ‘ so will these’ <script> </script>
Here are the options I am considering:
<pre>Escape Check < "escape these" > <“and these”> <html><tr><td></td></tr></html> 'these will need escaping too' ‘ so will these’ <script> </script></pre>
/**
* Encoding html special characters, including nl2br
* #param string $original
* #return string
*/
function encode_html_sp_chars($original) {
$table = get_html_translation_table(HTML_ENTITIES);
$table[' '] = ' ';
$encoded = strtr($original, $table);
return nl2br($encoded);
}
I have tried both htmlspecialchars and htmlentities, but none of them encodes spaces.
Use htmlspecialchars.
echo htmlspecialchars($string);
In your case, please pass two parameters this way:
echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
Thanks to zerkms and Phil.
Explanation
Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. This function returns a string with these conversions made. If you require all input substrings that have associated named entities to be translated, use htmlentities() instead.
If the input string passed to this function and the final document share the same character set, this function is sufficient to prepare input for inclusion in most contexts of an HTML document. If, however, the input can represent characters that are not coded in the final document character set and you wish to retain those characters (as numeric or named entities), both this function and htmlentities() (which only encodes substrings that have named entity equivalents) may be insufficient. You may have to use mb_encode_numericentity() instead.
This best way is probably
Example:
function encode_html_sp_chars($original) {
$encoded = htmlentities($original, ENT_QUOTES, "UTF-8");
$encoded = str_replace(' ', ' ', $encoded);
return nl2br($encoded);
}
I save a record "فحص الرسالة العربية" in php that always saved as :
فحص الرسالة العربية
I want to convert this into UTF-16BE chars when i retrieve it so I am using a function that returns :
002600230031003600300031003b002600230031003500380031003b002600230031003500380039003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500380035003b002600230031003500380037003b002600230031003500370035003b002600230031003600300034003b002600230031003500370037003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500390033003b002600230031003500380035003b002600230031003500370036003b002600230031003600310030003b002600230031003500370037003b
This is function that m using for converting string retrieved from database
function convertCharsn($string) {
$in = '';
$out = iconv('UTF-8', 'UTF-16BE', $string);
for($i=0; $i<strlen($out); $i++) {
$in .= sprintf("%02X", ord($out[$i]));
}
return $in;
}
But when i type same character in below url, it shows different characters as compared to my string.
http://www.routesms.com/downloads/onlineunicode.asp
returning :
0641062D063500200627064406310633062706440629002006270644063906310628064A0629
I want my string to be converted as it is being converted in above url.
my database collation is utf-8_general_ci
Basically, you need to decode those characters out of HTML entities first. Just use html_entity_decode()
$rawChars = html_entity_decode($string, ENT_QUOTES | ENT_HTML401, 'UTF-8');
convertCharsn($rawChars);
Otherwise, you're just encoding the entities. You can see that as & is 0026 in UTF16, and # is 0023. So you can see the repeating sequence of 00260023 in the above transcoding that you posted. So decode it first, and you should be set...