I want to convert special characters to HTML entities, and then back to the original characters.
I have used htmlentities() and then html_entity_decode(). It worked fine for me, but { and } do not come back to the original characters, they remain { and }.
What can I do now?
My code looks like:
$custom_header = htmlentities($custom_header);
$custom_header = html_entity_decode($custom_header);
Even though nobody can replicate your problem, here's a direct way to solve it by a simple str_replace.
$input = '<p><script type="text/javascript"> $(document).ready(function() { $("#left_side_custom_image").click(function() { alert("HELLO"); }); }); </script></p> ';
$output = str_replace( array( '{', '}'), array( '{', '}'), $input);
Demo (click the 'Source' link in the top right)
Edit: I see the problem now. If your input string is:
"{hello}"
A call to htmlentities encodes the & into &, which gives you the string
"{hello}"
The & is then later decoded back into &, to output this:
"{hello}"
The fix is to send the string through html_entity_decode again, which will properly decode your entities.
$custom_header = "{hello}";
$custom_header = htmlentities($custom_header);
$custom_header = html_entity_decode($custom_header);
echo html_entity_decode( $custom_header); // Outputs {hello}
Related
I am trying to pass a string to a javascript function which opens that string in an editable text area. If the string does not contain a new line character, it is passed successfully. But when there is a new line character it fails.
My code in PHP looks like
$show_txt = sprintf("showEditTextarea('%s')", $test_string);
$output[] = '<a href="#" id="link-'.$data['test'].'" onclick="'.$show_txt.';return false;">';
And the javascript function looks like -
$output[] = '<script type="text/javascript">
var showEditTextarea = function(test_string) {
alert(test_string);
}
</script>';
The string that was successfully passed was "This is a test" and it failed for "This is a first test
This is a second test"
Javascript does not allow newline characters in strings. You need to replace them by \n before the sprintf() call.
You are getting this error because there is nothing escaping your javascript variables... json_encode is useful here. addslashes will also have to be used in the context to escape the double quotes.
$show_txt = sprintf("showEditTextarea(%s)", json_encode($test_string));
$output[] = '<a href="#" id="link-'.$data['test'].'" onclick="'.htmlspecialchars($show_txt).';return false;">';
Why don't you try replacing all spaces in the php string with \r\n before you pass it to the JavaScript function? See if that works.
If that does not work then try this:
str_replace($test, "\n", "\n");
Replacing with two \ may work as it will encapsulate.
I would avoid storing HTML or JS in PHP variables as much as possible, but if you do need to store the HTML in a PHP variable then you will need to escape the new line characters.
try
$test_string = str_replace("\n", "\\\n", $test_string);
Be sure to use double quotes in the str_replace otherwise the \n will be interpreted as literally \n instead of a new line character.
Try this code, that deletes new lines:
$show_txt = sprintf("showEditTextarea('%s')", str_replace(PHP_EOL, '', $test_string));
Or replaces with: \n.
$show_txt = sprintf("showEditTextarea('%s')", str_replace(PHP_EOL, '\n', $test_string));
I have website that's in win-1251 encoding and it needs to stay that way. But I also need to be able to echo few links that contain non latin, non cyrillic characters like šžāņūī...
I need a function that convert this
"māja un man tā patīk"
to
"māja un man tā patīk"
and that does not touch html, so if there is <b> it needs to stay as <b>, not > or <
And please no advices about the encoding and how wrong that is.
$str = "<b>Obāchan</b> おばあちゃん";
$str = preg_replace_callback('/./u', function ($matches) {
$chr = $matches[0];
if (strlen($chr) > 1) {
$chr = mb_convert_encoding($chr, 'HTML-ENTITIES', 'UTF-8');
}
return $chr;
}, $str);
This expects the original $str to be UTF-8 encoded, i.e. your PHP file should be saved in UTF-8. It encodes all non-ASCII compatible code points to HTML entities. Since all HTML special characters are ASCII characters, they remain untouched. The resulting string is pure ASCII. Since the lower Win-1251 code points are ASCII compatible, the resulting string is also a valid Win-1251 string. The above $str converts to:
<b>Obāchan</b> おばあちゃん
The main things you probably don't want to encode are <, > and &. Those are really the only special characters. So how about encoding everything first, and then just decode <, > and & I feel you should be fine.
This is untested:
$output =
htmlspecialchars_decode(
htmlentities($input, ENT_NOQUOTES, 'CP-1251')
);
let me know
What Evert suggest looks logical to me too! If you insist this is a way to do it if there are only two letters that bother you. For more letters the scrit will not be as effective and needs to change.
<?PHP
function myConvert($str)
{
$chars['ā']='ā';
$chars['ī']='ī';
foreach ($chars as $key => $value)
$output = str_replace($key, $value, $str);
echo $str;
}
myConvert("māja un man tā patīk");
?>
==================edited==============
For many characters maybe this one can help you:
<?PHP
function myConvert($str)
{
$final=null;
$parts = preg_split("/&#[0-9]*;/i", $str);//get all text parts
preg_match_all("/&#[0-9]*;/i", $str, $delimiters );//get delimiters;
$delimiters[0][]='';//make arrays equal size
foreach($parts as $key => $value)
$final.=$value.mb_convert_encoding
($delimiters[0][$key], "UTF-8", "HTML-ENTITIES");
return $final;
}
$fh = fopen("testFile.txt", 'w') ;
fwrite($fh, myConvert("māja un man tā patīkī"));
fclose($fh);
?>
The desired output is written in the text file. This code, exactly as it is -not merged in some project- does what it claims to do. Converts codes like ā to the analogous character they present.
How do I convert a string that has a - or + sign to a html friendly string?
I mean to convert those characters to html notations, like space is and so on...
ps: htmlentities doesn't work. I still see the -/+
Try this
$string = str_replace('+', '+', $string); // Convert + sign
$string = str_replace('-', '-', $string); // Convert - sign
I don't think there is entities for these symbols see: http://www.w3schools.com/tags/ref_entities.asp
I tested with
$str = "- and +"; echo htmlentities($str);
and didn't get entities. According to: http://us.php.net/manual/en/function.htmlentities.php
I would expect them to be encoded if there was encoding available.
No idea what you want to accomplish. But this escapes selected characters to html entities:
$html = preg_replace("/([+-])/e", '"&#".ord("$1").";"', $html);
As far as I am aware, - and + are fine in HTML, and dont have an entity equivalent. See http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Are you sure you're not thinking of URL encoding?
Specify that you want it to use unicode as follows:
htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
Have a look at the 2nd comment on this page:
http://www.php.net/manual/en/function.htmlentities.php#100388
This will enable more encoding characters.
If you just want to encode some, then this is a little lighter weight:
<?php
$ent = array(
'+'=>'+',
'-'=>'+'
);
echo strtr('+ and -', $ent);
?>
I have seen some solutions, or at least tries, but none of them really work.
How do I strip all tags except those inside <code> or [code] - and replace all the < and > with < etc. in order to let JavaScript do some syntax highlighting on the output?
Why don't you try using strpos() to get the position of [code] and [/code].
When you have the location (assuming you only have one set of the code tag) just get the contents of everything before and everything after and the strip_tags on that text.
Hope this helps.
Use a callback:
$code = 'code: <p>[code]<hi>sss</hi>[/code]</p> more code: <p>[code]<b>sadf</b>[/code]</p>';
function codeFormat($matches)
{
return htmlspecialchars($matches[0]);
}
echo preg_replace_callback('#\[code\](?:(?!\[/code\]).)*\[/code\]#', 'codeFormat', $code);
<?php
$str = '<b><code><b><a></a></b></code></b><code>asdsadas</code>';
$str = str_replace('[code]', '<code>', $str);
$str = str_replace('[/code]', '</code>', $str);
preg_match('/<code>(.*?)<\/code>/', $str, $matches);
$str = strip_tags($str, "<code>");
foreach($matches as $match)
{
$str = preg_replace('/<code><\/code>/', $str, '<code>'.htmlspecialchars($match).'</code>', 1);
}
echo $str;
?>
This searches for the code tags and captures what is within the tags. Strips the tags. Loops through the matches replacing the code tags with the text captured and replacing the < and >.
EDIT: the two str_replace lines added to allow [code] too.
$str = '[code]
<script type="text/javascript" charset="utf-8">
var foo = "bar";
</script>
[/code]
strip me';
echo formatForDisplay( $str );
function formatForDisplay( $output ){
$output = preg_replace_callback( '#\[code]((?:[^[]|\[(?!/?code])|(?R))+)\[/code]#', 'replaceWithValues', $output );
return strip_tags($output);
}
function replaceWithValues( $matches ){
return htmlentities( $matches[ 1 ] );
}
try this should work, i tested it and it seemed to have the desired effect.
Well, I tried a lot with all your given code, right now I am working with this one, but it is still not giving the expected results -
What I want is, a regular textarea, where one can put regular text, hit enter, having a new line, not allowing tags here - maybe <strong> or <b>....
Perfect would be to recognice links and have them surrounded with <a> tags
This text should automatically have <p> and <br /> where needed.
To fill in code in various languages one should type
[code lang=xxx] code [/code] - in the best case [code lang="xxx"] or <code lang=xxx> would work too.
Than typing the code or copy and paste it inside.
The code I am using at the moment, that at least does the changing of tags and output it allright except of tabs and linebreaks is:
public function formatForDisplay( $output ){
$output = preg_replace_callback( '#\[code lang=(php|js|css|html)]((?:[^[]|\[(?!/?code])|(?R))+)\[/code]#', array($this,'replaceWithValues'), $output );
return strip_tags($output,'<code>');
}
public function replaceWithValues( $matches ){
return '<code class="'.$matches[ 1 ].'">'.htmlentities( $matches[ 2 ] ).'</code>';
}
Similar like it works here.
The strip_tag syntax gives you an option to determine the allowable tags:
string strip_tags ( string $str [, string $allowable_tags ] ) -> from PHP manual.
This should give you a start on the right direction I hope.
Is there any reasons why PHP's json_encode function does not escape all JSON control characters in a string?
For example let's take a string which spans two rows and has control characters (\r \n " / \) in it:
<?php
$s = <<<END
First row.
Second row w/ "double quotes" and backslash: \.
END;
$s = json_encode($s);
echo $s;
// Will output: "First row.\r\nSecond row w\/ \"double quotes\" and backslash: \\."
?>
Note that carriage return and newline chars are unescaped. Why?
I'm using jQuery as my JS library and it's $.getJSON() function will do fine when you fully, 100% trust incoming data. Otherwise I use JSON.org's library json2.js like everybody else.
But if you try to parse that encoded string it throws an error:
<script type="text/javascript">
JSON.parse(<?php echo $s ?>); // Will throw SyntaxError
</script>
And you can't get the data! If you remove or escape \r \n " and \ in that string then JSON.parse() will not throw error.
Is there any existing, good PHP function for escaping control characters. Simple str_replace with search and replace arrays will not work.
function escapeJsonString($value) {
# list from www.json.org: (\b backspace, \f formfeed)
$escapers = array("\\", "/", "\"", "\n", "\r", "\t", "\x08", "\x0c");
$replacements = array("\\\\", "\\/", "\\\"", "\\n", "\\r", "\\t", "\\f", "\\b");
$result = str_replace($escapers, $replacements, $value);
return $result;
}
I'm using the above function which escapes a backslash (must be first in the arrays) and should deal with formfeeds and backspaces (I don't think \f and \b are supported in PHP).
D'oh - you need to double-encode: JSON.parse is expecting a string of course:
<script type="text/javascript">
JSON.parse(<?php echo json_encode($s) ?>);
</script>
I still haven't figured out any solution without str_replace..
Try this code.
$json_encoded_string = json_encode(...);
$json_encoded_string = str_replace("\r", '\r', $json_encoded_string);
$json_encoded_string = str_replace("\n", '\n', $json_encoded_string);
Hope that helps...
$search = array("\n", "\r", "\u", "\t", "\f", "\b", "/", '"');
$replace = array("\\n", "\\r", "\\u", "\\t", "\\f", "\\b", "\/", "\"");
$encoded_string = str_replace($search, $replace, $json);
This is the correct way
Converting to and fro from PHP should not be an issue.
PHP's json_encode does proper encoding but reinterpreting that inside java script can cause issues. Like
1) original string - [string with nnn newline in it] (where nnn is actual newline character)
2) json_encode will convert this to
[string with "\\n" newline in it] (control character converted to "\\n" - Literal "\n"
3) However when you print this again in a literal string using php echo then "\\n" is interpreted as "\n" and that causes heartache. Because JSON.parse will understand a literal printed "\n" as newline - a control character (nnn)
so to work around this: -
A)
First encode the json object in php using json_enocde and get a string. Then run it through a filter that makes it safe to be used inside html and java script.
B)
use the JSON string coming from PHP as a "literal" and put it inside single quotes instead of double quotes.
<?php
function form_safe_json($json) {
$json = empty($json) ? '[]' : $json ;
$search = array('\\',"\n","\r","\f","\t","\b","'") ;
$replace = array('\\\\',"\\n", "\\r","\\f","\\t","\\b", "'");
$json = str_replace($search,$replace,$json);
return $json;
}
$title = "Tiger's /new \\found \/freedom " ;
$description = <<<END
Tiger was caged
in a Zoo
And now he is in jungle
with freedom
END;
$book = new \stdClass ;
$book->title = $title ;
$book->description = $description ;
$strBook = json_encode($book);
$strBook = form_safe_json($strBook);
?>
<!DOCTYPE html>
<html>
<head>
<title> title</title>
<meta charset="utf-8">
<script type="text/javascript" src="/3p/jquery/jquery-1.7.1.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
var strBookObj = '<?php echo $strBook; ?>' ;
try{
bookObj = JSON.parse(strBookObj) ;
console.log(bookObj.title);
console.log(bookObj.description);
$("#title").html(bookObj.title);
$("#description").html(bookObj.description);
} catch(ex) {
console.log("Error parsing book object json");
}
});
</script>
</head>
<body>
<h2> Json parsing test page </h2>
<div id="title"> </div>
<div id="description"> </div>
</body>
</html>
Put the string inside single quote in java script. Putting JSON string inside double quotes would cause the parser to fail at attribute markers (something like { "id" : "value" } ). No other escaping should be required if you put the string as "literal" and let JSON parser do the work.
I don't fully understand how var_export works, so I will update if I run into trouble, but this seems to be working for me:
<script>
window.things = JSON.parse(<?php var_export(json_encode($s)); ?>);
</script>
Maybe I'm blind, but in your example they ARE escaped. What about
<script type="text/javascript">
JSON.parse("<?php echo $s ?>"); // Will throw SyntaxError
</script>
(note different quotes)
Just an addition to Greg's response: the output of json_encode() is already contained in double-quotes ("), so there is no need to surround them with quotes again:
<script type="text/javascript">
JSON.parse(<?php echo $s ?>);
</script>
Control characters have no special meaning in HTML except for new line in textarea.value . JSON_encode on PHP > 5.2 will do it like you expected.
If you just want to show text you don't need to go after JSON. JSON is for arrays and objects in JavaScript (and indexed and associative array for PHP).
If you need a line feed for the texarea-tag:
$s=preg_replace('/\r */','',$s);
echo preg_replace('/ *\n */','
',$s);
This is what I use personally and it's never not worked. Had similar problems originally.
Source script (ajax) will take an array and json_encode it. Example:
$return['value'] = 'test';
$return['value2'] = 'derp';
echo json_encode($return);
My javascript will make an AJAX call and get the echoed "json_encode($return)" as its input, and in the script I'll use the following:
myVar = jQuery.parseJSON(msg.replace(/"/ig,'"'));
with "msg" being the returned value. So, for you, something like...
var msg = '<?php echo $s ?>';
myVar = jQuery.parseJSON(msg.replace(/"/ig,'"'));
...might work for you.
There are 2 solutions unless AJAX is used:
Write data into input like and read it in JS:
<input type="hidden" value="<?= htmlencode(json_encode($data)) ?>"/>
Use addslashes
var json = '<?= addslashes(json_encode($data)) ?>';
When using any form of Ajax, detailed documentation for the format of responses received from the CGI server seems to be lacking on the Web. Some Notes here and entries at stackoverflow.com point out that newlines in returned text or json data must be escaped to prevent infinite loops (hangs) in JSON conversion (possibly created by throwing an uncaught exception), whether done automatically by jQuery or manually using Javascript system or library JSON parsing calls.
In each case where programmers post this problem, inadequate solutions are presented (most often replacing \n by \\n on the sending side) and the matter is dropped. Their inadequacy is revealed when passing string values that accidentally embed control escape sequences, such as Windows pathnames. An example is "C:\Chris\Roberts.php", which contains the control characters ^c and ^r, which can cause JSON conversion of the string {"file":"C:\Chris\Roberts.php"} to loop forever. One way of generating such values is deliberately to attempt to pass PHP warning and error messages from server to client, a reasonable idea.
By definition, Ajax uses HTTP connections behind the scenes. Such connections pass data using GET and POST, both of which require encoding sent data to avoid incorrect syntax, including control characters.
This gives enough of a hint to construct what seems to be a solution (it needs more testing): to use rawurlencode on the PHP (sending) side to encode the data, and unescape on the Javascript (receiving) side to decode the data. In some cases, you will apply these to entire text strings, in other cases you will apply them only to values inside JSON.
If this idea turns out to be correct, simple examples can be constructed to help programmers at all levels solve this problem once and for all.