Using HTMLSPECIALCHARS on JSON results - needed? - php

I have an API which sends data to a javascript which then throws the response into some input fields.
I wonder if I need to use htmlspecialchars on the json_encode? Like so:
json_encode(
array(
'some_text' => htmlspecialchars('Some special & characters'),
'maybe_html' => htmlspecialchars('some <b>html</b>'),
'etc' => htmlspecialchars('yo')
)
);

Certainly not. HTML entities make no difference or sense within JSON, and if the result is processed by Javascript and inserted into the document via the DOM API via appropriate methods, then escaping is not needed there either. Escaping should be done when data comes in contact with a specific output medium. Here the data must be correctly encoded as JSON (which json_encode does), HTML is nowhere to be found here. If anything, HTML escaping should be done in Javascript because it's closer to the HTML, but again, it's unnecessary since Javascript interacts with the DOM API and not HTML.
See The Great Escapism (Or: What You Need To Know To Work With Text Within Text)

Depends on what you're doing with the string data.
What is important is the correct header for the content type.
header('Content-type: application/json');

Related

Remove double-quotes from a json_encoded string on the keys

I have a json_encoded array which is fine.
I need to strip the double-quotes on all of the keys of the json string on returning it from a function call.
How would I go about doing this and returning it successfully?
Thanks!
I do apologise, here is a snippet of the json code:
{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}
Just to add a little more info:
I will be retrieving the JSON via an AJAX request, so if it would be easier, I am open to ideas in how to do this on the javascript side.
EDITED as per anubhava's comment
$str = '{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}';
$str = preg_replace('/"([^"]+)"\s*:\s*/', '$1:', $str);
echo $str;
This certainly works for the above string, although there maybe some edge cases that I haven't thought of for which this will not work. Whether this will suit your purposes depends on how static the format of the string and the elements/values it contains will be.
TL;DR: Missing quotes is how Chrome shows it is a JSON object instead of a string. Ensure that you have Header('Content-Type: application/json; charset=UTF8'); in PHP's AJAX response to solve the real problem.
DETAILS:
A common reason for wanting to solve this problem is due to finding this difference while debugging the processing of returned AJAX data.
In my case I saw the difference using Chrome's debugging tools. When connected to the legacy system, upon success, Chrome showed that there were no quotes shown around keys in the response according to the debugger. This allowed the object to be immediately treated as an object without using a JSON.parse() call. Debugging my new AJAX destination, there were quotes shown in the response and variable was a string and not an object.
I finally realized the true issue when I tested the AJAX response externally saw the legacy system actually DID have quotes around the keys. This was not what the Chrome dev tools showed.
The only difference was that on the legacy system there was a header specifying the content type. I added this to the new (WordPress) system and the calls were now fully compatible with the original script and the success function could handle the response as an object without any parsing required. Now I can switch between the legacy and new system without any changes except the destination URL.

Clean php output into javascript

Due to the nature of my project. I am pulling data from my db and outputting to javascript. Things were working just fine till I got to the main content. It has strings like (;, :, - ''). How do I ensure that these are displayed without crushing my script coz as for now nothing seems to work.
If all you have is a single string value then see answer by Tomalak Geret'kal.
If there is any chance of getting something more than a single value from your database, like an array, object, null, or anything more complex, then I would suggest using json_encode. By using something like this:
<script>
var your_JavaScript_variable = <?php echo json_encode(your_PHP_variable); ?>;
</script>
you can pass complex data structures, arrays, or even single strings from PHP to JavaScript with all of your backslash escaping done automatically.
Additionally when you use JSON for moving your data from PHP to JavaScript it will be easy to make your application get the data from your server asynchronously without page refreshes using AJAX in the future.
You can use the PHP addslashes function for inserting into Javascript, and htmlspecialchars for inserting into HTML.
You should be encoding that data into json. PHP has a handy function to do this, json_encode.
Be sure to use the JSON_HEX_QUOTE option or the quotes in your data will break your js.
Read this: http://php.net/manual/en/function.json-encode.php

Sending HTML Code Through JSON

I've got a php script which generates HTML content. Is there a way to send back that HTML content through JSON to my webpage from the php script?
Yes, you can use json_encode to take your HTML string and escape it as necessary to be valid JSON (it'll also do things that are unnecessary, sadly, unless you use flags to prevent it). For instance, if your original string is:
<p class="special">content</p>
...json_encode will produce this:
"<p class=\"special\">content<\/p>"
You'll notice it has an unnecessary backslash before the / near the end. You can use the JSON_UNESCAPED_SLASHES flag to prevent the unnecessary backslashes. json_encode(theString, JSON_UNESCAPED_SLASHES); produces:
"<p class=\"special\">content</p>"
Do Like this
1st put all your HTML content to array, then do json_encode
$html_content="<p>hello this is sample text";
$json_array=array(
'content'=>50,
'html_content'=>$html_content
);
echo json_encode($json_array);
All string data must be UTF-8 encoded.
$out = array(
'render' => utf8_encode($renderOutput),
'text' => utf8_encode($textOutput)
);
$out = json_encode($out);
die($out);
In PHP:
$data = "<html>....";
exit(json_encode($data));
Then you should use AJAX to retrieve the data and do what you want with it. I suggest using JQuery: http://api.jquery.com/jQuery.getJSON/
You can send it as a String, why not. But you are probably missusing JSON here a bit since as far as I understand the point is to send just the data needed and wrap them into HTML on the client.
Just to expand on #T.J. Crowder's answer.
json_encode does well with simple html strings, in my experience however json_encode often becomes confused by, (or it becomes quite difficult to properly escape) longer complex nested html mixed with php. Two options to consider if you are in this position are: encoding/decoding the markup first with something like [base64_encode][1]/ decode (quite a bit of a performance hit), or (and perhaps preferably) be more selective in what you are passing via json, and generate the necessary markup on the client side instead.
All these answers didn't work for me.
But this one did:
json_encode($array, JSON_HEX_QUOT | JSON_HEX_TAG);
Thanks to this answer.

Do you have to format the printout from print_r() in PHP to have it validate with W3C?

This will not validate because of the output from print_r, is it not supposed to be used "on a site" or do one have to format it in a certain way?
<?php
$stuff1 = $_POST["stuff1"];//catch variables
$stuff2 = $_POST["stuff2"];
$stuff3 = $_POST["stuff3"];
$myStuff[0] = $stuff1;//put into array
$myStuff[1] = $stuff2;
$myStuff[2] = $stuff3;
print_r($myStuff);
?>
print_r() is mainly designed as a helpful tool for developers, not for actual production use in a manner that end-users would see. Thus, you shouldn't really be trying to validate it - if you're at the stage where you're trying to get stuff to validate, you shouldn't be using print_r anyway.
The validator can't distinguish the output of print_\r() from the surrounding html structure; it simply parses the whole character stream. If the output of your print_r() contains characters that have a special meaning in html (apparently < and > in your case) the validator must assume that it belongs to the html structure, not the text data. You have to mark them as "no, this is just text data, not a control character" for html parsers. One way to do this is to send entities instead of the "real" character itself, e.g. < instead of <
The function htmlspecialchars() takes care of those characters that always have a special meaning in (x)html.
You might also want to enclose the output in a <pre>....</pre> element to keep the formatting of print_r().
echo '<pre>', htmlspecialchars(print_r($myStuff, true)), "</pre>\n";
A plain print_r outputs text, so there's no reason for it not to affect validation. To print it out nicely formatted on an HTML page, use a <pre>:
$printout = print_r($my_var);
echo "<pre>$printout</pre>";
If you don't want to display it, but only to see it as a developer, place it in an HTML (<!-- any text -->).

How to allow certain HTML tags in a form field in Symfony 1.2

I'm playing around with Symfony and have encountered a road block.
I created a model "CmsPage" which has a field called "content" which is stored as a clob (this is specific to doctrine I believe). When I created the app I set "--escaping-strategy=on" so if I enter any html when editing a CmsPage that gets encoded with html entities or something along those lines. I would like to allow html in this field and a quick googling hasn't helped much. Maybe I'm searching for the wrong terms.
Anywho I would like to disable character escaping for this field and possibly only allow a small selection of html tags. What is the correct way to do this in Symfony?
You can use http://htmlpurifier.org/ It is great tool for your needs.
Here is small configuration for htmlpurifier. These rules perfect match with TinyMce editor.
$purifier = new HTMLPurifier();
$purfier_config = HTMLPurifier_Config::createDefault();
$purfier_config->set('HTML.DefinitionID', 'User Content Filter');
$purfier_config->set('HTML.DefinitionRev', 1);
// these are allowed html tags, means white list
$purfier_config->set('HTML.Allowed', 'a,strong,em,p,span,img,li,ul,ol,sup,sub,small,big,code,blockquote,h1,h2,h3,h4,h5');
// these are allowed html attributes, coool!
$purfier_config->set('HTML.AllowedAttributes', 'a.href,a.title,span.style,span.class,span.id,p.style,img.src,img.style,img.alt,img.title,img.width,img.height');
// auto link given url string
$purfier_config->set('AutoFormat.Linkify', true);
// auto format \r\n lines
$purfier_config->set('AutoFormat.AutoParagraph', true);
// clean empty tags
$purfier_config->set('AutoFormat.RemoveEmpty', true);
// cache dir, just for symfony of course, you can change to another path
$purfier_config->set('Cache.SerializerPath', sfConfig::get('sf_cache_dir'));
// translation type,
$purfier_config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
// allow youtube videos
$purfier_config->set('Filter.YouTube', true);
$purfier_config->set('HTML.TidyLevel', 'heavy');
// now clean your data
$clean_nice_html_data = $purifier->purify($user_input_data, $purfier_config);
Now you can insert data to databse with html tags, and you don't need to escape your data, because, htmlpurifier clean nasty, dangerous data for you, and only accept your allowed tags and attributes.
I hope it helps.
From http://www.librosweb.es/symfony_1_1_en/capitulo7/output_escaping.html
Every template has access to an $sf_data variable, which is a container object referencing all the escaped variables.
[skipped]
$sf_data also gives you access to the unescaped, or raw, data. This is useful when a variable stores HTML code meant to be interpreted by the browser, provided that you trust this variable. Call the getRaw() method when you need to output the raw data.
echo $sf_data->getRaw('test');
=> alert(document.cookie)You will have to access raw data each time you need variables containing HTML to be really interpreted as HTML. You can now understand why the default layout uses $sf_data->getRaw('sf_content') to include the template, rather than a simpler $sf_content, which breaks when output escaping is activated.

Categories