How dangerous is it to output certain content without escaping it first

How dangerous is it to output certain content without escaping it first - php

Following on from a question I asked about escaping content when building a custom cms I wanted to find out how dangerous not escaping content from the db can be - assume the data ha been filtered/validated prior to insertion in the db.
I know it's a best practice to escape output but I'm just not sure how easy or even possible it is for someone to 'inject' a value into page content that is to be displayed.
For example let's assume this content with HTML markup is displayed using a simple echo statement:
<p>hello</p>
Admittedly it won't win any awards as far as content writing goes ;)
My question is can someone alter that for evil purposes assuming filtered/validated prior to db insertion?

Always escape for the appropriate context; it doesn't matter if it's JSON or XML/HTML or CSV or SQL (although you should be using placeholders for SQL and a library for JSON), etc.
Why? Because it's consistent. And being consistent is also a form of being lazy: you don't need to ponder if the data is "safe for HTML" because it shouldn't matter. And being lazy (in a good way) is a valuable programming trait. (In this case it's also being lazy about avoiding having to fix "bugs" due to changes in the future.)
Don't omit escaping "because it will never contain data that needs to be escaped" .. because, one day, over a course of a number of situations, that assumption will be wrong.

If you do not escape your HTML output, one could simply insert scripts into the HTML code of your page - running in the browser of every client that visits your page. It is called Cross-site scripting (XSS).
For example:
<p>hello</p><script>alert('I could run any other Javascript code here!');</script>
In the place of the alert(), you can use basically anything: access cookies, manipulate the DOM, communicate with other servers, et cetera.
Well, this is a very easy way of inserting scripts, and strip_tags can protect against this one. But there are hundreds of more sophisticated tricks, that strip_tags simply won't protect against.
If you really want to store and output HTML, HTMLPurifier could be your solution:
Hackers have a huge arsenal of XSS vectors hidden within the depths of
the HTML specification. HTML Purifier is effective because it
decomposes the whole document into tokens and removing non-whitelisted
elements, checking the well-formedness and nesting of tags, and
validating all attributes according to their RFCs. HTML Purifier's
comprehensive algorithms are complemented by a breadth of knowledge,
ensuring that richly formatted documents pass through unstripped.

It could be, for example, also problem linked with some other vulnerabilities like e.g. sql injection. Then someone would b e able to ommit filtering/validation prior adding to db and display whatever he can.

If you are pulling the word hello from the database and displaying it nothing will happen. If the content contains the <script> tags though then it is dangerous because a users cookies can be stolen then and used to hijack their session.

Related

Confused on htmlspecialchars, real_escape_string, etc

I've written a decent admin interface that includes inventory management, content management, and blogging. Now its time to lock it down and make it secure (Yes, I should have been doing it from the beginning...
For blog creation/editing, I'm using ckeditor which posts HTML output to editblog.php. Also i'm using simple text inputs for Title, Author, etc...
I'm concerned because the blog will have img src="uploads/etc.jpg", as well as divs, spans, etc...
SO! When I sanitize this data, how do I make sure that all those quotes and slashes can be safely shoved into my SQL database, and what do i do to spit it back out on the frontend? I'm also concerned because if the blogger "quotes" something, I don't want that to be messed with either.
Simple input like title, author, etc I'm using $title = mysqli_real_escape_string($title)
But is that enough? How do I preserve the user's intended input while avoiding attack?
I've done my research and yet I still don't get it. I hope someone can break it down nice and simple for me...

Nice and simple...
You always sanitize for the context to which you want to write.
These techniques will preserve the user's input, but prevent that input from being interpreted as code within a specific context.
When you want to query the database, you are worried about SQL injection attacks:
Use mysql_real_escape_string to sanitize SQL for the database query.
When you want to display something (as HTML) that will be parsed by the browser, you are worried about cross site scripting:
Use htmlspecialchars to sanitize for HTML output.
This will provide a basic level of security.
For more security on the database side, you should look at prepared statements and PHP PDO.
For more information about some pitfalls of htmlspecialchars, take a look at #Cheekysoft's excellent explaination: htmlspecialchars and mysql_real_escape_string

Examples of XSS that I can use to test my page input?

I have had issues with XSS. Specifically I had an individual inject JS alert showing that the my input had vulnerabilities. I have done research on XSS and found examples but for some reason I can't get them to work.
Can I get example(s) of XSS that I can throw into my input and when I output it back to the user see some sort of change like an alert to know it's vulnerable?
I'm using PHP and I am going to implement htmlspecialchars() but I first am trying to reproduce these vulnerabilities.
Thanks!

You can use this firefox addon:
XSS Me
XSS-Me is the Exploit-Me tool used to test for reflected Cross-Site
Scripting (XSS). It does NOT currently test for stored XSS.
The
tool works by submitting your HTML forms and substituting the form
value with strings that are representative of an XSS attack. If the
resulting HTML page sets a specific JavaScript value
(document.vulnerable=true) then the tool marks the page as vulnerable
to the given XSS string. The tool does not attempting to compromise
the security of the given system. It looks for possible entry points
for an attack against the system. There is no port scanning, packet
sniffing, password hacking or firewall attacks done by the
tool.
You can think of the work done by the tool as the same as the
QA testers for the site manually entering all of these strings into
the form fields.

For example:
<script>alert("XSS")</script>
"><b>Bold</b>
'><u>Underlined</u>

It is very good to use some of the automated tools, however you won't gain any insight or experience from those.
The point of XSS attack is to execute javascript in a browser window, which is not supplied by the site. So first you must have a look in what context the user supplied data is printed on the website; it might be within <script></script> code block, it might be within <style></style> block, it might be used as an attribute of an element <input type="text" value="USER DATA" /> or for instance in a <textarea>. Depending on that you will see what syntax you will use to escape the context (or use it); for instance if you are within <script> tags, it might be sufficient to close parethesis of a function and end the line with semicolon, so the final injection will look like ); alert(555);. If the data supplied is used as an html attribute, the injection might look like " onclick="alert(1)" which will cause js execution if you click on the element (this area is rich to play with especially with html5).
The point is, the context of the xss is as much important as any filtering/sanatizing functions that might be in place, and often there might be small nuances which the automated tool will not catch. As you can see above even without quotes and html tags, in a limited number of circumstance you might be able to bypass the filters and execute js.
There also needs to be considered the browser encoding, for instance you might be able to bypass filters if the target browser has utf7 encoding (and you encode your injection that way). Filter evasion is a whole another story, however the current PHP functions are pretty bulletproof, if used correctly.
Also here is a long enough list of XSS vectors
As a last thing, here is an actual example of a XSS string that was found on a site, and I guarantee you that not a single scanner would've found that (there were various filters and word blacklists, the page allowed to insert basic html formatting to customize your profile page):
<a href="Boom"><font color=a"onmouseover=alert(document.cookie);"> XSS-Try ME</span></font>

Ad-hoc testing is OK, however I also recommend trying a web application vulnerability scanning tool to ensure you haven't missed anything.
acunetix is pretty good and has a free trial of their application:
http://www.acunetix.com/websitesecurity/xss.htm
(Note I have no affiliation with this company, however I have used the product to test my own applications).

TinyMCE, PHP and MySQL: security and escaping questions

I'm implementing TinyMCE for a client so they can edit front-end content via a simple, familiar interface in their site's admin panel.
I have never used TinyMCE before but notice that you are able to insert whatever markup you want and it will be happily saved off to the MySQL database, assuming you don't escape the contents of the TinyMCE before running it through your query.
You can even insert single quotes and have it break your SQL query entirely.
But of course, when I do escape the contents, benign presentational stuff like paragraph tags get converted to HTML entities and so the whole point of the WYSIWYG editor is defeated, because the entities are spat back out when it comes to displaying the stored content on the front-end.
So is there a way I can "selectively escape" content from TinyMCE, to keep the innocent tags like P and BR but get rid of dangerous ones like SCRIPT, IFRAME, etc.? I really don't want to have to manually encode and decode them using str_replace() or whatever, but I'd rather not give my client a gaping security hole either.
Thanks.

Have you tried htmlpurifier? works wonders. Its caveats; big and slow, but the best you can have.
http://htmlpurifier.org .

Sorry Dude, I'd say this a question for the authors of TinyMCE, so I suggest you ask at: http://tinymce.moxiecode.com/enterprise/support.php ... I'm sure they'll be only to happy to answer (for a small fee), and I suspect this may even be one of there FAQ's.
It's just that I'd guess you'd be very lucky if you hit another TinyMCE-user (let alone an authorative one) on stack-overflow, a "general programming forum"... although I notice there are currently 837 questions tagged "tinymce" on this forum; have you tried searching through them? Maybe there's a pointer in one of those?
Cheers. Keith.
EDIT: Yep, Making user-made HTML templates safe is more or less the same question posed in different words, and it has (what looks to ignorant me) a couple of answers which posit practical solutions. I just searched stack overflow for "Tiny MCE html security".

That's like complaining that you can write naughty words in Microsoft Word, and that Word should filter them for you. Or complain to GM that they build cars that then get used as escape vehicles in bank robberies. TinyMCE's job is to be an online editor, not to be the content police.
If you need to ban certain tags, then remove them when the document's submitted by using strip_tags(). Or better yet, HTMLpurifier for a more bullet-proof sanitization. If embedded quotes are breaking your SQL, then why weren't you passing the submitted document through mysql_real_escape_string() or using PDO prepared queries first? MCE has no idea what the server-side handling is going to be, nor should it care at all. It's up to you to decide how to handle the data, because only you know what its ultimate purpose is going to be.
In any case, remember that all those editors work on the client side. You can make TinyMCE as bulletproof and as strict an editor as you want, but it's still running on the client. Nothing says a malicious user can't bypass it entirely and submit all the embedded quotes and bad tags they want. The ultimate responsibility for cleaning the data HAS to fall on your code running on the server, as it's the last line of defense, and the only one that can ensure the database remains pristine. Anything else is lipstick on a pig.

WYSIWYG editor security question (preventing malicious input)

I'm using jWYSIWYG in a form I'm creating that posts to a database and was wondering how you can prevent a malicious user from trying to inject code in the frame?
Doesn't the editor need brackets (which I'd normally strip during the post process) in order to display styles?

If the editor allows arbitrary HTML, you're fighting a losing battle since users could simply use the editor to craft their malicious content.
If the editor only allows for a subset of markup, then it should use an alternative syntax (similar to how stackoverflow does it), or you should escape all HTML except for specific, whitelisted tags.
Note that it's pretty easy to not do this correctly so I would use a third-party solution that has been appropriately tested for security.

Ultimately, the output is in your own hands when you will be inserting it into the database, a time you need to make sure that you strip away anything malicious. The simplest way will be to probaly use htmlentites against such data, however, there are other ways bad guys can bypass that. Here is a nice script also implemented by popular Kohana php framework for its input class against the possible XSS attacks:
http://svn.bitflux.ch/repos/public/popoon/trunk/classes/externalinput.php

I have encountered similar situations, and I have started using HTMLPurifier on my PHP backend which will prevent every attack vector I can think of. It is easy to install, and will allow you to whitelist the elements and attributes. It also prevents the XSS attacks that could still exist whilst using htmlentities.

How can I allow my user to insert HTML code, without risks? (not only technical risks)

I developed a web application, that permits my users to manage some aspects of a web site dynamically (yes, some kind of cms) in LAMP environment (debian, apache, php, mysql)
Well, for example, they create a news in their private area on my server, then this is published on their website via a cURL request (or by ajax).
The news is created with an WYSIWYG editor (fck at moment, probably tinyMCE in the next future).
So, i can't disallow the html tags, but how can i be safe?
What kind of tags i MUST delete (javascripts?)?
That in meaning to be server-safe.. but how to be 'legally' safe?
If an user use my application to make xss, can i be have some legal troubles?

If you are using php, an excellent solution is to use HTMLPurifier. It has many options to filter out bad stuff, and as a side effect, guarantees well formed html output. I use it to view spam which can be a hostile environment.

It doesn't really matter what you're looking to remove, someone will always find a way to get around it. As a reference take a look at this XSS Cheat Sheet.
As an example, how are you ever going to remove this valid XSS attack:
<IMG SRC=&#x6A&#x61&#x76&#x61&#x73&#x63&#x72&#x69&#x70&#x74&#x3A&#x61&#x6C&#x65&#x72&#x74&#x28&#x27&#x58&#x53&#x53&#x27&#x29>
Your best option is only allow a subset of acceptable tags and remove anything else. This practice is know as White Listing and is the best method for preventing XSS (besides disallowing HTML.)
Also use the cheat sheet in your testing; fire as much as you can at your website and try to find some ways to perform XSS.

The general best strategy here is to whitelist specific tags and attributes that you deem safe, and escape/remove everything else. For example, a sensible whitelist might be <p>, <ul>, <ol>, <li>, <strong>, <em>, <pre>, <code>, <blockquote>, <cite>. Alternatively, consider human-friendly markup like Textile or Markdown that can be easily converted into safe HTML.

Rather than allow HTML, you should have some other markup that can be converted to HTML. Trying to strip out rogue HTML from user input is nearly impossible, for example
<scr<script>ipt etc="...">
Removing from this will leave
<script etc="...">

Kohana's security helper is pretty good. From what I remember, it was taken from a different project.
However I tested out
<IMG SRC=&#x6A&#x61&#x76&#x61&#x73&#x63&#x72&#x69&#x70&#x74&#x3A&#x61&#x6C&#x65&#x72&#x74&#x28&#x27&#x58&#x53&#x53&#x27&#x29>
From LFSR Consulting's answer, and it escaped it correctly.

For a C# example of white list approach, which stackoverflow uses, you can look at this page.

If it is too difficult removing the tags you could reject the whole html-data until the user enters a valid one.
I would reject html if it contains the following tags:
frameset,frame,iframe,script,object,embed,applet.
Also tags which you want to disallow are: head (and sub-tags),body,html because you want to provide them by yourself and you do not want the user to manipulate your metadata.
But generally speaking, allowing the user to provide his own html code always imposes some security issues.

You might want to consider, rather than allowing HTML at all, implementing some standin for HTML like BBCode or Markdown.

I use this php strip_tags function because i want user can post safely and i allow just few tags which can be used in post in this way nobody can hack your website through script injection so i think strip_tags is best option
Clich here for code for this php function

It is very good function in php you can use it
$string = strip_tags($_POST['comment'], "<b>");

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How dangerous is it to output certain content without escaping it first - php

It could be, for example, also problem linked with some other vulnerabilities like e.g. sql injection. Then someone would b e able to ommit filtering/validation prior adding to db and display whatever he can.

If you are pulling the word hello from the database and displaying it nothing will happen. If the content contains the <script> tags though then it is dangerous because a users cookies can be stolen then and used to hijack their session.

Related

Confused on htmlspecialchars, real_escape_string, etc

Examples of XSS that I can use to test my page input?

TinyMCE, PHP and MySQL: security and escaping questions

WYSIWYG editor security question (preventing malicious input)

How can I allow my user to insert HTML code, without risks? (not only technical risks)

Categories

Resources