Prohibit the posting of HTML in textarea form field - php

I have a text area where a user can define a post. This field should allow BBCODE but not HTML.
Currently HTML is allowed but it should not be.
How can I disallow HTML tags to be posted by the user?

There are two main choices here. You can either escape the HTML, so it's treated as plain text, or you can remove it. Either way is safe, but escaping is usually what users expect.
To escape, use htmlspecialchars() [docs] on the input, before you process the bbcode.
echo htmlspecialchars("<b>Hello [i]world![/i]</b>")
<b>Hello [i]world![/i]</b>
To remove the HTML tags entirely, use strip_tags() [docs] instead:
echo strip_tags("<b>Hello [i]world![/i]</b>")
Hello [i]world![/i]

I think you should use strip_tags() it will strip html tags & preserve the text but leave BBcode.

Related

safely load HTML from user into textarea

I'm using TinyMCE 4 on a project, where I need to be able to pre-populate the textarea with HTML that was submitted through POST (for server-side error handling without deleting all their work) I know that a textarea works mostly like a tag, in that HTML inside is not parsed into DOM, so most sites show the demo:
<textarea name="demo"><?=$_POST['demo']?></textarea>
but what happens when a user submits HTML that includes an unmatched <textarea> or </textarea> tag?
Is there a standard way to manage this risk?
use htmlspecialchars($_POST['demo']) in php when outputing
Remove only the <textarea> tags from the user input. Please see this post using regular expressions. It tells you how to remove only certain tags (unlike htmlentities) which removes all tags.
Use xmp tag instead of textarea. It will display html as itself.
Eg: http://dadinck.x10.mx/xmp.html
htmlentities function will replace every html caracter (such as <) to one that will display correctly but wont break your html.
http://www.php.net/manual/en/function.htmlentities.php

How to Secure Data Submitted Through CKEditor

I am using CKEditor in my site to let the users post their comments. CKEditor has many buttons to compose the comment. Suppose If a User makes his comment bold and italic Such Like
This is comment
And CKEditor will ouput the following html
<i><strong>This is comment</strong></i>
Now, If I store this html in the mysql database and output on the webpage as it is, without wrapping it with htmlspecialchars(), then The Comment will be shown on the page bold and italic and this is what I want.
But on the other hand If I wrap the comment with htmlspecialchars() and displays it on the webpage it will be shown as
<i><strong>This is comment</strong></i>
But I do not want to show like this, I want the user formatting. But If I do not wrap it with htmlspecialchars(), it is risky and it can cause XSS Attack and other security risks.
How Can I Achieve both Purposes
(1). Keep the User Formatting
(2). Also Secure the HTML Contents
You need to draw up a whitelist of what elements and attributes you want to allow your users to include (eg allow <strong> but not <script>; allow <a href> but not <div onmouseover>), and then enforce it by parsing the input, removing all elements and attributes that don't fit your pattern, and serialising the results back into HTML.
This is a hard job that cannot be done with a few simple regexes or strip_tags (which is NOT an adequate solution for XSS even if it did fit your needs). You would be well advised to use an existing library to do it - HTML Purifier is one such for PHP.
i think you are looking for strip_tags. it will remove all the html and php tags from the string and only allow the given tags like <strong><i> etc
<?php
$str = "<i><strong>this is a comment<strong></i><script>here is script</script>";
echo $str = strip_tags($str,"<i><strong>");
?>
php.net documentation for strip_tags
strip_tags function has option to allow or disallow tags. use php.net for more reference about strip tags. You must strip unwanted or not allowed tags. if you don't then it might be vunerable by javascripts too.
Use htmlspecialchars while u are storing and use htmlspecialchars_decode while you are displaying. This will help you to keep format of user formated content
Two options spring to mind. First of all you can strip out all HTML and use a BB code parser to allow the user to post BB tags, rather than HTML - http://php.net/manual/en/book.bbcode.php
Secondly, you could strip out all HTML except a few tags. I don't know of any parser that does that personally, however I have seen it in action on sites before (Murphy's law I can't find any right now). You should be able to achieve this with a sophisticated enough RegEx replacement check.
Use this before printing it back on screen:
function html_escape($raw_input)
{
return htmlspecialchars($raw_input, ENT_QUOTES | ENT_HTML401, 'UTF-8');
}

Outputting a snippet of arbitrary HTML document content without open tags

For example, a snippet of 50 characters. Problem is, of course, closing any opened tags. What's a good way to do this? Or else to make things easier, what's a good way to completely skim off all HTML content from the snippet?
You can strip out all HTML tags, etc. via the strip_tags() function, which is (being realistic) probably the best way to go, as otherwise you'll most likely end up with more tags than actual content.
For example:
$first50Chars = substr(trim(strip_tags($longString)), 0, 50);
If tags are generally allowed in the text (I mean, if, for example, text contains <b>, text must be marked with bold, etc), then looks like strip_tags() function is the easiest variant to remove tags from snippet.
If tags are generally not allowed in the text (for example, "<b>" must be just displayed as "<b>"), then you can use htmlentities() function.

htmlentities displaying html safely

I have data that is coming in from a rss feed. I want to be safe and use htmlentities but then again if I use it if there is html code in there the page is full of code and content. I don't mind the formatting the rss offers and would be glad to use it as long as I can display it safely. I'm after the content of the feed but also want it to format decently too (if there is a break tag or paragraph or div) Anyone know a way?
Do you want to protect from XSS in the feed? If so, you'll need an HTML sanitizer to run on the HTML prior to displaying it:
HTMLSanitizer
HTMLPurifier
If you just want to escape whatever is there, just call htmlspecialchars() on it. But any HTML will appear as escaped text...
You can use the strip_tags tags function and specify the allowed tags in there:
echo strip_tags($content, '<p><a>');
This way any tag not specified in allowed tags will be removed.
You can transform the HTML into mark down and then back up again using various libraries.

How to Remove Html Tags in PHP?

I use htmlspecialchars function in my string. But, I don't want to clean them;
<b>, <br>, <p> <ul>,<li> bla bla...
Example: Mystring = "<script>.....</script><br><b>test</b><p>aaaa</p>";
I want to; =
<script>.....</script>
Have a look at HTML Purifier, and especially the whitelist feature.
This is probably the safest approach if you allow HTML tags. You can view the comparison here.
You want to remove all tags? Use strip_tags().
You can use HTML Sanitizer Class - http://www.phpclasses.org/browse/package/3746.html
You can use built in PHP function: strip_tags($text, '<p><a><li><b>');
You don't need to use any class or external library, to remove HTML tags from user input.
In the above example I am cleaning the $text from all tags except <p><a><li><b>;
This PHP function will clean user input making sure are not submitting <script> or </div> to break your page.

Categories