I was reading an article about form security because I have a form in which a user can add messages.
I read that it was best to use strip_tags(), htmlspecialchars() and nl2br(). Somewhere else it is being said to use html_entity_decode().
I have this code in my page which takes the user input
<?php
$topicmessage = check_input($_POST['message']); //protect against SQLinjection
$topicmessage = strip_tags($topicmessage, "<p><a><span>");
$topicmessage = htmlspecialchars($topicmessage);
$topicmessage = nl2br($topicmessage);
?>
but when i echo the message, it's all on one line and it appears that the breaks have been removed by the strip_tags and not put back by nl2br().
To me, that makes sense why it does that, because if the break has been removed, how does it know where to put it back (or does it)?
Anyway, i'm looking for a way where i can protect my form for being used to try and hack the site like using javascript in the form.
You have 2 choices:
Allow absolutely no HTML. Use strip_tags() with NO allowed tags, or htmlspecialchars() to escape any tags that may be in there.
Allow HTML, but you need to sanitize the HTML. This is NOT something you can do with strip_tags. Use a library (Such as HTMLPurifier)...
You just need htmlspecialchars before printing form content, and mysql_real_escape before posting into SQL(you don't need it before printing), and you should be good.
Doing your way of stipping tags is very dangerous, you need short list of allowed tags with limited attributes - this is not something you can do in 1 line. You might want to look into HTML normalizers, like Tidy.
Use HTML Purifier for html-input and strip everything you dont want - all but paragraphs, all anchors etc.
Unrelated but important:
sprintf for stuff like "only digits from that field".
mysql-real-escape-string.php on all insert queries in general.
Related
I have html stored in the database and I need to output it to the page.
If I don't escape() it, then I get the bold formatting I want, but I run the risk of getting an XSS from the unescaped html source.
If I escape() it, then it shows the raw html code <b>bold text</b> instead of bold text.
How can I escape everything, except some tags? I'm thinking to apply the escape(), then search for the <b> and </b> and unescape them. Would that work? Any security problems you see with it? I'm also not sure how I would search for the <b></b> tags. Regex for that maybe or what?
P.S. the escape() I mean is a function in Zend. I believe it's the equivalent of htmlspecialchars().
Unescaping is the way to go. If you only whitelist a couple of tags to be converted back from the html escapes, then you won't run into XSS exploits.
Workaround markups provide no advantage regarding that, as the many failed BBcode parsers prove.
(Instead of converting back and forth it might however be sensible to utilize HTMLPurifier instead.)
If the HTML-markup in the database comes from users you do not trust, you should give them access to markdown or similar 'safe' editing environments, so they can prepare the markup they want and not be allowed to inject HTML.
Attempts to perform selective filtering are frequently wrong, and miss ways attackers can inject malicious code. So don't let them write raw HTML.
htmlspecialchars_decode() is the opposite of htmlspecialchars(). It is possible to unescape it, but there's no parameter for restricting tags.
If the html is written by the user it is bad idea :)
You could use the HTMLPurifier library which will take care of everything you need to do with escaping and such. Here is a nice video explaining how to install it into the zend framework
http://www.zendcasts.com/htmlpurifier-integration/2011/05/
try use strip_tags in the second parameter is the $ allowable_tags
Use Zend_Filter_StripTags class and as argument for the constructor use an array with following keys:
'allowTags' => Tags which are allowed
'allowAttribs' => Attributes which are allowed
This second part allows you to trim all unwanted attribs like 'onClick' etc whose can be as dangerous as <script> code, but you can leave 'src' for <img> or 'href' for <a>
Create your own view helper or you can also use setEscape() in controller See http://framework.zend.com/manual/en/zend.view.scripts.html#zend.view.scripts.escaping
I have an issue when i enter text i can enter html with the text.For example "I am entering text ". Now this link shows up as a link when the form is submitted. Any ideas on how to prevent this?
I am entering text Go to my site . This is the input so when i output the data it comes out as I am entering text **Go to my site** with the hyperlink.
Put the string in htmlspecialchars() or strip_tags().
And, since I feel cleaning strings for other purposes will be the next question thrown out, I should bring up this: The ultimate clean/secure function
You aren't going to easily be able to prevent a user from entering tags without javascript, but you can use
strip_tags()
on the backend to remove them.
htmlspecialchars()
will not remove these tags, it will just encode the special characters.
Normally, you do not want to prevent this. You want to make sure it doesn't output HTML when you print it.
The way to do that is like so:
echo $_GET['text']; // this prints HTML links etc
echo htmlspecialchars($_GET['text']); // this does not
If I understood correctly, you want to prevent injecting HTML code. Use htmlspecialchars().
echo htmlspecialchars($_POST['myform']);
Sanitize your input data before displaying it. This is well knows as a Cross Site Scripting (XSS) attack.
You can use htmlspecialchars() or string_tags() to clean the data.
I have the following array:
'tagline_p' => "I'm a <a href='#showcase'>multilingual web</a> developer, designer and translator. I'm here to <a href='#contact'>help you</a> reach a worldwide audience.",
Should I escape the HTML tags inside the array to avoid hackings to my site? (How to escape them?)
or is OK to have HTML tags inside an array?
The only time it becomes a problem is when it contains user input. You know what you put in your array, and trust it. But you don't know what users are passing in, and don't trust that.
So in this particular case, escaping is not needed. But as soon as user input is involved, you should escape the input.
It's not the HTML itself that is dangerous, but the type of HTML users can pass in, like script tags which allow them to execute Javascript.
Addition
Note that it's best practice to only escape on output not on input. The output is where the data can do damage, so you want to consistently escape that. That way, you don't have to make sure that all input is escaped.
That way, you don't have problems when outputting data to different formats where maybe different rules apply. You don't have to use things like stripslashes() or htmlspecialchars_decode() if you don't need things to be output as html.
It's fine to store the data in the array.
You only need to escape the tags when you are outputting it into an HTML context, and you don't trust it, or you don't want the HTML to be interpreted.
You have to escape data in an appropriate manner to where you are sending it; for HTML if you don't want it to be read as HTML you can use htmlspecialchars(), likewise if you are putting it into an SQL statement and you don't want it to be read as SQL, you can use mysql_real_escape_string() etc.
You should escape HTML when it has been entered by a user (and thus is unsafe) AND you're going to display that HTML in you site. If it's you who wrote it, it doesn't need any kind of escaping.
If you do need to escape html you should do so right before displaying it on your site. There is no need to escape data when you're just lugging it around (like you're presummably doing with that array). You can escape HTML with the htmlspecialchars() function.
(Use htmlspecialchars or htmlentities to escape the HTML.)
Having HTML tags is fine as long as you restrict the set of tags and attributes coming from user, if that array is dynamically generated. For example, <script> should not be allowed, nor event handlers like onmouseover.
It depends on how the HTML is getting into the array. If it's hardcoded by you, it's probably all right. If it's coming from a user, well, all user input is suspect- HTML is just more difficult to clean.
The real question might be "Why do you want to put HTML in an array?". If it's static text, put it in a template file somewhere.
make an array of allowable tags and use strip_tags($input_array[$key],$allowable_tags)
or make a function like this
function sanitize_input($allowable_tags='<br><b><strong><p>')
{
$input_array = $input;
foreach ($input as $key=>$value){
if(!empty($value)) {
$input_array[$key] = strip_tags($input_array[$key],$allowable_tags);
}
}
return $input_array;
}
I have one problem regarding the data insertion in PHP.
In my site there is a message system.
So when my inbox loads it gives one JavaScript alert.
I have searched a lot in my site and finally I found that someone have send me a message with the text below.
<script>
alert(5)
</script>
So how can I restrict the script code being inserted in my database?
I am running on PHP.
There is no problem with JavaScript code being stored in the database. The actual problem is with non-HTML content being taken from the database and displayed to the user as if it were HTML. The correct approach would be to make sure your rendering code treats text as text, not as HTML.
In PHP, this would be done by calling htmlspecialchars on the inbox contents when displaying the inbox (possibly along with nl2br and maybe turning links to <a> tags).
Avoid using striptags for text content: as an user, I might want to type a message like:
... and to create a link, use your-text-here ...
striptags would eliminate the tag, htmlspecialchars would make the text appear as it was typed.
You should not restrict it to be inserted into the database (if StackOverflow would restrict it, we would not be able to post code examples here!)
You should better control how you display it. For instance, add htmlentities() or htmlspecialchars() to your echo call.
This is called XSS. There are numerous threads about it on SO.
How to prevent XSS with HTML/PHP?
What are the best practices for avoid xss attacks in a PHP site?
XSS Attacks Prevention
Is preventing XSS and SQL Injection as easy as does this…?
You should use strip_tags. If you still want to allow some HTML, then add a whitelist in the second parameter.
I should add a really big caveat here. If you're leaving any tags in a strip_tags whitelist, you can still be susceptible to javascript injection. Assume you're allowing the seemingly innocuous tags <strong> and <em>:
Strip tags will still allow all attributes, including event handlers
like <strong onmouseover="window.href=http://mydodgysite.com">this</strong>.
You have a couple of serious options:
strip_tags with no whitelist. Safe, but doesn't allow for any formatting, and may cause problems with strings like this: "x<y, but y>4" --> "x4"
htmlentities. Use this when displaying the data on the screen (not on the data before you put it in the database). It's safe, but doesn't allow for formatting.
A different markup system than HTML, for example: Markdown, Wiki markup, BB Code. Requires rendering to convert back to HTML, but it's mostly safe and can be quite flexible.
User input should be escaped before outputting it.
Whenever you're displaying something a user submitted, run it through htmlspecialchars() first. This'll turn HTML code into safe output.
Take a look at the htmlspecialchars() function. It converts < > ' " and & to their html entity equilivents, meaning <script> will become <script>
You can use strip_tags(). The second argument of this function will allow you to list an explicit list of which tags are allowable:
// Allow <p> and <a>, <script> will be stripped
echo strip_tags($text, '<p><a>');
You may also consider htmlspecialchars(), which converts characters like < into <, causing the browser to interpret them as text, rather than code:
$new = htmlspecialchars("<a href='test'>Test</a>", ENT_QUOTES);
echo $new; // <a href='test'>Test</a>
If I understand you right, you're just looking for two simple commands:
$message = str_replace($message, "<", "<");
$message = str_replace($message, ">", ">");
What is the most secure way to stop users adding html or javascript to a field. I am adding a youtube style 'description' where users can explain their work but I don't want anything other than plain text in there and preferable none of the htmlentities rubbish like '<' or '>'.
Could I do something like this:
$clean = htmlentities($_POST['description']);
if ($clean != $_POST['description']) ... then return the form with an error?
Have you seen strip_tags?
strip_tags() would probably be the best bet.
You don't need to check the cleaned code vs the original and throw an error. As long as it is cleaned, you should be able to display it. Just throw away the original comment. You can put a note under the textbox saying that no html is allowed if you want to make it more user friendly.
Use strip_tags() instead htmlentities().
And the method is ok.
htmlspecialchars(), if used properly (see comments), is the safest way to ensure plain text. There is no way to inject any HTML or JavaScript when the output has all the HTML special characters escaped. If you use strip_tags, you will prevent your users from using completely legitimate characters.
Also don't forget mysql_real_escape_string() if you are storing data in MySQL.