I created a form where users can enter html code and it outputs their code in another textarea. The problem is that if the html the user enters has a textarea in the code, the in their code breaks my textarea form. I see other sites display any html correctly so how is this done without breaking the form and allowing the user to copy it so that it still remains as and not some converted code so they can paste it on their webpage?
Ah crap yeah I figured it out, in fact the problem wasn't with the htmlspecialchars code alone I forgot to add a return to one of my functions haha. Thanks guys.
Represent characters that have special meaning in HTML using entities. Since you are using PHP, use htmlspecialchars
There are millions and millions of ways to do this. The easiest is to use htmlspecialchars or htmlentities on the user's input. This will make a visual </textarea> in the textarea box without closing it. This actually turns it into </textarea>. htmlspecialchars transforms less characters than htmlentities and usually makes more sense to use in a situation like this, but do your research.
strip_tags() is also a possibility.
You can also use a regular expression with PCRE, or even str_replace() or other string manipulation functions to strip off the textarea, convert the special characters, etc.
PECL also as a BB code extension you can use if you still want your users to be able to enter some for of tags to style their output.
<textarea><?php echo htmlentities($code); ?></textarea>
You have to transform the html code into symbols, so it is not treated as html.
Use the function htmlentities() on the textarea content before echoing it.
Related
I have a php script, where the user inserts his name.
Users can insert anything they want, even things like <img src="....
I would like to save their input in a way it won't show any image (or any html).
I know it exists but I don't know what keywords to search in order to find what does it.
Use strip_tags($str).
http://php.net/strip_tags
htmlspecialchars() will encode the text so that the tags are not interpreted as HTML.
The easiest solution is the PHP function strip_tags(), which does exactly what the name suggests, and strips HTML tags from a string.
The other alternative is to 'escape' the input, so that HTML characters such as < and > are converted into displayable text. This would result in the HTML code being displayed.
You would do this with the function htmlentities().
It's worth pointing out that the input may contain HTML characters without actually intending to be HTML. The & character is a HTML reserved character, but can also be found in normal text. > and < are less commonly used in normal text, but still possible. All of them may cause problems when displayed on your page, without necessarily being actual HTML code.
The solution to this is as above, to escape the string using htmlentities(). You may want to run striptags() first, but you should also run htmlentities() as well, to ensure that the string is displayed correctly.
Hope that helps.
I was reading an article about form security because I have a form in which a user can add messages.
I read that it was best to use strip_tags(), htmlspecialchars() and nl2br(). Somewhere else it is being said to use html_entity_decode().
I have this code in my page which takes the user input
<?php
$topicmessage = check_input($_POST['message']); //protect against SQLinjection
$topicmessage = strip_tags($topicmessage, "<p><a><span>");
$topicmessage = htmlspecialchars($topicmessage);
$topicmessage = nl2br($topicmessage);
?>
but when i echo the message, it's all on one line and it appears that the breaks have been removed by the strip_tags and not put back by nl2br().
To me, that makes sense why it does that, because if the break has been removed, how does it know where to put it back (or does it)?
Anyway, i'm looking for a way where i can protect my form for being used to try and hack the site like using javascript in the form.
You have 2 choices:
Allow absolutely no HTML. Use strip_tags() with NO allowed tags, or htmlspecialchars() to escape any tags that may be in there.
Allow HTML, but you need to sanitize the HTML. This is NOT something you can do with strip_tags. Use a library (Such as HTMLPurifier)...
You just need htmlspecialchars before printing form content, and mysql_real_escape before posting into SQL(you don't need it before printing), and you should be good.
Doing your way of stipping tags is very dangerous, you need short list of allowed tags with limited attributes - this is not something you can do in 1 line. You might want to look into HTML normalizers, like Tidy.
Use HTML Purifier for html-input and strip everything you dont want - all but paragraphs, all anchors etc.
Unrelated but important:
sprintf for stuff like "only digits from that field".
mysql-real-escape-string.php on all insert queries in general.
I have the following array:
'tagline_p' => "I'm a <a href='#showcase'>multilingual web</a> developer, designer and translator. I'm here to <a href='#contact'>help you</a> reach a worldwide audience.",
Should I escape the HTML tags inside the array to avoid hackings to my site? (How to escape them?)
or is OK to have HTML tags inside an array?
The only time it becomes a problem is when it contains user input. You know what you put in your array, and trust it. But you don't know what users are passing in, and don't trust that.
So in this particular case, escaping is not needed. But as soon as user input is involved, you should escape the input.
It's not the HTML itself that is dangerous, but the type of HTML users can pass in, like script tags which allow them to execute Javascript.
Addition
Note that it's best practice to only escape on output not on input. The output is where the data can do damage, so you want to consistently escape that. That way, you don't have to make sure that all input is escaped.
That way, you don't have problems when outputting data to different formats where maybe different rules apply. You don't have to use things like stripslashes() or htmlspecialchars_decode() if you don't need things to be output as html.
It's fine to store the data in the array.
You only need to escape the tags when you are outputting it into an HTML context, and you don't trust it, or you don't want the HTML to be interpreted.
You have to escape data in an appropriate manner to where you are sending it; for HTML if you don't want it to be read as HTML you can use htmlspecialchars(), likewise if you are putting it into an SQL statement and you don't want it to be read as SQL, you can use mysql_real_escape_string() etc.
You should escape HTML when it has been entered by a user (and thus is unsafe) AND you're going to display that HTML in you site. If it's you who wrote it, it doesn't need any kind of escaping.
If you do need to escape html you should do so right before displaying it on your site. There is no need to escape data when you're just lugging it around (like you're presummably doing with that array). You can escape HTML with the htmlspecialchars() function.
(Use htmlspecialchars or htmlentities to escape the HTML.)
Having HTML tags is fine as long as you restrict the set of tags and attributes coming from user, if that array is dynamically generated. For example, <script> should not be allowed, nor event handlers like onmouseover.
It depends on how the HTML is getting into the array. If it's hardcoded by you, it's probably all right. If it's coming from a user, well, all user input is suspect- HTML is just more difficult to clean.
The real question might be "Why do you want to put HTML in an array?". If it's static text, put it in a template file somewhere.
make an array of allowable tags and use strip_tags($input_array[$key],$allowable_tags)
or make a function like this
function sanitize_input($allowable_tags='<br><b><strong><p>')
{
$input_array = $input;
foreach ($input as $key=>$value){
if(!empty($value)) {
$input_array[$key] = strip_tags($input_array[$key],$allowable_tags);
}
}
return $input_array;
}
I have one problem regarding the data insertion in PHP.
In my site there is a message system.
So when my inbox loads it gives one JavaScript alert.
I have searched a lot in my site and finally I found that someone have send me a message with the text below.
<script>
alert(5)
</script>
So how can I restrict the script code being inserted in my database?
I am running on PHP.
There is no problem with JavaScript code being stored in the database. The actual problem is with non-HTML content being taken from the database and displayed to the user as if it were HTML. The correct approach would be to make sure your rendering code treats text as text, not as HTML.
In PHP, this would be done by calling htmlspecialchars on the inbox contents when displaying the inbox (possibly along with nl2br and maybe turning links to <a> tags).
Avoid using striptags for text content: as an user, I might want to type a message like:
... and to create a link, use your-text-here ...
striptags would eliminate the tag, htmlspecialchars would make the text appear as it was typed.
You should not restrict it to be inserted into the database (if StackOverflow would restrict it, we would not be able to post code examples here!)
You should better control how you display it. For instance, add htmlentities() or htmlspecialchars() to your echo call.
This is called XSS. There are numerous threads about it on SO.
How to prevent XSS with HTML/PHP?
What are the best practices for avoid xss attacks in a PHP site?
XSS Attacks Prevention
Is preventing XSS and SQL Injection as easy as does this…?
You should use strip_tags. If you still want to allow some HTML, then add a whitelist in the second parameter.
I should add a really big caveat here. If you're leaving any tags in a strip_tags whitelist, you can still be susceptible to javascript injection. Assume you're allowing the seemingly innocuous tags <strong> and <em>:
Strip tags will still allow all attributes, including event handlers
like <strong onmouseover="window.href=http://mydodgysite.com">this</strong>.
You have a couple of serious options:
strip_tags with no whitelist. Safe, but doesn't allow for any formatting, and may cause problems with strings like this: "x<y, but y>4" --> "x4"
htmlentities. Use this when displaying the data on the screen (not on the data before you put it in the database). It's safe, but doesn't allow for formatting.
A different markup system than HTML, for example: Markdown, Wiki markup, BB Code. Requires rendering to convert back to HTML, but it's mostly safe and can be quite flexible.
User input should be escaped before outputting it.
Whenever you're displaying something a user submitted, run it through htmlspecialchars() first. This'll turn HTML code into safe output.
Take a look at the htmlspecialchars() function. It converts < > ' " and & to their html entity equilivents, meaning <script> will become <script>
You can use strip_tags(). The second argument of this function will allow you to list an explicit list of which tags are allowable:
// Allow <p> and <a>, <script> will be stripped
echo strip_tags($text, '<p><a>');
You may also consider htmlspecialchars(), which converts characters like < into <, causing the browser to interpret them as text, rather than code:
$new = htmlspecialchars("<a href='test'>Test</a>", ENT_QUOTES);
echo $new; // <a href='test'>Test</a>
If I understand you right, you're just looking for two simple commands:
$message = str_replace($message, "<", "<");
$message = str_replace($message, ">", ">");
What is the most secure way to stop users adding html or javascript to a field. I am adding a youtube style 'description' where users can explain their work but I don't want anything other than plain text in there and preferable none of the htmlentities rubbish like '<' or '>'.
Could I do something like this:
$clean = htmlentities($_POST['description']);
if ($clean != $_POST['description']) ... then return the form with an error?
Have you seen strip_tags?
strip_tags() would probably be the best bet.
You don't need to check the cleaned code vs the original and throw an error. As long as it is cleaned, you should be able to display it. Just throw away the original comment. You can put a note under the textbox saying that no html is allowed if you want to make it more user friendly.
Use strip_tags() instead htmlentities().
And the method is ok.
htmlspecialchars(), if used properly (see comments), is the safest way to ensure plain text. There is no way to inject any HTML or JavaScript when the output has all the HTML special characters escaped. If you use strip_tags, you will prevent your users from using completely legitimate characters.
Also don't forget mysql_real_escape_string() if you are storing data in MySQL.