strip_tags and htmlentities - php

Should I use htmlentities with strip_tags?
I am currently using strip_tags when adding to database and thinking about removing htmlentities on output; I want to avoid unnecessary processing while generating HTML on the server.
Is it safe to use only strip_tags without allowed tags?

First: Use the escaping method only as soon as you need it. I.e. if you insert something into a database, only escape it for the database, i.e. apply mysql_real_escape_string (or PDO->quote or whatever database layer you are using). But don't yet apply any escaping for the output. No strip_tags or similar yet. This is because you may want to use the data stored in the database someplace else, where HTML escaping isn't necessary, but only makes the text ugly.
Second: You should not use strip_tags. It removes the tags altogether. I.e. the user doesn't get the same output as he typed in. Instead use htmlspecialchars. It will give the user the same output, but will make it harmless.

strip_tags will remove all HTML tags:
"<b>foo</b><i>bar</i>" --> "foobar"
htmlentities will encode characters which are special characters in HTML
"a & b" --> "a & b"
"<b>foo</b>" --> "<b>foo</b>"
If you use htmlentities, then when you output the string to the browser, the user should see the text as they entered it, not as HTML
echo htmlentities("<b>foo</b>");
Visually results in: <b>foo</b>
echo strip_tags("<b>foo</b>");
Results in: foo

I wouldn't use htmlentities as this will allow you to insert the string, as is, into the database. Yhis is no good for account details or forums.
Use mysql_real_escape_string for inserting data into the database, and strip_tags for receiving data from the database and echoing out to the screen.

try this one and see the differences:
<?php
$d= isset($argv[1]) ? $argv[1] : "empty argv[1]".PHP_EOL;
echo strip_tags(htmlentities($d)) . PHP_EOL;
echo htmlentities(strip_tags($d)) . PHP_EOL;
?>
open up cmd or your terminal and type something like following;
php your_script.php "<br>foo</br>"
this should get what you want and safe !

Related

How to retrieve original text after using htmlspecialchars() and htmlentities()

I have some text that I will be saving to my DB. Text may look something like this: Welcome & This is a test paragraph. When I save this text to my DB after processing it using htmlspecialchars() and htmlentities() in PHP, the sentence will look like this: Welcome & This is a test paragraph.
When I retrieve and display the same text, I want it to be in the original format. How can I do that?
This is the code that I use;
$text= htmlspecialchars(htmlentities($_POST['text']));
$text= mysqli_real_escape_string($conn,$text);
There are two problems.
First, you are double-encoding HTML characters by using both htmlentities and htmlspecialchars. Both of those functions do the same thing, but htmlspecialchars only does it with a subset of characters that have HTML character entity equivalents (the special ones.) So with your example, the ampersand would be encoded twice (since it is a special character), so what you would actually get would be:
$example = 'Welcome & This is a test paragraph';
$example = htmlentities($example);
var_dump($example); // 'Welcome & This is a test paragraph'
$example = htmlspecialchars($example);
var_dump($example); // 'Welcome &amp; This is a test paragraph'
Decide which one of those functions you need to use (probably htmlspecialchars will be sufficient) and use only one of them.
Second, you are using these functions at the wrong time. htmlentities and htmlspecialchars will not do anything to "sanitize" your data for input into your database. (Not saying that's what you're intending, as you haven't mentioned this, but many people do seem to try to do this.) If you want to protect yourself from SQL injection, bind your values to prepared statements. Escaping it as you are currently doing with mysqli_real_escape_string is good, but it isn't really sufficient.
htmlspecialchars and htmlentities have specific purposes: to convert characters in strings that you are going to output into an HTML document. Just wait to use them until you are ready to do that.

shell_exec() not running program & giving incomplete output [duplicate]

I want to display text on the page, the text should look like this:
<sometext> ... but when I echo this, nothing appears!!
How ca I do this?
A "page" is written in HTML, so < means "Start a tag".
You have to represent characters with special meaning in HTML using entities.
You can write them directly, or make use of the htmlspecialchars function.
echo "<sometext>";
echo htmlspecialchars("<sometext>");
You probably want <sometext>.
If that text is coming from user input, you should definitely use htmlspecialchars() on it, to help prevent XSS.
This is because the browser assumes it is an unknown tag. If you want the browser to show it, use:
echo '<sometext>';
or use the htmlentities function like so:
echo htmlentities('<sometext>');
You need to call htmlentities() to convert the HTML metacharacters into something that will display properly.

Need to escape or sanatise output that is displayed in a <textarea>?

Do you have to escape or sanatise output that will be in a <textarea>?
It seems that if i sanatise it using htmlentities() the actual &...; character replacements come up
Well, you have to:
<?php
$content = "</textarea><script>alert('hi!')</script>";
?>
<textarea>
<?php echo $content; ?>
</textarea>
Yes, you need to sanitize. Use htmlspecialchars($str, ENT_QUOTES) instead.
If that output was initially provided by the user or any untrusted source (i.e. not directly from your code) then it needs to be sanitized to prevent against XSS attacks.
You need to consider whatever the output is editable by the user or not. If it not and it is a trusted output (maybe coming from pre defined texts that YOU wrote) you obviously don't. Otherwise yes. And the HTML chars replacement is quite normal but you don't have to worry because when the page is read and outputted to the user browser all the previous characters will still be there.
Notice that the > and < characters could be used, if not sanitize, to inject other HTML code and particular the <script> tag that can run Javascript.
Always escape all occurances of < and > (with < and >) within the textarea's content. Otherwise one could provide the following content (example) to "escape" the textarea and inject HTML code:
</textarea><script src="http://malicious.code.is/us.js"></script>
Otherwise this could result in the following code:
<textarea id="text"></textarea><script src="http://malicious.code.is/us.js"></script></textarea>
The second </textarea> at the end would be ignored and the script tag before would be executed.
Just using htmlspecialchars() is NOT enough. It still leaves you vulnerable to certain multibyte character attack vectors (even when using htmlspecialchars($string, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8')
Perhaps look at a library like HTMLPurifier to give you a more complete solution.
Here is a pretty good summary of XSS protection in PHP.
http://www.bytetouch.com/blog/programming/protecting-php-scripts-from-cross-site-scripting-xss-attacks/

preventing php injection in forms (if a user submits php code into a form)

In a contact form, if someone enters the following into the textbox:
<?php echo 'hi'; ?>
I see that the server will not execute it because of an error. What I would like it to do is instead, somehow escape it into plain text and display it correctly. I have seen other sites been able to do this. I originally thought this could be solved by the addslashes() function, but that doesn't seem to work.
Thanks,
Phil
No. Use htmlspecialchars instead. Don't use addslashes.
To be more specific, addslashes bluntly escapes all instances of ', " and \ and NUL. It was meant to prevent SQL injection, but it has no real use in proper security measures.
What you want is preventing the browser to interpret tags as is (and that's entirely different from preventing SQL injections). For instance, if I want to talk about <script> elements, SO shouldn't simply send that string literally, causing to start an actual script (that can lead to Cross-site scripting), but some characters, especially < and >, need to be encoded as HTML entities so they're shown as angle brackets (the same is true for &, that otherwise would be interpreted as the start of an HTML entity).
In your case, output after htmlspecialchars would look like:
<?php echo 'hi'; ?>
Use htmlspecialchars before outputing anything provided by the user. But in this case, also make sure that you do not execute anything the user inputs. Do not use eval, include or require. If you save the user data to a file, use readfile or file_get_contents+htmlspecialchars instead of include/require. If you're using eval, change it into echo and so on.

How do I properly encode a form in php

On my site users can add content to the database via a form. I want the users to be able to type anything in the form and for it all to be added to the database how they have entered it. At the moment I'm getting problems with a number of characters, namely slashes, &, ? etc.
What is the best way to allow all characters to be added to the database correctly?
Also, do you have to decode them when displaying them for it to work correctly? If so, how do I do that?
When saving, use mysql_real_escape_string (or PDO) to protect against SQL injection attacks. This will make it possible to write quotes and backslashes without destroying the SQL query.
<?php
$text = mysql_real_escape_string($_POST['text']);
mysql_query('INSERT INTO table(text) VALUES("'.$text.'")');
?>
When printing the data to a browser (with echo), first run it through htmlspecialchars to disable HTML and solve your current problem:
<?php
// ...fetch $text from db here...
echo htmlspecialchars($text);
?>
htmlentities() may help you encode and decode html characters:
You can also use nl2br() to preserve line breaks from textarea elements:
Also you should use PDO for your database needs as it is much more secure than the old method of escaping data, mysql_real_escape_string()

Categories