Showing plain PHP code in a HTML page - php

And I'm talking (especially) forums here - [PHP]code here[/PHP] - style. Some forums escape double quotes or other "dangerous characters" and others don't.
What is the best method? What are you guys using?
Can it be done without the fear of code injection?
Edit: Who said anything about reinventing the wheel?

When PHP echo or print text, it never executes it. That only happens with eval. This means that if you did this:
echo '<?php ... ?>';
it would carry through to the page output and not be parsed or executed.
This means that all you need to do is escape the usual characters (<, >, &, etc.) and you should generally be safe.

Don't reinvent the wheel. I see BBCode in your question. Grab a markdown library and use it instead. SO uses this: http://daringfireball.net/projects/markdown/

There is no fear of PHP code injection (unless you are doing some unusual things like eval'ing HTML templates) but always a fear of JS code injection, often called XSS. And all danger coming only from possible JS code.
Thus, there is no special treatment for the PHP code, shown on a HTML page. Just treat it as any other data. < > brackets usually being escaped, for obvious reason.
Don't reinvent the wheel. PHP has it's highlight_string function for this

If you see escaped quotes on some page, that's most likely because their script escaped them twice (for example magic_quotes did it once, then mysql_query() again). When data sanitisation is done properly, you should not see escape characters in output.

Related

Is htmlspecialchars() needed when using mysqli prepared statement? [duplicate]

In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.

unescaping some HTML tags after they've been escaped for XSS

I have html stored in the database and I need to output it to the page.
If I don't escape() it, then I get the bold formatting I want, but I run the risk of getting an XSS from the unescaped html source.
If I escape() it, then it shows the raw html code <b>bold text</b> instead of bold text.
How can I escape everything, except some tags? I'm thinking to apply the escape(), then search for the <b> and </b> and unescape them. Would that work? Any security problems you see with it? I'm also not sure how I would search for the <b></b> tags. Regex for that maybe or what?
P.S. the escape() I mean is a function in Zend. I believe it's the equivalent of htmlspecialchars().
Unescaping is the way to go. If you only whitelist a couple of tags to be converted back from the html escapes, then you won't run into XSS exploits.
Workaround markups provide no advantage regarding that, as the many failed BBcode parsers prove.
(Instead of converting back and forth it might however be sensible to utilize HTMLPurifier instead.)
If the HTML-markup in the database comes from users you do not trust, you should give them access to markdown or similar 'safe' editing environments, so they can prepare the markup they want and not be allowed to inject HTML.
Attempts to perform selective filtering are frequently wrong, and miss ways attackers can inject malicious code. So don't let them write raw HTML.
htmlspecialchars_decode() is the opposite of htmlspecialchars(). It is possible to unescape it, but there's no parameter for restricting tags.
If the html is written by the user it is bad idea :)
You could use the HTMLPurifier library which will take care of everything you need to do with escaping and such. Here is a nice video explaining how to install it into the zend framework
http://www.zendcasts.com/htmlpurifier-integration/2011/05/
try use strip_tags in the second parameter is the $ allowable_tags
Use Zend_Filter_StripTags class and as argument for the constructor use an array with following keys:
'allowTags' => Tags which are allowed
'allowAttribs' => Attributes which are allowed
This second part allows you to trim all unwanted attribs like 'onClick' etc whose can be as dangerous as <script> code, but you can leave 'src' for <img> or 'href' for <a>
Create your own view helper or you can also use setEscape() in controller See http://framework.zend.com/manual/en/zend.view.scripts.html#zend.view.scripts.escaping

preventing php injection in forms (if a user submits php code into a form)

In a contact form, if someone enters the following into the textbox:
<?php echo 'hi'; ?>
I see that the server will not execute it because of an error. What I would like it to do is instead, somehow escape it into plain text and display it correctly. I have seen other sites been able to do this. I originally thought this could be solved by the addslashes() function, but that doesn't seem to work.
Thanks,
Phil
No. Use htmlspecialchars instead. Don't use addslashes.
To be more specific, addslashes bluntly escapes all instances of ', " and \ and NUL. It was meant to prevent SQL injection, but it has no real use in proper security measures.
What you want is preventing the browser to interpret tags as is (and that's entirely different from preventing SQL injections). For instance, if I want to talk about <script> elements, SO shouldn't simply send that string literally, causing to start an actual script (that can lead to Cross-site scripting), but some characters, especially < and >, need to be encoded as HTML entities so they're shown as angle brackets (the same is true for &, that otherwise would be interpreted as the start of an HTML entity).
In your case, output after htmlspecialchars would look like:
<?php echo 'hi'; ?>
Use htmlspecialchars before outputing anything provided by the user. But in this case, also make sure that you do not execute anything the user inputs. Do not use eval, include or require. If you save the user data to a file, use readfile or file_get_contents+htmlspecialchars instead of include/require. If you're using eval, change it into echo and so on.

Do I need htmlentities() or htmlspecialchars() in prepared statements?

In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.

Best way to defend against mysql injection and cross site scripting

At the moment, I apply a 'throw everything at the wall and see what sticks' method of stopping the aforementioned issues. Below is the function I have cobbled together:
function madSafety($string)
{
$string = mysql_real_escape_string($string);
$string = stripslashes($string);
$string = strip_tags($string);
return $string;
}
However, I am convinced that there is a better way to do this. I am using FILTER_ SANITIZE_STRING and this doesn't appear to to totally secure.
I guess I am asking, which methods do you guys employ and how successful are they? Thanks
Just doing a lot of stuff that you don't really understand, is not going to help you. You need to understand what injection attacks are and exactly how and where you should do what.
In bullet points:
Disable magic quotes. They are an inadequate solution, and they confuse matters.
Never embed strings directly in SQL. Use bound parameters, or escape (using mysql_real_escape_string).
Don't unescape (eg. stripslashes) when you retrieve data from the database.
When you embed strings in html (Eg. when you echo), you should default to escape the string (Using htmlentities with ENT_QUOTES).
If you need to embed html-strings in html, you must consider the source of the string. If it's untrusted, you should pipe it through a filter. strip_tags is in theory what you should use, but it's flawed; Use HtmlPurifier instead.
See also: What's the best method for sanitizing user input with PHP?
The best way against SQL injection is to bind variables, rather then "injecting" them into string.
http://www.php.net/manual/en/mysqli-stmt.bind-param.php
Don’t! Using mysql_real_escape_string is enough to protect you against SQL injection and the stropslashes you are doing after makes you vulnerable to SQL injection. If you really want it, put it before as in:
function madSafety($string)
{
$string = stripslashes($string);
$string = strip_tags($string);
$string = mysql_real_escape_string($string);
return $string;
}
stripslashes is not really useful if you are doing mysql_real_escape_string.
strip_tags protects against HTML/XML injection, not SQL.
The important thing to note is that you should escape your strings differently depending on the imediate use you have for it.
When you are doing MYSQL requests use mysql_real_escape_string. When you are outputing web pages use htmlentities. To build web links use urlencode…
As vartec noted, if you can use placeholders by all means do it.
This topic is so wrong!
You should NOT filter the input of the user! It is information that has been entered by him. What are you going to do if I want my password be like: '"'>s3cr3t<script>alert()</script>
Filter the characters and leave me with a changed password, so I cannot even succeed in my first login? This is bad.
The proper solution is to use prepared statements or mysql_real_escape_string() to avoid sql injections and use context-aware escaping of the characters to avoid your html code being messed up.
Let me remind you that the web is only one of the ways you can represent the information entered by the user. Would you accept such stripping if some desktop software do it? I hope your answer is NO and you would understand why this is not the right way.
Note that in different context different characters has to be escaped. For example, if you need to display the user first name as a tooltip, you will use something like:
<span title="{$user->firstName}">{$user->firstName}</span>
However, if the user has set his first name to be like '"><script>window.document.location.href="http://google.com"</script> what are you gonna do? Strip the quotes? This would be so wrong! Instead of doing this non-sense, consider escaping the quotes while rendering the data, not while persisting it!
Another context you should consider is while rendering the value itself. Consider the previously used html code and imagine the user first name be like <textarea>. This would wrap all html code that follows into this textarea element, thus breaking up the whole page.
Yet again - consider escaping the data depending on the context you are using it in!
P.S Not really sure how to react on those negative votes. Are you, people, actually reading my reply?

Categories