Text input sanitation and security - php

I am trying to make sure all my inputs are secure, protecting the server and XSS attacks. Is validating input with strip_tags and htmlentities a fool proof system? I have been told it was and would like to confirm. ie for example:
$re = htmlentities(strip_tags($_GET['re']), ENT_COMPAT, "UTF-8");
this should prevent any linux commands and any html links correct? are there any vulnerabilities that havent been considered with this?

This is not at all what htmlentities is for. Use htmlentites to encode your output before it is sent to the browser. It has nothing to do with sanitizing input. The only thing you need to worry about when processing input is properly escaping data being interpolated into SQL queries to prevent SQL injection. See PHP Data Objects for more on that.
strip_tags is debatably useful here, but you don't need to use both strip_tags and htmlentities. The whole purpose of htmlentites is that it prevents the tags from being interpreted. The only correct way to think about this is: Preserve the content the user entered and render it safe. Don't strip their tags, just encode them so they appear as they were typed. Otherwise you wind up stripping things like <sarcasm> and <rant> tags. The intent of the user was not to inject HTML.
"Linux commands" have nothing to do with HTML. There is no way to execute arbitrary Linux commands through HTML/script injection.
What i have in mind is something such as ";ls -la"
If you are actually taking user-supplied input and executing it via system or something in that vein, you are already in trouble. This is a terrible idea and you shouldn't do it.
</rant>

You must always choose the right tool for the job. That being said $re = htmlentities(strip_tags($_GET['re']), ENT_COMPAT, "UTF-8"); should never be used for anything. The command is redundant which means you don't understand what its doing. It not very good at preventing xss because xss is an output problem.
To sanitize shell arguments you must use escpaeshellarg(). For XSS you should use:
htmlspecialchars($_GET['re'], ENT_QUOTES, "UTF-8");. However this doesn't stop all XSS and it doesn't do anything to stop SQL Injection.
Use parametrized queries for sql.
And all of that just scratches the surface read the OWASP top 10.

this is how I filter my inputs before I'll insert it into my database
<?php
function sanitize($data){
$result = trim($data);
$result = htmlspecialchars($data);
$result = mysql_real_escape_string($data);
return $result;
}
?>

Related

Is htmlspecialchars() needed when using mysqli prepared statement? [duplicate]

In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.

preventing php injection in forms (if a user submits php code into a form)

In a contact form, if someone enters the following into the textbox:
<?php echo 'hi'; ?>
I see that the server will not execute it because of an error. What I would like it to do is instead, somehow escape it into plain text and display it correctly. I have seen other sites been able to do this. I originally thought this could be solved by the addslashes() function, but that doesn't seem to work.
Thanks,
Phil
No. Use htmlspecialchars instead. Don't use addslashes.
To be more specific, addslashes bluntly escapes all instances of ', " and \ and NUL. It was meant to prevent SQL injection, but it has no real use in proper security measures.
What you want is preventing the browser to interpret tags as is (and that's entirely different from preventing SQL injections). For instance, if I want to talk about <script> elements, SO shouldn't simply send that string literally, causing to start an actual script (that can lead to Cross-site scripting), but some characters, especially < and >, need to be encoded as HTML entities so they're shown as angle brackets (the same is true for &, that otherwise would be interpreted as the start of an HTML entity).
In your case, output after htmlspecialchars would look like:
<?php echo 'hi'; ?>
Use htmlspecialchars before outputing anything provided by the user. But in this case, also make sure that you do not execute anything the user inputs. Do not use eval, include or require. If you save the user data to a file, use readfile or file_get_contents+htmlspecialchars instead of include/require. If you're using eval, change it into echo and so on.

preg_replace on xss code

Can this code help to sanitize malicious code in user submit form?
function rex($string) {
$patterns = array();
$patterns[0] = '/=/i';
$patterns[1] = '/javascript:/i';
$replacements = array();
$replacements[0] = '';
$replacements[1] = '';
return preg_replace($patterns, $replacements, $string);
I have included htmlentities() to prevent XSS on client side, is all the code shown is safe enough to prevent attack?
You don't need that if you are using htmlentities. To prevent XSS you can even just use htmlspecialchars.
Just make sure that you use htmlspecialchars on all data that is printed as plain text in your HTML response.
See also: the answers to "Does this set of regular expressions FULLY protect against cross site scripting?"
your substitutions may help. But you're better off using a pre-rolled solution like PHP's data filters. Then you can easily limit datatype to what you expect.
htmlentities alone will do the trick. No need to replace anything at all.
No.
http://ha.ckers.org/xss.html
Your first replacement rule is useless as it can be easily circumvented by using eval and character encoding (and an equal sign isn't necessary for XSS attacks anyway).
Your second rule can be very likely circumvented on at least some browsers by using things like javascript : or java\script:.
In short, it doesn't help much. If you want to show plain text, htmlentities is probably fine (there are exotic attacks which take advantage of unusual character encodings and browser stupidity to launch XSS attacks without any special characters, but that only works on specific browsers - cough IE cough - in specific situations). If you want to put user input in URLs or other attributes, it is not necessarily enough.

Do I need htmlentities() or htmlspecialchars() in prepared statements?

In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.

Best way to defend against mysql injection and cross site scripting

At the moment, I apply a 'throw everything at the wall and see what sticks' method of stopping the aforementioned issues. Below is the function I have cobbled together:
function madSafety($string)
{
$string = mysql_real_escape_string($string);
$string = stripslashes($string);
$string = strip_tags($string);
return $string;
}
However, I am convinced that there is a better way to do this. I am using FILTER_ SANITIZE_STRING and this doesn't appear to to totally secure.
I guess I am asking, which methods do you guys employ and how successful are they? Thanks
Just doing a lot of stuff that you don't really understand, is not going to help you. You need to understand what injection attacks are and exactly how and where you should do what.
In bullet points:
Disable magic quotes. They are an inadequate solution, and they confuse matters.
Never embed strings directly in SQL. Use bound parameters, or escape (using mysql_real_escape_string).
Don't unescape (eg. stripslashes) when you retrieve data from the database.
When you embed strings in html (Eg. when you echo), you should default to escape the string (Using htmlentities with ENT_QUOTES).
If you need to embed html-strings in html, you must consider the source of the string. If it's untrusted, you should pipe it through a filter. strip_tags is in theory what you should use, but it's flawed; Use HtmlPurifier instead.
See also: What's the best method for sanitizing user input with PHP?
The best way against SQL injection is to bind variables, rather then "injecting" them into string.
http://www.php.net/manual/en/mysqli-stmt.bind-param.php
Don’t! Using mysql_real_escape_string is enough to protect you against SQL injection and the stropslashes you are doing after makes you vulnerable to SQL injection. If you really want it, put it before as in:
function madSafety($string)
{
$string = stripslashes($string);
$string = strip_tags($string);
$string = mysql_real_escape_string($string);
return $string;
}
stripslashes is not really useful if you are doing mysql_real_escape_string.
strip_tags protects against HTML/XML injection, not SQL.
The important thing to note is that you should escape your strings differently depending on the imediate use you have for it.
When you are doing MYSQL requests use mysql_real_escape_string. When you are outputing web pages use htmlentities. To build web links use urlencode…
As vartec noted, if you can use placeholders by all means do it.
This topic is so wrong!
You should NOT filter the input of the user! It is information that has been entered by him. What are you going to do if I want my password be like: '"'>s3cr3t<script>alert()</script>
Filter the characters and leave me with a changed password, so I cannot even succeed in my first login? This is bad.
The proper solution is to use prepared statements or mysql_real_escape_string() to avoid sql injections and use context-aware escaping of the characters to avoid your html code being messed up.
Let me remind you that the web is only one of the ways you can represent the information entered by the user. Would you accept such stripping if some desktop software do it? I hope your answer is NO and you would understand why this is not the right way.
Note that in different context different characters has to be escaped. For example, if you need to display the user first name as a tooltip, you will use something like:
<span title="{$user->firstName}">{$user->firstName}</span>
However, if the user has set his first name to be like '"><script>window.document.location.href="http://google.com"</script> what are you gonna do? Strip the quotes? This would be so wrong! Instead of doing this non-sense, consider escaping the quotes while rendering the data, not while persisting it!
Another context you should consider is while rendering the value itself. Consider the previously used html code and imagine the user first name be like <textarea>. This would wrap all html code that follows into this textarea element, thus breaking up the whole page.
Yet again - consider escaping the data depending on the context you are using it in!
P.S Not really sure how to react on those negative votes. Are you, people, actually reading my reply?

Categories