Input data processing using PHP for security - php

I am writing a code which will process the user text input in a registration form. I have implemented the following function which make sure that the input data is safe:
function input_check($Indata, $dbc) { // input_check($Indata, $dbc)
$Indata = trim($Indata); // remove white spaces
$Indata = stripslashes($Indata); // remove back slashes
$Indata = strip_tags($Indata); // remove html tags
$Indata = htmlspecialchars($Indata); // convert html entities
$Indata = mysql_real_escape_string($Indata,$dbc);
return $Indata;
}
Is there any other processing that I have to do in order to ensure that the input is safe?
I meant safe from malicious input data

Your strategy to use all possible escaping mechanisms may be safe, but will make your application too complex - imagine what you need to do, to use the data (which seems to be stored in a MySQL database later, right?) to print it in a html form later.
A more wise approach is, to use only the adequate escaping mechanism depending on the use of the data:
to store data in a MySQL database, use a database escaping mechanism (btw instead of mysql_real_escape_string() which is deprecated, use PDO::quote() or even better use parameter binding which already does escaping for you)
to print stored data in html text use htmlspecialchars(), possibly in conjunction with strip_tags()
to print stored data in html attributes use htmlspecialchars() together with urlencode()
... and so on. Then you will most likely be safe of SQLInjection, XSS attacks and so on.

Related

XSS output filtering in PHP by applying htmlspecialchars() on each string in the user data object beforehand

*note - this post is only about XSS attacks and not about SQL injections as we already use prepared statements
Hi all,
I plan to filter my output in regards to XSS attacks. So far, I have read that the "recommended" approach for websites in UTF-8 format is to use htmlspecialchars() to encode every output of user input data, e.g., for every relevant echo() or print() statement. (At least for websites that do not require handling user input data containing HTML)
As noted in How to prevent XSS with HTML/PHP? and How can I sanitize user input with PHP?
However, there are too many cases where user input data is being printed out on the site I'm working on, and it spreads over numerous files/web pages. It would be a mammoth project to specifically address every single related echo() and print() statement. Thus, I thought about iterating over the whole user input data object retrieved from the backend before printing out its fields with echo() or print(). For example, with this helper function:
// helper function
function xss_recursive_object_iterator(&$object)
{
if ($object === null) {
return;
}
if (is_object($object) || is_array($object)) {
foreach ($object as $key => &$field) {
if (is_string($field)) {
$cleaned_field = htmlspecialchars($field, ENT_QUOTES, 'UTF-8');
// maybe additional operations for output encoding (but which)
// ...
$field = $cleaned_field;
} else if (is_array($field) || is_object($field)) {
recursive_object_iterator($field);
}
}
unset($field);
}
}
...
// clean object with user input data retrieved from backend with the above function
xss_recursive_object_iterator($user_data_object);
...
// output of user input strings from the XSS filtered object
echo($user_data_object->field_string1);
echo($user_data_object->field_string2);
...
Instead of applying it on every single echo()/print() field
echo(htmlspecialchars($user_data_object->field_string1, ENT_QUOTES, 'UTF-8'));
echo(htmlspecialchars($user_data_object->field_string1, ENT_QUOTES, 'UTF-8'));
...
Question 1
What are the drawbacks of iterating over the whole object and applying the encoding operations to every field beforehand as shown above? Would this leave any xss output filtering issues open?
Question 2
Additionally for user data being printed inside tags I would use json_encode($field_string, JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS);
And for dynamic URLs with user input I would use htmlspecialchars(urlencode($field_string));
As suggested in Json: PHP to JavaScript safe or not? and Does urlencode() protect against XSS
Lastly it is to say that the website does not integrate user input into CSS.
Is there already a crucial aspect I am missing, or am I good so far at filtering XSS attacks apart from an additional allowlist in the Content Security Policy settings? Of course, I will also test it against the cheatsheet: https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Evasion_Cheat_Sheet.html, but maybe there is something obvious.
For example, are there more operations missing that could provide me additional safety against XSS attacks in terms of output encoding, for example, strip_tags() or specific regex operations?
Question 3
What about already validating the data before saving?
For example leveraging filter_input_array, will it give any additional security, or is it unnecessary as I filter the output for XSS anyway?
If you do not intend to print html code from user input in general you should sanitize input before persisting it to database so "Question 3" is the way to go.
filter_input_array by default sanitizes all input.
If you still have dangerous user input stored in your database either use a template engine which automatically will sanitize output or write your own function like
function _e($str) {
echo htmlspecialchars($str);
}
Iterating over all object properties would produce unnecessary load because even fields that are not printed are encoded.

Sanitize POST htmlentities or plus stripslashes and strip_tags

As always I start this saying that I am learning.
I saw in several books and even here, that a lot of user when we are talking about sanitize, for example, Form>Input>Submit, they use
function sanitizeexample($param)
{
$param = stripslashes($param);
$param = strip_tags($param);
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
Instead of JUST:
function sanitizeexample($param)
{
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
So here the question. Do stripslashes() and strip_tags() provide something else regarding to security? Or it´s enough with htmlentities().
And I´m asking JUST to know which is the best to use.
Whether strip_tags() provides a value-add is dependent on your particular use case. If you htmlentities() a string that contains html tags, you're going to get the raw html content escaped and rendered on the page. The example you give is probably making the assumption that this is not what you want, and so by doing strip_tags() first, html tags are removed.
stripslashes is the inverse to addslashes. In modern (PHP >= 5.4) PHP code, this is not necessary. On legacy systems, with magic_quotes_gpc enabled, user input from request variables are automagically escaped with addslashes so as to make them "safe" for direct use in database queries. This has widely been considered a Bad Idea (because it's not actually safe, for many reasons) and magic_quotes has been removed. Accordingly, you would now not normally need to stripslashes() user input. (Whether you actually need to is going to be dependent on PHP version and ini settings.)
(Note that you would still need to properly escape any content going into your database, but that is better done with parameterized queries or database-specific escaping functions, both of which are outside the scope of this question.)
It depends on your goals:
if you're getting user's data passed from html form - you should
definitely apply strip_tags(trim($_POST['name'])) approach to
sanitize possible insecure and excessive data.
if you are receiving uploaded user's file content and need to save
content formatting - you have to consider how to safely process and
store such files making some specific(selective) sanitizing

Function escaping before inserting in mysql

I've been working on a code that escapes your posts if they are strings before you enter them in DB, is it an good idea? Here is the code: (Updated to numeric)
static function securePosts(){
$posts = array();
foreach($_POST as $key => $val){
if(!is_numeric($val)){
if(is_string($val)){
if(get_magic_quotes_gpc())
$val = stripslashes($val);
$posts[$key] = mysql_real_escape_string($val);
}
}else
$posts[$key] = $val;
}
return $posts;
}
Then in an other file:
if(isset($_POST)){
$post = ChangeHandler::securePosts();
if(isset($post['user'])){
AddUserToDbOrWhatEver($post['user']);
}
}
Is this good or will it have bad effects when escaping before even entering it in the function (addtodborwhater)
When working with user-input, one should distinguish between validation and escaping.
Validation
There you test the content of the user-input. If you expect a number, you check if this is really a numerical input. Validation can be done as early as possible. If the validation fails, you can reject it immediately and return with an error message.
Escaping
Here you bring the user-input into a form, that can not damage a given target system. Escaping should be done as late as possible and only for the given system. If you want to store the user-input into a database, you would use a function like mysqli_real_escape_string() or a parameterized PDO query. Later if you want to output it on an HTML page you would use htmlspecialchars().
It's not a good idea to preventive escape the user-input, or to escape it for several target systems. Each escaping can corrupt the original value for other target systems, you can loose information this way.
P.S.
As YourCommonSense correctly pointed out, it is not always enough to use escape functions to be safe, but that does not mean that you should not use them. Often the character encoding is a pitfall for security efforts, and it is a good habit to declare the character encoding explicitely. In the case of mysqli this can be done with $db->set_charset('utf8'); and for HTML pages it helps to declare the charset with a meta tag.
It is ALWAYS a good idea to escape user input BEFORE inserting anything in database. However, you should also try to convert values, that you expect to be a number to integers (signed or unsigned). Or better - you should use prepared SQL statements. There is a lot of info of the latter here and on PHP docs.

How to santize user inputs in PHP?

Is this enough?
$listing = mysql_real_escape_string(htmlspecialchars($_POST['listing']));
Depends - if you are expecting text, it's just fine, although you shouldn't put the htmlspecialchars in input. Do it in output.
You might want to read this: What's the best method for sanitizing user input with PHP?
you can use php function : filter_var()
a good tutorial in the link :
http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html
example to sanitize integer :
To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter strips out all characters except for digits and . + -
It is simple to use and we no longer need to boggle our minds with regular expressions.
<?php
/*** an integer ***/
$int = "abc40def+;2";
/*** sanitize the integer ***/
echo filter_var($int, FILTER_SANITIZE_NUMBER_INT);
?>
The above code produces an output of 40+2 as the none INT values, as specified by the filter, have been removed
See:
Best way to stop SQL Injection in PHP
What are the best practices for avoid xss attacks in a PHP site
And sanitise data immediately before it is used in the context it needs to be made safe for. (e.g. don't run htmlspecialchars until you are about to output HTML, you might need the unedited data before then (such as if you ever decide to send content from the database by email)).
Yes. However, you shouldn't use htmlspecialchars on input. Only on output, when you print it.
This is because, it's not certain that the output will always be through html. It could be through a terminal, so it could confuse users if weird codes suddenly show up.
It depends on what you want to achieve. Your version prevents (probably) all SQL injections and strips out HTML (more exactly: Prevents it from being interpreted when sent to the browser). You could (and probably should) apply the htmlspecialchars() on output, not input. Maybe some time in the future you want to allow simple things like <b>.
But there's more to sanitizing, e.g. if you expect an Email Address you could verify that it's indeed an email address.
As has been said don't use htmlspecialchars on input only output. Another thing to take into consideration is ensuring the input is as expected. For instance if you're expecting a number use is_numeric() or if you're expecting a string to only be of a certain size or at least a certain size check for this. This way you can then alert users to any errors they have made in their input.
What if your listing variable is an array ?
You should sanitize this variable recursively.
Edit:
Actually, with this technique you can avoid SQL injections but you can't avoid XSS.
In order to sanitize "unreliable" string, i usually combine strip_tags and html_entity_decode.
This way, i avoid all code injection, even if characters are encoded in a Ł way.
$cleaned_string = strip_tags( html_entity_decode( $var, ENT_QUOTES, 'UTF-8' ) );
Then, you have to build a recursive function which call the previous functions and walks through multi-dimensional arrays.
In the end, when you want to use a variable into an SQL statement, you can use the DBMS-specific (or PDO's) escaping function.
$var_used_with_mysql = mysql_real_escape_string( $cleaned_string );
In addition to sanitizing the data you should also validate it. Like checking for numbers after you ask for an age. Or making sure that a email address is valid. Besides for the security benefit you can also notify your users about problems with their input.
I would assume it is almost impossible to make an SQL injection if the input is definitely a number or definitely an email address so there is an added level of safety.

Is preventing XSS and SQL Injection as easy as does this

Question: Is preventing XSS (cross-site scripting) as simple using strip_tags on any saved input fields and running htmlspecialchars on any displayed output ... and preventing SQL Injection by using PHP PDO prepared statements?
Here's an example:
// INPUT: Input a persons favorite color and save to database
// this should prevent SQL injection ( by using prepared statement)
// and help prevent XSS (by using strip_tags)
$sql = 'INSERT INTO TABLE favorite (person_name, color) VALUES (?,?)';
$sth = $conn->prepare($sql);
$sth->execute(array(strip_tags($_POST['person_name']), strip_tags($_POST['color'])));
// OUTPUT: Output a persons favorite color from the database
// this should prevent XSS (by using htmlspecialchars) when displaying
$sql = 'SELECT color FROM favorite WHERE person_name = ?';
$sth = $conn->prepare($sql);
$sth->execute(array(strip_tags($_POST['person_name'])));
$sth->setFetchMode(PDO::FETCH_BOTH);
while($color = $sth->fetch()){
echo htmlspecialchars($color, ENT_QUOTES, 'UTF-8');
}
It's even more simple. Just htmlspecialchars() (with quote style and character set) on user-controlled input is enough. The strip_tags() is only useful if you already want to sanitize data prior to processing/save in database, which is often not used in real world. HTML code doesn't harm in PHP source, but PHP code may do so if you use eval() on non-sanitized user-controlled input or that kind of evil stuff.
This however doesn't save you from SQL injections, but that's another story.
Update: to get clean user input from the request to avoid magic quotes in user-controlled input, you can use the following function:
function get_string($array, $index, $default = null) {
if (isset($array[$index]) && strlen($value = trim($array[$index])) > 0) {
return get_magic_quotes_gpc() ? stripslashes($value) : $value;
} else {
return $default;
}
}
which can be used as:
$username = get_string($_POST, "username");
$password = get_string($_POST, "password");
(you can do simliar for get_number, get_boolean, get_array, etc)
To prepare the SQL query to avoid SQL injections, do:
$sql = sprintf(
"SELECT id FROM user WHERE username = '%s' AND password = MD5('%s')",
mysql_real_escape_string($user),
mysql_real_escape_string($password)
);
To display user-controlled input to avoid XSS, do:
echo htmlspecialchars($data, ENT_QUOTES, 'UTF-8');
It depends on where and how you want to use the user data. You need to know the context you want to insert your data in and the meta characters of that context.
If you just want to allow the user to put text up on your website, htmlspecialchars suffices to escape the HTML meta characters. But if you want to allow certain HTML or want to embed user data in existing HTML elements (like a URL into a A/IMG element), htmlspecialchars is not enough as you’re not in the HTML context anymore but in the URL context.
So entering <script>alert("xss")</script> into a image URL field will yield:
<img src="<script>alert("xss")</script&gt" />
But entering javascript:alert("xss") will succeed:
<img src="javascript:alert("xss")" />
Here you should take a look at the fabulous XSS (Cross Site Scripting) Cheat Sheet to see what contexts your user data can be injected in.
strip_tags is not necessary. In most cases strip_tags is just irritating, because some of your users may want to use < and > in their texts. Just use htmlspecialchars (or htmlentities if you prefer) before you echo the texts to the browser.
(Don't forget mysql_real_esacpe_string before you insert anything into your database!)
The general rule/meme is "Filter Input, Escape Output." Using strip_tags on your input to remove any HTML is a good idea for input filtering, but you should be as strict as possible in what input you allow. For example, if an input parameter is only supposed to be an integer, only accept numeric input and always convert it to an integer before doing anything with it. A well-vetted input filtering library is going to help you a lot here; one that isn't specific to a particular framework is Inspekt (which I wrote, so I'm a bit biased).
For output, htmlspecialchars should be able to escape XSS attacks, but only if you pass the correct parameters. You must pass the quote escaping style and a charset.
In general, this should remove XSS attacks:
$safer_str = htmlspecialchars($unsafe_str, ENT_QUOTES, 'UTF-8');
Without passing ENT_QUOTES as the second parameter, single-quote chars are not encoded. Additionally, XSS attacks have been demonstrated when the correct charset is not passed (typically UTF-8 will be adequate). htmlspecialchars should always be called with ENT_QUOTES and a charset parameter.
Note that PHP 5.2.12 contains a fix for a multibyte XSS attack.
You may find the OWASP ESAPI PHP port interesting and useful, although the PHP version is not complete AFAIK.
Yes, using PDO prepared statements protects from SQL injection. The SQL injection attack is based on the fact that the data submitted by the attacker is treated as a part of the query. For example, the attacker submits the string "a' or 'a'='a" as his password. Instead of the whole string being compared to the passwords in the database, it is included in the query, so the query becomes "SELECT * FROM users WHERE login='joe' AND password='a' or 'a'='a'". The part of attacker input is interpreted as a part of the query. However in case of prepared statements, you are telling the SQL engine specifically, what part is the query, and what part is data (by setting the parameters), so no such confusion is possible.
No, using strip_tags will not always protect you from cross-site scripting. Consider the following example. Let's say your page contains:
<script>
location.href='newpage_<?php echo strip_tags($_GET['language']); ?>.html';
</script>
The attacker submits the request with "language" set to "';somethingevil();'" . strip_tags() returns this data as is (there are no tags in it). The produced page code becomes:
<script>
location.href='newpage_';somethingevil();'.html';
</script>
somethingevil() gets executed. Replace somethingevil() with actual XSS exploit code.
Your last example with htmlspecialchars() will protect against this one, because it will escape single quotes. However I have seen even weirder cases of user-supplied data inside JavaScript code, where it is not even within a quoted string. I think it was in the variable or function name. In that last case no amount of escaping will probably help. I beleive that it is best to avoid using user input to generate JavaScript code.
Simple answer : no
Longer answer : There are ways to inject xss that PHP strip_stags cannot avoid.
For better protection try HTML purifier

Categories