Is preventing XSS and SQL Injection as easy as does this - php

Question: Is preventing XSS (cross-site scripting) as simple using strip_tags on any saved input fields and running htmlspecialchars on any displayed output ... and preventing SQL Injection by using PHP PDO prepared statements?
Here's an example:
// INPUT: Input a persons favorite color and save to database
// this should prevent SQL injection ( by using prepared statement)
// and help prevent XSS (by using strip_tags)
$sql = 'INSERT INTO TABLE favorite (person_name, color) VALUES (?,?)';
$sth = $conn->prepare($sql);
$sth->execute(array(strip_tags($_POST['person_name']), strip_tags($_POST['color'])));
// OUTPUT: Output a persons favorite color from the database
// this should prevent XSS (by using htmlspecialchars) when displaying
$sql = 'SELECT color FROM favorite WHERE person_name = ?';
$sth = $conn->prepare($sql);
$sth->execute(array(strip_tags($_POST['person_name'])));
$sth->setFetchMode(PDO::FETCH_BOTH);
while($color = $sth->fetch()){
echo htmlspecialchars($color, ENT_QUOTES, 'UTF-8');
}

It's even more simple. Just htmlspecialchars() (with quote style and character set) on user-controlled input is enough. The strip_tags() is only useful if you already want to sanitize data prior to processing/save in database, which is often not used in real world. HTML code doesn't harm in PHP source, but PHP code may do so if you use eval() on non-sanitized user-controlled input or that kind of evil stuff.
This however doesn't save you from SQL injections, but that's another story.
Update: to get clean user input from the request to avoid magic quotes in user-controlled input, you can use the following function:
function get_string($array, $index, $default = null) {
if (isset($array[$index]) && strlen($value = trim($array[$index])) > 0) {
return get_magic_quotes_gpc() ? stripslashes($value) : $value;
} else {
return $default;
}
}
which can be used as:
$username = get_string($_POST, "username");
$password = get_string($_POST, "password");
(you can do simliar for get_number, get_boolean, get_array, etc)
To prepare the SQL query to avoid SQL injections, do:
$sql = sprintf(
"SELECT id FROM user WHERE username = '%s' AND password = MD5('%s')",
mysql_real_escape_string($user),
mysql_real_escape_string($password)
);
To display user-controlled input to avoid XSS, do:
echo htmlspecialchars($data, ENT_QUOTES, 'UTF-8');

It depends on where and how you want to use the user data. You need to know the context you want to insert your data in and the meta characters of that context.
If you just want to allow the user to put text up on your website, htmlspecialchars suffices to escape the HTML meta characters. But if you want to allow certain HTML or want to embed user data in existing HTML elements (like a URL into a A/IMG element), htmlspecialchars is not enough as you’re not in the HTML context anymore but in the URL context.
So entering <script>alert("xss")</script> into a image URL field will yield:
<img src="<script>alert("xss")</script&gt" />
But entering javascript:alert("xss") will succeed:
<img src="javascript:alert("xss")" />
Here you should take a look at the fabulous XSS (Cross Site Scripting) Cheat Sheet to see what contexts your user data can be injected in.

strip_tags is not necessary. In most cases strip_tags is just irritating, because some of your users may want to use < and > in their texts. Just use htmlspecialchars (or htmlentities if you prefer) before you echo the texts to the browser.
(Don't forget mysql_real_esacpe_string before you insert anything into your database!)

The general rule/meme is "Filter Input, Escape Output." Using strip_tags on your input to remove any HTML is a good idea for input filtering, but you should be as strict as possible in what input you allow. For example, if an input parameter is only supposed to be an integer, only accept numeric input and always convert it to an integer before doing anything with it. A well-vetted input filtering library is going to help you a lot here; one that isn't specific to a particular framework is Inspekt (which I wrote, so I'm a bit biased).
For output, htmlspecialchars should be able to escape XSS attacks, but only if you pass the correct parameters. You must pass the quote escaping style and a charset.
In general, this should remove XSS attacks:
$safer_str = htmlspecialchars($unsafe_str, ENT_QUOTES, 'UTF-8');
Without passing ENT_QUOTES as the second parameter, single-quote chars are not encoded. Additionally, XSS attacks have been demonstrated when the correct charset is not passed (typically UTF-8 will be adequate). htmlspecialchars should always be called with ENT_QUOTES and a charset parameter.
Note that PHP 5.2.12 contains a fix for a multibyte XSS attack.
You may find the OWASP ESAPI PHP port interesting and useful, although the PHP version is not complete AFAIK.

Yes, using PDO prepared statements protects from SQL injection. The SQL injection attack is based on the fact that the data submitted by the attacker is treated as a part of the query. For example, the attacker submits the string "a' or 'a'='a" as his password. Instead of the whole string being compared to the passwords in the database, it is included in the query, so the query becomes "SELECT * FROM users WHERE login='joe' AND password='a' or 'a'='a'". The part of attacker input is interpreted as a part of the query. However in case of prepared statements, you are telling the SQL engine specifically, what part is the query, and what part is data (by setting the parameters), so no such confusion is possible.
No, using strip_tags will not always protect you from cross-site scripting. Consider the following example. Let's say your page contains:
<script>
location.href='newpage_<?php echo strip_tags($_GET['language']); ?>.html';
</script>
The attacker submits the request with "language" set to "';somethingevil();'" . strip_tags() returns this data as is (there are no tags in it). The produced page code becomes:
<script>
location.href='newpage_';somethingevil();'.html';
</script>
somethingevil() gets executed. Replace somethingevil() with actual XSS exploit code.
Your last example with htmlspecialchars() will protect against this one, because it will escape single quotes. However I have seen even weirder cases of user-supplied data inside JavaScript code, where it is not even within a quoted string. I think it was in the variable or function name. In that last case no amount of escaping will probably help. I beleive that it is best to avoid using user input to generate JavaScript code.

Simple answer : no
Longer answer : There are ways to inject xss that PHP strip_stags cannot avoid.
For better protection try HTML purifier

Related

Input data processing using PHP for security

I am writing a code which will process the user text input in a registration form. I have implemented the following function which make sure that the input data is safe:
function input_check($Indata, $dbc) { // input_check($Indata, $dbc)
$Indata = trim($Indata); // remove white spaces
$Indata = stripslashes($Indata); // remove back slashes
$Indata = strip_tags($Indata); // remove html tags
$Indata = htmlspecialchars($Indata); // convert html entities
$Indata = mysql_real_escape_string($Indata,$dbc);
return $Indata;
}
Is there any other processing that I have to do in order to ensure that the input is safe?
I meant safe from malicious input data
Your strategy to use all possible escaping mechanisms may be safe, but will make your application too complex - imagine what you need to do, to use the data (which seems to be stored in a MySQL database later, right?) to print it in a html form later.
A more wise approach is, to use only the adequate escaping mechanism depending on the use of the data:
to store data in a MySQL database, use a database escaping mechanism (btw instead of mysql_real_escape_string() which is deprecated, use PDO::quote() or even better use parameter binding which already does escaping for you)
to print stored data in html text use htmlspecialchars(), possibly in conjunction with strip_tags()
to print stored data in html attributes use htmlspecialchars() together with urlencode()
... and so on. Then you will most likely be safe of SQLInjection, XSS attacks and so on.

A PHP function to prevent SQL Injections and XSS

I am tring to make my PHP as secure as possible, and the two main things I am trying to avoid are
mySQL Injections
Cross-Side Scripting (XSS)
This is the script I got against mySQL Injections:
function make_safe($variable) {
$variable = mysql_real_escape_string(trim($variable));
return $variable; }
http://www.addedbytes.com/writing-secure-php/writing-secure-php-1/
Against XSS, I found this:
$username = strip_tags($_POST['username']);
Now I want to unite the two into a single function. Would this be the best way to do so? :
function make_safe($variable) {
$variable = strip_tags(mysql_real_escape_string(trim($variable)));
return $variable; }
Or does the mysql_real_escape_string already prevent XSS? And lastly, is there anything else that I could add into this function to prevent other forms of hacking?
mysql_real_escape_string() doesn't prevent XSS. It will only make impossible to do SQL injections.
To fight XSS, you need to use htmlspecialchars() or strip_tags(). 1st will convert special chars like < to < that will show up as <, but won't be executed. 2nd just strip all tags out.
I don't recommend to make special function to do it or even make one function to do it all, but your given example would work. I assume.
This function:
function make_safe($variable)
{
$variable = strip_tags(mysql_real_escape_string(trim($variable)));
return $variable;
}
Will not work
SQL injection and XSS are two different beasts. Because they each require different escaping you need to use each escape function strip_tags and mysql_real_escape_string separatly.
Joining them up will defeat the security of each.
Use the standard mysql_real_escape_string() when inputting data into the database.
Use strip_tags() when querying stuff out of the database before outputting them to the screen.
Why combining the two function is dangerous
From the horses mouth: http://php.net/manual/en/function.strip-tags.php
Because strip_tags() does not actually validate the HTML, partial or broken tags can result in the removal of more text/data than expected.
So by inputting malformed html into a database field a smart attacker can use your naive implementation to defeat mysql_real_escape_string() in your combo.
What you should really be looking into is using prepared statements and PDO to both provide an abstraction layer against your database as well as completely eradicate SQL injection attacks.
As for XSS, just make sure to never trust user input. Either run strip_tags or htmlentities when you store the data, or when you output it (not both as this will mess with your output), and you'll be all right.

Is mysql_real_escape_string(htmlspecialchars()) really useful ? why?

Is it really usefull to have something like :
$passe = mysql_real_escape_string(htmlspecialchars($_POST['passe']));
why do we use this?
how to optimize it ?
Thank you
<?php
mysql_connect("localhost", "root", "");
mysql_select_db("nom_db");
$passe = mysql_real_escape_string(htmlspecialchars($_POST['passe']));
$passe2 = mysql_real_escape_string(htmlspecialchars($_POST['passe2']));
if($passe == $passe2)
{
script here
}
else
{
echo 'Your password is wrong';
}
?>
In that code example, it isn't useful at all.
htmlspecialchars converts characters with special meaning in HTML into entities. That is essential if you have some text that you want to insert into an HTML document (as it stops, for instance, characters such as < being treated as the start of tags, and protects against XSS).
mysql_real_escape_string converts characters with special meaning in MySQL SQL queries into escapes. This allows you to insert arbitrary strings into a MySQL database safely (protecting against errors and injection. There are, however, better ways to do the same thing.
In this case, you are just comparing two strings. Running them through a bunch of conversions isn't going to do anything useful.
You should only use mysql_real_escape_string($var) when passing untrusted variables in to a database query like so:
$query = mysql_query("SELECT * FROM `foo` WHERE `bar` = '".mysql_real_escape_string($_POST['username'])."'");
It is important to do this to protect against SQL injection attacks.
As for htmlspecialchars(), this should be used when outputting untrusted variables to page, it will strip out any HTML to prevent an variable outputting unwanted or dangerous HTML on top a page (javascript for example).
In your example, you need neither functions as you are just comparing them and are not putting them in a database or on a webpage.
Using Of htmlspecialchars keep you protected from xss but there is bypass method
if you will add this word to url
like
` link name ' ;
bypass will use javascript onmouseout onhover else That require magic_qutoes off
addslashes & mysql_real_escape_string protect from sql injection
by ignore the ' or " quotes
but the good way to remove this words after make it in lowercase
mean
$username = strtolower($_GET['ser']);
if(preg_match("(select|and|or|union|into|from|information|schema|.user|concat|group)\", $username)){
die("Error : Hacking Attemp ");
}
Your full code in the pastebin shows that the variables are used later for a database query.
mysql_query("INSERT INTO validation VALUES('', '$pseudo', '$passe', '$email')");
mysql_real_escape_string() is a must here; htmlspecialchars isn't, for the reasons #Quentin explained so well above.
Use htmlspecialchars later in the output if anything of what you insert gets output on a HTML page.
Using htmlspecialchars() like you is pointless, because for strings:
mysql_real_escape_string(htmlspecialchars($_POST['passe'])) ==
mysql_real_escape_string(htmlspecialchars($_POST['passe2']));
Is as equal as:
$_POST['passe'] == $_POST['passe2']

Are these two functions overkill for sanitization?

function sanitizeString($var)
{
$var = stripslashes($var);
$var = htmlentities($var);
$var = strip_tags($var);
return $var;
}
function sanitizeMySQL($var)
{
$var = mysql_real_escape_string($var);
$var = sanitizeString($var);
return $var;
}
I got these two functions from a book and the author says that by using these two, I can be extra safe against XSS(the first function) and sql injections(2nd func).
Are all those necessary?
Also for sanitizing, I use prepared statements to prevent sql injections.
I would use it like this:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
EDIT: Get rid of strip_tags for the 1st function because it doesn't do anything.
Would using these two functions be enough to prevent the majority of attacks and be okay for a public site?
To be honest, I think the author of these function has either no idea what XSS and SQL injections are or what exactly the used function do.
Just to name two oddities:
Using stripslashes after mysql_real_escape_string removes the slashes that were added by mysql_real_escape_string.
htmlentities replaces the chatacters < and > that are used in strip_tags in order to identify tags.
Furthermore: In general, functions that protect agains XSS are not suitable to protect agains SQL injections and vice versa. Because each language and context hast its own special characters that need to be taken care of.
My advice is to learn why and how code injection is possible and how to protect against it. Learn the languages you are working with, especially the special characters and how to escape these.
Edit   Here’s some (probably weird) example: Imagine you allow your users to input some value that should be used as a path segment in a URI that you use in some JavaScript code in a onclick attribute value. So the language context looks like this:
HTML attribute value
JavaScript string
URI path segment
And to make it more fun: You are storing this input value in a database.
Now to store this input value correctly into your database, you just need to use a proper encoding for the context you are about to insert that value into your database language (i.e. SQL); the rest does not matter (yet). Since you want to insert it into a SQL string declaration, the contextual special characters are the characters that allow you to change that context. As for string declarations these characters are (especially) the ", ', and \ characters that need to be escaped. But as already stated, prepared statements do all that work for you, so use them.
Now that you have the value in your database, we want to output them properly. Here we proceed from the innermost to the outermost context and apply the proper encoding in each context:
For the URI path segment context we need to escape (at least) all those characters that let us change that context; in this case / (leave current path segment), ?, and # (both leave URI path context). We can use rawurlencode for this.
For the JavaScript string context we need to take care of ", ', and \. We can use json_encode for this (if available).
For the HTML attribute value we need to take care of &, ", ', and <. We can use htmlspecialchars for this.
Now everything together:
'… onclick="'.htmlspecialchars('window.open("http://example.com/'.json_encode(rawurlencode($row['user-input'])).'")').'" …'
Now if $row['user-input'] is "bar/baz" the output is:
… onclick="window.open("http://example.com/"%22bar%2Fbaz%22"")" …
But using all these function in these contexts is no overkill. Because although the contexts may have similar special characters, they have different escape sequences. URI has the so called percent encoding, JavaScript has escape sequences like \" and HTML has character references like ". And not using just one of these functions will allow to break the context.
It's true, but this level of escaping may not be appropriate in all cases. What if you want to store HTML in a database?
Best practice dictates that, rather than escaping on receiving values, you should escape them when you display them. This allows you to account for displaying both HTML from the database and non-HTML from the database, and it's really where this sort of code logically belongs, anyway.
Another advantage of sanitizing outgoing HTML is that a new attack vector may be discovered, in which case sanitizing incoming HTML won't do anything for values that are already in the database, while outgoing sanitization will apply retroactively without having to do anything special
Also, note that strip_tags in your first function will likely have no effect, if all of the < and > have become < and >.
You are doing htmlentities (which turns all > into >) and then calling strip_tags which at this point will not accomplish anything more, since there are no tags.
If you're using prepared statements and SQL placeholders and never interpolating user input directly into your SQL strings, you can skip the SQL sanitization entirely.
When you use placeholders, the structure of the SQL statement (SELECT foo, bar, baz FROM my_table WHERE id = ?) is send to the database engine separately from the data values which are (eventually) bound to the placeholders. This means that, barring major bugs in the database engine, there is absolutely no way for the data values to be misinterpreted as SQL instructions, so this provides complete protection from SQL injection attacks without requiring you to mangle your data for storage.
No, this isn't overkill this is a vulnerability.
This code completely vulnerable to SQL Injection. You are doing a mysql_real_escape_string() and then you are doing a stripslashes(). So a " would become \" after mysql_real_escape_string() and then go back to " after the stripslashes(). mysql_real_escape_string() alone is best to stop sql injection. Parameterized query libraries like PDO and ADODB uses it, and Parameterized queries make it very easy to completely stop sql injection.
Go ahead test your code:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
mysql_query("select * from mysql.user where Host='".$variable."'");
What if:
$_POST['user_input'] = 1' or 1=1 /*
Patched:
mysql_query("select * from mysql.user where Host='".mysql_real_escape_string($variable)."'");
This code is also vulnerable to some types XSS:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
print("<body background='http://localhost/image.php?".$variable."' >");
What if:
$_POST['user_input']="' onload=alert(/xss/)";
patched:
$variable=htmlspecialchars($variable,ENT_QUOTES);
print("<body background='http://localhost/image.php?".$variable."' >");
htmlspeicalchars is encoding single and double quotes, make sure the variable you are printing is also encased in quotes, this makes it impossible to "break out" and execute code.
Well, if you don't want to reinvent the wheel you can use HTMLPurifier. It allows you to decide exactly what you want and what you don't want and prevents XSS attacks and such
I wonder about the concept of sanitization. You're telling Mysql to do exactly what you want it to do: run a query statement authored in part by the website user. You're already constructing the sentence dynamically using user input - concatenating strings with data supplied by the user. You get what you ask for.
Anyway, here's some more sanitization methods...
1) For numeric values, always manually cast at least somewhere before or while you build the query string: "SELECT field1 FROM tblTest WHERE(id = ".(int) $val.")";
2) For dates, convert the variable to unix timestamp first. Then, use the Mysql FROM_UNIXTIME() function to convert it back to a date. "SELECT field1 FROM tblTest WHERE(date_field >= FROM_UNIXTIME(".strtotime($val).")";. This is actually needed sometimes anyway to deal with how Mysql interprets and stores dates different from the script or OS layers.
3) For short and predictable strings that must follow a certain standard (username, email, phone number, etc), you can a) do prepared statements; or b) regex or other data validation.
4) For strings which wouldn't follow any real standard and which may or may not have pre- or double-escaped and executable code all over the place (text, memos, wiki markup, links, etc), you can a) do prepared statements; or b) store to and convert from binary/blob form - converting each character to binary, hex, or decimal representation before even passing the value to the query string, and converting back when extracting. This way you can focus more on just html validation when you spit the stored value back out.

How to santize user inputs in PHP?

Is this enough?
$listing = mysql_real_escape_string(htmlspecialchars($_POST['listing']));
Depends - if you are expecting text, it's just fine, although you shouldn't put the htmlspecialchars in input. Do it in output.
You might want to read this: What's the best method for sanitizing user input with PHP?
you can use php function : filter_var()
a good tutorial in the link :
http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html
example to sanitize integer :
To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter strips out all characters except for digits and . + -
It is simple to use and we no longer need to boggle our minds with regular expressions.
<?php
/*** an integer ***/
$int = "abc40def+;2";
/*** sanitize the integer ***/
echo filter_var($int, FILTER_SANITIZE_NUMBER_INT);
?>
The above code produces an output of 40+2 as the none INT values, as specified by the filter, have been removed
See:
Best way to stop SQL Injection in PHP
What are the best practices for avoid xss attacks in a PHP site
And sanitise data immediately before it is used in the context it needs to be made safe for. (e.g. don't run htmlspecialchars until you are about to output HTML, you might need the unedited data before then (such as if you ever decide to send content from the database by email)).
Yes. However, you shouldn't use htmlspecialchars on input. Only on output, when you print it.
This is because, it's not certain that the output will always be through html. It could be through a terminal, so it could confuse users if weird codes suddenly show up.
It depends on what you want to achieve. Your version prevents (probably) all SQL injections and strips out HTML (more exactly: Prevents it from being interpreted when sent to the browser). You could (and probably should) apply the htmlspecialchars() on output, not input. Maybe some time in the future you want to allow simple things like <b>.
But there's more to sanitizing, e.g. if you expect an Email Address you could verify that it's indeed an email address.
As has been said don't use htmlspecialchars on input only output. Another thing to take into consideration is ensuring the input is as expected. For instance if you're expecting a number use is_numeric() or if you're expecting a string to only be of a certain size or at least a certain size check for this. This way you can then alert users to any errors they have made in their input.
What if your listing variable is an array ?
You should sanitize this variable recursively.
Edit:
Actually, with this technique you can avoid SQL injections but you can't avoid XSS.
In order to sanitize "unreliable" string, i usually combine strip_tags and html_entity_decode.
This way, i avoid all code injection, even if characters are encoded in a Ł way.
$cleaned_string = strip_tags( html_entity_decode( $var, ENT_QUOTES, 'UTF-8' ) );
Then, you have to build a recursive function which call the previous functions and walks through multi-dimensional arrays.
In the end, when you want to use a variable into an SQL statement, you can use the DBMS-specific (or PDO's) escaping function.
$var_used_with_mysql = mysql_real_escape_string( $cleaned_string );
In addition to sanitizing the data you should also validate it. Like checking for numbers after you ask for an age. Or making sure that a email address is valid. Besides for the security benefit you can also notify your users about problems with their input.
I would assume it is almost impossible to make an SQL injection if the input is definitely a number or definitely an email address so there is an added level of safety.

Categories