How to santize user inputs in PHP? - php

Is this enough?
$listing = mysql_real_escape_string(htmlspecialchars($_POST['listing']));

Depends - if you are expecting text, it's just fine, although you shouldn't put the htmlspecialchars in input. Do it in output.
You might want to read this: What's the best method for sanitizing user input with PHP?

you can use php function : filter_var()
a good tutorial in the link :
http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html
example to sanitize integer :
To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter strips out all characters except for digits and . + -
It is simple to use and we no longer need to boggle our minds with regular expressions.
<?php
/*** an integer ***/
$int = "abc40def+;2";
/*** sanitize the integer ***/
echo filter_var($int, FILTER_SANITIZE_NUMBER_INT);
?>
The above code produces an output of 40+2 as the none INT values, as specified by the filter, have been removed

See:
Best way to stop SQL Injection in PHP
What are the best practices for avoid xss attacks in a PHP site
And sanitise data immediately before it is used in the context it needs to be made safe for. (e.g. don't run htmlspecialchars until you are about to output HTML, you might need the unedited data before then (such as if you ever decide to send content from the database by email)).

Yes. However, you shouldn't use htmlspecialchars on input. Only on output, when you print it.
This is because, it's not certain that the output will always be through html. It could be through a terminal, so it could confuse users if weird codes suddenly show up.

It depends on what you want to achieve. Your version prevents (probably) all SQL injections and strips out HTML (more exactly: Prevents it from being interpreted when sent to the browser). You could (and probably should) apply the htmlspecialchars() on output, not input. Maybe some time in the future you want to allow simple things like <b>.
But there's more to sanitizing, e.g. if you expect an Email Address you could verify that it's indeed an email address.

As has been said don't use htmlspecialchars on input only output. Another thing to take into consideration is ensuring the input is as expected. For instance if you're expecting a number use is_numeric() or if you're expecting a string to only be of a certain size or at least a certain size check for this. This way you can then alert users to any errors they have made in their input.

What if your listing variable is an array ?
You should sanitize this variable recursively.
Edit:
Actually, with this technique you can avoid SQL injections but you can't avoid XSS.
In order to sanitize "unreliable" string, i usually combine strip_tags and html_entity_decode.
This way, i avoid all code injection, even if characters are encoded in a Ł way.
$cleaned_string = strip_tags( html_entity_decode( $var, ENT_QUOTES, 'UTF-8' ) );
Then, you have to build a recursive function which call the previous functions and walks through multi-dimensional arrays.
In the end, when you want to use a variable into an SQL statement, you can use the DBMS-specific (or PDO's) escaping function.
$var_used_with_mysql = mysql_real_escape_string( $cleaned_string );

In addition to sanitizing the data you should also validate it. Like checking for numbers after you ask for an age. Or making sure that a email address is valid. Besides for the security benefit you can also notify your users about problems with their input.
I would assume it is almost impossible to make an SQL injection if the input is definitely a number or definitely an email address so there is an added level of safety.

Related

Sanitize POST htmlentities or plus stripslashes and strip_tags

As always I start this saying that I am learning.
I saw in several books and even here, that a lot of user when we are talking about sanitize, for example, Form>Input>Submit, they use
function sanitizeexample($param)
{
$param = stripslashes($param);
$param = strip_tags($param);
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
Instead of JUST:
function sanitizeexample($param)
{
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
So here the question. Do stripslashes() and strip_tags() provide something else regarding to security? Or it´s enough with htmlentities().
And I´m asking JUST to know which is the best to use.
Whether strip_tags() provides a value-add is dependent on your particular use case. If you htmlentities() a string that contains html tags, you're going to get the raw html content escaped and rendered on the page. The example you give is probably making the assumption that this is not what you want, and so by doing strip_tags() first, html tags are removed.
stripslashes is the inverse to addslashes. In modern (PHP >= 5.4) PHP code, this is not necessary. On legacy systems, with magic_quotes_gpc enabled, user input from request variables are automagically escaped with addslashes so as to make them "safe" for direct use in database queries. This has widely been considered a Bad Idea (because it's not actually safe, for many reasons) and magic_quotes has been removed. Accordingly, you would now not normally need to stripslashes() user input. (Whether you actually need to is going to be dependent on PHP version and ini settings.)
(Note that you would still need to properly escape any content going into your database, but that is better done with parameterized queries or database-specific escaping functions, both of which are outside the scope of this question.)
It depends on your goals:
if you're getting user's data passed from html form - you should
definitely apply strip_tags(trim($_POST['name'])) approach to
sanitize possible insecure and excessive data.
if you are receiving uploaded user's file content and need to save
content formatting - you have to consider how to safely process and
store such files making some specific(selective) sanitizing

Do I need to escape ALL GET arrays?

I have an important question, and I don't know what to search for, so I'm asking you guys for help.
Do I need to escape this kind of code:
<?php if(isset($_GET['hk']) && $_GET['hk'] == "loginerror") { echo "error"; } ?>
(the result will be something like index.php?hk=loginerror)
Or should I leave it un-escaped? Can hackers "hack" if I don't use escape?
Thanks.
You need to escape (or encode, depending on context) special characters in user input when you use it in generated code or data formats (e.g. if you put it in an SQL query, an HTML document, a JSON file, etc).
If you are just comparing it to a string or seeing if it exists, there is no point in escaping it.
It is always good practice to filter or escaping your string when sending information to limit hackers of finding any security flaws.
Furthermore, never use the $_GET method when sending sensitive information over the net, rather use the $_POST method.
Using the $_GET methods shows which variable are being parsed and this information could be very very important and influential to a hacker
NO.
You shouldn't escape not a single $_GET array at all.

$_GET['user'] security vulnerability in PHP

I'm posting it for a clarification in a specific situation, though user input sanitization/validations is a cliche subject.
A section of the code contain
$haystack=$_GET['user'];
$input is never used for 'echo' or 'print' or in any SQL query or in any such thing. The only further use of the user input ( $haystack ) is to check if the string contains a predefined $needle.
if (preg_match($needle,$haystack)) {
$result="A";
} else {
$result="B";
}
My worry is the execution of a malicious code, rather than the presence of it in the user input.
So the question is, if the user input is used only in the context (no usage in echo,print,SQL etc) mentioned above, is there still a possibility of a malicious code in the user input get executed.
I wanted to add the security measures that is just required for the context than overdoing it.
If used only in the context, there's no way to execute malicious code from the user input.
You should be careful with eval, preg_replace (with modifier e, thanks Pelshoff), database queries and echo (& print, sprintf…).
Its not possible to just execute arbitrary code by being able to alter a string. Only when you output the string directly, or use it in SQL should you be really worried.
preg_match won't end up executing your input. It's too simple and straightforward to have a hidden exploitable bug. If you toss $haystack after running preg_match on it, then it can't possibly hurt you.
While the $haystack may not be reflected, it can obviously affect program flow. The (extremely short) code you posted certainly doesn't look directly vulnerable, but not sanitizing your input may enable code execution in conjunction with other vulnerabilities.

The ultimate clean/secure function

I have a lot of user inputs from $_GET and $_POST... At the moment I always write mysql_real_escape_string($_GET['var'])..
I would like to know whether you could make a function that secures, escapes and cleans the $_GET/$_POST arrays right away, so you won't have to deal with it each time you are working with user inputs and such.
I was thinking of an function, e.g cleanMe($input), and inside it, it should do mysql_real_escape_string, htmlspecialchars, strip_tags, stripslashes (I think that would be all to make it clean & secure) and then return the $input.
So is this possible? Making a function that works for all $_GET and $_POST, so you would do only this:
$_GET = cleanMe($_GET);
$_POST = cleanMe($_POST);
So in your code later, when you work with e.g $_GET['blabla'] or $_POST['haha'] , they are secured, stripped and so on?
Tried myself a little:
function cleanMe($input) {
$input = mysql_real_escape_string($input);
$input = htmlspecialchars($input, ENT_IGNORE, 'utf-8');
$input = strip_tags($input);
$input = stripslashes($input);
return $input;
}
The idea of a generic sanitation function is a broken concept.
There is one right sanitation method for every purpose. Running them all indiscriminately on a string will often break it - escaping a piece of HTML code for a SQL query will break it for use in a web page, and vice versa. Sanitation should be applied right before using the data:
before running a database query. The right sanitation method depends on the library you use; they are listed in How can I prevent SQL injection in PHP?
htmlspecialchars() for safe HTML output
preg_quote() for use in a regular expression
escapeshellarg() / escapeshellcmd() for use in an external command
etc. etc.
Using a "one size fits all" sanitation function is like using five kinds of highly toxic insecticide on a plant that can by definition only contain one kind of bug - only to find out that your plants are infested by a sixth kind, on which none of the insecticides work.
Always use that one right method, ideally straight before passing the data to the function. Never mix methods unless you need to.
There is no point in simply passing the input through all these functions. All these functions have different meanings. Data doesn't get "cleaner" by calling more escape-functions.
If you want to store user input in MySQL you need to use only mysql_real_escape_string. It is then fully escaped to store safely in the database.
EDIT
Also note the problems that arise with using the other functions. If the client sends for instance a username to the server, and the username contains an ampersand (&), you don;t want to have called htmlentities before storing it in the database because then the username in the database will contain &.
You're looking for filter_input_array().
However, I suggest only using that for business-style validation/sanitisation and not SQL input filtering.
For protection against SQL injection, use parametrised queries with mysqli or PDO.
The problem is, something clean or secure for one use, won't be for another : cleaning for part of a path, for part of a mysql query, for html output (as html, or in javascript or in an input's value), for xml may require different things which contradicts.
But, some global things can be done.
Try to use filter_input to get your user's input. And use prepared statements for your SQL queries.
Although, instead of a do-it-all function, you can create some class which manages your inputs. Something like that :
class inputManager{
static function toHTML($field){
$data = filter_input(INPUT_GET, $field, FILTER_SANITIZE_SPECIAL_CHARS);
return $data;
}
static function toSQL($field, $dbType = 'mysql'){
$data = filter_input(INPUT_GET, $field);
if($dbType == 'mysql'){
return mysql_real_escape_string($data);
}
}
}
With this kind of things, if you see any $_POST, $GET, $_REQUEST or $_COOKIE in your code, you know you have to change it. And if one day you have to change how you filter your inputs, just change the class you've made.
May I suggest to install "mod_security" if you're using apache and have full access to server?!
It did solve most of my problems. However don't rely in just one or two solutions, always write secure code ;)
UPDATE
Found this PHP IDS (http://php-ids.org/); seems nice :)

Are these two functions overkill for sanitization?

function sanitizeString($var)
{
$var = stripslashes($var);
$var = htmlentities($var);
$var = strip_tags($var);
return $var;
}
function sanitizeMySQL($var)
{
$var = mysql_real_escape_string($var);
$var = sanitizeString($var);
return $var;
}
I got these two functions from a book and the author says that by using these two, I can be extra safe against XSS(the first function) and sql injections(2nd func).
Are all those necessary?
Also for sanitizing, I use prepared statements to prevent sql injections.
I would use it like this:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
EDIT: Get rid of strip_tags for the 1st function because it doesn't do anything.
Would using these two functions be enough to prevent the majority of attacks and be okay for a public site?
To be honest, I think the author of these function has either no idea what XSS and SQL injections are or what exactly the used function do.
Just to name two oddities:
Using stripslashes after mysql_real_escape_string removes the slashes that were added by mysql_real_escape_string.
htmlentities replaces the chatacters < and > that are used in strip_tags in order to identify tags.
Furthermore: In general, functions that protect agains XSS are not suitable to protect agains SQL injections and vice versa. Because each language and context hast its own special characters that need to be taken care of.
My advice is to learn why and how code injection is possible and how to protect against it. Learn the languages you are working with, especially the special characters and how to escape these.
Edit   Here’s some (probably weird) example: Imagine you allow your users to input some value that should be used as a path segment in a URI that you use in some JavaScript code in a onclick attribute value. So the language context looks like this:
HTML attribute value
JavaScript string
URI path segment
And to make it more fun: You are storing this input value in a database.
Now to store this input value correctly into your database, you just need to use a proper encoding for the context you are about to insert that value into your database language (i.e. SQL); the rest does not matter (yet). Since you want to insert it into a SQL string declaration, the contextual special characters are the characters that allow you to change that context. As for string declarations these characters are (especially) the ", ', and \ characters that need to be escaped. But as already stated, prepared statements do all that work for you, so use them.
Now that you have the value in your database, we want to output them properly. Here we proceed from the innermost to the outermost context and apply the proper encoding in each context:
For the URI path segment context we need to escape (at least) all those characters that let us change that context; in this case / (leave current path segment), ?, and # (both leave URI path context). We can use rawurlencode for this.
For the JavaScript string context we need to take care of ", ', and \. We can use json_encode for this (if available).
For the HTML attribute value we need to take care of &, ", ', and <. We can use htmlspecialchars for this.
Now everything together:
'… onclick="'.htmlspecialchars('window.open("http://example.com/'.json_encode(rawurlencode($row['user-input'])).'")').'" …'
Now if $row['user-input'] is "bar/baz" the output is:
… onclick="window.open("http://example.com/"%22bar%2Fbaz%22"")" …
But using all these function in these contexts is no overkill. Because although the contexts may have similar special characters, they have different escape sequences. URI has the so called percent encoding, JavaScript has escape sequences like \" and HTML has character references like ". And not using just one of these functions will allow to break the context.
It's true, but this level of escaping may not be appropriate in all cases. What if you want to store HTML in a database?
Best practice dictates that, rather than escaping on receiving values, you should escape them when you display them. This allows you to account for displaying both HTML from the database and non-HTML from the database, and it's really where this sort of code logically belongs, anyway.
Another advantage of sanitizing outgoing HTML is that a new attack vector may be discovered, in which case sanitizing incoming HTML won't do anything for values that are already in the database, while outgoing sanitization will apply retroactively without having to do anything special
Also, note that strip_tags in your first function will likely have no effect, if all of the < and > have become < and >.
You are doing htmlentities (which turns all > into >) and then calling strip_tags which at this point will not accomplish anything more, since there are no tags.
If you're using prepared statements and SQL placeholders and never interpolating user input directly into your SQL strings, you can skip the SQL sanitization entirely.
When you use placeholders, the structure of the SQL statement (SELECT foo, bar, baz FROM my_table WHERE id = ?) is send to the database engine separately from the data values which are (eventually) bound to the placeholders. This means that, barring major bugs in the database engine, there is absolutely no way for the data values to be misinterpreted as SQL instructions, so this provides complete protection from SQL injection attacks without requiring you to mangle your data for storage.
No, this isn't overkill this is a vulnerability.
This code completely vulnerable to SQL Injection. You are doing a mysql_real_escape_string() and then you are doing a stripslashes(). So a " would become \" after mysql_real_escape_string() and then go back to " after the stripslashes(). mysql_real_escape_string() alone is best to stop sql injection. Parameterized query libraries like PDO and ADODB uses it, and Parameterized queries make it very easy to completely stop sql injection.
Go ahead test your code:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
mysql_query("select * from mysql.user where Host='".$variable."'");
What if:
$_POST['user_input'] = 1' or 1=1 /*
Patched:
mysql_query("select * from mysql.user where Host='".mysql_real_escape_string($variable)."'");
This code is also vulnerable to some types XSS:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
print("<body background='http://localhost/image.php?".$variable."' >");
What if:
$_POST['user_input']="' onload=alert(/xss/)";
patched:
$variable=htmlspecialchars($variable,ENT_QUOTES);
print("<body background='http://localhost/image.php?".$variable."' >");
htmlspeicalchars is encoding single and double quotes, make sure the variable you are printing is also encased in quotes, this makes it impossible to "break out" and execute code.
Well, if you don't want to reinvent the wheel you can use HTMLPurifier. It allows you to decide exactly what you want and what you don't want and prevents XSS attacks and such
I wonder about the concept of sanitization. You're telling Mysql to do exactly what you want it to do: run a query statement authored in part by the website user. You're already constructing the sentence dynamically using user input - concatenating strings with data supplied by the user. You get what you ask for.
Anyway, here's some more sanitization methods...
1) For numeric values, always manually cast at least somewhere before or while you build the query string: "SELECT field1 FROM tblTest WHERE(id = ".(int) $val.")";
2) For dates, convert the variable to unix timestamp first. Then, use the Mysql FROM_UNIXTIME() function to convert it back to a date. "SELECT field1 FROM tblTest WHERE(date_field >= FROM_UNIXTIME(".strtotime($val).")";. This is actually needed sometimes anyway to deal with how Mysql interprets and stores dates different from the script or OS layers.
3) For short and predictable strings that must follow a certain standard (username, email, phone number, etc), you can a) do prepared statements; or b) regex or other data validation.
4) For strings which wouldn't follow any real standard and which may or may not have pre- or double-escaped and executable code all over the place (text, memos, wiki markup, links, etc), you can a) do prepared statements; or b) store to and convert from binary/blob form - converting each character to binary, hex, or decimal representation before even passing the value to the query string, and converting back when extracting. This way you can focus more on just html validation when you spit the stored value back out.

Categories