I'm putting together a site, (we're already using javascript to prevalidate on the client side). However after getting tired of writing mysql_real_escape_string every other line. I wrote this class that only has two functions mainly focused on sanitising data in user-input/sql. My question is, what are ways to achieve easier input-sanitizing and while improving code readability?
<?php
class Safe {
function userinput($string){
$string = strip_tags($string);
$string = htmlspecialchars($string);
return $string;
}
function sql ($string){
$sqlstuff = Array("union", "select", "update", "delete", "outfile", "create");
$string = Safe::str($string);
$string = mysql_escape_string($string);
$string = str_ireplace($sqlstuff, "", $string);
return $string;
}
}
?>
Sorry, this is going to sound harsh, but your class is completely broken.
You should not be using htmlspecialchars for sanitizing input, it is only useful for escaping output. You do not need to encode HTML for insertion to the database nor should you. Only using htmlspecialchars when sending output to the browser
You should not be stripping tags from your input, you should be leaving them alone and again using htmlspecialchars when you output that data later to insure HTML tags are escaped and not interpreted by the browser
You should not be using mysql_escape_string or mysql_real_escape_string, you should be using PDO. If you are writing a new site there is absolutely no reason not to start out correctly and use PDO. Do it.
You should not be filtering out "union", "select", etc, that's dumb. Those words can appear in regular English language, and they're harmless if you're properly escaping quotes which PDO will handle for you.
Again, sorry for the harsh tone of this answer, but scrap the entire thing and use PDO. There is literally nothing salvageable here.
It's a good idea to use a class like that, particularily if it simplifies input handling. There's however a few points I'd like to comment on:
You should use mysql_real_escape_string instead of the PHP3 mysql_escape_string.
The first function should be called html or something. userinput sounds to vague and misrepresentative.
HTML escaping needs more parameters htmlspecialchars($str, ENT_QUOTES, "UTF-8") to be perfectly safe
The blacklisting of dangerous SQL keywords is not a good idea. It hints at a wrong approach to using SQL queries (if you receive queries via HTTP requests, that's your problem).
Also you should not attempt to filter them. Instead detected them, write to the error/security log, and die() immediately. If there is an attempt to circumvent security, there's no point in attempting any "cleaning" of the request.
You can also use filter_* functions that are bundled with PHP and provide you with the mechanism to filter request parameters according to specific filtering rules.
With few extra tricks, you could even filter arrays of different types of data (thanks to erisco!).
class sanitizer {
public function sanitizeValues($values, $filters) {
$defaultOptions=FILTER_FLAG_NO_ENCODE_QUOTES | FILTER_FLAG_STRIP_LOW | FILTER_NULL_ON_FAILURE;
$filters=(array)$filters;
$values=(array)$values;
foreach ($filters as $key => $filter) {
if($parts=explode('/', $key)){
$v=&$values;
foreach ($parts as $part){
$v=&$v[$part];
}
$filter=(array)$filter;
$filter[1]=isset($filter[1])?$filter[1]:$defaultOptions;
$v=filter_var($v, $filter[0], $filter[1]);
// consider if you really need this here instead of PDO
// $v=mysql_real_escape_string($v);
}
else{
$values[$key]=isset($values[$key]) ? filter_var($values[$key], $filter[0], $filter[1]) : null;
}
}
return $values;
}
}
$manager=sanitizer::sanitizeValues($_GET['manager'], array(
'manager/managerID'=>FILTER_VALIDATE_INT,
'manager/username'=>FILTER_SANITIZE_STRING,
'manager/name'=>FILTER_SANITIZE_STRING,
'manager/email'=>FILTER_SANITIZE_STRING,
'manager/phone'=>FILTER_SANITIZE_STRING,
'manager/bio'=>FILTER_SANITIZE_STRING,
'manager/enabled'=>FILTER_VALIDATE_BOOLEAN,
'manager/password'=>FILTER_SANITIZE_STRING));
This will produce an array complete with all the needed fields based on the 'manager' parameter in _GET, with all values filtered and, optionally, escaped.
Related
As always I start this saying that I am learning.
I saw in several books and even here, that a lot of user when we are talking about sanitize, for example, Form>Input>Submit, they use
function sanitizeexample($param)
{
$param = stripslashes($param);
$param = strip_tags($param);
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
Instead of JUST:
function sanitizeexample($param)
{
$param = htmlentities($param);
return $param;
}
$name = sanitizeexample($_POST['name']);
So here the question. Do stripslashes() and strip_tags() provide something else regarding to security? Or it´s enough with htmlentities().
And I´m asking JUST to know which is the best to use.
Whether strip_tags() provides a value-add is dependent on your particular use case. If you htmlentities() a string that contains html tags, you're going to get the raw html content escaped and rendered on the page. The example you give is probably making the assumption that this is not what you want, and so by doing strip_tags() first, html tags are removed.
stripslashes is the inverse to addslashes. In modern (PHP >= 5.4) PHP code, this is not necessary. On legacy systems, with magic_quotes_gpc enabled, user input from request variables are automagically escaped with addslashes so as to make them "safe" for direct use in database queries. This has widely been considered a Bad Idea (because it's not actually safe, for many reasons) and magic_quotes has been removed. Accordingly, you would now not normally need to stripslashes() user input. (Whether you actually need to is going to be dependent on PHP version and ini settings.)
(Note that you would still need to properly escape any content going into your database, but that is better done with parameterized queries or database-specific escaping functions, both of which are outside the scope of this question.)
It depends on your goals:
if you're getting user's data passed from html form - you should
definitely apply strip_tags(trim($_POST['name'])) approach to
sanitize possible insecure and excessive data.
if you are receiving uploaded user's file content and need to save
content formatting - you have to consider how to safely process and
store such files making some specific(selective) sanitizing
I've got a simple question:
When is it best to sanitize user input?
And which one of these is considered the best practice:
Sanitize data before writing to database.
Save raw data and sanitize it in the view.
For example use HTML::entities() and save result to database.
Or by using HTML methods in the views because in this case laravel by default uses HTML::entities().
Or maybe by using the both.
EDIT: I found interesting example http://forums.laravel.com/viewtopic.php?id=1789. Are there other ways to solve this?
I would say you need both locations but for different reasons. When data comes in you should validate the data according to the domain, and reject requests that do not comply. As an example, there is no point in allowing a tag (or text for that matter) if you expect a number. For a parameter representing.a year, you may even want to check that it is within some range.
Sanitization kicks in for free text fields. You can still do simple validation for unexpected characters like 0-bytes. IMHO it's best to store raw through safe sql (parameterized queries) and then correctly encode for output. There are two reasons. The first is that if your sanitizer has a bug, what do you do with all the data in your database? Resanitizing can have unwanted consequences. Secondly you want to do contextual escaping, for whichever output you are using (JSON, HTML, HTML attributes etc.)
I have a full article on input filtering in Laravel, you might find it useful http://usman.it/xss-filter-laravel/, here is the excerpt from this article:
You can do a global XSS clean yourself, if you don’t have a library to write common methods you may need frequently then I ask you to create a new library Common in application/library. Put this two methods in your Common library:
/*
* Method to strip tags globally.
*/
public static function global_xss_clean()
{
// Recursive cleaning for array [] inputs, not just strings.
$sanitized = static::array_strip_tags(Input::get());
Input::merge($sanitized);
}
public static function array_strip_tags($array)
{
$result = array();
foreach ($array as $key => $value) {
// Don't allow tags on key either, maybe useful for dynamic forms.
$key = strip_tags($key);
// If the value is an array, we will just recurse back into the
// function to keep stripping the tags out of the array,
// otherwise we will set the stripped value.
if (is_array($value)) {
$result[$key] = static::array_strip_tags($value);
} else {
// I am using strip_tags(), you may use htmlentities(),
// also I am doing trim() here, you may remove it, if you wish.
$result[$key] = trim(strip_tags($value));
}
}
return $result;
}
Then put this code in the beginning of your before filter (in application/routes.php):
//Our own method to defend XSS attacks globally.
Common::global_xss_clean();
I just found this question. Another way to do it is to enclose dynamic output in triple brackets like this {{{ $var }}} and blade will escape the string for you. That way you can keep the potentially dangerous characters in case they are important somewhere else in the code and display them as escaped strings.
i'd found this because i was worried about xss in laravel, so this is the packages gvlatko
it is easy:
To Clear Inputs = $cleaned = Xss::clean(Input::get('comment');
To Use in views = $cleaned = Xss::clean(Input::file('profile'), TRUE);
It depends on the user input. If you're generally going to be outputting code they may provide (for example maybe it's a site that provides code snippets), then you'd sanitize on output. It depends on the context. If you're asking for a username, and they're entering HTML tags, your validation should be picking this up and going "no, this is not cool, man!"
If it's like the example I stated earlier (code snippets), then let it through as RAW (but be sure to make sure your database doesn't break), and sanitize on output. When using PHP, you can use htmlentities($string).
I'm wondering if there is a significant downside to using the following code:
if(isset($_GET)){
foreach($_GET as $v){
$v = htmlspecialchars($v);
}
}
I realize that it probably isn't necessary to use htmlspecialchars on each variable. Anyone know offhand if this is good to do?
UPDATE:
Because I don't think my above code would work, I'm updating this with the code that I'm using (despite the negativity towards the suggestions). :)
if(isset($_GET)){
foreach($_GET as $k=>$v){
$_GET[$k] = htmlspecialchars($v);
}
}
This totally depends on what you want to do.
In general, the answer is "no", and you should only escape data specifically for their intended purpose. Randomly escaping data without purpose isn't helping, and it just causes further confusion, as you have to keep track of what's been escaped and how.
In short, keep your data stored raw, and escape it specifically for its intended use when you use it:
for HTML output, use htmlentities().
for shell command names, use escapeshellcmd().
for shell arguments, use escapeshellarg().
for building a GET URL string, use urlencode() on the parameter values.
for database queries, use the respective database escape mechanism (or prepared statements).
This reasoning applies recursively. So if you want to write a link to a GET URL to the HTML output, it'd be something like this:
echo "click";
It'd be terrible if at that point you'd have to remember if $var had already previously been escaped, and how.
Blanket escaping isn't necessary, and it's possibly harmful to the data. Don't do it.
Apply htmlspecialchars() only to data that you are about to output in a HTML page - ideally immediately before, or directly when you output it.
It won't affect numbers, but it can backfire for string parameters which are not intended to be put in HTML code.
You have to treat each key different depending on its meaning. Possibility of generalization also depends on your application.
The way you're doing it won't work. You need to make $v a reference, and it breaks for anything requiring recursion ($_GET['array'][0], for example).
if(isset($_GET)) {
foreach($_GET as &$v) {
$v = htmlspecialchars($v);
}
}
I'm using $_POST and aware about mysql exploit, I decided to use this function on the top of my page, therefore all POST will be safe:
Can you tell me if I miss something and this function will really do the job as I think it will?
function clean_post(){
if ( $_POST){
foreach ($_POST as $k => $v) {
$_POST[$k]=stripslashes($v);
$_POST[$k]=mysql_real_escape_string($v);
$_POST[$k]=preg_replace('/<.*>/', "", "$v");
}
}
if ( $_COOKIE){
foreach ($_COOKIE as $k => $v) {
$_COOKIE[$k]=stripslashes($v);
$_COOKIE[$k]=mysql_real_escape_string($v);
$_COOKIE[$k]=preg_replace('/<.*>/', "", "$v");
}
}
}
It will also remove all html tag, a safest option to output the result might be to use:
<pre>
$foo
</pre>
Cheers!
Cheers!
I think it's a bad idea to do this. It will corrupt the data your users enter even before it hits the database. This approach will also encourage you to use lazy coding where you consistently don't escape data because you believe that all your data is already "clean". This will come back to bite you one day when you do need to output some unsafe characters and you either forget to escape them or you aren't really sure which function you need to call so you just try something and hope that it works.
To do it properly you should ensure that magic quotes is disabled and only escape data when necessary, using precisely the correct escaping method - no more, no less.
There are some problems with it.
First you apply functions on types that doesn't need them, your integers for example needs only a (int) cast to be secure.
Second you do not secure lenght, when you're requesting a '12 chars string' it would be a good idea to ensure you've got only 12 chars, and not 2048. Limiting size is really something your attackers will not like.
Third in your foreach loop you have a $v variable, you assign 3 times a function on $v to $_POST[$k]. So the 1st two assignements are lost when the 3rd occurs...
Then all the things previous people said are right :-)
The rule is apply the filter at the right moment for the right output. HTML output need an html filter (htmlspecialchars), but the database doesn't need it, it need a database escaping. Let's say you want to extract data from your database to build a CSV or a PDF, HTML escaping will make you life harder. You'll need CSV escaping at this time, or PDF escaping.
Finally it is effectively hard to remember if you are manipulating a data which is already well escaped for your output. And I recommend you an excellent read on Joel on Software about Apps Hungarian. The text is quite long, but very good, and the web escaping sequence is used as an example on why Apps Hungarian is good (even if System Hungarain is bad).
Hi this is my first answer for any question asked on web so please review it.
Put this code in top of your script and no need to assign these posted values to any variables for doing the same job of making the input data safe for database. Just use $_POST values as it is in your query statements.
foreach ($_POST as $k => $v) {
if(!is_array($_POST[$k]) ) { //checks for a checkbox array & so if present do not escape it to protect data from being corrupted.
if (ini_get('magic_quotes_gpc')) {
$v = stripslashes($v);
}
$v = preg_replace('/<.*>/', "", "$v"); //replaces html chars
$_POST[$k]= mysql_real_escape_string(trim($v));
}
}
Don't forget $_GET[]
if ($_POST OR $_GET)
Also you can add strip_tags()
I don't know whether your function is correct or not, but the principle is certainly incorrect. You want to escape only where you need to, i.e. just before you pass things into MySQL (in fact you don't even want to do that, ideally; use bound parameters).
There are plenty of situations where you might want the raw data as passed in over the HTTP request. With your approach, there's no ability to do so.
In general, I don't think it's that good of an idea.
Not all post data necessarily goes into MySQL, so there is no need to escape it if it doesn't. That said, using something like PDO and prepared statements is a better way, mysql_* functions are deprecated.
The regular expression could destroy a lot of potentially valid text. You should worry about things like HTML when outputting, not inputting. Furthermore, use a function like strip_tags or htmlspecilchars to handle this.
stripslashes is only necessary if magic quotes are enabled (which they shouldn't be, but always is possible)
When working with stripslashes I'd use get_magic_quotes_gpc():
if (get_magic_quotes_gpc()) {
$_POST[$k]=stripslashes($v);
}
Otherwise you'll over-strip.
I have a lot of user inputs from $_GET and $_POST... At the moment I always write mysql_real_escape_string($_GET['var'])..
I would like to know whether you could make a function that secures, escapes and cleans the $_GET/$_POST arrays right away, so you won't have to deal with it each time you are working with user inputs and such.
I was thinking of an function, e.g cleanMe($input), and inside it, it should do mysql_real_escape_string, htmlspecialchars, strip_tags, stripslashes (I think that would be all to make it clean & secure) and then return the $input.
So is this possible? Making a function that works for all $_GET and $_POST, so you would do only this:
$_GET = cleanMe($_GET);
$_POST = cleanMe($_POST);
So in your code later, when you work with e.g $_GET['blabla'] or $_POST['haha'] , they are secured, stripped and so on?
Tried myself a little:
function cleanMe($input) {
$input = mysql_real_escape_string($input);
$input = htmlspecialchars($input, ENT_IGNORE, 'utf-8');
$input = strip_tags($input);
$input = stripslashes($input);
return $input;
}
The idea of a generic sanitation function is a broken concept.
There is one right sanitation method for every purpose. Running them all indiscriminately on a string will often break it - escaping a piece of HTML code for a SQL query will break it for use in a web page, and vice versa. Sanitation should be applied right before using the data:
before running a database query. The right sanitation method depends on the library you use; they are listed in How can I prevent SQL injection in PHP?
htmlspecialchars() for safe HTML output
preg_quote() for use in a regular expression
escapeshellarg() / escapeshellcmd() for use in an external command
etc. etc.
Using a "one size fits all" sanitation function is like using five kinds of highly toxic insecticide on a plant that can by definition only contain one kind of bug - only to find out that your plants are infested by a sixth kind, on which none of the insecticides work.
Always use that one right method, ideally straight before passing the data to the function. Never mix methods unless you need to.
There is no point in simply passing the input through all these functions. All these functions have different meanings. Data doesn't get "cleaner" by calling more escape-functions.
If you want to store user input in MySQL you need to use only mysql_real_escape_string. It is then fully escaped to store safely in the database.
EDIT
Also note the problems that arise with using the other functions. If the client sends for instance a username to the server, and the username contains an ampersand (&), you don;t want to have called htmlentities before storing it in the database because then the username in the database will contain &.
You're looking for filter_input_array().
However, I suggest only using that for business-style validation/sanitisation and not SQL input filtering.
For protection against SQL injection, use parametrised queries with mysqli or PDO.
The problem is, something clean or secure for one use, won't be for another : cleaning for part of a path, for part of a mysql query, for html output (as html, or in javascript or in an input's value), for xml may require different things which contradicts.
But, some global things can be done.
Try to use filter_input to get your user's input. And use prepared statements for your SQL queries.
Although, instead of a do-it-all function, you can create some class which manages your inputs. Something like that :
class inputManager{
static function toHTML($field){
$data = filter_input(INPUT_GET, $field, FILTER_SANITIZE_SPECIAL_CHARS);
return $data;
}
static function toSQL($field, $dbType = 'mysql'){
$data = filter_input(INPUT_GET, $field);
if($dbType == 'mysql'){
return mysql_real_escape_string($data);
}
}
}
With this kind of things, if you see any $_POST, $GET, $_REQUEST or $_COOKIE in your code, you know you have to change it. And if one day you have to change how you filter your inputs, just change the class you've made.
May I suggest to install "mod_security" if you're using apache and have full access to server?!
It did solve most of my problems. However don't rely in just one or two solutions, always write secure code ;)
UPDATE
Found this PHP IDS (http://php-ids.org/); seems nice :)