PHP function to sanitize all data - php

Is it a good, or stupid idea to sanitize all the data that could be sqlinjected? I wrote a function that should do it, but I've never seen it done and was wondering if it was a poor idea.
The function I wrote:
function sanitizeData()
{
$_SERVER['HTTP_USER_AGENT'] = mysql_real_escape_string($_SERVER['HTTP_USER_AGENT']);
foreach(array_keys($_COOKIE) as $key)
{
$_COOKIE[$key] = mysql_real_escape_string($_COOKIE[$key]);
}
foreach(array_keys($_POST) as $key)
{
$_POST[$key] = mysql_real_escape_string($_POST[$key]);
}
foreach(array_keys($_GET) as $key)
{
$_GET[$key] = mysql_real_escape_string($_GET[$key]);
}
}

A bad idea; this is basically another version of the deprecated magic_quotes. Most of that data probably won't end up going into the database, so you'll end up escaping unnecessarily, and potentially double-escaping.
Instead, use prepared statements as needed. Look at mysqli_stmt (part of mysqli) and PDOStatement (part of PDO).

It is also very important to understand that mysql_real_escape_string do not sanitize anything.
By applying this function you do not make any data safe. That's very widespread misunderstanding.
This function merely escaping string delimiters. So, it works only for strings, quote delimited ones.
Thus, real sanitization could be only like this:
$_GET[$key] = "'".mysql_real_escape_string($_GET[$key])."'";
And even this one isn't suffice.
But as already Matt mentioned it would be very bad practice. And even more: as a matter of fact, not only input data should be properly formatted/paramertized. It's database function, not user input one! It has nothing to do with user input. Some data may come not from user input but from a file or other query or some service - it all should be properly formatted as well. That's very important to understand.
Also you are using an odd way to iterate arrays.
this one is more common:
foreach($_GET as $key => $value)
{
$_GET[$key] = mysql_real_escape_string($value);
}

Related

Function escaping before inserting in mysql

I've been working on a code that escapes your posts if they are strings before you enter them in DB, is it an good idea? Here is the code: (Updated to numeric)
static function securePosts(){
$posts = array();
foreach($_POST as $key => $val){
if(!is_numeric($val)){
if(is_string($val)){
if(get_magic_quotes_gpc())
$val = stripslashes($val);
$posts[$key] = mysql_real_escape_string($val);
}
}else
$posts[$key] = $val;
}
return $posts;
}
Then in an other file:
if(isset($_POST)){
$post = ChangeHandler::securePosts();
if(isset($post['user'])){
AddUserToDbOrWhatEver($post['user']);
}
}
Is this good or will it have bad effects when escaping before even entering it in the function (addtodborwhater)
When working with user-input, one should distinguish between validation and escaping.
Validation
There you test the content of the user-input. If you expect a number, you check if this is really a numerical input. Validation can be done as early as possible. If the validation fails, you can reject it immediately and return with an error message.
Escaping
Here you bring the user-input into a form, that can not damage a given target system. Escaping should be done as late as possible and only for the given system. If you want to store the user-input into a database, you would use a function like mysqli_real_escape_string() or a parameterized PDO query. Later if you want to output it on an HTML page you would use htmlspecialchars().
It's not a good idea to preventive escape the user-input, or to escape it for several target systems. Each escaping can corrupt the original value for other target systems, you can loose information this way.
P.S.
As YourCommonSense correctly pointed out, it is not always enough to use escape functions to be safe, but that does not mean that you should not use them. Often the character encoding is a pitfall for security efforts, and it is a good habit to declare the character encoding explicitely. In the case of mysqli this can be done with $db->set_charset('utf8'); and for HTML pages it helps to declare the charset with a meta tag.
It is ALWAYS a good idea to escape user input BEFORE inserting anything in database. However, you should also try to convert values, that you expect to be a number to integers (signed or unsigned). Or better - you should use prepared SQL statements. There is a lot of info of the latter here and on PHP docs.

Does this protect against injection attacks?

Does this protect against SQL injection attacks?
function sanitize($value) {
// Stripslashes
if (is_array($value)) {
if (get_magic_quotes_gpc()) {
$value = array_map("stripslashes", $value);
}
$value = array_map("mysql_real_escape_string", $value);
} else {
if (get_magic_quotes_gpc()) {
$value = stripslashes($value);
}
$value = mysql_real_escape_string($value);
}
return $value;
}
$_REQUEST = array_map('sanitize', $_REQUEST);
$_GET = array_map('sanitize', $_GET);
$_POST = array_map('sanitize', $_POST);
$_COOKIE = array_map('sanitize', $_COOKIE);
What could I add to sanitize() to protect against cross-site scripting?
What other channels would allow attackers to insert malicious code?
The one-word answer would be "yes". However:
If $value is an array that contains other arrays it won't be handled correctly. You should loop over $value make a recursive call to sanitize for each array you find.
It's preferable to use prepared statements instead of doing this. Of course, if you already have a complete application and are not building from scratch this can be problematic.
Finally, the other ways in which someone can subvert your application are cross-site scripting (aka CSS or XSS) and cross-site request forgeries (CSRF). There are lots of resources here on SO and on the internet you can use to get up to speed. As a starting point, protection against XSS involves calling htmlspecialchars on anything you output, while protection against CSRF involves requiring a session-specific id code for each operation your privileged users are allowed to perform on your site.
Array-safe sanitize version
function sanitize($value) {
if (is_array($value)) {
foreach($value as &$item) {
$item = sanitize($item);
}
} else {
if (get_magic_quotes_gpc()) {
$value = stripslashes($value);
}
$value = mysql_real_escape_string($value);
}
return $value;
}
Update:
For higher visibility: Bjoern's link to this question ( What's the best method for sanitizing user input with PHP? ) is really good.
No.
Use PHP Data Objects Or... Use a Database Abstraction Layer Or... Some framework that does this.
Don't write your own because:
Someone else has
Their code works fine
You can use their code for free
They thought of all the issues you don't know about yet.
It's a lot of work to do this, it's already been done, just spend twenty minutes and figure out someone else's code that does this.
If it is applied after the database connection was established, then it escapes the initial input data correctly.
Now you will have problems using such escaped values for HTML output however. And it does not protect against second order SQL injection (querying the database, then using those values as-is for a second query). And more importantly, most applications work on the input values. If you do any sort of rewriting or string matching, you might undo some of the escaping.
Hencewhy it is often recommended to apply the escaping right before the query is assembled. Nevertheless, the code itself is functional for the general case and advisable if you can't rewrite heaps of legacy code.
You should add html_entities. Most of the time you put $_POST variables into a textbox, like:
<textarea><?php echo $_POST['field']; ?></textarea>
They can mess up your HTML by filling in and do anything they want.

What are ways to improve my PHP data sanitation class?

I'm putting together a site, (we're already using javascript to prevalidate on the client side). However after getting tired of writing mysql_real_escape_string every other line. I wrote this class that only has two functions mainly focused on sanitising data in user-input/sql. My question is, what are ways to achieve easier input-sanitizing and while improving code readability?
<?php
class Safe {
function userinput($string){
$string = strip_tags($string);
$string = htmlspecialchars($string);
return $string;
}
function sql ($string){
$sqlstuff = Array("union", "select", "update", "delete", "outfile", "create");
$string = Safe::str($string);
$string = mysql_escape_string($string);
$string = str_ireplace($sqlstuff, "", $string);
return $string;
}
}
?>
Sorry, this is going to sound harsh, but your class is completely broken.
You should not be using htmlspecialchars for sanitizing input, it is only useful for escaping output. You do not need to encode HTML for insertion to the database nor should you. Only using htmlspecialchars when sending output to the browser
You should not be stripping tags from your input, you should be leaving them alone and again using htmlspecialchars when you output that data later to insure HTML tags are escaped and not interpreted by the browser
You should not be using mysql_escape_string or mysql_real_escape_string, you should be using PDO. If you are writing a new site there is absolutely no reason not to start out correctly and use PDO. Do it.
You should not be filtering out "union", "select", etc, that's dumb. Those words can appear in regular English language, and they're harmless if you're properly escaping quotes which PDO will handle for you.
Again, sorry for the harsh tone of this answer, but scrap the entire thing and use PDO. There is literally nothing salvageable here.
It's a good idea to use a class like that, particularily if it simplifies input handling. There's however a few points I'd like to comment on:
You should use mysql_real_escape_string instead of the PHP3 mysql_escape_string.
The first function should be called html or something. userinput sounds to vague and misrepresentative.
HTML escaping needs more parameters htmlspecialchars($str, ENT_QUOTES, "UTF-8") to be perfectly safe
The blacklisting of dangerous SQL keywords is not a good idea. It hints at a wrong approach to using SQL queries (if you receive queries via HTTP requests, that's your problem).
Also you should not attempt to filter them. Instead detected them, write to the error/security log, and die() immediately. If there is an attempt to circumvent security, there's no point in attempting any "cleaning" of the request.
You can also use filter_* functions that are bundled with PHP and provide you with the mechanism to filter request parameters according to specific filtering rules.
With few extra tricks, you could even filter arrays of different types of data (thanks to erisco!).
class sanitizer {
public function sanitizeValues($values, $filters) {
$defaultOptions=FILTER_FLAG_NO_ENCODE_QUOTES | FILTER_FLAG_STRIP_LOW | FILTER_NULL_ON_FAILURE;
$filters=(array)$filters;
$values=(array)$values;
foreach ($filters as $key => $filter) {
if($parts=explode('/', $key)){
$v=&$values;
foreach ($parts as $part){
$v=&$v[$part];
}
$filter=(array)$filter;
$filter[1]=isset($filter[1])?$filter[1]:$defaultOptions;
$v=filter_var($v, $filter[0], $filter[1]);
// consider if you really need this here instead of PDO
// $v=mysql_real_escape_string($v);
}
else{
$values[$key]=isset($values[$key]) ? filter_var($values[$key], $filter[0], $filter[1]) : null;
}
}
return $values;
}
}
$manager=sanitizer::sanitizeValues($_GET['manager'], array(
'manager/managerID'=>FILTER_VALIDATE_INT,
'manager/username'=>FILTER_SANITIZE_STRING,
'manager/name'=>FILTER_SANITIZE_STRING,
'manager/email'=>FILTER_SANITIZE_STRING,
'manager/phone'=>FILTER_SANITIZE_STRING,
'manager/bio'=>FILTER_SANITIZE_STRING,
'manager/enabled'=>FILTER_VALIDATE_BOOLEAN,
'manager/password'=>FILTER_SANITIZE_STRING));
This will produce an array complete with all the needed fields based on the 'manager' parameter in _GET, with all values filtered and, optionally, escaped.

mysql_real_escape_string() for entire $_REQUEST array, or need to loop through it?

Is there an easier way of safely extracting submitted variables other than the following?
if(isset($_REQUEST['kkld'])) $kkld=mysql_real_escape_string($_REQUEST['kkld']);
if(isset($_REQUEST['info'])) $info=mysql_real_escape_string($_REQUEST['info']);
if(isset($_REQUEST['freq'])) $freq=mysql_real_escape_string($_REQUEST['freq']);
(And: would you use isset() in this context?)
To escape all variables in one go:
$escapedGet = array_map('mysql_real_escape_string', $_GET);
To extract all variables into the current namespace (i.e. $foo = $_GET['foo']):
extract($escapedGet);
Please do not do this last step though. There's no need to, just leave the values in an array. Extracting variables can lead to name clashes and overwriting of existing variables, which is not only a hassle and a source of bugs but also a security risk. Also, as #BoltClock says, stick to $_GET or $_POST. Also2, as #zerkms points out, there's no point in mysql_real_escaping variables that are not supposed to be used in a database query, it may even lead to further problems.
Note that really none of this is a particularly good idea at all, you're just reincarnating magic_quotes and global_vars, which were horrible PHP practices from ages past. Use prepared statements with bound parameters via mysqli or PDO and use values through $_GET or filter_input. See http://www.phptherightway.com.
You can also use a recursive function like this to accomplish that
function sanitate($array) {
foreach($array as $key=>$value) {
if(is_array($value)) { sanitate($value); }
else { $array[$key] = mysql_real_escape_string($value); }
}
return $array;
}
sanitate($_POST);
To sanitize or validate any INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV, you can use
filter_input_array — Gets external variables and optionally filters them
Filtering can be done with a callback, so you could supply mysql_real_escape_string.
This method does not allow filtering for $_REQUEST, because you should not work with $_REQUEST when the data is available in any of the other superglobals. It's potentially insecure.
The method also requires you to name the input keys, so it's not a generic batch filtering. If you want generic batch filtering, use array_map or array_walk or array_filter as shown elsewhere on this page.
Also, why are you using the old mysql extension instead of the mysqli (i for improved) extension. The mysqli extension will give you support for transactions, multiqueries and prepared statements (which eliminates the need for escaping) All features that can make your DB code much more reliable and secure.
As far as I'm concerned Starx' and Ryan's answer from Nov 19 '10 is the best solution here as I just needed this, too.
When you have multiple input fields with one name (e.g. names[]), meaning they will be saved into an array within the $_POST-array, you have to use a recursive function, as mysql_real_escape_string does not work for arrays.
So this is the only solution to escape such a $_POST variable.
function sanitate($array) {
foreach($array as $key=>$value) {
if(is_array($value)) { sanitate($value); }
else { $array[$key] = mysql_real_escape_string($value); }
}
return $array;
}
sanitate($_POST);
If you use mysqli extension and you like to escape all GET variables:
$escaped_get = array_map(array($mysqli, 'real_escape_string'), $_GET);
As an alternative, I can advise you to use PHP7 input filters, which provides a shortcut to sql escaping. I'd not recommend it per se, but it spares creating localized variables:
$_REQUEST->sql['kkld']
Which can be used inline in SQL query strings, and give an extra warning should you forget it:
mysql_query("SELECT x FROM y WHERE z = '{$_REQUEST->sql['kkld']}'");
It's syntactically questionable, but allows you escaping only those variables that really need it. Or to emulate what you asked for, use $_REQUEST->sql->always();

PHP / MYSQL: Sanitizing user input - is this a bad idea?

I have one "go" script that fetches any other script requested and this is what I wrote to sanitize user input:
foreach ($_REQUEST as $key => $value){
if (get_magic_quotes_gpc())
$_REQUEST[$key] = mysql_real_escape_string(stripslashes($value));
else
$_REQUEST[$key] = mysql_real_escape_string($value);
}
I haven't seen anyone else use this approach. Is there any reason not to?
EDIT - amended for to work for arrays:
function mysql_escape($thing) {
if (is_array($thing)) {
$escaped = array();
foreach ($thing as $key => $value) {
$escaped[$key] = mysql_escape($value);
}
return $escaped;
}
// else
if (get_magic_quotes_gpc()) $thing = stripslashes($thing);
return mysql_real_escape_string($thing);
}
foreach ($_REQUEST as $key => $value){
$_REQUEST[$key] = mysql_escape($value);
}
I find it much better to escape the data at the time it is used, not on the way in. You might want to use that data in JSON, XML, Shell, MySQL, Curl, or HTML and each will have it's own way of escaping the data.
Lets have a quick review of WHY escaping is needed in different contexts:
If you are in a quote delimited string, you need to be able to escape the quotes.
If you are in xml, then you need to separate "content" from "markup"
If you are in SQL, you need to separate "commands" from "data"
If you are on the command line, you need to separate "commands" from "data"
This is a really basic aspect of computing in general. Because the syntax that delimits data can occur IN THE DATA, there needs to be a way to differentiate the DATA from the SYNTAX, hence, escaping.
In web programming, the common escaping cases are:
1. Outputting text into HTML
2. Outputting data into HTML attributes
3. Outputting HTML into HTML
4. Inserting data into Javascript
5. Inserting data into SQL
6. Inserting data into a shell command
Each one has a different security implications if handled incorrectly. THIS IS REALLY IMPORTANT! Let's review this in the context of PHP:
Text into HTML:
htmlspecialchars(...)
Data into HTML attributes
htmlspecialchars(..., ENT_QUOTES)
HTML into HTML
Use a library such as HTMLPurifier to ENSURE that only valid tags are present.
Data into Javascript
I prefer json_encode. If you are placing it in an attribute, you still need to use #2, such as
Inserting data into SQL
Each driver has an escape() function of some sort. It is best. If you are running in a normal latin1 character set, addslashes(...) is suitable. Don't forget the quotes AROUND the addslashes() call:
"INSERT INTO table1 SET field1 = '" . addslashes($data) . "'"
Data on the command line
escapeshellarg() and escapeshellcmd() -- read the manual
--
Take these to heart, and you will eliminate 95%* of common web security risks! (* a guess)
If you have arrays in your $_REQUEST their values won't be sanitized.
I've made and use this one:
<?php
function _clean($var){
$pattern = array("/0x27/","/%0a/","/%0A/","/%0d/","/%0D/","/0x3a/",
"/union/i","/concat/i","/delete/i","/truncate/i","/alter/i","/information_schema/i",
"/unhex/i","/load_file/i","/outfile/i","/0xbf27/");
$value = addslashes(preg_replace($pattern, "", $var));
return $value;
}
if(isset($_GET)){
foreach($_GET as $k => $v){
$_GET[$k] = _clean($v);
}
}
if(isset($_POST)){
foreach($_POST as $k => $v){
$_POST[$k] = _clean($v);
}
}
?>
Your approach tries to sanitize all the request data for insertion into the database, but what if you just wanted to output it? You will have unnecessary backslashes in your output. Also, escaping is not a good strategy to protect from SQL exceptions anyway. By using parametrized queries (e.g. in PDO or in MySQLi) you "pass" the problem of escaping to the abstraction layer.
Apart from the lack of recursion into arrays and the unnecessary escaping of, say, integers, this approach encodes data for use in an SQL statement before sanitization. mysql_real_escape_string() escapes data, it doesn't sanitize it -- escaping and sanitizing aren't the same thing.
Sanitization is the task many PHP scripts have of scrutinizing input data for acceptability before using it. I think this is better done on data that hasn't been escaped. I generally don't escape data until it goes into the SQL. Those who prefer to use Prepared Statements achieve the same that way.
One more thing: If input data can include utf8 strings, it seems these ought to be validated before escaping. I often use a recursive utf8 cleaner on $_POST before sanitization.

Categories