PHP input GET vars sanitizing - php

For my application, written in PHP 5+, I have a common.php which is included from all other pages. Within that I have an include sanitize.php which aims to sanitise any input vars used in the URL. So, targetting $_GET[] values.
This is just to have one place where I can tidy any vars, if used, and use them in the code later.
There seems to be no tidy way, I've seen, to sanitise based on expected/desired inputs. The method I initially looked at was this sanitize.php having a foreach to loop through any vars, lookup the type of sanitization required, and then add the cleaned vars to a separate array for use within my code.
Instead of using PHP sanitization filters, to keep it standard, I thought I'd use regex. Types I want are alphaonly, alphanumeric, email, password. Although "password" would allow some special chars, I want to remove or even escape potentially "hazardous" ones like ' " to then be included into a mysql DB. We have a european userbase so different locales are possible, but I'm hoping that won't be too much of an issue.
Would this be a "good" solution to start from, or am I trying to reinvent the wheel?
Random Page
/mypage.php?c=userid&p=lkasjdlakjsdlakj&z=....
(use SANITIZED_SAFE_INPUT_VARS variable only)
sanitize.php
var aryAllowedGetParamNames = array(
"c" => "alphaonly", //login
"p" => "alphaemail", //password
"e" => "email" //email
//...
);
var sanitizeTypes = array (
"alphaonly" => "[a-zA-Z]",
"alphanumeric" => "[a-zA-Z0-9]",
"email" => "[a-zA-Z0-9]...etc"
);
var SANITIZED_SAFE_INPUT_VARS = array();
foreach ($_GET as $key => $value) {
//apply regex and add value to SANITIZED_SAFE_INPUT_VARS
}
EDIT
There seems to be some opinion about the use of passwords in the URL. I'll explain in a little more detail. Instead of using a POST login prompt with username and password, I am using an ajax async call to _db_tryLogin.php with parameters for userid and password. The username is ALWAYS a 6-ALPHA-only text string, and the password is an md5 of what was typed. I'm aware of the opinions on MD5 not being "safe enough".
The JS currently MD5s the password and sends that to the _db_tryLogin.php.
-> async : _db_login.php?c=ABCDEF&p=SLKDauwfLKASFUWPOjkjafkKoAWOIFHF2733287
This will return an async response of "1" or "0". Both will cause the page to refresh, but if the _db_tryLogin.php page detects the password and userid matches one DB record, then session variables are set and the site knows the user is logged in.
I used MD5 for the async request just to quickly hash the password so it's not transmitted in plaintext.
The _db_tryLogin.php takes the password, which is md5(plainpass) adds a SALT and MD5s again, and then this is what is compared against the usertable in the DB.
DB password stored = md5(SALT.md5(plainpass))

I would to start just regex each variable , apply null if it doesn't match the requirements. Either test what it SHOULD have only, or what it shouldn't have, whichever is smaller:
$safeValue = (preg_match('/^[a-zA-Z0-9]{0,5}$/',$value) ? $value : "");
ALONG with prepared statements with parameter input aka
$query = "SELECT x FROM table WHERE id=?";
bind_param("si",$var,$var)
PHP also comes in with built filters, such as email and others). Example: filter_var($data, FILTER_SANITIZE_EMAIL)
http://php.net/manual/en/filter.filters.sanitize.php

What are you sanitising against? If you're [only] trying to protect your SQL database you're doing it wrong, and should be looking into Prepared Statements.
USER SUBMITTED DATA SHOULD NEVER BE TRUSTED. Accepted, yes, trusted - No.
Rather than going through a long tedious process of allowing certain chararacters, simply disallow (ie remove) characters you don't want to accept, such as non-alphanumeric or backtick characters etc. It may also save you a lot of efforts to use the PHP strip_tags() function.
1) Create your function in your include file. I would recommend creating it in an abstract Static Class, but that's a little beyond the scope of this answer.
2) Within this function/class method add your definitions for what bad characters you're looking for, and the data that these checks would apply to. You seem to have a good idea of your logic process, but be aware that there is no definitively correct code answer, as each programmers' needs from a string are different.
3) using the criteria defined in (2) you can then use the Regex to remove non-valid characters to return a "safe" set of variables.
example:
// Remove backtick, single and double quotes from a variable.
// using PCRE Regex.
$data = preg_relace("/[`"']/","",$data);
4) Use the PHP function strip_tags() to do just that and remove HTML and PHP code from a string.
5) For email validation use the PHP $email = filter_var($data, FILTER_SANITIZE_EMAIL); function, it will be far better than your own simple regex. Use PHP Filter Validations they are intended exactly for your sort of situation.
6) NEVER trust the output data, even if it passes all the checks and regexes you can give it, something may still get through. ALWAYS be VERY wary of user submitted data. NEVER trust it.
7) Use Prepared Statements for your SQL interactions.
8) As a shortcut for number types (int / float) you can use PHP type-casting to force a given varibles to being a certain type and destroying any chance of it being anything else:
$number = $_GET['number']; //can be anything.
$number = (int)$_GET['number']; //must be an integer or zero.
Notes:
Passwords should not be a-z only, but should be as many characters as you are able to choose from, the more the better.
If the efforts you are actioning here are for the case of protecting database security and integrity, you're doing it wrong, and should be using Prepared Statements for your MySQL interactions.
Stop using var to declare variables as this is from PHP4 and is VERY old, it is far better to use the Variable preconditional $ (such as $variable = true;) .
You state:
We have a european userbase so different locales are possible
To which I would highly recommend exploring PHP mb_string functions because natively PHP is not mutlibyte safe.

Related

Sanitizing global arrays in PHP

I'm trying to find the best way to sanitize requests in PHP.
From what I've read I learned that GET variables should be sanitized only when they're being displayed, not at the beginning of the "request flow". Post variables (which don't come from the database) either.
I can see several problems here:
Of course I can create functions sanitizing these variables, and by calling something like Class::post('name'), or Class::get('name') everything will be safe. But what if a person who will use my code in the future will forget about it and use casual $_POST['name'] instead of my function? Can I provide, or should I provide a bit of security here?
There is never a one-size-fits-all sanitization. "Sanitization" means you manipulate a value to conform to certain properties. For example, you cast something that's supposed to be a number to a number. Or you strip <script> tags out of supposed HTML. What and how exactly to sanitize depends on what the value is supposed to be and whether you need to sanitize at all. Sanitizing HTML for whitelisted tags is really complex, for instance.
Therefore, there's no magic Class::sanitize which fits everything at once. Anybody using your code needs to think about what they're trying to do anyway. If they just blindly use $_POST values as is, they have already failed and need to turn in their programmer card.
What you always need to do is to escape based on the context. But since that depends on the context, you only do it where necessary. You don't blindly escape all all $_POST values, because you have no idea what you're escaping for. See The Great Escapism (Or: What You Need To Know To Work With Text Within Text) for more background information on the whole topic.
The variables are basically "sanitized" when PHP reads them. Meaning if I were to submit
"; exec("some evil command"); $blah="
Then it won't be a problem as far as PHP is concerned - you will get that literal string.
However, when passing it on from PHP to something else, it's important to make sure that "something else" won't misinterpret the string. So, if it's going into a MySQL database then you need to escape it according to MySQL rules (or use prepared statements, which will do this for you). If it's going into HTML, you need to encode < as < as a minimum. If it's going into JavaScript, then you need to JSON-encode it, and so on.
You can do something like this... Not foolproof, but it works..
foreach($_POST as $key => $val)
{
//do sanitization
$val = Class::sanitize($val);
$_POST[$key] = $val;
}
Edit: You'd want to put this as close to the header as you can get. I usually put mine in the controller so it's executed from the __construct() automagically.
Replace the $_POST array with a sanitizer object which is beheaving like an array.

PHP Injection from HTTP GET data used as PHP array key value

i would like to know if there is a possible injection of code (or any other security risk like reading memory blocks that you weren't supposed to etc...) in the following scenario, where unsanitized data from HTTP GET is used in code of PHP as KEY of array.
This supposed to transform letters to their order in alphabet. a to 1, b to 2, c to 3 .... HTTP GET "letter" variable supposed to have values letters, but as you can understand anything can be send to server:
HTML:
http://www.example.com/index.php?letter=[anything in here, as dirty it can gets]
PHP:
$dirty_data = $_GET['letter'];
echo "Your letter's order in alphabet is:".Letter2Number($dirty_data);
function Letter2Number($my_array_key)
{
$alphabet = array("a" => "1", "b" => "2", "c" => "3");
// And now we will eventually use HTTP GET unsanitized data
// as a KEY for a PHP array... Yikes!
return $alphabet[$my_array_key];
}
Questions:
Do you see any security risks?
How can i sanitize HTTP data to be able use them in code as KEY of an array?
How bad is this practice?
I can't see any problems with this practice. Anything you... errr... get from $_GET is a string. It will not pose any security threat whatsoever unless you call eval() on it. Any string can be used as a PHP array key, and it will have no adverse effects whatsoever (although if you use a really long string, obviously this will impact memory usage).
It's not like SQL, where you are building code to be executed later - your PHP code has already been built and is executing, and the only way you can modify the way in which it executes at runtime is by calling eval() or include()/require().
EDIT
Thinking about it there are a couple of other ways, apart from eval() and include(), that this input could affect the operation of the script, and that is to use the supplied string to dynamically call a function/method, instantiate an object, or in variable variables/properties. So for example:
$userdata = $_GET['userdata'];
$userdata();
// ...or...
$obj->$userdata();
// ...or...
$obj = new $userdata();
// ...or...
$someval = ${'a_var_called_'.$userdata};
// ...or...
$someval = $obj->$userdata;
...would be a very bad idea, if you were to do it with sanitizing $userdata first.
However, for what you are doing, you do not need to worry about it.
Any external received from GET, POST, FILE, etc. should be treated as filthy and sanitized appropriately. How and when you sanitize depends on when the data is going to be used. If you are going to store it to the DB, it needs to be escaped (to avoid SQL Injection. See PDO for example). Escaping is also necessary when running an OS command based on user data such as eval or attempting to read a file (like reading ../../../etc/passwd). If it's going to be displayed back to the user, it needs to be encoded (to avoid html injection. See htmlspecialchars for example).
You don't have to sanitize data for the way you are using it above. In fact, you should only escape for storage and encode for display, but otherwise leave data raw. Of course, you may want to perform your own validation on the data. For example, you may want dirty_data to be in the list of [a, b, c] and if not echo it back to the user. Then you would have to encode it.
Any well-known OS is not going to have a problem even if the user managed to attempt to read an invalid memory address.
Presumably this array's contents are meant to be publicly accessible in this way, so no.
Run it through array_key_exists()
Probably at least a little bad. Maybe there's something that could be done with a malformed multibyte string or something that could trigger some kind of overflow on a poorly-configured server... but that's pure (ignorant) speculation on my part.

The ultimate clean/secure function

I have a lot of user inputs from $_GET and $_POST... At the moment I always write mysql_real_escape_string($_GET['var'])..
I would like to know whether you could make a function that secures, escapes and cleans the $_GET/$_POST arrays right away, so you won't have to deal with it each time you are working with user inputs and such.
I was thinking of an function, e.g cleanMe($input), and inside it, it should do mysql_real_escape_string, htmlspecialchars, strip_tags, stripslashes (I think that would be all to make it clean & secure) and then return the $input.
So is this possible? Making a function that works for all $_GET and $_POST, so you would do only this:
$_GET = cleanMe($_GET);
$_POST = cleanMe($_POST);
So in your code later, when you work with e.g $_GET['blabla'] or $_POST['haha'] , they are secured, stripped and so on?
Tried myself a little:
function cleanMe($input) {
$input = mysql_real_escape_string($input);
$input = htmlspecialchars($input, ENT_IGNORE, 'utf-8');
$input = strip_tags($input);
$input = stripslashes($input);
return $input;
}
The idea of a generic sanitation function is a broken concept.
There is one right sanitation method for every purpose. Running them all indiscriminately on a string will often break it - escaping a piece of HTML code for a SQL query will break it for use in a web page, and vice versa. Sanitation should be applied right before using the data:
before running a database query. The right sanitation method depends on the library you use; they are listed in How can I prevent SQL injection in PHP?
htmlspecialchars() for safe HTML output
preg_quote() for use in a regular expression
escapeshellarg() / escapeshellcmd() for use in an external command
etc. etc.
Using a "one size fits all" sanitation function is like using five kinds of highly toxic insecticide on a plant that can by definition only contain one kind of bug - only to find out that your plants are infested by a sixth kind, on which none of the insecticides work.
Always use that one right method, ideally straight before passing the data to the function. Never mix methods unless you need to.
There is no point in simply passing the input through all these functions. All these functions have different meanings. Data doesn't get "cleaner" by calling more escape-functions.
If you want to store user input in MySQL you need to use only mysql_real_escape_string. It is then fully escaped to store safely in the database.
EDIT
Also note the problems that arise with using the other functions. If the client sends for instance a username to the server, and the username contains an ampersand (&), you don;t want to have called htmlentities before storing it in the database because then the username in the database will contain &.
You're looking for filter_input_array().
However, I suggest only using that for business-style validation/sanitisation and not SQL input filtering.
For protection against SQL injection, use parametrised queries with mysqli or PDO.
The problem is, something clean or secure for one use, won't be for another : cleaning for part of a path, for part of a mysql query, for html output (as html, or in javascript or in an input's value), for xml may require different things which contradicts.
But, some global things can be done.
Try to use filter_input to get your user's input. And use prepared statements for your SQL queries.
Although, instead of a do-it-all function, you can create some class which manages your inputs. Something like that :
class inputManager{
static function toHTML($field){
$data = filter_input(INPUT_GET, $field, FILTER_SANITIZE_SPECIAL_CHARS);
return $data;
}
static function toSQL($field, $dbType = 'mysql'){
$data = filter_input(INPUT_GET, $field);
if($dbType == 'mysql'){
return mysql_real_escape_string($data);
}
}
}
With this kind of things, if you see any $_POST, $GET, $_REQUEST or $_COOKIE in your code, you know you have to change it. And if one day you have to change how you filter your inputs, just change the class you've made.
May I suggest to install "mod_security" if you're using apache and have full access to server?!
It did solve most of my problems. However don't rely in just one or two solutions, always write secure code ;)
UPDATE
Found this PHP IDS (http://php-ids.org/); seems nice :)

How to santize user inputs in PHP?

Is this enough?
$listing = mysql_real_escape_string(htmlspecialchars($_POST['listing']));
Depends - if you are expecting text, it's just fine, although you shouldn't put the htmlspecialchars in input. Do it in output.
You might want to read this: What's the best method for sanitizing user input with PHP?
you can use php function : filter_var()
a good tutorial in the link :
http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html
example to sanitize integer :
To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter strips out all characters except for digits and . + -
It is simple to use and we no longer need to boggle our minds with regular expressions.
<?php
/*** an integer ***/
$int = "abc40def+;2";
/*** sanitize the integer ***/
echo filter_var($int, FILTER_SANITIZE_NUMBER_INT);
?>
The above code produces an output of 40+2 as the none INT values, as specified by the filter, have been removed
See:
Best way to stop SQL Injection in PHP
What are the best practices for avoid xss attacks in a PHP site
And sanitise data immediately before it is used in the context it needs to be made safe for. (e.g. don't run htmlspecialchars until you are about to output HTML, you might need the unedited data before then (such as if you ever decide to send content from the database by email)).
Yes. However, you shouldn't use htmlspecialchars on input. Only on output, when you print it.
This is because, it's not certain that the output will always be through html. It could be through a terminal, so it could confuse users if weird codes suddenly show up.
It depends on what you want to achieve. Your version prevents (probably) all SQL injections and strips out HTML (more exactly: Prevents it from being interpreted when sent to the browser). You could (and probably should) apply the htmlspecialchars() on output, not input. Maybe some time in the future you want to allow simple things like <b>.
But there's more to sanitizing, e.g. if you expect an Email Address you could verify that it's indeed an email address.
As has been said don't use htmlspecialchars on input only output. Another thing to take into consideration is ensuring the input is as expected. For instance if you're expecting a number use is_numeric() or if you're expecting a string to only be of a certain size or at least a certain size check for this. This way you can then alert users to any errors they have made in their input.
What if your listing variable is an array ?
You should sanitize this variable recursively.
Edit:
Actually, with this technique you can avoid SQL injections but you can't avoid XSS.
In order to sanitize "unreliable" string, i usually combine strip_tags and html_entity_decode.
This way, i avoid all code injection, even if characters are encoded in a Ł way.
$cleaned_string = strip_tags( html_entity_decode( $var, ENT_QUOTES, 'UTF-8' ) );
Then, you have to build a recursive function which call the previous functions and walks through multi-dimensional arrays.
In the end, when you want to use a variable into an SQL statement, you can use the DBMS-specific (or PDO's) escaping function.
$var_used_with_mysql = mysql_real_escape_string( $cleaned_string );
In addition to sanitizing the data you should also validate it. Like checking for numbers after you ask for an age. Or making sure that a email address is valid. Besides for the security benefit you can also notify your users about problems with their input.
I would assume it is almost impossible to make an SQL injection if the input is definitely a number or definitely an email address so there is an added level of safety.

A tidy way to clean your URL variables?

I'm wondering if there is a quick and easy function to clean get variables in my url, before I work with them.( or $_POST come to think of it... )
I suppose I could use a regex to replace non-permitted characters, but I'm interested to hear what people use for this sort of thing?
The concept of cleaning input never made much sense to me. It's based on the assumption that some kinds of input are dangerous, but in reality there is no such thing as dangerous input; Just code that handles input wrongly.
The culprit of it is that if you embed a variable inside some kind of string (code), which is then evaluated by any kind of interpreter, you must ensure that the variable is properly escaped. For example, if you embed a string in a SQL-statement, then you must quote and escape certain characters in this string. If you embed values in a URL, then you must escape it with urlencode. If you embed a string within a HTML document, then you must escape with htmlspecialchars. And so on and so forth.
Trying to "clean" data up front is a doomed strategy, because you can't know - at that point - which context the data is going to be used in. The infamous magic_quotes anti-feature of PHP, is a prime example of this misguided idea.
I use the PHP input filters and the function urlencode.
Regular expressions can be helpful, and also PHP 5.2.0 introduced a whole filter extension devoted to filtering input variables in different ways.
It's hard to recommend a single solution, because the nature of input variables is so... variable. :-)
I use the below method to sanitize input for MYSQL database use. To summarize, iterate through the $_POST or $_GET array via foreach, and pass each $_POST or $_GET through the DBSafe function to clean it up. The DBSafe could easily be modified for other uses of the data variables (e.g. HTML output etc..).
// Iterate POST array, pass each to DBSafe function to clean up data
foreach ($_POST as $key => $PostVal) {
// Convert POST Vars into regular vars
$$key=DBSafe($PostVal);
// Use above statement to leave POST or GET array intact, and use new individual vars
// OR, use below to update POST or GET array vars
// Update POST vars
$_POST[$key]=DBSafe($PostVal);
}
function DBSafe($InputVal) {
// Returns MySQL safe values for DB update. unquoted numeric values; NULL for empty input; escaped, 'single-quoted' string-values;
if (is_numeric($InputVal)) {
return $InputVal;
} else {
// escape_string may not be necessary depending on server PHP and MySQL (i.e. magic_quotes) setup. Uncomment below if needed.
// $InputVal=mysql_escape_string($InputVal);
$InputVal=(!$InputVal?'NULL':"'$InputVal'");
return $InputVal;
}
}

Categories