I have a php page that echoes something like this:
echo "<div>" . $_REQUEST["id"] . "</div>";
This leads to XSS issue, which i tried to fix using htmlpurifier through a function that cleans $_REQUEST by reference, leading to this code:
function sanitizer(array &array) {
foreach ($array as $key => $value) {
$array[$key] = htmlpurifierInstance->purify($value);
}
}
sanitizer($_REQUEST);
echo "<div>" . $_REQUEST["id"] . "</div>";
After another checkmarx test, the issue stills pops up, what's the fix to this issue?
Sanitising HTML should be a very rare requirement, not something you do regularly on all input.
Whenever a value has a limited range of valid values, validate it. Reject it or unset it if it's not valid. So if "id" is supposed to be a number, reject non-numeric input.
Whenever outputting or sending any variable somewhere, escape it for the relevant context. In this case, you are outputting in an HTML context, so use htmlspecialchars. This is not something you can do ahead of time, because the same variable might be used in multiple contexts.
For the particular case of database queries, don't use escaping, use parameterised queries.
In the rare cases where you really need the user to be able to enter HTML, come up with a strict whitelist of tags and attributes they can use, and sanitise the particular variable based on that, as part of your input processing. (This is what HTMLPurifier is for.)
Never, ever, try to write a "universal" sanitising or escaping function. At best, you will end up mangling data by applying too many things at once; at worst, you'll defeat your own security.
Related
I made a little PHP Scripting for checking the input in a form textfielt if
a.) it is a number
b.) if it is secure
I tried it with a function
<?php
function checkSecurity($glob) {
if (is_numeric($glob)) {
$value = htmlspecialchars($glob);
$value = trim($glob);
return $value;
}
else
{
echo "<p style='color:red;'>Wrong Input</p>";
die;
}
}
My answer: is it necessary to check the security with htmlspecialchars() and trim() or isn`t it just enough to use is_numeric()
Thanks a lot
Mario
If input is numeric, neither htmlspecialchars nor trim would do anything.
Line 4 in your code is pointless, because the next line overwrites $value.
htmlspecialchars() is an _output_encoding function anyway, you don't usually want to use it on input.
When used correctly, htmlspecialchars() protects against some types of XSS. This will not make anything secure, not even against XSS in general, let alone oher vulnerabilities.
In most cases, you can't implement a generic "checkSecurity" function for your application to handle all input validation. If this was possible, why would your framework not do it? (Ok, .net does to some extent, but it's arguable whether that is overall positive or negative, considering the false sense of security it gives to developers.)
You should never concatenate html from strings, or build html responses in variables (applicable to both server and client side). You need a template engine if this ever happens.
I'm trying to find the best way to sanitize requests in PHP.
From what I've read I learned that GET variables should be sanitized only when they're being displayed, not at the beginning of the "request flow". Post variables (which don't come from the database) either.
I can see several problems here:
Of course I can create functions sanitizing these variables, and by calling something like Class::post('name'), or Class::get('name') everything will be safe. But what if a person who will use my code in the future will forget about it and use casual $_POST['name'] instead of my function? Can I provide, or should I provide a bit of security here?
There is never a one-size-fits-all sanitization. "Sanitization" means you manipulate a value to conform to certain properties. For example, you cast something that's supposed to be a number to a number. Or you strip <script> tags out of supposed HTML. What and how exactly to sanitize depends on what the value is supposed to be and whether you need to sanitize at all. Sanitizing HTML for whitelisted tags is really complex, for instance.
Therefore, there's no magic Class::sanitize which fits everything at once. Anybody using your code needs to think about what they're trying to do anyway. If they just blindly use $_POST values as is, they have already failed and need to turn in their programmer card.
What you always need to do is to escape based on the context. But since that depends on the context, you only do it where necessary. You don't blindly escape all all $_POST values, because you have no idea what you're escaping for. See The Great Escapism (Or: What You Need To Know To Work With Text Within Text) for more background information on the whole topic.
The variables are basically "sanitized" when PHP reads them. Meaning if I were to submit
"; exec("some evil command"); $blah="
Then it won't be a problem as far as PHP is concerned - you will get that literal string.
However, when passing it on from PHP to something else, it's important to make sure that "something else" won't misinterpret the string. So, if it's going into a MySQL database then you need to escape it according to MySQL rules (or use prepared statements, which will do this for you). If it's going into HTML, you need to encode < as < as a minimum. If it's going into JavaScript, then you need to JSON-encode it, and so on.
You can do something like this... Not foolproof, but it works..
foreach($_POST as $key => $val)
{
//do sanitization
$val = Class::sanitize($val);
$_POST[$key] = $val;
}
Edit: You'd want to put this as close to the header as you can get. I usually put mine in the controller so it's executed from the __construct() automagically.
Replace the $_POST array with a sanitizer object which is beheaving like an array.
I am a PHP newbie and am working on a basic form validation script. I understand that input filtering and output escaping are both vital for security reasons. My question is whether or not the code I have written below is adequately secure? A few clarifying notes first.
I understand there is a difference between sanitizing and validating. In the example field below, the field is plain text, so all I need to do is sanitize it.
$clean['myfield'] is the value I would send to a MySQL database. I am using prepared statements for my database interaction.
$html['myfield'] is the value I am sending back to the client so that when s/he submits the form with invalid/incomplete data, the sanitized fields that have data in them will be repopulated so they don't have to type everything in from scratch.
Here is the (slightly cleaned up) code:
$clean = array();
$html = array();
$_POST['fname'] = filter_var($_POST['fname'], FILTER_SANITIZE_STRING);
$clean['fname'] = $_POST['fname'];
$html['fname'] = htmlentities($clean['fname'], ENT_QUOTES, 'UTF-8');
if ($_POST['fname'] == "") {
$formerrors .= 'Please enter a valid first name.<br/><br/>';
}
else {
$formerrors .= 'Name is valid!<br/><br/>';
}
Thanks for your help!
~Jared
I understand that input filtering and output escaping are both vital for security reasons.
I'd say rather that output escaping is vital for security and correctness reasons, and input filtering is potentially-useful measure for defence-in-depth and to enforce specific application rules.
The input filtering step and the output escaping step are necessarily separate concerns, and cannot be combined into one step, not least because there are many different types of output escaping, and the right one has to be chosen for each output context (eg HTML-escaping in a page, URL-escaping to make a link, SQL-escaping, and so on).
Unfortunately PHP is traditionally very hazy on these issues and so offers a bunch of mixed-message functions that are likely to mislead you.
In the example field below, the field is plain text, so all I need to do is sanitize it.
Yes. Alas, FILTER_SANITIZE_STRING is not in any way a sane sanitiser. It completely removes some content (strip_tags, which is itself highly non-sensible) whilst HTML-escaping other content. eg quotes turn into ". This is a nonsense.
Instead, for input sanitisation, look at:
checking it's a valid string for the encoding you're using (hopefully UTF-8; see eg this regex for that);
removing control characters, U+0000–U+001F and U+007F–U+009F. Allow the newline through only on deliberate multi-line text fields;
removing the characters that are not suitable for use in markup;
validating the input conforms to application requirements on a field-by-field basis, for data whose content model is more specific than arbitrary text strings. Although your escaping should handle a < character correctly, it's probably a good idea to get rid of it early in fields where it makes no sense to have one.
For the output escaping step I'd generally prefer htmlspecialchars() to htmlentities(), though your correct use of the UTF-8 argument stops the latter function breaking in the way it usually does.
Depending on what you want to secure, the filter you call might be overactive (see comments). Injectionwise you should be safe since you're using Prepared Statements (see this answer)
On a design note you might want to filter first, then check for empty values. Doing that you can shorten your code ;)
I understand that input filtering ... is vital for security reasons.
This is wrong statement.
Although it can be right in some circumstances, in such a generalised form it can do no good but false feeling of safety.
all I need to do is sanitize it.
There is no such thing like "general sanitizing". You have to understand each particular case and it's limitations. For example, for the database you need to use several different sanitization techniques, not one. While for the filenames it is going to be completely different one.
I am using prepared statements for my database interaction.
Thus, you should not touch the data at all. Just leave it as is.
Here is the (slightly cleaned up) code:
It seems there is some overkill in your code.
you are cleaning your HTML data twice while it is possible that you won't need it at all.
and for some reason you are raising an error on success.
I'd make it rather this way
$formerrors = '';
if ($_POST['fname'] == "") {
$formerrors .= 'Please enter a valid first name.<br/><br/>';
}
if (!$formerrors) {
$html = array();
foreach ($_POST as $key => $val) {
$html[$key] = htmlspecialchars($val,ENT_QUOTES);
}
}
I'm wondering if there is a significant downside to using the following code:
if(isset($_GET)){
foreach($_GET as $v){
$v = htmlspecialchars($v);
}
}
I realize that it probably isn't necessary to use htmlspecialchars on each variable. Anyone know offhand if this is good to do?
UPDATE:
Because I don't think my above code would work, I'm updating this with the code that I'm using (despite the negativity towards the suggestions). :)
if(isset($_GET)){
foreach($_GET as $k=>$v){
$_GET[$k] = htmlspecialchars($v);
}
}
This totally depends on what you want to do.
In general, the answer is "no", and you should only escape data specifically for their intended purpose. Randomly escaping data without purpose isn't helping, and it just causes further confusion, as you have to keep track of what's been escaped and how.
In short, keep your data stored raw, and escape it specifically for its intended use when you use it:
for HTML output, use htmlentities().
for shell command names, use escapeshellcmd().
for shell arguments, use escapeshellarg().
for building a GET URL string, use urlencode() on the parameter values.
for database queries, use the respective database escape mechanism (or prepared statements).
This reasoning applies recursively. So if you want to write a link to a GET URL to the HTML output, it'd be something like this:
echo "click";
It'd be terrible if at that point you'd have to remember if $var had already previously been escaped, and how.
Blanket escaping isn't necessary, and it's possibly harmful to the data. Don't do it.
Apply htmlspecialchars() only to data that you are about to output in a HTML page - ideally immediately before, or directly when you output it.
It won't affect numbers, but it can backfire for string parameters which are not intended to be put in HTML code.
You have to treat each key different depending on its meaning. Possibility of generalization also depends on your application.
The way you're doing it won't work. You need to make $v a reference, and it breaks for anything requiring recursion ($_GET['array'][0], for example).
if(isset($_GET)) {
foreach($_GET as &$v) {
$v = htmlspecialchars($v);
}
}
I'm using $_POST and aware about mysql exploit, I decided to use this function on the top of my page, therefore all POST will be safe:
Can you tell me if I miss something and this function will really do the job as I think it will?
function clean_post(){
if ( $_POST){
foreach ($_POST as $k => $v) {
$_POST[$k]=stripslashes($v);
$_POST[$k]=mysql_real_escape_string($v);
$_POST[$k]=preg_replace('/<.*>/', "", "$v");
}
}
if ( $_COOKIE){
foreach ($_COOKIE as $k => $v) {
$_COOKIE[$k]=stripslashes($v);
$_COOKIE[$k]=mysql_real_escape_string($v);
$_COOKIE[$k]=preg_replace('/<.*>/', "", "$v");
}
}
}
It will also remove all html tag, a safest option to output the result might be to use:
<pre>
$foo
</pre>
Cheers!
Cheers!
I think it's a bad idea to do this. It will corrupt the data your users enter even before it hits the database. This approach will also encourage you to use lazy coding where you consistently don't escape data because you believe that all your data is already "clean". This will come back to bite you one day when you do need to output some unsafe characters and you either forget to escape them or you aren't really sure which function you need to call so you just try something and hope that it works.
To do it properly you should ensure that magic quotes is disabled and only escape data when necessary, using precisely the correct escaping method - no more, no less.
There are some problems with it.
First you apply functions on types that doesn't need them, your integers for example needs only a (int) cast to be secure.
Second you do not secure lenght, when you're requesting a '12 chars string' it would be a good idea to ensure you've got only 12 chars, and not 2048. Limiting size is really something your attackers will not like.
Third in your foreach loop you have a $v variable, you assign 3 times a function on $v to $_POST[$k]. So the 1st two assignements are lost when the 3rd occurs...
Then all the things previous people said are right :-)
The rule is apply the filter at the right moment for the right output. HTML output need an html filter (htmlspecialchars), but the database doesn't need it, it need a database escaping. Let's say you want to extract data from your database to build a CSV or a PDF, HTML escaping will make you life harder. You'll need CSV escaping at this time, or PDF escaping.
Finally it is effectively hard to remember if you are manipulating a data which is already well escaped for your output. And I recommend you an excellent read on Joel on Software about Apps Hungarian. The text is quite long, but very good, and the web escaping sequence is used as an example on why Apps Hungarian is good (even if System Hungarain is bad).
Hi this is my first answer for any question asked on web so please review it.
Put this code in top of your script and no need to assign these posted values to any variables for doing the same job of making the input data safe for database. Just use $_POST values as it is in your query statements.
foreach ($_POST as $k => $v) {
if(!is_array($_POST[$k]) ) { //checks for a checkbox array & so if present do not escape it to protect data from being corrupted.
if (ini_get('magic_quotes_gpc')) {
$v = stripslashes($v);
}
$v = preg_replace('/<.*>/', "", "$v"); //replaces html chars
$_POST[$k]= mysql_real_escape_string(trim($v));
}
}
Don't forget $_GET[]
if ($_POST OR $_GET)
Also you can add strip_tags()
I don't know whether your function is correct or not, but the principle is certainly incorrect. You want to escape only where you need to, i.e. just before you pass things into MySQL (in fact you don't even want to do that, ideally; use bound parameters).
There are plenty of situations where you might want the raw data as passed in over the HTTP request. With your approach, there's no ability to do so.
In general, I don't think it's that good of an idea.
Not all post data necessarily goes into MySQL, so there is no need to escape it if it doesn't. That said, using something like PDO and prepared statements is a better way, mysql_* functions are deprecated.
The regular expression could destroy a lot of potentially valid text. You should worry about things like HTML when outputting, not inputting. Furthermore, use a function like strip_tags or htmlspecilchars to handle this.
stripslashes is only necessary if magic quotes are enabled (which they shouldn't be, but always is possible)
When working with stripslashes I'd use get_magic_quotes_gpc():
if (get_magic_quotes_gpc()) {
$_POST[$k]=stripslashes($v);
}
Otherwise you'll over-strip.