I know finding alternatives to eval() is a common topic, but I'm trying to do something I've not done before. I am processing known-format CSV files, and I have built rules for how the data will be handled depending on the source columns available.
Here is a simplified example of what I'm doing. In this example, $linedata represents a single row of data from the CSV file. $key["type"] points to the column I need the data from. If this column holds the value of "IN", I want $newcol set to "individual", else "organization".
$key["type"] = 12;
$linedata[12] = 'IN';
$rule = '($linedata[($key["type"])] == "IN" ? "individual" : "organization");';
eval ('$newcol = ' . $rule);
So $rule stores the logic. I can run a filter on the $linedata array to try and protect from malicious code coming from the CSV files, but I wonder if there is a better way to store and process rules like this?
You cannot store arbitrary PHP in a CSV file and then expect it to work without calling eval (or similar functionality).
The safe way to do what you're asking for is to treat the file as data, not code.
This is why languages like BBCode exist: you can't have an inert language trigger active features directly, so you create an easy-to-interpret mini-scripting-language that lets you achieve what you want.
In other words, you cannot store active "rules" in the file without interpreting them somehow, and you cannot simultaneously allow them to contain arbitrary PHP and be "safe". So you can either attempt to parse and restrict PHP (don't, it's tough!) or you can give them a nice easy little language, and interpret that. Or, better yet, don't store logic in data files.
I was wrong.
I can be wrong, however create_function may be good enough.
http://www.php.net/manual/en/function.create-function.php
Related
Suppose, I have a variable $var1 which contains the followings:
$var1="../sidebar_items/uploaded_files/notices/circular.pdf";
Now, I want to create another variable $var2 which will hold the following value:
$var2="../dev/sidebar_items/uploaded_files/notices/circular.pdf";
I want to create $var2 with the help of $var1 via string manipulation. Is it possible to do so? How will I achieve that?
Just use str_replace(). It returns the changed string but doesn't change the original variable so you can store the result in a new variable:
$var2 = str_replace('../', '../dev/', $var1);
It looks like you're allowing users to upload files and saving the filenames. That's potentially very dangerous depending on how you're dealing with them; if you are taking their string value and doing file system operations against it, you could end up with a user uploading a file with a name like ../../../../../usr/bin/php and risking allowing a delete operation against that file (if your permissions are set up really, really poorly) or, perhaps more realistically, using path manipulation to delete, modify, or overwrite any file owned by the web server user. index.php would be an obvious target.
You should consider keeping both paths in separate constants rather than using string manipulation to turn one into the other at runtime. You should also consider renaming user-uploaded files, or at least being very careful about how you store them with regard to naming based on how you access them in your code.
you could also use strtr() function of PHP
$var2 = strtr($var1, '../', '../dev/');
I'd approach it by separating out the file name using basename() and then having a variable which has the path to the dev directory. This allows you to change it to all sorts rather than limiting it to a minor change...
$var1="../sidebar_items/uploaded_files/notices/circular.pdf";
$devpath = "../dev/sidebar_items/uploaded_files/notices/";
echo $devpath.basename($var1);
gives...
../dev/sidebar_items/uploaded_files/notices/circular.pdf
I've got a simple question:
When is it best to sanitize user input?
And which one of these is considered the best practice:
Sanitize data before writing to database.
Save raw data and sanitize it in the view.
For example use HTML::entities() and save result to database.
Or by using HTML methods in the views because in this case laravel by default uses HTML::entities().
Or maybe by using the both.
EDIT: I found interesting example http://forums.laravel.com/viewtopic.php?id=1789. Are there other ways to solve this?
I would say you need both locations but for different reasons. When data comes in you should validate the data according to the domain, and reject requests that do not comply. As an example, there is no point in allowing a tag (or text for that matter) if you expect a number. For a parameter representing.a year, you may even want to check that it is within some range.
Sanitization kicks in for free text fields. You can still do simple validation for unexpected characters like 0-bytes. IMHO it's best to store raw through safe sql (parameterized queries) and then correctly encode for output. There are two reasons. The first is that if your sanitizer has a bug, what do you do with all the data in your database? Resanitizing can have unwanted consequences. Secondly you want to do contextual escaping, for whichever output you are using (JSON, HTML, HTML attributes etc.)
I have a full article on input filtering in Laravel, you might find it useful http://usman.it/xss-filter-laravel/, here is the excerpt from this article:
You can do a global XSS clean yourself, if you don’t have a library to write common methods you may need frequently then I ask you to create a new library Common in application/library. Put this two methods in your Common library:
/*
* Method to strip tags globally.
*/
public static function global_xss_clean()
{
// Recursive cleaning for array [] inputs, not just strings.
$sanitized = static::array_strip_tags(Input::get());
Input::merge($sanitized);
}
public static function array_strip_tags($array)
{
$result = array();
foreach ($array as $key => $value) {
// Don't allow tags on key either, maybe useful for dynamic forms.
$key = strip_tags($key);
// If the value is an array, we will just recurse back into the
// function to keep stripping the tags out of the array,
// otherwise we will set the stripped value.
if (is_array($value)) {
$result[$key] = static::array_strip_tags($value);
} else {
// I am using strip_tags(), you may use htmlentities(),
// also I am doing trim() here, you may remove it, if you wish.
$result[$key] = trim(strip_tags($value));
}
}
return $result;
}
Then put this code in the beginning of your before filter (in application/routes.php):
//Our own method to defend XSS attacks globally.
Common::global_xss_clean();
I just found this question. Another way to do it is to enclose dynamic output in triple brackets like this {{{ $var }}} and blade will escape the string for you. That way you can keep the potentially dangerous characters in case they are important somewhere else in the code and display them as escaped strings.
i'd found this because i was worried about xss in laravel, so this is the packages gvlatko
it is easy:
To Clear Inputs = $cleaned = Xss::clean(Input::get('comment');
To Use in views = $cleaned = Xss::clean(Input::file('profile'), TRUE);
It depends on the user input. If you're generally going to be outputting code they may provide (for example maybe it's a site that provides code snippets), then you'd sanitize on output. It depends on the context. If you're asking for a username, and they're entering HTML tags, your validation should be picking this up and going "no, this is not cool, man!"
If it's like the example I stated earlier (code snippets), then let it through as RAW (but be sure to make sure your database doesn't break), and sanitize on output. When using PHP, you can use htmlentities($string).
2 short questions based on trying to make my code more efficient (I think my ultimate quest is to make my entire (fairly complex) website based on some sort of MVC framework, but not being a professional programmer, I think that's going to be a long and steep learning curve..)
In this code, is there a way to merge the if statement and for loop, to avoid the nesting:
if($fileatt['name']!=null)
{
$attachedFiles = "You uploaded the following file(s)\n";
for($i=0;$i<count($docNames);$i++)
{
$attachedFiles = $attachedFiles. " - " . $docNames[$i] . "\n";
}
}
At the moment, I do the fairly standard thing of splitting my $_POST array from a form submission, 'clean' the contents and store the elements in individual variables:
$name = cleanInput($_POST['name']);
$phone = cleanInput($_POST['phone']);
$message = cleanInput($_POST['message']);
...
(where cleanInput() contains striptags() and mysql_real_escape_string())
I had thought that keeping all the information in an array might my code more efficient, but is there a way to apply a function to all (or selected) elements of an array? For example, in R, this is what the apply() function does.
Alternatively, given that all my variables have the same name as in the $_POST array, is there a way to generate all the variables dynamically in a foreach loop? (I know the standard answer when people ask if they can dynamically generate variables is to use a hashmap or similar, but I was interested to see if there's a technique I've missed)
You can use extract and combine it with array_map
extract(array_map('cleanInput', $_POST), EXTR_SKIP);
echo $name; // outputs name
Be warned that $_POST could be anything and user can then submit anything to your server and it becomes a variable in your code, thus if you have things like
if(empty($varName)) { } // assumes $varName is empty initially
Could easily bypassed by user submitting $_POST['varName'] = 1
To avoid mishaps like this, you can have a whitelist of array and filter out only those you need:
$whitelist = array('name', 'phone', 'message');
$fields = array();
foreach($_POST as $k => $v) {
if(in_array($k, $whitelist)) $fields[$k] = $v;
}
extract(array_map('cleanInput', $fields));
1) To the first question, how to merge the if and the for loop:
Why would you want to merge this, it will only make the code more difficult to read. If your code requires an if and afterwards a for loop, then show this fact, there is nothing bad with that. If you want to make the code more readable, then you can write a function, with a fitting name, e.g. listAttachedFiles().
2) To the question about cleaning the user input:
There is a difference between input validation and escaping. It's a good thing to validate the input, e.g. if you expect a number, then only accept numbers as input. But escaping should not be done until you know the target system. So leave the input as it is and before writing to the db use the mysql_real_escape_string() function, before writing to an HTML page use the function htmlspecialchars().
Combining escape functions before needed, can lead to invalid data. It can become impossible to give it out correctly, on a certain target system.
Personally I think that the performance cost of using an "If" statement is worth the benefit of having easily readable code. Also you have to be sure that you actually use fewer cycles by combining, if there is such a way.
I'm not sure I follow your second question, but have you looked at extract() and array_walk() yet?
Point 1 is premature optimization. And you want get any better performance / readability by doing so. (similar for using arrays for everything).
Point 2 - AaaarrgghhH! You should only change the representation of data at the point where it leaves PHP, using a method approporiate to the destination - not where it arrives in PHP.
To make your for loop more efficient don't use Count() within the condition of your loops.
It's the first thing they teach in school. As the For loops are reevaluating the conditions at each iterations.
$nbOfDocs = count($docNames); //will be much faster
for($i=0;$i<$nbOfDocs;$i++)
{
$attachedFiles = $attachedFiles. " - " . $docNames[$i] . "\n";
}
i would like to know if there is a possible injection of code (or any other security risk like reading memory blocks that you weren't supposed to etc...) in the following scenario, where unsanitized data from HTTP GET is used in code of PHP as KEY of array.
This supposed to transform letters to their order in alphabet. a to 1, b to 2, c to 3 .... HTTP GET "letter" variable supposed to have values letters, but as you can understand anything can be send to server:
HTML:
http://www.example.com/index.php?letter=[anything in here, as dirty it can gets]
PHP:
$dirty_data = $_GET['letter'];
echo "Your letter's order in alphabet is:".Letter2Number($dirty_data);
function Letter2Number($my_array_key)
{
$alphabet = array("a" => "1", "b" => "2", "c" => "3");
// And now we will eventually use HTTP GET unsanitized data
// as a KEY for a PHP array... Yikes!
return $alphabet[$my_array_key];
}
Questions:
Do you see any security risks?
How can i sanitize HTTP data to be able use them in code as KEY of an array?
How bad is this practice?
I can't see any problems with this practice. Anything you... errr... get from $_GET is a string. It will not pose any security threat whatsoever unless you call eval() on it. Any string can be used as a PHP array key, and it will have no adverse effects whatsoever (although if you use a really long string, obviously this will impact memory usage).
It's not like SQL, where you are building code to be executed later - your PHP code has already been built and is executing, and the only way you can modify the way in which it executes at runtime is by calling eval() or include()/require().
EDIT
Thinking about it there are a couple of other ways, apart from eval() and include(), that this input could affect the operation of the script, and that is to use the supplied string to dynamically call a function/method, instantiate an object, or in variable variables/properties. So for example:
$userdata = $_GET['userdata'];
$userdata();
// ...or...
$obj->$userdata();
// ...or...
$obj = new $userdata();
// ...or...
$someval = ${'a_var_called_'.$userdata};
// ...or...
$someval = $obj->$userdata;
...would be a very bad idea, if you were to do it with sanitizing $userdata first.
However, for what you are doing, you do not need to worry about it.
Any external received from GET, POST, FILE, etc. should be treated as filthy and sanitized appropriately. How and when you sanitize depends on when the data is going to be used. If you are going to store it to the DB, it needs to be escaped (to avoid SQL Injection. See PDO for example). Escaping is also necessary when running an OS command based on user data such as eval or attempting to read a file (like reading ../../../etc/passwd). If it's going to be displayed back to the user, it needs to be encoded (to avoid html injection. See htmlspecialchars for example).
You don't have to sanitize data for the way you are using it above. In fact, you should only escape for storage and encode for display, but otherwise leave data raw. Of course, you may want to perform your own validation on the data. For example, you may want dirty_data to be in the list of [a, b, c] and if not echo it back to the user. Then you would have to encode it.
Any well-known OS is not going to have a problem even if the user managed to attempt to read an invalid memory address.
Presumably this array's contents are meant to be publicly accessible in this way, so no.
Run it through array_key_exists()
Probably at least a little bad. Maybe there's something that could be done with a malformed multibyte string or something that could trigger some kind of overflow on a poorly-configured server... but that's pure (ignorant) speculation on my part.
I am always thinking about validation in any kind on the webpage (PHP or ASP, it doesn't matter), but never find a good and accurate answer.
For example, a I have some GET-Parameter, which defines a SQL query like DESC oder ASC. (SQL-Injection?)
Or I have a comment-function for user, where the data is also saved in a database.
Is it enought to check for HTML-tags inside the data? Should the validation done before adding it to the database or showing it on the page?
I am searching for the ToDo's which should be always performed with any data given from "outside".
Thanks.
Have a good idea of what you want from the user.
You want them to specify ascending/descending order? That's an enumeration (or a boolean), not part of an SQL query:
$query = "SELECT [...] ORDER BY field " . escape($_GET['sortOrder']); //wrong
This is wrong no matter how much you escape and sanitize their string, because this is not the way to validate an enumeration. Compare:
if ($_GET['sortOrder'] == 'desc') {
$ascending = false;
} else {
$ascending = true;
}
if ($ascending) {
...
} else {
...
}
...which does not warrant a discussion of string escaping or SQL injection because all you want from the user is a yes/no (or ascending/descending) answer.
You want them to enter a comment? Why disallow HTML tags? What if the user wants to enter HTML code?
Again, what you want from them is, say, "a text... any text with a maximum length of 1024 characters*." What does this have to do with SQL or injection? Nothing:
$text = $_POST['commentText'];
if (mb_strlen($text, ENCODING) <= 1024) {
//valid!
}
The value in the database should reflect what the user entered verbatim; not translated, not escaped. Say you're stripping all HTML <tags> from the comment. What happens when you decide to send comments somewhere in JSON format? Do you strip JSON control characters as well? What about some other format? What happens if HTML introduces a tag called ":)"? Do you go around in your database stripping off smileys from all comments?
The answer is no, as you don't want HTML-safe, JSON-safe, some-weird-format-with-smileys-safe input from the user. You want text that is at maximum 1024 characters. Check for that. Store that.
Now, the displaying part is trickier. In order to display:
<b>I like HTML "tags"
in HTML, you need to write something like:
<b>I like HTML "tags"
In JSON, you would do:
{ "I like HTML \"tags\" }
That is why you should use your language facilities to escape the data when you're using it.
The same of course goes for SQL, which is why you should escape the data when using simple query functions like mysql_query() in PHP. (Parametrized queries, which you should really be using, on the other hand, need no escaping.)
Summary
Have a really good idea of what you want as the input, keeping in mind that you almost never need, say, "HTML-safe text." Validate against that. Escape when required, meaning escape HTML as you send to the browser, SQL as you send to the database, and so on.
*: You should also define what a "character" means here. UTF-8, for example, may use multiple bytes to encode a code point. Does "character" mean "byte" or "Unicode code point"?
If you're using PDO, be sure to use prepared statements - these clean the incoming data automatically.
If using the mysql_* functions, run each variable through mysql_real_escape_string first.
You can also do validation such as making sure the variable is one of an acceptable range:
$allowed_values = array('name', 'date', 'last_login')
if(in_array($v, $allowed_values)) {
// now we can use the variable
}
You are talking about two kinds of data sanitation. One is about putting user-generated data in your database and the other is about putting user-generated data on your webpage. For the former you should follow adam's suggestions. For the later you should look into htmlspecialchars.
Do not mix these two as they do two completely different things. For that purpose sanitation should only take place at the last moment. Use adam's suggestion just before updating the database. Use htmlspecialchars just before echoing data. Do not use htmlspecialchars on data before adding it to the database.
You might also want to look around Stackoverflow, because this sort of question has been asked and answered countless times in the past.