PHP user input data security

PHP user input data security - php

I am trying to figure out which functions are best to use in different cases when inputting data, as well as outputting data.
When I allow a user to input data into MySQL what is the best way to secure the data to prevent SQL injections and or any other type of injections or hacks someone could attempt?
When I output the data as regular html from the database what is the best way to do this so scripts and such cannot be run?
At the moment I basically only use
mysql_real_escape_string();
before inputting the data to the database, this seems to work fine, but I would like to know if this is all I need to do, or if some other method is better.
And at the moment I use
stripslashes(nl2br(htmlentities()))
(most of the time anyways) for outputting data. I find these work fine for what I usually use them for, however I have run into a problem with htmlentities, I want to be able to have some html tags output respectively, for example:
<ul></ul><li></li><bold></bold>
etc, but I can't.
any help would be great, thanks.

I agree with mikikg that you need to understand SQL injection and XSS vulnerabilities before you can try to secure applications against these types of problems.
However, I disagree with his assertions to use regular expressions to validate user input as a SQL injection preventer. Yes, do validate user input insofar as you can. But don't rely on this to prevent injections, because hackers break these kinds of filters quite often. Also, don't be too strict with your filters -- plenty of websites won't let me log in because there's an apostrophe in my name, and let me tell you, it's a pain in the a** when this happens.
There are two kinds of security problems you mention in your question. The first is a SQL injection. This vulnerability is a "solved problem." That is, if you use parameterized queries, and never pass user supplied data in as anything but a parameter, the database is going to do the "right thing" for you, no matter what happens. For many databases, if you use parameterized queries, there's no chance of injection because the data isn't actually sent embedded in the SQL -- the data is passed unescaped in a length prefixed or similar blob along the wire. This is considerably more performant than database escape functions, and can be safer. (Note: if you use stored procedures that generate dynamic SQL on the database, they might also have injection problems!)
The second problem you mention is the cross site scripting problem. If you want to allow the user to supply HTML without entity escaping it first, this problem is an open research question. Suffice to say that if you allow the user to pass some kinds of HTML, it's entirely likely that your system will suffer an XSS problem at some point to a determined attacker. Now, the state of the art for this problem is to "filter" the data on the server, using libraries like HTMLPurifier. Attackers can and do break these filters on a regular basis; but as of yet nobody has found a better way of protecting the application from these kinds of things. You may be better off only allowing a specific whitelist of HTML tags, and entity encoding anything else.

This is one of the most problematic task today :)
You need to know how SQL injection and other attackers methods works. There are very detailed explanation of each method in https://www.owasp.org/index.php/Main_Page and also whole security framework for PHP.
Using specific security libraries from some framework are also good choice like in CodeIgniter or Zend.
Next, use REGEXP as much as you can and stick pattern rules to specific input format.
Use prepared statements or active records class of your framework.
Always cast your input with (int)$_GET['myvar'] if you really need numeric values.
There are so many other rules and methods to secure your application, but one golden rule is "never trust user's input".

In your php configuration, magic_quotes_gpc should be off. So you won't need stripslashes.
For SQL, take a look at PDO's prepared statements.
And for your custom tags, as there are only three of them, you can do a preg_replace call after the call of htmlentities to convert those back before your insert them into the database.

Related

How sanitize and store user input, that contains HTML regex pattern in WordPress

I working on some WordPress plugin that one of its features is ability to store HTML regex pattern, entered by user, to DB and then display it on settings page.
My method is actually work but I wonder if that code is secure enough:
That's the user entered pattern:
<div(.+?)class='sharedaddy sd-sharing-enabled'(.*?)>(.+?)<\div><\div><\div>
That's the way I'm storing HTML pattern in DB:
$print_options['custom_exclude_pattern'] = htmlentities(stripslashes($_POST['custom_exclude_pattern']),ENT_QUOTES,"UTF-8");
That's how it's actually stored in WordPress DB:
s:22:"custom_exclude_pattern";s:109:"<div(.+?)class="sharedaddy sd-sharing-enabled"(.*?)>(.+?)<\div><\div><\div>";
And that's how the output is displayed on settings page:
<input type="text" name="custom_exclude_pattern" value="<?php echo str_replace('"',"'",html_entity_decode($print_options['custom_exclude_pattern'])); ?>" size="30" />
Thanks for help :)

From the comments, it sounds like you are concerned about two separate issues (and possibly unaware of a third one that I will mention in a minute) and looking for one solution for both: SQL Injection and Cross-Site Scripting. You have to treat each one separately. I implore you to read this article by Defuse Security.
How to Prevent SQL Injection
This has been answered before on StackOverflow with respect to PHP applications in general. WordPress's $wpdb supports prepared statements, so you don't necessarily have to figure out how to work with PDO or MySQLi either. (However, any vulnerabilities in their driver WILL affect your plugin. Make sure you read the $wpdb documentation thoroughly.
You should not escape the parameters before passing them to a prepared statement. You'll just end up with munged data.
Cross-Site Scripting
As of this writing (June 2015), there are two general situations you need to consider:
The user should not be allowed to submit any HTML, CSS, etc. to this input.
The user is allowed to submit some HTML, CSS, etc. to this input, but we don't want them to be able to hack us by doing so.
The first problem is straightforward enough to solve:
echo htmlentities($dbresult['field'], ENT_QUOTES | ENT_HTML5, 'UTF-8');
The second problem is a bit tricky. It involves allowing only certain markup while not accidentally allowing other markup that can be leveraged to get Javascript to run in the user's browser. The current gold standard in XSS defense while allowing some HTML is HTML Purifier.
Important!
Whatever your requirements, you should always apply your XSS defense on output, not before inserting stuff into the database. Recently, Wordpress core had a stored cross-site scripting vulnerability that resulted from the decision to escape before storing rather than to escape before rendering. By supplying a sufficiently long comment, attackers could trigger a MySQL truncation bug on the escaped text, which allowed them to bypass their defense.
Bonus: PHP Object Injection from unserialize()
That's how it's actually stored in WordPress DB:
s:22:"custom_exclude_pattern";s:109:"<div(.+?)class="sharedaddy sd-sharing-enabled"(.*?)>(.+?)<\div><\div><\div>";
It looks like you're using serialize() when storing this data and, presumably, using unserialize() when retrieving it. Be careful with unserialize(); if you let users have any control over the string, they can inject PHP objects into your code, which can also lead to Remote Code Execution.
Remote Code Execution, for the record, means they can take over your entire website and possibly the server that hosts your blog. If there is any chance that a user can alter this record directly, I highly recommend using json_encode() and json_decode() instead.

I hope I got the point, if not then correct me: you are trying to dynamically insert a pattern for an input field, based on the same pattern being stored in your db, right?
Well, personally I think patterns are a good help for usability, in that the user knows his input format is not correct without needing to submit and refresh every time.
The big problem of patterns is, HTML code can be modified client-side. I believe the only safe solution would be to check server-side for the correctness of the input... There is no way a client side procedure can be safer than a server-side one!

Well, if you are gonna let your user input a regex, you could just do something like prepared statement + htmlentities($input, ENT_COMPAT, "UTF-I"); to sanitize the input, and then do the opposite, that is html_entity_decode($dataFromDb, ENT_COMPAT, " UTF-8");. A must is the prepared statement, all the other ways to work around a malicious input can be combined in lots of different ways!

Sanitizing PHP Variables, am I overusing it?

I've been working with PHP for some time and I began asking myself if I'm developing good habits.
One of these is what I belive consists of overusing PHP sanitizing methods, for example, one user registers through a form, and I get the following post variables:
$_POST['name'], $_POST['email'] and $_POST['captcha']. Now, what I usually do is obviously sanitize the data I am going to place into MySQL, but when comparing the captcha, I also sanitize it.
Therefore I belive I misunderstood PHP sanitizing, I'm curious, are there any other cases when you need to sanitize data except when using it to place something in MySQL (note I know sanitizing is also needed to prevent XSS attacks). And moreover, is my habit to sanitize almost every variable coming from user-input, a bad one ?

Whenever you store your data someplace, and if that data will be read/available to (unsuspecting) users, then you have to sanitize it. So something that could possibly change the user experience (not necessarily only the database) should be taken care of. Generally, all user input is considered unsafe, but you'll see in the next paragraph that some things might still be ignored, although I don't recommend it whatsoever.
Stuff that happens on the client only is sanitized just for a better UX (user experience, think about JS validation of the form - from the security standpoint it's useless because it's easily avoidable, but it helps non-malicious users to have a better interaction with the website) but basically, it can't do any harm because that data (good or bad) is lost as soon as the session is closed. You can always destroy a webpage for yourself (on your machine), but the problem is when someone can do it for others.
To answer your question more directly - never worry about overdoing it. It's always better to be safe than sorry, and the cost is usually not more than a couple of milliseconds.

The term you need to search for is FIEO. Filter Input, Escape Output.
You can easily confound yourself if you do not understand this basic principle.
Imagine PHP is the man in the middle, it receives with the left hand and doles out with the right.
A user uses your form and fills in a date form, so it should only accept digits and maybe, dashes. e.g. nnnnn-nn-nn. if you get something which does not match that, then reject it.
That is an example of filtering.
Next PHP, does something with it, lets say storing it in a Mysql database.
What Mysql needs is to be protected from SQL injection, so you use PDO, or Mysqli's prepared statements to make sure that EVEN IF your filter failed you cannot permit an attack on your database. This is an example of Escaping, in this case escaping for SQL storage.
Later, PHP gets the data from your db and displays it onto a HTML page. So you need to Escape the data for the next medium, HTML (this is where you can permit XSS attacks).
In your head you have to divide each of the PHP 'protective' functions into one or other of these two families, Filtering or Escaping.
Freetext fields are of course more complex than filtering for a date, but never mind, stick to the principles and you will be OK.
Hoping this helps http://phpsec.org/projects/guide/

Is FILTER_SANITIZE_STRING enough to avoid SQL injection and XSS attacks?

I'm using PHP 5 with SQLite 3 class and I'm wondering if using PHP built-in data filtering function with the flag FILTER_SANITIZE_STRING is enough to stop SQL injection and XSS attacks.
I know I can go grab a large ugly PHP class to filter everything but I like to keep my code as clean and as short as possible.
Please advise.

The SQLite3 class allows you to prepare statements and bind values to them. That would be the correct tool for your database queries.
As for XSS, well that is entirely unrelated to your use of SQLite.

It's never wise to use the same sanitization function for both XSS and SQLI. For XSS you can use htmlentities to filter user input before output to HTML. For SQLI on SQLite you can either use prepared statements (which is better) or use escapeString to filter user input before constructing SQL queries with them.

If you don't trust your own understanding of the security issues enough to need to ask this question, how can you trust someone here to give you a good answer?
If you go down the path of stripping out unwanted characters sooner or later you're going to be stripping out characters that users want to type. It's better to encode for the specific context that the data is used.
Check out OWASP ESAPI, it contains plenty of encoding functions. If you don't want to pull in that big of a library, check out what the functions do and copy the relevant parts to your codebase.

If you are just trying to build a simple form and dont want to introduce any heavy or even light frameworks, then go with php filters + and use PDO for the database. This should protect you from everything but cross site request forgeries.

FILTER_SANITIZE_STRING will remove HTML tags not special characters like &. If you want to convert a special character to entity code prevent malicious users to do anything.
filter_input(INPUT_GET, 'input_name', FILTER_SANITIZE_SPECIAL_CHARS);
OR
filter_input($var_name, FILTER_SANITIZE_SPECIAL_CHARS);
If you want to encode everything it's worth using for
FILTER_SANITIZE_ENCODED
For more info:
https://www.php.net/manual/en/function.filter-var.php

I think its good enough to secure your string data inputs, but there are many other options available which you can choose. e.g. other libraries would increase your application process time but will help you to process/parse other types of data.

I'm learning PHP on my own and I've become aware of the strip_tags() function. Is this the only way to increase security?

I'm new to PHP and I'm following a tutorial here:
Link
It's pretty scary that a user can write php code in an input and basically screw your site, right?
Well, now I'm a bit paranoid and I'd rather learn security best practices right off the bat than try to cram them in once I have some habits in me.
Since I'm brand new to PHP (literally picked it up two days ago), I can learn pretty much anything easily without getting confused.
What other way can I prevent shenanigans on my site? :D

There are several things to keep in mind when developing a PHP application, strip_tags() only helps with one of those. Actually strip_tags(), while effective, might even do more than needed: converting possibly dangerous characters with htmlspecialchars() should even be preferrable, depending on the situation.
Generally it all comes down to two simple rules: filter all input, escape all output. Now you need to understand what exactly constitutes input and output.
Output is easy, everything your application sends to the browser is output, so use htmlspecialchars() or any other escaping function every time you output data you didn't write yourself.
Input is any data not hardcoded in your PHP code: things coming from a form via POST, from a query string via GET, from cookies, all those must be filtered in the most appropriate way depending on your needs. Even data coming from a database should be considered potentially dangerous; especially on shared server you never know if the database was compromised elsewhere in a way that could affect your app too.
There are different ways to filter data: white lists to allow only selected values, validation based on expcted input format and so on. One thing I never suggest is try fixing the data you get from users: have them play by your rules, if you don't get what you expect, reject the request instead of trying to clean it up.
Special attention, if you deal with a database, must be paid to SQL injections: that kind of attack relies on you not properly constructing query strings you send to the database, so that the attacker can forge them trying to execute malicious instruction. You should always use an escaping function such as mysql_real_escape_string() or, better, use prepared statements with the mysqli extension or using PDO.
There's more to say on this topic, but these points should get you started.
HTH
EDIT: to clarify, by "filtering input" I mean decide what's good and what's bad, not modify input data in any way. As I said I'd never modify user data unless it's output to the browser.

strip_tags is not the best thing to use really, it doesn't protect in all cases.
HTML Purify:
http://htmlpurifier.org/
Is a real good option for processing incoming data, however it itself still will not cater for all use cases - but it's definitely a good starting point.

I have to say that the tutorial you mentioned is a little misleading about security:
It is important to note that you never want to directly work with the $_GET & $_POST values. Always send their value to a local variable, & work with it there. There are several security implications involved with the values when you directly access (or
output) $_GET & $_POST.
This is nonsense. Copying a value to a local variable is no more safe than using the $_GET or $_POST variables directly.
In fact, there's nothing inherently unsafe about any data. What matters is what you do with it. There are perfectly legitimate reasons why you might have a $_POST variable that contains ; rm -rf /. This is fine for outputting on an HTML page or storing in a database, for example.
The only time it's unsafe is when you're using a command like system or exec. And that's the time you need to worry about what variables you're using. In this case, you'd probably want to use something like a whitelist, or at least run your values through escapeshellarg.
Similarly with sending queries to databases, sending HTML to browsers, and so on. Escape the data right before you send it somewhere else, using the appropriate escaping method for the destination.

strip_tags removes every piece of html. more sophisticated solutions are based on whitelisting (i.e. allowing specific html tags). a good whitelisting library is htmlpurifyer http://htmlpurifier.org/
and of course on the database side of things use functions like mysql_real_escape_string or pg_escape_string

Well, probably I'm wrong, but... In all literature, I've read, people say It's much better to use htmlspellchars.
Also, rather necessary to cast input data. (for int for example, if you are sure it's user id).
Well, beforehand, when you'll start using database - use mysql_real_escape_string instead of mysql_escape_string to prevent SQL injections (in some old books it's written mysql_escape_string still).

Which Type of Input is Least Vulnerable to Attack?

Which type of input is least vulnerable to Cross-Site Scripting (XSS) and SQL Injection attacks.
PHP, HTML, BBCode, etc. I need to know for a forum I'm helping a friend set up.

(I just posted this in a comment, but it seems a few people are under the impression that select lists, radio buttons, etc don't need to be sanitized.)
Don't count on radio buttons being secure. You should still sanitize the data on the server. People could create an html page on their local machine, and make a text box with the same name as your radio button, and have that data get posted back.
A more advanced user could use a proxy like WebScarab, and just tweak the parameters as they are posted back to the server.
A good rule of thumb is to always use parameterized SQL statements, and always escape user-generated data before putting it into the HTML.

We need to know more about your situation. Vulnerable how? Some things you should always do:
Escape strings before storing them in a database to guard against SQL injections
HTML encode strings when printing them back to the user from an unknown source, to prevent malicious html/javascript
I would never execute php provided by a user. BBCode/UBBCode are fine, because they are converted to semantically correct html, though you may want to look into XSS vulnerabilities related to malformed image tags. If you allow HTML input, you can whitelist certain elements, but this will be a complicated approach that is prone to errors. So, given all of the preceding, I would say that using a good off-the-shelf BBCode library would be your best bet.

None of them are. All data that is expected at the server can be manipulated by those with the knowledge and motivation. The browser and form that you expect people to be using is only one of several valid ways to submit data to your server/script.
Please familiarize yourself with the topic of XSS and related issues
http://shiflett.org/articles/input-filtering
http://shiflett.org/blog/2007/mar/allowing-html-and-preventing-xss

Any kind of boolean.
You can even filter invalid input quite easily.
;-)

There's lots of BB code parsers that sanitize input for HTML and so on. If there's not one available as a package, then you could look at one of the open source forum software packages for guidance.
BB code makes sense as it's the "standard" for forums.

The input that is the least vulnerable to attack is the "non-input".
Are you asking the right question?

For Odin's sake, please don't sanitize inputs. Don't be afraid of users entering whatever they want into your forms.
User input is not inherently unsafe. The accepted answer leads to those kinds of web interfaces like my bank's, where Mr. O'Reilly cannot open an account, because he has an illegal character in his name. What is unsafe is always how you use the user input.
The correct way to avoid SQL injections is to use prepared statements. If your database abstraction layer doesn't let you use those, use the correct escaping functions rigorously (myslq_escape et al).
The correct way to prevent XSS attacks is never something like striptags(). Escape everything - in PHP, something like htmlentities() is what you're looking for, but it depends on whether you are outputing the string as part of HTML text, an HTML attribute, or inside of Javascript, etc. Use the right tool for the right context. And NEVER just print the user's input directly to the page.
Finally, have a look at the Top 10 vulnerabilities of web applications, and do the right thing to prevent them. http://www.applicure.com/blog/owasp-top-10-2010

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.