possibly dangerous text input handling

possibly dangerous text input handling - php

I've read up on SQL Injection, XSS and other security issues and am trying to figure out what to use to safeguard the company's site.
We are about to deploy a simple 'User feedback' form with a textarea so users can tell us how to improve the site to enhance their user experience.
When the user pushes 'submit' on the form, we read the textarea comments from the user, and then programmatically create a filename in that user's subfolder and save their comments to a file. Then we add the filename and path to that user's database record.
The team is not worried about security issues here but I am. Their thinking is "we create the filename, it is 0% based on any user input, and since we write this 'UserX comments' filename and path to the database with no direct user influence -- there is no risk."
My concern is NOT the database activity -- because they're right, the user has no role in WHAT we write to their database record since we're just creating our own filename and storing it in their db record.
My concern is the text file!
So I'm petitioning our small team to rewrite the code to use security to read then write the user's comments in the textarea to the text file.
My concern is -- since we plan to actually READ our user's feedback and open these text files to read them later on -- there might be bad stuff in the textarea that (unless we clean it) could hurt us somehow.
I'm insisting we use strip_tags() but I need to sound informed about the manner in which we sanitize the textarea input -- I'm thinking strip_tags() is the way to go here but I'm 100% new to sanitizing user input. I looked at htmlspecialchars() but that just converts certain characters like '&' to &
and so forth.
Are there other ways to santize/make safe any text the user types into a textarea before we write it out to a file on our web server?

Looks like strip_tags is a good way to go. I'd also suggest writing the file outside of the webroot so that it can't be accessed by a browser.
See also: This Other Thread

If you're not worried about SQL injection, and it seems you're not (either because you know the SQL is sanitized or because you're saving to a text file), then the other problem is the possible XSS attack.
It's easy to ignore those, they don't affect you directly. An XSS attack is an attack that allows one to inject client-side scripts into a webpage. Your database works fine, your server files are not modified, your session files aren't modified either.
This vulnerability is completely client-side. Like I said, it doesn't affect your server. But then someone (i.e.: me) goes on your website, and all of a sudden is redirected to a Warez site while viewing a totally SFW, trusted website. You lose trust from your users. The search engines that crawl your site also mark you as possibly harmful. You lose traffic. You lose revenues. Then again, your server is perfectly fine.
You definitely need to sanitize the user input that is outputted back to the user because of this. Yes, strip_tags is a solution and so is htmlspecialchars or htmlentities.
strip_tags is a little less restrictive however, because it allows you to define some tags you'd like your users to be able to insert in their posts, like bold, links, or italic.
In conclusion, you are absolutely right on insisting on this practice. It doesn't affect you (i.e.: your company's server) directly, but it will affect you at some point if you want a trusted presence on the world wide web.
I know this may be a longer answer than others who should suggest to just strip_tags. They are absolutely right, which is why I upvoted them. Just trying to give you some "corporate" arguments there. :)

It depends on how you are creating a file, and what are you doing with the text after reading it.
If you are using PHP's native functions to write the file, then you should have no problem with remote code execution.
If all you do after reading it is display to a user via HTML, htmlentities(), which effectively makes HTML tags inside the text powerless while still displaying correctly to the user, should be enough.
If you are using it as a part of some query to a database, you should use that database cleaning routine before concatenating it to your SQL. (ie. mysql_real_escape_string() for MySQL, or pg_escape_string() for PostgreSQL).
You may also want to take a look at some info on the OWASP page.
Edit: I forgot to mention, you should also use ENT_QUOTES with htmlentities to prevent single quote injections.

simply use mysql_real_escape_string() to get rid of quotes. htmlentities() if you are worried about js files. That should be about as good as it gets there.

Sanitizing input only protects you from sql injection by removing certain characters. These characters, however, cannot act in a malicious manner from within a text file. I happen to know quite a bit about malware, and trust me, you are not at risk here.
Edit:
If I somehow missed the point of this post through my rambling, do let me know so I can update my answer.

I have a solution that follows mainstream of your development team ideology:
Do not use any user authorization on your site, including admins. Thus, no XSS will harm you either.

Related

How sanitize and store user input, that contains HTML regex pattern in WordPress

I working on some WordPress plugin that one of its features is ability to store HTML regex pattern, entered by user, to DB and then display it on settings page.
My method is actually work but I wonder if that code is secure enough:
That's the user entered pattern:
<div(.+?)class='sharedaddy sd-sharing-enabled'(.*?)>(.+?)<\div><\div><\div>
That's the way I'm storing HTML pattern in DB:
$print_options['custom_exclude_pattern'] = htmlentities(stripslashes($_POST['custom_exclude_pattern']),ENT_QUOTES,"UTF-8");
That's how it's actually stored in WordPress DB:
s:22:"custom_exclude_pattern";s:109:"<div(.+?)class="sharedaddy sd-sharing-enabled"(.*?)>(.+?)<\div><\div><\div>";
And that's how the output is displayed on settings page:
<input type="text" name="custom_exclude_pattern" value="<?php echo str_replace('"',"'",html_entity_decode($print_options['custom_exclude_pattern'])); ?>" size="30" />
Thanks for help :)

From the comments, it sounds like you are concerned about two separate issues (and possibly unaware of a third one that I will mention in a minute) and looking for one solution for both: SQL Injection and Cross-Site Scripting. You have to treat each one separately. I implore you to read this article by Defuse Security.
How to Prevent SQL Injection
This has been answered before on StackOverflow with respect to PHP applications in general. WordPress's $wpdb supports prepared statements, so you don't necessarily have to figure out how to work with PDO or MySQLi either. (However, any vulnerabilities in their driver WILL affect your plugin. Make sure you read the $wpdb documentation thoroughly.
You should not escape the parameters before passing them to a prepared statement. You'll just end up with munged data.
Cross-Site Scripting
As of this writing (June 2015), there are two general situations you need to consider:
The user should not be allowed to submit any HTML, CSS, etc. to this input.
The user is allowed to submit some HTML, CSS, etc. to this input, but we don't want them to be able to hack us by doing so.
The first problem is straightforward enough to solve:
echo htmlentities($dbresult['field'], ENT_QUOTES | ENT_HTML5, 'UTF-8');
The second problem is a bit tricky. It involves allowing only certain markup while not accidentally allowing other markup that can be leveraged to get Javascript to run in the user's browser. The current gold standard in XSS defense while allowing some HTML is HTML Purifier.
Important!
Whatever your requirements, you should always apply your XSS defense on output, not before inserting stuff into the database. Recently, Wordpress core had a stored cross-site scripting vulnerability that resulted from the decision to escape before storing rather than to escape before rendering. By supplying a sufficiently long comment, attackers could trigger a MySQL truncation bug on the escaped text, which allowed them to bypass their defense.
Bonus: PHP Object Injection from unserialize()
That's how it's actually stored in WordPress DB:
s:22:"custom_exclude_pattern";s:109:"<div(.+?)class="sharedaddy sd-sharing-enabled"(.*?)>(.+?)<\div><\div><\div>";
It looks like you're using serialize() when storing this data and, presumably, using unserialize() when retrieving it. Be careful with unserialize(); if you let users have any control over the string, they can inject PHP objects into your code, which can also lead to Remote Code Execution.
Remote Code Execution, for the record, means they can take over your entire website and possibly the server that hosts your blog. If there is any chance that a user can alter this record directly, I highly recommend using json_encode() and json_decode() instead.

I hope I got the point, if not then correct me: you are trying to dynamically insert a pattern for an input field, based on the same pattern being stored in your db, right?
Well, personally I think patterns are a good help for usability, in that the user knows his input format is not correct without needing to submit and refresh every time.
The big problem of patterns is, HTML code can be modified client-side. I believe the only safe solution would be to check server-side for the correctness of the input... There is no way a client side procedure can be safer than a server-side one!

Well, if you are gonna let your user input a regex, you could just do something like prepared statement + htmlentities($input, ENT_COMPAT, "UTF-I"); to sanitize the input, and then do the opposite, that is html_entity_decode($dataFromDb, ENT_COMPAT, " UTF-8");. A must is the prepared statement, all the other ways to work around a malicious input can be combined in lots of different ways!

XSS in URI in page without any input

Is the XSS attack made by user input?
I have recived attacks like this:
'"--></style></script><script>alert(0x002357)</script>
when scanning a php page without any html content with acunetix or netsparker.
Thanks in advance

Remember that even if you had just a static collection of HTML files without any server-side or or client-side scripting whatsoever, you may still store you logs in an SQL database or watch them as HTML using some log analyzer which may be vulnerable to this kind of URIs. I have seen URIs in logs that were using escape sequences to run malicious command in command line terminals – google for escape sequence injection and you may be surprised how popular they are. Attacking web-based log analyzing tools is even more common – google for log injection. I am not saying that this particular attack was targeted at your logs but I'm just saying that not displaying any user input on your web pages doesn't mean that you are safe from malicious payloads in your URIs.

I'm not 100% sure I understand your question. If I understood you correctly, you used a security scanner to check your web application for XSS vulnerabilities and it did show a problem about which you aren't sure if it really is a problem.
XSS is pretty simple: whenever there is a way to force an application to display unfiltered code a user provided, there is a vulnerability.
The attack code you show above seems to target a style tag that add certain user provided data (eg. a template variable or something similar). You should check if there's such a thing in your app and make sure it's properly filtered.

Blackbox scanners will try this attack even when your html doesn't expect any parameter because there is no easy way for them to know what's going on in your source code), if you don't echo anything or use stuff like PHP_SELF you are fine.
Also take a look at DOM Based XSS to understand how XSS might happen without any server-side flaw.
If the scanner reports a vulnerability take a look at the description and source code, generally it will hilight the vulnerable part of the source code so you can see.
Secondly you can manually test and if executes JS then you can investigate whether it's about your framework, or a vulnerability in the javascript code or in URL Rewrite (maybe you echo your current path in the page) or something like that.

Where did you find this XSS? As far as I am aware if a page does not take any user-input (a process/display it) it cannot be vulnerable to XSS.
Edit:
I think I misunderstood your question - did you mean can XSS occur by entering Javascript in the address bar in the browser? Or by appending Javascript to the URI? If the latter - then the page is susceptible to XSS and you should use a whitelist for any variables passed to your URI. If the former, then no, any client-side changes in the address bar will only be visible to that single user.

Security and php

So I have a website that given a users input it will generate a /home/content/s/a/m/p/l/e/users/profile/index.php. My real question is, is this safe? This is what I do to try to sanitize the users input, if there is more, please let me know.
strip_tags(html_entity_decode($mysqli->real_escape_string($title)), ALLOWED_TAGS);
ALLOWED_TAGS = "<br><p><b><i><hr>";
Since I am relatively new to this website development, I am wondering if this is a good approach, because it takes the strain off using the database to get the same information over and over again, instead just have a static page with the information on it, or is this a HUGE security hole? I do not know! :) I do not know if they could do some sort of XSS attack with what I have setup here. Please help!
Michael
P.S. If you have any answers or suggestions, could you please give me some insight into why it is. I have a degree in computer science so I am curious on how it works, not just the quick and dirty solution. Thanks.

This is a PHP security checklist I compiled for my company's internal knowledgebase. May be it helps.
Do not use deprecated functions and practices
Always validate user input
Use place holders when using variable values in an SQL query.
Always escape variables used in SQL queries.
Set proper directory permissions
Always regenerate session id when the user logs in each time. (To avoid session id hijacking)
Never store passwords in plain text. Store only their hashed values.
When outputting user input in a web page, always check for html special characters. (HTML tags like may be used for XSS attacks)
Know the specs of your deployment server before you move to it
Protect directories where log entries are saved.
Set register_globals to off
PHP safe mode can be useful, but it is deprecated since version 5.3
If not used in the code, disable the functions system and exec using the disable_functions setting in php.ini
Set display_errors to off in production/live servers.
Validate Cookie Data

This XSS input validation is awful. An html_entity_decode() is the opposite of what you need. Further more some of these tags, such as the <p> tag allow you to execute JavaScript in an event handler. So in short this code doesn't do shit to stop xss.
You should use htmlspecialchars($var,ENT_QUOTES); or htmlpurifer. If you go the htmlpurifer route make sure you keep that shit up to date, it gets bypassed every couple of weeks, oah and htmlpurifer very computationally expensive because it uses THOUSANDS of regex's.

security issues with echoing a user entered text

I am accepting user text in a form and echoing it back on the page (the code goes to the database as well but that is prepared queries so no worries there). I wanted to know if there are any possible security implications that can be caused by it? On the server side I mean, i know on the client side you can break but can you reach server side?
I need to know if something like eval can be possibly done with this case.

The scenario you explained is called XSS. It is possible to compromise your server with the help of an XSS vulnerability, but it does need other things to fall in place.
Say you have an administrator account that has permissions to make configuration changes to your server over the web. Now, if an attacker creates a XSS link and somehow gets the administrator to click it, his account would be compromised.
Once the attacker has administrator access, he can systematically take control of the entire system. This happened recently with Apache - read their article on it. It is the best write-up on a security incident I have ever seen, you will learn a lot from it.

use htmlspecialchars($yourstring) in php, or strip characters, no need to open possibilities for exploits.

If you use the user input directly to query an SQL database, you can be subjected to SQL injections. Just google it for examples.
EDIT: Oh, I missed the text saying that you just echo the text. Hm, well, maybe the user can issue PHP commands if you evaluate the user input. But I don't know why you should do that because then the user could issue any PHP commands to the server (which is a clear security risk)...

Use:
echo htmlentities($string);
Everywhere. Unless you want to open your application to dozens of possible attacks:
http://ha.ckers.org/xss.html
If you need to echo a HTML markup:
1) Use HTMLPurifier on the HTML before saving it to the database.
2) I recommend to use XHTML STRICT filtering.
3) Disallow tags like scripts, frame, attributes like onclick etc. The list of tags and attributes users entering HTML should never need is quite long. Just restrict them to what they might need, e.g.: p, ol, ul, h1, h2, h3, dl, abbr, img (these can be dangerous, many possible attacks through img tag, be careful), a (detto), table, maybe few more.

Which Type of Input is Least Vulnerable to Attack?

Which type of input is least vulnerable to Cross-Site Scripting (XSS) and SQL Injection attacks.
PHP, HTML, BBCode, etc. I need to know for a forum I'm helping a friend set up.

(I just posted this in a comment, but it seems a few people are under the impression that select lists, radio buttons, etc don't need to be sanitized.)
Don't count on radio buttons being secure. You should still sanitize the data on the server. People could create an html page on their local machine, and make a text box with the same name as your radio button, and have that data get posted back.
A more advanced user could use a proxy like WebScarab, and just tweak the parameters as they are posted back to the server.
A good rule of thumb is to always use parameterized SQL statements, and always escape user-generated data before putting it into the HTML.

We need to know more about your situation. Vulnerable how? Some things you should always do:
Escape strings before storing them in a database to guard against SQL injections
HTML encode strings when printing them back to the user from an unknown source, to prevent malicious html/javascript
I would never execute php provided by a user. BBCode/UBBCode are fine, because they are converted to semantically correct html, though you may want to look into XSS vulnerabilities related to malformed image tags. If you allow HTML input, you can whitelist certain elements, but this will be a complicated approach that is prone to errors. So, given all of the preceding, I would say that using a good off-the-shelf BBCode library would be your best bet.

None of them are. All data that is expected at the server can be manipulated by those with the knowledge and motivation. The browser and form that you expect people to be using is only one of several valid ways to submit data to your server/script.
Please familiarize yourself with the topic of XSS and related issues
http://shiflett.org/articles/input-filtering
http://shiflett.org/blog/2007/mar/allowing-html-and-preventing-xss

Any kind of boolean.
You can even filter invalid input quite easily.
;-)

There's lots of BB code parsers that sanitize input for HTML and so on. If there's not one available as a package, then you could look at one of the open source forum software packages for guidance.
BB code makes sense as it's the "standard" for forums.

The input that is the least vulnerable to attack is the "non-input".
Are you asking the right question?

For Odin's sake, please don't sanitize inputs. Don't be afraid of users entering whatever they want into your forms.
User input is not inherently unsafe. The accepted answer leads to those kinds of web interfaces like my bank's, where Mr. O'Reilly cannot open an account, because he has an illegal character in his name. What is unsafe is always how you use the user input.
The correct way to avoid SQL injections is to use prepared statements. If your database abstraction layer doesn't let you use those, use the correct escaping functions rigorously (myslq_escape et al).
The correct way to prevent XSS attacks is never something like striptags(). Escape everything - in PHP, something like htmlentities() is what you're looking for, but it depends on whether you are outputing the string as part of HTML text, an HTML attribute, or inside of Javascript, etc. Use the right tool for the right context. And NEVER just print the user's input directly to the page.
Finally, have a look at the Top 10 vulnerabilities of web applications, and do the right thing to prevent them. http://www.applicure.com/blog/owasp-top-10-2010

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.