Should I escape characters in my GET and POST requests?

Should I escape characters in my GET and POST requests? - php

I have just read that PHP escapes incoming GET and POST requests on its own for some time. Double escaping does no good. Should I escape the strings at all?
For example I process a simple input like this:
$contact = mysqli_real_escape_string($dbLink, strip_tags($_POST['contact']));
Later, when saved and retrieved from the database I fill the input with last values, like:
echo '<input type="text" class="form-control" id="inputContact" name="contact" value="'.$contact.'">'.PHP_EOL;
When someone enters quotes in the field, it returns something like this, which destroys the form:
<input type="text" class="form-control" id="inputContact" name="contact" value="0900 123 456, jozefmat" ejkasdfadsf"="">

I have just read that PHP escapes incoming GET and POST requests on its own for some time
This is magic quotes, they were always ineffective and more trouble then they were worth. They have been deprecated and modern versions of PHP do not support them at all.
Should I escape the strings at all?
Yes. You should perform suitable sanitization of untrusted data (either via escaping, white list filtering or some other suitable means) as is applicable for the place you are putting the data (which is different depending on if you are inserting it into a database query (search for SQL injection), an HTML document (search for XSS or Cross-Site Scripting) or somewhere else).
As you have noticed, the options you have available to do even within an HTML document vary - what is suitable for "Inside an element" is not always suitable for "Inside an attribute value" or "Inside a script element".

It really depends upon what you are using the responses for and if you are manipulating them after.
For instance, if you are inserting the data into a DB then you should escape your data to prevent SQL injections. If you are just displaying the data then there should be no need unless you are displaying specific characters. But you can also exchange them for HTML entities
in the example you gave, you should escape them because that is how SQL injection works

it depends.If your retrieving or sending anything to a database then yes you should escape characters to avoid mysql injection.But if its just dealing with the client side(browser) aspect then it wouldnt be a big deal

Related

Is escaping output from MySQL server necessary if data being retrieved has already been sanitized?

I'm interested to know whether or not it is necessary to escape output from a MySQL server if the data that is being retrieved has already been filtered when the user submitted a form.
Example:
1. The user submits a form with a comment for a blog post.
2. On form submission, prior to sending data to MySQL server, their input is filtered with FILTER_SANITIZE_SPECIAL_CHARS to prevent injection attacks.
3. Once the data has been posted to server, the user is rerouted to another screen where they can view their comment.
4. When retrieving their comment from the server (which has stored the filtered input), is it necessary to escape this output as well?
Here's the main issue for me. I'm taking user input from a form (for a blog post), sanitizing it with FILTER_SANITIZE_SPECIAL_CHARS, and then posting it to the MySQL server. If I retrieve this information from the server and display it in html, there are no issues. HOWEVER, I have been reading that you should ALWAYS escape output from servers as well. So I escaped the same post with htmlspecialchars(). Now, I have the issue that ALL special chars (including parentheses, and any quotes that are used by the user in their post) are coming back in their escaped html format. Not user friendly whatsoever.
What is the best work around for this, or is it even necessary to escape the output if it is coming from the server and has already been sanitized on user input?

Sanitization is not the same as escaping, and you should make sure not to confuse the two.
Sanitization is removing unwanted input. That is, if the user adds a <script> tag to their input, and you don't want their input to include <script> tags, then removing that <script> tag would be sanitization. Sanitization is not escaping data for an output context.
Escaping is properly encoding data for an output context. For example, to prevent HTML injection, you might call htmlspecialchars() to correctly encode & as &. To prevent SQL injection, you might use mysqli::real_escape_string() to convert ' to \'. (Though it would be highly preferable to use prepared statements / parameterized queries to prevent having to worry about sql injection or escaping at all.)
Importantly, escaping is context-specific. An escaping you use for HTML is not necessarily valid or sufficient for SQL (or vice-versa, or any other output context).
The problem with FILTER_SANITIZE_SPECIAL_CHARS is that that it's poorly named: it's doing both in one step, which is confusing for your database (since your database now has html-encoded data), and confusing for output (because now you have already-escaped data that is vulnerable to being multiply-escaped).
Instead, you should explicitly separate your sanitization and escaping efforts. Only sanitize data on input that you don't want to persist. Only escape data on output, and according to its proper output context.
The reason you want to store raw (pre-output-escaped) data in the database is so that if you ever need to output to a different context (e.g. now you're dong JSON output, or you need to write it to a file, or actually see what the raw data is), you won't need to unescape it first. (If you really have to, you might reasonably store a pre-escaped copy in a separate column, but you should always have your original data available.) It also makes the rule simple: always sanitize input; always escape output.

Zend\escap data before inserting into database

I'm adding some xss protection to the website I'm working on, the platform is zendFrameWork 2 and therefor I'm using Zend\escaper. from zend documentation i knew that:
Zend\Escaper is meant to be used only for escaping data that is to be
output, and as such should not be misused for filtering input data.
For such tasks, the Zend\Filter component, HTMLPurifier.
but what are the riskes if i escaped the data before inserting it into the database, am i so wrong to do that? please explane to me as im somehow new to this topic.
thanks

When encoding data before storing it you will have to decode it before you can do anything sensible with it before outputting it. That's why I'd not do it.
Let's say you have an international application and you want to store the escaped value of a form field which might contain any NON-ASCII characters those might become escaped into HTML-Entities. So what if you have to quantify the content of that field? Like counting the characters? You will always have to de-escape the content before counting it. and then you have to re-escape it again. Much work done but nothing gained.
The same applies to search-operations in your database. You will have to escape the search-phrase the same way then your input for the database to understand what you are looking for.
I'd use one character-set throughout the application and database (I prefer UTF-8, beware of the MySQL-Connection....) and only escape content on output. Thant way I can then do whatever I like with the data and are on the safe side on output. And escaping is done in my view-layer automaticaly so I don't even have to think about it every time I handle data as it works automaticaly. That way you can't forget it.
That does not prevent me from filtering and sanitizing the input. And it doesn't prevent me from escaping the database-content using the appropriate database-escaping mechanisms like mysqli_real_escape_string or similar or using prepared statements!
But that's just my opinion, others might think otherwise!

"Output" here refers to the web page. A form field ( HTML tag) is an INPUT (from the webpage), any text is an OUTPUT (to the webpage). You need to ensure any output (to the webpage) does not contain dangerous characters that could be used to forge XSS attack vectors.
This said, if you have DANGEROUS_INPUT_X given by the user and then
$NOT_DANGEROUS_ANYMORE = ZED.HtmlPurifier(DANGEROUS_INPUT_X)
DBSave($NOT_DANGEROUS_ANYMORE)
and somewhere else
$OUTPUT = DBLoad($NOT_DANGEROUS_ANYMORE)
echo $OUTPUT
you should be fine, as long as you do not apply any additional encoding/decoding to this output. It will be displayed in the way it is saved, that was safe.
I would suggest to look at output encoding more than validation: HtmlPurifier cleans the HTML, while you could accept any kind of bad characters if you ensure your output is encoded in the page.
Here https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet some general rules, here the PHP example
echo htmlspecialchars($DANGEROUS_INPUT_X_NOW_OUTPUT, ENT_QUOTES, "UTF-8");
Remember to set the Character Set and be consistent with the same one throughout your pages/scripts/binaries and in the database as well.

html sanitization makes difficulties

I read user profiles from database and show them. Before I show them I use HTML sanitizing through php htmlentities. It shows them correctly. But, while allowing user to edit it, it is shown like double filtered.
echo '<input id="about" name="about" value="'.$php_filtered_value>.'">';
Then inside the input, ampersand would look like &
If I don't filter the variable there is worry about html injection.
What should I do?

I prefer to follow OWASP RULE#2:
> RULE #2 - Attribute Escape Before Inserting Untrusted Data into HTML
Requirements:
-Aggressive HTML Entity Encoding
-Only place untrusted data into a whitelist of safe attributes (listed below).
-Strictly validate unsafe attributes such as background, id and name.
Please see XSS (Cross Site Scripting) Prevention Cheat Sheet

Don't double escape the text then (as in: once before storing it in the database and again before echoing it).
Unescaped (what you typed): Able & Baker
Escaped once (what is being stored in the DB): Able & Baker
Double-escaped (what is ending up in your HTML): Able &amp; Baker
Rather, escape the text only once: generally on the output side, not on the input side.

Saving textarea in the database

I've been searching about this, but I can't find the most important part - what field to use.
I want to save a textarea without allowing any kind of javascript, html or php. What functions should I run the posted textarea through before saving it in the database? And what field type should I use for it in the database? It'll be a description, max 1000 chars.

There are a number of ways to go around in removing/handling code so that it can be saved in your database.
Regular Expressions
One way (but may be hard and unreliable) is to remove/ detect code using regular expressions.
For example, the following removes all script tags using php code (Taken from here):
$mystring = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $mystring)
The stip_tags PHP function
You can also make use of the built in stip_tags function which strips HTML and PHP tags from a string. The manual provides several examples, one shown below for your convenience:
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
HTML Purifier
You can check out HTML Purifier, which is a common HTML filter PHP library intended to detect and remove dangerous code.
Simple code found on their Getting Started Section:
require_once '/path/to/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);
In Practice (Safe Output)
If you are trying to avoid XSS attacks or Injection attacks, cleaning user data is the wrong way to go about it. Removing tags is not a 100 % guarantee for keeping your service safe from these attacks. Therefore, in practice, user data containing code is not usually filtered/ cleaned, but rather escaped during output. More specifically, the special characters within the string are escaped, where these characters are based on the syntax of the language. An example of this is making use of PHP's htmlspecialchars function in order to convert special characters to their respective HTML entities. A Code Snippet taken from manual is shown below:
<?php
$new = htmlspecialchars("<a href='test'>Test</a>", ENT_QUOTES);
echo $new; // <a href='test'>Test</a>
?>
For more information about escaping and a very good explanation related to your question, look at this page. It shows you other forms of output escaping. Also, for a question and answer related to escaping, click here.
Furthermore, one more short but VITAL point I want to throw at you is that ANY data received from a user CANNOT be trusted.
SQL Injection Attacks
Definition (From here)
A SQL injection attack consists of insertion or "injection" of a SQL
query via the input data from the client to the application. A
successful SQL injection exploit can read sensitive data from the
database, modify database data (Insert/Update/Delete), execute
administration operations on the database (such as shutdown the DBMS),
recover the content of a given file present on the DBMS file system
and in some cases issue commands to the operating system.
For SQL Injection attacks: Use prepared statements and parameterized queries when storing information to the database. (Question and Answer found here) A tutorial of prepared statements using PDO can be found here.
Cross-site Scripting (XSS)
Definition (from here):
Cross-Site Scripting attacks are a type of injection problem, in which
malicious scripts are injected into the otherwise benign and trusted
web sites. Cross-site scripting (XSS) attacks occur when an attacker
uses a web application to send malicious code, generally in the form
of a browser side script, to a different end user.
I personally like this image for a better understanding.
For XSS attacks: you should consult this famous page, which describes rule by rule on what needs to be done.

TLDR:
It is conventional to use htmlspecialchars() to encode text on output, rather than filter the text on input. A text field is fine for this purpose.
What you need to defend against
You are trying to protect yourself from XSS. XSS happens when users can stored HTML control characters on your site. Other users will see this HTML markup, so a malicious user can use your page to redirect people to other sites or steal cookies and so on.
You need to consider this for all of your inputs: this should include any varchar or text field that can be stored in your database; not just your textareas. I can add malicious content to an input field just as easily as I can add it to a textarea.
How do we defend against this?
Let's say that a user claims that their username is:
<script src="http://example.com/malicious.js"></script>
The simplest way to handle this is to save this into the database "as is". However, whenever you echo it on the site, you should filter it through the PHP htmlspecialchars() function:
echo 'Hi, my name is ' . htmlspecialchars($user->username) . '!';
htmlspecialchars turns the HTML control characters (<, >, &, ', and ") into their HTML Entities (<, >, &, &apos;, and "). This would look like the original character in a browser (i.e.: to normal users), but it would not act like actual HTML markup.
The result is that instead of malicious JavaScript, the user's name would literally look like <script src="http: //example.com/malicious.js"></script>.
Why filter on output? Why not on input?
1 - OWASP recommends this way
2 - If you forget to protect an input field, and someone figures it out and adds malicious content, you now need to find the malicious content in the database and repair the fault code on your site.
3 - If you forget to encode an output field, and someone manages to sneak in malicious input, then you only need to repair the faulty code on your site.
4 - It is possible for users to write usernames that would break the HTML fields used to edit the usernames. If you encode the content before you store it in the database, then you need to display it "as is" in the appropriate input fields (let's assume that an admin or the user can change their username later). But, let's suppose that a user found a way to inject malicious code into the database. What if they said that their username is: " style="display:none;" />. The input field that would let the administrator change this username now looks like:
<input type="text" name="username" value="" style="display:none;" />" />
malicious content -> ^^^^^^^^^^^^^^^^^^^^^^^^^^
Now, the admins can't fix the problem: the input field has disappeared. But, if you encode the text on output, then all of your input fields will have protection against malicous content. Now, your inputs will look like this:
<input type="text" name="username" value="" style="display:none;" />" />
safe content -> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Input text and special characters and MySQL

I have a simple textbox in a form and I want to safely store special characters in the database after POST or GET and I use the code below.
$text=mysql_real_escape_string(htmlspecialchars_decode(stripslashes(trim($_GET["text"])),ENT_QUOTES));
When I read the text from the database and put it in the text value I use the code above.
$text=htmlspecialchars($text_from_DB,ENT_QUOTES,'UTF-8',false);
<input type="text" value="<?=$text?>" />
I am trying to save in the database with no special characters (meaning I don't want to write in database field " or ')
Actually when writing to the database do htmlspecialchars_decode to the text.
When writing to the form text box do htmlspecialchars to the text.
Is this the best approach for safe writing special chars to the database?

You have the right idea of keeping the text in the database as raw. Not sure what all the HTML entity stuff is for; you shouldn't need to be doing that for a database insertion.
[The only reason I can think of why you might try to entity-decode incoming input for the database would be if you find you are getting character references like Š in your form submission input. If that's happening, it's because the user is inputting characters that don't exist in the encoding used by the page with the form. This form of encoding is totally bogus because you then can't distinguish between the user typing Š and literally typing Š! You should avoid this by using the UTF-8 encoding for all your pages and content, as every possible character fits in this encoding.]
Strings in your script should always be raw text with no escaping. That means you don't do anything to them until the time you output them into a context that isn't plain-text. So for putting them into an SQL string:
$category= trim($_POST['category']);
mysql_query("SELECT * FROM things WHERE category='".mysql_real_escape_string($category)."'");
(or use parameterised queries to avoid having to manually escape it.) When putting content into HTML:
<input type="text" name="category" value="<?php echo htmlspecialchars($category); ?>" />
(you can define a helper function with a shorter name like function h($s) { echo htmlspecialchars($s, ENT_QUOTES); } if you want to cut down on the amount of typing you have to do in templates.)
And... that's pretty much it. You don't need to process strings that come out of the database, as they're already raw strings. You don't need to process input strings(*), other than any application-specific field validation you want to do.
*: well, except if magic_quotes_gpc is turned on, in which case you do either need to stripslashes() everything that comes in from get/post/cookie, or, my favoured option, just immediately fail:
if (get_magic_quotes_gpc())
die(
'Magic quotes are turned on. They are utterly bogus and no-one should use them. '.
'Turn them off, you idiot, or I refuse to run. So there!'
);

When you write to db, use htmlentities but when you read back, use html_entity_decode function.
As a sidenote, if you are looking for some security, then for strings use mysql_real_escape_string and for numbers use intval.

I'd like to point out a couple of things:
there is nothing wrong in saving characters like ' and " in a database, SQL injections are just a matter of string manipulation, they actually have nothing to do with SQL or databases -- the problem only relies in how the query string is built. If you want to write your own queries (not recommended) you don't have to encode every apostrophe or double quote: just escape them once to build a safe string, and save them in the database. A better approach is using PDO as mentioned, or using the mysqli extension which allows queries with prepared statements
htmlentities() and similar functions should be used when sending data as output to the browser, not for encoding data to be stored in a database for at least two reasons: first of all it's useless, the DB doesn't care about html entities, it just contains data; secondly you should always treat data coming from the database as potentially insecure, so you should save it in "raw" format and encode it when using it.

The best approach to safe write to a DB is to use the PDO abstraction layer and make use of prepared statements.
http://www.php.net/manual/en/intro.pdo.php
A good tutorial (I learned from this one) is
http://www.phpro.org/tutorials/Introduction-to-PHP-PDO.html
However, you might have to rewrite alot of your site just to implement this. But this is no doubt the most elegant method than having to make use of all those functions. Plus, prepared statements are becoming the de facto now. Another benefit of this is that you do not have to rewrite your queries if you switch to a different database (such as from MySQL to PostgreSQL). But I would say consider this if you plan to scale your site.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.