Do I need to escape/filter data that is coming from the database? Even if said data has already been "escaped" once (at the point in time where it was inserted into the database).
For example, say I allow users to submit blog posts via a form that has a title input and a textarea input.
A malicious user submits the blog post
title: Attackposttitle');DROP TABLE posts;--
textarea: Hahaha nuked ur site noobzors!
Now as this is being inserted into my database, I am going to escape it with mysql_real_escape_string, but once it is in the database I will later reference this data in my php blog application with something like this:
sql="SELECT posttitle FROM posts WHERE id=50";
$posttitlearray = mysql_fetch_array(mysql_query($sql));
This is where my concern is, what if I, for example, run the following query to get the post content:
sql="SELECT postcontent FROM posts WHERE posttitle=$posttitlearray[posttitle]";
In theory am I not sql injecting myself? IE, am I not effectively running the query:
sql="SELECT postcontent FROM posts WHERE posttitle=Attackposttitle');DROP TABLE posts;--";
Or does the "Attackposttitle');DROP TABLE posts;--" data continue to be escaped once it is in the database?
Do I need to continually escape it like so:
sql="SELECT postcontent FROM posts WHERE posttitle=msql_real_escape_string($posttitlearray[posttitle])";
Or is the data safe once it has been escaped initially upon first being inserted into the database?
Thanks Stack!
It does not continue to be escaped once it's put in the database. You'll have to escape it again.
$sql="SELECT postcontent FROM posts WHERE posttitle='".mysql_real_escape_string($posttitlearray[posttitle])."'";
The value should be escaped every time just before insertion to SQL query. Not for magical security reasons, but just to be sure that the syntax of the resultant query is OK.
Escaping the string sound magical to many people, something like shield against some mysterious danger, but in fact it is nothing magical. It is just the way to enable special characters being processed by the query.
The best would be just to have a look what escaping really does. Say the input string is:
Attackposttitle');DROP TABLE posts;--
after escaping:
Attackposttitle\');DROP TABLE posts;--
in fact it escaped only the single slash. That's the only thing you need to assure - that when you insert the string in the query, the syntax will be OK!
insert into posts set title = 'Attackposttitle\');DROP TABLE posts;--'
It's nothing magical like danger shield or something, it is just to ensure that the resultant query has the right syntax! (of course if it doesn't, it can be exploited)
The query parser then looks at the \' sequence and knows that it is still the variable, not ending of its value. It will remove the backslash and the following will be stored in the database:
Attackposttitle');DROP TABLE posts;--
which is exactly the same value as user entered. And which is exactly what you wanted to have in the database!!
So this means that the if you fetch that string from the database and want to use it in the query again, you need to escape it again to be sure that the resultant query has the right syntax.
But, in your example, very important thing to mention is the magic_quotes_gpc directive!
This feature escapes all the user input automatically (gpc - _GET, _POST and _COOKIE). This is an evil feature made for people not aware of sql injection. It is evil for two reasons. First reason is that then you have to distinguish the case of your first and second query - in the first you don't escape and in the second you do. What most people do is to either switch the "feature" off (I prefer this solution) or unescape the user input at first and then escape it again when needed. The unescape code could look like:
function stripslashes_deep($value)
{
return is_array($value) ?
array_map('stripslashes_deep', $value) :
stripslashes($value);
}
if (get_magic_quotes_gpc()) {
$_POST = stripslashes_deep($_POST);
$_GET = stripslashes_deep($_GET);
$_COOKIE = stripslashes_deep($_COOKIE);
}
The second reason why this is evil is because there is nothing like "universal quoting".
When quoting, you always quote text for some particular output, like:
string value for mysql query
like expression for mysql query
html code
json
mysql regular expression
php regular expression
For each case, you need different quoting, because each usage is present within different syntax context. This also implies that the quoting shouldn't be made at the input into PHP, but at the particular output! Which is the reason why features like magic_quotes_gpc are broken (never forget to handle it, or better, assure it is switched off!!!).
So, what methods would one use for quoting in these particular cases? (Feel free to correct me, there might be more modern methods, but these are working for me)
mysql_real_escape_string($str)
mysql_real_escape_string(addcslashes($str, "%_"))
htmlspecialchars($str)
json_encode() - only for utf8! I use my function for iso-8859-2
mysql_real_escape_string(addcslashes($str, '^.[]$()|*+?{}')) - you cannot use preg_quote in this case because backslash would be escaped two times!
preg_quote()
Try using bind variables. which will remove the need to escape your data completely.
http://php.net/manual/en/function.mssql-bind.php
only down side is your restricted to using them with stored procedures in SQL server, other database you can use them for everything.
Related
I'm trying to update an old program that uses regular MySql queries, runs all the inputs through addslashes() prior to inserting them, and runs the retrieved data through stripslashes() before returning it. I'm trying to update the program to run with Mysqli and use prepared statements, but I'm not sure how to make the transition with the existing data in the database.
My first thought was to write a function to test if slashes have been added, but the general consensus in the answers to similar questions I found was that this isn't doable.
I'm trying to avoid doing a mass update to the database, because it's an open source program and this seems likely to potentially cause problems for existing users when they try to upgrade.
If I continue to use addslashes() prior to inserting using MySqli and prepared statements would this work? What would be the reasons for not doing it this way and instead going the full database upgrade route?
Edit:
Based on a comment that appears to be deleted now, I went and looked at the database directly. All I could find was an ' that had been converted to ' and no slashes. When I pulled out the data it came out fine without running stripslashes(). So do the added slashes just tell mysql to escape the data when it's inserted? Will all the data come out find if I remove the stripslashes()? If so what's the point of stripslashes()?
You have several questions:
If I continue to use addslashes() prior to inserting using mysqli_ and prepared statements would this work?
No, it would not: the added backslashes would then be stored in your database, which is not what you want. In prepared statements the provided arguments are taken literally as you pass them, as in that mechanism there is no more need to escape quotes.
What would be the reasons for not doing it this way and instead going the full database upgrade route?
See above.
So do the added slashes just tell MySql to escape the data when it's inserted?
The added slashes are interpreted when you embed a string literal in your SQL, as that necessarily is wrapped in quotes. For example:
$text = "My friend's wedding";
$sql = "INSERT INTO mytable VALUES ('$text')";
$result = mysqli_query($con, $sql);
The mysqli_query will fail, as the SQL statement that is passed to it looks like this:
INSERT INTO mytable VALUES ('My friend's wedding')
This is not valid SQL, as the middle quote ends the string literal, and so the s that follows is not considered part of the string any more. As the SQL engine does not understand the s (nor wedding and the dangling quote after it), it produces a syntax error. You can even see something is wrong in the way the syntax is highlighted in the above line.
So that is where addslashes() is (somewhat) useful:
$text = "My friend's wedding";
$text = addslashes($text);
$sql = "INSERT INTO mytable VALUES ('$text')";
$result = mysqli_query($con, $sql);
This will work because now the SQL statement sent to MySql looks like this:
INSERT INTO mytable VALUES ('My friend\'s wedding')
The backslash tells MySql that the quote that follows it has to be taken as a literal one, and not as the end of the string. MySql will not store the backslash itself, because it is only there to escape the quote (see list of escape sequences in the documentation).
So in your table you will get this content inserted:
My friend's wedding
Then:
Will all the data come out fine if I remove the stripslashes()?
Yes. In fact, it should never have been in your code in the first place, as it will negatively effect what you get when there truly are (intended!) backslashes in your data. You are maybe lucky that a backslash is a rare character in normal text. Apparently you have not found any backslash in your database data, so it probably did not badly affect you until now. It is an error to think that the addslashes() call during the saving of data had to be countered with a stripslashes() call while reading the data, because, again, MySql had already removed them while writing the data into the database.
If so what's the point of stripslashes()?
There is no point in the context of what you are doing. It only is useful if you have received somehow a string in which there are characters that have a backslash before them, and have the meaning to escape the next character when the string is used in some context. But since you only briefly have such strings (after calling addslashes()) and MySql interprets these escapes by removing the backslashes in the written data, you never again see that string with the additional backslashes.
Situations where you would need stripslashes() are very rare and very specific. For your scenario you don't need it.
I know I've already asked a question about sanitizing and escaping, but I have a question which didn't get answered.
Okay, here it goes. If I have a PHP-script and I GET the users input and SELECT it from a mySQL database, would it matter/be any security risk, if I didn't escape < and > through the use of either htmlspecialchars, htmlentities or strip_tags and therefore allowed for HTML tags to be selected/searched from the database? Because the input is already being sanitized through the use of trim(), mysql_real_escape_string and addcslashes (\%_).
The problem using htmlspecialchars is that it escapes ampersand (&), which the user input is supposed to allow (I guess the same goes for htmlentities?). With the use of strip_tags, something like "John" results in the PHP-script selecting and displaying results for John, which it isn't supposed to do.
Here is my PHP-code for sanitizing the input, before selecting from the database:
if(isset($_GET['query'])) {
if(strlen(trim($_GET['query'])) >= 3) {
$search = mysql_real_escape_string(addcslashes(trim($_GET['search']), '\%_'));
$sql = "SELECT name, age, address WHERE name LIKE '%".$search."%'";
[...]
}
}
And here is my output for displaying "x matched y results.":
echo htmlspecialchars(strip_tags($_GET['search']), ENT_QUOTES, 'UTF-8')." matched y results.";
A good way to go about this is to use MySQLi, it uses prepared statements which essentially escapes everything for you on the backend and offers strong protection against SQL injections. Not escaping GET data is just as dangerous as not escaping any other input.
There's two different concerns here that you've identified.
User Data in SQL Statements
Whenever you're constructing a query, you need to be absolutely certain that no arbitrary user data will end up in it. These mistakes are called SQL injection bugs and are the result of failing to correctly escape your data. As a general rule, you should never, ever use string concatenation to compose a query. Whenever possible, use placeholders to ensure that your data is correctly escaped.
User Data in HTML Document
When you're rendering a page that contains user-submitted content, you need to escape it so that the user cannot introduce arbitrary HTML tags or scripting elements. This is avoids XSS issues and means that characters like & and < do not get interpreted incorrectly. User data of "x < y" wouldn't end up breaking your page.
You'll always need to escape for whatever context you're rendering user data into. There are others, like inside a script tag or in a URL, but these are the two most common ones.
I have some pages that are stored in databases. For security purposes, all the pages is escaped before saved into the DB, but then when i print the page, the HTML-tags are still escaped. Like this
Link
Obviously, that doesn't work very well, so how do i unescape the pages?
I've tried with html_entity_decode without any success.
While data should be escaped before inserting it into the database, it shouldn't still be escaped when you take it out. The root cause of your problem is that it is being escaped twice between collection and examining it after it comes out of the database.
You should track down why it is being escaped twice and fix that.
That may leave the existing data broken though (it depends on if the data is being escaped twice on the way in or if it is being escaped on the way out of the database with magic_quotes_runtime). If so, you will need to clean it up. That form of escaping has nothing to do with HTML and can be reversed with stripslashes.
The clean up will look something like:
SELECT * from database_table
Create a prepared UPDATE statement to update a row
foreach row stripslashes on the data that was double escaped, pass the data to the prepared statement
Use stripslashes(): http://uk3.php.net/manual/en/function.stripslashes.php
Use stripslashes($str) for retrieve the content and remove slashes added during insert content into database.
thanks
mysql database input strings should always be escaped using mysql_real_escape_string() and when they come out, they should be unescaped using stripslashes().
for numbers like id's, those should be converted to integers using int() and then range checked: for instance, AUTO_INCREMENT columns like id's by default start with 1. so for a validation check on anything you get from $_GET[] or $_POST[], check that your int()'ed number is >= 1.
filter all your integers through int().
filter all your real numbers through doubleval(), unless you are working with monetary values and you have your own decimal number class - floating point can mangle money.
Well, on my php pages I am escaping in this manner:
$title = "Jack's Long 'Shoes'";
$title = mysql_real_escape_string($title);
$go = mysql_query("INSERT INTO titles (title) VALUES '$title'");
Then when I view this data via phpmyadmin, the data appears as it were before it was escaped, ie Jack's Long 'Shoes'
I was under the impression that it would look like:
Jack\s Long \Shoes\
Are the slashes supposed to actually be printed inside the mysql database field?
No. The escapes vanish once they pass into the data tables. That's the whole points of escaping data - it's like stuffing a letter into an envelope. The letter stays in the envelope (escaped) during its journey through the postal system. Once it gets to its destination (the database storage medium), it's removed from the envelope and stored in its original form.
If the escaping (envelope) was stored along with the letter, you'd have to UNESCAPE (open the envelope) it each time you pulled the letter out of the database.
For databases, the escaping serves to "hide" SQL metacharacters from the query parser. Once the data's passed through the parser and has been written into the DB, the escapes are no longer necessary. The db's own internal handlers knows what is data and what is sql commands, so the artificial divisions created by the escapes are no longer necessary at that point.
Certain characters must exist so that the statement can be understood.
For example, in PHP, when you have, $var = "The other day, someone said, \"Hello!\""; you don't expect the \ to exist in the output string. Escaping in SQL is the same concept. The escape characters are there to mark special characters as literal, not to actually show up.
I usually escape user input by doing the following:
htmlspecialchars($str,ENT_QUOTES,"UTF-8");
as well as mysql_real_escape_string($str) whenever a mysql connection is available.
How can this be improved? I have not had any problems with this so far, but I am unsure about it.
Thank you.
Data should be escaped (sanitized) for storage and encoded for display. Data should never be encoded for storage. You want to store only the raw data. Note that escaping does not alter raw data at all as escape characters are not stored; they are only used to properly signal the difference between raw data and command syntax.
In short, you want to do the following:
$data = $_POST['raw data'];
//Shorthand used; you all know what a query looks like.
mysql_query("INSERT " . mysql_real_escape_string($data));
$show = mysql_query("SELECT ...");
echo htmlentities($show);
// Note that htmlentities() is usually overzealous.
// htmlspecialchars() is enough the majority of the time.
// You also don't have to use ENT_QUOTES unless you are using single
// quotes to delimit input (or someone please correct me on this).
You may also need to strip slashes from user input if magic quotes is enabled. stripslashes() is enough.
As for why you should not encode for storage, take the following example:
Say that you have a DB field that is char(5). The html input is also maxlength="5". If a user enters "&&&&&", which may be perfectly valid, this is stored as "&&." When it's retrieved and displayed back to the user, if you do not encode, they will see "&&," which is incorrect. If you do encode, they see "&&," which is also incorrect. You are not storing the data that the user intended to store. You need to store the raw data.
This also becomes an issue in a case where a user wants to store special characters. How do you handle the storage of these? You don't. Store it raw.
To defend against sql injection, at the very least escape input with mysql_real_escape_string, but it is recommended to use prepared statements with a DB wrapper like PDO. Figure out which one works best, or write your own (and test it thoroughly).
To defend against XSS (cross-site-scripting), encode user input before it is displayed back to them.
If you only use mysql_real_escape_string($str) to avoid sql injection, make sure you always add single quotes around it in your query.
The htmlspecialchars is fine when parsing unsafe output to the screen.
For the database switch to PDO.
It's much easier and does the escaping for you.
http://php.net/pdo