PHP - Non-destructive Input Sanitization - php

I have a TextArea on my website which I write the input into my database.
I want to filter this TextArea input, but without removing any HTML tags or other stuff.
In short, I want to sanetize and securize the input before I write it into my database, but I want the entry to be intact and unmodified when I take back the entry from the database and write it on the website.
How can I achieve this?

If you want to preserve the data character for character when it's written back to the website try:
$stringToSave = mysql_real_escape_string($inputString);
Then when retrieving it from the database:
$stringToPutOnPage = htmlentities($databaseString);
If you want the html to actually be read as html (be careful about XSS) you can just use:
$stringToSave = mysql_real_escape_string($inputString);
Edit: It would seem that best practice is to sanitize the string for html after retrieving it from the database and not before. Thanks for the comments, I will have to change my method.

If you mean you simply want to make it safe to store in your database all you need to do is use the database specific escaping method, for example mysql_real_escape_string. Of course, that doesn't secure you from XSS attacks, but if you want to retrieve and display it unmodified you don't have a choice.

It's really simple:
To avoid SQL injection, mysql_real_escape_string your values before concatenating them into an SQL query, or use parameterized queries that don't suffer from malformed strings in the first place.
To avoid XSS problems and/or messed up HTML, HTML escape your values before plugging them into an HTML context.
JSON escape them in a JSON context, CSV escape them in a CSV context, etc pp.
All are the same problem, really. As a very simple example, to produce the string "test" (I want the quotes to be part of the string), I can't write the string literal $foo = ""test"". I have to escape the quotes within the quotes to make clear which quotes are supposed to end the string and which are part of the string: $foo = "\"test\"".
SQL injection, XSS problems and messed up HTML are all just a variation on this.
To plug a value that contains quotes into a query, you have the same problem as above:
$comment = "\"foo\""; // comment is "foo", including quotes
$query = 'INSERT INTO `db` (`comment`) VALUES ("' . $comment . '")';
// INSERT INTO `db` (`comment`) VALUES (""foo"")
That produces invalid syntax at best, SQL injection attacks at worst. Using mysql_real_escape_string avoids this:
$query = 'INSERT INTO `db` (`comment`) VALUES ("' . mysql_real_escape_string($comment) . '")';
// INSERT INTO `db` (`comment`) VALUES ("\"foo\"")
HTML escaping is exactly the same, just with different syntax issues.
You only need to escape your values in the right context using the right method. To escape values for HTML, use htmlentities. Do that at the time it's necessary. Don't prematurely or over-escape your values, only apply the appropriate escape function in the right context at the right time.

Related

Would this function work for generally sanitizing db query variables?

I know most people say to just use prepared statements, but I have a site with many existent queries and I need to sanitize the variables by the mysqli_real_escape_string() function method.
Also the php manual of mysqli_query() says mysqli_real_escape_string() is an acceptable alternative, so here I am ...
I want to do this type of queries:
$query = sprintf("SELECT * FROM users WHERE user_name = %s",
query_var($user_name, "text"));
$Records = mysqli_query($db, $query) or die(mysqli_error($db));
I want to know below function would work, I am unsure if:
I should still do the stripslashes() at the start ? An old function I used from Adobe Dreamweaver did this.
Is it OK to add the quotes like $the_value = "'".$the_value."'"; after the mysqli_real_escape_string() ?
Does it have any obvious / big flaws ?
I noticed the stripslashes() removes multiple \\\\\\ and replaces it with one, so that migt not work well for general use, e.g when a user submits a text comment or an item description that might contain \\\\, is it generally OK not to use stripslashes() here ?
I am mostly worried about SQL injections, it is OK if submitted data included html tags and so, I deal with that when outputing / printing data.
if(!function_exists('query_var')){
function query_var($the_value, $the_type="text"){
global $db;
// do I still need this ?
// $the_value = stripslashes($the_value);
$the_value = mysqli_real_escape_string($db, $the_value);
// do not allow dummy type of variables
if(!in_array($the_type, array('text', 'int', 'float', 'double'))){
$the_type='text';
}
if($the_type=='text'){
$the_value = "'".$the_value."'";
}
if($the_type=='int'){
$the_value = intval($the_value);
}
if($the_type == 'float' or $the_type=='double'){
$the_value = floatval($the_value);
}
return $the_value;
}
}
A text string constant in MySQL / MariaDB starts and ends with a single quote ' character. If the text itself contains a quote character, we escape it by doubling it. So the name "O'Leary" looks like this in a SQL statement's text.
SET surname = 'O''Leary'
That's the only rule. If your users feed you data with backslashes or other escaping schemes, you can feed it verbatim to MySql with the kind of text string representation mentioned here.
Don't overthink this. But use carefully debugged escaping functions. Avoid writing your own, because any tiny bug will allow SQL injection.
Looking at the PHP functions documentation, I found some references that made me decide the stripslashes() is not needed in that function.
https://www.php.net/manual/en/security.database.sql-injection.php
Generic functions like addslashes() are useful only in a very specific
environment (e.g. MySQL in a single-byte character set with disabled
NO_BACKSLASH_ESCAPES) so it is better to avoid them.
https://www.php.net/manual/en/function.addslashes.php
The addslashes() is sometimes incorrectly used to try to prevent SQL
Injection. Instead, database-specific escaping functions and/or
prepared statements should be used.

Not able to insert string which contains '

I wrote a script to insert record in my DB. The only issue I am getting is when I try to store data which contains ' character then the script does not work and it does not store anything in the DB. For example John's Birthday , Amy's Home etc . Any solution to this problem which allows special character like ' to store in the DB and retrieving them without any harm to security?
mysqli_query($con,"INSERT INTO Story (desc)
VALUES ('$mytext')");
PHP's mysqli_real_escape_string is made specifically for this purpose. You problem is that quotes are being interpreted by MySQL as part of the query instead of values. You need to escape characters like this so they won't affect your query - this is what SQL injection is.
$mytext = mysqli_real_escape_string($con, $mytext);
// continue with your query
Manual: http://php.net/manual/en/mysqli.real-escape-string.php
Filter the variable part of the query through mysqli_real_escape_string.

Do I need to use addslashes() when backing up database?

I am using the code shown here, it uses addslashes() on the data fetched from the database before saving to file.
$row[$j] = addslashes($row[$j]);
My question is why and do I need to use this? I thought you would do this when saving to the database not the other way round. When I compare the results from the above script with the export from phpMyAdmin, the fields that contain serialized data are different. I would like to know if it would cause any problems when importing back into the database?
Script:
'a:2:{i:0;s:5:\"Hello\";i:1;s:5:\"World\";}'
phpMyAdmin Export:
'a:2:{i:0;s:5:"Hello";i:1;s:5:"World";}'
UPDATE
All data is escaped when inserting into the database.
Change from mysql to mysqli.
SQL file outputs like:
INSERT INTO test (foo, bar) VALUES (1, '\'single quotes\'\r\n\"double quotes\"\r\n\\back slashes\\\r\n/forward slashes/\r\n');
SOLUTION
Used $mysqli->real_escape_string() and not addslashes()
inserting to db
When inserting data to a MySQL database you should be either using prepared statements or the proper escape function like mysql_real_escape_string. addslashes has nothing to do with databases and should not be used. Escaping is used as a general term but actually covers a large number of operations. Here it seems two uses of escaping are being talked about:
Escaping dangerous values that could be inserted in to a database
Escaping string quotes to avoid broken strings
Most database escaping functions do a lot more than just escape quotes. They escape illegal characters and well as invisible characters like \0 ... this is because depending on the database you are using there are lots of ways of breaking an insert - not just by adding a closing quote.
Because someone seems to have missed my comment about mentioning PDO I will mention it again here. It is far better to use PDO or some other database abstraction system along with prepared statments, this is because you no longer have to worry about escaping your values.
outputting / dumping db values
In the mentioned backup your database script the original coder is using addslashes as a quick shorthand to make sure the outputted strings in the mysql dump are correctly formatted and wont break on re-insert. It has nothing to do with security.
selecting values from a db
Even if you escape your values on insert to the database, you will need to escape the quotes again when writing that data back in to any kind of export file that utilises strings. This is only because you wish to protect your strings so that they are properly formatted.
When inserting escaped data into a database, the 'escape sequences' used will be converted back to their original values. for example:
INSERT INTO table SET field = "my \"escaped\" value"
Once in the database the value will actually be:
my "escaped" value
So when you pull it back out of the database you will receive:
my "escaped" value
So when you need to place this in a formatted string/dump, a dump that will be read back in by a parser, you will need to do some kind of escaping to format it correctly:
$value_from_database = 'my "escaped" value';
echo '"' . $value_from_database . '"';
Will produce:
"my "escaped" value"
Which will break any normal string parser, so you need to do something like:
$value_from_database = 'my "escaped" value';
echo '"' . addslashes($value_from_database) . '"';
To produce:
"my \"escaped\" value"
However, if it were me I'd just target the double quote and escape:
$value_from_database = 'my "escaped" value';
echo '"' . str_replace('"', '\\"', $value_from_database) . '"';
I think you are mixing two problems. The first problem is SQL Injection and to prevent this you would have to escape the data going into the database. However by now there is a far more better way to do this. Using prepared statements and bound parameters. Example with PDO:
// setup a connection with the database
$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'pass');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// run query
$stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = :name');
$stmt->execute(array(':name' => $name));
// get data
foreach ($stmt as $row) {
// do something with $row
}
The other thing you would have to worry about it XSS attacks which basically allows a possible attacker to inject code into your website. To prevent this you should always use htmlspecialchars() when displaying data with possible information you cannot trust:
echo htmlspecialchars($dataFromUnsafeSource, ENT_QUOTES, 'UTF-8');
All data is escaped when inserting into the database.
When using prepared statements and bound paramters this isn't needed anymore.
Should I use addslashes() then use str_replace() to change \" to "?
addslashes() sounds like a crappy way to prevent anything. So not needed AFAICT.
Another note about accessing the database and in the case you are still using the old mysql_* function:
They are no longer maintained and the community has begun the deprecation process. See the red box? Instead you should learn about prepared statements and use either PDO or MySQLi. If you can't decide, this article will help to choose. If you care to learn, here is a good PDO tutorial.
You should store data without modifying them.
You should perform the needed escaping when outputting the data or putting them "inside" other data, like inside a database query.
just use mysql_escape_string() instead of addslashes and ereg_replace as written in david walsh's blog.
just try it it'll be better. :)

Querying NON-escaped strings in MySQL

The table has company names which are not escaped.
My qry looks like
$sql = "SELECT id FROM contact_supplier WHERE name = '$addy' LIMIT 1";
The problem comes in where the company name values in the table are sometimes things like "Acme Int'l S/L".
(FYI: values of the $addy match the DB)
Clearly, the values were not escaped when stored.
How do I find my matches?
[EDIT]
Ahah!
I think I'm we're on to something.
The source of the $addy value is a file
$addresses = file('files/addresses.csv');
I then do a
foreach ($addresses as $addy) {}
Well, when I escape the $addy string, it's escaping the new line chars and including "\r\n" to the end of the comparison string.
Unless someone suggests a more graceful way, I guess I'll prob strip those with a str_replace().
:)
[\EDIT]
Why do you think the data already stored in the table should be escaped?
You should escape data only right before it is written directly into a text-based language, e.g. as a part of an SQL query, or into an HTML page, or in a JavaScript code block.
When the query is executed, there's nothing espaced. MySQL transforms it and inserts, otherwise it wouldn't insert and gives error because of syntax or we escape them for security like sql injection.
So your query with escaped values will be working fine with the data in your database.
If the values were not escaped when stored then they would have caused SQL errors when you tried to enter them.
The problem is that the data is not being escaped when you make the query.
Quick hack: Use mysql_real_escape_string
Proper solution: Don't build SQL by mashing together strings. Use prepared statements and parameterized queries
Another option would be to change your query to this...
$sql = "SELECT id FROM contact_supplier WHERE name = \"$addy\" LIMIT 1";
Use mysql_real_escape_string:
$addy = mysql_real_escape_string($addy);
Or try using parameterized queries (PDO).
Regarding this statement:
Clearly, the values were not escaped when stored.
This is incorrect logic. If the values weren't escaped in the original INSERT statement, the statement would have failed. Without escaping you'd get an error along the lines of syntax error near "l S/L' LIMIT 1". The fact that the data is correctly stored in the database proves that whoever inserted it managed to do it correctly (either by escaping or by using parameterized queries).
If you are doing things correctly then the data should not stored in the database in the escaped form.
The issue turned out to be new-line characters
The source of the $addy value starts out like this
$addresses = file('files/addresses.csv');
I then goes through
foreach ($addresses as $addy) {}
When I escape the $addy string, it's escaping the new line chars and inserting "\r\n" on the end of the comparison string.
As soon as I dropped those chars with string_replace() after escaping, everything went swimmingly
Thanks-a-BUNCH for the help

Can I do SQL injection to this code?

I'm still learning about SQL injection, but always the best way for me was using examples, so this is part of my code:
$sql = "INSERT INTO `comments` (`id`, `idpost`, `comment`, `datetime`, `author`, `active`)
VALUES (NULL, '" . addslashes($_POST['idcomment']) . "', '" .
addslashes($_POST['comment']) . "', NOW(), '" .
addslashes($_POST['name']) . "', '1');";
mysql_query($sql);
Knowing that all the POST vars are entered by the user, can you show me how can i make an injection to this script? so i can understand more about this vulnerability. Thanks!
my database server is MySQL.
Don't use addslashes(), always use mysql_real_escape_string(). There are known edge cases where addslashes() is not enough.
If starting something new from scratch, best use a database wrapper that supports prepared statements like PDO or mysqli.
Most of the other answers seem to have missed the point of this question entirely.
That said, based on your example above (and despite your code not following the best practice use of mysql_real_escape_string()) it is beyond my ability to inject anything truly detrimental when you make use of addslashes().
However, if you were to omit it, a user could enter a string into the name field that looks something like:
some name'; DROP TABLE comments; --
The goal is to end the current statement, and then execute your own. -- is a comment and is used to make sure nothing that would normally come after the injected string is processed.
However (again), it is my understanding that MySQL by default automatically closes the DB connection at the end of a single statement's execution. So even if I did get so far as to try and drop a table, MySQL would cause that second statement to fail.
But this isn't the only type of SQL injection, I would suggest reading up some more on the topic. My research turned up this document from dev.mysql.com which is pretty good: http://dev.mysql.com/tech-resources/articles/guide-to-php-security-ch3.pdf
Edit, another thought:
Depending on what happens to the data once it goes to the database, I may not want to inject any SQL at all. I may want to inject some HTML/JavaScript that gets run when you post the data back out to a webpage in a Cross-Site Scripting (XSS) attack. Which is also something to be aware of.
As said before, for strings, use mysql_real_escape_string() instead of addslashes() but for integers, use intval().
/* little code cleanup */
$idcomment = intval($_POST['idcomment']);
$comment = mysql_real_escape_string($_POST['comment']);
$name = mysql_real_escape_string($_POST['name']);
$sql = "INSERT INTO comments (idpost, comment, datetime, author, active)
VALUES ($idcomment, '$comment', NOW(), '$name', 1)";
mysql_query($sql);
Addslashes handles only quotes.
But there are some more important cases here:
Be careful on whether you use double or single quotes when creating the string to be escaped:
$test = 'This is one line\r\nand this is another\r\nand this line has\ta tab';
echo $test;
echo "\r\n\r\n";
echo addslashes($test);
$test = "This is one line\r\nand this is another\r\nand this line has\ta tab";
echo $test;
echo "\r\n\r\n";
echo addslashes($test);
Another one:
In particular, MySQL wants \n, \r and \x1a escaped which addslashes does NOT do. Therefore relying on addslashes is not a good idea at all and may make your code vulnerable to security risks.
And one more:
Be very careful when using addslashes and stripslashes in combination with regular expression that will be stored in a MySQL database. Especially when the regular expression contain escape characters!
To store a regular expression with escape characters in a MySQL database you use addslashes. For example:
$l_reg_exp = addslashes( �[\x00-\x1F]� );
After this the variable $l_reg_exp will contain: [\\x00-\\x1F].
When you store this regular expression in a MySQL database, the regular expression in the database becomes [\x00-\x1F].
When you retrieve the regular expression from the MySQL database and apply the PHP function stripslashes(), the single backslashes will be gone!
The regular expression will become [x00-x1F] and your regular expression might not work!
Remember, that the magic may happen in:
addslashes which may miss something
before adding to database
after retrieving from database
Your example is just an excerpt. The real problem might not be visible here yet.
(based on comments from php.net which are very often more valuable than the manual itself )

Categories