Escaping Characters Such as $ and % | MySQL and PHP - php

So basically I’ve been digging deep into the realm of MySQL and PHP…specifically the security measures I should take when dealing with a database and form inputs. So far I’ve found that the following are very strongly recommended:
Prepared Statements
Using mysqli_real_escape_string()
NOT using Magic Quotes as it confuses databases and ends up giving you stuff like “You\’re name isn\’t….”
All of this is great and I’ve been following it. However, I was wondering if one should also escape characters such as the dollars sign [$], percentage sign [%], and possibly others. Couldn’t the query interpret the dollar sign as a PHP variable perhaps? What about LIKE syntax I’ve heard that uses the % symbol or even the wildcard sign? Prepared statements should technically take care of all of this, but I just wanted to be safe and make sure I had everything escaped properly. In the case that I forget to use prepared statements or just neglect to do them, I was hoping this second line of defense per-say could save me a loooong headache.
Here is what I use for escaping currently:
function escape($connection, $data){
$new_data = trim($data);
$new_data = mysqli_real_escape_string($connection, $new_data);
$new_data = addcslashes($new_data, '%_$');
$new_data = htmlspecialchars($new_data, ENT_NOQUOTES);
return $new_data;
}
So is this proper? Am I doing something horrendously wrong? Notice that I would have to remove back slashes before the $, %, and _ characters when retreiving the database data.

You don't need to escape dollar sign. MySQL doesn't treat that character specially, and PHP only recognizes it in source code, not in string values (unless you call eval on the string, but that's a whole other can of worms).
You would only need to escape % and _ if you used user input as the argument to LIKE and you didn't want the user to be able to use wildcards. This could come up if you're processing a search form. You don't need to use it when storing into the database.
You don't need to use htmlspecialchars when accessing the database. That should only be used when you're displaying data to the user in an HTML page, to prevent XSS injection.

Am I doing something horrendously wrong?
Yes.
First on your research.
Prepared Statements is the only great thing you have found.
While use of mysqli_real_escape_string (assuming you are using prepared statements) would be useless and harmful (producing the outcome you have noted yourself: “You\’re name isn\’t….”).
And Magic Quotes has been removed from the language long time ago already - thus, nothing to concern actually.
So, even most of your initial premises are plainly wrong.
Now to your question.
Couldn’t the query interpret the dollar sign as a PHP variable perhaps?
No.
What about LIKE syntax I’ve heard that uses the % symbol or even the wildcard sign?
Yes, you've heard it right. That's exact purpose of LIKE operator - to perform a wildcard search. Disabling these symbols in LIKE would make not a slightest sense.
Means every time you are going to use LIKE operator, you have to decide which particular symbol to use and which to disallow. NO one-for-all solution can be used. Not to mention that in all other mysql interactions % sign has no special meaning at all.
Prepared statements should technically take care of all of this
Prepared statements has nothing to do neither with $ nor with % signs. Prepared statements deal with SQL injections, but neither symbol could cause it (wouldn't you call "injection" a proper intended use of LIKE operator, would you?).
Finally, to the most horrendous part.
In the case you forget to use prepared statements or just neglect to do them,
nothing can save you.
And least help would be from the function you developed.
To sum it all up.
Get rid of this function.
Use placeholders* to represent every single variable in the query.
Escape % and _ symbols in the input data only if it's going to be used in LIKE operator and you don't want them to be interpreted.
Use htmlspecialchars() for output, not mysql input.
*read on prepared statements if the term is unfamiliar to you.

Depending on what kind of data and what it is used for.
If you find PHP default prepared statements are too long and complex to remember I suggest to have look at some classes available on github to give you an idea of simplified queries.
A Good example #
https://github.com/joshcam/PHP-MySQLi-Database-Class
An example of insert queries with this class
$data = Array (
'login' => 'admin',
'active' => true,
'firstName' => 'John',
'lastName' => 'Doe',
'password' => $db->func('SHA1(?)',Array ("secretpassword+salt")),
// password = SHA1('secretpassword+salt')
'createdAt' => $db->now(),
// createdAt = NOW()
'expires' => $db->now('+1Y')
// expires = NOW() + interval 1 year
// Supported intervals [s]econd, [m]inute, [h]hour, [d]day, [M]onth, [Y]ear
);
$id = $db->insert ('users', $data);
if ($id)
echo 'user was created. Id=' . $id;
else
echo 'insert failed: ' . $db->getLastError();

Related

php - about mysql_real_escape_string [duplicate]

So basically I’ve been digging deep into the realm of MySQL and PHP…specifically the security measures I should take when dealing with a database and form inputs. So far I’ve found that the following are very strongly recommended:
Prepared Statements
Using mysqli_real_escape_string()
NOT using Magic Quotes as it confuses databases and ends up giving you stuff like “You\’re name isn\’t….”
All of this is great and I’ve been following it. However, I was wondering if one should also escape characters such as the dollars sign [$], percentage sign [%], and possibly others. Couldn’t the query interpret the dollar sign as a PHP variable perhaps? What about LIKE syntax I’ve heard that uses the % symbol or even the wildcard sign? Prepared statements should technically take care of all of this, but I just wanted to be safe and make sure I had everything escaped properly. In the case that I forget to use prepared statements or just neglect to do them, I was hoping this second line of defense per-say could save me a loooong headache.
Here is what I use for escaping currently:
function escape($connection, $data){
$new_data = trim($data);
$new_data = mysqli_real_escape_string($connection, $new_data);
$new_data = addcslashes($new_data, '%_$');
$new_data = htmlspecialchars($new_data, ENT_NOQUOTES);
return $new_data;
}
So is this proper? Am I doing something horrendously wrong? Notice that I would have to remove back slashes before the $, %, and _ characters when retreiving the database data.
You don't need to escape dollar sign. MySQL doesn't treat that character specially, and PHP only recognizes it in source code, not in string values (unless you call eval on the string, but that's a whole other can of worms).
You would only need to escape % and _ if you used user input as the argument to LIKE and you didn't want the user to be able to use wildcards. This could come up if you're processing a search form. You don't need to use it when storing into the database.
You don't need to use htmlspecialchars when accessing the database. That should only be used when you're displaying data to the user in an HTML page, to prevent XSS injection.
Am I doing something horrendously wrong?
Yes.
First on your research.
Prepared Statements is the only great thing you have found.
While use of mysqli_real_escape_string (assuming you are using prepared statements) would be useless and harmful (producing the outcome you have noted yourself: “You\’re name isn\’t….”).
And Magic Quotes has been removed from the language long time ago already - thus, nothing to concern actually.
So, even most of your initial premises are plainly wrong.
Now to your question.
Couldn’t the query interpret the dollar sign as a PHP variable perhaps?
No.
What about LIKE syntax I’ve heard that uses the % symbol or even the wildcard sign?
Yes, you've heard it right. That's exact purpose of LIKE operator - to perform a wildcard search. Disabling these symbols in LIKE would make not a slightest sense.
Means every time you are going to use LIKE operator, you have to decide which particular symbol to use and which to disallow. NO one-for-all solution can be used. Not to mention that in all other mysql interactions % sign has no special meaning at all.
Prepared statements should technically take care of all of this
Prepared statements has nothing to do neither with $ nor with % signs. Prepared statements deal with SQL injections, but neither symbol could cause it (wouldn't you call "injection" a proper intended use of LIKE operator, would you?).
Finally, to the most horrendous part.
In the case you forget to use prepared statements or just neglect to do them,
nothing can save you.
And least help would be from the function you developed.
To sum it all up.
Get rid of this function.
Use placeholders* to represent every single variable in the query.
Escape % and _ symbols in the input data only if it's going to be used in LIKE operator and you don't want them to be interpreted.
Use htmlspecialchars() for output, not mysql input.
*read on prepared statements if the term is unfamiliar to you.
Depending on what kind of data and what it is used for.
If you find PHP default prepared statements are too long and complex to remember I suggest to have look at some classes available on github to give you an idea of simplified queries.
A Good example #
https://github.com/joshcam/PHP-MySQLi-Database-Class
An example of insert queries with this class
$data = Array (
'login' => 'admin',
'active' => true,
'firstName' => 'John',
'lastName' => 'Doe',
'password' => $db->func('SHA1(?)',Array ("secretpassword+salt")),
// password = SHA1('secretpassword+salt')
'createdAt' => $db->now(),
// createdAt = NOW()
'expires' => $db->now('+1Y')
// expires = NOW() + interval 1 year
// Supported intervals [s]econd, [m]inute, [h]hour, [d]day, [M]onth, [Y]ear
);
$id = $db->insert ('users', $data);
if ($id)
echo 'user was created. Id=' . $id;
else
echo 'insert failed: ' . $db->getLastError();

mysqli_real_escape_string using environment

I'm using mysqli extension in php for connection to database. I've such a simple question. Is it better to use mysqli instead of mysql and why is it necessary to use mysqli_real_escape_string ? what is this function doing exactly ? Thanks ...
I'll put a little example not using SQL. Imagine you have this PHP code:
<?php
echo 'Hello, world!';
Now you want to replace world with O'Hara:
<?php
echo 'Hello, O'Hara!'; // Parse error: syntax error, unexpected T_STRING, expecting ',' or ';'
Yeah, of course, that is not valid PHP. You need to escape the single quote since it's interpreted as a literal quote rather than the string delimiter:
<?php
echo 'Hello, O\'Hara!';
You have exactly the same problem when composing SQL queries. If you inject random input into your code, sooner or later it'll break. You need to encode input so it's handled as literal input rather than broken code.
How can you do that? Well, MySQL accepts \' just like PHP (though it's only a coincidence: other database engines use other escape methods). So the dumbest solution is to add back slashes here and here:
SELECT id FROM user WHERE name='O\'Hara';
Of course, it's a lot of work to hard-code all the possible characters that need escaping (and you'll probably forget some of them) so you can use a function that does the job for you: either mysql_real_escape_string() or mysqli_real_escape_string().
The question is: is this good enough? Well, it kind of works, but it leads to annoying code that's difficult to maintain:
$sql = "UPDATE user SET name='" . mysql_real_escape_string($name) . "' WHERE id='" . mysql_real_escape_string($id) . "'";
... and you still need to take care of surrounding the complete value with single quotes... which are not always mandatory (think of numbers)... What a mess. Can't someone invent something better? Good news is: they did! It's called prepared statements:
// Just an example, I invented the syntax
$sql = 'UPDATE user SET name=:name WHERE id=:id';
$params = array(
'name' => "O'Brian",
'id' => 31416,
);
$MyDbConnection->execute($sql, $params);
In real life:
MySQLi has the prepare() method to accomplish this. Find some examples there.
Legacy MySQL extension... has nothing: it does not support prepared statements at all! If you use this extension, you are stuck with the annoying add-quotes-yourself and string concatenation methods.
I hope this explains the whole question.
Mysql is slightly faster than Mysqli, but it would have no effect in 99% of web development. The real advantage is that Mysqli is more focused around classes and methods.
Mysqli_real_escape_string is a precautionary function to escape any illegal/malicious characters in a string that you are going to use in a Mysql query. There is also a standard mysql_real_escape_string function aswell. If in doubt it is better to use it than not use it, but beware too many may cause speed issues with your scripts/queries.
To cut it short, if you're writing procedural PHP use standard Mysql, but if you're writing object orientated code then use Mysqli and maximise it's potential. You must always make your queries safe, mysql_real_escape_string is just one way.
Hope this helps!

SQL Injection, Quotes and PHP

I'm quite confused now and would like to know, if you could clear things up for me.
After the lateste Anon/Lulsec attacks, i was questioning my php/mysql security.
So, i thought, how could I protect both, PHP and Mysql.
Question: Could anyone explain me, what's best practice to handle PHP and Mysql when it comes to quotes?
Especially in forms, I would need some kind of htmlspecialchars in order to protect the html, correct?
Can PHP be exploitet at all with a form? Is there any kind of protection needed?
Should I use real_escape_string just before a query? Would it be wrong/bad to use it already within PHP (see sanitize_post function)?
Currently i'm using the following function. The function "sanitizes" all $_POST and $_GET variables. Is this "safe"?
function sanitize_post($array) {
global $db;
if(is_array($array)) {
foreach($array as $key=>$value) {
if(is_array($array[$key])) {
$array[$key] = sanitize_post($array[$key]);
} elseif(is_string($array[$key])) {
$array[$key] = $db->real_escape_string(strtr(stripslashes(trim($array[$key])), array("'" => '', '"' => '')));
}
}
} elseif(is_string($array)) {
$array = $db->real_escape_string(strtr(stripslashes(trim($array)), array("'" => '', '"' => '')));
}
return $array;
}
I'm using PHP 5.3.5 with Mysql 5.1.54.
Thanks.
mysql_real_escape_string deserves your attention.
However direct queries are a quagmire and no longer considered safe practice. You should read up on PDO prepared statements and binding parameters which has a side benefit of quoting, escaping, etc. built-in.
BEST practice is always to use prepared statements. This makes SQL injection impossible. This is done with either PDO or mysqli. Forget about all the mysql_* functions. They are old and obsolete.
Question: Could anyone explain me, what's best practice to handle PHP
and Mysql when it comes to quotes?
That's easy: Use prepared statements, e. g. with PDO::prepare or mysqli_prepare.
There is nothing like "universal sanitization". Let's call it just quoting, because that's what its all about.
When quoting, you always quote text for some particular output, like:
string value for mysql query
like expression for mysql query
html code
json
mysql regular expression
php regular expression
For each case, you need different quoting, because each usage is present within different syntax context. This also implies that the quoting shouldn't be made at the input into PHP, but at the particular output! Which is the reason why features like magic_quotes_gpc are broken (always assure it is switched off!!!).
So, what methods would one use for quoting in these particular cases? (Feel free to correct me, there might be more modern methods, but these are working for me)
mysql_real_escape_string($str)
mysql_real_escape_string(addcslashes($str, "%_"))
htmlspecialchars($str)
json_encode() - only for utf8! I use my function for iso-8859-2
mysql_real_escape_string(addcslashes($str, '^.[]$()|*+?{}')) - you cannot use preg_quote in this case because backslash would be escaped two times!
preg_quote()
Don't waste the effort using mysql_real_escape_string() or anything like that. Just use prepared statements with PDO and SQL injection is impossible.
I usually use the PHP functions stripslashes and strip_tags on the variables as they come in via $_POST (or $_GET, depending on what you use) and mysql_real_escape_string during the query. (I'm not sure if this is "right" but it's worked for me so far.) You can also use PHP's built in validate filters to check things like email addresses, url's, data types, etc. PDO is supposedly decent at preventing SQL injection but I haven't had any experience with it yet.
The basic workflow should be
$data = $_POST['somefield which will go into the database'];
... do data validation ...
if (everything ok) {
$escaped_data = escape_function($data);
$sql = " ... query here with $escaped_data ... ";
do_query($sql);
}
Basically, data that's been escaped for database insertion should ONLY be used for database insertion. There's no point in pre-processing everything and overwriting all data with db-escaped values, when only 2 or 3 of 50(say) values actually go anywhere near the db.
Ditto for htmlspecialchars. Don't send data through htmlspecialchars unless it's headed for an HTML-type display.
Don't store data in the DB formatted for one particular purpose, because if you ever need the data in a different form for some other purpose, you have to undo the escaping. Always store raw/unformatted data in the db. And note: the escaping done with mysql_real_escape_string() and company does not actually get stored in the db. It's there only to make sure the data gets into the database SAFELY. What's actually stored in the db is the raw unescaped/unquoted data. Once it's in the database, it's "safe".
e.g. consider the escaping functions as handcuffs on a prisoner being transferred. While the prisoner is inside either jail, cuffs are not needed.

URL and mod_rewrite: use many special chars and keep data safe from attacks

I'm working on a site where contents pages are handled with mod_rewrite and I'm trying to make the URL managed with mod_rewrite protected from SQL injections with some char restriction, because users can create pages contents like this:
http://site.com/content-type/Page-created-by-user
My doubts come when they insert something like:
http://site.com/architect/Giovanni+Dall'Agata
I need to insert ' char because I can have names like this for example of famous architects, but I don't know if I can keep data safe and how prevent SQL injections with this character.
Should I do something particular to prevent attacks?
I'm using PDO class in PHP like this:
$architect = strip_tags (trim ($_REQUEST["architect"]));
// pdo class etc..
$pdo_stmt->bindParam (":arch", $architect, PDO::PARAM_STR);
// and the other code here...
Users can't create pages with these chars: < > / \ * ? = should I ban ' and " too?
Or should I permit only one of ' and " chars or can I use them together and keep server safe?
$stmt->bindParam (and bindValue, and in general, prepared statements) are safe against SQL injection. All serious SB frameworks support a way of adding parameters to a query, and values added that way are sanitized. You should always do that and never insert variables data coming from users (see comments) manually into an SQL query string.
That still leaves the question of XSS injections, which are easier to miss (though also less dangerous); to avoid them, make sure you always use htmlspecialchars($var,ENT_QUOTES) (or urlencode, depending on the context).
PDO automatically escapes characters like ' so you should be ok, just make sure you have register_globals and magic_quotes turned off and always use bindParam for your queries.
Also if your talking about creating dynamic URL's you shouldn't have the ' character in them anyways. I always use:
$str = preg_replace("([^0-9a-zA-Z\-])", "", $str);
Which removes anything thats not 0-9, a-z or a dash from the string.

Are these two functions overkill for sanitization?

function sanitizeString($var)
{
$var = stripslashes($var);
$var = htmlentities($var);
$var = strip_tags($var);
return $var;
}
function sanitizeMySQL($var)
{
$var = mysql_real_escape_string($var);
$var = sanitizeString($var);
return $var;
}
I got these two functions from a book and the author says that by using these two, I can be extra safe against XSS(the first function) and sql injections(2nd func).
Are all those necessary?
Also for sanitizing, I use prepared statements to prevent sql injections.
I would use it like this:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
EDIT: Get rid of strip_tags for the 1st function because it doesn't do anything.
Would using these two functions be enough to prevent the majority of attacks and be okay for a public site?
To be honest, I think the author of these function has either no idea what XSS and SQL injections are or what exactly the used function do.
Just to name two oddities:
Using stripslashes after mysql_real_escape_string removes the slashes that were added by mysql_real_escape_string.
htmlentities replaces the chatacters < and > that are used in strip_tags in order to identify tags.
Furthermore: In general, functions that protect agains XSS are not suitable to protect agains SQL injections and vice versa. Because each language and context hast its own special characters that need to be taken care of.
My advice is to learn why and how code injection is possible and how to protect against it. Learn the languages you are working with, especially the special characters and how to escape these.
Edit   Here’s some (probably weird) example: Imagine you allow your users to input some value that should be used as a path segment in a URI that you use in some JavaScript code in a onclick attribute value. So the language context looks like this:
HTML attribute value
JavaScript string
URI path segment
And to make it more fun: You are storing this input value in a database.
Now to store this input value correctly into your database, you just need to use a proper encoding for the context you are about to insert that value into your database language (i.e. SQL); the rest does not matter (yet). Since you want to insert it into a SQL string declaration, the contextual special characters are the characters that allow you to change that context. As for string declarations these characters are (especially) the ", ', and \ characters that need to be escaped. But as already stated, prepared statements do all that work for you, so use them.
Now that you have the value in your database, we want to output them properly. Here we proceed from the innermost to the outermost context and apply the proper encoding in each context:
For the URI path segment context we need to escape (at least) all those characters that let us change that context; in this case / (leave current path segment), ?, and # (both leave URI path context). We can use rawurlencode for this.
For the JavaScript string context we need to take care of ", ', and \. We can use json_encode for this (if available).
For the HTML attribute value we need to take care of &, ", ', and <. We can use htmlspecialchars for this.
Now everything together:
'… onclick="'.htmlspecialchars('window.open("http://example.com/'.json_encode(rawurlencode($row['user-input'])).'")').'" …'
Now if $row['user-input'] is "bar/baz" the output is:
… onclick="window.open("http://example.com/"%22bar%2Fbaz%22"")" …
But using all these function in these contexts is no overkill. Because although the contexts may have similar special characters, they have different escape sequences. URI has the so called percent encoding, JavaScript has escape sequences like \" and HTML has character references like ". And not using just one of these functions will allow to break the context.
It's true, but this level of escaping may not be appropriate in all cases. What if you want to store HTML in a database?
Best practice dictates that, rather than escaping on receiving values, you should escape them when you display them. This allows you to account for displaying both HTML from the database and non-HTML from the database, and it's really where this sort of code logically belongs, anyway.
Another advantage of sanitizing outgoing HTML is that a new attack vector may be discovered, in which case sanitizing incoming HTML won't do anything for values that are already in the database, while outgoing sanitization will apply retroactively without having to do anything special
Also, note that strip_tags in your first function will likely have no effect, if all of the < and > have become < and >.
You are doing htmlentities (which turns all > into >) and then calling strip_tags which at this point will not accomplish anything more, since there are no tags.
If you're using prepared statements and SQL placeholders and never interpolating user input directly into your SQL strings, you can skip the SQL sanitization entirely.
When you use placeholders, the structure of the SQL statement (SELECT foo, bar, baz FROM my_table WHERE id = ?) is send to the database engine separately from the data values which are (eventually) bound to the placeholders. This means that, barring major bugs in the database engine, there is absolutely no way for the data values to be misinterpreted as SQL instructions, so this provides complete protection from SQL injection attacks without requiring you to mangle your data for storage.
No, this isn't overkill this is a vulnerability.
This code completely vulnerable to SQL Injection. You are doing a mysql_real_escape_string() and then you are doing a stripslashes(). So a " would become \" after mysql_real_escape_string() and then go back to " after the stripslashes(). mysql_real_escape_string() alone is best to stop sql injection. Parameterized query libraries like PDO and ADODB uses it, and Parameterized queries make it very easy to completely stop sql injection.
Go ahead test your code:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
mysql_query("select * from mysql.user where Host='".$variable."'");
What if:
$_POST['user_input'] = 1' or 1=1 /*
Patched:
mysql_query("select * from mysql.user where Host='".mysql_real_escape_string($variable)."'");
This code is also vulnerable to some types XSS:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
print("<body background='http://localhost/image.php?".$variable."' >");
What if:
$_POST['user_input']="' onload=alert(/xss/)";
patched:
$variable=htmlspecialchars($variable,ENT_QUOTES);
print("<body background='http://localhost/image.php?".$variable."' >");
htmlspeicalchars is encoding single and double quotes, make sure the variable you are printing is also encased in quotes, this makes it impossible to "break out" and execute code.
Well, if you don't want to reinvent the wheel you can use HTMLPurifier. It allows you to decide exactly what you want and what you don't want and prevents XSS attacks and such
I wonder about the concept of sanitization. You're telling Mysql to do exactly what you want it to do: run a query statement authored in part by the website user. You're already constructing the sentence dynamically using user input - concatenating strings with data supplied by the user. You get what you ask for.
Anyway, here's some more sanitization methods...
1) For numeric values, always manually cast at least somewhere before or while you build the query string: "SELECT field1 FROM tblTest WHERE(id = ".(int) $val.")";
2) For dates, convert the variable to unix timestamp first. Then, use the Mysql FROM_UNIXTIME() function to convert it back to a date. "SELECT field1 FROM tblTest WHERE(date_field >= FROM_UNIXTIME(".strtotime($val).")";. This is actually needed sometimes anyway to deal with how Mysql interprets and stores dates different from the script or OS layers.
3) For short and predictable strings that must follow a certain standard (username, email, phone number, etc), you can a) do prepared statements; or b) regex or other data validation.
4) For strings which wouldn't follow any real standard and which may or may not have pre- or double-escaped and executable code all over the place (text, memos, wiki markup, links, etc), you can a) do prepared statements; or b) store to and convert from binary/blob form - converting each character to binary, hex, or decimal representation before even passing the value to the query string, and converting back when extracting. This way you can focus more on just html validation when you spit the stored value back out.

Categories