Casting variables to integers in SQL queries in PHP

Casting variables to integers in SQL queries in PHP - php

First of all, I am fully aware of SQL injection vulnerabilities and I am using PDO for newer applications that I am developing in PHP.
Long story short, the organization that I'm working for cannot afford to delegate any human resources at the moment to switch everything over to PDO for the rather large application that I'm currently working on, so I'm stuck with using mysql_* functions in the meantime.
Anyways, I am wondering if it is safe to use data validation functions to "sanitize" numeric parameters used in the interpolated queries. We do use mysql_real_escape_string() for strings (and yes I am aware of the limitations there too). Here is an example:
public function foo($id) {
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
$id id a user-supplied value via HTTP GET so obviously this code is vulnerable. Would be OK if I did this?
public function foo($id) {
if (!ctype_digit($id)) {
throw new \InvalidArgumentException("ID must be numeric");
}
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
As I'm aware, ctype_digit is the same as checking against a regular expression of \d+.
(There's also filter_var($id, FILTER_VALIDATE_INT), but that can potentially return int(0) which evaluates to FALSE under loosely-typed comparisons, so I'd have to do === FALSE there.)
Are there any problems with this temporary solution?
Update:
Variables do not only include primary keys, but any field with type boolean, tinyint, int, bigint, etc., which means that zero is a perfectly acceptable value to be searching for.
We are using PHP 5.3.2

Yes, if you indeed religiously use the correct function to validate the data and correctly prevent the query from running if the data is not as expected, there's no vulnerability I can see. ctype_digit has a very limited and clear purpose:
Returns TRUE if every character in the string text is a decimal digit, FALSE otherwise.
There's basically nothing that can go wrong with this function, so it's safe to use. It will even return false on an empty string (since PHP 5.1). Note that is_numeric would not be so trustworthy. I would possibly still add a range check to make sure the number is within an expected range, I'm not sure what could happen with overflowing integers. If you additionally cast to (int) after this check, there's no chance of injection.
Caveat emptor: as with all non-native parameterised queries, there's still a chance of injection if you're getting into any shenanigans with connection charsets. The range of bytes that may slip through are severely limited by ctype_digit, but you never know what one could come up with.

Yes, it will work. Your code will raise an exception if the value isn't a numeric string, you'll just have to catch that and display an error message to the user.
Beware that ctype_digit($foo):
Will return true if $foo is empty before PHP 5.1 (see the doc) ;
Will return false for all int values of $foo outside of the [48, 57] interval (ASCII numbers).
So you'll also need to check that $foo is a non-empty string if you plan on using ctype_digit($foo)

Long story short, the organization that I'm working for cannot afford to delegate any human resources at the moment to switch everything over to PDO
I don't see where is the problem here. According to the code you posted, you are already using some sort of DB wrapper, and already planning to alter the calling code for the every numeric parameter. Why not to alter that DB wrapper to make it support prepared statements, and alter calling code to employ it?
Old mysql ext is not a problem - one can emulate prepared statements with it all right.
I am fully aware of SQL injection vulnerabilities.
Your "full awareness" is a bit exaggerated. Unfortunately, most people do not understand the real source of injection, as well as the real purpose of the prepared statement.
That thing with separating data from the query is a nice trick, but totally unnecessary one. While the real value of prepared statement is its inevitability, as opposed to essential arbitrariness of the manual formatting.
Another your fault is separated treatment for the strings - it is partly formatted in the query (adding quotes) and partly - outside (escaping special characters) which again is a call disaster.
As you decided to stick to the manual formatting, then enjoy your injections, sooner or later. Your ideas are good for the artificial, fully controlled sandbox example. However, things turn different in the real life application, with many people working on it. Instead of asking a program to format your data, you are asking people to do that. With all the obvious consequences.
It makes me wonder why PHP users unable learn from their mistakes, and still eagerly devising practices, that has been proved unreliable long time ago.
Just spotted another fallacy in your reasoning
a user-supplied value via HTTP GET so obviously this code is vulnerable.
You have to understand that any unformatted value makes this code vulnerable, no matter if its HTTP GET, FTP PUT or file read. It is not only notorious "user input" have to be properly formatted but any input. This is why it is essential to make DB driver the only place where formatting occurs. It should be not a developer who formats the data but a program. Your idea contradicts with such a cornerstone principle.

Use mysql_real_escape_string and wrap your $id in single quote marks. The single quote marks ensures the safety and avoids the probability of SQL-injection.
For example, SELECT * FROM table WHERE id = 'escaped string' can't be hacked to something like: SELECT * FROM table WHERE id = 1; DROP table; as '1; DROP table;' will be considered the input argument for WHERE.

ctype_digit() will return false for most integer values of $id. If you want to use the function, cast it to string first:
public function foo($id) {
$id = (string)$id;
if (!ctype_digit($id)) {
throw new \InvalidArgumentException("ID must be numeric");
}
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
This is because integer is interpreted as ASCII value.

I use intval() for simple cases although (int) apparently eats less resources. Example:
$sql =
"SELECT * FROM categories WHERE category_id = " .
intval($_POST['id']) .
" LIMIT 1";

Related

Am i safe from SQL injection if i know for sure that a certain value i am using in a dynamic statement is an integer?

Before i make my query, i check if the variable that is to be used in that query is an integer using this code: filter_var($_POST["user_id"], FILTER_VALIDATE_INT) !== false.
My question is, should i use PDO or do any escaping if the above function returns true only if my value is an integer (meaning that the value i am to use to build my query is safe) ? Is there any need to escape the value using prepared statements if my value has already passed the above test ?
I have not done any testing with the above nor am i really experienced in server-side technologies, so it is up to you PHP/security experts to guide me.

It's still a good idea to use prepared statements. Bind functions at this point are tried and true.
What if you or someone else screws up the filter?
Are you going to remember to use the right filter at every point in your code? This is a very easy thing to mismanage, and sometimes you may not be able to plan for every eventuality. Integers are relatively easy, but strings are far more complex.
In regards to your professional reputation, will other people see this code? If you had open source code (like github or something), and I was a hiring manager looking into your history, I would not hire you for breaking such a standard security practice like this.
Admittedly, point 3 is a little off topic, but I feel that it's worth mentioning.

This answer is an explanation of the shorthand type casting, from comments, as it's easier to read it as an answer than as a set of comments.
Your code:
filter_var($_POST["user_id"], FILTER_VALIDATE_INT) !== false.
This is a long winded way of ensuring that POSTed data is integer. It has issues, because POST data is always cast as a string.
$_POST["user_id"] = (int)$_POST["user_id"];
Is much easier to read and shorter to type, and this forces the data to be the integer type. This will competely solve your security risk if putting non-integer data into an integer placement in your SQL.
This utalises PHP Type Juggling which it is well worth reading up on.
While the above code will solve your security aspect, it will raise other overlap issues because any string can be casted to an integer, but the cast will return 0 if the string doesn't start with an integer value.
Example:
$string = "hello";
print (int)$string; // outputs 0;
$string = "27hello";
print (int)$string; // outputs 27;
$string = true;
print (int)$string; // outputs 1;
$string = "";
print (int)$string; // outputs 0;
So, overall I would suggest the following line to ensure your given POST value is a correct integer:
if (strcmp((int)$_POST['value'], $_POST['value']) == 0){
/// it's ok!
}
Please see this answer for further details as well as this PHP manual page.

maybe you need to see php bugs page this page before use FILTER_VALIDATE_INT

Is it necessary to validate $_GET id

I'm sure someone asked this before but I just can't find a post similar.
how necessary is it to validate an ID field from $_GET variable?
I'm using is_numeric() to make sure I'm getting a number at least but am I just putting in unnecessary code?
ex.
www.test.com/user.php?user_id=5
if (isset($_GET['user_id']) && is_numeric($_GET['user_id'])) {
*PDO query for user information*
}
is the is_numeric() necessary?
is there a possibility of an attack by changing user_id in the address?

The best way to sanitize a numeric id is by using an (int) cast.
$id = (int) $_GET['ID'];
with strings you just never know.
Is the is_int() necessary?
You are probably looking for retrieving data by id. Therefore convert the string to an int is the simplest way to go. On a side note is_int will always return false if applied to a string.
Is there a possibility of an attack by changing user_id in the address?
Well, strings are always dirty. You never know what strange characters an user might input and how that will effect the query. For example, I don't know if it can be applied in this case but, you should take a look at NULL bytes attacks.

If you want to properly validate an integer before performing the query, you should use filter_input(); the outcome is either a valid integer, false if it's not a valid integer or null if the parameter wasn't passed at all.
if (is_int($userId = filter_input(INPUT_GET, 'user_id', FILTER_VALIDATE_INT))) {
*PDO query for user information*
}
If you're using prepared statements this won't really matter so much, but if you wish to return a failure response based on whether the input conforms to what's expected, you can use the above.

If you don't want to use prepared statements, PDO::quote should be the correct function:
Returns a quoted string that is theoretically safe to pass into an SQL statement.

is_int will not work, because GET variables are always passed as strings.
Personally, I like to test for a valid integer with:
if(strval(intval($_GET['user_id'])) === $_GET['user_id'])
However, this can be overkill. After all, if you're using prepared statements then there's no need to handle any escaping, and searching for a row that doesn't exist will just return no results. I'd throw in intval($_GET['user_id']), but only to really make it clear to future coders that the ID is a number.

is_int check type of variable. But $_GET['id'] will be always a string. Better use filter_var.
But you must use prepared statement anyway.
P.S. With prepared statements you can not use validation. DB will tell that nothing was found. But if you want to warn user about bad request you must validate it before querying.

IS multiplying by 1 a safe way to clean numeric values against sql injections?

I am wondering, If I have a value I know should be numeric, is multiplying it by 1 a safe method to clean it?
function x($p1){
$p1*=1;
sql="select * from t where id ={$p1}";
//run query..
}
Although my example uses an ID, this is being used for many types of numeric values I have in my app (can be money, can be pai etc).

I don't see why it wouldn't be. But what's wrong with using prepared statements? That's always going to be safer than using PHP variables directly in SQL statements.

You can use is_numeric()

I'm sure there is a more "appropriate" way, but for the scope of your question, I would say yes. If some sort of string is passed PHP will interpret it as a zero when doing the mathematical operation.

You can also use is_int()

While that'll probably work, intval seems like a better solution. http://php.net/manual/en/function.intval.php. Your intent will likely be more obvious to someone else reading your code.
If you want to check if a value is numeric before converting it to an int, use is_numeric ( http://php.net/manual/en/function.is-numeric.php ). It'll check for strings that are numeric as well as integers. For example, if a number was coming back from a text input form via AJAX, it might be a string. In that case, is_int would return false, but is_numeric would return true.
EDIT
Now that I know you use DECIMAL for the MySQL column type, you can do something like this:
function getItem($pValue)
{
if (!is_numeric($pValue))
{
return false;
}
$Query = sprintf
(
'SELECT * FROM %s WHERE %s = %.2f',
'TableName',
'Price',
$pValue
);
// Do something with $Query
}

It works most of the times as it will cast strings to integers or doubles, but you have to be careful. It's going to work correctly for scalar values. However, if you do this:
x(new stdClass);
You'll get an E_NOTICE. This is not so bad, right? This:
x(array());
And you'll get an E_ERROR, Unsupported operand types, and the script terminates.
Maybe you'd think that it isn't so bad, but a fatal error at an inopportune moment can leave your system in an unstable state, per example, by losing referential integrity or leaving a series of queries unfinished.
Only you knows if a case like the above can happen. But if this data comes from a user in any way, I'd go with Murphy's Law on this one and not trust it.

Are these two functions overkill for sanitization?

function sanitizeString($var)
{
$var = stripslashes($var);
$var = htmlentities($var);
$var = strip_tags($var);
return $var;
}
function sanitizeMySQL($var)
{
$var = mysql_real_escape_string($var);
$var = sanitizeString($var);
return $var;
}
I got these two functions from a book and the author says that by using these two, I can be extra safe against XSS(the first function) and sql injections(2nd func).
Are all those necessary?
Also for sanitizing, I use prepared statements to prevent sql injections.
I would use it like this:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
EDIT: Get rid of strip_tags for the 1st function because it doesn't do anything.
Would using these two functions be enough to prevent the majority of attacks and be okay for a public site?

To be honest, I think the author of these function has either no idea what XSS and SQL injections are or what exactly the used function do.
Just to name two oddities:
Using stripslashes after mysql_real_escape_string removes the slashes that were added by mysql_real_escape_string.
htmlentities replaces the chatacters < and > that are used in strip_tags in order to identify tags.
Furthermore: In general, functions that protect agains XSS are not suitable to protect agains SQL injections and vice versa. Because each language and context hast its own special characters that need to be taken care of.
My advice is to learn why and how code injection is possible and how to protect against it. Learn the languages you are working with, especially the special characters and how to escape these.
Edit   Here’s some (probably weird) example: Imagine you allow your users to input some value that should be used as a path segment in a URI that you use in some JavaScript code in a onclick attribute value. So the language context looks like this:
HTML attribute value
JavaScript string
URI path segment
And to make it more fun: You are storing this input value in a database.
Now to store this input value correctly into your database, you just need to use a proper encoding for the context you are about to insert that value into your database language (i.e. SQL); the rest does not matter (yet). Since you want to insert it into a SQL string declaration, the contextual special characters are the characters that allow you to change that context. As for string declarations these characters are (especially) the ", ', and \ characters that need to be escaped. But as already stated, prepared statements do all that work for you, so use them.
Now that you have the value in your database, we want to output them properly. Here we proceed from the innermost to the outermost context and apply the proper encoding in each context:
For the URI path segment context we need to escape (at least) all those characters that let us change that context; in this case / (leave current path segment), ?, and # (both leave URI path context). We can use rawurlencode for this.
For the JavaScript string context we need to take care of ", ', and \. We can use json_encode for this (if available).
For the HTML attribute value we need to take care of &, ", ', and <. We can use htmlspecialchars for this.
Now everything together:
'… onclick="'.htmlspecialchars('window.open("http://example.com/'.json_encode(rawurlencode($row['user-input'])).'")').'" …'
Now if $row['user-input'] is "bar/baz" the output is:
… onclick="window.open("http://example.com/"%22bar%2Fbaz%22"")" …
But using all these function in these contexts is no overkill. Because although the contexts may have similar special characters, they have different escape sequences. URI has the so called percent encoding, JavaScript has escape sequences like \" and HTML has character references like ". And not using just one of these functions will allow to break the context.

It's true, but this level of escaping may not be appropriate in all cases. What if you want to store HTML in a database?
Best practice dictates that, rather than escaping on receiving values, you should escape them when you display them. This allows you to account for displaying both HTML from the database and non-HTML from the database, and it's really where this sort of code logically belongs, anyway.
Another advantage of sanitizing outgoing HTML is that a new attack vector may be discovered, in which case sanitizing incoming HTML won't do anything for values that are already in the database, while outgoing sanitization will apply retroactively without having to do anything special
Also, note that strip_tags in your first function will likely have no effect, if all of the < and > have become < and >.

You are doing htmlentities (which turns all > into >) and then calling strip_tags which at this point will not accomplish anything more, since there are no tags.

If you're using prepared statements and SQL placeholders and never interpolating user input directly into your SQL strings, you can skip the SQL sanitization entirely.
When you use placeholders, the structure of the SQL statement (SELECT foo, bar, baz FROM my_table WHERE id = ?) is send to the database engine separately from the data values which are (eventually) bound to the placeholders. This means that, barring major bugs in the database engine, there is absolutely no way for the data values to be misinterpreted as SQL instructions, so this provides complete protection from SQL injection attacks without requiring you to mangle your data for storage.

No, this isn't overkill this is a vulnerability.
This code completely vulnerable to SQL Injection. You are doing a mysql_real_escape_string() and then you are doing a stripslashes(). So a " would become \" after mysql_real_escape_string() and then go back to " after the stripslashes(). mysql_real_escape_string() alone is best to stop sql injection. Parameterized query libraries like PDO and ADODB uses it, and Parameterized queries make it very easy to completely stop sql injection.
Go ahead test your code:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
mysql_query("select * from mysql.user where Host='".$variable."'");
What if:
$_POST['user_input'] = 1' or 1=1 /*
Patched:
mysql_query("select * from mysql.user where Host='".mysql_real_escape_string($variable)."'");
This code is also vulnerable to some types XSS:
$variable = sanitizeString($_POST['user_input']);
$variable = sanitizeMySQL($_POST['user_input']);
print("<body background='http://localhost/image.php?".$variable."' >");
What if:
$_POST['user_input']="' onload=alert(/xss/)";
patched:
$variable=htmlspecialchars($variable,ENT_QUOTES);
print("<body background='http://localhost/image.php?".$variable."' >");
htmlspeicalchars is encoding single and double quotes, make sure the variable you are printing is also encased in quotes, this makes it impossible to "break out" and execute code.

Well, if you don't want to reinvent the wheel you can use HTMLPurifier. It allows you to decide exactly what you want and what you don't want and prevents XSS attacks and such

I wonder about the concept of sanitization. You're telling Mysql to do exactly what you want it to do: run a query statement authored in part by the website user. You're already constructing the sentence dynamically using user input - concatenating strings with data supplied by the user. You get what you ask for.
Anyway, here's some more sanitization methods...
1) For numeric values, always manually cast at least somewhere before or while you build the query string: "SELECT field1 FROM tblTest WHERE(id = ".(int) $val.")";
2) For dates, convert the variable to unix timestamp first. Then, use the Mysql FROM_UNIXTIME() function to convert it back to a date. "SELECT field1 FROM tblTest WHERE(date_field >= FROM_UNIXTIME(".strtotime($val).")";. This is actually needed sometimes anyway to deal with how Mysql interprets and stores dates different from the script or OS layers.
3) For short and predictable strings that must follow a certain standard (username, email, phone number, etc), you can a) do prepared statements; or b) regex or other data validation.
4) For strings which wouldn't follow any real standard and which may or may not have pre- or double-escaped and executable code all over the place (text, memos, wiki markup, links, etc), you can a) do prepared statements; or b) store to and convert from binary/blob form - converting each character to binary, hex, or decimal representation before even passing the value to the query string, and converting back when extracting. This way you can focus more on just html validation when you spit the stored value back out.

Is this PHP code secure?

Just a quick question: is the following PHP code secure? Also is there anything you think I could or should add?
$post = $_GET['post'];
if(is_numeric($post))
{
$post = mysql_real_escape_string($post);
}
else
{
die("NAUGHTY NAUGHTY");
}
mysql_select_db("****", $*****);
$content = mysql_query("SELECT * FROM tbl_***** WHERE Id='" . $post . "'");

In this particular case, I guess the is_numeric saves you from SQL injections (although you would still be able to break the SQL statement, cf Alex' answer). However, I really think you should consider using parameterized queries (aka. prepared statements) because:
They will protect you even when using parameters of non-numeric types
You do not risk forgetting input sanitation as you add more parameters
Your code will be a lot easier to write and read
Here is an example (where $db is a PDO connection):
$stmt = $db->prepare('SELECT * FROM tbl_Persons WHERE Id = :id');
$stmt->execute(array(':id' => $_GET['post']));
$rows = $stmt->fetchAll();
For more information about parameterized SQL statements in PHP see:
Best way to stop SQL Injection in PHP

It's a little rough, but I don't immediately see anything that will cause you any serious problems. You should note that hexidecimal notation is accepted within is_numeric() according to the documentation. You may want to use is_int() or cast it. And for clarity, I would suggest using parameterized queries:
$sql = sprintf("SELECT col1
FROM tbl
WHERE col2 = '%s'", mysql_real_escape_string($post));
In this case, $post would be passed in as the value of %s.

is_numeric will return true for hexadecimal numbers such as '0xFF'.
EDIT: To get around this you can do something like:
sprintf('%d', mysql_real_escape_string($post, $conn));
//If $post is not an int, it will become 0 by sprintf
Look at the snippet here on php.net for more info.

You have the right idea but is_numeric() may not behave as you intended.
Try this test:
<?php
$tests = Array(
"42",
1337,
"1e4",
"not numeric",
Array(),
9.1
);
foreach($tests as $element)
{
if(is_numeric($element))
{
echo "'{$element}' is numeric", PHP_EOL;
}
else
{
echo "'{$element}' is NOT numeric", PHP_EOL;
}
}
?>
The result is:
'42' is numeric
'1337' is numeric
'1e4' is numeric
'not numeric' is NOT numeric
'Array' is NOT numeric
'9.1' is numeric
1e4 may not be something your sql server understands if you're looking for what is commonly referred to as a numeric value. From an SQL injection standpoint you're fine.

You're not passing the connection resource to mysql_real_escape_string() (but you seemingly do so with mysql_select_db()). The connection resource amongst other things stores the connection charset setting which might affect the behavior of real_escape_string().
Either don't pass the resource anywhere or (preferably) pass it always but don't make it even worse than not passing the resource by mixing both.
"Security" in my book also encompasses whether the code is readable, "comprehensible" and does "straight-forward" things. In the example you would at least have to explain to me why you have the !numeric -> die branch at all when you treat the id as a string in a SELECT query. My counter-argument (as the example stands; could be wrong in your context) would be "Why bother? The SELECT just will not return any record for a non-numeric id" which reduces the code to
if ( isset($_GET['post']) ) {
$query = sprintf(
"SELECT x,y,z FROM foo WHERE id='%s'",
mysql_real_escape_string($_GET['post'], $mysqlconn)
);
...
}
That automagically eliminates the trouble you might run into because is_numeric() doesn't behave as you expect (as explained in other answers).
edit: And there's something to be said about using die() to often/to early in production code. It's fine for test/example code but in a live system you almost always want to give control back to the surrounding code instead of just exiting (so your application can handle the error gracefully). During the development phase you might want to bail out early or put more tests in. In that case take a look at http://docs.php.net/assert.
Your example might qualify for an assertion. It won't break if the assertion is deactivated but it might give a developer more information about why it's not working as intended (by this other developer) when a non-numeric argument is passed. But you have to be careful about separating necessary tests from assertions; they are not silver bullets.
If you feel is_numeric() to be an essential test your function(?) might return false, throw an exception or something to signal the condition. But to me an early die() is the easy way out, a bit like a clueless opossum, "I have no idea what to do now. If i play dead maybe no one will notice" ;-)
Obligatory hint to prepared statements: http://docs.php.net/pdo.prepared-statements

I think it looks ok.
When accessing databases I always use query binding, this avoids problems if I forget to escape my strings.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.