Is using regex to check columns in database wrapper overkill?

Is using regex to check columns in database wrapper overkill? - php

Here's an example wrapper for an SQL query
public function where ($col, $val)
{
if (!preg_match('~^[a-z0-9_]+$~i', $col))
throw new Exception('Invalid parameter $col');
$this->where.= "WEHERE $col = :$col";
}
Is this quite overkill since regex is probably using resources.
Note I am actually using this to wrap PDO (notice the colon in :$col).

If $col can be specified through user input then this is not overkill but rather your only defense against SQL injection.
If $col is known safe (for example your code produces its value with a switch statement) then it's probably not worth it to include the runtime check. But you should take into account the possibility of the "known safe" status changing as the program is maintained in the future.

Related

Casting variables to integers in SQL queries in PHP

First of all, I am fully aware of SQL injection vulnerabilities and I am using PDO for newer applications that I am developing in PHP.
Long story short, the organization that I'm working for cannot afford to delegate any human resources at the moment to switch everything over to PDO for the rather large application that I'm currently working on, so I'm stuck with using mysql_* functions in the meantime.
Anyways, I am wondering if it is safe to use data validation functions to "sanitize" numeric parameters used in the interpolated queries. We do use mysql_real_escape_string() for strings (and yes I am aware of the limitations there too). Here is an example:
public function foo($id) {
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
$id id a user-supplied value via HTTP GET so obviously this code is vulnerable. Would be OK if I did this?
public function foo($id) {
if (!ctype_digit($id)) {
throw new \InvalidArgumentException("ID must be numeric");
}
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
As I'm aware, ctype_digit is the same as checking against a regular expression of \d+.
(There's also filter_var($id, FILTER_VALIDATE_INT), but that can potentially return int(0) which evaluates to FALSE under loosely-typed comparisons, so I'd have to do === FALSE there.)
Are there any problems with this temporary solution?
Update:
Variables do not only include primary keys, but any field with type boolean, tinyint, int, bigint, etc., which means that zero is a perfectly acceptable value to be searching for.
We are using PHP 5.3.2

Yes, if you indeed religiously use the correct function to validate the data and correctly prevent the query from running if the data is not as expected, there's no vulnerability I can see. ctype_digit has a very limited and clear purpose:
Returns TRUE if every character in the string text is a decimal digit, FALSE otherwise.
There's basically nothing that can go wrong with this function, so it's safe to use. It will even return false on an empty string (since PHP 5.1). Note that is_numeric would not be so trustworthy. I would possibly still add a range check to make sure the number is within an expected range, I'm not sure what could happen with overflowing integers. If you additionally cast to (int) after this check, there's no chance of injection.
Caveat emptor: as with all non-native parameterised queries, there's still a chance of injection if you're getting into any shenanigans with connection charsets. The range of bytes that may slip through are severely limited by ctype_digit, but you never know what one could come up with.

Yes, it will work. Your code will raise an exception if the value isn't a numeric string, you'll just have to catch that and display an error message to the user.
Beware that ctype_digit($foo):
Will return true if $foo is empty before PHP 5.1 (see the doc) ;
Will return false for all int values of $foo outside of the [48, 57] interval (ASCII numbers).
So you'll also need to check that $foo is a non-empty string if you plan on using ctype_digit($foo)

Long story short, the organization that I'm working for cannot afford to delegate any human resources at the moment to switch everything over to PDO
I don't see where is the problem here. According to the code you posted, you are already using some sort of DB wrapper, and already planning to alter the calling code for the every numeric parameter. Why not to alter that DB wrapper to make it support prepared statements, and alter calling code to employ it?
Old mysql ext is not a problem - one can emulate prepared statements with it all right.
I am fully aware of SQL injection vulnerabilities.
Your "full awareness" is a bit exaggerated. Unfortunately, most people do not understand the real source of injection, as well as the real purpose of the prepared statement.
That thing with separating data from the query is a nice trick, but totally unnecessary one. While the real value of prepared statement is its inevitability, as opposed to essential arbitrariness of the manual formatting.
Another your fault is separated treatment for the strings - it is partly formatted in the query (adding quotes) and partly - outside (escaping special characters) which again is a call disaster.
As you decided to stick to the manual formatting, then enjoy your injections, sooner or later. Your ideas are good for the artificial, fully controlled sandbox example. However, things turn different in the real life application, with many people working on it. Instead of asking a program to format your data, you are asking people to do that. With all the obvious consequences.
It makes me wonder why PHP users unable learn from their mistakes, and still eagerly devising practices, that has been proved unreliable long time ago.
Just spotted another fallacy in your reasoning
a user-supplied value via HTTP GET so obviously this code is vulnerable.
You have to understand that any unformatted value makes this code vulnerable, no matter if its HTTP GET, FTP PUT or file read. It is not only notorious "user input" have to be properly formatted but any input. This is why it is essential to make DB driver the only place where formatting occurs. It should be not a developer who formats the data but a program. Your idea contradicts with such a cornerstone principle.

Use mysql_real_escape_string and wrap your $id in single quote marks. The single quote marks ensures the safety and avoids the probability of SQL-injection.
For example, SELECT * FROM table WHERE id = 'escaped string' can't be hacked to something like: SELECT * FROM table WHERE id = 1; DROP table; as '1; DROP table;' will be considered the input argument for WHERE.

ctype_digit() will return false for most integer values of $id. If you want to use the function, cast it to string first:
public function foo($id) {
$id = (string)$id;
if (!ctype_digit($id)) {
throw new \InvalidArgumentException("ID must be numeric");
}
$sql = "SELECT * FROM items WHERE item_id = $id";
$this->query($sql); // call mysql_query and does things with result
}
This is because integer is interpreted as ASCII value.

I use intval() for simple cases although (int) apparently eats less resources. Example:
$sql =
"SELECT * FROM categories WHERE category_id = " .
intval($_POST['id']) .
" LIMIT 1";

Function escaping before inserting in mysql

I've been working on a code that escapes your posts if they are strings before you enter them in DB, is it an good idea? Here is the code: (Updated to numeric)
static function securePosts(){
$posts = array();
foreach($_POST as $key => $val){
if(!is_numeric($val)){
if(is_string($val)){
if(get_magic_quotes_gpc())
$val = stripslashes($val);
$posts[$key] = mysql_real_escape_string($val);
}
}else
$posts[$key] = $val;
}
return $posts;
}
Then in an other file:
if(isset($_POST)){
$post = ChangeHandler::securePosts();
if(isset($post['user'])){
AddUserToDbOrWhatEver($post['user']);
}
}
Is this good or will it have bad effects when escaping before even entering it in the function (addtodborwhater)

When working with user-input, one should distinguish between validation and escaping.
Validation
There you test the content of the user-input. If you expect a number, you check if this is really a numerical input. Validation can be done as early as possible. If the validation fails, you can reject it immediately and return with an error message.
Escaping
Here you bring the user-input into a form, that can not damage a given target system. Escaping should be done as late as possible and only for the given system. If you want to store the user-input into a database, you would use a function like mysqli_real_escape_string() or a parameterized PDO query. Later if you want to output it on an HTML page you would use htmlspecialchars().
It's not a good idea to preventive escape the user-input, or to escape it for several target systems. Each escaping can corrupt the original value for other target systems, you can loose information this way.
P.S.
As YourCommonSense correctly pointed out, it is not always enough to use escape functions to be safe, but that does not mean that you should not use them. Often the character encoding is a pitfall for security efforts, and it is a good habit to declare the character encoding explicitely. In the case of mysqli this can be done with $db->set_charset('utf8'); and for HTML pages it helps to declare the charset with a meta tag.

It is ALWAYS a good idea to escape user input BEFORE inserting anything in database. However, you should also try to convert values, that you expect to be a number to integers (signed or unsigned). Or better - you should use prepared SQL statements. There is a lot of info of the latter here and on PHP docs.

Is is safe to use mysql_* functions if PDO and mysqli is not available?

I have a website hosted on a shared hosting.
They have php 5.2.13 installed.
I know the vulnerabilities of SQL Injection and I want to prevent it.
So I want to use PDO or mysqli for preventing it.
But the problem when I used phpinfo(); to view the hosting environment php setup info,
I found that there was no mysql driver for PDO and there was no support for mysqli in it.
So I wanted to know whether it will be safe to use that old mysql_* functions( along with
functions like mysql_real_escape_string).
I looked at this one on SO but it wasn't much helpful to me.
Prepared statements possible when mysqli and PDO are not available?
UPDATE:
I forgot to mention that most of the queries will be simple. There are no forms used so no user input will be used to make a query. All the queries will be hard coded with necessary parameters and they will not be changed once set.

No. The lack of more secure solutions is never a valid excuse to fall back to a less secure or more vulnerable solution.
You're much better off finding a different hosting provider that doesn't disable arbitrary PHP features even in their shared hosting packages. Oh, and try to get one that uses PHP 5.3, or better yet if you can, PHP 5.4.

If you're really rigorous about always using mysql_real_escape_string() with all user-supplied input then I think you should be safe from any SQL injection that prepared statements protects you from.
How perfect are you at this? I'll bet most of the buffer overflow vulnerabilities were created by programmers who thought they were good at checking inputs....

A good way to do that is to implement a Wrapper class for the use of the mysql_* functions, with a few methods to create prepared statements.
The idea is that you must pass strongly-typed parameters in your queries.
For instance, here is a piece of code with the general idea. Of course it needs more work.
But that can prevent from SQL Injection attacks if it's fairly implemented.
You can also search for 3rd party libraries that already implement that, because this is common.
<?php
class MyDb
{
protected $query;
public function setQuery($query)
{
$this->query = $query;
}
public function setNumericParameter($name, $value)
{
if (is_numeric($value)) // SQL Injection check, is the value really an Int ?
{
$this->query = str_replace(':'.$name, $value);
}
// else, probably an intent of SQL Injection
}
// Implement here the methods for all the types you need, including dates, strings, etc
public function fetchArray()
{
$res = mysql_query($this->query);
return mysql_fetch_array($res);
}
}
MyDb $db = new MyDb();
$db->setQuery('SELECT * FROM articles WHERE id = :id');
$db->setNumericParameter('id', 15);
while ($row = $db->fetchArray())
{
// do your homework
}

Why are the big frameworks ignoring precondition checks?

From what I know, checking preconditions is a good practice. If a method needs an int value then it's a good solution to do use something like this:
public function sum($input1, $input2) {
if (!is_int($input1)) throw new Exception('Input must be a integer');
However after looking to the source code of Zend/Codeigniter I don't see checks like this very often. Is there a reason for this ?

Because it is difficult / inefficient to test each and every variable before you use it. Instead they check just input variables - check visitors at the door, not once inside the house.
It is of course a good defensive programming technique to test at least more important vars before using them, especially if the input comes from many places.
This is a bit off-topic, but the solution I would recommend is to test input variables like this:
$username=get('username', 'string');
$a=get('a', 'int');
...
$_REQUEST and similar should never be used (or even be accessible) directly.
Also, when doing HTML output, you should always use this:
echo html($username); // replaces '<' with '<' - uses htmlentities
To avoid SQL injection attacks one can use MeekroDB, but it is unfortunately very limiting (MySQL only, single DB only,...). It has a good API though which promotes safety, so I would recommend checking it out.
For myself I have build a small DB library that is based on PDO and uses prepared statements. YMMV.

Specifying such strict preconditions in any case is not necessary and feels not useful in a dynamical typed language.
$sum = sum("1", "2");
Why one should forbid it? Additional if you throw an Exception, one tries to avoid it. This means, he will test and cast himself
function sum ($a, $b) {
if (!is_int($a)) throw new Exception('Input must be a integer');
if (!is_int($b)) throw new Exception('Input must be a integer');
return $a + $b;
}
if (!is_int($value1)) { $value1 = (int) $value1; }
if (!is_int($value2)) { $value2 = (int) $value2; }
$sum = sum($value1, $value2);
Every is_int() occurs multiple times just to avoid unnecessary Exceptions.
Its sufficient to validate values, when you receive them, not all over the whole application.

Speaking about ZF, i'd say that they try to minimize it in favour of interfaces and classes. You can see in many definitions across ZF something like this:
public function preDispatch(Zend_Request_Http $request)
which is fine enough. Also at critical places where ints/strings are needed there are some sanity checks. But mostly not in the form of is_string() but rather as isValidLocale() that calls some other class to check validity.

Is this PHP code secure?

Just a quick question: is the following PHP code secure? Also is there anything you think I could or should add?
$post = $_GET['post'];
if(is_numeric($post))
{
$post = mysql_real_escape_string($post);
}
else
{
die("NAUGHTY NAUGHTY");
}
mysql_select_db("****", $*****);
$content = mysql_query("SELECT * FROM tbl_***** WHERE Id='" . $post . "'");

In this particular case, I guess the is_numeric saves you from SQL injections (although you would still be able to break the SQL statement, cf Alex' answer). However, I really think you should consider using parameterized queries (aka. prepared statements) because:
They will protect you even when using parameters of non-numeric types
You do not risk forgetting input sanitation as you add more parameters
Your code will be a lot easier to write and read
Here is an example (where $db is a PDO connection):
$stmt = $db->prepare('SELECT * FROM tbl_Persons WHERE Id = :id');
$stmt->execute(array(':id' => $_GET['post']));
$rows = $stmt->fetchAll();
For more information about parameterized SQL statements in PHP see:
Best way to stop SQL Injection in PHP

It's a little rough, but I don't immediately see anything that will cause you any serious problems. You should note that hexidecimal notation is accepted within is_numeric() according to the documentation. You may want to use is_int() or cast it. And for clarity, I would suggest using parameterized queries:
$sql = sprintf("SELECT col1
FROM tbl
WHERE col2 = '%s'", mysql_real_escape_string($post));
In this case, $post would be passed in as the value of %s.

is_numeric will return true for hexadecimal numbers such as '0xFF'.
EDIT: To get around this you can do something like:
sprintf('%d', mysql_real_escape_string($post, $conn));
//If $post is not an int, it will become 0 by sprintf
Look at the snippet here on php.net for more info.

You have the right idea but is_numeric() may not behave as you intended.
Try this test:
<?php
$tests = Array(
"42",
1337,
"1e4",
"not numeric",
Array(),
9.1
);
foreach($tests as $element)
{
if(is_numeric($element))
{
echo "'{$element}' is numeric", PHP_EOL;
}
else
{
echo "'{$element}' is NOT numeric", PHP_EOL;
}
}
?>
The result is:
'42' is numeric
'1337' is numeric
'1e4' is numeric
'not numeric' is NOT numeric
'Array' is NOT numeric
'9.1' is numeric
1e4 may not be something your sql server understands if you're looking for what is commonly referred to as a numeric value. From an SQL injection standpoint you're fine.

You're not passing the connection resource to mysql_real_escape_string() (but you seemingly do so with mysql_select_db()). The connection resource amongst other things stores the connection charset setting which might affect the behavior of real_escape_string().
Either don't pass the resource anywhere or (preferably) pass it always but don't make it even worse than not passing the resource by mixing both.
"Security" in my book also encompasses whether the code is readable, "comprehensible" and does "straight-forward" things. In the example you would at least have to explain to me why you have the !numeric -> die branch at all when you treat the id as a string in a SELECT query. My counter-argument (as the example stands; could be wrong in your context) would be "Why bother? The SELECT just will not return any record for a non-numeric id" which reduces the code to
if ( isset($_GET['post']) ) {
$query = sprintf(
"SELECT x,y,z FROM foo WHERE id='%s'",
mysql_real_escape_string($_GET['post'], $mysqlconn)
);
...
}
That automagically eliminates the trouble you might run into because is_numeric() doesn't behave as you expect (as explained in other answers).
edit: And there's something to be said about using die() to often/to early in production code. It's fine for test/example code but in a live system you almost always want to give control back to the surrounding code instead of just exiting (so your application can handle the error gracefully). During the development phase you might want to bail out early or put more tests in. In that case take a look at http://docs.php.net/assert.
Your example might qualify for an assertion. It won't break if the assertion is deactivated but it might give a developer more information about why it's not working as intended (by this other developer) when a non-numeric argument is passed. But you have to be careful about separating necessary tests from assertions; they are not silver bullets.
If you feel is_numeric() to be an essential test your function(?) might return false, throw an exception or something to signal the condition. But to me an early die() is the easy way out, a bit like a clueless opossum, "I have no idea what to do now. If i play dead maybe no one will notice" ;-)
Obligatory hint to prepared statements: http://docs.php.net/pdo.prepared-statements

I think it looks ok.
When accessing databases I always use query binding, this avoids problems if I forget to escape my strings.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.