An alternative to SQL concatenation for search options? - php

I'm a novice programmer, and I've inherited an application designed and built by a person who has now left the company. It's done in PHP and SQL Server 2008R2. In this application, there's a page with a table displaying a list of items, populated from the database, with some options for filters in a sidebar - search by ID, keyword, date etc. This table is populated by a mammoth query, and the filters are applied by concatenating them into said query. For example, if someone wanted item #131:
$filterString = "Item.itemID = 131";
$filter = " AND " . $filterString;
SELECT ...
FROM ...
WHERE...
$filter
The filter is included on the end of the URL of the search page. This isn't great, and I'm fairly sure there are some SQL injection vulnerabilities as a result, but it is extremely flexible - the filter string is created before it's concatentated, and can have lots of different conditions: E.g.$filterString could be "condition AND condition AND coindtion OR condition".
I've been looking into Stored Procedures, as a better way to counter the issue of SQL Injection, but I haven't had any luck working out how to replicate this same level of flexibility. I don't know ahead of time which of the filters (if any) will be selected.
Is there something I'm missing?

Use either Mysqli or PDO which support prepared/parameterized queries to battle sql injection. In PDO this could look something like this
$conditions = '';
$params = array();
if(isset($form->age)) {
$conditions .= ' AND user.age > ?'
$params[] = $form->age;
}
if(isset($form->brand)) {
$conditions .= ' AND car.brand = ?'
$params[] = $form->brand;
}
$sql = "
SELECT ...
FROM ...
LEFT ...
WHERE $conditions
";
$sth = $dbh->prepare($sql);
$sth->execute($params);
$result = $sth->fetchAll();
From the manual:
Calling PDO::prepare() and PDOStatement::execute() for statements that will be issued multiple times with different parameter values optimizes the performance of your application by allowing the driver to negotiate client and/or server side caching of the query plan and meta information, and helps to prevent SQL injection attacks by eliminating the need to manually quote the parameters.
http://no1.php.net/manual/en/pdo.prepare.php

Related

How to check for the sanity of sql query in PHP

I am writing a DbAdapter in PHP. Trying to avoid sql injection attacks, for conditional selects, I need a way to check for the sanity of the SQL query that I am going to run. Given that prepared statements make the implementation very complicated, is there a quick way to check for the sanity of the sql query (WHERE clauses in particular as is the case here) before executing in the heart of the class? For example, a helper method to return false for malicious or suspicious queries will be fine.
My class code:
require_once './config.php';
class DbAdapter
{
private $link;
/**
* DbAdapter constructor.
*/
public function __construct()
{
$this->link = new mysqli(DBHOST, DBUSER, DBPASS, DBNAME);
if ($this->link->connect_errno) {
die($this->link->connect_error);
}
}
/**
* #param $table
* #param array $columns
* #param string $condition
* #return bool|mysqli_result
*/
public function select($table, $columns = [], $condition = "")
{
$colsString = $this->extractCols($columns);
$whereString = $this->extractConditions($condition);
$sql = "SELECT $colsString FROM `$table` " . $whereString;
return $this->link->query($sql);
}
public function __destruct()
{
$this->link->close();
}
private function extractCols(array $columns)
{
if(!$columns) { return '*';}
else {
$str = "";
foreach($columns as $col) {
$str .= "$col,";
}
return trim($str, ',');
}
}
private function extractConditions(string $conditions)
{
if(!$conditions) {
return "";
}
else {
$where = "WHERE ";
foreach ($conditions as $key => $value){
$where .= "$key=" . $conditions[$key] . "&";
}
return trim($where, "&");
}
}
}
Short Answer
You can use EXPLAIN, as in EXPLAIN SELECT foo FROM table_bar. How to interpret the results programmatically for "sanity," however, is a much more difficult question. You'll need a programmatic definition of "sanity," like "examines more than n rows" or "involves more than t tables."
SQL Injection
You mentioned that your motivation includes wanting to "avoid sql injection attacks." If that's what's worrying you, the most important thing here is to avoid concatenating any user data into a query. SQL injection is possible if you concatenate any user data, and it's very, very hard to detect. Much better simply to prevent it entirely.
This code, frankly, makes my hair stand on end:
$where = "WHERE ";
foreach ($conditions as $key => $value){
$where .= "$key=" . $conditions[$key] . "&";
}
There's no way to make that safe enough or to sanity-check it enough. You might think, "Yeah, but all of the conditions should contain only digits," or something similarly easy to validate, but you cannot safely rely on that. What happens when you modify your code next year, or next week, or tomorrow, and add a string parameter? Instant vulnerability.
You need to use prepared statements, rather than concatenating variables into your query. Simply escaping your variables is not enough. See How can I prevent SQL injection in PHP?.
Some Notes on Application Design
Note that this is typically something you do before deploying queries to production, not on the fly. If you're building a toll that allows users to build their own queries, some on-the-fly evaluation of the queries may be unavoidable.
But if all you're dealing with is multiple conditions in the WHERE clause, then queries will be fast (and you won't need to use EXPLAIN) as long as two things are true:
you don't use subqueries, like ... WHERE id IN (SELECT id from OtherTable WHERE ...) ..., and
you have appropriate indexes. (Again, though, this is something you can anticipate at development time in >99% of cases.)
Relevant "War Story" to Hopefully Ease Some of Your Fears
I once wrote a tool that allowed all kinds of complex queries to be built and run against MySQL on a database with several million rows in each of the major tables. The queries were mostly straightforward WHERE conditions, like WHERE lastOrder > '2018-01-01', along with a few (mostly hard-coded) JOIN and subquery possibilities. I just indexed aggressively and never needed to EXPLAIN anything; it never really hit any bottlenecks of performance.
Allowing arbitrary input to become part of your SQL code is a fundamentally flawed design. There's no way to make that "sane."
Some technologies like Database Firewall attempt to do what you're asking, to detect when queries are compromised by an SQL injection attack. The trouble is, it's very difficult to distinguish between an SQL query that was compromised, versus one that's merely including legitimate dynamic content.
The result is that injection detection methods are not reliable. They fail to detect all cases of injection, and they also misidentify as injection cases that are legitimate.
Another approach is to use whitelisting of SQL queries. That is, enumerate all the legitimate SQL query forms used by a given application, and allow only those queries to run. This requires that you run the app in a kind of "teaching mode" before you deploy, to identify all the legitimate SQL queries. Then turn on the database firewall to block anything that wasn't a known SQL query at the time you did the test run.
This has disadvantages too. It doesn't account for SQL queries that need to be fully dynamic, like pivot table queries or constructive conditions (e.g. your query gains a variable number of terms in the WHERE clause based on app logic).
The best method of preventing SQL injection is still to use code review. Make sure any dynamic values are passed as query parameters using a prepared statement. You claim that this makes the code "very complex" but that's not true.
$sql = "SELECT ...";
$stmt = $pdo->prepare($sql);
$stmt->execute($paramValuesArray);
At least we can say that it's no less complex to write all the code you showed that appends terms to an SQL statement.

query multiple columns php/mysql

new to php and am enrolled on a course, so can ask tutor tomorrow if this is more complicated than i think it might be!
I have an sql query, and it works fine. But I am trying to add and 'and' in the select statement.
This is what I have at the minute
$query = "SELECT * from table1 where table1.age <= " . $_POST['min_age'] ;
I have a 'region' input on my linked html page and want results to be returned only if the min_age and region values match those inputted by the user.
I have tried adding an 'and where' but it doesn't work and I am not sure if it is because of the multiple "'s or if what I am trying to do needs a different method?
Thanks
If you need multiple conditions, just separate them with AND:
... WHERE table1.age <= ? AND table1.region = ?
No need to use WHERE again. Just like you wouldn't need to use if() more than once if you were writing a complex condition in PHP.
PS: This isn't directly related to your question, but you should get into the habit of not putting $_POST or $_GET variables directly into your SQL queries. It's a good way to get hacked! Ask your tutor about "SQL injection," or read my presentation SQL Injection Myths and Fallacies.
I know you're just starting out, but if you were training to be an electrician, you would place a high priority on learning how to avoid being electrocuted or how to avoid causing a fire.
Here's how I would write your query using mysqli. One advantage of using query parameters is you never need to worry about where you start and end your quotes.
$query = "SELECT * from table1 where table1.age <= ? AND table1.region = ?";
$stmt = $mysqli->prepare($query) or trigger_error($mysqli->error, E_USER_ERROR);
$stmt->bind_param("is", $_POST["min_age"], $_POST["region"]);
$stmt->execute() or trigger_error($stmt->error, E_USER_ERROR);
The other good habit I'm showing here is to always report if prepare() or execute() return an error.
If you must interpolate variables into your SQL, first make sure you protect the variables either by coercing the value to an integer, or else by using a proper escaping function like mysqli_real_escape_string(). Don't put $_POST variables directly into the string. Also you don't have to stop and restart the quotes if you use PHP's syntax for embedding variables directly in double-quoted strings:
$age = (int) $_POST["min_age"];
$region = $mysqli->real_escape_string($_POST["region"]);
$query = "SELECT * from table1 where table1.age <= {$age}
AND table1.region = '{$region}'";

PDO quote method

Where and when do you use the quote method in PDO? I'm asking this in the light of the fact that in PDO, all quoting is done by the PDO object therefore no user input should be escaped/quoted etc. This makes one wonder why worry about a quote method if it's not gonna get used in a prepared statement anyway?
When using Prepared Statements with PDO::prepare() and PDOStatement::execute(), you don't have any quoting to do : this will be done automatically.
But, sometimes, you will not (or cannot) use prepared statements, and will have to write full SQL queries and execute them with PDO::exec() ; in those cases, you will have to make sure strings are quoted properly -- this is when the PDO::quote() method is useful.
While this may not be the only use-case it's the only one I've needed quote for. You can only pass values using PDO_Stmt::execute, so for example this query wouldn't work:
SELECT * FROM tbl WHERE :field = :value
quote comes in so that you can do this:
// Example: filter by a specific column
$columns = array("name", "location");
$column = isset($columns[$_GET["col"]]) ? $columns[$_GET["col"]] : $defaultCol;
$stmt = $pdo->prepare("SELECT * FROM tbl WHERE " . $pdo->quote($column) . " = :value");
$stmt->execute(array(":value" => $value));
$stmt = $pdo->prepare("SELECT * FROM tbl ORDER BY " . $pdo->quote($column) . " ASC");
and still expect $column to be filtered safely in the query.
The PDO system does not have (as far as I can find) any mechanism to bind an array variable in PHP into a set in SQL. That's a limitation of SQL prepared statements as well... thus you are left with the task of stitching together your own function for this purpose. For example, you have this:
$a = array(123, 'xyz', 789);
You want to end up with this:
$sql = "SELECT * FROM mytable WHERE item IN (123, 'xyz', 789)";
Using PDO::prepare() does not work because there's no method to bind the array variable $a into the set. You end up needing a loop where you individually quote each item in the array, then glue them together. In which case PDO::quote() is probably better than nothing, at least you get the character set details right.
Would be excellent if PDO supported a cleaner way to handle this. Don't forget, the empty set in SQL is a disgusting special case... which means any function you build for this purpose becomes more complex than you want it to be. Something like PDO::PARAM_SET as an option on the binding, with the individual driver deciding how to handle the empty set. Of course, that's no longer compatible with SQL prepared statements.
Happy if someone knows a way to avoid this difficulty.
A bit late anwser, but one situation where its useful is if you get a load of data out of your table which you're going to put back in later.
for example, i have a function which gets a load of text out of a table and writes it to a file. that text might later be inserted into another table. the quote() method makes all the quotes safe.
it's real easy:
$safeTextToFile = $DBH->quote($textFromDataBase);

parameterized query without preparing statement in PHP

Is there an API in MySQLi, PDO or in PHP that use parameterized query but not preparing it for recall later? I found it in ADO.NET when we dont call .Prepare() method of SQLParameter, but I didn't find this in PHP.
Prepared statements are sent to the database server with the values separated from the query. If you wanted to get the user with ID 1337, using PDO you would do this;
$sql = 'SELECT u.id, u.username FROM users u WHERE u.id = :theUserIdToGet LIMIT 1';
$stmt = $db->prepare($sql);
$stmt->bindValue(':theUserIdToGet', 1337);
$result = $stmt->fetch();
In a traditional query this would just be;
$sql = 'SELECT u.id, u.username FROM users u WHERE u.id = ' . 1337 . ' LIMIT 1';
$stmt = $db->query($sql);
$result = $stmt->fetch();
The first example clearly requires more code, however it has the following benifits;
Security - The values (just one in this case; 1337) are sent to the database server separately. The database server safely checks these values for bad characters and inserts them into the query before execution. Note that some query API's will emulate this, but it still offers a great deal of safety.
Readability - When adding more than a couple of parameters to a query, it gets very messy if concatenating strings ("WHERE id = " . $var1 . " AND " . $var2 . " = 1", etc).
Performance (occasionally) - It is true that prepared statements are vastly quicker when executed many times, but in practice this is very infrequent. The performance overhead of preparing a query over query() is negligible.
Prepared statements should always be used when inserting variables into a query because of their legibility and security.
PDO is generally considered to be very good, I personally use it all the time. It's learning curve is gentle and many tutorials are available on the internet. It's documentation is here

PHP: Multiple SQL queries vs one monster

I have some really funky code. As you can see from the code below I have a series of filters that I add to query. Now would it be easier to just have multiple queries, with it's own set of filters, then store the results in an array, or have this mess?
Does anyone have a better solution to this mess? I need to be able to filter by keyword and item number, and it needs to be able to filter using multiple values, not know which is which.
//Prepare filters and values
$values = array();
$filters = array();
foreach($item_list as $item){
$filters[] = "ItemNmbr = ?";
$filters[] = "ItemDesc LIKE ?";
$filters[] = "NoteText LIKE ?";
$values[] = $item;
$values[] = '%' . $item . '%';
$values[] = '%' . $item . '%';
}
//Prepare the query
$sql = sprintf(
"SELECT ItemNmbr, ItemDesc, NoteText, Iden, BaseUOM FROM ItemMaster WHERE %s LIMIT 21",
implode(" OR ", $filters)
);
//Set up the types
$types = str_repeat("s", count($filters));
array_unshift($values, $types);
//Execute it
$state = $mysqli->stmt_init();
$state->prepare($sql) or die ("Could not prepare statement:" . $mysqli->error);
call_user_func_array(array($state, "bind_param"), $values);
$state->bind_result($ItemNmbr, $ItemDesc, $NoteText, $Iden, $BaseUOM);
$state->execute() or die ("Could not execute statement");
$state->store_result();
I don't see anything particularly monstrous about your query.
The only thing I would do different is separate the search terms.
Ie
$item_list could be split in numeric items and text items.
then you could make the search something like:
...WHERE ItemNmbr IN ( number1, number2, number3) OR LIKE .... $text_items go here....
IN is a lot more efficient and if your $item_list doesn't contain any text part... then you are just searching a bunch of numbers which is really fast.
Now the next part if you are using a lot of LIKEs in your query maybe you should consider using MySQL Full-text Searching.
You're answer depends on what exactly you need. The advantage of using only 1 query is that of resource use. One query takes only 1 connection and 1 communication with the sql server. And depending what what exactly you are attempting to do, it might take less SQL power to do it in 1 statement than multiple.
However, it might be more practical from a programmers point of view to use a few less complex sql statements that require less to create than 1 large one. Remember, ultimitaly you are programming this and you need to make it work. It might not really make a difference, script processing vs. sql processing. Only you can make ultimate call, which is more important? I would generally recommend SQL processing above script process when dealing with large databases.
A single table query can be cached by the SQL engine. MySQL and its ilk do not cache joined tables. The general rule for performance is to use joins only when necessary. This encourages the DB engine to cache table indexes aggressively and also makes your code easier to adapt to (faster) object databases--like Amazon/Google cloud services force you to.

Categories