Can anybody explain me in plain English what parametrized queries are and how to implement it in PHP for a MySQL database to avoid SQL injection?
The prepared statements and stored procedures section of the PHP manual, whilst it relates specifically to PDO, covers this well when it says:
They can be thought of as a kind of
compiled template for the SQL that an
application wants to run, that can be
customized using variable parameters.
Prepared statements offer two major
benefits:
The query only needs to be parsed (or
prepared) once, but can be executed
multiple times with the same or
different parameters. When the query
is prepared, the database will
analyze, compile and optimize it's
plan for executing the query. For
complex queries this process can take
up enough time that it will noticeably
slow down an application if there is a
need to repeat the same query many
times with different parameters. By
using a prepared statement the
application avoids repeating the
analyze/compile/optimize cycle. This
means that prepared statements use
fewer resources and thus run faster.
The parameters to prepared statements
don't need to be quoted; the driver
automatically handles this. If an
application exclusively uses prepared
statements, the developer can be sure
that no SQL injection will occur
(however, if other portions of the
query are being built up with
unescaped input, SQL injection is
still possible).
If you're after specific example of how to use them, the above linked page also includes code samples.
Related
I've used the following sort of code a few times in my current project to clear out some tables. Incase it's not obvious I'm using PDO.
$clearResult = $db->query('TRUNCATE TABLE table_name');
I'm currently going through and updating a few of my earlier scripts to make sure they all make use of prepared statements and are written in a way to reduce (hopefully stop) sql injection.
No, there's no user input in the actual query so there's no risk of injection.
You do have to make sure that a user isn't able to trigger the truncate though, unless they're authorized.
It's not the SQL operation that determines whether or not a prepared statement should be used. To prevent SQL Injection, a prepared statement should be used when any variable is involved in the query where bound parameters are permitted. That is not limited to just user input either, any variable at all should be a bound paremeter, regardless of where it came from.
In your example there are no variables required for the query, and so there is no security benefit of using a prepared statement.
Even if your table_name was coming from user input or a variable, a prepared statement would not be a solution because it is not possible to bind the table name.
Prepared statements would have no effect on your truncate query.
PDO prepared statements are useful when running queries with user input as they allow you to use features such as bound parameters to sanitise user input.
They are also useful for optimising queries that will run multiple times.
You might want to read up a little on prepared statements in the PHP documentation - PHP documentation for prepared statements:
Many of the more mature databases support the concept of prepared
statements. What are they? They can be thought of as a kind of
compiled template for the SQL that an application wants to run, that
can be customized using variable parameters. Prepared statements offer
two major benefits:
The query only needs to be parsed (or prepared) once, but can be
executed multiple times with the same or different parameters. When
the query is prepared, the database will analyze, compile and optimize
its plan for executing the query. For complex queries this process can
take up enough time that it will noticeably slow down an application
if there is a need to repeat the same query many times with different
parameters. By using a prepared statement the application avoids
repeating the analyze/compile/optimize cycle. This means that prepared
statements use fewer resources and thus run faster.
The parameters to
prepared statements don't need to be quoted; the driver automatically
handles this. If an application exclusively uses prepared statements,
the developer can be sure that no SQL injection will occur (however,
if other portions of the query are being built up with unescaped
input, SQL injection is still possible). Prepared statements are so
useful that they are the only feature that PDO will emulate for
drivers that don't support them. This ensures that an application will
be able to use the same data access paradigm regardless of the
capabilities of the database.
I'm re-engineering a PHP-driven web site which uses a minimal database. The original version used "pseudo-prepared-statements" (PHP functions which did quoting and parameter replacement) to prevent injection attacks and to separate database logic from page logic.
It seemed natural to replace these ad-hoc functions with an object which uses PDO and real prepared statements, but after doing my reading on them, I'm not so sure. PDO still seems like a great idea, but one of the primary selling points of prepared statements is being able to reuse them… which I never will. Here's my setup:
The statements are all trivially simple. Most are in the form SELECT foo,bar FROM baz WHERE quux = ? ORDER BY bar LIMIT 1. The most complex statement in the lot is simply three such selects joined together with UNION ALLs.
Each page hit executes at most one statement and executes it only once.
I'm in a hosted environment and therefore leery of slamming their servers by doing any "stress tests" personally.
Given that using prepared statements will, at minimum, double the number of database round-trips I'm making, am I better off avoiding them? Can I use PDO::MYSQL_ATTR_DIRECT_QUERY to avoid the overhead of multiple database trips while retaining the benefit of parametrization and injection defense? Or do the binary calls used by the prepared statement API perform well enough compared to executing non-prepared queries that I shouldn't worry about it?
EDIT:
Thanks for all the good advice, folks. This is one where I wish I could mark more than one answer as "accepted" — lots of different perspectives. Ultimately, though, I have to give rick his due… without his answer I would have blissfully gone off and done the completely Wrong Thing even after following everyone's advice. :-)
Emulated prepared statements it is!
Today's rule of software engineering: if it isn't going to do anything for you, don't use it.
I think you want PDO::ATTR_EMULATE_PREPARES. That turns off native database prepared statements, but still allows query bindings to prevent sql injection and keep your sql tidy. From what I understand, PDO::MYSQL_ATTR_DIRECT_QUERY turns off query bindings completely.
When not to use prepared statements? When you're only going to be running the statement once before the db connection goes away.
When not to use bound query parameters (which is really what most people use prepared statements to get)? I'm inclined to say "never" and I'd really like to say "never", but the reality is that most databases and some db abstraction layers have certain circumstances under which they won't allow you to bind parameters, so you're forced to not use them in those cases. Any other time, though, it will make your life simpler and your code more secure to use them.
I'm not familiar with PDO, but I'd bet it provides a mechanism for running parametrized queries with the values given in the same function call if you don't want to prepare, then run as a separate step. (e.g., Something like run_query("SELECT * FROM users WHERE id = ?", 1) or similar.)
Also, if you look under the hood, most db abstraction layers will prepare the query, then run it, even if you just tell it to execute a static SQL statement. So you're probably not saving a trip to the db by avoiding explicit prepares anyhow.
Prepared statements are being used by thousands of people and are therefore well-tested (and thus one can infer they are reasonably secure). Your custom solution is only used by you.
The chance that your custom solution is insecure is pretty high. Use prepared statements. You have to maintain less code that way.
The benefits of prepared statements are as follows:
each query is only compiled once
mysql will use a more efficient transport format to send data to the server
However, prepared statements only persist per connection. Unless you're using connection pooling, there would be no benefit if you're only doing one statement per page. Trivially simple queries would not benefit from the more efficient transport format, either.
Personally I wouldn't bother. The pseudo-prepared statements are likely to be useful for the safe variable quoting they presumably provide.
Honestly, I don't think you should worry about it. However, I remember that a number of PHP data access frameworks supported prepare statement modes and non-prepare statement modes. If I remember correctly, PEAR:DB did back in the day.
I have ran into the same issue as you and I had my own reservations, so instead of using PDO I ended up writing my own light-weight database layer that supported prepares and standard statements and performed correct escaping (sql-injection prevention) in both cases. One of my other gripes with prepares is that sometimes it is more efficient to append some non-escapable input to a statement like ... WHERE id IN (1, 2, 3...).
I don't know enough about PDO to tell you what other options you have using it. However, I do know that PHP has escaping functions available for all database vendors it supports and you could roll your own little layer on top of any data access layer you are stuck with.
I have:
INSERT INTO post(title,message) VALUES (:title,:message)
where :message value has some random text with symbols (including comma).
As the comma symbol is the separator between components of the statement:
How can i escape :message value to keep its included commas in the string context and not be interpreted as the separator?
You don't need to. The whole point of using prepared statements, as you are with PDO, is that the structure of the query is sent separately from the data. This means that characters in the data are never confused for parts of the structure of the query.
You've done all you need to do to make the query safe in this regard.
Prepared statements my friend.
This is what php.net has to say;
The query only needs to be parsed (or prepared) once, but can be executed multiple time with the same or different parameters. When
the query is prepared, the database will analyze, compile and optimize
its plan for executing the query. For complex queries this process can
take up enough time that it will noticeably slow down an application
if there is a need to repeat the same query many times with different
parameters. By using a prepared statement the application avoids
repeating the analyze/compile/optimize cycle. This means that prepared
statements use fewer resources and thus run faster.
The parameters to prepared statements don't need to be quoted; the driver automatically handles this. If an application exclusively uses
prepared statements, the developer can be sure that no SQL injection
will occur (however, if other portions of the query are being built up
with unescaped input, SQL injection is still possible).
I'm re-engineering a PHP-driven web site which uses a minimal database. The original version used "pseudo-prepared-statements" (PHP functions which did quoting and parameter replacement) to prevent injection attacks and to separate database logic from page logic.
It seemed natural to replace these ad-hoc functions with an object which uses PDO and real prepared statements, but after doing my reading on them, I'm not so sure. PDO still seems like a great idea, but one of the primary selling points of prepared statements is being able to reuse them… which I never will. Here's my setup:
The statements are all trivially simple. Most are in the form SELECT foo,bar FROM baz WHERE quux = ? ORDER BY bar LIMIT 1. The most complex statement in the lot is simply three such selects joined together with UNION ALLs.
Each page hit executes at most one statement and executes it only once.
I'm in a hosted environment and therefore leery of slamming their servers by doing any "stress tests" personally.
Given that using prepared statements will, at minimum, double the number of database round-trips I'm making, am I better off avoiding them? Can I use PDO::MYSQL_ATTR_DIRECT_QUERY to avoid the overhead of multiple database trips while retaining the benefit of parametrization and injection defense? Or do the binary calls used by the prepared statement API perform well enough compared to executing non-prepared queries that I shouldn't worry about it?
EDIT:
Thanks for all the good advice, folks. This is one where I wish I could mark more than one answer as "accepted" — lots of different perspectives. Ultimately, though, I have to give rick his due… without his answer I would have blissfully gone off and done the completely Wrong Thing even after following everyone's advice. :-)
Emulated prepared statements it is!
Today's rule of software engineering: if it isn't going to do anything for you, don't use it.
I think you want PDO::ATTR_EMULATE_PREPARES. That turns off native database prepared statements, but still allows query bindings to prevent sql injection and keep your sql tidy. From what I understand, PDO::MYSQL_ATTR_DIRECT_QUERY turns off query bindings completely.
When not to use prepared statements? When you're only going to be running the statement once before the db connection goes away.
When not to use bound query parameters (which is really what most people use prepared statements to get)? I'm inclined to say "never" and I'd really like to say "never", but the reality is that most databases and some db abstraction layers have certain circumstances under which they won't allow you to bind parameters, so you're forced to not use them in those cases. Any other time, though, it will make your life simpler and your code more secure to use them.
I'm not familiar with PDO, but I'd bet it provides a mechanism for running parametrized queries with the values given in the same function call if you don't want to prepare, then run as a separate step. (e.g., Something like run_query("SELECT * FROM users WHERE id = ?", 1) or similar.)
Also, if you look under the hood, most db abstraction layers will prepare the query, then run it, even if you just tell it to execute a static SQL statement. So you're probably not saving a trip to the db by avoiding explicit prepares anyhow.
Prepared statements are being used by thousands of people and are therefore well-tested (and thus one can infer they are reasonably secure). Your custom solution is only used by you.
The chance that your custom solution is insecure is pretty high. Use prepared statements. You have to maintain less code that way.
The benefits of prepared statements are as follows:
each query is only compiled once
mysql will use a more efficient transport format to send data to the server
However, prepared statements only persist per connection. Unless you're using connection pooling, there would be no benefit if you're only doing one statement per page. Trivially simple queries would not benefit from the more efficient transport format, either.
Personally I wouldn't bother. The pseudo-prepared statements are likely to be useful for the safe variable quoting they presumably provide.
Honestly, I don't think you should worry about it. However, I remember that a number of PHP data access frameworks supported prepare statement modes and non-prepare statement modes. If I remember correctly, PEAR:DB did back in the day.
I have ran into the same issue as you and I had my own reservations, so instead of using PDO I ended up writing my own light-weight database layer that supported prepares and standard statements and performed correct escaping (sql-injection prevention) in both cases. One of my other gripes with prepares is that sometimes it is more efficient to append some non-escapable input to a statement like ... WHERE id IN (1, 2, 3...).
I don't know enough about PDO to tell you what other options you have using it. However, I do know that PHP has escaping functions available for all database vendors it supports and you could roll your own little layer on top of any data access layer you are stuck with.
I am currently writing a CRUD class in PHP using PDO.
I like the security that prepared statements provide, but I have heard that they also prevent databases like mysql from using the queryCache.
Is it better to use a prepared Select statement when you are only doing one select at a time? or would just $pdo->quote() suffice the security standpoint (or have any other advantages like caching?).
All my update, delete and inserts are done using prepared statements already. I am just curious about the selects.
MySQLPerformanceBlog.com did some benchmarks in an article about "Prepared Statements." Peter Zaitsev wrote:
I’ve done a simple benchmark (using
SysBench) to see performance of simple
query (single row point select) using
standard statement, prepared statement
and have it served from query cache.
Prepared statements give 2290
queries/sec which is significantly
better than 2000 with standard
statements but it is still well below
4470 queries/sec when results are
served from query cache.
This seems to say that the "overhead" of using prepared statements is that they are 14.5% faster than using a straight query execution, at least in this simple test. The relative difference probably diminishes with a more complex query or a larger result set.
It seems counter-intuitive that prepared queries would be faster, given the double round-trip to the server and other factors. Peter's benchmark lacks details. Anyway, you should run your own tests, because the type of query you run, and your environment and hardware, are definitely important factors.
As for Query Cache, it was true in the past that prepared statements were incompatible with caching query results, but this was changed. See "How the Query Cache Operates" in the MySQL documentation:
Before MySQL 5.1.17, prepared
statements do not use the query cache.
Beginning with 5.1.17, prepared
statements use the query cache under
certain conditions, which differ
depending on the preparation method: ...
The documentation goes on to describe these conditions. Go read it.
I do recommend using prepared statements for SELECT queries. Quoting variables as you interpolate them into SQL statements can be effective if you do it consistently. But even quoting may have some subtle security vulnerabilities, e.g. with multi-byte character sets (see MySQL bug #8378). It's easier to use prepared queries in a secure way in these cases.
Yes, use prepared statements. I seriously doubt you will run into performance problems with prepared statements running much slower than just a regular literal query. However, on mysql, you appear to be correct. I would opt for prepared statements nevertheless.
Here is one reference:
http://www.mysqlperformanceblog.com/2006/08/02/mysql-prepared-statements/
Although, if you are worried about caching, you might want to look at things like memcached.
This is my understanding, as confirmed by discussion from: here
A normal query is taken as a single
string, parsed, executed, and
returned. End of story. A prepared
statement is taken as a template
string, parsed, and cached. It then
has variables passed into it, almost
like a function call.
Caching the query once tends to cost a
little bit more than just executing it
straight. The savings come in later
calls, when you skip the compilation
step. You save per repeated query the
amount of the compilation.
So, in short, on MySQL, if you're executing a query once, preparing it just adds a needless extra amount of processing.
Prepared statements are generally considered to be better practice.
I would suggest reading the MySql article on prepared statements and their practicalities and advantages over conventional plain-vanilla interpolated stringy queries.
Are you only doing a select "once" in the application lifetime, or "once" per call to the function?
Because if the latter, you should still benefit from the caching in the prepared statement anyway.
Just a reminder that MySQL > 5.1.17 does use the query cache for prepared statements.
From the code POV, i believe prepared statements are, for the most part, the way to go in terms of readability, maintainability, etc...
The one reason not to use them would be expensive queries that get called with some frequency. (queries that take a lot of time to run and have a real benefit on being on the query cache).