I was wondering if there was any difference between the MySQL PREPARE/EXECUTE clauses and the PHP mysqli prepare/execute methods? Are either better or worse at preventing injections?
I am curious because I am writing a few database stored procedures and in one, the table and attributes are not known on compilation. I could write the data in as static, but the query is a bit complex and it would just bloat the procedure with a lot of control logic. It got me thinking about this though and I just wondered whether it is better if, when I just needed a simple statement, I write a short procedure with dynamic SQL or just prepare and bind with PHP.
Also I apologize if this is a repeat and would of course welcome a link to an already answered SO question. However, I looked generally on google and could not really find much in the way of a specific answer to this.
PHP mysqli is a layer, in PHP, around MySQL's prepare / execute functionality.
Both will keep you safe from SQL injection as long as everything that comes in from users is stuffed into a bound variable.
Prepared statements, handled either way, make for more efficient high-volume operation too. In MySQL the efficiency gain is modest compared to the high-priced DBMSs like Oracle, but it is still worth every bit of the trouble.
If you need to have table names as "variables" in your app, that's OK, You can't treat table names as bound variables though. So you need to be totally paranoid about any user input that results in the construction of these table names.
As I answer it for 2020 and what I learned you should use to prepare and bind or bind inside execute with PDO, not with mysqli. It will be protected from injections and be safer.
Related
This question already has answers here:
Why is using a mysql prepared statement more secure than using the common escape functions?
(7 answers)
Closed 9 years ago.
Okay, I still don't really get it. I keep reading that in order to properly escape your MySQL queries, you need to use mysqli_prepare() and mysqli_bind_param().
I tried using this setup and, quite frankly, it's a little clunkier. I'm stuck passing variables by reference when I don't need to ever reference them again, and it's just more lines of code to accomplish the same task.
I guess I just don't get what the difference is between:
<?php
$sql = new \MySQLi(...);
$result = $sql->query('
UPDATE `table`
SET
`field` = "'.$sql->real_escape_string($_REQUEST[$field]).'";
');
?>
and
<?php
$sql = new \MySQLi(...);
$stmt = $sql->prepare('
UPDATE `table`
SET
`field` = ?;
');
$value = $_REQUEST[$field];
$stmt->bind_param('s', $value);
$stmt->execute();
$result = $stmt->get_result();
unset($value);
?>
other than more code.
I mean, did they implement this so that people wouldn't forget to escape values before sending them in a query? Or is it somehow faster?
Or should I use this method when I intend to use the same query repeatedly (since a mysqli_stmt can be reused) and use the traditional method in other cases?
What you are reading, that you need to use mysqli_prepare() and mysqli_bind_param() functions to "properly escape your MySQL queries" is wrong.
It is true that if you use mysqli_prepare() and mysqli_bind_param(), you needn't (and shouldn't) "escape" the values supplied as bind parameters. So, in that sense, there's some truth in what you are reading.
It's only when unsafe variables are included in the SQL text (the actual text of the query) that you need to "properly escape" the variables, usually by wrapping the variables in mysqli_real_escape_string() function calls.
(We note that it's possible to make of use of prepared statements and still include un-escaped variables in the SQL text, rather than passing the variable values as bind_parameters. That does sort of defeats the purpose of using prepared statements, but the point is, either way, you can write code that is vulnerable.
MySQL now supports "server side" prepared statements (if the option is enabled in the connection), and that's a performance optimization (in some cases) of repeated executions of identical SQL text. (This has been long supported in other databases, such as Oracle, where making use of prepared statements has been a familiar pattern for, like, since forever.)
Q: Did they implement [prepared statements] so that people wouldn't forget to escape values before sending them in a query?
A: Based on the number of examples of code vulnerable to SQL Injection when not using prepared statements, despite the documentation regarding mysql_real_escape_string() function, you'd think that certainly would be sufficient reason.
I think one big benefit is that when we're reading code, we can see a SQL statement as a single string literal, rather than a concatenation of a bunch of variables, with quotes and dots and calls to mysql_real_escape_string, which isn't too bad with a simple query, but with a more complex query, it is just overly cumbersome. The use of the ? placeholder makes for a more understandable SQL statement,... true, I need to look at other lines of code to figure out what value is getting stuffed there. (I think the Oracle style named parameters :fee, :fi, :fo, :fum is preferable to the positional ?, ?, ?, ? notation.) But having STATIC SQL text is what is really the benefit.
Q: Or is it somehow faster?
As I mentioned before, the use of server side prepared statements can be and advantage in terms of performance. It's not always the case that it's faster, but for repeated execution of the same statement, where the only difference is literal values (as in repeated inserts), it can provide a performance boost.
Q: Or should I use this method when I intend to use the same query repeatedly (since a mysqli_stmt can be reused) and use the traditional method in other cases?
That's up to you. My preference is for using STATIC SQL text. But this really comes from a long history of using Oracle, and using the same pattern with MySQL fits naturally. (Albeit, from Perl using the DBI interface, and Java using JDBC and MyBATIS, or other ORMs (Hibernate, Glassfish JPA, et al.)
Following the same pattern just feels natural in PHP; the introduction of mysqli_ and PDO are a welcome relief from the arcane (and abused) mysql_ interface.
Good code can be written following either pattern. But I challenge you to think ahead, about more complex SQL statements, and whether the choice to use mysqli_real_escape_string() and concatenating together a dynamic string to be executed, rather than using static SQL text and bind parameters, might make reading, and deciphering, the actual SQL being executed more complicated for the soul that finds themselves maintaining code they didn't write.
I think studies have shown that code is read ten times more than it is written, which is why we strive to produce readable, understandable code, even if that means more lines of code. (When each statement is doing a single identifiable thing, that's usually easier for me to understand than reading a jumble of concatenated function calls in one complicated statement.
I think it's less a question of the latter method being more secure per se than encouraging separation of logic. With prepared statements the SQL query is independent of the values we use. This means, for example, when we go back and change our query we don't have to concatenate a bunch of different values to a string, and maybe risk forgetting to escape our input. Makes for more maintainable code!
There are a couple main benefits I found that were well written:
The overhead of compiling and optimizing the statement is incurred
only once, although the statement is executed multiple times. Not
all optimization can be performed at the time the prepared statement
is compiled, for two reasons: the best plan may depend on the
specific values of the parameters, and the best plan may change as
tables and indexes change over time.
Prepared statements are resilient against SQL injection, because
parameter values, which are transmitted later using a different
protocol, need not be correctly escaped. If the original statement
template is not derived from external input, SQL injection cannot
occur.
On the other hand, if a query is executed only once, server-side prepared statements can be slower because of the additional round-trip to the server. Implementation limitations may also lead to performance penalties: some versions of MySQL did not cache results of prepared queries, and some DBMSs such as PostgreSQL do not perform additional query optimization during execution.
Source
I would like to add that mysqli_bind_param() has been removed as of PHP 5.4.0. You should use mysqli_stmt_bind_param()
I have this very question to clear things up. I read some documentation and comments around but still somethings are just not clear enough.
I understand PDO offers more drivers which would certainly is a plus if you would ever change your database type.
As said on another post, PDO doesnt offer true prepared statements but mysqli does so it would be safer to use MYSQLI
Benchmarks looks similar, (did not test it myself but checked around on the web for a few benchmarks)
Being object oriented is not an issue for me since mysqli is catching up. But would be nice to benchmark procedural mysqli vs PDO since procedural is supposed to be slightly faster.
But here is my question, with prepared statement, do we have to use parameter binding with the data we use in our statement? good practice or have to? I understand prepared statements are good perfermance-wise if you run the same query multiple times but it is enough to secure the query itself? or binding parameters is a must? What exactly do the binding parameters and how it works to protect the data from sql injection? Also would be appreciated if you point our any misunderstanding about the statements I made above.
In short,
Binding is a must, being a cornerstone of protection, no matter if it is supported by a native driver or not. It's the idea of substitution that matters.
The difference is negligible in either safety and performance.
Performance is the last thing to consider. There is NO API that is considerable slower than other. It is not a class or a function that may cause whatever performance problem but a data manipulation or a bad algorithm. Optimize your queries, not mere functions to call them.
If you are going to use a raw bare API, then PDO is the only choice. While wrapped in a higher level class, mysqli seems more preferable for mysql.
Both mysqli and PDO lack bindings for the identifiers and keywords. In this case a whitelist-based protection must be implemented. Here is my article with the ready made example, Adding a field name to the SQL query dynamically
I'm re-engineering a PHP-driven web site which uses a minimal database. The original version used "pseudo-prepared-statements" (PHP functions which did quoting and parameter replacement) to prevent injection attacks and to separate database logic from page logic.
It seemed natural to replace these ad-hoc functions with an object which uses PDO and real prepared statements, but after doing my reading on them, I'm not so sure. PDO still seems like a great idea, but one of the primary selling points of prepared statements is being able to reuse them… which I never will. Here's my setup:
The statements are all trivially simple. Most are in the form SELECT foo,bar FROM baz WHERE quux = ? ORDER BY bar LIMIT 1. The most complex statement in the lot is simply three such selects joined together with UNION ALLs.
Each page hit executes at most one statement and executes it only once.
I'm in a hosted environment and therefore leery of slamming their servers by doing any "stress tests" personally.
Given that using prepared statements will, at minimum, double the number of database round-trips I'm making, am I better off avoiding them? Can I use PDO::MYSQL_ATTR_DIRECT_QUERY to avoid the overhead of multiple database trips while retaining the benefit of parametrization and injection defense? Or do the binary calls used by the prepared statement API perform well enough compared to executing non-prepared queries that I shouldn't worry about it?
EDIT:
Thanks for all the good advice, folks. This is one where I wish I could mark more than one answer as "accepted" — lots of different perspectives. Ultimately, though, I have to give rick his due… without his answer I would have blissfully gone off and done the completely Wrong Thing even after following everyone's advice. :-)
Emulated prepared statements it is!
Today's rule of software engineering: if it isn't going to do anything for you, don't use it.
I think you want PDO::ATTR_EMULATE_PREPARES. That turns off native database prepared statements, but still allows query bindings to prevent sql injection and keep your sql tidy. From what I understand, PDO::MYSQL_ATTR_DIRECT_QUERY turns off query bindings completely.
When not to use prepared statements? When you're only going to be running the statement once before the db connection goes away.
When not to use bound query parameters (which is really what most people use prepared statements to get)? I'm inclined to say "never" and I'd really like to say "never", but the reality is that most databases and some db abstraction layers have certain circumstances under which they won't allow you to bind parameters, so you're forced to not use them in those cases. Any other time, though, it will make your life simpler and your code more secure to use them.
I'm not familiar with PDO, but I'd bet it provides a mechanism for running parametrized queries with the values given in the same function call if you don't want to prepare, then run as a separate step. (e.g., Something like run_query("SELECT * FROM users WHERE id = ?", 1) or similar.)
Also, if you look under the hood, most db abstraction layers will prepare the query, then run it, even if you just tell it to execute a static SQL statement. So you're probably not saving a trip to the db by avoiding explicit prepares anyhow.
Prepared statements are being used by thousands of people and are therefore well-tested (and thus one can infer they are reasonably secure). Your custom solution is only used by you.
The chance that your custom solution is insecure is pretty high. Use prepared statements. You have to maintain less code that way.
The benefits of prepared statements are as follows:
each query is only compiled once
mysql will use a more efficient transport format to send data to the server
However, prepared statements only persist per connection. Unless you're using connection pooling, there would be no benefit if you're only doing one statement per page. Trivially simple queries would not benefit from the more efficient transport format, either.
Personally I wouldn't bother. The pseudo-prepared statements are likely to be useful for the safe variable quoting they presumably provide.
Honestly, I don't think you should worry about it. However, I remember that a number of PHP data access frameworks supported prepare statement modes and non-prepare statement modes. If I remember correctly, PEAR:DB did back in the day.
I have ran into the same issue as you and I had my own reservations, so instead of using PDO I ended up writing my own light-weight database layer that supported prepares and standard statements and performed correct escaping (sql-injection prevention) in both cases. One of my other gripes with prepares is that sometimes it is more efficient to append some non-escapable input to a statement like ... WHERE id IN (1, 2, 3...).
I don't know enough about PDO to tell you what other options you have using it. However, I do know that PHP has escaping functions available for all database vendors it supports and you could roll your own little layer on top of any data access layer you are stuck with.
I am thinking of rewriting some open-source application for my purposes to PDO and transactions using InnoDB (mysql_query and MyISAM now).
My question is: Which cases are reasonable for using prepared statements?
Because everywhere I am reading is stated (even in many posts here) that I should use prepared statements every time and everywhere because of the 1. security and 2. performance. Even PHP manual recommends using prepared statements and not mentioning the escape-thing.
You can't deny the security mechanism. But thinking it over and over it comes to mind that having to prepare the statement every time and then use it once.. It doesn't make sense. While having to insert 1000 times some variables in single statement, that makes sense but it is obvious. But this is not what common eshop or board is built upon.
So how to overcome this? May I prepare my statements application-wide and to name them specifically? Can I prepare several different statements and to use them by name? Because this is the only reasonable solution I am thinking of (except the 1000x thing).
I found there is this mysql_real_escape called $pdo->quote as well for the purpose of single query. Why not to use this? Why to bother with preparing?
And what do you think of this excellent article?
http://blog.ulf-wendel.de/2008/pdo_mysqlnd-prepared-statements-again/
Do you agree with the overhead caused by preparing the statements?
Thanks
I think this falls in the "premature optimization" category.
How significant is the overhead? Have you measured it? Does it affect your server performance at all?
Odds are it doesn't.
On the plus side, you have an undeniable gain in terms of security (which should be a major concern for any internet-based shop).
On the downside, you have the risk that it might affect performance. In the link you provided, it shows that poorly implemented PDO preparation results in slightly lower performance than non prepared statement in some circumstances. Performance difference on 5000 runs is 0.298 seconds.
Insignificant. Even more so when you realize that the "non prepared" queries are run without the input sanitizing routines that would be required to make them safe in a live environment. If you don't use the prepared queries, you need some form of input sanitizing to prevent SQL attacks, and depending on how it is done, you may need to massage back the result sets.
Bottom line, there is no significant performance issue, but there is a significant security benefit. Thus the official recommendation of using prepared statements.
In your question, you speak of "the common eshop". The "common eshop" will never have enough traffic to worry about the performance issue, if there is one. The security issue on the other end...
My question is: Which cases are reasonable for using prepared statements?
All of them. The community is openly-opposed to the usage of mysql_* functions.
Note: Suggested alternatives
Use of this extension is discouraged. Instead, the MySQLi or PDO_MySQL extension should be used. See also MySQL: choosing an API for more information.
Alternatives to this function include:
mysqli_connect()
PDO::__construct()
source
But thinking it over and over it comes to mind that having to prepare the statement every time and then use it once.. It doesn't make sense
You're trading in a Geo for a Jaguar and you're complaining that you don't like the Jaguar because you don't always use the seat-heaters. You don't have to be consistently using every function of a library to mean it's good.
I found there is this mysql_real_escape called $pdo->quote as well for the purpose of single query. Why not to use this? Why to bother with preparing?
If you are using this function to build SQL statements, you are strongly recommended to use PDO::prepare() to prepare SQL statements with bound parameters instead of using PDO::quote() to interpolate user input into an SQL statement. Prepared statements with bound parameters are not only more portable, more convenient, immune to SQL injection, but are often much faster to execute than interpolated queries, as both the server and client side can cache a compiled form of the query. source
My question is: Which cases are reasonable for using prepared statements?
Well actually, that's hard to say. Especially as you didn't even tell which open source application you speak about here.
To give you an example: For a ultra-lame guestbook app PDO with prepared statements will be the perfect choice, as well for 99% of all other open source apps out there. But for some this actually can make a difference. The important part here is: You have not told anything about the application.
As the database is not unimportant to an application, it's the other way round as well: the application is not unimportant to the database.
So you either need to share more about that "mysterious" open-source application you ask about or you need to tell us, what exactly you would like to know. Because generally, it's simple: Take PDO. But in specific, there are differences, so you need to tell us what the application in specific is, otherwise your question is already answered.
And btw., if the application is mysql_* style, it's much easier to just replace with mysqli_* interface. If you had done some actually rewriting, even just for fun, you would have seen that.
So better add more meat here or live with some not-so-precise answers.
While this question is rather old, some topics were not really discussed that should be outlined here for others researching the same as the OP.
To summarize everything below:
Yes always use prepare statements
Yes use PDO over mysqli over mysql. This way if you switch database systems all you need to do is update the queries instead of queries, function calls, and arguments given it supports prepared statements.
Always sanitize user supplied data despite using prepared statements with parameters
Look into a DBAL (Database Abstraction Layer) to ease working with all of these factors and manipulating queries to suit your needs.
There is the topic of PDO::ATTR_EMULATE_PREPARES which will increase the performance of calling cached queries in MySQL >= 5.1.21 when emulation is turned OFF, which is ENABLED by default. Meaning PHP will emulate the prepare before execute sends it to the actual database. The time between emulated and non-emulated is normally negligible unless working with an external database (not localhost), such as on a cloud, that may have an abnormally high ping rate.
The caching depends on your MySQL settings in my.cnf as well, but MySQL optimization outside the scope of this post.
<?php
$pdo = new \PDO($connection_string);
$pdo->setAttribute( \PDO::ATTR_EMULATE_PREPARES, false );
?>
So keep this in mind since mysqli_ does not provide an API for client side emulation and is always going to use MySQL for preparing statements.
http://www.php.net/manual/en/mysqli.quickstart.prepared-statements.php
Despite having similar features there are differences and you may need features that one API provides while the other does not. See PHP's reference on choosing one API over the other: http://www.php.net/manual/en/mysqlinfo.api.choosing.php
So this pretty much goes along with what you asked with defining your statements application-wide, as cacheable queries would be cached on the MySQL server, and wouldn't need to be prepared application-wide.
The other benefit is that exceptions in your Query would be thrown at prepare() instead of execute() which aids in development to ensure your Queries are correct.
Regardless there is no real world performance benefits of using prepare or not.
Another benefit of prepared statements is working with Transactions if you use InnoDB for MySQL. You can start a transaction, insert a record, get the last insert id, update another table, delete from another, and if anything fails along the way you can rollBack() to before the transaction took place. Otherwise commit the changes if you choose to. For example working with a new order and setting the user's last order column to the new order id, and delete a pending order, but the supplied payment type did not meet the criteria for placing orders from the order_flags table, so you can rollBack() and show the user a friendly error message.
As for security, I am rather baffled no one touched on this. When sending any user supplied data to ANY system including PHP and MySQL, sanitize and standardize it.
Yes prepared statements do provide some security when it comes to escaping the data but it is NOT 100% bullet proof.
So always using prepared statements is far more beneficial than not with no real performance loss, and some benefits with caching, but you should still sanitize your user supplied data.
One step is to typecast the variables to the desired data type you are working with. Using objects would further ease this since you work within a single Model for the data types as opposed to having to remember it each time you work with the same data.
To add on to the above you should look into a database abstraction layer that uses PDO.
For example Doctrine DBAL: http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/query-builder.html
The added benefits of working with a DBAL+PDO are that
You can standardize and shorten the amount of work you have to do.
Aid in sanitization of user supplied data
Easily manipulate complex queries
Use nested transactions
Easily switch between databases
Your code becomes more portable and usable in other projects
For example I extended PDO and overrode the query(), fetchAll(), and fetch() methods so that they would always use prepared statements and so that I could write SQL statements inside fetch() or fetchAll() instead of having to write everything out again.
EG:
<?php
$pdo = new PDOEnhanced( $connection );
$pdo->fetchAll( "SELECT * FROM foo WHERE bar = 'hi'", PDO::FETCH_OBJ );
//would automatically provide
$stmt = $pdo->prepare( "SELECT * FROM foo WHERE bar=?" );
$stmt->execute( array( 'hi' ) );
$resultSet = $stmt->fetchAll( PDO::FETCH_OBJ )
?>
As for people suggesting that mysql_* style, is much easier to just replace with mysqli_* API. It is not the case. A large portion of mysql_* functions were left out or had arguments changes with mysqli_*
See: http://php.net/manual/en/mysqli.summary.php
You can however get a converter released by Oracle to ease the process: https://wikis.oracle.com/display/mysql/Converting+to+MySQLi
Keep in mind that it is a file source text parser and is not 100% accurate so validate the changes before merging them. It will also add a significant amount of overhead for the globals it creates.
I'm re-engineering a PHP-driven web site which uses a minimal database. The original version used "pseudo-prepared-statements" (PHP functions which did quoting and parameter replacement) to prevent injection attacks and to separate database logic from page logic.
It seemed natural to replace these ad-hoc functions with an object which uses PDO and real prepared statements, but after doing my reading on them, I'm not so sure. PDO still seems like a great idea, but one of the primary selling points of prepared statements is being able to reuse them… which I never will. Here's my setup:
The statements are all trivially simple. Most are in the form SELECT foo,bar FROM baz WHERE quux = ? ORDER BY bar LIMIT 1. The most complex statement in the lot is simply three such selects joined together with UNION ALLs.
Each page hit executes at most one statement and executes it only once.
I'm in a hosted environment and therefore leery of slamming their servers by doing any "stress tests" personally.
Given that using prepared statements will, at minimum, double the number of database round-trips I'm making, am I better off avoiding them? Can I use PDO::MYSQL_ATTR_DIRECT_QUERY to avoid the overhead of multiple database trips while retaining the benefit of parametrization and injection defense? Or do the binary calls used by the prepared statement API perform well enough compared to executing non-prepared queries that I shouldn't worry about it?
EDIT:
Thanks for all the good advice, folks. This is one where I wish I could mark more than one answer as "accepted" — lots of different perspectives. Ultimately, though, I have to give rick his due… without his answer I would have blissfully gone off and done the completely Wrong Thing even after following everyone's advice. :-)
Emulated prepared statements it is!
Today's rule of software engineering: if it isn't going to do anything for you, don't use it.
I think you want PDO::ATTR_EMULATE_PREPARES. That turns off native database prepared statements, but still allows query bindings to prevent sql injection and keep your sql tidy. From what I understand, PDO::MYSQL_ATTR_DIRECT_QUERY turns off query bindings completely.
When not to use prepared statements? When you're only going to be running the statement once before the db connection goes away.
When not to use bound query parameters (which is really what most people use prepared statements to get)? I'm inclined to say "never" and I'd really like to say "never", but the reality is that most databases and some db abstraction layers have certain circumstances under which they won't allow you to bind parameters, so you're forced to not use them in those cases. Any other time, though, it will make your life simpler and your code more secure to use them.
I'm not familiar with PDO, but I'd bet it provides a mechanism for running parametrized queries with the values given in the same function call if you don't want to prepare, then run as a separate step. (e.g., Something like run_query("SELECT * FROM users WHERE id = ?", 1) or similar.)
Also, if you look under the hood, most db abstraction layers will prepare the query, then run it, even if you just tell it to execute a static SQL statement. So you're probably not saving a trip to the db by avoiding explicit prepares anyhow.
Prepared statements are being used by thousands of people and are therefore well-tested (and thus one can infer they are reasonably secure). Your custom solution is only used by you.
The chance that your custom solution is insecure is pretty high. Use prepared statements. You have to maintain less code that way.
The benefits of prepared statements are as follows:
each query is only compiled once
mysql will use a more efficient transport format to send data to the server
However, prepared statements only persist per connection. Unless you're using connection pooling, there would be no benefit if you're only doing one statement per page. Trivially simple queries would not benefit from the more efficient transport format, either.
Personally I wouldn't bother. The pseudo-prepared statements are likely to be useful for the safe variable quoting they presumably provide.
Honestly, I don't think you should worry about it. However, I remember that a number of PHP data access frameworks supported prepare statement modes and non-prepare statement modes. If I remember correctly, PEAR:DB did back in the day.
I have ran into the same issue as you and I had my own reservations, so instead of using PDO I ended up writing my own light-weight database layer that supported prepares and standard statements and performed correct escaping (sql-injection prevention) in both cases. One of my other gripes with prepares is that sometimes it is more efficient to append some non-escapable input to a statement like ... WHERE id IN (1, 2, 3...).
I don't know enough about PDO to tell you what other options you have using it. However, I do know that PHP has escaping functions available for all database vendors it supports and you could roll your own little layer on top of any data access layer you are stuck with.