I have a PHP class that processes data and stores it in a MySQL database. I use prepared statements via PDO for security reasons when data is saved, but because the class is large these prepared statements are created inside different functions that are called thousands of times during the lifetime of the object (anywhere from one minute to thirty).
What I’m wondering is if there’s any reason I couldn't prepare the statements in the class constructor and save the handles in member variables to avoid the statements being prepared more than once.
Is there any reason this wouldn't work? I don’t see why not, but I've never seen it done before, which makes me wonder if doing this is a bad practice for some reason.
I.E. something like this:
Class MyClass {
private stmt1;
function __construct($dbh) {
$this->stmt1 = $dbh->prepare('SELECT foo FROM bar WHERE foobar = :foobar');
}
private function doFoo() {
$this->stmt1->execute(...)
...
}
}
I use prepared statements via PDO for security reasons when data is saved, but because the class is large these prepared statements are created inside different functions that are called thousands of times during the lifetime of the object (anywhere from one minute to thirty).
Whenever I look at bounty questions I always ask myself, "Are they even solving the correct problem?" Is executing the same query with different parameters thousands of times during the lifetime of this object really the best way to go?
If you are doing multiple SELECTs then maybe a better query that fetches more information at once would be better.
If you are doing multiple INSERTs then maybe batch inserts would serve you better.
If after evaluating the above options you decide that you still need to call these statements thousands of times during the life of the object then yes, you can cache the result of a prepared statement:
Measure current performance.
Turn off emulated prepares.
Measure the performance impact.
Use a technique called memoization or lazy loading to cache the prepare but only prepare a query when it is actually used.
Measure the performance impact again.
This allows you to see the impact of each piece that you changed. I would suspect that if you are really calling these queries thousands of times then some or all of these changes will help you but you must measure before and after to measure to know.
Storing the statements as variables works on paper. Be wary about performance though.
In particular, there's a world of difference between real prepares (which are off by default for MySQL) or emulated prepares (default for MySQL, using PDO::ATTR_EMULATE_PREPARES).
An emulated prepared statement will parse the query locally. Upon getting executed, they'll replace the parameters by their value and ship the final SQL string to the client. Upon receiving it, the database will parse the query, come up with a query plan, execute it, and return rows.
A real prepared statement will ship the query to be prepared straight to the database. The latter will parse it, prepare a generic query plan based on the query and the unknown variables, and return a prepared statement for use by PHP. When PDO executes the statement, it ships the prepared statement back along with the parameters. The database then executes the prepared query plan and returns rows.
As you may have noted, a real prepared statement involves a lot of back and forth between PHP and the DB. This is offset by the fact that the query is planned once and for all. Sometimes this is desirable (a similar query is used many times); sometimes not (the query is used a single time).
A further caveat is that a real prepared statement's query plan may or may not be the best possible one owing to the variables involved. Suppose an b-tree index on foo (bar):
select bar from foo order by bar limit ?
If the variable is small, an index scan is desirable; if it's larger, a bitmap index scan makes sense if available; if it's huge, a seq scan becomes desirable. In the latter two cases, the planner will also need to pick a sorting method. But since the query planner is tasked with coming up with a plan, Murphy's law states that it'll occasionally pick the worst possible plan for your particular use case. And the next thing you know, you'll end up scanning the sorting the entire table to retrieve a couple of rows, or following the index on bar to retrieve the entire table.
Lastly, and as an aside, you might want to look into ORMs if you're not familiar with them already.
Technically it is possible, as you already know by simply trying or just reading:
The query […] can be executed multiple times.
I would consider preparing all statements in the constructor as a bad idea. I guess it will become unmaintainable if you got a bunch of SQL statements in the constructor without any context. Furthermore you might prepare more than you actually need.
One idea to overcome this is using a statement map:
private $statments = array();
public function getStatement($sql)
{
if (! isset($this->statements[$sql])) {
$this->statements[$sql] = $this->pdo->prepare($sql);
}
return $this->statements[$sql];
}
This will prepare statements only once and you got your SQL context in the right place.
But I would call this a premature optimization because your DBS' query cache is most likely doing this for you.
Related
How to take advantage of prepared statements for performance? I understand that something like this might benefit if I put it in a loop:
SELECT `Name` FROM `Hobbits` WHERE `ID` = :ID;
I've read that looping with prepared statements is faster than looping without, but otherwise prepared statements would slightly decrease performance. So - how big may that loop be?
If I run a complex SQL query at the beginning of my code and repeat it with one different parameter at the end - will the second query run faster? (We are using a single connection for each page load). Is there a limit on cached queries, so I better repeat my queries right away?
What about executing the entire script twice with the exact same parameters (reload the page or 2 users)?
A prepared query is given to the SQL server, which parses it and possibly already prepares an execution plan. You're then basically given an id for these allocated resources and can execute this prepared statement by just filling in the blanks in the statement. You can run this statement as often as you like and the database will not have to repeat the parsing and execution planning, which may bring a speed improvement.
As long as you do not throw away the statement, there's no hard timeout for how long the statement will "stay prepared". It's not a cache, it's an allocated resource on the SQL server. At least as long as your database driver uses native prepared statements in the SQL API. PDO for example does not do so by default, unless you set PDO::ATTR_EMULATE_PREPARES to false.
At the end of the script execution though, all those resources will always be deallocated, they do not persist across different page loads. Beyond that, the SQL server may or may not cache the query and its results for some time regardless of the client script.
How long are prepared mysql queries cached?
This is not actually "a cache". Prepared statement lasts as little as during script execution.
If I run a complex SQL query at the beginning of my code and repeat it with one different parameter at the end - will the second query run faster?
The more complex a query, the less effect you will see. Frankly, prepared statement saves you only parsing, while if execution involves temporary or filesort, or table scan - prepared statement would speed up none of them.
On the other hand, for the simple primary-key lookups, which involve no complex query parsing nor building sophisticated query plans, the benefit would be negligible to none.
So - how big may that loop be?
The more iterations it gets - the more benefit. However, in a sane web-application one have to avoid looping queries at all.
What about executing the entire script twice with the exact same parameters (reload the page or 2 users)?
As I said above, there will be no benefit from a prepared statement at all. A classical query cache, however, most likely would fire.
How to take advantage of prepared statements for performance?
Noway. Not in web-serving PHP, at least. In a some long-running cli-based script - may be.
However, prepared statements ought to be used anyway, for the purpose of producing syntactically correct queries.
FYI, from
http://dev.mysql.com/doc/refman/5.7/en/statement-caching.html
The max_prepared_stmt_count system variable controls the total number of statements the server caches. (The sum of the number of prepared statements across all sessions.)
Ive recently upgraded my mind from mysql_* to PDO, and I have one simple question:
Is PDO really that much more efficient that the use of a prepared statement and an execute in a for-each loop is quicker than a single call in mysql with multiple values in it?
For example if I have an array of 5 names, putting these in an execute command in a for loop operating on an 'insert' prepared statement - is calling this 5 times going to be quicker in computational speed that one call using the old mysql with all 5 values in a single query? Or is it preferred due to security rather than speed alone?
The meaning and significance of native prepared statements (which you call "PDO") is overlooked and misjudged by everyone.
The speed benefit, everyone talking about so much, in reality can be achieved extremely rare, and often unnoticeable at all. Especially in the area of web-development with PHP which PDO belongs to.
Also note that whatever speed benefit belongs to the query parsing only - no such matters like index rebuilding or time required to find a record to update ever affected by prepared statements.
So, speaking of numbers like five, don't bother yourself with this "once-prepare-multiple execute" thing. It is not what PDO is about. PDO does two essential things, which makes it preferred over two other possible extensions:
it supports prepared statements in general, allowing data in the query not directly but via placeholder. This is the only reason why you should use PDO or similar lib (although you can easily make even old mysql ext to support prepared statements, but PDO offers it out of the box)
it makes such support not as painful as mysqli
Turning back to your question:
You can use either way you like. Just remember that multiple inserts are better to be wrapped in a transaction, due to default settings of the modern DB engines
No matter which way you choose, any dynamical value should be added into query via placeholders only. If you still not convinced, you are welcome to read an article I wrote on the matter (which is still incomplete, but have a through explanation on the real meaning of prepared statements).
PS. There is also one minor benefit of native prepared statements, often forgotten (becaulse seldom demanded) - if native prepared statement were used (and backed by msqlnd driver), the data returned is already formatted according to its type.
One query that fetches 5 rows will probably be quicker than 5 separate calls, so you are comparing apples and oranges.
When executing the same query, the performance will be similar too. The (small) performance advantage that PDO has, is that queries with parameters are supposed to be better cachable. When querying customer 3 and customer 5, the query will be cached as two different queries, while only the id is different. By using parameters, the database might cache the query in a smarter way, so a second call with a different input doesn't need to go through the query optimizer and such.
That said, apart from the performance advantage, PDO is also safer (when actually using paramteres), and in the end easier. It may look more complex at first, but it is easier to do right, because without using parameters, you will need to do all the escaping yourself, risking dangerous bugs.
By the way, you can also build a query with a variable number of parameters, and bind a value to each of them in a loop, so with PDO you could still perform the single insert query for 5 rows, although it will need a bit puzzling and a bit of extra code.
If I want to execute the same query on two different requests, and I use prepared statements with Doctrine2... Will the prepared statement be sent only the first time and be stored by the database for some time? Or will it be removed after each script finishes?
On PostgreSQL a prepared statement is valid only till the end of the session and is not saved in the memory and shared between many sessions, see doc:
http://www.postgresql.org/docs/9.2/static/sql-prepare.html
Prepared statements only last for the duration of the current database
session. When the session ends, the prepared statement is forgotten,
so it must be recreated before being used again. This also means that
a single prepared statement cannot be used by multiple simultaneous
database clients; however, each client can create their own prepared
statement to use.
However, they also say, that PostgreSQL may (but not need to) save a plan for this query in memory for future reusing:
If a prepared statement is executed enough times, the server may
eventually decide to save and re-use a generic plan rather than
re-planning each time. This will occur immediately if the prepared
statement has no parameters; otherwise it occurs only if the generic
plan appears to be not much more expensive than a plan that depends on
specific parameter values. Typically, a generic plan will be selected
only if the query's performance is estimated to be fairly insensitive
to the specific parameter values supplied.
To examine the query plan PostgreSQL is using for a prepared statement, use EXPLAIN. If a generic plan is in use, it will contain
parameter symbols $n, while a custom plan will have the current actual
parameter values substituted into it.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Are prepared statements cached server-side across multiple page loads with PHP?
I'm working on a new project and using parameterized queries for the first time (PHP with a MySQL DB). I read that they parameterized queries are cached, but I'm wondering how long they are cached for. For example, let's say I have a function 'getAllUsers()' that gets a list of all active user ID's from the user table and for each ID, a User object is created and a call to function 'getUser($user)' is made to set the other properties of the object. The 'getUser()' function has it's own prepared query with a stmt->close() at the end of the function.
If I do it this way, does my parameterized query in 'getUser()' take advantage of caching at all or is the query destroyed from cache after each stmt->close()?
Note: I also use the getUser() function if a page only requires data for a single user object so I wanted to do it this way to ensure that if the user table changes I only ever need to update one query.
Is this the right way of doing something like this or is there a better way?
Update: Interesting, just saw this on php.net's manual for prepared statements (http://php.net/manual/en/mysqli.quickstart.prepared-statements.php)
Using a prepared statement is not always the most efficient way of executing a statement. A prepared statement executed only once causes more client-server round-trips than a non-prepared statement.
So I guess the main benefit for parameterized queries is to protect against SQL injection and not necessarily to speed things up unless it's a query that will repeated at one time.
Calling mysqli_stmt::close will:
Closes a prepared statement. mysqli_stmt_close() also deallocates the
statement handle.
therefore not being able to use the cached version of the statement for further executions.
I wouldn't mind of freeing resources or closing statements since PHP will do it for you at the end of the script anyway.
Also if you are working with loops (as you described) take a look at mysqli_stmt::reset which will reset the prepared statement to its original state (after the prepare call).
That's good question, from some point of view.
First, about "caching".
There is some special thing about prepared queries - you can send it to server once and then execute it multiple times. It can give some small theoretical benefit for using already parsed and prepared query.
As it seems, you're not using such mechanism, every time preparing every your query. So, there is no caching at all.
Next, about premature optimization.
You've heard of some caching, and it occupied your imagination.
While there is no real need or cause for you to concern about caching or whatever performance issue.
So, there is a rule: do not occupy yourself with performance issues until they are real.
Otherwise you'll waste your time.
I'm writing some DB routines and I'm using prepared statements. My environment is PDO with PHP5.
I understand prepared statements primarily provide a performance benefit, as well as some auxiliary bonuses such as not having to manually SQL-escape input data.
My question is about the performance part.
I have two implementations of a getPrice function below that takes a product id and returns its price.
getPrice_A reuses the same PDOStatement object across subsequent calls within the same script execution. Is this necessary or recommended? If so, is there any way to avoid duplicating this extra code across every single get*() in every single model?
getPrice_B creates a new PDOStatement object on every call. Will the DBMS recognize this statement has already been prepared and still be able to skip some work? In other words, does this implementation properly take advantage of the performance benefits of prepared statements?
Having written all this out and read it over, I imagine getPrice_B is fine and getPrice_A is providing a negligible benefit on top of that, which may or may not be worth the extra complication.
I'd still like to hear for sure from someone more knowledgable though.
Assume that $pdo is a valid, connected PDO object in the examples below.
<?php
class Product {
static function &getPrice_A($id) {
static $stmt;
if (!$stmt) {
$stmt = $pdo->prepare('SELECT price FROM products WHERE id = ?');
}
$stmt->execute(array($id));
return $stmt->fetchColumn(0);
}
static function &getPrice_B($id) {
$stmt = $pdo->prepare('SELECT price FROM products WHERE id = ?');
$stmt->execute(array($id));
return $stmt->fetchColumn(0);
}
}
// example usage:
$price = Product::getPrice(4982);
echo "Product 4982 costs $price\n";
From what I understand, prepared statements will reuse the generated SQL plan if it is the same statement, so the database will see the same prepared statement and not have to do the work to figure out how to query the database. I would say the extra work of saving the prepared statement in Product::getPrice_A is not typically very helpful, more because it can obscure the code rather than an issue of performance. When dealing with performance, I feel it's always best to focus on code clarity and then performance when you have real statistics that indicate a problem.
I would say "yes, the extra work is unnecessary" (regardless of if it really boosts performance). Also, I am not a very big DB expert, but the performance gain of prepared statements is something I heard from others, and it is at the database level, not the code level (so if the code is actually invoking a parameterized statement on the actual DB, then the DB can do these execution plan caching... though depending on the database, you may get the benefit even without the parameterized statement).
Anyways, if you are really worried about (and seeing) database performance issues, you should look into a caching solution... of which I would highly recommend memcached. With such a solution, you can cache your query results and not even hit the database for things you access frequently.