Related
This question already has answers here:
Getting raw SQL query string from PDO prepared statements
(16 answers)
Closed 6 years ago.
In PHP, when accessing MySQL database with PDO with parametrized query, how can you check the final query (after having replaced all tokens)?
Is there a way to check what gets really executed by the database?
So I think I'll finally answer my own question in order to have a full solution for the record. But have to thank Ben James and Kailash Badu which provided the clues for this.
Short Answer
As mentioned by Ben James: NO.
The full SQL query does not exist on the PHP side, because the query-with-tokens and the parameters are sent separately to the database.
Only on the database side the full query exists.
Even trying to create a function to replace tokens on the PHP side would not guarantee the replacement process is the same as the SQL one (tricky stuff like token-type, bindValue vs bindParam, ...)
Workaround
This is where I elaborate on Kailash Badu's answer.
By logging all SQL queries, we can see what is really run on the server.
With mySQL, this can be done by updating the my.cnf (or my.ini in my case with Wamp server), and adding a line like:
log=[REPLACE_BY_PATH]/[REPLACE_BY_FILE_NAME]
Just do not run this in production!!!
You might be able to use PDOStatement->debugDumpParams. See the PHP documentation .
Using prepared statements with parametrised values is not simply another way to dynamically create a string of SQL. You create a prepared statement at the database, and then send the parameter values alone.
So what is probably sent to the database will be a PREPARE ..., then SET ... and finally EXECUTE ....
You won't be able to get some SQL string like SELECT * FROM ..., even if it would produce equivalent results, because no such query was ever actually sent to the database.
I check Query Log to see the exact query that was executed as prepared statement.
I initially avoided turning on logging to monitor PDO because I thought that it would be a hassle but it is not hard at all. You don't need to reboot MySQL (after 5.1.9):
Execute this SQL in phpMyAdmin or any other environment where you may have high db privileges:
SET GLOBAL general_log = 'ON';
In a terminal, tail your log file. Mine was here:
>sudo tail -f /usr/local/mysql/data/myMacComputerName.log
You can search for your mysql files with this terminal command:
>ps auxww|grep [m]ysqld
I found that PDO escapes everything, so you can't write
$dynamicField = 'userName';
$sql = "SELECT * FROM `example` WHERE `:field` = :value";
$this->statement = $this->db->prepare($sql);
$this->statement->bindValue(':field', $dynamicField);
$this->statement->bindValue(':value', 'mick');
$this->statement->execute();
Because it creates:
SELECT * FROM `example` WHERE `'userName'` = 'mick' ;
Which did not create an error, just an empty result. Instead I needed to use
$sql = "SELECT * FROM `example` WHERE `$dynamicField` = :value";
to get
SELECT * FROM `example` WHERE `userName` = 'mick' ;
When you are done execute:
SET GLOBAL general_log = 'OFF';
or else your logs will get huge.
What I did to print that actual query is a bit complicated but it works :)
In method that assigns variables to my statement I have another variable that looks a bit like this:
$this->fullStmt = str_replace($column, '\'' . str_replace('\'', '\\\'', $param) . '\'', $this->fullStmt);
Where:
$column is my token
$param is the actual value being assigned to token
$this->fullStmt is my print only statement with replaced tokens
What it does is a simply replace tokens with values when the real PDO assignment happens.
I hope I did not confuse you and at least pointed you in right direction.
The easiest way it can be done is by reading mysql execution log file and you can do that in runtime.
There is a nice explanation here:
How to show the last queries executed on MySQL?
I don't believe you can, though I hope that someone will prove me wrong.
I know you can print the query and its toString method will show you the sql without the replacements. That can be handy if you're building complex query strings, but it doesn't give you the full query with values.
I think easiest way to see final query text when you use pdo is to make special error and look error message. I don't know how to do that, but when i make sql error in yii framework that use pdo i could see query text
I always check/limit/cleanup the user variables I use in database queries
Like so:
$pageid = preg_replace('/[^a-z0-9_]+/i', '', $urlpagequery); // urlpagequery comes from a GET var
$sql = 'SELECT something FROM sometable WHERE pageid = "'.$pageid.'" LIMIT 1';
$stmt = $conn->query($sql);
if ($stmt && $stmt->num_rows > 0) {
$row = $stmt->fetch_assoc();
// do something with the database content
}
I don't see how using prepared statements or further escaping improves anything in that scenario? Injection seems impossible here, no?
I have tried messing with prepared statements.. and I kind of see the point, even though it takes much more time and thinking (sssiissisis etc.) to code even just half-simple queries.
But as I always cleanup the user input before DB interaction, it seems unnecessary
Can you enlighten me?
You will be better off using prepared statement consistently.
Regular expressions are only a partial solution, but not as convenient or as versatile. If your variables don't fit a pattern that can be filtered with a regular expression, then you can't use them.
All the "ssisiisisis" stuff is an artifact of Mysqli, which IMHO is needlessly confusing.
I use PDO instead:
$sql = 'SELECT something FROM sometable WHERE pageid = ? LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute(array($pageid));
See? No need for regexp filtering. No need for quoting or breaking up the string with . between the concatenated parts.
It's easy in PDO to pass an array of variables, then you don't have to do tedious variable-binding code.
PDO also supports named parameters, which can be handy if you have an associative array of values:
$params = array("pageid"=>123, "user"=>"Bill");
$sql = 'SELECT something FROM sometable WHERE pageid = :pageid AND user = :user LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute($params);
If you enable PDO exceptions, you don't need to test whether the query succeeds. You'll know if it fails because the exception is thrown (FWIW, you can enable exceptions in Mysqli too).
You don't need to test for num_rows(), just put the fetching in a while loop. If there are no rows to fetch, then the loop stops immediately. If there's just one row, then it loops one iteration.
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
// do something with the database content
}
Prepared statements are easier and more flexible than filtering and string-concatenation, and in some cases they are faster than plain query() calls.
The question would be how you defined "improve" in this context. In this situation I would say that it makes no difference to the functionality of the code.
So what is the difference to you? You say that this is easier and faster for you to write. That might be the case but is only a matter of training. Once you're used to prepared statements, you will write them just as fast.
The difference to other programmers? The moment you share this code, it will be difficult for the other person to fully understand as prepared statements are kind of standard (or in a perfect world would be). So by using something else it makes it in fact harder to understand for others.
Talking more about this little piece of code makes no sense, as in fact it doesn't matter, it's only one very simple statement. But imagine you write a larger script, which will be easier to read and modify in the future?
$id = //validate int
$name = //validate string
$sometext = //validate string with special rules
$sql = 'SELECT .. FROM foo WHERE foo.id = '.$id.' AND name="'.$name.'" AND sometext LIKE "%'.$sometext.'%"';
You will always need to ask yourself: Did I properly validate all the variables I am using? Did I make a mistake?
Whereas when you use code like this
$sql = $db->prepare('SELECT .. FROM foo WHERE foo.id = :id AND name=":name" AND sometext LIKE "%:sometext%"');
$sql->bind(array(
':id' => $id,
':name' => $name,
':sometext' => $sometext,
));
No need to worry if you done everything right because PHP will take care of this for you.
Of course this isn't a complex query as well, but having multiple variables should demonstrate my point.
So my final answer is: If you are the perfect programmer who never forgets or makes mistakes and work alone, do as you like. But if you're not, I would suggest using standards as they exist for a reason. It is not that you cannot properly validate all variables, but that you should not need to.
Prepared statements can sometimes be faster. But from the way you ask the question I would assume that you are in no need of them.
So how much extra performance can you get by using prepared statements ? Results can vary. In certain cases I’ve seen 5x+ performance improvements when really large amounts of data needed to be retrieved from localhost – data conversion can really take most of the time in this case. It could also reduce performance in certain cases because if you execute query only once extra round trip to the server will be required, or because query cache does not work.
Brought to you faster by http://www.mysqlperformanceblog.com/
I don't see how using prepared statements or further escaping improves anything in that scenario?
You're right it doesn't.
P.S. I down voted your question because there seems little research made before you asked.
This question already has answers here:
Getting raw SQL query string from PDO prepared statements
(16 answers)
Closed 6 years ago.
In PHP, when accessing MySQL database with PDO with parametrized query, how can you check the final query (after having replaced all tokens)?
Is there a way to check what gets really executed by the database?
So I think I'll finally answer my own question in order to have a full solution for the record. But have to thank Ben James and Kailash Badu which provided the clues for this.
Short Answer
As mentioned by Ben James: NO.
The full SQL query does not exist on the PHP side, because the query-with-tokens and the parameters are sent separately to the database.
Only on the database side the full query exists.
Even trying to create a function to replace tokens on the PHP side would not guarantee the replacement process is the same as the SQL one (tricky stuff like token-type, bindValue vs bindParam, ...)
Workaround
This is where I elaborate on Kailash Badu's answer.
By logging all SQL queries, we can see what is really run on the server.
With mySQL, this can be done by updating the my.cnf (or my.ini in my case with Wamp server), and adding a line like:
log=[REPLACE_BY_PATH]/[REPLACE_BY_FILE_NAME]
Just do not run this in production!!!
You might be able to use PDOStatement->debugDumpParams. See the PHP documentation .
Using prepared statements with parametrised values is not simply another way to dynamically create a string of SQL. You create a prepared statement at the database, and then send the parameter values alone.
So what is probably sent to the database will be a PREPARE ..., then SET ... and finally EXECUTE ....
You won't be able to get some SQL string like SELECT * FROM ..., even if it would produce equivalent results, because no such query was ever actually sent to the database.
I check Query Log to see the exact query that was executed as prepared statement.
I initially avoided turning on logging to monitor PDO because I thought that it would be a hassle but it is not hard at all. You don't need to reboot MySQL (after 5.1.9):
Execute this SQL in phpMyAdmin or any other environment where you may have high db privileges:
SET GLOBAL general_log = 'ON';
In a terminal, tail your log file. Mine was here:
>sudo tail -f /usr/local/mysql/data/myMacComputerName.log
You can search for your mysql files with this terminal command:
>ps auxww|grep [m]ysqld
I found that PDO escapes everything, so you can't write
$dynamicField = 'userName';
$sql = "SELECT * FROM `example` WHERE `:field` = :value";
$this->statement = $this->db->prepare($sql);
$this->statement->bindValue(':field', $dynamicField);
$this->statement->bindValue(':value', 'mick');
$this->statement->execute();
Because it creates:
SELECT * FROM `example` WHERE `'userName'` = 'mick' ;
Which did not create an error, just an empty result. Instead I needed to use
$sql = "SELECT * FROM `example` WHERE `$dynamicField` = :value";
to get
SELECT * FROM `example` WHERE `userName` = 'mick' ;
When you are done execute:
SET GLOBAL general_log = 'OFF';
or else your logs will get huge.
What I did to print that actual query is a bit complicated but it works :)
In method that assigns variables to my statement I have another variable that looks a bit like this:
$this->fullStmt = str_replace($column, '\'' . str_replace('\'', '\\\'', $param) . '\'', $this->fullStmt);
Where:
$column is my token
$param is the actual value being assigned to token
$this->fullStmt is my print only statement with replaced tokens
What it does is a simply replace tokens with values when the real PDO assignment happens.
I hope I did not confuse you and at least pointed you in right direction.
The easiest way it can be done is by reading mysql execution log file and you can do that in runtime.
There is a nice explanation here:
How to show the last queries executed on MySQL?
I don't believe you can, though I hope that someone will prove me wrong.
I know you can print the query and its toString method will show you the sql without the replacements. That can be handy if you're building complex query strings, but it doesn't give you the full query with values.
I think easiest way to see final query text when you use pdo is to make special error and look error message. I don't know how to do that, but when i make sql error in yii framework that use pdo i could see query text
This question already has answers here:
Getting raw SQL query string from PDO prepared statements
(16 answers)
Closed 6 years ago.
In PHP, when accessing MySQL database with PDO with parametrized query, how can you check the final query (after having replaced all tokens)?
Is there a way to check what gets really executed by the database?
So I think I'll finally answer my own question in order to have a full solution for the record. But have to thank Ben James and Kailash Badu which provided the clues for this.
Short Answer
As mentioned by Ben James: NO.
The full SQL query does not exist on the PHP side, because the query-with-tokens and the parameters are sent separately to the database.
Only on the database side the full query exists.
Even trying to create a function to replace tokens on the PHP side would not guarantee the replacement process is the same as the SQL one (tricky stuff like token-type, bindValue vs bindParam, ...)
Workaround
This is where I elaborate on Kailash Badu's answer.
By logging all SQL queries, we can see what is really run on the server.
With mySQL, this can be done by updating the my.cnf (or my.ini in my case with Wamp server), and adding a line like:
log=[REPLACE_BY_PATH]/[REPLACE_BY_FILE_NAME]
Just do not run this in production!!!
You might be able to use PDOStatement->debugDumpParams. See the PHP documentation .
Using prepared statements with parametrised values is not simply another way to dynamically create a string of SQL. You create a prepared statement at the database, and then send the parameter values alone.
So what is probably sent to the database will be a PREPARE ..., then SET ... and finally EXECUTE ....
You won't be able to get some SQL string like SELECT * FROM ..., even if it would produce equivalent results, because no such query was ever actually sent to the database.
I check Query Log to see the exact query that was executed as prepared statement.
I initially avoided turning on logging to monitor PDO because I thought that it would be a hassle but it is not hard at all. You don't need to reboot MySQL (after 5.1.9):
Execute this SQL in phpMyAdmin or any other environment where you may have high db privileges:
SET GLOBAL general_log = 'ON';
In a terminal, tail your log file. Mine was here:
>sudo tail -f /usr/local/mysql/data/myMacComputerName.log
You can search for your mysql files with this terminal command:
>ps auxww|grep [m]ysqld
I found that PDO escapes everything, so you can't write
$dynamicField = 'userName';
$sql = "SELECT * FROM `example` WHERE `:field` = :value";
$this->statement = $this->db->prepare($sql);
$this->statement->bindValue(':field', $dynamicField);
$this->statement->bindValue(':value', 'mick');
$this->statement->execute();
Because it creates:
SELECT * FROM `example` WHERE `'userName'` = 'mick' ;
Which did not create an error, just an empty result. Instead I needed to use
$sql = "SELECT * FROM `example` WHERE `$dynamicField` = :value";
to get
SELECT * FROM `example` WHERE `userName` = 'mick' ;
When you are done execute:
SET GLOBAL general_log = 'OFF';
or else your logs will get huge.
What I did to print that actual query is a bit complicated but it works :)
In method that assigns variables to my statement I have another variable that looks a bit like this:
$this->fullStmt = str_replace($column, '\'' . str_replace('\'', '\\\'', $param) . '\'', $this->fullStmt);
Where:
$column is my token
$param is the actual value being assigned to token
$this->fullStmt is my print only statement with replaced tokens
What it does is a simply replace tokens with values when the real PDO assignment happens.
I hope I did not confuse you and at least pointed you in right direction.
The easiest way it can be done is by reading mysql execution log file and you can do that in runtime.
There is a nice explanation here:
How to show the last queries executed on MySQL?
I don't believe you can, though I hope that someone will prove me wrong.
I know you can print the query and its toString method will show you the sql without the replacements. That can be handy if you're building complex query strings, but it doesn't give you the full query with values.
I think easiest way to see final query text when you use pdo is to make special error and look error message. I don't know how to do that, but when i make sql error in yii framework that use pdo i could see query text
Nowadays, "Prepared statements" seem to be the only way anyone recommends sending queries to a database. I even see recommendations to use prepared statements for stored procs. However, do to the extra query prepared statements require - and the short time they last - I'm persuaded that they are only useful for a line of INSERT/UPDATE queries.
I'm hoping someone can correct me on this, but it just seems like a repeat of the whole "Tables are evil" CSS thing. Tables are only evil if used for layouts - not tabular data. Using DIV's for tabular data is a style violation of WC3.
Like wise, plain SQL (or that generated from AR's) seems to be much more useful for 80% of the queries used, which on most sites are a single SELECT not to be repeated again that page load (I'm speaking about scripting languages like PHP here). Why would I make my over-taxed DB prepare a statement that it is only to run once before being removed?
MySQL:
A prepared statement is specific to
the session in which it was created.
If you terminate a session without
deallocating a previously prepared
statement, the server deallocates it
automatically.
So at the end of your script PHP will auto-close the connection and you will lose the prepared statement only to have your script re-created it on the next load.
Am I missing something or is this just a way to decrease performance?
:UPDATE:
It dawned on me that I am assuming new connections for each script. I would assume that if a persistent connection is used then these problems would disappear. Is this correct?
:UPDATE2:
It seems that even if persistent connections are the solution - they are not a very good option for most of the web - especially if you use transactions. So I'm back to square one having nothing more than the benchmarks below to go on...
:UPDATE3:
Most people simply repeat the phrase "prepared statements protect against SQL injection" which doesn't full explain the problem. The provided "escape" method for each DB library also protects against SQL injection. But it is more than that:
When sending a query the normal way,
the client (script) converts the data
into strings that are then passed to
the DB server. The DB server then uses
CPU power to convert them back into
the proper binary datatype. The
database engine then parses the
statement and looks for syntax errors.
When using prepared statements... the
data are sent in a native binary form,
which saves the conversion-CPU-usage,
and makes the data transfer more
efficient. Obviously, this will also
reduce bandwidth usage if the client
is not co-located with the DB server.
...The variable types are predefined,
and hence MySQL take into account
these characters, and they do not need
to be escaped.
http://www.webdesignforums.net/showthread.php?t=18762
Thanks to OIS for finally setting me strait on this issue.
unlike the CSS tables debate, there are clear security implications with prepared statements.
if you use prepared statements as the ONLY way to put user-supplied data in to a query, then they are absolutely bullet-proof when it comes to SQL injection.
When you execute a sql statement on the database, the sql parser needs to analyse it beforehand, which is the exact same process as the preparation.
So, comparing executing sql statements directly to preparing and executing has no disadvantages, but some advantages:
First of all, as longneck already stated, passing user input into a prepared statement escapes the input automatically. It is as if the database has prepared filters for the values and lets in only those values that fit.
Secondly, if use prepared statements thoroughly, and you come in the situation where you need to execute it multiple times, you don't need to rewrite the code to prepare and execute, but you just execute it.
Thirdly: The code becomes more readable, if done properly:
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge=:badge)
WHERE group=:group';
$params = array(
':group' => $group,
':badge' => $_GET['badge']
);
$stmt = $pdo->prepare($sql);
$result = $stmt->execute($params);
Instead of
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge="'.mysql_real_escape_string($_GET['badge']).'")
WHERE group="'.mysql_real_escape_string($group).'"';
$result = mysql_query($sql);
Imagine you had to change the sql statement, which code would be your favourite? ;-)
Prepared Statements come in handy in several situations:
Great separation of query data from untrusted user data.
Performance increase when the same query is executed multiple times
Performance increase when binary data is being transmitted as the prepared statement can use the binary protocol, whereas a traditional query will end up doing encoding and such.
There is a performance hit under normal circumstances (not repeated, no binary data) as you now have to do two back and forths. The first to "prepare" the query, and the second to transmit the token along with the data to be inserted. Most people are willing to make this sacrifice for the security benefit.
With regards to persistent connections:
MySQL has one of the fastest connection build up times on the market. It's essentially free for most set ups, so you're not going to see too much of a change using persistent connections or not.
The answer has to do with security and abstraction. Everyone else has already mentioned security, but the real upside is that your input is completely abstracted from the query itself. This allows for a true database agnosticism when using an abstraction layer, whereas inlining the input is usually a database-dependent process. If you care anything for portability, prepared statements are the way to go.
In the real world, I rarely ever write DML queries. All of my INSERTS / UPDATES are automatically built by the abstraction layer and are executed by simply passing an input array. For all intents and purposes, there really is no "performance hit" for preparing queries and then executing them (save for connection latency in the initial PREPARE). But when using a UDS (Unix Domain Socket) connection, you're not going to notice (or even be able to benchmark) a difference. It's usually on the order of a few microseconds.
Given the security and abstraction upsides, I'd hardly call it wasteful.
The performance benefit doesn't come from less parsing - it comes from only having to calculate access paths once rather than repeatedly. This helps a lot when you're issuing thousands of queries.
Given mysql's very simple optimizer/planner this may be less of an issue than with a more mature database with much more sophisticated optimizers.
However, this performance benefit can actually turn into a detriment if you've got a sophisticated optimizer that is aware of data skews. In that case you can often be better off with getting a different access path for the same query using different literal values rather than reusing a preexisting path.
When using sql queries like SELECT x,y,z FROM foo WHERE c='mary had a little lamb' the server has to parse the sql statement including the data + you have to sanitize the "mary had..." part (a call to mysql_real_escape() or similar for each parameter).
Using prepared statements the server has to parse the statement, too, but without the the data and sends back only an identifier for the statement (a tiny tiny data packet). Then you send the actual data without first sanitizing it. I don't see the overhead here, though I freely admit I've never tested it. Have you? ;-)
edit: And using prepared statements can eliminate the need to convert each and every parameter (in/out) to strings. Probably even more so if your version of php uses mysqlnd (instead of the "old" libmysql client library). Haven't tested the performance aspect of that either.
I don't seem to be finding any good benefits to use persistent connections - or prepared statements for that mater. Look at these numbers - for 6000 select statements (which will never happen in a page request!) you can barely tell the difference. Most of my pages use less than 10 queries.
UPDATED I just revised my test to
include 4k SELECT and 4k INSERT
statements! Run it yourself and let me
know if there are any design errors.
Perhaps the difference would be greater if my MySQL server wasn't running on the same machine as Apache.
Persistent: TRUE
Prepare: TRUE
2.3399310112 seconds
Persistent: FALSE
Prepare: TRUE
2.3265211582184 seconds
Persistent: TRUE
Prepare: FALSE
2.3666892051697 seconds
Persistent: FALSE
Prepare: FALSE
2.3496441841125 seconds
Here is my test code:
$hostname = 'localhost';
$username = 'root';
$password = '';
$dbname = 'db_name';
$persistent = FALSE;
$prepare = FALSE;
try
{
// Force PDO to use exceptions for all errors
$attrs = array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION);
if($persistent)
{
// Make the connection persistent
$attrs[PDO::ATTR_PERSISTENT] = TRUE;
}
$db = new PDO("mysql:host=$hostname;dbname=$dbname", $username, $password, $attrs);
// What type of connection?
print 'Persistent: '.($db->getAttribute(PDO::ATTR_PERSISTENT) ? 'TRUE' : 'FALSE').'<br />';
print 'Prepare: '.($prepare ? 'TRUE' : 'FALSE').'<br />';
//Clean table from last run
$db->exec('TRUNCATE TABLE `pdo_insert`');
}
catch(PDOException $e)
{
echo $e->getMessage();
}
$start = microtime(TRUE);
$name = 'Jack';
$body = 'This is the text "body"';
if( $prepare ) {
// Select
$select = $db->prepare('SELECT * FROM pdo_insert WHERE id = :id');
$select->bindParam(':id', $x);
// Insert
$insert = $db->prepare('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES (:name, :body, :author_id)');
$insert->bindParam(':name', $name);
$insert->bindParam(':body', $body);
$insert->bindParam(':author_id', $x);
$run = 0;
for($x=0;$x<4000;++$x)
{
if( $insert->execute() && $select->execute() )
{
$run++;
}
}
}
else
{
$run = 0;
for($x=0;$x<4000;++$x) {
// Insert
if( $db->query('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES ('.$db->quote($name).', '. $db->quote($body).', '. $db->quote($x).')')
AND
// Select
$db->query('SELECT * FROM pdo_insert WHERE id = '. $db->quote($x)) )
{
$run++;
}
}
}
print (microtime(true) - $start).' seconds and '.($run * 2).' queries';
Cassy is right. If you don't prepare/compile it, the dbms would have to in any case before able to run it.
Also, the advantage is you could check the prepare result and if prepare fail your algo can branch off to treat an exception without wasting db resources to run the failing query.