The question is a fairly open one. I've been using Stored Procs with MS SQLServer for some time with classic ASP and ASP.net and love them, lots.
I have a small hobby project I'm working on and for various reasons have gone the LAMP route. Any hints/tricks/traps or good starting points to get into using stored procedures with MySQL and PHP5? My version of MySQL supports Stored Procedures.
#michal kralik - unfortunately there's a bug with the MySQL C API that PDO uses which means that running your code as above with some versions of MySQL results in the error:
"Syntax error or access violation: 1414 OUT or INOUT argument $parameter_number for routine $procedure_name is not a variable or NEW pseudo-variable".
You can see the bug report on bugs.mysql.com. It's been fixed for version 5.5.3+ & 6.0.8+.
To workaround the issue, you would need to separate in & out parameters, and use user variables to store the result like this:
$stmt = $dbh->prepare("CALL sp_takes_string_returns_string(:in_string, #out_string)");
$stmt->bindParam(':in_string', 'hello');
// call the stored procedure
$stmt->execute();
// fetch the output
$outputArray = $this->dbh->query("select #out_string")->fetch(PDO::FETCH_ASSOC);
print "procedure returned " . $outputArray['#out_string'] . "\n";
Forget about mysqli, it's much harder to use than PDO and should have been already removed. It is true that it introduced huge improvements over mysql, but to achieve the same effect in mysqli sometimes requires enormous effort over PDO i.e. associative fetchAll.
Instead, take a look at PDO, specifically
prepared statements and stored procedures.
$stmt = $dbh->prepare("CALL sp_takes_string_returns_string(?)");
$value = 'hello';
$stmt->bindParam(1, $value, PDO::PARAM_STR|PDO::PARAM_INPUT_OUTPUT, 4000);
// call the stored procedure
$stmt->execute();
print "procedure returned $value\n";
It isn't actually mandatory to use mysqli or PDO to call stored procedures in MySQL 5. You can call them just fine with the old mysql_ functions. The only thing you can't do is return multiple result sets.
I've found that returning multiple result sets is somewhat error prone anyway; it does work in some cases but only if the application remembers to consume them all, otherwise the connection is left in a broken state.
You'll need to use MySQLI (MySQL Improved Extension) to call stored procedures. Here's how you would call an SP:
$mysqli = new MySQLI(user,pass,db);
$result = $mysqli->query("CALL sp_mysp()");
When using SPs you'll need close first resultset or you'll receive an error. Here's some more information :
http://blog.rvdavid.net/using-stored-procedures-mysqli-in-php-5/
(broken link)
Alternatively, you can use Prepared Statements, which I find very straight-forward:
$stmt = $mysqli->prepare("SELECT Phone FROM MyTable WHERE Name=?");
$stmt->bind_param("s", $myName);
$stmt->execute();
MySQLI Documentation: http://no.php.net/manual/en/book.mysqli.php
I have been using ADODB, which is a great thing for abstracting actual commands to make it portable between different SQL Servers (ie mysql to mssql). However, Stored procedures do not appear to be directly supported. What this means, is that I have run a SQL query as if it is a normal one, but to "call" the SP.
An example query:
$query = "Call HeatMatchInsert('$mMatch', '$mOpponent', '$mDate', $mPlayers, $mRound, '$mMap', '$mServer', '$mPassword', '$mGame', $mSeason, $mMatchType)";
This isn't accounting for returned data,which is important. I'm guessing that this would be done by setting a #Var , that you can select yourself as the return #Variable .
To be Abstract though, although making a first php stored procedure based web app was very difficult to work around (mssql is very well documented, this is not), It's great after its done - changes are very easy to make due to the seperation.
Related
The question is a fairly open one. I've been using Stored Procs with MS SQLServer for some time with classic ASP and ASP.net and love them, lots.
I have a small hobby project I'm working on and for various reasons have gone the LAMP route. Any hints/tricks/traps or good starting points to get into using stored procedures with MySQL and PHP5? My version of MySQL supports Stored Procedures.
#michal kralik - unfortunately there's a bug with the MySQL C API that PDO uses which means that running your code as above with some versions of MySQL results in the error:
"Syntax error or access violation: 1414 OUT or INOUT argument $parameter_number for routine $procedure_name is not a variable or NEW pseudo-variable".
You can see the bug report on bugs.mysql.com. It's been fixed for version 5.5.3+ & 6.0.8+.
To workaround the issue, you would need to separate in & out parameters, and use user variables to store the result like this:
$stmt = $dbh->prepare("CALL sp_takes_string_returns_string(:in_string, #out_string)");
$stmt->bindParam(':in_string', 'hello');
// call the stored procedure
$stmt->execute();
// fetch the output
$outputArray = $this->dbh->query("select #out_string")->fetch(PDO::FETCH_ASSOC);
print "procedure returned " . $outputArray['#out_string'] . "\n";
Forget about mysqli, it's much harder to use than PDO and should have been already removed. It is true that it introduced huge improvements over mysql, but to achieve the same effect in mysqli sometimes requires enormous effort over PDO i.e. associative fetchAll.
Instead, take a look at PDO, specifically
prepared statements and stored procedures.
$stmt = $dbh->prepare("CALL sp_takes_string_returns_string(?)");
$value = 'hello';
$stmt->bindParam(1, $value, PDO::PARAM_STR|PDO::PARAM_INPUT_OUTPUT, 4000);
// call the stored procedure
$stmt->execute();
print "procedure returned $value\n";
It isn't actually mandatory to use mysqli or PDO to call stored procedures in MySQL 5. You can call them just fine with the old mysql_ functions. The only thing you can't do is return multiple result sets.
I've found that returning multiple result sets is somewhat error prone anyway; it does work in some cases but only if the application remembers to consume them all, otherwise the connection is left in a broken state.
You'll need to use MySQLI (MySQL Improved Extension) to call stored procedures. Here's how you would call an SP:
$mysqli = new MySQLI(user,pass,db);
$result = $mysqli->query("CALL sp_mysp()");
When using SPs you'll need close first resultset or you'll receive an error. Here's some more information :
http://blog.rvdavid.net/using-stored-procedures-mysqli-in-php-5/
(broken link)
Alternatively, you can use Prepared Statements, which I find very straight-forward:
$stmt = $mysqli->prepare("SELECT Phone FROM MyTable WHERE Name=?");
$stmt->bind_param("s", $myName);
$stmt->execute();
MySQLI Documentation: http://no.php.net/manual/en/book.mysqli.php
I have been using ADODB, which is a great thing for abstracting actual commands to make it portable between different SQL Servers (ie mysql to mssql). However, Stored procedures do not appear to be directly supported. What this means, is that I have run a SQL query as if it is a normal one, but to "call" the SP.
An example query:
$query = "Call HeatMatchInsert('$mMatch', '$mOpponent', '$mDate', $mPlayers, $mRound, '$mMap', '$mServer', '$mPassword', '$mGame', $mSeason, $mMatchType)";
This isn't accounting for returned data,which is important. I'm guessing that this would be done by setting a #Var , that you can select yourself as the return #Variable .
To be Abstract though, although making a first php stored procedure based web app was very difficult to work around (mssql is very well documented, this is not), It's great after its done - changes are very easy to make due to the seperation.
I'm new to PHP, but not programming. Have come from an ASP [classic] background. In brief, I'm using PHP 5.4, with FastCGI on IIS7 and SQL Server 2005 Express. I've learnt the fundamentals, and have spent quite some time looking into security.
I'm sanitising both GET and POST input data. My db connection strings are in a separate file placed outside the web root. I'm using PDO prepared statements [though I've heard query+quote perform faster] with named placeholders along with db stored procedures.
I'm trying to understand why I would need to use additional arguments within the bindParam function, particularly data type options "PDO::PARAM_STR, 12" [second argument in that example represent the data length right?].
What are the benefits of specifying the data type and length within the bindParam? Is it needed if I'm using stored procedures in which the data type and length is already specified? Also, I believe I need to use something like "PDO::PARAM_INPUT_OUTPUT" to return a value from a stored proc?
Thanks!
** EDIT **
For some reason, if I use the PDO::PARAM_STR argument, my stored procs don't seem to write data into the db. So I omitted that argument. Here's my code:
$sql1 = $conn->prepare("EXEC insert_platts :userAgent, :userIp, 1, :source");
$sql1->bindParam(':userAgent', $userAgent);
$sql1->bindParam(':userIp', $userIp);
$sql1->bindParam(':source', $source);
$sql1->execute();
Also, rather than returning the identity value from the stored proc, I'm using lastInsertId() instead:
$lastRow = $conn->lastInsertId();
print $lastRow;
No, data type and data length are not needed. I'm using mysql stored procs and the parameters are never typed values, all though I validate them of course. I guess that the reason is extra security and INOUT params. Quote:
To return an INOUT parameter from a stored procedure, use the bitwise
OR operator to set the PDO::PARAM_INPUT_OUTPUT
have you tried this?
$params = array(
':userAgent'=>$userAgent,
':userIp' => $userIp,
':source' => $source
);
$sql1 = $conn->prepare("EXEC insert_platts :userAgent, :userIp, 1, :source");
$sql1->execute($params);
About special characters: are you using correct encodings? I mean, the same encoding in the php app and the DB... sometimes is hard to work with one encoding in the scripts and other in the database.. and very often problems like that arise...
I'm using PHP 5.3.6 with PDO to access Postgres 9.0.4. I've been asked to reduce the memory footprint of a report. The current implementation is simple: execute the query, do a fetchAll() and then iterate with foreach() through the resulting array. This obviously doesn't scale with huge result sets: it can temporarily consume 100MB or more.
I have a new implementation which takes the PDO statement handle and then iterates directly on it using foreach(), i.e. no intermediate array via fetchAll(). (From what I've read, iterating a statement handle with foreach calls fetch() under the covers.) This is just as fast and consumes way less memory: about 28kB. Still, I'm not confident I'm doing it right because, although I've done a ton of Googling, it's tough to find answers to basic questions about this:
I've seen articles that suggest solving my original problem using cursors. Does the Postgress PDO driver already use cursors internally? If writing my own SQL to create a cursor is required, I'm willing to but I'd prefer to write the simplest code possible (but no simpler!).
If foreach calls fetch() each iteration, isn't that too network chatty? Or is it smart and fetches many rows at once, e.g. 500, to save bandwidth? (This may imply that it uses cursors internally.)
I've seen an article that wraps the statement handle in a class that implements Iterator interface. Isn't this redundant given that a PDO statement handle already does this? Or am I missing something?
My call to prepare the SQL statement looks like this:
$sth = $dbh->prepare($sql);
I found that it made no memory or speed difference if I did this:
$sth = $dbh->prepare($sql, array( PDO::ATTR_CURSOR => PDO::CURSOR_FWDONLY ) );
Is this because this is the default anyway for the Postgres PDO driver? This would make sense if it is already using cursors internally.
General comments about the approach and other ways to solve this problem are welcome.
PDO for Postgres does use cursors internally.
Apparently PDO::CURSOR_FWDONLY does not use cursors. Black box tests:
(0) Preparations:
$con = new \PDO('dsn');
// you'll get "NO ACTIVE TRANSACTION" otherwise
$con->beginTransaction();
$sql = 'select * from largetable';
(1) Default - takes forever:
$stmt = $con->prepare($sql);
$stmt->execute();
print_r($stmt->fetch());
(2) FWDONLY - takes forever:
$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_FWDONLY));
$stmt->execute();
print_r($stmt->fetch());
(3) SCROLLABLE - runs in a flash:
$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_SCROLL));
$stmt->execute();
print_r($stmt->fetch());
I turned on PG logging just to be sure and it is indeed so - only SCROLL uses cursors.
So, the only way to make use of cursors is to use SCROLL, at least in PHP 5.4.23.
I simply want to execute a MySQL stored procedure. But I want to use the parameter parsing technique for all the usual reasons. So I've taken the example from the php manual here and now have this:
$stmt = $dbh->prepare("CALL update_bug_status(?,?)");
$stmt->bindParam(1, $bug_id);
$stmt->bindParam(2, $bug_status);
$stmt->execute();
The missing piece of the puzzle is the $dbh variable, which the manual seems to forget to mention!
I thought for $dbh I could use an ODBC connection variable like this:
$connection_string = "DRIVER={MySQL ODBC 5.1 Driver};Server=10.32.27.6;Database=bugs";
$dbh=odbc_connect($connection_string,'root','xxxxxx');
But this doesn't work because 'odbc_connect' simply returns an id number.
I've seen other examples that seem to make use of mysql specific functions. But I don't have these functions available so I want an answer that uses standard ODBC functions if possible.
You are using a PDO method on an ODBC connection (see the menu on the left to see which portion of the manual you are in), and you should use odbc_prepare and odbc_execute (either that, or rather then doing an odbc_connect use the PDO driver).
Nowadays, "Prepared statements" seem to be the only way anyone recommends sending queries to a database. I even see recommendations to use prepared statements for stored procs. However, do to the extra query prepared statements require - and the short time they last - I'm persuaded that they are only useful for a line of INSERT/UPDATE queries.
I'm hoping someone can correct me on this, but it just seems like a repeat of the whole "Tables are evil" CSS thing. Tables are only evil if used for layouts - not tabular data. Using DIV's for tabular data is a style violation of WC3.
Like wise, plain SQL (or that generated from AR's) seems to be much more useful for 80% of the queries used, which on most sites are a single SELECT not to be repeated again that page load (I'm speaking about scripting languages like PHP here). Why would I make my over-taxed DB prepare a statement that it is only to run once before being removed?
MySQL:
A prepared statement is specific to
the session in which it was created.
If you terminate a session without
deallocating a previously prepared
statement, the server deallocates it
automatically.
So at the end of your script PHP will auto-close the connection and you will lose the prepared statement only to have your script re-created it on the next load.
Am I missing something or is this just a way to decrease performance?
:UPDATE:
It dawned on me that I am assuming new connections for each script. I would assume that if a persistent connection is used then these problems would disappear. Is this correct?
:UPDATE2:
It seems that even if persistent connections are the solution - they are not a very good option for most of the web - especially if you use transactions. So I'm back to square one having nothing more than the benchmarks below to go on...
:UPDATE3:
Most people simply repeat the phrase "prepared statements protect against SQL injection" which doesn't full explain the problem. The provided "escape" method for each DB library also protects against SQL injection. But it is more than that:
When sending a query the normal way,
the client (script) converts the data
into strings that are then passed to
the DB server. The DB server then uses
CPU power to convert them back into
the proper binary datatype. The
database engine then parses the
statement and looks for syntax errors.
When using prepared statements... the
data are sent in a native binary form,
which saves the conversion-CPU-usage,
and makes the data transfer more
efficient. Obviously, this will also
reduce bandwidth usage if the client
is not co-located with the DB server.
...The variable types are predefined,
and hence MySQL take into account
these characters, and they do not need
to be escaped.
http://www.webdesignforums.net/showthread.php?t=18762
Thanks to OIS for finally setting me strait on this issue.
unlike the CSS tables debate, there are clear security implications with prepared statements.
if you use prepared statements as the ONLY way to put user-supplied data in to a query, then they are absolutely bullet-proof when it comes to SQL injection.
When you execute a sql statement on the database, the sql parser needs to analyse it beforehand, which is the exact same process as the preparation.
So, comparing executing sql statements directly to preparing and executing has no disadvantages, but some advantages:
First of all, as longneck already stated, passing user input into a prepared statement escapes the input automatically. It is as if the database has prepared filters for the values and lets in only those values that fit.
Secondly, if use prepared statements thoroughly, and you come in the situation where you need to execute it multiple times, you don't need to rewrite the code to prepare and execute, but you just execute it.
Thirdly: The code becomes more readable, if done properly:
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge=:badge)
WHERE group=:group';
$params = array(
':group' => $group,
':badge' => $_GET['badge']
);
$stmt = $pdo->prepare($sql);
$result = $stmt->execute($params);
Instead of
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge="'.mysql_real_escape_string($_GET['badge']).'")
WHERE group="'.mysql_real_escape_string($group).'"';
$result = mysql_query($sql);
Imagine you had to change the sql statement, which code would be your favourite? ;-)
Prepared Statements come in handy in several situations:
Great separation of query data from untrusted user data.
Performance increase when the same query is executed multiple times
Performance increase when binary data is being transmitted as the prepared statement can use the binary protocol, whereas a traditional query will end up doing encoding and such.
There is a performance hit under normal circumstances (not repeated, no binary data) as you now have to do two back and forths. The first to "prepare" the query, and the second to transmit the token along with the data to be inserted. Most people are willing to make this sacrifice for the security benefit.
With regards to persistent connections:
MySQL has one of the fastest connection build up times on the market. It's essentially free for most set ups, so you're not going to see too much of a change using persistent connections or not.
The answer has to do with security and abstraction. Everyone else has already mentioned security, but the real upside is that your input is completely abstracted from the query itself. This allows for a true database agnosticism when using an abstraction layer, whereas inlining the input is usually a database-dependent process. If you care anything for portability, prepared statements are the way to go.
In the real world, I rarely ever write DML queries. All of my INSERTS / UPDATES are automatically built by the abstraction layer and are executed by simply passing an input array. For all intents and purposes, there really is no "performance hit" for preparing queries and then executing them (save for connection latency in the initial PREPARE). But when using a UDS (Unix Domain Socket) connection, you're not going to notice (or even be able to benchmark) a difference. It's usually on the order of a few microseconds.
Given the security and abstraction upsides, I'd hardly call it wasteful.
The performance benefit doesn't come from less parsing - it comes from only having to calculate access paths once rather than repeatedly. This helps a lot when you're issuing thousands of queries.
Given mysql's very simple optimizer/planner this may be less of an issue than with a more mature database with much more sophisticated optimizers.
However, this performance benefit can actually turn into a detriment if you've got a sophisticated optimizer that is aware of data skews. In that case you can often be better off with getting a different access path for the same query using different literal values rather than reusing a preexisting path.
When using sql queries like SELECT x,y,z FROM foo WHERE c='mary had a little lamb' the server has to parse the sql statement including the data + you have to sanitize the "mary had..." part (a call to mysql_real_escape() or similar for each parameter).
Using prepared statements the server has to parse the statement, too, but without the the data and sends back only an identifier for the statement (a tiny tiny data packet). Then you send the actual data without first sanitizing it. I don't see the overhead here, though I freely admit I've never tested it. Have you? ;-)
edit: And using prepared statements can eliminate the need to convert each and every parameter (in/out) to strings. Probably even more so if your version of php uses mysqlnd (instead of the "old" libmysql client library). Haven't tested the performance aspect of that either.
I don't seem to be finding any good benefits to use persistent connections - or prepared statements for that mater. Look at these numbers - for 6000 select statements (which will never happen in a page request!) you can barely tell the difference. Most of my pages use less than 10 queries.
UPDATED I just revised my test to
include 4k SELECT and 4k INSERT
statements! Run it yourself and let me
know if there are any design errors.
Perhaps the difference would be greater if my MySQL server wasn't running on the same machine as Apache.
Persistent: TRUE
Prepare: TRUE
2.3399310112 seconds
Persistent: FALSE
Prepare: TRUE
2.3265211582184 seconds
Persistent: TRUE
Prepare: FALSE
2.3666892051697 seconds
Persistent: FALSE
Prepare: FALSE
2.3496441841125 seconds
Here is my test code:
$hostname = 'localhost';
$username = 'root';
$password = '';
$dbname = 'db_name';
$persistent = FALSE;
$prepare = FALSE;
try
{
// Force PDO to use exceptions for all errors
$attrs = array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION);
if($persistent)
{
// Make the connection persistent
$attrs[PDO::ATTR_PERSISTENT] = TRUE;
}
$db = new PDO("mysql:host=$hostname;dbname=$dbname", $username, $password, $attrs);
// What type of connection?
print 'Persistent: '.($db->getAttribute(PDO::ATTR_PERSISTENT) ? 'TRUE' : 'FALSE').'<br />';
print 'Prepare: '.($prepare ? 'TRUE' : 'FALSE').'<br />';
//Clean table from last run
$db->exec('TRUNCATE TABLE `pdo_insert`');
}
catch(PDOException $e)
{
echo $e->getMessage();
}
$start = microtime(TRUE);
$name = 'Jack';
$body = 'This is the text "body"';
if( $prepare ) {
// Select
$select = $db->prepare('SELECT * FROM pdo_insert WHERE id = :id');
$select->bindParam(':id', $x);
// Insert
$insert = $db->prepare('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES (:name, :body, :author_id)');
$insert->bindParam(':name', $name);
$insert->bindParam(':body', $body);
$insert->bindParam(':author_id', $x);
$run = 0;
for($x=0;$x<4000;++$x)
{
if( $insert->execute() && $select->execute() )
{
$run++;
}
}
}
else
{
$run = 0;
for($x=0;$x<4000;++$x) {
// Insert
if( $db->query('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES ('.$db->quote($name).', '. $db->quote($body).', '. $db->quote($x).')')
AND
// Select
$db->query('SELECT * FROM pdo_insert WHERE id = '. $db->quote($x)) )
{
$run++;
}
}
}
print (microtime(true) - $start).' seconds and '.($run * 2).' queries';
Cassy is right. If you don't prepare/compile it, the dbms would have to in any case before able to run it.
Also, the advantage is you could check the prepare result and if prepare fail your algo can branch off to treat an exception without wasting db resources to run the failing query.