Autorollback in postgres using PDO

Autorollback in postgres using PDO - php

I found out that postgres + PDO auto rollbacks previous changes when an exception is thrown (EVEN WHEN THE EXCEPTION IS CAUGHT AND SWALLOWED!). Example (in pseudo-code):
$transaction->begin();
try {
$manager->insert("INSERT ...");
try {
$manager->exec("A QUERY BREAKING SOME DB CONSTRAINT LIKE A UNIQUE INDEX ...");
} catch (\Exception $ex) {
// IT IS CAUGHT AND SWALLOWED!
}
$transaction->commit();
} catch (Exception $ex) {
$transaction->rollback(); // THIS CLEARLY DOES NOT RUN!
}
In postgres the first insert gets reverted. In mysql no.
Can anyone throws some light on the matter? Is it possible to change this ridiculous behaviour? I would like to perform my rollbacks myself and not get pg to do it when he thinks it is appropriate.

That's not PDO's fault, it's inherent to PostgreSQL's transaction management. See:
How can I tell PostgreSQL not to abort the whole transaction when a single constraint has failed?
Can I ask Postgresql to ignore errors within a transaction
Rollback after error in transaction
PostgreSQL doesn't roll the transaction back, but it sets it to an aborted state where it can only roll back, and where all statements except ROLLBACK report an error:
ERROR: current transaction is aborted, commands ignored until end of transaction block
(I'm surprised not to find this referred to in the official documentation; think I'll need to write a patch to improve that.)
So. When you try/catch and swallow the exception in PDO, you're trapping a PHP-side exception, but you're not changing the fact that the PostgreSQL transaction is in an aborted state.
If you wanted to be able to swallow exceptions and keep on using the transaction, you must create a SAVEPOINT before each statement that might fail. If it fails, you must ROLLBACK TO SAVEPOINT ...;. If it succeeds you may RELEASE SAVEPOINT ...;. This imposes extra overhead on the database for transaction management, adds round-trips, and burns through transaction IDs faster (which means PostgreSQL has to do more background cleanup work).
It is generally preferable to instead design your SQL so it won't fail under normal circumstances. For example, you can validate most constraints client-side, treating the server side constraints as a second level of assurance while trapping most errors client-side.
Where that's impractical, make your application fault tolerant, so it can retry a failed transaction. Sometimes this is necessary anyway - for example, you can't generally use savepoints to recover from deadlocks transaction aborts or serialization failures. It can also be useful to keep failure-prone transactions as short as possible, doing just the minimum work required, so you have less to keep track of and repeat.
So: Where possible, instead of swallowing an exception, run failure-prone database code in a retry loop. Make sure your code keeps a record of the information it needs to retry the whole transaction on error, not just the most recent statement.
Remember, any transaction can fail: The DBA might restart the database to apply a patch, the system might run out of RAM due to a runaway cron job, etc. So failure tolerant apps are a good design anyway.
Props to you for at least using PDO exceptions and handling exceptions - you're way ahead of most devs already.

Related

Is it necessary to rollback if commit fails?

This seems like a simple enough question, yet I couldn't find any definitive answers specific for MySQL. Look at this:
$mysqli->autocommit(false); //Start the transaction
$success = true;
/* do a bunch of inserts here, which will be rolled back and
set $success to false if they fail */
if ($success) {
if ($mysqli->commit()) {
/* display success message, possibly redirect to another page */
}
else {
/* display error message */
$mysqli->rollback(); //<----------- Do I need this?
}
}
$mysqli->autocommit(true); //Turns autocommit back on (will be turned off again if needed)
//Keep running regardless, possibly executing more inserts
The thing is, most examples I have seen just end the script if commiting failed, either by letting it finish or by explicitly calling exit(), which apparently auto rolls back the transaction(?), but what if I need to keep running and possibly execute more database-altering operations later? If this commit failed and I didn't have that rollback there, would turning autocommit back on (which according to this comment on the PHP manual's entry on autocommit does commit pending operations) or even explicitly calling another $mysqli->commit() later on, attempt to commit the previous inserts again, since it failed before and they weren't rolled back?
I hope I've been clear enough and that I can get a definitive answer for this, which has been bugging me quite a lot.
Edit: OK, maybe I phrased my question wrong in that line comment. It's not really a matter of whether or not I need the rollback, which, as it was pointed out, would depend on my use case, but really what is the effect of having or not having a rollback in that line. Perhaps a simpler question would be: does a failed commit call discards pending operations or does it just leaves them in their pending state, waiting for a rollback or another commit?

If you are NOT re-using the connection and it is closed immediately after the transaction fails, closing the connection would cause an implicit rollback anyway.
If you are re-using the connection, you should definitely do a rollback to avoid inconsistency with any follow-up statements.
And if you are not really re-using it but it is still in a blocking state (e.g. being left open for a couple of seconds or even minutes, depending e.g. whether you're on a website or a cronjob), please keep in mind that there can be many concurrent connections going on. So if you have a very large transaction, the server needs to hold it in a temporary state which might consume lots of memory (e.g. if you're doing a major database migration that affects lots of columns or tables) you should definitely do an explicit rollback or close the connection for an implicit rollback after it fails.
Another factor is => if you have lots of concurrent connections in different processes, they may or may not already see parts of the transaction, even if it's not committed yet, depending on the transaction isolation level that you are using. See also:
https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

With PHP and MySQL, should you check for rollback failures?

I'm using PHP's mysqli library. Database inserts and updates are always in a try-catch block. Success of each query is checked immediately (if $result === false), and any failure throws an exception. The catch calls mysqli_rollback() and exits with a message for the user.
My question is, should I bother checking the return value of mysqli_rollback()? If so, and rollback fails, what actions should the code take?
I have a hard time understanding how a rollback could fail (barring some atrocious bug in MySQL). And since PHP is going to exit anyway, calling rollback almost seems superfluous. I certainly think it should be in the code for clarity, but when PHP exits it will close the connection to MySQL and uncommitted transactions are rolled back automatically.

if rollback fails (connection failure for example), the changes will be rollbacked after the connection close anyway, so you don't need to handle the error. When you are in transaction, unless you have explicit commit (or you are running in autocommit mode, which means you have commit after each statement), the transaction is being roll back.
If a session that has autocommit disabled ends without explicitly
committing the final transaction, MySQL rolls back that transaction.
The only case you would like to handle rollback error is if you are not exiting from your script, but starting a new transaction later, as starting transaction will implicitly commit the current one. Check out Statements That Cause an Implicit Commit

mysqli_rollback can fail if you're not (never were) connected to the database. Depends on your error-handling before-hand.

Concurrency in Doctrine

I have an application, running on php + mysql plattform, using Doctrine2 framework. I need to execute 3 db queries during one http request: first INSERT, second SELECT, third UPDATE. UPDATE is dependent on result of SELECT query. There is a high probability of concurrent http requests. If such situation occurs, and DB queries get mixed up (eg. INS1, INS2, SEL1, SEL2, UPD1, UPD2), it will result in data inconsistency. How do I assure atomicity of INS-SEL-UPD operation? Do I need to use some kind of locks, or transactions are sufficient?

The answer from #YaK is actually a good answer. You should know how to deal with locks in general.
Addressing Doctrine2 specifically, your code should look like:
$em->getConnection()->beginTransaction();
try {
$toUpdate = $em->find('Entity\WhichWillBeUpdated', $id, \Doctrine\DBAL\LockMode::PESSIMISTIC_WRITE);
// this will append FOR UPDATE http://docs.doctrine-project.org/en/2.0.x/reference/transactions-and-concurrency.html
$em->persist($anInsertedOne);
// you can flush here as well, to obtain the ID after insert if needed
$toUpdate->changeValue('new value');
$em->persist($toUpdate);
$em->flush();
$em->getConnection()->commit();
} catch (\Exception $e) {
$em->getConnection()->rollback();
throw $e;
}
The every subsequent request to fetch for update, will wait until this transaction finishes for one process which has acquired the lock. Mysql will release the lock automatically after transaction is finished successfully or failed. By default, innodb lock timeout is 50 seconds. So if your process does not finish transaction in 50 seconds it will rollback and release the lock automatically. You do not need any additional fields on your entity.

A table-wide LOCK is guaranteed to work in all situations. But they are quite bad because they kind of prevent concurrency, rather than deal with it.
However, if your script holds the locks for a very short time frame, it might be an acceptable solution.
If your table uses InnoDB engine (no support for transactions with MyISAM), transaction is the most efficient solution, but also the most complex.
For your very specific need (in the same table, first INSERT, second SELECT, third UPDATE dependending on result of SELECT query):
Start a transaction
INSERT your records. Other transactions will not see these new rows until your own transaction is committed (unless you use a non-standard isolation level)
SELECT your record(s) with SELECT...LOCK IN SHARE MODE. You now have a READ lock on these rows, no one else may change these rows. (*)
Compute whatever you need to compute to determine whether or not you need to UPDATE something.
UPDATE the rows if required.
Commit
Expect errors at any time. If a dead-lock is detected, MySQL may decide to ROLLBACK you transaction to escape the dead-lock. If another transaction is updating the rows you are trying to read from, your transaction may be locked for some time, or even time-out.
The atomicity of your transaction is guaranteed if you proceed this way.
(*) in general, rows not returned by this SELECT may still be inserted in a concurrent transaction, that is, the non-existence is not guaranteed throughout the course of the transaction unless proper precautions are taken

Transactions won't prevent thread B to read the values thread A has not locked
So you must use locks to prevent concurrency access.
#Gediminas explained how you can use locks with Doctrine.
But using locks can result in dead locks or lock timeouts.
Doctrine renders these SQL errors as RetryableExceptions.
These exceptions are often normal if you are in a high concurrency environment.
They can happen very often and your application should handle them properly.
Each time a RetryableException is thrown by Doctrine, the proper way to handle this is to retry the whole transaction.
As easy as it seems, there is a trap. The Doctrine 2 EntityManager becomes unusable after a RetryableException and you must recreate a new one to replay your whole transaction.
I wrote this article illustrated with a full example.

How to handle MySQL deadlock situations on an application level?

When a deadlock situation occurs in MySQL/InnoDB, it returns this familiar error:
'Deadlock found when trying to get lock; try restarting transaction'
So what i did was record all queries that go into a transaction so that they can simply be reissued if a statement in the transaction fails. Simple.
Problem: When you have queries that depend on the results of previous queries, this doesn't work so well.
For Example:
START TRANSACTION;
INSERT INTO some_table ...;
-- Application here gets ID of thing inserted: $id = $database->LastInsertedID()
INSERT INTO some_other_table (id,data) VALUES ($id,'foo');
COMMIT;
In this situation, I can't simply reissue the transaction as it was originally created. The ID acquired by the first SQL statement is no longer valid after the transaction fails but is used by the second statement. Meanwhile, many objects have been populated with data from the transaction which then become obsolete when the transaction gets rolled back. The application code itself does not "roll back" with the database of course.
Question is: How can i handle these situations in the application code? (PHP)
I'm assuming two things. Please tell me if you think I'm on the right track:
1) Since the database can't just reissue a transaction verbatim in all situations, my original solution doesn't work and should not be used.
2) The only good way to do this is to wrap any and all transaction-issuing code in it's own try/catch block and attempt to reissue the code itself, not just the SQL.
Thanks for your input. You rock.

A transaction can fail. Deadlock is a case of fail, you could have more fails in serializable levels as well. Transaction isolation problems is a nightmare. Trying to avoid fails is the bad way I think.
I think any well written transaction code should effectively be prepared for failing transactions.
As you have seen recording queries and replaying them is not a solution, as when you restart your transaction the database has moved. If it were a valid solution the SQL engine would certainly do that for you. For me the rules are:
redo all your reads inside the transactions (any data you have read outside may have been altered)
throw everything from previous attempt, if you have written things outside of the transaction (logs, LDAP, anything outside the SGBD) it should be cancelled because of the rollback
redo everything in fact :-)
This mean a retry loop.
So you have your try/catch block with the transaction inside. You need to add a while loop with maybe 3 attempts, you leave the while loop if the commit part of the code succeed. If after 3 retry the transaction is still failing then launch an Exception to the user -- so that you do not try an inifinite retry loop, you may have a really big problem in fact --. Note that you should handle SQL error and lock or serializable exception in different ways. 3 is an arbitrary number, you may try a bigger number of attempts.
This may give something like that:
$retry=0;
$notdone=TRUE;
while( $notdone && $retry<3 ) {
try {
$transaction->begin();
do_all_the_transaction_stuff();
$transaction->commit();
$notdone=FALSE;
} catch( Exception $e ) {
// here we could differentiate basic SQL errors and deadlock/serializable errors
$transaction->rollback();
undo_all_non_datatbase_stuff();
$retry++;
}
}
if( 3 == $retry ) {
throw new Exception("Try later, sorry, too much guys other there, or it's not your day.");
}
And that means all the stuff (read, writes, fonctionnal things) must be enclosed in the $do_all_the_transaction_stuff();. Implying the transaction managing code is in the controllers, the high-level-application-functional-main code, not split upon several low-level-database-access-models objects.

If an PHP PDO transaction fails, must I rollback() explicitely?

I've seen an code example where someone does a
$dbh->rollback();
when there occurs an PDOException. I thought the database will rollback automatically in such a case?

If you don't commit not rollback an opened transaction, and it's not commited anywhere later in your script, it won't be commited (as seen by the database engine), and will automatically rolled-back at the end of your script.
Still, I (well, almost) always commit or rollback explicitly the transactions I open, so :
There is not risk of an error (like commiting "by mistake" later in the script)
The code is more easy to read / understand : when one sees $db->rollback(), he knows I want the transaction rolled-back for sure, and he doesn't have to think "did he really want to rollback, or did he forget something ? and what about later in the script ?"
The DB engine doesn't "see" the PDOException : it is thrown by PHP under various conditions -- but the database doesn't rollback anything by itself :
either a transaction is commited
or it's rolled-back
or it's not explicitly commited nor rolled-back -- which means it's not commited -- which means what's been modified is not "really" modified

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.