Deadlocks and Timeouts break ACID transactions

Deadlocks and Timeouts break ACID transactions - php

I've got transactional application that works like so:
try {
$db->begin();
increaseNumber();
$db->commit();
} catch(Exception $e) {
$db->rollback();
}
And then inside the increaseNumber() I'll have a query like so, which is the only function that works with this table:
// I use FOR UPDATE so that nobody else can read this table until its been updated
$result = $db->select("SELECT item1
FROM units
WHERE id = '{$id}'
FOR UPDATE");
$result = $db->update("UPDATE units SET item1 = item1 + 1
WHERE id = '{$id}'");
Everything is wrapped in a transaction but lately I've been dealing with some pretty slow queries and there's a lot of concurrency going on in my application so I can't really make sure that queries are to be run in a specific order.
Can deadlocks cause ACID transactions to break? I have one function that adds something and then another that removes it but when I have deadlocks then I find the data is completely out of sync like the transactions were ignored.
Is this bound to happen or is something else wrong?
Thanks, Dominic

Well, if a transaction runs into a lock (from another transaction) that doesn't release, it'll fail after timeout. I believe the default is 30 seconds. You should make note if anyone is using any 3rd party applications on the database. I know for a fact that, for example, SQL Manager 2007 does not release locks on InnoDB unless you disconnect from database (sometimes it only takes a Commit Transaction on ... well, everything), which causes a lot of queries to fail after timeout. Of course, if your transactions are ACID-compliant, it should execute in all-or-nothing. It will break only if you break data between transactions.
You can try extending the timeout, but a 30 second lock might imply some deeper problems. It depends, of course, on what storage engine you're using (by MySQL tag and transactions I assumed InnoDB).
You can also try and turn on query profiling to see if any queries run for a ridiculous amount of time. Just note that it does, of course, decrease performance, so it may not be a production solution.

A in ACID stands for Atomic, so no deadlocks cannot make an ACID transaction break -- Rather it will make it not happen like in all-or-nothing.
More likely, if you inconsistent data, you application is doing multiple "transactions" in what is a logical single transaction, like: user creates and account (transaction-begin..-commit), user sets a password (transaction-begin...-deadlock..-rollback) your application ignored the error and continues, and now your database is left with a user created and no password.
Look in your application as what else the application is doing besides the rollback, and logically whether there is multiple parts to the build up of the consistent data.

Related

Two transactions on two databases

I need to put multiple values into 2 databases. The thing is, that if one of those INSERTS fails, i need all the other to rollback.
The question
Is it possible, to make simultaneously two transactions, inserting some values into databases, and then Commit or rollback both of them?
The Code
$res = new ResultSet(); //class connecting and letting me query first database
$res2 = new ResultSet2(); //the other database
$res->query("BEGIN");
$res2->query("BEGIN");
try
{
$res->query("INSERT xxx~~") or wyjatek('rollback'); //wyjatek is throwing exception if query fails
$res2->query("INSERT yyy~~")or wyjatek('rollback');
......
//if everything goes well
$res->query("COMMIT");
$res2->query("COMMIT");
//SHOW some GREEN text saying its done.
}
catch(Exception $e)
{
//if wyjatek throws an exception
$res->query("ROLLBACK");
$res2->query("ROLLBACK");
//SHOW some RED text, saying it failed
}
Summary
So is it proper way, or will it even work?
All tips appreciated.

Theoretically
If you will remove
or wyjatek('rollback')
your script will be work.
But looking to documentation
Transactions are isolated within a single "database". If you want to use multiple database transactions using MySql you can see XA Transactions.
Support for XA transactions is available for the InnoDB storage
engine.
XA supports distributed transactions, that is, the ability to permit
multiple separate transactional resources to participate in a global
transaction. Transactional resources often are RDBMSs but may be other
kinds of resources.
An application performs actions that involve different database
servers, such as a MySQL server and an Oracle server (or multiple
MySQL servers), where actions that involve multiple servers must
happen as part of a global transaction, rather than as separate
transactions local to each server.
The XA Specification. This
document is published by The Open Group and available at
http://www.opengroup.org/public/pubs/catalog/c193.htm

What about letting PostgreSQL doing the dirty work?
http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION

What you propose will almost always work. But for some uses, 'almost always' is not good enough.
If you have deferred constraints, the commit on $res2 could fail on a constraint violation, and then it is too late to rollback $res.
Or, one of your servers or the network could fail between the first commit and the second. If the php, database1, and database2 are all on the same hardware, the window for this failure mode is pretty small, but not negligible.
If 'almost always' is not good enough, and you cannot migrate one set of data to live inside the other database, then you might need to resort to "prepared transactions".

PDO Transactions Locks?

I have done extensive research about MySQL transactions via the PHP PDO interface. I am still a little fuzzy about the actual background workings of the transaction methods. Specifically, I need to know if there is any reason that I should want to prevent all my queries (SELECTs included) inside a transaction spanning from the beginning of the script to the end? Of course, handling any error in the transaction and rolling them back if need be.
I want to know if there is any locking going on during a transaction and if so, is it row level locking because it is InnoDB?

Don't do that.
The reason for this is that transactions take advantage of MVCC a mechanism by which every piece of data updated is in fact not update-in-place but merely inserted in somewhere else.
MVCC implies allocating memory and or storage space to accumulate and operate all of the changes you send it without committing them to disk until you issue a COMMIT.
That means that while your entire script runs all changes are stored until the script ends. And all of the records that you try and change during the transaction are marked as "work in progress" so that other processes/threads can know that this data will soon be invalidated.
Having certain pieces of data marked as "work in progress" for the entire length of the script means that any other concurrent update will see the flag and say "i have to wait until this finishes so I'll get the most recent data".
This includes SELECTS depending on isolation levels. Selecting stuff that is marked as "work in progress" may not be what you want because some tables that you may want to join may contain already updated data while other tables are not updated yet resulting in a dirty read.
Transactionality and atomicity of operations is desirable but costly. Use it where it's needed. Yes that means more work for you to figure out where race conditions can happen and even if race conditions occur you have to make the decision if they are really critical or is "some" data loss/mix acceptable.
Would you like your logs and visit counters and other statistics to drag down the speed of your entire site? Or is the quality of that information sacrificable for speed (as long as it's not an analytics suit you can afford the occasional collision).
Would you like a seat reservation application to miss-fire and allow more users to grab a seat even after the seat cout was 0? of course not - here you want to leverage transactions and isolation levels to ensure that never happens.

Concurrency in Doctrine

I have an application, running on php + mysql plattform, using Doctrine2 framework. I need to execute 3 db queries during one http request: first INSERT, second SELECT, third UPDATE. UPDATE is dependent on result of SELECT query. There is a high probability of concurrent http requests. If such situation occurs, and DB queries get mixed up (eg. INS1, INS2, SEL1, SEL2, UPD1, UPD2), it will result in data inconsistency. How do I assure atomicity of INS-SEL-UPD operation? Do I need to use some kind of locks, or transactions are sufficient?

The answer from #YaK is actually a good answer. You should know how to deal with locks in general.
Addressing Doctrine2 specifically, your code should look like:
$em->getConnection()->beginTransaction();
try {
$toUpdate = $em->find('Entity\WhichWillBeUpdated', $id, \Doctrine\DBAL\LockMode::PESSIMISTIC_WRITE);
// this will append FOR UPDATE http://docs.doctrine-project.org/en/2.0.x/reference/transactions-and-concurrency.html
$em->persist($anInsertedOne);
// you can flush here as well, to obtain the ID after insert if needed
$toUpdate->changeValue('new value');
$em->persist($toUpdate);
$em->flush();
$em->getConnection()->commit();
} catch (\Exception $e) {
$em->getConnection()->rollback();
throw $e;
}
The every subsequent request to fetch for update, will wait until this transaction finishes for one process which has acquired the lock. Mysql will release the lock automatically after transaction is finished successfully or failed. By default, innodb lock timeout is 50 seconds. So if your process does not finish transaction in 50 seconds it will rollback and release the lock automatically. You do not need any additional fields on your entity.

A table-wide LOCK is guaranteed to work in all situations. But they are quite bad because they kind of prevent concurrency, rather than deal with it.
However, if your script holds the locks for a very short time frame, it might be an acceptable solution.
If your table uses InnoDB engine (no support for transactions with MyISAM), transaction is the most efficient solution, but also the most complex.
For your very specific need (in the same table, first INSERT, second SELECT, third UPDATE dependending on result of SELECT query):
Start a transaction
INSERT your records. Other transactions will not see these new rows until your own transaction is committed (unless you use a non-standard isolation level)
SELECT your record(s) with SELECT...LOCK IN SHARE MODE. You now have a READ lock on these rows, no one else may change these rows. (*)
Compute whatever you need to compute to determine whether or not you need to UPDATE something.
UPDATE the rows if required.
Commit
Expect errors at any time. If a dead-lock is detected, MySQL may decide to ROLLBACK you transaction to escape the dead-lock. If another transaction is updating the rows you are trying to read from, your transaction may be locked for some time, or even time-out.
The atomicity of your transaction is guaranteed if you proceed this way.
(*) in general, rows not returned by this SELECT may still be inserted in a concurrent transaction, that is, the non-existence is not guaranteed throughout the course of the transaction unless proper precautions are taken

Transactions won't prevent thread B to read the values thread A has not locked
So you must use locks to prevent concurrency access.
#Gediminas explained how you can use locks with Doctrine.
But using locks can result in dead locks or lock timeouts.
Doctrine renders these SQL errors as RetryableExceptions.
These exceptions are often normal if you are in a high concurrency environment.
They can happen very often and your application should handle them properly.
Each time a RetryableException is thrown by Doctrine, the proper way to handle this is to retry the whole transaction.
As easy as it seems, there is a trap. The Doctrine 2 EntityManager becomes unusable after a RetryableException and you must recreate a new one to replay your whole transaction.
I wrote this article illustrated with a full example.

Is the benefit significant enough to use transaction if I'm only performing one query?

I'm wondering, since transaction is useful when you're performing a set of queries. However, if I'm only performing one query should I bother to use it?
I currently have
$res = $pdo->query("SELECT id FROM products LIMIT 5");
$row = $res->fetchAll();
And all it does is that. Is the following code more efficient? (I'm thinking that its not since its only doing one thing, but I'm not sure)
$pdo->beginTransaction();
$res = $pdo->exec("SELECT id FROM products LIMIT 5");
$pdo->commit();
If its not more efficient, is there an alternative method of doing this that may be more efficient?
Thank you.

If there are any actions performed after the query that are important to the function that script is performing (let's say an e-commerce transaction) and leaving the query in place would be problematic (i.e. something related to payment failed) then you will want to use transactions even though only one query will be performed. Otherwise you will have purchases without payment.
If that transaction is the only thing occurring in that script then transactions are unnecessary and can/should be avoided.
Here is a simplistic example:
try
{
$pdo->beginTransaction();
$res = $pdo->exec("INSERT INTO orders ....");
// Try to make payment
$payment->process();
$pdo->commit();
}
catch (Exception $e)
{
// Payment failed unexpectedly! Rollback the transaction.
$pdo->rollback();
}
Naturally you will want to do a better job of handling failures like this but this should give you a decent demonstration of when transactions are appropriate.

Transactions are not about efficiency
Transactions allow you to group several statements to one atomic statement, meaning that it will complete completely or not at all, but it must not complete partially. E.g. if you remove money from one bank account and then add it to another bank account then this action should either happen completely (account a as $x less and account b has $x more) or not at all. What MUST NOT happen is that you remove money from account a, then the system crashes and money is lost in the digital void.
You can't have no transactions
When you are using PDO and you don't explicitly start a transaction then you are in auto-commit mode. That means that every time you start a query, a transaction is started, the query is executed and the transaction is committed (means marked as complete) immediately.
Still thinking about performance?
There are some rare cases where thinking about performance and transactions may be justified. E.g. if you are manipulation large amounts of data it can be (whether it really is depends on your DB) faster to do all your statements in one large transactions. On the other hand in this case it is probably logical to do these manipulations in a transaction anyway, because if your program crashes, you want to rerun the program and it should start of from the beginning, not somewhere where it left of the last time.

Dependent insertion of data into MySql table

I have 2 tables:
user_tb.username
user_tb.point
review_tb.username
review_tb.review
I am coding with PHP(CodeIgniter). So I am trying to insert data into review_tb with the review the user had submitted and if that is a success, i will award the user with some points.
Well this look like a very simple process. We will first insert the review into the review_tb with the username and use PHP to check if there is any problem with the query executed and if it's a success, we will proceed with updating the points in the user_tb.
Yea, but here comes the problem. What if inserting into review_tb is a success but the second query, inserting into the user_tb is NOT a success, can we kind of "undo" the review_tb query or "revert" the change that we did to review_tb.
It's kind of like "all or nothing".
The purpose of this is to sync all data across the database, where in real life, we will be managing a database of more tables, and inserting more data into each table which depends on each other.
Please give some enlightenment on how we can do this in PHP or CodeIgniter or just MySql query.

If you want a "all or nothing" behavior for your SQL operations, you are looking for transactions ; here is the relevant page from the MySQL manual : 12.4.1. START TRANSACTION, COMMIT, and ROLLBACK Syntax.
Wikipedia describes those this way :
A database transaction comprises a
unit of work performed within a
database management system (or
similar system) against a database,
and treated in a coherent and reliable
way independent of other transactions.
Transactions in a database environment
have two main purposes:
To provide reliable units of work that allow correct recovery from
failures and keep a database
consistent even in cases of system
failure, when execution stops
(completely or partially) and many
operations upon a database remain
uncompleted, with unclear status.
To provide isolation between programs accessing a database
concurrently. Without isolation the
programs' outcomes are typically
erroneous.
Basically :
you start a transaction
you do what you have to ; ie, your first insert, and your update
if everything is OK, you commit the transaction
else, if there is any problem with any of your queries, you rollback the transaction ; and it will cancel everything you did in that transaction.
There is a manual page about transactions and CodeIgniter here.
Note that, with MySQL, no every Engine supports transaction ; between the two most used engines, MyISAM doesn't support transactions, while InnoDB supports them.

Can't you use transactions? If you did both inserts inside the same transaction, then either both succeed or neither does.
Try something like
BEGIN;
INSERT INTO review_tb(username, review) VALUES(x, y);
INSERT INTO user_tb(username, point) VALUES(x, y);
COMMIT;
Note that you need to use a database engine that supports transactions (such as InnoDB).

If you have InnoDB support use it, but when its not possible you can use a code similar to the following:
$result=mysql_query("INSERT INTO ...");
if(!$result) return false;
$result=mysql_query("INSERT INTO somewhereelse");
if(!$result) {
mysql_query("DELETE FROM ...");
return false;
}
return true;
This cleanup might still fail, but can work whenever the insert query fails because of duplicates or constraints. For unexpected terminations, only way is to use transactions.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.