I've got a PHP app with a custom model layer built on top of PDO. Each of my models is a separate class (with a common base class) that manages interactions with a given table (all SELECTs, INSERTs, and UPDATEs for the given table are instance methods of the given class). However, all the classes share a single PDO object, with a single DB connection, which is passed into the constructor of the model class, in a very simplistic form of dependency injection.
This is a bit of a simplification, but it works something like this:
<?php
$pdo = new \PDO(/*connection details*/);
$modelA = new ModelA($pdo);
$modelB = new ModelB($pdo);
$thingA = $modelA->find(1234); //where 1234 is the primary key
$thingB = $modelB->create([
'modelb_id' => $thingA->id,
'notes' => 'This call generates an runs an insert statement based on the array of fields and values passed in, applying specialized business logic, and other fun stuff.'
]);
print_r($thingB); // Create returns an object containing the data from the newly created row, so this would dump it to the screen.
At any rate, currently, everything just runs through PDO's prepared statements and gets committed immediately, because we're not using transactions. For the most part that's fine, but we're running into some issues.
First, if the database interaction is more complicated than what you see above, you can have a request that generates a large number of SQL queries, including several INSERTs and UPDATEs, that are all interrelated. If you hit an error or exception midway through the execution, the first half of the database stuff is already committed to the database, but everything after that never runs. So you get orphaned records, and other bad data. I know that is essentially the exact problem transaction were invented to solve, so it seems like a compelling case to use them.
The other thing we're running into is intermittent MySQL deadlocks when running at high traffic volumes. I'm still trying to get to the bottom of these, but I suspect transactions would help here, too, from what I'm reading.
Here's the thing though. My database logic is abstracted (as it should be), which ends up scattering it a bit. What I'm thinking I'd like to do is create a transaction right when I create my PDO object, and then all the queries run with the PDO object would be part of that transaction. Then as part of my app's tear down sequence for an HTTP request, commit the transaction, after everything has been done. And in my error handling logic, I'd probably want to call rollback.
I'm going to do some experimentation before I try to implement this fully, but I was hoping to get some feedback from someone who might have gone down with road, about whether I'm on the right track, or this is actually a terrible idea, or some tips for a more successful implementation. I also have some specific questions concerns:
Will wrapping everything in transactions cause a performance hit? It seems like if you've got multiple simultaneous transactions, they have to be queued, so the first transaction can complete before the second can run. Or else you'd have to cache the state for your database, so that all the events in one transaction can follow their logical progression? Maybe not. Maybe it's actually faster. I have no idea.
We use MySQL replication. One environment uses MySQL's Master-Slave support, with the bin log format set to STATEMENT. The other environment uses Percona Cluster, with Master-Master replication. Are there any replication implications of this model?
Is there a timeout for the transaction? And when it times out, does it commit or rollback? I expect my normal application logic will either commit or rollback, depending on whether there was an error. But if the whole thing crashes, what happens to the transaction I've been building?
I mentioned we're running into MySQL deadlocks. I've added some logic to my models, where when PDO throws a PDOException, if it's a 1213, I catch it, sleep for up to 3 seconds, then try again. What happens to my transaction if there's a deadlock, or any PDOException for that matter? Does it automatically rollback? Or does it leave it alone, so I can attempt to recover?
What other risks am I completely overlooking that I need to account for? I can try out my idea and do some tests, but I'm worried about the case where everything I know to look for works, but what about the things I don't know to look for that come up when I push it out into the wild?
Right now, I don't use a connection pooling, I just create a connection for each request, shared between all the model classes. But if I want to do connection pooling later, how does that interact with transactions?
I have a pretty clear picture of how and where to start the transaction, with when I create my shared PDO object. But is there a best practice for the commit part? I have a script that does other teardown stuff for each request. I could put it there. I could maybe subclass PDO, and add it to the destructor, but I've never done much with constructors in PHP.
Oh yeah, I need to get clear about how nested transactions work. In my preliminary tests, it looks like you can call BEGIN several times, and the first time you call COMMIT, it commits everything back to the first BEGIN. And any subsequent COMMITs don't do much. Is that right, or did I misinterpret my tests?
Related
For our web application, written in Laravel, we use transactions to update the database. We have separated our data cross different database (for ease, let's say "app" and "user"). During an application update, an event is fired to update some user statistics in the user database. However, this application update may be called as part of a transaction on the application database, so the code structure looks something as follows.
DB::connection('app')->beginTransaction();
// ...
DB::connection('user')->doStuff();
// ...
DB::connection('app')->commit();
It appears that any attempt to start a transaction on the user connection (since a single query already creates an implicit transaction) while in a game transaction does not work, causing a deadlock (see InnoDB status output). I also used innotop to get some more information, but while it showed the lock wait, it did not show by which query it was blocked. There was a lock present on the user table, but I could not find it's origin. Relevant output is shown below:
The easy solution would be to pull out the user operation from the transaction, but since the actual code is slightly more complicated (the doStuff actually happens somewhere in a nested method called during the transaction and is called from different places), this is far from trivial. I would very much like the doStuff to be part of the transaction, but I do not see how the transaction can span multiple databases.
What is the reason that this situation causes a deadlock and is it possible to run the doStuff on the user database within this transaction, or do we have to find an entirely different solution like queueing the events for execution afterwards?
I have found a way to solve this issue using a workaround. My hypothesis was that it was trying to use the app connection to update the user database, but after forcing it to use the user connection, it still locked. Then I switched it around and used the app connection to update the user database. Since we have joins between the databases, we have database users with proper access, so that was no issue. This actually solved the problem.
That is, the final code turned out to be something like
DB::connection('app')->beginTransaction();
// ...
DB::connection('app')->table('user.users')->doStuff(); // Pay attention to this line!
// ...
DB::connection('app')->commit();
I've been practicing and reading about test driven development.
I've been doing well so far since much of it is straightforward but I have questions as to what to test
about certain classes such as the one below.
public function testPersonEnrollmentDateIsSet()
{
//For the sake of simplicity I've abstracted the PDO/Connection mocking
$PDO = $this->getPDOMock();
$PDOStatement = $this->getPDOStatementMock();
$PDO->method('prepare')->willReturn($PDOStatement);
$PDOStatement->method('fetch')->willReturn('2000-01-01');
$AccountMapper = $this->MapperFactory->build(
'AccountMapper',
array($PDO)
);
$Person = $this->EntityFactory->build('Person');
$Account = $this->EntityFactory->build('Account');
$Person->setAccount($Account);
$AccountMapper->getAccountEnrollmentDate($Person);
$this->assertEquals(
'2001-01-01',
$Person->getAccountEnrollmentDate()
);
}
I'm a little unsure if I should even be testing this at all for two reasons.
Unit Testing Logic
In the example above I'm testing if the value is mapped correctly. Just the logic alone, mocking the connection so no database. This is awesome because I can run a test of exclusively business logic without any other developers having to install or configure a database dependency.
DBUnit Testing Result
However, a separate configuration can be run on demand to test the SQL queries themselves which is another type of test altogether. While also consolidating SQL tests to run separately from unit tests.
The test would be exactly the same as above except the PDO connection would not be mocked out and be a real database connection.
Cause For Concern
I'm torn because although it is testing for different things, it's essentially duplicate code.
If I get rid of the unit test, I introduce a required database dependency at all times. As the codebase grows, the tests will become more slow over time, no out-of-box testing; extra effort from other developers to set up a configuration.
If I get rid of the database test I can't assure the SQL will return the expected information.
My questions are:
Is this a legitimate reason to keep both tests?
Is it worth it or is it possible this may become a maintenance nightmare?
According to your comments, it seems like you want to test what PDO is doing. (Are prepare, execute etc ... being called properly and doing things I expect from them ?)
It's not what you want to do here with unit testing. You do not want to test PDO or any other library, and you are not supposed to test the logic or the behavior of PDO. Be sure that these things have been done by people in charge of PDO.
What you want to test is the correct execution of YOUR code and so the logic of YOUR code. You want to test getAccountExpirationDate, and so the results you expect from it. Not the results you expect from PDO calls.
Therefore, and if I understand your code right, you only have to test that the expiration date of the $Person you give as parameter has been set with the expected value from $result -> If you didn't use PDO in the right way, then the test will fail anyway !
A correct test could be :
Create a UserMapper object and open a DB transaction
Insert data in DB that will be retrieved in $result
Create a Person object with correct taxId and clientID to get the wanted $result from your query
Run UserMapper->getAccountExpirationDate($Person)
Assert that $Person->expirationDate equals whatever you settled as expected from $result
Rollback DB transaction
You may also check the format of the date, etc ... Anything that is related to the logic you implemented.
If you are going to test such things in many tests, use setUp and tearDown to init and rollback your DB after each test.
Is this a legitimate reason to keep both tests?
I think no one could ever be able to tell from your simple example, which test you need to keep. If you like, you can keep both. I say that because the answer depends a lot on unknown factors of your projects, including future requirements & modifications.
Personally, I don't think you need both, at least for the same function. If your function is more easier to break logically than data-sensitive, feel free to drop the database-backed ones. You will learn to use the correct type of tests over time.
I'm torn because although it is testing for different things, it's essentially duplicate code.
Yes, it is.
If I get rid of the database test I can't assure the SQL will return the expected information.
This is a valid concern if your database gets complicated enough. You are using DBUnit, so I would assume you have a sample database with sample information. Over time, a production database may gets multiple structure modifications (adding fields, removing fields, even re-working data relationship...). These things will add to your efforts to maintain your test database. However, it would also be helpful if you realize that some changes in the database aren't right.
If I get rid of the unit test, I introduce a required database dependency at all times. As the codebase grows, the tests will become more slow over time, no out-of-box testing; extra effort from other developers to set up a configuration.
What's wrong with requiring a database dependency?
Firstly, I don't see how extra efforts from other developers to setup a configuration comes to play here. You only set up one time, isn't it? In my opinion, it's a minor factor.
Second, the tests will be slow over time. Of course. You can count on in-memory database like H2 or HSQLDB to mitigate that risk, but it certainly take longer. However, do you actually need to worry? Most of these data-backed tests are integration test. They should be the responsibility of Continuous Integration Server, who runs exhausting test every now and then to prevent regression bug. For developers, I think we should only run a small set of light-weight tests during development.
In the end, I would like to remind you the number 1 rule for using unit test: Test what can break. Don't write tests for test sake. You will never have 100% coverage. However, since 20% of the code is used 80% of the time, focusing your testing effort on your main flow of business might be a good idea.
So I wanted to start a browser game project. Just for practise.
After years of PHP programming, today I heard about transactions and innoDB the first time ever.
So I googled it and still have some questions.
My first encounter with it was on a website that says, InnoDB would be necessary when programming a browsergame. Because it might be used by many people at the same time and if two people access to a database table at the same time (with one nanosecond difference for example), it could get confusing and data might be lost or your SELECT is not updated although it should have been updated by the access one nanosecond ago (but the script was still running and couldn't change it yet) ... and so on.
And apparently, transactions solve this problem by first handling the first access (until it is completed) and then handling the second one. Is this correct?
And another function is, that if you have for example 2 queries in your transaction and the second one fails, it "rolls back" and "deletes"(or never applies) the changes of the first (successful) query. Right? So either everything goes as it should or nothing changes at all. That would be great I think.
Another question: When should I use transactions? Everytime I access the database? Or is it better to use it just for some particular accesses to the database? And should I always use try {} catch() {}?
And one last question:
How does this transaction proceeds?
My understanding is the following:
You start a transaction
You do your queries and change the database or SELECT something
If everything went well, you commit the changes so they get applied to the database
If something went wrong with queries it cancels and jumps to the catch() {} where you rollback the transaction and the changes don't get applied
Is this correct? Of course, besides the question how to start, commit and rollback a transaction in your code.
Yes this is correct.You can also create savepoints to save your current point before running the query.I stricly recommend you to look into the documentation of mysql references it is explained there clearly.
I have the following problem with one of our users: they have two stores in two diferent locations, each place has its own database, however, they need to share the client base and the list of materials registered for sale. At the moment, what we do is when the client registers a new client, a copy is made on the other location's database. Problems quickly arise as their internet connection is unstable. If, at the time of the registration, the internet is down, it tries to make a copy, fails and carries with inconsistent databases.
I considered making the updates to the database via Pdo transactions that would manage the two databases, but it seems that you need a new instance of PDO $dbh1= new PDO('mysql:host=xxxx;dbname=test',$user,$pass); for each database, and I don't see a way to commit both updates. Looking at this related question what is the best way to do distributed transactions across multiple databases it seems that I need some for transation management. Can this be achieved by PDO?
No, PDO cannot do anything remotely resembling distributed transactions (which in any case are a very thorny issue where no silver bullets exist).
In general, in the presence of network partitions (i.e. actors falling off the network) it can be proved that you cannot achieve consistency and availability (guaranteed response to your queries) at the same time -- see CAP theorem.
It seems that you need to re-evaluate your requirements and design a solution based on the results of this analysis; in order to scale data processing or storage horizontally you have to take the scaling into account from day one and plan accordingly.
You can only instantiate a single PDO object. Therefore, you will need to switch databases using a query, then performing the same queries in the second DB.
Best bet is to do a transaction, then commit that transaction (if successful). Then do something like
$dbh->query('USE otherdb');
$dbh->exec();
Then do a second transaction, and commit or rollback based on whether or not it worked.
I'm not sure if this really answers what you are asking though.
I'm using PostgreSQL 9.1 with serializable transactions (similar to predicate locking) and I'm hitting error could not serialize access due to read/write dependencies among transactions. I believe that this is caused by access to a table that contains data that wouldn't need to be protected by serializable transaction. However, the rest of the code logic depends on serializable transaction support so I need to use serializable transaction for the rest of the tables.
Is it possible to configure PosgreSQL, database schema or transaction so that I can start a serializable transaction and still declare that any serialization failure in table XYZ is not worth aborting (rollback) the transaction? If so, how to do it?
The only solution to the problem I'm aware of is to put all not-important tables in an another database and use paraller connection and transaction to that database. I'm hoping for a simpler solution. My application is written in PHP in case that makes a difference.
Update: I'm looking for extra performance; the code automatically retries the failed transaction and after many enough retries it will eventually go through so in theory I'm good. In practice, the performance sucks a lot.
PostgreSQL doesn't allow you to specify this sort of thing. You should, if this is the case, not use serializable transaction level at all but use more explicit locking (select ... for update, etc). This may cause deadlocks but avoids the problems you are having now. You could simulate this by going to read committed and locking relevant rows early on in the transaction.
I am not quite sure to what extent you can use exception handling in user defined functions to get around this issue. I think it should be possible since functions are at least in theory able to handle all their own exceptions. However I would worry about implications in messing with transactional guarantees.