I recently encountered an issue on a live application.
I realized I had more and more concurrency exceptions and locks with a database.
Basically I start a transaction which requires a SELECT and an INSERT on the same table to commit.
But because the load is really heavy, each transactions locks the table, in most case it is so fast it doesn't cause any problems but there is a point where the locks start waiting more and more.
I was able to somewhat fix this problem by tweaking the queries.
Though, now, I'd like to write some tests with PHPUnit to validate my fix and avoid any regressions.
I was not able to find any materials on how to do this.
Since PHP isn't multi threaded, I've no ideas how I could run concurrent queries in a single test to validate.
Basically, I would like to be able to run multiple calls in a single test to ensure everything is ok.
I know I could try to do some high level tests by directly querying the http server and load the whole application, but since my problem comes from a standalone library I'd rather like to test it against itself.
Any ideas?
The short answer is there's no good way to test concurrent reads/writes on an actual database with PHPUnit. It's simply not the right tool for that job.
But there are a few facets to a good solution for testing this. First the code can (and should) be written to handle every possible issue. A database system like PostgreSQL will fail immediately on locks and transaction issues. To handle it elegantly I use code that looks something like this (pseudo-code, also used to answer another question):
begin transaction
while not successful and count < 5
try
execute sql
commit
except
if error code is '40P01' or '55P03'
# Deadlock or lock not available
sleep a random time (200 ms to 1 sec) * number of retries
else if error code is '40001' or '25P02'
# "In failed sql transaction" or serialized transaction failure
rollback
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
else if error message is 'There is no active transaction'
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
increment count
Then create two sets of tests: one set should confirm the code is handling the situations correctly (i.e. unit tests). The other set of tests is for the environment (i.e. integration / functional tests).
Unit Tests
I find this to be a hard situation to reproduce in a PHPUnit test that connects to a database, and using a real database isn't appropriate for a true unit test. Instead, create PDO stubs and unit tests that throw every kind of database exception. This confirms the code is working as expected, but does not test concurrency on any real database. Remember, unit tests are only for confirming your code is written correctly, not for testing 3rd party software.
$iterationCount = 0;
$db->runInTransaction(function() use (&$iterationCount) {
$iterationCount++;
if ($iterationCount === 1) {
$exception = new PDOExceptionStub('Deadlock');
$exception->setCode('40P01');
throw $exception;
}
});
// First time fails, second time succeeds
$this->assertEquals(2, $iterationCount, 'Expected 2 iterations of runInTransaction');
Write a complete suite of tests that don't connect to the DB, but confirm logic.
Integration Tests
As you have found, PHPUnit is simply not the right tool to perform a load test. It's not appropriate for anything more complex than sequential unit and integration tests. You could run multiple instances of PHPUnit concurrently to put more load on the database. However I find this goes beyond what it was meant for, plus it doesn't help you monitor the database for issues. Therefore I see no way around the higher level tests you're looking to avoid.
But your library can be tested without running your complete application. I would create the simplest possible application just for testing it. It can have one or more CLI scripts that connect to a database. Those scripts can be spawned multiple times to put load on the database. Or make a simple web page with the library and use any of the many load testing applications out there to test it.
Related
I've got a PHP app with a custom model layer built on top of PDO. Each of my models is a separate class (with a common base class) that manages interactions with a given table (all SELECTs, INSERTs, and UPDATEs for the given table are instance methods of the given class). However, all the classes share a single PDO object, with a single DB connection, which is passed into the constructor of the model class, in a very simplistic form of dependency injection.
This is a bit of a simplification, but it works something like this:
<?php
$pdo = new \PDO(/*connection details*/);
$modelA = new ModelA($pdo);
$modelB = new ModelB($pdo);
$thingA = $modelA->find(1234); //where 1234 is the primary key
$thingB = $modelB->create([
'modelb_id' => $thingA->id,
'notes' => 'This call generates an runs an insert statement based on the array of fields and values passed in, applying specialized business logic, and other fun stuff.'
]);
print_r($thingB); // Create returns an object containing the data from the newly created row, so this would dump it to the screen.
At any rate, currently, everything just runs through PDO's prepared statements and gets committed immediately, because we're not using transactions. For the most part that's fine, but we're running into some issues.
First, if the database interaction is more complicated than what you see above, you can have a request that generates a large number of SQL queries, including several INSERTs and UPDATEs, that are all interrelated. If you hit an error or exception midway through the execution, the first half of the database stuff is already committed to the database, but everything after that never runs. So you get orphaned records, and other bad data. I know that is essentially the exact problem transaction were invented to solve, so it seems like a compelling case to use them.
The other thing we're running into is intermittent MySQL deadlocks when running at high traffic volumes. I'm still trying to get to the bottom of these, but I suspect transactions would help here, too, from what I'm reading.
Here's the thing though. My database logic is abstracted (as it should be), which ends up scattering it a bit. What I'm thinking I'd like to do is create a transaction right when I create my PDO object, and then all the queries run with the PDO object would be part of that transaction. Then as part of my app's tear down sequence for an HTTP request, commit the transaction, after everything has been done. And in my error handling logic, I'd probably want to call rollback.
I'm going to do some experimentation before I try to implement this fully, but I was hoping to get some feedback from someone who might have gone down with road, about whether I'm on the right track, or this is actually a terrible idea, or some tips for a more successful implementation. I also have some specific questions concerns:
Will wrapping everything in transactions cause a performance hit? It seems like if you've got multiple simultaneous transactions, they have to be queued, so the first transaction can complete before the second can run. Or else you'd have to cache the state for your database, so that all the events in one transaction can follow their logical progression? Maybe not. Maybe it's actually faster. I have no idea.
We use MySQL replication. One environment uses MySQL's Master-Slave support, with the bin log format set to STATEMENT. The other environment uses Percona Cluster, with Master-Master replication. Are there any replication implications of this model?
Is there a timeout for the transaction? And when it times out, does it commit or rollback? I expect my normal application logic will either commit or rollback, depending on whether there was an error. But if the whole thing crashes, what happens to the transaction I've been building?
I mentioned we're running into MySQL deadlocks. I've added some logic to my models, where when PDO throws a PDOException, if it's a 1213, I catch it, sleep for up to 3 seconds, then try again. What happens to my transaction if there's a deadlock, or any PDOException for that matter? Does it automatically rollback? Or does it leave it alone, so I can attempt to recover?
What other risks am I completely overlooking that I need to account for? I can try out my idea and do some tests, but I'm worried about the case where everything I know to look for works, but what about the things I don't know to look for that come up when I push it out into the wild?
Right now, I don't use a connection pooling, I just create a connection for each request, shared between all the model classes. But if I want to do connection pooling later, how does that interact with transactions?
I have a pretty clear picture of how and where to start the transaction, with when I create my shared PDO object. But is there a best practice for the commit part? I have a script that does other teardown stuff for each request. I could put it there. I could maybe subclass PDO, and add it to the destructor, but I've never done much with constructors in PHP.
Oh yeah, I need to get clear about how nested transactions work. In my preliminary tests, it looks like you can call BEGIN several times, and the first time you call COMMIT, it commits everything back to the first BEGIN. And any subsequent COMMITs don't do much. Is that right, or did I misinterpret my tests?
When running our suite of acceptance tests we would like to execute every single test on a defined database state. If one of the tests would write to the database (like create users or something else), it must of course not affect later tests.
We have thought about several options to achieve that, but copying the whole database before every single tests does not seem like the best solution (thinking of possible performance issues).
One more idea was to use MySQL transactions, but some of our tests cause many HTTP requests, so different PHP processes are spawned and they would lose the transaction too early for a clean rollback after the full test is done.
Are there better ways to guarantee a defined database state for every of our acceptance tests? We would like to keep it simpler than solutions like aufs or btrfs tackling it on system level.
You could approach this problem using PhpUnit.
It is used for automated testing with PHP. Is not the only library, but one of the most extended ones.
You could use it with database testing as well ( https://phpunit.de/manual/current/en/database.html ). Basically, it lets you accomplish exactly what you are looking for. Import initially the whole database, and then in each test suite, load what you need and then restore to the previous state. For example, you could save temporarily the current status of the table A and after you are done with all tests of the suite, simply restore it. Instead of reloading the whole database.
By the way, having a minimal Database with only the required information for testing will help a lot as well. In that case you don't have to deal with big performance issues, and you can simply restore it after each test suite.
I've been practicing and reading about test driven development.
I've been doing well so far since much of it is straightforward but I have questions as to what to test
about certain classes such as the one below.
public function testPersonEnrollmentDateIsSet()
{
//For the sake of simplicity I've abstracted the PDO/Connection mocking
$PDO = $this->getPDOMock();
$PDOStatement = $this->getPDOStatementMock();
$PDO->method('prepare')->willReturn($PDOStatement);
$PDOStatement->method('fetch')->willReturn('2000-01-01');
$AccountMapper = $this->MapperFactory->build(
'AccountMapper',
array($PDO)
);
$Person = $this->EntityFactory->build('Person');
$Account = $this->EntityFactory->build('Account');
$Person->setAccount($Account);
$AccountMapper->getAccountEnrollmentDate($Person);
$this->assertEquals(
'2001-01-01',
$Person->getAccountEnrollmentDate()
);
}
I'm a little unsure if I should even be testing this at all for two reasons.
Unit Testing Logic
In the example above I'm testing if the value is mapped correctly. Just the logic alone, mocking the connection so no database. This is awesome because I can run a test of exclusively business logic without any other developers having to install or configure a database dependency.
DBUnit Testing Result
However, a separate configuration can be run on demand to test the SQL queries themselves which is another type of test altogether. While also consolidating SQL tests to run separately from unit tests.
The test would be exactly the same as above except the PDO connection would not be mocked out and be a real database connection.
Cause For Concern
I'm torn because although it is testing for different things, it's essentially duplicate code.
If I get rid of the unit test, I introduce a required database dependency at all times. As the codebase grows, the tests will become more slow over time, no out-of-box testing; extra effort from other developers to set up a configuration.
If I get rid of the database test I can't assure the SQL will return the expected information.
My questions are:
Is this a legitimate reason to keep both tests?
Is it worth it or is it possible this may become a maintenance nightmare?
According to your comments, it seems like you want to test what PDO is doing. (Are prepare, execute etc ... being called properly and doing things I expect from them ?)
It's not what you want to do here with unit testing. You do not want to test PDO or any other library, and you are not supposed to test the logic or the behavior of PDO. Be sure that these things have been done by people in charge of PDO.
What you want to test is the correct execution of YOUR code and so the logic of YOUR code. You want to test getAccountExpirationDate, and so the results you expect from it. Not the results you expect from PDO calls.
Therefore, and if I understand your code right, you only have to test that the expiration date of the $Person you give as parameter has been set with the expected value from $result -> If you didn't use PDO in the right way, then the test will fail anyway !
A correct test could be :
Create a UserMapper object and open a DB transaction
Insert data in DB that will be retrieved in $result
Create a Person object with correct taxId and clientID to get the wanted $result from your query
Run UserMapper->getAccountExpirationDate($Person)
Assert that $Person->expirationDate equals whatever you settled as expected from $result
Rollback DB transaction
You may also check the format of the date, etc ... Anything that is related to the logic you implemented.
If you are going to test such things in many tests, use setUp and tearDown to init and rollback your DB after each test.
Is this a legitimate reason to keep both tests?
I think no one could ever be able to tell from your simple example, which test you need to keep. If you like, you can keep both. I say that because the answer depends a lot on unknown factors of your projects, including future requirements & modifications.
Personally, I don't think you need both, at least for the same function. If your function is more easier to break logically than data-sensitive, feel free to drop the database-backed ones. You will learn to use the correct type of tests over time.
I'm torn because although it is testing for different things, it's essentially duplicate code.
Yes, it is.
If I get rid of the database test I can't assure the SQL will return the expected information.
This is a valid concern if your database gets complicated enough. You are using DBUnit, so I would assume you have a sample database with sample information. Over time, a production database may gets multiple structure modifications (adding fields, removing fields, even re-working data relationship...). These things will add to your efforts to maintain your test database. However, it would also be helpful if you realize that some changes in the database aren't right.
If I get rid of the unit test, I introduce a required database dependency at all times. As the codebase grows, the tests will become more slow over time, no out-of-box testing; extra effort from other developers to set up a configuration.
What's wrong with requiring a database dependency?
Firstly, I don't see how extra efforts from other developers to setup a configuration comes to play here. You only set up one time, isn't it? In my opinion, it's a minor factor.
Second, the tests will be slow over time. Of course. You can count on in-memory database like H2 or HSQLDB to mitigate that risk, but it certainly take longer. However, do you actually need to worry? Most of these data-backed tests are integration test. They should be the responsibility of Continuous Integration Server, who runs exhausting test every now and then to prevent regression bug. For developers, I think we should only run a small set of light-weight tests during development.
In the end, I would like to remind you the number 1 rule for using unit test: Test what can break. Don't write tests for test sake. You will never have 100% coverage. However, since 20% of the code is used 80% of the time, focusing your testing effort on your main flow of business might be a good idea.
I am working on a complex database application written with PHP and Mysqli. For large database operations I´m using daemons (also PHP) working in the background. During these operations which may take several minutes I want to prevent the users from accesing the data being affected and show them a message instead.
I thought about creating a Mysql table and insert a row each time a specific daemon operation takes place. Then I would be always able to check if a certain operation takes place while trying to access the data.
However, of course it is important that the records do not stay in the database, if the daemon process gets terminated by any reason (kill from console, losing database connection, pulling the plug, etc.) I do not think that Mysql transactions / rollbacks are able do this, because a commit is necessary in order to make the changes public and the records will remain in the database if terminated.
Is there a way to ensure that the records get deleted if the process gets terminated?
This is an interesting problem, I actually implemented it for a University course a few years ago.
The trick I used is to play with the transaction isolation. If your daemons create the record indicating they are in progress, but do not commit it, then you are correct in saying that the other clients will not see that record. But, if you set the isolation level for that client to READ UNCOMMITTED, you will see the record saying it's in progress - since READ UNCOMMITTED will show you the changes from other clients which are not yet committed. (The daemon would be left with the default isolation level).
You should set the client's isolation level to read uncommitted for that daemon check only, not for it's other work as it could be very dangerous.
If the daemon crashes, the transaction gets aborted and the record goes. If it's successful, it can either mark it done in the db or delete the record etc, and then it commits.
This is really nice, since if the daemon fails for any reason all the changes it made are reversed and it can be retried.
If you need more explanation or code I may be able to find my assignment somewhere :)
Transaction isolation levels reference
Note that this all requires InnoDB or any good transactional DB.
I'm testing that a function properly adds data to a db, but I want the test data removed after the test is finished. If the test fails, it quits at the failure and never gets the chance to delete the test rows.
It's the only test that hits the db, so I don't really want to do anything in the tearDown() method.
I'm testing an $obj->save() type method that saves data parsed from a flat file.
If your database supports transactions, you could issue a start_transaction at the beginning of the test. If the test fails (causing the program to quit), an implicit rollback will be executed and undo your changes. If the test succeeds, issue an explicit rollback.
Another option is to wrap the assertions in a try-catch statement - this prevents the test from halting (as well as other automatic features like capturing screenshots), and you can do whatever you need from that point.
You should use separate databases for development/production and testing. Which are identical in structure but every time you perform testing you drop the testing db and restore it from some data fixtures. The point is that this way you can be absolutely sure that your db contains the same set of data every time you run your tests. So deleting test data is no big deal.
Are you using the suggested approach for database testing via the Database Testcase Extension?
Basically, if the test fails (read if there is not an error that makes PHPUnit exit), there should be no issues because the database is seeded on startup of the testcase:
the default implementation in PHPUnit will automatically truncate all tables specified and then insert the data from your data set in the order specified by the data set.
so there should be no need to do that manually. Even if there is an error, PHPUnit will clear the table on next run.