I'm testing that a function properly adds data to a db, but I want the test data removed after the test is finished. If the test fails, it quits at the failure and never gets the chance to delete the test rows.
It's the only test that hits the db, so I don't really want to do anything in the tearDown() method.
I'm testing an $obj->save() type method that saves data parsed from a flat file.
If your database supports transactions, you could issue a start_transaction at the beginning of the test. If the test fails (causing the program to quit), an implicit rollback will be executed and undo your changes. If the test succeeds, issue an explicit rollback.
Another option is to wrap the assertions in a try-catch statement - this prevents the test from halting (as well as other automatic features like capturing screenshots), and you can do whatever you need from that point.
You should use separate databases for development/production and testing. Which are identical in structure but every time you perform testing you drop the testing db and restore it from some data fixtures. The point is that this way you can be absolutely sure that your db contains the same set of data every time you run your tests. So deleting test data is no big deal.
Are you using the suggested approach for database testing via the Database Testcase Extension?
Basically, if the test fails (read if there is not an error that makes PHPUnit exit), there should be no issues because the database is seeded on startup of the testcase:
the default implementation in PHPUnit will automatically truncate all tables specified and then insert the data from your data set in the order specified by the data set.
so there should be no need to do that manually. Even if there is an error, PHPUnit will clear the table on next run.
Related
I recently encountered an issue on a live application.
I realized I had more and more concurrency exceptions and locks with a database.
Basically I start a transaction which requires a SELECT and an INSERT on the same table to commit.
But because the load is really heavy, each transactions locks the table, in most case it is so fast it doesn't cause any problems but there is a point where the locks start waiting more and more.
I was able to somewhat fix this problem by tweaking the queries.
Though, now, I'd like to write some tests with PHPUnit to validate my fix and avoid any regressions.
I was not able to find any materials on how to do this.
Since PHP isn't multi threaded, I've no ideas how I could run concurrent queries in a single test to validate.
Basically, I would like to be able to run multiple calls in a single test to ensure everything is ok.
I know I could try to do some high level tests by directly querying the http server and load the whole application, but since my problem comes from a standalone library I'd rather like to test it against itself.
Any ideas?
The short answer is there's no good way to test concurrent reads/writes on an actual database with PHPUnit. It's simply not the right tool for that job.
But there are a few facets to a good solution for testing this. First the code can (and should) be written to handle every possible issue. A database system like PostgreSQL will fail immediately on locks and transaction issues. To handle it elegantly I use code that looks something like this (pseudo-code, also used to answer another question):
begin transaction
while not successful and count < 5
try
execute sql
commit
except
if error code is '40P01' or '55P03'
# Deadlock or lock not available
sleep a random time (200 ms to 1 sec) * number of retries
else if error code is '40001' or '25P02'
# "In failed sql transaction" or serialized transaction failure
rollback
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
else if error message is 'There is no active transaction'
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
increment count
Then create two sets of tests: one set should confirm the code is handling the situations correctly (i.e. unit tests). The other set of tests is for the environment (i.e. integration / functional tests).
Unit Tests
I find this to be a hard situation to reproduce in a PHPUnit test that connects to a database, and using a real database isn't appropriate for a true unit test. Instead, create PDO stubs and unit tests that throw every kind of database exception. This confirms the code is working as expected, but does not test concurrency on any real database. Remember, unit tests are only for confirming your code is written correctly, not for testing 3rd party software.
$iterationCount = 0;
$db->runInTransaction(function() use (&$iterationCount) {
$iterationCount++;
if ($iterationCount === 1) {
$exception = new PDOExceptionStub('Deadlock');
$exception->setCode('40P01');
throw $exception;
}
});
// First time fails, second time succeeds
$this->assertEquals(2, $iterationCount, 'Expected 2 iterations of runInTransaction');
Write a complete suite of tests that don't connect to the DB, but confirm logic.
Integration Tests
As you have found, PHPUnit is simply not the right tool to perform a load test. It's not appropriate for anything more complex than sequential unit and integration tests. You could run multiple instances of PHPUnit concurrently to put more load on the database. However I find this goes beyond what it was meant for, plus it doesn't help you monitor the database for issues. Therefore I see no way around the higher level tests you're looking to avoid.
But your library can be tested without running your complete application. I would create the simplest possible application just for testing it. It can have one or more CLI scripts that connect to a database. Those scripts can be spawned multiple times to put load on the database. Or make a simple web page with the library and use any of the many load testing applications out there to test it.
I've got a PHP app with a custom model layer built on top of PDO. Each of my models is a separate class (with a common base class) that manages interactions with a given table (all SELECTs, INSERTs, and UPDATEs for the given table are instance methods of the given class). However, all the classes share a single PDO object, with a single DB connection, which is passed into the constructor of the model class, in a very simplistic form of dependency injection.
This is a bit of a simplification, but it works something like this:
<?php
$pdo = new \PDO(/*connection details*/);
$modelA = new ModelA($pdo);
$modelB = new ModelB($pdo);
$thingA = $modelA->find(1234); //where 1234 is the primary key
$thingB = $modelB->create([
'modelb_id' => $thingA->id,
'notes' => 'This call generates an runs an insert statement based on the array of fields and values passed in, applying specialized business logic, and other fun stuff.'
]);
print_r($thingB); // Create returns an object containing the data from the newly created row, so this would dump it to the screen.
At any rate, currently, everything just runs through PDO's prepared statements and gets committed immediately, because we're not using transactions. For the most part that's fine, but we're running into some issues.
First, if the database interaction is more complicated than what you see above, you can have a request that generates a large number of SQL queries, including several INSERTs and UPDATEs, that are all interrelated. If you hit an error or exception midway through the execution, the first half of the database stuff is already committed to the database, but everything after that never runs. So you get orphaned records, and other bad data. I know that is essentially the exact problem transaction were invented to solve, so it seems like a compelling case to use them.
The other thing we're running into is intermittent MySQL deadlocks when running at high traffic volumes. I'm still trying to get to the bottom of these, but I suspect transactions would help here, too, from what I'm reading.
Here's the thing though. My database logic is abstracted (as it should be), which ends up scattering it a bit. What I'm thinking I'd like to do is create a transaction right when I create my PDO object, and then all the queries run with the PDO object would be part of that transaction. Then as part of my app's tear down sequence for an HTTP request, commit the transaction, after everything has been done. And in my error handling logic, I'd probably want to call rollback.
I'm going to do some experimentation before I try to implement this fully, but I was hoping to get some feedback from someone who might have gone down with road, about whether I'm on the right track, or this is actually a terrible idea, or some tips for a more successful implementation. I also have some specific questions concerns:
Will wrapping everything in transactions cause a performance hit? It seems like if you've got multiple simultaneous transactions, they have to be queued, so the first transaction can complete before the second can run. Or else you'd have to cache the state for your database, so that all the events in one transaction can follow their logical progression? Maybe not. Maybe it's actually faster. I have no idea.
We use MySQL replication. One environment uses MySQL's Master-Slave support, with the bin log format set to STATEMENT. The other environment uses Percona Cluster, with Master-Master replication. Are there any replication implications of this model?
Is there a timeout for the transaction? And when it times out, does it commit or rollback? I expect my normal application logic will either commit or rollback, depending on whether there was an error. But if the whole thing crashes, what happens to the transaction I've been building?
I mentioned we're running into MySQL deadlocks. I've added some logic to my models, where when PDO throws a PDOException, if it's a 1213, I catch it, sleep for up to 3 seconds, then try again. What happens to my transaction if there's a deadlock, or any PDOException for that matter? Does it automatically rollback? Or does it leave it alone, so I can attempt to recover?
What other risks am I completely overlooking that I need to account for? I can try out my idea and do some tests, but I'm worried about the case where everything I know to look for works, but what about the things I don't know to look for that come up when I push it out into the wild?
Right now, I don't use a connection pooling, I just create a connection for each request, shared between all the model classes. But if I want to do connection pooling later, how does that interact with transactions?
I have a pretty clear picture of how and where to start the transaction, with when I create my shared PDO object. But is there a best practice for the commit part? I have a script that does other teardown stuff for each request. I could put it there. I could maybe subclass PDO, and add it to the destructor, but I've never done much with constructors in PHP.
Oh yeah, I need to get clear about how nested transactions work. In my preliminary tests, it looks like you can call BEGIN several times, and the first time you call COMMIT, it commits everything back to the first BEGIN. And any subsequent COMMITs don't do much. Is that right, or did I misinterpret my tests?
I am developing a Codeigniter (2.0.2) Application, which will utilise a Master database for all write operations (INSERT/UPDATE/DELETE) and a read replica for all read operations (SELECT).
Now I know I can access two different database objects within the code to route the individual requests to the specific database server, but i'm thinking there has a better way, automated way. I'll be using MySQL and Active Record, and also want to build in Memcache checking - although it won't be used immediately, I'd like the option there for the future, built in at this stage.
I'm thinking if its possible to add a hook/library of some kind to intercept the $this->db->query so that the following happens:
1) SQL Query received
2) Check if SELECT query
2a) If SELECT, see if Memcache is active, if so encode SQL and check Memcache for response.
2b) If no memcache response, or Memcache is not active, execute query as normal through READ MySQL server.
3) Query was NOT select, so execute query as normal through the WRITE MySQL server.
4) Return response.
I'm sure that looking at this, it should be quite simple to do, but no matter how I look at it i'm just not seeing a potential answer - but there's got to be one! Can anyone help/assist?
In addition, I also want the ability to be able to log all write SQL commands for troubleshooting, presumably the best way is to introduce 3a) Write SQL command to plain text file ... into the above scheme. I don't believe MySQL actually logs the non-SELECT queries in anyway ... does it?
That type of behavior is a little bit beyond the normal scope of CI. Unfortunately, your best bet is to manually extend the database drivers, specifically override the function simple_query or _execute (simple_query is a wrapper around _execute which simply ensures initialization). That is really the only place where you can guarantee that you can catch all of the queries and branch the logic accordingly. (You may also want to override close as that is the cleanup script)
(Personally, I would have a the SELECT DB load a secondary DB into itself and just call $write_db->simple_query conditionally, that seems like it would be the least trouble).
A long time ago I wrote a php class that handles postgresql db connections.
I've added transactions to my insert/update functions and it works just fine for me. But recently I found out about the "pg_prepare" function.
I'm a bit confused about what that function does and if it'll be better to switch to it.
Currently whenever I do an insert/update my sql looks like this:
$transactionSql = "PREPARE TRANSACTION ".md5(time()).";"
.$theUpdateOrDeleteSQL.";".
."COMMIT;";
This will return something like:
PREPARE TRANSACTION '4601a2e4b4aa2632167d3cc62b516e6d';
INSERT INTO users (username,g_id,email,password)
VALUES('test', '1', 'test','1234');
COMMIT;
I've structured my database with relations and I'm using (when it's possible):
ON DELETE CASCADE
ON UPDATE CASCADE
But I want to be 100% sure things are clean in the database and there are no leftovers if/when there is a failure upon updating/deleting or inserting.
It would be nice if you can share your opinion/experience about pg_prepare, do I really need the "prepare transaction" and any other addition things that might help me? :)
No, you don't need 2 phase commit !...
For safe PHP database handling, do not use pg_query directly, rather wrap it in a special function which does the following :
opens the database connection on your first query
if using persistent connections, ensure the connection is in a known state
register_shutdown_function to a function that issues a ROLLBACK
make sure autocommit is off, or simply issue a BEGIN before the first query
log database error and slow queries
only uses pg_query_params() which takes care of sql injections nicely
That way, if your script crashes or whatever, a rollback is issued automatically. You can only commit by explicitly comitting.
If you use persistent connections beware : php's handling of pg_pconnect is a little ... buggy.
No you don't need prepare transaction (that is intended for distributed transactions across different servers - as Milen has already pointed out.
I'm not sure how the PHP interface handles that, but as long as you can make sure you are not running in auto commit mode, things should be fine.
If you can't control the auto commit mode, simply put your statements into a BEGIN ... COMMIT block
Can I insert something into a MySQL database using PHP and then immediately make a call to access that, or is the insert asynchronous (in which case the possibility exists that the database has not finished inserting the value before I query it)?
What I think the OP is asking is this:
<?
$id = $db->insert(..);
// in this case, $row will always have the data you just inserted!
$row = $db->select(...where id=$id...)
?>
In this case, if you do a insert, you will always be able to access the last inserted row with a select. That doesn't change even if a transaction is used here.
If the value is inserted in a transaction, it won't be accessible to any other transaction until your original transaction is committed. Other than that it ought to be accessible at least "very soon" after the time you commit it.
There are normally two ways of using MySQL (and most other SQL databases, for that matter):
Transactional. You start a transaction (either implicitly or by issuing something like 'BEGIN'), issue commands, and then either explicitly commit the transaction, or roll it back (failing to take any action before cutting off the database connection will result in automatic rollback).
Auto-commit. Each statement is automatically committed to the database as it's issued.
The default mode may vary, but even if you're in auto-commit mode, you can "switch" to transactional just by issuing a BEGIN.
If you're operating transactionally, any changes you make to the database will be local to your db connection/instance until you issue a commit. Issuing a commit should block until the transaction is fully committed, so once it returns without error, you can assume the data is there.
If you're operating in auto-commit (and your database library isn't doing something really strange), you can rely on data you've just entered to be available as soon as the call that inserts the data returns.
Note that best practice is to always operate transactionally. Even if you're only issuing a single atomic statement, it's good to be in the habit of properly BEGINing and COMMITing a transaction. It also saves you from trouble when a new version of your database library switches to transactional mode by default and suddenly all your one-line SQL statements never get committed. :)
Mostly the answer is yes. You would have to do some special work to force a database call to be asynchronous in the way you describe, and as long as you're doing it all in the same thread, you should be fine.
What is the context in which you're asking the question?