PHPUnit - test the validity of an SQL Query

PHPUnit - test the validity of an SQL Query - php

I'm in the process of testing a factory class.
One of the methods must load the data in an array for an object that another method will instantiate.
That method contains the SQL query that holds a critical condition that must be tested. ( in this case only ask for the records that are "published". Ex.: WHERE published=1 ). That distinction in the SQL Query is the only detail that makes that method differ from another one, and I want to test the query execution behaviour.
Now, I can't really mock my PDO object and ask it to return a fixed result as I would not test the execution of the query by mySQL. That would make a useless test.
That leads me to think that I'll need to set up a static database with fixed test data inside it. Am I right on this or have I missed something?
Should I separate the test requiring the "test database" from the tests that are autonomous ?

I strongly agree on the part of not mocking out the PDO. At some point i want to make sure my queries work against a real database. While this might not be a unit test anymore, technically. For me it offers so much more to know that my code that handles data storage really works against the db.
What i tend to do is creating a sort of Data Access class for each Class that needs to talk to the DB and so separating out most of the business logic from the db access code.
That way i can mock out the data access when testing the classes and after that set up a "Test Database" for each "data access class" and see if those work.
#zerkms Answer (+1) already linked http://phpunit.de/manual/current/en/database.html and the only other resource i have found to be of value when it comes to DB-Testing is the Book Real-World Solutions for Developing High-Quality PHP Frameworks and Applications that has a big chapter covering this subject.
Should I separate test requiring the "test database" from the tests that are autonomous ?
Only if your testsuite gets really really big and you have runtime issues that force you to say "hitting even a testing database just takes to long for all my tests so i run those only on a continuous integration server and not while developing.

Yes, it is the common practice.
It is also the special methods for loading fixtures: http://phpunit.de/manual/current/en/database.html
What about second question: no, you should not separate them. Your tests should test the behaviour, not the details of implementation. Join test cases into test classes by logic.

Related

What's the difference between mocking an object and just setting a value

I know that me asking this question means that i probably didn't understand mocking fully. I know why using mocks (isolation) , i know how to use mocks (each FW and it's way) - but i don't understand - if the purpose of mocking is to return expected value to isolate my unit test, why should i mock an object and not just create the value by myself?
Why this:
$searchEngine = Mockery::mock('elasticSearch');
$searchEngine->shouldReceive('setupDataConnection')->once()->andReturn("data connected");
$insertData = new DataInserter;
$this->assertEquals("data connected",$insertData->insertData($searchEngine));
And not this:
$knownResult = "data connected";
$insertData = new DataInserter;
$this->assertEquals($insertData->insertData($searchEngine) ,$knownResult);
Edit
Sorry for the mistake, but the accidently didn't include the insertData in the second example.

by using a mock you have access to additional information, from a object that behaves directly like a C style struct ( only performs data storage ) then there is little difference (with the exception about making assertions about calls to your mocks that are usually rather useful since you can then make sure that your value is for example set 3 times, and not 7 (maybe with some intermediate value that could cause potential problems. )
Mocks are useful for testing the interactions between objects.
Now if your class does something more complex ( such as access a resource like a database, or read/write data from a file. ) then mocks become lifesaving because they pretty much allow you to abstract away the internal behavior of classes that are not under test (for example lets imagine that the first version of your class simply stores values in memory, the next version stores them in a specific place in a database, this way you can first of all save yourself the resource hit of a database access and secondly be able to effectively prove that your class under test works correctly as opposed to also proving that your data access class works. if you have a failed test then you can hone in on the issue immediately as opposed to trying to have to figure out if the issue is in your data provider or data accessor. )
Because tests can be ran in parallel certain resources can cause false failures.
A few examples of interactions which are extremely difficult to test without mocks would be:
Code which talks to a Database Access layer ( ensuring that queries are correct, close() is called on the connection, and appropriate SQL is sent, plus you don't modify a real database. )
File/Socket IO (again ensuring that data is consistent)
System calls ( e.g. calls to php exec)
anything that relies on threads / timing ( not as much of an issue in PHP)
GUI Code ( again, almost unheard of in PHP) but if I shift the language to lets say java its significantly easier to call :
when(myDialogBoxMock).showOptionInputDialog().thenReturn("mockInput")
then to try to fake it with subclasses or temporary implementations/subclasses.
Calls to specific methods which must be called with a specific value only.
verify(myMock.setValue(7), times(2));
verifyNoOtherInteractions(myMock);
A large part of this is also speed, if you are reading a file off the disk lets say 300/400 times in a massive codebase then you will definitely notice a speed increase by using mocks.
If you have a tool like EMMA/jacoco for java in PHP you will be able to have a effective code coverage report to show you where your tests are not covering code. And on any non-trivial application you will find yourself trying to figure out how the hell to get your object under test into a specific state to test specific behavior, Mocks with DI are really your tool to perform these tasks.

In the first example you are providing a fake collaborator ($searchEngine) for your DataInserter. In the assert statement you are calling insertData, which will execute the logic of your DataInserter that will in term interact with the fake $searchEngine. By providing the fake, you can verify that the logic of DataInserter works correctly in isolation.
In your second example, you are not testing the DataInserter logic. You are just checking whether the constant string $knownResult is equal to "data connected", i.e. you are testing string equality of php. As far as I can see from the snippet you are constructing a new DataInserter object but you aren't even exercising its code in the test.

Unit testing with Symfony 2 and database

I'm trying to understand the unit tests when a database is involved. My classes heavily depends on a database to do their work. I already had a look at many many answers here on stackoverflow and on the internet in general, but I'm not really happy with what I found.
One of the common problems that I see when a database is involved (MySQL in my case) is that the tests are really really long to execute and one possible solution is to use SQLite.
I can't use it everywhere in the tests because some classes uses RAW queries instead of the Doctrine querybuilder to bypass the limitation of unknown datetime functions. SQLite has a different set of functions for date values and this means that to use it in the tests I should rewrite my queries twice, one for MySQL and one for SQLite.
I decided then to stick with MySQL, but creating and dropping the schema every time a test runs takes so much time. I know that the SchemaTool class can handle the creation of a subset of the schema: would be a good solution to create only the tables I really need in the test suite or should I always use the full schema?
Example of the part of code (pseudo-code) I'm trying to test
NotificationManagerClass
constructor(EntityManager)
loadNotifications()
deleteNotification()
updateNotification()
as you can see, I inject the entity manager in the constructor of the class. The class is a service registered in the Symfony container. In the controller I then get this service and use its methods. Inside the methods I use the querybuilder because I must have access to some other services and a repository isn't container-aware. How can I decouple more than this my class? I can't think a way to do it

You mix a lot of words together, which don't all have the meaning you gave them. A quick summary of the words:
Unit tests - Tests only one class. It doesn't test nor instantiate any other classes (it uses mocks or stubs for them). It shouldn't be using a database, it should mock doctrine instead.
Web tests - These tests test how multiple classes work together using database. These are often quite slow, so you don't want to have very many very specific tests here.
MySQL and SQLite are both just database driver. SQLite is as slow as MySQL and it really doesn't matter which one you use (well, it does matter, you have to use the same driver as you use in production)
You can't mock everything, because you decided to use raw mysql functions (bad choice)... Always avoid to use mysql queries inside classes. Use either Doctrine (there are a lot of simple ways to bypass the unknown datetime functions) or put the raw queries in the Repository and mock the repository.
In short:
Don't use anything other than the tested object in unit tests (mock/stub everything else)
Web Tests should be using production tools
Don't use mysql directly in a class

Test involving a database is not a unit test. Your repositories should be tested with integration or functional tests (or acceptance if you write such).
If you have many tests involving database, that probably means you took a wrong turn somewhere, either with:
code design - your classes are coupled to the database
deciding how to test - writing too many integration tests instead of unit tests
Might be worth looking into the test pyramid: http://martinfowler.com/bliki/TestPyramid.html
By looking into how to speed up your tests you'll only continue walking the wrong path. The solution is to write more unit tests and less integration tests. Unit tests are fast, independent and isolated, therefore harder to break. Integration tests are more fragile and slow. Look into FIRST properties of unit tests.
Btw: SQLite is a dead end. It doesn't behave like mysql as it doesn't support constraints.
Your example class could be modelled as:
class NotificationManager
{
public function __construct(MyNotificationRepository $repository, MyFabulousService $service)
{}
// ... other methods
}
This way you can inject fakes of both repository and your service and test the NotificationManager in isolation.
There's no much value in unit testing query builders as you'd be testing doctrine instead of your code. It is also hard, since the query class is declared final. You can test if queries return correct results functionally. There's gonna be lot less functional tests needed if you unit test your classes properly.
If you made the MyNotificationRepository an interface, you could even decouple from doctrine.

Where would a senior PHP developer locate the method getActiveEntries()?

Separation of Concerns or Single Responsibility Principle
The majority of the questions in the dropdown list of questions that "may already have your answer" only explain "theory" and are not concrete examples that answer my simple question.
What I'm trying to accomplish
I have a class named GuestbookEntry that maps to the properties that are in the database table named "guestbook". Very simple!
Originally, I had a static method named getActiveEntries() that retrieved an array of all GuestbookEntry objects that had entries in the database. Then while learning how to properly design php classes, I learned two things:
Static methods are not desirable.
Separation of Concerns
My question:
Dealing with Separation of Concerns, if the GuestbookEntry class should only be responsible for managing single guestbook entries then where should this getActiveEntries() method go? I want to learn the absolute proper way to do this.

I guess there are actually two items in the question:
As #duskwuff points out, there is nothing wrong with static methods per-se; if you know their caveats (e.g. "Late Static Binding") and limitations, they are just another tool to work with. However, the way you model the interaction with the DB does have an impact on separation of concerns and, for example, unit testing.
For different reasons there is no "absolute proper way" of doing persistence. One of the reasons is that each way of tackling it has different tradeoffs; which one is better for you project is hard to tell. The other important reason is that languages evolve, so a new language feature can improve the way frameworks handle things. So, instead of looking for the perfect way of doing it you may want to consider different ways of approaching OO persistence assuming that you want so use a relational database:
Use the Active Record pattern. What you have done so far looks like is in the Active Record style, so you may find it natural. The active record has the advantage of being simple to grasp, but tends to be tightly coupled with the DB (of course this depends on the implementation). This is bad from the separation of concerns view and may complicate testing.
Use an ORM (like Doctrine or Propel). In this case most of the hard work is done by the framework (BD mapping, foreign keys, cascade deletes, some standard queries, etc.), but you must adapt to the framework rules (e.g. I recall having a lot of problems with the way Doctrine 1 handled hierarchies in a project. AFAIK this things are solved in Doctrine 2).
Roll your own framework to suite your project needs and your programming style. This is clearly a time consuming task, but you learn a lot.
As a general rule of thumb I try to keep my domain models as independent as possible from things like DB, mainly because of unit tests. Unit tests should be running all the time while you are programming and thus they should run fast (I always keep a terminal open while I program and I'm constantly switching to it to run the whole suite after applying changes). If you have to interact with the DB then your tests will become slow (any mid-sized system will have 100 or 200 test methods, so the methods should run in the order of milliseconds to be useful).
Finally, there are different techniques to cope with objects that communicate with DBs (e.g. mock objects), but a good advise is to always have a layer between your domain model and the DB. This will make your model more flexible to changes and easier to test.
EDIT: I forgot to mention the DAO approach, also stated in #MikeSW answer.
HTH

Stop second-guessing yourself. A static method getActiveEntries() is a perfectly reasonable way to solve this problem.
The "enterprisey" solution that you're looking for would probably involve something along the lines of creating a GuestbookEntryFactory object, configuring it to fetch active entries, and executing it. This is silly. Static methods are a tool, and this is an entirely appropriate job for them.

GetActiveEntries should be a method of a repository/DAO which will return an array of GuestBookEntry. That repository of course can implement an interface so here you have easy testing.
I disagree in using a static method for this, as it's clearly a persistence access issue. The GuestBookEntry should care only about the 'business' functionality and never about db. That's why it's useful to use a repository, that object will bridge the business layer to the db layer.
Edit My php's rusty but you get the idea.
public interface IRetrieveEntries
{
function GetActiveEntries();
}
public class EntriesRepository implements IRetrieveEntries
{
private $_db;
function __constructor($db)
{
$this->_db=$db;
}
function GetActiveEntries()
{
/* use $db to retreive the actual entries
You can use an ORM, PDO whatever you want
*/
//return entries;
}
}
You'll pass the interface around everywhere you need to access the functionality i.e you don't couple the code to the actual repository. You will use it for unit testing as well. The point is that you encapsulate the real data access in the GetActiveEntries method, the rest of the app won't know about the database.
About repository pattern you can read some tutorials I've wrote (ignore the C# used, the concepts are valid in any language)

Unit Testing: Specific testing & Flow of Control

I am quite new to unit testing and testing in general. I am developing with phpUnit, but as my question is more general / a design question, the actual environment shouldn't be of too much importance.
I assume, that it is a good practice, to write your testcases as specific as possible. For example (the later the better):
assertNotEmpty($myObject); // myObject is not Null
assertInternalType('array', $myObject); // myObject is an array
assertGreaterThan(0, count($myObject)); // myObject actually has entries
If that is correct, here is my question:
Is it a accepted practice to write some flow control inside a testcase, if the state of the object one is testing against depends on external sources (i.e. DB) - or even in general?
Like:
if (myObject !== null) {
if (count(myObject) > 0) {
// assert some Business Logic
}
else {
// assert some different Business Logic
}
}
Is this kind of flow control inside a testcase admissible or is a "code smell" and should be circumvented? If it is ok, are there any tips or practices, which should be kept in mind here?

Paul's answer addresses test method scope and assertions, but one thing your question's code implied is that you would test A if the returned object had value X but test B if it had value Y. In other words, your test would expect multiple values and test different things based on what it actually got.
In general, you will be a happier tester if each of your tests has a single, known, correct value. You achieve this by using fixed, known test data, often by setting it up inside the test itself. Here are three common paths:
Fill a database with a set of fixed data used by all tests. This will evolve as you add more tests and functionality, and it can become cumbersome to keep it up-to-date as things move. If you have tests that modify the data, you either need to reset the data after each test or rollback the changes.
Create a streamlined data set for each test. During setUp() the test drops and recreates the database using its specific data set. It can make writing tests easier initially, but you still must update the datasets as the objects evolve. This can also make running the tests take longer.
Use mocks for your data access objects when not testing those DAOs directly. This allows you to specify in the test exactly what data should be returned and when. Since you're not testing the DAO code, it's okay to mock it out. This makes the tests run quickly and means you don't need to manage data sets. You still have to manage the mock data, however, and write the mocking code. There are many mocking frameworks depending on your preference, including PHPUnit's own built-in one.

It's okay to have SOME control flow within your testcases, but in general, understand that your unit tests will work out best if they are disjoint, that is, they each test for different things. The reason this works out well is that when your test cases fail, you can see precisely what the failure is from the testcase that failed, as opposed to needing to go deeper inside a larger test case to see what went wrong. The usual metric is a single assertion per unit test case. That said, there are exceptions to every rule, and that's certainly one of them; there's nothing necessarily wrong with having a couple of assertions in your test case, particularly when the setup / teardown of the test case scenario is particularly difficult. However, the real code smell you want to avoid is the situation where you have one "test" which tests all the code paths.

What are standard/best practices for creating unit tests for functionality using databases?

I get the idea behind unit testing however am trying to think of a clean simple way do it when it requires database functionality. For instance I have a function that return result based off a database select query. Would the database alway have to remain the same for me to properly see that only the correct results are being returned. What is the best way to perform unit testing (in PHP) when it requires database inactivity (whether it be read, write, update, or delete)?

There is a whole chapter on that in the PHPUnit manual:
http://www.phpunit.de/manual/current/en/database.html and also worth reading
http://matthewturland.com/2010/01/04/database-testing-with-phpunit-and-mysql/
It's like with everything else when Unit-Testing. Create a known state and test your code against it to see if it returns the expected results.

Personally, I create a dummy testing database and populate it with a known data set for each testing run (I do that right in the setUp functions). Then the tests run against that data set, and then it's removed on tearDown...
Now, this is more of a Integration test than an Unit test (And personally I treat it differently from a unit test, run on its own schedule along with other integration tests), but it's still quite useful.

It is not a unit test if it needs the database.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.