Understanding unit testing

Understanding unit testing - php

I'm trying to understand how unit testing works. So far i understand you're testing the output of functions based on what you put in. Ok. Does that mean if your function has the only possibility of returning one datatype you only have to write one test for it? So say i write a function that only has the possibility of returning TRUE or FALSE, does that mean i would just be checking if the response is boolean?
And then also say i have a function that is pulling posts from a blog from a database. Say have the function set to: if number of rows = 0 return FALSE, else return the results. So i now i have the possibility of a function that could return a boolean value or an array. How the heck do you test that? The function now doesn't rely solely on the input, it relies on what's in the database.

A useful function does not return true or false randomly. It returns true under certain circumstances and false under other circumstances.
A unit test would therefore do three things (in that order):
Create an environment from which you know whether the function should return true or false (this environment is called the test fixture).
Run the function that is to be tested.
Compare the result of the function with you expectations.
Obviously this cannot be done with all possible inputs (if it could be done, this would be a proof, not a test), so the art of unit testing is choosing appropriate test fixtures. There are two ways to go about this:
Black box testing looks at the specification of the function to choose the fixtures: what are the valid input values and how should they influence the output?
White box testing looks into the function and chooses fixtures such that every statement of the function is executed by at least one fixture.
... if number of rows = 0 return FALSE, else return the results.
If such a function directly sends a query to a database server by calling an API of the DBMS (e.g. mysql_query), the creation of the test fixture would include
Setting up a database.
Inserting rows into the database (or not, depending on whether the test should check that the results are returned properly or FALSE is returned properly).
If, on the other hand, that function calls a custom function (e.g. provided by an object-relational mapper), then the test fixture should provide a mock implementation of that custom function. Dependency injection helps to ensure lose coupling of caller and callee, so the real implementation can be replaced with the mock implementation at runtime.

You have the gist of it about right. If there are external depecnencies for the unit there are two approaches. creating a Mock class or using some kind of fixture.
Using your DB example if youre testing a class that has to call on a data access class to pull data from the database then you would typically Mock that data access class, making the methods you plan on calling from it return the response you expect.
If on the other hand you were actually testing the data access class itself then you would use a fixture - a test db filled with data you know. And then test that the data access class returned the right data.

I think you need to do a bit of reading about unit testing. To answer your first question about the cases you should test, I suggest you read up on Boundary-Value testing and equivalence partitioning. in your case, yes you assert that the output is true or false, but are there multiple pre-conditions that can lead to the same result? If so, these need testing.
Regarding your second question where the return value depends on the database. This is strictly not a unit test (i.e. it involves multiple units). I suggest you spend some time reading about mock objects.

Unit testing just means that you're getting the outputs you expect. Write a test for every conceivable, unique state your program could be in and ensure the output of your function is what you expect.
For example, you could test in the case that your database is empty, a few rows, or full.
Then for each state, give different inputs (a couple invalid, a couple valid).
Unit testing just breaks up your programs into units and makes sure each unit works as expected.
EDIT:
Wikipedia has a great article on unit testing.

For unit testing you are trying to isolate and test certain pieces of code. You do this by providing the inputs such that you know what the answer should be if the code is working properly. If you are testing whether a function returns a boolean correctly then you need to provide inputs that you know will evaluate to say "false", then check to ensure that the function did indeed return "false". You will also want to write a test that should evaluate to "true". In fact, write as many tests with as many different inputs as you like until you are convinced that the function is working as expected.
There seems to be mixed opinions when it comes to unit testing with a database. Some argue that if you are using a database that you are performing integration testing. Regardless, for a unit test you should know what the input is and know what the expected answer should be. Your explanation of what you want to test doesn't align with what unit testing is really for. Once suggestion would be to create a test database that always contains the same data each time you run the unit test. This way you always know what the input is and know what the output should be. You could accomplish this by creating an in-memory database such as Hypersonic.
Hope this helps!

If you allow me a suggestion, take this book: THE ART OF UNIT TESTING, by Roy Osherove. http://amzn.com/1933988274. It's not a huge weapon-massive-like book.
I had never read anything about UnitTesting prior to this book, but now it's very clear how UnitTest works.
Hope it helps!

I should point out that unit testing is not just about what results your functions give out; it's also about how they mutate / impact other objects, as well. In a sense, they test what messages are passed between objects.

Related

Test Driven Development and KISS

For example I want to design a Job entity. Job has a status and I need to mark a Job as done.
I wrote a markDone() method and I want to test it. So to do an assertion I need one more method - isDone(). But at the moment I don't use isDone() method in the my code.
Is it ok to write such a useless methods to please a TDD?

Is it ok to write such a useless methods to please a TDD?
Yes, but maybe not.
The yes part: you are allowed to shape your design in such a way that testing is easier, including creating methods specifically for observability. Such things often come in handy later when you are trying to build views to understand processes running in production.
The maybe not part: What's Job::status for? What observable behavior(s) in the system change when that status is set to done? What bugs would be filed if Job::markDone were a no-op? That's the kind of thing you really want to be testing.
For instance, it might be that you need to be able to describe the job as a JSON document, and changing the job status changes the value that appears in the JSON. Great! Test that.
job.markDone()
json = job.asJson()
assert "DONE".equals(json.getText("status))
In the domain-layer / business-logic, objects are interesting for how they use their hidden data structures to respond to queries. Commands are interesting, not because they mutate the data structure, but because the mutated data structure produces different query results.

If the method has no use outside of testing, I'd avoid writing it since there's probably another way that you can check whether or not the job is done. Is the job's status public, or do you have a method that returns the job's status? If so, I'd use that to test whether markDone() is working properly. If not, you can use reflection to check the value of a private or protected object property.

Can `isDone()' just reside in the test project and not the production code? That way, you wouldn't have useless production code just for the sake of testing.

What's the difference between mocking an object and just setting a value

I know that me asking this question means that i probably didn't understand mocking fully. I know why using mocks (isolation) , i know how to use mocks (each FW and it's way) - but i don't understand - if the purpose of mocking is to return expected value to isolate my unit test, why should i mock an object and not just create the value by myself?
Why this:
$searchEngine = Mockery::mock('elasticSearch');
$searchEngine->shouldReceive('setupDataConnection')->once()->andReturn("data connected");
$insertData = new DataInserter;
$this->assertEquals("data connected",$insertData->insertData($searchEngine));
And not this:
$knownResult = "data connected";
$insertData = new DataInserter;
$this->assertEquals($insertData->insertData($searchEngine) ,$knownResult);
Edit
Sorry for the mistake, but the accidently didn't include the insertData in the second example.

by using a mock you have access to additional information, from a object that behaves directly like a C style struct ( only performs data storage ) then there is little difference (with the exception about making assertions about calls to your mocks that are usually rather useful since you can then make sure that your value is for example set 3 times, and not 7 (maybe with some intermediate value that could cause potential problems. )
Mocks are useful for testing the interactions between objects.
Now if your class does something more complex ( such as access a resource like a database, or read/write data from a file. ) then mocks become lifesaving because they pretty much allow you to abstract away the internal behavior of classes that are not under test (for example lets imagine that the first version of your class simply stores values in memory, the next version stores them in a specific place in a database, this way you can first of all save yourself the resource hit of a database access and secondly be able to effectively prove that your class under test works correctly as opposed to also proving that your data access class works. if you have a failed test then you can hone in on the issue immediately as opposed to trying to have to figure out if the issue is in your data provider or data accessor. )
Because tests can be ran in parallel certain resources can cause false failures.
A few examples of interactions which are extremely difficult to test without mocks would be:
Code which talks to a Database Access layer ( ensuring that queries are correct, close() is called on the connection, and appropriate SQL is sent, plus you don't modify a real database. )
File/Socket IO (again ensuring that data is consistent)
System calls ( e.g. calls to php exec)
anything that relies on threads / timing ( not as much of an issue in PHP)
GUI Code ( again, almost unheard of in PHP) but if I shift the language to lets say java its significantly easier to call :
when(myDialogBoxMock).showOptionInputDialog().thenReturn("mockInput")
then to try to fake it with subclasses or temporary implementations/subclasses.
Calls to specific methods which must be called with a specific value only.
verify(myMock.setValue(7), times(2));
verifyNoOtherInteractions(myMock);
A large part of this is also speed, if you are reading a file off the disk lets say 300/400 times in a massive codebase then you will definitely notice a speed increase by using mocks.
If you have a tool like EMMA/jacoco for java in PHP you will be able to have a effective code coverage report to show you where your tests are not covering code. And on any non-trivial application you will find yourself trying to figure out how the hell to get your object under test into a specific state to test specific behavior, Mocks with DI are really your tool to perform these tasks.

In the first example you are providing a fake collaborator ($searchEngine) for your DataInserter. In the assert statement you are calling insertData, which will execute the logic of your DataInserter that will in term interact with the fake $searchEngine. By providing the fake, you can verify that the logic of DataInserter works correctly in isolation.
In your second example, you are not testing the DataInserter logic. You are just checking whether the constant string $knownResult is equal to "data connected", i.e. you are testing string equality of php. As far as I can see from the snippet you are constructing a new DataInserter object but you aren't even exercising its code in the test.

Unit Testing: Specific testing & Flow of Control

I am quite new to unit testing and testing in general. I am developing with phpUnit, but as my question is more general / a design question, the actual environment shouldn't be of too much importance.
I assume, that it is a good practice, to write your testcases as specific as possible. For example (the later the better):
assertNotEmpty($myObject); // myObject is not Null
assertInternalType('array', $myObject); // myObject is an array
assertGreaterThan(0, count($myObject)); // myObject actually has entries
If that is correct, here is my question:
Is it a accepted practice to write some flow control inside a testcase, if the state of the object one is testing against depends on external sources (i.e. DB) - or even in general?
Like:
if (myObject !== null) {
if (count(myObject) > 0) {
// assert some Business Logic
}
else {
// assert some different Business Logic
}
}
Is this kind of flow control inside a testcase admissible or is a "code smell" and should be circumvented? If it is ok, are there any tips or practices, which should be kept in mind here?

Paul's answer addresses test method scope and assertions, but one thing your question's code implied is that you would test A if the returned object had value X but test B if it had value Y. In other words, your test would expect multiple values and test different things based on what it actually got.
In general, you will be a happier tester if each of your tests has a single, known, correct value. You achieve this by using fixed, known test data, often by setting it up inside the test itself. Here are three common paths:
Fill a database with a set of fixed data used by all tests. This will evolve as you add more tests and functionality, and it can become cumbersome to keep it up-to-date as things move. If you have tests that modify the data, you either need to reset the data after each test or rollback the changes.
Create a streamlined data set for each test. During setUp() the test drops and recreates the database using its specific data set. It can make writing tests easier initially, but you still must update the datasets as the objects evolve. This can also make running the tests take longer.
Use mocks for your data access objects when not testing those DAOs directly. This allows you to specify in the test exactly what data should be returned and when. Since you're not testing the DAO code, it's okay to mock it out. This makes the tests run quickly and means you don't need to manage data sets. You still have to manage the mock data, however, and write the mocking code. There are many mocking frameworks depending on your preference, including PHPUnit's own built-in one.

It's okay to have SOME control flow within your testcases, but in general, understand that your unit tests will work out best if they are disjoint, that is, they each test for different things. The reason this works out well is that when your test cases fail, you can see precisely what the failure is from the testcase that failed, as opposed to needing to go deeper inside a larger test case to see what went wrong. The usual metric is a single assertion per unit test case. That said, there are exceptions to every rule, and that's certainly one of them; there's nothing necessarily wrong with having a couple of assertions in your test case, particularly when the setup / teardown of the test case scenario is particularly difficult. However, the real code smell you want to avoid is the situation where you have one "test" which tests all the code paths.

PHPUnit - test the validity of an SQL Query

I'm in the process of testing a factory class.
One of the methods must load the data in an array for an object that another method will instantiate.
That method contains the SQL query that holds a critical condition that must be tested. ( in this case only ask for the records that are "published". Ex.: WHERE published=1 ). That distinction in the SQL Query is the only detail that makes that method differ from another one, and I want to test the query execution behaviour.
Now, I can't really mock my PDO object and ask it to return a fixed result as I would not test the execution of the query by mySQL. That would make a useless test.
That leads me to think that I'll need to set up a static database with fixed test data inside it. Am I right on this or have I missed something?
Should I separate the test requiring the "test database" from the tests that are autonomous ?

I strongly agree on the part of not mocking out the PDO. At some point i want to make sure my queries work against a real database. While this might not be a unit test anymore, technically. For me it offers so much more to know that my code that handles data storage really works against the db.
What i tend to do is creating a sort of Data Access class for each Class that needs to talk to the DB and so separating out most of the business logic from the db access code.
That way i can mock out the data access when testing the classes and after that set up a "Test Database" for each "data access class" and see if those work.
#zerkms Answer (+1) already linked http://phpunit.de/manual/current/en/database.html and the only other resource i have found to be of value when it comes to DB-Testing is the Book Real-World Solutions for Developing High-Quality PHP Frameworks and Applications that has a big chapter covering this subject.
Should I separate test requiring the "test database" from the tests that are autonomous ?
Only if your testsuite gets really really big and you have runtime issues that force you to say "hitting even a testing database just takes to long for all my tests so i run those only on a continuous integration server and not while developing.

Yes, it is the common practice.
It is also the special methods for loading fixtures: http://phpunit.de/manual/current/en/database.html
What about second question: no, you should not separate them. Your tests should test the behaviour, not the details of implementation. Join test cases into test classes by logic.

What are standard/best practices for creating unit tests for functionality using databases?

I get the idea behind unit testing however am trying to think of a clean simple way do it when it requires database functionality. For instance I have a function that return result based off a database select query. Would the database alway have to remain the same for me to properly see that only the correct results are being returned. What is the best way to perform unit testing (in PHP) when it requires database inactivity (whether it be read, write, update, or delete)?

There is a whole chapter on that in the PHPUnit manual:
http://www.phpunit.de/manual/current/en/database.html and also worth reading
http://matthewturland.com/2010/01/04/database-testing-with-phpunit-and-mysql/
It's like with everything else when Unit-Testing. Create a known state and test your code against it to see if it returns the expected results.

Personally, I create a dummy testing database and populate it with a known data set for each testing run (I do that right in the setUp functions). Then the tests run against that data set, and then it's removed on tearDown...
Now, this is more of a Integration test than an Unit test (And personally I treat it differently from a unit test, run on its own schedule along with other integration tests), but it's still quite useful.

It is not a unit test if it needs the database.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.