Dependencies between PHPUnit tests - php

I'm writing a PHPUnit test case for an API (so not exactly a unit test) and I'm thinking about having a test that all other tests will depend on.
The tests in the test case make API requests. Most of these requests require a user. The test in question will create that user that the other tests will use.
Would that be a horrible idea?

I think that the best way for unit tests is to eliminate the dependencies first.
You can abstract the end point with your own local version that will return predictible results. This way you can test that your requests are correct.
You can abstract the data providers (database, filesitem, etc...) with your stubs that will also return predictible data (username, etc..).
After that you just test your request and see they are correct..
The second part is to actualy test the data providers, with different tests, so you know that the good username will be given.
And then you can test the API connectivity, etc..
EDIT. If you have dependencies in your code, and it's difficult to abstract the providers or the end point web service, you may need to adjust your code so that it will accept references to those objects as parameters. Than in your tests you change the objects passed with your own stub objects. In production you pass the correct references, so that you will not need to change your code for testing.
I hope i have been clear. If not, ask me and i can explain better, maybe i did not understand your question well

Related

Test Driven Development and KISS

For example I want to design a Job entity. Job has a status and I need to mark a Job as done.
I wrote a markDone() method and I want to test it. So to do an assertion I need one more method - isDone(). But at the moment I don't use isDone() method in the my code.
Is it ok to write such a useless methods to please a TDD?
Is it ok to write such a useless methods to please a TDD?
Yes, but maybe not.
The yes part: you are allowed to shape your design in such a way that testing is easier, including creating methods specifically for observability. Such things often come in handy later when you are trying to build views to understand processes running in production.
The maybe not part: What's Job::status for? What observable behavior(s) in the system change when that status is set to done? What bugs would be filed if Job::markDone were a no-op? That's the kind of thing you really want to be testing.
For instance, it might be that you need to be able to describe the job as a JSON document, and changing the job status changes the value that appears in the JSON. Great! Test that.
job.markDone()
json = job.asJson()
assert "DONE".equals(json.getText("status))
In the domain-layer / business-logic, objects are interesting for how they use their hidden data structures to respond to queries. Commands are interesting, not because they mutate the data structure, but because the mutated data structure produces different query results.
If the method has no use outside of testing, I'd avoid writing it since there's probably another way that you can check whether or not the job is done. Is the job's status public, or do you have a method that returns the job's status? If so, I'd use that to test whether markDone() is working properly. If not, you can use reflection to check the value of a private or protected object property.
Can `isDone()' just reside in the test project and not the production code? That way, you wouldn't have useless production code just for the sake of testing.

Best Way to Test an API in PHPUNIT Without Revealing Private Credentials

I created a PHP class that consume's AskGeo's API. (http://askgeo.com)
I want to test it with PHPUNIT and include those tests in the PSR package I am releasing, but I don't want to give away my Account ID and API key that I will need to test the API.
What's the best way to do this?
I was thinking I could include a config file that could be filled in with those credentials before running tests.
Is that the best way?
Good unit tests must be fast and repeatable. If you make your tests work with remote API, you will break this rules:
HTTP request is too long to execute - never know how long, good tests work for milliseconds
once customer have unstable internet, test results may differ. And tests will not work without internet at all.
What you may do is to separate your code to your logic and some transport layer. You should inject transport into logic code, to be able use real transport (like cUrl) for production and mock transport for testing env. Mock transport should return answers from fixtures you provide. This will make tests extremelly fast and always produce same result.
Transport layer can look like this:
TransportInterface with method request($url, $params, $method, ...)
StubTransport implements TransportInterface with __construct($fixtures) - if $url+$params+$method found in fixtures, return result.
CurlTransport implements TransportInterface - cUrl implementation.
VoodooMagicTransport implements TransportInterface - get results using some black magic.
You logic code: FantasticLogic class with __construct(TransportInterface $transport). Do not use implementation's class name in parameter type, only interface.
In my opinion, that is the best way if you really want to test against the live production API on each test run -- to require the developer that is utilizing your class to enter in their own credentials. However, if there isn't a proper set of credentials available, you should consider using a stub (a static canned answer to the call).

Unit Testing (PHP): When to fake/mock dependencies and when not to

is it better to fake dependencies (for example Doctrine) for unit-tests or to use the real ones?
In a unit test, you use only ONE real instance of a class, and that is the class that you want to test.
ALL dependencies of that class should be mocked, unless there is a reason not to.
Reasons not to mock would be if data objects are being used that have no dependencies itself - you can use the real object and test if it received correct data afterwards.
Another reason not to mock would be if the configuration of the mock is too complicated - in that case, you have a reason to refactor the code instead, because if mocking a class is too complicated, the API of that class might be too complicated, too.
But the general answer: You want to always mock every dependency, every time.
I'll give you an example for the "too-complicated-so-refactor" case.
I was using a "Zend_Session_Namespace" object for internal data storage of a model. That instance got injected into the model, so mocking was not an issue.
But the internal implementation of the real "Namespace" class made me mock all the calls to __set and __get in the correct order of how they were used in the model. And that sucked. Because every time I decided to reorder the reading and writing of a value in my code, I had to change the mocking in the tests, although nothing was broken. Refactoring in the code should not lead to broken tests or force you to change them.
The refactoring added a new object that separates the "Zend_Session_Namespace" from the model. I created an object that extends "ArrayObject" and contains the "Namespace". On creation, all the values got read from the Namespace and added to the ArrayObject, and on every write, the value also gets passed to the Namespace object as well.
I now had the situation that I could use a real extended ArrayObject for all my tests, which in itself only needed an unconfigured mocked instance of "Zend_Session_Namespace", because I did not need to test whether the values were correctly stored in the session when I tested the model. I only needed a data store that gets used inside the model.
To test that the session gets correctly read and written, I have tests for that ArrayObject itself.
So in the end I am using a real instance of the model, and a real instance of the data store together with a mocked instance of "Zend_Session_Namespace" which does nothing. I deliberately chose to separate "model stuff" and "session save stuff" which had been mixed into the model class before -> "single responsibility principle".
The testing really got easier that way. And I'd say that this is also a code smell: If creating and configuring the mock classes is complicated, or needs a lot of changes when you change the tested class, it is time to think about refactoring. There is something wrong there.
Mocking should be done for a reason. Good reasons are:
You can not easily make the depended-on-component (DOC) behave as intended for your tests.
Does calling the DOC cause any non-derministic behaviour (date/time, randomness, network connections)?
The test setup is overly complex and/or maintenance intensive (like, need for external files)
The original DOC brings portability problems for your test code.
Does using the original DOC cause unnacceptably long build / execution times?
Has the DOC stability (maturity) issues that make the tests unreliable, or, worse, is the DOC not even available yet?
For example, you (typically) don't mock standard library math functions like sin or cos, because they don't have any of the abovementioned problems.
Why is it recommendable to avoid mocking where unnecessary?
For one thing, mocking increases test complexity.
Secondly, mocking makes your tests dependent on the inner workings of your code, namely on how the code interacts with the DOCs. Would be acceptable for white box tests where the implemented algorithm is tested, but not desirable for black box tests.

How do I architect my classes for easier unit testing?

I'll admit, I haven't unit tested much... but I'd like to. With that being said, I have a very complex registration process that I'd like to optimize for easier unit testing. I'm looking for a way to structure my classes so that I can test them more easily in the future. All of this logic is contained within an MVC framework, so you can assume the controller is the root where everything gets instantiated from.
To simplify, what I'm essentially asking is how to setup a system where you can manage any number of third party modules with CRUD updates. These third party modules are all RESTful API driven and response data is stored in local copies. Something like the deletion of a user account would need to trigger the deletion of all associated modules (which I refer to as providers). These providers may have a dependency on another provider, so the order of deletions/creations is important. I'm interested in which design patterns I should specifically be using to support my application.
Registration spans several classes and stores data in several db tables. Here's the order of the different providers and methods (they aren't statics, just written that way for brevity):
Provider::create('external::create-user') initiates registration at a particular step of a particular provider. The double colon syntax in the first param indicates the class should trigger creation on providerClass::providerMethod. I had made a general assumption that Provider would be an interface with the methods create(), update(), delete() that all other providers would implement it. How this gets instantiated is likely something you need to help me with.
$user = Provider_External::createUser() creates a user on an external API, returns success, and user gets stored in my database.
$customer = Provider_Gapps_Customer::create($user) creates a customer on a third party API, returns success, and stores locally.
$subscription = Provider_Gapps_Subscription::create($customer) creates a subscription associated to the previously created customer on the third party API, returns success, and stores locally.
Provider_Gapps_Verification::get($customer, $subscription) retrieves a row from an external API. This information gets stored locally. Another call is made which I'm skipping to keep things concise.
Provider_Gapps_Verification::verify($customer, $subscription) performs an external API verification process. The result of which gets stored locally.
This is a really dumbed down sample as the actual code relies upon at least 6 external API calls and over 10 local database rows created during registration. It doesn't make sense to use dependency injection at the constructor level because I might need to instantiate 6 classes in the controller without knowing if I even need them all. What I'm looking to accomplish would be something like Provider::create('external') where I simply specify the starting step to kick off registration.
The Crux of the Problem
So as you can see, this is just one sample of a registration process. I'm building a system where I could have several hundred service providers (external API modules) that I need to sign up for, update, delete, etc. Each of these providers gets related back to a user account.
I would like to build this system in a manner where I can specify an order of operations (steps) when triggering the creation of a new provider. Put another way, allow me to specify which provider/method combination gets triggered next in the chain of events since creation can span so many steps. Currently, I have this chain of events occurring via the subject/observer pattern. I'm looking to potentially move this code to a database table, provider_steps, where I list each step as well as it's following success_step and failure_step (for rollbacks and deletes). The table would look as follows:
# the id of the parent provider row
provider_id int(11) unsigned primary key,
# the short, slug name of the step for using in codebase
step_name varchar(60),
# the name of the method correlating to the step
method_name varchar(120),
# the steps that get triggered on success of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_success varchar(255),
# the steps that get triggered on failure of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_failure varchar(255),
created_at datetime,
updated_at datetime,
index ('provider_id', 'step_name')
There's so many decisions to make here... I know I should favor composition over inheritance and create some interfaces. I also know I'm likely going to need factories. Lastly, I have a lot of domain model shit going on here... so I likely need business domain classes. I'm just not sure how to mesh them all together without creating an utter mess in my pursuit of the holy grail.
Also, where would be the best place for the db queries to take place?
I have a model for each database table already, but I'm interested in knowing where and how to instantiate the particular model methods.
Things I've Been Reading...
Design Patterns
The Strategy Pattern
Composition over Inheritance
The Factory method pattern
The Abstract factory pattern
The Builder pattern
The Chain-of-responsibility pattern
You're already working with the pub/sub pattern, which seems appropriate. Given nothing but your comments above, I'd be considering an ordered list as a priority mechanism.
But it still doesn't smell right that each subscriber is concerned with the order of operations of its dependents for triggering success/failure. Dependencies usually seem like they belong in a tree, not a list. If you stored them in a tree (using the composite pattern) then the built-in recursion would be able to clean up each dependency by cleaning up its dependents first. That way you're no longer worried about prioritizing in which order the cleanup happens - the tree handles that automatically.
And you can use a tree for storing pub/sub subscribers almost as easily as you can use a list.
Using a test-driven development approach could get you what you need, and would ensure your entire application is not only fully testable, but completely covered by tests that prove it does what you want. I'd start by describing exactly what you need to do to meet one single requirement.
One thing you know you want to do is add a provider, so a TestAddProvider() test seems appropriate. Note that it should be pretty simple at this point, and have nothing to do with a composite pattern. Once that's working, you know that a provider has a dependent. Create a TestAddProviderWithDependent() test, and see how that goes. Again, it shouldn't be complex. Next, you'd likely want to TestAddProviderWithTwoDependents(), and that's where the list would get implemented. Once that's working, you know you want the Provider to also be a Dependent, so a new test would prove the inheritance model worked. From there, you'd add enough tests to convince yourself that various combinations of adding providers and dependents worked, and tests for exception conditions, etc. Just from the tests and requirements, you'd quickly arrive at a composite pattern that meets your needs. At this point I'd actually crack open my copy of GoF to ensure I understood the consequences of choosing the composite pattern, and to make sure I didn't add an inappropriate wart.
Another known requirement is to delete providers, so create a TestDeleteProvider() test, and implement the DeleteProvider() method. You won't be far away from having the provider delete its dependents, too, so the next step might be creating a TestDeleteProviderWithADependent() test. The recursion of the composite pattern should be evident at this point, and you should only need a few more tests to convince yourself that deeply nested providers, empty leafs, wide nodes, etc., all will properly clean themselves up.
I would assume that there's a requirement for your providers to actually provide their services. Time to test calling the providers (using mock providers for testing), and adding tests that ensure they can find their dependencies. Again, the recursion of the composite pattern should help build the list of dependencies or whatever you need to call the correct providers correctly.
You might find that providers have to be called in a specific order. At this point you might need to add prioritization to the lists at each node within the composite tree. Or maybe you have to build an entirely different structure (such as a linked list) to call them in the right order. Use the tests and approach it slowly. You might still have people concerned that you delete dependents in a particular externally prescribed order. At this point you can use your tests to prove to the doubters that you will always delete them safely, even if not in the order they were thinking.
If you've been doing it right, all your previous tests should continue to pass.
Then come the tricky questions. What if you have two providers that share a common dependency? If you delete one provider, should it delete all of its dependencies even though a different provider needs one of them? Add a test, and implement your rule. I figure I'd handle it through reference counting, but maybe you want a copy of the provider for the second instance, so you never have to worry about sharing children, and you keep things simpler that way. Or maybe it's never a problem in your domain. Another tricky question is if your providers can have circular dependencies. How do you ensure you don't end up in a self-referential loop? Write tests and figure it out.
After you've got this whole structure figured out, only then would you start thinking about the data you would use to describe this hierarchy.
That's the approach I'd consider. It may not be right for you, but that's for you to decide.
Unit Testing
With unit testing, we only want to test the code that makes up the individual unit of source code, typically a class method or function in PHP (Unit Testing Overview). Which indicates that we don't want to actually test the external API in Unit Testing, we only want to test the code we are writing locally. If you do want to test entire workflows, you are likely wanting to perform integration testing (Integration Testing Overview), which is a different beast.
As you specifically asked about designing for Unit Testing, lets assume you actually mean Unit Testing as opposed to Integration Testing and submit that there are two reasonable ways to go about designing your Provider classes.
Stub Out
The practice of replacing an object with a test double that (optionally) returns configured return values is refered to as stubbing. You can use a stub to "replace a real component on which the SUT depends so that the test has a control point for the indirect inputs of the SUT. This allows the test to force the SUT down paths it might not otherwise execute". Reference & Examples
Mock Objects
The practice of replacing an object with a test double that verifies expectations, for instance asserting that a method has been called, is referred to as mocking.
You can use a mock object "as an observation point that is used to verify the indirect outputs of the SUT as it is exercised. Typically, the mock object also includes the functionality of a test stub in that it must return values to the SUT if it hasn't already failed the tests but the emphasis is on the verification of the indirect outputs. Therefore, a mock object is lot more than just a test stub plus assertions; it is used a fundamentally different way".
Reference & Examples
Our Advice
Design your class to both all both Stubbing and Mocking. The PHP Unit Manual has an excellent example of Stubbing and Mocking Web Service. While this doesn't help you out of the box, it demonstrates how you would go about implementing the same for the Restful API you are consuming.
Where is the best place for the db queries to take place?
We suggest you use an ORM and not solve this yourself. You can easily Google PHP ORM's and make your own decision based off your own needs; our advice is to use Doctrine because we use Doctrine and it suits our needs well and over the past few years, we have come to appreciate how well the Doctrine developers know the domain, simply put, they do it better than we could do it ourselves so we are happy to let them do it for us.
If you don't really grasp why you should use an ORM, see Why should you use an ORM? and then Google the same question. If you still feel like you can roll your own ORM or otherwise handle the Database Access yourself better than the guys dedicated to it, we would expect you to already know the answer to the question. If you feel you have a pressing need to handle it yourself, we suggest you look at the source code for a number a of ORM's (See Doctrine on Github) and find the solution that best fits your scenario.
Thanks for asking a fun question, I appreciate it.
Every single dependency relationship within your class hierarchy must be accessible from outside world (shouldn't be highly coupled). For instance, if you are instantiating class A within class B, class B must have setter/getter methods implemented for class A instance holder in class B.
http://en.wikipedia.org/wiki/Dependency_injection
The furthermost problem I can see with your code - and this hinders you from testing it actually - is making use of static class method calls:
Provider::create('external::create-user')
$user = Provider_External::createUser()
$customer = Provider_Gapps_Customer::create($user)
$subscription = Provider_Gapps_Subscription::create($customer)
...
It's epidemic in your code - even if you "only" outlined them as static for "brevity". Such attitiude is not brevity it's counter-productive for testable code. Avoid these at all cost incl. when asking a question about Unit-Testing, this is known bad practice and it is known that such code is hard to test.
After you've converted all static calls into object method invocations and used Dependency Injection instead of static global state to pass the objects along, you can just do unit-testing with PHPUnit incl. making use of stub and mock objects collaborating in your (simple) tests.
So here is a TODO:
Refactor static method calls into object method invocations.
Use Dependency Injection to pass objects along.
And you very much improved your code. If you argue that you can not do that, do not waste your time with unit-testing, waste it with maintaining your application, ship it fast, let it make some money, and burn it if it's not profitable any longer. But don't waste your programming life with unit-testing static global state - it's just stupid to do.
Think about layering your application with defined roles and responsibilities for each layer. You may like to take inspiration from Apache-Axis' message flow subsystem. The core idea is to create a chain of handlers through which the request flows until it is processed. Such a design facilitates plugable components which may be bundled together to create higher order functions.
Further you may like to read about Functors/Function Objects, particularly Closure, Predicate, Transformer and Supplier to create your participating components. Hope that helps.
Have you looked at the state design pattern? http://en.wikipedia.org/wiki/State_pattern
You could make all your steps as different states in state machine and it would look like graph. You could store this graph in your database table/xml, also every provider can have his own graph which represents order in which execution should happen.
So when you get into certain state you may trigger event/events (save user, get user). I dont know your application specific, but events can be res-used by other providers.
If it fails on some of the steps then different graph path is executed.
If you will correctly abstract it you could have loosely coupled system which follows orders given by graph and executes events based on state.
Then later if you need add some other provider you only need to create graph and/or some new events.
Here is some example: https://github.com/Metabor/Statemachine

Unit Testing: Specific testing & Flow of Control

I am quite new to unit testing and testing in general. I am developing with phpUnit, but as my question is more general / a design question, the actual environment shouldn't be of too much importance.
I assume, that it is a good practice, to write your testcases as specific as possible. For example (the later the better):
assertNotEmpty($myObject); // myObject is not Null
assertInternalType('array', $myObject); // myObject is an array
assertGreaterThan(0, count($myObject)); // myObject actually has entries
If that is correct, here is my question:
Is it a accepted practice to write some flow control inside a testcase, if the state of the object one is testing against depends on external sources (i.e. DB) - or even in general?
Like:
if (myObject !== null) {
if (count(myObject) > 0) {
// assert some Business Logic
}
else {
// assert some different Business Logic
}
}
Is this kind of flow control inside a testcase admissible or is a "code smell" and should be circumvented? If it is ok, are there any tips or practices, which should be kept in mind here?
Paul's answer addresses test method scope and assertions, but one thing your question's code implied is that you would test A if the returned object had value X but test B if it had value Y. In other words, your test would expect multiple values and test different things based on what it actually got.
In general, you will be a happier tester if each of your tests has a single, known, correct value. You achieve this by using fixed, known test data, often by setting it up inside the test itself. Here are three common paths:
Fill a database with a set of fixed data used by all tests. This will evolve as you add more tests and functionality, and it can become cumbersome to keep it up-to-date as things move. If you have tests that modify the data, you either need to reset the data after each test or rollback the changes.
Create a streamlined data set for each test. During setUp() the test drops and recreates the database using its specific data set. It can make writing tests easier initially, but you still must update the datasets as the objects evolve. This can also make running the tests take longer.
Use mocks for your data access objects when not testing those DAOs directly. This allows you to specify in the test exactly what data should be returned and when. Since you're not testing the DAO code, it's okay to mock it out. This makes the tests run quickly and means you don't need to manage data sets. You still have to manage the mock data, however, and write the mocking code. There are many mocking frameworks depending on your preference, including PHPUnit's own built-in one.
It's okay to have SOME control flow within your testcases, but in general, understand that your unit tests will work out best if they are disjoint, that is, they each test for different things. The reason this works out well is that when your test cases fail, you can see precisely what the failure is from the testcase that failed, as opposed to needing to go deeper inside a larger test case to see what went wrong. The usual metric is a single assertion per unit test case. That said, there are exceptions to every rule, and that's certainly one of them; there's nothing necessarily wrong with having a couple of assertions in your test case, particularly when the setup / teardown of the test case scenario is particularly difficult. However, the real code smell you want to avoid is the situation where you have one "test" which tests all the code paths.

Categories