I'm stuck with a general OOP problem, and can't find the right way to phrase my question.
I want to create a class that gives me an object which I can write to once, persist it to storage and then be unable to change the properties. (for example: invoice information - once written to storage, this should be immutable). Not all information is available immediately, during the lifecycle of the object, information is added.
What I'd like to avoid is having exceptions flying out of setters when trying to write, because it feels like you're offering a contract you don't intend to keep.
Here are some ideas I've considered so far:
Pass in any write-information in the constructor. Constructor throws exception if the data is already present.
Create multiple classes in an inheritance tree, with each class representing the entity at some stage of its lifecycle, with appropriate setters where needed. Add a colletive interface for all the read operations.
Silently discarding any inappropriate writes.
My thoughts on these:
1. Makes the constructor highly unstable, generally a bad idea.
2. Explosion of complexity, and doesn't solve the problem completely (you can call the setter twice in a row, within the same request)
3. Easy, but same problem as with the exceptions; it's all a big deception towards your clients.
(Just FYI: I'm working in PHP5 at the moment - although I suspect this to be a generic problem)
Interesting problem. I think your best choice was #1, but I'm not sure I'd do it in the constructor. That way the client code can choose what it wants to do with the exception (suppress them, handle them, pass them up to the caller, etc...). And if you don't like exceptions, you could move the writing to a write() method that returns true if the write was successful and false otherwise.
Related
Is it a good idea to have logic inside __constructor?
public class someClass
{
public function __construct()
{
//some logic here
}
So far I thought that it is fine; however, this reddit comment suggests the opposite.
As #Barry wrote, one of the reasons is related to unit-testing, but it's just a side-effect.
Let's take the worst case scenario: you have a "class", which only has a constructor (you probably have seen such examples). So ... why was it even written as a class? You cannot alter it's state, you cannot request it to perform any task and you have no way to check, that it did what you wanted. You could as well used a linear file and just included it. This is just bad.
Now for a more reasonable example: let's assume you have a class, which has some validation checks in the constructor and makes a new DB connection in it. And and then it also has some public methods for performing various tasks
The most obvious problem is the "makes a new DB connection" - there is no way to affect or prevent this operation from outside the class. And that new connection is going off to do who-knows-what (probably loading some configuration and trying to throw exceptions). It also constitutes a hidden dependency, for which you have no indication, without inspecting the class's code.
And there is a similar problem with code, that does validations and/or transformations of passed parameters. It constitutes hidden logic (and thus violating PoLA. It also makes your class harder to extend, because you probably will want to retain some of that validation functionality, while replacing other part. And you don't have that option. Because all of that code gets run whenever you crate a new instance.
Bottom line is this - logic in constructor is considered to be a "code smell". It's not a mortal sin (like using eval() on a global variable), but it's a sign of bad design.
No it isn't a good idea for automated testing. When testing you want to be able to "mock" objects that allow you to control the logic especially in terms of interfaces. So if you place logic in the constructor then it is very hard to test as you must use the real object.
here is a fantastic talk with much more detail on why not to put logic in constructor (google tech talk by Misko Hevery)
https://www.youtube.com/watch?v=RlfLCWKxHJ0
I think this question is little bit unclear because i don't think that __construct is bad place to logic, the question is what kind of logic you have here? Some kind of logic can be placed in constructor but another must not be present in constructor. For example Symfony Response - constructor contains logic, but this logic is necessary for this object, and this constructor doesn't make some implicit actions. This constructor doesn't print content to output or something else - so this is good example (as for me)...
Also it is important to understand what you object must to do, if it will be immutable object - constructor can have little bit another view...
Also it's important to follow SOLID and appropriate design pattern...
is it better to fake dependencies (for example Doctrine) for unit-tests or to use the real ones?
In a unit test, you use only ONE real instance of a class, and that is the class that you want to test.
ALL dependencies of that class should be mocked, unless there is a reason not to.
Reasons not to mock would be if data objects are being used that have no dependencies itself - you can use the real object and test if it received correct data afterwards.
Another reason not to mock would be if the configuration of the mock is too complicated - in that case, you have a reason to refactor the code instead, because if mocking a class is too complicated, the API of that class might be too complicated, too.
But the general answer: You want to always mock every dependency, every time.
I'll give you an example for the "too-complicated-so-refactor" case.
I was using a "Zend_Session_Namespace" object for internal data storage of a model. That instance got injected into the model, so mocking was not an issue.
But the internal implementation of the real "Namespace" class made me mock all the calls to __set and __get in the correct order of how they were used in the model. And that sucked. Because every time I decided to reorder the reading and writing of a value in my code, I had to change the mocking in the tests, although nothing was broken. Refactoring in the code should not lead to broken tests or force you to change them.
The refactoring added a new object that separates the "Zend_Session_Namespace" from the model. I created an object that extends "ArrayObject" and contains the "Namespace". On creation, all the values got read from the Namespace and added to the ArrayObject, and on every write, the value also gets passed to the Namespace object as well.
I now had the situation that I could use a real extended ArrayObject for all my tests, which in itself only needed an unconfigured mocked instance of "Zend_Session_Namespace", because I did not need to test whether the values were correctly stored in the session when I tested the model. I only needed a data store that gets used inside the model.
To test that the session gets correctly read and written, I have tests for that ArrayObject itself.
So in the end I am using a real instance of the model, and a real instance of the data store together with a mocked instance of "Zend_Session_Namespace" which does nothing. I deliberately chose to separate "model stuff" and "session save stuff" which had been mixed into the model class before -> "single responsibility principle".
The testing really got easier that way. And I'd say that this is also a code smell: If creating and configuring the mock classes is complicated, or needs a lot of changes when you change the tested class, it is time to think about refactoring. There is something wrong there.
Mocking should be done for a reason. Good reasons are:
You can not easily make the depended-on-component (DOC) behave as intended for your tests.
Does calling the DOC cause any non-derministic behaviour (date/time, randomness, network connections)?
The test setup is overly complex and/or maintenance intensive (like, need for external files)
The original DOC brings portability problems for your test code.
Does using the original DOC cause unnacceptably long build / execution times?
Has the DOC stability (maturity) issues that make the tests unreliable, or, worse, is the DOC not even available yet?
For example, you (typically) don't mock standard library math functions like sin or cos, because they don't have any of the abovementioned problems.
Why is it recommendable to avoid mocking where unnecessary?
For one thing, mocking increases test complexity.
Secondly, mocking makes your tests dependent on the inner workings of your code, namely on how the code interacts with the DOCs. Would be acceptable for white box tests where the implemented algorithm is tested, but not desirable for black box tests.
I'll admit, I haven't unit tested much... but I'd like to. With that being said, I have a very complex registration process that I'd like to optimize for easier unit testing. I'm looking for a way to structure my classes so that I can test them more easily in the future. All of this logic is contained within an MVC framework, so you can assume the controller is the root where everything gets instantiated from.
To simplify, what I'm essentially asking is how to setup a system where you can manage any number of third party modules with CRUD updates. These third party modules are all RESTful API driven and response data is stored in local copies. Something like the deletion of a user account would need to trigger the deletion of all associated modules (which I refer to as providers). These providers may have a dependency on another provider, so the order of deletions/creations is important. I'm interested in which design patterns I should specifically be using to support my application.
Registration spans several classes and stores data in several db tables. Here's the order of the different providers and methods (they aren't statics, just written that way for brevity):
Provider::create('external::create-user') initiates registration at a particular step of a particular provider. The double colon syntax in the first param indicates the class should trigger creation on providerClass::providerMethod. I had made a general assumption that Provider would be an interface with the methods create(), update(), delete() that all other providers would implement it. How this gets instantiated is likely something you need to help me with.
$user = Provider_External::createUser() creates a user on an external API, returns success, and user gets stored in my database.
$customer = Provider_Gapps_Customer::create($user) creates a customer on a third party API, returns success, and stores locally.
$subscription = Provider_Gapps_Subscription::create($customer) creates a subscription associated to the previously created customer on the third party API, returns success, and stores locally.
Provider_Gapps_Verification::get($customer, $subscription) retrieves a row from an external API. This information gets stored locally. Another call is made which I'm skipping to keep things concise.
Provider_Gapps_Verification::verify($customer, $subscription) performs an external API verification process. The result of which gets stored locally.
This is a really dumbed down sample as the actual code relies upon at least 6 external API calls and over 10 local database rows created during registration. It doesn't make sense to use dependency injection at the constructor level because I might need to instantiate 6 classes in the controller without knowing if I even need them all. What I'm looking to accomplish would be something like Provider::create('external') where I simply specify the starting step to kick off registration.
The Crux of the Problem
So as you can see, this is just one sample of a registration process. I'm building a system where I could have several hundred service providers (external API modules) that I need to sign up for, update, delete, etc. Each of these providers gets related back to a user account.
I would like to build this system in a manner where I can specify an order of operations (steps) when triggering the creation of a new provider. Put another way, allow me to specify which provider/method combination gets triggered next in the chain of events since creation can span so many steps. Currently, I have this chain of events occurring via the subject/observer pattern. I'm looking to potentially move this code to a database table, provider_steps, where I list each step as well as it's following success_step and failure_step (for rollbacks and deletes). The table would look as follows:
# the id of the parent provider row
provider_id int(11) unsigned primary key,
# the short, slug name of the step for using in codebase
step_name varchar(60),
# the name of the method correlating to the step
method_name varchar(120),
# the steps that get triggered on success of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_success varchar(255),
# the steps that get triggered on failure of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_failure varchar(255),
created_at datetime,
updated_at datetime,
index ('provider_id', 'step_name')
There's so many decisions to make here... I know I should favor composition over inheritance and create some interfaces. I also know I'm likely going to need factories. Lastly, I have a lot of domain model shit going on here... so I likely need business domain classes. I'm just not sure how to mesh them all together without creating an utter mess in my pursuit of the holy grail.
Also, where would be the best place for the db queries to take place?
I have a model for each database table already, but I'm interested in knowing where and how to instantiate the particular model methods.
Things I've Been Reading...
Design Patterns
The Strategy Pattern
Composition over Inheritance
The Factory method pattern
The Abstract factory pattern
The Builder pattern
The Chain-of-responsibility pattern
You're already working with the pub/sub pattern, which seems appropriate. Given nothing but your comments above, I'd be considering an ordered list as a priority mechanism.
But it still doesn't smell right that each subscriber is concerned with the order of operations of its dependents for triggering success/failure. Dependencies usually seem like they belong in a tree, not a list. If you stored them in a tree (using the composite pattern) then the built-in recursion would be able to clean up each dependency by cleaning up its dependents first. That way you're no longer worried about prioritizing in which order the cleanup happens - the tree handles that automatically.
And you can use a tree for storing pub/sub subscribers almost as easily as you can use a list.
Using a test-driven development approach could get you what you need, and would ensure your entire application is not only fully testable, but completely covered by tests that prove it does what you want. I'd start by describing exactly what you need to do to meet one single requirement.
One thing you know you want to do is add a provider, so a TestAddProvider() test seems appropriate. Note that it should be pretty simple at this point, and have nothing to do with a composite pattern. Once that's working, you know that a provider has a dependent. Create a TestAddProviderWithDependent() test, and see how that goes. Again, it shouldn't be complex. Next, you'd likely want to TestAddProviderWithTwoDependents(), and that's where the list would get implemented. Once that's working, you know you want the Provider to also be a Dependent, so a new test would prove the inheritance model worked. From there, you'd add enough tests to convince yourself that various combinations of adding providers and dependents worked, and tests for exception conditions, etc. Just from the tests and requirements, you'd quickly arrive at a composite pattern that meets your needs. At this point I'd actually crack open my copy of GoF to ensure I understood the consequences of choosing the composite pattern, and to make sure I didn't add an inappropriate wart.
Another known requirement is to delete providers, so create a TestDeleteProvider() test, and implement the DeleteProvider() method. You won't be far away from having the provider delete its dependents, too, so the next step might be creating a TestDeleteProviderWithADependent() test. The recursion of the composite pattern should be evident at this point, and you should only need a few more tests to convince yourself that deeply nested providers, empty leafs, wide nodes, etc., all will properly clean themselves up.
I would assume that there's a requirement for your providers to actually provide their services. Time to test calling the providers (using mock providers for testing), and adding tests that ensure they can find their dependencies. Again, the recursion of the composite pattern should help build the list of dependencies or whatever you need to call the correct providers correctly.
You might find that providers have to be called in a specific order. At this point you might need to add prioritization to the lists at each node within the composite tree. Or maybe you have to build an entirely different structure (such as a linked list) to call them in the right order. Use the tests and approach it slowly. You might still have people concerned that you delete dependents in a particular externally prescribed order. At this point you can use your tests to prove to the doubters that you will always delete them safely, even if not in the order they were thinking.
If you've been doing it right, all your previous tests should continue to pass.
Then come the tricky questions. What if you have two providers that share a common dependency? If you delete one provider, should it delete all of its dependencies even though a different provider needs one of them? Add a test, and implement your rule. I figure I'd handle it through reference counting, but maybe you want a copy of the provider for the second instance, so you never have to worry about sharing children, and you keep things simpler that way. Or maybe it's never a problem in your domain. Another tricky question is if your providers can have circular dependencies. How do you ensure you don't end up in a self-referential loop? Write tests and figure it out.
After you've got this whole structure figured out, only then would you start thinking about the data you would use to describe this hierarchy.
That's the approach I'd consider. It may not be right for you, but that's for you to decide.
Unit Testing
With unit testing, we only want to test the code that makes up the individual unit of source code, typically a class method or function in PHP (Unit Testing Overview). Which indicates that we don't want to actually test the external API in Unit Testing, we only want to test the code we are writing locally. If you do want to test entire workflows, you are likely wanting to perform integration testing (Integration Testing Overview), which is a different beast.
As you specifically asked about designing for Unit Testing, lets assume you actually mean Unit Testing as opposed to Integration Testing and submit that there are two reasonable ways to go about designing your Provider classes.
Stub Out
The practice of replacing an object with a test double that (optionally) returns configured return values is refered to as stubbing. You can use a stub to "replace a real component on which the SUT depends so that the test has a control point for the indirect inputs of the SUT. This allows the test to force the SUT down paths it might not otherwise execute". Reference & Examples
Mock Objects
The practice of replacing an object with a test double that verifies expectations, for instance asserting that a method has been called, is referred to as mocking.
You can use a mock object "as an observation point that is used to verify the indirect outputs of the SUT as it is exercised. Typically, the mock object also includes the functionality of a test stub in that it must return values to the SUT if it hasn't already failed the tests but the emphasis is on the verification of the indirect outputs. Therefore, a mock object is lot more than just a test stub plus assertions; it is used a fundamentally different way".
Reference & Examples
Our Advice
Design your class to both all both Stubbing and Mocking. The PHP Unit Manual has an excellent example of Stubbing and Mocking Web Service. While this doesn't help you out of the box, it demonstrates how you would go about implementing the same for the Restful API you are consuming.
Where is the best place for the db queries to take place?
We suggest you use an ORM and not solve this yourself. You can easily Google PHP ORM's and make your own decision based off your own needs; our advice is to use Doctrine because we use Doctrine and it suits our needs well and over the past few years, we have come to appreciate how well the Doctrine developers know the domain, simply put, they do it better than we could do it ourselves so we are happy to let them do it for us.
If you don't really grasp why you should use an ORM, see Why should you use an ORM? and then Google the same question. If you still feel like you can roll your own ORM or otherwise handle the Database Access yourself better than the guys dedicated to it, we would expect you to already know the answer to the question. If you feel you have a pressing need to handle it yourself, we suggest you look at the source code for a number a of ORM's (See Doctrine on Github) and find the solution that best fits your scenario.
Thanks for asking a fun question, I appreciate it.
Every single dependency relationship within your class hierarchy must be accessible from outside world (shouldn't be highly coupled). For instance, if you are instantiating class A within class B, class B must have setter/getter methods implemented for class A instance holder in class B.
http://en.wikipedia.org/wiki/Dependency_injection
The furthermost problem I can see with your code - and this hinders you from testing it actually - is making use of static class method calls:
Provider::create('external::create-user')
$user = Provider_External::createUser()
$customer = Provider_Gapps_Customer::create($user)
$subscription = Provider_Gapps_Subscription::create($customer)
...
It's epidemic in your code - even if you "only" outlined them as static for "brevity". Such attitiude is not brevity it's counter-productive for testable code. Avoid these at all cost incl. when asking a question about Unit-Testing, this is known bad practice and it is known that such code is hard to test.
After you've converted all static calls into object method invocations and used Dependency Injection instead of static global state to pass the objects along, you can just do unit-testing with PHPUnit incl. making use of stub and mock objects collaborating in your (simple) tests.
So here is a TODO:
Refactor static method calls into object method invocations.
Use Dependency Injection to pass objects along.
And you very much improved your code. If you argue that you can not do that, do not waste your time with unit-testing, waste it with maintaining your application, ship it fast, let it make some money, and burn it if it's not profitable any longer. But don't waste your programming life with unit-testing static global state - it's just stupid to do.
Think about layering your application with defined roles and responsibilities for each layer. You may like to take inspiration from Apache-Axis' message flow subsystem. The core idea is to create a chain of handlers through which the request flows until it is processed. Such a design facilitates plugable components which may be bundled together to create higher order functions.
Further you may like to read about Functors/Function Objects, particularly Closure, Predicate, Transformer and Supplier to create your participating components. Hope that helps.
Have you looked at the state design pattern? http://en.wikipedia.org/wiki/State_pattern
You could make all your steps as different states in state machine and it would look like graph. You could store this graph in your database table/xml, also every provider can have his own graph which represents order in which execution should happen.
So when you get into certain state you may trigger event/events (save user, get user). I dont know your application specific, but events can be res-used by other providers.
If it fails on some of the steps then different graph path is executed.
If you will correctly abstract it you could have loosely coupled system which follows orders given by graph and executes events based on state.
Then later if you need add some other provider you only need to create graph and/or some new events.
Here is some example: https://github.com/Metabor/Statemachine
In PHP 5.2.x, mySQL 5.x
Im having a bit of an issue wrapping my head around what should and should not be an instance of a class in php because they are not persistent once the page is rendered.
Say I have a list of comments. To me, it would make sense that every comment be its own object because I can call actions on them, and they hold properties. If I was doing this in another language (one that has persistent state and can be interacted with), I would do it that way.
But it seems wasteful because to do that I have a loop that is calling new() and it would probably mean that I need to access the database for each instance (also bad).
But maybe im missing something.
Php just seems different in how I think about class and objects. When should something be a class instance, and when not?
This is a subjective issue, so I'll try to gather my thoughts coherently:
Persistence in PHP has sort of a different meaning. Your thinking that each comment should be an object because comments have actions which can be performed on them seems correct. The fact that the objects won't persist across a page load isn't really relevant. It isn't uncommon in PHP to use an object in one script, which gets destroyed when the script completes, and then re-instantiate it on a subsequent page load.
Object-oriented programming provides (among other things) code organization and code reuse. Even if an object is only really used once during the execution of a script, if defining its class aids in program organization, you are right to do so.
You usually needn't worry about resource wastefulness until it starts to become a problem; if your server is constantly taxed to where it degrades your user experience or limits your expansion, then it is time to optimize.
Addendum:
Another reason defining a class for your comments is that doing so could pay dividends later when you need to extend the class. Suppose you decide to implement something like a comment reply. The reply is itself just a comment, but holds some extra attributes about the comment to which it refers. You can extend the Comment object to add those attributes and additional functionality.
I started off by drafting a question: "What is the best way to perform unit testing on a constructor (e.g., __construct() in PHP5)", but when reading through the related questions, I saw several comments that seemed to suggest that setting member variables or performing any complicated operations in the constructor are no-nos.
The constructor for the class in question here takes a param, performs some operations on it (making sure it passes a sniff test, and transforming it if necessary), and then stashes it away in a member variable.
I thought the benefits of doing it this way were:
1) that client code would always be
certain to have a value for this
member variable whenever an object
of this class is instantiated, and
2) it saves a step in client code
(one of which could conceivably be
missed), e.g.,
$Thing = new Thing;
$Thing->initialize($var);
when we could just do this
$Thing = new Thing($var);
and be done with it.
Is this a no-no? If so why?
My rule of thumb is that an object should be ready for use after the constructor has finished. But there are often a number of options that can be tweaked afterwards.
My list of do's and donts:
Constructors should set up basic options for the object.
They should maybe create instances of helper objects.
They should not aqquire resources(files, sockets, ...), unless the object clearly is a wrapper around some resource.
Of course, no rules without exceptions. The important thing is that you think about your design and your choises. Make object usage natural - and that includes error reporting.
This comes up quite a lot in C++ discussions, and the general conclusion I've come to there has been this:
If an object does not acquire any external resources, members must be initialized in the constructor. This involves doing all work in the constructor.
(x, y) coordinate (or really any other structure that's just a glorified tuple)
US state abbreviation lookup table
If an object acquires resources that it can control, they may be allocated in the constructor:
open file descriptor
allocated memory
handle/pointer into an external library
If the object acquires resources that it can't entirely control, they must be allocated outside of the constructor:
TCP connection
DB connection
weak reference
There are always exceptions, but this covers most cases.
Constructors are for initializing the object, so
$Thing = new Thing($var);
is perfectly acceptable.
The job of a constructor is to establish an instance's invariants.
Anything that doesn't contribute to that is best kept out of the constructor.
To improve the testability of a class it is generally a good thing to keep it's constructor as simple as possible and to have it ask only for things it absolutely needs. There's an excellent presentation available on YouTube as part of Google's "Clean Code Talks" series explaining this in detail.
You should definitely avoid making the client have to call
$thing->initialize($var)
That sort of stuff absolutely belongs in the constructor. It's just unfriendly to the client programmer to make them call this. There is a (slightly controversial) school of thought that says you should write classes so that objects are never in an invalid state -- and 'uninitialized' is an invalid state.
However for testability and performance reasons, sometimes it's good to defer certain initializations until later in the object's life. In cases like these, lazy evaluation is the solution.
Apologies for putting Java syntax in a Python answer but:
// Constructor
public MyObject(MyType initVar) {
this.initVar = initVar;
}
private void lazyInitialize() {
if(initialized) {
return
}
// initialization code goes here, uses initVar
}
public SomeType doSomething(SomeOtherType x) {
lazyInitialize();
// doing something code goes here
}
You can segment your lazy initialization so that only the parts that need it get initialized. It's common, for example, to do this in getters, just for what affects the value that's being got.
Depends on what type of system you're trying to architect, but in general I believe constructors are best used for only initializing the "state" of the object, but not perform any state transitions themselves. Best to just have it set the defaults.
I then write a "handle" method into my objects for handling things like user input, database calls, exceptions, collation, whatever. The idea is that this will handle whatever state the object finds itself in based on external forces (user input, time, etc.) Basically, all the things that may change the state of the object and require additional action are discovered and represented in the object.
Finally, I put a render method into the class to show the user something meaningful. This only represents the state of the object to the user (whatever that may be.)
__construct($arguments)
handle()
render(Exception $ex = null)
The __construct magic method is fine to use. The reason you see initialize in a lot of frameworks and applications is because that object is being programmed to an interface or it is trying to enact a singleton/getInstance pattern.
These objects are generally pulled into context or a controller and then have the common interface functionality called on them by other higher level objects.
If $var is absolutely necessary for $Thing to work, then it is a DO
You should not put things in a constructor that is only supposed to run once when the class is created.
To explain.
If i had a database class. Where the constructor is the connection to the database
So
$db = new dbclass;
And now i am connected to the database.
Then we have a class that uses some methods within the database class.
class users extends dbclass
{
// some methods
}
$users = new users
// by doing this, we have called the dbclass's constructor again