I reading the following article and got very confused on the "Dynamic Class Instantiation" part. Specifically this code snippet:
$obj = new $className();
if (!$obj instanceof SomeBaseType) {
throw new \InvalidTypeException();
}
I don't understand what this is actually doing for you. If its not an instance of a base class it errors out. How is that helpful? It doesn't get an instance of the correct type, or perform a different action based on the current type, it just throws an exception. I'm having trouble conceptualizing the purpose of this code and the article didn't really clear it up for me.
---------Complete section from the article----------
"Dynamic Class Instantiation
Generally speaking, the following code, while legal, should be used very seldom, and only when other possible instantiation patterns have been exhausted:
$obj = new $className();
if (!$obj instanceof SomeBaseType) {
throw new \InvalidTypeException();
}
Why is this a bad pattern? First, it makes the assumption up front that the constructor signature is free from any required parameters. While this is good for object types that are already known to this factory, it might not always be true of a consumers subtype of the base object in question. This patten should never be used on objects that have dependencies, or in situations where it is conceivable that a subtype might have dependencies because this takes away the possibility for a subtype to practice constructor injection.
Another problem is that instead of managing an object, or a list of objects, you are now managing a class name, or list of class names in addition to an object or list of objects. Instead, one could simply manage the objects.
If, on the other hand, you know this particular object type is no more than a value object (or similar), with no chance of it needing dependencies in subtypes, you can then cautiously use this instantiation pattern."
Related
I have some pattern that works great for me, but that I have some difficulty explaining to fellow programmers. I am looking for some justification or literature reference.
I personally work with PHP, but this would also be applicable to Java, Javascript, C++, and similar languages. Examples will be in PHP or Pseudocode, I hope you can live with this.
The idea is to use a lazy evaluation container for intermediate results, to avoid multiple computation of the same intermediate value.
"Dynamic programming":
http://en.wikipedia.org/wiki/Dynamic_programming
The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations: once the solution to a given subproblem has been computed, it is stored or "memo-ized": the next time the same solution is needed, it is simply looked up
Lazy evaluation container:
class LazyEvaluationContainer {
protected $values = array();
function get($key) {
if (isset($this->values[$key])) {
return $this->values[$key];
}
if (method_exists($this, $key)) {
return $this->values[$key] = $this->$key();
}
throw new Exception("Key $key not supported.");
}
protected function foo() {
// Make sure that bar() runs only once.
return $this->get('bar') + $this->get('bar');
}
protected function bar() {
.. // expensive computation.
}
}
Similar containers are used e.g. as dependency injection containers (DIC).
Details
I usually use some variation of this.
It is possible to have the actual data methods in a different object than the data computation methods?
It is possible to have computation methods with parameters, using a cache with a nested array?
In PHP it is possible to use magic methods (__get() or __call()) for the main retrieval method. In combination with "#property" in the class docblock, this allows type hints for each "virtual" property.
I often use method names like "get_someValue()", where "someValue" is the actual key, to distinguish from regular methods.
It is possible to distribute the data computation to more than one object, to get some kind of separation of concerns?
It is possible to pre-initialize some values?
EDIT: Questions
There is already a nice answer talking about a cute mechanic in Spring #Configuration classes.
To make this more useful and interesting, I extend/clarify the question a bit:
Is storing intermediate values from dynamic programming a legitimate use case for this?
What are the best practices to implement this in PHP? Is some of the stuff in "Details" bad and ugly?
If I understand you correctly, this is quite a standard procedure, although, as you rightly admit, associated with DI (or bootstrapping applications).
A concrete, canonical example would be any Spring #Configuration class with lazy bean definitions; I think it displays exactly the same behavior as you describe, although the actual code that accomplishes it is hidden from view (and generated behind the scenes). Actual Java code could be like this:
#Configuration
public class Whatever {
#Bean #Lazy
public OneThing createOneThing() {
return new OneThing();
}
#Bean #Lazy
public SomeOtherThing createSomeOtherThing() {
return new SomeOtherThing();
}
// here the magic begins:
#Bean #Lazy
public SomeThirdThing getSomeThirdThing() {
return new SomeThirdThing(this.createOneThing(), this.createOneThing(), this.createOneThing(), createSomeOtherThing());
}
}
Each method marked with #Bean #Lazy represents one "resource" that will be created once it is needed (and the method is called) and - no matter how many times it seems that the method is called - the object will only be created once (due to some magic that changes the actual code during loading). So even though it seems that in createOneThing() is called two times in createOneThing(), only one call will occur (and that's only after someone tries to call createSomeThirdThing() or calls getBean(SomeThirdThing.class) on ApplicationContext).
I think you cannot have a universal lazy evaluation container for everything.
Let's first discuss what you really have there. I don't think it's lazy evaluation. Lazy evaluation is defined as delaying an evaluation to the point where the value is really needed, and sharing an already evaluated value with further requests for that value.
The typical example that comes to my mind is a database connection. You'd prepare everything to be able to use that connection when it is needed, but only when there really is a database query needed, the connection is created, and then shared with subsequent queries.
The typical implementation would be to pass the connection string to the constructor, store it internally, and when there is a call to the query method, first the method to return the connection handle is called, which will create and save that handle with the connection string if it does not exist. Later calls to that object will reuse the existing connection.
Such a database object would qualify for lazy evaluating the database connection: It is only created when really needed, and it is then shared for every other query.
When I look at your implementation, it would not qualify for "evaluate only if really needed", it will only store the value that was once created. So it really is only some sort of cache.
It also does not really solve the problem of universally only evaluating the expensive computation once globally. If you have two instances, you will run the expensive function twice. But on the other hand, NOT evaluating it twice will introduce global state - which should be considered a bad thing unless explicitly declared. Usually it would make code very hard to test properly. Personally I'd avoid that.
It is possible to have the actual data methods in a different object than the data computation methods?
If you have a look at how the Zend Framework offers the cache pattern (\Zend\Cache\Pattern\{Callback,Class,Object}Cache), you'd see that the real working class is getting a decorator wrapped around it. All the internal stuff of getting the values stored and read them back is handled internally, from the outside you'd call your methods just like before.
The downside is that you do not have an object of the type of the original class. So if you use type hinting, you cannot pass a decorated caching object instead of the original object. The solution is to implement an interface. The original class implements it with the real functions, and then you create another class that extends the cache decorator and implements the interface as well. This object will pass the type hinting checks, but you are forced to manually implement all interface methods, which do nothing more than pass the call to the internal magic function that would otherwise intercept them.
interface Foo
{
public function foo();
}
class FooExpensive implements Foo
{
public function foo()
{
sleep(100);
return "bar";
}
}
class FooCached extends \Zend\Cache\Pattern\ObjectPattern implements Foo
{
public function foo()
{
//internally uses instance of FooExpensive to calculate once
$args = func_get_args();
return $this->call(__FUNCTION__, $args);
}
}
I have found it impossible in PHP to implement a cache without at least these two classes and one interface (but on the other hand, implementing against an interface is a good thing, it shouldn't bother you). You cannot simply use the native cache object directly.
It is possible to have computation methods with parameters, using a cache with a nested array?
Parameters are working in the above implementation, and they are used in the internal generation of a cache key. You should probably have a look at the \Zend\Cache\Pattern\CallbackCache::generateCallbackKey method.
In PHP it is possible to use magic methods (__get() or __call()) for the main retrieval method. In combination with "#property" in the class docblock, this allows type hints for each "virtual" property.
Magic methods are evil. A documentation block should be considered outdated, as it is no real working code. While I found it acceptable to use magic getter and setter in a really easy-to-understand value object code, which would allow to store any value in any property just like stdClass, I do recommend to be very careful with __call.
I often use method names like "get_someValue()", where "someValue" is the actual key, to distinguish from regular methods.
I would consider this a violation of PSR-1: "4.3. Methods: Method names MUST be declared in camelCase()." And is there a reason to mark these methods as something special? Are they special at all? The do return the value, don't they?
It is possible to distribute the data computation to more than one object, to get some kind of separation of concerns?
If you cache a complex construction of objects, this is completely possible.
It is possible to pre-initialize some values?
This should not be the concern of a cache, but of the implementation itself. What is the point in NOT executing an expensive computation, but to return a preset value? If that is a real use case (like instantly return NULL if a parameter is outside of the defined range), it must be part of the implementation itself. You should not rely on an additional layer around the object to return a value in such cases.
Is storing intermediate values from dynamic programming a legitimate use case for this?
Do you have a dynamic programming problem? There is this sentence on the Wikipedia page you linked:
There are two key attributes that a problem must have in order for dynamic programming to be applicable: optimal substructure and overlapping subproblems. If a problem can be solved by combining optimal solutions to non-overlapping subproblems, the strategy is called "divide and conquer" instead.
I think that there are already existing patterns that seem to solve the lazy evaluation part of your example: Singleton, ServiceLocator, Factory. (I'm not promoting singletons here!)
There also is the concept of "promises": Objects are returned that promise to return the real value later if asked, but as long as the value isn't needed right now, would act as the values replacement that could be passed along instead. You might want to read this blog posting: http://blog.ircmaxell.com/2013/01/promise-for-clean-code.html
What are the best practices to implement this in PHP? Is some of the stuff in "Details" bad and ugly?
You used an example that probably comes close to the Fibonacci example. The aspect I don't like about that example is that you use a single instance to collect all values. In a way, you are aggregating global state here - which probably is what this whole concept is about. But global state is evil, and I don't like that extra layer. And you haven't really solved the problem of parameters enough.
I wonder why there are really two calls to bar() inside foo()? The more obvious method would be to duplicate the result directly in foo(), and then "add" it.
All in all, I'm not too impressed until now. I cannot anticipate a real use case for such a general purpose solution on this simple level. I do like IDE auto suggest support, and I do not like duck-typing (passing an object that only simulates being compatible, but without being able to ensure the instance).
So lets say I have a class that is composed of other classes.
class HttpRequest
{
public $session = new Session();
// .. the rest of the HttpRequest code
}
Now, I want to have access to Session class through HttpRequest class so Im using composition.
But does this breaks laws of OOP Encapsulation or Data hidding that states that all properties should be protected, and accessed through setter and getter methods?
Is this wrong:
$request = new HttpRequest();
$request->session->set('id', 5);
or should I use this:
$request = new HttpRequest();
$session = $request->getSession();
$session->set('id', 5);
Encapsulation states that properties shoud be protected.
How to provide access to inner classes then? Is the first example wrong as far as proper OOP goes?
There are valid reasons to not allow direct access to the object:
Allows for manipulation of the object outside of the object itself. If you make the property public, any part of your code could overwrite $session on the HttpRequest class, and you'd have a tough time tracking it down. Encapsulation from a data protection standpoint is there to ensure that only the object's methods can directly alter the object.
Allows you to gracefully handle the case in which that variable is not set. If, for some reason, $session does not get set on your class - you'll immediately have a fatal when you try to call a method on it. If you wrap it in a getter, you could check for that condition and create a new instance of the class on the fly.
Follows true "OO" paradigms
However, in some cases I would say it is okay to do this. Particularly if you know that the property will always be set (and the only way in which it would not be set is not a supported way to use the object).
It also makes sense depending on how the property is going to be accessed. Symfony2 uses this in their Request classes. It feels natural in that case, as the "query" "post" and "request" vars are all "ParameterBag"s (glorified arrays). However, they do expose a getter for the Session object - likely because of it's use case.
In short: it really depends on how you'll be using the variable. In this particular case, I'd say it doesn't much matter.
I like your first option, (It's the one using composition), and look that has encapsulation (I don't know what makes the function set), but I suppose that it's modifying some attribute through the function of the "component" object "session", that pattern is also known as "delegation".
On the other hand if you use encapsulation you cannot user "public", that is allowing to be modified for everybody. It's because of this that you user setters or getter, or in your code "set"
I know this is old, but I would use neither of these. Does your HttpRequest object really need to hold onto the Session object or can a Session object be passed into some functions of the HttpRequest object that need it? Is there a strong case for having HttpRequest store this object?
PHPUnit's getMock($classname, $mockmethods) creates a new object based on the given class name and lets me change/test the behavior of the methods I specified.
I long for something different; it's changing the behavior of methods of an existing object - without constructing a new object.
Is that possible? If yes, how?
When thinking about the problem I came to the conclusion that it would be possible by serializing the object, changing the serialized string to let the object be instance of a new class that extends the old class plus the mocked methods.
I'd like some code for that - or maybe there is such code already somewhere.
While it would certainly be possible to create the to-be-mocked object again, it's just too complicated to do it in my test. Thus I don't want to do that if I don't really really really have to. It's a TYPO3 TSFE instance, and setting that up once in the bootstrapping process is hard enough already.
I know this answer is rather late, but I feel for future viewers of this question there now exists a simpler solution (which has some drawbacks, but depending on your needs can be much easier to implement). Mockery has support for mocking pre-existing objects with what they call a "proxied partial mock." They say that this is for classes with final methods, but it can be used in this case (although the docs do caution that it should be a "last resort").
$existingObjectMock = \Mockery::mock($existingObject);
$existingObjectMock->shouldReceive('someAction')->andReturn('foobar');
It acts by creating a proxy object which hands all method calls and attribute gets/sets to the existing object unless they are mocked.
It should be noted, though, that the proxy suffers from the obvious issue of failing any typechecks or typehints. But this can be usually avoided, because the $existingObject can still be passed around. You should only use the $existingObjectMock when you need the mock capabilities.
Not all pre-existing code can be tested. Code really needs to be designed to be testable. So, while not exactly what you're asking for, you can refactor the code so that the instantiation of the object is in a separate method, then mock that method to return what you want.
class foo {
function new_bar($arg) { return new bar($arg); }
function blah() {
...
$obj = $this->new_bar($arg);
return $obj->twiddle();
}
}
and then you can test it with
class foo_Test extends PHPUnit_Framework_TestCase {
function test_blah() {
$sut = $this->getMock('foo', array('new_bar', 'twiddle'));
$sut->expects($this->once())->method('new_bar')->will($this->returnSelf());
$sut->expects($this->once())->method('twiddle')->will($this->returnValue('yes'));
$this->assertEquals('yes', $sut->blah());
}
}
Let me start of by saying: Welcome to the dark side of unit testing.
Meaning: You usually don't want to do this but as you explained you have what seems to be a valid use case.
runkit
What you can do quite easily, well not trivial but easier than changing your application architecture, is to change class behavior on the fly by using runkit
runkit_method_rename(
get_class($object), 'methodToMock', 'methodToMock_old'
);
runkit_method_add(
get_class($object), 'methodToMock', '$param1, $param2', 'return 7;'
);
runkit::method_add
and after the test to a method_remove and the rename again. I don't know of any framework / component that helps you with that but it's not that much to implement on your own in a UglyTestsBaseTest extends PHPUnit_Framework_TestCase.
Well...
If all you have access to is a reference to that object (as in: The $x in $x = new Foo();) i don't know of any way to say: $x, you are now of type SomethingElse and all other variables pointing to that object should change too.
I'm going to assume you already know things like testing your privates but it doesn't help you because you don't have control over the objects life cycle.
The php test helpers extension
Note: the Test-Helper extension is superseded by https://github.com/krakjoe/uopz
What also might help you out here is: Stubbing Hard-Coded Dependencies using the php-test-helpers extension that allows you do to things like Intercepting Object Creation.
That means while your application calls $x = new Something(); you can hack PHP to make it so that $x then contains an instance of YourSpecialCraftedSomething.
You might create that classing using the PHPUnit Mocking API or write it yourself.
As far as i know those are your options. If it's worth going there (or just writing integration / selenium tests for that project) is something you have to figure out on your own as it heavily depends on your circumstances.
I understand the importance of Dependency Injection and its role in Unit testing, which is why the following issue is giving me pause:
One area where I struggle not to use the Singleton is the Identity Map/Unit of Work pattern (Which keeps tabs on Domain Object state).
//Not actual code, but it should demonstrate the point
class Monitor{//singleton construction omitted for brevity
static $members = array();//keeps record of all objects
static $dirty = array();//keeps record of all modified objects
static $clean = array();//keeps record of all clean objects
}
class Mapper{//queries database, maps values to object fields
public function find($id){
if(isset(Monitor::members[$id]){
return Monitor::members[$id];
}
$values = $this->selectStmt($id);
//field mapping process omitted for brevity
$Object = new Object($values);
Monitor::new[$id]=$Object
return $Object;
}
$User = $UserMapper->find(1);//domain object is registered in Id Map
$User->changePropertyX();//object is marked "dirty" in UoW
// at this point, I can save by passing the Domain Object back to the Mapper
$UserMapper->save($User);//object is marked clean in UoW
//but a nicer API would be something like this
$User->save();
//but if I want to do this - it has to make a call to the mapper/db somehow
$User->getBlogPosts();
//or else have to generate specific collection/object graphing methods in the mapper
$UserPosts = $UserMapper->getBlogPosts();
$User->setPosts($UserPosts);
Any advice on how you might handle this situation?
I would be loathe to pass/generate instances of the mapper/database access into the Domain Object itself to satisfy DI - At the same time, avoiding that results in lots of calls within the Domain Object to external static methods.
Although I guess if I want "save" to be part of its behaviour then a facility to do so is required in its construction. Perhaps it's a problem with responsibility, the Domain Object shouldn't be burdened with saving. It's just quite a neat feature from the Active Record pattern - it would be nice to implement it in some way.
What I do, albeit maybe not the best course of action, is to have a clear naming convention for my classes, FI: user_User is the domain object and user_mapper_User is it's mapper.
In my parent domainObject class I code the logic to find it's mapper.
Then you have a couple options to delegate to it, an obvious one would be to use the __call() method in domainObject.
I currently have a method within my class that has to call other methods, some in the same object and others from other objects.
class MyClass
{
public function myMethod()
{
$var1 = $this->otherMethod1();
$var2 = $this->otherMethod2();
$var3 = $this->otherMethod3();
$otherObject = new OtherClass();
$var4 = $otherObject->someMethod();
# some processing goes on with these 4 variables
# then the method returns something else
return $var5;
}
}
I'm new to the whole TDD game, but some of what I think I understood to be key premises to more testable code are composition, loose coupling, with some strategy for Dependency Injection/Inversion of Control.
How do I go about refactoring a method into something more testable in this particular situation?
Do I pass the $this object reference to the method as a parameter, so that I can easily mock/stub the collaborating methods? Is this recommended or is it going overboard?
class MyClass
{
public function myMethod($self, $other)
{
# $self == $this
$var1 = $self->otherMethod1();
$var2 = $self->otherMethod2();
$var3 = $self->otherMethod3();
$var4 = $other->someMethod();
# ...
return $var5;
}
}
Also, it is obvious to me that dependencies are a pretty big deal with TDD, as one has to think about how to inject a stub/mock to the said method for tests. Do most TDDers use DI/IoC as a primary strategy for public dependencies? at which point does it become exaggerated? can you give some pointers to do this efficiently?
These are some good questions... let me first say that I do not really know JS at all, but I am a unit tester and have dealt with these issues. I first want to point out that JsUnit exists if you are not using it.
I wouldn't worry too much about your method calling other methods within the same class... this is bound to happen. What worries me more is the creation of the other object, depending on how complicated it is.
For example, if you are instantiating a class that does all kinds of operations over the network, that is too heavy for a simple unit test. What you would prefer to do is mock out the dependency on that class so that you can have the object produce the result you would expect to receive from its operations on the network, without incurring the overhead of going on the network: network failures, time, etc...
Passing in the other object is a bit messy. What people typically do is have a factory method to instantiate the other object. The factory method can decide, based on whether or not you are testing (typically via a flag) whether or not to instantiate the real object or the mock. In fact, you may want to make the other object a member of you class, and within the constructor, call the factory, or make the decision right there whether or not to instantiate the mock or the real thing. Within the setup function or within your test cases you can set special conditions on the mock object so that it will return the proper value.
Also, just make sure you have tests for your other functions in the same class... I hope this helps!
Looks like the whole idea of this class is not quite correct. In TDD your are testing classes, but not methods. If a method has it own responsibility and provides it's own (separate testable) functionality it should be moved to a separate class. Otherwise it just breaks the whole OOP encapsulation thing. Particularly it breaks the Single Responsibility Principle.
In your case, I would extract the tested method into another class and injected $var1, $var2, $var3 and $other as dependencies. The $other should be mocked, as well any object which tested class depends on.
class TestMyClass extends MyTestFrameworkUnitTestBase{
function testMyClass()
{
$myClass = new MyClass();
$myClass->setVar1('asdf');
$myClass->setVar2(23);
$myClass->setVar3(78);
$otherMock = getMockForClassOther();
$myClass->setOther($otherMock);
$this->assertEquals('result', $myClass->myMethod());
}
}
Basic rule I use is: If I want to test something, I should make it a class. This is not always true in PHP though. But it works in PHP in 90% of cases. (Based on my experience)
I might be wrong, but I am under the impression that objects/classes should be black boxes to their clients, and so to their testing clients (encapsulating I think is the term I am looking for).
There's a few things you can do:
The best thing to do is mock, here's one such library: http://code.google.com/p/php-mock-function
It should let you mock out only the specific functions you want.
If that doesn't work, the next best thing is to provide the implementation of method2 as a method of an object within the MyClass class. I find this one of the easier methods if you can't mock methods directly:
class MyClass {
function __construct($method2Impl) {
$this->method2Impl = $method2Impl;
}
function method2() {
return $this->method2Imple->call();
}
}
Another option is to add an "under test" flag, so that the method behaves different. I don't recommend this either - eventually you'll have differing code paths and with their own bugs.
Another option would be to subclass and override the behaviors you need. I -really- don't suggest this since you'll end up customizing your overridden mock to the point that it'll have bugs itself :).
Finally, if you need to mock out a method because its too complicated, that can be a good sign to move it into its own object and use composition (essentially using the method2Impl technique i mentioned above).
Possibly, this is more a matter of single responsibility principle being violated, which is feeding into TDD issues.
That's a GOOD thing, that means TDD is exposing design flaws. Or so the story goes.
If those methods are not public, and are just you breaking apart you code into more digestable chunks, honestly, I wouldn't care.
If those methods are public, then you've got an issue. Following the rule, 'any public method of a class instance must be callable at any point'. That is to say, if you're requiring some sort of ordering of method calls, then it's time to break that class up.