Parameter type covariance in specializations - php

tl;dr
What strategies exist to overcome parameter type invariance for specializations, in a language (PHP) without support for generics?
Note: I wish I could say my understanding of type theory/safety/variance/etc., was more complete; I'm no CS major.
Situation
You've got an abstract class, Consumer, that you'd like to extend. Consumer declares an abstract method consume(Argument $argument) which needs a definition. Shouldn't be a problem.
Problem
Your specialized Consumer, called SpecializedConsumer has no logical business working with every type of Argument. Instead, it should accept a SpecializedArgument (and subclasses thereof). Our method signature changes to consume(SpecializedArgument $argument).
abstract class Argument { }
class SpecializedArgument extends Argument { }
abstract class Consumer {
abstract public function consume(Argument $argument);
}
class SpecializedConsumer extends Consumer {
public function consume(SpecializedArgument $argument) {
// i dun goofed.
}
}
We're breaking Liskov substitution principle, and causing type safety problems. Poop.
Question
Ok, so this isn't going to work. However, given this situation, what patterns or strategies exist to overcome the type safety problem, and the violation of LSP, yet still maintain the type relationship of SpecializedConsumer to Consumer?
I suppose it's perfectly acceptable that an answer can be distilled down to "ya dun goofed, back to the drawing board".
Considerations, Details, & Errata
Alright, an immediate solution presents itself as "don't define the consume() method in Consumer". Ok, that makes sense, because method declaration is only as good as the signature. Semantically though the absence of consume(), even with a unknown parameter list, hurts my brain a bit. Perhaps there is a better way.
From what I'm reading, few languages support parameter type covariance; PHP is one of them, and is the implementation language here. Further complicating things, I've seen creative "solutions" involving generics; another feature not supported in PHP.
From Wiki's Variance (computer science) - Need for covariant argument types?:
This creates problems in some situations, where argument types should be covariant to model real-life requirements. Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Person::see(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. If children only visit pediatricians, we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Child::see(d) must take any Doctor, not just a Pediatrician.
The article goes on to say:
In this case, the visitor pattern could be used to enforce this relationship. Another way to solve the problems, in C++, is using generic programming.
Again, generics can be used creatively to solve the problem. I'm exploring the visitor pattern, as I have a half-baked implementation of it anyway, however most implementations as described in articles leverage method overloading, yet another unsupported feature in PHP.
<too-much-information>
Implementation
Due to recent discussion, I'll expand on the specific implementation details I've neglected to include (as in, I'll probably include way too much).
For brevity, I've excluded method bodies for those which are (should be) abundantly clear in their purpose. I've tried to keep this brief, but I tend to get wordy. I didn't want to dump a wall of code, so explanations follow/precede code blocks. If you have edit privileges, and want to clean this up, please do. Also, code blocks aren't copy-pasta from a project. If something doesn't make sense, it might not; yell at me for clarification.
With respect to the original question, hereafter the Rule class is the Consumer and the Adapter class is the Argument.
The tree-related classes are comprised as follows:
abstract class Rule {
abstract public function evaluate(Adapter $adapter);
abstract public function getAdapter(Wrapper $wrapper);
}
abstract class Node {
protected $rules = [];
protected $command;
public function __construct(array $rules, $command) {
$this->addEachRule($rules);
}
public function addRule(Rule $rule) { }
public function addEachRule(array $rules) { }
public function setCommand(Command $command) { }
public function evaluateEachRule(Wrapper $wrapper) {
// see below
}
abstract public function evaluate(Wrapper $wrapper);
}
class InnerNode extends Node {
protected $nodes = [];
public function __construct(array $rules, $command, array $nodes) {
parent::__construct($rules, $command);
$this->addEachNode($nodes);
}
public function addNode(Node $node) { }
public function addEachNode(array $nodes) { }
public function evaluateEachNode(Wrapper $wrapper) {
// see below
}
public function evaluate(Wrapper $wrapper) {
// see below
}
}
class OuterNode extends Node {
public function evaluate(Wrapper $wrapper) {
// see below
}
}
So each InnerNode contains Rule and Node objects, and each OuterNode only Rule objects. Node::evaluate() evaluates each Rule (Node::evaluateEachRule()) to a boolean true. If each Rule passes, the Node has passed and it's Command is added to the Wrapper, and will descend to children for evaluation (OuterNode::evaluateEachNode()), or simply return true, for InnerNode and OuterNode objects respectively.
As for Wrapper; the Wrapper object proxies a Request object, and has a collection of Adapter objects.
The Request object is a representation of the HTTP request.
The Adapter object is a specialized interface (and maintains specific state) for specific use with specific Rule objects. (this is where the LSP problems come in)
The Command object is an action (a neatly packaged callback, really) which is added to the Wrapper object, once all is said and done, the array of Command objects will be fired in sequence, passing the Request (among other things) in.
class Request {
// all teh codez for HTTP stuffs
}
class Wrapper {
protected $request;
protected $commands = [];
protected $adapters = [];
public function __construct(Request $request) {
$this->request = $request;
}
public function addCommand(Command $command) { }
public function getEachCommand() { }
public function adapt(Rule $rule) {
$type = get_class($rule);
return isset($this->adapters[$type])
? $this->adapters[$type]
: $this->adapters[$type] = $rule->getAdapter($this);
}
public function commit(){
foreach($this->adapters as $adapter) {
$adapter->commit($this->request);
}
}
}
abstract class Adapter {
protected $wrapper;
public function __construct(Wrapper $wrapper) {
$this->wrapper = $wrapper;
}
abstract public function commit(Request $request);
}
So a given user-land Rule accepts the expected user-land Adapter. If the Adapter needs information about the request, it's routed through Wrapper, in order to preserve the integrity of the original Request.
As the Wrapper aggregates Adapter objects, it will pass existing instances to subsequent Rule objects, so that the state of an Adapter is preserved from one Rule to the next. Once an entire tree has passed, Wrapper::commit() is called, and each of the aggregated Adapter objects will apply it's state as necessary against the original Request.
We are then left with an array of Command objects, and a modified Request.
What the hell is the point?
Well, I didn't want to recreate the prototypical "routing table" common in many PHP frameworks/applications, so instead I went with a "routing tree". By allowing arbitrary rules, you can quickly create and append an AuthRule (for example) to a Node, and no longer is that whole branch accessible without passing the AuthRule. In theory (in my head) it's like a magical unicorn, preventing code duplication, and enforcing zone/module organization. In practice, I'm confused and scared.
Why I left this wall of nonsense?
Well, this is the implementation for which I need to fix the LSP problem. Each Rule corresponds to an Adapter, and that ain't good. I want to preserve the relationship between each Rule, as to ensure type safety when constructing the tree, etc., however I can't declare the key method (evaluate()) in the abstract Rule, as the signature changes for subtypes.
On another note, I'm working on sorting out the Adapter creation/management scheme; whether it is the responsibility of the Rule to create it, etc.
</too-much-information>

To properly answer this question, we must really take a step back and look at the problem you're trying to solve in a more general manner (and your question was already pretty general).
The Real Problem
The real problem is that you're trying to use inheritance to solve a problem of business logic. That's never going to work because of LSP violations and -more importantly- tight coupling your business logic to the application's structure.
So inheritance is out as a method to solve this problem (for the above, and the reasons you stated in the question). Fortunately, there are a number of compositional patterns that we can use.
Now, considering how generic your question is, it's going to be very hard to identify a solid solution to your problem. So let's go over a few patterns and see how they can solve this problem.
Strategy
The Strategy Pattern is the first that came to my mind when I first read the question. Basically, it separates the implementation details from the execution details. It allows for a number of different "strategies" to exist, and the caller would determine which to load for the particular problem.
The downside here is that the caller must know about the strategies in order to pick the correct one. But it also allows for a cleaner distinction between the different strategies, so it's a decent choice...
Command
The Command Pattern would also decouple the implementation just like Strategy would. The main difference is that in Strategy, the caller is the one that chooses the consumer. In Command, it's someone else (a factory or dispatcher perhaps)...
Each "Specialized Consumer" would implement only the logic for a specific type of problem. Then someone else would make the appropriate choice.
Chain Of Responsibility
The next pattern that may be applicable is the Chain of Responsibility Pattern. This is similar to the strategy pattern discussed above, except that instead of the consumer deciding which is called, each one of the strategies is called in sequence until one handles the request. So, in your example, you would take the more generic argument, but check if it's the specific one. If it is, handle the request. Otherwise, let the next one give it a try...
Bridge
A Bridge Pattern may be appropriate here as well. This is in some sense similar to the Strategy pattern, but it's different in that a bridge implementation would pick the strategy at construction time, instead of at run time. So then you would build a different "consumer" for each implementation, with the details composed inside as dependencies.
Visitor Pattern
You mentioned the Visitor Pattern in your question, so I'd figure I'd mention it here. I'm not really sure it's appropriate in this context, because a visitor is really similar to a strategy pattern that's designed to traverse a structure. If you don't have a data structure to traverse, then the visitor pattern will be distilled to look fairly similar to a strategy pattern. I say fairly, because the direction of control is different, but the end relationship is pretty much the same.
Other Patterns
In the end, it really depends on the concrete problem that you're trying to solve. If you're trying to handle HTTP requests, where each "Consumer" handles a different request type (XML vs HTML vs JSON etc), the best choice will likely be very different than if you're trying to handle finding the geometric area of a polygon. Sure, you could use the same pattern for both, but they are not really the same problem.
With that said, the problem could also be solved with a Mediator Pattern (in the case where multiple "Consumers" need a chance to process data), a State Pattern (in the case where the "Consumer" will depend on past consumed data) or even an Adapter Pattern (in the case where you're abstracting a different sub-system in the specialized consumer)...
In short, it's a difficult problem to answer, because there are so many solutions that it's hard to say which is correct...

The only one known to me is DIY strategy: accept simple Argument in function definition and immediately check if it is specialized enough:
class SpecializedConsumer extends Consumer {
public function consume(Argument $argument) {
if(!($argument instanceof SpecializedArgument)) {
throw new InvalidArgumentException('Argument was not specialized.');
}
// move on
}
}

Related

Am I setting myself up for failure using a static method in a Laravel Controller?

I am quite new to OOP, so this is really a basic OOP question, in the context of a Laravel Controller.
I'm attempting to create a notification system system that creates Notification objects when certain other objects are created, edited, deleted, etc. So, for example, if a User is edited, then I want to generate a Notification regarding this edit. Following this example, I've created UserObserver that calls NotificationController::store() when a User is saved.
class UserObserver extends BaseObserver
{
public function saved($user)
{
$data = [
// omitted from example
];
NotificationController::store($data);
}
}
In order to make this work, I had to make NotificationController::store() static.
class NotificationController extends \BaseController {
public static function store($data)
{
// validation omitted from example
$notification = Notification::create($data);
}
I'm only vaguely familiar with what static means, so there's more than likely something inherently wrong with what I'm doing here, but this seems to get the job done, more or less. However, everything that I've read indicates that static functions are generally bad practice. Is what I'm doing here "wrong," per say? How could I do this better?
I will have several other Observer classes that will need to call this same NotificationController::store(), and I want NotificationController::store() to handle any validation of $data.
I am just starting to learn about unit testing. Would what I've done here make anything difficult with regard to testing?
I've written about statics extensively here: How Not To Kill Your Testability Using Statics. The gist of it as applies to your case is as follows:
Static function calls couple code. It is not possible to substitute static function calls with anything else or to skip those calls, for whatever reason. NotificationController::store() is essentially in the same class of things as substr(). Now, you probably wouldn't want to substitute a call to substr by anything else; but there are a ton of reasons why you may want to substitute NotificationController, now or later.
Unit testing is just one very obvious use case where substitution is very desirable. If you want to test the UserObserver::saved function just by itself, because it contains a complex algorithm which you need to test with all possible inputs to ensure it's working correctly, you cannot decouple that algorithm from the call to NotificationController::store(). And that function in turn probably calls some Model::save() method, which in turn wants to talk to a database. You'd need to set up this whole environment which all this other unrelated code requires (and which may or may not contain bugs of its own), that it essentially is impossible to simply test this one function by itself.
If your code looked more like this:
class UserObserver extends BaseObserver
{
public function saved($user)
{
$data = [
// omitted from example
];
$this->NotificationController->store($data);
}
}
Well, $this->NotificationController is obviously a variable which can be substituted at some point. Most typically this object would be injected at the time you instantiate the class:
new UserObserver($notificationController)
You could simply inject a mock object which allows any methods to be called, but which simply does nothing. Then you could test UserObserver::saved() in isolation and ensure it's actually bug free.
In general, using dependency injected code makes your application more flexible and allows you to take it apart. This is necessary for unit testing, but will also come in handy later in scenarios you can't even imagine right now, but will be stumped by half a year from now as you need to restructure and refactor your application for some new feature you want to implement.
Caveat: I have never written a single line of Laravel code, but as I understand it, it does support some form of dependency injection. If that's actually really the case, you should definitely use that capability. Otherwise be very aware of what parts of your code you're coupling to what other parts and how this will impact your ability to take it apart and refactor later.

Does this break open closed principle

I am attempting to create a factory. I want the client to send a code to the create method, which will be used to instantiate a class that is used to process that type of 'thing'.
The list of codes are a member of the class, since they should never change. But, to make it more testable i have added a setter for the codeMap array.
Does this break the open closed principle and if so, how to make this testable correctly?
<?php
class My_ThingFactory
{
/**
* #var array
*/
private $codeMap = array(
'A111' => 'My_Thing_ConcreteA'
);
public function create($code)
{
if (isset($this->codeMap[$code])) {
return new $this->codeMap[$code];
}
}
public function setCodeMap(array $codeMap)
{
$this->codeMap = $codeMap;
}
}
The Open/Closed principle has to do with extending some code to add functionality without modifying the core behavior (i.e. by not editing the source code) of your class. Your class keeps its internals to itself and provides clear public interfaces to interact with them. From this perspective, no you have not broken open/closed principle. At least not at face value.
However, that said, I also got the impression from your question that you are wondering if having a setter for your private $codeMap array breaks the principle. It doesn't directly, but the implementation also makes it attractive to modify if another developer wants more fine tuned access to the $codeMap array. Basically, the only way to update this array on the fly is to wipe it out and reset it with setCodeMap(). You are not providing a mechanism to add a single code to the map. As soon as you find yourself needing more granular access to this map, you will also find yourself violating the open/closed principle.
Consider this, let's say another developer is using your code and the $codeMap array is 20 or 30 elements strong; they must hack your core code to provide better access to that array. Since there is not way to add a single code, they must create a new array to pass to setCodeMap() that consists of the current $codeMap array plus any additional elements they wish to add. There isn't another way (besides hardcoding the original array) to do this without opening up the My_ThingFactory and adding something like:
public function getCodeMap()
{
return $this->codeMap;
}
Then in their extended class they could do something like:
class AnotherThingFactory extends My_ThingFactory
{
public function addCodes(array $newCodes)
{
$this->setCodeMap(array_merge($this->getCodeMap(), $newCodes));
}
}
But again, this is only possible by going into your class and adding the needed functionality before, which does break the open/closed principle. You could also rectify this by simply making the $codeMap property protected and then an extending class can do what they need to without hacking your code. The extending class also then has the onus of ensuring that they are manipulating it correctly.
So to answer the open/closed question: if you are intending to keep the $codeMap locked down by design and don't intend for it to be used in alternate way then you are fine. But as I said above as soon as you need better control of the $codeMap array, you will need to violate the principle to do so. My suggestion would be to brainstorm how much management of that factory you want built in to your class and make it part of the class core functionality.
As for testing, I don't see why you couldn't test this code as it is. You can set your code map and then test for the corresponding instance that was returned with the create() method.
class FactoryTest extends PHPUnit_Framework_TestCase
{
private $factory;
public function setUp()
{
$this->factory = new My_ThingFactory();
}
public function tearDown()
{
$this->factory = null;
}
public function testMadeConcreteA()
{
$this->assertInstanceOf('My_Thing_ConcreteA', $this->factory->create('A111'));
}
public function testMadeStealthBomber()
{
$this->factory->setCodeMap(array('B-52', 'StealthBomber')); //Assume the class exists.
$this->assertInstanceOf('StealthBomber', $this->factory->create('B-52'));
}
public function testDidntMakeSquat()
{
$this->assertNotInstanceOf('My_Thing_ConcreteA', $this->factory->create('Nada'));
}
}
The open-closed principle is not universal. You need to make an assumption about what is likely to change (the open part) and what is not (the closed part).
Since you're using a factory, the closed part is the create service (factories make this part closed). The open part is the things the factory is going to create. A factory allows extending those things later.
A small but important point is that your pattern is not a GoF factory, but rather a Simple Factory. So, it's perhaps not the strongest form of Factory for exploiting the open-closed principle. That is, if you add new stuff to create, you have to modify the class (the $codeMap array).
What stands out in your question is that you seem to contradict the principle of openness when you say:
The list of codes are a member of the class, since they should never change.
In my mind, if you're using a factory, the list of codes is expected to change.
As for your set function, it's a public method, and so by definition is closed (else you shouldn't reveal it). On the other hand, you're exposing details of the implementation (as mentioned in Crackertastic's answer). You might be concerned more about violating encapsulation with this method.
I think an easier solution (although I'm not sure about it in PHP) is to initialize your factory with a $codeArray that's been created by another class. I think this is what Kamal Wickamanayake refers to in his comment on your question. Another solution is a service (closed) to add/delete elements (which boil down to adding new entries into your $codeArray but in a hidden way).

Lazy evaluation container for dynamic programming?

I have some pattern that works great for me, but that I have some difficulty explaining to fellow programmers. I am looking for some justification or literature reference.
I personally work with PHP, but this would also be applicable to Java, Javascript, C++, and similar languages. Examples will be in PHP or Pseudocode, I hope you can live with this.
The idea is to use a lazy evaluation container for intermediate results, to avoid multiple computation of the same intermediate value.
"Dynamic programming":
http://en.wikipedia.org/wiki/Dynamic_programming
The dynamic programming approach seeks to solve each subproblem only once, thus reducing the number of computations: once the solution to a given subproblem has been computed, it is stored or "memo-ized": the next time the same solution is needed, it is simply looked up
Lazy evaluation container:
class LazyEvaluationContainer {
protected $values = array();
function get($key) {
if (isset($this->values[$key])) {
return $this->values[$key];
}
if (method_exists($this, $key)) {
return $this->values[$key] = $this->$key();
}
throw new Exception("Key $key not supported.");
}
protected function foo() {
// Make sure that bar() runs only once.
return $this->get('bar') + $this->get('bar');
}
protected function bar() {
.. // expensive computation.
}
}
Similar containers are used e.g. as dependency injection containers (DIC).
Details
I usually use some variation of this.
It is possible to have the actual data methods in a different object than the data computation methods?
It is possible to have computation methods with parameters, using a cache with a nested array?
In PHP it is possible to use magic methods (__get() or __call()) for the main retrieval method. In combination with "#property" in the class docblock, this allows type hints for each "virtual" property.
I often use method names like "get_someValue()", where "someValue" is the actual key, to distinguish from regular methods.
It is possible to distribute the data computation to more than one object, to get some kind of separation of concerns?
It is possible to pre-initialize some values?
EDIT: Questions
There is already a nice answer talking about a cute mechanic in Spring #Configuration classes.
To make this more useful and interesting, I extend/clarify the question a bit:
Is storing intermediate values from dynamic programming a legitimate use case for this?
What are the best practices to implement this in PHP? Is some of the stuff in "Details" bad and ugly?
If I understand you correctly, this is quite a standard procedure, although, as you rightly admit, associated with DI (or bootstrapping applications).
A concrete, canonical example would be any Spring #Configuration class with lazy bean definitions; I think it displays exactly the same behavior as you describe, although the actual code that accomplishes it is hidden from view (and generated behind the scenes). Actual Java code could be like this:
#Configuration
public class Whatever {
#Bean #Lazy
public OneThing createOneThing() {
return new OneThing();
}
#Bean #Lazy
public SomeOtherThing createSomeOtherThing() {
return new SomeOtherThing();
}
// here the magic begins:
#Bean #Lazy
public SomeThirdThing getSomeThirdThing() {
return new SomeThirdThing(this.createOneThing(), this.createOneThing(), this.createOneThing(), createSomeOtherThing());
}
}
Each method marked with #Bean #Lazy represents one "resource" that will be created once it is needed (and the method is called) and - no matter how many times it seems that the method is called - the object will only be created once (due to some magic that changes the actual code during loading). So even though it seems that in createOneThing() is called two times in createOneThing(), only one call will occur (and that's only after someone tries to call createSomeThirdThing() or calls getBean(SomeThirdThing.class) on ApplicationContext).
I think you cannot have a universal lazy evaluation container for everything.
Let's first discuss what you really have there. I don't think it's lazy evaluation. Lazy evaluation is defined as delaying an evaluation to the point where the value is really needed, and sharing an already evaluated value with further requests for that value.
The typical example that comes to my mind is a database connection. You'd prepare everything to be able to use that connection when it is needed, but only when there really is a database query needed, the connection is created, and then shared with subsequent queries.
The typical implementation would be to pass the connection string to the constructor, store it internally, and when there is a call to the query method, first the method to return the connection handle is called, which will create and save that handle with the connection string if it does not exist. Later calls to that object will reuse the existing connection.
Such a database object would qualify for lazy evaluating the database connection: It is only created when really needed, and it is then shared for every other query.
When I look at your implementation, it would not qualify for "evaluate only if really needed", it will only store the value that was once created. So it really is only some sort of cache.
It also does not really solve the problem of universally only evaluating the expensive computation once globally. If you have two instances, you will run the expensive function twice. But on the other hand, NOT evaluating it twice will introduce global state - which should be considered a bad thing unless explicitly declared. Usually it would make code very hard to test properly. Personally I'd avoid that.
It is possible to have the actual data methods in a different object than the data computation methods?
If you have a look at how the Zend Framework offers the cache pattern (\Zend\Cache\Pattern\{Callback,Class,Object}Cache), you'd see that the real working class is getting a decorator wrapped around it. All the internal stuff of getting the values stored and read them back is handled internally, from the outside you'd call your methods just like before.
The downside is that you do not have an object of the type of the original class. So if you use type hinting, you cannot pass a decorated caching object instead of the original object. The solution is to implement an interface. The original class implements it with the real functions, and then you create another class that extends the cache decorator and implements the interface as well. This object will pass the type hinting checks, but you are forced to manually implement all interface methods, which do nothing more than pass the call to the internal magic function that would otherwise intercept them.
interface Foo
{
public function foo();
}
class FooExpensive implements Foo
{
public function foo()
{
sleep(100);
return "bar";
}
}
class FooCached extends \Zend\Cache\Pattern\ObjectPattern implements Foo
{
public function foo()
{
//internally uses instance of FooExpensive to calculate once
$args = func_get_args();
return $this->call(__FUNCTION__, $args);
}
}
I have found it impossible in PHP to implement a cache without at least these two classes and one interface (but on the other hand, implementing against an interface is a good thing, it shouldn't bother you). You cannot simply use the native cache object directly.
It is possible to have computation methods with parameters, using a cache with a nested array?
Parameters are working in the above implementation, and they are used in the internal generation of a cache key. You should probably have a look at the \Zend\Cache\Pattern\CallbackCache::generateCallbackKey method.
In PHP it is possible to use magic methods (__get() or __call()) for the main retrieval method. In combination with "#property" in the class docblock, this allows type hints for each "virtual" property.
Magic methods are evil. A documentation block should be considered outdated, as it is no real working code. While I found it acceptable to use magic getter and setter in a really easy-to-understand value object code, which would allow to store any value in any property just like stdClass, I do recommend to be very careful with __call.
I often use method names like "get_someValue()", where "someValue" is the actual key, to distinguish from regular methods.
I would consider this a violation of PSR-1: "4.3. Methods: Method names MUST be declared in camelCase()." And is there a reason to mark these methods as something special? Are they special at all? The do return the value, don't they?
It is possible to distribute the data computation to more than one object, to get some kind of separation of concerns?
If you cache a complex construction of objects, this is completely possible.
It is possible to pre-initialize some values?
This should not be the concern of a cache, but of the implementation itself. What is the point in NOT executing an expensive computation, but to return a preset value? If that is a real use case (like instantly return NULL if a parameter is outside of the defined range), it must be part of the implementation itself. You should not rely on an additional layer around the object to return a value in such cases.
Is storing intermediate values from dynamic programming a legitimate use case for this?
Do you have a dynamic programming problem? There is this sentence on the Wikipedia page you linked:
There are two key attributes that a problem must have in order for dynamic programming to be applicable: optimal substructure and overlapping subproblems. If a problem can be solved by combining optimal solutions to non-overlapping subproblems, the strategy is called "divide and conquer" instead.
I think that there are already existing patterns that seem to solve the lazy evaluation part of your example: Singleton, ServiceLocator, Factory. (I'm not promoting singletons here!)
There also is the concept of "promises": Objects are returned that promise to return the real value later if asked, but as long as the value isn't needed right now, would act as the values replacement that could be passed along instead. You might want to read this blog posting: http://blog.ircmaxell.com/2013/01/promise-for-clean-code.html
What are the best practices to implement this in PHP? Is some of the stuff in "Details" bad and ugly?
You used an example that probably comes close to the Fibonacci example. The aspect I don't like about that example is that you use a single instance to collect all values. In a way, you are aggregating global state here - which probably is what this whole concept is about. But global state is evil, and I don't like that extra layer. And you haven't really solved the problem of parameters enough.
I wonder why there are really two calls to bar() inside foo()? The more obvious method would be to duplicate the result directly in foo(), and then "add" it.
All in all, I'm not too impressed until now. I cannot anticipate a real use case for such a general purpose solution on this simple level. I do like IDE auto suggest support, and I do not like duck-typing (passing an object that only simulates being compatible, but without being able to ensure the instance).

How to represent an object and map between different sources/locations

I will be building a system where a particular object will originate from a web service (SOAP based). It will then be displayed on a web page (via PHP). Under certain circumstances we'll store a copy with some additional information in a local MySQL database. And from there it will be batch processed into Salesforce CRM (again via PHP). We may also subsequently pull the object out of Salesforce for display online. So alot going on. For the most part the object is the same with each subsequent node in the system likely adding a couple of fields specific to it, unique ids mainly.
I'd initially toyed with the idea of encapsulating all the necessary functionality into the one class in PHP which would deal with reading and writing from each of the appropriate sources. This felt like it was over complicating the class, and not a good approach.
I then looked at having just a container class, with no real functionality attached beyond getters and setters. Then creating separate functionality outside of this to deal with the reading and writing between the different sources, simple enough code although tedious to map between all the different field names across the different sources. There is probably a design pattern or two that apply here, but I'm not familiar with them. Any and all suggestions on how to approach this appreciated.
What you are looking is Adapter pattern. You can keep your existing code till you completely change all the classes.
I'd suggest to use a composite memento serializable into XML.
I think they may be several ways to handle that. #EGL 2-101 adapter idea is one way to do it.
Basically, you have several sources, which in O.O. jargon, are different objects. But, you want to treated like if they where a single object.
You may want to make a single class for each source, test the "connection", as if each case was the only way you where going to work with. When you have several of that classes, try to make all classes share some interface, methods or properties:
class AnyConnection
{
public function __construct() {
// ...
}
public function read() {
// ...
}
} // class
class SOAPObject extends AnyConnection
{
public function __construct() {
// ...
}
public function read() {
// ...
}
} // class
class MYSQLObject extends AnyConnection
{
public function __construct() {
// ...
}
public function read() {
// ...
}
} // class
class SalesObject extends AnyConnection
{
public function __construct() {
// ...
}
public function read() {
// ...
}
} // class
Later, use a single class to wrap to all of these source classes.
class AnyObject extends AnyConnection
{
$mySOAPObject;
$myMYSQLObject;
$mySalesObject;
public function __construct() {
// ...
}
public function read() {
// ...
}
} // class
Later, add the code, to select which "connection" you want.
Why not separate data and operations?
Contain the core information into a class C. When web services sends this class, it is encompassed in an object of some class W. The web service pulls C and sends it to persistence layer, which creates and stores P that internally contains C, et.c.,
Akin to how data flows over a TCP/IP stack...
The way I see this after thinking about it a bit would be pretty much a class to play with your object and then serialize it.
I'd probably use something like this:
<?php
class MyObject
{
protected $_data;
public function __construct($serializedObject = null) {
if(!is_null($serializedObject)) {
$this->_data = json_decode($serializedObject);
}
}
public function __get($key) {
return $this->_data[$key];
}
/* setter and other things you need */
public function encode() {
return json_encode($this->_data);
}
public function __toString() {
return $this->encode();
}
}
Then just use it to pass it serialized to your different web services.
I think JSON would do a pretty good job on this one, because you can easily unserialize it fast in so many programming languages and it's so much lighter than XML.
DataMapper pattern is that what you're looking for.
You can have one mapper for each storage mechanism that you use and use them all with one object that represent data to business logic.
Is seems your problem is more of an architectural / design decision that pure implementation detail. (I haven't done PHP for a long while and do not know salesforce but other CRM systems)
I believe the technique/pattern that will work for you is the use of a staging area. This helps especially if you have changing integration needs and also when your source data looks different from your system model or when you have different sources to integrate from. Thus, you import into the staging area and then from the staging into your system. At each place you naturally have to map (can use metadata) and maybe transform/translate data. There will be initial effort to build this, but once it's done the step from staging to your system stays quite static/stable.
Using meta data mapping can address flexibility concerns but adds a bit of complexity on implementation. It all depends on the skills and time you have at hand for your project.
I would not have any association between the objects at all. They are used for different purposes but looks similar. period.
In .NET we use a library called automapper to copy information between different classes (like a business object and a DTO). You can build something similar in PHP, either by using get_object_vars or the reflection API.
myCopyApi.copy($myDTO, $myBO);
Say you retrieve a Car from the webservice. You can store it in a WebserviceCar, which has a property car.
Now, if you want to store that Car in the database, put it in a DatabaseCar, which also has a property car. If you want to put it in Salesforce, put it in a SalesforceCar object, which has a property car.
This way, you have one object which has the common fields and several objects which have storage-specific information.
Assuming that you are thinking about storing the actual object (serialized,encoded or whatever) in a field in the database: From my point of view the object it is never the same in two applications, as business-wise, it serves different purposes. Doing this is a kind of "cutting short" in a case where is no room for "cutting short".
Remember that mainly class represents a "category of objects" which all share same properties and behaviours. So let each application use it's own class as their purpose requires it. What can be created although is, as others suggested and as you thought, the creation of an Adapter or Factory which can be used in all the implied applications as it serves the same business purposes "translation" of objects.
Adapter pattern
Factory pattern

Organizing Classes in PHP

Suppose I've the following classes in my project:
class Is // validation class
class Math // number manipulation class
Now, if I want to validate a given number for primality where would be the logical place to insert my Prime() method? I can think of the following options:
Is_Math::Prime()
Math_Is::Prime()
I hate these ambiguities, the slow down my thinking process and often induce me in errors. Some more examples:
Is::Image() or Image::Is() ?
Is_Image::PNG() or Image_Is::PNG() ?
Is_i18n_US::ZipCode() or i18n_Is_US::ZipCode() or
i18n_US_Is::ZipCode() ?
In the Image example the first choice makes more sense to me while in the i18n example I prefer the last one. Not having a standard makes me feel like the whole code base is messy.
Is there a holy grail solution for organizing classes? Maybe a different paradigm?
For the Math example, I'd put the actual functionality of checking if a number is prime in the Math class. In your Is class you would put a method that would be called when a validation needs to occur. You would then use Math::Prime() from there.
With Image, that's a type check. You probably don't need to make a method for it unless you are making sure valid image data has been uploaded.
With the PNG method, same with Math. Put the actual PNG data checker algorithm in Image and make your validator method in Is call it.
The zip code example should be in your Is class only since it operates on a string primitive and probably will just use a regexp (read: it won't be a complex method, unlike your PNG checker which probably will be).
If you want to respect the SRP (http://en.wikipedia.org/wiki/Single_responsibility_principle), do the little exercice:
Select your class and try to describe what it does/can do. If you have an "AND" in your description, you must move the method to an other class.
See page 36: http://misko.hevery.com/attachments/Guide-Writing%20Testable%20Code.pdf
Other Law (there are many more) that will help you organize your classes: Law of Demeter (http://en.wikipedia.org/wiki/Law_of_Demeter).
To learn a lot and to help you make the right choice, I advice you Misko's blog (A google evangelist): http://misko.hevery.com
Hope this helps.
I don't think it's ambiguous at all. "Is" should be first in every one of those examples, and I'll tell you why: "Is" is the superset of validation operations in which Is::Math is a member.
In the case of Is::Math, what are you doing? Are you doing math operations? Or are you validating mathematical entities? The latter, obviously, otherwise it'd just be "Math".
Which of those two operations has the greater scope? Is? Or Math? Is, obviously, because Is is conceptually applicable to many non-Math entities, whereas Math is Math specific. (Likewise in the case of Math::Factor, it wouldn't be Factor::Math, because Math is the superset in which Factor belongs.)
The whole purpose of this type of OOPing is to group things in a manner that makes sense. Validation functions, even when they apply to wildly different types of entities (Prime numbers vs. PNG images) have more similarities to each other than they do to the things they are comparing. They will return the same types of data, they are called in the same kind of situations.
Everything about handling validation in itself would fit in your Is-classes:
Did it pass?
Which parts did not pass?
Should the validation errors be logged somewhere?
Zend_Validate in Zend Framework provides such an approach, maybe you can get some inspiration from it. Since this approach would have you implementing the same interface in all validation-classes, you could easily
use the same syntax for validation, independantly of which data is validated
easily recognize which validation rules you have available by checking for all classes named Is_Prime, Is_Image instead of checking for Math_Is, Image_Is all over the place.
Edit:
Why not use a syntax like this:
class Math {
public function isPrime() {
$validation_rule = new Is_Prime();
return (bool) $validation_rule->validates($this->getValue());
}
}
And thereby also allow
class Problem {
public function solveProblem(Math $math) {
$validation_rule = new Is_Prime();
if($validation_rule->validates($math->getValue())) {
return $this->handlePrime($math);
} else {
return $this->handleNonPrime($math);
}
}
}
I think there is no "The Right Answer" to the problem you stated. Some people will put Prime in Is, and some in Math. There is ambiguity. Otherwise you wouldn't be asking this question.
Now, you have to resolve the ambiguity somehow. You can think about some rules and conventions, that would say which class/method goes where. But this may be fragile, as the rules are not always obvious, and they may become very complicated, and at that point they're no longer helpful.
I'd suggest that you design the classes so that it's obvious by looking at the names where some method should go.
Don't name your validation package Is. It's so general name that almost everything goes there. IsFile, IsImage, IsLocked, IsAvailable, IsFull - doesn't sound good, ok? There is no cohesion with that design.
It's probably better to make the validation component filter data at subsystems boundary (where you have to enforce security and business rules), nothing else.
After making that decision, your example becomes obvious. Prime belongs in Math. Is::Image is probably too general. I'd prefer Image::IsValid, because you'll probably also have other methods operating on an image (more cohesion). Otherwise "Is" becomes a bag for everything, as I said at the beginning.
I don't think "is" belongs in class names at all. I think that's for methods.
abstract class Validator {}
class Math_Validator extends Validator
{
public static function isPrime( $number )
{
// whatever
}
}
class I18N_US_Validator extends Validator
{
public static function isZipCode( $input )
{
// whatever
}
}
class Image_Validator extends Validator
{
public static function isPng( $path )
{
// whatever
}
}
Math_Validator::isPrime( 1 );
I18N_US_Validator::isZipCode( '90210' );
Image_Validator::isPng( '/path/to/image.png' );
Is there a holy grail solution for organizing classes? Maybe a different paradigm?
No, that is a basic flaw of class based oop. It's subjective.
Functional programming (Not to be confused with procedural programming) has less problems with this matter, mostly because the primary building blocks are much smaller. Classless oop also deals better, being a hybrid of oop and functional programming of sorts.
Classes can be considered to be fancy types that do things, like validating themselves.
abstract class ValidatingType
{
protected $val;
public function __construct($val)
{
if(!self::isValid($val))
{ // complain, perhaps by throwing exception
throw new Exception("No, you can't do that!");
}
$this->val = $val;
}
abstract static protected function isValid($val);
}
We extend ValidatingType to create a validating type. That obliges us to create an isValid method.
class ValidatingNumber extends ValidatingType
{
...
static protected function isValid($val)
{
return is_numeric($val);
}
}
class ValidatingPrimeNumber extends ValidatingNumber
{
/*
* If your PHP doesn't have late-binding statics, then don't make the abstract
* or overridden methods isValid() static.
*/
static protected function isValid($val)
{
return parent::isValid($val)
or self::isPrime($val); // defined separately
}
}
class ValidatingImage extends ValidatingType
{
...
static protected function isValid($val)
{
// figure it out, return boolean
}
}
One advantage of this approach is that you can continue to create new validating types, and you don't get a ballooning Is class.
There are more elegant variations on this approach. This is a simple variation. The syntax may require cleaning up.

Categories