Testing class with high cohesion but reasonably complex logic - php

For example, I have a class that takes raw data, analyses it and returns numbers for a report. Let's call it SomeReportDataProvider
class SomeReportDataProvider
{
public function report(array $data)
{
$data = $this->prepare($data);
$report = [];
foreach ($data as $item) {
if ($row->somefield == 'somevalue') {
$report = $this->doThis($report, $row);
} else {
$report = $this->doThat($report, $row);
}
}
// ... do something else
$report = $this->postProcess($report);
return $report;
}
protected function doThis($item)
{
// ... do some math, probably call some other methods
}
protected function doThat($item)
{
// ... do other math
}
// ... and so on
}
So the class really does only one thing, step-by-step proccessing of raw data for some report. It's not big, most likely 4-6 methods with 5-10 lines. All of its methods are tightly related and serve one purpose, so I would say that it has high cohesion. But what's the best way to test the class?
If I try to approach it with mentality "test behavior, not implementation", I should test only it's single public method. The advantage of this approach is that it's easy to refactor the class later without even touching the tests, and I would be sure that it still exibits exactly same behavior. But it's also can be really hard to cover all cases through single method, too many possible code paths.
I can make most (probably all) methods public and test them in isolation, but then I'm breaking encapsulation of class and algorithm and testing implementation details of it. Any refactoring would be harder and unwanted changes in behavior more likely.
As a last option, I can break it into small classes, most likely having 1 or 2 methods. But is it really a good idea to break higly cohesive class into smaller, tightly coupled classes, that do very specific thing and aren't reused anywhere else? Also it's still will be more difficult to refactor. And probably harder for other developers to quickly get a full picture of how it works.

I always try to go for splitting up the classes as much as possible, like you say in your last option, because in my experience that’s the best way to simplify tests, which is by itself a very valid reason...
Also, I don't agree with you - I think this approach makes it easier to refactor in the future, and easier to understand, rather that a big class with everything together...
Looking at your code, I can see a bunch of separate “roles”: ReporterInterface, Reporter, ReportDataPreparator, ThisDoer, ThatDoer, ReportPostProcessor (obviously you can find better names :)
You might want to reuse some of them in the future, but even if that's not the case and all them are very specific to reporting, you can just put them in a separate namespace and folder (like a “reporting module”).
This reporting module has one unique API, which is your ReporterInterface, and all the rest of your system only needs to care about that interface, regardless of whether that reporter uses private methods, other classes, or a whole system in the background - they just have to call $reporter->report($data)...
So, from the perspective of the rest of the system, nothing changes, your reporting services are still all together, and your unit tests are much easier to write and maintain…

Related

A Big Function with Case instead of many small functions

The benefits of breaking code into very small components which do one and only one simple function are obvious. But there is nothing which mandates that we should make each and every function a separate function in itself.
Consider the following case:
There is one big function, however EACH of the cases included in the function is isolated, and runs only by itself. All of the case blocks can be just copy/pasted into a different code body, wrapped in their function name. So its modular. There wont be any multiple cases(ifs) combined, ever.
All the small functions which reside in the case blocks use $vars array as their main variable. So any number of variables in any format can be passed to the parent iterator as part of an array. There are no limitations.
The parent iterator can be run anywhere, from any place, even within itself by requesting a particular action. ie $this->run_actions('close_ticket');
It has massive advantage regarding the common procedures which need to be run, and may need to be run over all actions requested. output buffering is one, and any action hooks, filters or any other all encompassing system that can be imagined.
This format allows any future new procedures which need be run before and after any action and/or on the inputs and outputs of any action, to be easily integrated. (For the particular case i have in my hands, the appearance of such cases in future is certain!!!.) If all these actions were divided into small functions instead, you would need to either go and change hooks and filters on each of the functions, or still have some sort of other function to dispatch these onto those small functions. With this format, you just place them before or after the switch/case block.
Code reading is simple: When you see a particular case, you know what it does -> 'view_tickets' is the ticket view action, and it takes $vars as an input.
Obviously, a truly hypermassive function will have various disadvantages.
So, the question is:
Assuming that the size of the function is kept at a reasonable size and the principles of modularity and one simple action per case is preserved, also considering that anyone who works with this code in future will not need to look into this code and must not modify this code and need to know only the hooks and filters which will be documented elsewhere than code itself, (including any variables they use) do you think this could be an efficient approach to combining tasks which need common procedures run on them?
public function run_actions($action,$vars=false)
{
// Global, common filters and procedures which all actions need or may need
ob_start();
switch($action)
{
case 'view_tickets':
{
// Do things
// Echo things if necessary
break;
}
case 'close_ticket':
{
// Do things
// Echo things if necessary
break;
}
case 'do_even_more_stuff':
{
// Do things
// Echo things if necessary
break;
}
// Many more cases as needed
}
// Even more common post-processing actions, filters and other stuff any action may need
$output=ob_get_clean();
return $output;
}
You can replace conditional with polymorphism. Create an abstract action class with a method like "execute" and then subclass for all various actions implementing that method.
e.g.
function run_actions(IAction action) {
//...
action->execute();
//...
}
That way, if you will need to introduce additional behavior, you won't need to modify and test long run_actions with numerous responsibilities.
Various disadvantages:
The switch cases all use $vars so they don't have a specific signature.
This hides the signature from the developer that its thus forced to read the code.
you can't do type-hinting on $vars (force parameters to be arrays, instance of some class, etc)
no IDE autocompletion
Easier to do a mistake
forget a break and you're done. No recognizable error.
Difficult to refactor
what would you do if you need to extract the code to a function? You need to duplicate preprocessing (ob_start, etc) or to change everything
what would you do if you needed to run on action with no preprocessing?
I agree it is very simple, but it has long-run disadvantages. Up to you to strike the right balance :)
When I look at this kind of architecture, I see it as beginning to build a new programming language on top of the existing one. This isn't always a bad thing, if the features you're building are a better abstraction than the language you're building them with, but it's worth challenging what those features are.
In this case, the part of the language you're reinventing is function dispatch: you have a named action, which takes arbitrary parameters, and runs arbitrary code. PHP already does this, and quite efficiently; it also has features your system lacks, such as built-in checks of the number and (to some extent) type of parameters. Furthermore, by inventing a non-standard "syntax", existing tools will not work as well - they won't recognise the actions as self-documenting structures, for instance.
The main part you gain is the ability to add pre- and post-processing around the actions. If there were no other way to achieve this, the tradeoff might be worthwhile, but luckily you have better options, e.g. putting each action into a function, and passing it as a callback to the wrapper function; or making each action an object, and using inheritance or composition to attach the pre- and post-processing without repeating it.
By wrapping the arguments in an array, you can also emulate named parameters, which PHP lacks. This can be a good idea if a function takes many parameters, some of them perhaps optional, but it does come with the drawbacks of reinventing processing that the language would normally do for you, such as applying the correct defaults, complaining on missing mandatory items, etc
There is a simple principle that says don't use more than 2 tab indentation.
eg :
public function applyRules($rules){
if($rules){
foreach($rules as $rule){
//apply ryle
}
}
}
Becomes better when you refactor it :
public function applyRules($rules){
if($rules){
$this->executeRules($rules);
}
}
private function executeRules($rules){
foreach($rules as $rule){
$rule->execute();
}
}
And this way your code will be refactored better and you could apply more unit tests than you could.
Another rule says don't use else, instead break the code eg :
public function usernameExists($username){
if($username){
return true;
}else{
return false;
}
}
Instead of doing this, you should do this :
public function usernameExists($username){
if($username){
return true;
}
return false;
}
Only this way you ensure that your code flows 100%

Am I setting myself up for failure using a static method in a Laravel Controller?

I am quite new to OOP, so this is really a basic OOP question, in the context of a Laravel Controller.
I'm attempting to create a notification system system that creates Notification objects when certain other objects are created, edited, deleted, etc. So, for example, if a User is edited, then I want to generate a Notification regarding this edit. Following this example, I've created UserObserver that calls NotificationController::store() when a User is saved.
class UserObserver extends BaseObserver
{
public function saved($user)
{
$data = [
// omitted from example
];
NotificationController::store($data);
}
}
In order to make this work, I had to make NotificationController::store() static.
class NotificationController extends \BaseController {
public static function store($data)
{
// validation omitted from example
$notification = Notification::create($data);
}
I'm only vaguely familiar with what static means, so there's more than likely something inherently wrong with what I'm doing here, but this seems to get the job done, more or less. However, everything that I've read indicates that static functions are generally bad practice. Is what I'm doing here "wrong," per say? How could I do this better?
I will have several other Observer classes that will need to call this same NotificationController::store(), and I want NotificationController::store() to handle any validation of $data.
I am just starting to learn about unit testing. Would what I've done here make anything difficult with regard to testing?
I've written about statics extensively here: How Not To Kill Your Testability Using Statics. The gist of it as applies to your case is as follows:
Static function calls couple code. It is not possible to substitute static function calls with anything else or to skip those calls, for whatever reason. NotificationController::store() is essentially in the same class of things as substr(). Now, you probably wouldn't want to substitute a call to substr by anything else; but there are a ton of reasons why you may want to substitute NotificationController, now or later.
Unit testing is just one very obvious use case where substitution is very desirable. If you want to test the UserObserver::saved function just by itself, because it contains a complex algorithm which you need to test with all possible inputs to ensure it's working correctly, you cannot decouple that algorithm from the call to NotificationController::store(). And that function in turn probably calls some Model::save() method, which in turn wants to talk to a database. You'd need to set up this whole environment which all this other unrelated code requires (and which may or may not contain bugs of its own), that it essentially is impossible to simply test this one function by itself.
If your code looked more like this:
class UserObserver extends BaseObserver
{
public function saved($user)
{
$data = [
// omitted from example
];
$this->NotificationController->store($data);
}
}
Well, $this->NotificationController is obviously a variable which can be substituted at some point. Most typically this object would be injected at the time you instantiate the class:
new UserObserver($notificationController)
You could simply inject a mock object which allows any methods to be called, but which simply does nothing. Then you could test UserObserver::saved() in isolation and ensure it's actually bug free.
In general, using dependency injected code makes your application more flexible and allows you to take it apart. This is necessary for unit testing, but will also come in handy later in scenarios you can't even imagine right now, but will be stumped by half a year from now as you need to restructure and refactor your application for some new feature you want to implement.
Caveat: I have never written a single line of Laravel code, but as I understand it, it does support some form of dependency injection. If that's actually really the case, you should definitely use that capability. Otherwise be very aware of what parts of your code you're coupling to what other parts and how this will impact your ability to take it apart and refactor later.

Abstract class - children type

I'm trying to design some class hierarchy and I got "stuck" at this part.
Lets say that I have following classes
abstract class Video
{
const TYPE_MOVIE = 1;
const TYPE_SHOW = 2;
abstract public function getTitle();
abstract public function getType();
}
class Movie extends Video
{
// ...
public function getType()
{
return self::TYPE_MOVIE;
}
}
class Show extends Video
{
// ...
public function getType()
{
return self::TYPE_SHOW;
}
}
In the diffrent part of the system I have (Parser) class that encapsulates creation of
movie and show objects and returns obj. to the client.
Question: What is the best way to get a type of a obj. returned from parser/factory class, so that client can do something like
$video = $parser->getVideo('Dumb and Dumber');
echo $video->getTitle();
// Way 1
if($video->getType == 'show') {
echo $video->getNbOfSeasons();
}
// Way 2
if($video instanceof Show) {
echo $video->getNbOfSeasons();
}
// Current way
if($video->getType == Video::TYPE_SHOW) {
echo $video->getNbOfSeasons();
}
Is there a better way than my solution (read as: does my solution suck?)?
Is there a better way than my solution (read as: does my solution suck?)?
Your solution does not suck, per se. However, whenever someone is trying to determine the subtype to perform some actions, I tend to wonder; why? This answer might be a little theoretical and perhaps even a little pedantic, but here goes.
You shouldn't care. The relationship between a parent and a child class is that the child class overwrites the behaviour of the parent. A parent class should always be substitutable by it's children, regardless which one. If you find yourself asking: how do I determine the subtype, you're usually doing one of two things "wrong":
You're attempting to perform an action based upon subtype. Normally, one would opt for moving that action to the class itself, instead of "outside" of the class. This makes for more manageable code as well.
You're attempting to fix a problem you've introduced yourself by using inheritance, where inheritance isn't warranted. If there is a parent, and there are children, each of which are to be used differently, each of which have different methods, just stop using inheritance. They're not the same type. A film is not the same a tv-serie, not even close. Sure, you can see both on your television, but the resemblance stops there.
If you're running into issue number 2, you're probably using inheritance not because it makes sense, but simply to reduce code duplication. That, in and on itself, is a good thing, but the way you're attempting to do so might not be optimal. If you can, you could use composition instead, although I have my doubts where the duplicated behaviour would be, apart from some arbitrary getters and setters.
That said, if your code works, and you're happy with it: go for it. This answer is correct in how to approach OO, but I don't know anything about the rest of your application, so the answer is generic.
I'd go with way 2. It abstracts you the need to add another constant at Video in case you may want to add class SoapOpera extends Show (for instance).
With way #2, you are less dependent on constants. Whatever information you can gain without hardcoding it, means less problems to likely happen in the future in case you want to extend. Read about Tight an Loose Coupling.
I think the second option is better, using instanceof. This is in general common to all OOP design and not just PHP.
With your first option, you have specifics about derived classes in the base class, and thus must modify the base class for each new derived class you add, which should always be avoided.
Leaving the base class as-is when adding new derived classes promotes code reuse.
If there is a "right" way, and everything is subjective in coding of course (as long as it doesn't adversely impact performance/maintainability ;) ), then it's the second way as "Truth" and "Brady" have pointed out.
The upside of doing things the way you're doing them now (class constants in the abstract) is that when you're working with other developers it can provide hints as to how you expect the abstract class to be interacted with.
For instance:
$oKillerSharkFilm = Video::factory(Video::MOVIE, 'Jaws', 'Dundundundundundun');
$oKillerSharkDocumentary = Video::factory(Video::DOCUMENTARY, 'Jaws', 'A Discovery Shark Week Special');
Of course, the downside is that you have to maintain the "allowable extensions" in the abstract class.
You could still use the instanceof method as demonstrated in your question and maintain the list of allowable extension in the abstract predominantly for control/type fixing.

OOP concept understanding

I recently started learning the basics of OOP in PHP.
I am new to a whole lot of concepts.
In the traditional procedural way of doing things, if I had a repetitive task, I wrote a function and called it each time.
Since this seems to be a regular occurence, I created a small library of 5-10 functions, which I included in my procedural projects and used.
In OOP, what is the valid way of using your functions and having them accessible from all objects?
To make things closer to the real world, I created a thumbnail class, that takes an image filename as an argument and can perform some operations on it.
In procedural programming. when I had a function for creating thumbnails, I also had a function to create a random md5 string, check a given folder if said string existed, and repeat if it did, so I could generate a unique name for my thumbnails before saving them.
But if I wanted to generate another unique name for another purpose, say saving a text file, I could call that function again.
So, long story short, what is the valid OOP way to have the method randomise_and_check($filename) (and all other methods in my library) accessible from all the objects in my application?
Great question. The first thing you want to do is identify the primary objects you will be working with. An easy way to do this is to identify all the nouns related to your project. In your example it sounds like you will be working with images and strings, from this we can create two classes which will contain related attributes (functions, member variables, etc). And as you wisely mentioned, we need to ensure that the algorithms you are converting into OOP can be called from any context, so we try to keep them abstract as possible (within reason).
So for your specific situation I would suggest something like:
// Good object reference, abstract enough to cover any type of image
// But specific enough to provide semantic API calls
class Image
{
// Using your example, but to ensure you follow the DRY principle
// (Don't repeat yourself) this method should be broken up into two
// separate methods
public static function randomise_and_check($fileUri)
{
// Your code here
....
// Example of call to another class from within this class
$hash = String::generateHash();
}
}
// Very abstract, but allows this class to grow over time, by adding more
// string related methods
class String
{
public static function generateHash()
{
return md5(rand());
}
}
// Calling code example
$imageStats = Image::radomise_and_check($fileUri);
There are several other approaches and ideas that can be employed, such as whether or not to instantiate objects, or whether we should create a parent class from which we can extend, but these concepts will become evident over time and with practice. I think the code snippet provided should give you a good idea what you can do to make the jump from procedural to OOP. And, as always, don't forget to read the docs for more info.
-- Update --
Adding an OOP example:
class Image
{
protected $sourceUri;
public function setSourceUri($sourceUri)
{
$this->sourceUri = $sourceUri;
}
public function generateThumb()
{
return YourGenerator::resize($this->getSourceUri);
}
}
$image = new Image();
$image->setSourceUri($imageUri);
$thumbnail = $image->generateThumbnail();
The way I see it, you have two options:
Don't worry about cramming yourself into OOP and just make them standard, global functions in some utilities.php file you include wherever you want to use it. This is my preferred method.
If you take the more OOP approach, you could make them static functions ("methods") in some utilities class. From the PHP documentation:
<?php
class Foo {
public static function aStaticMethod() {
// ...
}
}
Foo::aStaticMethod();
$classname = 'Foo';
$classname::aStaticMethod(); // As of PHP 5.3.0
?>
Create an (abstract) Util-class with static functions:
example from my Util class:
abstract Class Util{
public static function dump($object){
echo '<pre class=\"dump\">' . print_r($object, true) . '</pre>';
}
}
How to use:
<?
$object = new Whatever();
//what's in the object?
Util::dump($object);
?>
For a beginner, OOP development is not all that different from procedural (once you master the basic concepts it gets quite a bit different, but that's not important to learning the basics).
You deal in OO concepts all the time, you just don't realize it. When you click on a file in your file manager, and manipulate that file.. you're using Object Oriented concepts. The file has attributes (size, type, read-only, etc..) and things you can do with it (open, copy, delete).
You just apply those concepts to development by creating objects that have properties and things you can do with it (methods).
In the OOP world, you don't typically make things available to everything else. OOP is all about "encapsulation", which is limiting access to only that which is needed. Why would you make a "haircut" method available to an orange juice object? You wouldn't. You only make the "haircut" method available to objects that need haircuts.
Writing reusable OO software is very difficult. Even professionals can't get it right a lot of the time. It requires a mixture of experience, training, practice, and frankly luck in some cases.
You should read about Dependency Injection as it seems to apply to your specific problem. Basically, you have an object that depends on some abstraction, maybe the "Image Library" functionality. In your controller, you would create an instance of the "Image Library" object and inject that dependency into whatever other objects required it.
That is, you need to stop thinking on the global scope altogether. Instead, you have to compartmentalize functionailties in a sane way and tie them together. Basically, objects should only know about as little as they need to know (also look up Law of Demeter and SOLID). I reiterate, this is tough to do correctly, and most of the time you can still have an application that works beautifully even if it's done incorrectly.
If you want to be very strict about this you should apply this line of thinking to everything, but if you have a function that wraps something very simple like return isset($_POST[$key]) ? $_POST[$key] : $default; I see no real harm in creating a global function for that. You could create an HttpPost wrapper class, but that is overkill in most circumstances IMO.
The short answer: use ordinary function. OOP encourages you to think about data and associated routines, using static functions instead of ordinary does not make your program more object-oriented. Following the single programming paradigm is not practical, combine them when you see that this will make your program cleaner.

best way to refactor my form (procedural to oop?)

(Note: this is related to this question, but I think it could have been written more clearly, so I'm trying again -- my update only helped to a limited extent.)
I've inherited some code that creates a complex form with numerous sections, and lots of possible views, depending on a number of parameters. I've been working with it for a while, and finally have a chance to think about doing some re-factoring. It's currently written procedurally, with a bunch of functions that look like this:
get_section_A ($type='foo', $mode='bar', $read_only=false, $values=array()) {
if ($this->type == 'foo') {
if ($this->mode == 'bar') { }
else { }
} else { }
}
Passing around those parameters is nasty, so I've started writing a class like this:
class MyForm {
public $type; // or maybe they'd be private or
public $mode; // I'd use getters and setters
public $read_only; // let's not get distracted by that :)
public $values;
// etc.
function __constructor ($type='foo', $mode='bar', $read_only=false, $values_array=array()) {
$this->type = $type;
// etc.
}
function get_sections () {
$result = $this->get_section_A();
$result .= $this->get_section_B();
$result .= $this->get_section_C();
}
function get_section_A() {
if ($this->type == 'foo') { }
else { }
}
function get_section_B() {}
function get_section_C() {}
// etc.
}
The problem is that the procedural functions are split into a few files (for groups of sections), and if I combine them all into a single class file, I'm looking at 2500 lines, which feels unwieldy. I've thought of a few solutions:
keep living with the nasty parameters and do something else with my time :)
live with having a 2500 line file
create a separate class for each group of sections that somehow "knows" the values of those parameters
If I do #3, I've thought of two basic approaches:
pass the MyForm object in as a single parameter
create a FormSectionGroup class with static properties that get set in MyForm, then in the group files, each class would extend FormSectionGroup and automatically have access to the current values for those parameters.
1) is probably easier to set-up, and once I'm inside get_section_A() whether I say $this->type or $myForm->type isn't all that different, but it's not exactly OOP. (In fact, I could do that without really changing to an OOP approach.)
Are there other approaches? Thoughts about which is better?
I would love nothing more than to write a lengthy explanation of how to do this, but I'm feeling a bit lazy. I do however have enough energy to point you instead to Zend_Form from the zend framework. There may be some dependencies to make it work properly (Zend_View, Elements, Decorators), but once you have them, it handles that type of situations quite gracefully.
I thought of this when I posted in your previous question - this problem reeks of decorator pattern.
It's going to be no small task, though. But I think you'll have an amazing sense of satisfaction/accomplishment once you get it done.
Having done a lot of Cocoa programming recently, I tend to view things in MVC-pattern (Model-View-Controller) terms. Hence, I'd look at the form as a controller of its various sections.
Each section object should be responsible for keeping track of its status, values and whether or not it should be displayed. Or to be precise, the section_model would take care of the values (default values, validation, etc), the section_view would take care of displaying (or not) parts of the section and the section_controller would send keep track of the status of the section and report results to the form object.
The form object should instantiate the section controllers, tell them to display or hide or whatever, and get status reports. The form object, really act a controller, can then decide when the form if completely filled out. You may have a form_model object to save the collected data, or maybe you'd rather have the section_model objects take part of that. It will take a while to get a feeling for how the different objects interact but I know from experience that if you are disciplined in designing your objects (the key is: what is an object's responsibility and what is not), you will gain a better overview and your code will be easier to upgrade. Once you find that improvements start to arise naturally, you are on the right track.
If you have the time to work on doing #3, you will probably be the happiest over the long run. Personally, I don't have the time in my job to refactor to that degree very often, and so I'd probably be forced to pick #1. #2 sounds like the worst of both worlds to me. Lots of work to get code you don't like.

Categories