I started off by drafting a question: "What is the best way to perform unit testing on a constructor (e.g., __construct() in PHP5)", but when reading through the related questions, I saw several comments that seemed to suggest that setting member variables or performing any complicated operations in the constructor are no-nos.
The constructor for the class in question here takes a param, performs some operations on it (making sure it passes a sniff test, and transforming it if necessary), and then stashes it away in a member variable.
I thought the benefits of doing it this way were:
1) that client code would always be
certain to have a value for this
member variable whenever an object
of this class is instantiated, and
2) it saves a step in client code
(one of which could conceivably be
missed), e.g.,
$Thing = new Thing;
$Thing->initialize($var);
when we could just do this
$Thing = new Thing($var);
and be done with it.
Is this a no-no? If so why?
My rule of thumb is that an object should be ready for use after the constructor has finished. But there are often a number of options that can be tweaked afterwards.
My list of do's and donts:
Constructors should set up basic options for the object.
They should maybe create instances of helper objects.
They should not aqquire resources(files, sockets, ...), unless the object clearly is a wrapper around some resource.
Of course, no rules without exceptions. The important thing is that you think about your design and your choises. Make object usage natural - and that includes error reporting.
This comes up quite a lot in C++ discussions, and the general conclusion I've come to there has been this:
If an object does not acquire any external resources, members must be initialized in the constructor. This involves doing all work in the constructor.
(x, y) coordinate (or really any other structure that's just a glorified tuple)
US state abbreviation lookup table
If an object acquires resources that it can control, they may be allocated in the constructor:
open file descriptor
allocated memory
handle/pointer into an external library
If the object acquires resources that it can't entirely control, they must be allocated outside of the constructor:
TCP connection
DB connection
weak reference
There are always exceptions, but this covers most cases.
Constructors are for initializing the object, so
$Thing = new Thing($var);
is perfectly acceptable.
The job of a constructor is to establish an instance's invariants.
Anything that doesn't contribute to that is best kept out of the constructor.
To improve the testability of a class it is generally a good thing to keep it's constructor as simple as possible and to have it ask only for things it absolutely needs. There's an excellent presentation available on YouTube as part of Google's "Clean Code Talks" series explaining this in detail.
You should definitely avoid making the client have to call
$thing->initialize($var)
That sort of stuff absolutely belongs in the constructor. It's just unfriendly to the client programmer to make them call this. There is a (slightly controversial) school of thought that says you should write classes so that objects are never in an invalid state -- and 'uninitialized' is an invalid state.
However for testability and performance reasons, sometimes it's good to defer certain initializations until later in the object's life. In cases like these, lazy evaluation is the solution.
Apologies for putting Java syntax in a Python answer but:
// Constructor
public MyObject(MyType initVar) {
this.initVar = initVar;
}
private void lazyInitialize() {
if(initialized) {
return
}
// initialization code goes here, uses initVar
}
public SomeType doSomething(SomeOtherType x) {
lazyInitialize();
// doing something code goes here
}
You can segment your lazy initialization so that only the parts that need it get initialized. It's common, for example, to do this in getters, just for what affects the value that's being got.
Depends on what type of system you're trying to architect, but in general I believe constructors are best used for only initializing the "state" of the object, but not perform any state transitions themselves. Best to just have it set the defaults.
I then write a "handle" method into my objects for handling things like user input, database calls, exceptions, collation, whatever. The idea is that this will handle whatever state the object finds itself in based on external forces (user input, time, etc.) Basically, all the things that may change the state of the object and require additional action are discovered and represented in the object.
Finally, I put a render method into the class to show the user something meaningful. This only represents the state of the object to the user (whatever that may be.)
__construct($arguments)
handle()
render(Exception $ex = null)
The __construct magic method is fine to use. The reason you see initialize in a lot of frameworks and applications is because that object is being programmed to an interface or it is trying to enact a singleton/getInstance pattern.
These objects are generally pulled into context or a controller and then have the common interface functionality called on them by other higher level objects.
If $var is absolutely necessary for $Thing to work, then it is a DO
You should not put things in a constructor that is only supposed to run once when the class is created.
To explain.
If i had a database class. Where the constructor is the connection to the database
So
$db = new dbclass;
And now i am connected to the database.
Then we have a class that uses some methods within the database class.
class users extends dbclass
{
// some methods
}
$users = new users
// by doing this, we have called the dbclass's constructor again
Related
Is it a good idea to have logic inside __constructor?
public class someClass
{
public function __construct()
{
//some logic here
}
So far I thought that it is fine; however, this reddit comment suggests the opposite.
As #Barry wrote, one of the reasons is related to unit-testing, but it's just a side-effect.
Let's take the worst case scenario: you have a "class", which only has a constructor (you probably have seen such examples). So ... why was it even written as a class? You cannot alter it's state, you cannot request it to perform any task and you have no way to check, that it did what you wanted. You could as well used a linear file and just included it. This is just bad.
Now for a more reasonable example: let's assume you have a class, which has some validation checks in the constructor and makes a new DB connection in it. And and then it also has some public methods for performing various tasks
The most obvious problem is the "makes a new DB connection" - there is no way to affect or prevent this operation from outside the class. And that new connection is going off to do who-knows-what (probably loading some configuration and trying to throw exceptions). It also constitutes a hidden dependency, for which you have no indication, without inspecting the class's code.
And there is a similar problem with code, that does validations and/or transformations of passed parameters. It constitutes hidden logic (and thus violating PoLA. It also makes your class harder to extend, because you probably will want to retain some of that validation functionality, while replacing other part. And you don't have that option. Because all of that code gets run whenever you crate a new instance.
Bottom line is this - logic in constructor is considered to be a "code smell". It's not a mortal sin (like using eval() on a global variable), but it's a sign of bad design.
No it isn't a good idea for automated testing. When testing you want to be able to "mock" objects that allow you to control the logic especially in terms of interfaces. So if you place logic in the constructor then it is very hard to test as you must use the real object.
here is a fantastic talk with much more detail on why not to put logic in constructor (google tech talk by Misko Hevery)
https://www.youtube.com/watch?v=RlfLCWKxHJ0
I think this question is little bit unclear because i don't think that __construct is bad place to logic, the question is what kind of logic you have here? Some kind of logic can be placed in constructor but another must not be present in constructor. For example Symfony Response - constructor contains logic, but this logic is necessary for this object, and this constructor doesn't make some implicit actions. This constructor doesn't print content to output or something else - so this is good example (as for me)...
Also it is important to understand what you object must to do, if it will be immutable object - constructor can have little bit another view...
Also it's important to follow SOLID and appropriate design pattern...
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Who needs singletons?
i was wondering, what are the drawbacks using Singletons in php scripts. I use them alot and i sometimes can't understand criticism of developers. Some examples:
I have a Request class:
Sanitizing POST, GET, COOKIE inputdata and using it instead of the global arrays - strictly and globally. Like
$request = Request::getInstance();
$firstname = $request->post('firstname', $additionalFilters);
There is always only ONE request per request. Why is using singleton in this case a bad idea?
Same for $_SESSION:
I have a Session class (Singleton) which does represent the $_SESSION array because there is only one session and i use it globally.
Database
$mysql = DB::getInstance('mysql', 'dbname'); //pseudo
$sqlite = DB::getInstance('sqlite', 'dbname'); //pseudo
For each type of database, i want only ONE object and never MORE. In my opinion there is otherwise a risk of chaos.
Unique rows
Also i often use classes to represent/use a unique row of a db table.
$article = Article::getInstance($id);
$navigation = Navigation::getInstance($id);
I see only benefits doing it this way. I never want a second object representing a unique row. Why is singleton such a bad idea here?
In fact, most (nearly all) my classes don't have a public constructor but always a static method like getInstance($id) or create() so the class itself handles the possible instances (which doesn't mean they are all singletons by definition)
So my question is: Are there any drawbacks i didn't realize yet. And which concrete scenario's the singleton-doubters thinking of when advise against Singletons.
Edit:
Now, you got a singleton that wraps around the $_POST, but what if you
don't have $_POST, but want to use a file for input instead? In that
case, it would be more convenient if you have an abstract input class,
and instantiate a POSTInput to manage input through posted data.
Ok, valid advantages. I didn't realized that. Especially the point regarding the Request class.
Still i have doubts whether that approach. Assume i have a "Functionality" class which executes a concrete request (like a guestbook component).
Within that class i want to get a sent parameter. So i get my singleton instance of Request
$req = Request::getInstance();
$message = $req->post('message');
This way, only my functionality object cares about a Request class.
When i use the non-singleton approach, i need somehow an additional class/function to manage that every request gets a valid request object. That way my Functionality class doesn't need to know about that managing class but in my opinion there still arises a dependence/problem: Everytime i create an instace of an Functionality object there is a chance that i forget to set a request object.
Surely i can define a non-optional parameter when creating a functionality. But that leads to a parameter overkill altogether at some time. Or not?
Singletons (and static classes, to which the same story largely applies) are not bad per se, but they introduce dependencies that you may not want.
Now, you got a singleton that wraps around the $_POST, but what if you don't have $_POST, but want to use a file for input instead? In that case, it would be more convenient if you have an abstract input class, and instantiate a POSTInput to manage input through posted data.
If you want to get input from a file, or even (for whatever reason) want to mimic (or replay) multiple requests based on input from a database table, you can still do this, without altering any code, except the part that instantiates the class.
Same applies to other classes too. You don't want your whole application to talk to this MySQL singleton.. What if you need to connect to two MySQL databases? What if you need to switch to WhatEverSQL?.. Make abstracts for these kinds of classes, and override them to implement specific technologies.
I do not think singletons should have as bad press they do in a request-based architecture such as PHP or ASP.NET (or whatever you want). Essentially, in a regular program the life-time of that singleton can be as many months or years as the program is running:
int main()
{
while(dont_exit)
{
// do stuff
Singleton& mySingleton = Singleton::getInstance();
// use it
}
return 0;
}
Apart from being little more than a global variable, it is very hard to replace that singleton with, perhaps, a singleton that might be useful in unit-testing. The amount of code that could depend on it, in potentially hundreds of source files tightly couples the use of that singleton to the entire program.
That said, in the case of request-based scenarios such as your PHP page, or an ASP.NET page, all your callable code is effectively wrapped in a function call anyway. Again, they are obfuscating a global variable (but within the context of the request) with safe-guards against being created more than once.
But still, I advocate against their use. Why? Because even in the context of your single request, everything is reliant and tightly coupled to that instance. What happens when you want to test a different scenario with a different request object? Assuming you coded using includes, you now have to go and modify every single instance of that call. If you had passed a reference to a pre-constructed Request class, you can now do cool stuff, such as provide mock unit-testing version of your class, by simply changing what gets passed down to the other functions. You've also de-coupled everything from using this universal Request object.
Ok guys I am struggling to understand why there is a need of singleton.
Let's make a real example: I have a framework for a my CMS
I need to have a class that logs some information (let's stick on PHP).
Example:
class Logger{
private $logs = array();
public function add($log) {
$this->logs[]=$log;
}
}
Now of course this helper object must be unique for the entry life of a page request of my CMS.
And to solve this we would make it a singleton (declaring private the constructor etc.)
But Why in the hell a class like that isn't entirerly static? This would solve the need of the singleton pattern (that's considered bad pratice) Example:
class Logger {
private static $logs = array();
public static function add($log) {
self::$logs[]=$log;
}
}
By making this helper entirely static, when we need to add a log somewhere in our application we just need to call it statically like: Logger::add('log 1'); vs a singleton call like: Logger::getInstance()->add('log 1');
Hope someone makes it easy to understand for me why use singleton over static class in PHP.
Edit
This is a pretty nice lecture on the singleton vs static class for who is interested, thanks to #James. (Note it doesn't address my question)
Many reasons.
Static methods are basically global functions that can be called from any scope, which lends itself to hard to track bugs. You might as well not use a class at all.
Since you cannot have a __construct method, you may have to put an init static method somewhere. Now people in their code are unsure if the init method has been called previously. Do they call it again? Do they have to search the codebase for this call? What if init was somewhere, but then gets removed, or breaks? Many places in your code now rely on the place that calls the init method.
Static methods are notoriously hard to unit test with many unit testing frameworks.
There are many more reasons, but it's hard to list them all.
Singletons aren't really needed either if you are using DI.
A side note. DI allows for your classes not to rely on each other, but rather on interfaces. Since their relationships are not cemented, it is easier to change your application at a later time, and one class breaking will not break both classes.
There are some instances where single state classes are viable, for instance if none of your methods rely on other methods (basically none of the methods change the state of the class).
I use singletons, so I can tell you exactly why I do it instead of a static function.
The defining characteristic of a singleton is that it is a class that has just one instance. It is easy to see the "just one instance" clause and forget to see the "it is a class" clause. It is, after all, a normal class object with all the advantages that that brings. Principly, it has its own state and it can have private functions (methods). Static functions have to do both of these in more limited or awkward ways.
That said, the two complement each other: a static function can be leveraged to return a singleton on the same class. That's what I do in the singleton I use the most often: a database handler.
Now, many programmers are taught that "singletons are bad, mm'kay?" but overlook the rider that things like are usually only bad when overused. Just like a master carver, an experienced programmer has a lot of tools at his disposal and many will not get a lot of use. My database handler is ideal as a singleton, but it's the only one I routinely use. For a logging class, I usually use static methods.
Singletons allow you to override behavior. Logger::add('1') for example can log to different devices only if the Logger class knows how. Logger::getLogger()->add('1') can do different things depending on what subtype of Logger getLogger() returns.
Sure you can do everything within the logger class, but often you then end up implementing the singleton inside the static class.
If you have a static method that opens a file, writes out and closes it, you may end up with two calls trying to open the same file at the same time, as a static method doesn't guarantee there is one instance.
But, if you use a singleton, then all calls use the same file handler, so you are always only having one write at a time to this file.
You may end up wanting to queue up the write requests, in case there are several, if you don't want them to fail, or you have to synchronize in other ways, but all calls will use the same instance.
UPDATE:
This may be helpful, a comparison on static versus singleton, in PHP.
http://moisadoru.wordpress.com/2010/03/02/static-call-versus-singleton-call-in-php/
As leblonk mentioned, you can't override static classes, which makes unit testing very difficult. With a singleton, you can instantiate a "mock" object instead of the actual class. No code changes needed.
Static classes can have namespace conflicts. You can't load 2 static classes of the same name, but you can load 2 different versions of a singleton and instantiate them under the same name. I've done this when I needed to test new versions of classes. I instantiate a different version of the class, but don't need to change the code that references that class.
I often mix singletons and static. For example, I use a database class that ensures there is only 1 connection to each master (static) and slave (singleton). Each instance of the db class can connect to a different slave, if a connection to the same slave is requested, the singleton object is returned. The master connection is a static object instantiated inside each slave singleton, so only 1 master connection exists across all db instantiated objects.
Firstly, I want to restrict this question to web development only. So this is language agnostic as long as the language is being used for web development. Personally, I am coming at this from a background in PHP.
Often we need to use an object from multiple scopes. For example, we might need to use a database class in the normal scope but then also from a controller class. If we create the database object in normal scope then we cannot access it from inside the controller class. We wish to avoid creating two database objects in different scopes and so need a way of reusing the database class regardless of scope. In order to do so, we have two options:
Make the database object global so that it can be accessed from anywhere.
Pass the database class to the controller class in the form of, for example, a parameter to the controller's constructor. This is known as dependency injection (DI).
The problem becomes more complex when there are many classes involved all demanding objects in many different scopes. In both solutions, this becomes problematic because if we make each one of our objects global, we are putting too much noise into the global scope and if we pass too many parameters into a class, the class becomes much more difficult to manage.
Therefore, in both cases, you often see the use of a registry. In the global case, we have a registry object which is made global and then add all of our objects and variables to that making them available in any object but only putting a single variable, the registry, into the global scope. In the DI case, we pass the registry object into each class reducing the number of parameters to 1.
Personally, I use the latter approach because of the many articles that advocate it over using globals but I have encountered two problems. Firstly, the registry class will contain huge amounts of recursion. For example, the registry class will contain database login variables needed by the database class. Therefore, we need to inject the registry class into the database. However, the database will be needed by many other classes and so the database will need to be added to the registry, created a loop. Can modern languages handle this okay or is this causing huge performance issues? Notice that the global registry does not suffer from this as it is not passed into anything.
Secondly, I will start passing large amounts of data to objects that don't need it. My database doesn't care about my router but the router will get passed to the database along with the database connection details. This is made worse through the recursion problem because if the router has the registry, the registry has the database and the registry and the registry is passed to the database, then the database is getting passed to itself via the router (i.e. I could do $this->registry->router->registry->database from inside the database class`).
Furthermore, I don't see what the DI is giving me other than more complexity. I have to pass an extra variable into each object and I have to use registry objects with $this->registry->object->method() instead of $registry->object->method(). Now this obviously isn't a massive problem but it does seem needless if it is not giving me anything over the global approach.
Obviously, these problems don't exist when I use DI without a registry but then I have to pass every object 'manually', resulting in class constructors with a ridiculous number of parameters.
Given these issues with both versions of DI, isn't a global registry superior? What am I losing by using a global registry over DI?
One thing that is often mentioned when discussing DI vs Globals is that globals inhibit your ability to test your program properly. How exactly do globals prevent me from testing a program where DI would not? I have read in many places that this is due to the fact that a global can be altered from anywhere and thus is difficult to mock. However, it seems to me that since, at least in PHP, objects are passed by reference, changing an injected object in some class will also change it in any other class into which it has been injected.
Let's tackle this one by one.
Firstly, the registry class will contain huge amounts of recursion
You do not have to inject the Registry class into the database class. You can just as well have dedicated methods on the Registry to create the required classes for you. Or if you inject the Registry, you can simply not store it but only grab from it what is needed for the class to be instantiated properly. No Recursion.
Notice that the global registry does not suffer from this as it is not passed into anything.
There might be no recursion for the Registry itself, but objects in the Registry may very well have circular references. This could potentially lead to memory leaks when unsetting objects from the Registry with PHP Versions before 5.3 when the Garbage Collector would not collect those properly.
Secondly, I will start passing large amounts of data to objects that don't need it. My database doesn't care about my router but the router will get passed to the database along with the database connection details.
True. But that's what the Registry is for. It's not much different from passing $_GLOBALS into your objects. If you dont want that, dont use a Registry, but only pass in the arguments required for the class instances to be in a valid state. Or simply dont store it.
I could do $this->registry->router->registry->database
It is s unlikely that router exposes a public method to get the Registry. You wont be able to get to database from $this through router, but you will be able to get to database directly. Certainly. It's a Registry. That's what you wrote it for. If you want to store the Registry in your objects, you can wrap them into a Segregated Interface that only allows access to a subset of the data contained within.
Obviously, these problems don't exist when I use DI without a registry but then I have to pass every object 'manually', resulting in class constructors with a ridiculous number of parameters.
Not necessarily. When using constructor injection, you can limit the number of arguments to those absolutely necessary to put the object into a valid state. The remaining optional dependencies can very much be set through setter injection as well. Also, no one hinders you to add the arguments in an Array or Config object. Or use Builders.
Given these issues with both versions of DI, isn't a global registry superior? What am I losing by using a global registry over DI?
When you use a global Registry you are tight coupling this dependency to the class. This means the using classes cannot be used without this concrete Registry class anymore. You assume there will be only this Registry and not a different implementation. When injecting the dependecies, you are free to inject whatever fulfills the responsibility of the dependency.
One thing that is often mentioned when discussing DI vs Globals is that globals inhibit your ability to test your program properly. How exactly do globals prevent me from testing a program where DI would not?
They do not prevent you from testing the code. They just make it harder. When Unit-Testing you want to have the system in a known and reproducable state. If your code has dependencies on the global state, you have to create this state on each test run.
I have read in many places that this is due to the fact that a global can be altered from anywhere and thus is difficult to mock
Correct, if one test changes the global state, it might affect the next tests if you do not change it back. This means you have to take effort to recreate the environment in addition to setting your Subject-Under-Test into a known state. This might be easy if there is just one dependency, but what if there is many and those depend on the global state too. You'll end up in Dependency Hell.
I'll post this as an answer since I'd like to include the code.
I've benchmarked passing an object versus using global. I basically created a relatively simple object, but one with a self reference and a nested object.
The results:
Passed Completed in 0.19198203086853 Seconds
Globaled Completed in 0.20970106124878 Seconds
And the results are identical if I remove the nested object and the self reference...
So yes, it appears that there's no real performance difference between these two different methods of passing data. So make the better architectural choice (IMHO that's the Dependency Injection)...
The script:
$its = 10000;
$bar = new stdclass();
$bar->foo = 'bar';
$bar->bar = $bar;
$bar->baz = new StdClass();
$bar->baz->ar = 'bart';
$s = microtime(true);
for ($i=0;$i<$its;$i++) passed($bar);
$e = microtime(true);
echo "Passed Completed in ".($e - $s) ." Seconds\n";
$s = microtime(true);
for ($i=0;$i<$its;$i++) globaled();
$e = microtime(true);
echo "Globaled Completed in ".($e - $s) ." Seconds\n";
function passed($bar) {
is_object($bar);
}
function globaled() {
global $bar;
is_object($bar);
}
Tested on 5.3.2
I'm currently working on an oophp application. I have a site class which will contain all of the configuration settings for the app. Originally, I was going to use the singleton pattern to allow each object to reference a single instance of the site object, however mainly due to testing issues involved in this pattern, I've decided to try a different approach.
I would like to make the site class the main parent class in my app and call it's constructor from within the constructors of the child classes, in order to make all the properties available whenever needed.
When run for the first time, the class will only contain the db details for the app. To get the remaining values a query must be performed using the db details. However, any subsequent instances will be clones of the original (with all values). I may also set a Boolean flag to perform the query again a completely new instance is required.
Would this be a viable alternative to the singleton and would it solve the testing issues it causes? This is all theory atm, I haven't started to code anything yet,
Any thoughts or advice greatly appreciated.
Thanks.
I think a better way is to have an 'configuration' object that will get passed to the constructors of all your other classes. So, almost something like a singleton, except it's explicitly created and passed only to classes that need it. This approach is usually called dependency injection.
After trying many different techniques, what I have found functional and reliable is this method:
Use a bootstrap, or initialization file. It is located in the ROOT of the site with the proper permission and safe-guards against direct access.
All pages in the site first include this file. Within it, I create all my global objects (settings, user), and reference them from there.
For example:
// OBJECT CREATION
$Config = new Configuration();
$User = new User();
Then within classes that require these objects:
public function __construct($id = NULL) {
global $Config; // DEPENDENCY INJECTION SOUNDS LIKE AN ADDICTION!
if($Config->allow_something) {
$this->can_do_something = true;
}
if(NULL !== $id) {
$this->load_record($id);
}
}
Notice that I just access these global objects from within the class, and how I don't have to include the object variables as the first constructor parameter each and every time. That gets old.
Also, having a static Database class has been very helpful. There are no objects I have to worry about passing, I can just call $row = DB::select_row($sql_statement);; check out the PhpConsole class.
UPDATE
Thanks for the upvote, whoever did that. It has called attention to the fact that my answer is not something I am proud of. While it might help the OP accomplish what they wanted, it is NOT a good practice.
Passing objects to new object constructors is a good practice (dependency injection), and while "inconvenient," as with other things in life, the extra effort is worth it.
The only redeeming part of my answer is use of the facade pattern (eg. DB::select_row()). This is not necessarily a singleton (something the OP wanted to avoid), and gives you an opportunity to present a slimmed down interface.
Laravel is a modern PHP framework that uses dependency injection and facades, among other proven design patterns. I suggest that any novice developer review these and other such design practices thoroughly.