I'm used to make pretty much all of my class variables private and create "wrapper" functions that get / set them:
class Something{
private $var;
function getVar(){
$return $this->var;
}
}
$smth = new Something();
echo $smth->getVar();
I see that a lot of people do this, so I ended up doing the same :)
Is there any advantage using them this way versus:
class Something{
public $var;
}
$smth = new Something();
echp $smth->var;
?
I know that private means that you can't access them directly outside the class, but for me it doesn't seem very important if the variable is accessible from anywhere...
So is there any other hidden advantage that I missing with private variables?
It's called encapsulation and it's a very good thing. Encapsulation insulates your class from other classes mucking about with it's internals, they only get the access that you allow through your methods. It also protects them from changes that you may make to the class internals. Without encapsulation if you make a change to a variable's name or usage, that change propagates to all other classes that use the variable. If you force them to go through a method, there's at least a chance that you'll be able to handle the change in the method and protect the other classes from the change.
It differs from case to case if you want to use private variables with public getters and setters or if you just want to declare a variable as public directly.
The reason it might be good to use "getters" and "setters" is if you want to have control over when someone accessess the data.
As an example, lets say you got this:
public setBirthday($date)
Then you can make sure that the date passed in to that setter is a valid birthdate.
But you can't if you just declare the variable as public like this
public $birthday;
Based on comments.
Also, if you decide change the
internal storage mechanism from a
string containing the date to the
number of seconds since 1/1/1970, you
can still present the date externally
in the same way if you use
encapsulation, but not if you expose
the variables directly. Every piece of
code that touched the internal
variable directly would now be broken.
This means that if the internal storage mechanism would change to numbers of seconds from 1/1/1970 then you don't have to change the 'External API'. The reason is because you have full control over it:
public getBirthday() {
// you can still return a string formatted date, even though your
// private variable contains number of seconds from 1/1/1970
}
Access modifiers don't make a whole lot of sense in scripting languages. Many actual object-oriented languages like Python or Javascript don't have them.
And the prevalence of naive getter/setter methods is simply due to PHP not providing an explicit construct for that. http://berryllium.nl/2011/02/getters-and-setters-evil-or-necessary-evil/
It is to demark variables that are internal to the implementation of the class, from variables that are intended for external change. There are also protected variables, which are for internal use, and use by extensions to the class.
We make the variables private so that only the code within the class can modify the variables, protecting interference from outside, guaranteeing control and expected behaviour of the variable.
The purpose of encapsulation is to hide the internals of an object from other objects. The idea is that the external footprint of the object constitutes it's defined type, think of it like a contract with other objects. Internally, it may have to jump through some hoops to provide the outward-facing functionality, but that's of no concern to other objects. They shouldn't be able to mess with it.
For example, let's say you have a class which provides calculations for sales tax. Some kind of utility service object, basically. It has a handful of methods which provide the necessary functionality.
Internally, that class is hitting a database to get some values (tax for a given jurisdiction, for example) in order to perform the calculations. It may be maintaining a database connection and other database-related things internally, but other classes don't need to know about that. Other classes are concerned only with the outward facing contract of functionality.
Suppose sometime later the database needs to be replaced with an external web service. (The company is going with a service for calculating sales tax rather than maintain it internally.). Because the class is encapsulated, you can change its internal implementation to use the service instead of the database very easily. The class just needs to continue to provide the same outward facing functionality.
If other classes were mucking around with the internals of the class, then re-implementing it would risk breaking others parts of the system.
Related
I understand the reasons for not using statics in Java.
However, I'm currently developing OO code in PHP. I use DAOs with the goal of keeping my queries in one place so I can easily find them. I also instantiate some DAOs so I can incorporate pagination in some (relevant) queries. In many cases, it's not necessary and so I tend to just create static methods (even though technically I don't think I can call that a DAO) in the form:
$info = schemeDAO::someFunction($variable);
I may need only that single method during a page refresh (i.e. a specific value in a header file).
I may need to instantiate the same DAO a hundred times as objects are created and destroyed.
$dao = new myDao();
$info = $dao->someFunction($variable);
Either way, it seems to me, in PHP at least, wouldn't it be more performance efficient to simply load a static and keep it in memory?
While the static access is acceptable (to an extent), with the dynamic approach you can pass the object transitively to the 3rd side object via dependency, (otherwise also the transitive call the transition of dependency would have to be initiated from the original class), which needs not to be pushed some data, but rather the dependency decides and pulls the data/method it needs be it multiple times in a single method. Otherwise way it can only return, while instance can be called, not-separated wrapper method logic from data. Instance inline code seems to be shorter, and when you remove an instance, all their calls complain at that moment, whereas static class continues to preserve unnoticed in the code as they don't need the instantiation prerequisite.
Static classes preserve their state in between various objects, and methods contexts and thus are not automatically "reset" as it is with the 'new construct'. Instances encourage more transparent pure functions approach - passing parameters. When you pass an object, you don't separate the service logic from it's data structure, when you pass only the array data structure, the execution logic is lost in transit or separated into another place and must be eventually called intransparently statically when not-passed - pure functions concept.
I would use comparison with Einsteins vs Newton's equations. In some cases, they look totally the same. But to be pretty content I would use more versatile instances or service locator singletons, vs static classes. On the other side, the less "versatile" static classes might be initially easier to implement, especially if you don't plan on rocket with them as far as to the space orbit as you might get with instances. Similarly as with private attributes you signal they are not passed anywhere, pure functions, though it might less often signalize also the bad, being called from anywhere.
Variable encapsulation, Set/Get methods are best practices but why do we have a chance to declare a variable public if it's not meant to be used anyway? Would it have been better if variables were always private by default with no chance of making them public since all of the tutorials I read says they should be encapsulated with set/get methods? Is there any valid use case for public variables at least in PHP OOP?
In fact it's just the other way round: Theoretically getters/setters are wrong. The properties defines the state of an object, where the methods defines the behaviour. Getters/Setters only intercept the read and write access to properties, but they break the semantic meaning completely: Now reading the status of an object is a behaviour of the object.
To make properties to look like properties again there is a RFC on the road :)
https://wiki.php.net/rfc/propertygetsetsyntax
Set/Get methods are best practices but why do we have a chance to declare a variable public if it's not meant to be used anyway?
Best practices and not meant to be used is not the same. A language needs to offer different tools for different use-cases and should be consistent.
PHP objects always supported public members and when differentiated visibility was introduced, for backwards compatible reasons public members are very useful.
Would it have been better if variables were always private by default with no chance of making them public since all of the tutorials I read says they should be encapsulated with set/get methods?
That question can not be specifically answered, it's too subjective and there are too many different use-cases that would result in a different answers.
Is there any valid use case for public variables at least in PHP OOP?
Start with backwards compatiblity. If you can not refactor your code but would need to rewrite it completely all the time, this would be very expensive.
let's see..
this's a real world Email API class from CakePHP EmailComponent. to use this class you only need to "set" some property then just send()
$this->Email->to = 'ss#b.co';
$this->Email->from = 'me#b.co';
$this->Email->title = 'xxx';
$this->Email->msg = 'blabla..';
$this->Email->send();
in fact there is a lot of private properties and function inside this class but it's private.
Class has (single) responsibility to do something.
Encapsulation is to publish only what people use to do that thing and keep technical/infrastructure inside as private.
Firstly, I want to restrict this question to web development only. So this is language agnostic as long as the language is being used for web development. Personally, I am coming at this from a background in PHP.
Often we need to use an object from multiple scopes. For example, we might need to use a database class in the normal scope but then also from a controller class. If we create the database object in normal scope then we cannot access it from inside the controller class. We wish to avoid creating two database objects in different scopes and so need a way of reusing the database class regardless of scope. In order to do so, we have two options:
Make the database object global so that it can be accessed from anywhere.
Pass the database class to the controller class in the form of, for example, a parameter to the controller's constructor. This is known as dependency injection (DI).
The problem becomes more complex when there are many classes involved all demanding objects in many different scopes. In both solutions, this becomes problematic because if we make each one of our objects global, we are putting too much noise into the global scope and if we pass too many parameters into a class, the class becomes much more difficult to manage.
Therefore, in both cases, you often see the use of a registry. In the global case, we have a registry object which is made global and then add all of our objects and variables to that making them available in any object but only putting a single variable, the registry, into the global scope. In the DI case, we pass the registry object into each class reducing the number of parameters to 1.
Personally, I use the latter approach because of the many articles that advocate it over using globals but I have encountered two problems. Firstly, the registry class will contain huge amounts of recursion. For example, the registry class will contain database login variables needed by the database class. Therefore, we need to inject the registry class into the database. However, the database will be needed by many other classes and so the database will need to be added to the registry, created a loop. Can modern languages handle this okay or is this causing huge performance issues? Notice that the global registry does not suffer from this as it is not passed into anything.
Secondly, I will start passing large amounts of data to objects that don't need it. My database doesn't care about my router but the router will get passed to the database along with the database connection details. This is made worse through the recursion problem because if the router has the registry, the registry has the database and the registry and the registry is passed to the database, then the database is getting passed to itself via the router (i.e. I could do $this->registry->router->registry->database from inside the database class`).
Furthermore, I don't see what the DI is giving me other than more complexity. I have to pass an extra variable into each object and I have to use registry objects with $this->registry->object->method() instead of $registry->object->method(). Now this obviously isn't a massive problem but it does seem needless if it is not giving me anything over the global approach.
Obviously, these problems don't exist when I use DI without a registry but then I have to pass every object 'manually', resulting in class constructors with a ridiculous number of parameters.
Given these issues with both versions of DI, isn't a global registry superior? What am I losing by using a global registry over DI?
One thing that is often mentioned when discussing DI vs Globals is that globals inhibit your ability to test your program properly. How exactly do globals prevent me from testing a program where DI would not? I have read in many places that this is due to the fact that a global can be altered from anywhere and thus is difficult to mock. However, it seems to me that since, at least in PHP, objects are passed by reference, changing an injected object in some class will also change it in any other class into which it has been injected.
Let's tackle this one by one.
Firstly, the registry class will contain huge amounts of recursion
You do not have to inject the Registry class into the database class. You can just as well have dedicated methods on the Registry to create the required classes for you. Or if you inject the Registry, you can simply not store it but only grab from it what is needed for the class to be instantiated properly. No Recursion.
Notice that the global registry does not suffer from this as it is not passed into anything.
There might be no recursion for the Registry itself, but objects in the Registry may very well have circular references. This could potentially lead to memory leaks when unsetting objects from the Registry with PHP Versions before 5.3 when the Garbage Collector would not collect those properly.
Secondly, I will start passing large amounts of data to objects that don't need it. My database doesn't care about my router but the router will get passed to the database along with the database connection details.
True. But that's what the Registry is for. It's not much different from passing $_GLOBALS into your objects. If you dont want that, dont use a Registry, but only pass in the arguments required for the class instances to be in a valid state. Or simply dont store it.
I could do $this->registry->router->registry->database
It is s unlikely that router exposes a public method to get the Registry. You wont be able to get to database from $this through router, but you will be able to get to database directly. Certainly. It's a Registry. That's what you wrote it for. If you want to store the Registry in your objects, you can wrap them into a Segregated Interface that only allows access to a subset of the data contained within.
Obviously, these problems don't exist when I use DI without a registry but then I have to pass every object 'manually', resulting in class constructors with a ridiculous number of parameters.
Not necessarily. When using constructor injection, you can limit the number of arguments to those absolutely necessary to put the object into a valid state. The remaining optional dependencies can very much be set through setter injection as well. Also, no one hinders you to add the arguments in an Array or Config object. Or use Builders.
Given these issues with both versions of DI, isn't a global registry superior? What am I losing by using a global registry over DI?
When you use a global Registry you are tight coupling this dependency to the class. This means the using classes cannot be used without this concrete Registry class anymore. You assume there will be only this Registry and not a different implementation. When injecting the dependecies, you are free to inject whatever fulfills the responsibility of the dependency.
One thing that is often mentioned when discussing DI vs Globals is that globals inhibit your ability to test your program properly. How exactly do globals prevent me from testing a program where DI would not?
They do not prevent you from testing the code. They just make it harder. When Unit-Testing you want to have the system in a known and reproducable state. If your code has dependencies on the global state, you have to create this state on each test run.
I have read in many places that this is due to the fact that a global can be altered from anywhere and thus is difficult to mock
Correct, if one test changes the global state, it might affect the next tests if you do not change it back. This means you have to take effort to recreate the environment in addition to setting your Subject-Under-Test into a known state. This might be easy if there is just one dependency, but what if there is many and those depend on the global state too. You'll end up in Dependency Hell.
I'll post this as an answer since I'd like to include the code.
I've benchmarked passing an object versus using global. I basically created a relatively simple object, but one with a self reference and a nested object.
The results:
Passed Completed in 0.19198203086853 Seconds
Globaled Completed in 0.20970106124878 Seconds
And the results are identical if I remove the nested object and the self reference...
So yes, it appears that there's no real performance difference between these two different methods of passing data. So make the better architectural choice (IMHO that's the Dependency Injection)...
The script:
$its = 10000;
$bar = new stdclass();
$bar->foo = 'bar';
$bar->bar = $bar;
$bar->baz = new StdClass();
$bar->baz->ar = 'bart';
$s = microtime(true);
for ($i=0;$i<$its;$i++) passed($bar);
$e = microtime(true);
echo "Passed Completed in ".($e - $s) ." Seconds\n";
$s = microtime(true);
for ($i=0;$i<$its;$i++) globaled();
$e = microtime(true);
echo "Globaled Completed in ".($e - $s) ." Seconds\n";
function passed($bar) {
is_object($bar);
}
function globaled() {
global $bar;
is_object($bar);
}
Tested on 5.3.2
I'm obviously brand new to these concepts. I just don't understand why you would limit access to properties or methods. It seems that you would just write the code according to intended results. Why would you create a private method instead of simply not calling that method? Is it for iterative object creation (if I'm stating that correctly), a multiple developer situation (don't mess up other people's work), or just so you don't mess up your own work accidentally?
Your last two points are quite accurate - you don't need multiple developers to have your stuff messed with. If you work on a project long enough, you'll realize you've forgotten much of what you did at the beginning.
One of the most important reasons for hiding something is so that you can safely change it later. If a field is public, and several months later you want to change it so that every time the field changes, something else happens, you're in trouble. Because it was public, there's no way to know or remember how many other places accessed that field directly. If it's private, you have a guarantee that it isn't being touched outside of this class. You likely have a public method wrapped around it, and you can easily change the behavior of that method.
In general, more you things make public, the more you have to worry about compatibility with other code.
We create private methods so that consumers of our classes don't have to care about implementation details - they can focus on the few nifty things our classes provide for them.
Moreover, we're obligated to consider every possible use of public methods. By making methods private, we reduce the number of features a class has to support, and we have more freedom to change them.
Say you have a Queue class - every time a caller adds an item to the queue, it may be necessary to to increase the queue's capacity. Because of the underlying implementation, setting the capacity isn't trivial, so you break it out into a separate function to improve the readability of your Enqueue function. Since callers don't care about a queue's capacity (you're handling it for them), you can make the method private: callers don't get distracted by superfluous methods, you don't have to worry that callers will do ridiculous things to the capacity, and you can change the implementation any time you like without breaking code that uses your class (as long as it still sets the capacity within the limited use cases defined by your class).
It all comes down to encapsulation. This means hiding the insides of the class and just caring about what it does. If you want to have a credit card processing class, you don't really care 'how' it processes the credit card. You just want to be able to go: $creditCardProcessor->charge(10.99, $creditCardNumber); and expect it to work.
By making some methods public and others private or protected, we leave an entry way for others so they know where it is safe to call code from. The public methods and variables are called an 'interface'.
For any class, you have an implementation. This is how the class carries out its duty. If it is a smoothie making class, how the class adds the ingredients, what ingredients it adds, etc are all part of the implementation. The outside code shouldn't know and/or care about the implementation.
The other side of the class it its interface. The interface is the public methods that the developer of the class intended to be called by outside code. This means that you should be able to call any public method and it will work properly.
There are several reasons for using encapsulation, one of the strongest is: Imagine using a large, complicated library written by someone else. If every object was unprotected you could unknowingly be accessing or changing values that the developer never intended to be manipulated in that way.
Hiding data makes the program easier to conceptualize and easier to implement.
It's all about encapsulation. Methods are private that do the inner grunt work while exposing graceful functions that make things easy. E.g. you might have an $product->insert() function that utilizes 4 inner functions to validate a singleton db object, make the query safe, etc - those are inner functions that don't need to be exposed and if called, might mess up other structures or flows you, the developer, have put in place.
a multiple developer situation (don't
mess up other people's work), or just
so you don't mess up your own work
accidentally?
Mainly these two things. Making a method public says "this is how the class is supposed to be used by its clients", making it private says "this is an implementation detail that may change without warning and which clients should not care about" AND forces clients to follow that advice.
A class with a few, well documented public methods is much easier to use by someone who's not familiar with it (which may well be its original author, looking at it for the first time in 6 months) than one where everything is public, including all the little implementation details that you don't care about.
It makes collaboration easier, you tell the users of your classes what parts should not change so often and you can guarantee that your object will be in a meaningful state if they use only public methods.
It does not need to be so strict as distinguishing between private/public/whatever (I mean enforced by the language). For example, in Python, this is accomplished by a naming convention. You know you shouldn't mess with anything marked as not public.
For example - private/protected method may be part of some class which is called in another (public) method. If that part is called in more public methods, it makes sense. And yet you don't want these methods to be called anywhere else.
It's quite the same with class properties. Yes, you can write all-public classes, but whats the fun in that?
I started off by drafting a question: "What is the best way to perform unit testing on a constructor (e.g., __construct() in PHP5)", but when reading through the related questions, I saw several comments that seemed to suggest that setting member variables or performing any complicated operations in the constructor are no-nos.
The constructor for the class in question here takes a param, performs some operations on it (making sure it passes a sniff test, and transforming it if necessary), and then stashes it away in a member variable.
I thought the benefits of doing it this way were:
1) that client code would always be
certain to have a value for this
member variable whenever an object
of this class is instantiated, and
2) it saves a step in client code
(one of which could conceivably be
missed), e.g.,
$Thing = new Thing;
$Thing->initialize($var);
when we could just do this
$Thing = new Thing($var);
and be done with it.
Is this a no-no? If so why?
My rule of thumb is that an object should be ready for use after the constructor has finished. But there are often a number of options that can be tweaked afterwards.
My list of do's and donts:
Constructors should set up basic options for the object.
They should maybe create instances of helper objects.
They should not aqquire resources(files, sockets, ...), unless the object clearly is a wrapper around some resource.
Of course, no rules without exceptions. The important thing is that you think about your design and your choises. Make object usage natural - and that includes error reporting.
This comes up quite a lot in C++ discussions, and the general conclusion I've come to there has been this:
If an object does not acquire any external resources, members must be initialized in the constructor. This involves doing all work in the constructor.
(x, y) coordinate (or really any other structure that's just a glorified tuple)
US state abbreviation lookup table
If an object acquires resources that it can control, they may be allocated in the constructor:
open file descriptor
allocated memory
handle/pointer into an external library
If the object acquires resources that it can't entirely control, they must be allocated outside of the constructor:
TCP connection
DB connection
weak reference
There are always exceptions, but this covers most cases.
Constructors are for initializing the object, so
$Thing = new Thing($var);
is perfectly acceptable.
The job of a constructor is to establish an instance's invariants.
Anything that doesn't contribute to that is best kept out of the constructor.
To improve the testability of a class it is generally a good thing to keep it's constructor as simple as possible and to have it ask only for things it absolutely needs. There's an excellent presentation available on YouTube as part of Google's "Clean Code Talks" series explaining this in detail.
You should definitely avoid making the client have to call
$thing->initialize($var)
That sort of stuff absolutely belongs in the constructor. It's just unfriendly to the client programmer to make them call this. There is a (slightly controversial) school of thought that says you should write classes so that objects are never in an invalid state -- and 'uninitialized' is an invalid state.
However for testability and performance reasons, sometimes it's good to defer certain initializations until later in the object's life. In cases like these, lazy evaluation is the solution.
Apologies for putting Java syntax in a Python answer but:
// Constructor
public MyObject(MyType initVar) {
this.initVar = initVar;
}
private void lazyInitialize() {
if(initialized) {
return
}
// initialization code goes here, uses initVar
}
public SomeType doSomething(SomeOtherType x) {
lazyInitialize();
// doing something code goes here
}
You can segment your lazy initialization so that only the parts that need it get initialized. It's common, for example, to do this in getters, just for what affects the value that's being got.
Depends on what type of system you're trying to architect, but in general I believe constructors are best used for only initializing the "state" of the object, but not perform any state transitions themselves. Best to just have it set the defaults.
I then write a "handle" method into my objects for handling things like user input, database calls, exceptions, collation, whatever. The idea is that this will handle whatever state the object finds itself in based on external forces (user input, time, etc.) Basically, all the things that may change the state of the object and require additional action are discovered and represented in the object.
Finally, I put a render method into the class to show the user something meaningful. This only represents the state of the object to the user (whatever that may be.)
__construct($arguments)
handle()
render(Exception $ex = null)
The __construct magic method is fine to use. The reason you see initialize in a lot of frameworks and applications is because that object is being programmed to an interface or it is trying to enact a singleton/getInstance pattern.
These objects are generally pulled into context or a controller and then have the common interface functionality called on them by other higher level objects.
If $var is absolutely necessary for $Thing to work, then it is a DO
You should not put things in a constructor that is only supposed to run once when the class is created.
To explain.
If i had a database class. Where the constructor is the connection to the database
So
$db = new dbclass;
And now i am connected to the database.
Then we have a class that uses some methods within the database class.
class users extends dbclass
{
// some methods
}
$users = new users
// by doing this, we have called the dbclass's constructor again