PHP Oddities: Bypass Class Visibility? - php

Today I was working on an old diagnostic library I had written back in the halcyon 5.1 days. That library provided highly detailed dumps of any variable you could give it, using color coding to indicate types, and using Reflection to produce a lot of insight into objects (including, depending on what flags you pass, outputting relevant PHPDoc and even source code for objects - especially useful in backtraces).
At the time I been able to bypass member visibility to output the value of protected and private members of a class. That proved to be quite useful from a debugging perspective, particularly with respect to detailed error logs we produce.
To bypass visibility in 5.1, I used the Reflection API, which let you see the value of protected and private members with ReflectionMethod->getValue($object). This of course is a bit of a security bypass, but it's not too bad since if you're going to view and modify values this way, you're pretty clearly breaking the object's intended API.
PHP 5.2 stopped Reflection from being able to access protected/private members and methods. Of course this was intentional, and considered that ability to be a security concern. I simply added a try/catch around this piece of my library, and output it if the language allowed it, and didn't if it doesn't. Java Reflection AFAICR always allowed you to bypass visibility (I believe they are of the opinion that if you want it badly enough, you'll get it one way or another, visibility is just an advertised API for an object, violate this at your own risk).
As a thought exercise, and perhaps to bring my dump library up to date, I'm curious if anyone can think of clever ways to bypass visibility in modern versions of PHP (5.2+, but of particular interest to me is PHP 5.3).
There are three avenues which seem particularly hopeful. First: mangling serialize/unserialize:
Class Foo {
protected $bar;
private $baz;
}
class VisibleFoo {
public $bar;
public $baz;
}
$f = new Foo();
$data = serialize($f);
$visibleData = str_replace($data, 'O:3:"Foo":', 'O:10:"VisibleFoo":');
$muahaha = unserialize($visibleData);
Of course it's more involved than this, because protected members are flagged as such: null*nullProperty, and private members are bound against their original class name: nullOriginalClassnullProperty (see PHP Serialization), but theoretically you could clean those all up and trick serialize / unserialize into exposing these values for you.
This has a few drawbacks: first, it is fragile with respect to language versions. PHP doesn't (AFAIK) make any guarantee that the data produced by serialize() will remain consistent from version to version (indeed, the way protected and private members are represented has changed since I've used PHP). Second, and more importantly, some objects declare a __sleep() method, which might have unintended side effects of 1) not giving you access to all the private members, and 2) maybe this will tear down database connections, close file streams, or other side effects of the object thinking it's going to sleep when it is in fact not.
A second option is to try to parse out print_r() or other built-in debugging statements to scrape values. This has the consequence of being incredibly difficult to do well beyond simple values (my old library would let you drill down into members which are themselves objects, and so forth). Interestingly it's a variant of this approach that I used to detect infinite recursion ($a->b = &$a) using var_dump().
A third option is to subclass the target and increase its visibility that way. This will get you access to protected members, but not to private members.
I seem to recall a few years ago reading a post by someone who had figured out a way to bypass with lambda functions or something to that effect. I can't find it any longer, and having tried a variety of variations on this idea, I've come up empty.
TLDR version: an anyone think of some magic hoops to jump through to dredge out protected and private members of a PHP object instance?

Fourth option:
$reflectionProperty->setAccessible(true);
can be used to make any property accessible using the getValue() method, even if it's protected or private. Test the visibility, then use setAccessible(true), getValue() and setAccessible(false) to reset.
I think this is a lot cleaner that a serialize()/unserialise() to a new class that has all properties public.... and doesn't require you to have duplicate versions of all your classes

Just as a note to anyone coming across this topic looking for a way for a debug script or whatever to blow open protected properties to have a poke about inside, probably the easiest and most far-reaching approach is to bypass the proper include/require system for the debug script and load the code you're debugging with eval(), as in this quick example:
$code = file_get_contents('ClassFile.php');
$code = trim($code, '<?php>');
$code = str_replace('protected', 'public', $code);
eval($code);
Of course you'll probably want to also replace private keywords with public too, and it would likely be a good idea to be more careful with your replacing code to avoid replacing the string 'protected' anywhere that isn't actually the protected keyword, etc. etc...
I'm sure I don't need to say that doing this on any kind of regular basis or for any reason other than development is far, far from best practice. However, this method of bypassing protected/private members is unlikely to be broken by updates to PHP any time soon, barring removal of the eval() function.

Related

Namespace vs global in PHP Functions? [duplicate]

What is the utility of the global keyword?
Are there any reasons to prefer one method to another?
Security?
Performance?
Anything else?
Method 1:
function exempleConcat($str1, $str2)
{
return $str1.$str2;
}
Method 2:
function exempleConcat()
{
global $str1, $str2;
return $str1.$str2;
}
When does it make sense to use global?
For me, it appears to be dangerous... but it may just be a lack of knowledge. I am interested in documented (e.g. with example of code, link to documentation...) technical reasons.
Bounty
This is a nice general question about the topic, I (#Gordon) am offering a bounty to get additional answers. Whether your answer is in agreement with mine or gives a different point of view doesn't matter. Since the global topic comes up every now and then, we could use a good "canonical" answer to link to.
Globals are evil
This is true for the global keyword as well as everything else that reaches from a local scope to the global scope (statics, singletons, registries, constants). You do not want to use them. A function call should not have to rely on anything outside, e.g.
function fn()
{
global $foo; // never ever use that
$a = SOME_CONSTANT // do not use that
$b = Foo::SOME_CONSTANT; // do not use that unless self::
$c = $GLOBALS['foo']; // incl. any other superglobal ($_GET, …)
$d = Foo::bar(); // any static call, incl. Singletons and Registries
}
All of these will make your code depend on the outside. Which means, you have to know the full global state your application is in before you can reliably call any of these. The function cannot exist without that environment.
Using the superglobals might not be an obvious flaw, but if you call your code from a Command Line, you don't have $_GET or $_POST. If your code relies on input from these, you are limiting yourself to a web environment. Just abstract the request into an object and use that instead.
In case of coupling hardcoded classnames (static, constants), your function also cannot exist without that class being available. That's less of an issue when it's classes from the same namespace, but when you start mix from different namespaces, you are creating a tangled mess.
Reuse is severly hampered by all of the above. So is unit-testing.
Also, your function signatures are lying when you couple to the global scope
function fn()
is a liar, because it claims I can call that function without passing anything to it. It is only when I look at the function body that I learn I have to set the environment into a certain state.
If your function requires arguments to run, make them explicit and pass them in:
function fn($arg1, $arg2)
{
// do sth with $arguments
}
clearly conveys from the signature what it requires to be called. It is not dependent on the environment to be in a specific state. You dont have to do
$arg1 = 'foo';
$arg2 = 'bar';
fn();
It's a matter of pulling in (global keyword) vs pushing in (arguments). When you push in/inject dependencies, the function does not rely on the outside anymore. When you do fn(1) you dont have to have a variable holding 1 somewhere outside. But when you pull in global $one inside the function, you couple to the global scope and expect it to have a variable of that defined somewhere. The function is no longer independent then.
Even worse, when you are changing globals inside your function, your code will quickly be completely incomprehensible, because your functions are having sideeffects all over the place.
In lack of a better example, consider
function fn()
{
global $foo;
echo $foo; // side effect: echo'ing
$foo = 'bar'; // side effect: changing
}
And then you do
$foo = 'foo';
fn(); // prints foo
fn(); // prints bar <-- WTF!!
There is no way to see that $foo got changed from these three lines. Why would calling the same function with the same arguments all of a sudden change it's output or change a value in the global state? A function should do X for a defined input Y. Always.
This gets even more severe when using OOP, because OOP is about encapsulation and by reaching out to the global scope, you are breaking encapsulation. All these Singletons and Registries you see in frameworks are code smells that should be removed in favor of Dependency Injection. Decouple your code.
More Resources:
http://c2.com/cgi/wiki?GlobalVariablesAreBad
How is testing the registry pattern or singleton hard in PHP?
Flaw: Brittle Global State & Singletons
static considered harmful
Why Singletons have no use in PHP
SOLID (object-oriented design)
Globals are unavoidable.
It is an old discussion, but I still would like to add some thoughts because I miss them in the above mentioned answers. Those answers simplify what a global is too much and present solutions that are not at all solutions to the problem. The problem is: what is the proper way to deal with a global variable and the use of the keyword global? For that do we first have to examine and describe what a global is.
Take a look at this code of Zend - and please understand that I do not suggest that Zend is badly written:
class DecoratorPluginManager extends AbstractPluginManager
{
/**
* Default set of decorators
*
* #var array
*/
protected $invokableClasses = array(
'htmlcloud' => 'Zend\Tag\Cloud\Decorator\HtmlCloud',
'htmltag' => 'Zend\Tag\Cloud\Decorator\HtmlTag',
'tag' => 'Zend\Tag\Cloud\Decorator\HtmlTag',
);
There are a lot of invisible dependencies here. Those constants are actually classes.
You can also see require_once in some pages of this framework. Require_once is a global dependency, hence creating external dependencies. That is inevitable for a framework. How can you create a class like DecoratorPluginManager without a lot of external code on which it depends? It can not function without a lot of extras. Using the Zend framework, have you ever changed the implementation of an interface? An interface is in fact a global.
Another globally used application is Drupal. They are very concerned about proper design, but just like any big framework, they have a lot of external dependencies. Take a look at the globals in this page:
/**
* #file
* Initiates a browser-based installation of Drupal.
*/
/**
* Root directory of Drupal installation.
*/
define('DRUPAL_ROOT', getcwd());
/**
* Global flag to indicate that site is in installation mode.
*/
define('MAINTENANCE_MODE', 'install');
// Exit early if running an incompatible PHP version to avoid fatal errors.
if (version_compare(PHP_VERSION, '5.2.4') < 0) {
print 'Your PHP installation is too old. Drupal requires at least PHP 5.2.4. See the system requirements page for more information.';
exit;
}
// Start the installer.
require_once DRUPAL_ROOT . '/includes/install.core.inc';
install_drupal();
Ever written a redirect to the login page? That is changing a global value. (And then are you not saying 'WTF', which I consider as a good reaction to bad documentation of your application.) The problem with globals is not that they are globals, you need them in order to have a meaningful application. The problem is the complexity of the overall application which can make it a nightmare to handle.
Sessions are globals, $_POST is a global, DRUPAL_ROOT is a global, the includes/install.core.inc' is an unmodifiable global. There is big world outside any function that is required in order to let that function do its job.
The answer of Gordon is incorrect, because he overrates the independence of a function and calling a function a liar is oversimplifying the situation. Functions do not lie and when you take a look at his example the function is designed improperly - his example is a bug. (By the way, I agree with this conclusion that one should decouple code.)
The answer of deceze is not really a proper definition of the situation. Functions always function within a wider scope and his example is way too simplistic. We will all agree with him that that function is completely useless, because it returns a constant. That function is anyhow bad design. If you want to show that the practice is bad, please come with a relevant example. Renaming variables throughout an application is no big deal having a good IDE (or a tool). The question is about the scope of the variable, not the difference in scope with the function. There is a proper time for a function to perform its role in the process (that is why it is created in the first place) and at that proper time may it influence the functioning of the application as a whole, hence also working on global variables.
The answer of xzyfer is a statement without argumentation. Globals are just as present in an application if you have procedural functions or OOP design. The next two ways of changing the value of a global are essentially the same:
function xzy($var){
global $z;
$z = $var;
}
function setZ($var){
$this->z = $var;
}
In both instances is the value of $z changed within a specific function. In both ways of programming can you make those changes in a bunch of other places in the code. You could say that using global you could call $z anywhere and change there. Yes, you can. But will you? And when done in inapt places, should it then not be called a bug?
Bob Fanger comments on xzyfer.
Should anyone then just use anything and especially the keyword 'global'? No, but just like any type of design, try to analyze on what it depends and what depends on it. Try to find out when it changes and how it changes. Changing global values should only happen with those variables that can change with every request/response. That is, only to those variables that are belonging to the functional flow of a process, not to its technical implementation. The redirect of an URL to the login page belongs to the functional flow of a process, the implementation class used for an interface to the technical implementation. You can change the latter during the different versions of the application, but should not change those with every request/response.
To further understand when it is a problem working with globals and the keyword global and when not will I introduce the next sentence, which comes from Wim de Bie when writing about blogs:
'Personal yes, private no'. When a function is changing the value of a global variable in sake of its own functioning, then will I call that private use of a global variable and a bug. But when the change of the global variable is made for the proper processing of the application as a whole, like the redirect of the user to the login page, then is that in my opinion possibly good design, not by definition bad and certainly not an anti-pattern.
In retrospect to the answers of Gordon, deceze and xzyfer: they all have 'private yes'(and bugs) as examples. That is why they are opposed to the use of globals. I would do too. They, however, do not come with 'personal yes, private no'-examples like I have done in this answer several times.
The one big reason against global is that it means the function is dependent on another scope. This will get messy very quickly.
$str1 = 'foo';
$str2 = 'bar';
$str3 = exampleConcat();
vs.
$str = exampleConcat('foo', 'bar');
Requiring $str1 and $str2 to be set up in the calling scope for the function to work means you introduce unnecessary dependencies. You can't rename these variables in this scope anymore without renaming them in the function as well, and thereby also in all other scopes you're using this function. This soon devolves into chaos as you're trying to keep track of your variable names.
global is a bad pattern even for including global things such as $db resources. There will come the day when you want to rename $db but can't, because your whole application depends on the name.
Limiting and separating the scope of variables is essential for writing any halfway complex application.
Simply put there is rarely a reason to global and never a good one in modern PHP code IMHO. Especially if you're using PHP 5. And extra specially if you're develop Object Orientated code.
Globals negatively affect maintainability, readability and testability of code. Many uses of global can and should be replaced with Dependency Injection or simply passing the global object as a parameter.
function getCustomer($db, $id) {
$row = $db->fetchRow('SELECT * FROM customer WHERE id = '.$db->quote($id));
return $row;
}
Dont hesitate from using global keyword inside functions in PHP. Especially dont take people who are outlandishly preaching/yelling how globals are 'evil' and whatnot.
Firstly, because what you use totally depends on the situation and problem, and there is NO one solution/way to do anything in coding. Totally leaving aside the fallacy of undefinable, subjective, religious adjectives like 'evil' into the equation.
Case in point :
Wordpress and its ecosystem uses global keyword in their functions. Be the code OOP or not OOP.
And as of now Wordpress is basically 18.9% of internet, and its running the massive megasites/apps of innumerable giants ranging from Reuters to Sony, to NYT, to CNN.
And it does it well.
Usage of global keyword inside functions frees Wordpress from MASSIVE bloat which would happen given its huge ecosystem. Imagine every function was asking/passing any variable that is needed from another plugin, core, and returning. Added with plugin interdependencies, that would end up in a nightmare of variables, or a nightmare of arrays passed as variables. A HELL to track, a hell to debug, a hell to develop. Inanely massive memory footprint due to code bloat and variable bloat too. Harder to write too.
There may be people who come up and criticize Wordpress, its ecosystem, their practices and what goes on around in those parts.
Pointless, since this ecosystem is pretty much 20% of roughly entire internet. Apparently, it DOES work, it does its job and more. Which means its the same for the global keyword.
Another good example is the "iframes are evil" fundamentalism. A decade ago it was heresy to use iframes. And there were thousands of people preaching against them around internet. Then comes facebook, then comes social, now iframes are everywhere from 'like' boxes to authentication, and voila - everyone shut up. There are those who still did not shut up - rightfully or wrongfully. But you know what, life goes on despite such opinions, and even the ones who were preaching against iframes a decade ago are now having to use them to integrate various social apps to their organization's own applications without saying a word.
......
Coder Fundamentalism is something very, very bad. A small percentage among us may be graced with the comfortable job in a solid monolithic company which has enough clout to endure the constant change in information technology and the pressures it brings in regard to competition, time, budget and other considerations, and therefore can practice fundamentalism and strict adherence to perceived 'evils' or 'goods'. Comfortable positions reminiscent of old ages these are, even if the occupiers are young.
For the majority however, the i.t. world is an ever changing world in which they need to be open minded and practical. There is no place for fundamentalism, leave aside outrageous keywords like 'evil' in the front line trenches of information technology.
Just use whatever makes the best sense for the problem AT HAND, with appropriate considerations for near, medium and long term future. Do not shy away from using any feature or approach because it has a rampant ideological animosity against it, among any given coder subset.
They wont do your job. You will. Act according to your circumstances.
It makes no sense to make a concat function using the global keyword.
It's used to access global variables such as a database object.
Example:
function getCustomer($id) {
global $db;
$row = $db->fetchRow('SELECT * FROM customer WHERE id = '.$db->quote($id));
return $row;
}
It can be used as a variation on the Singleton pattern
I think everyone has pretty much expounded on the negative aspects of globals. So I will add the positives as well as instructions for proper use of globals:
The main purpose of globals was to share information between functions. back when
there was nothing like a class, php code consisted of a bunch of functions. Sometimes
you would need to share information between functions. Typically the global was used to
do this with the risk of having data corrupted by making them global.
Now before some happy go lucky simpleton starts a comment about dependency injection I
would like to ask you how the user of a function like example get_post(1) would know
all the dependencies of the function. Also consider that dependencies may differ from
version to version and server to server. The main problem with dependency injection
is dependencies have to be known beforehand. In a situation where this is not possible
or unwanted global variables were the only way to do achieve this goal.
Due to the creation of the class, now common functions can easily be grouped in a class
and share data. Through implementations like Mediators even unrelated objects can share
information. This is no longer necessary.
Another use for globals is for configuration purposes. Mostly at the beginning of a
script before any autoloaders have been loaded, database connections made, etc.
During the loading of resources, globals can be used to configure data (ie which
database to use where library files are located, the url of the server etc). The best
way to do this is by use of the define() function since these values wont change often
and can easily be placed in a configuration file.
The final use for globals is to hold common data (ie CRLF, IMAGE_DIR, IMAGE_DIR_URL),
human readable status flags (ie ITERATOR_IS_RECURSIVE). Here globals are used to store
information that is meant to be used application wide allowing them to be changed and
have those changes appear application wide.
The singleton pattern became popular in php during php4 when each instance of an object
took up memory. The singleton helped to save ram by only allowing one instance of an
object to be created. Before references even dependancy injection would have been a bad
idea.
The new php implementation of objects from PHP 5.4+ takes care of most of these problems
you can safely pass objects around with little to no penalty any more. This is no longer
necessary.
Another use for singletons is the special instance where only one instance of an object
must exist at a time, that instance might exist before / after script execution and
that object is shared between different scripts / servers / languages etc. Here a
singleton pattern solves the solution quite well.
So in conclusion if you are in position 1, 2 or 3 then using a global would be reasonable. However in other situations Method 1 should be used.
Feel free to update any other instances where globals should be used.

Why is the amount of visibility on methods and attributes important?

Why shouldn't one leave all methods and attributes accessible from anywhere (i.e. public)?
Can you give me an example of a problem I can run into if I declared an attribute as public?
Think of McDonald's as an object. There's a well known public method to order a BigMac.
Internally there's going to be a few zillion other calls to actually GET the materials for making that Bigmac. They don't want you to know how their supply chain works, so all you get is the public Gimme_a_BigMac() call, and would never ever allow you to get access to the Slaughter_a_cow() or Buy_potatoes_for_fries() methods.
For your own code, that no one will ever see, go ahead and leave everything public. But if you're doing a library for others to reuse, then you go and protect the internal details. That leaves McDonald's free to switch to having Scotty beam over a patty rather than having to call up a Trucking company to deliver the meat by land. The end-user never knows the difference - they just get their BigMac. But internally everything could fundamentally change.
Why shouldn't one leave all methods and attributes accessible from anywhere (i.e. public)?
Because that is far too expensive.
Every public method that I make has to be carefully designed and then approved by a team of architects, it has to be implemented to be robust in the face of arbitrarily hostile or buggy callers, it has to be fully tested, all problems found during testing have to have regression suites added, the method has to be documented, the documentation has to be translated into at least twelve different languages.
The biggest cost of all though is: the method has to be maintained, unchanged, forever and ever, amen. If I decide in the next version that I didn't like what that method did, I can't change it because customers now rely on it. Breaking backwards compatibility of a public method imposes costs on users and I am loathe to do that. Living with a bad design or implementation of a public method imposes high costs on the designers, testers and implementers of the next version.
A public method can easily cost thousands or even tens of thousands of dollars. Make a hundred of them in a class and that's a million dollar class right there.
Private methods have none of those costs. Spend shareholder money wisely; make everything private that you possibly can.
Think of visibility scopes as inner circles of trust.
Take yourself as an example, and think about what activities are public and what are private or protected. There are number of things that you are not delegating for anybody to do on your behalf. There are some that are fine others to trigger and there are some with limited access.
Similarly, in programming, scopes give you tools for creating different circles of trust. Additionally, making things private/protected, give you more control on what's happening. For example, you can allow 3rd-party plugins that can extend some of your code, while they can be limited to the scope of how far they can go.
So, to generalize, scopes give you the extra level of security and keeps things more organized that they would be otherwise.
Because that violates the concept of encapsulation, a key tenet of OOP.
A risk you run, you say?
<?php
class Foo
{
/**
* #var SomeObject
*/
public $bar;
}
Your code states that $bar should contain an object instanceof SomeObject. However, anyone using your code could do
$myFoo->bar = new SomeOtherObject();
... and any code relying on Foo::$bar being a SomeObject would break. With getters and setters and protected properties, you can enforce this expectation:
<?php
class Foo
{
/**
* #var SomeObject
*/
protected $bar;
public function setBar(SomeObject $bar)
{
$this->bar = $bar;
}
}
Now you can be certain that any time Foo::$bar is set, it will be with an object instanceof SomeObject.
By hiding implementation details, it is also preventing an object from getting into an inconsistent state.
Here is an contrived example of a stack (pseudo code).
public class Stack {
public List stack = new List();
public int currentStackPosition = 0;
public String pop() {
if (currentStackPosition-1 >= 0) {
currentStackPosition--;
return stack.remove(currentStackPosition + 1);
} else {
return null;
}
}
public void push(String value) {
currentStackPosition++;
stack.add(value);
}
}
If you make both variables private the implementation works fine. But if public you can easily break it by just setting an incorrect value for currentStackPosition or directly modifying the List.
If you only expose the functions you provide a reliable contract that others can use and trust. Exposing the implementation just make it a thing that might work of nobody messes with it.
Encapsulation is not needed in any language, but it's useful.
Encapsulation is used to minimise the number of potential dependencies with the highest probability of change propagation also it helps preventing inconsistencies :
Simple example: Assume we made a Rectangle class that contained four variables - length, width, area, perimeter. Please note that area and perimeter are derived from length and width (normally I wouldn't make variables for them), so that changing length would change both area and perimeter.
If you did not use proper information hiding (encapsulation), then another program utilizing that Rectangle class could alter the length without altering the area, and you would have an inconsistent Rectangle. Without encapsulation, it would be possible to create a Rectangle with a length of 1 and a width of 3, and have an area of 32345.
Using encapsulation, we can create a function that, if a program wanted to change the length of the rectangle, that the object would appropriately update its area and perimeter without being inconsistent.
Encapsulation eliminates the possibilities for inconsistency, and shifts the responsibility of staying consistent onto the object itself rather than a program utilizing it.
However at the same time encapsulation is sometimes a bad idea, and motion planning and collision (in game programming) are areas where this is particularly likely to be the case.
the problem is that encapsulation is fantastic in places where it is needed, but it is terrible when applied in places where it isn’t needed like when there are global properties that need to be maintained by a group of encapsulation, Since OOP enforced encapsulation no matter what, you are stuck. For example, there are many properties of objects that are non-local, for example, any kind of global consistency. What tends to happen in OOP is that every object has to encode its view of the global consistency condition, and do its part to help maintain the right global properties. This can be fun if you really need the encapsulation, to allow alternative implementations. But if you don’t need it, you end up writing lots of very tricky code in multiple places that basically does the same thing. Everything seems encapsulated, but is in fact completely interdependent.
Well, in fact you can have everything public and it doesn't break encapsulation when you state clearly, what is the contract, the correct way to use objects. Maybe not attributes, but methods are often more hidden than they have to be.
Remember, that it is not you, the API designer, that is breaking the encapsulation by making things public. It is the users of the class that can do so, by calling internal methods in their application. You can either slap their hands for trying to do so (i.e. declaring methods private), or pass the responsibility to them (e.g. by prefixing non-API methods with "_"). Do you really care whether someone breaks his code by using your library the other way you advice him to do? I don't.
Making almost everything private or final -- or leaving them without API documentation, on the other hand -- is a way of discouraging extendability and feedback in open source. Your code can be used in a ways you even didn't think of, which might not be the case when everything is locked (e.g. sealed-by-default methods in C#).
The only problem you can run into is that people will see you as "uncool" if you don't use Private or Protected or Abstract Static Final Interface or whatever. This stuff is like designer clothes or Apple gadgets - people buy them not because they need to, but just to keep up with others.
Yes, encapsulation is an important theoretical concept, but in the practice "private" and friends rarely make sense. They might make some sense in Java or C#, but in a scripting language like PHP using "private" or "protected" is sheer stupid, because encapsulation is invented to be checked by a compiler, which doesn't exist in PHP. More details.
See also this excellent response and #troelskn and #mario comments over here
The visibility is just something that you can use for your own good, to help you not break your own code. And if you use it right, you will help others (who are using your code) that don't break their own code (by not using your code right).
The simplest, widely known example, in my opinion is the Singleton pattern. It's a pattern, because it's a common problem. (Definition of pattern from Wikipedia:
is a formal way of documenting a solution to a design problem
Definition of the Singleton pattern in Wikipedia:
In software engineering, the singleton pattern is a design pattern used to implement the mathematical concept of a singleton, by restricting the instantiation of a class to one object. This is useful when exactly one object is needed to coordinate actions across the system.
http://en.wikipedia.org/wiki/Singleton_pattern
The implementation of the pattern uses a private constructor. If you don't make the constructor private, anyone could mistakenly create a new instance, and break the whole point of having only one instance.
You may think that the previous answers are "theoretical", if you use public properties in Doctrine2 Entities, you break lazy loading.
To save you from yourself!
There's been some excellent answers above, but I wanted to add a bit. This is called principle of least privilege. With less privilege, less entities have authority to break things. Breaking things is bad.
If you follow the principle of least privilege, the principle of least knowledge (or Law of Demeter) and single responsibility principle aren't far behind. Since your class you wrote to download the latest football scores has followed this principle, and you have to poll it's data instead of it being dumped straight to your interface, you copy and paste the whole class into your next project, saving development time. Saving development time is good.
If you're lucky, you'll be coming back to this code in 6 months to fix a small bug, after you've made gigaquads of money from it. Future self will take prior self's name in vain for not following the above principles, and he will fall victim to a violation of the principle of least astonishment. That is, your bug is a parse error in the football score model, but since you didn't follow LOD and SRP, you're astonished at the fact that you're doing XML parsing inline with your output generation. There are much better things in life to be astonished by than the horrificness of your own code. Trust me, I know.
Since you followed all the principles and documented your code, you work two hours every Thursday afternoon on maintenance programming, and the rest of the time surfing.

PHP global in functions

What is the utility of the global keyword?
Are there any reasons to prefer one method to another?
Security?
Performance?
Anything else?
Method 1:
function exempleConcat($str1, $str2)
{
return $str1.$str2;
}
Method 2:
function exempleConcat()
{
global $str1, $str2;
return $str1.$str2;
}
When does it make sense to use global?
For me, it appears to be dangerous... but it may just be a lack of knowledge. I am interested in documented (e.g. with example of code, link to documentation...) technical reasons.
Bounty
This is a nice general question about the topic, I (#Gordon) am offering a bounty to get additional answers. Whether your answer is in agreement with mine or gives a different point of view doesn't matter. Since the global topic comes up every now and then, we could use a good "canonical" answer to link to.
Globals are evil
This is true for the global keyword as well as everything else that reaches from a local scope to the global scope (statics, singletons, registries, constants). You do not want to use them. A function call should not have to rely on anything outside, e.g.
function fn()
{
global $foo; // never ever use that
$a = SOME_CONSTANT // do not use that
$b = Foo::SOME_CONSTANT; // do not use that unless self::
$c = $GLOBALS['foo']; // incl. any other superglobal ($_GET, …)
$d = Foo::bar(); // any static call, incl. Singletons and Registries
}
All of these will make your code depend on the outside. Which means, you have to know the full global state your application is in before you can reliably call any of these. The function cannot exist without that environment.
Using the superglobals might not be an obvious flaw, but if you call your code from a Command Line, you don't have $_GET or $_POST. If your code relies on input from these, you are limiting yourself to a web environment. Just abstract the request into an object and use that instead.
In case of coupling hardcoded classnames (static, constants), your function also cannot exist without that class being available. That's less of an issue when it's classes from the same namespace, but when you start mix from different namespaces, you are creating a tangled mess.
Reuse is severly hampered by all of the above. So is unit-testing.
Also, your function signatures are lying when you couple to the global scope
function fn()
is a liar, because it claims I can call that function without passing anything to it. It is only when I look at the function body that I learn I have to set the environment into a certain state.
If your function requires arguments to run, make them explicit and pass them in:
function fn($arg1, $arg2)
{
// do sth with $arguments
}
clearly conveys from the signature what it requires to be called. It is not dependent on the environment to be in a specific state. You dont have to do
$arg1 = 'foo';
$arg2 = 'bar';
fn();
It's a matter of pulling in (global keyword) vs pushing in (arguments). When you push in/inject dependencies, the function does not rely on the outside anymore. When you do fn(1) you dont have to have a variable holding 1 somewhere outside. But when you pull in global $one inside the function, you couple to the global scope and expect it to have a variable of that defined somewhere. The function is no longer independent then.
Even worse, when you are changing globals inside your function, your code will quickly be completely incomprehensible, because your functions are having sideeffects all over the place.
In lack of a better example, consider
function fn()
{
global $foo;
echo $foo; // side effect: echo'ing
$foo = 'bar'; // side effect: changing
}
And then you do
$foo = 'foo';
fn(); // prints foo
fn(); // prints bar <-- WTF!!
There is no way to see that $foo got changed from these three lines. Why would calling the same function with the same arguments all of a sudden change it's output or change a value in the global state? A function should do X for a defined input Y. Always.
This gets even more severe when using OOP, because OOP is about encapsulation and by reaching out to the global scope, you are breaking encapsulation. All these Singletons and Registries you see in frameworks are code smells that should be removed in favor of Dependency Injection. Decouple your code.
More Resources:
http://c2.com/cgi/wiki?GlobalVariablesAreBad
How is testing the registry pattern or singleton hard in PHP?
Flaw: Brittle Global State & Singletons
static considered harmful
Why Singletons have no use in PHP
SOLID (object-oriented design)
Globals are unavoidable.
It is an old discussion, but I still would like to add some thoughts because I miss them in the above mentioned answers. Those answers simplify what a global is too much and present solutions that are not at all solutions to the problem. The problem is: what is the proper way to deal with a global variable and the use of the keyword global? For that do we first have to examine and describe what a global is.
Take a look at this code of Zend - and please understand that I do not suggest that Zend is badly written:
class DecoratorPluginManager extends AbstractPluginManager
{
/**
* Default set of decorators
*
* #var array
*/
protected $invokableClasses = array(
'htmlcloud' => 'Zend\Tag\Cloud\Decorator\HtmlCloud',
'htmltag' => 'Zend\Tag\Cloud\Decorator\HtmlTag',
'tag' => 'Zend\Tag\Cloud\Decorator\HtmlTag',
);
There are a lot of invisible dependencies here. Those constants are actually classes.
You can also see require_once in some pages of this framework. Require_once is a global dependency, hence creating external dependencies. That is inevitable for a framework. How can you create a class like DecoratorPluginManager without a lot of external code on which it depends? It can not function without a lot of extras. Using the Zend framework, have you ever changed the implementation of an interface? An interface is in fact a global.
Another globally used application is Drupal. They are very concerned about proper design, but just like any big framework, they have a lot of external dependencies. Take a look at the globals in this page:
/**
* #file
* Initiates a browser-based installation of Drupal.
*/
/**
* Root directory of Drupal installation.
*/
define('DRUPAL_ROOT', getcwd());
/**
* Global flag to indicate that site is in installation mode.
*/
define('MAINTENANCE_MODE', 'install');
// Exit early if running an incompatible PHP version to avoid fatal errors.
if (version_compare(PHP_VERSION, '5.2.4') < 0) {
print 'Your PHP installation is too old. Drupal requires at least PHP 5.2.4. See the system requirements page for more information.';
exit;
}
// Start the installer.
require_once DRUPAL_ROOT . '/includes/install.core.inc';
install_drupal();
Ever written a redirect to the login page? That is changing a global value. (And then are you not saying 'WTF', which I consider as a good reaction to bad documentation of your application.) The problem with globals is not that they are globals, you need them in order to have a meaningful application. The problem is the complexity of the overall application which can make it a nightmare to handle.
Sessions are globals, $_POST is a global, DRUPAL_ROOT is a global, the includes/install.core.inc' is an unmodifiable global. There is big world outside any function that is required in order to let that function do its job.
The answer of Gordon is incorrect, because he overrates the independence of a function and calling a function a liar is oversimplifying the situation. Functions do not lie and when you take a look at his example the function is designed improperly - his example is a bug. (By the way, I agree with this conclusion that one should decouple code.)
The answer of deceze is not really a proper definition of the situation. Functions always function within a wider scope and his example is way too simplistic. We will all agree with him that that function is completely useless, because it returns a constant. That function is anyhow bad design. If you want to show that the practice is bad, please come with a relevant example. Renaming variables throughout an application is no big deal having a good IDE (or a tool). The question is about the scope of the variable, not the difference in scope with the function. There is a proper time for a function to perform its role in the process (that is why it is created in the first place) and at that proper time may it influence the functioning of the application as a whole, hence also working on global variables.
The answer of xzyfer is a statement without argumentation. Globals are just as present in an application if you have procedural functions or OOP design. The next two ways of changing the value of a global are essentially the same:
function xzy($var){
global $z;
$z = $var;
}
function setZ($var){
$this->z = $var;
}
In both instances is the value of $z changed within a specific function. In both ways of programming can you make those changes in a bunch of other places in the code. You could say that using global you could call $z anywhere and change there. Yes, you can. But will you? And when done in inapt places, should it then not be called a bug?
Bob Fanger comments on xzyfer.
Should anyone then just use anything and especially the keyword 'global'? No, but just like any type of design, try to analyze on what it depends and what depends on it. Try to find out when it changes and how it changes. Changing global values should only happen with those variables that can change with every request/response. That is, only to those variables that are belonging to the functional flow of a process, not to its technical implementation. The redirect of an URL to the login page belongs to the functional flow of a process, the implementation class used for an interface to the technical implementation. You can change the latter during the different versions of the application, but should not change those with every request/response.
To further understand when it is a problem working with globals and the keyword global and when not will I introduce the next sentence, which comes from Wim de Bie when writing about blogs:
'Personal yes, private no'. When a function is changing the value of a global variable in sake of its own functioning, then will I call that private use of a global variable and a bug. But when the change of the global variable is made for the proper processing of the application as a whole, like the redirect of the user to the login page, then is that in my opinion possibly good design, not by definition bad and certainly not an anti-pattern.
In retrospect to the answers of Gordon, deceze and xzyfer: they all have 'private yes'(and bugs) as examples. That is why they are opposed to the use of globals. I would do too. They, however, do not come with 'personal yes, private no'-examples like I have done in this answer several times.
The one big reason against global is that it means the function is dependent on another scope. This will get messy very quickly.
$str1 = 'foo';
$str2 = 'bar';
$str3 = exampleConcat();
vs.
$str = exampleConcat('foo', 'bar');
Requiring $str1 and $str2 to be set up in the calling scope for the function to work means you introduce unnecessary dependencies. You can't rename these variables in this scope anymore without renaming them in the function as well, and thereby also in all other scopes you're using this function. This soon devolves into chaos as you're trying to keep track of your variable names.
global is a bad pattern even for including global things such as $db resources. There will come the day when you want to rename $db but can't, because your whole application depends on the name.
Limiting and separating the scope of variables is essential for writing any halfway complex application.
Simply put there is rarely a reason to global and never a good one in modern PHP code IMHO. Especially if you're using PHP 5. And extra specially if you're develop Object Orientated code.
Globals negatively affect maintainability, readability and testability of code. Many uses of global can and should be replaced with Dependency Injection or simply passing the global object as a parameter.
function getCustomer($db, $id) {
$row = $db->fetchRow('SELECT * FROM customer WHERE id = '.$db->quote($id));
return $row;
}
Dont hesitate from using global keyword inside functions in PHP. Especially dont take people who are outlandishly preaching/yelling how globals are 'evil' and whatnot.
Firstly, because what you use totally depends on the situation and problem, and there is NO one solution/way to do anything in coding. Totally leaving aside the fallacy of undefinable, subjective, religious adjectives like 'evil' into the equation.
Case in point :
Wordpress and its ecosystem uses global keyword in their functions. Be the code OOP or not OOP.
And as of now Wordpress is basically 18.9% of internet, and its running the massive megasites/apps of innumerable giants ranging from Reuters to Sony, to NYT, to CNN.
And it does it well.
Usage of global keyword inside functions frees Wordpress from MASSIVE bloat which would happen given its huge ecosystem. Imagine every function was asking/passing any variable that is needed from another plugin, core, and returning. Added with plugin interdependencies, that would end up in a nightmare of variables, or a nightmare of arrays passed as variables. A HELL to track, a hell to debug, a hell to develop. Inanely massive memory footprint due to code bloat and variable bloat too. Harder to write too.
There may be people who come up and criticize Wordpress, its ecosystem, their practices and what goes on around in those parts.
Pointless, since this ecosystem is pretty much 20% of roughly entire internet. Apparently, it DOES work, it does its job and more. Which means its the same for the global keyword.
Another good example is the "iframes are evil" fundamentalism. A decade ago it was heresy to use iframes. And there were thousands of people preaching against them around internet. Then comes facebook, then comes social, now iframes are everywhere from 'like' boxes to authentication, and voila - everyone shut up. There are those who still did not shut up - rightfully or wrongfully. But you know what, life goes on despite such opinions, and even the ones who were preaching against iframes a decade ago are now having to use them to integrate various social apps to their organization's own applications without saying a word.
......
Coder Fundamentalism is something very, very bad. A small percentage among us may be graced with the comfortable job in a solid monolithic company which has enough clout to endure the constant change in information technology and the pressures it brings in regard to competition, time, budget and other considerations, and therefore can practice fundamentalism and strict adherence to perceived 'evils' or 'goods'. Comfortable positions reminiscent of old ages these are, even if the occupiers are young.
For the majority however, the i.t. world is an ever changing world in which they need to be open minded and practical. There is no place for fundamentalism, leave aside outrageous keywords like 'evil' in the front line trenches of information technology.
Just use whatever makes the best sense for the problem AT HAND, with appropriate considerations for near, medium and long term future. Do not shy away from using any feature or approach because it has a rampant ideological animosity against it, among any given coder subset.
They wont do your job. You will. Act according to your circumstances.
It makes no sense to make a concat function using the global keyword.
It's used to access global variables such as a database object.
Example:
function getCustomer($id) {
global $db;
$row = $db->fetchRow('SELECT * FROM customer WHERE id = '.$db->quote($id));
return $row;
}
It can be used as a variation on the Singleton pattern
I think everyone has pretty much expounded on the negative aspects of globals. So I will add the positives as well as instructions for proper use of globals:
The main purpose of globals was to share information between functions. back when
there was nothing like a class, php code consisted of a bunch of functions. Sometimes
you would need to share information between functions. Typically the global was used to
do this with the risk of having data corrupted by making them global.
Now before some happy go lucky simpleton starts a comment about dependency injection I
would like to ask you how the user of a function like example get_post(1) would know
all the dependencies of the function. Also consider that dependencies may differ from
version to version and server to server. The main problem with dependency injection
is dependencies have to be known beforehand. In a situation where this is not possible
or unwanted global variables were the only way to do achieve this goal.
Due to the creation of the class, now common functions can easily be grouped in a class
and share data. Through implementations like Mediators even unrelated objects can share
information. This is no longer necessary.
Another use for globals is for configuration purposes. Mostly at the beginning of a
script before any autoloaders have been loaded, database connections made, etc.
During the loading of resources, globals can be used to configure data (ie which
database to use where library files are located, the url of the server etc). The best
way to do this is by use of the define() function since these values wont change often
and can easily be placed in a configuration file.
The final use for globals is to hold common data (ie CRLF, IMAGE_DIR, IMAGE_DIR_URL),
human readable status flags (ie ITERATOR_IS_RECURSIVE). Here globals are used to store
information that is meant to be used application wide allowing them to be changed and
have those changes appear application wide.
The singleton pattern became popular in php during php4 when each instance of an object
took up memory. The singleton helped to save ram by only allowing one instance of an
object to be created. Before references even dependancy injection would have been a bad
idea.
The new php implementation of objects from PHP 5.4+ takes care of most of these problems
you can safely pass objects around with little to no penalty any more. This is no longer
necessary.
Another use for singletons is the special instance where only one instance of an object
must exist at a time, that instance might exist before / after script execution and
that object is shared between different scripts / servers / languages etc. Here a
singleton pattern solves the solution quite well.
So in conclusion if you are in position 1, 2 or 3 then using a global would be reasonable. However in other situations Method 1 should be used.
Feel free to update any other instances where globals should be used.

PHP protected classes and properties, protected from whom?

I'm just getting started with OOP PHP with PHP Object-Oriented Solutions by David Powers, and am a little curious about the notion of protection in OOP.
The author clearly explains how protection works, but the bit about not wanting others to be able to change properties falls a bit flat. I'm having a hard time imagining a situation where it is ever possible to prevent others from altering your classes, since they could just open up your class.php and manually tweak whatever they pleased seeing as how PHP is always in plain text.
Caution: all of the above written by a beginner with a beginner's understanding of programming.
From yourself!
You use various levels of protection to indicate how you want a class to be used. If a class member is protected or private, it can only be accessed by the class itself. There's no chance you can screw up the value of that member accidentally from "external" code (code outside the class).
Say you have a class member that is only supposed to contain numbers. You make it protected and add a setter which checks that its value can only be numeric:
class Foo {
protected $num = 0;
public function setNum($num) {
if (!is_int($num)) {
throw new Exception('Not a number!!!');
}
$this->num = $num;
}
}
Now you can be sure that Foo::$num will always contain a number when you want to work with it. You can skip a lot of extra error checking code whenever you want to use it. Any time you try to assign anything but a number to it, you'll get a very loud error message, which makes it very easy to find bugs.
It's a restriction you put on yourself to ease your own work. Because programmers make mistakes. Especially dynamically typed languages like PHP let you silently make a lot of mistakes without you noticing, which turn into very hard to debug, very serious errors later on.
By its very nature, software is very soft and easily degrades into an unmaintainable Rube Goldberg logic machine. OOP, encapsulation, visibility modifiers, type hinting etc are tools PHP gives you to make your code "harder", to express your intent of what you want certain pieces of your code to be and enable PHP to enforce this intent for you.
Protected is not really protecting from anyone to change the source code, but is just a class method visibility in PHP OOP
Class members declared public can be accessed everywhere. Members declared protected can be accessed only within the class itself and by inherited and parent classes. Members declared as private may only be accessed by the class that defines the member.
They mean they are protected in different ways...
Private variables are not visible to anywhere except from within the class.
Protected variables are not visible to the instantiated object, but are visible to classes which inherit from that class, as well as the class itself.
Nothing stops another programmer from opening a class file and changing the access modifiers.
The hiding of data is a good thing because the less you expose, the more you can control and less bugs you can potentially introduce.

I'm new to OOP/PHP. What's the practicality of visibility and extensibility in classes?

I'm obviously brand new to these concepts. I just don't understand why you would limit access to properties or methods. It seems that you would just write the code according to intended results. Why would you create a private method instead of simply not calling that method? Is it for iterative object creation (if I'm stating that correctly), a multiple developer situation (don't mess up other people's work), or just so you don't mess up your own work accidentally?
Your last two points are quite accurate - you don't need multiple developers to have your stuff messed with. If you work on a project long enough, you'll realize you've forgotten much of what you did at the beginning.
One of the most important reasons for hiding something is so that you can safely change it later. If a field is public, and several months later you want to change it so that every time the field changes, something else happens, you're in trouble. Because it was public, there's no way to know or remember how many other places accessed that field directly. If it's private, you have a guarantee that it isn't being touched outside of this class. You likely have a public method wrapped around it, and you can easily change the behavior of that method.
In general, more you things make public, the more you have to worry about compatibility with other code.
We create private methods so that consumers of our classes don't have to care about implementation details - they can focus on the few nifty things our classes provide for them.
Moreover, we're obligated to consider every possible use of public methods. By making methods private, we reduce the number of features a class has to support, and we have more freedom to change them.
Say you have a Queue class - every time a caller adds an item to the queue, it may be necessary to to increase the queue's capacity. Because of the underlying implementation, setting the capacity isn't trivial, so you break it out into a separate function to improve the readability of your Enqueue function. Since callers don't care about a queue's capacity (you're handling it for them), you can make the method private: callers don't get distracted by superfluous methods, you don't have to worry that callers will do ridiculous things to the capacity, and you can change the implementation any time you like without breaking code that uses your class (as long as it still sets the capacity within the limited use cases defined by your class).
It all comes down to encapsulation. This means hiding the insides of the class and just caring about what it does. If you want to have a credit card processing class, you don't really care 'how' it processes the credit card. You just want to be able to go: $creditCardProcessor->charge(10.99, $creditCardNumber); and expect it to work.
By making some methods public and others private or protected, we leave an entry way for others so they know where it is safe to call code from. The public methods and variables are called an 'interface'.
For any class, you have an implementation. This is how the class carries out its duty. If it is a smoothie making class, how the class adds the ingredients, what ingredients it adds, etc are all part of the implementation. The outside code shouldn't know and/or care about the implementation.
The other side of the class it its interface. The interface is the public methods that the developer of the class intended to be called by outside code. This means that you should be able to call any public method and it will work properly.
There are several reasons for using encapsulation, one of the strongest is: Imagine using a large, complicated library written by someone else. If every object was unprotected you could unknowingly be accessing or changing values that the developer never intended to be manipulated in that way.
Hiding data makes the program easier to conceptualize and easier to implement.
It's all about encapsulation. Methods are private that do the inner grunt work while exposing graceful functions that make things easy. E.g. you might have an $product->insert() function that utilizes 4 inner functions to validate a singleton db object, make the query safe, etc - those are inner functions that don't need to be exposed and if called, might mess up other structures or flows you, the developer, have put in place.
a multiple developer situation (don't
mess up other people's work), or just
so you don't mess up your own work
accidentally?
Mainly these two things. Making a method public says "this is how the class is supposed to be used by its clients", making it private says "this is an implementation detail that may change without warning and which clients should not care about" AND forces clients to follow that advice.
A class with a few, well documented public methods is much easier to use by someone who's not familiar with it (which may well be its original author, looking at it for the first time in 6 months) than one where everything is public, including all the little implementation details that you don't care about.
It makes collaboration easier, you tell the users of your classes what parts should not change so often and you can guarantee that your object will be in a meaningful state if they use only public methods.
It does not need to be so strict as distinguishing between private/public/whatever (I mean enforced by the language). For example, in Python, this is accomplished by a naming convention. You know you shouldn't mess with anything marked as not public.
For example - private/protected method may be part of some class which is called in another (public) method. If that part is called in more public methods, it makes sense. And yet you don't want these methods to be called anywhere else.
It's quite the same with class properties. Yes, you can write all-public classes, but whats the fun in that?

Categories