I'm working on a new project with a sizeable PHP codebase. The application uses quite a few PHP constants ( define('FOO', 'bar') ), particularly for things like database connection parameters. These constants are all defined in a single configuration file that is require_once()'d directly by basically every class in the application.
A few years ago this would have made perfect sense, but since then I've gotten the Unit Testing bug and this tight coupling between classes is really bothering me. These constants smell like global variables, and they're referenced directly throughout the application code.
Is this still a good idea? Would it be reasonable to copy these values into an object and use this object (i.e. a Bean - there, I said it) to convey them via dependency injection to the the classes that interact with the database? Am I defeating any of the benefits of PHP constants (say speed or something) by doing this?
Another approach I'm considering would be be to create a separate configuration PHP script for testing. I'll still need to figure a way to get the classes under test to use the sandbox configuration script instead of the global configuration script. This still feels brittle, but it might require less outright modification to the entire application.
In my opinion, constants should be used only in two circumstances:
Actual constant values (i.e. things that will never change, SECONDS_PER_HOUR).
OS-dependent values, as long as the constant can be used transparently by the application, in every situation possible.
Even then, I'd reconsider whether class constants would be more appropriate so as not to pollute the constants space.
In your situation, I'd say constants are not a good solution because you will want to provide alternative values depending on where they're used.
These constants smell like global variables, and they're referenced directly […]. Would it be reasonable to copy these values into an object and […] convey them via dependency injection?
Absolutely! I would go even further and say even class constants should be avoided. Because they are public, they expose internals and they are API, so you cannot change them easily without risking breaking existing applications due to the tight coupling. A configuration object makes much more sense (just dont make it a Singleton).
Also see:
Brittle Global State & Singletons from
http://misko.hevery.com/code-reviewers-guide/
To answer this question it is important to discuss the style of code being written.
PHP 5 includes a number of useful OOP features, one of which is class constants. If you're using an object oriented approach, rather than polluting the global namespace, or worry about overriding common constants, you should use class constants.
FOO_BAR could be FOO::BAR in the end, it comes down to the scope of where you want the constant defined.
If you're writing a more procedural style program, or mixing procedural with some classes, global constants aren't an issue. If the code you're working on is becoming unmanageable due to the constants you're using, try changing things around. Otherwise, don't worry about it.
Additionally, class constants wont allow you to use function return values, global constants will. This is great when you have a value that wont ever be changed throughout the scope of the program, but needs to be generated.
using constants for database connection information is perfectly fine. This prevents hard-coding it within the object itself and since its read-only you can't overwrite the values.
I'm not fond of hard-coding my settings in an object, as things can change, but if you wanted to do that, that would work just as well.
If You have PHP 5.3 or newer, You may use namespace.
http://www.php.net/manual/en/language.namespaces.php
It works with const variable = 'something';
Unfortunately, it doesn't wokrk with define('variable','something');
Globals in namespace are encapsulated. In some situations it is better than having an object.
I don't agree constants, nor the hardcode :-)
I prefer, performance aside, Zend_Config_Ini from ZendFramework.
You can overload sections, maintain the values read-only in memory, and others:
http://framework.zend.com/manual/en/zend.config.adapters.ini.html
Related
I'm trying to find out why the use of global is considered to be bad practice in python (and in programming in general). Can somebody explain? Links with more info would also be appreciated.
This has nothing to do with Python; global variables are bad in any programming language.
However, global constants are not conceptually the same as global variables; global constants are perfectly harmless. In Python the distinction between the two is purely by convention: CONSTANTS_ARE_CAPITALIZED and globals_are_not.
The reason global variables are bad is that they enable functions to have hidden (non-obvious, surprising, hard to detect, hard to diagnose) side effects, leading to an increase in complexity, potentially leading to Spaghetti code.
However, sane use of global state is acceptable (as is local state and mutability) even in functional programming, either for algorithm optimization, reduced complexity, caching and memoization, or the practicality of porting structures originating in a predominantly imperative codebase.
All in all, your question can be answered in many ways, so your best bet is to just google "why are global variables bad". Some examples:
Global Variables Are Bad - Wiki Wiki Web
Why is Global State so Evil? - Software Engineering Stack Exchange
Are global variables bad?
If you want to go deeper and find out why side effects are all about, and many other enlightening things, you should learn Functional Programming:
Side effect (computer science) - Wikipedia
Why are side-effects considered evil in functional programming? - Software Engineering Stack Exchange
Functional programming - Wikipedia
Yes, in theory, globals (and "state" in general) are evil. In practice, if you look into your python's packages directory you'll find that most modules there start with a bunch of global declarations. Obviously, people have no problem with them.
Specifically to python, globals' visibility is limited to a module, therefore there are no "true" globals that affect the whole program - that makes them a way less harmful. Another point: there are no const, so when you need a constant you have to use a global.
In my practice, if I happen to modify a global in a function, I always declare it with global, even if there technically no need for that, as in:
cache = {}
def foo(args):
global cache
cache[args] = ...
This makes globals' manipulations easier to track down.
A personal opinion on the topic is that having global variables being used in a function logic means that some other code can alter the logic and the expected output of that function which will make debugging very hard (especially in big projects) and will make testing harder as well.
Furthermore, if you consider other people reading your code (open-source community, colleagues etc) they will have a hard time trying to understand where the global variable is being set, where has been changed and what to expect from this global variable as opposed to an isolated function that its functionality can be determined by reading the function definition itself.
(Probably) Violating Pure Function definition
I believe that a clean and (nearly) bug-free code should have functions that are as pure as possible (see pure functions). A pure function is the one that has the following conditions:
The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices (usually—see below).
Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.
Having global variables is violating at least one of the above if not both as an external code can probably cause unexpected results.
Another clear definition of pure functions: "Pure function is a function that takes all of its inputs as explicit arguments and produces all of its outputs as explicit results." [1]. Having global variables violates the idea of pure functions since an input and maybe one of the outputs (the global variable) is not explicitly being given or returned.
(Probably) Violating Unit testing F.I.R.S.T principle
Further on that, if you consider unit-testing and the F.I.R.S.T principle (Fast tests, Independent tests, Repeatable, Self-Validating and Timely) will probably violate the Independent tests principle (which means that tests don't depend on each other).
Having a global variable (not always) but in most of the cases (at least of what I have seen so far) is to prepare and pass results to other functions. This violates this principle as well. If the global variable has been used in that way (i.e the global variable used in function X has to be set in a function Y first) it means that to unit test function X you have to run test/run function Y first.
Globals as constants
On the other hand and as other people have already mentioned, if the global variable is used as a "constant" variable can be slightly better since the language does not support constants. However, I always prefer working with classes and having the "constants" as a class member and not use a global variable at all. If you have a code that two different classes require to share a global variable then you probably need to refactor your solution and make your classes independent.
I don't believe that globals shouldn't be used. But if they are used the authors should consider some principles (the ones mentioned above perhaps and other software engineering principles and good practices) for a cleaner and nearly bug-free code.
They are essential, the screen being a good example. However, in a multithreaded environment or with many developers involved, in practice often the question arises: who did (erraneously) set or clear it? Depending on the architecture, analysis can be costly and be required often. While reading the global var can be ok, writing to it must be controlled, for example by a single thread or threadsafe class. Hence, global vars arise the fear of high development costs possible by the consequences for which themselves are considered evil. Therefore in general, it's good practice to keep the number of global vars low.
I designed a PHP 5.5+ framework comprised of more than 750 different classes to make both web applications and websites.
I would like to, ideally, be able to reduce its size by producing a version of itself containing just the bare essential files and resources needed for a given project (whether it's a website or a web application).
What I want to do is to be able to:
reduce the amount of traits, classes, constants and functions to the bare essential per project
compress the code files to achieve a lesser deployment size and faster execution (if possible)
So far, I've got the second part completed. But the most important part is the first, and that's where I'm having problems. I have a function making use of get_declared_classes() and get_declared_traits(), get_defined_constants() and get_defined_functions() to get the full list of user-defined classes, traits, functions and constants. But it gives me pretty much EVERYTHING and that's not what I want.
Is there a way to get all defined classes, functions and constants (no need for traits as I could run class_uses() on every class and get the list of traits in use by that class) for a single given script?
I know there's the token_get_all() function but I tried it with no luck (or maybe it's I'm using it the wrong way).
Any hint? Any help would be greatly appreciated :)
You can use PHP Parser for this. It constructs abstract syntax trees based on the files you supply to it. Then you can analyze its output for each file, and produce a report usable to you.
Other than that, you can use token_get_all() approach you've mentioned already, and write a small parser yourself. Depending on your project, this might be easier or more difficult. For example, do you use a lot of new X() constructs, or do you tend to pass dependencies via constructors?
Unfortunately, these are about the only viable choices you have, since PHP is dynamically typed language.
If you use dependency injection, however, you might want to take a look at your DI framework's internal cache files, which often contain such dependency maps. If you don't use such framework, I recommend to start doing this, especially since your project is big and that's where dependency injection excels at. PHP-DI, one of such frameworks, proved to be successful in some of my middle-size projects (25k SLOC).
Who knows? Maybe refactoring your project to use DI will let you accomplish the task you want without even getting to know all the dependencies. One thing I'm sure of is that it will help you maintain it.
In my application architecture I want to replace my globals with something that ain't gonna burn most of the developer's eyes, because I am using globals like this,
define('DEVELOPMENT_ENVIRONMENT', true);
// Shorten DIRECTORY_SEPARATOR global,
define('DS', DIRECTORY_SEPARATOR);
// Set full path to the document root
define('ROOT', realpath(dirname(__FILE__)) . DS);
how could I prevent this? I tried creating a class that reads an xml file, but this will give me a longer code like this
$c = new Config();
if($c->devmode === TRUE) {}
or maybe something like this
$c = new Config()
echo $c->baseurl;
Any better ways to do this?
I think questions like yours can not be generally answered but they probably deserve an answer anyway. It's just that there is not the one golden rule or solution to deal with this.
At the most bare sense I can imagine the problem you describe is the context an application runs in. At the level of human face this is multi-folded, just only take the one constant:
define('DEVELOPMENT_ENVIRONMENT', true);
Even quite simple and easily introduced, it comes with a high price. If it is already part of your application first try to understand what the implications are.
You have one application codebase and somewhere in it - in concrete everywhere the constant is used - there are branches of your code that are either executed if this constant is TRUE or FALSE.
This on it's own is problematic because such code tends to become complex and hard to debug. So regardless how (constant, variable, function, class) you first of all should reduce and prevent the usage of such constructs.
And honestly, using a (global) constant does not look that wrong too me, especially compared with the alternatives, it first of all is the most preferable one in my eyes because it lies less and is not complicated but rather straight forward. You could turn this into a less-dynamic constant in current PHP versions by using the const keyword to declare it however:
const DEVELOPMENT_ENVIRONMENT = TRUE;
This is one facet of this little line of code. Another one is the low level of abstraction it comes with. If you want to define environments for the application, saying that a development environment is true or false is ambiguous. Instead you normally have an environment which can be of different types:
const ENVIRONMENT_UNSPECIFIED = 0;
const ENVIRONMENT_DEVELOPMENT = 1;
const ENVIRONMENT_STAGING = 2;
const ENVIRONMENT_LIVE = 3;
const ENVIRONMENT = ENVIRONMENT_DEVELOPMENT;
However this little example is just an example to visualize what I mean to make it little ambiguous. It does not solve the general problem outlined above and the following one:
You introduce context to your application on the level of global. That means any line of code inside a component (function, class) that relates to anything global (here: DEVELOPMENT_ENVIRONMENT) can not be de-coupled from the global state any longer. That means you've written code that only works inside that applications global context. This stands in your way if you want to write re-usable software components. Re-usability must not only mean a second application, it already means in testing and debugging. Or just the next revision of your software. As you can imagine that can stand in your own way pretty fast - or let's say faster then you want.
So the problem here is less the constant on it's own but more relying to the single context the code will run in or better worded global static state. The goal you need to aim for when you would like to introduce changes here for the better is to reduce this global static state. This is important if you're looking for alternatives because it will help you to do better decisions.
For example, instead of introducing a set of constants I have in the last code-example, find places that you make use of DEVELOPMENT_ENVIRONMENT and think why you have put it in there and if it is not possible to remove it out there. So first think about if it is needed at all (these environment flags are often a smell, once needed in a quick debugging or because it was thought "oh how practical" - and then rotting in code over weeks of no use). After you've considered whether it is needed or not and you came to the point it is needed, you need to find out why it is needed at that place. Does it really belong there? Can't it - as you should do with anything that provides context - turned into a parameter?
Normally objects by definition ship with their own context. If you've got a logger that behaves differently in development than in live, this should be a configuration and not a decision inside the application code somewhere. If your application always has a logger, inject it. The application code just logs.
So as you can imagine, it totally depends on many different things how and when you can prevent this. I can only suggest you to find out now, to reduce the overall usage.
There are some practical tips on the way for common scenarios we face in applications. For the "root-path problem" you can use relative paths in conjunction with magic constants like __DIR__. For example if the front-endpoint in the webroot (e.g. index.php) needs to point to the private application directory hosting the code:
<?php
/**
* Turbo CMS - Build to race your website's needs to the win.
*
* Webroot Endpoint
*/
require(__DIR__ . '/../private/myapp/bootstrap.php');
The application then normally knows how it works and where to find files relative to itself. And if you return some application context object (and this must not be global(!)), you can inject the webroot folder as well:
<?php
/**
* Turbo CMS - Build to race your website's needs to the win.
*
* Webroot Endpoint
*/
/* #var $turboAppContext Turbo\App\WebappContext */
$turboAppContext = require(__DIR__ . '/../private/myapp/bootstrap.php');
$turboAppContext->setWebroot(__DIR__);
Now the context of your webserver configures the application defaults. this is a crucial part actually because this touches a field of context inside your application (but not in every component) that is immanent. You can not prevent this context. It's like with leaking abstractions. There is an environment (known as "the system") your application runs in. But even though, you want to make it as independent as possible.
Like with the DEVELOPMENT_ENVIRONMENT constant above, these points are crucial to reduce and to find the right place for them. Also to only allow a very specific layer to set the input values (to change context) and only some high-level layers of your software to access these values. The largest part of your code-base should work without any of these parameters. And you can only control the access by passing around parameters and by not using global. Then code on a level that is allowed to access a certain setting (in the best meaning of the word), can access it - everything else does not have that parameter. To get this safety, you need to kill globals as best as possible.
E.g. the functionalitly to redirect to another location needs the base-url of the current request. It should not fetch them from server variables but based on a request-object that abstracts access to the server variables so that you can replace things here (e.g. when you're moving the application behind a front-proxy - well not always the best example but this can happen). If you have hard-coded your software against $_SERVER you would then need to modify $_SERVER in some stages of your software. You don't want that, instead you move away from this (again) global static state (here via a superglobal variable, spot those next to your global constants) by using objects that represent a certain functionality your application needs.
As long as we're talking about web-applications, take a look at Symfony's request and response abstraction (which is also used by many other projects which makes your application even more open and fluent). But this is just a side-note.
So whatever you want to base your decision on, do not get misguided by how many letters to type. The benefit of this is very short-sighted when you start to consider the overall letters you need to type when developing your software.
Instead understand where you introduce context, where you can prevent that and where you can't. For the places you can't, consider to make context a parameter instead of a "property" of the code. More fluent code allows you more re-usable code, better tests and less hassles when you move to another platform.
This is especially important if you have a large installation base. Code on these bases with global static state is a mess to maintain: Late releases, crawling releases, disappointed developers, burdensome development. There are lessons to learn, and the lessons are to understand which implications certain features of the language have and when to use them.
The best rule I can give - and I'm not an academic developer at all - is to consider global as expensive. It can be a superb shortcut to establish something however you should know about the price it comes with. And the field is wide because this does not only apply to object oriented programming but actually to procedural code as well. In object oriented programming many educational material exists that offers different ways to prevent global static state, so I would even say the situation there is quite well documented. But PHP is not purely OOP so it's not always that easy as having an object at hand - you might first need to introduce some (but then, see as well the request and response abstractions that are already available).
So the really best suggestion I can give to improve your code in context of this question is: Stick to the constant(s) (maybe with const keyword to make them less dynamic and more constant-ly) and then just try to remove them. As written in comments already, PHP does a very fine job about cross-platform file-access, just use / as directory separator, this is well understood and works very well. Try to not introduce a root-path constant anyway - this should not be constant for the code you write but a parameter on some level - it can change, for example in sub-requests or sub-apps which can save you a life-span before re-inventing the wheel again.
The hard task is to keep things simple. But it's worth.
Just put some server variable to the vhost config and prepare different config files for each option. Using apache it would be (you'll need mod_env module):
SetEnv ENVIRONMENT dev
And then in index just use something like:
$configFileName = getenv ('ENVIRONMENT').'.ini';
Now just load this file and determine all the application behaviour on the values given. Ofcourse you can facilitate it further if you use some framework but this would be a good start.
You can encapsulate your constants in a class and then retrieve it by a static methods :
if(Config::devMode()) {}
echo Config::baseUrl();
This way you save a line and some memory because you don't need to instantiate an object.
What is the best way to deal with "utility" functions in a OOP PHP framework? Right now, we just have a file with several functions that are needed throughout the system. (For example, a distribute() function which accepts a value and an array, and returns an array with the value distributed in the same proportions and same keys as the input array.)
I have always felt "dirty" using that because it's not object-oriented at all. Is it better practice to move these into various classes as static methods, or is that just a semantic workaround? Or is there just going to be a level in a framework where some stuff is going to fall outside of the OOP structure?
I tend to make a Util() class that contains only static methods, has no attributes, and is not inherited from. Essentially, it acts as a "namespace" to a bunch of utility functions. I will allow this class to grow in size, but will occasionally split of methods into their own classes if it is clear that those methods are designed only to work with certain kinds of data or if it is clear that a group of related methods should be grouped into a class along with, perhaps, some attributes.
I think it's perfectly OK to deviate from purely OOP practices so long as the code that deviates is well-organized and is not creating architectural flaws in your system that make it harder to understand and maintain.
I've always been more pragmatic about questions like these.
If you want to go full-OOP, you should obviously stick these into classes. However, these classes are only going to be container classes, because they don't really represent objects of any kind.
Also: using classes would require you to either have an instance of that class, using the singleton pattern or declaring every function static. The first one is slower (okay, might not be that much, but in a large framework things like that get large, too - especially in interpreted languages like PHP), while the second and third ones are just plain useless and simply an OOP wrapper for a set of functions (especially the third approach).
EDIT: Feel free to prove me wrong. I might be. I'm not too experienced and always saw it that way, but I might be wrong.
I always think of utility functions as extensions of the standard php functions. They are not object oriented because you don't really get any benefit from making them OO.
This question already has answers here:
Stop using `global` in PHP
(6 answers)
Closed 4 months ago.
function foo () {
global $var;
// rest of code
}
In my small PHP projects I usually go the procedural way. I generally have a variable that contains the system configuration, and when I nead to access this variable in a function, I do global $var;.
Is this bad practice?
When people talk about global variables in other languages it means something different to what it does in PHP. That's because variables aren't really global in PHP. The scope of a typical PHP program is one HTTP request. Session variables actually have a wider scope than PHP "global" variables because they typically encompass many HTTP requests.
Often (always?) you can call member functions in methods like preg_replace_callback() like this:
preg_replace_callback('!pattern!', array($obj, 'method'), $str);
See callbacks for more.
The point is that objects have been bolted onto PHP and in some ways lead to some awkwardness.
Don't concern yourself overly with applying standards or constructs from different languages to PHP. Another common pitfall is trying to turn PHP into a pure OOP language by sticking object models on top of everything.
Like anything else, use "global" variables, procedural code, a particular framework and OOP because it makes sense, solves a problem, reduces the amount of code you need to write or makes it more maintainable and easier to understand, not because you think you should.
Global variables if not used carefully can make problems harder to find. Let's say you request a php script and you get a warning saying you're trying to access an index of an array that does not exist in some function.
If the array you're trying to access is local to the function, you check the function to see if you have made a mistake there. It might be a problem with an input to the function so you check the places where the function is called.
But if that array is global, you need to check all the places where you use that global variable, and not only that, you have to figure out in what order those references to the global variable are accessed.
If you have a global variable in a piece of code it makes it difficult to isolate the functionality of that code. Why would you want to isolate functionality? So you can test it and reuse it elsewhere. If you have some code you don't need to test and won't need to reuse then using global variables is fine.
I agree with the accepted answer. I would add two things:
Use a prefix so you can immediately identify it as global (e.g. $g_)
Declare them in one spot, don't go sprinkling them all around the code.
Who can argue against experience, college degrees, and software engineering? Not me. I would only say that in developing object-oriented single page PHP applications, I have more fun when I know I can build the entire thing from scratch without worrying about namespace collisions. Building from scratch is something many people do not do anymore. They have a job, a deadline, a bonus, or a reputation to care about. These types tend to use so much pre-built code with high stakes, that they cannot risk using global variables at all.
It may be bad to use global variables, even if they are only used in the global area of a program, but let's not forget about those who just want to have fun and make something work.
If that means using a few variables (< 10) in the global namespace, that only get used in the global area of a program, so be it. Yes, yes, MVC, dependency injection, external code, blah, blah, blah, blah. But, if you have contained 99.99% of your code into namespaces and classes, and external code is sandboxed, the world will not end (I repeat, the world will not end) if you use a global variable.
Generally, I would not say using global variables is bad practice. I would say that using global variables (flags and such) outside of the global area of a program is asking for trouble and (in the long run) ill-advised because you can lose track of their states rather easily. Also, I would say that the more you learn, the less reliant you will be on global variables because you will have experienced the "joy" of tracking down bugs associated with their use. This alone will incentivize you to find another way to solve the same problem. Coincidentally, this tends to push PHP people in the direction of learning how to use namespaces and classes (static members, etc ...).
The field of computer science is vast. If we scare everyone away from doing something because we label it bad, then they lose out on the fun of truly understanding the reasoning behind the label.
Use global variables if you must, but then see if you can solve the problem without them. Collisions, testing, and debugging mean more when you understand intimately the true nature of the problem, not just a description of the problem.
Reposted from the ended SO Documentation Beta
We can illustrate this problem with the following pseudo-code
function foo() {
global $bob;
$bob->doSomething();
}
Your first question here is an obvious one
Where did $bob come from?
Are you confused? Good. You've just learned why globals are confusing and considered a bad practice. If this were a real program, your next bit of fun is to go track down all instances of $bob and hope you find the right one (this gets worse if $bob is used everywhere). Worse, if someone else goes and defines $bob (or you forgot and reused that variable) your code can break (in the above code example, having the wrong object, or no object at all, would cause a fatal error). Since virtually all PHP programs make use of code like include('file.php'); your job maintaining code like this becomes exponentially harder the more files you add.
How do we avoid Globals?
The best way to avoid globals is a philosophy called Dependency Injection. This is where we pass the tools we need into the function or class.
function foo(\Bar $bob) {
$bob->doSomething();
}
This is much easier to understand and maintain. There's no guessing where $bob was set up because the caller is responsible for knowing that (it's passing us what we need to know). Better still, we can use type declarations to restrict what's being passed. So we know that $bob is either an instance of the Bar class, or an instance of a child of Bar, meaning we know we can use the methods of that class. Combined with a standard autoloader (available since PHP 5.3), we can now go track down where Bar is defined. PHP 7.0 or later includes expanded type declarations, where you can also use scalar types (like int or string).
As:
global $my_global;
$my_global = 'Transport me between functions';
Equals $GLOBALS['my_global']
is bad practice (Like Wordpress $pagenow)... hmmm
Concider this:
$my-global = 'Transport me between functions';
is PHP error But:
$GLOBALS['my-global'] = 'Transport me between functions';
is NOT error, hypens will not clash with "common" user declared variables, like $pagenow. And Using UPPERCASE indicates a superglobal in use, easy to spot in code, or track with find in files
I use hyphens, if Im lazy to build classes of everything for a single solution, like:
$GLOBALS['PREFIX-MY-GLOBAL'] = 'Transport me ... ';
But In cases of a more wider use, I use ONE globals as array:
$GLOBALS['PREFIX-MY-GLOBAL']['context-something'] = 'Transport me ... ';
$GLOBALS['PREFIX-MY-GLOBAL']['context-something-else']['numbers'][] = 'Transport me ... ';
The latter is for me, good practice on "cola light" objectives or use, instead of clutter with singleton classes each time to "cache" some data. Please make a comment if Im wrong or missing something stupid here...