What is the best way to store configuration variables in PHP? - php

I need to store a bunch of configuration information in PHP.
I have considered the following....
// Doesn't seem right.
$mysqlPass = 'password';
// Seems slightly better.
$config = array(
'mysql_pass' => 'password'
);
// Seems dangerous having this data accessible by anything. but it can't be
// changed via this method.
define('MYSQL_PASSWORD', 'password');
// Don't know if this is such a good idea.
class Config
{
const static MYSQL_PASSWORD = 'password';
}
This is all I have thought of so far. I intend to import this configuration information into my application with require /config.inc.php.
What works for you with regard to storing configuration data, and what are best practices concerning this?

I've always gone with option #2 and just ensure that no one but the owner has ANY sort of access to it. It's the most popular method among PHP applications like Joomla, vBulletin, Gallery, and numerous others.
First method is too messy to me (readability) and the third is WAY too dangerous to do. I've never thought about the Class method, so someone else can provide their input on that one. But I guess it's fine so long as the right access is used on the class' usage.
Example..
define('EXAMPLE1', "test1"); // scenario 1
$example2 = "test2"; // scenario 2
function DealWithUserInput($input)
{
return eval($input);
}
Now this example of code is really dumb, but just an example. Consider what could be returned by the function depending on which scenario the user could try to use in their input.
Scenario 2 would only cause an issue if you made it a global within the function. Otherwise it's out of scope and unreachable.

I'd say it also depends of userbase a bit. If configurations has to be very user friendly or user has to have ability to change config via web etc.
I use Zend Config Ini for this and other settings are stored in SQL DB.

I generally use the second method... When handling database connections I generally open a connection at the beginning of the request, then close it at the end. I have a function that establishes the connection, then removes the username/password from the global array (with the unset() function), This prevents other parts of the system from accessing the "sensitive" mysql connection data.

I'm also with option 2 for most config values. If you were going to implement the Class then I would tie the specific values to the Class that it affects instead of a general config Class.
In your example, your Class would be for database connections and an instance would save the password, db_name, etc. This would encapsulate the data properly and also provide an easy means to create multiple connections if that was ever needed.

Related

A class property gets cleared at the end of a method execution

I have a structure like this:
class My_Class extends Another_Class {
private $my_data = array();
public function append_to_my_data() {
$my_data[] = $this->get_result();
}
}
The append_to_my_data method is called from an AJAX recursion, it is supposed to handle an array by small pieces and combine results by appending them to the $my_data property. But it seems that on each call of this method, the $my_data property is empty again and does not contain the results from the previous call.
Why does it happen? Should I look for another method of storing this data? I would not like to store the data in my JavaScript recursion because it might be quite deep and the data might get quite large.
By AJAX, I assume you mean Javascript.
This tells me one thing, you try to persist data between http requests from your client app (the Javascript code) and your server app code (PHP).
Be aware the JS/client runs on the user's machine in the browser, and the PHP code runs on your own server machine. Two totaly different machines with separate memory/hard disk and CPU.
To persist data on the server between different http requests there are many ways/approaches you can use.
The two MOST common ways are to use php sessions and/or MySQL database, there are many more ways to do it, the two above are the most common ones.
And, to access an object property, you will need to use the $this
i.e.:
$this->my_data[] = $this->get_result();
and not
$my_data[] = $this->get_result();

Is using superglobals directly good or bad in PHP?

So, I don't come from a huge PHP background—and I was wondering if in well formed code, one should use the 'superglobals' directly, e.g. in the middle of some function say $_SESSION['x'] = 'y'; or if, like I'd normally do with variables, it's better to send them as arguments that can be used from there, e.g:
class Doer {
private $sess;
public function __construct(&$sess) {
$this->sess =& $sess;
}
}
$doer = new Doer($_SESSION);
and then use the Doer->sess version from within Doer and such. (The advantage of this method is that it makes clear that Doer uses $_SESSION.)
What's the accepted PHP design approach for this problem?
I do like to wrap $_SESSION, $_POST, $_GET, and $_COOKIE into OOP structures.
I use this method to centralize code that handles sanitation and validation, all of the necessary isset () checks, nonces, setcookie parameters, etc. It also allows client code to be more readable (and gives me the illusion that it's more maintainable).
It may be difficult to enforce use of this kind of structure, especially if there are multiple coders. With $_GET, $_POST, and $_COOKIE (I believe), your initialization code can copy the data, then destroy the superglobal. Maybe a clever destructor could make this possible with $_SESSION (wipe $_SESSION on load, write it back in the destructor), though I haven't tried.
I don't usually use any of these enforcement techniques, though. After getting used to it, seeing $_SESSION in code outside the session class just looks strange, and I mostly work solo.
EDIT
Here's some sample client code, in case it helps somebody. I'm sure looking at any of the major frameworks would give you better ideas...
$post = Post::load ();
$post->numeric ('member_age');
$post->email ('member_email');
$post->match ('/regex/','member_field');
$post->required ('member_first_name','member_email');
$post->inSet ('member_status',array('unemployed','retired','part-time','full-time'));
$post->money ('member_salary');
$post->register ('member_last_name'); // no specific requirements, but we want access
if ($post->isValid())
{
// do good stuff
$firstName = $post->member_first_name;
}
else
{
// do error stuff
}
Post and its friends all derive from a base class that implements the core validation code, adding their own specific functionality like form tokens, session cookie configuration, whatever.
Internally, the class holds a collection of valid data that's extracted from $_POST as the validation methods are called, then returns them as properties using a magic __get method. Failed fields can't be accessed this way. My validation methods (except required) don't fail on empty fields, and many of them use func_get_args to allow them to operate on multiple fields at once. Some of the methods (like money) automatically translate the data into custom value types.
In the error case, I have a way to transform the data into a format that can be saved in the session and used to pre-populate the form and highlight errors after redirecting to the original form.
One way to improve on this would be to store the validation info in a Form class that's used to render the form and power client-side validation, as well as cleaning the data after submission.
Modifying the contents of the superglobals is considered poor practice. While there's nothing really wrong with it, especially if the code is 100% under your control, it can lead to unexpected side effects, especially when you consider mixed-source code. For instance, if you do something like this:
$_POST['someval'] = mysql_real_escape_string($_POST['someval']);
you might expect that everywhere PHP makes that 'someval' available would also get changed, but this is not the case. The copy in $_REQUEST['someval'] will be unchanged and still the original "unsafe" version. This could lead to an unintentional injection vulnerability if you do all your escaping on $_POST, but a later library uses $_REQUEST and assumes it's been escaped already.
As such, even if you can modify them, it's best to treat the superglobals as read-only. If you have to mess with the values, maintain your own parallel copies and do whatever wrappers/access methods required to maintain that copy.
I know this question is old but I'd like to add an answer.
mario's classes to handle the inputs is awesome.
I much prefer wrapping the superglobals in some way. It can make your code MUCH easier to read and lead to better maintainability.
For example, there is some code at my current job the I hate! The session variables are used so heavily that you can't realistically change the implementation without drastically affecting the whole site.
For example,
Let's say you created a Session class specific to your application.
class Session
{
//some nice code
}
You could write something like the following
$session = new Session();
if( $session->isLoggedIn() )
{
//do some stuff
}
As opposed to this
if( $_SESSION['logged'] == true )
{
//do some stuff
}
This seems a little trivial but it's a big deal to me. Say that sometime in the future I decide that I want to change the name of the index from 'logged' to 'loggedIn'.
I now have to go to every place in the app that the session variable is used to change this. Or, I can leave it and find someway to maintain both variables.
Or what if I want to check that that user is an admin user and is logged in? I might end up checking two different variables in the session for this. But, instead I could encapsulate it into one method and shorten my code.
This helps other programmers looking at your code because it becomes easier to read and they don't have to 'think' about it as much when they look at the code. They can go to the method and see that there is only ONE way to have a logged in user. It helps you too because if you wanted to make the 'logged' in check more complex you only have to go to one place to change it instead of trying to do global finds with your IDE and trying to change it that way.
Again, this is a trivial example but depending on how you use the session this route of using methods and classes to protect access could make your life much easier to live.
I would not recommend at all passing superglobal by reference. In your class, it's unclear that what you are modifying is a session variable. Also, keep in mind that $_SESSION is available everywhere outside your class. It's so wrong from a object oriented point of view to be able to modify a variable inside a class from outside that class by modifying a variable that is not related to the class. Having public attribute is consider to be a bad practice, this is even worst.
I found my way here while researching for my new PHP framework.
Validating input is REALLY important. But still, I do often find myself falling back to code like this:
function get( $key, $default=FALSE ){
return (isset($_GET[$key]) ? $_GET[$key]:$default);
}
function post( $key, $default=FALSE ){
return (isset($_POST[$key]) ? $_POST[$key]:$default);
}
function session( $key, $default=FALSE ){
return (isset($_SESSION[$key]) ? $_SESSION[$key]:$default);
}
Which I then use like this:
$page = get('p', 'start');
$first_name = post('first_name');
$last_name = post('last_name');
$age = post('age', -1);
I have found that since I have wildly different requirements for validation for different projects, any class to handle all cases would have to be incredibly large and complex. So instead I write validation code with vanilla PHP.
This is not good use of PHP.
get $_SESSION variables directly:
$id = $_SESSION['id'];
$hash = $_SESSION['hash'];
etc.

Efficient way to handle multiple HTMLPurifier configs

I'm using HTMLPurifier in a current project and I'm not sure about the most efficient way to go about handling multiple configs. For the most part the only major thing changing is the allowed tags.
Currently I have a private method, in each class using HTMLPurifier, that gets called when a config is needed and it creates one from the default. I'm not really happy with this as if 2 classes are using the same config, that's duplicating code, and what if a class needs 2 configs? It just feels messy.
The one upside to it, is it's lazy in the sense that it's not creating anything until it's needed. The various classes could be used and never have to purify anything. So I don't want to be creating a bunch of config objects that might not even be used.
I just recently found out that you can create a new instance of HTMLPurifier and directly set config options like so:
$purifier = new HTMLPurifier();
$purifier->config->set('HTML.Allowed', '');
I'm not sure if that's bad form at all or not, and if it isn't I'm not really sure of a good way to put it to use.
My latest idea was to create a config handler class that would just return an HTMLPurifier config for use in subsequent purify calls. At some point it could probably be expanded to allow the setting of configs, but just from the start I figured they'd just be hardcoded and run a method parameter through a switch to grab the requested config. Perhaps I could just send the stored purifier instance as an argument and have the method directly set the config on it like shown above?
This to me seems the best of the few ways I thought of, though I'm not sure if such a task warrants me creating a class to handle it, and if so, if I'm handling it in the best way.
I'm not too familiar with the inner workings of HTMLPurifier so I'm not sure if there are any better mechanisms in place for handling multiple configs.
Thanks for any insight anyone can offer.
Configuration objects have one important invariant: after they've been used to perform a purification, they cannot be edited. You might be interested in some convenience methods that the class has, in particular, inherit, and you can load an array of values using loadArray.
I just wrote up a quick, simple config handler class.
You can view / use it # http://gist.github.com/358187/
It takes an array of configs at initialization.
$configs = array(
'HTML.Doctype' => 'HTML 4.01 Strict',
'posts' => array(
'HTML.Allowed' => 'p,a[href],img[src]'
),
'comments' => array(
'HTML.Allowed' => 'p'
),
'default' => array(
'HTML.Allowed' => ''
)
);
Directives put directly into the array are global and will be applied to all configs. If the same directive is set inside a named config, the value in the named config will be used.
Then you can grab a config like so:
$purifierConfig = new purifierConfig($configs);
$commentsConfig = $purifierConfig->getConfig('comments');
or, shorthand:
$commentsConfig = $purifierConfig['comments'];
If the requested config does not exist or no config is specified in the call, the default config will be returned.
I'll probably flesh it out at some point, but for now this works.
I was looking for how people handled such things in their projects or if there were any built in mechanisms that made something like this more steamlined, but I suppose my question was a bit too narrow in scope.

What is the best way to represent configuration options internally?

So, I am looking at a number of ways to store my configuration data. I believe I've narrowed it down to 3 ways:
Just a simple variable
$config = array(
"database" => array(
"host" => "localhost",
"user" => "root",
"pass" => "",
"database" => "test"
)
);
echo $config['database']['host'];
I think that this is just too mutable, where as the configuration options shouldn't be able to be changed.
A Modified Standard Class
class stdDataClass {
// Holds the Data in a Private Array, so it cannot be changed afterwards.
private $data = array();
public function __construct($data)
{
// ......
$this->data = $data;
// .....
}
// Returns the Requested Key
public function __get($key)
{
return $this->data[$key];
}
// Throws an Error as you cannot change the data.
public function __set($key, $value)
{
throw new Exception("Tried to Set Static Variable");
}
}
$config = new stdStaticClass($config_options);
echo $config->database['host'];
Basically, all it does is encapsulates the above array into an object, and makes sure that the object can not be changed.
Or a Static Class
class AppConfig{
public static function getDatabaseInfo()
{
return array(
"host" => "localhost",
"user" => "root",
"pass" => "",
"database" => "test"
);
}
// .. etc ...
}
$config = AppConfig::getDatabaseInfo();
echo $config['host'];
This provides the ultimate immutability, but it also means that I would have to go in and manually edit the class whenever I wanted to change the data.
Which of the above do you think would be best to store configuration options in? Or is there a better way?
Of those 3 options, a static method is probably the best.
Really, though, "the best" is ultimately about what's easiest and most consistent for you to use. If the rest of your app isn't using any OO code then you might as well go with option #1. If you are ultimately wanting to write a whole db abstraction layer, option #2.
Without knowing something more about what your goals are and what the rest of your app looks like, it's kind of like asking someone what the best motor vehicle is -- it's a different answer depending on whether you're looking for a sports car, a cargo truck, or a motorcycle.
I'd go with whats behind door #3.
It looks easier to read and understand than #2, and seems to meet your needs better than #1.
Take a look at this question for ideas on storing the config data in a separate file:
Fastest way to store easily editable config data in PHP?
I'd use method #2 pulling the config data as an array from an external file.
The best way is that which fits your application best.
For a small app, it might be totally sufficient to use an array, even it is mutable. If no one is there to modify it except you, it doesn't have to be immutable.
The second approach is very flexible. It encapsulates data, but does not know anything about it. You can pass it around freely and consuming classes can take from it what they need. It is generic enough to be reused and it does not couple the config class to the concrete application. You could also use an interface with this or similar classes to allow for type hints in your method signatures to indicate a Config is required. Just don't name it stdDataClass, but name it by it's role: Config.
Your third solution is very concrete. It hardcodes a lot of assumptions about what your application requires into the class and it also makes it the responsibility of the class to know and provide this data through getters and setters. Depending on the amount of components requiring configuration, you might end up with a lot of specific getters. Chances are pretty good you will have to rewrite the entire thing for your next app, just because your next app has different components.
I'd go with the second approach. Also, have a look at Zend_Config, as it meets all your requirements already and let's you init the Config object from XML, Ini and plain arrays.

Passing database connection by reference in PHP

The question is if a database connection should be passed in by reference or by value?
For me I'm specifically questioning a PHP to MySQL connection, but I think it applies to all databases.
I have heard that in PHP when you pass a variable to a function or object, that it is copied in memory and therefore uses twice as much memory immediately. I have also heard that it's only copied once changes have been made to the value (such as a key being added/removed from an array).
In a database connection, I would think it's being changed within the function as the query could change things like the last insert id or num rows. (I guess this is another question: are things like num rows and insert id stored within the connection or an actual call is made back to the database?)
So, does it matter memory or speed wise if the connection is passed by reference or value? Does it make a difference PHP 4 vs 5?
// $connection is resource
function DoSomething1(&$connection) { ... }
function DoSomething2($connection) { ... }
A PHP resource is a special type that already is a reference in itself. Passing it by value or explicitly by reference won't make a difference (ie, it's still a reference). You can check this for yourself under PHP4:
function get_connection() {
$test = mysql_connect('localhost', 'user', 'password');
mysql_select_db('db');
return $test;
}
$conn1 = get_connection();
$conn2 = get_connection(); // "copied" resource under PHP4
$query = "INSERT INTO test_table (id, field) VALUES ('', 'test')";
mysql_query($query, $conn1);
print mysql_insert_id($conn1)."<br />"; // prints 1
mysql_query($query, $conn2);
print mysql_insert_id($conn2)."<br />"; // prints 2
print mysql_insert_id($conn1); // prints 2, would print 1 if this was not a reference
Call-time pass-by-reference is being depreciated,so I wouldn't use the method first described. Also, generally speaking, resources are passed by reference in PHP 5 by default. So having any references should not be required, and you should never open up more than one database connection unless you really need it.
Personally, I use a singleton-factory class for my database connections, and whenever I need a database reference I just call Factory::database(), that way I don't have to worry about multiple connections or passing/receiving references.
<?php
Class Factory
{
private static $local_db;
/**
* Open new local database connection
*
* #return MySql
*/
public static function localDatabase() {
if (!is_a(self::$local_db, "MySql")) {
self::$local_db = new MySql(false);
self::$local_db->connect(DB_HOST, DB_USER, DB_PASS, DB_DATABASE);
self::$local_db->debugging = DEBUG;
}
return self::$local_db;
}
}
?>
It isn't the speed you should be concerned with, but the memory.
In PHP 4, things like database connections and resultsets should be explicitly passed by reference. In PHP 5, this is done automatically, so you don't have to make it explicit.
BTW, singleton methods for creating database handles are a good idea: you can do $db = & Database::Connection(); and always get the correct handle. This saves you from using a global and the static method can do extra magic (like opening it automatically) for you. Just be careful of when your application scales enough that it needs multiple databases: then your magic function will have to know how to hand you back the correct one. IME this is not hugely difficult; the basic way to solve that is for the code layer that needs the DB handle to know how to ask for the correct one.
A database connection does not actually hold the underlying values, so you don't have to worry about losing assignments made inside a function. Metaphorically, you can think of a DB connection as, say, a runway number -- "OK, DB Connection 12 is cleared to be used for a query" -- The query and result set use the connection, and may need exclusive access for awhile, but the connection does not know anything about the underlying data.
A few people have said that you don't need to worry about this for PHP 5. This is incorrect, if you have a database OBJECT that you're using for all access. In that case, you do need to pass by reference, otherwise it instantiates a new DB object, which (often) creates a new connection to the database.
I discovered this using XDebug & WinCacheGrind, which kindly shows all the destructors that get called - in my case, a halfdozen or more database objects.
To clarify: The reason I point this out is that this is a common way of using Database connections, instead of the raw connection resource.
i don't really have a specific answer for php, but in general it would seem to me that you would want to pass this by reference if you are not explicitly sure that you encounter performance issues when passing by value.
Generally speaking, references are not faster in PHP. It's a common misconception, because they are semantically similar to C pointers, so people familiar with C often assume they work the same way. Not so. In fact, references are a tiny bit slower than copies, unless you actually assign to the variable (Which in any case is bad style, unless the variable is an object).
PHP has a mechanism called copy-on-write, which means that a variable isn't actually copied before it needs to. You can pass a huge data structure to a function; As long as it just reads from it, it makes no difference. A reference however, needs an additional entry in the internal registers, so it would actually take some extra processing (Though barely noticeable).

Categories