When working with PHP, I notice it's popular nowadays for class libraries to use lots of "prep" method calls that build up to an action. E.g.:
$db = new database();
$result = $db->select("*")
->from("table1")
->where(["field3" => "apple"])
->orderBy("field4")
->run();
I can see the benefits this gives you when using an IDE with autocomplete. But I also see advantages to putting this information into one array that gets passed at once. E.g.:
$queryParams = [
"fields" => "*",
"table" => "table1",
"constraint" => ["field3" => "apple"],
"sort" => "field4",
];
$db = new database();
$result = $db->select($queryParams);
The advantage I see is that the information is treated as one unit and is easy to pass around. It could be stored as JSON in a file or table, unserialized, and with one statement the information gets processed. Also, it's less intermingled with PHP code; if I want to convert my project from one language/framework to another, if a good part of the information that makes up my project is already in compatible JSON, then it will be easier for me to do so.
I understand that I could easily write up my own code that iterates over $queryParams and passes the values to the proper method calls, but I am curious why passing all the information a class needs to run an action at once is not good practice.
Related
(There is a TL;DR: at the bottom)
I have a PDF produced via MVC pattern. I am working with an existing code, which was a bit of a mess, but now I am seeing a pattern emerge.
Currently, I have a Controller class, and inside I have many many separate functions, roughly one function per page. Each function does something like this:
function showPage()
{
//get some data from repository
$data1 = $this->repository->getData1();
$data2 = $this->repository->getData2();
//pass that data to the PDF API class, aka "the view"
//and the class takes care of creating PDF pages
//with the appropriate data
$this->pdfApi->showView($data, $data2);
}
The above achieves a clean separation between Repository (which only returns data), the PDF API service (which receives the data and doesn't need to care or maintain data retrieval constructs. And Controller which pretty much just asks for Data, and passes it to PDF API. And all was well until I came across this problem:
Problem
Most every page has a "footer" with a message, and a "Proposal Number" that needs to be displayed on the page. Sometimes it also has other pieces of data. Since PDF API class has no data in itself, someone has to pass that data to PDF API. I have been passing the above to pieces of information every time as part of function parameters but it became inconvenient -- there are too many parameters to pass and they are cluttering up the code.
Try at Solution
To reduce the clutter in parameter passing, In my Controller I have created pulled data (via Repository) for variables such as $footerText and $proposalNumber and then used them I populate PDF API's own class properties. Side-effect of this is that now my PDF API has the relevant bits of data embedded directly in the API (which I consider to be undesirable, since data layer now imposes into API class)
So far I have resisted the temptation to just pass the entire Repository object to PDF API because that will do very much the same - mix data layer and API Layer, plus, API layer will have unrestricted access to Data, which can also be undesirable.
Actual Problem
When I want clean layer separation, my code is cluttered with multiple function parameter passing.
When I pass the entire Repository to my API class, I mix data and API layers, and API layer gets too much freedom to use Repository class.
Can I somehow achieve layer separation without the clutter or "mixing layers" issues identified above?
If you like to see code, here is some code below of my various unsuccessful tries :)
TL;DR: My various unsuccessful tries to keep layers separate or to reduce clutter proved to be unsuccessful
//in Controller - Exhibit 1
//Separation achieved with only data parameter passing tying layers together
//but, too much clutter -- too many parameters
//maximum layer separation but lots of annoying data passing
$data1 = $this->repository->getData1();
....
$data24 = $this->repository->getData24();
$this->pdfApi->showView($data1, $data2, $data3, ... );
//in Controller - Exhibit 2
//Layers are mixed - my data is now tied into API
//in constructor
$data1 = $this->repository->getData1();
....
$data24 = $this->repository->getData24();
$this->pdfApi->setData1($data1);
$this->pdfApi->setData24($data24);
//inside function (API already has data as part of its own vars):
$this->pdfApi->showView();
//in Controller - Exhibit 3
//layers are mixed -- entire Repository got into my API
//in constructor
$repo = new Repository();
$this->pdfApi->setRepository($repo);
//inside function (API has full Repository access gets its own data and more):
$this->pdfApi->showView();
I think the Exhibit 1 is most correct.
//inside Controller
$data = array(
'data1' => $this->repository->getData1(),
//...
'data24' => $this->repository->getData4()
):
$this->pdfApi->showView($data);
I say this because a popular framework, ZF2, which I use, also ascribes to the same pattern.
//inside Controller
$data = array(
'message' => 'Hello world',
);
$view = new ViewModel($data);
In my case the View is the PDF API
As a feature in the software I'm writing, I'm allowing myself to create calculators written in JS to compute the fees to be applied to a specific set of data, using said data as a reference. Since I'm using Mongo, I can run this safely server-side, and the browser can just call a php page and get the response. The function will be written from an administration control panel and saved to the database. I of course won't be doing any db interactions from inside that function, but executing mongocode is done within the database, so mongocode by nature can do db.foo
Just to protect myself and anyone else who might end up writing calculators, I've set db = null; in $context being passed to new MongoCode()
It looks a bit like this:
$sample = [
'estimatedvalue' => 200,
'estimatedcost' => 400,
'contractor' => false,
'db' => null,
];
$fees = [
'_id' => new MongoId(),
'name' => 'Friendly name!',
'code' => new MongoCode('function(){return (contractor ? estimatedCost : estimatedValue)*0.01; /* If a contractor is doing the work, base fee on cost. */}', $sample),
];
$a = $this->siteDB->execute($fees['code']);
if(isset($a['errno'])){
echo $a['errmsg'];
}else{
var_dump($a['retval']);
}
Fortunately, that works, and if I was to inject that into all context, there would be no db. commands runnable. I don't want to create a point where NoSQL injection can happen!
An example of something that this prevents is:
'code' => new MongoCode('function(){db.foo.remove();}', $sample),
// Since db is now null, the above won't work
My concern: Are there any other variables that exist in this MongoCode Execute environment that could be potentially harmful to leave in a user-editable function? I couldn't find any documentation on what else is accessible through mongocode functions. If db is it, then I'm all set!
This is not safe, and I don't think you can have a user-editable JS function that is. For example, this requires no variables and shuts down your server:
> db.eval("(new Mongo('localhost:27017')).getDB('admin').shutdownServer()")
They can insert data, drop databases, connect to other servers in your system, and generally wreck havoc.
If you are trying to allow a user-editable compute function in JavaScript, use a separate JS engine, pull the values from MongoDB, and pass the values + user-defined function to the totally separate JS engine.
I'm not quite grokking a couple of things in OOP and I'm going to use a fictional understanding of SO to see if I can get help understand.
So, on this page we have a question. You can comment on the question. There are also answers. You can comment on the answers.
Question
- comment
- comment
- comment
Answer
-comment
Answer
-comment
-comment
-comment
Answer
-comment
-comment
So, I'm imagining a very high level understanding of this type of system (in PHP, not .Net as I am not yet familiar with .Net) would be like:
$question = new Question;
$question->load($this_question_id); // from the URL probably
echo $question->getTitle();
To load the answers, I imagine it's something like this ("A"):
$answers = new Answers;
$answers->loadFromQuestion($question->getID()); // or $answers->loadFromQuestion($this_question_id);
while($answer = $answers->getAnswer())
{
echo $answer->showFormatted();
}
Or, would you do ("B"):
$answers->setQuestion($question); // inject the whole obj, so we have access to all the data and public methods in $question
$answers->loadFromQuestion(); // the ID would be found via $this->question->getID() instead of from the argument passed in
while($answer = $answers->getAnswer())
{
echo $answer->showFormatted();
}
I guess my problem is, I don't know when or if I should be passing in an entire object, and when I should just be passing in a value. Passing in the entire object gives me a lot of flexibility, but it's more memory and subject to change, I'd guess (like a property or method rename). If "A" style is better, why not just use a function? OOP seems pointless here.
Thanks,
Hans
While I like Jason's answer, it is not, strictly speaking OO.
$question = new Question($id);
$comments = $question->getComments();
$answers = $question->getAnswers();
echo $question->getTitle();
echo $question->getText();
foreach ($comments as $comment)
echo $comments->getText();
The problems are:
There is no information hiding, a fundamental principle of OO.
If the format of the answers needs to change, it must be changed in a place that is not associated with the object that houses the data.
The solution is not extensible. (There is no behaviour to inherit.)
You must keep behaviour (tightly coupled) with the data. Otherwise you are not writing OO.
$question = new Question($id);
$questionView = new QuestionView( $question );
$questionView->displayComments();
$questionView->displayAnswers();
How the information is displayed is now an implementation detail, and reusable.
Notice how this opens up the following possibility:
$question = new Question( $id );
$questionView = new QuestionView( $question );
$questionView->setPrinterFriendly();
$questionView->displayComments();
$questionView->displayAnswers();
The idea is that now you can change how the questions are formatted from a single location in the code base. You can support multiple formats for the comments and answers without the calling code (a) ever knowing; and (b) ever needing to change (to a significant degree).
If you are coding text formatting details in more than one location because you are misusing accessor methods, the life of any future maintainers will be miserable. If the maintainer is a psychopath who knows where you live, you will be in trouble.
Objects, Data, and Views
Here's the problem, as I understand it:
Database -> Object -> Display Content
You want to keep the behaviour of the object centred around logic that is intrinsic to the object. In other words, you don't want the Object to have to do things that have nothing to do with its core responsibilities. Most commonly this will include load, save, and print functionality. You want to keep these separate from the object itself because if you ever have to change database, or output format, you want to make as few changes in the system as possible, and restrain the ripple effect.
To simplify this, let's take a look at loading only Comments; everything is applicable to Questions and Answers as well.
Comment Class
The Comment class might offer the following behaviours:
Reply
Delete
Update (requires permission)
Restore (from a delete)
etc.
CommentDB Class
We can create a CommentDB object that knows how to manipulate the Comments in the database. A CommentDB object has the following behaviours:
Create
Load
Save
Update
Delete
Restore
Notice that these behaviours will likely be common across all objects and can therefore be subject to refactoring. This will also let you change databases quite easily as the connection information will be isolated to a single class (the grandfather of all database objects).
Example usage:
$commentDb = new CommentDB();
$comment = $commentDb->create();
Later:
$comment->update( "new text" );
Notice that there are a number of possible ways to implement this, but you can always do so without violating encapsulation and information hiding.
CommentView Class
Lastly, the CommentView class will be tightly coupled to a Comment class. That it can obtain the attributes of Comment class via accessors is expected. The information is still hidden from the rest of the system. The Comment and its CommentView are tightly coupled. The idea is that the formatting is kept in a single place, not scattered throughout classes that need to use the data willy nilly.
Any classes that need to display comments, but in a slightly different format, can inherit from CommentView.
See also: Allen Holub wrote "You should never use get/set functions", is he correct?
Why pass either? What about:
<?php
$question = new Question($id);
$comments = $question->getComments();
$answers = $question->getAnswers();
echo $question->getTitle();
echo $question->getText();
foreach ($comments as $comment)
echo $comments->getText();
foreach ($answers as $answer)
{
$answer_comments = $answer->getComments();
echo $answer->getText();
foreach ($answer_comments as $comment)
echo $comment->getText();
}
Where getComments() and getAnswers() use $this->id to retrieve and return an array of comment or answer objects?
You could build utility methods in the comment and answer objects that allow you to load by parent id. In which case, just taking an id as a parameter would be nice.
$question = new Question($id);
$answers = Answer::forQuestion($question->id);
$comments = Comment::forQuestion($question->id);
$ans_comments = Comment::forAnswer($answer->id); // or some way to distinguish what the parent object is.
Edit: Likely the child model (Comment or Answer in this case) doesn't need anything from the parent except and id to do db queries with. Passing in the entire parent object would be overkill. (Also, PHP has a terrible time garbage collecting objects with circular references, which might be fixed in the 5.3 series.)
Both styles are acceptable. Sometimes you only need the value, sometimes you'll need the object. In this example I would personally do something along the lines of your first example, but trivial programs like this don't tend to exist in the wild very often so maybe you want the second piece.
My rule of thumb is to do the thing in the least number of lines that still clearly demonstrates what you're attempting to do to anyone who comes after you. The overhead of most object creation vs value passing is something you'll likely never ever have to deal with on modern arch.
Adding to what #jasonbar already mentioned:
I don't know when or if I should be passing in an entire object, and when I should just be passing in a value.
It depends on the Coupling you need and the Cohesion you desire.
Passing in the entire object gives me a lot of flexibility, but it's more memory and subject to change.
PHP does not copy the object when you use it as an argument to a function. Neither do most other languages (either by default, like C# and Java, or upon explicit request, like C and C++)
To add to Dave Jarvis and jasonbar answers, I usually have DataMappers to convert between relational data and objects, instead of using an ActiveRecord approach. So, following your example, we would have these classes:
Question
Answer
Comment
and their data mappers:
QuestionMapper
AnswerMapper
CommentMapper
Each mapper implementing a similar interface:
save(object) // creates or updates a record in the database (or text file, for that matter)
delete(id)
get(id)
Then, we would do as:
$q = QuestionMapper::get( $questionid );
// here we could either (a) just return a list of Answers
// previously eagerly-loaded by the
// QuestionMapper, or (b) lazy load the answers by
// calling AnswerMapper::getByQuestionID( $this->id ) or similar.
$aAnswers = $q->getAnswers();
foreach($aAnswers as $oAnswer){
echo $oAnswer->getText();
$aComments = $oAnswer->getComments();
foreach($aComments as $oComment){
echo $oComment->getText();
}
}
Regarding the use of things like QuestionView->render( $question ), I prefer to have Views which display the data using getters from the domain objects. If you pass a Question to a HTMLView, it will render it as HTML; if you pass it to a JSONView, then you'll get JSON-formatted content. This means that the domain objects need to have getters.
PS: We could also consider the QuestionMapper to load everything related to Questions, Answers, and Comments. Since Comments always belongs to Answers or Questions, and Answers always belong to Questions, it could make sense that the QuestionMapper loaded everything. Of course we would have to consider different strategies for lazy loading a Question's set of Answers and Comments, to avoid hogging the server.
I have recently begun working on a PHP/JS Form Class that will also include a SQL Form builder (eg. building simple forms from sql and auto inserts/updates).
I have tried several classes (zend_form, clonefish, PHP Form Builder Class, phorms etc) but as yet haven't come across a complete solution that is simple, customizable and complete (both server side and client side validation, covers all simple html elements and lots of dhtml elements: sorting, wysiwyg, mutli file upload, date picker, ajax validation etc)
My question is why do some "classes" implement elements via an array and others via proper OO class calls.
eg.
Clonefish (popular commercial php class):
$config = Array(
'username' => Array(
'type' => 'inputText',
'displayname' => 'Username',
validation => Array(
Array(
'type' => 'string',
'minimum' => 5,
'maximum' => 15,
),
),
));
$clonefish = new clonefish( 'loginform', 'test.php', 'POST' );
$clonefish->addElements( $config, $_POST );
Then others eg. Zend_Form
$form = new Zend_Form;
$username = new Zend_Form_Element_Text('username');
$username->addValidator(new Zend_Validate_Alnum());
$form->addElement($username);
I realise Zend_Form can pass elements in via an array similar to clonefish but why do this?
Is there any benefit? It seems to make things more complicated especially when using a proper IDE like Komodo.
Any thoughts would be appreciated as I dont want to get too far down the track and realize there was great benefit in using arrays to add elements (although this wouldn't be much of a task to add on).
Cheers
My question is why do some "classes" implement elements via an array and others via proper OO class calls.
For convenience. It's less verbose and it feels less like coding and more like configuration and you need less intimate knowledge of the API.
Btw, the reason you have not yet come across a complete solution that is simple, customizable and complete is because it is not simple. Forms, their validation and rendering is complex, especially if you want to have it customizable for any purpose. ZF's form components are a good example of how to properly decouple and separate all concerns to get the ultimate extensible form builder (including client side code through Zend_Dojo or ZendX_Jquery). But they are also a great example of the complexity required for this. Even with the convenient array configuration, it is damn difficult to make them bend to your will, especially if you need to depart from the default configuration and rendering.
Why to use objects? Becouase they are a much more complex types. Consider the following example (I never useed Zend_Form so I don't even know its architecture):
class MySuperAlnumValidator extends Zend_Validate_Alnum {
protected $forbiddenWords = array();
public function addForbiddenWord($word) {
$this->forbiddenWords[] = $word;
}
// Override Zend_Value_Alnum::validate() - I don't know whether such a method even exists
// but you know what's the point
public function validate() {
parent::validate();
if (in_array($this->value, $this->forbiddenWords) {
throw new Exception('Invalid value.');
}
return $this->value;
}
}
// -----------------------
$validator = new MySuperAlnumValidator();
$validator->addForbiddenWord('admin');
$validator->addForbiddenWord('administrator');
$username->addValidator($validator);
This is only a simple example but when you start writing more complex validators/form fields/etc. then objects are, in principle, the only meaningful tool.
So, I am looking at a number of ways to store my configuration data. I believe I've narrowed it down to 3 ways:
Just a simple variable
$config = array(
"database" => array(
"host" => "localhost",
"user" => "root",
"pass" => "",
"database" => "test"
)
);
echo $config['database']['host'];
I think that this is just too mutable, where as the configuration options shouldn't be able to be changed.
A Modified Standard Class
class stdDataClass {
// Holds the Data in a Private Array, so it cannot be changed afterwards.
private $data = array();
public function __construct($data)
{
// ......
$this->data = $data;
// .....
}
// Returns the Requested Key
public function __get($key)
{
return $this->data[$key];
}
// Throws an Error as you cannot change the data.
public function __set($key, $value)
{
throw new Exception("Tried to Set Static Variable");
}
}
$config = new stdStaticClass($config_options);
echo $config->database['host'];
Basically, all it does is encapsulates the above array into an object, and makes sure that the object can not be changed.
Or a Static Class
class AppConfig{
public static function getDatabaseInfo()
{
return array(
"host" => "localhost",
"user" => "root",
"pass" => "",
"database" => "test"
);
}
// .. etc ...
}
$config = AppConfig::getDatabaseInfo();
echo $config['host'];
This provides the ultimate immutability, but it also means that I would have to go in and manually edit the class whenever I wanted to change the data.
Which of the above do you think would be best to store configuration options in? Or is there a better way?
Of those 3 options, a static method is probably the best.
Really, though, "the best" is ultimately about what's easiest and most consistent for you to use. If the rest of your app isn't using any OO code then you might as well go with option #1. If you are ultimately wanting to write a whole db abstraction layer, option #2.
Without knowing something more about what your goals are and what the rest of your app looks like, it's kind of like asking someone what the best motor vehicle is -- it's a different answer depending on whether you're looking for a sports car, a cargo truck, or a motorcycle.
I'd go with whats behind door #3.
It looks easier to read and understand than #2, and seems to meet your needs better than #1.
Take a look at this question for ideas on storing the config data in a separate file:
Fastest way to store easily editable config data in PHP?
I'd use method #2 pulling the config data as an array from an external file.
The best way is that which fits your application best.
For a small app, it might be totally sufficient to use an array, even it is mutable. If no one is there to modify it except you, it doesn't have to be immutable.
The second approach is very flexible. It encapsulates data, but does not know anything about it. You can pass it around freely and consuming classes can take from it what they need. It is generic enough to be reused and it does not couple the config class to the concrete application. You could also use an interface with this or similar classes to allow for type hints in your method signatures to indicate a Config is required. Just don't name it stdDataClass, but name it by it's role: Config.
Your third solution is very concrete. It hardcodes a lot of assumptions about what your application requires into the class and it also makes it the responsibility of the class to know and provide this data through getters and setters. Depending on the amount of components requiring configuration, you might end up with a lot of specific getters. Chances are pretty good you will have to rewrite the entire thing for your next app, just because your next app has different components.
I'd go with the second approach. Also, have a look at Zend_Config, as it meets all your requirements already and let's you init the Config object from XML, Ini and plain arrays.