Understanding OOP Principles in passing around objects/values - php

I'm not quite grokking a couple of things in OOP and I'm going to use a fictional understanding of SO to see if I can get help understand.
So, on this page we have a question. You can comment on the question. There are also answers. You can comment on the answers.
Question
- comment
- comment
- comment
Answer
-comment
Answer
-comment
-comment
-comment
Answer
-comment
-comment
So, I'm imagining a very high level understanding of this type of system (in PHP, not .Net as I am not yet familiar with .Net) would be like:
$question = new Question;
$question->load($this_question_id); // from the URL probably
echo $question->getTitle();
To load the answers, I imagine it's something like this ("A"):
$answers = new Answers;
$answers->loadFromQuestion($question->getID()); // or $answers->loadFromQuestion($this_question_id);
while($answer = $answers->getAnswer())
{
echo $answer->showFormatted();
}
Or, would you do ("B"):
$answers->setQuestion($question); // inject the whole obj, so we have access to all the data and public methods in $question
$answers->loadFromQuestion(); // the ID would be found via $this->question->getID() instead of from the argument passed in
while($answer = $answers->getAnswer())
{
echo $answer->showFormatted();
}
I guess my problem is, I don't know when or if I should be passing in an entire object, and when I should just be passing in a value. Passing in the entire object gives me a lot of flexibility, but it's more memory and subject to change, I'd guess (like a property or method rename). If "A" style is better, why not just use a function? OOP seems pointless here.
Thanks,
Hans

While I like Jason's answer, it is not, strictly speaking OO.
$question = new Question($id);
$comments = $question->getComments();
$answers = $question->getAnswers();
echo $question->getTitle();
echo $question->getText();
foreach ($comments as $comment)
echo $comments->getText();
The problems are:
There is no information hiding, a fundamental principle of OO.
If the format of the answers needs to change, it must be changed in a place that is not associated with the object that houses the data.
The solution is not extensible. (There is no behaviour to inherit.)
You must keep behaviour (tightly coupled) with the data. Otherwise you are not writing OO.
$question = new Question($id);
$questionView = new QuestionView( $question );
$questionView->displayComments();
$questionView->displayAnswers();
How the information is displayed is now an implementation detail, and reusable.
Notice how this opens up the following possibility:
$question = new Question( $id );
$questionView = new QuestionView( $question );
$questionView->setPrinterFriendly();
$questionView->displayComments();
$questionView->displayAnswers();
The idea is that now you can change how the questions are formatted from a single location in the code base. You can support multiple formats for the comments and answers without the calling code (a) ever knowing; and (b) ever needing to change (to a significant degree).
If you are coding text formatting details in more than one location because you are misusing accessor methods, the life of any future maintainers will be miserable. If the maintainer is a psychopath who knows where you live, you will be in trouble.
Objects, Data, and Views
Here's the problem, as I understand it:
Database -> Object -> Display Content
You want to keep the behaviour of the object centred around logic that is intrinsic to the object. In other words, you don't want the Object to have to do things that have nothing to do with its core responsibilities. Most commonly this will include load, save, and print functionality. You want to keep these separate from the object itself because if you ever have to change database, or output format, you want to make as few changes in the system as possible, and restrain the ripple effect.
To simplify this, let's take a look at loading only Comments; everything is applicable to Questions and Answers as well.
Comment Class
The Comment class might offer the following behaviours:
Reply
Delete
Update (requires permission)
Restore (from a delete)
etc.
CommentDB Class
We can create a CommentDB object that knows how to manipulate the Comments in the database. A CommentDB object has the following behaviours:
Create
Load
Save
Update
Delete
Restore
Notice that these behaviours will likely be common across all objects and can therefore be subject to refactoring. This will also let you change databases quite easily as the connection information will be isolated to a single class (the grandfather of all database objects).
Example usage:
$commentDb = new CommentDB();
$comment = $commentDb->create();
Later:
$comment->update( "new text" );
Notice that there are a number of possible ways to implement this, but you can always do so without violating encapsulation and information hiding.
CommentView Class
Lastly, the CommentView class will be tightly coupled to a Comment class. That it can obtain the attributes of Comment class via accessors is expected. The information is still hidden from the rest of the system. The Comment and its CommentView are tightly coupled. The idea is that the formatting is kept in a single place, not scattered throughout classes that need to use the data willy nilly.
Any classes that need to display comments, but in a slightly different format, can inherit from CommentView.
See also: Allen Holub wrote "You should never use get/set functions", is he correct?

Why pass either? What about:
<?php
$question = new Question($id);
$comments = $question->getComments();
$answers = $question->getAnswers();
echo $question->getTitle();
echo $question->getText();
foreach ($comments as $comment)
echo $comments->getText();
foreach ($answers as $answer)
{
$answer_comments = $answer->getComments();
echo $answer->getText();
foreach ($answer_comments as $comment)
echo $comment->getText();
}
Where getComments() and getAnswers() use $this->id to retrieve and return an array of comment or answer objects?
You could build utility methods in the comment and answer objects that allow you to load by parent id. In which case, just taking an id as a parameter would be nice.
$question = new Question($id);
$answers = Answer::forQuestion($question->id);
$comments = Comment::forQuestion($question->id);
$ans_comments = Comment::forAnswer($answer->id); // or some way to distinguish what the parent object is.
Edit: Likely the child model (Comment or Answer in this case) doesn't need anything from the parent except and id to do db queries with. Passing in the entire parent object would be overkill. (Also, PHP has a terrible time garbage collecting objects with circular references, which might be fixed in the 5.3 series.)

Both styles are acceptable. Sometimes you only need the value, sometimes you'll need the object. In this example I would personally do something along the lines of your first example, but trivial programs like this don't tend to exist in the wild very often so maybe you want the second piece.
My rule of thumb is to do the thing in the least number of lines that still clearly demonstrates what you're attempting to do to anyone who comes after you. The overhead of most object creation vs value passing is something you'll likely never ever have to deal with on modern arch.

Adding to what #jasonbar already mentioned:
I don't know when or if I should be passing in an entire object, and when I should just be passing in a value.
It depends on the Coupling you need and the Cohesion you desire.
Passing in the entire object gives me a lot of flexibility, but it's more memory and subject to change.
PHP does not copy the object when you use it as an argument to a function. Neither do most other languages (either by default, like C# and Java, or upon explicit request, like C and C++)

To add to Dave Jarvis and jasonbar answers, I usually have DataMappers to convert between relational data and objects, instead of using an ActiveRecord approach. So, following your example, we would have these classes:
Question
Answer
Comment
and their data mappers:
QuestionMapper
AnswerMapper
CommentMapper
Each mapper implementing a similar interface:
save(object) // creates or updates a record in the database (or text file, for that matter)
delete(id)
get(id)
Then, we would do as:
$q = QuestionMapper::get( $questionid );
// here we could either (a) just return a list of Answers
// previously eagerly-loaded by the
// QuestionMapper, or (b) lazy load the answers by
// calling AnswerMapper::getByQuestionID( $this->id ) or similar.
$aAnswers = $q->getAnswers();
foreach($aAnswers as $oAnswer){
echo $oAnswer->getText();
$aComments = $oAnswer->getComments();
foreach($aComments as $oComment){
echo $oComment->getText();
}
}
Regarding the use of things like QuestionView->render( $question ), I prefer to have Views which display the data using getters from the domain objects. If you pass a Question to a HTMLView, it will render it as HTML; if you pass it to a JSONView, then you'll get JSON-formatted content. This means that the domain objects need to have getters.
PS: We could also consider the QuestionMapper to load everything related to Questions, Answers, and Comments. Since Comments always belongs to Answers or Questions, and Answers always belong to Questions, it could make sense that the QuestionMapper loaded everything. Of course we would have to consider different strategies for lazy loading a Question's set of Answers and Comments, to avoid hogging the server.

Related

Where's the best place to implement frequent object properties filling

We are developing an application in Symfony (2.7) that deals with third party entities and usually needs to transform (or fill) objects from one subsystem to another.
Take this example where we need to fill $destinationObject with data stored in $sourceObject:
$sourceObject;
$destinationObject = new DestinationObject();
$destinationObject->setProperty01($sourceObject->getProperty01());
$destinationObject->setProperty02($sourceObject->getProperty02());
$destinationObject->setProperty03($sourceObject->getProperty03());
$destinationObject->setProperty04($sourceObject->getPropertyWithDifferentName());
$intermediateValue = explode('/',$sourceObject->getProperty04());
$destinationObject->setProperty05($intermediateValue[0]);
$destinationObject->setProperty06($intermediateValue[1]);
Manual method (or no method) leads to duplicated code, copy&paste practices, etc.
So my question is:
Where is the propper place to implement this kind of "transformations"?
My ideas so far:
Doing it as a model method like $destinationObject->loadFromSourceObject($sourceObject) is a bad idea (entity coupling)
I don't like the idea of building an utility class full of static methods
...
EDIT: Also, there's a common situation where in $destinationObject we need to load data from both $sourceObjectFromClassA and $sourceObjectFromClassB, so SourceObjectToDestinationObjectTransformer understood as a method returning a new created DestinationObject is not a valid option. I mean, something like this may be needed (bad code coming warning, just for exemplification purpose):
$destinationObject = new DestinationObject();
$destinationObject->loadFromSourceClassA($sourceObjectFromClassA);
$destinationObject->loadFromSourceClassB($sourceObjectFromClassB);

Is it a good practice to define expectation in dataProvider

quick example of dataProvider:
return [
['180d-1pc', '6m-1pc'],
]
and test:
public function test_convert($title, $expected)
{
$uut = new Converter();
$this->assertEquals($expected, $uut->convertDayTitle($title));
}
(simply put: test if we convert 180d(days) to 6m(months))
as you can see - in data provider there are input data defined and also an expected output.
This works fine in many cases, but I keep having this feeling that maybe it's not the best idea. So I'm wondering if could be considered bad practice. If so - when I will see it was a bad idea to do it?
One example is, when you would like to use the same dataProvider in two tests - should you define two expected values and use one of them?
Quick example (I just made that up, so don't pay attention that I make product object just from title ;)):
public function test_gets_discount_when_licence_period_longer_than_1year($title, $expectedTitle, $expectedDiscount) {
$prod = new Product($title);
$this->assertEquals($expectedDiscount, $product->hasDiscount();
}
How to make this more elegant?
What you're doing is totally fine. Data providers can, and should, include the expected test result.
About the issue of reusing the same dataProvider for 2 tests, and only using some fields: I'd say that if the fields are related (i.e. they are properties of the same object), it can be acceptable.
If you feel the dataProvider is getting too big and complex because 2 tests must use data from it, just create 2 separate dataProviders, and use another private common method for building the common data.

Autoloading Required Classes from User Input in PHP

Let me start off by saying I'm an intermediate level PHP coder who's learning OOP. I've got a site running, but I would like to break my code up to implement a more flexible design pattern...and because OOP is just plain awesome.
In my original code, I used switch statements to call functions that correspond with the user request.
$request = (string) $_GET['fruit'];
switch ($request) {
case 'apply':
Get_Apple();
default:
Error_Not_A_Fruit();
exit;
}
This makes it highly inflexible and requires me to change the code at multiple locations for adding new options the user may request.
I'm thinking about changing it to a Polymorphic class call. I'm using composer, so I've got my Objects setup to autoload with PSR-4 standards. So the answer seems simple, if the user request is "apple," I could create
$request = (string) $_GET['fruit'];
$product = new Product\$request;
But, if a user manually enters something that doesn't exists...say "orange," what method would I use to white list the user's input? Like I said, this is my first venture into OOP and would love to pick up design standards that you guys use. I'm thinking encapsulating the block inside a try{} & catch(){} block, but is that the way it should be done?
Any advise would be greatly appreciated :)
Cheers,
Niro
Update: I'd like to make it clearer because it may not have been before. I'm looking for the approach to doing this in such a way that one could add new Objects of Subclass Product (implementing a Product Interface). That way I can add different Product types without changing code everywhere.
Well, you can always use class_exist('Product\\Apple').
class_exists takes 2 arguments:
Class Name (full, with namespace)
Boolean value, whether to try and autoload the class or not. Default is true.
The function returns a boolean.
So you write
$fullClassName = "Product\\$request";
if(class_exists($fullClassName )) {
$product = new $fullClassName();
}
else {
//error here
}

Why is it considered bad practice to use "global" reference inside functions? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Are global variables in PHP considered bad practice? If so, why?
global in functions
Edit: question answered in link above.
No, "global" in php is not the same thing as global in other languages, and while it does not introduce any security problems it can make the code less comprehensible to others.
OP:
Project summary - I am writing a web CMS to get my feet wet with PHP / MySQL. In order to break up the code I have a concept of these basic tiers / modules:
Data
- MySQL Tables
- PHP Variables
Function
- SQL - Get / Set / etc
- Frontend - Showing pages
- Backend - Manager
Presentation
- HTML Templates
- Page Content
- CSS Stylesheets
The goal is common enough. Use MySQL to hold site settings and page content, use php to get / manipulate content data for a page being served, then insert into an html template and echo to browser. Coming from OO languages like C# the first problem I ran into was variable scope issues when using includes and functions.
From the beginning I had been writing function-only php files and including them where needed in other files that had existing variable array definitions. For example, ignoring the data tier for a moment, a simple page might generically look like this:
File 1 (page)
$DATA_PAGE = Array
(
'temp' = null,
'title' = null,
[...]
);
include 'functions.php';
get_data ( 'page.php' );
[...]
run_html ();
File 2 (functions)
function get_data ( $page_name )
{
global $DATA_PAGE;
$DATA_PAGE [ 'temp' ] = 'template';
$DATA_PAGE [ 'title' ] = 'test page';
[...]
}
function run_html ()
{
global $DATA_PAGE;
echo '<html>';
echo '<head>';
echo '<title>' . $DATA_PAGE [ 'title' ] . '</title>';
[...]
}
I chose this method for a few reasons:
The data in these arrays after sql fetch might be used anywhere, including page content
I didnt want to have a dozen function arguments, or pass entire arrays
The code runs great. But in every article I find on the subject, the "global" calls in my functions are called bad practice, even though the articles never state why? I thought all that meant was "use parent scope". Am in introducing a security hole into my app? Is there a better method? Thanks in advance.
I think a top reason for avoiding this is that it hides dependencies.
Your functions get_data and run_html do not advertise in any way that they share data, and yet they do, in a big way. And there is no way (short of reading the code) to know that run_html will be useless if get_data has not been called.
As the complexity of your codebase grows, this kind of lurking dependency will make your code fragile and hard to reason about.
Because global variables can be modified by the process without other parts of your code knowing it producing unexpected results.
You should always try to keep variables scoped as locally as possible -- not only will this help you with debugging, but it will keep your code easier and cleaner to read and go back and modify.
If you are looking to share data across multiple functions, you might look into making a class for your data and then defining methods to operate on the data encapsulated in your object. (Object-Oriented programming)
For many reasons, for example:
Hard to support code with global variables. You don't know where global variables can affect your logic and don't control access to them
Security - if you have complex system (especially with plugins), someone can compromise all system with global variables.
Settings in a global variable are fine, but putting data that can be modified into a global variable can, as tkone said, have unexpected results.
That said, I don't agree with the notion that global variables should be avoided at all costs - just try wrapping them into, say, a singleton settings class.
It is bad practice to use variables from a different scope to your own, because it makes your code less portable. In other words, if I want to use the get_data() function somewhere else, in another project, I have to define the $DATA_PAGE variable before I can use it.
AFAIK this is the only reason this should be avoided - but it's a pretty good reason.
I would pass by reference instead.
function get_data (&$data_page, $page_name )
{
$data_page['temp'] = 'template';
$data_page['title'] = 'test page';
[...]
}
get_data($DATA_PAGE);

Is using superglobals directly good or bad in PHP?

So, I don't come from a huge PHP background—and I was wondering if in well formed code, one should use the 'superglobals' directly, e.g. in the middle of some function say $_SESSION['x'] = 'y'; or if, like I'd normally do with variables, it's better to send them as arguments that can be used from there, e.g:
class Doer {
private $sess;
public function __construct(&$sess) {
$this->sess =& $sess;
}
}
$doer = new Doer($_SESSION);
and then use the Doer->sess version from within Doer and such. (The advantage of this method is that it makes clear that Doer uses $_SESSION.)
What's the accepted PHP design approach for this problem?
I do like to wrap $_SESSION, $_POST, $_GET, and $_COOKIE into OOP structures.
I use this method to centralize code that handles sanitation and validation, all of the necessary isset () checks, nonces, setcookie parameters, etc. It also allows client code to be more readable (and gives me the illusion that it's more maintainable).
It may be difficult to enforce use of this kind of structure, especially if there are multiple coders. With $_GET, $_POST, and $_COOKIE (I believe), your initialization code can copy the data, then destroy the superglobal. Maybe a clever destructor could make this possible with $_SESSION (wipe $_SESSION on load, write it back in the destructor), though I haven't tried.
I don't usually use any of these enforcement techniques, though. After getting used to it, seeing $_SESSION in code outside the session class just looks strange, and I mostly work solo.
EDIT
Here's some sample client code, in case it helps somebody. I'm sure looking at any of the major frameworks would give you better ideas...
$post = Post::load ();
$post->numeric ('member_age');
$post->email ('member_email');
$post->match ('/regex/','member_field');
$post->required ('member_first_name','member_email');
$post->inSet ('member_status',array('unemployed','retired','part-time','full-time'));
$post->money ('member_salary');
$post->register ('member_last_name'); // no specific requirements, but we want access
if ($post->isValid())
{
// do good stuff
$firstName = $post->member_first_name;
}
else
{
// do error stuff
}
Post and its friends all derive from a base class that implements the core validation code, adding their own specific functionality like form tokens, session cookie configuration, whatever.
Internally, the class holds a collection of valid data that's extracted from $_POST as the validation methods are called, then returns them as properties using a magic __get method. Failed fields can't be accessed this way. My validation methods (except required) don't fail on empty fields, and many of them use func_get_args to allow them to operate on multiple fields at once. Some of the methods (like money) automatically translate the data into custom value types.
In the error case, I have a way to transform the data into a format that can be saved in the session and used to pre-populate the form and highlight errors after redirecting to the original form.
One way to improve on this would be to store the validation info in a Form class that's used to render the form and power client-side validation, as well as cleaning the data after submission.
Modifying the contents of the superglobals is considered poor practice. While there's nothing really wrong with it, especially if the code is 100% under your control, it can lead to unexpected side effects, especially when you consider mixed-source code. For instance, if you do something like this:
$_POST['someval'] = mysql_real_escape_string($_POST['someval']);
you might expect that everywhere PHP makes that 'someval' available would also get changed, but this is not the case. The copy in $_REQUEST['someval'] will be unchanged and still the original "unsafe" version. This could lead to an unintentional injection vulnerability if you do all your escaping on $_POST, but a later library uses $_REQUEST and assumes it's been escaped already.
As such, even if you can modify them, it's best to treat the superglobals as read-only. If you have to mess with the values, maintain your own parallel copies and do whatever wrappers/access methods required to maintain that copy.
I know this question is old but I'd like to add an answer.
mario's classes to handle the inputs is awesome.
I much prefer wrapping the superglobals in some way. It can make your code MUCH easier to read and lead to better maintainability.
For example, there is some code at my current job the I hate! The session variables are used so heavily that you can't realistically change the implementation without drastically affecting the whole site.
For example,
Let's say you created a Session class specific to your application.
class Session
{
//some nice code
}
You could write something like the following
$session = new Session();
if( $session->isLoggedIn() )
{
//do some stuff
}
As opposed to this
if( $_SESSION['logged'] == true )
{
//do some stuff
}
This seems a little trivial but it's a big deal to me. Say that sometime in the future I decide that I want to change the name of the index from 'logged' to 'loggedIn'.
I now have to go to every place in the app that the session variable is used to change this. Or, I can leave it and find someway to maintain both variables.
Or what if I want to check that that user is an admin user and is logged in? I might end up checking two different variables in the session for this. But, instead I could encapsulate it into one method and shorten my code.
This helps other programmers looking at your code because it becomes easier to read and they don't have to 'think' about it as much when they look at the code. They can go to the method and see that there is only ONE way to have a logged in user. It helps you too because if you wanted to make the 'logged' in check more complex you only have to go to one place to change it instead of trying to do global finds with your IDE and trying to change it that way.
Again, this is a trivial example but depending on how you use the session this route of using methods and classes to protect access could make your life much easier to live.
I would not recommend at all passing superglobal by reference. In your class, it's unclear that what you are modifying is a session variable. Also, keep in mind that $_SESSION is available everywhere outside your class. It's so wrong from a object oriented point of view to be able to modify a variable inside a class from outside that class by modifying a variable that is not related to the class. Having public attribute is consider to be a bad practice, this is even worst.
I found my way here while researching for my new PHP framework.
Validating input is REALLY important. But still, I do often find myself falling back to code like this:
function get( $key, $default=FALSE ){
return (isset($_GET[$key]) ? $_GET[$key]:$default);
}
function post( $key, $default=FALSE ){
return (isset($_POST[$key]) ? $_POST[$key]:$default);
}
function session( $key, $default=FALSE ){
return (isset($_SESSION[$key]) ? $_SESSION[$key]:$default);
}
Which I then use like this:
$page = get('p', 'start');
$first_name = post('first_name');
$last_name = post('last_name');
$age = post('age', -1);
I have found that since I have wildly different requirements for validation for different projects, any class to handle all cases would have to be incredibly large and complex. So instead I write validation code with vanilla PHP.
This is not good use of PHP.
get $_SESSION variables directly:
$id = $_SESSION['id'];
$hash = $_SESSION['hash'];
etc.

Categories