OOP principles were difficult for me to grasp because for some reason I could never apply them to web development. As I developed more and more projects I started understanding how some parts of my code could use certain design patterns to make them easier to read, reuse, and maintain so I started to use it more and more.
The one thing I still can't quite comprehend is why I should abstract my data layer. Basically if I need to print a list of items stored in my DB to the browser I do something along the lines of:
$sql = 'SELECT * FROM table WHERE type = "type1"';'
$result = mysql_query($sql);
while($row = mysql_fetch_assoc($result))
{
echo '<li>'.$row['name'].'</li>';
}
I'm reading all these How-Tos or articles preaching about the greatness of PDO but I don't understand why. I don't seem to be saving any LoCs and I don't see how it would be more reusable because all the functions that I call above just seem to be encapsulated in a class but do the exact same thing. The only advantage I'm seeing to PDO are prepared statements.
I'm not saying data abstraction is a bad thing, I'm asking these questions because I'm trying to design my current classes correctly and they need to connect to a DB so I figured I'd do this the right way. Maybe I'm just reading bad articles on the subject :)
I would really appreciate any advice, links, or concrete real-life examples on the subject!
Think of a abstracting the data layer as a way to save time in the future.
Using your example. Let's say you changed the names of the tables. You would have to go to each file where you have a SQL using that table and edit it. In the best case, it was a matter of search and replace of N files. You could have saved a lot of time and minimized the error if you only had to edit one file, the file that had all your sql methods.
The same applies to column names.
And this is only considering the case where you rename stuff. It is also quite possible to change database systems completely. Your SQL might not be compatible between Sqlite and MySQL, for example. You would have to go and edit, once again, a lot of files.
Abstraction allows you to decouple one part from the other. In this case, you can make changes to the database part without affecting the view part.
For very small projects this might be more trouble than it is worth. And even then, you should still do it, at least to get used to it.
I'm NOT a php person but this is a more general question so here goes.
You're probably building something small, sometimes though even something small/medium should have an abstracted data layer so it can grow better.
The point is to cope with CHANGE
Think about this, you have a small social networking website. Think about the data you'll store, profile details, pictures, friends, messages. For each of these you'll have pages like pictures.php?&uid=xxx.
You'll then have a little piece of SQL slapped in there with the mysql code. Now think of how easy/difficult it would be to change this? You would change 5-10 pages? When you'll do this, you'll probably get it wrong a few times before you test it thoroughly.
Now, think of Facebook. Think of the amount of pages there will be, do you think it'll be easier to change a line of SQL in each page!?
When you abstract the data access correctly:
Its in one place, its easier to change.
Therefore its easier to test.
Its easier to replace. (Think about what you'd have to do if you had to switch to another Database)
Hope this Helps
One of the other advantage of abstracting the data layer is to be less dependent on the underlying database.
With your method, the day you want to use something else than mysql or your column naming change or the php API concerning mysql change, you will have to rewrite a lot of code.
If all the database access part was neatly abstracted, the needed changes will be minimal and restricted to a few files instead of the whole project.
It is also a lot easier to reuse code concerning sql injection or others utility function if the code is centralized in one place.
Finally, it's easier to do unit testing if everything goes trough some classes than on every pages from your project.
For example, in a recent project of mine (sorry, no code sharing is possible), mysql related functions are only called in one class. Everything from query generation to object instantiation is done here. So it's very for me to change to another database or reuse this class somewhere else.
In my opinion, the data access is one of the most important aspects to separate / abstract out from the rest of your code.
Separating out various 'layers' has several advantages.
1) It neatly organises your code base. If you have to make a change, you'll know immediately where the change needs to be made and where to find the code. This might not be so much of a big deal if you're working on a project on your own but with a larger team the benefits can quickly become obvious. This point is actually pretty trivial but I added it anyway. The real reason is number 2..
2) You should try to separate things that might need to change independently of each other. In your specific example, it is conceivable that you would want to change the DB / data access logic without impacting the user interface. Or, you might want to change the user interface without impacting on the data access. Im sure you can see how this is made impossible if the code is mixed in with each other.
When your data access layer, has a tightly defined interface, you can change its inner workings however you want, and as long as it still adheres to the interface you can be pretty certain it wont have broken anything further up. Obviously this would still need verifying with testing.
3) Reuse. Writing data access code can get pretty repetitive. It's even more repetitive when you have to rewrite the data access code for each page you write. Whenever you notice something repetitive in code, alarm bells should be ringing. Repetitiveness, is prone to errors and causes a maintenance problem.
I'm sure you see the same queries popping up in various different pages? This can be resolved by putting those queries lower down in your data layer. Doing so helps to ease maintenance; whenever a table or column name changes, you only need to correct the one place in your data layer that references it instead of trawling through your entire user interface and potentially missing something.
4) Testing. If you want to use automated tool to carry out unit testing you will need everything nicely separated. How will you test your code to select all Customer records when this code is scattered all throughout your interface? It is much easier when you have a specific SelectAllCustomers function on a data access object. You can test this once here and be sure that it will work for every page that uses it.
There are more reasons that I'll let other people add. The main thing to take away is that separating out layers allows one layer to change without letting the change ripple through to other layers. As the database and user interface are areas of an application / website that change the most frequently it is a very good idea to keep them separate and nicely isolated from everything else and each other.
In my point of view to print just a list of items in a database table, your snippet is the more appropriate: fast, simple and clear.
I think a bit more abstraction could be helpful in other cases to avoid code repetitions with all the related advantages.
Consider a simple CMS with authors, articles, tags and a cross reference table for articles and tags.
In your homepage your simple query will become a more complex one. You will join articles and users, then you will fetch related tag for each article joining the tags table with the cross reference one and filtering by article_id.
You will repeat this query with some small changes in the author profile and in the tag search results.
Using a abstraction tool like this, you can define your relations once and use a more concise syntax like:
// Home page
$articles = $db->getTable('Article')->join('Author a')
->addSelect('a.name AS author_name');
$first_article_tags = $articles[0]->getRelated('Tag');
// Author profile
$articles = $db->getTable('Article')->join('Author a')
->addSelect('a.name AS author_name')->where('a.id = ?', $_GET['id']);
// Tag search results
$articles = $db->getTable('Article')->join('Author a')
->addSelect('a.name AS author_name')
->join('Tag')->where('Tag.slug = ?', $_GET['slug']);
You can reduce the remaining code repetition encapsulating it in Models and refactoring the code above:
// Home page
$articles = Author::getArticles();
$first_article_tags = $articles[0]->getRelated('Tag');
// Author profile
$articles = Author::getArticles()->where('a.id = ?', $_GET['id']);
// Tag search results
$articles = Author::getArticles()
->join('Tag')->where('Tag.slug = ?', $_GET['slug']);
There are other good reasons to abstract more or less, with its pros and cons. But in my opinion for a big part the web projects the main is this one :P
Related
I'm coding an interface class at the moment to manage showing of user comments on two different kinds of pages, one page is the profiles and the other is the media pages.
Both sets of comments are stored in different tables but I'm wondering whether I should use one function or split both tables into a separate function.
Is the overall goal of OOP to have code that works well for your site or to be able to use it over in different sections without the need to modify lots?
I could have:
showComments($pageId, $type, $userType)
{
if($type == 'media')
$sql = "SELECT comment FROM mediatable WHERE id=:pageId";
elseif($type == 'profile')
$sql = "SELECT comment FROM profileTable WHERE id=:pageId";
if($userType == 'moderator')
//show Moderation Tools
//Rest of code goes here
}
Or I could seperate it into different functions like so:
showMediaComments($id);
moderateMediaComments($id);
showProfileComments($id);
moderateProfileComments($id);
I'm thinking the second method would be better as I could then use the code again easier but it would required more lines of code ...
Neither one is proper OOP. A proper way would be to have a abstract Comment class and subclasses MediaComments, ProfileComments that handle the differences. Also, read about the MVC architecture
It really depends on what your goal is for your OOP site.
Many people use OOP to different extents and in many different ways, however, if you're using a true MVC, it would have a model class for each database table, so mediatable and profiletable queries wouldn't be in the same function.
That said, in my opinion, separating them would be better. It tends to keep code cleaner and more localized.
Is the overall goal of OOP to have code that works well for your site
or to be able to use it over in different sections without the need to
modify lots?
The overall goal of OOP is to design systems that are more flexible to future changes by providing interfaces and classes which can be optionally extended by future developers to meet needs not originally accounted for.
The goal of modular design (in procedural or OOP code) is to create small chunks of code that represent each logic bit which is it's own independent block (i.e. function/method). In other words, each logical task should be broken down into the individual components that make it up in the same way you would normalize a database table in a RDBMS.
Is the overall goal of OOP to have code that works well for your site
Think about it, OOP or not, you'll always want code that works well for your site :)
or to be able to use it over in different sections without the need to modify lots?
I agree pretty much with the answer from #Xeoncross for this one, but I like to view OOP from a more conceptual point of view: objects that collaborates throught message sends with the goal of modelling a subset of the reality (that is, your domain problem). I found it easier to think in terms of domain concepts instead of classes and interfaces (that's the way you use to express those domain concepts).
About the sample you posted, it's good to separate each branch of the conditional in its own method.
To learn why the previous is important, let´s take the following example: suppose there´s a bug in the media comments part of then code (and rest is working fine). For fixing it, you'll have to modify that part, potentially making changes that breaks already working code just because all belongs to the same method (remember, "If it ain't broke, don't fix it").
Follow #Darhazer advice and put each pice of logic in the class where it belongs. That way a class will only know about one and only one thing (MediaComments contains only the code that deals with media comments, ProfileComments does the same for profile comments, and so on). That will make it easy to locate, maintain and modify code related to a particular concept.
You can learn more about the reasons behind this decision reading about The Single Responsibility Principle here and here.
Also, I think you'll find interesting the "Replace Conditional with Polymorphism" section of Martin Fowler's book: Refactoring: Improving the Design of Existing Code that explains the drawbacks of having a conditional statement like yours and gives you an object oriented technique to get rid of it.
It depends, but generally speaking you should try to separe chunk of code and do a simple task with each method. What will happen if you add another type of media in the future - you will end up adding "if elses" to this method.
I quite often see in PHP, WordPress plugins specifically, that people write SQL directly in their plugins... The way I learnt things, everything should be handled in layers... So that if one day the requirements of a given layer change, I only have to worry about changing the layer that everything interfaces.
Right now I'm writing a layer to interface the database so that, if something ever changes in the way I interact with databases, all I have to do is change one layer, not X number of plugins I've created.
I feel as though this is something that other people may have come across in the past, and that my approach my be inefficient.
I'm writing classes such as
Table
Column
Row
That allow me to create database tables, columns, and rows using given objects with specific methods to handle all their functions:
$column = new \Namespace\Data\Column ( /* name, type, null/non-null, etc... */ );
$myTable = new \Namespace\Data\Table( /* name, column objects, and indexes */ );
\Namespace\TableModel.create($myTable);
My questions are...
Has someone else already written something to provide some separation between different layers?
If not, is my approach going to help at all in the long run or am I wasting my time; should I break down and hard-code the sql like everyone else?
If it is going to help writing this myself, is there any approach I could take to handle it more efficiently?
You seem to be looking for an ORM.
Here is one : http://www.doctrine-project.org/docs/orm/2.0/en/tutorials/getting-started-xml-edition.html
To be honest, I'd just hard-code the SQL, because:
Everyone else does so too. Big parts of WordPress would need to be rewritten, if they would ever wish to change from MySQL to something else. It would just be a waste of time to write your perfect layer for your plugin, if the rest of the whole system still only works with hard-coded SQL.
We don't live in a perfect world. Too much abstraction will - soon or late - end up in performance and other issues, which I don't even think of yet. Keep it simple. Also, using SQL you can benefit from some performance "hacks", which maybe won't work for other systems.
SQL is a widely accepted standard and can already be seen as abstraction layer. for example there's even the possibility to access Facebook's Graph via SQL-like syntax (see FQL). If you want to change to another data-source, you'll probably find some layer wich supports SQL-syntax anyways! In that sense, you could even say SQL already is some kind of abstraction layer.
But: if you decide to use SQL, be sure to use WordPress' $wpdb. Using that, you're on the safe side, as WordPress takes care of connecting to the database, forming the queries, etc. If, one day, WordPress will decide to change from databases to something else, they'll need to create a $wpdb-layer to that new source - for backwards compatibility. Also, many general requests already are in $wpdb as functions (such as $wpdb->insert()), so there's no direct need to hard-code SQL.
If however, you decide to use such an abstraction layer: Wikipedia has more information.
Update: I just found out that the CMS Drupal uses a database abstraction layer - but they still use SQL to form their queries, for all the different databases! I think that shows pretty clearly, how SQL can already be used as an abstraction layer.
I have a design discussion with a collegue.
We are implementing a PHP MySQL database application. In the first instance we have written the Insert Get Update Delete, SelectAll and Search functions for a particular table, writting the table and fieldnames in the code, with several php files, one for the object class, one to draw the HTML table for that table, one for editing a row of that table, one containing the above functions, etc.
The disagreement comes as I have now written generic functions that read/write from the database and draw the HTML taking the table name as a parameter, letting these functions discovers the fieldnames from the database or class. So now that code can be used for any table, with any fields without having to manually go in change each function that needs alteration. I understand there will be cases where more table specific functionality is needed, and this I think that should be done as requirements arise, integrating common parts where possible.
My collegue on the other hand is adamant we should keep a set of files separate for each table, i.e. around 5 php files for each table. The read/write functions written differently in each case, with the need for any changes required for all tables to be affecting 5 x number of tables amount of times.
I can certainly say there will be more than main 15 tables in the database that will at least need basic funcionality.
What approach do you think is most appropriate?
One of the important principles in programming is DRY : Don't Repeat Yourself. So, everything common to several usecases should be written once, in a single location.
Now, I've never had to develop an application where each database table had the same, generic, crud pages. If it were the case, it wouldn't be a functional application, but a database management application. Are you sure you aren't redeveloping phpMyAdmin?
If you need the same code to handle several base operations on several tables, I would say you shouldn't write that code more than once : no need to repeat yourself :
it would take more time writing the application
it would take more time (and possible cause more bugs) when maintaining it
A possible solution could be to put as much common code as possible into a single class, common to many tables ; and, then, have a specific class for each table, that would extend that common class.
This way :
The common code would be written only once
Each table could have its specific code
The common code could be overridden for a specific table, if necessary.
About that, see the Object Inheritance section of the manual.
About that common class idea, you could even go farther, and write pretty much no code at all, if your CRUD is generic enough : there are ORM frameworks that might do that for you.
For instance, you might want to take a look at Doctrine.
We're on the first stages of development, and we don't have the complete functional specifications for the web application we're developing. Yes, we know, but it's not our fault.
So, we're building some parts keeping them pretty simple and straight-forward so we can build on top of that when we have more details on what to build.
We have a section for clients, for ads, for users, ... and I wanted to keep things separate because we don't know what's coming in the future. Yes, at the moment we have only a few fields and some basic listings and editing pages, but all that will grow.
It's not that I don't want to implement some generic code that we can reuse. It's that we don't know yet what will be the limitations in the near future, and I don't want to write generic code that we'll have to parametrize intensely.
For example, Alex built a generic Update method to which you pass an object and it will create an UPDATE SQL statement and execute it. OK, that's cool, but that doesn't work for the Users section of the web app because we store the password encoded. First, it won't encode the password. Second, if you edit a user and don't enter anything on the password and password-confirmation fields, the old password will remain. So, we have a problem with the generic Update method, and as I see it there are two possible solutions:
a) Parametrize the Update method so if it is modifying a user, keep the password if the password on the object is blank. And encode the password, of course.
b) Override the Update method for the child class.
Alex's implementation didn't use inheritance and he used the generic methods in a static class he'd call this way DataAccess::Update($object);. The method takes the table name from the class name as he modified the database to make them match (I prefer "Clients" for the table and "Client" for the class). So, option b is not possible with Alex's implementation.
The way I was trying to build it was keeping separate Update methods for each table. Yes, I was repeating myself but, as I said before, we don't have a full specification, so we don't know how it's going to grow. We have an idea, but we don't have the exact details.
So, the point here is that I don't want to write generic code until we have a much more detailed specification so we can evaluate what can and what cannot be shared between the parts.
Not all sections of the web app work the same and, as JB Nizet said: "If it were the case, it wouldn't be a functional application, but a database management application."
And I can tell you for sure this is not a database management application, even though Alex would say "we're just building a database app." Well, maybe, but a database application is not only showing/modifying the tables. And now, views won't solve all problems.
Again, as JB Nizet said: "If it were the case, it wouldn't be a functional application, but a database management application."
And now I'm repeating myself again, but this time there's no reason for that.
Thanks for your time.
I have a large system that I have coded and I wish to make the code (dare I say) more simple and easy to read. Unfortunately before this, I haven't used functions much.
I have many different MySQL Queries that are run in my code, I feel that if I make the various displays into functions and store them in a separate file it would make the code much easier to maintain (actually, I know it would).
The only thing I am wondering is, if this is a common practice and if you think that it is going to hurt me in the long run in terms of performance and other factors. Here is an example of what I currently am using:
$result = mysql_query("SELECT * FROM table");
while($row = mysql_fetch_array($result)){
/* Display code will go here */
}
As you can imagine this can get lengthy. I am thinking of making a function that will take the result variable, and accomplish this, and then return the results, as so:
$result = mysql_query("SELECT * FROM table");
dsiplayinfo($result);
Do you think this is the right way to go?
[Edit]
The functions would be very different because each of them needs to display the data in different ways. There are different fields of the database that need to be shown in each scenario. Do you feel that this approach is still a good one even with that factor? AKA Modular design is not being fully accomplished, but easy maintenance is.
One thing you want to keep in mind is the principle of DRY: Don't Repeat Yourself.
If you are finding there is a block of code that you are using multiple times, or is very similar that can be made the same, then it is a ideal candidate for being moved into a function.
Using more functions can be helpful, and it can be hurtful. In theory it moves you more towards modular design, meaning you can use a function over and over in multiple apps without having to re-write it.
I would honestly encourage you to more toward larger conventions, like the MVC Frameworks out there. Kohana is a great one. It uses things like Helpers for additional outside functionality, Models to query the database, and Controllers to perform all of your logic - and the end-result is passed on to the View to be formatted with HTML/CSS and spiced up with Javascript.
Don't use "SELECT *" -- enumerate the fields you want for performance reasons, as well as maintainence reasons.
In your example, the display code will likely be pretty closely coupled with the SQL query, so you may as well encapsulate them both together
You might consider some sort of MVC framework with an ORM (like CakePHP) which will facilitate Model reuse much better than writing a bunch of functions
You are on the right track! Write your code, then refactor it to make it better--very smart.
Yes, also consider looking into an ORM or some database agnostic interface. This might help reduce duplication as well (and certainly make porting to a new DB easier, should that ever come up).
Basically any time you see similar looking code (be it in structure or in functionality) you have an opportunity to factor it out into functions that can be shared across the application. A good rule of thumb is Don't Repeat Yourself (DRY)
I'm currently reading up on OO before I embark upon my next major project. To give you some quick background, I'm a PHP developer, working on web applications.
One area that particularly interests me is the User Interface; specifically how to build this and connect it to my OO "model".
I've been doing some reading on this area. One of my favourites is this:
Building user interfaces for object-oriented systems
"All objects must provide their own UI"
Thinking about my problem, I can see this working well. I build my "user" object to represent someone who has logged into my website, for example. One of my methods is then "display_yourself" or similar. I can use this throughout my code. Perhaps to start with this will just be their name. Later, if I need to adjust to show their name+small avatar, I can just update this one method and hey-presto, my app is updated. Or if I need to make their name a link to their profile, hey-presto I can update again easily from one place.
In terms of an OO system; I think this approach works well. Looking on other StackOverflow threads, I found this under "Separation of Concerns":
Soc
"In computer science, separation of
concerns (SoC) is the process of
breaking a computer program into
distinct features that overlap in
functionality as little as possible. A
concern is any piece of interest or
focus in a program. Typically,
concerns are synonymous with features
or behaviors. Progress towards SoC is
traditionally achieved through
modularity and encapsulation, with the
help of information hiding."
To my mind I have achieved this. My user object hides all it's information. I don't have any places in my code where I say $user->get_user_name() before I display it.
However, this seems to go against what other people seem to think of as "best practice".
To quote the "selected" (green one) answer from the same question:
"The separation of concerns is keeping
the code for each of these concerns
separate. Changing the interface
should not require changing the
business logic code, and vice versa.
Model-View-Controller (MVC) design
pattern is an excellent example of
separating these concerns for better
software maintainability."
Why does this make for better software maintainability? Surely with MVC, my View has to know an awful lot about the Model? Read the JavaWorld article for a detailed discussion on this point:
Building user interfaces for object-oriented systems
Anyway... getting to the actual point, finally!
1. Can anyone recommend any books that discuss this in detail? I don't want an MVC book; I'm not sold on MVC. I want a book that discusses OO / UI, the potential issues, potential solutuions etc.. (maybe including MVC)
Arthur Riel's Object-Oriented Design Heuristics
touches on it (and is an excellent book as well!), but I want something that goes into more detail.
2. Can anyone put forward an argument that is as well-explained as Allen Holub's JavaWorld article that explains why MVC is a good idea?
Many thanks for anyone who can help me reach a conclusion on this.
This is a failure in how OOP is often taught, using examples like rectangle.draw() and dinosaur.show() that make absolutely no sense.
You're almost answering your own question when you talk about having a user class that displays itself.
"Later, if I need to adjust to show their name+small avatar, I can just update this one method and hey-presto, my app is updated."
Think about just that little piece for moment. Now take a look at Stack Overflow and notice all of the places that your username appears. Does it look the same in each case? No, at the top you've just got an envelope next to your username followed by your reputation and badges. In a question thread you've got your avatar followed by your username with your reputation and badges below it. Do you think that there is a user object with methods like getUserNameWithAvatarInFrontOfItAndReputationAndBadgesUnderneath() ? Nah.
An object is concerned with the data it represents and methods that act on that data. Your user object will probably have firstName and lastName members, and the necessary getters to retrieve those pieces. It might also have a convenience method like toString() (in Java terms) that would return the user's name in a common format, like the first name followed by a space and then the last name. Aside from that, the user object shouldn't do much else. It is up to the client to decide what it wants to do with the object.
Take the example that you've given us with the user object, and then think about how you would do the following if you built a "UI" into it:
Create a CSV export showing all users, ordered by last name. E.g. Lastname, Firstname.
Provide both a heavyweight GUI and a Web-based interface to work with the user object.
Show an avatar next to the username in one place, but only show the username in another.
Provide an RSS list of users.
Show the username bold in one place, italicized in another, and as a hyperlink in yet another place.
Show the user's middle initial where appropriate.
If you think about these requirements, they all boil down to providing a user object that is only concerned with the data that it should be concerned with. It shouldn't try to be all things to everyone, it should just provide a means to get at user data. It is up to each of the many views you will create to decide how it wants to display the user data.
Your idea about updating code in one place to update your views in many places is a good one. This is still possible without mucking with things at a too low of a level. You could certainly create widget-like classes that would encapsulate your various common views of "stuff", and use them throughout your view code.
Here's the approach I take when creating websites in PHP using an MVC/separation of concerns pattern:
The framework I use has three basic pieces:
Models - PHP Classes. I add methods to them to fetch and save data. Each
model represents a distinct type of entity in the system: users, pages,
blog posts
Views - Smarty templates. This is where the html lives.
Controllers - PHP classes. These are the brains of the application. Typically
urls in the site invoke methods of the class. example.com/user/show/1 would
invoke the $user_controller->show(1) method. The controller fetches data out
of the model and gives it to the view.
Each of these pieces has a specific job or "concern". The model's job is to provide a clean interface to the data. Typically the site's data is stored in a SQL database. I add methods to the model for fetching data out and saving data in.
The view's job is to display data. All HTML markup goes in the view. Logic to handle zebra-striping for a table of data goes in the view. Code to handle the format that a date should be displayed in goes in the view. I like using Smarty templates for views because it offers some nice features to handle things like that.
The controller's job is to act as an intermediary between the user, the model, and the view.
Let's look at an example of how these come together and where the benefits lie:
Imagine a simple blog site. The main piece of data is a post. Also, imagine that the site keeps track of the number of times a post is viewed. We'll create a SQL table for that:
posts
id date_created title body hits
Now suppose you would like to show the 5 most popular posts. Here's what you might see in a non MVC application:
$sql = "SELECT * FROM posts ORDER BY hits DESC LIMIT 5";
$result = mysql_query($sql);
while ($row = mysql_fetch_assoc($result)) {
echo "$row['title']<br />";
}
This snippet is pretty straightforward and works well if:
It is the only place you want to show the most popular posts
You never want to change how it looks
You never decide to change what a "popular post" is
Imagine that you want to show the 10 most popular posts on the home page and the 5 most popular in a sidebar on subpages. You now need to either duplicate the code above, or put it in an include file with logic to check where it is being displayed.
What if you want to update the markup for the home page to add a "new-post" class to posts that were created today?
Suppose you decide that a post is popular because it has a lot of comments, not hits. The database will change to reflect this. Now, every place in your application that shows popular posts must be updated to reflect the new logic.
You are starting to see a snowball of complexity form. It's easy to see how things can become increasingly difficult to maintain over the course of a project. Also, consider the complexity when multiple developers are working on a project. Should the designer have to consult with the database developer when adding a class to the output?
Taking an MVC approach and enforcing a separation of concerns within your application can mitigate these issues. Ideally we want to separate it out into three areas:
data logic
application logic
and display logic
Let's see how to do this:
We'll start with the model. We'll have a $post_model class and give it a method called get_popular(). This method will return an array of posts. Additionally we'll give it a parameter to specify the number of posts to return:
post_model.php
class post_model {
public function get_popular($number) {
$sql = "SELECT * FROM posts ORDER BY hits DESC LIMIT $number";
$result = mysql_query($sql);
while($row = mysql_fetch_assoc($result)) {
$array[] = $row;
}
return $array;
}
}
Now for the homepage we have a controller, we'll call it "home". Let's imagine that we have a url routing scheme that invokes our controller when the home page is requested. It's job is to get the popular posts and give them to the correct view:
home_controller.php
class home_controller {
$post_model = new post_model();
$popular_posts = $post_model->get_popular(10);
// This is the smarty syntax for assigning data and displaying
// a template. The important concept is that we are handing over the
// array of popular posts to a template file which will use them
// to generate an html page
$smarty->assign('posts', $popular_posts);
$smarty->view('homepage.tpl');
}
Now let's see what the view would look like:
homepage.tpl
{include file="header.tpl"}
// This loops through the posts we assigned in the controller
{foreach from='posts' item='post'}
{$post.title}
{/foreach}
{include file="footer.tpl"}
Now we have the basic pieces of our application and can see the separation of concerns.
The model is concerned with getting the data. It knows about the database, it knows about SQL queries and LIMIT statements. It knows that it should hand back a nice array.
The controller knows about the user's request, that they are looking at the homepage. It knows that the homepage should show 10 popular posts. It gets the data from the model and gives it to the view.
The view knows that an array of posts should be displayed as a series of achor tags with break tags after them. It knows that a post has a title and an id. It knows that a post's title should be used for the anchor text and that the posts id should be used in the href. The view also knows that there should be a header and footer shown on the page.
It's also important to mention what each piece doesn't know.
The model doesn't know that the popular posts are shown on the homepage.
The controller and the view don't know that posts are stored in a SQL database.
The controller and the model don't know that every link to a post on the homepage should have a break tag after it.
So, in this state we have established a clear separation of concerns between data logic (the model), application logic (the controller), and display logic (the view). So now what? We took a short simple PHP snippet and broke it into three confusing files. What does this give us?
Let's look at how having a separation of concerns can help us with the issues mentioned above. To reiterate, we want to:
Show popular posts in a sidebar on subpages
Highlight new posts with an additional css class
Change the underlying definition of a "popular post"
To show the popular posts in a sidebar we'll add two files our subpage:
A subpage controller...
subpage_controller.php
class subpage_controller {
$post_model = new post_model();
$popular_posts = $post_model->get_popular(5);
$smarty->assign('posts', $popular_posts);
$smarty->view('subpage.tpl');
}
...and a subpage template:
subpage.tpl
{include file="header.tpl"}
<div id="sidebar">
{foreach from='posts' item='post'}
{$post.title}
{/foreach}
</div>
{include file="footer.tpl"}
The new subpage controller knows that the subpage should only show 5 popular posts. The subpage view knows that subpages should put the list of posts inside a sidebar div.
Now, on the homepage we want to highlight new posts. We can achieve this by modifying the homepage.tpl.
{include file="header.tpl"}
{foreach from='posts' item='post'}
{if $post.date_created == $smarty.now}
<a class="new-post" href="post.php?id={$post.id}">{$post.title}</a>
{else}
{$post.title}
{/if}
{/foreach}
{include file="footer.tpl"}
Here the view handles all of the new logic for displaying popular posts. The controller and the model didn't need to know anything about that change. It's purely display logic. The subpage list continues to show up as it did before.
Finally, we'd like to change what a popular post is. Instead of being based on the number of hits a page got, we'd like it to be based on the number of comments a post got. We can apply that change to the model:
post_model.php
class post_model {
public function get_popular($number) {
$sql = "SELECT * , COUNT(comments.id) as comment_count
FROM posts
INNER JOIN comments ON comments.post_id = posts.id
ORDER BY comment_count DESC
LIMIT $number";
$result = mysql_query($sql);
while($row = mysql_fetch_assoc($result)) {
$array[] = $row;
}
return $array;
}
}
We have increased the complexity of the "popular post" logic. However, once we've made this change in the model, in one place, the new logic is applied everywhere. The homepage and the subpage, with no other modifications, will now display popular posts based on comments. Our designer didn't need to be involved in this. The markup is not affected.
Hopefully, this provides a compelling example of how separating the concerns of data logic, application logic, and display logic, can make developing your application easier. Changes in one area tend to have less of an impact on other areas.
Following this convention isn't a magic bullet that will automatically make your code perfect. And you will undoubtedly come across issues where it is far less clear where the separation should be. In the end, it's all about managing complexity within the application.
You should give plenty of thought to how you construct your models. What sort of interfaces will they provide (see Gregory's answer regarding contracts)? What data format does the controller and view expect to work with? Thinking about these things ahead of time will make things easier down the road.
Also, there can be some overhead when starting a project to get all of these pieces working together nicely. There are many frameworks that provide the building blocks for models, controllers, templating engines, url routing, and more. See many other posts on SO for suggestions on PHP MVC frameworks. These frameworks will get you up and running but you as the developer are in charge of managing complexity and enforcing a separation of concerns.
I will also note that the code snippets above are just simplified examples. They may (most likely) have bugs. However, they are very similar in structure to the code I use in my own projects.
I am not sure I can lead you to the water you wat to drink, but I think I can answer some of your concerns.
First, in MVC, the model and the view do have some interplay, but the view is really coupled to contract and not to implementation. You can shift out to other models that adhere to the same contract and still be able to use the view. And, if you think about it, it makes sense. A user has a first name and last name. He probably also has a logon name and a password, although you might or might not tie this to the "contract" of what a user is. The point is, once you determine what a user is, it is unlikely to change much. You might add something to it, but it is unlikely you are going to take away that often.
In the view, you have pointers to the model that adheres to that contract, but I can use a simple object:
public class User
{
public string FirstName;
public string LastName;
}
Yes, I realize public fields are bad. :-) I can also use a DataTable as a model, as long as it exposes FirstName and LastName. That may not be the best example, but the point is the model is not tied to the view. The view is tied to a contract and the particular model adheres to that contract.
I have not heard of every object must have its own UI. There are essentially two types of objects: state and behavior. I have seen examples that have both state and behavior, but they generally are in systems that are not very testable, which I am not fond of. Ultimately, every state object should be exposed to some form of UI to avoid forcing IT people to handle all the updates directly in a data store, perhaps, but have their own UI? I would have to see that written in an explanation to try to understand what the user is doing.
As for SoC, the reasaon to package things distinctly is the ability to switch out layers/tiers without rewriting the entire system. In general, the app is really located in the business tier, so that part cannot as easily be switched out. The data and UI should be fairly easy to switch out in a well designed system.
As far as books on understanding OOP, I tend to like books on patterns, as they are more practical ways of understanding the concepts. You can find the primer material on the web. If you want a language agnostic pattern book, and think a bit geeky, the Gang of Four book is a good place to start. For more creative types, I would say Heads Up Design Patterns.
The problem with the idea that all your objects know how to display themselves is that each object can only be displayed in one way. What happens if you want to provide a detail view of a user, and a summary view. What happens if you want to display a view that merges a number of objects (users and their associated addresses for example). If you seperate your business objects (users) from the things that know how to display them then you have no more code to write, you just seperate it into different places.
This makes software more maintainable because if a user object is behaving incorrectly, you know it is the user, if it is not displaying properly, you know it is the view. In the situation where you need to provide a new interface to your application (say you decide to provide a new look and feel for mobile browsers), then you dont need to change your user object at all, you add a new object that knows how to render the user object for a mobile browser.
SOLID principles provide some good reasoning for this, here is a relatively concise look at these. I am afraid that I dont have a book to hand that sums it up well, but experience has taught me that it is easier to write new code than it is to update old code, and so designs that favour small modular classes that plug together to achieve what is needed, while harder to design up front, are far easier to maintain in the long run. It is great to be able to write a new renderer for a user object, without ever having to delve into the internals of that object.
Can anyone put forward an argument [...] that explains why MVC is a good idea?
It keeps you sane by helping you remember what your code does because they are isolated from each other.
Consider the amount of code that
would go into that single class, if
you want to expose the same info not
only as Html on the UI, but as part
of an RSS, a JSON, a rest service
with XML, [insert something else].
It is a leaky abstraction, meaning it tries to give you the sense that it will be the only piece that will ever know that data, but that can't be entirely truth. Lets say you want to provide a service that will integrate with several external third parties. You will have a really hard time forcing them to use your specific language to integrate with your service (as it is The class the only piece that can ever the data it is using), or if in the other hand you expose some of its data you are not hiding the data from those third parties systems.
Update 1: I gave an overall look at the whole article, and being an old article (99), it isn't really about MVC as we know it today vs. object oriented, nor has arguments that are against the SRP.
You could perfectly be in line with what he said, and handle the above scenario I mentioned with specific classes responsible to translate the object's public contract to the different formats: the main concern was that we didn't have a clear place to handle the changes and also that we didn't want the information to be repeated all over. So, on the html case, you could perfectly have a control that renders the info, or a class that transform it to html or [insert reuse mecanism here].
Btw, I had a flash back with the RMI bit. Anyway, in that example you can see he is tied to a communication mecanism. That said, each method call is remotely handled. I think he was also really concerned on developers having code that instead of getting a single object and operating on the info returned, had lots of small Get calls to get tons of different pieces of information.
Ps. I suggest you read info about DDD and Solid, which as I said for the SRP I wouldn't say it is the type of things the author was complaning about
I don't know any good books on the MVC subject, but from my own experience. In web development for example, many times you work with designers and sometimes dbas. Separating the logic from the presentation allows you to work with people with different skill sets better because the designer doesn't need to much about coding and vice versa. Also, for the concept of DRY, you can make your code less repetitive and easier to maintain. Your code will be more reusable and make your job a lot easier. It will also make you a better developer because you will become more organized and think of programming in a different way. So even if you have to work on something that is not MVC, you might have a different approach to architecting the project because you understand the MVC concepts.
I guess the tradeoff with a lot of MVC frameworks for large sites is that it may not be fast enough to handle the load.
My 2c.. another thing you could do besides what was said is to use Decorators of your User objects. This way, you could decorate the user differently depending on the context. So you'd end up with WebUser.class, CVSUser.class, RSSUser.class, etc.
I don't really do it this way, and it could get messy, but it helps in avoiding the client code from having to pull a lot of info out of your User. It might be something interesting to look into ;-)
Why getter and setter methods are evil (JavaWorld)
Decorator pattern