I cant seem to find an acceptable answer to this.
There are two big things I keep seeing:
1) Don't execute queries in the controller. That is the responsibility of business or data.
2) Only select the columns that you need in a query.
My problem is that these two things kind of butt heads since what is displayed in the UI is really what determines what columns need to be queried. This in turn leads to the obvious solution of running the query in the controller, which you aren't supposed to do. Any documentation I have found googling, etc. seems to conveniently ignore this topic and pretend it isn't an issue.
Doing it in the business layer
Now if I take it the other way and query everything in the business layer then I implicitly am making all data access closely reflect the ui layer. This is more a problem with naming of query functions and classes than anything I think.
Take for example an application that has several views for displaying different info about a customer. The natural thing to do would be to name these data transfer classes the same as the view that needs them. But, the business or service layer has no knowledge of the ui layer and therefore any one of these data transfer classes could really be reused for ANY view without breaking any architecture rules. So then, what do I name all of these variations of, say "Customer", where one selects first name and last name, another might select last name and email, or first name and city, and so on. You can only name so many classes "CustomerSummary".
Entity Framework and IQueryable is great. But, what about everything else?
I understand that in entity framework I can have a data layer pass back an IQuerable whose execution is deferred and then just tell that IQueryable what fields I want. That is great. It seems to solve the problem. For .NET. The problem is, I also do PHP development. And pretty much all of the ORMs for php are designed in a way that totally defeat the purpose of using an ORM at all. And even those dont have the same ability as EF / IQueryable. So I am back to the same problem without a solution again in PHP.
Wrapping it up
So, my overall question is how do I get only the fields I need without totally stomping on all the rules of an ntier architecture? And without creating a data layer that inevitably has to be designed to reflect the layout of the UI layer?
And pretty much all of the ORMs for php are designed in a way that totally defeat the purpose of using an ORM at all.
The Doctrine PHP ORM offers lazy loading down to the property / field level. You can have everything done through proxies that will only query the database as needed. In my experience letting the ORM load the whole object once is preferable 90%+ of the time. Otherwise if you're not careful you will end up with multiple queries to the database for the same records. The extra DB chatter isn't worthwhile unless your data model is messy and your rows are very long.
Keep in mind a good ORM will also offer a built-in caching layer. Populating a whole object once and caching it is easier and more extensible then having your code keep track of which fields you need to query in various places.
So my answer is don't go nuts trying to only query the fields you need when using an ORM. If you are writing your queries by hand just in the places you need them, then only query the fields you need. But since you are talking good architectural patterns I assume you're not doing this.
Of course there are exceptions, like querying large data sets for reporting or migrations. These will require unique optimizations.
Questions
1) Don't execute queries in the controller. That is the responsibility of business or data.
How you design your application is up to you. That being said, it's always best to consider best patterns and practices. The way I design my controllers is that I pass in the data layer(IRepository) through constructor and inject that at run time.
public MyController(IRepository repo)
To query my code I simply call
repository.Where(x=> x.Prop == "whatever")
Using IQueryable creates the leaky abstraction problem. Although, it may not be a big deal but you have to be careful and mindful of how you are using your objects especially if they contain relational data. Once you query your data layer you would construct your view model in your controller action with the appropriate data required for your view.
public ActionResult MyAction(){
var data = _repository.Single(x => x.Id == 1);
var vm = new MyActionViewModel {
Name = data.Name,
Age = data.Age
};
return View();
}
If I had any queries that where complex I would create a business layer to include that logic. This would include enforcing business rules etc. In my business layer I would pass in the repository and use that.
2) Only select the columns that you need in a query.
With ORMs you usually pass back the whole object. After that you can construct your view model to include only the data you need.
My suggestion to your php problem is maybe to set up a web api for your data. It would return json data that you can then parse in whatever language you need.
Hope this helps.
The way I do it is as follows:
Have a domain object (entity, business object .. things with the same name) for Entities\Customer, that has all fields and associated logic for all of the data, that a complete instance would have. But for persistence create two separate data mappers:
Mappers\Customer for handling all of the data
Mappers\CustomerSummary for only important parts
If you only need to get customers name and phone number, you use the "summary mapper", but, when you need to examine user's profile, you have the "all data mapper". And the same separation can be really useful, when updating data too. Especially, if your "full customer" get populated from multiple tables.
// code from a method of some service layer class
$customer = new \Model\Entities\Customer;
$customer->setId($someID);
$mapper = new \Model\Mappers\CustomerSummary($this->db);
if ($needEverything) {
$mapper = new \Model\Mappers\Customer($this->db);
}
$mapper->fetch($customer);
As for, what goes where, you probably might want to read this old post.
Related
I need help with something I can’t get my head wrapped around regarding the Repository and Service/Use-case pattern (part of DDD design) I want to implement in my next (Laravel PHP) project.
All seems clear. Just one part of DDD that is confusing is the data structures from repositories. People seem to choose data structures the Repository should return (arrays or entities) but it all has disadvantages. One of which is performance looking at my experiences in the past. And one is which you don’t have interfaces for simple data structures (array or simple object attributes).
I’ll start with explaining the experience I have with a previous project. This project had flaws but some good strengths I learned from and like to see in my new project but with solving some design mistakes.
Previous experience
In the past I’ve build a website that was API Centric using the Kohana framework and Doctrine 2 ORM (data mapper pattern). The flow looked like this:
Website controller → API client (HMVC calls) → API controller → Custom Repository → Doctrine 2 ORM native Repository/Entity-manager
My custom Repository returned plain arrays using Doctrine2 DQL. Doctrine2 recommends array result data for read only operations. And yes, it made my site nice and light. The API controller just converted the array data to JSON. Simple as that.
In the past my company created projects relying fully on loaded Doctrine2 entities and it’s something we regretted due to performance.
My REST API supported queries like
/api/users?include_latest_adverts=2&include_location=true
on the users resource. The API controller passed include_location to the repository which directly included the location relation. The controller read latest_adverts=2 and called the adverts repository to get the latest 2 adverts of each user. Arrays were returned.
For example first user array:
[
name
avatar
adverts [
advert 1 [
name
price
]
advert 2 [
….
]
]
]
This proved to be very successful. My whole website was using the API. It would be very easy to to add a new client because the API was perfectly in production already using oauth. The whole website runs on it.
But this design had flaws too. My controller still contained A LOT of logic for validation, mailing, params or filters like has_adverts=true to get users with adverts only. It would mean that if I created a new port, like a total new CLI interface, I would have to duplicate alot of these controllers due to all the validation etc. But no duplication if I would create a new client. So at least one problem was solved :-)
My admin panels were completely coupled to the doctrine2 repository/entity-manager to speed up development (sort of). Why? Because my API had fat controllers with special functionality for the website only (special validation, mailing for registering etc). I would have to redo work or refactor a lot. So decided to use the Entities directly to still have some sort clear way of writing code instead of rewriting all my API controllers and move them to Services (for site & admin) for instance. Time was an issue in fixing my design mistakes.
For my next project I want all code to go through my own custom repositories and services. One flow for good separation.
New project (using DDD ideas) and dilemma with data structures
While I like the idea of being API centric, I don’t want my next project to be API centric in core because I think the same functionality should be available without the HTTP protocol in between. I want to design the core using DDD ideas.
But I liked the idea using a layer that just talked as a API and returns simple arrays. The perfect base for any new port, including my own frontend. My idea is to consider my Service classes as the API interface (return the array data), do the validation etc. I could have Services specially for the website (registering) and plain services used by the Admin or background processes. In some admin cases a Service would not be required anyway for simple CRUD editing, I could just use Repositories directly. Controllers would be very thin. With this creating a real REST API would just be a matter to create new controllers using the same Services my frontend controller classes do.
For internal logic like business rules it would be useful to have Entities (clear interfaces) instead of arrays from repositories. This way I could benefit from defining some methods that did some logic based on attributes. BUT If I would be using Doctrine2 and my repositories would always return Entities my application would suffer a big performance hit!!
One data structure ensures performance but no clear interfaces, the other ensures clear interfaces but bad performance when using a Data Pattern pattern like Doctrine 2 (now or in the future). Also I could end up with two data types which would be confusing.
I was thinking something similar to this flow:
Controller (thin) → UserService (incl. validation) → UserRepository (just storage) → Eloquent ORM
Why Eloquent instead of Doctrine2? Because I want to stick a bit to what’s common within the Laravel framework and community. So I could benefit from third party modules, for example to generate admin interfaces or similar based on models (bypassing my DDD rules). Other than using third party modules, I would design my core stuff so switching should always be easy and not affect data structure choices or performance.
Eloquent is an activerecord pattern. So I would be tempted to convert this data to POPO’s like Doctrine2 entities are. But nope... as said above, with doctrine2 real models would make the system very fat. So I fall back to simple arrays again. Knowing this would work for both and any other implementation in the future.
But it feels bad always rely on arrays. Especially when creating internal business rules. A developer would have to guess values on arrays, have no autocompletion in his IDE, could not have special methods like in Entity classes. But making two ways of dealing with data feels bad too. Or I am just too perfectionist ;) I want ONE clear data structure for all!
Building interfaces and POPO’s would mean a lot of duplicate work. I would need to convert an Eloquent model (just a table mapper, not entity) to an entity object implementing this interface. All is extra work. And eventually my last layer would be just like a API, thus converting it to arrays again. Which is extra work too. Arrays seem the deal again.
It seemed so easy reading up into DDD and Hexagonal. It seems so logic! But in reality I struggle with this one simple issue trying to stick to OOP principles. I want to use arrays because it’s the only way to be 100% sure I am not depended on any model choice and querying choice from my ORM regarding performance etc and don't have duplicate work in converting to arrays for views or an API. But there's no clear contract on how a user array could look. I want to speed up my project using these patterns, not slow them down :-) So not an option to have many converters.
Now I read a lot of topics. One makes POPO’s & interfaces that conform proper entities like Doctrine2 could return, but with all the extra work for Eloquent. Switching to Doctrine2 should be fairly easy, but would impact performance so bad or one would need to convert Doctrine2 array data to these own entity interfaces. Others choose to return simple arrays.
One convinces people to use Doctrine2 instead of Eloquent, but they leave out the fact that Doctrine2 is heavy and you really need to use array results for read only operations.
We design repositories to be changeable right? Not because it’s “nice” by design only. So how could we rely on full Entities if it has such big impact on performance or duplicate work? Even when using Doctrine2 only (coupled) this same issue would arise due to its performance!
All ORM implementations would be able to return arrays, thus no duplicate work there. Good performance. But we miss clear contracts. And we don’t have interfaces for arrays or class attributes (as a workaround)... Ugh ;)
Do I just miss a missing block in our programming languages? Interfaces on simple data structures??
Is it wise to make all arrays and have advanced business logic talk to these arrays? Thus no classes with clear interfaces. Any precalculated data (normally would be returned by an Entity method) would be within an array key defined the Service class. if not wise, what’s the alternative considering all of the above?
I would really appreciate if someone with great experience in this “domain” considering performance, different ORM implementations, etc could tell me how he/she has dealt with this?
Thanks in advance!
I think what you are dealing with is something similiar I'm struggling with. The solution I'm thinking works best is:
Entities/Repositories
Use and pass around Entities always when performing Write operations (Creating things, Updating things, Deleting things, and complex combinations thereof).
Sometimes you may use Entities when doing Read operations (when you anticipate the Read might need to be used for a Write soon after...ie. ->findById is soon followed by ->save).
Anytime you are working with an Entity (whether it be Write or Read), the Repositories need to be the place to go. You should be able to tell new developers that they can only persist to the database through Entities and the Repository.
The Entities will have properties that represent some Domain Object (many times they represent a database table with the fields of a table, but not always). They will also contain the domain logic/rules with them (ie. validation, calculations) so they are not anemic. You may additionally have some domain services if your Entities need help interacting with other Entities (need to trigger other events), or you just need an additional place to handle some extra domain logic (perform Repository calls to check for some unique conditions).
Your Repositories will solely be for working with Entities. The Repositories could accept Entities and do some persistence work with them. Or they could accept just some parameters, and do some reading/fetching into full Entities.
Some Repositories will know how to save some Domain Objects that are more complex than others. Perhaps an Entity that has a property which contains a list of other Entities that need to be saved along side the main entity (you can dive deeper into learning about Aggregate roots if you want).
The interfaces to Repositories rest in your Domain layer, but not the actual implementations of those Repositories. That way you can have an Eloquent version or whatever.
Other Queries (Table Data Gateway)
These queries won't work with Entities. They'll just be accepting parameters and returning things like Arrays or POPO's (Plain Old PHP Objects).
Many times you will need to perform Reads that do not return nicely into a single Entity. These Reads are typically more for reporting (not for CRUD-like operations, like Reading a user into an edit form that is eventually submitted and saved). For example, you might have a report that is 200 rows of JOINed data. If you used the Repositiory and tried to return large deep objects (with all the relationships populated, or even lazy-loaded) then you are going to have performance issues. Instead, use the Table Data Gatway pattern. You are just displaying data and not really needing OOP power here. The outputted data could however contain ID's, which through the UI could be used to initiate calls to Repository persistence methods.
As you are developing your app, when you come across the need for a new Read/Report query, create a new method in some class somewhere in your Table Data Gatway folder. You may find you have already created a similar query, so see how you can consolidate the other query. Use some parameters if necessary to make the gateway method's queries more flexible in particular ways (ie. columns to select, sort order, pagination, etc.). Don't make your queries too flexible though, this is where query builders/ORMs go wrong! You need to constrain your queries to a certain extent to where if you need to replace them (perhaps a different database engine) then you can easily perceive what the allowed variations are and aren't. It's up to you to find the right balance between flexibility (so you have more DRY code) and constraints (so you can optimize/replace queries later).
You can create services in your Domain to handle receiving parameters, then passing them to the Table Data Gateway, and then receiving back arrays to do some more mutating on. This will keep your Domain logic in the domain (and out of the infrastructure/persistence layer of the Repository & Table Data Gateway).
Again, just like the Repository, use interfaces in your domain services so that the implementation details stay out of your Domain layer, and resides in the actual Table Data Gateway folder.
I started some time working with the Yii Framework and I saw some things "do not let me sleep." Here I talk about my doubts about how Yii users use the Active Record.
I saw many people add business rules of the application directly in Active Record, the same generated by Gii. I deeply believe that this is a misinterpretation of what is Active Record and a violation of SRP.
Early on, SRP is easier to apply. ActiveRecord classes handle persistence, associations and not much else. But bit-by-bit, they grow. Objects that are inherently responsible for persistence become the de facto owner of all business logic as well. And a year or two later you have a User class with over 500 lines of code, and hundreds of methods in it’s public interface. Callback hell ensues.
When I talked about it with some people and my view was criticized. But when asked:
And when you need to regenerate your Active Record full of business rules through Gii what do you do? Rewrite? Copy and Paste? That's great, congratulations!
Got an answer, only the silence.
So, I:
What I am currently doing in order to reach a little better architecture is to generate the Active Records in a folder /ar. And inside the /models folder add the Domain Model.
By the way, is the Domain Model who owns the business rules, and is the Domain Model that uses the Active Records to persist and retrieve data, and this is the Data Model.
What do you think of this approach?
If I'm wrong somewhere, please tell me why before criticizing harshly.
Some of the comments on this article are quite helpful:
http://blog.codeclimate.com/blog/2012/10/17/7-ways-to-decompose-fat-activerecord-models/
In particular, the idea that your models should grow out of a strictly 'fat model' setup as you need more seems quite wise.
Are you having issues now or mainly trying to plan ahead? This may be hard to plan ahead for and may just need refactoring as you go ...
Edit:
Regarding moveUserToGroup (in your comment below), I could see how having that might bother you. Found this as I was thinking about your question: https://gist.github.com/justinko/2838490 An equivalent setup that you might use for your moveUserToGroup would be a CFormModel subclass. It'll give you the ability to do validations, etc, but could then be more specific to what you're trying to handle (and use multiple AR objects to achieve your objectives instead of just one).
I often use CFormModel to handle forms that have multiple AR objects or forms where I want to do other things.
Sounds like that may be what you're after. More details available here:
http://www.yiiframework.com/doc/guide/1.1/en/form.overview
The definition of Active Record, according to Martin Fowler:
An object carries both data and behavior. Much of this data is persistent and needs to be stored in a database. Active Record uses the most obvious approach, putting data access logic in the domain object. This way all people know how to read and write their data to and from the database.
When you segregate data and behavior you no longer have an Active Record. Two common related patterns are Data Mapper and Table/Row Gateway (this one more related to RDBMS's).
Again, Fowler says:
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema.
And again:
A Table Data Gateway holds all the SQL for accessing a single table or view: selects, inserts, updates, and deletes. Other code calls its methods for all interaction with the database.
A Row Data Gateway gives you objects that look exactly like the record in your record structure but can be accessed with the regular mechanisms of your programming language. All details of data source access are hidden behind this interface.
A Data Mapper is usualy storage independent, the mapper recovers data from the storage and creates mapped objects (Plain-old objects). The mapped object knows absolutely nothing about being stored somewhere else.
As I said, TDG/RDG are more inwardly related to a relational table. TDG object represents the structure of the table and implements all common operations. RGD object contains data related to one single row of the table. Unlike mapped object of Data Mapper, the RDG object has conscience that it is part of a whole, because it references its container TDG.
Now that I have read an awfull lot of posts, articles, questions and answers on OOP, MVC and design patterns, I still have questions on what is the best way to build what i want to build.
My little framework is build in an MVC fashion. It uses smarty as the viewer and I have a class set up as the controller that is called from the url.
Now where I think I get lost is in the model part. I might be mixing models and classes/objects to much (or to little).
Anyway an example. When the aim is to get a list of users that reside in my database:
the application is called by e.g. "users/list" The controller then runs the function list, that opens an instance of a class "user" and requests that class to retrieve a list from the table. once returned to the controller, the controller pushes it to the viewer by assigning the result set (an array) to the template and setting the template.
The user would then click on a line in the table that would tell the controler to start "user/edit" for example - which would in return create a form and fill that with the user data for me to edit.
so far so good.
right now i have all of that combined in one user class - so that class would have a function create, getMeAListOfUsers, update etc and properties like hairType and noseSize.
But proper oop design would want me to seperate "user" (with properties like, login name, big nose, curly hair) from "getme a list of users" what would feel more like a "user manager class".
If I would implement a user manager class, how should that look like then? should it be an object (can't really compare it to a real world thing) or should it be an class with just public functions so that it more or less looks like a set of functions.
Should it return an array of found records (like: array([0]=>array("firstname"=>"dirk", "lastname"=>"diggler")) or should it return an array of objects.
All of that is still a bit confusing to me, and I wonder if anyone can give me a little insight on how to do approach this the best way.
The level of abstraction you need for your processing and data (Business Logic) depends on your needs. For example for an application with Transaction Scripts (which probably is the case with your design), the class you describe that fetches and updates the data from the database sounds valid to me.
You can generalize things a bit more by using a Table Data Gateway, Row Data Gateway or Active Record even.
If you get the feeling that you then duplicate a lot of code in your transaction scripts, you might want to create your own Domain Model with a Data Mapper. However, I would not just blindly do this from the beginning because this needs much more code to get started. Also it's not wise to write a Data Mapper on your own but to use an existing component for that. Doctrine is such a component in PHP.
Another existing ORM (Object Relational Mapper) component is Propel which provides Active Records.
If you're just looking for a quick way to query your database, you might find NotORM inspiring.
You can find the Patterns listed in italics in
http://martinfowler.com/eaaCatalog/index.html
which lists all patterns in the book Patterns of Enterprise Application Architecture.
I'm not an expert at this but have recently done pretty much exactly the same thing. The way I set it up is that I have one class for several rows (Users) and one class for one row (User). The "several rows class" is basically just a collection of (static) functions and they are used to retrieve row(s) from a table, like so:
$fiveLatestUsers = Users::getByDate(5);
And that returns an array of User objects. Each User object then has methods for retrieving the fields in the table (like $user->getUsername() or $user->getEmail() etc). I used to just return an associative array but then you run into occasions where you want to modify the data before it is returned and that's where having a class with methods for each field makes a lot of sense.
Edit: The User object also have methods for updating and deleting the current row;
$user->setUsername('Gandalf');
$user->save();
$user->delete();
Another alternative to Doctrine and Propel is PHP Activerecords.
Doctrine and Propel are really mighty beasts. If you are doing a smaller project, I think you are better off with something lighter.
Also, when talking about third-party solutions there are a lot of MVC frameworks for PHP like: Kohana, Codeigniter, CakePHP, Zend (of course)...
All of them have their own ORM implementations, usually lighter alternatives.
For Kohana framework there is also Auto modeler which is supposedly very lightweight.
Personally I'm using Doctrine, but its a huge project. If I was doing something smaller I'd sooner go with a lighter alternative.
I'm currently rebuilding an admin application and looking for your recommendations for best-practice! Excuse me if I don't have the right terminology, but how should I go about the following?
Take the example of "users" - typically we can create a class with properties like 'name', 'username', 'password', etc. and make some methods like getUser($user_ID), getAllUsers(), etc. In the end, we end up with an array/arrays of name-value pairs like; array('name' => 'Joe Bloggs', 'username' => 'joe_90', 'password' => '123456', etc).
The problem is that I want this object to know more about each of its properties.
Consider "username" - in addition to knowing its value, I want the object to know things like; which text label should display beside the control on the form, which regex I should use when validating, what error message is appropriate? These things seem to belong in the model.
The more I work on the problem, the more I see other things too; which HTML element should be used to display this property, what are minimum/maximum values for properties like 'registration_date'?
I envisaged the class looking something like this (simplified):
class User {
...etc...
private static $model = array();
...etc...
function __construct(){
...etc...
$this->model['username']['value'] = NULL; // A default value used for new objects.
$this->model['username']['label'] = dictionary::lookup('username'); // Displayed on the HTML form. Actual string comes from a translation database.
$this->model['username']['regex'] = '/^[0-9a-z_]{4,64}$/i'; // Used for both client-side validation and backend validation/sanitising;
$this->model['username']['HTML'] = 'text'; // Which type of HTML control should be used to interact with this property.
...etc...
$this->model['registration_date']['value'] = 'now'; // Default value
$this->model['registration_date']['label'] = dictionary::lookup('registration_date');
$this->model['registration_date']['minimum'] = '2007-06-05'; // These values could be set by a permissions/override object.
$this->model['registration_date']['maximum'] = '+1 week';
$this->model['registration_date']['HTML'] = 'datepicker';
...etc...
}
...etc...
function getUser($user_ID){
...etc...
// getUser pulls the real data from the database and overwrites the default value for that property.
return $this->model;
}
}
Basically, I want this info to be in one location so that I don't have to duplicate code for HTML markup, validation routines, etc. The idea is that I can feed a user array into an HTML form helper and have it automatically create the form, controls and JavaScript validation.
I could then use the same object in the backend with a generic set($data = array(), $model = array()) method to avoid having individual methods like setUsername($username), setRegistrationDate($registration_date), etc...
Does this seem like a sensible approach?
What would you call value, label, regex, etc? Properties of properties? Attributes?
Using $this->model in getUser() means that the object model is overwritten, whereas it would be nicer to keep the model as a prototype and have getUser() inherit the properties.
Am I missing some industry-standard way of doing this? (I have been through all the frameworks - example models are always lacking!!!)
How does it scale when, for example, I want to display user types with a SELECT with values from another model?
Thanks!
Update
I've since learned that Java has class annotations - http://en.wikipedia.org/wiki/Java_annotations - which seem to be more or less what I was asking. I found this post - http://interfacelab.com/metadataattributes-in-php - does anyone have any insight into programming like this?
You're on the right track there. When it comes to models I think there are many approaches, and the "correct" one usually depends on your type of application.
Your model can be directly an Active Record, maybe a table row data gateway or a "POPO", plain old PHP object (in other words, a class that doesn't implement any specific pattern).
Whichever you decide works best for you, things like validation etc. can be put into the model class. You should be able to work with your users as User objects, not as associative arrays - that is the main thing.
Does this seem like a sensible approach
Yes, besides the form label thing. It's probably best to have a separate source for data such as form labels, because you may eventually want to be able to localize them. Also, the label isn't actually related to the user object - it's related to displaying a form.
How I would approach this (suggestion)
I would have a User object which represents a single user. It should be possible to create an empty user or create it from an array (so that it's easy to create one from a database result for example). The user object should also be able to validate itself, for example, you could give it a method "isValid", which when called will check all values for validity.
I would additionally have a user repository class (or perhaps just some static methods on the User class) which could be used to fetch users from the database and store them back. This repository would directly return user objects when fetching, and accept user objects as parameters for saving.
As to what comes to forms, you could probably have a form class which takes a user object. It could then automatically get values from the user and use it to validate itself as well.
I have written on this topic a bit here: http://codeutopia.net/blog/2009/02/28/creating-a-simple-abstract-model-to-reduce-boilerplate-code/ and also some other posts linked in the end of that one.
Hope this helps. I'd just like to remind that my approach is not perfect either =)
An abstract response for you which quite possibly won't help at all, but I'm happy to get the down votes as it's worth saying :)
You're dealing with two different models here, in some world we call these Class and Instance, in other's we talk of Classes and Individuals, and in other worlds we make distinctions between A-Box and T-Box statements.
You are dealing with two sets of data here, I'll write them out in plain text:
User a Class .
username a Property;
domain User;
range String .
registration_date a Property;
domain User;
range Date .
this is your Class data, T-Box statements, Blueprints, how you describe the universe that is your application - this is not the description of the 'things' in your universe, rather you use this to describe the things in your universe, your instance data.. so you then have:
user1 a User ;
username "bob";
registration_date "2010-07-02" .
which is your Instance, Individual, A-Box data, the things in your universe.
You may notice here, that all the other things you are wondering how to do, validation, adding labels to properties and so forth, all come under the first grouping, things that describe your universe, not the things in it. So that's where you'd want to add it.. again in plain text..
username a Property;
domain User;
range String;
title "Username";
validation [ type Regex; value '/^[0-9a-z_]{4,64}$/i' ] .
The point in all this, is to help you analyse the other answers you get - you'll notice that in your suggestion you munged these two distinct sets of data together, and in a way it's a good thing - from this hopefully you can see that typically the classes in PHP take on the role of Classes (unsurprisingly) and each object (or instance of a class) holds the individual instance data - however you've started to merge these two parts of your universe together to try and make one big reusable set of classes outside of the PHP language constructs that are provided.
From here you have two paths, you can either fold in to line and follow the language structure to make your code semi reusable and follow suggested patterns like MVC (which if you haven't done, would do you good) - or you can head in to a cutting edge world where these worlds are described and we build frameworks to understand the data about our universes and the things in it, but it's an abstract place where at the minute it's hard to be productive, though in the long term is the path to the future.
Regardless, I hope that in some way that helps you to get a grip of the other responses.
All the best!
Having looked at your question, the answers and your responses; I might be able to help a bit more here (although it's difficult to cover everything in a single answer).
I can see what you are looking to do here, and in all honesty this is how most frameworks start out; making a set of classes to handle everything, then as they are made more reusable they often hit on tried and tested patterns until finally ending up with what I'd say is 'just another framework', they all do pretty much the same thing, in pretty much the same ways, and aim to be as reusable as they can - generally about the only difference between them is coding styles and quality - what they do is pretty much the same for all.
I believe you're hitting on a bit of anti-pattern in your design here, to explain.. You are focussed on making a big chunk of code reusable, the validation, the presentation and so forth - but what you're actually doing (and of course no offence) is making the working code of the application very domain specific, not only that but the design you illustrate will make it almost impossible to extend, to change layers (like make a mobile version), to swap techs (like swap db vendors) and further still, because you've got presentation and application (and data) tiers mixed together, any designer who hit's the app will have to be working in, and changing, your application code - hit on a time when you have two versions of the app and you've got a big messy problem tbh.
As with most programming problems, you can solve this by doing three things:
designing a domain model.
specifying and designing interfaces rather that worrying about the implementation.
separating cross cutting concerns
Designing a domain model is a very important part of Class based OO programming, if you've never done it before then now is the ideal time, it doesn't matter whether you do this in a modelling language like UML or just in plain text, the idea is to define all the Entities in your Domain, it's easy to slip in to writing a book when discussing this, but let's keep it simple. Your domain model comprises all the Entities in your application's domain, each Entity is a thing, think User, Address, Article, Product and so forth, each Entity is typically defined as a Class (which is the blueprint of that entity) and each Class has Properties (like username, register_date etc).
Class User {
public $username;
public $register_date;
}
Often we may keep these as POPOs, however they are often better thought of as Transfer Objects (often called Data Transfer Objects, Value Objects) - a simple Class blueprint for an entity in your domain - normally we try to keep these portable as well, so that they can be implemented in any language, passed between apps, serialized and sent to other apps and similar - this isn't a must, indeed nothing is a must - but it does touch on separation of concerns in that it would normally be naked, implying no functionality, just a blueprint ot hold values. Contrast sharply with Business Objects and Utility Classes that actually 'do' things, are implementations of functionality, not just simple value holders.
Don't be fooled though, both Inheritance and Composition also play their part in domain model, a User may have several Addresses, each Address may be the address of several different Users. A BillingAddress may extend a normal Address and add in additional properties and so forth. (aside: what is a User? do you have a User? do you have a Person with 1-* UserAccounts?).
After you've got your domain model, the next step is usually mapping that up to some form of persistence layer (normally a database) two common ways of doing this (in well defined way) are by using an ORM (such as doctrine, which is in symphony if i remember correctly), and the other way is to use DAO pattern - I'll leave that part there, but typically this is a distinct part of the system, DAO layers have the advantage in that you specify all the methods available to work with the persistence layer for each Entity, whilst keeping the implementation abstracted, thus you can swap database vendors without changing the application code (or business rules as many say).
I'm going to head in to a grey area with the next example, as mentioned earlier Transfer Objects (our entities) are typically naked objects, but they are also often a good place to strap on other functionality, you'll see what I mean.
To illustrate Interfaces, you could simply define an Interface for all your Entities which is something like this:
Interface Validatable {
function isValid();
}
then each of your entities can implement this with their own custom validation routine:
Class User implements Validatable {
public function isValid()
{
// custom validation here
return $boolean;
}
}
Now you don't need to worry about creating some convoluted way of validating objects, you can simply call isValid() on any entity and find out if it's valid or not.
The most important thing to note is that by defining the interface, we've separated some of the concerns, in that no other part of the application needs to do anything to validate an object, all they need to know is that it's Validatable and to call the isValid() method.
However, we have crossed some concerns in that each object (instance of a Class) now carries it's own validation rules and model. It may make sense to abstract this out, one easy way of doing this is to make the validation method static, so you could define:
Class User {
public static function validate(User $user)
{
// custom validation here
return $boolean;
}
}
Or you could move to using getters and setters, this is another very common pattern where you can hide the validation inside the setter, thus ensuring that each property always holds valid data (or null, or default value).
Or perhaps you move the validation in to it's own library? Class Validate with it's own methods, or maybe you just pop it in the DAO layer because you only care about checking something when you save it, or maybe you need to validate when you receive data and when you persist it - how you end up doing it is your call and there is no 'best way'.
The third consideration, which I've already touched on, is separation of concerns - should a persistence layer care how the things it's persisting are presented? should the business logic care about how things are presented? should an Entity care where and how it's displayed? or should the presentation layer care how things are presented? Similarly, we can ask is there only ever going to be one presentation layer? in one language? What about how a label appears in a sentence, sure singular User and Address makes sense, but you can't simply +s to show the lists because Users is right but Addresss is wrong ;) - also we have working considerations like do I want a new designer having to change application code just to change the presentation of 'user account' to 'User Account', even do I want to change my app code in the classes when a that change is asked for?
Finally, and just to throw everything I've said - you have to ask yourself, what's the job I'm trying to do here? am I building a big reusable application with potentially many developers and a long life cycle here - or would a simple php script for each view and action suffice (one that reads $_GET/$_POST, validates, saves to db then displays what it should or redirects where it should) - in many, if not all cases this is all that's needed.
Remember, PHP is made to be invoked when a request is made to a web server, then send back a response [end] that's it, what happens between then is your domain, your job, the client and user typically doesn't care, and you can sum up what you're trying to do this simply: build a script to respond to that request as quickly as possible, with the expected results. That's and it needn't be any more complicated than that.
To be blunt, doing everything I mentioned and more is a great thing to do, you'll learn loads, understand your job better etc, but if you just want to get the job out the door and have easy to maintain simple code in the end, just build one script per view, and one per action, with the odd reusable bit (like a http handler, a db class, an email class etc).
You're running into the Model-View-Controller (MVC) architecture.
The M only stores data. No display information, just typed key-value pairs.
The C handles the logic of manipulating this information. It changes the M in response to user input.
The V is the part which handles displaying things. It should be something like Smarty templates rather than a huge amount of raw PHP for generating HTML.
Having it all "in one place" is the wrong approach. You won't have duplicated code with MVC - each part is a distinct step. This improves code reuse, readability, and maintainability.
I've been reading several articles on MVC and had a few questions I was hoping someone could possibly assist me in answering.
Firstly if MODEL is a representation of the data and a means in which to manipulate that data, then a Data Access Object (DAO) with a certain level of abstraction using a common interface should be sufficient for most task should it not?
To further elaborate on this point, say most of my development is done with MySQL as the underlying storage mechanism for my data, if I avoided vendor specific functions -- (i.e. UNIX_TIMESTAMP) -- in the construction of my SQL statements and used a abstract DB object that has a common interface moving between MySQL and maybe PostgreSQL, or MySQL and SQLite should be a simple process.
Here's what I'm getting at some task, are handled by a single CONTROLLER -- (i.e. UserRegistration) and rather that creating a MODEL for that task, I can get an instance of the db object -- (i.e. DB::getInstance()) -- then make the necessary db calls to INSERT a new user. Why with such a simple task would I create a new MODEL?
In some of the examples I've seen a MODEL is created, and within that MODEL there's a SELECT statement that fetches x number of orders from the order table and returns an array. Why do this, if in your CONTROLLER your creating another loop to iterate over that array and assign it to the VIEW; ex. 1?
ex. 1: foreach ($list as $order) { $this->view->set('order', $order); }
I guess one could modify the return so something like this is possibly; ex. 2.
ex. 2: while ($order = $this->model->getOrders(10)) { $this->view->set('order', $order); }
I guess my argument is that why create a model when you can simply make the necessary db calls from within your CONTROLLER, assuming your using a DB object with common interface to access your data, as I suspect most of websites are using. Yes I don't expect this is practical for all task, but again when most of what's being done is simple enough to not necessarily warrant a separate MODEL.
As it stands right now a user makes a request 'www.mysite.com/Controller/action/args1/args2', the front controller (I call it router) passes off to Controller (class) and within that controller a certain action (method) is called and from there the appropriate VIEW is created and then output.
So I guess you're wondering whether the added complexity of a model layer -on top- of a Database Access Object is the way you want to go. In my experience, simplicity trumps any other concern, so I would suggest that if you see a clear situation where it's simpler to completely go without a Model and have the data access occur in the equivalent of a controller, then you should go with that.
However, there are still other potential benefits to having an MVC separation:
No SQL at all in the controller: Maybe you decide to gather your data from a source other than a database (an array in the session? A mock object for testing? a file? just something else), or your database schema changes and you have to look for all the places that your code has to change, you could look through just the models.
Seperation of skillsets: Maybe someone on your team is great at complex SQL queries, but not great at dealing with the php side. Then the more separated the code is, the more people can play to their strengths (even more so when it comes to the html/css/javascript side of things).
Conceptual object that represents a block of data: As Steven said, there's a difference in the benefits you get from being database agnostic (so you can switch between mysql and postgresql if need be) and being schema agnostic (so you have an object full of data that fits together well, even if it came from different relational tables). When you have a model that represents a good block of data, you should be able to reuse that model in more than one place (e.g. a person model could be used in logins and when displaying a personnel list).
I certainly think that the ideals of separation of the tasks of MVC are very useful. But over time I've come to think that alternate styles, like keeping that MVC-like separation with a functional programming style, may be easier to deal with in php than a full blown OOP MVC system.
I found this great article that addressed most of my questions. In case anyone else had similar questions or is interested in reading this article. You can find it here http://blog.astrumfutura.com/archives/373-The-M-in-MVC-Why-Models-are-Misunderstood-and-Unappreciated.html.
The idea behind MVC is to have a clean separation between your logic. So your view is just your output, and your controller is a way of interacting with your models and using your models to get the necessary data to give to the necessary views. But all the work of actually getting data will go on your model.
If you think of your User model as an actual person and not a piece of data. If you want to know that persons name is it easier to call up a central office on the phone (the database) and request the name or to just ask the person, "what is your name?" That's one of the ideas behind the model. In a most simplistic way you can view your models as real living things and the methods you attach to them allow your controllers to ask those living things a series of questions (IE - can you view this page? are you logged in? what type of image are you? are you published? when were you last modified?). Your controller should be dumb and your model should be smart.
The other idea is to keep your SQL work in one central location, in this case your models. So that you don't have errant SQL floating around your controllers and (worst case scenario) your views.