Related
Intoduction problem:
What is the best practice to build my class T object, when I receive it from a MongoCursor::getNext()? As far as it goes, getNext() function of a MongoCursor returns with an array. I wish to use the result from that point as an object of type T.
Should I write my own constructor for type T, that accepts an array? Is there any generic solution to this, for example when type T extends G, and G does the job as a regular way, recursively (for nested documents).
I'm new to MongoDB, and I'd like to build my own generic mapper with a nice interface.
Bounty:
Which are the possible approaches, patterns and which would fit the concept of MongoDB the most from the view of PHP.
This answer has been rewritten.
Most data mappers work by representing one object per class or "model" is normally the coined term. If you wish to allow multiple accession through a single object (i.e. $model->find()) it is normally demmed so that the method will not actually return an instance of itself but instead that of an array or a MongoCursor eager loading classes into the space.
Such a paradigm is normally connected with "Active Record". This is the method that ORMs, ODMs and frameworks all use to communicate to databases in one way or another, not only for MongoDB but also for SQL and any other databases to happen to crop up (Cassandra, CouchDB etc etc).
It should be noted immediately that even though active record gives a lot of power it should not be blanketed across the entire application. There are times where using the driver directly would be more benefical. Most ORMs, ODMs and frameworks provide the ability to quickly and effortlessly access the driver directly for this reason.
There is, as many would say, no light weight data mapper. If you are going to map your returned data to classes then it will consume resources, end of. The benefit of doing this is the power you receive when manipulating your objects.
Active record is really good at being able to provide events and triggers from within PHP. A good example is that of an ORM I made for Yii: https://github.com/Sammaye/MongoYii it can provide hooks for:
afterConstruct
beforeFind
afterFind
beforeValidate
afterValidate
beforeSave
afterSave
It should be noted that when it comes to events like beforeSave and afterSave MongoDB does not possess triggers ( https://jira.mongodb.org/browse/SERVER-124 ) so it makes sense that the application should handle this. On top of the obvious reason for the application to handle this it also makes much better handling of the save functions by being able to call your native PHP functions to manipulate every document saved prior to touching the database.
Most data mappers work by using PHP own class CRUD to represent theirs too. For example to create a new record:
$d=new User();
$d->username='sammaye';
$d->save();
This is quite a good approach since you create a "new" ( https://github.com/Sammaye/MongoYii/blob/master/EMongoDocument.php#L46 shows how I prepare for a new record in MongoYii ) class to make a "new" record. It kind of fits quite nicely semantically.
Update functions are normally accessed through read functions, you cannot update a model you don't know the existane of. This brings us onto the next step of populating models.
To handle populating a model different ORMs, ODMs and frameworks commit to different methods. For example, my MongoYii extension uses a factory method called model in each class to bring back a new instance of itself so I can call th dynamic find and findOne and other such methods.
Some ORMs, ODMs and frameworks provide the read functions as direct static functions making them into factory methods themselves whereas some use the singleton pattern, however, I chose not to ( https://stackoverflow.com/a/4596323/383478 ).
Most, if not all, implement some form of the cursor. This is used to return multiples of the models and directly wraps (normally) the MongoCursor to replace the current() method with returning a pre-populate model.
For example calling:
User::model()->find();
Would return a EMongoCursor (in MongoYii) which would then sotre the fact that the class User was used to instantiate the cursor and when called like:
foreach(User::model() as $k=>$v){
var_dump($v);
}
Would call the current() method here: https://github.com/Sammaye/MongoYii/blob/master/EMongoCursor.php#L102 returning a new single instance of the model.
There are some ORMs, ODMs and frameworks which implement eager array loading. This means they will just load the whole result straight into your RAM as an array of models. I personally do not like this approach, it is wasteful and also does not bode well when you need to use active record for larger updates due to adding some new functionality in places that needs adding to old records.
One last topic before I move on is the schemaless nature of MongoDB. The problem with using PHP classes with MongoDB is that you want all the functionality of PHP but with the variable nature of MongoDB. This is easy to over come in SQL since it has a pre-defined schema, you just query for it and jobs done; however, MongoDB has no such thing.
This does make schema handling in MongoDB quite hazardous. Most ORMs, ODMs and frameworks demand that you pre-define the schema in the spot (i.e. Doctrine 2) using private variables with get and set methods. In MongoYii, to make my life easy and elegant, I decided to retain MongoDBs schemaless nature by using magics that would detect ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L26 is my __get and https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L47 is my __set ), if the property wa inaccessible in the class, if the field was in a internal _attributes array and if not then just return null. Likewise, for setting an attribute I would just set in the intrernal _attributes variable.
As for dealing with how to assign this schema I left internal assignment upto the user however, to deal with setting properties from forms etc I used the validation rules ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L236 ) calling a function called getSafeAttributeNames() which would return a list of attributes which had validation rules against them. If they did not have validation rules then those attributes which existed in the incoming $_POST or $_GET array would not be set. So this provided the ability for a schema, yet secure, model structure.
So we have covered how to actually use the root document you also ask how to data mappers handle subdocuments. Doctrine 2 and many others provide full class based subdocuments ( http://docs.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/reference/embedded-mapping.html ) but this can be extremely resourceful. Instead I decided that I would provide helper functions which would allow for flexible usage of subdocument without eager loading them into models and so consuming RAM. Basically what I did was to leave them as they are a provide a validator ( https://github.com/Sammaye/MongoYii/blob/master/validators/ESubdocumentValidator.php ) for validating inside of them. Of course the validator is self spawning so if you had a rule in the validator that used the validator again to issue a validation of a nested subdocument then it would work.
So I think that completes a very basic discussion of ORMs, ODMs and frameworks use data mappers. Of course I could probably write an entire essay on this but this is a good enough discussion for the minute I believe.
Background
This is a long and complicated question. I'm using php/MySQL as an example (since it's an actual example) but this could theoretically apply to other languages. Bear with me.
MVC Framework with no ORM
I'm creating an application that uses its own framework. For a variety of reasons, the following answers to this question will not be accepted:
Why don't you just use X framework?
Why don't you just use X ORM?
This application does its own queries (written by me) and that's how I'd like it to stay for now.
Business Logic?
"Business Logic" seems to be a meaningless buzzword, but I take it to essentially mean
The queries and the logic that builds the result set based on those queries
I've also read that the Model in an MVC should do all the business logic.
User.php is 884 lines
Now that I've gotten my app working fairly well, I'd like to refactor it so as not to have such abominations. User.php is essentially the model for Users (obviously). There are some responsibilities I see in it that I could easily pluck out, but a major hurdle I'm running into is:
How can I reconcile SOLID with MVC?
The reason that User.php has grown so large is because I do any query that requires a User member in that file. User is used for a ton of operations (it needs to do so much more than just CRUD), so any query that needs userid, username, etc. is run by a function in this file. Apparently the queries should be in the model (and not the controller), but I feel that this should definitely be split up somehow. I need to accomplish the following:
Create an API that covers all of these necessary queries in a compartmentalized way
Avoid giving access to the DB connection class when it's not necessary
Add User data to the view (User.php is doing that right now -- the view object is injected by a setter, which I think is also bad).
...so what I could do is create other objects like UserBranchManager, UserSiteManager, UserTagManager, etc. and each of those could have the relevant queries and the DB object injected to run those queries, but then how do they get the coveted User::$userid that they need to run these queries? Not only that, but how could I pass Branch::$branchid? Shouldn't those members be private? Adding a getter for them also makes that pointless.
I'm also not sure where to draw the line of how much an object should do. A lot of the operations similar but still different. A class for each one would be huge overkill.
Possible Answer
If I can't get any help, what I'll do (or at least try to do) is have a dependency injection container of some kind build dependencies for the objects above (e.g. UserBranchManager) and inject them into the relevant controller. These would have a DB and Query object. The Query object could be passed to low level models (like User) to bind parameters as needed, and the higher level models or whatever they are called would give the results back to the controller which would add the data to the template as needed as well. Some possible hurdles I see are creating proper contracts (e.g. the UserController should preferably depend on some abstraction of the user models) but some specifics are inevitably required, especially when it comes to the view.
Can anyone offer some wisdom in response to my rambling question?
Response to #tereško
He has provided a great answer not only here, but also at How should a model be structured in MVC?
Code
As requested, here is some extremely pared down code (basically services one request). Some important notes:
Right now, controllers are not classes, just files
The controller also handles a lot of the routing
There are no "view" objects, just the templates
This will probably look really bad
These are also things to improve, but I'm mostly worried about the model (User in particular since it's getting out of control):
#usr.php -- controller
$route = route();
$user = '';
$branch = '<TRUNK>';
if (count($route) > 0) {
if (count($route) > 1) {
list($user, $branch) = $route;
}
else {
list($user) = $route;
}
}
$dec = new Decorator('user');
$dec->css('user');
if (isset($_SESSION['user']) && $_SESSION['user']->is($user)) {
$usr = $_SESSION['user'];
}
else {
$usr = new User(new DB, $user);
}
$usr->setUpTemplate($dec, $branch);
return $dec->execute();
# User.php -- model
class User {
private $userid;
private $username;
private $db;
public function __construct(DB $db, $username = null) {
$this->username = $username;
$this->db = $DB;
}
public function is($user) {
return strtolower($this->username) === strtolower($user);
}
public function setUpTemplate(Decorator $dec, $branch) {
$dec->_title = "$this->username $branch";
// This function runs a moderately complicated query
// joining the branch name and this user id/name
$dec->branch = $this->getBranchDisplay($branch);
}
}
Questions about answers
Answer here:
You talk about leaving off caching/authentication/authorization. Are these important? Why aren't they covered? How do they relate to the model/controller/router?
The Data Mapper example has the Person class with methods like getExemption, isFlaggedForAudit .. what are those? It seems like those calculations would require DB data, so how does it get them? Person Mapper leaves off select. Isn't that important?
What is "domain logic?"
Answer 5863870 (specifically the code example):
Shouldn't these factory objects be abstractions (and not rely on creation via new) or are these special?
How would your API include the necessary definition files?
I've read a lot about how it's best for dependencies to be injected in the constructor (if they're mandatory). I assume you set the factories in this way, but why not the objects/mappers/services themselves? What about abstractions?
Are you worried about code duplication (e.g. most models requiring an _object_factory member in their class definition)? If so, how could you avoid this?
You're using protected. Why?
If you can provide any specific code examples that would be best since it's easier for me to pick stuff up that way.
I understand the theory of what your answers are saying and it's helped a lot. What I'm still interested in (and not totally sure of) is making sure that dependencies of objects in this API are handled in the best way (the worst would be new everywhere).
Dont confuse SOLID (You can get a good explanation of what it is on my blog at: http://crazycoders.net/2012/03/confoo-2012-make-your-project-solid/
SOLID is great when considering the framework that goes around the application you are trying to build. The management of the data itself is another thing. You can't really apply the Single Responsibility of the S of SOLID to a business model that RELIES on other business models such as User, Groups and Permissions.
Thus, you have to let some of the SOLID go when you build an application. What SOLID is good for, is to make sure your framework behind your application is strong.
For example, if you build your own framework and business model, you will probably have a base class MODEL another for DATABASEACCESS, just remember that your MODEL shouldn't be aware of how to get the data, just know that it must get data.
For example:
Good:
- MyApp_Models_User extends MyApp_Framework_Model
- MyApp_Models_Group extends MyApp_Framework_Model
- MyApp_Models_Permission extends MyApp_Framework_Model
- MyApp_Framework_Model
- MyApp_Framework_Provider
- MyApp_Framework_MysqliProvider extends MyApp_Framework_Provider
In this good part, you create a model like this:
$user = new MyApp_Models_User(new MyApp_Framework_MysqliProvider(...));
$user->load(1234);
This way, you will prevent a fail in the single responsibility, your provider is used to load the data from one of the many providers that exist and your model represents the data that you extracted, it doesn't know how to read or write the data, thats the providers job...
Bad way:
- MyApp_Model_User
- MyApp_Model_Group
- MyApp_Model_Permission
define('MyDB', 'localhost');
define('MyUser', 'user');
define('MyPass', 'pass');
$user = new MyApp_Models_User(1234);
Using this bad method you first of all break the single responsibility, your model represents something and also manages the input/ouput of the data. Also, you create a dependency by stating that you need to define constants for the model to connect to the database and you completly abstract the database methods, if you need to change them and have 37 models, you'll have TONS of work to do...
Now, you can, if you want work the bad way... i still do it, i'm aware of it, but sometimes, when you have crappy structure and want to refactor, you can and should work against a principle just to refactor correctly and slowly, THEN, refactor a little more and end up with something SOLID and MVC looking.
Just remember that SOLID doesn't apply to everything, it's just a guideline, but it's very very good guideline.
Well .. it depends on what is actually inside your ./user.php file. If i had to guess, you would be a situation, where your user "model" has too many responsibilities. Basically, you are violating single responsibility principle and not sure how to go about fixing that.
You did no provide any code .. so , lets continue with guessing ...
It is possible that your user "model" is implementing active record pattern. This would be the main source of SRP problems. You could watch this part of lecture slides. It will explain some of it. The point would be, instead of using active record, to go with something similar to a data mapper pattern.
Also, you might notice that some of the domain logic, which works with User class instances, seems to happen outside your "model". It might be beneficial to separate that part in a different structure. Otherwise you will be forcing the domain logic inside the controller. Maybe this comment could shine some light on the whole subject.
Another thing you might have crammed inside your user "model" could be parts of the authorization (no to confuse with authentication) mechanism. It could be pragmatic to separate this responsibility.
Update
You talk about leaving off caching/authentication/authorization. Are these important? Why aren't they covered? How do they relate to the model/controller/router?
Caching is something that you would add later in the application. The domain objects do not care where the data comes from. For that reason you can wither add the caching with in the service-level objects or inside the existing data mappers. I would advise to choose former option, because changing existing mappers might have unforeseen side effects. And because it would just over-complicate the existing mappers.
namespace Application\Service;
class Community{
public function getUserDetails( $uid )
{
$user = $this->domainObjectFactory->build('User');
$cache = $this->cacheFactory->build('User');
$user->setId( $uid );
try
{
$cache->pull( $user );
}
cache( NotFoundException $e)
{
$mapper = $this->mapperFactory->build('User');
$mapper->pull( $user );
$cache->push( $user );
}
return $user->getDetails();
}
}
This would illustrate a very simplified acquisition of user information based on user's ID. The code creates domain object and provides it with ID, then this $user ovject is used as condition to search for cached details or, if it fails, fetching that pulling that information from DB via the data mapper. Also, if that is successful, the details are pushed into the cache, for next time.
You might notice, that this example did not handle situation, when mapper cannot find such user with such ID in storage (usually - SQL database). As I said , it's a simplified example.
Also, you might notice, that this sort of caching can be easily added on case-by-case basis and would not drastically change how your logic behaves.
Authorization is another part, which should not directly influence your business logic. I already linked my preferred way for providing authentication. The idea is that, instead of checking for credentials inside controller (like here, here, here or here), the access rights are checked before you execute a method on the controller. This way you have additional options for handling the denial of access, without being stuck within a specific controller.
The Data Mapper example has the Person class with methods like getExemption(), isFlaggedForAudit() .. what are those? It seems like those calculations would require DB data, so how does it get them? Person Mapper leaves off select. Isn't that important?
The Person class is a domain object. It would contain the part of domain logic, that is associated directly with that entity.
For those methods to be executed, the mapper should at first load the data. In PHP it would look something like this:
$person = new Person;
$mapper = new PersonMapper( $databaseConnection );
$person->setId( $id );
$mapper->fetch( $person );
if ( $person->isFlaggedForAudit() )
{
return $person->getTaxableEearnings();
}
The names of methods in the PersonMapper are there as an example, so that you would understand, how the class is supposed to be used. I usually name methods fetch(), store() and remove() (or push/pull/remove ... depends on how much GIT have I been using). IMHO, there is no point to have a separate update() and insert() methods. If object's data was initially retrieved by mapper, then it is an UPDATE. If not - it is an INSERT. You can also determine it whether the value, which corresponds to PRIMARY KEY has been set or not (in some cases, at least).
What is "domain logic?"
It is part of the code which knows how to create an invoice and apply discount price for specific products. It's also the code which makes sure, that you do not submit registration form, you do not state that you have been born in 1592.
The MVC is made from two layers: presentation (can contain: views, templates, controllers) and model layer (can contain: services, domain objects, mappers). The presentation layer deals with user interaction and responses. The model layer deals with business and validation rules. You could say that domain business logic is everything in model, that does not deal with storage.
I guess there is no easy way to explain.
Shouldn't these factory objects be abstractions (and not rely on creation via new) or are these special?
Which "factory objects", where? Are you talking about this fragment ?
$serviceFactory = new ServiceFactory(
new DataMapperFactory( $dbhProvider ),
new DomainObjectFactory
);
$serviceFactory->setDefaultNamespace('Application\\Service');
That whole code fragment is supposed to be in bootstrap.php file (or you might be using index.php or init.php for that). It's the entry point for the application. It is not part of any class, therefore you cannot have *tight coupling" there. There is nothing to "couple with".
How would your API include the necessary definition files?
That fragment is not entire bootstrap.php file. Above that code are includes and initialization for autoloader. I am currently using dynamic loader, which lets classes from same namespace to be located in different folders.
Also, such autoloader is a development-stage artifact. In production code you have to use loader, which works with predefined hashtable and does not need to actually walk the tree of namespaces and locations.
I've read a lot about how it's best for dependencies to be injected in the constructor (if they're mandatory). I assume you set the factories in this way, but why not the objects/mappers/services themselves? What about abstractions?
What are you talking about ?!?
Are you worried about code duplication (e.g. most models requiring an _object_factory member in their class definition)? If so, how could you avoid this?
Did you actually LOOK at how old that code fragment in comments was ?!?!?
You're using protected. Why?
Because, if values/methods are defined with protected visibility, you can access them when you extend the original class. Also, that code example was made for old version of answer. Check the dates.
And no. I will not provide any specific code examples. Because each situation is different. If you want to do copy-paste development, then use CakePHP or CodeIgniter.
I am trying to use Dependency Injection as much as possible, but I am having trouble when it comes to things like short-lived dependencies.
For example, let's say I have a blog manager object that would like to generate a list of blogs that it found in the database. The options to do this (as far as I can tell) are:
new Blog();
$this->loader->blog();
the loader object creates various other types of objects like database objects, text filters, etc.
$this->blogEntryFactory->create();
However, #1 is bad because it creates a strong coupling. #2 still seems bad because it means that the object factory has to be previously injected - exposing all the other objects that it can create.
Number 3 seems okay, but if I use #3, do I put the "new" keywords in the blogEntryFactory itself, OR, do I inject the loader into the blogEntryFactory and use the loader?
If I have many different factories like blogEntryFactory (for example I could have userFactory and commentFactory) it would seem like putting the "new" keyword across all these different factories would be creating dependency problems.
I hope this makes sense...
NOTE
I have had some answers about how this is unnecessary for this specific blog example, but there are, in fact, cases where you should use the Abstract Factory Pattern, and that is the point I am getting at. Do you use "new" in that case, or do something else?
I'm no expert, but I'm going to take a crack at this. This assumes that Blog is just a data model object that acts as a container for some data and gets filled by the controller (new Blog is not very meaningful). In this case, Blog is a leaf of the object graph, and using new is okay. If you are going to test methods that need to create a Blog, you have to simultaneously test the creation of the Blog anyway, and using a mock object doesn't make sense .. the Blog does not persist past this method.
As an example, say that PHP did not have an array construct but had a collections object. Would you call $this->collectionsFactory->create() or would you be satisfied to say new Array;?
In answer to the title: yes, abstract factories typically use new. For example, see the MazeFactory code on page 92 of the GoF book. It includes, return new Maze; return new Wall; return new Room; return new Door;
In answer to the note: a design that uses abstract factories to create data models is highly suspect. The purpose is to vary the behavior of the factory's products while making their concrete implementations invisible to clients. Data models with no behavior do not benefit from an abstract factory.
I'm trying to replace a site written procedurally with a nice set of classes as a learning exercise.
So far, I've created a record class that basically holds one line in the database's main table.
I also created a loader class which can:
loadAllFromUser($username)
loadAllFromDate($date)
loadAllFromGame($game)
These methods grab all the valid rows from the database, pack each row into a record, and stick all the records into an array.
But what if I want to just work with one record? I took a stab at that and ended up with code that was nearly identical to my procedural original.
I also wasn't sure where that one record would go. Does my loader class have a protected record property?
I'm somewhat confused.
EDIT - also, where would I put something like the HTML template for outputting a record to the site? does that go in the record class, in the loader, or in a 3rd class?
I recommend looking into using something like Doctrine for abstracting your db-to-object stuff, other than for learning purposes.
That said, there are many ways to model this type of thing, but in general it seems like the libraries (home-grown or not) that handle it tend to move towards having, at a high level:
A class that represents an object that is mapped to the db
A class that represents the way in which that object is mapped to the db
A class that represents methods for retrieving objects from the db
Think about the different tasks that need done, and try to encapsulate them cleanly. The Law of Demeter is useful to keep in mind, but don't get too bogged down with trying to grok everything in object-oriented design theory right this moment -- it can be much more useful to think, design, code, and see where weaknesses in your designs lie yourself.
For your "work with one record, but without duplicating a bunch of code" problem, perhaps something like having your loadAllFromUser methods actually be methods that call a private method that takes (for instance) a parameter that is the number of records to be retrieved, where if that parameter is null it retrieves all the records.
You can take that a step further, and implement __call on your loader class. Assuming it can know or find out about the fields that you want to load by, you can construct the parameters to a function that does the loading programatically -- look at the common parts of your functions, see what differs, and see if you can find a way to make those different parts into function parameters, or something else that allows you to avoid repetition.
MVC is worth reading up on wrt your second question. At the least, I would probably want to have that in a separate class that expects to be passed a record to render. The record probably shouldn't care about how it's represented in html, the thing that makes markup for a record shouldn't care about how the record is gotten. In general, you probably want to try to make things as standalone as possible.
It's not an easy thing to get used to, and most of "getting good" at this sort of design is a matter of practice. For actual functionality, tests can help a lot -- say you're writing your loader class, and you know that if you call loadAllFromUser($me) that you should get an array of three specific records with your dataset (even if it's a dataset used for testing only), if you have something you can run which would call that on your loader and check for the right results, it can help you know that your code is at least right from the standpoint of behavior, if not from design -- and when you change the design you can ensure that it still behaves correctly. PHPUnit seems to be the most popular tool for this in php-land.
Hopefully this points you in a useful group of directions instead of just being confusing :) Good luck, and godspeed.
You can encapsulate the unique parts of loadAllFrom... and loadOneFrom... within utility methods:
private function loadAll($tableName) {
// fetch all records from tableName
}
private function loadOne($tableName) {
// fetch one record from tableName
}
and then you won't see so much duplication:
public function loadAllFromUser() {
return $this->loadAll("user");
}
public function loadOneFromUser() {
return $this->loadOne("user");
}
If you like, you can break it down further like so:
private function load($tableName, $all = true) {
// return all or one record from tableName
// default is all
}
you can then replace all of those methods with calls such as:
$allUsers = $loader->load("users");
$date = $loader->load("date", false);
You could check the arguments coming into your method and decide from there.
$args = func_get_args();
if(count($args) > 1)
{
//do something
}
else // do something else
Something simple liek this could work. Or you could make two seperate methods inside your class for handling each type of request much like #karim's example. Whichever works best for what you would like to do.
Hopefully I understand what you are asking though.
To answer your edit:
Typically you will want to create a view class. This will be responsible for handling the HTML output of the data. It is good practice to keep these separate. The best way to do this is by injecting your 'data class' object directly into the view class like such:
class HTMLview
{
private $data;
public function __construct(Loader $_data)
{
$this->data = $_data;
}
}
And then continue with the output now that this class holds your processed database information.
It's entirely possible and plausible that your record class can have a utility method attached to itself that knows how to load a single record, given that you provide it a piece of identifying information (such as its ID, for example).
The pattern I have been using is that an object can know how to load itself, and also provides static methods to perform "loadAll" actions, returning an array of those objects to the calling code.
So, I'm going through a lot of this myself with a small open source web app I develop as well, I wrote most of it in a crunch procedurally because it's how I knew to make a working (heh, yeah) application in the shortest amount of time - and now I'm going back through and implementing heavy OOP and MVC architecture.
I'm currently rebuilding an admin application and looking for your recommendations for best-practice! Excuse me if I don't have the right terminology, but how should I go about the following?
Take the example of "users" - typically we can create a class with properties like 'name', 'username', 'password', etc. and make some methods like getUser($user_ID), getAllUsers(), etc. In the end, we end up with an array/arrays of name-value pairs like; array('name' => 'Joe Bloggs', 'username' => 'joe_90', 'password' => '123456', etc).
The problem is that I want this object to know more about each of its properties.
Consider "username" - in addition to knowing its value, I want the object to know things like; which text label should display beside the control on the form, which regex I should use when validating, what error message is appropriate? These things seem to belong in the model.
The more I work on the problem, the more I see other things too; which HTML element should be used to display this property, what are minimum/maximum values for properties like 'registration_date'?
I envisaged the class looking something like this (simplified):
class User {
...etc...
private static $model = array();
...etc...
function __construct(){
...etc...
$this->model['username']['value'] = NULL; // A default value used for new objects.
$this->model['username']['label'] = dictionary::lookup('username'); // Displayed on the HTML form. Actual string comes from a translation database.
$this->model['username']['regex'] = '/^[0-9a-z_]{4,64}$/i'; // Used for both client-side validation and backend validation/sanitising;
$this->model['username']['HTML'] = 'text'; // Which type of HTML control should be used to interact with this property.
...etc...
$this->model['registration_date']['value'] = 'now'; // Default value
$this->model['registration_date']['label'] = dictionary::lookup('registration_date');
$this->model['registration_date']['minimum'] = '2007-06-05'; // These values could be set by a permissions/override object.
$this->model['registration_date']['maximum'] = '+1 week';
$this->model['registration_date']['HTML'] = 'datepicker';
...etc...
}
...etc...
function getUser($user_ID){
...etc...
// getUser pulls the real data from the database and overwrites the default value for that property.
return $this->model;
}
}
Basically, I want this info to be in one location so that I don't have to duplicate code for HTML markup, validation routines, etc. The idea is that I can feed a user array into an HTML form helper and have it automatically create the form, controls and JavaScript validation.
I could then use the same object in the backend with a generic set($data = array(), $model = array()) method to avoid having individual methods like setUsername($username), setRegistrationDate($registration_date), etc...
Does this seem like a sensible approach?
What would you call value, label, regex, etc? Properties of properties? Attributes?
Using $this->model in getUser() means that the object model is overwritten, whereas it would be nicer to keep the model as a prototype and have getUser() inherit the properties.
Am I missing some industry-standard way of doing this? (I have been through all the frameworks - example models are always lacking!!!)
How does it scale when, for example, I want to display user types with a SELECT with values from another model?
Thanks!
Update
I've since learned that Java has class annotations - http://en.wikipedia.org/wiki/Java_annotations - which seem to be more or less what I was asking. I found this post - http://interfacelab.com/metadataattributes-in-php - does anyone have any insight into programming like this?
You're on the right track there. When it comes to models I think there are many approaches, and the "correct" one usually depends on your type of application.
Your model can be directly an Active Record, maybe a table row data gateway or a "POPO", plain old PHP object (in other words, a class that doesn't implement any specific pattern).
Whichever you decide works best for you, things like validation etc. can be put into the model class. You should be able to work with your users as User objects, not as associative arrays - that is the main thing.
Does this seem like a sensible approach
Yes, besides the form label thing. It's probably best to have a separate source for data such as form labels, because you may eventually want to be able to localize them. Also, the label isn't actually related to the user object - it's related to displaying a form.
How I would approach this (suggestion)
I would have a User object which represents a single user. It should be possible to create an empty user or create it from an array (so that it's easy to create one from a database result for example). The user object should also be able to validate itself, for example, you could give it a method "isValid", which when called will check all values for validity.
I would additionally have a user repository class (or perhaps just some static methods on the User class) which could be used to fetch users from the database and store them back. This repository would directly return user objects when fetching, and accept user objects as parameters for saving.
As to what comes to forms, you could probably have a form class which takes a user object. It could then automatically get values from the user and use it to validate itself as well.
I have written on this topic a bit here: http://codeutopia.net/blog/2009/02/28/creating-a-simple-abstract-model-to-reduce-boilerplate-code/ and also some other posts linked in the end of that one.
Hope this helps. I'd just like to remind that my approach is not perfect either =)
An abstract response for you which quite possibly won't help at all, but I'm happy to get the down votes as it's worth saying :)
You're dealing with two different models here, in some world we call these Class and Instance, in other's we talk of Classes and Individuals, and in other worlds we make distinctions between A-Box and T-Box statements.
You are dealing with two sets of data here, I'll write them out in plain text:
User a Class .
username a Property;
domain User;
range String .
registration_date a Property;
domain User;
range Date .
this is your Class data, T-Box statements, Blueprints, how you describe the universe that is your application - this is not the description of the 'things' in your universe, rather you use this to describe the things in your universe, your instance data.. so you then have:
user1 a User ;
username "bob";
registration_date "2010-07-02" .
which is your Instance, Individual, A-Box data, the things in your universe.
You may notice here, that all the other things you are wondering how to do, validation, adding labels to properties and so forth, all come under the first grouping, things that describe your universe, not the things in it. So that's where you'd want to add it.. again in plain text..
username a Property;
domain User;
range String;
title "Username";
validation [ type Regex; value '/^[0-9a-z_]{4,64}$/i' ] .
The point in all this, is to help you analyse the other answers you get - you'll notice that in your suggestion you munged these two distinct sets of data together, and in a way it's a good thing - from this hopefully you can see that typically the classes in PHP take on the role of Classes (unsurprisingly) and each object (or instance of a class) holds the individual instance data - however you've started to merge these two parts of your universe together to try and make one big reusable set of classes outside of the PHP language constructs that are provided.
From here you have two paths, you can either fold in to line and follow the language structure to make your code semi reusable and follow suggested patterns like MVC (which if you haven't done, would do you good) - or you can head in to a cutting edge world where these worlds are described and we build frameworks to understand the data about our universes and the things in it, but it's an abstract place where at the minute it's hard to be productive, though in the long term is the path to the future.
Regardless, I hope that in some way that helps you to get a grip of the other responses.
All the best!
Having looked at your question, the answers and your responses; I might be able to help a bit more here (although it's difficult to cover everything in a single answer).
I can see what you are looking to do here, and in all honesty this is how most frameworks start out; making a set of classes to handle everything, then as they are made more reusable they often hit on tried and tested patterns until finally ending up with what I'd say is 'just another framework', they all do pretty much the same thing, in pretty much the same ways, and aim to be as reusable as they can - generally about the only difference between them is coding styles and quality - what they do is pretty much the same for all.
I believe you're hitting on a bit of anti-pattern in your design here, to explain.. You are focussed on making a big chunk of code reusable, the validation, the presentation and so forth - but what you're actually doing (and of course no offence) is making the working code of the application very domain specific, not only that but the design you illustrate will make it almost impossible to extend, to change layers (like make a mobile version), to swap techs (like swap db vendors) and further still, because you've got presentation and application (and data) tiers mixed together, any designer who hit's the app will have to be working in, and changing, your application code - hit on a time when you have two versions of the app and you've got a big messy problem tbh.
As with most programming problems, you can solve this by doing three things:
designing a domain model.
specifying and designing interfaces rather that worrying about the implementation.
separating cross cutting concerns
Designing a domain model is a very important part of Class based OO programming, if you've never done it before then now is the ideal time, it doesn't matter whether you do this in a modelling language like UML or just in plain text, the idea is to define all the Entities in your Domain, it's easy to slip in to writing a book when discussing this, but let's keep it simple. Your domain model comprises all the Entities in your application's domain, each Entity is a thing, think User, Address, Article, Product and so forth, each Entity is typically defined as a Class (which is the blueprint of that entity) and each Class has Properties (like username, register_date etc).
Class User {
public $username;
public $register_date;
}
Often we may keep these as POPOs, however they are often better thought of as Transfer Objects (often called Data Transfer Objects, Value Objects) - a simple Class blueprint for an entity in your domain - normally we try to keep these portable as well, so that they can be implemented in any language, passed between apps, serialized and sent to other apps and similar - this isn't a must, indeed nothing is a must - but it does touch on separation of concerns in that it would normally be naked, implying no functionality, just a blueprint ot hold values. Contrast sharply with Business Objects and Utility Classes that actually 'do' things, are implementations of functionality, not just simple value holders.
Don't be fooled though, both Inheritance and Composition also play their part in domain model, a User may have several Addresses, each Address may be the address of several different Users. A BillingAddress may extend a normal Address and add in additional properties and so forth. (aside: what is a User? do you have a User? do you have a Person with 1-* UserAccounts?).
After you've got your domain model, the next step is usually mapping that up to some form of persistence layer (normally a database) two common ways of doing this (in well defined way) are by using an ORM (such as doctrine, which is in symphony if i remember correctly), and the other way is to use DAO pattern - I'll leave that part there, but typically this is a distinct part of the system, DAO layers have the advantage in that you specify all the methods available to work with the persistence layer for each Entity, whilst keeping the implementation abstracted, thus you can swap database vendors without changing the application code (or business rules as many say).
I'm going to head in to a grey area with the next example, as mentioned earlier Transfer Objects (our entities) are typically naked objects, but they are also often a good place to strap on other functionality, you'll see what I mean.
To illustrate Interfaces, you could simply define an Interface for all your Entities which is something like this:
Interface Validatable {
function isValid();
}
then each of your entities can implement this with their own custom validation routine:
Class User implements Validatable {
public function isValid()
{
// custom validation here
return $boolean;
}
}
Now you don't need to worry about creating some convoluted way of validating objects, you can simply call isValid() on any entity and find out if it's valid or not.
The most important thing to note is that by defining the interface, we've separated some of the concerns, in that no other part of the application needs to do anything to validate an object, all they need to know is that it's Validatable and to call the isValid() method.
However, we have crossed some concerns in that each object (instance of a Class) now carries it's own validation rules and model. It may make sense to abstract this out, one easy way of doing this is to make the validation method static, so you could define:
Class User {
public static function validate(User $user)
{
// custom validation here
return $boolean;
}
}
Or you could move to using getters and setters, this is another very common pattern where you can hide the validation inside the setter, thus ensuring that each property always holds valid data (or null, or default value).
Or perhaps you move the validation in to it's own library? Class Validate with it's own methods, or maybe you just pop it in the DAO layer because you only care about checking something when you save it, or maybe you need to validate when you receive data and when you persist it - how you end up doing it is your call and there is no 'best way'.
The third consideration, which I've already touched on, is separation of concerns - should a persistence layer care how the things it's persisting are presented? should the business logic care about how things are presented? should an Entity care where and how it's displayed? or should the presentation layer care how things are presented? Similarly, we can ask is there only ever going to be one presentation layer? in one language? What about how a label appears in a sentence, sure singular User and Address makes sense, but you can't simply +s to show the lists because Users is right but Addresss is wrong ;) - also we have working considerations like do I want a new designer having to change application code just to change the presentation of 'user account' to 'User Account', even do I want to change my app code in the classes when a that change is asked for?
Finally, and just to throw everything I've said - you have to ask yourself, what's the job I'm trying to do here? am I building a big reusable application with potentially many developers and a long life cycle here - or would a simple php script for each view and action suffice (one that reads $_GET/$_POST, validates, saves to db then displays what it should or redirects where it should) - in many, if not all cases this is all that's needed.
Remember, PHP is made to be invoked when a request is made to a web server, then send back a response [end] that's it, what happens between then is your domain, your job, the client and user typically doesn't care, and you can sum up what you're trying to do this simply: build a script to respond to that request as quickly as possible, with the expected results. That's and it needn't be any more complicated than that.
To be blunt, doing everything I mentioned and more is a great thing to do, you'll learn loads, understand your job better etc, but if you just want to get the job out the door and have easy to maintain simple code in the end, just build one script per view, and one per action, with the odd reusable bit (like a http handler, a db class, an email class etc).
You're running into the Model-View-Controller (MVC) architecture.
The M only stores data. No display information, just typed key-value pairs.
The C handles the logic of manipulating this information. It changes the M in response to user input.
The V is the part which handles displaying things. It should be something like Smarty templates rather than a huge amount of raw PHP for generating HTML.
Having it all "in one place" is the wrong approach. You won't have duplicated code with MVC - each part is a distinct step. This improves code reuse, readability, and maintainability.