What does a Data Mapper typically look like?

What does a Data Mapper typically look like? - php

I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?

From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.

If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).

Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.

PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.

keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.

Related

Active Record must have domain logic?

I started some time working with the Yii Framework and I saw some things "do not let me sleep." Here I talk about my doubts about how Yii users use the Active Record.
I saw many people add business rules of the application directly in Active Record, the same generated by Gii. I deeply believe that this is a misinterpretation of what is Active Record and a violation of SRP.
Early on, SRP is easier to apply. ActiveRecord classes handle persistence, associations and not much else. But bit-by-bit, they grow. Objects that are inherently responsible for persistence become the de facto owner of all business logic as well. And a year or two later you have a User class with over 500 lines of code, and hundreds of methods in it’s public interface. Callback hell ensues.
When I talked about it with some people and my view was criticized. But when asked:
And when you need to regenerate your Active Record full of business rules through Gii what do you do? Rewrite? Copy and Paste? That's great, congratulations!
Got an answer, only the silence.
So, I:
What I am currently doing in order to reach a little better architecture is to generate the Active Records in a folder /ar. And inside the /models folder add the Domain Model.
By the way, is the Domain Model who owns the business rules, and is the Domain Model that uses the Active Records to persist and retrieve data, and this is the Data Model.
What do you think of this approach?
If I'm wrong somewhere, please tell me why before criticizing harshly.

Some of the comments on this article are quite helpful:
http://blog.codeclimate.com/blog/2012/10/17/7-ways-to-decompose-fat-activerecord-models/
In particular, the idea that your models should grow out of a strictly 'fat model' setup as you need more seems quite wise.
Are you having issues now or mainly trying to plan ahead? This may be hard to plan ahead for and may just need refactoring as you go ...
Edit:
Regarding moveUserToGroup (in your comment below), I could see how having that might bother you. Found this as I was thinking about your question: https://gist.github.com/justinko/2838490 An equivalent setup that you might use for your moveUserToGroup would be a CFormModel subclass. It'll give you the ability to do validations, etc, but could then be more specific to what you're trying to handle (and use multiple AR objects to achieve your objectives instead of just one).
I often use CFormModel to handle forms that have multiple AR objects or forms where I want to do other things.
Sounds like that may be what you're after. More details available here:
http://www.yiiframework.com/doc/guide/1.1/en/form.overview

The definition of Active Record, according to Martin Fowler:
An object carries both data and behavior. Much of this data is persistent and needs to be stored in a database. Active Record uses the most obvious approach, putting data access logic in the domain object. This way all people know how to read and write their data to and from the database.
When you segregate data and behavior you no longer have an Active Record. Two common related patterns are Data Mapper and Table/Row Gateway (this one more related to RDBMS's).
Again, Fowler says:
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema.
And again:
A Table Data Gateway holds all the SQL for accessing a single table or view: selects, inserts, updates, and deletes. Other code calls its methods for all interaction with the database.
A Row Data Gateway gives you objects that look exactly like the record in your record structure but can be accessed with the regular mechanisms of your programming language. All details of data source access are hidden behind this interface.
A Data Mapper is usualy storage independent, the mapper recovers data from the storage and creates mapped objects (Plain-old objects). The mapped object knows absolutely nothing about being stored somewhere else.
As I said, TDG/RDG are more inwardly related to a relational table. TDG object represents the structure of the table and implements all common operations. RGD object contains data related to one single row of the table. Unlike mapped object of Data Mapper, the RDG object has conscience that it is part of a whole, because it references its container TDG.

MongoDB object mapping (PHP)

Intoduction problem:
What is the best practice to build my class T object, when I receive it from a MongoCursor::getNext()? As far as it goes, getNext() function of a MongoCursor returns with an array. I wish to use the result from that point as an object of type T.
Should I write my own constructor for type T, that accepts an array? Is there any generic solution to this, for example when type T extends G, and G does the job as a regular way, recursively (for nested documents).
I'm new to MongoDB, and I'd like to build my own generic mapper with a nice interface.
Bounty:
Which are the possible approaches, patterns and which would fit the concept of MongoDB the most from the view of PHP.

This answer has been rewritten.
Most data mappers work by representing one object per class or "model" is normally the coined term. If you wish to allow multiple accession through a single object (i.e. $model->find()) it is normally demmed so that the method will not actually return an instance of itself but instead that of an array or a MongoCursor eager loading classes into the space.
Such a paradigm is normally connected with "Active Record". This is the method that ORMs, ODMs and frameworks all use to communicate to databases in one way or another, not only for MongoDB but also for SQL and any other databases to happen to crop up (Cassandra, CouchDB etc etc).
It should be noted immediately that even though active record gives a lot of power it should not be blanketed across the entire application. There are times where using the driver directly would be more benefical. Most ORMs, ODMs and frameworks provide the ability to quickly and effortlessly access the driver directly for this reason.
There is, as many would say, no light weight data mapper. If you are going to map your returned data to classes then it will consume resources, end of. The benefit of doing this is the power you receive when manipulating your objects.
Active record is really good at being able to provide events and triggers from within PHP. A good example is that of an ORM I made for Yii: https://github.com/Sammaye/MongoYii it can provide hooks for:
afterConstruct
beforeFind
afterFind
beforeValidate
afterValidate
beforeSave
afterSave
It should be noted that when it comes to events like beforeSave and afterSave MongoDB does not possess triggers ( https://jira.mongodb.org/browse/SERVER-124 ) so it makes sense that the application should handle this. On top of the obvious reason for the application to handle this it also makes much better handling of the save functions by being able to call your native PHP functions to manipulate every document saved prior to touching the database.
Most data mappers work by using PHP own class CRUD to represent theirs too. For example to create a new record:
$d=new User();
$d->username='sammaye';
$d->save();
This is quite a good approach since you create a "new" ( https://github.com/Sammaye/MongoYii/blob/master/EMongoDocument.php#L46 shows how I prepare for a new record in MongoYii ) class to make a "new" record. It kind of fits quite nicely semantically.
Update functions are normally accessed through read functions, you cannot update a model you don't know the existane of. This brings us onto the next step of populating models.
To handle populating a model different ORMs, ODMs and frameworks commit to different methods. For example, my MongoYii extension uses a factory method called model in each class to bring back a new instance of itself so I can call th dynamic find and findOne and other such methods.
Some ORMs, ODMs and frameworks provide the read functions as direct static functions making them into factory methods themselves whereas some use the singleton pattern, however, I chose not to ( https://stackoverflow.com/a/4596323/383478 ).
Most, if not all, implement some form of the cursor. This is used to return multiples of the models and directly wraps (normally) the MongoCursor to replace the current() method with returning a pre-populate model.
For example calling:
User::model()->find();
Would return a EMongoCursor (in MongoYii) which would then sotre the fact that the class User was used to instantiate the cursor and when called like:
foreach(User::model() as $k=>$v){
var_dump($v);
}
Would call the current() method here: https://github.com/Sammaye/MongoYii/blob/master/EMongoCursor.php#L102 returning a new single instance of the model.
There are some ORMs, ODMs and frameworks which implement eager array loading. This means they will just load the whole result straight into your RAM as an array of models. I personally do not like this approach, it is wasteful and also does not bode well when you need to use active record for larger updates due to adding some new functionality in places that needs adding to old records.
One last topic before I move on is the schemaless nature of MongoDB. The problem with using PHP classes with MongoDB is that you want all the functionality of PHP but with the variable nature of MongoDB. This is easy to over come in SQL since it has a pre-defined schema, you just query for it and jobs done; however, MongoDB has no such thing.
This does make schema handling in MongoDB quite hazardous. Most ORMs, ODMs and frameworks demand that you pre-define the schema in the spot (i.e. Doctrine 2) using private variables with get and set methods. In MongoYii, to make my life easy and elegant, I decided to retain MongoDBs schemaless nature by using magics that would detect ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L26 is my __get and https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L47 is my __set ), if the property wa inaccessible in the class, if the field was in a internal _attributes array and if not then just return null. Likewise, for setting an attribute I would just set in the intrernal _attributes variable.
As for dealing with how to assign this schema I left internal assignment upto the user however, to deal with setting properties from forms etc I used the validation rules ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L236 ) calling a function called getSafeAttributeNames() which would return a list of attributes which had validation rules against them. If they did not have validation rules then those attributes which existed in the incoming $_POST or $_GET array would not be set. So this provided the ability for a schema, yet secure, model structure.
So we have covered how to actually use the root document you also ask how to data mappers handle subdocuments. Doctrine 2 and many others provide full class based subdocuments ( http://docs.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/reference/embedded-mapping.html ) but this can be extremely resourceful. Instead I decided that I would provide helper functions which would allow for flexible usage of subdocument without eager loading them into models and so consuming RAM. Basically what I did was to leave them as they are a provide a validator ( https://github.com/Sammaye/MongoYii/blob/master/validators/ESubdocumentValidator.php ) for validating inside of them. Of course the validator is self spawning so if you had a rule in the validator that used the validator again to issue a validation of a nested subdocument then it would work.
So I think that completes a very basic discussion of ORMs, ODMs and frameworks use data mappers. Of course I could probably write an entire essay on this but this is a good enough discussion for the minute I believe.

in Zend, Why do We use DB Model class and Mapper class as two separate?

I am working on the zend project, I am referring on other zend project to create the new Zend Project.But I don't like to blindly follow that project without understanding. In the Zend Directory structure, In Model class there are mainly two type of classes I see, like as in
- models
- DbTables
- Blog.php //Extends Zend_Db_Table_Abstract
- Blog.php // Contains methods like validate() and save()
- BlogMapper.php // Also Contains methods like validate(Blog b) & save(Blog b)
Why this specific structure is followed?
Is this is to separate Object class and Database model class?
Please explain.

DataMapper is a design pattern from Patterns of Enterprise Application Architecture.
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema.
How you store data in a relational database is usually different from how you would structure objects in memory. For instance, an object will have an array with other objects, while in a database, your table will have a foreign key to another table instead. Because of the object-relational impedance mismatch, you use a mediating layer between the domain object and the database. This way, you can evolve both without affecting the other.
Separating the Mapping responsibility in its own layer is also more closely following the Single Responsibility Principle. Your objects dont need to know about the DB logic and vice versa. This gives you greater flexibility when writing your code.
When you dont want to use a Domain Model, you usually dont need DataMapper. If your database tables are simple, you might be better off with a TableModule and TableDataGateway or even just ActiveRecord.
For various other patterns see my answer to
ORM/DAO/DataMapper/ActiveRecord/TableGateway differences? and
http://martinfowler.com/eaaCatalog/index.html

The idea of a Model is to wrap up the logical collection of data inside of your code.
The idea of a DataMapper is to relate this application-level collection of data with how you are storing it.
For a lot of ActiveRecord implementations, the framework does not provide this separation of intent and this can lead to problems. For example, a BlogPost model can wrap up the basic information of a blog post like
title
author
body
date_posted
But maybe you also want to have it contain something like:
number_of_reads
number_of_likes
Now you could store all of this data in a single MySQL table to begin with, but as your blog grows and you become super famous, you find out that your statistics data is taking an awful lot of hits and you want to move it off to a separate database server.
How would you go about migrating those fields of the BlogPost objects off to a different data store without changing your application code?
With the DataMapper, you can modify the way the object is saved to the database(s) and the way it is loaded from the database(s). This lets you tweak the storage mechanism without having to change the actual collection of information that your application relies on.

php oop MVC design - proper architecture for an application to edit data

Now that I have read an awfull lot of posts, articles, questions and answers on OOP, MVC and design patterns, I still have questions on what is the best way to build what i want to build.
My little framework is build in an MVC fashion. It uses smarty as the viewer and I have a class set up as the controller that is called from the url.
Now where I think I get lost is in the model part. I might be mixing models and classes/objects to much (or to little).
Anyway an example. When the aim is to get a list of users that reside in my database:
the application is called by e.g. "users/list" The controller then runs the function list, that opens an instance of a class "user" and requests that class to retrieve a list from the table. once returned to the controller, the controller pushes it to the viewer by assigning the result set (an array) to the template and setting the template.
The user would then click on a line in the table that would tell the controler to start "user/edit" for example - which would in return create a form and fill that with the user data for me to edit.
so far so good.
right now i have all of that combined in one user class - so that class would have a function create, getMeAListOfUsers, update etc and properties like hairType and noseSize.
But proper oop design would want me to seperate "user" (with properties like, login name, big nose, curly hair) from "getme a list of users" what would feel more like a "user manager class".
If I would implement a user manager class, how should that look like then? should it be an object (can't really compare it to a real world thing) or should it be an class with just public functions so that it more or less looks like a set of functions.
Should it return an array of found records (like: array([0]=>array("firstname"=>"dirk", "lastname"=>"diggler")) or should it return an array of objects.
All of that is still a bit confusing to me, and I wonder if anyone can give me a little insight on how to do approach this the best way.

The level of abstraction you need for your processing and data (Business Logic) depends on your needs. For example for an application with Transaction Scripts (which probably is the case with your design), the class you describe that fetches and updates the data from the database sounds valid to me.
You can generalize things a bit more by using a Table Data Gateway, Row Data Gateway or Active Record even.
If you get the feeling that you then duplicate a lot of code in your transaction scripts, you might want to create your own Domain Model with a Data Mapper. However, I would not just blindly do this from the beginning because this needs much more code to get started. Also it's not wise to write a Data Mapper on your own but to use an existing component for that. Doctrine is such a component in PHP.
Another existing ORM (Object Relational Mapper) component is Propel which provides Active Records.
If you're just looking for a quick way to query your database, you might find NotORM inspiring.
You can find the Patterns listed in italics in
http://martinfowler.com/eaaCatalog/index.html
which lists all patterns in the book Patterns of Enterprise Application Architecture.

I'm not an expert at this but have recently done pretty much exactly the same thing. The way I set it up is that I have one class for several rows (Users) and one class for one row (User). The "several rows class" is basically just a collection of (static) functions and they are used to retrieve row(s) from a table, like so:
$fiveLatestUsers = Users::getByDate(5);
And that returns an array of User objects. Each User object then has methods for retrieving the fields in the table (like $user->getUsername() or $user->getEmail() etc). I used to just return an associative array but then you run into occasions where you want to modify the data before it is returned and that's where having a class with methods for each field makes a lot of sense.
Edit: The User object also have methods for updating and deleting the current row;
$user->setUsername('Gandalf');
$user->save();
$user->delete();

Another alternative to Doctrine and Propel is PHP Activerecords.
Doctrine and Propel are really mighty beasts. If you are doing a smaller project, I think you are better off with something lighter.
Also, when talking about third-party solutions there are a lot of MVC frameworks for PHP like: Kohana, Codeigniter, CakePHP, Zend (of course)...
All of them have their own ORM implementations, usually lighter alternatives.
For Kohana framework there is also Auto modeler which is supposedly very lightweight.
Personally I'm using Doctrine, but its a huge project. If I was doing something smaller I'd sooner go with a lighter alternative.

PHP MVC & SQL minus Model

I've been reading several articles on MVC and had a few questions I was hoping someone could possibly assist me in answering.
Firstly if MODEL is a representation of the data and a means in which to manipulate that data, then a Data Access Object (DAO) with a certain level of abstraction using a common interface should be sufficient for most task should it not?
To further elaborate on this point, say most of my development is done with MySQL as the underlying storage mechanism for my data, if I avoided vendor specific functions -- (i.e. UNIX_TIMESTAMP) -- in the construction of my SQL statements and used a abstract DB object that has a common interface moving between MySQL and maybe PostgreSQL, or MySQL and SQLite should be a simple process.
Here's what I'm getting at some task, are handled by a single CONTROLLER -- (i.e. UserRegistration) and rather that creating a MODEL for that task, I can get an instance of the db object -- (i.e. DB::getInstance()) -- then make the necessary db calls to INSERT a new user. Why with such a simple task would I create a new MODEL?
In some of the examples I've seen a MODEL is created, and within that MODEL there's a SELECT statement that fetches x number of orders from the order table and returns an array. Why do this, if in your CONTROLLER your creating another loop to iterate over that array and assign it to the VIEW; ex. 1?
ex. 1: foreach ($list as $order) { $this->view->set('order', $order); }
I guess one could modify the return so something like this is possibly; ex. 2.
ex. 2: while ($order = $this->model->getOrders(10)) { $this->view->set('order', $order); }
I guess my argument is that why create a model when you can simply make the necessary db calls from within your CONTROLLER, assuming your using a DB object with common interface to access your data, as I suspect most of websites are using. Yes I don't expect this is practical for all task, but again when most of what's being done is simple enough to not necessarily warrant a separate MODEL.
As it stands right now a user makes a request 'www.mysite.com/Controller/action/args1/args2', the front controller (I call it router) passes off to Controller (class) and within that controller a certain action (method) is called and from there the appropriate VIEW is created and then output.

So I guess you're wondering whether the added complexity of a model layer -on top- of a Database Access Object is the way you want to go. In my experience, simplicity trumps any other concern, so I would suggest that if you see a clear situation where it's simpler to completely go without a Model and have the data access occur in the equivalent of a controller, then you should go with that.
However, there are still other potential benefits to having an MVC separation:
No SQL at all in the controller: Maybe you decide to gather your data from a source other than a database (an array in the session? A mock object for testing? a file? just something else), or your database schema changes and you have to look for all the places that your code has to change, you could look through just the models.
Seperation of skillsets: Maybe someone on your team is great at complex SQL queries, but not great at dealing with the php side. Then the more separated the code is, the more people can play to their strengths (even more so when it comes to the html/css/javascript side of things).
Conceptual object that represents a block of data: As Steven said, there's a difference in the benefits you get from being database agnostic (so you can switch between mysql and postgresql if need be) and being schema agnostic (so you have an object full of data that fits together well, even if it came from different relational tables). When you have a model that represents a good block of data, you should be able to reuse that model in more than one place (e.g. a person model could be used in logins and when displaying a personnel list).
I certainly think that the ideals of separation of the tasks of MVC are very useful. But over time I've come to think that alternate styles, like keeping that MVC-like separation with a functional programming style, may be easier to deal with in php than a full blown OOP MVC system.

I found this great article that addressed most of my questions. In case anyone else had similar questions or is interested in reading this article. You can find it here http://blog.astrumfutura.com/archives/373-The-M-in-MVC-Why-Models-are-Misunderstood-and-Unappreciated.html.

The idea behind MVC is to have a clean separation between your logic. So your view is just your output, and your controller is a way of interacting with your models and using your models to get the necessary data to give to the necessary views. But all the work of actually getting data will go on your model.
If you think of your User model as an actual person and not a piece of data. If you want to know that persons name is it easier to call up a central office on the phone (the database) and request the name or to just ask the person, "what is your name?" That's one of the ideas behind the model. In a most simplistic way you can view your models as real living things and the methods you attach to them allow your controllers to ask those living things a series of questions (IE - can you view this page? are you logged in? what type of image are you? are you published? when were you last modified?). Your controller should be dumb and your model should be smart.
The other idea is to keep your SQL work in one central location, in this case your models. So that you don't have errant SQL floating around your controllers and (worst case scenario) your views.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.