I develop an application that allows to access two different ERP systems. I want to abstract away some differences, such that I can search the same entity in both systems and represent a list that contains entities of both (e.g. a list of invoices). I'm thinking about how to architect this.
One approach is using the ActiveRecords pattern, have a Document class that extends from ObjectModel and then create two subclasses that each contain the database field to object property mapping. I feel it is cumbersome, because these object then represent a single instance of a Document, and still need to handle queries that result in a collection of documents.
I never touched doctrine2, but as I'm reading about Active Record vs Data Mapper pattern, it seems that it could bring a solution to my problem. Therefore my question:
Is it possible to have one object (here Document), and link two data mapper configurations to it, one for each data source ?
Related
Intoduction problem:
What is the best practice to build my class T object, when I receive it from a MongoCursor::getNext()? As far as it goes, getNext() function of a MongoCursor returns with an array. I wish to use the result from that point as an object of type T.
Should I write my own constructor for type T, that accepts an array? Is there any generic solution to this, for example when type T extends G, and G does the job as a regular way, recursively (for nested documents).
I'm new to MongoDB, and I'd like to build my own generic mapper with a nice interface.
Bounty:
Which are the possible approaches, patterns and which would fit the concept of MongoDB the most from the view of PHP.
This answer has been rewritten.
Most data mappers work by representing one object per class or "model" is normally the coined term. If you wish to allow multiple accession through a single object (i.e. $model->find()) it is normally demmed so that the method will not actually return an instance of itself but instead that of an array or a MongoCursor eager loading classes into the space.
Such a paradigm is normally connected with "Active Record". This is the method that ORMs, ODMs and frameworks all use to communicate to databases in one way or another, not only for MongoDB but also for SQL and any other databases to happen to crop up (Cassandra, CouchDB etc etc).
It should be noted immediately that even though active record gives a lot of power it should not be blanketed across the entire application. There are times where using the driver directly would be more benefical. Most ORMs, ODMs and frameworks provide the ability to quickly and effortlessly access the driver directly for this reason.
There is, as many would say, no light weight data mapper. If you are going to map your returned data to classes then it will consume resources, end of. The benefit of doing this is the power you receive when manipulating your objects.
Active record is really good at being able to provide events and triggers from within PHP. A good example is that of an ORM I made for Yii: https://github.com/Sammaye/MongoYii it can provide hooks for:
afterConstruct
beforeFind
afterFind
beforeValidate
afterValidate
beforeSave
afterSave
It should be noted that when it comes to events like beforeSave and afterSave MongoDB does not possess triggers ( https://jira.mongodb.org/browse/SERVER-124 ) so it makes sense that the application should handle this. On top of the obvious reason for the application to handle this it also makes much better handling of the save functions by being able to call your native PHP functions to manipulate every document saved prior to touching the database.
Most data mappers work by using PHP own class CRUD to represent theirs too. For example to create a new record:
$d=new User();
$d->username='sammaye';
$d->save();
This is quite a good approach since you create a "new" ( https://github.com/Sammaye/MongoYii/blob/master/EMongoDocument.php#L46 shows how I prepare for a new record in MongoYii ) class to make a "new" record. It kind of fits quite nicely semantically.
Update functions are normally accessed through read functions, you cannot update a model you don't know the existane of. This brings us onto the next step of populating models.
To handle populating a model different ORMs, ODMs and frameworks commit to different methods. For example, my MongoYii extension uses a factory method called model in each class to bring back a new instance of itself so I can call th dynamic find and findOne and other such methods.
Some ORMs, ODMs and frameworks provide the read functions as direct static functions making them into factory methods themselves whereas some use the singleton pattern, however, I chose not to ( https://stackoverflow.com/a/4596323/383478 ).
Most, if not all, implement some form of the cursor. This is used to return multiples of the models and directly wraps (normally) the MongoCursor to replace the current() method with returning a pre-populate model.
For example calling:
User::model()->find();
Would return a EMongoCursor (in MongoYii) which would then sotre the fact that the class User was used to instantiate the cursor and when called like:
foreach(User::model() as $k=>$v){
var_dump($v);
}
Would call the current() method here: https://github.com/Sammaye/MongoYii/blob/master/EMongoCursor.php#L102 returning a new single instance of the model.
There are some ORMs, ODMs and frameworks which implement eager array loading. This means they will just load the whole result straight into your RAM as an array of models. I personally do not like this approach, it is wasteful and also does not bode well when you need to use active record for larger updates due to adding some new functionality in places that needs adding to old records.
One last topic before I move on is the schemaless nature of MongoDB. The problem with using PHP classes with MongoDB is that you want all the functionality of PHP but with the variable nature of MongoDB. This is easy to over come in SQL since it has a pre-defined schema, you just query for it and jobs done; however, MongoDB has no such thing.
This does make schema handling in MongoDB quite hazardous. Most ORMs, ODMs and frameworks demand that you pre-define the schema in the spot (i.e. Doctrine 2) using private variables with get and set methods. In MongoYii, to make my life easy and elegant, I decided to retain MongoDBs schemaless nature by using magics that would detect ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L26 is my __get and https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L47 is my __set ), if the property wa inaccessible in the class, if the field was in a internal _attributes array and if not then just return null. Likewise, for setting an attribute I would just set in the intrernal _attributes variable.
As for dealing with how to assign this schema I left internal assignment upto the user however, to deal with setting properties from forms etc I used the validation rules ( https://github.com/Sammaye/MongoYii/blob/master/EMongoModel.php#L236 ) calling a function called getSafeAttributeNames() which would return a list of attributes which had validation rules against them. If they did not have validation rules then those attributes which existed in the incoming $_POST or $_GET array would not be set. So this provided the ability for a schema, yet secure, model structure.
So we have covered how to actually use the root document you also ask how to data mappers handle subdocuments. Doctrine 2 and many others provide full class based subdocuments ( http://docs.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/reference/embedded-mapping.html ) but this can be extremely resourceful. Instead I decided that I would provide helper functions which would allow for flexible usage of subdocument without eager loading them into models and so consuming RAM. Basically what I did was to leave them as they are a provide a validator ( https://github.com/Sammaye/MongoYii/blob/master/validators/ESubdocumentValidator.php ) for validating inside of them. Of course the validator is self spawning so if you had a rule in the validator that used the validator again to issue a validation of a nested subdocument then it would work.
So I think that completes a very basic discussion of ORMs, ODMs and frameworks use data mappers. Of course I could probably write an entire essay on this but this is a good enough discussion for the minute I believe.
Some background:
This is for an MVC application built without using any frameworks, but follows the MVC paradigm.
I am using an ORM library called RedBean to talk to the database.
While RedBean represents data using "beans", it supports wrapping them into a model, so custom functions may be provided by the model:
$account = R::dispense('account');
$account->name = "john";
R::store($account);
$account->getFormattedLastUpdatedTime('America/New_York');
Now the question:
Currently, each instance of the model would represent 1 row in the database. If I have a collection of accounts, then I would have an array of "account" models.
In the application, I have a feature for custom profile fields (don't worry, I am not using EAV though :)). One of the tables stores metadata for those fields (name, description etc) for generating the form fields for those custom profile fields. Once again, each row of the metadata represents 1 form field and each row is represented by 1 model.
I now wish to write a method to convert all those rows into a form object which can then be used for rendering and processing the form. But, where should this method live? My initial thought was to place it in the model representing the custom profile field metadata.
Clarification: This function would not be in the account model, but instead in the profile_fields_meta model.
Problem
As each model should represent 1 row, it seems a bit "dirty" to have the model return an object that would be generated from MULTIPLE rows in the database. Am I correct to say this is not the best way to do it? What do you recommend I do instead?
It would be right to have extended ArrayObject (http://php.net/manual/en/class.arrayobject.php) or other container class to run methods for collection.
Try to modify query methods to return data in your custom collection class instead of array if specified.
There's nothing inherent to MVC that says "each model [instance] should represent one row." Often in MVC frameworks the model (as seen by the controller, at least) is entirely ignorant of the data store and doesn't have any concept of or direct mapping to a "row." This isn't necessarily the case with ORMs but a model needn't adhere to an ORM's constraints.
However, though it's hard to tell without knowing more about your schema and implementation, the functionality you're describing doesn't sound appropriate for your Account model. In fact, it sounds to me like you should consider having a "FormField" model such that, in Rails parlance, Account "has many" FormFields.
And for the record, EAV isn't always bad, it's just often misused.
I am working on the zend project, I am referring on other zend project to create the new Zend Project.But I don't like to blindly follow that project without understanding. In the Zend Directory structure, In Model class there are mainly two type of classes I see, like as in
- models
- DbTables
- Blog.php //Extends Zend_Db_Table_Abstract
- Blog.php // Contains methods like validate() and save()
- BlogMapper.php // Also Contains methods like validate(Blog b) & save(Blog b)
Why this specific structure is followed?
Is this is to separate Object class and Database model class?
Please explain.
DataMapper is a design pattern from Patterns of Enterprise Application Architecture.
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema.
How you store data in a relational database is usually different from how you would structure objects in memory. For instance, an object will have an array with other objects, while in a database, your table will have a foreign key to another table instead. Because of the object-relational impedance mismatch, you use a mediating layer between the domain object and the database. This way, you can evolve both without affecting the other.
Separating the Mapping responsibility in its own layer is also more closely following the Single Responsibility Principle. Your objects dont need to know about the DB logic and vice versa. This gives you greater flexibility when writing your code.
When you dont want to use a Domain Model, you usually dont need DataMapper. If your database tables are simple, you might be better off with a TableModule and TableDataGateway or even just ActiveRecord.
For various other patterns see my answer to
ORM/DAO/DataMapper/ActiveRecord/TableGateway differences? and
http://martinfowler.com/eaaCatalog/index.html
The idea of a Model is to wrap up the logical collection of data inside of your code.
The idea of a DataMapper is to relate this application-level collection of data with how you are storing it.
For a lot of ActiveRecord implementations, the framework does not provide this separation of intent and this can lead to problems. For example, a BlogPost model can wrap up the basic information of a blog post like
title
author
body
date_posted
But maybe you also want to have it contain something like:
number_of_reads
number_of_likes
Now you could store all of this data in a single MySQL table to begin with, but as your blog grows and you become super famous, you find out that your statistics data is taking an awful lot of hits and you want to move it off to a separate database server.
How would you go about migrating those fields of the BlogPost objects off to a different data store without changing your application code?
With the DataMapper, you can modify the way the object is saved to the database(s) and the way it is loaded from the database(s). This lets you tweak the storage mechanism without having to change the actual collection of information that your application relies on.
I'm new to using an MVC structure, and am developing my own MVC framework for a university project. I've got a database class that I can use to send a query to the database and return me an array of objects (PHP standard object class). I then want to display a list of the objects on an index page.
My question is, should this list of standard objects really be a list of models? Or are they fine as they are?
You don't have to create separate model classes just for the sake of creating model classes because they are in the MVC pattern name. If your solution also works when you simply pass the array of PHP standard object classes to your view, you should just use that. In that case creating model classes would just be extra work for no benefit. However, if you need additional functionality besides simply outputting a list of database results, you should consider creating actual models.
You should store data in your database and manipulate this data (create,update,read and destroy) with your models. I think you should better know what is MVC and what is for. You can read a bit here
depends... One could argue that most applications are very heavy on the Model side (Fat Models is an actual pattern), therefore it will not be enough to create a few stdObjects from an array, but actually map tables to object classes, so you can add useful methods to them.
I would recommend to take a look at Doctrine and try implementing a subset of features.
I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?
From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.
If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).
Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.
PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.
keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.