i'm learning about repository pattern and i have seen a lot of examples where repository pattern is used for creation and update.
Here is one example of repisotory interface.
interface RepositoryInterface
{
public function all();
public function create(array $data);
public function update(array $data, $id);
public function delete($id);
public function show($id);
}
This repository interface is responsible for creating/retrieving and updating models.
But then, after a little better search, i found that people should avoid to persist data in repository and that repositories should act as collections and be used only for retrieving data. Here is the link .
Here is what they say there.
Probably the most important distinction about repositories is that they represent collections of entities. They do not represent database storage or caching or any number of technical concerns. Repositories represent collections. How you hold those collections is simply an implementation detail.
Here is one example of repository that only retrieve data.
interface BlogRepositoryInterface
{
public function all();
public function getByUser(User $user);
}
I am wondering what is the best practice for repository pattern?
If we should use repository only for retrieving models, how then we handle create/update/delete models ?
Object persistance is totally allowed by Repository pattern.
From Martin Fowler's book Patterns of Enterprise Application Architecture (p.322):
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes.
The excerpt is clear: As Repository is a collection, you should be able to add and delete objects from it at your will.
The only concern I have is about your interface. You should break it in two or more as you will probably have objects that:
are not meant to be deleted
are not meant to be updated
are not meant to be inserted
Creating different interfaces would make your code meet the Interface Segregation Principle that states that no client should be forced to depend on methods it does not use.
Some examples:
Let's say you have a class that represents a state of your country. It's rare to see a country adding new states, deleting or changing their names frequently. Thus, the class State could implement an interface that has only the methods all() and show().
Suppose that you are coding an e-commerce. Deleting a Customer from database is not an option because all his data like buying history, searches, etc, would be lost. So you would do a soft delete, setting a flag $customer->deleted = true;. In this case, the class Customer could implement an interface that has only the methods all() and show() and other interface - or two interfaces - for the methods insert() and update().
I think you misunderstood the sentences you quoted:
Probably the most important distinction about repositories is that
they represent collections of entities. They do not represent database
storage or caching or any number of technical concerns. Repositories
represent collections. How you hold those collections is simply an
implementation detail.
There is no statement saying that you should only use repositories for reading. The most important characteristic of repositories is that when you create or update an item with repositories, the changes may not be applied to the persistence layer immediately. The time changes are applied depends on the implementation of the repository.
A small note from me, we shouldn't have a method called create in the repository. As a collection, we add item to it, not create item. I usually have an add method instead of the create method in my repository interfaces. The creation should be the responsibility of a factory.
Related
I've got an entity with a lot of linked properties, when I'm handling a CSV import, I don't want to create $em->getReference() calls for all the linked fields (mainly because i want to keep it as abstract as possible and don't want to hard code all the possible references).
I rather want to do this in the Entity setter method for the given properties. However that would require me to access doctrine from within the Model which in its turn is a bad practice.
Should i access the entity's Metadata and go from there or is there a better approach to this, which I haven't yet mentioned?
Doing it in the setter, really messes up the whole SOA thing. If you care about the code being decoupled and abstract you can use Dependency Inversion.
Let's say you have entity A that has associations to entity B and C, then for getting the references to correct B and C instances from the raw data you get from the CSV, you would define two interfaces e.g: BRepositoryInterface and CRepositoryInterface, they both might contain a single method find($id), but they still have to be distinct. Now make your Doctrine Repositories for the respective entities implement these interfaces and inject them into the service where create entity A.
If you really wanna make some good code, then you should create separate classes implementing each of these interfaces, and then inject your Doctrine Repositories into them, these classes then act as wrappers for those repositories, this way you have a distinct layer between your DataMapper layer and your business logic layer, which gives you the abstraction you want.
This is what I've learned in my recent studies on good code, DDD and Design patterns. It is no where near perfect(not that there is such a thing). Any Ideas/Comments would be appreciated.
Update: In regards to your comment:
One of the main things that good design strives for is "capturing the language of domain experts", (see this source item no.4 for a description of these legendary beings).i.e: What is your code in plain English?
What your code says is basically find the Objects with these given ids from the repositories of the Entities that have an association to A.This looks pretty good since you have no explicit dependencies on what A has associations to.But looking at it closer, you'll see that you do have dependencies on actual B and C Objects and their repositories, since when you provide an id for some Object, you're not just providing an id, but you're also implicitly stating what that object is, otherwise an id would have no meaning other than it's scalar Value.However that approach definitely has it's use cases both in Semantics of the Design, and RAD.But there is still the issue of Law of Demeter, but it can be solved, see below:
Either way I think you should definitely have a factory for A objects that looks something like this.
class AFactory{
protected $br;
protected $cr;
public function __construct(BRepositoryInterface $br, CrepositoryInterface $cr){
$this->br = $br;
$this->cr = $cr;
}
public function create($atr1, $atr2, $bId, $cId){
$b = $this->br->find($bId);
$c = $this->cr->find($cId);
return new A($atr1, $atr2, $bId, $cId);
}
}
Now you can actually create this factory using the design you stated by having another factory for this factory, this will also solve the issue with Law of Demeter.That Factory will have the Entity Manager as it's dependency, it will read A's metadata, and fetch the Repositories of the related objects based on that metadata, and create a new AFactory Instance from those repositories, now if you implement those interfaces (BRepositoryInterface and CRepositoryInterface) in your actual Doctrine Repositories, the AFactory instance will be successfully created.
I am fairly new to Domain Driven Design (DDD), but what I understand of it is that you speak to the application service, which is the entry to your "model". The service can talk to a repository which uses sources (files, databases, etc) to get data. The repository returns an entity.
That is the global idea what I get of it. The service knows the repository but not the entity etc.
Now I have the following issue.
I have an entity user, which is something like the following (is just an example)
<?php
class User
{
protected $name;
protected $city_id;
public function getCity()
{
// return $city_entity;
}
}
The getCity() function returns the city entity. I wish for this function to use lazy loading so injecting the CityEntity when you use the user repository is not really lazy loading.
I came with two solutions to the problem. But I feel that both are against the DDD principals.
First solution I came up with is to inject the city repository in the user entity, which has disadvantages: if you need more repositories you have to load them all in the entity. It looks like answer but it just looks like a wrapper for the repository to me. So why not just inject the repository then?
Second solution, you give the entity a service locator. The disadvantage of this is you don't know any more which repositories are needed unless you read the code.
So now, the question is, what is the best way to give the flexibility of lazy loading while keeping the DDD principals intact?
One of main point in DDD is that your domain model should only express the ubiquitous language of the bounded context to handle the business rules.
Thus, in DDD entities, lazy loading is an anti-pattern. There are some reasons for that:
if an aggregate holds only the data that it requires to ensure business invariants, it needs them all and they are few, thus eager loading works better.
if you lazy load data, your clients have to handle much more exceptional paths that those relevant in business terms
you can use shared identifiers to cope with references between aggregates
it's cheap to use dedicated queries for projective purposes (often called read-model)
IMHO, you should never use DDD entities as a data access technique: use DTOs for that.
For further info you could take a look at Effective Aggregate Design by Vaughn Vernon.
What good is a repository pattern when you have an ORM?
Example. Suppose i have the following (fictional) tables:
Table: users
pk_user_id
fk_userrole_id
username
Table: userroles
fk_userrole_id
role
Now with an orm i could simply put this in a model file:
$user = ORM::load('users', $id);
Now $user is already my object, which could easily be lazy loaded:
(would be even nicer if things are automatically singular/pluralized)
foreach ( $user->userroles()->role as $role )
{
echo $role;
}
Now with the Repository pattern i'd had to create a repository for the Users and one for the Roles. The repository also needs all kinds of functions to retrieve data for me and to store it. Plus it needs to work with Entity models. So i have to create all of those too.
To me that looks like alot of stuff do... When i could simply get the data like i described above with an ORM. And i could store it just as easy:
ORM::store($user);
In this case it would not only store the user object to the database, but also any changes i made to the 'Roles' object aswell. So no need for any extra work like you need with the repository pattern...
So my question basically is, why would i want to use a repository pattern with an ORM? I've seen tutorials where to use that pattern (like with Doctrine). But it really just doesn't make any sense to me... Anyone any explanation for its use in combination with an ORM..??
The ORM is an implementation detail of the Repository. The ORM just makes it easy to access the db tables in an OOP friendly way. That's it.
The repository abstract persistence access, whatever storage it is. That is its purpose. The fact that you're using a db or xml files or an ORM doesn't matter. The Repository allows the rest of the application to ignore persistence details. This way, you can easily test the app via mocking or stubbing and you can change storages if it's needed. Today you might use MySql, tomorrow you'll want to use NoSql or Cloud Storage. Do that with an ORM!
Repositories deal with Domain/Business objects (from the app point of view), an ORM handles db objects. A business objects IS NOT a db object, first has behaviour, the second is a glorified DTO, it only holds data.
Edit
You might say that both repository and ORM abstract access to data, however the devil is in the details. The repository abstract the access to all storage concerns, while the ORM abstract access to a specific RDBMS
In a nutshell, Repository and ORM's have DIFFERENT purposes and as I've said above, the ORM is always an implementation detail of the repo.
You can also check this post about more details about the repository pattern.
ORM and repository pattern...depends on setup.
If you use your ORM entities as the domain layer, then please use no repositories.
If you have a separate domain model and you need to map from that model to ORM entities and so perform a save, then repositories are what you need.
More details you find here (but must be logged to linked-in). Also to understand the difference, check out the definition of the repository pattern.
Most people use classes that they call repositories, but aren't repositories at all, just query classes - this is how/where you should place your queries if you decided to go with the #1 option (see answer above). In this case make sure not to expose DbContext or ISession from that query class, nor to expose CUD-methods from there - remember, Query class!
The #2 option is a tough one. If you do a real repository, all the inputs and outputs on the repository interface will contain clear domain classes (and no database related object). It's forbidden to expose ORM mapped classes or ORM architecture related objects from there. There will be a Save method also. These repositories might also contain queries, but unlike query classes, these repos will do more - they will take your domain aggregate (collection and tree of entities) and save them to DB by mapping those classes to ORM classes and perform a save on ORM. This style (#2) does not needs to use ORM, the repository pattern was primarly made for ADO.NET (any kind of data access).
Anyhow these 2 options are the 2 extremes we can do. A lot of people use repositories with ORM, but they just add an extra layer of code without real function, the only real function there is the query class like behaviour.
Also I'd be careful when someone talks about UnitOfWork, especially with ORM. Almost every sample on the internet is a fail in terms of architecture. If you need UoW, why not use TransactionScope (just make sure you got a wrapper which uses other than Serializable transaction by default). In 99,9% you won't need to manage 2 sets of independent changes in data (so 2 sets of OuW), so TransactionScope will be a fine choce in .NET - for PHP i'd look for some open-session-view implementations...
I'm working on a large project at the moment and am just wondering which is best practice, to model entities and sets of entities seperately or in one class?
Currently I am implementing two classes for each entity (for example an 'author' and 'authors' class) where the plural class contains methods like 'fetch authors' (using Zend_Db_Table_Abstract for plural and Zend_Db_Table_Row_Abstract for singular).
However I realised that I've often seen methods like 'fetch/list' functions in a single entity's object, which seems quite neat in terms of the fact that I won't have to have as many files.
I know there are no hard-and-fast rules for data modelling but before I continue too far I'd be interested in learning what the general consensus on best-practice for this is (along with supporting arguments of course!).
Answers [opinions] gratefully received!
Rob Ganly
Personally, I prefer a model called Person to actually represent a single person and a model like PersonCollection to represent a collection of persons. In neither case, would I have methods for fetch/get on these objects. Rather, I would put those methods on a PersonRepository or a PersonMapper class.
That's really my biggest area of discomfort with ActiveRecord as a pattern for modeling. By having methods like find() and save(), it opens the door to methods like getPersonByName(), getPersonsWithMinimumAge(), etc. These methods are great, nothing wrong with them, but I think that semantically, they work better on a mapper or a repository class. Let the Model actually model, leave persistence and retrieval to mappers and repositories.
So, to more directly address your question, I see potentially three classes per "entity type":
Person - actually models a person
PersonCollection - extends some Abstract Collection class, each item of class Person
PersonMapper - persistence and retrieval of Person objects and PersonCollections
Controllers would use the mapper to persist and retrieve models and collections.
It's probably no surprise that I'm drawn to Doctrine2. The EntityManager there functions as a single point of contact for persistence and retrieval. I can then create repositories and services that use the EntityManager for custom functionality. And I can then layer on action helpers or factories or dependency injection containers to make it easy to get/create those repositories and services.
But I know that the standard ActiveRecord approach is quite common, well-understood, and very mainstream. You can get good results using it and can find many developers who immediately understand it and can work well with it.
As in most things, YMMV.
I'm seriously confused about the concept of the 'Model' in MVC. Most frameworks that exist today put the Model between the Controller and the database, and the Model almost acts like a database abstraction layer. The concept of 'Fat Model Skinny Controller' is lost as the Controller starts doing more and more logic.
In DDD, there is also the concept of a Domain Entity, which has a unique identity to it. As I understand it, a user is a good example of an Entity (unique userid, for instance). The Entity has a life-cycle -- it's values can change throughout the course of the action -- and then it's saved or discarded.
The Entity I describe above is what I thought Model was supposed to be in MVC? How off-base am I?
To clutter things more, you throw in other patterns, such as the Repository pattern (maybe putting a Service in there). It's pretty clear how the Repository would interact with an Entity -- how does it with a Model?
Controllers can have multiple Models, which makes it seem like a Model is less a "database table" than it is a unique Entity.
UPDATE: In this post the Model is described as something with knowledge, and it can be singular or a collection of objects. So it's sound more like an Entity and a Model are more or less the same. The Model is an all encompassing term, where an Entity is more specific. A Value Object would be a Model as well. At least in terms of MVC. Maybe???
So, in very rough terms, which is better?
No "Model" really ...
class MyController {
public function index() {
$repo = new PostRepository();
$posts = $repo->findAllByDateRange('within 30 days');
foreach($posts as $post) {
echo $post->Author;
}
}
}
Or this, which has a Model as the DAO?
class MyController {
public function index() {
$model = new PostModel();
// maybe this returns a PostRepository?
$posts = $model->findAllByDateRange('within 30 days');
while($posts->getNext()) {
echo $posts->Post->Author;
}
}
}
Both those examples didn't even do what I was describing above. I'm clearly lost. Any input?
Entity
Entity means an object that is a single item that the business logic works with, more specifically those which have an identity of some sort.
Thus, many people refer to ORM-mapped objects as entities.
Some refer to as "entity" to a class an instance of which represents a single row in a database.
Some other people prefer to call only those of these classes as "entity" which also contain business rules, validation, and general behaviour, and they call the others as "data transfer objects".
Model
A Model is something that is not directly related to the UI (=View) and control flow (=Controller) of an application, but rather about the way how data access and the main data abstraction of the application works.
Basically, anything can be a model that fits the above.
MVC
You can use entities as your models in MVC. They mean two different things, but the same classes can be called both.
Examples
A Customer class is very much an entity (usually), and you also use it as part of data access in your app. It is both an entity and a model in this case.
A Repository class may be part of the Model, but it is clearly not an entity.
If there is a class that you use in the middle of your business logic layer but don't expose to the rest of the application, it may be an entity, but it is clearly not a Model from the perspective of the MVC app.
Your example
As for your code examples, I would prefer the first one.
A Model is a class that is used as a means of data abstaction of an application, not a class which has a name suffixed with "Model". Many people consider the latter bloatware.
You can pretty much consider your Repository class as part of your model, even if its name isn't suffixed with "Model".
I would add to that the fact that it is also easier to work with the first one, and for other people who later may have to understand your code, it is easier to understand.
All answers are a heavy mashup of different things and simply wrong.
A model in DDD is much like a model in the real world:
A simplification and abstraction of something.
No less and no more.
It has nothing to do with data nor objects or anything else.
It's simply the concept of a domain part. And in also every complex domain
there is always more than one model, e.g. Trading, Invoicing, Logistics.
An entity is not a "model with identity" but simply an object with identity.
A repository is not just a 1st level cache but a part of the domain too.
It is giving an illusion of in-memory objects and responsible for fetching
Aggregates (not entities!) from anywhere and saving them
i.e. maintaining the life cycle of objects.
The "model" in your application is the bit which holds your data. The "entity" in domain-driven design is, if I remember correctly, a model with an identity. That is to say, an entity is a model which usually corresponds directly to a "physical" element in a database or file. I believe DDD defines two types of models, one being the entity, the other being the value, which is just a model without and identity.
The Repository pattern is just a type of indexed collection of models/entities. So for instance if your code wants order #13, it will first ask the repository for it, and if it can't get it from there, it will go and fetch it from wherever. It's basically a level 1 cache if you will. There is no difference in how it acts with a model, and how it acts with an entity, but since the idea of a repository is to be able to fetch models using their IDs, in terms of DDD, only entities would be allowed into the repository.
A simple solution using service and collection:
<?php
class MyController {
public function index() {
$postService = ServiceContainer::get('Post');
$postCollection = $postService->findAllByDateRange('within 30 days');
while($postCollection->getNext()) {
echo $postCollection->current()->getAuthor();
}
}
}
EDIT:
The model(class) is the simple representation of the entity scheme. The model(object) is a single entity. The service operates on models and provides concrete data to the controllers. No controller has any model. The models stand alone.
On the other "side", mappers map the models into persistance layers (e.g: databases, 3rd party backends, etc).
while this is specifically about Ruby on Rails, the same principles and information still apply since the discussion is around MVC and DDD.
http://blog.scottbellware.com/2010/06/no-domain-driven-design-in-rails.html