PHP design patterns Factories, Repositories and...?

PHP design patterns Factories, Repositories and...? - php

We have factories for creating complex objects. (For makin' stuff)
We have repositories for find them. (For findin' stuff)
What do we have for updating them? (For changin' stuff up)
Seems like a missing piece in the puzzle? I don't think it belongs in repositories, as that breaks single responsibility...

Updating an Entity (and so, the database) belongs to the Repository. The Repository itself is an Layer between the Database itself and the program.
Every Database Operation therefore belongs to the Repository. Also, a Repository mustn't communicate with a database, it also could have an XML, CSV or API as data source. But that doesn't matter, because you're communicating with the Repository. The Repository deals with all that's coming afterwards.
You could just change the Repository with another one and your program would work without any problems, because the repositories all implement the same interface. You don't like that MySQL Database anymore, that old fashioned CSV is much better? Just replace the used Repository and you're done.
Finding an entry with an repository isn't more than a SELECT statement, so why won't you UPDATE or DELETE with it?
Further reading on the MSDN
Found a great explanation and example on web.archive.org

I think it depends on approach.
With DDD for example what You are saying is true. Repository should be responsible for adding, finding and removing, because it works on a collection, but there is a question why it should be able to update a single object.
What can be done? I think I will only copy what other person said, so I wil just post link to answer: approach to removing save/update from repository

Related

A data structure from Repository to serve all purposes?

I need help with something I can’t get my head wrapped around regarding the Repository and Service/Use-case pattern (part of DDD design) I want to implement in my next (Laravel PHP) project.
All seems clear. Just one part of DDD that is confusing is the data structures from repositories. People seem to choose data structures the Repository should return (arrays or entities) but it all has disadvantages. One of which is performance looking at my experiences in the past. And one is which you don’t have interfaces for simple data structures (array or simple object attributes).
I’ll start with explaining the experience I have with a previous project. This project had flaws but some good strengths I learned from and like to see in my new project but with solving some design mistakes.
Previous experience
In the past I’ve build a website that was API Centric using the Kohana framework and Doctrine 2 ORM (data mapper pattern). The flow looked like this:
Website controller → API client (HMVC calls) → API controller → Custom Repository → Doctrine 2 ORM native Repository/Entity-manager
My custom Repository returned plain arrays using Doctrine2 DQL. Doctrine2 recommends array result data for read only operations. And yes, it made my site nice and light. The API controller just converted the array data to JSON. Simple as that.
In the past my company created projects relying fully on loaded Doctrine2 entities and it’s something we regretted due to performance.
My REST API supported queries like
/api/users?include_latest_adverts=2&include_location=true
on the users resource. The API controller passed include_location to the repository which directly included the location relation. The controller read latest_adverts=2 and called the adverts repository to get the latest 2 adverts of each user. Arrays were returned.
For example first user array:
[
name
avatar
adverts [
advert 1 [
name
price
]
advert 2 [
….
]
]
]
This proved to be very successful. My whole website was using the API. It would be very easy to to add a new client because the API was perfectly in production already using oauth. The whole website runs on it.
But this design had flaws too. My controller still contained A LOT of logic for validation, mailing, params or filters like has_adverts=true to get users with adverts only. It would mean that if I created a new port, like a total new CLI interface, I would have to duplicate alot of these controllers due to all the validation etc. But no duplication if I would create a new client. So at least one problem was solved :-)
My admin panels were completely coupled to the doctrine2 repository/entity-manager to speed up development (sort of). Why? Because my API had fat controllers with special functionality for the website only (special validation, mailing for registering etc). I would have to redo work or refactor a lot. So decided to use the Entities directly to still have some sort clear way of writing code instead of rewriting all my API controllers and move them to Services (for site & admin) for instance. Time was an issue in fixing my design mistakes.
For my next project I want all code to go through my own custom repositories and services. One flow for good separation.
New project (using DDD ideas) and dilemma with data structures
While I like the idea of being API centric, I don’t want my next project to be API centric in core because I think the same functionality should be available without the HTTP protocol in between. I want to design the core using DDD ideas.
But I liked the idea using a layer that just talked as a API and returns simple arrays. The perfect base for any new port, including my own frontend. My idea is to consider my Service classes as the API interface (return the array data), do the validation etc. I could have Services specially for the website (registering) and plain services used by the Admin or background processes. In some admin cases a Service would not be required anyway for simple CRUD editing, I could just use Repositories directly. Controllers would be very thin. With this creating a real REST API would just be a matter to create new controllers using the same Services my frontend controller classes do.
For internal logic like business rules it would be useful to have Entities (clear interfaces) instead of arrays from repositories. This way I could benefit from defining some methods that did some logic based on attributes. BUT If I would be using Doctrine2 and my repositories would always return Entities my application would suffer a big performance hit!!
One data structure ensures performance but no clear interfaces, the other ensures clear interfaces but bad performance when using a Data Pattern pattern like Doctrine 2 (now or in the future). Also I could end up with two data types which would be confusing.
I was thinking something similar to this flow:
Controller (thin) → UserService (incl. validation) → UserRepository (just storage) → Eloquent ORM
Why Eloquent instead of Doctrine2? Because I want to stick a bit to what’s common within the Laravel framework and community. So I could benefit from third party modules, for example to generate admin interfaces or similar based on models (bypassing my DDD rules). Other than using third party modules, I would design my core stuff so switching should always be easy and not affect data structure choices or performance.
Eloquent is an activerecord pattern. So I would be tempted to convert this data to POPO’s like Doctrine2 entities are. But nope... as said above, with doctrine2 real models would make the system very fat. So I fall back to simple arrays again. Knowing this would work for both and any other implementation in the future.
But it feels bad always rely on arrays. Especially when creating internal business rules. A developer would have to guess values on arrays, have no autocompletion in his IDE, could not have special methods like in Entity classes. But making two ways of dealing with data feels bad too. Or I am just too perfectionist ;) I want ONE clear data structure for all!
Building interfaces and POPO’s would mean a lot of duplicate work. I would need to convert an Eloquent model (just a table mapper, not entity) to an entity object implementing this interface. All is extra work. And eventually my last layer would be just like a API, thus converting it to arrays again. Which is extra work too. Arrays seem the deal again.
It seemed so easy reading up into DDD and Hexagonal. It seems so logic! But in reality I struggle with this one simple issue trying to stick to OOP principles. I want to use arrays because it’s the only way to be 100% sure I am not depended on any model choice and querying choice from my ORM regarding performance etc and don't have duplicate work in converting to arrays for views or an API. But there's no clear contract on how a user array could look. I want to speed up my project using these patterns, not slow them down :-) So not an option to have many converters.
Now I read a lot of topics. One makes POPO’s & interfaces that conform proper entities like Doctrine2 could return, but with all the extra work for Eloquent. Switching to Doctrine2 should be fairly easy, but would impact performance so bad or one would need to convert Doctrine2 array data to these own entity interfaces. Others choose to return simple arrays.
One convinces people to use Doctrine2 instead of Eloquent, but they leave out the fact that Doctrine2 is heavy and you really need to use array results for read only operations.
We design repositories to be changeable right? Not because it’s “nice” by design only. So how could we rely on full Entities if it has such big impact on performance or duplicate work? Even when using Doctrine2 only (coupled) this same issue would arise due to its performance!
All ORM implementations would be able to return arrays, thus no duplicate work there. Good performance. But we miss clear contracts. And we don’t have interfaces for arrays or class attributes (as a workaround)... Ugh ;)
Do I just miss a missing block in our programming languages? Interfaces on simple data structures??
Is it wise to make all arrays and have advanced business logic talk to these arrays? Thus no classes with clear interfaces. Any precalculated data (normally would be returned by an Entity method) would be within an array key defined the Service class. if not wise, what’s the alternative considering all of the above?
I would really appreciate if someone with great experience in this “domain” considering performance, different ORM implementations, etc could tell me how he/she has dealt with this?
Thanks in advance!

I think what you are dealing with is something similiar I'm struggling with. The solution I'm thinking works best is:
Entities/Repositories
Use and pass around Entities always when performing Write operations (Creating things, Updating things, Deleting things, and complex combinations thereof).
Sometimes you may use Entities when doing Read operations (when you anticipate the Read might need to be used for a Write soon after...ie. ->findById is soon followed by ->save).
Anytime you are working with an Entity (whether it be Write or Read), the Repositories need to be the place to go. You should be able to tell new developers that they can only persist to the database through Entities and the Repository.
The Entities will have properties that represent some Domain Object (many times they represent a database table with the fields of a table, but not always). They will also contain the domain logic/rules with them (ie. validation, calculations) so they are not anemic. You may additionally have some domain services if your Entities need help interacting with other Entities (need to trigger other events), or you just need an additional place to handle some extra domain logic (perform Repository calls to check for some unique conditions).
Your Repositories will solely be for working with Entities. The Repositories could accept Entities and do some persistence work with them. Or they could accept just some parameters, and do some reading/fetching into full Entities.
Some Repositories will know how to save some Domain Objects that are more complex than others. Perhaps an Entity that has a property which contains a list of other Entities that need to be saved along side the main entity (you can dive deeper into learning about Aggregate roots if you want).
The interfaces to Repositories rest in your Domain layer, but not the actual implementations of those Repositories. That way you can have an Eloquent version or whatever.
Other Queries (Table Data Gateway)
These queries won't work with Entities. They'll just be accepting parameters and returning things like Arrays or POPO's (Plain Old PHP Objects).
Many times you will need to perform Reads that do not return nicely into a single Entity. These Reads are typically more for reporting (not for CRUD-like operations, like Reading a user into an edit form that is eventually submitted and saved). For example, you might have a report that is 200 rows of JOINed data. If you used the Repositiory and tried to return large deep objects (with all the relationships populated, or even lazy-loaded) then you are going to have performance issues. Instead, use the Table Data Gatway pattern. You are just displaying data and not really needing OOP power here. The outputted data could however contain ID's, which through the UI could be used to initiate calls to Repository persistence methods.
As you are developing your app, when you come across the need for a new Read/Report query, create a new method in some class somewhere in your Table Data Gatway folder. You may find you have already created a similar query, so see how you can consolidate the other query. Use some parameters if necessary to make the gateway method's queries more flexible in particular ways (ie. columns to select, sort order, pagination, etc.). Don't make your queries too flexible though, this is where query builders/ORMs go wrong! You need to constrain your queries to a certain extent to where if you need to replace them (perhaps a different database engine) then you can easily perceive what the allowed variations are and aren't. It's up to you to find the right balance between flexibility (so you have more DRY code) and constraints (so you can optimize/replace queries later).
You can create services in your Domain to handle receiving parameters, then passing them to the Table Data Gateway, and then receiving back arrays to do some more mutating on. This will keep your Domain logic in the domain (and out of the infrastructure/persistence layer of the Repository & Table Data Gateway).
Again, just like the Repository, use interfaces in your domain services so that the implementation details stay out of your Domain layer, and resides in the actual Table Data Gateway folder.

Laravel Repositories

What are the advantages of Repositories in Laravel? It seems to be abstracting the Model layer from the business logic of the application. Although it really just seems to make the whole request life cycle just that much more complicated for little gain.
Can someone shed light on the advantage of Laravel repositories?
Edit
After now using repositories for some time I would add the following:
Repositories enforce single responsibility
Repositories should only return one collection of entities
Although separate from dependancy injection the concepts are brothers
Storage abstraction for the actual storage implementation (e.g. MySQL)
Easier testing

Repositories, like in the provided tutorial, aren't necessary a Laravel concept. Rather, they're a form of IoC injection that is possible with Laravel. Any object that might similarly be injected doesn't mean it's a repository. See the video for a good example from Taylor Otwell, which happens to use a "repository" as well: http://vimeo.com/53029232.
In this example, the repository abstracts where the data came from that is passed to the controller. As long as the data passed implements the specified interface, the controller can "blissfully" make use of the interface's defined methods without worry about where the data initially came from. This allows switching the initial source of the data without breaking your controller. You could pull the data from a file, a database, an outside API, a mock object, or just some arbitrary array. Basically, the controller doesn't need to gather the data represented by the repository. It can just receive and use.

In addition to the other answers here, it's worth pointing out that repositories used when used in Laravel can add an extra level of expressiveness. Take for example the following:
$users = User::whereHas("role", function($q) {
$q->where('name', 'moderator');
}, '<', 1)->get();
The code is difficult to read and awkward to look at. It can be encapsulated in a repository method, and demonstrate much clearer code intent:
$users = $userRepository->getUsersWhoAreNotModerators();
This is also achievable using eloquent 'query scopes' but I think using a repository is superior, as is adheres much better to the single responsibility principal, and is doable regardless of whether you are using Eloquent.

Repositories help to keep your controller clean and make reusable code. Functions in repositories can be accessed in one or more controllers, repositories, etc.
Also, all your backend related logic (like fetching data from database or external calls) can be added in repositories to make the logical separation.
One of the main uses of repositories is to create different binding (using interfaces to define your functions and with the help of the app, bind different implementations of the function as per need). For example, two separate repositories (implementing the parent repository/interface) handling database and files for backend data.

The main reason you use repository pattern is to be able to easily change your data source. For example, in my experience the most common change is adding a caching layer, because the repository implements an interface all you need to do is build the new object implementing the same interface with the new methods handling cache and change the binding.

In my experience the repository in Laravel benefits are these:
The most important thing by using repositories is that you can change your ORM any time and for any reason that you prefer. for example, you want to migrate from MySQL eloquent to some other SQLite ORM.
Helps you to keep your controllers clean and readable.
Helps you to reuse your methods in a repository for any other controllers.
You can add BaseRepository to your repositories list, which involves all base methods like all(), get(), findOrFail(), firstOrFail(), paginate(), create(), ... and use them in other repositories.
Using Interfaces for repositories and binding them that also can be used as services in Laravel.

repository pattern vs ORM

What good is a repository pattern when you have an ORM?
Example. Suppose i have the following (fictional) tables:
Table: users
pk_user_id
fk_userrole_id
username
Table: userroles
fk_userrole_id
role
Now with an orm i could simply put this in a model file:
$user = ORM::load('users', $id);
Now $user is already my object, which could easily be lazy loaded:
(would be even nicer if things are automatically singular/pluralized)
foreach ( $user->userroles()->role as $role )
{
echo $role;
}
Now with the Repository pattern i'd had to create a repository for the Users and one for the Roles. The repository also needs all kinds of functions to retrieve data for me and to store it. Plus it needs to work with Entity models. So i have to create all of those too.
To me that looks like alot of stuff do... When i could simply get the data like i described above with an ORM. And i could store it just as easy:
ORM::store($user);
In this case it would not only store the user object to the database, but also any changes i made to the 'Roles' object aswell. So no need for any extra work like you need with the repository pattern...
So my question basically is, why would i want to use a repository pattern with an ORM? I've seen tutorials where to use that pattern (like with Doctrine). But it really just doesn't make any sense to me... Anyone any explanation for its use in combination with an ORM..??

The ORM is an implementation detail of the Repository. The ORM just makes it easy to access the db tables in an OOP friendly way. That's it.
The repository abstract persistence access, whatever storage it is. That is its purpose. The fact that you're using a db or xml files or an ORM doesn't matter. The Repository allows the rest of the application to ignore persistence details. This way, you can easily test the app via mocking or stubbing and you can change storages if it's needed. Today you might use MySql, tomorrow you'll want to use NoSql or Cloud Storage. Do that with an ORM!
Repositories deal with Domain/Business objects (from the app point of view), an ORM handles db objects. A business objects IS NOT a db object, first has behaviour, the second is a glorified DTO, it only holds data.
Edit
You might say that both repository and ORM abstract access to data, however the devil is in the details. The repository abstract the access to all storage concerns, while the ORM abstract access to a specific RDBMS
In a nutshell, Repository and ORM's have DIFFERENT purposes and as I've said above, the ORM is always an implementation detail of the repo.
You can also check this post about more details about the repository pattern.

ORM and repository pattern...depends on setup.
If you use your ORM entities as the domain layer, then please use no repositories.
If you have a separate domain model and you need to map from that model to ORM entities and so perform a save, then repositories are what you need.
More details you find here (but must be logged to linked-in). Also to understand the difference, check out the definition of the repository pattern.
Most people use classes that they call repositories, but aren't repositories at all, just query classes - this is how/where you should place your queries if you decided to go with the #1 option (see answer above). In this case make sure not to expose DbContext or ISession from that query class, nor to expose CUD-methods from there - remember, Query class!
The #2 option is a tough one. If you do a real repository, all the inputs and outputs on the repository interface will contain clear domain classes (and no database related object). It's forbidden to expose ORM mapped classes or ORM architecture related objects from there. There will be a Save method also. These repositories might also contain queries, but unlike query classes, these repos will do more - they will take your domain aggregate (collection and tree of entities) and save them to DB by mapping those classes to ORM classes and perform a save on ORM. This style (#2) does not needs to use ORM, the repository pattern was primarly made for ADO.NET (any kind of data access).
Anyhow these 2 options are the 2 extremes we can do. A lot of people use repositories with ORM, but they just add an extra layer of code without real function, the only real function there is the query class like behaviour.
Also I'd be careful when someone talks about UnitOfWork, especially with ORM. Almost every sample on the internet is a fail in terms of architecture. If you need UoW, why not use TransactionScope (just make sure you got a wrapper which uses other than Serializable transaction by default). In 99,9% you won't need to manage 2 sets of independent changes in data (so 2 sets of OuW), so TransactionScope will be a fine choce in .NET - for PHP i'd look for some open-session-view implementations...

Altering table using the Symfony2 Doctrine2 console/generate feature?

What I'm trying to figure out is how to add new fields to a table, using Symfony2 with Doctrine2.
I used this to initially create the Entity:
php app/console doctrine:generate:entity --entity="MyMainBundle:ImagesTable" --fields="title:string(100) file:string(100)"
And I used this to create/update the tables on the database:
php app/console doctrine:schema:update --force
Now if I wanted to add new fields to the ImagesTable entity, is there an easy way to do it using the console, or do I have to manually edit the entity. I am just using 1 entity as an example right now, but in reality, there are many entities I'd be changing; so, there has to be an easier way to do it.
I've been manually editing them to create relationships, so if there is an easier way to do that as well, that'd be great.
I remember this being a lot easier with Symfony1.4 - all I had to do was create the database/tables using phpMyAdmin, and Symfony was able to generate the models with no issues.
I really hope I'm missing something here, because this won't work if I have to manually edit every entity for every change.

Doctrine generator commands are intended to help the developer to quickly prototype an idea. They generally don't produce production ready code, and the code needs to be checked to see if it contains what you want.
You can still create your model in phpmyadmin and use Doctrine reverse engineering tools, but it also doesn't produce production ready code, only intended to use in prototyping.
Creating database/tables beforehand doesn't really work well with Doctrine2, as the underlying relation between tables may not be the same as the relation between objects of your model. The whole point of ORM is to think in classes and letting Doctrine do the rest of the work for you.
Doctrine is not intended to write your entities for you, it gives you tools to build your data model, which you use to code your model in Php.
If you don't like to code your entities by hand (which is what all developers using doctrine does), you may want to have a look at RedbeanPHP, a zero-config ORM framework for PHP. It creates the database tables, columns, indexes on the fly depending on the data model you use.

Entity Framwework-like ORM NOT for .NET

What I really like about Entity framework is its drag and drop way of making up the whole model layer of your application. You select the tables, it joins them and you're done. If you update the database scheda, right click -> update and you're done again.
This seems to me miles ahead the competiting ORMs, like the mess of XML (n)Hibernate requires or the hard-to-update Django Models.
Without concentrating on the fact that maybe sometimes more control over the mapping process may be good, are there similar one-click (or one-command) solutions for other (mainly open source like python or php) programming languages or frameworks?
Thanks

SQLAlchemy database reflection gets you half way there. You'll still have to declare your classes and relations between them. Actually you could easily autogenerate the classes too, but you'll still need to name the relations somehow so you might as well declare the classes manually.
The code to setup your database would look something like this:
from sqlalchemy import create_engine, MetaData
from sqlalchemy.ext.declarative import declarative_base
metadata = MetaData(create_engine(database_url), reflect=True)
Base = declarative_base(metadata)
class Order(Base):
__table__ = metadata.tables['orders']
class OrderLine(Base):
__table__ = metadata.tables['orderlines']
order = relation(Order, backref='lines')
In production code, you'd probably want to cache the reflected database metadata somehow. Like for instance pickle it to a file:
from cPickle import dump, load
import os
if os.path.exists('metadata.cache'):
metadata = load(open('metadata.cache'))
metadata.bind = create_engine(database_url)
else:
metadata = MetaData(create_engine(database_url), reflect=True)
dump(metadata, open('metadata.cache', 'w'))

I do not like “drag and drop” create of data access code.
At first sight it seems easy, but then you make a change to the database and have to update the data access code. This is where it becomes hard, as you often have to redo what you have done before, or hand edit the code the drag/drop designer created. Often when you make a change to one field mapping with a drag/drop designer, the output file has unrelated lines changes, so you can not use your source code control system to confirm you have make the intended change (and not change anything else).
However having to create/edit xml configuring files is not nice every time you refractor your code or change your database schema you have to update the mapping file. It is also very hard to get started with mapping files and tracking down what looks like simple problem can take ages.
There are two other options:
Use a code generator like CodeSmith that comes with templates for many ORM systems. When (not if) you need to customize the output you can edit the template, but the simple case are taken care of for you. That ways you just rerun the code generator every time you change the database schema and get a repeatable result.
And/or use fluent interface (e.g Fluent NHibernate) to configure your ORM system, this avoids the need to the Xml config file and in most cases you can use naming conventions to link fields to columns etc. This will be harder to start with then a drag/drop designer but will pay of in the long term if you do match refactoring of the code or database.
Another option is to use a model that you generate both your database and code from. The “model” is your source code and is kept under version control. This is called “Model Driven Development” and can be great if you have lots of classes that have simpler patterns, as you only need to create the template for each pattern once.

I have heard iBattis is good. A few companies fall back to iBattis when their programmer teams are not capable of understanding Hibernate (time issue).
Personally, I still like Linq2Sql. Yes, the first time someone needs to delete and redrag over a table seems like too much work, but it really is not. And the time that it doesn't update your class code when you save is really a pain, but you simply control-a your tables and drag them over again. Total remakes are very quick and painless. The classes it creates are extremely simple. You can even create multiple table entities if you like with SPs for CRUD.
Linking SPs to CRUD is similar to EF: You simply setup your SP with the same parameters as your table, then drag it over your table, and poof, it matches the data types.
A lot of people go out of their way to take IQueryable away from the repository, but you can limit what you link in linq2Sql, so IQueryable is not too bad.
Come to think of it, I wonder if there is a way to restrict the relations (and foreign keys).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.