I'm in the process of designing a system that stores entities and their relations over time.
Each entity has properties, each property should be versioned, so when a property of the entity changes, a new history state gets added. The complexity comes with the fact that I also need to version the relations between the separate entites. For example: when entity A moves from parent X to parent Y, the relation of both entities also gets a new history state.
I'm looking for advice on how to design this on a lower level - are there any design patterns available for this sort of thing, or any other best practices/proven methods?
I'm building this in PHP with a PostgreSQL database, optionally using Doctrine as the ORM/DBAL.
I would recommend looking into the table_log extension (https://github.com/psoo/table_log) for this. It does require compiling but it has the advantage of allowing you to restore tables to a previous state, audit changes, etc.
One important detail here is that the history changes are stored in other tables, not your main tables, so you think of it as current data plus external audit trails. If you need to merge them for a query, you could create a view.
Related
I had one class User which is also an Entity type. I need to store information about which user can send a message to another one. My conceptual solution is:
- to make a database table with two columns, except unique id of course (id_sender, id_recipient)
- the next step should be to map my class User to that table (maybe using #joinTable annotation)
What is a good practice for thinks like that? I know I should use the doctrine (as a higher abstraction layer). Thanks for your time!
Diving into the world of DDD is something of a whirlwind for me. While I've done a lot of research I'm struggling to change my thought process.
So, I have a package and a products entity. A package cannot function without products; vice-versa. The problem comes in when I'm needing to get the products belonging to a package (Note: the package is customisable, this means the products belonging to the package could be different the next time round). It seems that this belongs on neither entity, furthermore applying this to the package or product entity would make them tightly coupled.
I must stress that I'm using the word association because I'm trying to figure this out on a domain level rather than infrastructure.
A little thinking has led me to the following thoughts:
Make Packages an Aggregate root & the products to be a sibling of the
Package entity. However I believe that both of the Package & Products
are entities due to them been uniquely identified.
Create a domain service that would take the package ID & map to the
infrastructure layer to find the associated products. However, this
seems rather long winded for a single look up.
It would be great if someone could turn me sane again! Thanks in advance!
Ubiquitious Language
DDD is all about language - the key here is to listen to your domain experts talk about products and packages - how do they think about them? And about the processes involved in working with them?
When they create a package, do they really think to themselves "I must define products with this package", or do they think "I'll setup a package, and then link a few products to it" - if the latter, although it may on first blush feel like a package can't exist without products, notice the subtle implication in the timing - that the package can exist without products, because they expect to link products to it as a second step.
Given that, you might model the link as a completely independent aggregate with a single entity as the root, e.g. PackageProduct (or better yet, some term your domain experts use to define the assocation) and simply create new instances of this aggregate when a product is linked to a package. This entity would have a PackageId and a ProductId on it.
However, if there are business rules, e.g. a Package can only have one product of a given type, or at most 5 products, then make the PackageProduct entity an entity within the Package aggregate, which has Package as the aggregate root. The PackageProduct would have a reference to the Package and a property of the ProductId. See below for some terminology clarification.
Entities vs Aggregates
Based on your question, it seems there might be some confusion about terminology. In DDD, we have:
Entities:
Have an identity that outlasts any given property
Has a mutable state
Generally, modelling business processes is all about working out how to mutate the state of entities
Aggregate
A group of entities over which invariants must be enforced
invariants are business rules that MUST hold at all times
A single entity is always nominated as the 'aggregate root'
Other entities can only refer to the aggregate by the root
When modelling a business process in order to mutate state, the aggregate is the boundary within which the states of all entities must be consistent with the invariants.
Outside the aggregate boundary, other entities may get updated asynchronously - aiming for eventual consistency
Domain Service
A service that contains business logic that doesn't belong in a single entity
It is generally not just a wrapper around a piece of infrastructure
See https://lostechies.com/jimmybogard/2008/05/21/entities-value-objects-aggregates-and-roots/ for more info.
Read vs Write Operations
The problem comes in when I'm needing to get the products belonging to a package
This sounds a lot like it might be in order to support a UI or a report? In which case - don't stress about the entity model. The entity model is there to ensure the business rules hold while users are trying to modify the state of the system. When doing a read operation, there is no need to modify the state, so you can bypass the model. Define a query that projects your data store onto a DTO and tailor the the projection to suite the needs of the UI or report.
The problem comes in when I'm needing to get the products belonging to a package (Note: the package is customisable, this means the products belonging to the package could be different the next time round).
This sounds like a query. It often helps to separate the modelling of commands (things that can alter your domain model) and queries.
I have been using Cakephp ver2.x and just started migrating to Cakephp v3.x. When I tried using the new ORM, I am baffled by basic concepts like repositories and table objects. What is the difference between repositories and table objects?
A repository can be anything while a table, as the name states, is just a table.
http://api.cakephp.org/3.0/class-Cake.ORM.Table.html
Represents a single database table.
Exposes methods for retrieving data out of it, and manages the associations this table has to other tables. Multiple instances of this class can be created for the same database table with different aliases, this allows you to address your database structure in a richer and more expressive way.
http://api.cakephp.org/3.0/class-Cake.Datasource.RepositoryInterface.html
Describes the methods that any class representing a data storage should comply with.
A data storage can be any kind of storage system, even one that doesn't know tables like a graph DB or document based system.
It is always simple to just check the API documentation and code for this kind of questions. The code is pretty well documented. Also the way this works becomes obvious then:
class Table implements RepositoryInterface, EventListenerInterface
Table implements the interface defined by RepositoryInterface.
I am currently working on a huge refactoring project. We have taken over a classic PHP/MySQL project, where most code is procedural, duplicated, and there is very little hint of an architecture.
I am planning on using Doctrine to handle our Data Access, and have all of my tables mapped to entities. However, our MySQL tables are largely messed up.
The table I am currently working with has over 40 columns, and is not normalized by any means. A quick example of what we have:
Brand
id
name
poNumber
orderConfirmationEmail <---- these should go into a BrandConfirmations entity
shippingConfirmationEmail <-----
bill_address <---- these should go into a BrandAddress entity
bill_address2 <-----
city <------
.
.
.
Ideally, what I would like to have is for Doctrine to pull out the fields that reference different Entities, and actually put them into those Entities. So for instance id, name, and poNumber would get pulled out into a Brand entity. orderConfirmationEmail and shippingConfirmationEmail would get pulled out into a BrandNotification entity. Next, bill_address, and the rest of the address fields would get pulled out into a BrandBillAddress entity. Is there a way to configure Doctrine to split the table into these models for me, or do I have to custom write code myself that would do that?
If I do have to write the code to split this table myself, do you have any resources or advice that tackle a similar issue? I haven't been able to find many yet.
The latest version of Doctrine 2 supports what they call embeddables: http://doctrine-orm.readthedocs.org/en/latest/tutorials/embeddables.html. It may solve some of your problems. However, it requires D2.5+. Currently, S2 uses Doctrine 2.4. You could experiment with using the very latest doctrine.
What you can do is make your domain models (entities) act as though you had value objects. So $brand->getOrderConfirmation() would actually return an order confirmation object. You have to do some messing around to keep everything mapped to one table and you might be limited on some of your queries but it's not that hard. The advantage is that the rest of your new applications deals with proper normalized objects. It's only the internal persistence code that needs to get messy.
There are quite a few links on this approach. Here is one: http://russellscottwalker.blogspot.com/2013/11/entities-vs-value-objects-and-doctrine-2.html
Your best bet of course is to refactor your database schema. I like to do kind of a raw dump of the original database into a yaml file with the desired object nesting. I then load the yaml file into the new schema. If you are really lucky then you might even be able to create new views for your existing application which will allow it to keep working in parallel with your new application.
I started some time working with the Yii Framework and I saw some things "do not let me sleep." Here I talk about my doubts about how Yii users use the Active Record.
I saw many people add business rules of the application directly in Active Record, the same generated by Gii. I deeply believe that this is a misinterpretation of what is Active Record and a violation of SRP.
Early on, SRP is easier to apply. ActiveRecord classes handle persistence, associations and not much else. But bit-by-bit, they grow. Objects that are inherently responsible for persistence become the de facto owner of all business logic as well. And a year or two later you have a User class with over 500 lines of code, and hundreds of methods in it’s public interface. Callback hell ensues.
When I talked about it with some people and my view was criticized. But when asked:
And when you need to regenerate your Active Record full of business rules through Gii what do you do? Rewrite? Copy and Paste? That's great, congratulations!
Got an answer, only the silence.
So, I:
What I am currently doing in order to reach a little better architecture is to generate the Active Records in a folder /ar. And inside the /models folder add the Domain Model.
By the way, is the Domain Model who owns the business rules, and is the Domain Model that uses the Active Records to persist and retrieve data, and this is the Data Model.
What do you think of this approach?
If I'm wrong somewhere, please tell me why before criticizing harshly.
Some of the comments on this article are quite helpful:
http://blog.codeclimate.com/blog/2012/10/17/7-ways-to-decompose-fat-activerecord-models/
In particular, the idea that your models should grow out of a strictly 'fat model' setup as you need more seems quite wise.
Are you having issues now or mainly trying to plan ahead? This may be hard to plan ahead for and may just need refactoring as you go ...
Edit:
Regarding moveUserToGroup (in your comment below), I could see how having that might bother you. Found this as I was thinking about your question: https://gist.github.com/justinko/2838490 An equivalent setup that you might use for your moveUserToGroup would be a CFormModel subclass. It'll give you the ability to do validations, etc, but could then be more specific to what you're trying to handle (and use multiple AR objects to achieve your objectives instead of just one).
I often use CFormModel to handle forms that have multiple AR objects or forms where I want to do other things.
Sounds like that may be what you're after. More details available here:
http://www.yiiframework.com/doc/guide/1.1/en/form.overview
The definition of Active Record, according to Martin Fowler:
An object carries both data and behavior. Much of this data is persistent and needs to be stored in a database. Active Record uses the most obvious approach, putting data access logic in the domain object. This way all people know how to read and write their data to and from the database.
When you segregate data and behavior you no longer have an Active Record. Two common related patterns are Data Mapper and Table/Row Gateway (this one more related to RDBMS's).
Again, Fowler says:
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema.
And again:
A Table Data Gateway holds all the SQL for accessing a single table or view: selects, inserts, updates, and deletes. Other code calls its methods for all interaction with the database.
A Row Data Gateway gives you objects that look exactly like the record in your record structure but can be accessed with the regular mechanisms of your programming language. All details of data source access are hidden behind this interface.
A Data Mapper is usualy storage independent, the mapper recovers data from the storage and creates mapped objects (Plain-old objects). The mapped object knows absolutely nothing about being stored somewhere else.
As I said, TDG/RDG are more inwardly related to a relational table. TDG object represents the structure of the table and implements all common operations. RGD object contains data related to one single row of the table. Unlike mapped object of Data Mapper, the RDG object has conscience that it is part of a whole, because it references its container TDG.