How to handle Domain Entity validation before it's persisted?

How to handle Domain Entity validation before it's persisted? - php

An Entity (let's say a UserEntity) has rigid rules for it's properties, and it can exist in 2 states - persisted (which means it has an id) and pre-persisted (which means it does not have an id yet).
According to the answer to this question about how to handle required properties, a "real" UserEntity should only ever be created with an id passed in to its constructor.
However, when I need to create a new UserEntity from information sent by the browser, I need to be able to validate the information before persisting into the db.
In the past, I would simply create a blank UserEntity (without an id), set the new properties, and the validate it - but, in this new, more secure way of thinking about Entities, I shouldn't ever create a new UserEntity without its id.
I don't want to create TWO places that know how to validate my UserEntity's properties, because if they ever change (and they will) it would be double the code to update and double the chances for bugs.
How do I efficiently centralize the validation knowledge of my entity's properties?
Note
One idea I had is reflected in this question, in which I consider storing the non-state properties like email, password and name in a standardized value object that would know about the rules for its properties that different services, like the Controller, Validator, and Repo, or Mapper could use.

that's what factories are for. to the factory method you pass only the data that is required to enforce the real invariants of UserEntity (take some time to figure out what are your real invariants of UserEntity and you'd better do it with your domain experts).
in the factory method you create a new Id and pass it to the UserEntity constructor.
In this stage i don't think it is that bad to discard the instance if the validation inside the constructor fails. in the worst case - you've lost an id... it's not a case that suppose to happen quite often - most of the time the data should be validated in the client.
Of course another option is that in the factory method you first validate the parameters and only then create a new Id and pass it to the UserEntity constructor.
itzik saban

I think you have a couple of options to consider:
(1) Consider your first comment:
An Entity (let's say a UserEntity) has rigid rules for it's
properties, and it can exist in 2 states - persisted (which means it
has an id) and pre-persisted (which means it does not have an id yet).
Here, you are already mention that validation actually depends on whether the entity has been persisted. In other words, if the entity hasn't been persisted, then it should be valid without the ID. If you continue with this domain specification, I feel the validation should act accordingly (e.g. return isValid even without an ID if the object hasn't been persisted)
(2) If you assume "valid" means the object has an ID, then you would need to generate the ID upon creation. Depending on how your IDs are generated, this could get tricky (e.g. save to database and return created ID, or generate unique identifiers somehow, or ...)
Using either approach, its probably worth implementing common base class(es) for your Entity (e.g. with ID) to help minimize duplicating validation across the different states. Hopefully, this shields the derived entities from the common validation as well.

In my opinion , save() , and load() methods should be doing both validation and setting ID attribute . And by the way an entity without Identity attribute is not a entity at all .
In my view Identity attribute should be validated and ensured when entity is in transit e.g
loading from db , loading from file or (after) saving to db such that
if loading from db fails discard the entity saving to db/file fails discard the entity .
Since validation is business log /behavior etc and a better pattern for that would be
Strategy Pattern (http://en.wikipedia.org/wiki/Strategy_pattern)

The topic of how to do validation correctly is somewhat of a grey area.
Validation is typically cast as Invariant and Contextual validation. Invariant validation pertains to those things that, according to your problem domain, have to be present in order for your model to function properly in its intended role. Contextual validation pertains to state that's valid within a given usage context (e.g. A Contact used for emailing needs an email address, but doesn't need phone number; a Contact used for catalog mailings needs a mailing address, but doesn't need an email, etc.).
If you want to be architecturally pure, then technically the concerns of input validation (what your customers are typing into a user interface) and the state of a given entity are two different concerns. Ideally, your domain should have no knowledge of the particular type of application it's written for and therefore shouldn't be burdened with providing error messages suitable for use, either directly or indirectly, in displaying error messages back to the user. This presents a bit of an issue, since it can lead to duplicate or triplicate error checking (client side, service side, domain-level), so many opt for a more pragmatic approach of dealing with most validation external to the entity (e.g. validating an input model prior to entity creation).

I don't see the problem with persisting invalid data. What is valid or not is a business concern and can sometimes depends on the situation. The database doesn't care about these business rules.
If I have to fill out a big form online and the very last step requires me to enter my credit card information and I don't have my card ready, I'll have to discard all that information and the next time enter it all over again (which won't happen because I rather go somewhere else). I would like that application to store the information I already gave and later on I can make it functionally valid. As long as it isn't valid, I can't order things online.

Related

Request validation imposes validation code duplication in Symfony

Say I have a REST API method creating a user. I also have a user entity with configured validation constraints. The question is how to validate the data from the request. My problems are:
I can't populate a user instance without validating the data in the request beforehand - some data might be missing in it, other might be invalid. For example null passed to a user entity's setter with string type-hinting.
I'm not so keen to validate the request data separately, before populating a user instance, because it would be a duplication of the validation constraints configured for the user entity. It would be a problem to manage the same or similar validation constrains in two places - the controller and the entity validation config.
So basically I want to avoid the duplication of the validation constraints in the code and the config, but at the same time I'm forced to duplicate it before populating the entity. How can I get over this?

It is quite physiologic.
What I would suggest is to use a DTO where no restrictions are checked (basically where you can accept "all kind of data" in your setters or even by having public properties that is less cumbersome) and to have a validation on it.
When the DTO is valid, create the underlying object in a valid state (Value Object?)
Of course you need to "duplicate" some constraints but I would not consider this as a real duplication because, actually, DTO and underlying object are not the same object even if them seems to be related. If you disagree - and it could be the case - just stop and think about the boost you'll have by decoupling entity (that should be always in a valid state) from the model where user input data are taken.

Where to check for mandatory properties in a domain object?

I have a PHP MVC application, and my 'M' has a service layer, mapper layer and domain layer. Where and how should I check to ensure that an object has all it's required properties?
It seems to me that these responsibilities don't belong in the mapper or service layer, so that leaves the domain layer itself. I put a method, checkRequired(), in my base domain class. It checks the object's properties against an array of $_required properties, and throws an error if any are missing.
When retrieving objects from the database, I have been calling checkRequired() as the last command in the object's constructor. If the object is a new entity (i.e. not retrieved from the database), I supply some default values (in the constructor) and then call checkRequired().
While this has been working OK, I now come to put some behavioural methods on my (somewhat anaemic) domain objects, and I run in to trouble. For example, a User can own many Pets, so on my User model I put an addPet() method. I know that I need to pass the Pet object in, since it's best to inject dependencies, and my real method signature is therefore User::addPet(ConcretePet).
But that's the problem! My Pets can't exist without a User (as their Owner), so I can't instantiate ConcretePet! I could make the User optional on the Pet, but this would be a backward step. I could move the contents of checkRequired() somewhere else, but where?
What's the typical way to solve such a common problem?

That checkRequired is not a DDD way. An enntity should be valid all the time. In other words - you should not have a way to put it in an invalid state (like missing properties). How?
When you just have public properties that you set, that's anemic. Properties that you persist in DB should be private. The only way to set them is through methods with a business meaning. These methods should check all the invariants (not just required fields - all kind of constraints that would make an entity valid or not) and prevent the update if some of them are not met.
About User->Pet topic: If the Pet can't exist without the User then probably the User is an Aggregate Root that is responsible for protecting invariants related to User and Pets. That means there should be a method addPet... well... maybe something more meaningful? adoptPet and breedPet (they might have slightly different rules and input)? And this adoptPet should ensure the invariant of the pet having a User... By the way - why User and not an Owner?
But Pet also can be an Aggregate Root. That means it's constructor should require a User parameter.
Note that it depends from the use case what is the aggregate root. In some cases Pet is treated as aggregate root but in case of pet adoption it's a part of User aggregate.

Make a undirectional relationship as possible as you can. If a pet could be tracked alone (I mean without a user), consider it as an aggregate root.
Place a User attribute in Pet but no pets attribute in User. Therefore you don't need to have a addPet() method in User.
If you want to find all pets belonging to a user, use a query instead:
public class PetRepository {
public List<Pet> findByOwner(String uid) {
//omitted codes
}
}
Hope this helps.

When retrieving objects from the database, I have been calling
checkRequired() as the last command in the object's constructor.
Off topic, but this can be problematic. Suppose that the set of required attributes changes at some point such that persisted entities are no longer valid. You still want to be able to reconstitute them, so you shouldn't run that check upon reconstitution. Instead, only run validation upon creation or during behaviors.
With regards to the addPet method, instead of passing an instance of a concrete pet class, pass the data required to create an instance of a pet as method arguments or as an instance of a PetInfo class. This would allow the User class to create a fully valid instance of Pet.

Validate form input in Domain Objects setters?

Since I got into learning about MVC I have always validated my form data in my controllers which is a habit I picked up while skimming through CodeIgniters code but I have learned that its way of doing certain operations is not the best, it just gets the job done.
Should all form data be validated by the domain objects? And if so should it be done in the setters like so
public function setFirstName($firstName) {
// Check if the field was required
if(!$firstName) {
throw new InvalidArgumentException('The "First name" field is required');
}
// Check the length of the data
// Check the format
// Etc etc
}
Also, for example, if I am handling a basic user registration my User class does not have a $confirmPassword property so I would not be doing
$user->setConfirmPassword($confirmPassword);.
One way of checking if the two passwords entered are equal would be to set the $password and do something like
$user->setPassword($password);
if(!$user->matchPassword($confirmPassword)) {
throw new PasswordsNotEqualException('Some message');
}
and this would be done in the Service layer I would think?
Any advice to help me in the correct direction would be great. Thanks.

Should all form data be validated by the domain objects? And if so
should it be done in the setters like so
IMO you should only let creation of valid objects, and the best way to archieve this is to make those checks in the method that creates an object.
Assuming that the first name of an user cannot be changed, you would validate that upon the user creation. This way, you forget about the setter because you won't need it anymore.
There may be cases when you want a property to be changed, and you would need to validate them too (because that change could lead form a valid object to an invalid object, if that's the case).
One way of checking if the two passwords entered are equal would be to
set the $password and do something like...
You can handle this one the same way: have a Password object that checks both the password and confirmation upon its creation. One you have a valid Password instance, you can use it knowing that it passed all the validations you specified.
References
Those design principles (complete and valid objects from the start, etc) are from Hernan Wilkinson's "Design Principles Behind Patagonia".Be sure to check the ESUG 2010 Video and the presentation slides.
I've recently answered another question about validating properties I think you may come in handy: https://stackoverflow.com/a/14867390/146124
Cheers!

TL;DR
No, setters should not be validating the data. And nick2083 is completely wrong.
Longer version ...
According to Tim Howard's provided definition [source], the domain objects can verify the state of domain information that they contain. This basically states, that for you to actually have a domain object, said object need to be able to validate itself.
When to validate
You basically have to options:
validate in each setter
have one method to validate whole object
If the validation is a part of setter, there is one major drawback: the order of setters matters.
Example: lets say you are making an application which deals with life insurance. It is quite probable, that you will have a domain object which contains the person that is insured and the person that gets awarded the premium, when policy is triggered (the insured on dies). You would have to make sure that the recipient and the insured are not the same person. But there is no rule to govern in which order you execute the setters.
When you have two or more parameters in domain object, which have to be validated against each-other, the implementation becomes a bit fuzzy. The most feasible solution is to check when all parameters are assigned, but at that point you have already lost the benefit of in-setter validation: the execution of code has moved past the origin of invalid data.
And how would you deal with situations, where the valid state of domain object is not having a parameter A set if parameter B is large then 21 and C is already set?
Conclusion: validation in setters is only the viable solution, when you have very simple domain objects, with no tangled validation rules.

Does input filter / validation code belong in the controller or the domain model?

I have been using php for awhile but am new to OO php. As an exercise for myself I am building a small MVC framework.
I realize there is probably not a definitive answer for this, but I am wondering: Where does the input filter / validation code belong?
Should it be a part of the controller, where the request is parsed?
Or is it more appropriate to have filter / validation code in the domain model, so that each domain object is responsible for validating its own info.
Any advice would be much appreciated.

Controller is not responsible for validation in any way, shape or form. Controller is the part in presentation layer which is responsible for reacting on users input. Not questioning it.
The validation is mostly the responsibility of domain objects, which are where most of domain business logic ends up within the model layer. Some validation is what one would call "data integrity checks" (like making sure that username is unique). Those constraints are enforced by DB structure (like with UNIQUE constraint in given example or NOT NULL in some others). When you are saving the domain object, using data mapper (or some other storage pattern), it might raise some exceptions. Those exceptions too then might be used to set an error state on a particular domain object.
If you have a form, it will be tied to one or more domain objects, which, when the form is posted, validates it. The current view then requests information from the model layer and, if there has been an error state set, displays the appropriate warnings.

Controllers would typically handle request data (GET / POST) and detect invalid form submissions, CSRF, missing fields, etc. for which the model should not be concerned about. This is the most likely place where you would write the bulk of mostly your filtering code; validation should only go as far as sanity checks for early failure (e.g. don't bother to send an email address to the model if it's not a valid email address).
Your domain objects may also provide validation hooks (even filtering), which would reduce the controller's responsibility, but in most cases I personally find it easier to work with a contract based model (the model assumes you're passing legit values) because it's easier to directly translate validation issues to specific form fields.
The model itself may do validation as well, albeit different from aforementioned input filtering (and content type validation); for instance, it might check whether an email exists in the database rather than making sure it's a valid email address.

OOP Design - Where/When do you Validate properties?

I have read a few books on OOP DDD/PoEAA/Gang of Four and none of them seem to cover the topic of validation - it seems to be always assumed that data is valid.
I gather from the answers to this post (OOP Design Question - Validating properties) that a client should only attempt to set a valid property value on a domain object.
This person has asked a similar question that remains unanswered: http://bytes.com/topic/php/answers/789086-php-oop-setters-getters-data-validation#post3136182
So how do you ensure it is valid?
Do you have a 'validator method' alongside every getter and setter?
isValidName()
setName()
getName()
I seem to be missing some key basic knowledge about OOP data validation - can you point me to a book that covers this topic in detail? - ie. covering different types of validation / invariants/ handling feedback / to use Exceptions or not etc

In my experience, the validation occurs where there is human/user input. And this usually happens where you allow through your method to change something. In your example, I would go for validation for the method:
setName()
So It happens where you allow input of values/setting values which turns out to be setter methods.

It's important to distinguish between valid in the sense of a domain object's invariants (which must always be satisfied) and what some people call "contextual validation." For example, is a customer with a negative bank account "invalid?" No, but they may not be authorized to perform certain kinds of transactions. That's contextual validation, in contrast to "every customer entity must have a non-null ID," which is a different type of validation altogether.
One effective technique to enforce invariants is to distinguish classes that represent user input from domain objects and don't expose unrestricted mutators (simple set accessors, for example) on your domain objects.
For example, if you have a Student domain object, don't manipulate it directly in the user interface. Instead of creating Student instances, your views create StudentBuilder instances that model what you need to construct a valid Student domain object.
Next, you have classes that validate builder instances conform to the domain object's invariants, and a factories that takes accept builders and can transform them into valid domain objects. (You can also introduce contextual validation strategies at this step as appropriate.)

Each object should make sure that its internal state is consistent, so validation is best done before the internal state is modified - in the object's setter methods.

An important part of the OOP is too always keep your object in a valid state. Therefore, validation should be done after an input that could modify the object.
It's always good to validate data comming from properties/set, parameters to functions and constructor.

If you control the code that uses your class, then you should be validating before even attempting to manipulate the object's variables (through the public properties). If you are anticipating a scenario where you don't know how your class is to be used, then yes, you should validate within the property, that's more or less what they are for. Obviously this assumes the definition of "is a valid name" is a static business rule inherent to the object.
Validating on both levels is of course, the safest route to go.

This depends on your style of programming, as Wikipedia has more detailed explanations I will just scratch the surface and link to Wikipedia. (Yes, I'm THAT lazy. :-))
NOTE: All of this does NOT apply to user input. You have to validate it in either way. I'm just talking about the business logic classes which should not be coupled with the user input in any way. :-)
Defensive
As mentioned by others, you will enforce every property to its bounds. I often threw runtime exceptions (Java) to indicate those failures.
Wikipedia on Defensive Programming
By Contract
You document the requirements of your code and assume, for example, the values passed to your setters are valid regarding to the defined contract. This saves you a lot of boilerplate code. But bug hunting will be a little more difficult when an illegal value was given.
Wikipedia on Design by Contract

In short, always validate. In long, perform your validations all together at once, not 'along the way'. This will help your code remain streamlined and help debugging confusion.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.