Validate form input in Domain Objects setters?

Validate form input in Domain Objects setters? - php

Since I got into learning about MVC I have always validated my form data in my controllers which is a habit I picked up while skimming through CodeIgniters code but I have learned that its way of doing certain operations is not the best, it just gets the job done.
Should all form data be validated by the domain objects? And if so should it be done in the setters like so
public function setFirstName($firstName) {
// Check if the field was required
if(!$firstName) {
throw new InvalidArgumentException('The "First name" field is required');
}
// Check the length of the data
// Check the format
// Etc etc
}
Also, for example, if I am handling a basic user registration my User class does not have a $confirmPassword property so I would not be doing
$user->setConfirmPassword($confirmPassword);.
One way of checking if the two passwords entered are equal would be to set the $password and do something like
$user->setPassword($password);
if(!$user->matchPassword($confirmPassword)) {
throw new PasswordsNotEqualException('Some message');
}
and this would be done in the Service layer I would think?
Any advice to help me in the correct direction would be great. Thanks.

Should all form data be validated by the domain objects? And if so
should it be done in the setters like so
IMO you should only let creation of valid objects, and the best way to archieve this is to make those checks in the method that creates an object.
Assuming that the first name of an user cannot be changed, you would validate that upon the user creation. This way, you forget about the setter because you won't need it anymore.
There may be cases when you want a property to be changed, and you would need to validate them too (because that change could lead form a valid object to an invalid object, if that's the case).
One way of checking if the two passwords entered are equal would be to
set the $password and do something like...
You can handle this one the same way: have a Password object that checks both the password and confirmation upon its creation. One you have a valid Password instance, you can use it knowing that it passed all the validations you specified.
References
Those design principles (complete and valid objects from the start, etc) are from Hernan Wilkinson's "Design Principles Behind Patagonia".Be sure to check the ESUG 2010 Video and the presentation slides.
I've recently answered another question about validating properties I think you may come in handy: https://stackoverflow.com/a/14867390/146124
Cheers!

TL;DR
No, setters should not be validating the data. And nick2083 is completely wrong.
Longer version ...
According to Tim Howard's provided definition [source], the domain objects can verify the state of domain information that they contain. This basically states, that for you to actually have a domain object, said object need to be able to validate itself.
When to validate
You basically have to options:
validate in each setter
have one method to validate whole object
If the validation is a part of setter, there is one major drawback: the order of setters matters.
Example: lets say you are making an application which deals with life insurance. It is quite probable, that you will have a domain object which contains the person that is insured and the person that gets awarded the premium, when policy is triggered (the insured on dies). You would have to make sure that the recipient and the insured are not the same person. But there is no rule to govern in which order you execute the setters.
When you have two or more parameters in domain object, which have to be validated against each-other, the implementation becomes a bit fuzzy. The most feasible solution is to check when all parameters are assigned, but at that point you have already lost the benefit of in-setter validation: the execution of code has moved past the origin of invalid data.
And how would you deal with situations, where the valid state of domain object is not having a parameter A set if parameter B is large then 21 and C is already set?
Conclusion: validation in setters is only the viable solution, when you have very simple domain objects, with no tangled validation rules.

Related

Request validation imposes validation code duplication in Symfony

Say I have a REST API method creating a user. I also have a user entity with configured validation constraints. The question is how to validate the data from the request. My problems are:
I can't populate a user instance without validating the data in the request beforehand - some data might be missing in it, other might be invalid. For example null passed to a user entity's setter with string type-hinting.
I'm not so keen to validate the request data separately, before populating a user instance, because it would be a duplication of the validation constraints configured for the user entity. It would be a problem to manage the same or similar validation constrains in two places - the controller and the entity validation config.
So basically I want to avoid the duplication of the validation constraints in the code and the config, but at the same time I'm forced to duplicate it before populating the entity. How can I get over this?

It is quite physiologic.
What I would suggest is to use a DTO where no restrictions are checked (basically where you can accept "all kind of data" in your setters or even by having public properties that is less cumbersome) and to have a validation on it.
When the DTO is valid, create the underlying object in a valid state (Value Object?)
Of course you need to "duplicate" some constraints but I would not consider this as a real duplication because, actually, DTO and underlying object are not the same object even if them seems to be related. If you disagree - and it could be the case - just stop and think about the boost you'll have by decoupling entity (that should be always in a valid state) from the model where user input data are taken.

Does input filter / validation code belong in the controller or the domain model?

I have been using php for awhile but am new to OO php. As an exercise for myself I am building a small MVC framework.
I realize there is probably not a definitive answer for this, but I am wondering: Where does the input filter / validation code belong?
Should it be a part of the controller, where the request is parsed?
Or is it more appropriate to have filter / validation code in the domain model, so that each domain object is responsible for validating its own info.
Any advice would be much appreciated.

Controller is not responsible for validation in any way, shape or form. Controller is the part in presentation layer which is responsible for reacting on users input. Not questioning it.
The validation is mostly the responsibility of domain objects, which are where most of domain business logic ends up within the model layer. Some validation is what one would call "data integrity checks" (like making sure that username is unique). Those constraints are enforced by DB structure (like with UNIQUE constraint in given example or NOT NULL in some others). When you are saving the domain object, using data mapper (or some other storage pattern), it might raise some exceptions. Those exceptions too then might be used to set an error state on a particular domain object.
If you have a form, it will be tied to one or more domain objects, which, when the form is posted, validates it. The current view then requests information from the model layer and, if there has been an error state set, displays the appropriate warnings.

Controllers would typically handle request data (GET / POST) and detect invalid form submissions, CSRF, missing fields, etc. for which the model should not be concerned about. This is the most likely place where you would write the bulk of mostly your filtering code; validation should only go as far as sanity checks for early failure (e.g. don't bother to send an email address to the model if it's not a valid email address).
Your domain objects may also provide validation hooks (even filtering), which would reduce the controller's responsibility, but in most cases I personally find it easier to work with a contract based model (the model assumes you're passing legit values) because it's easier to directly translate validation issues to specific form fields.
The model itself may do validation as well, albeit different from aforementioned input filtering (and content type validation); for instance, it might check whether an email exists in the database rather than making sure it's a valid email address.

How to handle Domain Entity validation before it's persisted?

An Entity (let's say a UserEntity) has rigid rules for it's properties, and it can exist in 2 states - persisted (which means it has an id) and pre-persisted (which means it does not have an id yet).
According to the answer to this question about how to handle required properties, a "real" UserEntity should only ever be created with an id passed in to its constructor.
However, when I need to create a new UserEntity from information sent by the browser, I need to be able to validate the information before persisting into the db.
In the past, I would simply create a blank UserEntity (without an id), set the new properties, and the validate it - but, in this new, more secure way of thinking about Entities, I shouldn't ever create a new UserEntity without its id.
I don't want to create TWO places that know how to validate my UserEntity's properties, because if they ever change (and they will) it would be double the code to update and double the chances for bugs.
How do I efficiently centralize the validation knowledge of my entity's properties?
Note
One idea I had is reflected in this question, in which I consider storing the non-state properties like email, password and name in a standardized value object that would know about the rules for its properties that different services, like the Controller, Validator, and Repo, or Mapper could use.

that's what factories are for. to the factory method you pass only the data that is required to enforce the real invariants of UserEntity (take some time to figure out what are your real invariants of UserEntity and you'd better do it with your domain experts).
in the factory method you create a new Id and pass it to the UserEntity constructor.
In this stage i don't think it is that bad to discard the instance if the validation inside the constructor fails. in the worst case - you've lost an id... it's not a case that suppose to happen quite often - most of the time the data should be validated in the client.
Of course another option is that in the factory method you first validate the parameters and only then create a new Id and pass it to the UserEntity constructor.
itzik saban

I think you have a couple of options to consider:
(1) Consider your first comment:
An Entity (let's say a UserEntity) has rigid rules for it's
properties, and it can exist in 2 states - persisted (which means it
has an id) and pre-persisted (which means it does not have an id yet).
Here, you are already mention that validation actually depends on whether the entity has been persisted. In other words, if the entity hasn't been persisted, then it should be valid without the ID. If you continue with this domain specification, I feel the validation should act accordingly (e.g. return isValid even without an ID if the object hasn't been persisted)
(2) If you assume "valid" means the object has an ID, then you would need to generate the ID upon creation. Depending on how your IDs are generated, this could get tricky (e.g. save to database and return created ID, or generate unique identifiers somehow, or ...)
Using either approach, its probably worth implementing common base class(es) for your Entity (e.g. with ID) to help minimize duplicating validation across the different states. Hopefully, this shields the derived entities from the common validation as well.

In my opinion , save() , and load() methods should be doing both validation and setting ID attribute . And by the way an entity without Identity attribute is not a entity at all .
In my view Identity attribute should be validated and ensured when entity is in transit e.g
loading from db , loading from file or (after) saving to db such that
if loading from db fails discard the entity saving to db/file fails discard the entity .
Since validation is business log /behavior etc and a better pattern for that would be
Strategy Pattern (http://en.wikipedia.org/wiki/Strategy_pattern)

The topic of how to do validation correctly is somewhat of a grey area.
Validation is typically cast as Invariant and Contextual validation. Invariant validation pertains to those things that, according to your problem domain, have to be present in order for your model to function properly in its intended role. Contextual validation pertains to state that's valid within a given usage context (e.g. A Contact used for emailing needs an email address, but doesn't need phone number; a Contact used for catalog mailings needs a mailing address, but doesn't need an email, etc.).
If you want to be architecturally pure, then technically the concerns of input validation (what your customers are typing into a user interface) and the state of a given entity are two different concerns. Ideally, your domain should have no knowledge of the particular type of application it's written for and therefore shouldn't be burdened with providing error messages suitable for use, either directly or indirectly, in displaying error messages back to the user. This presents a bit of an issue, since it can lead to duplicate or triplicate error checking (client side, service side, domain-level), so many opt for a more pragmatic approach of dealing with most validation external to the entity (e.g. validating an input model prior to entity creation).

I don't see the problem with persisting invalid data. What is valid or not is a business concern and can sometimes depends on the situation. The database doesn't care about these business rules.
If I have to fill out a big form online and the very last step requires me to enter my credit card information and I don't have my card ready, I'll have to discard all that information and the next time enter it all over again (which won't happen because I rather go somewhere else). I would like that application to store the information I already gave and later on I can make it functionally valid. As long as it isn't valid, I can't order things online.

OOP Design - Where/When do you Validate properties?

I have read a few books on OOP DDD/PoEAA/Gang of Four and none of them seem to cover the topic of validation - it seems to be always assumed that data is valid.
I gather from the answers to this post (OOP Design Question - Validating properties) that a client should only attempt to set a valid property value on a domain object.
This person has asked a similar question that remains unanswered: http://bytes.com/topic/php/answers/789086-php-oop-setters-getters-data-validation#post3136182
So how do you ensure it is valid?
Do you have a 'validator method' alongside every getter and setter?
isValidName()
setName()
getName()
I seem to be missing some key basic knowledge about OOP data validation - can you point me to a book that covers this topic in detail? - ie. covering different types of validation / invariants/ handling feedback / to use Exceptions or not etc

In my experience, the validation occurs where there is human/user input. And this usually happens where you allow through your method to change something. In your example, I would go for validation for the method:
setName()
So It happens where you allow input of values/setting values which turns out to be setter methods.

It's important to distinguish between valid in the sense of a domain object's invariants (which must always be satisfied) and what some people call "contextual validation." For example, is a customer with a negative bank account "invalid?" No, but they may not be authorized to perform certain kinds of transactions. That's contextual validation, in contrast to "every customer entity must have a non-null ID," which is a different type of validation altogether.
One effective technique to enforce invariants is to distinguish classes that represent user input from domain objects and don't expose unrestricted mutators (simple set accessors, for example) on your domain objects.
For example, if you have a Student domain object, don't manipulate it directly in the user interface. Instead of creating Student instances, your views create StudentBuilder instances that model what you need to construct a valid Student domain object.
Next, you have classes that validate builder instances conform to the domain object's invariants, and a factories that takes accept builders and can transform them into valid domain objects. (You can also introduce contextual validation strategies at this step as appropriate.)

Each object should make sure that its internal state is consistent, so validation is best done before the internal state is modified - in the object's setter methods.

An important part of the OOP is too always keep your object in a valid state. Therefore, validation should be done after an input that could modify the object.
It's always good to validate data comming from properties/set, parameters to functions and constructor.

If you control the code that uses your class, then you should be validating before even attempting to manipulate the object's variables (through the public properties). If you are anticipating a scenario where you don't know how your class is to be used, then yes, you should validate within the property, that's more or less what they are for. Obviously this assumes the definition of "is a valid name" is a static business rule inherent to the object.
Validating on both levels is of course, the safest route to go.

This depends on your style of programming, as Wikipedia has more detailed explanations I will just scratch the surface and link to Wikipedia. (Yes, I'm THAT lazy. :-))
NOTE: All of this does NOT apply to user input. You have to validate it in either way. I'm just talking about the business logic classes which should not be coupled with the user input in any way. :-)
Defensive
As mentioned by others, you will enforce every property to its bounds. I often threw runtime exceptions (Java) to indicate those failures.
Wikipedia on Defensive Programming
By Contract
You document the requirements of your code and assume, for example, the values passed to your setters are valid regarding to the defined contract. This saves you a lot of boilerplate code. But bug hunting will be a little more difficult when an illegal value was given.
Wikipedia on Design by Contract

In short, always validate. In long, perform your validations all together at once, not 'along the way'. This will help your code remain streamlined and help debugging confusion.

Should validation be done in Form objects, or the model?

This question is mainly geared towards Zend in PHP, although it certainly applies to other languages and frameworks, so I welcome everyone's opinion.
I've only recently been using the Zend framework, and while it's not perfect, I have had a pretty good time with it. One thing that drives me crazy, however, is that most of the examples I see of people using Zend do the validation in special form objects, rather than in the model. I think this is bad practice because data can enter into the system in other ways beyond form input, which means that either validators have to be bent and twisted to validate other input, or validation must be done in a second place, and logic duplicated.
I've found some other posts and blogs out there with people who feel the same way I do, but the developers of Zend made this choice for a reason, and other people seem to use it without issue, so I wanted to get some feedback from the community here.
As I said, this mainly applies to Zend, although I think it's important to look at the issue as a whole, rather than working within the confines of the Zend framework, since Zend was designed so that you could use as much, or as little, as you wished.

This is a non-zend specfic answer, however I believe that the model should be responsible for the validity of its own data. If this is the case then the validation belongs in the model, however this may not always be achievable and it may be necessary to perform validation in the view, however I think this should be in addition to the validation performed in the model not a replacement for it.
The problem with only having validation in the view is that at some point you will probably want another view on your data. Your site may become popular and customers are asking for XML based APIs to generate their own views. Do you then rely on the customer to validate the data?
Even if you do not have to provide APIs some customers may want customized views that are sufficiently different to warrant a completely different version of the page, again you now have validation in the views duplicated.
I think the ideal scenario is to have your model do the validation but to make the results of the validation available for the view to read and render the page again with the validation results displayed.
I think it is perfectly reasonable to have the view doing validation if you want to instantly display validation data back to the user etc but the final decision on data validity should rest with the model.

It's important to remember that data validation which is relevant to an application isn't always the same thing as data validation that's relevant to a database schema.
Consider a simple registration form where a user creates an account with a username and password. You perform validation on the password because you want it to be X number of characters in length and contain a good mix of character types (or whatever).
But none of this is relevant to validate the data for database insertion, because you aren't going to store plain-text passwords - you're going to store a hash of them in some way (md5, md5 + salt, whatever). Instead you might make sure that you have a 32 character hexadecimal string so that it is very likely to be a properly created MD5 hash.
This password example isn't the only scenario, just a good one for explanation here in this topic.
So what's the answer? I don't think there's any one-solution-fits-all. Sometimes you will want (need?) to validate the data twice. Sometimes you'll do it once an only in the Model. Just match it as best as possible to your application's needs.

Perhaps you should have a look at Using Zend_Form in Your Models by Matthew Weier O'Phinney - one of the lead-developers of the Zend Framework - for his view on exactly this question.

Well, the validation can be done at many different levels and usually none of them is "the best". Of course, the model can be populated with invalid data that do not come from the form, but we can also create forms whose data do not go to any model.
Moreover, the direct validation in models is unsually not integrated with our form rendering system, which causes problems if we want to show the error messages and re-populate the form with the user-entered data then.
Both of the solutions have their own pros and cons. It would be perfect to have a system that ensures us that the validation finally must be done at some level. If the form does not validate some data, then the model does and vice versa. Unfortunately, I haven't heard of such library, but I must note that the validators in the frameworks unsually are source-independent. You can pass the POST data to them, but the same can be done with the information retreived from a properly parsed CSV, MYSQL databases, etc.

I am not aware of Zend. But.
Your model have to receive valid data. Model and it's methods shouldn't check data again and again. Of course there are should be functions that do actual validation and they should be called from the gui validation or from the other data input place.
The best you can do on your model side is call "Assertions" on all the data to be sure on the development time that validation have been taken its place.
The lower level of the code (UI, model, utils) the less validation and check code should be there. As then there is a big chance that the same validation will be called more then one.

How about putting esthetical validation in the form, and business rules validation in the model.
Take a registration form for example.
The form would assure that the email field is trimmed and contains a valid email, that the password/confirm password field are identical and that the user checked the I Agree to terms checkbox.
The registration model would make sure that the email hasn't been taken yet in the table, would salt and hash the password.
It's how I split the two usually.

User input should be validated when it is being inputted because it is specific to the form of entry (ie, do some form validation - make sure text boxes that should have numbers are numbers).
Business logic should probably be validated on the model because it is model specific (ie. make sure they have't already reserved that same room or something like that).
The problem with validating it at the model level is that the model might be used in different ways. Correct input for one scenario may not be correct input for another.
The other issue is that you usually want some context sensitive validation, such as displaying a red box around the form control that has the bad input.
The model or database might do some extra validation to make sure the user code isn't doing something completely wrong (constraints, etc).

Peter Bailey's password example is excellent. A user model can only validate, if a password was set (because it's not stored as plain text but as a hash) while input validation can ensure, that the original plain text password corresponds to the security requirements (number of characters,...). Therefore you need both: Model validation and form/input validation, ideally as separate, reusable component and not directly in bloated controller actions.
Think of input validation as whitelist validation (“accept known good”) and model validation as blacklist validation (“reject known bad”). Whitelist validation is more secure while blacklist validation prevents your model layer from being overly constrained to very specific use cases.
Invalid model data should always cause an exception to be thrown (otherwise the application can continue running without noticing the mistake) while invalid input values coming from external sources are not unexpected, but rather common (unless you got users that never make mistakes).
See also: https://lastzero.net/2015/11/form-validation-vs-model-validation/

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.