In input of my application I have the following data: airplane_id, airport_id and passenger(s) details.
I need to make sure that selected airplane_id could reach airport_id. It might be done only with help a SQL query, but this checking is still a validation process, isn't it?
Validation should happen before I will save passenger(s) details.
In my application model, it is the ActiveRecord pattern object which represent a table. I would rather make Validator as a separated layer than to build it into the Model layer. But in this case I have an extra issue: usually Validators are general (their rules might be applied to any set of data). For instance is this data email? or IP? or date? etc.... but never mind what the data is.
In my case, the mentioned rule won't be common at all; it will definitely be a specific rule, which can't be used by any other input data. So my question is: Is this checking still part of the validation process?
And if yes, will Validator violate the S principle from the set of SOLID?
It is validation and you should use a separate validation layer (single responsibility for input validation). Input validation isn't just data type checking, it can be much more complex. Model validation might still be needed though.
Think of input validation as whitelist validation (“accept known good”) and model validation as blacklist validation (“reject known bad”). Whitelist validation is more secure while blacklist validation prevents your model layer from being overly constrained to very specific use cases.
Invalid model data should always cause an exception to be thrown (otherwise the application can continue running without noticing the mistake) while invalid input values coming from external sources are not unexpected, but rather common (unless you got users that never make mistakes).
See also: https://lastzero.net/2015/11/form-validation-vs-model-validation/
Yes, these checks are validation.
Speaking from experience with a MVC pattern framework(Yii/2), I would say that you could make an abstract validator class and from there extend it into your concrete validators and call those validators from the model class. This will need a Model->validate() call, but having separate classes that actually check the data will not violate the S in SOLID, while Model->validate() will just loop through the validatos validate methods and store the error messages in an array.
I have a bunch of Domain Objects and I am using overloading to get and set properties.
My form filters are comprehensive. If properties of the wrong type or value sneak through, I am confident that I can pick them up in the mapper. Worst case scenario is that the database throws an exception which I can catch.
In this instance, should I worry about getters and setters in the domain object?
As a best practice, you should always "catch what you can" before you get to the database. Though it may seem as if a round-trip isn't a big deal, they are expensive. Objects have to be created on the server, application pool resources managed, and so much more. Do all the validation you can, though it's tedious, before you get to the database.
The reason you rely on the database to throw exceptions is to ensure its integrity via other forms of access (e.g. import scripts), not to leverage it for your application (which is capable of catching and handling them gracefully).
The final benefit of building the get and set operations is that you can fully encapsulate these bounds checks so that you only have to write the code once, you're going in the right direction!
Well there is no good answer in my opinion for it.
If you do it, you can be 100% sure the value returned is of type x. Otherwise you depend on the data layer for the right stuff.
I check the values, but that is mostly because I like the defensive programming approuch (everything outside the scope is evil and should not be trusted). The domain object is outside the scope of the mapper, so you are not sure what you get.
Next if you make an api of some kind and use the domain objects again out of scope so check the values.
conclusion it depends on the code style you like. So you could implement the getters and settets or just direct to the values.
Although I recommend use at least getters and setters for the case you need to change how values are handeld (array needs to become object for example)
Defining explicit 'getters and setters' will allow you to implement the correct encapsulation for your domain model.
From a setting point of view, this would be type checking for complex values. Then you can perform any validation that is applicable in the context of the domain model.
When the domain model is persisted, this "validation" is really to ensure the values are stored correctly. The only concern the mapper would have is to cast the simple (scalar) values to the correct format (i.e dates etc). These operations tend to be database specific and would probably be better suited within the mapper.
Since I got into learning about MVC I have always validated my form data in my controllers which is a habit I picked up while skimming through CodeIgniters code but I have learned that its way of doing certain operations is not the best, it just gets the job done.
Should all form data be validated by the domain objects? And if so should it be done in the setters like so
public function setFirstName($firstName) {
// Check if the field was required
if(!$firstName) {
throw new InvalidArgumentException('The "First name" field is required');
}
// Check the length of the data
// Check the format
// Etc etc
}
Also, for example, if I am handling a basic user registration my User class does not have a $confirmPassword property so I would not be doing
$user->setConfirmPassword($confirmPassword);.
One way of checking if the two passwords entered are equal would be to set the $password and do something like
$user->setPassword($password);
if(!$user->matchPassword($confirmPassword)) {
throw new PasswordsNotEqualException('Some message');
}
and this would be done in the Service layer I would think?
Any advice to help me in the correct direction would be great. Thanks.
Should all form data be validated by the domain objects? And if so
should it be done in the setters like so
IMO you should only let creation of valid objects, and the best way to archieve this is to make those checks in the method that creates an object.
Assuming that the first name of an user cannot be changed, you would validate that upon the user creation. This way, you forget about the setter because you won't need it anymore.
There may be cases when you want a property to be changed, and you would need to validate them too (because that change could lead form a valid object to an invalid object, if that's the case).
One way of checking if the two passwords entered are equal would be to
set the $password and do something like...
You can handle this one the same way: have a Password object that checks both the password and confirmation upon its creation. One you have a valid Password instance, you can use it knowing that it passed all the validations you specified.
References
Those design principles (complete and valid objects from the start, etc) are from Hernan Wilkinson's "Design Principles Behind Patagonia".Be sure to check the ESUG 2010 Video and the presentation slides.
I've recently answered another question about validating properties I think you may come in handy: https://stackoverflow.com/a/14867390/146124
Cheers!
TL;DR
No, setters should not be validating the data. And nick2083 is completely wrong.
Longer version ...
According to Tim Howard's provided definition [source], the domain objects can verify the state of domain information that they contain. This basically states, that for you to actually have a domain object, said object need to be able to validate itself.
When to validate
You basically have to options:
validate in each setter
have one method to validate whole object
If the validation is a part of setter, there is one major drawback: the order of setters matters.
Example: lets say you are making an application which deals with life insurance. It is quite probable, that you will have a domain object which contains the person that is insured and the person that gets awarded the premium, when policy is triggered (the insured on dies). You would have to make sure that the recipient and the insured are not the same person. But there is no rule to govern in which order you execute the setters.
When you have two or more parameters in domain object, which have to be validated against each-other, the implementation becomes a bit fuzzy. The most feasible solution is to check when all parameters are assigned, but at that point you have already lost the benefit of in-setter validation: the execution of code has moved past the origin of invalid data.
And how would you deal with situations, where the valid state of domain object is not having a parameter A set if parameter B is large then 21 and C is already set?
Conclusion: validation in setters is only the viable solution, when you have very simple domain objects, with no tangled validation rules.
An Entity (let's say a UserEntity) has rigid rules for it's properties, and it can exist in 2 states - persisted (which means it has an id) and pre-persisted (which means it does not have an id yet).
According to the answer to this question about how to handle required properties, a "real" UserEntity should only ever be created with an id passed in to its constructor.
However, when I need to create a new UserEntity from information sent by the browser, I need to be able to validate the information before persisting into the db.
In the past, I would simply create a blank UserEntity (without an id), set the new properties, and the validate it - but, in this new, more secure way of thinking about Entities, I shouldn't ever create a new UserEntity without its id.
I don't want to create TWO places that know how to validate my UserEntity's properties, because if they ever change (and they will) it would be double the code to update and double the chances for bugs.
How do I efficiently centralize the validation knowledge of my entity's properties?
Note
One idea I had is reflected in this question, in which I consider storing the non-state properties like email, password and name in a standardized value object that would know about the rules for its properties that different services, like the Controller, Validator, and Repo, or Mapper could use.
that's what factories are for. to the factory method you pass only the data that is required to enforce the real invariants of UserEntity (take some time to figure out what are your real invariants of UserEntity and you'd better do it with your domain experts).
in the factory method you create a new Id and pass it to the UserEntity constructor.
In this stage i don't think it is that bad to discard the instance if the validation inside the constructor fails. in the worst case - you've lost an id... it's not a case that suppose to happen quite often - most of the time the data should be validated in the client.
Of course another option is that in the factory method you first validate the parameters and only then create a new Id and pass it to the UserEntity constructor.
itzik saban
I think you have a couple of options to consider:
(1) Consider your first comment:
An Entity (let's say a UserEntity) has rigid rules for it's
properties, and it can exist in 2 states - persisted (which means it
has an id) and pre-persisted (which means it does not have an id yet).
Here, you are already mention that validation actually depends on whether the entity has been persisted. In other words, if the entity hasn't been persisted, then it should be valid without the ID. If you continue with this domain specification, I feel the validation should act accordingly (e.g. return isValid even without an ID if the object hasn't been persisted)
(2) If you assume "valid" means the object has an ID, then you would need to generate the ID upon creation. Depending on how your IDs are generated, this could get tricky (e.g. save to database and return created ID, or generate unique identifiers somehow, or ...)
Using either approach, its probably worth implementing common base class(es) for your Entity (e.g. with ID) to help minimize duplicating validation across the different states. Hopefully, this shields the derived entities from the common validation as well.
In my opinion , save() , and load() methods should be doing both validation and setting ID attribute . And by the way an entity without Identity attribute is not a entity at all .
In my view Identity attribute should be validated and ensured when entity is in transit e.g
loading from db , loading from file or (after) saving to db such that
if loading from db fails discard the entity saving to db/file fails discard the entity .
Since validation is business log /behavior etc and a better pattern for that would be
Strategy Pattern (http://en.wikipedia.org/wiki/Strategy_pattern)
The topic of how to do validation correctly is somewhat of a grey area.
Validation is typically cast as Invariant and Contextual validation. Invariant validation pertains to those things that, according to your problem domain, have to be present in order for your model to function properly in its intended role. Contextual validation pertains to state that's valid within a given usage context (e.g. A Contact used for emailing needs an email address, but doesn't need phone number; a Contact used for catalog mailings needs a mailing address, but doesn't need an email, etc.).
If you want to be architecturally pure, then technically the concerns of input validation (what your customers are typing into a user interface) and the state of a given entity are two different concerns. Ideally, your domain should have no knowledge of the particular type of application it's written for and therefore shouldn't be burdened with providing error messages suitable for use, either directly or indirectly, in displaying error messages back to the user. This presents a bit of an issue, since it can lead to duplicate or triplicate error checking (client side, service side, domain-level), so many opt for a more pragmatic approach of dealing with most validation external to the entity (e.g. validating an input model prior to entity creation).
I don't see the problem with persisting invalid data. What is valid or not is a business concern and can sometimes depends on the situation. The database doesn't care about these business rules.
If I have to fill out a big form online and the very last step requires me to enter my credit card information and I don't have my card ready, I'll have to discard all that information and the next time enter it all over again (which won't happen because I rather go somewhere else). I would like that application to store the information I already gave and later on I can make it functionally valid. As long as it isn't valid, I can't order things online.
I'm currently rebuilding an admin application and looking for your recommendations for best-practice! Excuse me if I don't have the right terminology, but how should I go about the following?
Take the example of "users" - typically we can create a class with properties like 'name', 'username', 'password', etc. and make some methods like getUser($user_ID), getAllUsers(), etc. In the end, we end up with an array/arrays of name-value pairs like; array('name' => 'Joe Bloggs', 'username' => 'joe_90', 'password' => '123456', etc).
The problem is that I want this object to know more about each of its properties.
Consider "username" - in addition to knowing its value, I want the object to know things like; which text label should display beside the control on the form, which regex I should use when validating, what error message is appropriate? These things seem to belong in the model.
The more I work on the problem, the more I see other things too; which HTML element should be used to display this property, what are minimum/maximum values for properties like 'registration_date'?
I envisaged the class looking something like this (simplified):
class User {
...etc...
private static $model = array();
...etc...
function __construct(){
...etc...
$this->model['username']['value'] = NULL; // A default value used for new objects.
$this->model['username']['label'] = dictionary::lookup('username'); // Displayed on the HTML form. Actual string comes from a translation database.
$this->model['username']['regex'] = '/^[0-9a-z_]{4,64}$/i'; // Used for both client-side validation and backend validation/sanitising;
$this->model['username']['HTML'] = 'text'; // Which type of HTML control should be used to interact with this property.
...etc...
$this->model['registration_date']['value'] = 'now'; // Default value
$this->model['registration_date']['label'] = dictionary::lookup('registration_date');
$this->model['registration_date']['minimum'] = '2007-06-05'; // These values could be set by a permissions/override object.
$this->model['registration_date']['maximum'] = '+1 week';
$this->model['registration_date']['HTML'] = 'datepicker';
...etc...
}
...etc...
function getUser($user_ID){
...etc...
// getUser pulls the real data from the database and overwrites the default value for that property.
return $this->model;
}
}
Basically, I want this info to be in one location so that I don't have to duplicate code for HTML markup, validation routines, etc. The idea is that I can feed a user array into an HTML form helper and have it automatically create the form, controls and JavaScript validation.
I could then use the same object in the backend with a generic set($data = array(), $model = array()) method to avoid having individual methods like setUsername($username), setRegistrationDate($registration_date), etc...
Does this seem like a sensible approach?
What would you call value, label, regex, etc? Properties of properties? Attributes?
Using $this->model in getUser() means that the object model is overwritten, whereas it would be nicer to keep the model as a prototype and have getUser() inherit the properties.
Am I missing some industry-standard way of doing this? (I have been through all the frameworks - example models are always lacking!!!)
How does it scale when, for example, I want to display user types with a SELECT with values from another model?
Thanks!
Update
I've since learned that Java has class annotations - http://en.wikipedia.org/wiki/Java_annotations - which seem to be more or less what I was asking. I found this post - http://interfacelab.com/metadataattributes-in-php - does anyone have any insight into programming like this?
You're on the right track there. When it comes to models I think there are many approaches, and the "correct" one usually depends on your type of application.
Your model can be directly an Active Record, maybe a table row data gateway or a "POPO", plain old PHP object (in other words, a class that doesn't implement any specific pattern).
Whichever you decide works best for you, things like validation etc. can be put into the model class. You should be able to work with your users as User objects, not as associative arrays - that is the main thing.
Does this seem like a sensible approach
Yes, besides the form label thing. It's probably best to have a separate source for data such as form labels, because you may eventually want to be able to localize them. Also, the label isn't actually related to the user object - it's related to displaying a form.
How I would approach this (suggestion)
I would have a User object which represents a single user. It should be possible to create an empty user or create it from an array (so that it's easy to create one from a database result for example). The user object should also be able to validate itself, for example, you could give it a method "isValid", which when called will check all values for validity.
I would additionally have a user repository class (or perhaps just some static methods on the User class) which could be used to fetch users from the database and store them back. This repository would directly return user objects when fetching, and accept user objects as parameters for saving.
As to what comes to forms, you could probably have a form class which takes a user object. It could then automatically get values from the user and use it to validate itself as well.
I have written on this topic a bit here: http://codeutopia.net/blog/2009/02/28/creating-a-simple-abstract-model-to-reduce-boilerplate-code/ and also some other posts linked in the end of that one.
Hope this helps. I'd just like to remind that my approach is not perfect either =)
An abstract response for you which quite possibly won't help at all, but I'm happy to get the down votes as it's worth saying :)
You're dealing with two different models here, in some world we call these Class and Instance, in other's we talk of Classes and Individuals, and in other worlds we make distinctions between A-Box and T-Box statements.
You are dealing with two sets of data here, I'll write them out in plain text:
User a Class .
username a Property;
domain User;
range String .
registration_date a Property;
domain User;
range Date .
this is your Class data, T-Box statements, Blueprints, how you describe the universe that is your application - this is not the description of the 'things' in your universe, rather you use this to describe the things in your universe, your instance data.. so you then have:
user1 a User ;
username "bob";
registration_date "2010-07-02" .
which is your Instance, Individual, A-Box data, the things in your universe.
You may notice here, that all the other things you are wondering how to do, validation, adding labels to properties and so forth, all come under the first grouping, things that describe your universe, not the things in it. So that's where you'd want to add it.. again in plain text..
username a Property;
domain User;
range String;
title "Username";
validation [ type Regex; value '/^[0-9a-z_]{4,64}$/i' ] .
The point in all this, is to help you analyse the other answers you get - you'll notice that in your suggestion you munged these two distinct sets of data together, and in a way it's a good thing - from this hopefully you can see that typically the classes in PHP take on the role of Classes (unsurprisingly) and each object (or instance of a class) holds the individual instance data - however you've started to merge these two parts of your universe together to try and make one big reusable set of classes outside of the PHP language constructs that are provided.
From here you have two paths, you can either fold in to line and follow the language structure to make your code semi reusable and follow suggested patterns like MVC (which if you haven't done, would do you good) - or you can head in to a cutting edge world where these worlds are described and we build frameworks to understand the data about our universes and the things in it, but it's an abstract place where at the minute it's hard to be productive, though in the long term is the path to the future.
Regardless, I hope that in some way that helps you to get a grip of the other responses.
All the best!
Having looked at your question, the answers and your responses; I might be able to help a bit more here (although it's difficult to cover everything in a single answer).
I can see what you are looking to do here, and in all honesty this is how most frameworks start out; making a set of classes to handle everything, then as they are made more reusable they often hit on tried and tested patterns until finally ending up with what I'd say is 'just another framework', they all do pretty much the same thing, in pretty much the same ways, and aim to be as reusable as they can - generally about the only difference between them is coding styles and quality - what they do is pretty much the same for all.
I believe you're hitting on a bit of anti-pattern in your design here, to explain.. You are focussed on making a big chunk of code reusable, the validation, the presentation and so forth - but what you're actually doing (and of course no offence) is making the working code of the application very domain specific, not only that but the design you illustrate will make it almost impossible to extend, to change layers (like make a mobile version), to swap techs (like swap db vendors) and further still, because you've got presentation and application (and data) tiers mixed together, any designer who hit's the app will have to be working in, and changing, your application code - hit on a time when you have two versions of the app and you've got a big messy problem tbh.
As with most programming problems, you can solve this by doing three things:
designing a domain model.
specifying and designing interfaces rather that worrying about the implementation.
separating cross cutting concerns
Designing a domain model is a very important part of Class based OO programming, if you've never done it before then now is the ideal time, it doesn't matter whether you do this in a modelling language like UML or just in plain text, the idea is to define all the Entities in your Domain, it's easy to slip in to writing a book when discussing this, but let's keep it simple. Your domain model comprises all the Entities in your application's domain, each Entity is a thing, think User, Address, Article, Product and so forth, each Entity is typically defined as a Class (which is the blueprint of that entity) and each Class has Properties (like username, register_date etc).
Class User {
public $username;
public $register_date;
}
Often we may keep these as POPOs, however they are often better thought of as Transfer Objects (often called Data Transfer Objects, Value Objects) - a simple Class blueprint for an entity in your domain - normally we try to keep these portable as well, so that they can be implemented in any language, passed between apps, serialized and sent to other apps and similar - this isn't a must, indeed nothing is a must - but it does touch on separation of concerns in that it would normally be naked, implying no functionality, just a blueprint ot hold values. Contrast sharply with Business Objects and Utility Classes that actually 'do' things, are implementations of functionality, not just simple value holders.
Don't be fooled though, both Inheritance and Composition also play their part in domain model, a User may have several Addresses, each Address may be the address of several different Users. A BillingAddress may extend a normal Address and add in additional properties and so forth. (aside: what is a User? do you have a User? do you have a Person with 1-* UserAccounts?).
After you've got your domain model, the next step is usually mapping that up to some form of persistence layer (normally a database) two common ways of doing this (in well defined way) are by using an ORM (such as doctrine, which is in symphony if i remember correctly), and the other way is to use DAO pattern - I'll leave that part there, but typically this is a distinct part of the system, DAO layers have the advantage in that you specify all the methods available to work with the persistence layer for each Entity, whilst keeping the implementation abstracted, thus you can swap database vendors without changing the application code (or business rules as many say).
I'm going to head in to a grey area with the next example, as mentioned earlier Transfer Objects (our entities) are typically naked objects, but they are also often a good place to strap on other functionality, you'll see what I mean.
To illustrate Interfaces, you could simply define an Interface for all your Entities which is something like this:
Interface Validatable {
function isValid();
}
then each of your entities can implement this with their own custom validation routine:
Class User implements Validatable {
public function isValid()
{
// custom validation here
return $boolean;
}
}
Now you don't need to worry about creating some convoluted way of validating objects, you can simply call isValid() on any entity and find out if it's valid or not.
The most important thing to note is that by defining the interface, we've separated some of the concerns, in that no other part of the application needs to do anything to validate an object, all they need to know is that it's Validatable and to call the isValid() method.
However, we have crossed some concerns in that each object (instance of a Class) now carries it's own validation rules and model. It may make sense to abstract this out, one easy way of doing this is to make the validation method static, so you could define:
Class User {
public static function validate(User $user)
{
// custom validation here
return $boolean;
}
}
Or you could move to using getters and setters, this is another very common pattern where you can hide the validation inside the setter, thus ensuring that each property always holds valid data (or null, or default value).
Or perhaps you move the validation in to it's own library? Class Validate with it's own methods, or maybe you just pop it in the DAO layer because you only care about checking something when you save it, or maybe you need to validate when you receive data and when you persist it - how you end up doing it is your call and there is no 'best way'.
The third consideration, which I've already touched on, is separation of concerns - should a persistence layer care how the things it's persisting are presented? should the business logic care about how things are presented? should an Entity care where and how it's displayed? or should the presentation layer care how things are presented? Similarly, we can ask is there only ever going to be one presentation layer? in one language? What about how a label appears in a sentence, sure singular User and Address makes sense, but you can't simply +s to show the lists because Users is right but Addresss is wrong ;) - also we have working considerations like do I want a new designer having to change application code just to change the presentation of 'user account' to 'User Account', even do I want to change my app code in the classes when a that change is asked for?
Finally, and just to throw everything I've said - you have to ask yourself, what's the job I'm trying to do here? am I building a big reusable application with potentially many developers and a long life cycle here - or would a simple php script for each view and action suffice (one that reads $_GET/$_POST, validates, saves to db then displays what it should or redirects where it should) - in many, if not all cases this is all that's needed.
Remember, PHP is made to be invoked when a request is made to a web server, then send back a response [end] that's it, what happens between then is your domain, your job, the client and user typically doesn't care, and you can sum up what you're trying to do this simply: build a script to respond to that request as quickly as possible, with the expected results. That's and it needn't be any more complicated than that.
To be blunt, doing everything I mentioned and more is a great thing to do, you'll learn loads, understand your job better etc, but if you just want to get the job out the door and have easy to maintain simple code in the end, just build one script per view, and one per action, with the odd reusable bit (like a http handler, a db class, an email class etc).
You're running into the Model-View-Controller (MVC) architecture.
The M only stores data. No display information, just typed key-value pairs.
The C handles the logic of manipulating this information. It changes the M in response to user input.
The V is the part which handles displaying things. It should be something like Smarty templates rather than a huge amount of raw PHP for generating HTML.
Having it all "in one place" is the wrong approach. You won't have duplicated code with MVC - each part is a distinct step. This improves code reuse, readability, and maintainability.