I am trying to decide between 2 patterns regarding data validation:
I try to follow the nominal workflow and catch exceptions thrown by my models and services: unique/foreign constraint violation, empty fields, invalid arguments, etc... (!! I catch only exceptions that I know I should)
pros: Very few code to write in my Controllers and Services: I just have to handle exceptions and transcribe them to a user-understandable message. Code very simple and readable.
cons: I need to write specific exceptions, which can be a lot of different exceptions sometimes. I also need to catch and parse generic PDO/Doctrine exceptions for database exceptions (constraint violations, etc...) to translate them into exceptions that are meaningfull (eg: DuplicateEntryException). I also can't bypass some validation: let's say an object of my model is marked as locked: trying to delete it will raise an exception. However I may want to force its deletion (with a confirmation popup for example). I won't be able to bypass the exception here.
I test and pre-validate everything explicitly with code and DB queries. For example, i'll test that something is not null and is an integer before setting it as an attribute in my model. Or I'll make a DB query to check that I am not going to create a duplicate entry.
pros: no need to write specific exceptions, because I prevalidate everything so I shouldn't be doing a lot of try/catch anyway. Also I can bypass some validation if I want to.
cons: Lots of tests and validation to write in the controllers, services and models. I will be performing more queries (the validation part). The DB already does the validation for foreign keys, unique constraints, not null columns... I shouldn't ignore that and recode it myself. Also this leads to very boring code!
I would rather use one pattern or the other, not a mix, in order to keep things as simple as possible.
The first solution seems to me like the best, but I'm afraid it might be some kind of anti-pattern? or maybe behind its theoretical simplicity it is hiding situations very hard to handle?
I would suggest that data validation should happen at the perimeter of an application. That is to say, that any data coming in should be checked to make sure it meets your expectations. Once allowed into the application, it's no longer validated, but it is always escaped according to context (DB, email, etc.) This allows you to keep all of the validation together and avoids the potential duplication of validation work (it's easy to come up with examples where data could validated twice by two models that both use it.) Joe Armstrong promotes this approach in his book on Erlang, and the software he's written for telcom stations runs for years without restarting, so it does seem to work well :)
Additionally, model expectations don't always perfectly line up with the expectations established by a particular interface (maybe the form is only showing a subset of the potential options, or maybe the interface had a dropdown of US states and the model stores states from many different countries, etc.) Sometimes complex interfaces can integrate several different model objects in a manner that enhances the user experience. While nice for the user, the interaction of these models using the exception approach can be very difficult to handle because some of the inputs may be hybrid inputs that neither model alone can validate. You always want to ensure validation matches the expectations of the UI first and foremost, and the second approach allows you to do this in even the most complex interfaces.
Also, exception handling is relatively expensive in terms of cycles. Validation issues can be quite frequent, and I'd try and avoid such an expensive operation for handling issues than have the potential of being quite frequent.
Last, some validation isn't really necessary for the model, but it's there to prevent attacks. While you can add this to the model, the added functionality can quickly muddy the model code.
So, of these two approaches, I would suggest the second approach because:
You can craft a clear perimeter to your app.
All of the validation is in one place and can be shared.
There's no duplication of validation if two or more models make use of the same input.
The models can focus on what they're good at: mapping knowledge of abstract entities to application state.
Even the most complex UI's can be appropriately validated.
Preemption likely will be more efficient.
Security-focused validation tasks that don't really belong in any model can be cleanly added to the app.
Related
in my application I do write to a read model table (think CQRS) at certain times. While doing so, I also have to remove older read models. So at any given point I need to:
Remove entities a,b,c
Persist entities x,y,z
In order to maintain a valid read model throughout the lifecycle, I would like to encapsulate this process in a single transaction. Doctrine does provide the necessary means.
However, I also must guarantee that no other entities are being flushed in the process. Sadly, calling doctrine's $em->getConnection()->commit(); seems to to flush the whole unit of work. But according to the docs I have to call that to finalise my transaction.
I cannot introduce a second entity manager to only take of my read model entities as they are in the same namespace as the other entities and apparently that is not the way the doctrine-orm-bundle is supposed to be used.
The only other approach I see is to work on a lower level and circumvent the EntityManager and UnitOfWork completely, but I would like to guarantee transactional integrity and do not see a way to do so without em/ouw.
TLDR: The way your application works might warrant that update-concerns are completely separable into two independent sets, but that's unusual, and fragile (usually not even true at the time of making the assertion). The proper way to model that is using separate EntityManagers for each of the sets, and by manually guarding their interconnections is the semantics of "how they are independent" coded into the system.
Details:
If I understand the question correctly, you are facing a design flaw here. You have two sets of updates (one for the read models and one for the other entities) and you mix them (because you want both in the same entity manager) while you also want to separate them (by the means of a separate transaction). In general, this won't work.
For example, let's think non-read-model entity instance A is just created in-memory (so it has no ID yet) and based on this you decide to reference it with read-model entity instance R. The R->A relationship is valid in memory, but now you expect to be able to flush only read model entities but not others. I.e. when you try to persist+flush R it will reference a non-existing foreign key and your RDBMS will hopefully fail the transaction. On a high-level, this is because a connected in-memory graph should be consistent data in its entirety and when you try to split its valid changes into subsets, you're rearranging the order of those changes implicitly, which may introduce temporary inconsistency, and then your transaction commit might just be at such a point.
Of course, it may happen that you know some rule why such a thing will never happen and why consistent state of each set is warranted in a fashion that is completely independent from the other set. But then you need to write your code reflecting that separation; the way you can do that is to use two entity managers. In that case your code will clearly cope with the two distinct transactions, their separation and how exactly that is consistent from both sides. But even in this case, to avoid clashes of updates, you probably need to have rules outlining a one-way visibility between the two sets, which will also imply an order to committing transactions. This is because transactions in a connected graph can be nested, but not "only overlap", because at each transaction commit you are asking the ORM to sync the consistent in-memory data of the transaction scope to the RDBMS.
I know you mentioned that you do not want to use two EMs, because read-model entities "are in the same namespace as the other entities and apparently that is not the way the doctrine-orm-bundle is supposed to be used".
The namespace does not really matter. You can use them separately in the two managers. You can even interconnect them, if you a) properly merge() them and b) cater for the above mentioned consistency you need to provide for both EM's transactions (because now they are working on one connected graph).
You should elaborate what exactly you refer to by saying "that is not the way the doctrine-orm-bundle is supposed to be used" -- probably there's an error in the original suggestion or something wrong with the way that suggestion is applied to this problem.
I started playing with DDD recently. Today I'm having a problem with placing validation logic in my application. I'm not sure what layer should I pick up. I searched over the internet and can't find an unified solution that solves my problem.
Let's consider the following example. User entity is represented by ValueObjects such as id (UUID), age and e-mail address.
final class User
{
/**
* #var \UserId
*/
private $userId;
/**
* #var \DateTimeImmutable
*/
private $dateOfBirth;
/**
* #var \EmailAddress
*/
private $emailAddress;
/**
* User constructor.
* #param UserId $userId
* #param DateTimeImmutable $dateOfBirth
* #param EmailAddress $emailAddress
*/
public function __construct(UserId $userId, DateTimeImmutable $dateOfBirth, EmailAddress $emailAddress)
{
$this->userId = $userId;
$this->dateOfBirth = $dateOfBirth;
$this->emailAddress = $emailAddress;
}
}
Non business logic related validation is performed by ValueObjects. And it's fine.
I'm having a trouble placing business logic rules validation.
What if, let's say, we would need to let Users have their own e-mail address only if they are 18+?
We would have to check the age for today, and throw an Exception if it's not ok.
Where should I put it?
Entity - check it while creating User entity, in the constructor?
Command - check it while performing Insert/Update/whatever command? I'm using tactician in my project, so should it be a job for
Command
Command Handler
Where to place validators responsible for checking data with the repository?
Like email uniqueness. I read about the Specification pattern. Is it ok, if I use it directly in Command Handler?
And last, but not least.
How to integrate it with UI validation?
All of the stuff I described above, is about validation at domain-level. But let's consider performing commands from REST server handler. My REST API client expects me to return a full information about what went wrong in case of input data errors. I would like to return a list of fields with error description.
I can actually wrap all the command preparation in try block an listen to Validation-type exceptions, but the main problem is that it would give me information about a single error, until the first exception.
Does it mean, that I have to duplicate my validation logic in controller-level (ie with zend-inputfilter - I'm using ZF2/3)? It sounds incosistent...
Thank you in advance.
I will try to answer your questions one by one and additionally give my two cents here and there and how I would solve the problems.
Non business logic related validation is performed by ValueObjects
Actually ValueObjects represent concepts from your business domain, so these validations are actually business logic validations too.
Entity - check it while creating User entity, in the constructor?
Yes, in my opinion you should try to add this kind of behavior as deep down in the Aggregates as you can. If you put it into Commands or Command Handlers you loose cohesiveness and business logic is leaking out into the Application layer. And I would even go further. Ask yourself the question if there are hidden concepts within your model that are not made explicit. In your case that is an AdultUser and an UnderagedUser (they could both implement a UserInterface) that actually have different behavior. In these cases I always strive for modelling this explicitly.
Like email uniqueness. I read about the Specification pattern. Is it ok, if I use it directly in Command Handler?
The Specification pattern is nice if you want to be able to combine complex queries with logical operators (especially for the Read Model). In your case I think this is an overkill. Adding a simple containsUserForEmail($emailValueObject) method into the UserRepositoryInterface and call this from the Use Case is fine.
<?php
$userRepository
->containsUserForEmail($emailValueObject)
->hasOrThrow(new EmailIsAlreadyRegistered($emailValueObject));
How to integrate it with UI validation?
So first of all there already should be client side validation for the fields in question. Make it easy to use your system in the right way and hard to use it in the wrong way.
Of course there still needs to be server side validation. We currently use the schema validation approach where we have a central schema registry from which we fetch a schema for a given payload and then can validate JSON payloads against that JSON Schema. If it fails we return a serialized ValidationErrors object. We also tell the client via the Content-Type: application/json; profile=https://some.schema.url/v1/user# header how it can build a valid payload.
You can find some nice articles on how to build a RESTful API on top of a CQRS architecture here and here.
Just to expand on what tPl0ch said, as I have found helpful... While I have not been in the PHP stack in many years, this largely is theoretical discussion, anyhow.
One of the larger problems faced in practical applications of DDD is that of validation. Traditional logic would dictate that validation has to live somewhere, where it really should live everywhere. What has probably tripped people up more than anything, when applying this to DDD is the qualities of a domain never being "in an invalid state". CQRS has gone a far way to address this, and you are using commands.
Personally, the way that I do this, is that commands are the only way to alter state. Even if I require the creation of a domain service for a complex operation, it is the commands which will do the work. A traditional command handler will dispatch a command against an aggregate and put the aggregate into a transitional state. All of this is fairly standard, but I additionally delegate the responsibility of validation of the transition to the commands themselves, as they already encompass business logic, as well. If I am creating a new Account, for example, and I require a first name, last name, and email address, I should be validating that as being present in the command, before it ever is attempted to be applied to the aggregate through the command handler. As such, each of my command handlers have not just awareness of the command, but also a command validator.
This validator ensures that the state of the command will not compromise the domain, which allows me to validate the command itself, and at a point where I do not incur additional cost related to having to validate somewhere in the infrastructure or implementation. Since the only way that I have to mutate state is solely in the commands, I do not put any of that logic directly into the domain objects themselves. That is not to say that the domain model is anemic, far from it, actually. There is an assumption that if you are not validating in the domain objects themselves, that the domain immediately becomes anemic. But, the aggregate needs to expose the means to set these values - generally through a method - and the command is translated to provide these values to that method. On of the semi-common approaches that you see is that logic is put into the property setters, but since you are only setting a single property at a time, you could more easily leave the aggregate in an invalid state. If you look at the command as being validated for the purpose of mutating that state as a single operation, you see that the command is a logical extension of the aggregate (and from a code organizational standpoint, lives very near, if not under, the aggregate).
Since I am only dealing with command validation at that point, I generally will have persistence validation, as well. Essentially, right before the aggregate is persisted, the entire state of the aggregate will be validated at once. The ultimate goal is to get a command to persist, so that means that I will have a single persistence validator per aggregate, but as many command validators as I have commands. That single persistence validator will provide the infallible validation that the command has not mutated the aggregate in a way that violates the overarching domain concerns. It will also have awareness that a single aggregate can have multiple valid transitional states, which is something not easily caught in a command. By multiple states, I mean that the aggregate may be valid for persistence as an "insert" for persistence, but perhaps is not valid for an "update" operation. The easiest example of that would be that I could not update or delete an aggregate which has not been persisted.
All of these can be surfaced to the UI, in my own implementation. The UI will hand the data to an application service, the application service will create the command, and it will invoke a "Validate" method on my handler which will return any validation failures within the command without executing it. If validation errors are present, the application service can yield to the controller, returning any validation errors that it finds, and allow them to surface up. Additionally, pre-submit, the data can be sent in, follow the same path for validation, and return those validation errors without physically submitting the data. It is the best of both worlds. Command violations can happen often, if the user is providing invalid input. Persistence violations, on the other hand, should happen rarely, if ever at all, outside of testing. It would imply that a command is mutating state in a way that is not supported by the domain.
Finally, post-validation of a command, the application service can execute it. The way that I have built my own infrastructure is that the command handler is aware of if the command was validated immediately before execution. If it was not, the command handler will execute the same validation that is exposed by the "Validate" method. The difference, however, is that it will be surfaced as an exception. Goal at this point is to halt execution, as an invalid command cannot enter the domain.
Although the samples are in Java (again, not my platform of choice), I highly recommend Vaughn Vernon's "Implementing Domain-Driven Design". It really pulls a lot of the concepts in the Evans' material together with the advances in the DDD paradigm, such as CQRS+ES. At least for me, the material in Vernon's book, which is also a part of the "DDD Series" of books, changed the way I fundamentally approach DDD as much as the Blue Book introduced me to it.
In input of my application I have the following data: airplane_id, airport_id and passenger(s) details.
I need to make sure that selected airplane_id could reach airport_id. It might be done only with help a SQL query, but this checking is still a validation process, isn't it?
Validation should happen before I will save passenger(s) details.
In my application model, it is the ActiveRecord pattern object which represent a table. I would rather make Validator as a separated layer than to build it into the Model layer. But in this case I have an extra issue: usually Validators are general (their rules might be applied to any set of data). For instance is this data email? or IP? or date? etc.... but never mind what the data is.
In my case, the mentioned rule won't be common at all; it will definitely be a specific rule, which can't be used by any other input data. So my question is: Is this checking still part of the validation process?
And if yes, will Validator violate the S principle from the set of SOLID?
It is validation and you should use a separate validation layer (single responsibility for input validation). Input validation isn't just data type checking, it can be much more complex. Model validation might still be needed though.
Think of input validation as whitelist validation (“accept known good”) and model validation as blacklist validation (“reject known bad”). Whitelist validation is more secure while blacklist validation prevents your model layer from being overly constrained to very specific use cases.
Invalid model data should always cause an exception to be thrown (otherwise the application can continue running without noticing the mistake) while invalid input values coming from external sources are not unexpected, but rather common (unless you got users that never make mistakes).
See also: https://lastzero.net/2015/11/form-validation-vs-model-validation/
Yes, these checks are validation.
Speaking from experience with a MVC pattern framework(Yii/2), I would say that you could make an abstract validator class and from there extend it into your concrete validators and call those validators from the model class. This will need a Model->validate() call, but having separate classes that actually check the data will not violate the S in SOLID, while Model->validate() will just loop through the validatos validate methods and store the error messages in an array.
I am working on a PHP project which makes extensive use of the MVC design pattern. I am looking to add validation to a form and am curious as to what the right place for validation is.
Due to the way that forms are generated, validation on postback data is a lot simpler and less repetitive in view components. Is it acceptable to have the view validating response data, or should this be implemented within the controller, or even the model?
What are the benefits?
The right place for validation is the Model.
This makes most sense because you are doing validation on the data, which is what the model represents. In terms of the CRUD updates, the model should always be used somehow.
If you are changing data from the
view, you should have validations
being checked.
If you have controllers changing
data, you should have validations
being checked.
And finally if you have having the
model itself changing data, you
should still have validations.
The only way to achieve this state is to have the validation go into the model.
Due to performance and faster response, after implementing the validations in the model, you should try to add some sort of client side(JS) to immediately notify the end user.
Validation is always about the data. Why are you validating data? So you can keep the integrity of the information your storing. Having the validations at the model level allows data to theoretically be always correct. This is always a neccesity. From there you can add extra validations in your business logic and client side to make your application more user friendly.
If you're validating the data on client side (i.e Javascript validation) which is absolutely not enough and not secure at all, You should implement it in View.
If you're validating data on server side, And your validation does not require application business logic (i.e you're not checking to see if the user has enough credit in his account), You should validate in the controller.
If the validation requires business logic, Implement it inside the model and call it via controller.
Postback validation is not good since it puts lots of pressure and delay, And the only advantage is to the programmer (not to be accounted).
You can use regex for most of validation, Which has the same syntax (almost) on PHP and JS.
Validation in the model seems to be the most common approach (you end up with something like $obj->isValid()) and this is suitable in many situations.
However, depending on your use case there may be good reasons to perform validation outside the model, either using separate validation code or in the controller, etc.:
If much of the overall validation problem involves information not accessible to the model (for example, if an admin user can perform transformations that a regular user cannot, or certain properties cannot be changed after a certain date), then you might want to check all these constraints in the same place.
It may also be convenient or necessary to apply very lax validation rules when constructing objects for tests. (A "shopping basket" object might ordinarily require an associated user, who in turn requires a valid email address, etc. A 100% valid shopping basket object might be inconvenient to construct in shopping basket unit tests.)
For historical reasons, validation rules might change (e.g. enforcing a "gender" where previously none was necessary) and so you may end up with different versions of data that need to be treated differently. (Different validation rules may also apply to bulk data import.)
If validation is very complex, you might want to provide different error messages (or none at all) depending upon what's most useful to the caller. In other situations, true or false might be all that is necessary.
It may be possible to handle these different use cases via arguments to the model's isValid() method, but this becomes increasingly unwieldy as the number of validation styles increases. (And I do think it's almost guaranteed that a single "one size fits all" isValid() method will eventually prove insufficient for most non-trivial projects.)
Don’t get confuse with sanitizing or cleaning the posted value with validation. You should fetch the posted values and scrub them by removing any malicious elements from the values within the Controller. Then send the data to the Model to be validated for the expected values or format. By breaking those actions into two procedures reduce the risk of malicious code to get implemented. This method works well if you are using the “trust no one input” policy; knowing some programmers can become sloppy or lazy. Another positive side is preventing your Model from becoming bloated and over worked, if so, then use a model helper to do the dirty work. This approach will also help balance your application load and improve performance.
I'm currently rebuilding an admin application and looking for your recommendations for best-practice! Excuse me if I don't have the right terminology, but how should I go about the following?
Take the example of "users" - typically we can create a class with properties like 'name', 'username', 'password', etc. and make some methods like getUser($user_ID), getAllUsers(), etc. In the end, we end up with an array/arrays of name-value pairs like; array('name' => 'Joe Bloggs', 'username' => 'joe_90', 'password' => '123456', etc).
The problem is that I want this object to know more about each of its properties.
Consider "username" - in addition to knowing its value, I want the object to know things like; which text label should display beside the control on the form, which regex I should use when validating, what error message is appropriate? These things seem to belong in the model.
The more I work on the problem, the more I see other things too; which HTML element should be used to display this property, what are minimum/maximum values for properties like 'registration_date'?
I envisaged the class looking something like this (simplified):
class User {
...etc...
private static $model = array();
...etc...
function __construct(){
...etc...
$this->model['username']['value'] = NULL; // A default value used for new objects.
$this->model['username']['label'] = dictionary::lookup('username'); // Displayed on the HTML form. Actual string comes from a translation database.
$this->model['username']['regex'] = '/^[0-9a-z_]{4,64}$/i'; // Used for both client-side validation and backend validation/sanitising;
$this->model['username']['HTML'] = 'text'; // Which type of HTML control should be used to interact with this property.
...etc...
$this->model['registration_date']['value'] = 'now'; // Default value
$this->model['registration_date']['label'] = dictionary::lookup('registration_date');
$this->model['registration_date']['minimum'] = '2007-06-05'; // These values could be set by a permissions/override object.
$this->model['registration_date']['maximum'] = '+1 week';
$this->model['registration_date']['HTML'] = 'datepicker';
...etc...
}
...etc...
function getUser($user_ID){
...etc...
// getUser pulls the real data from the database and overwrites the default value for that property.
return $this->model;
}
}
Basically, I want this info to be in one location so that I don't have to duplicate code for HTML markup, validation routines, etc. The idea is that I can feed a user array into an HTML form helper and have it automatically create the form, controls and JavaScript validation.
I could then use the same object in the backend with a generic set($data = array(), $model = array()) method to avoid having individual methods like setUsername($username), setRegistrationDate($registration_date), etc...
Does this seem like a sensible approach?
What would you call value, label, regex, etc? Properties of properties? Attributes?
Using $this->model in getUser() means that the object model is overwritten, whereas it would be nicer to keep the model as a prototype and have getUser() inherit the properties.
Am I missing some industry-standard way of doing this? (I have been through all the frameworks - example models are always lacking!!!)
How does it scale when, for example, I want to display user types with a SELECT with values from another model?
Thanks!
Update
I've since learned that Java has class annotations - http://en.wikipedia.org/wiki/Java_annotations - which seem to be more or less what I was asking. I found this post - http://interfacelab.com/metadataattributes-in-php - does anyone have any insight into programming like this?
You're on the right track there. When it comes to models I think there are many approaches, and the "correct" one usually depends on your type of application.
Your model can be directly an Active Record, maybe a table row data gateway or a "POPO", plain old PHP object (in other words, a class that doesn't implement any specific pattern).
Whichever you decide works best for you, things like validation etc. can be put into the model class. You should be able to work with your users as User objects, not as associative arrays - that is the main thing.
Does this seem like a sensible approach
Yes, besides the form label thing. It's probably best to have a separate source for data such as form labels, because you may eventually want to be able to localize them. Also, the label isn't actually related to the user object - it's related to displaying a form.
How I would approach this (suggestion)
I would have a User object which represents a single user. It should be possible to create an empty user or create it from an array (so that it's easy to create one from a database result for example). The user object should also be able to validate itself, for example, you could give it a method "isValid", which when called will check all values for validity.
I would additionally have a user repository class (or perhaps just some static methods on the User class) which could be used to fetch users from the database and store them back. This repository would directly return user objects when fetching, and accept user objects as parameters for saving.
As to what comes to forms, you could probably have a form class which takes a user object. It could then automatically get values from the user and use it to validate itself as well.
I have written on this topic a bit here: http://codeutopia.net/blog/2009/02/28/creating-a-simple-abstract-model-to-reduce-boilerplate-code/ and also some other posts linked in the end of that one.
Hope this helps. I'd just like to remind that my approach is not perfect either =)
An abstract response for you which quite possibly won't help at all, but I'm happy to get the down votes as it's worth saying :)
You're dealing with two different models here, in some world we call these Class and Instance, in other's we talk of Classes and Individuals, and in other worlds we make distinctions between A-Box and T-Box statements.
You are dealing with two sets of data here, I'll write them out in plain text:
User a Class .
username a Property;
domain User;
range String .
registration_date a Property;
domain User;
range Date .
this is your Class data, T-Box statements, Blueprints, how you describe the universe that is your application - this is not the description of the 'things' in your universe, rather you use this to describe the things in your universe, your instance data.. so you then have:
user1 a User ;
username "bob";
registration_date "2010-07-02" .
which is your Instance, Individual, A-Box data, the things in your universe.
You may notice here, that all the other things you are wondering how to do, validation, adding labels to properties and so forth, all come under the first grouping, things that describe your universe, not the things in it. So that's where you'd want to add it.. again in plain text..
username a Property;
domain User;
range String;
title "Username";
validation [ type Regex; value '/^[0-9a-z_]{4,64}$/i' ] .
The point in all this, is to help you analyse the other answers you get - you'll notice that in your suggestion you munged these two distinct sets of data together, and in a way it's a good thing - from this hopefully you can see that typically the classes in PHP take on the role of Classes (unsurprisingly) and each object (or instance of a class) holds the individual instance data - however you've started to merge these two parts of your universe together to try and make one big reusable set of classes outside of the PHP language constructs that are provided.
From here you have two paths, you can either fold in to line and follow the language structure to make your code semi reusable and follow suggested patterns like MVC (which if you haven't done, would do you good) - or you can head in to a cutting edge world where these worlds are described and we build frameworks to understand the data about our universes and the things in it, but it's an abstract place where at the minute it's hard to be productive, though in the long term is the path to the future.
Regardless, I hope that in some way that helps you to get a grip of the other responses.
All the best!
Having looked at your question, the answers and your responses; I might be able to help a bit more here (although it's difficult to cover everything in a single answer).
I can see what you are looking to do here, and in all honesty this is how most frameworks start out; making a set of classes to handle everything, then as they are made more reusable they often hit on tried and tested patterns until finally ending up with what I'd say is 'just another framework', they all do pretty much the same thing, in pretty much the same ways, and aim to be as reusable as they can - generally about the only difference between them is coding styles and quality - what they do is pretty much the same for all.
I believe you're hitting on a bit of anti-pattern in your design here, to explain.. You are focussed on making a big chunk of code reusable, the validation, the presentation and so forth - but what you're actually doing (and of course no offence) is making the working code of the application very domain specific, not only that but the design you illustrate will make it almost impossible to extend, to change layers (like make a mobile version), to swap techs (like swap db vendors) and further still, because you've got presentation and application (and data) tiers mixed together, any designer who hit's the app will have to be working in, and changing, your application code - hit on a time when you have two versions of the app and you've got a big messy problem tbh.
As with most programming problems, you can solve this by doing three things:
designing a domain model.
specifying and designing interfaces rather that worrying about the implementation.
separating cross cutting concerns
Designing a domain model is a very important part of Class based OO programming, if you've never done it before then now is the ideal time, it doesn't matter whether you do this in a modelling language like UML or just in plain text, the idea is to define all the Entities in your Domain, it's easy to slip in to writing a book when discussing this, but let's keep it simple. Your domain model comprises all the Entities in your application's domain, each Entity is a thing, think User, Address, Article, Product and so forth, each Entity is typically defined as a Class (which is the blueprint of that entity) and each Class has Properties (like username, register_date etc).
Class User {
public $username;
public $register_date;
}
Often we may keep these as POPOs, however they are often better thought of as Transfer Objects (often called Data Transfer Objects, Value Objects) - a simple Class blueprint for an entity in your domain - normally we try to keep these portable as well, so that they can be implemented in any language, passed between apps, serialized and sent to other apps and similar - this isn't a must, indeed nothing is a must - but it does touch on separation of concerns in that it would normally be naked, implying no functionality, just a blueprint ot hold values. Contrast sharply with Business Objects and Utility Classes that actually 'do' things, are implementations of functionality, not just simple value holders.
Don't be fooled though, both Inheritance and Composition also play their part in domain model, a User may have several Addresses, each Address may be the address of several different Users. A BillingAddress may extend a normal Address and add in additional properties and so forth. (aside: what is a User? do you have a User? do you have a Person with 1-* UserAccounts?).
After you've got your domain model, the next step is usually mapping that up to some form of persistence layer (normally a database) two common ways of doing this (in well defined way) are by using an ORM (such as doctrine, which is in symphony if i remember correctly), and the other way is to use DAO pattern - I'll leave that part there, but typically this is a distinct part of the system, DAO layers have the advantage in that you specify all the methods available to work with the persistence layer for each Entity, whilst keeping the implementation abstracted, thus you can swap database vendors without changing the application code (or business rules as many say).
I'm going to head in to a grey area with the next example, as mentioned earlier Transfer Objects (our entities) are typically naked objects, but they are also often a good place to strap on other functionality, you'll see what I mean.
To illustrate Interfaces, you could simply define an Interface for all your Entities which is something like this:
Interface Validatable {
function isValid();
}
then each of your entities can implement this with their own custom validation routine:
Class User implements Validatable {
public function isValid()
{
// custom validation here
return $boolean;
}
}
Now you don't need to worry about creating some convoluted way of validating objects, you can simply call isValid() on any entity and find out if it's valid or not.
The most important thing to note is that by defining the interface, we've separated some of the concerns, in that no other part of the application needs to do anything to validate an object, all they need to know is that it's Validatable and to call the isValid() method.
However, we have crossed some concerns in that each object (instance of a Class) now carries it's own validation rules and model. It may make sense to abstract this out, one easy way of doing this is to make the validation method static, so you could define:
Class User {
public static function validate(User $user)
{
// custom validation here
return $boolean;
}
}
Or you could move to using getters and setters, this is another very common pattern where you can hide the validation inside the setter, thus ensuring that each property always holds valid data (or null, or default value).
Or perhaps you move the validation in to it's own library? Class Validate with it's own methods, or maybe you just pop it in the DAO layer because you only care about checking something when you save it, or maybe you need to validate when you receive data and when you persist it - how you end up doing it is your call and there is no 'best way'.
The third consideration, which I've already touched on, is separation of concerns - should a persistence layer care how the things it's persisting are presented? should the business logic care about how things are presented? should an Entity care where and how it's displayed? or should the presentation layer care how things are presented? Similarly, we can ask is there only ever going to be one presentation layer? in one language? What about how a label appears in a sentence, sure singular User and Address makes sense, but you can't simply +s to show the lists because Users is right but Addresss is wrong ;) - also we have working considerations like do I want a new designer having to change application code just to change the presentation of 'user account' to 'User Account', even do I want to change my app code in the classes when a that change is asked for?
Finally, and just to throw everything I've said - you have to ask yourself, what's the job I'm trying to do here? am I building a big reusable application with potentially many developers and a long life cycle here - or would a simple php script for each view and action suffice (one that reads $_GET/$_POST, validates, saves to db then displays what it should or redirects where it should) - in many, if not all cases this is all that's needed.
Remember, PHP is made to be invoked when a request is made to a web server, then send back a response [end] that's it, what happens between then is your domain, your job, the client and user typically doesn't care, and you can sum up what you're trying to do this simply: build a script to respond to that request as quickly as possible, with the expected results. That's and it needn't be any more complicated than that.
To be blunt, doing everything I mentioned and more is a great thing to do, you'll learn loads, understand your job better etc, but if you just want to get the job out the door and have easy to maintain simple code in the end, just build one script per view, and one per action, with the odd reusable bit (like a http handler, a db class, an email class etc).
You're running into the Model-View-Controller (MVC) architecture.
The M only stores data. No display information, just typed key-value pairs.
The C handles the logic of manipulating this information. It changes the M in response to user input.
The V is the part which handles displaying things. It should be something like Smarty templates rather than a huge amount of raw PHP for generating HTML.
Having it all "in one place" is the wrong approach. You won't have duplicated code with MVC - each part is a distinct step. This improves code reuse, readability, and maintainability.