Mixing together DDD with Event Sourcing

Mixing together DDD with Event Sourcing - php

I can't get my head around concept of mixing together DDD with ES. I consider events as being part of domain side. Given that there is no problem with publishing them from repository to outside world and keeping models pure and simple. But aside from that there must be possibility of replaying them back on particular aggregate. This is where my problem occurs. I would like to keep my domain models pure and simple objects that remain lib/framework agnostic.
To apply past events on aggregate the aggregate must be aware of being part of ES structure (wherefore it would not remain pure domain object). As main job of aggregate is to enfroce some bussines invariants that may evolve over time it is impossible to apply old events using aggregate API. For instance, there is aggregate Post with child entities Comments. Today Post allows 10 Comments to be added, and method addCommnet() guards that rule. But it is not used to be that way all time. One year ago user was allowed to add up to 20 Comments. So appying past events may not meet current rules.
Broadway (popular PHP CQRS library) works around the problem by applying events without any prevalidation. Method addCommnet() just checks it against our invariants and then processes appling events. Applyinig event itself does not do any further checking. That is greaat but I perceive that as high level of integration in my domain models. Does really my domain model need to know anything about infastructure (which is ES style of saving data)?
EDIT:
To state the problem with the simplest words possible: is there any opportunity to get rid of all those applyXXX() methods from aggregate?
EDIT2:
I have written (bit hacky) PoC of this idea with PHP - github

Disclaimer: I'm a CQRS framework guy.
Broadway (popular PHP CQRS library) works around the problem by applying events without any prevalidation.
That's the way every CQRS Aggregate works, events are not checked because they express facts that already happened in the past. This means that applying an event doesn't throw exceptions, ever.
To apply past events on aggregate the aggregate must be aware of being part of ES structure (wherefore it would not remain pure domain object)
No, it doesn't. It must be aware of its past events. That is good.
Today Post allows 10 Comments to be added, and method addCommnet() guards that rule. But it is not used to be that way all time. One year ago user was allowed to add up to 20 Comments. So appying past events may not meet current rules.
What keeps you aggregate from ignoring that event or to interpret differently than 1 year ago?!
This particular case should make you think about the power of CQRS: writes have a different logic than reads. You apply the events on the aggregate in order to validate the future commands that arrive at it (the write/command side). Displaying those 20 events is handled by other logic (the read/query side).
This is where my problem occurs. I would like to keep my domain models pure and simple objects that remain lib/framework agnostic.
CQRS make possible to keep your aggregates pure (no side effects), no dependency to any library and simple. I do this using the style presented by cqrs.nu, by yielding events. This means that aggregate command handlers methods are in fact generators.
The read models can also very very simple, plain PHP immutable objects. Only the read model updater has dependency to persistence, but that can be inversed using an interface.

I can't get my head around concept of mixing together DDD with CQRS.
From the sound of things, you can't quite get your head around the mix of DDD and event sourcing. CQRS and Event Sourcing are separate ideas (that happen to go together well).
Today Post allows 10 Comments to be added, and method addCommnet() guards that rule. But it is not used to be that way all time. One year ago user was allowed to add up to 20 Comments. So appying past events may not meet current rules.
That's absolutely true. Notice, however, that it is also true that if you had a non event sourced post with 15 comments, and you try to make a "rule" now that only 10 comments are allowed, you still have a problem.
My answer to this puzzle (in both styles) is that you need a slightly different understanding of the responsibilities involved.
The responsibility of the domain model is behavior; it describes which states are reachable from the current state. The domain model shouldn't restrict you from being in a bad state, it should prevent good states from becoming bad states.
In version one, we might say that the state of a Post includes a TwentyList of Comments, where a TwentyList is (surprise) a container that can hold up to 20 comment identifiers.
In version two, where we want to maintain a limit of 10 comments, we don't change the TwentyList to a TenList, because that gives us backward compatibility headaches. Instead, we change the domain rule to say "no comments may be added to a post with 10 or more comments". The data schema is unchanged, and the undesirable states are still representable, but the allowed state transitions are greatly restricted.
Ironically enough, a good book to read to get more insights is Greg Young's Versioning in an Event Sourced System. The lesson, at a high level, is that event versioning is just message versioning, and state is just a message that a previous model left behind for the current model.
Value types aren't about rule constraints, they are about semantic constraints.
Keep in mind that the timelines are very different; behaviors are about the now and next, but states are about the past. States are supposed to endure much longer than behaviors (with the corresponding investment in design capital that implies).
Does really my domain model need to know anything about infrastructure (which is ES style of saving data)?
No, the domain model should not need to know about infrastructure.
But events aren't infrastructure -- they are state. A journal of AddComment and RemoveComment events is state just like a list of Comment entries is state.
The most general form of "behavior" is a function that takes current state as its input and emits events as its output
List<Event> act(State currentState);
as we can always at an outer layer, take the events (which are a non destructive representation of the state, and build the state from them.
State act(State currentState) {
List<Event> changes = act(currentState)
State nextState = currentState.apply(changes)
return nextState
}
List<Event> act(List<Event> history) {
State initialState = new State();
State currentState = initialState.apply(changes)
return act(currentState)
}
State act(List<Event> history) {
// Writing this out long hand to drive home the point
// we could of course call act: List<Event> -> State
// to avoid duplication.
List<Event> changes = act(history)
State initialState = new State()
State currentState = initialState.apply(history)
State nextState = currentState.apply(changes)
return nextState;
}
The point being that you can implement the behavior in the most general case, add a few adapters, and then let the plumbing choose which implementation is most appropriate.
Again, separation of responsibilities is your guiding star: state that manages what is, behavior that manages what changes are allowed, and plumbing/infrastructure are all distinct concerns.
In the simplest terms: I'm looking for opportunity to get rid of many applyXXX() (or similar in languages with overloading methods) methods from my aggregate
applyXXX is just a function, that accepts a State and an Event as arguments and returns a new State. You can use any spelling and scoping you want for it.

My answer is very short. Indeed, it is event-sourcing that you struggle with, not CQRS.
If handling of some event changes over time, you have two scenarios really:
You are fixing a bug and your handler should really behave differently. In this case you just proceed with the change.
You got some new intent. You actually have a new handling. This means that in fact this is a different event. In this case you have a new event and new handler.
These scenarios have no relation to programming languages and frameworks. Event-sourcing in general is much more about the business that about any tech.
I would second to Greg's book recommendation.

I think that your problem is that you want to validate events when they are applied, but apply and validation are two different stages of aggregate action. When you are adding comment by method addComment(event), there is your logic to validate and this method is throwing event, when you reply event this logic is not checking again. Past event can not be changed, and if your aggregate throws exception with reply event something is wrong with your aggregate. That's how I understand your problem.

Related

DDD and blameable

How do you handle situation with blameable in the DDD way?
Ofcourse we can ignore some things, but i think that when entity need some tracking (creator, updater, time updated / created) it should be in the class that actually performs some actions on entity.
For example we have post and user, what whould be the correct way?
$post = new Post();
$post->create(); // here we can set some created_id and
other attributes by using mixins or traits like some fw do
Or it is better like this:
$user->createPost($post);
$user->update($post);
As for me second is better, even when we need to track changes that does not apply to post directly, for example:
$post->doSomethingWithPost();
$user->updatePost($post);
Seems like blameable just throws out one important entity - user who manages some things on entities.
Ofcourse we should not overcomplicate things, but usually when blameable is implemented, entity from which you will get id is a logged in user, that is incorrect to the bounded context.
Here it is some Blogging Context, where user of this context updates post and not some authenticated user.
Whats your thoughts on this one? Is there some similar questions that i maybe missed?

All your examples seem like they are not designed with the DDD principles in mind. The first indicator to me is the usage of a $user variable. In 99% of the cases this is too generic to really capture the intent of a given Model. I think there are hidden concepts that would first have to be made explicit. I think along the lines of RegisteredAuthor and Administrator. At least that's what I understand from:
Here it is some Blogging Context, where user of this context updates post and not some authenticated user.
Another question is how can a "user of this context" not be authenticated? How do you know who he is?
In general in an application that explicitly requires User management we normally have something like an IdentityContext as a supporting Sub Domain. In the different contexts we then have other Models like Author or BlogAdministrator holding a reference to the User's identity (UserId) from the IdentityContext. The Red Book has some nice examples on how to implement this.
To answer the question on how to track who changed something and when:
This concept is also referred to as Auditability, which in most revenue relevant parts of system is actually a must once your organization is reaching a certain size. In this scenario I actually always recommend an Event Sourcing approach since it comes with auditability batteries included.
In your case it would actually be enough to either capture the executing UserId as Metadata to the commands like WritePostCommand or ChangePostContentsCommand or use the UserId in a RequestContext object that knows about the execution context (who was sending this command, when was it sent, is this user allowed to execute this use case).
You can then, as Alexander Langer pointed out in the comments, just use this metadata inside your Repositories or Handlers to pass the information to the Aggregates that need it, or could even just send them to an audit log to not pollute your Domain Model with this responsibilities.
NOTE: Generally I would not use the DoctrineExtensions like Blameable in your Domain Model. They depend heavily on Doctrine's Event system, and you do not want to tie your Model into an Infrastructure concern.
Kind regards

How do I architect my classes for easier unit testing?

I'll admit, I haven't unit tested much... but I'd like to. With that being said, I have a very complex registration process that I'd like to optimize for easier unit testing. I'm looking for a way to structure my classes so that I can test them more easily in the future. All of this logic is contained within an MVC framework, so you can assume the controller is the root where everything gets instantiated from.
To simplify, what I'm essentially asking is how to setup a system where you can manage any number of third party modules with CRUD updates. These third party modules are all RESTful API driven and response data is stored in local copies. Something like the deletion of a user account would need to trigger the deletion of all associated modules (which I refer to as providers). These providers may have a dependency on another provider, so the order of deletions/creations is important. I'm interested in which design patterns I should specifically be using to support my application.
Registration spans several classes and stores data in several db tables. Here's the order of the different providers and methods (they aren't statics, just written that way for brevity):
Provider::create('external::create-user') initiates registration at a particular step of a particular provider. The double colon syntax in the first param indicates the class should trigger creation on providerClass::providerMethod. I had made a general assumption that Provider would be an interface with the methods create(), update(), delete() that all other providers would implement it. How this gets instantiated is likely something you need to help me with.
$user = Provider_External::createUser() creates a user on an external API, returns success, and user gets stored in my database.
$customer = Provider_Gapps_Customer::create($user) creates a customer on a third party API, returns success, and stores locally.
$subscription = Provider_Gapps_Subscription::create($customer) creates a subscription associated to the previously created customer on the third party API, returns success, and stores locally.
Provider_Gapps_Verification::get($customer, $subscription) retrieves a row from an external API. This information gets stored locally. Another call is made which I'm skipping to keep things concise.
Provider_Gapps_Verification::verify($customer, $subscription) performs an external API verification process. The result of which gets stored locally.
This is a really dumbed down sample as the actual code relies upon at least 6 external API calls and over 10 local database rows created during registration. It doesn't make sense to use dependency injection at the constructor level because I might need to instantiate 6 classes in the controller without knowing if I even need them all. What I'm looking to accomplish would be something like Provider::create('external') where I simply specify the starting step to kick off registration.
The Crux of the Problem
So as you can see, this is just one sample of a registration process. I'm building a system where I could have several hundred service providers (external API modules) that I need to sign up for, update, delete, etc. Each of these providers gets related back to a user account.
I would like to build this system in a manner where I can specify an order of operations (steps) when triggering the creation of a new provider. Put another way, allow me to specify which provider/method combination gets triggered next in the chain of events since creation can span so many steps. Currently, I have this chain of events occurring via the subject/observer pattern. I'm looking to potentially move this code to a database table, provider_steps, where I list each step as well as it's following success_step and failure_step (for rollbacks and deletes). The table would look as follows:
# the id of the parent provider row
provider_id int(11) unsigned primary key,
# the short, slug name of the step for using in codebase
step_name varchar(60),
# the name of the method correlating to the step
method_name varchar(120),
# the steps that get triggered on success of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_success varchar(255),
# the steps that get triggered on failure of this step
# can be comma delimited; multiple steps could be triggered in parallel
triggers_failure varchar(255),
created_at datetime,
updated_at datetime,
index ('provider_id', 'step_name')
There's so many decisions to make here... I know I should favor composition over inheritance and create some interfaces. I also know I'm likely going to need factories. Lastly, I have a lot of domain model shit going on here... so I likely need business domain classes. I'm just not sure how to mesh them all together without creating an utter mess in my pursuit of the holy grail.
Also, where would be the best place for the db queries to take place?
I have a model for each database table already, but I'm interested in knowing where and how to instantiate the particular model methods.
Things I've Been Reading...
Design Patterns
The Strategy Pattern
Composition over Inheritance
The Factory method pattern
The Abstract factory pattern
The Builder pattern
The Chain-of-responsibility pattern

You're already working with the pub/sub pattern, which seems appropriate. Given nothing but your comments above, I'd be considering an ordered list as a priority mechanism.
But it still doesn't smell right that each subscriber is concerned with the order of operations of its dependents for triggering success/failure. Dependencies usually seem like they belong in a tree, not a list. If you stored them in a tree (using the composite pattern) then the built-in recursion would be able to clean up each dependency by cleaning up its dependents first. That way you're no longer worried about prioritizing in which order the cleanup happens - the tree handles that automatically.
And you can use a tree for storing pub/sub subscribers almost as easily as you can use a list.
Using a test-driven development approach could get you what you need, and would ensure your entire application is not only fully testable, but completely covered by tests that prove it does what you want. I'd start by describing exactly what you need to do to meet one single requirement.
One thing you know you want to do is add a provider, so a TestAddProvider() test seems appropriate. Note that it should be pretty simple at this point, and have nothing to do with a composite pattern. Once that's working, you know that a provider has a dependent. Create a TestAddProviderWithDependent() test, and see how that goes. Again, it shouldn't be complex. Next, you'd likely want to TestAddProviderWithTwoDependents(), and that's where the list would get implemented. Once that's working, you know you want the Provider to also be a Dependent, so a new test would prove the inheritance model worked. From there, you'd add enough tests to convince yourself that various combinations of adding providers and dependents worked, and tests for exception conditions, etc. Just from the tests and requirements, you'd quickly arrive at a composite pattern that meets your needs. At this point I'd actually crack open my copy of GoF to ensure I understood the consequences of choosing the composite pattern, and to make sure I didn't add an inappropriate wart.
Another known requirement is to delete providers, so create a TestDeleteProvider() test, and implement the DeleteProvider() method. You won't be far away from having the provider delete its dependents, too, so the next step might be creating a TestDeleteProviderWithADependent() test. The recursion of the composite pattern should be evident at this point, and you should only need a few more tests to convince yourself that deeply nested providers, empty leafs, wide nodes, etc., all will properly clean themselves up.
I would assume that there's a requirement for your providers to actually provide their services. Time to test calling the providers (using mock providers for testing), and adding tests that ensure they can find their dependencies. Again, the recursion of the composite pattern should help build the list of dependencies or whatever you need to call the correct providers correctly.
You might find that providers have to be called in a specific order. At this point you might need to add prioritization to the lists at each node within the composite tree. Or maybe you have to build an entirely different structure (such as a linked list) to call them in the right order. Use the tests and approach it slowly. You might still have people concerned that you delete dependents in a particular externally prescribed order. At this point you can use your tests to prove to the doubters that you will always delete them safely, even if not in the order they were thinking.
If you've been doing it right, all your previous tests should continue to pass.
Then come the tricky questions. What if you have two providers that share a common dependency? If you delete one provider, should it delete all of its dependencies even though a different provider needs one of them? Add a test, and implement your rule. I figure I'd handle it through reference counting, but maybe you want a copy of the provider for the second instance, so you never have to worry about sharing children, and you keep things simpler that way. Or maybe it's never a problem in your domain. Another tricky question is if your providers can have circular dependencies. How do you ensure you don't end up in a self-referential loop? Write tests and figure it out.
After you've got this whole structure figured out, only then would you start thinking about the data you would use to describe this hierarchy.
That's the approach I'd consider. It may not be right for you, but that's for you to decide.

Unit Testing
With unit testing, we only want to test the code that makes up the individual unit of source code, typically a class method or function in PHP (Unit Testing Overview). Which indicates that we don't want to actually test the external API in Unit Testing, we only want to test the code we are writing locally. If you do want to test entire workflows, you are likely wanting to perform integration testing (Integration Testing Overview), which is a different beast.
As you specifically asked about designing for Unit Testing, lets assume you actually mean Unit Testing as opposed to Integration Testing and submit that there are two reasonable ways to go about designing your Provider classes.
Stub Out
The practice of replacing an object with a test double that (optionally) returns configured return values is refered to as stubbing. You can use a stub to "replace a real component on which the SUT depends so that the test has a control point for the indirect inputs of the SUT. This allows the test to force the SUT down paths it might not otherwise execute". Reference & Examples
Mock Objects
The practice of replacing an object with a test double that verifies expectations, for instance asserting that a method has been called, is referred to as mocking.
You can use a mock object "as an observation point that is used to verify the indirect outputs of the SUT as it is exercised. Typically, the mock object also includes the functionality of a test stub in that it must return values to the SUT if it hasn't already failed the tests but the emphasis is on the verification of the indirect outputs. Therefore, a mock object is lot more than just a test stub plus assertions; it is used a fundamentally different way".
Reference & Examples
Our Advice
Design your class to both all both Stubbing and Mocking. The PHP Unit Manual has an excellent example of Stubbing and Mocking Web Service. While this doesn't help you out of the box, it demonstrates how you would go about implementing the same for the Restful API you are consuming.
Where is the best place for the db queries to take place?
We suggest you use an ORM and not solve this yourself. You can easily Google PHP ORM's and make your own decision based off your own needs; our advice is to use Doctrine because we use Doctrine and it suits our needs well and over the past few years, we have come to appreciate how well the Doctrine developers know the domain, simply put, they do it better than we could do it ourselves so we are happy to let them do it for us.
If you don't really grasp why you should use an ORM, see Why should you use an ORM? and then Google the same question. If you still feel like you can roll your own ORM or otherwise handle the Database Access yourself better than the guys dedicated to it, we would expect you to already know the answer to the question. If you feel you have a pressing need to handle it yourself, we suggest you look at the source code for a number a of ORM's (See Doctrine on Github) and find the solution that best fits your scenario.
Thanks for asking a fun question, I appreciate it.

Every single dependency relationship within your class hierarchy must be accessible from outside world (shouldn't be highly coupled). For instance, if you are instantiating class A within class B, class B must have setter/getter methods implemented for class A instance holder in class B.
http://en.wikipedia.org/wiki/Dependency_injection

The furthermost problem I can see with your code - and this hinders you from testing it actually - is making use of static class method calls:
Provider::create('external::create-user')
$user = Provider_External::createUser()
$customer = Provider_Gapps_Customer::create($user)
$subscription = Provider_Gapps_Subscription::create($customer)
...
It's epidemic in your code - even if you "only" outlined them as static for "brevity". Such attitiude is not brevity it's counter-productive for testable code. Avoid these at all cost incl. when asking a question about Unit-Testing, this is known bad practice and it is known that such code is hard to test.
After you've converted all static calls into object method invocations and used Dependency Injection instead of static global state to pass the objects along, you can just do unit-testing with PHPUnit incl. making use of stub and mock objects collaborating in your (simple) tests.
So here is a TODO:
Refactor static method calls into object method invocations.
Use Dependency Injection to pass objects along.
And you very much improved your code. If you argue that you can not do that, do not waste your time with unit-testing, waste it with maintaining your application, ship it fast, let it make some money, and burn it if it's not profitable any longer. But don't waste your programming life with unit-testing static global state - it's just stupid to do.

Think about layering your application with defined roles and responsibilities for each layer. You may like to take inspiration from Apache-Axis' message flow subsystem. The core idea is to create a chain of handlers through which the request flows until it is processed. Such a design facilitates plugable components which may be bundled together to create higher order functions.
Further you may like to read about Functors/Function Objects, particularly Closure, Predicate, Transformer and Supplier to create your participating components. Hope that helps.

Have you looked at the state design pattern? http://en.wikipedia.org/wiki/State_pattern
You could make all your steps as different states in state machine and it would look like graph. You could store this graph in your database table/xml, also every provider can have his own graph which represents order in which execution should happen.
So when you get into certain state you may trigger event/events (save user, get user). I dont know your application specific, but events can be res-used by other providers.
If it fails on some of the steps then different graph path is executed.
If you will correctly abstract it you could have loosely coupled system which follows orders given by graph and executes events based on state.
Then later if you need add some other provider you only need to create graph and/or some new events.
Here is some example: https://github.com/Metabor/Statemachine

Preemptive Validation or Exception handling?

I am trying to decide between 2 patterns regarding data validation:
I try to follow the nominal workflow and catch exceptions thrown by my models and services: unique/foreign constraint violation, empty fields, invalid arguments, etc... (!! I catch only exceptions that I know I should)
pros: Very few code to write in my Controllers and Services: I just have to handle exceptions and transcribe them to a user-understandable message. Code very simple and readable.
cons: I need to write specific exceptions, which can be a lot of different exceptions sometimes. I also need to catch and parse generic PDO/Doctrine exceptions for database exceptions (constraint violations, etc...) to translate them into exceptions that are meaningfull (eg: DuplicateEntryException). I also can't bypass some validation: let's say an object of my model is marked as locked: trying to delete it will raise an exception. However I may want to force its deletion (with a confirmation popup for example). I won't be able to bypass the exception here.
I test and pre-validate everything explicitly with code and DB queries. For example, i'll test that something is not null and is an integer before setting it as an attribute in my model. Or I'll make a DB query to check that I am not going to create a duplicate entry.
pros: no need to write specific exceptions, because I prevalidate everything so I shouldn't be doing a lot of try/catch anyway. Also I can bypass some validation if I want to.
cons: Lots of tests and validation to write in the controllers, services and models. I will be performing more queries (the validation part). The DB already does the validation for foreign keys, unique constraints, not null columns... I shouldn't ignore that and recode it myself. Also this leads to very boring code!
I would rather use one pattern or the other, not a mix, in order to keep things as simple as possible.
The first solution seems to me like the best, but I'm afraid it might be some kind of anti-pattern? or maybe behind its theoretical simplicity it is hiding situations very hard to handle?

I would suggest that data validation should happen at the perimeter of an application. That is to say, that any data coming in should be checked to make sure it meets your expectations. Once allowed into the application, it's no longer validated, but it is always escaped according to context (DB, email, etc.) This allows you to keep all of the validation together and avoids the potential duplication of validation work (it's easy to come up with examples where data could validated twice by two models that both use it.) Joe Armstrong promotes this approach in his book on Erlang, and the software he's written for telcom stations runs for years without restarting, so it does seem to work well :)
Additionally, model expectations don't always perfectly line up with the expectations established by a particular interface (maybe the form is only showing a subset of the potential options, or maybe the interface had a dropdown of US states and the model stores states from many different countries, etc.) Sometimes complex interfaces can integrate several different model objects in a manner that enhances the user experience. While nice for the user, the interaction of these models using the exception approach can be very difficult to handle because some of the inputs may be hybrid inputs that neither model alone can validate. You always want to ensure validation matches the expectations of the UI first and foremost, and the second approach allows you to do this in even the most complex interfaces.
Also, exception handling is relatively expensive in terms of cycles. Validation issues can be quite frequent, and I'd try and avoid such an expensive operation for handling issues than have the potential of being quite frequent.
Last, some validation isn't really necessary for the model, but it's there to prevent attacks. While you can add this to the model, the added functionality can quickly muddy the model code.
So, of these two approaches, I would suggest the second approach because:
You can craft a clear perimeter to your app.
All of the validation is in one place and can be shared.
There's no duplication of validation if two or more models make use of the same input.
The models can focus on what they're good at: mapping knowledge of abstract entities to application state.
Even the most complex UI's can be appropriately validated.
Preemption likely will be more efficient.
Security-focused validation tasks that don't really belong in any model can be cleanly added to the app.

How to implement a full Observer pattern in PHP

An Observer Design Pattern is the solution to loosely coupling objects so they can work together. In PHP you can easily implement this using just two classes.
Basically, you have a subject which is able to notify and update a list of observers of its state changes.
The problem I'm trying to solve is to know how to handler alerting the observers about different states of the object they are watching.
For example, lets say we have a file upload class to which we attach a logging class, websockets class, and a image resize class. Each of these classes that are watching want to know about different events in the upload process.
This file upload class might have three places where it needs to notify the classes listening that something has happend.
Error With Upload (alert logging class)
Upload success (alert websockets class)
Upload success and is image file (alert image resize class)
This is a very basic example, but how do you handle multiple events that different observers may need to know about? Calling notifyObservers() alone wouldn't be enough since each observer needs to know what it is being notified about.
One thought is that I could state with the call what type of event is being observed:
$this->notifyObservers('upload.error', this);
However, that would mean I would have to add custom switching to the observers themselves to know how to handle different events.
function observe($type, $object)
{
if($type === 'upload.error') $this->dosomething();
elseif($type === 'something.else') $this->otherthing();
...etc...
}
I find that very ugly as it starts to couple the observers back to the class they are observing.
Then again, if I just notify Observers without passing any information about what event just happens - they have to guess themselves what is going on which means more if() checks.

The observers aren't actually coupled to the class they are observing. The connection between the observer's handler and the observed object is made using literal string values (e.g. `upload.error'), which means that:
If you want to observe a specific object, you have to know from beforehand the names of the events it will publishing; this is the "coupling" that you don't like.
On the other hand, if you are interested in a specific event only, you can observe any type of object for that event without having any knowledge about that object.
Item 2 above is a benefit that you care about, but what to do about item 1?
If you think about it, there needs to be some way to differentiate between callbacks to the same observer if they represent different events taking place. These "identifiers", no matter what form they take, need to be packaged either into the observed object or be a part of the observer library code.
In the first instance (inside observed object) you would probably need a way for observers to query "do you ever publish event X?" before starting to observe a target for that event. The target can answer this question just fine. This leaves a bitter taste of coupling, but if you want any object to observe any other, and you have no idea what you will be observing beforehand, I don't think you can do any better.
In the second approach, you would have a number of well-known events defined (as const inside a class?) in your library. Presumably such a list of events can be made because the library tackles a concrete application domain, and that domain offers obvious choices for the events. Then, classes both internal to your library (which would end up being observed) and external to it (the observers which plug into the framework) would use these identifiers to differentiate between events. Many callback-based APIs (such as Win32) use an approach practically identical to this.

What is the best practice way to build my model?

I'm currently rebuilding an admin application and looking for your recommendations for best-practice! Excuse me if I don't have the right terminology, but how should I go about the following?
Take the example of "users" - typically we can create a class with properties like 'name', 'username', 'password', etc. and make some methods like getUser($user_ID), getAllUsers(), etc. In the end, we end up with an array/arrays of name-value pairs like; array('name' => 'Joe Bloggs', 'username' => 'joe_90', 'password' => '123456', etc).
The problem is that I want this object to know more about each of its properties.
Consider "username" - in addition to knowing its value, I want the object to know things like; which text label should display beside the control on the form, which regex I should use when validating, what error message is appropriate? These things seem to belong in the model.
The more I work on the problem, the more I see other things too; which HTML element should be used to display this property, what are minimum/maximum values for properties like 'registration_date'?
I envisaged the class looking something like this (simplified):
class User {
...etc...
private static $model = array();
...etc...
function __construct(){
...etc...
$this->model['username']['value'] = NULL; // A default value used for new objects.
$this->model['username']['label'] = dictionary::lookup('username'); // Displayed on the HTML form. Actual string comes from a translation database.
$this->model['username']['regex'] = '/^[0-9a-z_]{4,64}$/i'; // Used for both client-side validation and backend validation/sanitising;
$this->model['username']['HTML'] = 'text'; // Which type of HTML control should be used to interact with this property.
...etc...
$this->model['registration_date']['value'] = 'now'; // Default value
$this->model['registration_date']['label'] = dictionary::lookup('registration_date');
$this->model['registration_date']['minimum'] = '2007-06-05'; // These values could be set by a permissions/override object.
$this->model['registration_date']['maximum'] = '+1 week';
$this->model['registration_date']['HTML'] = 'datepicker';
...etc...
}
...etc...
function getUser($user_ID){
...etc...
// getUser pulls the real data from the database and overwrites the default value for that property.
return $this->model;
}
}
Basically, I want this info to be in one location so that I don't have to duplicate code for HTML markup, validation routines, etc. The idea is that I can feed a user array into an HTML form helper and have it automatically create the form, controls and JavaScript validation.
I could then use the same object in the backend with a generic set($data = array(), $model = array()) method to avoid having individual methods like setUsername($username), setRegistrationDate($registration_date), etc...
Does this seem like a sensible approach?
What would you call value, label, regex, etc? Properties of properties? Attributes?
Using $this->model in getUser() means that the object model is overwritten, whereas it would be nicer to keep the model as a prototype and have getUser() inherit the properties.
Am I missing some industry-standard way of doing this? (I have been through all the frameworks - example models are always lacking!!!)
How does it scale when, for example, I want to display user types with a SELECT with values from another model?
Thanks!
Update
I've since learned that Java has class annotations - http://en.wikipedia.org/wiki/Java_annotations - which seem to be more or less what I was asking. I found this post - http://interfacelab.com/metadataattributes-in-php - does anyone have any insight into programming like this?

You're on the right track there. When it comes to models I think there are many approaches, and the "correct" one usually depends on your type of application.
Your model can be directly an Active Record, maybe a table row data gateway or a "POPO", plain old PHP object (in other words, a class that doesn't implement any specific pattern).
Whichever you decide works best for you, things like validation etc. can be put into the model class. You should be able to work with your users as User objects, not as associative arrays - that is the main thing.
Does this seem like a sensible approach
Yes, besides the form label thing. It's probably best to have a separate source for data such as form labels, because you may eventually want to be able to localize them. Also, the label isn't actually related to the user object - it's related to displaying a form.
How I would approach this (suggestion)
I would have a User object which represents a single user. It should be possible to create an empty user or create it from an array (so that it's easy to create one from a database result for example). The user object should also be able to validate itself, for example, you could give it a method "isValid", which when called will check all values for validity.
I would additionally have a user repository class (or perhaps just some static methods on the User class) which could be used to fetch users from the database and store them back. This repository would directly return user objects when fetching, and accept user objects as parameters for saving.
As to what comes to forms, you could probably have a form class which takes a user object. It could then automatically get values from the user and use it to validate itself as well.
I have written on this topic a bit here: http://codeutopia.net/blog/2009/02/28/creating-a-simple-abstract-model-to-reduce-boilerplate-code/ and also some other posts linked in the end of that one.
Hope this helps. I'd just like to remind that my approach is not perfect either =)

An abstract response for you which quite possibly won't help at all, but I'm happy to get the down votes as it's worth saying :)
You're dealing with two different models here, in some world we call these Class and Instance, in other's we talk of Classes and Individuals, and in other worlds we make distinctions between A-Box and T-Box statements.
You are dealing with two sets of data here, I'll write them out in plain text:
User a Class .
username a Property;
domain User;
range String .
registration_date a Property;
domain User;
range Date .
this is your Class data, T-Box statements, Blueprints, how you describe the universe that is your application - this is not the description of the 'things' in your universe, rather you use this to describe the things in your universe, your instance data.. so you then have:
user1 a User ;
username "bob";
registration_date "2010-07-02" .
which is your Instance, Individual, A-Box data, the things in your universe.
You may notice here, that all the other things you are wondering how to do, validation, adding labels to properties and so forth, all come under the first grouping, things that describe your universe, not the things in it. So that's where you'd want to add it.. again in plain text..
username a Property;
domain User;
range String;
title "Username";
validation [ type Regex; value '/^[0-9a-z_]{4,64}$/i' ] .
The point in all this, is to help you analyse the other answers you get - you'll notice that in your suggestion you munged these two distinct sets of data together, and in a way it's a good thing - from this hopefully you can see that typically the classes in PHP take on the role of Classes (unsurprisingly) and each object (or instance of a class) holds the individual instance data - however you've started to merge these two parts of your universe together to try and make one big reusable set of classes outside of the PHP language constructs that are provided.
From here you have two paths, you can either fold in to line and follow the language structure to make your code semi reusable and follow suggested patterns like MVC (which if you haven't done, would do you good) - or you can head in to a cutting edge world where these worlds are described and we build frameworks to understand the data about our universes and the things in it, but it's an abstract place where at the minute it's hard to be productive, though in the long term is the path to the future.
Regardless, I hope that in some way that helps you to get a grip of the other responses.
All the best!

Having looked at your question, the answers and your responses; I might be able to help a bit more here (although it's difficult to cover everything in a single answer).
I can see what you are looking to do here, and in all honesty this is how most frameworks start out; making a set of classes to handle everything, then as they are made more reusable they often hit on tried and tested patterns until finally ending up with what I'd say is 'just another framework', they all do pretty much the same thing, in pretty much the same ways, and aim to be as reusable as they can - generally about the only difference between them is coding styles and quality - what they do is pretty much the same for all.
I believe you're hitting on a bit of anti-pattern in your design here, to explain.. You are focussed on making a big chunk of code reusable, the validation, the presentation and so forth - but what you're actually doing (and of course no offence) is making the working code of the application very domain specific, not only that but the design you illustrate will make it almost impossible to extend, to change layers (like make a mobile version), to swap techs (like swap db vendors) and further still, because you've got presentation and application (and data) tiers mixed together, any designer who hit's the app will have to be working in, and changing, your application code - hit on a time when you have two versions of the app and you've got a big messy problem tbh.
As with most programming problems, you can solve this by doing three things:
designing a domain model.
specifying and designing interfaces rather that worrying about the implementation.
separating cross cutting concerns
Designing a domain model is a very important part of Class based OO programming, if you've never done it before then now is the ideal time, it doesn't matter whether you do this in a modelling language like UML or just in plain text, the idea is to define all the Entities in your Domain, it's easy to slip in to writing a book when discussing this, but let's keep it simple. Your domain model comprises all the Entities in your application's domain, each Entity is a thing, think User, Address, Article, Product and so forth, each Entity is typically defined as a Class (which is the blueprint of that entity) and each Class has Properties (like username, register_date etc).
Class User {
public $username;
public $register_date;
}
Often we may keep these as POPOs, however they are often better thought of as Transfer Objects (often called Data Transfer Objects, Value Objects) - a simple Class blueprint for an entity in your domain - normally we try to keep these portable as well, so that they can be implemented in any language, passed between apps, serialized and sent to other apps and similar - this isn't a must, indeed nothing is a must - but it does touch on separation of concerns in that it would normally be naked, implying no functionality, just a blueprint ot hold values. Contrast sharply with Business Objects and Utility Classes that actually 'do' things, are implementations of functionality, not just simple value holders.
Don't be fooled though, both Inheritance and Composition also play their part in domain model, a User may have several Addresses, each Address may be the address of several different Users. A BillingAddress may extend a normal Address and add in additional properties and so forth. (aside: what is a User? do you have a User? do you have a Person with 1-* UserAccounts?).
After you've got your domain model, the next step is usually mapping that up to some form of persistence layer (normally a database) two common ways of doing this (in well defined way) are by using an ORM (such as doctrine, which is in symphony if i remember correctly), and the other way is to use DAO pattern - I'll leave that part there, but typically this is a distinct part of the system, DAO layers have the advantage in that you specify all the methods available to work with the persistence layer for each Entity, whilst keeping the implementation abstracted, thus you can swap database vendors without changing the application code (or business rules as many say).
I'm going to head in to a grey area with the next example, as mentioned earlier Transfer Objects (our entities) are typically naked objects, but they are also often a good place to strap on other functionality, you'll see what I mean.
To illustrate Interfaces, you could simply define an Interface for all your Entities which is something like this:
Interface Validatable {
function isValid();
}
then each of your entities can implement this with their own custom validation routine:
Class User implements Validatable {
public function isValid()
{
// custom validation here
return $boolean;
}
}
Now you don't need to worry about creating some convoluted way of validating objects, you can simply call isValid() on any entity and find out if it's valid or not.
The most important thing to note is that by defining the interface, we've separated some of the concerns, in that no other part of the application needs to do anything to validate an object, all they need to know is that it's Validatable and to call the isValid() method.
However, we have crossed some concerns in that each object (instance of a Class) now carries it's own validation rules and model. It may make sense to abstract this out, one easy way of doing this is to make the validation method static, so you could define:
Class User {
public static function validate(User $user)
{
// custom validation here
return $boolean;
}
}
Or you could move to using getters and setters, this is another very common pattern where you can hide the validation inside the setter, thus ensuring that each property always holds valid data (or null, or default value).
Or perhaps you move the validation in to it's own library? Class Validate with it's own methods, or maybe you just pop it in the DAO layer because you only care about checking something when you save it, or maybe you need to validate when you receive data and when you persist it - how you end up doing it is your call and there is no 'best way'.
The third consideration, which I've already touched on, is separation of concerns - should a persistence layer care how the things it's persisting are presented? should the business logic care about how things are presented? should an Entity care where and how it's displayed? or should the presentation layer care how things are presented? Similarly, we can ask is there only ever going to be one presentation layer? in one language? What about how a label appears in a sentence, sure singular User and Address makes sense, but you can't simply +s to show the lists because Users is right but Addresss is wrong ;) - also we have working considerations like do I want a new designer having to change application code just to change the presentation of 'user account' to 'User Account', even do I want to change my app code in the classes when a that change is asked for?
Finally, and just to throw everything I've said - you have to ask yourself, what's the job I'm trying to do here? am I building a big reusable application with potentially many developers and a long life cycle here - or would a simple php script for each view and action suffice (one that reads $_GET/$_POST, validates, saves to db then displays what it should or redirects where it should) - in many, if not all cases this is all that's needed.
Remember, PHP is made to be invoked when a request is made to a web server, then send back a response [end] that's it, what happens between then is your domain, your job, the client and user typically doesn't care, and you can sum up what you're trying to do this simply: build a script to respond to that request as quickly as possible, with the expected results. That's and it needn't be any more complicated than that.
To be blunt, doing everything I mentioned and more is a great thing to do, you'll learn loads, understand your job better etc, but if you just want to get the job out the door and have easy to maintain simple code in the end, just build one script per view, and one per action, with the odd reusable bit (like a http handler, a db class, an email class etc).

You're running into the Model-View-Controller (MVC) architecture.
The M only stores data. No display information, just typed key-value pairs.
The C handles the logic of manipulating this information. It changes the M in response to user input.
The V is the part which handles displaying things. It should be something like Smarty templates rather than a huge amount of raw PHP for generating HTML.
Having it all "in one place" is the wrong approach. You won't have duplicated code with MVC - each part is a distinct step. This improves code reuse, readability, and maintainability.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.