Unless you count QBASIC back when I was a kid, I learned to program in the object oriented paradigm a half-decade ago in university, so I understand how it works. I have difficulty, however, implementing it in my large, database-driven PHP applications, because everything is re-initialized on each page load. The only true persistence is the database, but pulling all the data from the database and reconstructing all the objects on every page load sounds like a lot of overhead for not much benefit, as I'll just have to do it all over again on next page load.
Additionally, I run into these sorts of problems when I try to re-factor an application (third attempt now) to OO:
A class called "Person" exists, and it contains a variety of properties. One of the extensions of that class, "Player", contains a "parent" property, of the type "Player". Initially, none of the players start with any parents, so they would all be initialized with that field as NULL. Over time, however, parents would be added to the Players properties, and would reference other Player objects that already exist.
In PHP, however, it's inevitable that I would have to rebuild the class structure with Players already having parents. In such a case, if I instantiate a Player who has as a parent a Player that hasn't been instantiated yet - I have a problem.
Is it necessary to reduce the scope of OO programming when dealing with PHP?
You're confusing OOP concepts, with concepts about HTTP and persistence.
Your question isn't really about OOP at all- it's about persisting OOP data across multiple HTTP requests.
PHP doesn't automatically persist objects across HTTP requests- it's left up to you to take care of that (in your case, doing just as you said: "pulling all the data from the database and reconstructing all the objects on every page load").
Is it necessary to reduce the scope of OO programming when dealing with PHP?
No. You just need to use lazy loading to instantiate parent player only if it becomes necessary.
Class Player {
protected $id;
protected $name;
protected $parentId;
protected $parent;
public static function getById($id) {
// fetch row from db where primary key = $id
// instantiate Player populated with data returned from db
}
public function getParentPlayer() {
if ($this->parentId) {
$this->parent or $this->parent = Player::getById($this->parentId);
return $this->parent;
}
return new Player(); // or null if you will
}
}
This way if you intantiate player id 40:
$john = Player::getById(40);
It does not have the $parent property populated. Only upon call to getParentPlayer() it is being loaded and stored under that property.
$joe = $john->getParentPlayer();
That is of course only if $parentId is pointing to non-NULL.
Update: to solve problem of duplicate instances stored in protected property when two or more players share single parent, static cache can be used:
protected static $cache = array();
public static function getById($id) {
if (array_key_exists($id, self::$cache)) {
return self::$cache[$id];
}
// fetch row from db where primary key = $id
// instantiate Player populated with data returned from db
self::$cache[$id] = ... // newly created instance
// return instance
}
If you expect to access significant number of parents on every request, it may be feasible to select all records from database at once and store them in another static pool.
Items not being created by means of parents is indeed something that is tricky. You could do something like, ID referencing and build a PlayerManager that does the linking for you. By means of RUID's (Runtime Unique ID's). But of course, you would like to have the parents be created first.
So when creating something like an OO structure of your database, which I would encourage if you want to do some extra manipulations on it besides printing. It is doable to create a player from your DB and then, if it has a parent, create the parent as you would do normally. When arriving at the parent in your while-loop. You can just say, no it is already created. Use a Player-manager class for this, which holds your Players. So it can check if Players (and thus parents) are already there.
So you get a bit of a different way of walking through your DB, instead of just doing it linear.
As for: is it necessary to do so. I actually don't know that much about it myself. It seems, as far as I've seen on the web, that PHP classes are mostly used to created one object that can do smart things (like a DOMDocument).
But do you really have to convert your DB table to objects? It depends on your type of application. If printing is what you want, it doesn't seem logical. If some big manipulation with a lot of calls to the objects (or normally to the table), I maybe would like to see a nice OO programmed structure.
OOP programming in PHP is for me more a way of having a good maintainability and a good architecture (composition, factories, component based programing, etc ).
For sure you'll need to rethink some Design Patterns used in persitent envirronments. Pools of objects, etc. And of course some Design Patterns as still completly good in PHP used in short-term module in apache: Singleton, Factory, Adapter, Interfaces, Lazy loading etc.
About the persitence problem there're several solutions. Of course database storage. But you have as well the session storage and application caches (like memcached or apc). You can store serialized objects in such caches. You'll just need a good autoloader to get the classes loaded (and a good opcode to avoid re-interpreting the sources at every requests).
Some objects are really heavy to load and build, we can think for example of an ACL object, loading roles, ressources, default policy, exception rules, maybe even compling some of theses rules. Once you've got it it would be a desaster to rebuild this object at every request. You should really study the way to store/load this finished object somewhere to limit his load time (avoiding SQL requests and build time). Once built it's certainly an object that could have a long life, like 1hour.
The only objects that you cannot store as serialized strings are in fact the one which depends shortly of data updates by concurrent requests. Some websites does not even need to really care about it, accuracy of data is not the same in a public website publishing news and in a financial accouting app. If your application can handle pre-build serialized objects you can see the serialize-storage-load-unserialize as a simple overhead in the framework :-) But effectivly, share nothing, no object listening to all concurrent requests, no write operation on an object shared with a parallel request until this object (or related data) is stored somewhere (and read). Take it as a challenge!
Related
As a means to try and learn object oriented PHP scripting, I'm currently attempting to rewrite a database web application that I previously wrote in procedural PHP. The application is used to store car parts and information about car parts.
In my application there are car parts. They are identified by various reference numbers, which are assigned by different organisations (the part's manufacturer, re-manufacturers, vehicle manufacturers, etc.), and any particular car part could have zero, one or many reference numbers as assigned by these organisations (and, confusingly, each reference number may refer to more than one unique car part as defined in the database I'm working on).
As far as I understand things, I am dealing with three different classes of entities. There is that of the car part, the reference number, and that of the reference-assigner (in my internal nomenclature I call these 'referrers'). As I am just getting started with learning OOP, I have begun by creating some very basic classes for each:
class Part {
public $part_id;
public $part_type;
public $weight;
public $notes;
public $references;
private $db;
function __construct(Database $db) {
$this->db = $db;
}
}
class Reference {
public $reference_id;
public $reference;
public $referrer;
private $db;
function __construct(Database $db) {
$this->db = $db;
}
}
class Referrer {
public $referrer_id;
public $referrer_name;
private $db;
function __construct(Database $db) {
$this->db = $db;
}
}
?>
What I've been struggling with is how to populate these and subsequently glue them together. The most basic function of my web application is to view a car part, including its metrics and its assigned reference numbers.
A part_id is included in a page request. Initially, I wrote a constructor method in my Part class which would look up that part in the database, and then another method which would look up reference numbers (which were JOINed with the referrer table) assigned to that Part ID. It would iterate through the results and create a new object for each reference, and hold the complete set in an array indexed by the reference_id datum.
After further reading, I began to understand that I should use a factory class to carry this out, however, as this kind of conjunction between my Part class and my Reference class is not the responsibility of any one of those discrete classes. This made conceptual sense, and so I have since devised a class that I've called PartReferenceFactory, which I understand should be responsible for assembling any kind of collation of part reference numbers:
class PartReferenceFactory {
public static function getReferences(Database $db, $part_id) {
$db_result = $db->query(
'SELECT *
FROM `' . REFERRER_TABLE . '`
LEFT JOIN `' . REFERENCE_TABLE . '` USING (`referrer_id`)
INNER JOIN `' . REFERENCE_REL . '` USING (`reference_id`)
WHERE `part_id` = :part_id
ORDER BY `order` ASC, `referrer_name` ASC, `reference` ASC',
array(':part_id' => $part_id);
);
if(empty($db_result)) {
return FALSE;
} else {
$references = array();
foreach($db_result as $reference_id => $reference_properties) {
$references[$reference_id] = new Reference($db);
$references[$reference_id]->reference = $reference_properties['reference'];
$references[$reference_id]->referrer = new Referrer($db);
$references[$reference_id]->referrer->referrer_id = $reference_properties['referrer_id'];
$references[$reference_id]->referrer->referrer_name = $reference_properties['referrer_id'];
}
return $references;
}
}
}
?>
The getReferences method of my factory class is then called inside my Part class, which I revised thusly:
class Part {
public $part_id;
public $part_type;
public $weight;
public $notes;
public $references;
private $db;
function __construct(Database $db) {
$this->db = $db;
}
function getReferences() {
$this->references = PartReferenceFactory::getReferences($this->db, $this->part_id);
}
}
Really I'm looking for advice on whether my general approach is a good one or if, as I suspect, I've misunderstood something, am overlooking other things, and am tying myself in knots. I will try to distill this into some underlying, directly-answerable questions:
Does my understanding of the purpose of classes and of factory classes seem erroneous?
Is it a bad idea for me to store arrays of objects? I ask from a viewpoint more of design than performance, insofar as they are not intertwined.
Is this even an appropriate way to structure relationships/inter-dependencies between classes in PHP?
Is it correct to call my PartReferenceFactory inside the getReferences method of the (instantiated) Part class? And is storing the returned reference objects within the part objects appropriate?
As a part of the application's GUI, I'll need to provide lists of referrers, requiring me to create another array of ALL referrers independently of any part object. Yet some of these referrers will exist within the $references array inside the part object. It has occurred to me that in order to avoid duplicate referrer objects, I could SELECT a list of all referrers at the beginning of each page request, formulate these into a global array of referrers, and simply reference these from within my Part and Reference classes as needed. However, I have read that it is not good to rely on the global scope within classes. Is there an elegant/best-practice solution to this?
Thank you very much for the time of whomever happens to read this, and I apologise for the extremely long question. I often worry that I mis-explain matters, and so wanted to be as precise as possible. Please let me know if it would be beneficial for any parts of my post to be deleted.
Hmmm, I think Jon is right, this question has taken real commitment to get my head round and I'm still not sure I know everything you're asking, but I'll give it a go.
1) Is your understanding of classes/factory classes right?
Well yes and no. Yes you do understand it, but you're taking a very classroom type approach to the problem, rather than the sort of pragmatism that comes with experience. OOP is a way of modelling real things, but it's still fundamentally a programming tool, so don't over complicate your structure, just to keep it real. The main think OOP gives you over structured programming is inheritance, which means you can code things which are kind of like other things, so reuse code better by only coding the stuff that's actually different. If 2 things would share code, make them a shared parent and put the code in there. Envisaging how the hierarchy might work efficiently is 80% of the task of designing an OOP application, and you're bound to get it wrong first time and have to restructure.
For example. I've coded a number of classes which represent entities in a database: ie Users, Realthings, Collectibles. Each has fundamentally different set of attributes and relations, but also has a core set of things that are similar: the way they're interrogated, the way the data is presented, the fact that they have attributes, fields and relations. So I coded a parent class which everything inherits, and it contains most of the code, then the specific children just define the stuff which is different for each class, and the hard work was deciding how much could go in Indexed to avoid repeating code in the child classes. Initially I had a lot of code in the children, but as it evolved I moved more and more code into the parent class, and thinned out the children. Like I say, don't assume you'll get it right first time.
2) Is it a bad idea to store objects in arrays? Well no, but why would you want to? In your example you have an array of references in the database, where the relationship is a many to many relation (ie you have 3 tables with a joining table in the middle). Then when you get the data you create each object and store it in an array. Sure you can do that, but remember each object has an overhead. Loading everything into memory is fine, but don't do it unless you need to. Also be sure you need a many<=>many relationship. I may have mis-read your explanation, but I thought you'd only need a one<=>many relationship, which could be done by storing a reference to the 'one' in the record of each of the 'many' (thus loosing the middle table and simplifying the join).
5) I'm going to jump to the last one at this point because it feeds into #2. You need to think of your application from different perspectives (a) when you have data and are trying to present it (read) (b) when you have some data and are trying to add new data, with links to existing data (adding a relation) and (c) when you have no data, and are adding a fresh record. How might the user get to each of those perspectives and what would happen. It's often easy to get (a) and (c) to work, but making (b) intuitive is often the hard part, and if you over complicate it, then users simply won't get it. When building up the inner data structures in #2 only do what you need, for the perspective a user is in. Unless you're writing an app, you don't need to load everything, only the stuff for the task at hand.
3) Not sure what you mean by 'is this appropriate'. OOP allows you to tie yourself up in knots, or create wonderful works of art. In theory you should just try to keep it simple. There are great books which give examples of why you might want to make things more complicated, but most of the reasons aren't obvious, so you'll need experience to decide if what you're doing is needlessly over complicating things or if it actually avoids a pitfall. From personal experience if you think it's overcomplicating, it probably is.
4) Not sure if this is what you meant, but I've taken this question to be asking if you've used Factory classes correctly. My understanding of the theory is that what you should end up with, is next to no references to doing things statically, beyond the initial creation of a FactoryClass object, stored statically in the child class being used, then you have to code an abstract method in parent class, implemented in each child class, which gets you that object, so you can call on the object of that FactoryClass, using a method call. I normally only call FactoryClass stuff directly when a non-factory class is initialised (ie you need to populate the static in a new object, incase the class hasn't been init'd yet). I don't see any problem calling it directly as you have, but I'd avoid that as the Factory is IMHO an implementation detail of the class, and so shouldn't be exposed outside that.
In my experience you're always learning new things and discovering pitfalls in OOP, so even thought I've been coding professionally for nearly 20 years, I'd not claim to be an expert on this, so I'm sure there will be totally opposing views to what I've said. I think you learn most by doing what your gut says is right, then not being too stubborn to start over again, if you gut changes it's mind :-)
Background
This is a long and complicated question. I'm using php/MySQL as an example (since it's an actual example) but this could theoretically apply to other languages. Bear with me.
MVC Framework with no ORM
I'm creating an application that uses its own framework. For a variety of reasons, the following answers to this question will not be accepted:
Why don't you just use X framework?
Why don't you just use X ORM?
This application does its own queries (written by me) and that's how I'd like it to stay for now.
Business Logic?
"Business Logic" seems to be a meaningless buzzword, but I take it to essentially mean
The queries and the logic that builds the result set based on those queries
I've also read that the Model in an MVC should do all the business logic.
User.php is 884 lines
Now that I've gotten my app working fairly well, I'd like to refactor it so as not to have such abominations. User.php is essentially the model for Users (obviously). There are some responsibilities I see in it that I could easily pluck out, but a major hurdle I'm running into is:
How can I reconcile SOLID with MVC?
The reason that User.php has grown so large is because I do any query that requires a User member in that file. User is used for a ton of operations (it needs to do so much more than just CRUD), so any query that needs userid, username, etc. is run by a function in this file. Apparently the queries should be in the model (and not the controller), but I feel that this should definitely be split up somehow. I need to accomplish the following:
Create an API that covers all of these necessary queries in a compartmentalized way
Avoid giving access to the DB connection class when it's not necessary
Add User data to the view (User.php is doing that right now -- the view object is injected by a setter, which I think is also bad).
...so what I could do is create other objects like UserBranchManager, UserSiteManager, UserTagManager, etc. and each of those could have the relevant queries and the DB object injected to run those queries, but then how do they get the coveted User::$userid that they need to run these queries? Not only that, but how could I pass Branch::$branchid? Shouldn't those members be private? Adding a getter for them also makes that pointless.
I'm also not sure where to draw the line of how much an object should do. A lot of the operations similar but still different. A class for each one would be huge overkill.
Possible Answer
If I can't get any help, what I'll do (or at least try to do) is have a dependency injection container of some kind build dependencies for the objects above (e.g. UserBranchManager) and inject them into the relevant controller. These would have a DB and Query object. The Query object could be passed to low level models (like User) to bind parameters as needed, and the higher level models or whatever they are called would give the results back to the controller which would add the data to the template as needed as well. Some possible hurdles I see are creating proper contracts (e.g. the UserController should preferably depend on some abstraction of the user models) but some specifics are inevitably required, especially when it comes to the view.
Can anyone offer some wisdom in response to my rambling question?
Response to #tereško
He has provided a great answer not only here, but also at How should a model be structured in MVC?
Code
As requested, here is some extremely pared down code (basically services one request). Some important notes:
Right now, controllers are not classes, just files
The controller also handles a lot of the routing
There are no "view" objects, just the templates
This will probably look really bad
These are also things to improve, but I'm mostly worried about the model (User in particular since it's getting out of control):
#usr.php -- controller
$route = route();
$user = '';
$branch = '<TRUNK>';
if (count($route) > 0) {
if (count($route) > 1) {
list($user, $branch) = $route;
}
else {
list($user) = $route;
}
}
$dec = new Decorator('user');
$dec->css('user');
if (isset($_SESSION['user']) && $_SESSION['user']->is($user)) {
$usr = $_SESSION['user'];
}
else {
$usr = new User(new DB, $user);
}
$usr->setUpTemplate($dec, $branch);
return $dec->execute();
# User.php -- model
class User {
private $userid;
private $username;
private $db;
public function __construct(DB $db, $username = null) {
$this->username = $username;
$this->db = $DB;
}
public function is($user) {
return strtolower($this->username) === strtolower($user);
}
public function setUpTemplate(Decorator $dec, $branch) {
$dec->_title = "$this->username $branch";
// This function runs a moderately complicated query
// joining the branch name and this user id/name
$dec->branch = $this->getBranchDisplay($branch);
}
}
Questions about answers
Answer here:
You talk about leaving off caching/authentication/authorization. Are these important? Why aren't they covered? How do they relate to the model/controller/router?
The Data Mapper example has the Person class with methods like getExemption, isFlaggedForAudit .. what are those? It seems like those calculations would require DB data, so how does it get them? Person Mapper leaves off select. Isn't that important?
What is "domain logic?"
Answer 5863870 (specifically the code example):
Shouldn't these factory objects be abstractions (and not rely on creation via new) or are these special?
How would your API include the necessary definition files?
I've read a lot about how it's best for dependencies to be injected in the constructor (if they're mandatory). I assume you set the factories in this way, but why not the objects/mappers/services themselves? What about abstractions?
Are you worried about code duplication (e.g. most models requiring an _object_factory member in their class definition)? If so, how could you avoid this?
You're using protected. Why?
If you can provide any specific code examples that would be best since it's easier for me to pick stuff up that way.
I understand the theory of what your answers are saying and it's helped a lot. What I'm still interested in (and not totally sure of) is making sure that dependencies of objects in this API are handled in the best way (the worst would be new everywhere).
Dont confuse SOLID (You can get a good explanation of what it is on my blog at: http://crazycoders.net/2012/03/confoo-2012-make-your-project-solid/
SOLID is great when considering the framework that goes around the application you are trying to build. The management of the data itself is another thing. You can't really apply the Single Responsibility of the S of SOLID to a business model that RELIES on other business models such as User, Groups and Permissions.
Thus, you have to let some of the SOLID go when you build an application. What SOLID is good for, is to make sure your framework behind your application is strong.
For example, if you build your own framework and business model, you will probably have a base class MODEL another for DATABASEACCESS, just remember that your MODEL shouldn't be aware of how to get the data, just know that it must get data.
For example:
Good:
- MyApp_Models_User extends MyApp_Framework_Model
- MyApp_Models_Group extends MyApp_Framework_Model
- MyApp_Models_Permission extends MyApp_Framework_Model
- MyApp_Framework_Model
- MyApp_Framework_Provider
- MyApp_Framework_MysqliProvider extends MyApp_Framework_Provider
In this good part, you create a model like this:
$user = new MyApp_Models_User(new MyApp_Framework_MysqliProvider(...));
$user->load(1234);
This way, you will prevent a fail in the single responsibility, your provider is used to load the data from one of the many providers that exist and your model represents the data that you extracted, it doesn't know how to read or write the data, thats the providers job...
Bad way:
- MyApp_Model_User
- MyApp_Model_Group
- MyApp_Model_Permission
define('MyDB', 'localhost');
define('MyUser', 'user');
define('MyPass', 'pass');
$user = new MyApp_Models_User(1234);
Using this bad method you first of all break the single responsibility, your model represents something and also manages the input/ouput of the data. Also, you create a dependency by stating that you need to define constants for the model to connect to the database and you completly abstract the database methods, if you need to change them and have 37 models, you'll have TONS of work to do...
Now, you can, if you want work the bad way... i still do it, i'm aware of it, but sometimes, when you have crappy structure and want to refactor, you can and should work against a principle just to refactor correctly and slowly, THEN, refactor a little more and end up with something SOLID and MVC looking.
Just remember that SOLID doesn't apply to everything, it's just a guideline, but it's very very good guideline.
Well .. it depends on what is actually inside your ./user.php file. If i had to guess, you would be a situation, where your user "model" has too many responsibilities. Basically, you are violating single responsibility principle and not sure how to go about fixing that.
You did no provide any code .. so , lets continue with guessing ...
It is possible that your user "model" is implementing active record pattern. This would be the main source of SRP problems. You could watch this part of lecture slides. It will explain some of it. The point would be, instead of using active record, to go with something similar to a data mapper pattern.
Also, you might notice that some of the domain logic, which works with User class instances, seems to happen outside your "model". It might be beneficial to separate that part in a different structure. Otherwise you will be forcing the domain logic inside the controller. Maybe this comment could shine some light on the whole subject.
Another thing you might have crammed inside your user "model" could be parts of the authorization (no to confuse with authentication) mechanism. It could be pragmatic to separate this responsibility.
Update
You talk about leaving off caching/authentication/authorization. Are these important? Why aren't they covered? How do they relate to the model/controller/router?
Caching is something that you would add later in the application. The domain objects do not care where the data comes from. For that reason you can wither add the caching with in the service-level objects or inside the existing data mappers. I would advise to choose former option, because changing existing mappers might have unforeseen side effects. And because it would just over-complicate the existing mappers.
namespace Application\Service;
class Community{
public function getUserDetails( $uid )
{
$user = $this->domainObjectFactory->build('User');
$cache = $this->cacheFactory->build('User');
$user->setId( $uid );
try
{
$cache->pull( $user );
}
cache( NotFoundException $e)
{
$mapper = $this->mapperFactory->build('User');
$mapper->pull( $user );
$cache->push( $user );
}
return $user->getDetails();
}
}
This would illustrate a very simplified acquisition of user information based on user's ID. The code creates domain object and provides it with ID, then this $user ovject is used as condition to search for cached details or, if it fails, fetching that pulling that information from DB via the data mapper. Also, if that is successful, the details are pushed into the cache, for next time.
You might notice, that this example did not handle situation, when mapper cannot find such user with such ID in storage (usually - SQL database). As I said , it's a simplified example.
Also, you might notice, that this sort of caching can be easily added on case-by-case basis and would not drastically change how your logic behaves.
Authorization is another part, which should not directly influence your business logic. I already linked my preferred way for providing authentication. The idea is that, instead of checking for credentials inside controller (like here, here, here or here), the access rights are checked before you execute a method on the controller. This way you have additional options for handling the denial of access, without being stuck within a specific controller.
The Data Mapper example has the Person class with methods like getExemption(), isFlaggedForAudit() .. what are those? It seems like those calculations would require DB data, so how does it get them? Person Mapper leaves off select. Isn't that important?
The Person class is a domain object. It would contain the part of domain logic, that is associated directly with that entity.
For those methods to be executed, the mapper should at first load the data. In PHP it would look something like this:
$person = new Person;
$mapper = new PersonMapper( $databaseConnection );
$person->setId( $id );
$mapper->fetch( $person );
if ( $person->isFlaggedForAudit() )
{
return $person->getTaxableEearnings();
}
The names of methods in the PersonMapper are there as an example, so that you would understand, how the class is supposed to be used. I usually name methods fetch(), store() and remove() (or push/pull/remove ... depends on how much GIT have I been using). IMHO, there is no point to have a separate update() and insert() methods. If object's data was initially retrieved by mapper, then it is an UPDATE. If not - it is an INSERT. You can also determine it whether the value, which corresponds to PRIMARY KEY has been set or not (in some cases, at least).
What is "domain logic?"
It is part of the code which knows how to create an invoice and apply discount price for specific products. It's also the code which makes sure, that you do not submit registration form, you do not state that you have been born in 1592.
The MVC is made from two layers: presentation (can contain: views, templates, controllers) and model layer (can contain: services, domain objects, mappers). The presentation layer deals with user interaction and responses. The model layer deals with business and validation rules. You could say that domain business logic is everything in model, that does not deal with storage.
I guess there is no easy way to explain.
Shouldn't these factory objects be abstractions (and not rely on creation via new) or are these special?
Which "factory objects", where? Are you talking about this fragment ?
$serviceFactory = new ServiceFactory(
new DataMapperFactory( $dbhProvider ),
new DomainObjectFactory
);
$serviceFactory->setDefaultNamespace('Application\\Service');
That whole code fragment is supposed to be in bootstrap.php file (or you might be using index.php or init.php for that). It's the entry point for the application. It is not part of any class, therefore you cannot have *tight coupling" there. There is nothing to "couple with".
How would your API include the necessary definition files?
That fragment is not entire bootstrap.php file. Above that code are includes and initialization for autoloader. I am currently using dynamic loader, which lets classes from same namespace to be located in different folders.
Also, such autoloader is a development-stage artifact. In production code you have to use loader, which works with predefined hashtable and does not need to actually walk the tree of namespaces and locations.
I've read a lot about how it's best for dependencies to be injected in the constructor (if they're mandatory). I assume you set the factories in this way, but why not the objects/mappers/services themselves? What about abstractions?
What are you talking about ?!?
Are you worried about code duplication (e.g. most models requiring an _object_factory member in their class definition)? If so, how could you avoid this?
Did you actually LOOK at how old that code fragment in comments was ?!?!?
You're using protected. Why?
Because, if values/methods are defined with protected visibility, you can access them when you extend the original class. Also, that code example was made for old version of answer. Check the dates.
And no. I will not provide any specific code examples. Because each situation is different. If you want to do copy-paste development, then use CakePHP or CodeIgniter.
By business model, or business objects, I mean plain old objects like a "User" with all their properties name, adress, ...; in addition to all the user properties let's say each user would have an "AppointmentBook" object, each book has a set of "TimeSlot" objects, etc.
The business model has objects with references between them, at least that's how I code a business model in Java.
Here comes the question:
To intialize my business objects, in Java, I would
fetch all of the data from DB only once during application
initialization,
map data from my DB to my business objects
store in memory (maps) and they would be shared across all the requests.
PHP's Share-Nothing-Architecture is confusing me for proper OO programming:
If I use the same logic, I would have to fetch all the objects from DB, for every request (I know I could still cache, but you don't cache all of your DB, it's not a question about caching but rather about the way of programming in PHP and its architecture).
So let's say that for one HTTP request, I just need the User properties and I don't need to access his appointment book. It would be a pitty to fetch all the data from the DB for all the objects the User makes reference to, as I just need his properties. This means that I will initialize PHP objects from my model with a lot of NULL values (NULL because of the objects contained in User that I won't load) which can later on lead to errors.
I was wondering how professional PHP developers usually use their business objects?
(I'm coming from Java)
UPDATE: It was kind of stupid to say that I would load the whole database into memory during application init in Java. What I rather meant is that, if I need to fetch a specific user, I could just load all of its data and that would be accessible through all the requests.
In PHP you do not keep all the data of your domain business model in the memory. Instead you only request from DB ( though cache, if needed ), the data you want.
Model layer in php should be built from multiple domain object and data mappers ( i assume, that part is not so different from Java ). If you need User details, then you fetch only that information from database/cache. You most likely will have a separate mapper just for dealing with user(s).
You display the information about that user, and forget about the query. Next request (when and if it comes) will require different information. Maybe you will want ContactList for that User ... then you really do not need user itself, only his user_id. Again, you let you mapper to fetch data into the domain object responsible for handling contact list, and if contact list contains User instances, then just create them, but leave in "unfetched" state (object knows only own user_id). Fetch them only if you really need to, and only the parts which you will use ins that "view".
P.S. you might have notices, I told that model later should be segmented, but quite often php developers just create single class of each DB table (which implements ActiveRecord) and call it "model". This is a result caused by Ruby on Rails influence on php framework developers, which, IMHO, is one of the worst things that has happened to PHP in past 5 years.
Your Java example implies your storing your entire databases content in memory. If your doing that, what's the point of the database? Why not just create all those object and memdump them for persistence.
If I use the same logic, I would have to fetch all the objects from DB, for every request
That's simply madness, you don't need to fetch anything, you create new instances when you need them and destroy them when you no longer need them.
So let's say that for one HTTP request, I just need the User properties and I don't need to access his appointment book.
That's easy, redesign your user. Your user needs it's properties and a property called appointmentBook which is simply an array of appointment book ids.
If you actually need those appointments you can fetch them from the database later.
This means that I will initialize PHP objects from my model with a lot of NULL values (NULL because of the objects contained in User that I won't load) which can later on lead to errors.
Not really, if this is the case your User object is too big. Make it smaller, you should load the entire user. Except of course the user has to be small enough for you to sensible load it.
If you don't want that then you can always create a UserProperties class and let every User have one. When you load the User you load the properties, but you also have an option to create the properties seperately.
Even in Java you would not load all data from the database into memory. You can however - as you write - often load more compared to short Transaction Scripts you normally have in PHP.
You models should be "clever" then to only load the data from the persistence storage that is needed to perform the requested action. This requires the object to be "clever" enough to lazy-load data probably.
This can be achieved with a Domain Model that knows enough about itself and a Data Mapper that knows enough about the storage for example.
There are other patterns as well which might suit your needs depending on the type of application, however a Domain Model together with Data Mapper is quite flexible.
An exemplary data mapper in the PHP world is Doctrine.
I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?
From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.
If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).
Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.
PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.
keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.
I'm using the data mapper pattern in a PHP app I'm developing and have a question. At present, you request a Site object with a specific ID and the mapper will lookup the row, create an object and return it. However, if you do this again for the same Site you end up with two different objects with identical data. eg.:
$mapper = new Site_Mapper();
$a = $mapper->get(1);
$b = $mapper->get(1);
$a == $b // true
$a === $b // false
So, my question is, should I:
Store instantiated Site objects in
the mapper so I can then check if
they already exist before creating a
new one (could be a problem if
there's multiple mappers of the same
type)
Do the same as #1 but ensure
there is only ever one instances of each
mapper
Do the same as #1 but use a
static property so multiple
instances isn't a problem
Don't
worry about it because it's
probably not a problem
What you're looking for is the Identity Map pattern. Be careful with so called "reading inconsistencies", though. While you use an "old instance", the DB might have been changed already. And while you edit your object, another user might get an instance of it, change it faster and save it faster. Then the other object overrides all these changes again. On the web though maybe not such a big problem since a "page" quickly runs through and no object survives for longer than a few fractional seconds.
I know the question was asked quite a while ago still wanted to answer just in case if someone else runs in a similar dilemma. Actually the above suggestions #1,2,3 that author made are all related and one should consider them all in order to solve the problem.
1) Store each retrieved from DB object in a mapper so that you don't have to do it again when object with the same ID is requested. In all subsequent calls the mapper should return the stored object. This is called IdentityMap pattern. To achieve this make a private property in your mapper to hold an instance of an IdentityMap for a given object type. The Site_Mapper->get() method should always check the IdentityMap for a given ID if the object is not retrieved yet the mapper will go to the database but if it's already stored in the map it returns the cached instance which saves the trip to the database. So then $a === $b should be true because they would be references to the same object instance.
2) Yes ideally there should always be one instance of the given data mapper (Site_Mapper) in order to maintain a single instance of the IdentityMap at a given time. This can be done using Singleton pattern. This is possible with some getter method like Site_Mapper::getInstance() which will always return the same instance of a given mapper. You'd also have to declare the __construct() as a private method to prevent unwanted instantiation using new and make sure getInstance() is the only way to instantiate a mapper.
3) What the author mentioned above about static properties is true as well. To implement a Singleton in PHP one has to use a static property to hold and instance of a Mapper.
I would highly recommend Martin Fowler's book "Patterns of Enterprise Application Architecture" which talks about the above mentioned patterns and many more. It's a good read especially if you're working on your own custom ORM solution. Hope that helps.
I'd go with caching somehow - static mapper classes would be my first choice, and is what I've seen most of. Otherwise, your option 2 (which is the singleton pattern) is probably the best option.
Remember you need to clear this cache when an update is made to avoid returning stale data.
Having said that, unless you are making something to get a lot of use or that does a lot of queries, it may not matter. (your 4)
Also worth looking at for guidance (I'm sure there are many examples, I just know this one best), Propel (http://propel.phpdb.org/) has the caching feature - might be worth looking at how it does it? Or just use it maybe?