If I have a class representing access to one table in my database in a class:table relationship, so I can hide table details in one class. But as any useful application I have to query several tables. How can I accomodate this using the class:table design?
There's a couple of different ways you can achieve this, however, which one you choose really depends on your circumstances.
1) Break the connection between your objects and your database
Write your objects to have no connection with your database tables. First normalise your database tables, then, look at how the user of your application will interact with your data. Model the data to objects, but don't tie each object to the table (ie with a Zend_DB_Table_Abstract class)
Once you have established your objects, then write mapper classes which map your objects back to the relevant tables in your database. These are the classes which extend Zend_DB_Table (if appropriate).
You can handle joins in two ways, either map the joins through the Zend_DB_Table relationship functionallity, or, (IMHO a better choice) just use Zend_DB_Select to make the relevant methods within your your mapper class.
So you've then got two classes (probably per table, but not always)
Person
PersonMapper
In your code, when you want to work with some objects, either create a new object
$person = new Person();
$person->setName('andrew taylor');
Then write pass it to the mapper to save it:
$personMapper = new PersonMapper();
$pesonnMapper->save($person);
Or, do it the other way:
$personMapper = new PersonMapper();
$person = personMapper->load(29);
$person->setName('joe bloggs');
$personMapper->save($person);
The next step on from here would be a collection class based on the SPL:
$personList = $personMapper->loadAllMen();
foreach($personList AS $person) {
echo $person->getName();
}
Where $personMapper->loadAllMen() is a method like:
$select = $this->select();
$select=>where('gender = "Male"');
$zendDbRows = this->fetchAll($select);
return new PersonList($zendDbRows);
2) MySQL Views
If you have a lot of joined tables where there is one row per join, so, you're joining customer information based on an id in your orders table, and you're doing it read-only (so you don't want to update any information through the Zend_DB_Table adaptor) you create your normalised tables, then, a single view across the top. The view handles the joins behind the scenes so through Zend it feels like you're connecting to a single table.
There are some caveats with this, MySQL views do have some performance problems (which is why it's best on single row FK joins), and, they're strictly read only.
Related
Consider the following READ and WRITE queries:
Read
// Retrieves a person (and their active game score if they have one)
$sql = "SELECT CONCAT(people.first_name,' ',people.last_name) as 'name',
people.uniform as 'people.uniform',
games.score as 'games.score'
FROM my_people as people
LEFT JOIN my_games as games ON(games.person_id = people.id AND games.active = 1)
WHERE people.id = :id";
$results = DB::select(DB::raw($sql),array("id"=>$id));
Write
// Saves a person
$person = new People;
$person->data = array('first_name'=>$input['first_name'],
'last_name'=>$input['last_name'],
'uniform'=>$input['uniform']);
$personID = $person->save();
// Save the game score
$game = new Games;
$game->data = array('person_id'=>$personID,
'active'=>$input['active'],
'score'=>$input['score']);
$game->save();
I put every write (INSERT/UPDATE) operation into my own centralized repository classes and call them using class->methods as shown above.
I may decide to put some of the read queries into repository classes if I find myself using a query over and over (DRY). I have to be careful of this, because I tend to go back and adjust the read query slightly to get more or less data out in specific areas of my application.
Many of my read queries are dynamic queries (coming from datagrids that are filterable and sortable).
The read queries will commonly have complex things like SUMs, COUNTs, ORDERing, GROUPing, COMPOSITE keys, etc.
Since my reads are so diverse, why would I need to further abstract them into Laravel's Eloquent ORM (or any ORM), especially since I'm using PDO (which has 12 different database drivers)? The only advantage I can see at the moment is if I wanted to rename a database field then that would be easier to do using the ORM. I'm not willing to pay the price in greater abstraction/obscurity for that though.
I'm really struggling with a recurring OOP / database concept.
Please allow me to explain the issue with pseudo-PHP-code.
Say you have a "user" class, which loads its data from the users table in its constructor:
class User {
public $name;
public $height;
public function __construct($user_id) {
$result = Query the database where the `users` table has `user_id` of $user_id
$this->name= $result['name'];
$this->height = $result['height'];
}
}
Simple, awesome.
Now, we have a "group" class, which loads its data from the groups table joined with the groups_users table and creates user objects from the returned user_ids:
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['user_ids'] as $user_id) {
// Make the user objects
$users[] = new User($user_id);
}
}
}
A group can have any number of users.
Beautiful, elegant, amazing... on paper. In reality, however, making a new group object...
$group = new Group(21); // Get the 21st group, which happens to have 4 users
...performs 5 queries instead of 1. (1 for the group and 1 for each user.) And worse, if I make a community class, which has many groups in it that each have many users within them, an ungodly number of queries are ran!
The Solution, Which Doesn't Sit Right To Me
For years, the way I've got around this, is to not code in the above fashion, but instead, when making a group for instance, I would join the groups table to the groups_users table to the users table as well and create an array of user-object-like arrays within the group object (never using/touching the user class):
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
**and also joining the `users` table,**
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['users'] as $user) {
// Make user arrays
$users[] = array_of_user_data_crafted_from_the_query_result;
}
}
}
...but then, of course, if I make a "community" class, in its constructor I'll need to join the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
...and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_communities table with the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
What an unmitigated disaster!
Do I have to choose between beautiful OOP code with a million queries VS. 1 query and writing these joins by hand for every single superset? Is there no system that automates this?
I'm using CodeIgniter, and looking into countless other MVC's, and projects that were built in them, and cannot find a single good example of anyone using models without resorting to one of the two flawed methods I've outlined.
It appears this has never been done before.
One of my coworkers is writing a framework that does exactly this - you create a class that includes a model of your data. Other, higher models can include that single model, and it crafts and automates the table joins to create the higher model that includes object instantiations of the lower model, all in a single query. He claims he's never seen a framework or system for doing this before, either.
Please Note:
I do indeed always use separate classes for logic and persistence. (VOs and DAOs - this is the entire point of MVCs). I have merely combined the two in this thought-experiment, outside of an MVC-like architecture, for simplicity's sake. Rest assured that this issue persists regardless of the separation of logic and persistence. I believe this article, introduced to me by James in the comments below this question, seems to indicate that my proposed solution (which I've been following for years) is, in fact, what developers currently do to solve this issue. This question is, however, attempting to find ways of automating that exact solution, so it doesn't always need to be coded by hand for every superset. From what I can see, this has never been done in PHP before, and my coworker's framework will be the first to do so, unless someone can point me towards one that does.
And, also, of course I never load data in constructors, and I only call the load() methods that I create when I actually need the data. However, that is unrelated to this issue, as in this thought experiment (and in the real-life situations where I need to automate this), I always need to eager-load the data of all subsets of children as far down the line as it goes, and not lazy-load them at some future point in time as needed. The thought experiment is concise -- that it doesn't follow best practices is a moot point, and answers that attempt to address its layout are likewise missing the point.
EDIT : Here is a database schema, for clarity.
CREATE TABLE `groups` (
`group_id` int(11) NOT NULL, <-- Auto increment
`make` varchar(20) NOT NULL,
`model` varchar(20) NOT NULL
)
CREATE TABLE `groups_users` ( <-- Relational table (many users to one group)
`group_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL
)
CREATE TABLE `users` (
`user_id` int(11) NOT NULL, <-- Auto increment
`name` varchar(20) NOT NULL,
`height` int(11) NOT NULL,
)
(Also note that I originally used the concepts of wheels and cars, but that was foolish, and this example is much clearer.)
SOLUTION:
I ended up finding a PHP ORM that does exactly this. It is Laravel's Eloquent. You can specify the relationships between your models, and it intelligently builds optimized queries for eager loading using syntax like this:
Group::with('users')->get();
It is an absolute life saver. I haven't had to write a single query. It also doesn't work using joins, it intelligently compiles and selects based on foreign keys.
Say you have a "wheel" class, which loads its data from the wheels table in its constructor
Constructors should not be doing any work. Instead they should contain only assignments. Otherwise you make it very hard to test the behavior of the instance.
Now, we have a "car" class, which loads its data from the cars table joined with the cars_wheels table and creates wheel objects from the returned wheel_ids:
No. There are two problems with this.
Your Car class should not contain both code for implementing "car logic" and "persistence logic". Otherwise you are breaking SRP. And wheels are a dependency for the class, which means that the wheels should be injected as parameter for the constructor (most likely - as a collection of wheels, or maybe an array).
Instead you should have a mapper class, which can retrieve data from database and store it in the WheelCollection instance. And a mapper for car, which will store data in Car instance.
$car = new Car;
$car->setId( 42 );
$mapper = new CarMapper( $pdo );
if ( $mapper->fetch($car) ) //if there was a car in DB
{
$wheels = new WheelCollection;
$otherMapper = new WheelMapper( $pdo );
$car->addWheels( $wheels );
$wheels->setType($car->getWheelType());
// I am not a mechanic. There is probably some name for describing
// wheels that a car can use
$otherMapper->fetch( $wheels );
}
Something like this. The mapper in this case are responsible for performing the queries. And you can have several source for them, for example: have one mapper that checks the cache and only, if that fails, pull data from SQL.
Do I really have to choose between beautiful OOP code with a million queries VS. 1 query and disgusting, un-OOP code?
No, the ugliness comes from fact that active record pattern is only meant for the simplest of usecases (where there is almost no logic associated, glorified value-objects with persistence). For any non-trivial situation it is preferable to apply data mapper pattern.
..and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_dealerships table with the dealerships table with the dealerships_cars table with the cars table with the cars_wheels table with the wheels table.
Jut because you need data about "available cares per dealership in Moscow" does not mean that you need to create Car instances, and you definitely will not care about wheels there. Different parts of site will have different scale at which they operate.
The other thing is that you should stop thinking of classes as table abstractions. There is no rule that says "you must have 1:1 relation between classes and tables".
Take the Car example again. If you look at it, having separate Wheel (or even WheelSet) class is just stupid. Instead you should just have a Car class which already contains all it's parts.
$car = new Car;
$car->setId( 616 );
$mapper = new CarMapper( $cache );
$mapper->fetch( $car );
The mapper can easily fetch data not only from "Cars" table but also from "Wheel" and "Engines" and other tables and populate the $car object.
Bottom line: stop using active record.
P.S.: also, if you care about code quality, you should start reading PoEAA book. Or at least start watching lectures listed here.
my 2 cents
ActiveRecord in Rails implements the concept of lazy loading, that is deferring database queries until you actually need the data. So if you instantiate a my_car = Car.find(12) object, it only queries the cars table for that one row. If later you want my_car.wheels then it queries the wheels table.
My suggestion for your pseudo code above is to not load every associated object in the constructor. The car constructor should query for the car only, and should have a method to query for all of it's wheels, and another to query it's dealership, which only queries for the dealership and defers collecting all of the other dealership's cars until you specifically say something like my_car.dealership.cars
Postscript
ORMs are database abstraction layers, and thus they must be tuned for ease of querying and not fine tuning. They allow you to rapidly build queries. If later you decide that you need to fine tune your queries, then you can switch to issuing raw sql commands or trying to otherwise optimize how many objects you're fetching. This is standard practice in Rails when you start doing performance tuning - look for queries that would be more efficient when issued with raw sql, and also look for ways to avoid eager loading (the opposite of lazy loading) of objects before you need them.
In general, I'd recommend having a constructor that takes effectively a query row, or a part of a larger query. How do do this will depend on your ORM. That way, you can get efficient queries but you can construct the other model objects after the fact.
Some ORMs (django's models, and I believe some of the ruby ORMs) try to be clever about how they construct queries and may be able to automate this for you. The trick is to figure out when the automation is going to be required. I do not have personal familiarity with PHP ORMs.
I use yii framework that implements Active Record pattern as ORM base. It has CActiveRecord class that is a table wrapper class with attributes reflecting table columns. So each object of this class represents a database row.
Wiki says about Active Record pattern:
Active record is an approach to accessing data in a database
and
A database table or view is wrapped into a class. Thus, an object instance is tied to a single row in the table.
So far so good.
But where should I put complex raw sql query that retrieves statistics data for example?
And, more generally, where should I put methods that retrieve some data that can not be an active record object (like data retrieved with aggregation queries) or if I knowingly do not want to retrieve an object but an array instead for example?
And for complex queries you can always use DAO if you want:
http://www.yiiframework.com/doc/guide/1.1/en/database.dao
But in most cases, CDbCriteria will fit your needs, you can read more about it here:
http://www.larryullman.com/2013/07/24/using-cdbcriteria-in-the-yii-framework/
There are many possibilities depending on what you want. Yii has relations to access related objects and one of the relation type is statistical relation, check this link:
http://www.yiiframework.com/doc/guide/1.1/en/database.arr#statistical-query
You may also use naming scopes to filter some of your results and then call for example count function to retrieve number of filtered results (this will be done by sending select count(*) ... to db server rather than fetching all entries, so it's very convenient). Check this for named scopes:
http://www.yiiframework.com/doc/guide/1.1/en/database.ar#named-scopes
If statistical data is related to your model, for example total spending by some client (although this could be done easily using statistical relation), you can add public function directly to your model class, such as
public function getTotalSpending() {
return 0; // or whatever you need to calculate here
}
Finally it is not considered a good practice to map your business logic directly to tables. Instead create your models by subclassing CModel or CFormModel classes and add public methods that retrieve / modify data (possibly using other models that do inherit CActiveRecord class).
Use CArrayDataProvide
Elements in the raw data array may be either objects (e.g. model objects) or associative arrays (e.g. query results of DAO). Make sure to set the keyField property to the name of the field that uniquely identifies a data record or false if you do not have such a field.
source: http://www.yiiframework.com/doc/api/1.1/CArrayDataProvider
Do not ever use Active Record pattern.
Suppose we have two related tables, for example one representing a person:
PERSON
name
age
...
current_status_id
and one representing a status update at a specific time for this person:
STATUS_HISTORY
recorded_on
status_id
blood_pressure
length
...
I have built an application in PHP using Zend Framework, and tried to retain 'object orientedness' by using a class for representing a person and a class for representing the status of a person. I also tried to use ORM principles where possible, such as using the data mapper for separating the domain model from the data layer.
What would be a nice (and object oriented) way of returning a list of persons from a data mapper, where in the list I sometimes want to know the last measured blood_pressure of the person, and sometimes not (depending on the requirements of the report/view in which the list is used). The same holds for different fields, e.g. values computed at the data layer (sum's, count's, etc.).
My first thought was using a rowset (e.g. Zend_Db_Rowset) but this introduces high coupling between my view and data layer. Another way might be to return a list of persons, and then querying for each person the latest status using a data mapper for requesting the status of a specific person. However, this will result in (at least) one additional query for each person record, and does not allow me to use JOINS at the data layer.
Any suggestions?
We have this same issue because of our ORM where I work. If you are worried enough about the performance hit of having to first get a list of your persons, then query for their statuses individually, you really have no other choice but to couple your data a little bit.
In my opinion, this is okay. You can either create a class that will hold the single "person" data and an array containing "status_history" records or suffer the performance hit of making another query per "person". You COULD reduce your query overhead by doing data caching locally (your controller would have to decide that if a request for a set of data is made before a certain time threshold, it just returns its own data instead of querying the db server)
Having a pure OO view is nice, but sometimes impractical.
Try to use "stdclass" class which is PHP's inbuild class, You can get the object of stdclass which will be created automatically by PHP and its member variable will be column name. So u can get object and get the values by column name. For example.
Query is
SELECT a.dept_id,a.dept_name,a.e_id,b.emp_name,b.emp_id from DEPT a,EMP b where b.emp_id=a.e_id;
Result will be array of stdclass objects. Each row represents one stdclass object.
Object
STDCLASS
{
dept_id;
dept_name;
e_id;
emp_id;
emp_name;
}
You can access like
foreach($resultset as $row)
{
$d_id = $row->dept_id;
$d_nam= $row->dept_name;
$e_id = $row->e_id;
$em_id= $row->emp_id;
$e_nam= $row->emp_name;
}
But
Blockquote
I am not sure about performance.
I'm experimenting with the Doctrine ORM (v1.2) for PHP. I have defined a class "liquor", with two child classes "gin" and "whiskey". I am using concrete inheritance (class table inheritance in most literature) to map the classes to three seperate database tables.
I am attempting to execute the following:
$liquor_table = Doctrine_Core::getTable('liquor');
$liquors = $liquor_table->findAll();
Initially, I expected $liquors to be a Doctrine_Collection containing all liquors, whether they be whiskey or gin. But when I execute the code, I get a empty collection, despite having several rows in the whiskey and gin database tables. Based on the generated SQL, I understand why: the ORM is querying the "liquor" table, and not the whiskey/gin tables where the actual data is stored.
Note that the code works perfectly when I switch the inheritance type to column aggregation (simple table inheritance).
What's the best way to obtain a Doctrine_Collection containing all liquors?
Update
After some more research, it looks like I'm expecting Doctrine to be performing a SQL UNION operation behind the scenes to combine the result sets from the "whiskey" and "gin" tables.
This is known as a polymorphic query.
According to this ticket, this functionality is not available in Doctrine 1.x. It is destined for the 2.0 release. (also see Doctrine 2.0 docs for CTI).
So in light of this information, what would be the cleanest, most efficient way to work around this deficiency? Switch to single table inheritance? Perform two DQL queries and manually merge the resulting Doctrine_Collections?
the only stable and useful inheritence mode of Doctrine for the moment is column_aggregation. I have tried the others in different projects. With column_aggregation you can imitate polymorphic queries.
Inheritance in general is something that is a bit buggy in Doctrine (1.x). With 2.x this will change, so we may have better options in the future.
I wrote the (not production ready) beginnings of an ORM that would do exactly what you're looking for a while back. Just so that I could have a proof of concept. All my studies did yield that you're in some way mixing code and data (subclass information in the liquor table).
So what you might do is write a method on your liquor class/table class that queries it's own table. The best way to get away with not having to hard-code all the subclasses in your liquor class is to have a column which contains the class name of the subclass in it.
How you spread the details around is entirely up to you. I think the most normalized (and anyone can correct me if I'm wrong here) way to do it is to store all fields that appear in your liquor class in the liquor table. Then, for each subclass, have a table that stores the specific data that pertains to the subclass type.
Which is the point at which you are mixing code and data because your code is reading the liquor table to get the name of the subclass to perform a join.
I'll use cars & bikes and some minimal, yet trivial differences between them for my example:
Ride
----
id
name
type
(1, 'Sebring', 'Car')
(2, 'My Bike', 'Bicycle')
Bicycle
-------
id
bike_chain_length
(2, '2 feet')
Car
---
id
engine_size
(1, '6 cylinders')
There's all kinds of variations from here forward like storing all liquor class data in the subclass table and only storing references and subclass names in the liquor table. I like this the least though because if you are aggregating the common data, it saves you from having to query every subclass table for the common fields.
Hope this helps!