I've been programming procedural code for quite some time, and recently I've been learning MVC. I have tried a couple of frameworks and decided I wanted to write my own from scratch to learn it inside and out. So far I have a working MVC framework (and it works great!) but I'm trying to determine the best way to query related tables.
As of now, in my member model, I use the following to get a list of the members and their rank_name
SELECT m.*, r.`name` AS rank_name
FROM `{$this->table}` AS m
LEFT JOIN `ranks` AS r
ON r.`id` = m.`rank_id`
This works fine, but it does not seem like the way it's supposed to be done in MVC. After working with frameworks like cakephp, I know that member hasOne rank (or is it rank hasMany member?)
The only alternative I thought of was to query the members table by itself in the member model, and then call a method in the rank model to get the rank name for each row, like this:
// member model
SELECT *
FROM `{this->table}`
while($row = $result->fetch_assoc()) {
$data []= $row;
$data['rank_name'] = $this->Rank->get_name($row['rank_id']);
}
But this can't be very efficient, having to run a separate query for each member. The only other concept I thought of was using MySQL's IN(x, y, z) function to get the rank names and then merging the arrays somehow.
What is the best practice for this in MVC?
Maybe an ORM (object relational mapper) is what you're looking for. An ORM assists you in storing and retrieving your modelobjects from the database.
For PHP I guess your best choice is Doctrine.
Related
I'm really struggling with a recurring OOP / database concept.
Please allow me to explain the issue with pseudo-PHP-code.
Say you have a "user" class, which loads its data from the users table in its constructor:
class User {
public $name;
public $height;
public function __construct($user_id) {
$result = Query the database where the `users` table has `user_id` of $user_id
$this->name= $result['name'];
$this->height = $result['height'];
}
}
Simple, awesome.
Now, we have a "group" class, which loads its data from the groups table joined with the groups_users table and creates user objects from the returned user_ids:
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['user_ids'] as $user_id) {
// Make the user objects
$users[] = new User($user_id);
}
}
}
A group can have any number of users.
Beautiful, elegant, amazing... on paper. In reality, however, making a new group object...
$group = new Group(21); // Get the 21st group, which happens to have 4 users
...performs 5 queries instead of 1. (1 for the group and 1 for each user.) And worse, if I make a community class, which has many groups in it that each have many users within them, an ungodly number of queries are ran!
The Solution, Which Doesn't Sit Right To Me
For years, the way I've got around this, is to not code in the above fashion, but instead, when making a group for instance, I would join the groups table to the groups_users table to the users table as well and create an array of user-object-like arrays within the group object (never using/touching the user class):
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
**and also joining the `users` table,**
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['users'] as $user) {
// Make user arrays
$users[] = array_of_user_data_crafted_from_the_query_result;
}
}
}
...but then, of course, if I make a "community" class, in its constructor I'll need to join the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
...and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_communities table with the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
What an unmitigated disaster!
Do I have to choose between beautiful OOP code with a million queries VS. 1 query and writing these joins by hand for every single superset? Is there no system that automates this?
I'm using CodeIgniter, and looking into countless other MVC's, and projects that were built in them, and cannot find a single good example of anyone using models without resorting to one of the two flawed methods I've outlined.
It appears this has never been done before.
One of my coworkers is writing a framework that does exactly this - you create a class that includes a model of your data. Other, higher models can include that single model, and it crafts and automates the table joins to create the higher model that includes object instantiations of the lower model, all in a single query. He claims he's never seen a framework or system for doing this before, either.
Please Note:
I do indeed always use separate classes for logic and persistence. (VOs and DAOs - this is the entire point of MVCs). I have merely combined the two in this thought-experiment, outside of an MVC-like architecture, for simplicity's sake. Rest assured that this issue persists regardless of the separation of logic and persistence. I believe this article, introduced to me by James in the comments below this question, seems to indicate that my proposed solution (which I've been following for years) is, in fact, what developers currently do to solve this issue. This question is, however, attempting to find ways of automating that exact solution, so it doesn't always need to be coded by hand for every superset. From what I can see, this has never been done in PHP before, and my coworker's framework will be the first to do so, unless someone can point me towards one that does.
And, also, of course I never load data in constructors, and I only call the load() methods that I create when I actually need the data. However, that is unrelated to this issue, as in this thought experiment (and in the real-life situations where I need to automate this), I always need to eager-load the data of all subsets of children as far down the line as it goes, and not lazy-load them at some future point in time as needed. The thought experiment is concise -- that it doesn't follow best practices is a moot point, and answers that attempt to address its layout are likewise missing the point.
EDIT : Here is a database schema, for clarity.
CREATE TABLE `groups` (
`group_id` int(11) NOT NULL, <-- Auto increment
`make` varchar(20) NOT NULL,
`model` varchar(20) NOT NULL
)
CREATE TABLE `groups_users` ( <-- Relational table (many users to one group)
`group_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL
)
CREATE TABLE `users` (
`user_id` int(11) NOT NULL, <-- Auto increment
`name` varchar(20) NOT NULL,
`height` int(11) NOT NULL,
)
(Also note that I originally used the concepts of wheels and cars, but that was foolish, and this example is much clearer.)
SOLUTION:
I ended up finding a PHP ORM that does exactly this. It is Laravel's Eloquent. You can specify the relationships between your models, and it intelligently builds optimized queries for eager loading using syntax like this:
Group::with('users')->get();
It is an absolute life saver. I haven't had to write a single query. It also doesn't work using joins, it intelligently compiles and selects based on foreign keys.
Say you have a "wheel" class, which loads its data from the wheels table in its constructor
Constructors should not be doing any work. Instead they should contain only assignments. Otherwise you make it very hard to test the behavior of the instance.
Now, we have a "car" class, which loads its data from the cars table joined with the cars_wheels table and creates wheel objects from the returned wheel_ids:
No. There are two problems with this.
Your Car class should not contain both code for implementing "car logic" and "persistence logic". Otherwise you are breaking SRP. And wheels are a dependency for the class, which means that the wheels should be injected as parameter for the constructor (most likely - as a collection of wheels, or maybe an array).
Instead you should have a mapper class, which can retrieve data from database and store it in the WheelCollection instance. And a mapper for car, which will store data in Car instance.
$car = new Car;
$car->setId( 42 );
$mapper = new CarMapper( $pdo );
if ( $mapper->fetch($car) ) //if there was a car in DB
{
$wheels = new WheelCollection;
$otherMapper = new WheelMapper( $pdo );
$car->addWheels( $wheels );
$wheels->setType($car->getWheelType());
// I am not a mechanic. There is probably some name for describing
// wheels that a car can use
$otherMapper->fetch( $wheels );
}
Something like this. The mapper in this case are responsible for performing the queries. And you can have several source for them, for example: have one mapper that checks the cache and only, if that fails, pull data from SQL.
Do I really have to choose between beautiful OOP code with a million queries VS. 1 query and disgusting, un-OOP code?
No, the ugliness comes from fact that active record pattern is only meant for the simplest of usecases (where there is almost no logic associated, glorified value-objects with persistence). For any non-trivial situation it is preferable to apply data mapper pattern.
..and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_dealerships table with the dealerships table with the dealerships_cars table with the cars table with the cars_wheels table with the wheels table.
Jut because you need data about "available cares per dealership in Moscow" does not mean that you need to create Car instances, and you definitely will not care about wheels there. Different parts of site will have different scale at which they operate.
The other thing is that you should stop thinking of classes as table abstractions. There is no rule that says "you must have 1:1 relation between classes and tables".
Take the Car example again. If you look at it, having separate Wheel (or even WheelSet) class is just stupid. Instead you should just have a Car class which already contains all it's parts.
$car = new Car;
$car->setId( 616 );
$mapper = new CarMapper( $cache );
$mapper->fetch( $car );
The mapper can easily fetch data not only from "Cars" table but also from "Wheel" and "Engines" and other tables and populate the $car object.
Bottom line: stop using active record.
P.S.: also, if you care about code quality, you should start reading PoEAA book. Or at least start watching lectures listed here.
my 2 cents
ActiveRecord in Rails implements the concept of lazy loading, that is deferring database queries until you actually need the data. So if you instantiate a my_car = Car.find(12) object, it only queries the cars table for that one row. If later you want my_car.wheels then it queries the wheels table.
My suggestion for your pseudo code above is to not load every associated object in the constructor. The car constructor should query for the car only, and should have a method to query for all of it's wheels, and another to query it's dealership, which only queries for the dealership and defers collecting all of the other dealership's cars until you specifically say something like my_car.dealership.cars
Postscript
ORMs are database abstraction layers, and thus they must be tuned for ease of querying and not fine tuning. They allow you to rapidly build queries. If later you decide that you need to fine tune your queries, then you can switch to issuing raw sql commands or trying to otherwise optimize how many objects you're fetching. This is standard practice in Rails when you start doing performance tuning - look for queries that would be more efficient when issued with raw sql, and also look for ways to avoid eager loading (the opposite of lazy loading) of objects before you need them.
In general, I'd recommend having a constructor that takes effectively a query row, or a part of a larger query. How do do this will depend on your ORM. That way, you can get efficient queries but you can construct the other model objects after the fact.
Some ORMs (django's models, and I believe some of the ruby ORMs) try to be clever about how they construct queries and may be able to automate this for you. The trick is to figure out when the automation is going to be required. I do not have personal familiarity with PHP ORMs.
specs: PHP 5 with mySQL built on top of Codeigniter Framework.
I have a database table called game and then sport specific tables like soccerGame and footballGame. these sport specific tables have a gameId field linking back to the game table. I have corresponding classes game and soccerGame/footballGame, which both extend game.
When I look up game information to display to the user, I'm having trouble figuring out how to dynamically link the two tables. i'm curious if it's possible to get all the information with with one query. The problem is, I need to query the game table first to figure out the sport name.
if that's not possible, my next thought is to do it with two queries. have my game_model query the game table, then based off the sport name, call the appropriate sport specific model (i.e. soccer_game_model) and get the sport specific info.
I would also pass the game object into the soccer_model, and the soccer_model would use that object to build me a soccerGame object. this seems a little silly to me because i'm building the parent object and then giving it to the extending class to make a whole new object?
thoughts?
thanks for the help.
EDIT:
game table
gameId
sport (soccer, basketball, football, etc)
date
other data
soccerGame table
soccerGameId
gameId
soccer specific information
footballGame table
footballGameId
gameId
football specific information
and so on for other sports
So I need to know what the sport is before I can decide which sport specific table I need to pull info from.
UPDATE:
Thanks all for the input. It seems like dynamic SQL is only possible through stored procedures, something I'm not well versed on right now. And even with them it's still a little messy. Right now I will go the two query route, one to get the sport name, and then a switch to get the right model.
From the PHP side of things now, it seems a little silly to get a game object, pass it to, say, my soccer_game_model, and then have that return me a soccer_game object, which is a child of the original game. Is that how it has to be done? or am I missing something from an OO perspective here?
To extend on Devin Young's answer, you would achieve this using Codeigniter's active record class like so:
public function get_game_by_id($game_id, $table)
{
return $this->db->join('game', 'game.id = ' . $table . '.gameId', 'left')
->where($table . '.gameId', $game_id)
->get('game')
->result();
}
So you're joining the table by the gameId which is shared, then using a where clause to find the correct one. Finally you use result() to return an array of objects.
EDIT: I've added a second table paramater to allow you to pass in the name of the table you can join either soccerGame, footballGame table etc.
If you don't know which sport to choose at this point in the program then you may want to take a step back and look at how you can add that so you do know. I would be reluctant to add multiple joins to all sport tables as you''ll run into issues down the line.
UPDATE
Consider passing the "sport" parameter when you look up game data. As a hidden field, most likely. You can then use a switch statement in your model:
switch($gameValue) {
case 'football': $gameTable = "footballGame"; break;
case 'soccer': $gameTable = "soccerGame"; break;
}
Then base your query off this:
"SELECT *
FROM ". $gameTable . "
...etc
You can combine the tables with joins. http://www.w3schools.com/sql/sql_join.asp
For example, if you need to get all the data from game and footballGame based on a footballGameId of 15:
SELECT *
FROM footballGame a
LEFT OUTER JOIN game b ON a.id = b.gameId
WHERE footballGameId = 15
Check this Stack Overflow answer for options on how to do it via a standard query. Then you can turn it into active record if you want (though that may be complicated and not worth your time if you don't need DB-agnostic calls in your app).
Fow what it's worth, there's nothing wrong with doing multiple queries, it just might be slower than an alternative. Try a few options, see what works best for you and your app.
I am trying to replicate what would be a left join in MySql in Mongo. I have a collection named Clients and another collection name Orders.
In the clients collection is have:
client_PK, FirstName, LastName, Company
In the orders collection I have:
order_PK, client_fk, OrderDate, OrderAmount,
So i know that can use embedded documents but for the sake of this question i am looking to use the reference model.
My question is, using these two collections how would I construct a table or object similar to a left join in mysql? I know this is a document db not a relational db but im using sql language just to give you an idea of what im trying to accomplish. In MySql it would look like this:
SELECT * FROM orders LEFT JOIN clients ON clients.client_PK = orders.client_fk
with this i could now construct a table that looked like:
FirstName | LastName | Company | OrderDate | OrderAmount
then i could repeat the rows using a while loop to display all orders and display the clients name with the order. Again i know mongo isn't a relational db but i am assuming there is a way simulate a table using two collections.
Thank you.
You almost certainly want to be storing these data all in the same MongoCollection (even in just a denormalization collection).
If you absolutely can't do that, though, and if your set is small, you can do something similar to this (since you asked about PHP):
<?php
// gather orders
$orders = iterator_to_array($mongodb->orders->find());
$joinedOrders = array();
// gather clients
foreach ($db->clients->find() as $client) {
// iterate orders (like a left join)
foreach ($orders as $order) {
// make a "joinedOrders" record for each join match
if ($order['client_fk'] == $client['client_PK']) {
$joinedOrders[] = array_merge($order, $client);
}
}
}
// result is now in $joinedOrders
This is, however, almost always a bad idea. (-: You really should be denormalizing your data, or using a relational database to store/query relational data.
i am assuming there is a way simulate a table using two collections
MongoDB does not have any tool for doing this. You are basically going to have "roll your own" joins. At the basic level, this means that you will have to write nested for loops and build a result set in your code.
Doing this type of "extra logic" is pretty common in MongoDB because of the lack of joins. If you're seeing this pattern a lot, you may want to consider using SQL for part of your data.
This question got me today, my repositories should always return full objects? They can not return partial data (in an array for example)?
For example, I have the method getUserFriends(User $user) inside my repository Friends, in this method I execute the following DQL:
$dql = 'SELECT userFriend FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
But this way I'm returning the users entities, containing all the properties, the generated SQL is a SELECT of all fields from the User table. But let's say I just need the id and the name of the user friends, there would be more interesting (and quick) get just these values?
$dql = 'SELECT userFriend.id, userFriend.name FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
These methods are executed in my service class.
From a database perspective, performance will not be that much affected by the number of fields, unless the number of rows to return is really huge (millions of rows, probably) : the hardest part for the db is to make the joints, and build the resultset from the tables.
From a php perspective, that depends on multiple factors, like the complexity and the number of objects created.
I would take the problem differently : I would profile and stress-test my code in order to see if performance is an issue or not, and decide to refactor only if needed (switching from doctrine to a hand-made model is time consuming, will the performance gain be worth it ?)
EDIT : and to answer your initial question : fetching complete objects will lead to easier caching if needed, and better data encapsulation. I would keep these until they represent a big performance issue.
You can use partial keyword in your DQL : http://www.doctrine-project.org/docs/orm/2.0/en/reference/partial-objects.html?highlight=partial
But only do that if your app has performance issues.
I think one of the more difficult concepts to understand in the Zend Framework is how the Table Data Gateway pattern is supposed to handle multi-table joins. Most of the suggestions I've seen claim that you simply handle the joins using a $db->select()...
Zend DB Select with multiple table joins
Joining Tables With Zend Framework PHP
Joining tables wthin a model in Zend Php
Zend Framework Db Select Join table help
Zend DB Select with multiple table joins
My question is: Which object is best suited to handle this kind of multi-table select statement? I feel like putting it in the model would break the 1-1 Table Data Gateway pattern between the class and the db table. Yet putting it in the controller seems wrong because why would a controller handle a SQL statement? Anyway, I feel like ZF makes handling datasets from multiple tables more difficult than it needs to be. Any help you can provide is great...
Thanks!
By definition, TableData Gateway handles one table only.
ZF enforces this definition with an integrity check on Zend_Db_Table_Selects. However, the integrity check can be disabled and then you can do joins. Just create a method inside your table class to do the Join via the select object like this:
public function findByIdAndJoinFoo($id)
{
$select = $this->select();
$select->setIntegrityCheck(false) // allows joins
->from($this)
->join('foo', 'foo.id = bar.foo_id');
return $this->fetchAll($select);
}
If you want to stick to the definition, you you can use some sort of Service Layer or DataMapper that knows how to handle multiple tables. These sit between the Db classes and the Controllers.
Another alternative is not to use Joins but table relationships and then lazy load dependent rowsets as needed. Of course, that's not Joins then, but multiple queries.
And finally, you can still just use Zend_Db_Statement and craft your SQL by hand:
$stmt = $db->query(
'SELECT * FROM bugs WHERE reported_by = ? AND bug_status = ?',
array('goofy', 'FIXED'));