Right way to handle database access in PHP OOP - php

I need some help with database access in PHP. I'm trying to do things the OOP way but I'm not sure if I'm heading the right way.
Lets say I have a class Person, for example:
class Person {
private $id;
private $firstname;
private $lastname;
// maybe some more member variables
function __construct($id = NULL) {
if(isset($id)) {
$this->id = $id;
$this->retrieve();
}
}
// getters, setters and other member functions
private function retrieve() {
global $db;
// Retrieve user from database
$stmt = $db->prepare("SELECT firstname, lastname FROM users WHERE id = :id");
$stmt->bindParam(":id", $this->id, PDO::PARAM_INT);
$stmt->execute();
$result = $stmt->fetch();
$this->firstname = $result['firstname'];
$this->lastname = $result['lastname'];
}
function insert() {
global $db;
// Insert object into database, or update if exists
$stmt = $db->prepare("REPLACE INTO users (id, firstname, lastname) VALUES (:id, :firstname, :lastname)");
$stmt->bindParam(":id", $this->id, PDO::PARAM_INT);
$stmt->bindParam(":firstname", $this->firstname, PDO::PARAM_STR);
$stmt->bindParam(":lastname", $this->lastname, PDO::PARAM_STR);
$stmt->execute();
}
}
Note that this is just an example I just wrote to describe my question, not actual code I use in an application.
Now my first question is: is this the correct way to handle database interaction? I thought this would be a good way because you can instantiate an object, manipulate it, then insert/update it again.
In other words: is it better to handle database interaction inside the class (like in my example) or outside it, in the code that instantiates/uses the class?
My second question is about updating a whole bunch of rows that may or may not have been modified. Lets say the class Person has a member variable $pets[], which is an array containing all the pets that person owns. The pets are stored in a separate table in the database, like this:
+---------+-------------+---------+
| Field | Type | Key |
+---------+-------------+---------+
| pet_id | int(11) | PRI |
| user_id | int(11) | MUL |
| name | varchar(25) | |
+---------+-------------+---------+
Lets say I modified some pets in the Person object. Maybe I added or deleted some pets, maybe I only updated some pet's names.
What is the best way to update the whole Person, including their pets in that case? Lets say one Person has 50 pets, do I just update them all even if only one of them has changed?
I hope this is clear enough ;)
EDIT:
Even more importantly, how do I handle deletions/insertions at the same time? My current approach is that on an "edit page", I retrieve a certain Person (including their pets) and display/print them in a form for the user to edit. The user then can edit pets, add new pets or delete some pets. When the user clicks the "apply" button, the form gets POSTed back to the PHP script.
The only way I can think of to update these changes into the database is to just remove all pets currently listed in the database, and then insert the new set of pets. There are some problems with this though: first of all, all rows in the pets table are deleted and reinserted on every edit, and second, the auto increment id will take huge leaps every time because of this.
I'm feeling I'm doing something wrong. Is it just not possible to let users remove/add pets and modify existing pets at the same time (should I handle those actions separately)?

What you're trying to accomplish is a task called object-relational mapping (mapping objects to tables in a relational database and vice-versa). Entire books have been written on that, but I'll try to give a short overview.
Now my first question is: is this the correct way to handle database interaction? I thought this would be a good way because you can instantiate an object, manipulate it, then insert/update it again.
It's a valid approach. However,
in general, you should try to adhere to Separation of Concerns. In my opinion, modelling a domain entity (like a person, in your case) and storing this object in/loading from a database are two (arguably three) different concerns that should be implemented in separate classes (again, personal opinion!). It makes unit-testing your classes very difficult and adds a lot of complexity.
Regarding ORM, there are several design patterns that have emerged over time. Most prominently:
Active Record is basically the approach that you've already suggested in your question; it tightly couples data and data access logic together in one object. In my opinion not the best approach, because it violates Separation of Concerns, but probably easiest to implement.
Gateways or Mappers: Try to create a separate class for accessing your Persons table (something like a PersonGateway. That way, your Person class contains only the data and assorted behaviour, and your PersonGateway all kinds of insert/update/delete methods. A mapper on the other hand might be a class that converts generic database result objects (for instance a row returned by a PDO query) into Person objects (in this case, the Person class does not need to be aware of the existence of such a mapper class).
What is the best way to update the whole Person, including their pets in that case? Lets say one Person has 50 pets, do I just update them all even if only one of them has changed?
Again, several possibilities. You should map our Pet as a separate class in any case.
Keep the pets in their separate table and implement your own data access logic for this class (for example using one of the patterns mentioned above). You can then individually update, insert or delete them at will. You can keep track of added/deleted pets in your parent object (the person) and then cascade your update operations when you persist the parent object.
Embed the pets collection inside your Persons table. Depending on the average size of this collection, and how you might want to query them, this might not be a good idea, though.
If your project gains complexity, you might also want to have a look at ORM frameworks (like for example the Doctrine ORM) that take care of these problems for you.

In answer to question 1:
I would suggest having the sql in the class, and feed in the condition via the argument as per your example. As they would (should) all relate to a table of data representing Person objects.
On your second:
If pets are in a second table, I would suggest you having this as a separate class, with different sql queries. The variable $pets[] in your example can hold the pet_ids and you can have separate sql in the Pet class to do any changing, adding or removing as necessary.

Related

Doctrine2 immutable entities and append only data structures

I like the technique described by the Marco Pivetta at PHP UK Conference 2016 (https://youtu.be/rzGeNYC3oz0?t=2011), he recommends to favour immutable entities and instead of changing data structures - appending them. History of changes as a bonus is a nice thing to have for many different reasons, so I would like to apply this approach on my projects. Let's have a look at the following use case:
class Task {
protected $id;
/**
* Status[]
*/
protected $statusChanges;
public function __construct()
{
$this->id = Uuid::uuid4();
$this->statusChange = new ArrayCollection();
}
public function changeStatus($status, $user){
$this->statusChange->add(new Status($status, $user, $this);
}
public function getStatus()
{
return $this->statusChange->last();
}
}
class Status {
protected $id;
protected $value;
protected $changedBy;
protected $created;
const DONE = 'Done';
public function __construct($value, User $changedBy, Task $task)
{
$this->id = Uuid::uuid4();
$this->value = $value;
$this->changedBy = $changedBy;
$this->task = $task;
$this->created = new \DateTime();
}
}
$user = $this->getUser();
$task = new Task();
$task->changeStatus(Status::DONE, $user);
$taskRepository->add($task, $persistChanges = true);
All status changes I'm planning to persist in the MySQL database. So the association will be One(Task)-To-Many(Status).
1) What is the recommended way of gettings tasks by current status? Ie. all currently opened, finished, pending tasks.
$taskRepository->getByStatus(Status::DONE);
2) What is your opinion on this technique, are there some disadvantages which may appear in the future, as the project will grow?
3) Where it is more practical to save status changes (as a serialized array in a Task field, or in a separate table?
Thanks for opinions!
I imagine this is going to get closed to some of it being based on opinion, just so you're aware.
That being said, I've been quite interested in the idea of this but I've not really looked into it a huge amount, but here's my thinking...
1. Find By Status
I think you would need to do some sort of sub query in the join to get the latest state for each task and match that. (I would like to point out that this is just guesswork from looking at SO rather than actual knowledge so it could be well off).
SELECT t, s
FROM t Task
LEFT JOIN t.status s WITH s.id = (
SELECT s2.id
FROM Status s2
WHERE s2.created = (
SELECT MAX(s3.created)
FROM Status s3
WHERE s3.task = t
)
)
WHERE s.value = :status
Or maybe just (provided the combined id & created fields are unique)...
SELECT t, s
FROM t Task
LEFT JOIN t.status s WITH s.created = (
SELECT MAX(s2.created)
FROM Status s2
WHERE s2.task = t
)
WHERE s.value = :status
2 Disadvantages
I would imagine that having to use the above type of queries for each repository call would require more work and would, therefore, be easier to get wrong. As you are only ever appending to the database it will only get bigger so storage/cache space may be an issue depending on how much data you have.
3 Where To Save Status
The main benefit of immutable entities is that they can be cached forever as they will never change. If you saved any state changes in a serialized field then the entity would need to be mutable which would defeat the purpose.
Here's what I do:
All the types of tables involved in my business
I organize my database in 4 types of tables:
Log_xxx
Data_xxx
Document_xxx
Cache_xxx
Any data I store falls in one of those 4 types of tables.
Document_xxx and Data_xxx are just for storing binary files (like PDFs of tariffs the providers send to me), and static data or super-slow-changing data (like the airports or countries or currencies in the world). They are not involved in the "main" of this explanation but worth mentioning them.
Log tables
All my "domain events" and also the "application events" go to a Log_xxx table.
Log tables are write-once, never deleteable, and I must do backups of them. This is where the "history of the business" is stored.
For example, for a "task" domain object as you mention in your question, say that the task can be "created" and then altered later, I'd use:
Log_Task_CreatedEvents
Log_Task_ChangedEvents
Also I save all the "application events": Each HTTP request with some contextual data. Each command-run... They go to:
Log_Application_Events
Never the domain can change unless there is an application that changes it (a command line, a cron, a controller attending an HTTP request, etc.). All the "domain events" have a reference to the application event that created them.
All the events, either domain events (like TaskChangedEvent) or the application events are absolutely immutable and carry several standard things like the timestamp at creation.
The "Doctrine Entities" have no setters so they can only be created and read. Never changed.
In the database I only have one relevant field. It is of type TEXT and represents the event in JSON. I have another field: WriteIndex which is autonumeric, is the primary key and is NEVER used by my software as a key. It is only used for backups and database control. When you have GBs of data sometimes you need to dump only "events starting at index XX".
Then for easiness, I have an extra field which I call "cachedEventId" which contains the very same "id" of the event, redundant to the JSON. This is why the field is named after "cached..." as it does not contain original data, it could be rebuilt from the event field. This is only for simplicity.
Although doctrine calls them "entities" those are not domain entities, they are domain value objects.
So, Log tables look like this:
INT writeIndex; // Never used by my program.
TEXT event; // Stores te event as Json
CHAR(40) cachedEventId; // Unique key, it acts really as the primary key from the point of view of my program. Rebuildable from the event field.
Sometimes I opt-in for having more cached fields, like the creation time-stamp. All those are not needed and only set there for convenience. All that should be extractable from the event.
The Cache tables
Then in the Cache_xxx I have the "accumulated data" data that can be "rebuilt" from the logs.
For example if I have a "task" domain object that has a "title" field, and a "creator", and a "due date", and the creator cannot be overwritten, by definition, and the title and due date can be re-set... then I'd have a table that looks like:
Cache_Tasks
* CHAR(40) taskId
* VARCHAR(255) title
* VARCHAR(255) creatorName
* DATE dueDate
Write model
Then, when I create an task, it writes to 2 tables:
* Log_Task_CreatedEvents // Store the creation event here as JSON
* Cache_Tasks // Store the creation event as one field per column
Then, when I modify a task, it also writes to 2 tables:
* Log_Task_ChangedEvents // Store the event of change here as JSON
* Cache_Tasks // Read the entity, change its property, flush.
Read model
To read the tasks, use the Cache_Tasks always.
They always represent the "latest state" of the object.
Deleteability
All the Cache_xxx tables are deleteable and do not need to be backed up. Just replay the events in date order and you'll get the cached entities again.
Sample code
I wrote this answer as "Task" as this was the question, but today for instance I've been working in assigning an "state" to the client's form-submissions. The clients just ask something via web and now I want to be able to "mark" this request as "new" or "processed" or "answered" or "mailValidated", etc...
I just created a new change() method to my FormSubmissionManager. It looks like this:
public function change( Id $formSubmissionId, array $arrayOfPropertiesToSet ) : ChangedEvent
{
$eventId = $this->idGenerator->generateNewId();
$applicationExecutionId = $this->application->getExecutionId();
$timeStamp = $this->systemClock->getNow();
$changedEvent = new ChangedEvent( $eventId, $applicationExecutionId, $timeStamp, $formSubmissionId, $arrayOfPropertiesToSet );
$this->entityManager->persist( $changedEvent );
$this->entityManager->flush();
$this->cacheManager->applyEventToCachedEntity( $changedEvent );
$this->entityManager->flush();
return $changedEvent;
}
Note that I do 2 flushes. This is on purpose. In the case the "write-to-the-cache" fails I don't want to loose the changedEvent.
So I "store" the event, then I cache it to the entity.
The Log_FormSubmission_ChangeEvent.event field looks like this:
{
"id":"5093ecd53d5cca81d477c845973add91e31a1dd9",
"type":"hellotrip.formSubmission.change",
"applicationExecutionId":"ff7ad4bd5ec6cebacc048650c866812ac0127ac2",
"timeStamp":"2018-04-04T02:03:11.637266Z",
"formSubmissionId":"758d3b3cf864d711d330c4e0d5c679cbf9370d9e",
"set":
{
"state":"quotationSent"
}
}
In the "row" of the cache I'll have the "quotationSent" in the column state so it can be queried normally from Doctrine even without the need of any Join.
I sell trips. You can see there many de-normalized data coming from several sources, like for example the number of adults, kids and infants travelling (coming from the creation of the form submission itself), the name of the trip he requests (coming from a repository of trips) and others.
You can also see the latest-added field "state" at the right of the image. There may be like 20 de-mapped fields in the cached row.
Answers to your questions
Q1) What is the recommended way of gettings tasks by current status? Ie. all currently opened, finished, pending tasks.
Query the cached table.
Q2) Are there some disadvantages which may appear in the future, as the project will grow?
When the project grows, intead of rebuilding the cache at the time of write, chich may be slow, you setup a queue system (for example a RabbitMq or AWS-SNS) and you just send to the queue a signal of "hey, this entity needs to be re-cached". Then you can return very quickly as saving a JSON and sending a signal to the queue is effort-less.
Then a listener to the queue will process all and every changes you make, and if re-caching is slow, you don't matter.
Q3) Where it is more practical to save status changes (as a serialized array in a Task field, or in a separate table?
Separate tables: A table for "status changes" (=log =events =value_objects, not entities), and another table for "tasks" (=cache =domain_entities).
When you make backups, place in super-secure place the backups of the logs.
Upon a critical failure, restore the logs=events and replay them to re-build the cache.
In symfony I use to create a hellotrip:cache:rebuild command that accepts as a parameter the cache I need to reconstruct. It truncates the table (deletes all cached data for that table) and re-builds it over again.
This is costly, so you only need to rebuild "all" when necessary. In normal conditions, your app should take care of having the caches up to date when there is a new event.
Documents and Data
At the very beginning I mentioned the Documents and Data tables.
Time for it, now: You can use that information when rebuilding the caches. For example, you can "de-map" the airport name into the cached entity while in the events you may only have the airport code.
You can rather safely change the cache format as your business has more complex queries, having pre-calculated data. Just change the schema, drop it, re-build the cache.
The change-events, instead, will remain "exactly the same" so the code that gets the data and saves the event has not changed, reducing the risk of regression bugs.
Hope to help!

Struggling With OOP Concept

I'm really struggling with a recurring OOP / database concept.
Please allow me to explain the issue with pseudo-PHP-code.
Say you have a "user" class, which loads its data from the users table in its constructor:
class User {
public $name;
public $height;
public function __construct($user_id) {
$result = Query the database where the `users` table has `user_id` of $user_id
$this->name= $result['name'];
$this->height = $result['height'];
}
}
Simple, awesome.
Now, we have a "group" class, which loads its data from the groups table joined with the groups_users table and creates user objects from the returned user_ids:
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['user_ids'] as $user_id) {
// Make the user objects
$users[] = new User($user_id);
}
}
}
A group can have any number of users.
Beautiful, elegant, amazing... on paper. In reality, however, making a new group object...
$group = new Group(21); // Get the 21st group, which happens to have 4 users
...performs 5 queries instead of 1. (1 for the group and 1 for each user.) And worse, if I make a community class, which has many groups in it that each have many users within them, an ungodly number of queries are ran!
The Solution, Which Doesn't Sit Right To Me
For years, the way I've got around this, is to not code in the above fashion, but instead, when making a group for instance, I would join the groups table to the groups_users table to the users table as well and create an array of user-object-like arrays within the group object (never using/touching the user class):
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
**and also joining the `users` table,**
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['users'] as $user) {
// Make user arrays
$users[] = array_of_user_data_crafted_from_the_query_result;
}
}
}
...but then, of course, if I make a "community" class, in its constructor I'll need to join the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
...and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_communities table with the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
What an unmitigated disaster!
Do I have to choose between beautiful OOP code with a million queries VS. 1 query and writing these joins by hand for every single superset? Is there no system that automates this?
I'm using CodeIgniter, and looking into countless other MVC's, and projects that were built in them, and cannot find a single good example of anyone using models without resorting to one of the two flawed methods I've outlined.
It appears this has never been done before.
One of my coworkers is writing a framework that does exactly this - you create a class that includes a model of your data. Other, higher models can include that single model, and it crafts and automates the table joins to create the higher model that includes object instantiations of the lower model, all in a single query. He claims he's never seen a framework or system for doing this before, either.
Please Note:
I do indeed always use separate classes for logic and persistence. (VOs and DAOs - this is the entire point of MVCs). I have merely combined the two in this thought-experiment, outside of an MVC-like architecture, for simplicity's sake. Rest assured that this issue persists regardless of the separation of logic and persistence. I believe this article, introduced to me by James in the comments below this question, seems to indicate that my proposed solution (which I've been following for years) is, in fact, what developers currently do to solve this issue. This question is, however, attempting to find ways of automating that exact solution, so it doesn't always need to be coded by hand for every superset. From what I can see, this has never been done in PHP before, and my coworker's framework will be the first to do so, unless someone can point me towards one that does.
And, also, of course I never load data in constructors, and I only call the load() methods that I create when I actually need the data. However, that is unrelated to this issue, as in this thought experiment (and in the real-life situations where I need to automate this), I always need to eager-load the data of all subsets of children as far down the line as it goes, and not lazy-load them at some future point in time as needed. The thought experiment is concise -- that it doesn't follow best practices is a moot point, and answers that attempt to address its layout are likewise missing the point.
EDIT : Here is a database schema, for clarity.
CREATE TABLE `groups` (
`group_id` int(11) NOT NULL, <-- Auto increment
`make` varchar(20) NOT NULL,
`model` varchar(20) NOT NULL
)
CREATE TABLE `groups_users` ( <-- Relational table (many users to one group)
`group_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL
)
CREATE TABLE `users` (
`user_id` int(11) NOT NULL, <-- Auto increment
`name` varchar(20) NOT NULL,
`height` int(11) NOT NULL,
)
(Also note that I originally used the concepts of wheels and cars, but that was foolish, and this example is much clearer.)
SOLUTION:
I ended up finding a PHP ORM that does exactly this. It is Laravel's Eloquent. You can specify the relationships between your models, and it intelligently builds optimized queries for eager loading using syntax like this:
Group::with('users')->get();
It is an absolute life saver. I haven't had to write a single query. It also doesn't work using joins, it intelligently compiles and selects based on foreign keys.
Say you have a "wheel" class, which loads its data from the wheels table in its constructor
Constructors should not be doing any work. Instead they should contain only assignments. Otherwise you make it very hard to test the behavior of the instance.
Now, we have a "car" class, which loads its data from the cars table joined with the cars_wheels table and creates wheel objects from the returned wheel_ids:
No. There are two problems with this.
Your Car class should not contain both code for implementing "car logic" and "persistence logic". Otherwise you are breaking SRP. And wheels are a dependency for the class, which means that the wheels should be injected as parameter for the constructor (most likely - as a collection of wheels, or maybe an array).
Instead you should have a mapper class, which can retrieve data from database and store it in the WheelCollection instance. And a mapper for car, which will store data in Car instance.
$car = new Car;
$car->setId( 42 );
$mapper = new CarMapper( $pdo );
if ( $mapper->fetch($car) ) //if there was a car in DB
{
$wheels = new WheelCollection;
$otherMapper = new WheelMapper( $pdo );
$car->addWheels( $wheels );
$wheels->setType($car->getWheelType());
// I am not a mechanic. There is probably some name for describing
// wheels that a car can use
$otherMapper->fetch( $wheels );
}
Something like this. The mapper in this case are responsible for performing the queries. And you can have several source for them, for example: have one mapper that checks the cache and only, if that fails, pull data from SQL.
Do I really have to choose between beautiful OOP code with a million queries VS. 1 query and disgusting, un-OOP code?
No, the ugliness comes from fact that active record pattern is only meant for the simplest of usecases (where there is almost no logic associated, glorified value-objects with persistence). For any non-trivial situation it is preferable to apply data mapper pattern.
..and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_dealerships table with the dealerships table with the dealerships_cars table with the cars table with the cars_wheels table with the wheels table.
Jut because you need data about "available cares per dealership in Moscow" does not mean that you need to create Car instances, and you definitely will not care about wheels there. Different parts of site will have different scale at which they operate.
The other thing is that you should stop thinking of classes as table abstractions. There is no rule that says "you must have 1:1 relation between classes and tables".
Take the Car example again. If you look at it, having separate Wheel (or even WheelSet) class is just stupid. Instead you should just have a Car class which already contains all it's parts.
$car = new Car;
$car->setId( 616 );
$mapper = new CarMapper( $cache );
$mapper->fetch( $car );
The mapper can easily fetch data not only from "Cars" table but also from "Wheel" and "Engines" and other tables and populate the $car object.
Bottom line: stop using active record.
P.S.: also, if you care about code quality, you should start reading PoEAA book. Or at least start watching lectures listed here.
my 2 cents
ActiveRecord in Rails implements the concept of lazy loading, that is deferring database queries until you actually need the data. So if you instantiate a my_car = Car.find(12) object, it only queries the cars table for that one row. If later you want my_car.wheels then it queries the wheels table.
My suggestion for your pseudo code above is to not load every associated object in the constructor. The car constructor should query for the car only, and should have a method to query for all of it's wheels, and another to query it's dealership, which only queries for the dealership and defers collecting all of the other dealership's cars until you specifically say something like my_car.dealership.cars
Postscript
ORMs are database abstraction layers, and thus they must be tuned for ease of querying and not fine tuning. They allow you to rapidly build queries. If later you decide that you need to fine tune your queries, then you can switch to issuing raw sql commands or trying to otherwise optimize how many objects you're fetching. This is standard practice in Rails when you start doing performance tuning - look for queries that would be more efficient when issued with raw sql, and also look for ways to avoid eager loading (the opposite of lazy loading) of objects before you need them.
In general, I'd recommend having a constructor that takes effectively a query row, or a part of a larger query. How do do this will depend on your ORM. That way, you can get efficient queries but you can construct the other model objects after the fact.
Some ORMs (django's models, and I believe some of the ruby ORMs) try to be clever about how they construct queries and may be able to automate this for you. The trick is to figure out when the automation is going to be required. I do not have personal familiarity with PHP ORMs.

how do i pattern a php database call?

specs: PHP 5 with mySQL built on top of Codeigniter Framework.
I have a database table called game and then sport specific tables like soccerGame and footballGame. these sport specific tables have a gameId field linking back to the game table. I have corresponding classes game and soccerGame/footballGame, which both extend game.
When I look up game information to display to the user, I'm having trouble figuring out how to dynamically link the two tables. i'm curious if it's possible to get all the information with with one query. The problem is, I need to query the game table first to figure out the sport name.
if that's not possible, my next thought is to do it with two queries. have my game_model query the game table, then based off the sport name, call the appropriate sport specific model (i.e. soccer_game_model) and get the sport specific info.
I would also pass the game object into the soccer_model, and the soccer_model would use that object to build me a soccerGame object. this seems a little silly to me because i'm building the parent object and then giving it to the extending class to make a whole new object?
thoughts?
thanks for the help.
EDIT:
game table
gameId
sport (soccer, basketball, football, etc)
date
other data
soccerGame table
soccerGameId
gameId
soccer specific information
footballGame table
footballGameId
gameId
football specific information
and so on for other sports
So I need to know what the sport is before I can decide which sport specific table I need to pull info from.
UPDATE:
Thanks all for the input. It seems like dynamic SQL is only possible through stored procedures, something I'm not well versed on right now. And even with them it's still a little messy. Right now I will go the two query route, one to get the sport name, and then a switch to get the right model.
From the PHP side of things now, it seems a little silly to get a game object, pass it to, say, my soccer_game_model, and then have that return me a soccer_game object, which is a child of the original game. Is that how it has to be done? or am I missing something from an OO perspective here?
To extend on Devin Young's answer, you would achieve this using Codeigniter's active record class like so:
public function get_game_by_id($game_id, $table)
{
return $this->db->join('game', 'game.id = ' . $table . '.gameId', 'left')
->where($table . '.gameId', $game_id)
->get('game')
->result();
}
So you're joining the table by the gameId which is shared, then using a where clause to find the correct one. Finally you use result() to return an array of objects.
EDIT: I've added a second table paramater to allow you to pass in the name of the table you can join either soccerGame, footballGame table etc.
If you don't know which sport to choose at this point in the program then you may want to take a step back and look at how you can add that so you do know. I would be reluctant to add multiple joins to all sport tables as you''ll run into issues down the line.
UPDATE
Consider passing the "sport" parameter when you look up game data. As a hidden field, most likely. You can then use a switch statement in your model:
switch($gameValue) {
case 'football': $gameTable = "footballGame"; break;
case 'soccer': $gameTable = "soccerGame"; break;
}
Then base your query off this:
"SELECT *
FROM ". $gameTable . "
...etc
You can combine the tables with joins. http://www.w3schools.com/sql/sql_join.asp
For example, if you need to get all the data from game and footballGame based on a footballGameId of 15:
SELECT *
FROM footballGame a
LEFT OUTER JOIN game b ON a.id = b.gameId
WHERE footballGameId = 15
Check this Stack Overflow answer for options on how to do it via a standard query. Then you can turn it into active record if you want (though that may be complicated and not worth your time if you don't need DB-agnostic calls in your app).
Fow what it's worth, there's nothing wrong with doing multiple queries, it just might be slower than an alternative. Try a few options, see what works best for you and your app.

How to handle Tree structures returned from SQL query using PHP?

This is a "theoretical" question.
I'm having trouble defining the question so please bear with me.
When you have several related tables in a database, for example a table that holds "users" and a table that holds "phones"
both "phones" and "users" have a column called "user_id"
select user_id,name,phone from users left outer join phones on phones.user_id = users.user_id;
the query will provide me with rows of all the users whether they have a phone or not.
If a user has several phones, his name will be returned in 2 rows as expected.
columns=>|user_id|name|phone|
row0 = > | 0 |fred|NULL|
row1 = > | 1 |paul|tlf1|
row2 = > | 1 |paul|tlf2|
the name "paul" in the case above is a necessary duplicate which in the RDMS's eye's is not a duplicate at all!
It will then be handled by some server side scripting language, for example php.
How are these "necessary duplicates" actually handled in real websites or applications?
as in, how are the row's "mapped" into some usable object model.
p.s. if you decide to post examples, post them for php,mysql,sqlite if possible.
edit:
Thank you for providing answers, each answer has interpreted the question differently and as such is different and correct in it's own way.
I have come to the conclusion that if round trips are expensive this will be the best way along with Jakob Nilsson-Ehle's solution, which was fitting for the theoretical question.
If round trips they are cheap, I will do separate selects for phones and users as 9000 suggests, if I need to show a single phone for every user, I will give a primary column to the phones and join it with the user select like Ollie Jones correctly suggests.
even though for real life applications I'm using 9000's answer, I think that for this unrealistic question Jakob Nilsson-Ehle's solution is most appropriate.
The thing I would probably do in this case in PHP would be to use the userId in a PHP array and then use that to continuosly update the users
A very simple example would be
$result = mysql_query('select user_id,name,phone from users left outer join phones on phones.user_id = users.user_id;');
$users = Array();
while($row = mysql_fetch_assoc($result)) {
$uid =$row['user_id'];
if(!array_key_exists($uid, $users)) {
$users[$uid] = Array('name' => $row['name'], 'phones' => Array());
}
$users[$uid]['phones'][] = $row['phone'];
}
Of course, depending on your programming style and the complexity of the user data, you might define a User class or something and populate the data, but that is fundamentally how I would would do it.
Your data model inherently allows a user to have 0, 1, or more phones.
You could get your database to return either 0 or 1 phone items for each user by employing a nasty hack, like choosing the numerically smallest phone number. (MIN(phone) ... GROUP BY user). But numerically comparing phone numbers makes very little sense.
Your problem of ambiguity (which of several phone numbers) points to a problem in your data model design. Take a look, if you will, at some common telephone-directory apps. A speed-dial app on a mobile phone is a good example. Mostly they offer ways to put in multiple phone numbers, but they always have the concept of a primary phone number.
If you add a column to your phone table indicating number priority, and make it part of your primary (unique) key, and declare that priority=1 means the user's primary number, your app will not have this ambiguous issue any more.
You can't easily get a tree structure from an RDBMS, only a table structure. And you want a tree: [(user1, (phone1, phone2)), (user2, (phone2, phone3))...]. You can optimize towards different goals, though.
Round-trips are more expensive than sending extra info: go with your current solution. It fetches username multiple times, but you only have one round-trip per entire phone book. May make sense if your overburdened MySQL host is 1000 miles away.
Sending extra info is more expensive than round-trips, or you want more clarity: as #martinho-fernandes suggests, only fetch user IDs with phones, then fetch user details in another query. I'd stick with this approach unless your entire user details is a short username. With SQLite I'd stick with it at all times just for the sake of clarity.
Sound's like you're confusing the object data model with the relational data model - Understanding how they differ in general, and in the specifics of your application is essential to writing OO code on top of a relational database.
Trivial ORM is not the solution.
There are ORM mapping technologies such as hibernate - however these do not scale well. IME, the best solution is using a factory pattern to manage the mapping properly.

Automatically joining tables without breaking default behaviour in Zend Framework

The situation is as follows: I've got 2 models: 'Action' and 'User'. These models refer to the tables 'actions' and 'users', respectively.
My action table contains a column user_id. At this moment, I need an overview of all actions, and the users to which they are assigned to. When i use $action->fetchAll(), I only have the user ID, so I want to be able to join the data from the user model, preferably without making a call to findDependentRowset().
I thought about creating custom fetchAll(), fetchRow() and find() methods in my model, but this would break default behaviour.
What is the best way to solve this issue? Any help would be greatly appreciated.
I designed and implemented the table-relationships feature in Zend Framework.
My first comment is that you wouldn't use findDependentRowset() anyway -- you'd use findParentRow() if the Action has a foreign key reference to User.
$actionTable = new Action();
$actionRowset = $actionTable->fetchAll();
foreach ($actionRowset as $actionRow) {
$userRow = $actionRow->findParentRow('User');
}
Edit: In the loop, you now have an $actionRow and a $userRow object. You can write changes back to the database through either object by changing object fields and calling save() on the object.
You can also use the Zend_Db_Table_Select class (which was implemented after I left the project) to retrieve a Rowset based on a join between Action and User.
$actionTable = new Action();
$actionQuery = $actionTable->select()
->setIntegrityCheck(false) // allows joins
->from($actionTable)
->join('user', 'user.id = action.user_id');
$joinedRowset = $actionTable->fetchAll($actionQuery);
foreach ($joinedRowset as $joinedRow) {
print_r($joinedRow->toArray());
}
Note that such a Rowset based on a join query is read-only. You cannot set field values in the Row objects and call save() to post changes back to the database.
Edit: There is no way to make an arbitrary joined result set writable. Consider a simple example based on the joined result set above:
action_id action_type user_id user_name
1 Buy 1 Bill
2 Sell 1 Bill
3 Buy 2 Aron
4 Sell 2 Aron
Next for the row with action_id=1, I change one of the fields that came from the User object:
$joinedRow->user_name = 'William';
$joinedRow->save();
Questions: when I view the next row with action_id=2, should I see 'Bill' or 'William'? If 'William', does this mean that saving row 1 has to automatically update 'Bill' to 'William' in all other rows in this result set? Or does it mean that save() automatically re-runs the SQL query to get a refreshed result set from the database? What if the query is time-consuming?
Also consider the object-oriented design. Each Row is a separate object. Is it appropriate that calling save() on one object has the side effect of changing values in a separate object (even if they are part of the same collection of objects)? That seems like a form of Content Coupling to me.
The example above is a relatively simple query, but much more complex queries are also permitted. Zend_Db cannot analyze queries with the intention to tell writable results from read-only results. That's also why MySQL views are not updateable.
You could always make a view in your database that does the join for you.
CREATE OR REPLACE VIEW VwAction AS
SELECT [columns]
FROM action
LEFT JOIN user
ON user.id = action.user_id
Then just use
$vwAction->fetchAll();
Just remember that views in MySQL are read-only (assuming this is MySQL)
isn't creating a view sql table a good solution to make joint ?
and after a simple table class to access it
I would think it's better if your logic is in sql than in php

Categories