construct foreign classes from database in PHP - php

Firstly, it's not a problem to construct classes from a database, i.e. mysql, it's more a question about performance.
If I have a Class A which depends on class B.
class A
{
protected $depend;
public function __construct($id == null)
{
// construct from mysql/postgresql/...
}
}
And in the database has class A (say table "tbl_A") a foreign key to the table of class B (say "tbl_B"). Of course this classes are depending on much more than one table but i will simplify things here...
At the moment i construct class A from it's table:
select * from tbl_A where ID = $id
If they are successful, the statement of class A gives me something like that:
ID | Name | B_ID
1 | "test" | 3
After that i had to construct class B in it's constructor. Is there any possibility to only make one statement with a join in constructor of class A and construct class B from there? I thought this will increase the performance of my application. Badly i don't found any functionality like friend classes (c++, etc) and i want to let my properties of class B stay protected or private.

What is the largest performance overhead: MySQL/PostGre SQL or the PHP side?
If i understood correctly, for one instance of A, you have to do 2 queries (1 for A, one for the depended B). So for 1000 single instances of A, it makes 2000 queries total.
You can optimize your app to use "batch loading" of some kind. i.e. loading 1000 instances of A at once will be 1 query for A (1000 rows) + 1 query for B (1000 rows). So optimized from 2000 queries to just 2.
Your suggestion about joining tables is off course possible (join table A and table B on the correct keys and select all (needed) fields), but this would optimize from 2000 queries to 1000 queries plus your object-relation-mapping code would get more complicated.

It's really going to depend on how deep down the rabbit hole you want to go, but if you want to catch all 2 steppers, you can do this:
select tA1.* from tbl_A tA1 where tA1.ID = $id
UNION
select tA3.* from tbl_A tA3
LEFT JOIN tbl_A tA2 ON tA3.ID = tA2.B_ID
WHERE tA2.B_ID = $id
but then there's an issue if you need to go deep... this method becomes increasingly complex and less efficient than doing multiple queries (especially since a union is just making MySQL do it in that way).
Also once you've got them, you should loop through the list of ids you got and keep that in an array so you don't need to reload them.

Related

Why is Yii2's ActiveRecord using lots of single SELECTs instead of JOINs?

I'm using Yii2's ActiveRecord implementation in (hopefully) exactly the way it should be used, according to the docs.
Problem
In a quite simple setup with simple relations betweens the tables, fetching 10 results is fast, 100 is slow. 1000 is impossible. The database is extremely small and indexed perfectly. The problem is definitly Yii2's way to request data, not the db itself.
I'm using a standard ActiveDataProvider like:
$provider = new ActiveDataProvider([
'query' => Post::find(),
'pagination' => false // to get all records
]);
What I suspect
Debugging with the Yii2 toolbar showed thousands of single SELECTs for a simple request that should just get 50 rows from table A with some simple "JOINs" to table B to table C. In plain SQL everybody would solve this with one SQL statement and two joins. Yii2 however fires a SELECT for every relation in every row (which makes sense to keep the ORM clean). Resulting in (more or less) 1 * 50 * 30 = 1500 queries for just getting two relations of each row.
Question
Why is Yii2 using so many single SELECTs, or is this a mistake on my side ?
Addionally, does anybody know how to "fix" this ?
As this is a very important issue for me I'll provide 500 bounty on May 14th.
By default, Yii2 uses lazy loading for better performance. The effect of this is that any relation is only fetched when you access it, hence the thousands of sql queries. You need to use eager loading. You can do this with \yii\db\ActiveQuery::with() which:
Specifies the relations with which this query should be performed
Say your relation is comments, the solution is as follows:
'query' => Post::find()->with('comments'),
From the guide for Relations, with will perform an extra query to get the relations i.e:
SELECT * FROM `post`;
SELECT * FROM `comment` WHERE `postid` IN (....);
To use proper joining, use joinWith with the eagerLoading parameter set to true instead:
This method allows you to reuse existing relation definitions to perform JOIN queries. Based on the definition of the specified relation(s), the method will append one or multiple JOIN statements to the current query.
So
'query' => Post::find()->joinWith('comments', true);
will result in the following queries:
SELECT `post`.* FROM `post` LEFT JOIN `comment` comments ON post.`id` = comments.`post_id`;
SELECT * FROM `comment` WHERE `postid` IN (....);
From #laslov's comment and https://github.com/yiisoft/yii2/issues/2379
it's important to realise that using joinWith() will not use the JOIN query to eagerly load the related data. For various reasons, even with the JOIN, the WHERE postid IN (...) query will still be executed to handle the eager loading. Thus, you should only use joinWith() when you specifically need a JOIN, e.g. to filter or order on one of the related table's columns
TLDR:
joinWith = with plus an actual JOIN (and therefore the ability to filter/order/group etc by one of the related columns)
In order to use relational AR, it is recommended that primary-foreign key constraints are declared for tables that need to be joined. The constraints will help to keep the consistency and integrity of the relational data.
Support for foreign key constraints varies in different DBMS. SQLite 3.6.19 or prior does not support foreign key constraints, but you can still declare the constraints when creating tables. MySQL’s MyISAM engine does not support foreign keys at all.
In AR, there are four types of relationships:
BELONGS_TO: if the relationship between table A and B is one-to-many, then B belongs to A (e.g. Post belongs to User);
HAS_MANY: if the relationship between table A and B is one-to-many, then A has many B (e.g. User has many Post);
HAS_ONE: this is special case of HAS_MANY where A has at most one B (e.g. User has at most one Profile);
MANY_MANY: this corresponds to the many-to-many relationship in database. An associative table is needed to break a many-to-many relationship into one-to-many relationships, as most DBMS do not support many-to-many relationship directly. In our example database schema, the tbl_post_category serves for this purpose. In AR terminology, we can explain MANY_MANY as the combination of BELONGS_TO and HAS_MANY. For example, Post belongs to many Category and Category has many Post.
The following code shows how we declare the relationships for the User and Post classes.
class Post extends CActiveRecord
{
......
public function relations()
{
return array(
'author'=>array(self::BELONGS_TO, 'User', 'author_id'),
'categories'=>array(self::MANY_MANY, 'Category',
'tbl_post_category(post_id, category_id)'),
);
}
}
class User extends CActiveRecord
{
......
public function relations()
{
return array(
'posts'=>array(self::HAS_MANY, 'Post', 'author_id'),
'profile'=>array(self::HAS_ONE, 'Profile', 'owner_id'),
);
}
}
The query result will be saved to the property as instance(s) of the related AR class. This is known as the lazy loading approach, i.e., the relational query is performed only when the related objects are initially accessed. The example below shows how to use this approach:
// retrieve the post whose ID is 10
$post=Post::model()->findByPk(10);
// retrieve the post's author: a relational query will be performed here
$author=$post->author;
You are somehow doing it the wrong please go through from the documentaion here http://www.yiiframework.com/doc/guide/1.1/en/database.arr

Yii relation generates GROUP BY clause in the query

I have User, Play and UserPlay model. Here is the relation defined in User model to calculate total time, the user has played game.
'playedhours'=>array(self::STAT, 'Play', 'UserPlay(user_id,play_id)',
'select'=>'SUM(duration)'),
Now i am trying to find duration sum with user id.
$playedHours = User::model()->findByPk($model->user_id)->playedhours)/3600;
This relation is taking much time to execute on large amount of data. Then is looked into the query generated by the relation.
SELECT SUM(duration) AS `s`, `UserPlay`.`user_id` AS `c0` FROM `Play` `t` INNER JOIN
`UserPlay` ON (`t`.`id`=`UserPlay`.`play_id`) GROUP BY `UserPlay`.`user_id` HAVING
(`UserPlay`.`user_id`=9);
GROUP BY on UserPlay.user_id is taking much time. As i don't need Group by clause here.
My question is, how to avoid GROUP BY clause from the above relation.
STAT relations are by definition aggregation queries, See Statistical Query.
You cannot remove GROUP BY here and make a meaningful query for aggregate data. SUM(), AVG(), etc are all aggregate functions see GROUP BY Functions, for a list of all aggregate functions supported by MYSQL.
Your problem is for the calculation you are doing a HAVING clause. This is not required as HAVING checks conditions after the aggregation takes place, which you can use to put conditions like for example SUM(duration) > 500 .
Basically what is happening is that you are grouping all the users separately first, then filtering for the user id you want. If you instead use a WHERE clause which will filter before not after then aggregation is for only the user you want then group it your query will be much faster.
Although Active Record is good at modelling data in an OOP fashion, it
actually degrades performance due to the fact that it needs to create
one or several objects to represent each row of query result. For data
intensive applications, using DAO or database APIs at lower level
could be a better choice
Therefore it is best if you change the relation to a model function querying the Db directly using the CommandBuilder or DAO API. Something like this
Class User extends CActiveRecord {
....
public function getPlayedhours(){
if(!isset($this->id)) // to prevent query running on a newly created object without a row loaded to it
return 0;
$played = Yii::app()->db->createCommand()
->select('SUM(duration)')
->from('play')
->join("user_play up","up.play_id = play.id")
->where("up.user_id =".$this->id)
->group("up.user_id")
->queryScalar();
if($played == null)
return 0;
else
return $played/3600 ;
}
....
}
If you query still is slow, try optimizing the indexes, implement cache mechanism, and use the explain command to figure out what is actually taking more time and more importantly why. If nothing is good enough, upgrade your hardware.

Struggling With OOP Concept

I'm really struggling with a recurring OOP / database concept.
Please allow me to explain the issue with pseudo-PHP-code.
Say you have a "user" class, which loads its data from the users table in its constructor:
class User {
public $name;
public $height;
public function __construct($user_id) {
$result = Query the database where the `users` table has `user_id` of $user_id
$this->name= $result['name'];
$this->height = $result['height'];
}
}
Simple, awesome.
Now, we have a "group" class, which loads its data from the groups table joined with the groups_users table and creates user objects from the returned user_ids:
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['user_ids'] as $user_id) {
// Make the user objects
$users[] = new User($user_id);
}
}
}
A group can have any number of users.
Beautiful, elegant, amazing... on paper. In reality, however, making a new group object...
$group = new Group(21); // Get the 21st group, which happens to have 4 users
...performs 5 queries instead of 1. (1 for the group and 1 for each user.) And worse, if I make a community class, which has many groups in it that each have many users within them, an ungodly number of queries are ran!
The Solution, Which Doesn't Sit Right To Me
For years, the way I've got around this, is to not code in the above fashion, but instead, when making a group for instance, I would join the groups table to the groups_users table to the users table as well and create an array of user-object-like arrays within the group object (never using/touching the user class):
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
**and also joining the `users` table,**
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['users'] as $user) {
// Make user arrays
$users[] = array_of_user_data_crafted_from_the_query_result;
}
}
}
...but then, of course, if I make a "community" class, in its constructor I'll need to join the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
...and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_communities table with the communities table with the communities_groups table with the groups table with the groups_users table with the users table.
What an unmitigated disaster!
Do I have to choose between beautiful OOP code with a million queries VS. 1 query and writing these joins by hand for every single superset? Is there no system that automates this?
I'm using CodeIgniter, and looking into countless other MVC's, and projects that were built in them, and cannot find a single good example of anyone using models without resorting to one of the two flawed methods I've outlined.
It appears this has never been done before.
One of my coworkers is writing a framework that does exactly this - you create a class that includes a model of your data. Other, higher models can include that single model, and it crafts and automates the table joins to create the higher model that includes object instantiations of the lower model, all in a single query. He claims he's never seen a framework or system for doing this before, either.
Please Note:
I do indeed always use separate classes for logic and persistence. (VOs and DAOs - this is the entire point of MVCs). I have merely combined the two in this thought-experiment, outside of an MVC-like architecture, for simplicity's sake. Rest assured that this issue persists regardless of the separation of logic and persistence. I believe this article, introduced to me by James in the comments below this question, seems to indicate that my proposed solution (which I've been following for years) is, in fact, what developers currently do to solve this issue. This question is, however, attempting to find ways of automating that exact solution, so it doesn't always need to be coded by hand for every superset. From what I can see, this has never been done in PHP before, and my coworker's framework will be the first to do so, unless someone can point me towards one that does.
And, also, of course I never load data in constructors, and I only call the load() methods that I create when I actually need the data. However, that is unrelated to this issue, as in this thought experiment (and in the real-life situations where I need to automate this), I always need to eager-load the data of all subsets of children as far down the line as it goes, and not lazy-load them at some future point in time as needed. The thought experiment is concise -- that it doesn't follow best practices is a moot point, and answers that attempt to address its layout are likewise missing the point.
EDIT : Here is a database schema, for clarity.
CREATE TABLE `groups` (
`group_id` int(11) NOT NULL, <-- Auto increment
`make` varchar(20) NOT NULL,
`model` varchar(20) NOT NULL
)
CREATE TABLE `groups_users` ( <-- Relational table (many users to one group)
`group_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL
)
CREATE TABLE `users` (
`user_id` int(11) NOT NULL, <-- Auto increment
`name` varchar(20) NOT NULL,
`height` int(11) NOT NULL,
)
(Also note that I originally used the concepts of wheels and cars, but that was foolish, and this example is much clearer.)
SOLUTION:
I ended up finding a PHP ORM that does exactly this. It is Laravel's Eloquent. You can specify the relationships between your models, and it intelligently builds optimized queries for eager loading using syntax like this:
Group::with('users')->get();
It is an absolute life saver. I haven't had to write a single query. It also doesn't work using joins, it intelligently compiles and selects based on foreign keys.
Say you have a "wheel" class, which loads its data from the wheels table in its constructor
Constructors should not be doing any work. Instead they should contain only assignments. Otherwise you make it very hard to test the behavior of the instance.
Now, we have a "car" class, which loads its data from the cars table joined with the cars_wheels table and creates wheel objects from the returned wheel_ids:
No. There are two problems with this.
Your Car class should not contain both code for implementing "car logic" and "persistence logic". Otherwise you are breaking SRP. And wheels are a dependency for the class, which means that the wheels should be injected as parameter for the constructor (most likely - as a collection of wheels, or maybe an array).
Instead you should have a mapper class, which can retrieve data from database and store it in the WheelCollection instance. And a mapper for car, which will store data in Car instance.
$car = new Car;
$car->setId( 42 );
$mapper = new CarMapper( $pdo );
if ( $mapper->fetch($car) ) //if there was a car in DB
{
$wheels = new WheelCollection;
$otherMapper = new WheelMapper( $pdo );
$car->addWheels( $wheels );
$wheels->setType($car->getWheelType());
// I am not a mechanic. There is probably some name for describing
// wheels that a car can use
$otherMapper->fetch( $wheels );
}
Something like this. The mapper in this case are responsible for performing the queries. And you can have several source for them, for example: have one mapper that checks the cache and only, if that fails, pull data from SQL.
Do I really have to choose between beautiful OOP code with a million queries VS. 1 query and disgusting, un-OOP code?
No, the ugliness comes from fact that active record pattern is only meant for the simplest of usecases (where there is almost no logic associated, glorified value-objects with persistence). For any non-trivial situation it is preferable to apply data mapper pattern.
..and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_dealerships table with the dealerships table with the dealerships_cars table with the cars table with the cars_wheels table with the wheels table.
Jut because you need data about "available cares per dealership in Moscow" does not mean that you need to create Car instances, and you definitely will not care about wheels there. Different parts of site will have different scale at which they operate.
The other thing is that you should stop thinking of classes as table abstractions. There is no rule that says "you must have 1:1 relation between classes and tables".
Take the Car example again. If you look at it, having separate Wheel (or even WheelSet) class is just stupid. Instead you should just have a Car class which already contains all it's parts.
$car = new Car;
$car->setId( 616 );
$mapper = new CarMapper( $cache );
$mapper->fetch( $car );
The mapper can easily fetch data not only from "Cars" table but also from "Wheel" and "Engines" and other tables and populate the $car object.
Bottom line: stop using active record.
P.S.: also, if you care about code quality, you should start reading PoEAA book. Or at least start watching lectures listed here.
my 2 cents
ActiveRecord in Rails implements the concept of lazy loading, that is deferring database queries until you actually need the data. So if you instantiate a my_car = Car.find(12) object, it only queries the cars table for that one row. If later you want my_car.wheels then it queries the wheels table.
My suggestion for your pseudo code above is to not load every associated object in the constructor. The car constructor should query for the car only, and should have a method to query for all of it's wheels, and another to query it's dealership, which only queries for the dealership and defers collecting all of the other dealership's cars until you specifically say something like my_car.dealership.cars
Postscript
ORMs are database abstraction layers, and thus they must be tuned for ease of querying and not fine tuning. They allow you to rapidly build queries. If later you decide that you need to fine tune your queries, then you can switch to issuing raw sql commands or trying to otherwise optimize how many objects you're fetching. This is standard practice in Rails when you start doing performance tuning - look for queries that would be more efficient when issued with raw sql, and also look for ways to avoid eager loading (the opposite of lazy loading) of objects before you need them.
In general, I'd recommend having a constructor that takes effectively a query row, or a part of a larger query. How do do this will depend on your ORM. That way, you can get efficient queries but you can construct the other model objects after the fact.
Some ORMs (django's models, and I believe some of the ruby ORMs) try to be clever about how they construct queries and may be able to automate this for you. The trick is to figure out when the automation is going to be required. I do not have personal familiarity with PHP ORMs.

how do i pattern a php database call?

specs: PHP 5 with mySQL built on top of Codeigniter Framework.
I have a database table called game and then sport specific tables like soccerGame and footballGame. these sport specific tables have a gameId field linking back to the game table. I have corresponding classes game and soccerGame/footballGame, which both extend game.
When I look up game information to display to the user, I'm having trouble figuring out how to dynamically link the two tables. i'm curious if it's possible to get all the information with with one query. The problem is, I need to query the game table first to figure out the sport name.
if that's not possible, my next thought is to do it with two queries. have my game_model query the game table, then based off the sport name, call the appropriate sport specific model (i.e. soccer_game_model) and get the sport specific info.
I would also pass the game object into the soccer_model, and the soccer_model would use that object to build me a soccerGame object. this seems a little silly to me because i'm building the parent object and then giving it to the extending class to make a whole new object?
thoughts?
thanks for the help.
EDIT:
game table
gameId
sport (soccer, basketball, football, etc)
date
other data
soccerGame table
soccerGameId
gameId
soccer specific information
footballGame table
footballGameId
gameId
football specific information
and so on for other sports
So I need to know what the sport is before I can decide which sport specific table I need to pull info from.
UPDATE:
Thanks all for the input. It seems like dynamic SQL is only possible through stored procedures, something I'm not well versed on right now. And even with them it's still a little messy. Right now I will go the two query route, one to get the sport name, and then a switch to get the right model.
From the PHP side of things now, it seems a little silly to get a game object, pass it to, say, my soccer_game_model, and then have that return me a soccer_game object, which is a child of the original game. Is that how it has to be done? or am I missing something from an OO perspective here?
To extend on Devin Young's answer, you would achieve this using Codeigniter's active record class like so:
public function get_game_by_id($game_id, $table)
{
return $this->db->join('game', 'game.id = ' . $table . '.gameId', 'left')
->where($table . '.gameId', $game_id)
->get('game')
->result();
}
So you're joining the table by the gameId which is shared, then using a where clause to find the correct one. Finally you use result() to return an array of objects.
EDIT: I've added a second table paramater to allow you to pass in the name of the table you can join either soccerGame, footballGame table etc.
If you don't know which sport to choose at this point in the program then you may want to take a step back and look at how you can add that so you do know. I would be reluctant to add multiple joins to all sport tables as you''ll run into issues down the line.
UPDATE
Consider passing the "sport" parameter when you look up game data. As a hidden field, most likely. You can then use a switch statement in your model:
switch($gameValue) {
case 'football': $gameTable = "footballGame"; break;
case 'soccer': $gameTable = "soccerGame"; break;
}
Then base your query off this:
"SELECT *
FROM ". $gameTable . "
...etc
You can combine the tables with joins. http://www.w3schools.com/sql/sql_join.asp
For example, if you need to get all the data from game and footballGame based on a footballGameId of 15:
SELECT *
FROM footballGame a
LEFT OUTER JOIN game b ON a.id = b.gameId
WHERE footballGameId = 15
Check this Stack Overflow answer for options on how to do it via a standard query. Then you can turn it into active record if you want (though that may be complicated and not worth your time if you don't need DB-agnostic calls in your app).
Fow what it's worth, there's nothing wrong with doing multiple queries, it just might be slower than an alternative. Try a few options, see what works best for you and your app.

Achieve the results of two individual queries using a JOIN

I have a USER table structure as shown below:
id parent_id userName
10 01 Project manager
11 10 manager
12 11 teamlead
13 12 team member
I need to find the project manager ID if I give the team member ID in where clause. I can get the results in each individual query.
But I'm trying to implement it with a JOIN query. I'm new to JOIN queries. How do I do it?
It looks as if this involves a bit more than a simple join. Be ready to enter a world of pain :). I recently had a similar problem, but with type hierarchies being stored in a table with a similar structure. What I ended up with is writing a recursive query. In Sql Server, you would use a Common Table Expression. In mysql, you would use loops.
Basically, the idea is that you join a table against itself, walking a hierarchy until you reach the top-level element. Behind the scenes, the server is creating virtual tables and joining them against each other until some "stopping condition" is reached. This point is very important: be sure that you have your stopping condition correct, or you could cause some serious problems.
This post is a great run-down. Also, a general search for the terms hierarchical query mysql in google will result in a wealth of information.
I believe this should work with your existing schema
SELECT ParentUser.UserName AS ManagerName, BaseUser.UserName AS TeamMemberName
FROM User AS BaseUser
INNER JOIN User AS ParentUser
ON BaseUser.parent_id = ParentUser.id
WHERE BaseUser.Id = #PassedInTeamMemberId
Basically you want to do this:
SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name
WHERE TableA.whatever = 'whatever'
I find this visual explanation of joins from the lovely and talented Jeff Atwood to be quite helpful.

Categories