This question got me today, my repositories should always return full objects? They can not return partial data (in an array for example)?
For example, I have the method getUserFriends(User $user) inside my repository Friends, in this method I execute the following DQL:
$dql = 'SELECT userFriend FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
But this way I'm returning the users entities, containing all the properties, the generated SQL is a SELECT of all fields from the User table. But let's say I just need the id and the name of the user friends, there would be more interesting (and quick) get just these values?
$dql = 'SELECT userFriend.id, userFriend.name FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
These methods are executed in my service class.
From a database perspective, performance will not be that much affected by the number of fields, unless the number of rows to return is really huge (millions of rows, probably) : the hardest part for the db is to make the joints, and build the resultset from the tables.
From a php perspective, that depends on multiple factors, like the complexity and the number of objects created.
I would take the problem differently : I would profile and stress-test my code in order to see if performance is an issue or not, and decide to refactor only if needed (switching from doctrine to a hand-made model is time consuming, will the performance gain be worth it ?)
EDIT : and to answer your initial question : fetching complete objects will lead to easier caching if needed, and better data encapsulation. I would keep these until they represent a big performance issue.
You can use partial keyword in your DQL : http://www.doctrine-project.org/docs/orm/2.0/en/reference/partial-objects.html?highlight=partial
But only do that if your app has performance issues.
Related
I'm currently building an eCommerce site using Symfony 3 that supports multiple languages, and have realised they way I've designed the Product entity will require joining multiple other entities on using DQL/the query builder to load up things like the translations, product reviews and discounts/special offers. but this means I am going to have a block of joins that are going to be the same in multiple repositories which seems wrong as that leads to having to hunt out all these blocks if we ever need to add or change a join to load in extra product data.
For example in my CartRepository's loadCart() function I have a DQL query like this:
SELECT c,i,p,pd,pt,ps FROM
AppBundle:Cart c
join c.items i
join i.product p
left join p.productDiscount pd
join p.productTranslation pt
left join p.productSpecial ps
where c.id = :id
I will end up with something similar in the SectionRepository when I'm showing the list of products on that page, what is the correct way to deal with this? Is there some place I can centrally define the list of entities needed to be loaded for the joined entity (Product in this case) to be complete. I realise I could just use lazy loading, but that would lead to a large amount of queries being run on pages like the section page (a section showing 40 products would need to run 121 queries with the above example instead of 1 if I use a properly joined query).
One approach (this is just off the top of my head, someone may have a better approach). You could reasonably easily have a centralised querybuilder function/service that would do that. The querybuilder is very nice for programattically building queries. The key difference would be the root entity and the filtering entity.
E.g. something like this. Note of course these would not all be in the same place (they might be across a few services, repositories etc), it's just an example of an approach to consider.
public function getCartBaseQuery($cartId, $joinAlias = 'o') {
$qb = $this->getEntityManager()->createQueryBuilder();
$qb->select($joinAlias)
->from('AppBundle:Cart', 'c')
->join('c.items', $joinAlias)
->where($qb->expr()->eq('c.id', ':cartId'))
->setParameter('cartId', $cartId);
return $qb;
}
public function addProductQueryToItem($qb, $alias) {
/** #var QueryBuilder $query */
$qb
->addSelect('p, pd, pt, ps')
->join($alias.'product', 'p')
->leftJoin('p.productDiscount', 'pd')
->join('p.productTranslation', 'pt')
->join('p.productSpecial', 'ps')
;
return $qb;
}
public function loadCart($cartId) {
$qbcart = $someServiceOrRepository->getCartBaseQuery($cartId);
$qbcart = $someServiceOrRepository->addProductQueryToItem($qbcart);
return $qbcart->getQuery()->getResult();
}
Like I said, just one possible approach, but hopefully it gives you some ideas and a start at solving the issue.
Note: If you religiously use the same join alias for the entity you attach your product data to you would not even have to specify it in the calls (but I would make it configurable myself).
There is no single correct answer to your question.
But if I have to make a suggestion, I'd say to take a look at CQRS (http://martinfowler.com/bliki/CQRS.html) which basically means you have a separated read model.
To make this as simple as possibile, let's say that you build a separate "extended_product" table where all data are already joined and de-normalized. This table may be populated at regular intervals with a background task, or by a command that gets triggered each time you update a product or related entity.
When you need to read products data, you query this table instead of the original one. Of course, nothing prevents you from having many different extended table with your data arranged in a separate way.
In some way it's a concept very similar to database "views", except that:
it is faster, because you query an actual table
since you create that table via code, you are not limited to a single SQL query to process data (think filters, aggregations, and so on)
I am aware this is not exactly an "answer", but hopefully it may give you some good ideas on how to fix your problem.
Consider the following READ and WRITE queries:
Read
// Retrieves a person (and their active game score if they have one)
$sql = "SELECT CONCAT(people.first_name,' ',people.last_name) as 'name',
people.uniform as 'people.uniform',
games.score as 'games.score'
FROM my_people as people
LEFT JOIN my_games as games ON(games.person_id = people.id AND games.active = 1)
WHERE people.id = :id";
$results = DB::select(DB::raw($sql),array("id"=>$id));
Write
// Saves a person
$person = new People;
$person->data = array('first_name'=>$input['first_name'],
'last_name'=>$input['last_name'],
'uniform'=>$input['uniform']);
$personID = $person->save();
// Save the game score
$game = new Games;
$game->data = array('person_id'=>$personID,
'active'=>$input['active'],
'score'=>$input['score']);
$game->save();
I put every write (INSERT/UPDATE) operation into my own centralized repository classes and call them using class->methods as shown above.
I may decide to put some of the read queries into repository classes if I find myself using a query over and over (DRY). I have to be careful of this, because I tend to go back and adjust the read query slightly to get more or less data out in specific areas of my application.
Many of my read queries are dynamic queries (coming from datagrids that are filterable and sortable).
The read queries will commonly have complex things like SUMs, COUNTs, ORDERing, GROUPing, COMPOSITE keys, etc.
Since my reads are so diverse, why would I need to further abstract them into Laravel's Eloquent ORM (or any ORM), especially since I'm using PDO (which has 12 different database drivers)? The only advantage I can see at the moment is if I wanted to rename a database field then that would be easier to do using the ORM. I'm not willing to pay the price in greater abstraction/obscurity for that though.
I'm developing app with FuelPHP & mySql and I'm using the provided ORM functionality. The problem is with following tables:
Table: pdm_data
Massive table (350+ columns, many rows)
Table data is rather static (updates only once a day)
Primary key: obj_id
Table: change_request
Only few columns
Data changes often (10-20 times / min)
References primary key (obj_id from table pdm_data)
Users can customize datasheet that is visible to them, eg. they can save filters (eg. change_request.obj_id=34 AND pdm_data.state = 6) on columns which then are translated to query realtime with ORM.
However, the querying with ORM is really slow as the table pdm_data is large and even ~100 rows will result in many mbs of data. The largest problem seems to be in FuelPHP ORM: even if the query itself is relatively fast model hydration etc. takes many seconds. Ideal solution would be to cache results from pdm_data table as it is rather static. However, as far as I know FuelPHP doesn't let you cache tables through relations (you can cache the complete result of query, thus both tables or none).
Furthermore, using normal SQL query with join instead of ORM is not ideal solution, as I need to handle other tasks where hydrated models are awesome.
I have currently following code:
//Initialize the query and use eager-loading
$query = Model_Changerequest::query()->related('pdmdata');
foreach($filters as $filter)
{
//First parameter can point to either table
$query->where($filter[0], $filter[1], $filter[2]);
}
$result = $query->get();
...
Does someone have a good solution for this?
Thanks for reading!
The slowness of the version 1 ORM is a known problem which is being addressed with v2. My current benchmarks are showing that v1 orm takes 2.5 seconds (on my machine, ymmv) to hydrate 40k rows while the current v2 alpha takes around 800ms.
For now I am afraid that the easiest solution is to do away with the ORM for large selects and construct the queries using the DB class. I know you said that you want to keep the abstraction of the ORM to ease development, one solution is to use as_object('MyModel') to return populated model objects.
On the other hand if performance is your main concern then the ORM is simply not suitable.
I have User, Play and UserPlay model. Here is the relation defined in User model to calculate total time, the user has played game.
'playedhours'=>array(self::STAT, 'Play', 'UserPlay(user_id,play_id)',
'select'=>'SUM(duration)'),
Now i am trying to find duration sum with user id.
$playedHours = User::model()->findByPk($model->user_id)->playedhours)/3600;
This relation is taking much time to execute on large amount of data. Then is looked into the query generated by the relation.
SELECT SUM(duration) AS `s`, `UserPlay`.`user_id` AS `c0` FROM `Play` `t` INNER JOIN
`UserPlay` ON (`t`.`id`=`UserPlay`.`play_id`) GROUP BY `UserPlay`.`user_id` HAVING
(`UserPlay`.`user_id`=9);
GROUP BY on UserPlay.user_id is taking much time. As i don't need Group by clause here.
My question is, how to avoid GROUP BY clause from the above relation.
STAT relations are by definition aggregation queries, See Statistical Query.
You cannot remove GROUP BY here and make a meaningful query for aggregate data. SUM(), AVG(), etc are all aggregate functions see GROUP BY Functions, for a list of all aggregate functions supported by MYSQL.
Your problem is for the calculation you are doing a HAVING clause. This is not required as HAVING checks conditions after the aggregation takes place, which you can use to put conditions like for example SUM(duration) > 500 .
Basically what is happening is that you are grouping all the users separately first, then filtering for the user id you want. If you instead use a WHERE clause which will filter before not after then aggregation is for only the user you want then group it your query will be much faster.
Although Active Record is good at modelling data in an OOP fashion, it
actually degrades performance due to the fact that it needs to create
one or several objects to represent each row of query result. For data
intensive applications, using DAO or database APIs at lower level
could be a better choice
Therefore it is best if you change the relation to a model function querying the Db directly using the CommandBuilder or DAO API. Something like this
Class User extends CActiveRecord {
....
public function getPlayedhours(){
if(!isset($this->id)) // to prevent query running on a newly created object without a row loaded to it
return 0;
$played = Yii::app()->db->createCommand()
->select('SUM(duration)')
->from('play')
->join("user_play up","up.play_id = play.id")
->where("up.user_id =".$this->id)
->group("up.user_id")
->queryScalar();
if($played == null)
return 0;
else
return $played/3600 ;
}
....
}
If you query still is slow, try optimizing the indexes, implement cache mechanism, and use the explain command to figure out what is actually taking more time and more importantly why. If nothing is good enough, upgrade your hardware.
I need to JOIN 2 tables (lets say User & Order table) for reporting module in my web app.
The problems are:
The User table is located on the different server & different
DBMS from the Order table. Technically it is a different system, so the User table is located on SQL Server DB, meanwhile the Order table is located on MySQL DB.
I couldn't use SQL Server's Linked Server because my company policy doesn't allow it. So, I coudn't JOIN them directly with SQL code. They want me to use Web Service instead of linked server.
The result of JOIN operation from those tables has a large number of rows (maybe more than 10,000 rows because the data aimed for reporting). So, I think it was a horrible thing to mapping them using Web Service.
So I came up with this:
I collected 2 query result from different models and join them with my app code (I'm using PHP with CodeIgniter) :
// First result
$userData = $this->userModel->getAllUser();
// Second result
$orderData = $this->orderModel->getAllOrder();
The $userData contains all user entities with the following columns:
[UserId, Username, Address, PhoneNumber, etc..]
And the $orderData contains all order entities with the following columns:
[OrderId, UserId, Date, etc..]
But is it possible to join those two query results in PHP / CodeIgniter?
How about the performance regarding the large amount of data?
Should I just use Web Service as suggested or there's another solution to accomplish this?
Thanks in advance :)
A few things to think about:
Do you actually need to return all user and order records in one single go
Do you actually want to return all rows for these two types of record
Would you be better off with a Report module for these report queries?
Would plain SQL syntax be a smarter move than trying to shim this into existence with the CodeIgniter "Active Record" (renamed Query Builder in 3.0)
Is JOIN really so bad? It is not a UNION, you want the data to be related.
I would recommend you limit your data returns, SELECT only the fields you actually require, make a new Report model to avoid trying to mess up your generic models and do this with raw SQL.
Complicated things get all the more complicated when you try too hard to stick to rules like "1 table = 1 model" and "User::getAllFoos + controller processing > Report::getMonthlyOrderStats()".