I have User, Play and UserPlay model. Here is the relation defined in User model to calculate total time, the user has played game.
'playedhours'=>array(self::STAT, 'Play', 'UserPlay(user_id,play_id)',
'select'=>'SUM(duration)'),
Now i am trying to find duration sum with user id.
$playedHours = User::model()->findByPk($model->user_id)->playedhours)/3600;
This relation is taking much time to execute on large amount of data. Then is looked into the query generated by the relation.
SELECT SUM(duration) AS `s`, `UserPlay`.`user_id` AS `c0` FROM `Play` `t` INNER JOIN
`UserPlay` ON (`t`.`id`=`UserPlay`.`play_id`) GROUP BY `UserPlay`.`user_id` HAVING
(`UserPlay`.`user_id`=9);
GROUP BY on UserPlay.user_id is taking much time. As i don't need Group by clause here.
My question is, how to avoid GROUP BY clause from the above relation.
STAT relations are by definition aggregation queries, See Statistical Query.
You cannot remove GROUP BY here and make a meaningful query for aggregate data. SUM(), AVG(), etc are all aggregate functions see GROUP BY Functions, for a list of all aggregate functions supported by MYSQL.
Your problem is for the calculation you are doing a HAVING clause. This is not required as HAVING checks conditions after the aggregation takes place, which you can use to put conditions like for example SUM(duration) > 500 .
Basically what is happening is that you are grouping all the users separately first, then filtering for the user id you want. If you instead use a WHERE clause which will filter before not after then aggregation is for only the user you want then group it your query will be much faster.
Although Active Record is good at modelling data in an OOP fashion, it
actually degrades performance due to the fact that it needs to create
one or several objects to represent each row of query result. For data
intensive applications, using DAO or database APIs at lower level
could be a better choice
Therefore it is best if you change the relation to a model function querying the Db directly using the CommandBuilder or DAO API. Something like this
Class User extends CActiveRecord {
....
public function getPlayedhours(){
if(!isset($this->id)) // to prevent query running on a newly created object without a row loaded to it
return 0;
$played = Yii::app()->db->createCommand()
->select('SUM(duration)')
->from('play')
->join("user_play up","up.play_id = play.id")
->where("up.user_id =".$this->id)
->group("up.user_id")
->queryScalar();
if($played == null)
return 0;
else
return $played/3600 ;
}
....
}
If you query still is slow, try optimizing the indexes, implement cache mechanism, and use the explain command to figure out what is actually taking more time and more importantly why. If nothing is good enough, upgrade your hardware.
Related
I have three models, Companies, events and assistances, where the assistances table stores the event_id and the company_id. I'd like to get a query in which the total assistances of the company to certain kind of events are stored. Nevertheless, as all these counts are linked to the same table, I don't really know how to build this query effectively. I have the ids of the assistances to each kind of event stored in some arrays, and then I do the following:
$query = $this->Companies->find('all')->where($conditions)->order(['name' => 'ASC']);
$query
->select(['total_assistances' => $query->func()->count('DISTINCT(Assistances.id)')])
->leftJoinWith('Assistances')
->group(['Companies.id'])
->autoFields(true);
Nevertheless, I don't know how to get the rest of the Assistance count, as I would need to count not all the distinct assistance Ids but only those taht fit to certain conditions, something like ->select(['assistances_conferences' => $query->func()->count('DISTINCT(Assistances.id)')])->where($conferencesConditions) (but obviously the previous line does not work. Is there any way of counting different kind of assistances in the query itself? (I need to do it this way because I then plan to use pagination and sort the table taking those fields into consideration).
The *JoinWith() methods accept a second argument, a callback that receives a query builder used for affecting the select list, as well as the conditions for the join.
->leftJoinWith('Assistances', function (\Cake\ORM\Query $query) {
return $query->where([
'Assistances.event_id IN' => [1, 2]
]);
})
This would generate a join statement like this, which would only include (and therefore count) the Assistances with an event_id of 1 or 2:
LEFT JOIN
assistances Assistances ON
Assistances.company_id = Companies.id AND
Assistances.event_id IN (1, 2)
The query builder passed to the callback really only supports selecting fields and adding conditions, more complex statements would need to be defined on the main query, or you'd possibly have to switch to using subqueries.
See also
Cookbook > Database Access & ORM > Query Builder > Filtering by Associated Data
I have the following query in a doctrine repository:
/**
* #return AbstractSurveyPage[]
*/
public function getResultCacheTestPages(): array
{
$select = $this->createQueryBuilder('page')
->select(['page', 'pageGroup', 'groupLabelText'])
->leftJoin('page.group', 'pageGroup')
->leftJoin('pageGroup.labelText', 'groupLabelText');
// Both useResultCache(true) and setCacheable(true) is needed
return $select->getQuery()->useResultCache(true)
->setCacheable(true)->getResult();
}
The query fetches page objects, their group objects and the group label eagerly.
However, I get the following error:
File /home/vagrant/web-projects/blueprint/vendor/doctrine/orm/lib/Doctrine/ORM/Cache/DefaultQueryCache.php:263
Message Undefined index: labelText
It seems that doctrine cannot find the labelText association of the group object. When I remove 'groupLabelText' from the select clause, the query and the second level result caching work fine (but the group label objects are not fetched eagerly anymore). When I disable caching with setCacheable(false), the query works fine.
It seems that doctrine is unable to cache results of fetch join queries with joins of more than one level.
Is there a way to get this working? Or should I not use deep fetch join queries, and use lazy fetching for the joined objects? Entity and association second level caching works fine for me, so maybe the performance hit is not that bad. Also, the code would become easier, since the fetch join queries are quite complicated (I use up to 25 joins and elements in the select clause in real code).
I'm currently building an eCommerce site using Symfony 3 that supports multiple languages, and have realised they way I've designed the Product entity will require joining multiple other entities on using DQL/the query builder to load up things like the translations, product reviews and discounts/special offers. but this means I am going to have a block of joins that are going to be the same in multiple repositories which seems wrong as that leads to having to hunt out all these blocks if we ever need to add or change a join to load in extra product data.
For example in my CartRepository's loadCart() function I have a DQL query like this:
SELECT c,i,p,pd,pt,ps FROM
AppBundle:Cart c
join c.items i
join i.product p
left join p.productDiscount pd
join p.productTranslation pt
left join p.productSpecial ps
where c.id = :id
I will end up with something similar in the SectionRepository when I'm showing the list of products on that page, what is the correct way to deal with this? Is there some place I can centrally define the list of entities needed to be loaded for the joined entity (Product in this case) to be complete. I realise I could just use lazy loading, but that would lead to a large amount of queries being run on pages like the section page (a section showing 40 products would need to run 121 queries with the above example instead of 1 if I use a properly joined query).
One approach (this is just off the top of my head, someone may have a better approach). You could reasonably easily have a centralised querybuilder function/service that would do that. The querybuilder is very nice for programattically building queries. The key difference would be the root entity and the filtering entity.
E.g. something like this. Note of course these would not all be in the same place (they might be across a few services, repositories etc), it's just an example of an approach to consider.
public function getCartBaseQuery($cartId, $joinAlias = 'o') {
$qb = $this->getEntityManager()->createQueryBuilder();
$qb->select($joinAlias)
->from('AppBundle:Cart', 'c')
->join('c.items', $joinAlias)
->where($qb->expr()->eq('c.id', ':cartId'))
->setParameter('cartId', $cartId);
return $qb;
}
public function addProductQueryToItem($qb, $alias) {
/** #var QueryBuilder $query */
$qb
->addSelect('p, pd, pt, ps')
->join($alias.'product', 'p')
->leftJoin('p.productDiscount', 'pd')
->join('p.productTranslation', 'pt')
->join('p.productSpecial', 'ps')
;
return $qb;
}
public function loadCart($cartId) {
$qbcart = $someServiceOrRepository->getCartBaseQuery($cartId);
$qbcart = $someServiceOrRepository->addProductQueryToItem($qbcart);
return $qbcart->getQuery()->getResult();
}
Like I said, just one possible approach, but hopefully it gives you some ideas and a start at solving the issue.
Note: If you religiously use the same join alias for the entity you attach your product data to you would not even have to specify it in the calls (but I would make it configurable myself).
There is no single correct answer to your question.
But if I have to make a suggestion, I'd say to take a look at CQRS (http://martinfowler.com/bliki/CQRS.html) which basically means you have a separated read model.
To make this as simple as possibile, let's say that you build a separate "extended_product" table where all data are already joined and de-normalized. This table may be populated at regular intervals with a background task, or by a command that gets triggered each time you update a product or related entity.
When you need to read products data, you query this table instead of the original one. Of course, nothing prevents you from having many different extended table with your data arranged in a separate way.
In some way it's a concept very similar to database "views", except that:
it is faster, because you query an actual table
since you create that table via code, you are not limited to a single SQL query to process data (think filters, aggregations, and so on)
I am aware this is not exactly an "answer", but hopefully it may give you some good ideas on how to fix your problem.
I'm building a query to show items with user and then show highest Bid on the item.
Example:
Xbox 360 by james. - the highest bid was $55.
art table by mario. - the highest bid was $25.
Query
SELECT i, u
FROM AppBundle:Item i
LEFT JOIN i.user u
I have another table bids (one to many relationship). I'm not sure how can I include single highest bid of the item in the same query with join.
I know I can just run another query after this query, with function (relationship), but I'm avoiding to do that for optimisation reasons.
Solution
SQL
https://stackoverflow.com/a/16538294/75799 - But how is this possible in doctrine DQL?
You can use IN with a sub query in such cases.
I am not sure if I understood your model correctly, but I attempted to make your query with a QueryBuilder and I am sure you will manage to make it work with this example:
$qb = $this->_em->createQueryBuilder();
$sub = $qb;
$sub->select('mbi') // max bid item
->where('i.id = mbi.id')
->leftJoin('mbi.bids', 'b'))
->andWhere($qb->expr()->max('b.value'))
->getQuery();
$qb = $qb->select('i', 'u')
->where($qb->expr()->in('i', $sub->getDQL()))
->leftJoin('i.user', 'u');
$query = $qb->getQuery();
return $query->getResult();
Your SQL query may look something like
select i,u
from i
inner join bids u on i.id = u.item_id
WHERE
i.value = (select max(value) from bids where item_id = i.id)
group by i
DQL, I don't think supports subqueries, so you could try using a Having clause or see if Doctrine\ORM\Query\Expr offers anything.
To solve this for my own case, I added a method to the origin entity (item) to find the max entity in a list of entities (bids), using Doctrine's Collections' Criteria I've written about it here.
Your Item entity would contain
public function getMaxBid()
{
$criteria = Criteria::create();
$criteria->orderBy(['bid.value' => Criteria::ASC]);
$criteria->setLimit(1);
return $this->bids->matching($criteria);
}
Unfortunately, there's no way that i know to find the maximum bid and the bidder with one grouping query, but there's several techniques to making the logic work with several queries. You could do a sub select and that might work fine depending on the size of the table. If you're planning on growing to the point where that's not going to work, you're probably already looking at sharing your relational databases, moving some data to a less transactional, higher performance db technology, or denormalizing, but if you want to keep this implemented in pure MySQL, you could use a procedure to express in multiple commands how to check for a bid and optionally add to the list, also updating the current high bidder in a denormalized high bids table. This keeps the complex logic of how to verify the bid in one, the most rigorously managed place - the database. Just make sure you use transactions properly to stop 2 bids from being recorded concurrently ( eg, SELECT FOR UPDATE).
I used to ask prospective programmers to write this query to see how experienced with MySQL they were, many thought just a max grouping was sufficient, and a few left the interview still convinced that it would work fine and i was wrong. So good question!
This question got me today, my repositories should always return full objects? They can not return partial data (in an array for example)?
For example, I have the method getUserFriends(User $user) inside my repository Friends, in this method I execute the following DQL:
$dql = 'SELECT userFriend FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
But this way I'm returning the users entities, containing all the properties, the generated SQL is a SELECT of all fields from the User table. But let's say I just need the id and the name of the user friends, there would be more interesting (and quick) get just these values?
$dql = 'SELECT userFriend.id, userFriend.name FROM Entities\User\Friend f JOIN f.friend userFriend WHERE f.user = ?0';
These methods are executed in my service class.
From a database perspective, performance will not be that much affected by the number of fields, unless the number of rows to return is really huge (millions of rows, probably) : the hardest part for the db is to make the joints, and build the resultset from the tables.
From a php perspective, that depends on multiple factors, like the complexity and the number of objects created.
I would take the problem differently : I would profile and stress-test my code in order to see if performance is an issue or not, and decide to refactor only if needed (switching from doctrine to a hand-made model is time consuming, will the performance gain be worth it ?)
EDIT : and to answer your initial question : fetching complete objects will lead to easier caching if needed, and better data encapsulation. I would keep these until they represent a big performance issue.
You can use partial keyword in your DQL : http://www.doctrine-project.org/docs/orm/2.0/en/reference/partial-objects.html?highlight=partial
But only do that if your app has performance issues.