I have three models, Companies, events and assistances, where the assistances table stores the event_id and the company_id. I'd like to get a query in which the total assistances of the company to certain kind of events are stored. Nevertheless, as all these counts are linked to the same table, I don't really know how to build this query effectively. I have the ids of the assistances to each kind of event stored in some arrays, and then I do the following:
$query = $this->Companies->find('all')->where($conditions)->order(['name' => 'ASC']);
$query
->select(['total_assistances' => $query->func()->count('DISTINCT(Assistances.id)')])
->leftJoinWith('Assistances')
->group(['Companies.id'])
->autoFields(true);
Nevertheless, I don't know how to get the rest of the Assistance count, as I would need to count not all the distinct assistance Ids but only those taht fit to certain conditions, something like ->select(['assistances_conferences' => $query->func()->count('DISTINCT(Assistances.id)')])->where($conferencesConditions) (but obviously the previous line does not work. Is there any way of counting different kind of assistances in the query itself? (I need to do it this way because I then plan to use pagination and sort the table taking those fields into consideration).
The *JoinWith() methods accept a second argument, a callback that receives a query builder used for affecting the select list, as well as the conditions for the join.
->leftJoinWith('Assistances', function (\Cake\ORM\Query $query) {
return $query->where([
'Assistances.event_id IN' => [1, 2]
]);
})
This would generate a join statement like this, which would only include (and therefore count) the Assistances with an event_id of 1 or 2:
LEFT JOIN
assistances Assistances ON
Assistances.company_id = Companies.id AND
Assistances.event_id IN (1, 2)
The query builder passed to the callback really only supports selecting fields and adding conditions, more complex statements would need to be defined on the main query, or you'd possibly have to switch to using subqueries.
See also
Cookbook > Database Access & ORM > Query Builder > Filtering by Associated Data
Related
I am working on an application in CakePHP 3.7.
We have 3 database tables with the following hierarchy. These tables are associated correctly according to the Table classes:
regulations
groups
filters
(The associations are shown in a previous question: CakePHP 3 - association is not defined - even though it appears to be)
I can get all of the data from all three tables as follows:
$regulations = TableRegistry::getTableLocator()->get('Regulations');
$data = $regulations->find('all', ['contain' => ['Groups' => ['Filters']]]);
$data = $data->toArray();
The filters table contains >1300 records. I'm therefore trying to build a feature which loads the data progressively via AJAX calls as the user scrolls down the page similar to what's described here: https://makitweb.com/load-content-on-page-scroll-with-jquery-and-ajax/
The problem is that I need to be able to count the total number of rows returned. However, the data for this exists in 3 tables.
If I do debug($data->count()); it will output 8 because there are 8 rows in the regulations table. Ideally I need a count of the rows it's returning in the filters table (~1300 rows) which is where most of the data exists in terms of the initial load.
The problem is further complicated because this feature allows a user to perform a search. It might be the case that a given search term exists in all 3 tables, 1 - 2 of the tables, or not at all. I don't know whether the correct way to do this is to try and count the rows returned in each table, or the rows overall?
I've read How to paginate associated records? and Filtering By Associated Data in the Cake docs.
Additional information
The issue seems to come down to how to write a query using the ORM syntax Cake provides. The following (plain MySQL) query will actually do what I want. Assuming the user has searched for "Asbestos":
SELECT
regulations.label AS regulations_label,
groups.label AS groups_label,
filters.label AS filters_label
FROM
groups
JOIN regulations
ON groups.regulation_id = regulations.id
JOIN filters
ON filters.group_id = groups.id
WHERE regulations.label LIKE '%Asbestos%'
OR groups.label LIKE '%Asbestos%'
OR filters.label LIKE '%Asbestos%'
ORDER BY
regulations.id ASC,
groups_label ASC,
filters_label ASC
LIMIT 0,50
Let's say there are 203 rows returned. The LIMIT condition means I am getting the first 50. Some js on the frontend (which isn't really relevant in terms of how it works) will make an ajax call as the user scrolls to the bottom of the page to re-run this query with a different limit (e.g. LIMIT 51, 100 for the next set of results).
The problem seems to be two fold:
If I write the query using Cake's ORM syntax the output is a nested array. Conversely if I write it in plain SQL it's returning a table which has just 3 columns with the labels that I need to display. This structure is much easier to work with in terms of outputting the required data on the page.
The second - and perhaps more important issue - is that I can't see how you could write the LIMIT condition in the ORM syntax to make this work due to the nested structure described in 1. If I added $data->limit(50) for example, it only imposes this limit on the regulations table. So essentially the search works for any associated data on the first 50 rows in regulations. As opposed to the LIMIT condition I've written in MySQL which would take into consideration the entire result set which includes the columns from all 3 tables.
To further elaborate point 2, assume the tables contain the following numbers of rows:
regulations: 150
groups: 1000
filters: 5000
If I use $data->limit(50) it would only apply to 50 rows in the regulations table. I need to apply the LIMIT the result set after searching all rows in all 3 tables for a given term.
Creating that query using the query builder is pretty simple, if you want joins for non-1:1 associations, then use the *JoinWith() methods instead of contain(), in this case innerJoinWith(), something like:
$query = $GroupsTable
->find()
->select([
'regulations_label' => 'Regulations.label',
'groups_label' => 'Groups.label',
'filters_label' => 'Filters.label',
])
->innerJoinWith('Regulations')
->innerJoinWith('Filters')
->where([
'OR' => [
'Regulations.label LIKE' => '%Asbestos%',
'Groups.label LIKE' => '%Asbestos%',
'Filters.label LIKE' => '%Asbestos%',
],
])
->order([
'Regulations.id' => 'ASC',
'Groups.label' => 'ASC',
'Filters.label' => 'ASC',
]);
See also
Cookbook > Database Access & ORM > Query Builder > Using innerJoinWith
Cookbook > Database Access & ORM > Query Builder > Adding Joins
What I want to do is to get all rows related with user_id but in a different way.
First condition is to get all Books that are related with the User via Resources table where user_id is stored (in other words - Books owned by the User). Second condition is to get all Books that are related with the User through the Cities model again which is stored in the Resources table as well (Books that belong to Cities owned by the User).
I tried really a lot of things and I simply cannot make this two conditions work because I use JOIN (tried different combinations of innerJoinWith and leftJoinWith) on the same "end" model (User).
What I've done so far:
$userBooks = $this->Books->find()
->leftJoinWith("Resources.Users")
->leftJoinWith("Cities.Resources.Users")
->where(["Resources.Users" => 1])
->orWhere(["Cities.Resources.Users" => 1])
->all();
This of course does not work, but I hope you get the point about what I'm trying to achieve. The best what I was able to get with trying different approaches is the result of only one JOIN statement what is logical.
Basically, this can be separated into 2 parts which gives expected result (but I do not prefer it because I want it done with one query of course):
$userBooks = $this->Books->find()
->innerJoinWith("Resources.Users", function($q) {
return $q->where(["Users.id" => 1]);
})
->all();
$userBooks2 = $this->Books->find()
->innerJoinWith("Cities.Resources.Users", function($q) {
return $q->where(["Users.id" => 1]);
})
->all();
Also, before this I created an SQL script which works well and result is like expected:
SELECT books.id FROM books, cities, users_resources WHERE
(users_resources.resource_id = books.resource_id AND users_resources.user_id = 1)
OR
(users_resources.resource_id = cities.resource_id AND books.city_id = cities.id AND users_resources.user_id = 1)
This query works and I want to transfer it into ORM styled query in CakePHP to get both Books that are owned by the user and the ones that are connected with the User via Cities. I want somehow to separate these joins to individually filter data like I did in the SQL query.
EDIT
I've tried #ndm solution but the result is the same as where there is only 1 association (User) - I was still able to get data based on only one join statement (second one was ignored). Due to the fact I had to move on, I ended up with
$userBooks = $this->Books->find()
->innerJoinWith("Cities.Resources.Users")
->where(["Users.id" => $userId])
->union($this->Books->find()
->innerJoinWith("Resources.Users")
->where(["Users.id" => $userId])
)
->all();
which outputs correct result but not in very effective way (by union of 2 queries). I would really like to know the best way to approach this as this is a very common case (filtering by related model (user) that is associated with other models).
The ORM (specifically the eager loader) doesn't allow joining the same alias multiple times.
This can be worked around in various ways, the most simple one probaly being creating a separate association with a unique alias. For example in your ResourcesTable, create another association to Users with a different alias, say Users2, like:
$this->belongsToMany('Users2', [
'className' => 'Users'
]);
Then you can use that association in the second leftJoinWith(), and apply the conditions accordingly:
$this->Books
->find()
->leftJoinWith('Resources.Users')
->leftJoinWith('Cities.Resources.Users2')
->where(['Users.id' => 1])
->orWhere(['Users2.id' => 1])
->group('Books.id')
->all();
And don't forget to group your books to avoid duplicate results.
You could also create the joins manually using leftJoin() or join() instead, where you can define the aliases on your own (or don't use any at all) so that there are no conflicts, for more complex queries that can be a tedious task though.
You could also use your two separate queries as subqueries for conditions on Books, or even create a union query from them, which however might perform worse...
See also
Cookbook > Database Access & ORM > Query Builder > Adding Joins
CakePHP Issues > Improve association data fetching
I have User, Play and UserPlay model. Here is the relation defined in User model to calculate total time, the user has played game.
'playedhours'=>array(self::STAT, 'Play', 'UserPlay(user_id,play_id)',
'select'=>'SUM(duration)'),
Now i am trying to find duration sum with user id.
$playedHours = User::model()->findByPk($model->user_id)->playedhours)/3600;
This relation is taking much time to execute on large amount of data. Then is looked into the query generated by the relation.
SELECT SUM(duration) AS `s`, `UserPlay`.`user_id` AS `c0` FROM `Play` `t` INNER JOIN
`UserPlay` ON (`t`.`id`=`UserPlay`.`play_id`) GROUP BY `UserPlay`.`user_id` HAVING
(`UserPlay`.`user_id`=9);
GROUP BY on UserPlay.user_id is taking much time. As i don't need Group by clause here.
My question is, how to avoid GROUP BY clause from the above relation.
STAT relations are by definition aggregation queries, See Statistical Query.
You cannot remove GROUP BY here and make a meaningful query for aggregate data. SUM(), AVG(), etc are all aggregate functions see GROUP BY Functions, for a list of all aggregate functions supported by MYSQL.
Your problem is for the calculation you are doing a HAVING clause. This is not required as HAVING checks conditions after the aggregation takes place, which you can use to put conditions like for example SUM(duration) > 500 .
Basically what is happening is that you are grouping all the users separately first, then filtering for the user id you want. If you instead use a WHERE clause which will filter before not after then aggregation is for only the user you want then group it your query will be much faster.
Although Active Record is good at modelling data in an OOP fashion, it
actually degrades performance due to the fact that it needs to create
one or several objects to represent each row of query result. For data
intensive applications, using DAO or database APIs at lower level
could be a better choice
Therefore it is best if you change the relation to a model function querying the Db directly using the CommandBuilder or DAO API. Something like this
Class User extends CActiveRecord {
....
public function getPlayedhours(){
if(!isset($this->id)) // to prevent query running on a newly created object without a row loaded to it
return 0;
$played = Yii::app()->db->createCommand()
->select('SUM(duration)')
->from('play')
->join("user_play up","up.play_id = play.id")
->where("up.user_id =".$this->id)
->group("up.user_id")
->queryScalar();
if($played == null)
return 0;
else
return $played/3600 ;
}
....
}
If you query still is slow, try optimizing the indexes, implement cache mechanism, and use the explain command to figure out what is actually taking more time and more importantly why. If nothing is good enough, upgrade your hardware.
So, I've extended CGridView to include an Advanced Search feature tailored to the needs of my organization.
Filter - lets you show/hide columns in the table, and you can also reorder columns by dragging the little drag icon to the left of each item.
Sort - Allows for the selection of multiple columns, specify Ascending or Descending.
Search - Select your column and insert search parameters. Operators tailored to data type of selected column.
Version 1 works, albeit slowly. Basically, I had my hands in the inner workings of CGridView, where I snatch the results from the DataProvider and do the searching and sorting in PHP before rendering the table contents.
Now writing Version 2, where I aim to focus on clever CDbCriteria creation, allowing MySQL to do the heavy lifting so it will run quicker. The implementation is trivial when dealing with a single database table. The difficulty arises when I'm dealing with 2 or more tables... For example, if the user intends to search on a field that is a STAT relation, I need that relation to be present in my query so that I may include comparisons.
Here's the question. How do I assure that Yii includes all with relations in my query so that I include comparisons? I've included all my relations with my criteria in the model's search function and I've tried CDbCriteria's together set to true ...
public function search() {
$criteria=new CDbCriteria;
$criteria->compare('id', $this->id);
$criteria->compare( ...
...
$criteria->with = array('relation0','relation1','relation3');
$criteria->together = true;
return new CActiveDataProvider(
get_class($this), array(
'criteria'=>$criteria,
'pagination' => array('pageSize' => 50)
));}
Then I'll snatch the criteria from the DataProvider and add a few conditions, for example, looking for dates > 1234567890. But I still get errors like this...
CDbCommand failed to execute the SQL statement:
SQLSTATE[42S22]: Column not found: 1054 Unknown column 't.relation3' in 'where clause'.
The SQL statement executed was:
SELECT COUNT(DISTINCT `t`.`id`) FROM `table` `t`
LEFT OUTER JOIN `relation_table` `relation0` ON (`t`.`id`=`relation0`.`id`)
LEFT OUTER JOIN `relation_table` `relation1` ON (`t`.`id`=`relation1`.`id`)
WHERE (`t`.`relation3` > 1234567890)
Where relation0 and relation1 are BELONGS_TO relations, but any STAT relations, here depicted as relation3, are missing. Furthermore, why is the query a SELECT COUNT(DISTINCT 't'.'id') ?
Edit #DCoder Here's the specific relation I'm working with now. The main table is Call, which has a HAS_MANY relation to CallSegments, which keeps the times. So the startTime of the Call is the minimum start_time of all the related CallSegments. And startTime is the hypothetical relation3 in my anonymized query error.
'startTime' => array(self::STAT, 'CallSegments', 'call_id',
'select' => 'min(`start_time`)'),
Edit Other people have sent me to CDbCriteria's together property, but as you can see above, I am currently trying that to no avail.
Edit Looks like the issue has may have been reported: Yii and github tickets.
It is not a good idea to snatch the sql from a criteria and use it by yourself.
If you are using the "with" property then you could easily use comparisons like:
$criteria->compare("`relation1`.`id`", $yourVarHere);
Also Yii doesn't behave well with grouping.
My approach with STAT relations is using an subquery in the selects of Yii, followed by having:
$criteria->select = array("`t`.*", "(SELECT COUNT(*) FROM `relation3` WHERE `id` = `t`.id_relation3) AS `rel3`");
$criteria->having = "`rel3` > " . $yourValue;
The above method creates a bug in the gridview pagination because the count is done on a different query. A workaround will be to drop the "with" property and write the joins by yourself in the "join" property like:
$criteria->join = "LEFT OUTER JOIN `relation_table` `relation0` ON (`t`.`id`=`relation0`.`id`)
LEFT OUTER JOIN `relation_table` `relation1` ON (`t`.`id`=`relation1`.`id`)
LEFT OUTER JOIN `relation_table` `relation3` ON (`t`.`id`=`relation3`.`id`)";
If the bug is a little difficult to get working could you use the stat relation as a simple HAS_ONE with :
'select'=>'count(relation3.id)',
'joinType'=>'left join',
'group'=>'relation3.id',
'on'=>'t.id = relation3.id',
'together'=>true
to get the count value out along side everything else?
Not sure how well this would work for your case but it's been helpful for me from time to time.
Ok to make it more clear:
I am Using doctrine
I have a table Brands and Products
Brand
id
name
Product
id
name
brand_id
I have a lot of brands and Products of those brands in the database.
I would like to retrieve List of brands(+ count of its products) Grouped by Brand.name's first latter.
ex:
array(
n => array(
0 => array('Nike', 4 ),
1 => array('North Pole', 18)
.....
)
.....
)
So my question was can this be done with one query in a efficient way.
I really don't wan't to run separate queries for each brand.name's first latter.
Doctrines "Hierarchical Data" cross my mind but I believe it for different thing?.
thanks
If you are going to use this form of result more than once, it might be worthwhile to make the formatting into a Hydrator, as described here.
In your case, you can create a query that select 3 columns
first letter of brand.name
brand.name
count(product.id)
Then hydrate the result
$results = $q->execute(array(), 'group_by_first_column');
You cannot take it from database in that way, but you can fetch data as objects or arrays and then transform it to described form. Use foreach loops.
When using Doctrine you can also use raw SQL querys and hydrate arrays instead of objects. So my Solution would be to use a native SQL Query:
SELECT
brand.name,
count(product.id)
FROM
brand
JOIN
product ON
brand.id=product.brand_id
GROUP BY
brand.id ORDER BY brand.name;
And then iterate in PHP over the result to build the desired array. Because the Result is ordered by Brand Name this is quite easy. If you wasn't to keep database abstraction I think it should also be possible to express this query in DQL, just hydrate an array instead of objects.