I've got two classes, Post and Comment, as components I'm using to build a news feed. Post has a one-to-many association with Comment.
When building my news feed, I query the posts table and typically return a chunk of 20 results. With these 20 results, I'd also like to load the comments associated with it. Logic dictates that I can simply use fetch="EAGER" on the #OneToMany annotation, but in this use case it doesn't scale well as searching through a comments table with 1,000's of entries repetitively for each post is not performant.
Ideally, I'd like to separately preload them in a single query - which is what I've done.
class Post {
// ...
/**
* #var Comment[]
*
* #ORM\OneToMany(targetEntity="App\Entity\PostComment", mappedBy="post", cascade={"persist", "remove"}, orphanRemoval=true)
*/
protected $comments;
public function getComments() {
return $this->comments;
}
public function setComments(array $comments) {
$this->comments = $comments;
}
}
class NewsFeedService {
private function preloadComments(array $posts) {
$postIds = array_column($posts, 'id'); // Get the post ID's
$comments = $this->manager->getRepository(Comment::class)->findCommentByPostIds($postIds);
// Group the comments by post
$groupedComments = [];
foreach($comments as $comment) {
$groupedComments[$comment->getPost()->getId()][] = $comment;
}
// Populate each $post object with it's comments
foreach($groupedComments as $postId => $postComments) {
$post = $posts[$postId];
$post->setComments($postComments);
}
}
}
class CommentRepository extends EntityRepository {
public function findCommentsByPostIds(array $postIds) {
return $this->createQueryBuilder('c')
->where('c.post IN(:ids)')
->setParameter('ids', $postIds)
->getQuery()
->getResult();
}
}
Essentially, when calling preloadComments() I run a single query on the DB and then force the results into each Post. Compared to EAGER fetching this saves me on average 35-40% with my current dataset (60,000 comments). I mean it works, but..
My question is if there is a better, perhaps even doctrine native way of doing this. There are also some things that I haven't tested such as if calling $post->setComments() and adding externally fetched data will cause issues if I just so happen to update the Post object and flush the changes. I feel like going about the way I'm doing isn't optimal and may cause some small headaches over the performance I'm saving.
Related
I am having trouble sorting a collection resulting from a one-to-many relationship that has been filtered. I have a quiz that has questions:
class Quiz
{
/**
* One quiz has many questions. This is the inverse side.
* #ORM\OneToMany(targetEntity="Question", mappedBy="assessment")
* #ORM\OrderBy({"num" = "ASC"})
*/
private $questions;
public function __construct() {
$this->questions = new ArrayCollection();
}
This works as expected. However, when I modify the getter to exclude inactive (soft-deleted) questions, the sort order is lost.
public function getQuestions()
{
// filter to never return soft deleted questions
$criteria = Criteria::create()->where(Criteria::expr()->eq("active", true));
return $this->questions->matching($criteria);
}
In fact, with this getter in place, if I modify the order by clause to a nonexistent column, I do not get an unrecognized field exception as I would expect:
#ORM\OrderBy({"nonexistantcolumn" = "ASC"})
This leads me to believe that somehow the criteria filtering is overriding the annotation. Any ideas on how to resolve this would be much appreciated.
Besides filtering, Criteria can also sort a collection:
public function getQuestions()
{
// filter to never return soft deleted questions
$criteria = Criteria::create()
->where(Criteria::expr()->eq("active", true))
->orderBy(["num" => Criteria::ASC]);
return $this->questions->matching($criteria);
}
However, consider adding another unfiltered getter since this will prevent you from actually deleting inactive elements, or moving this logic to a repository method.
I have a method that needs to pull in information from three related models. I have a solution that works but I'm afraid that I'm still running into the N+1 query problem (also looking for solutions on how I can check if I'm eager loading correctly).
The three models are Challenge, Entrant, User.
Challenge Model contains:
/**
* Retrieves the Entrants object associated to the Challenge
* #return \Illuminate\Database\Eloquent\Relations\HasMany
*/
public function entrants()
{
return $this->hasMany('App\Entrant');
}
Entrant Model contains:
/**
* Retrieves the Challenge object associated to the Entrant
* #return \Illuminate\Database\Eloquent\Relations\BelongsTo
*/
public function challenge()
{
return $this->belongsTo('App\Challenge', 'challenge_id');
}
/**
* Retrieves the User object associated to the Entrant
* #return \Illuminate\Database\Eloquent\Relations\BelongsTo
*/
public function user()
{
return $this->belongsTo('App\User', 'user_id');
}
and User model contains:
/**
* Retrieves the Entrants object associated to the User
* #return \Illuminate\Database\Eloquent\Relations\HasMany
*/
public function entrants()
{
return $this->hasMany('App\Entrant');
}
The method I am trying to use eager loading looks like this:
/**
* Returns an array of currently running challenges
* with associated entrants and associated users
* #return array
*/
public function liveChallenges()
{
$currentDate = Carbon::now();
$challenges = Challenge::where('end_date', '>', $currentDate)
->with('entrants.user')
->where('start_date', '<', $currentDate)
->where('active', '1')
->get();
$challengesObject = [];
foreach ($challenges as $challenge) {
$entrants = $challenge->entrants->load('user')->sortByDesc('current_total_amount')->all();
$entrantsObject = [];
foreach ($entrants as $entrant) {
$user = $entrant->user;
$entrantsObject[] = [
'entrant' => $entrant,
'user' => $user
];
}
$challengesObject[] = [
'challenge' => $challenge,
'entrants' => $entrantsObject
];
}
return $challengesObject;
}
I feel like I followed what the documentation recommended: https://laravel.com/docs/5.5/eloquent-relationships#eager-loading
but not to sure how to check to make sure I'm not making N+1 queries opposed to just 2. Any tips or suggestions to the code are welcome, along with methods to check that eager loading is working correctly.
Use Laravel Debugbar to check queries your Laravel application is creating for each request.
Your Eloquent query should generate just 3 raw SQL queries and you need to make sure this line doesn't generate N additional queries:
$entrants = $challenge->entrants->load('user')->sortByDesc('current_total_amount')->all()
when you do ->with('entrants.user') it loads both the entrants and the user once you get to ->get(). When you do ->load('user') it runs another query to get the user. but you don't need to do this since you already pulled it when you ran ->with('entrants.user').
If you use ->loadMissing('user') instead of ->load('user') it should prevent the redundant call.
But, if you leverage Collection methods you can get away with just running the 1 query at the beginning where you declared $challenges:
foreach ($challenges as $challenge) {
// at this point, $challenge->entrants is a Collection because you already eager-loaded it
$entrants = $challenge->entrants->sortByDesc('current_total_amount');
// etc...
You don't need to use ->load('user') because $challenge->entrants is already populated with entrants and the related users. so you can just leverage the Collection method ->sortByDesc() to sort the list in php.
also, You don't need to run ->all() because that would convert it into an array of models (you can keep it as a collection of models and still foreach it).
It's the first time I run into this problem. I want to create a doctrine object and pass it along without having to flush it.
Right after it's creation, I can display some value in the object, but I can't access nested object:
$em->persist($filter);
print_r($filter->getDescription() . "\n");
print_r(count($filter->getAssetClasses()));
die;
I get:
filter description -- 0
(I should have 19 assetClass)
If I flush $filter, i still have the same issue (why oh why !)
The solution is to refresh it:
$em->persist($filter);
$em->flush();
$em->refresh($filter);
print_r($filter->getDescription() . " -- ");
print_r(count($filter->getAssetClasses()));
die;
I get:
filter description -- 19
unfortunately, you can't refresh without flushing.
On my entities, I've got the following:
in class Filter:
public function __construct()
{
$this->filterAssetClasses = new ArrayCollection();
$this->assetClasses = new ArrayCollection();
}
/**
* #var Collection
*
* #ORM\OneToMany(targetEntity="FilterAssetClass", mappedBy="filterAssetClasses", cascade={"persist"})
*/
private $filterAssetClasses;
public function addFilterAssetClass(\App\CoreBundle\Entity\FilterAssetClass $filterAssetClass)
{
$this->filterAssetClasses[] = $filterAssetClass;
$filterAssetClass->setFilter($this);
return $this;
}
in class FilterAssetClass:
/**
* #var Filter
*
* #ORM\ManyToOne(targetEntity="App\CoreBundle\Entity\Filter", inversedBy="filterAssetClasses")
*/
private $filter;
/**
* #var Filter
*
* #ORM\ManyToOne(targetEntity="AssetClass")
*/
private $assetClass;
public function setFilter(\App\CoreBundle\Entity\Filter $filter)
{
$this->filter = $filter;
return $this;
}
Someone else did write the code for the entities, and i'm a bit lost. I'm not a Doctrine expert, so if someone could point me in the good direction, that would be awesome.
Julien
but I can't access nested object
Did you set those assetClasses in the first place?
When you work with objects in memory (before persist), you can add and set all nested objects, and use those while still in memory.
My guess is that you believe that you need to store objects to database in order for them to get their IDs assigned.
IMHO, that is a bad practice and often causes problems. You can use ramsey/uuid library instead, and set IDs in Entity constructor:
public function __construct() {
$this->id = Uuid::uuid4();
}
A database should be used only as a means for storing data. No business logic should be there.
I would recommend this video on Doctrine good practices, and about the above mentioned stuff.
Your problem is not related to doctrine nor the persist/flush/refresh sequence; the problem you describe is only a symptom of bad code. As others have suggested, you should not be relying on the database to get at your data model. You should be able to get what you are after entirely without using the database; the database only stores the data when you are done with it.
Your Filter class should include some code that manages this:
// Filter
public function __contsruct()
{
$this->filterAssetClasses = new ArrayCollection();
}
/**
* #ORM\OneToMany(targetEntity="FilterAssetClass", mappedBy="filterAssetClasses", cascade={"persist"})
*/
private $filterAssetClasses;
public function addFilterAssetClass(FilterAssetClass $class)
{
// assuming you don't want duplicates...
if ($this->filterAssetClasses->contains($class) {
return;
}
$this->filterAssetClasses[] = $class;
// you also need to set the owning side of this relationship
// for later persistence in the db
// Of course you'll need to create the referenced function in your
// FilterAssetClass entity
$class->addFilter($this);
}
You may have all of this already, but you didn't show enough of your code to know. Note that you should probably not have a function setFilterAssetClass() in your Filter entity.
I am fairly new to OOP PHP and as I am building my first project, I am coming across some difficult dilemmas.
I'm trying to build a forum. Yes, I know there are many free ones out there, I just want to have one that I can build according to my own needs :). Plus, it's fun to code.
I have build a base model and controller and template, for the forum, thread and post.
The base model has standard database functions, like find all, find by id etc. The individual models extend the model class and have only specific functions to the class.
I have set it up like this:
class Post_Model extends Model {
static protected $_table_name = 'post';
static protected $_db_fields = array('id', 'thread_id', 'parent_id', 'username', 'user_id',
'title', 'date_line', 'pagetext', 'show_signature', 'ip_address', 'icon_id', 'visible');
/**
* #var db fields
*/
public $id;
public $thread_id;
public $parent_id;
public $username;
public $user_id;
public $title;
public $date_line;
public $pagetext;
public $show_signature;
public $ip_address;
public $icon_id;
public $visible;
}
/*
* joined fields with user object. needed to display posts in all details
*/
static public function get_posts($id, $limit, $offset){
$result_set = static::find_by_sql("
SELECT *
FROM post
LEFT JOIN user ON (post.user_id = user.id) where post.thread_id=".(int)$id."
ORDER BY date_line DESC
LIMIT {$offset} , {$limit}");
return !empty($result_set) ? $result_set : false;
}
And besides the decisions that were tough, that I already had to make about certain functions and variables being static or not. I have this dilemma:
The basic CRUD works from the base model. That is fine, I can create, update, delete and read from my database through calling static methods from the (base) model, which instantiates objects for these forum, thread and post models. So far so good.
Part of the Model Class:
<?php
/**
* Find rows from database based on sql statement
* #param string $sql
* #retun array $result_set
*/
static public function find_by_sql($sql = '') {
$db = Database::getInstance();
$mysqli = $db->getConnection();
$result = $mysqli->query($sql);
$object_array = array();
while ($record = $result->fetch_array()) {
$object_array[] = static::instantiate($record);
}
return $object_array;
}
private static function instantiate($record) {
$object = new static;
foreach ($record as $attribute => $value) {
if ($object->has_attritube($attribute)) {
$object->$attribute = $value;
}
}
return $object;
}
}
?>
I transfer these objects for displaying in my templates. When reading from one table, that works just fine.
<?php
$postcount = $pagination->offset + 1;
if (!empty($posts)) {
foreach ($posts as $post) {
?>
<div class="post">
<div class="posttime"><?php echo strftime("%A %e %B %G %H:%M", $post->date_line); ?></div>
<div class="postnumber">#<?php echo $postcount; ?></div>
<div class="postuser">
<div class="username"><?php echo $post->username; ?></div>
<div class="avatar">avatar</div>
</div>
<div class="postcontents">
<div class="posttext"><?php echo $post->pagetext; ?></div>
</div>
</div>
<?php
$postcount++;
}
?>
<div class="pagination"><?php echo $page_split; ?></div>
However, for the display of all posts in the threads, I need info from the user table, e.g. there number of their posts, rank, avatar etc.
Okay, so here's the thing:
SELECT * FROM post
LEFT JOIN user ON (post.user_id = user.id)
where thread_id={$id}
ORDER BY date_line DESC
LIMIT {$offset} , {limit}
I now have more fields then variables that I have setup in my Post model. So they are not transferred for displaying.
Should I use an array for displaying these extra fields instead of object? I would rather still use the object approach for displaying, as I am now set up for that.
Or should I maybe add these user fields as variables to my Post model? I am thinking that is not the most tidy way to do it, however it would work.
Or should I go through a foreach loop to get a separate instance of a (yet to create) user class? I am thinking it would drain much more memory then the Left Join that simply gets me the data that I need.
I would like some hints on how you would solve this.
One thing that might break out the dilemma; in PHP an object can also act as an array (or even a function!) but a normal plain array cannot have properties or any other OOP feature.
You won't be able to solve this with your schema, because you are mixing properties from different entities in your Post entity [i.e. username].
The idea is that you just have to make your code aware of the relationship between entities, like:
class Post extends Entity
{
protected $id;
protected $body;
protected $date;
// this will be an instance of a Thread entity, let's say
protected $thread;
// this will be an instance of the User entity
protected $user;
...
}
This way you will be able to do:
$somePost -> getUser() -> getAvatar();
But how do you inject the correct user and thread to each post? That's what Object Relational Mappers are for: in you case, all the SQL stuff would be handled by the ORM, and you'll be just dealing with you object graph, with no worries about building queries on your own; you just have to define such relationships.
A popular PHP ORM is Doctrine: here is an example about how it handles relationships declarations.
Of course this approach has downsides, but it's up to you investigating and finding out the best tradeoff between flexibility and dependencies.
I have a very simple entity(WpmMenu) that holds menu items connected to one another in a self-referencing relationship (adjecent list it's called)?
so in my entity I have:
protected $id
protected $parent_id
protected $level
protected $name
with all the getters/setters the relationships are:
/**
* #ORM\OneToMany(targetEntity="WpmMenu", mappedBy="parent")
*/
protected $children;
/**
* #ORM\ManyToOne(targetEntity="WpmMenu", inversedBy="children", fetch="LAZY")
* #ORM\JoinColumn(name="parent_id", referencedColumnName="id", onUpdate="CASCADE", onDelete="CASCADE")
*/
protected $parent;
public function __construct() {
$this->children = new ArrayCollection();
}
And everything works fine. When I render the menu tree, I get the root element from the repository, get its children, and then loop through each child, get its children and do this recursively until I have rendered each item.
What happens (and for what I am seeking a solution)is this:
At the moment I have 5 level=1 items and each of these items have 3 level=2 items attached (and in the future I will be using level=3 items as well). To get all elements of my menu tree Doctrine executes:
1 query for the root element +
1 query to get the 5 children(level=1) of the root element +
5 queries to get the 3 children(level=2) of each of the level 1 items +
15 queries (5x3) to get the children(level=3) of each level 2 items
TOTAL: 22 queries
So, I need to find a solution for this and ideally I would like to have 1 query only.
So this is what I am trying to do:
In my entities repository(WpmMenuRepository) I use queryBuilder and get a flat array of all menu items ordered by level. Get the root element(WpmMenu) and add "manually" its children from the loaded array of elements. Then do this recursively on children. Doing this way I could have the same tree but with a single query.
So this is what I have:
WpmMenuRepository:
public function setupTree() {
$qb = $this->createQueryBuilder("res");
/** #var Array */
$res = $qb->select("res")->orderBy('res.level', 'DESC')->addOrderBy('res.name','DESC')->getQuery()->getResult();
/** #var WpmMenu */
$treeRoot = array_pop($res);
$treeRoot->setupTreeFromFlatCollection($res);
return($treeRoot);
}
and in my WpmMenu entity I have:
function setupTreeFromFlatCollection(Array $flattenedDoctrineCollection){
//ADDING IMMEDIATE CHILDREN
for ($i=count($flattenedDoctrineCollection)-1 ; $i>=0; $i--) {
/** #var WpmMenu */
$docRec = $flattenedDoctrineCollection[$i];
if (($docRec->getLevel()-1) == $this->getLevel()) {
if ($docRec->getParentId() == $this->getId()) {
$docRec->setParent($this);
$this->addChild($docRec);
array_splice($flattenedDoctrineCollection, $i, 1);
}
}
}
//CALLING CHILDREN RECURSIVELY TO ADD REST
foreach ($this->children as &$child) {
if ($child->getLevel() > 0) {
if (count($flattenedDoctrineCollection) > 0) {
$flattenedDoctrineCollection = $child->setupTreeFromFlatCollection($flattenedDoctrineCollection);
} else {
break;
}
}
}
return($flattenedDoctrineCollection);
}
And this is what happens:
Everything works out fine, BUT I end up with each menu items present twice. ;) Instead of 22 queries now I have 23. So I actually worsened the case.
What really happens, I think, is that even if I add the children added "manually", the WpmMenu entity is NOT considered in-sync with the database and as soon as I do the foreach loop on its children the loading is triggered in ORM loading and adding the same children that were added already "manually".
Q: Is there a way to block/disable this behaviour and tell these entities they they ARE in sync with the db so no additional querying is needed?
With immense relief (and a lots of learning about Doctrine Hydration and UnitOfWork) I found the answer to this question. And as with lots of things once you find the answer you realize that you can achieve this with a few lines of code. I am still testing this for unknown side-effects but it seems to be working correctly.
I had quite a lot of difficulties to identify what the problem was - once I did it was much easier to search for an answer.
So the problem is this: Since this is a self-referencing entity where the entire tree is loaded as a flat array of elements and then they are "fed manually" to the $children array of each element by the setupTreeFromFlatCollection method - when the getChildren() method is called on any of the entities in the tree (including the root element), Doctrine (NOT knowing about this 'manual' approach) sees the element as "NOT INITIALIZED" and so executes an SQL to fetch all its related children from the database.
So I dissected the ObjectHydrator class (\Doctrine\ORM\Internal\Hydration\ObjectHydrator) and I followed (sort of) the dehydration process and I got to a $reflFieldValue->setInitialized(true); #line:369 which is a method on the \Doctrine\ORM\PersistentCollection class setting the $initialized property on the class true/false. So I tried and IT WORKS!!!
Doing a ->setInitialized(true) on each of the entities returned by the getResult() method of the queryBuilder (using the HYDRATE_OBJECT === ObjectHydrator) and then calling ->getChildren() on the entities now do NOT trigger any further SQLs!!!
Integrating it in the code of WpmMenuRepository, it becomes:
public function setupTree() {
$qb = $this->createQueryBuilder("res");
/** #var $res Array */
$res = $qb->select("res")->orderBy('res.level', 'DESC')->addOrderBy('res.name','DESC')->getQuery()->getResult();
/** #var $prop ReflectionProperty */
$prop = $this->getClassMetadata()->reflFields["children"];
foreach($res as &$entity) {
$prop->getValue($entity)->setInitialized(true);//getValue will return a \Doctrine\ORM\PersistentCollection
}
/** #var $treeRoot WpmMenu */
$treeRoot = array_pop($res);
$treeRoot->setupTreeFromFlatCollection($res);
return($treeRoot);
}
And that's all!
Add the annotation to your association to enable eager loading. This should allow you to load the entire tree with only 1 query, and avoid having to reconstruct it from a flat array.
Example:
/**
* #ManyToMany(targetEntity="User", mappedBy="groups", fetch="EAGER")
*/
The annotation is this one but with the value changed
https://doctrine-orm.readthedocs.org/en/latest/tutorials/extra-lazy-associations.html?highlight=fetch
You can't solve this problem if using adjacent list. Been there, done that. The only way is to use nested-set and then you would be able to fetch everything you need in one single query.
I did that when I was using Doctrine1. In nested-set you have root, level, left and right columns which you can use to limit/expand fetched objects. It does require somewhat complex subqueries but it is doable.
D1 documentation for nested-set is pretty good, I suggest to check it and you will understand the idea better.
This is more like a completion and more cleaner solution, but is based on the accepted answer...
The only thing needed is a custom repository that is going to query the flat tree structure, and then, by iterating this array it will, first mark the children collection as initialized and then will hydratate it with the addChild setter present in the parent entity..
<?php
namespace Domain\Repositories;
use Doctrine\ORM\EntityRepository;
class PageRepository extends EntityRepository
{
public function getPageHierachyBySiteId($siteId)
{
$roots = [];
$flatStructure = $this->_em->createQuery('SELECT p FROM Domain\Page p WHERE p.site = :id ORDER BY p.order')->setParameter('id', $siteId)->getResult();
$prop = $this->getClassMetadata()->reflFields['children'];
foreach($flatStructure as &$entity) {
$prop->getValue($entity)->setInitialized(true); //getValue will return a \Doctrine\ORM\PersistentCollection
if ($entity->getParent() != null) {
$entity->getParent()->addChild($entity);
} else {
$roots[] = $entity;
}
}
return $roots;
}
}
edit: the getParent() method will not trigger additional queries as long as the relationship is made to the primary key, in my case, the $parent attribute is a direct relationship to the PK, so the UnitOfWork will return the cached entity and not query the database.. If your property doesn't relates by the PK, it WILL generate additional queries.