Doctrine Performance On OneToMany RelationShip

Doctrine Performance On OneToMany RelationShip - php

I am wondering whether OneToMany Realationship will hamper on application's performance. Say, city and user entity are in a OneToMany relationship, that means, a city can contain a lots of users. Now after setting the relationship in entity classes, whenever I retrieve a city, I can get its users via:
$users = $city->getUsers();
Now, I am wondering about the internal architechture. Whenever I retrieve city, will it query for all users also? If so, lets say a city can have 10000 users. Now won't it be a performance issue that I am retrieving a city only, but its getting along with all 10000 users as well? Or, it uses some other machanism and I am totally ok with it to implement?
Wish to get an explanation from you experts and suggestions about best practices regarding this situations as well. Thanks in advance.

Well, I am not an expert, but I can share you some good practices that you could consider at application development:
fetch="EXTRA_LAZY"
By default, doctrine 2.0 will load the entire collection and store it memory. In a scenario like yours, the users collection could be a performance problems by the table dimesion. So, why not mark the relationship as EXTRA_LAZY? An entity with this fetch mode only will be load when is accessed without triggering a full load of the collection.
/**
* #ManyToMany(targetEntity="User", mappedBy="cities", fetch="EXTRA_LAZY")
*/
Allowing this fetch mode, you are able to make use of function like slide() and count() robustly. For example:
$users = $em->getRepository('models\User')->findAll();
echo $users->count();
The code triggers a sql statement like select count(*) from users.. instead looping through the $user collection.
Hydrating objects
Not always is necessary to load a collection of entities. If you are creating a blog system, we need to provide only a list of post titles. This infrasrtucture can be inproved by Hydrating objects for read-only purposes.
Useful links
Some links that routes you to official doctrine docs guide.
Improving performance
Doctrine 2, best practices

Related

Fetch associated entities collection from Doctrine2 repository as an array

I'm starting a little project with DDD approach. I've created my domain model with Entities and ValueObjects in plain PHP. Entities have references to their associations - in my case, there is an Employee entity with collection of Teams he belongs to, and I keep them in Employee::teams property. Everything is going great, I've created mappings for those entities with associations in YAML, interfaces for repositories to be implemented in the Symfony2 and Doctrine2 layer, etc.
When I fetch Employees from repository (with Doctrine's EntityManager::findAll()) instead of array of Teams I receive PersistentCollection with those teams. It's built on PHP7, and Employee::getTeams() has return type of array so I'm getting the critical exception.
Is there any way to convert it into array with some external listeners (Symfony layer only, to not to mess in domain files) or any other core mechanism?
Thanks.

You get ArrayCollection
http://www.doctrine-project.org/api/common/2.5/class-Doctrine.Common.Collections.ArrayCollection.html
It has toArray() method so you can write
/**
* #return Team[]
*/
public function getTeams(): array
{
return $this->teams->toArray();
}
But I prefer
/**
* #return Team[]|ArrayCollection
*/
public function getTeams()
{
return $this->teams;
}
If you use good IDE like PhpStorm it will understand it correctly to autocomplete it like ArrayCollection and as Team if you make foreach ($employee->getTeams() as $team) {...
ArrayCollection are more powerful than plain arrays. For example you can filter and order them and Doctrine will make optimized SQL so you don't have to load all items
http://docs.doctrine-project.org/en/latest/reference/working-with-associations.html#filtering-collections

Kuba,
That doesn't seem to be a DDD question but a technical question regarding PHP implementation details.
However, please let me jump into your design and try to find a better solution for your domain. You are missing to model the relation between an employee and a team, let's call it employee in team. The way you did you are forcing an AR to reference another AR, saying this way that employee manages a team life cycle. This might not be the case but under a specific scenario another actor might change a status of a team and that could break the Employee invariant over the Team AR. Because of this when you make an AR to manage another AR life cycle then this dependency shouldn't be found through the repository and the root of the aggregate is in the position to now force invariant around the aggregated AR.
There are still some concerns to mention, but just to keep it simple, go ahead and model the concept you are missing: EmployeeInTeam or whatever you want to call it. It references the conceptual identity of the employee and the team so you can remove it safely in a future, you can query employees in a team and teams an employee is part of.
If you need frameworks or the DB to keep the consistence then you are not doing DDD. Use just objects, not the technology.
Regards,
Sebastian.

Select only specific portion of OneToMany joined collection

I am new to symfony so sorry if this is something really simple to anwer.
For the sake of example I have rewritten code snippets as if I was writing a blog.
I have a BlogPost entity with collection of BlogComments annotated like this:
/**
* #OneToMany(targetEntity="BlogComment", mappedBy="post")
*/
private $comments;
From my amateur point of view, Doctrine likes to work with complete objects, so this collection is either not initialized, or lazily loaded whenever I use the reference to it.
I guess you can imagine the overhead and memory requirements when every one of my BlogPosts has at least 500 Blogcomments and they all get initialized whenever I touch $comments variable.
What I am trying to achieve is to list ie. 50 blog posts, each with 20 latest comments (without the memory explosion). Additionally, I would like to be able to display only top 20 comments with most "likes" (or generally just select a subset based on some criteria).
Is there any generally recommended and clean way to achieve this kind of functionality? And when I achieve this, isn't use of such "incomplete" or "modified" entities going to break my logic (when updating/deleting items from the subset and persisting it)? I assume that solution to this will likely be a method in a custom repository, but I still can't see the thought behind it.
In advance, thank you for answers, I am really curious what kinds of solutions you will be able to come up with.

Doctrine is not good to deal with a big list objets. It will be far slower than a classic SQL query followed by while ($row = $stmt->fetch()). Sometimes, it's better to perform native queries. 50 blogs posts * 20 comments = 1000 objects populated, and even probably more if you get the User name for each result.
You will need to use a native query anyway, because I don't think you can get your 20 comments per blog post with a pure DQL query. You will need to do some joined subquery to limit to 20 comments, see this post for more information: MySQL JOIN with LIMIT 1 on joined table
Once you have done your native query, and if you really need your objects, you can bind your results to Doctrine entities with the ResultSetMappingBuilder: http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/native-sql.html#resultsetmappingbuilder

Where to put traversing graph query in DDD while avoiding doing hundreds of smaller ones?

So I've been trying to make a bit of DDD at the project I work on, but I'm facing the problem I mention in the title.
We have the Entity.php generated by the Symfony console, with the Doctrine annotations in there (I know it is not how it should be made), and the corresponding EntityRepository.php.
The applicable object graph is:
Post entity contains a Messages collection, which in turn have a ReadMessagescollection because we need to know by whom has it been read. To know whether a Post has been read, we want to left join Messages with ReadMessages filtering by the user we need, and if there are any blank ReadMessages, we'll know it has not been read.
If we use a method in the Post entity to iterate over all Messages and all ReadMessages for each of those, Doctrine will be making lots of queries unless we configure the associations as Eager, which we don't want to because then it will be retrieving the associations all the times we ask for a Post; the ideal way would be to use a DQL query that loads the joined entities, but since there is no way to access the repository from the entity (apart from injecting one in the other -which I don't even know if is possible-), I think the only option left is to use a Symfony2 service that gets Doctrine injected. The thing is that I don't really like having to add another piece just as a helper.
Is there any other way to do this?
Thanks in advance.

What if you would filter your collection using criteria (Doctrine\Common\Collections\Criteria)? I think this might solve your problem. You can read on how to do this in detail in the Doctrine2 documentation in 8.8.
It is as simple as you define your Criteria as where message is read and then you get the filtered result as follows:
/**
* Get all read messages
*
* #return Collection
*/
public function getReadMessages(){
$isReadCriteria = //... define criteria
$messages = $this->getMessages();
$readMessages = $messages->matching($isReadCriteria);
return $readMessages;
}

Handling relationships with the Data Mapper pattern

I am using the Data Mapper Pattern and I am wondering what is the best way to handle relationships with that pattern. I spent a lot of time searching for solutions in Google and Stack overflow, I found some but I am still not completely happy about them, especially in one special case that I will try to explain.
I am working with PHP so the examples of code that I will put are in PHP.
Let's say I have a table "team" (id, name) and a table "player" (id, name, team_id). This is a 1-N relationship.
By implementing the Data Mapper pattern, we will have the following classes: Team, TeamMapper, Player and PlayerMapper.
So far, everything is simple. What if we want to get all players from a team?
The first solution I found is to create a method getAllPlayers() in the Team class which will handle that with lazy loading and proxies. Then, we can retrieve the players of a team like that:
$players = $team->getAllPlayers();
The second solution I found is to directly use the PlayerMapper and pass the team ID as parameter. Something like:
$playerMapper->findAll(array('team_id' => $team->getId()));
But now, let's say that I want to display a HTML table with all the teams and with a column 'Players' with all of the players of each team. If we use the first solution I described, we will have to do one SQL query to get the list of teams and one query for each team to get the players, whcih means N+1 SQL queries where N is the number of teams.
If we use the second solutions I described, we can first retrieve all team IDs, put them in an array, and then pass it to the findAll method of the player mapper, something like that:
$playerMapper->findAll(array('team_id' => $teamIds));
In that case, we need to run only 2 queries. Much better. But I am still not very happy with that solution because the relationships are not described into the models and it is the developer who must know about them.
So my question is: are there others alternatives with the Data Mapper pattern? With the example I gave, is there a good way to select all teams with all players in just 2 queries with the description of the relationships into the model?
Thank you in advance!

If you look at Martin Fowler's text that describes how the DataMapper works, you'll see that you can use one query to get all the data that you need and then pass that data to each mapper, allowing the mapper to pick out only the data that it needs.
For you, this would be a query that joins from Team to Player, returning a resultset with duplicated Team data for each unique Player.
You then have to cater for the duplication in your mapping code by only creating new objects when the data changes.
I've done something similar where the equivalent would be the Team mapper iterating over the result set and, for each unique team pass the result set to the Player mapper so that it can create a player and then add the player to the team's collection.
While this will work, there are problems with this approach further downstream...

I have a possible solution to this problem that I have implemented successfully in one of my projects. It is not so complex and would use only 2 queries in the example described above.
The solution is to add another layer of code responsible for handling relationships.
For instance, we can put that in a service class (which can be used for other stuff as well, not only handling relationships).
So let's say that we have a class TeamService on top of Team and TeamMapper. TeamService would have a method getTeamsWithRelationships() which would return an array of Team objects. getTeamsWithRelationships() would use TeamMapper to get the list of teams. Then, with the PlayerMapper, it would get in only one query the list of players for these teams and set the players to the teams by using a setPlayers() method from the Team class.
This solution is quite simple and easy to implement, and it works well for all types of database relationships. I guess that some people may have something against it. If so, I would be interested to know what are the issues?

Where should filtering with an Acl be performed?

Let's say I have three tables: users, books, and users_books.
In one of my views, I want to display a list of all the books the current user has access to. A user has access to a book if a row matching a user and a book exists in users_books.
There are (at least) two ways I can accomplish this:
In my fetchAll() method in the books model, execute a join of some sort on the users_books table.
In an Acl plugin, first create a resource out of every book. Then, create a role out of every user. Next, allow or deny users access to each resource based on the users_books table. Finally, in the fetchAll() method of the books model, call isAllowed() on each book we find, using the current user as the role.
I see the last option as the best, because then I could use the Acl in other places in my application. That would remove the need to perform duplicate access checks.
What would you suggest?

I'd push it all down into the database:
Doing it in the database through JOINs will be a lot faster than filtering things in your PHP.
Doing it in the database will let you paginate things properly without having to jump through hoops like fetching more data than you need (and then fetching even more if you end up throwing too much out).
I can think of two broad strategies you could employ for managing the ACLs.
You could set up explicit ACLs in the database with a single table sort of like this:
id: The id of the thing (book, picture, ...) in question.
id_type: The type or table that id comes from.
user: The user that can look at the thing.
The (id, id_type) pair give you a pseudo-FK that you can use for sanity checking your database and the id_type can be used to select a class to provide the necessary glue to interact the the type-specific parts of the ACLs and add SQL snippets to queries to properly join the ACL table.
Alternatively, you could use a naming convention to attach an ACL sidecar table to each table than needs an ACL. For table t, you could have a table t_acl with columns like:
id: The id of the thing in t (with a real foreign key for integrity).
user: The user the can look at the thing.
Then, you could have a single ACL class that could adjust your SQL given the base table name.
The main advantage of the first approach is that you have a single ACL store for everything so it is easy to answer questions like "what can user X look at?". The main advantage of the second approach is that you can have real referential integrity and less code (through naming conventions) for gluing it all together.
Hopefully the above will help your thinking.

I would separate out your database access code from your models by creating a finder method in a repository class with an add method like getBooksByUser(User $user) to return a collection of book objects.
Not entirely sure you need ACLs from what you describe. I maybe wrong.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.