Doctrine: how to enable eager loading programmatically?

Doctrine: how to enable eager loading programmatically? - php

Imagine the following situation. You have books:
Book(bookId, authorId, title)
and authors:
Author(authorId, name)
and each book has (for the sake of simplicity) a single author.
By default all associations are configured in lazy mode. So, if I have the scenario, when I first load all books, iterate over the collection and fetch author of each book, I'll perform lots of queries to the database.
$books = $this->getDoctrine()
->getRepository('AppBundle:Book')
->findAll();
foreach($books as $b) {
echo $b->getAuthor()->getName();
}
Can I programmatically ask Doctrine to load authors eagerly for this specific query (not globally via configuration)?
Related: In Doctrine 2 can the Fetch Mode (Eager/Lazy etc.) be changed at runtime?
Related: http://doctrine-orm.readthedocs.io/projects/doctrine-orm/en/latest/tutorials/getting-started.html#list-of-bugs

You can simply mark the association between books and authors as EAGER (versus the implicit default of LAZY) and Doctrine will always load that particular association up front.
This can be accomplished by adding:
fetch=EAGER
to the mapping association.
One potential method to do this at runtime would be to create a Mapped Superclass. The superclass would define your relationships and other parts of your associations (not the relationship you're trying to adjust).
Then, to actually use the class at run time, you could create two other concrete implementations: LazyBook and EagerBook. Depending on your scenario at runtime, you would use one or the other of these concrete implementation entities to construct your associations.
Of course, LazyBook would define your Book -> Author association as a LAZY one (either explicitly or implicitly) and EagerBook would define it as EAGER.
This isn't truly dynamic as you've defined, but it allows you to programmatically determine which association to use at any given time while also self-documenting that it could be either.

One very important thing to understand here, is that Doctrine uses the Data Mapper pattern and not the Active Record pattern (you can find it in the Yii framework for example):
Doctrine 2 is an object-relational mapper (ORM) for PHP 5.4+ that
provides transparent persistence for PHP objects. It uses the Data
Mapper pattern at the heart, aiming for a complete separation of your
domain/business logic from the persistence in a relational database
management system.
The benefit of Doctrine for the programmer is the ability to focus on
the object-oriented business logic and worry about persistence only as
a secondary problem. This doesn’t mean persistence is downplayed by
Doctrine 2, however it is our belief that there are considerable
benefits for object-oriented programming if persistence and entities
are kept separated.
http://doctrine-orm.readthedocs.io/projects/doctrine-orm/en/latest/tutorials/getting-started.html#what-is-doctrine
That essentially means, that entity classes do not know a single thing about how they are persisted to the database. Even though they can have comment type annotations on them, those are just a form of metadata processed by ORM.
In turn that means you can do the very same thing you did with ActiveRecord, but it's now done at just another place. Let's take a look at the difference:
In ActiveRecord-based ORM (like Yii):
$books = Book::model()->with('author')->findAll();
In DataMapper-based ORM (like Symfony/Doctrine):
$books = $this->getDoctrine()->createQueryBuilder()
->select(['b', 'a'])
->from('AppBundle:Book', 'b')
->join('b.author', a')
->addSelect('a')
->getQuery()
->getResult();
Small comment on the later. The query you are building there is not an SQL query, but rather a DQL query (object-query-language employed by Doctrine).
So join/addSelect in here is much like with at the former query just telling ORM engine that you would like to load author at the same query. Particular relation metadata (e.g. column names for both underlying tables) is still defined out there at the entities metadata level.
Syntax (select, from, join) resembles the SQL on purpose, but you shouldn't be confused by it. Here, building the query, you operate ORM entities and not the database columns/tables.

Related

How does the Doctrine Repository mechanism of lazy loading entities work?

So, I want to understand how the Doctrine Repository mechanism works.
For my entities I use annotations, so the resulting object is built somewhere during the execution of the script.
I'd like to unserstand which are the possibile ways of implementing the lazy loading of entities from another entity.
In concrete, using Doctrine, I have the ability to fetch information of related object (from the Symfony book). This fetching is done in a lazy way: only if I call the method to get the information about the Entity it is loaded from the database querying it.
Now, I'd like to better understand this mechanism: how an entity can implement repository methods?
How can I reproduce this mechanism to implement it in other context similar to the one of a database data retrieval?
As the resulting object is really big, is there someone who can put me on the right way?
Which classes should have I read to understand the mechanism?
Are there any articles/posts that better explain how this mechanism is implemented?
Are there better (or simply simpler) ways of implementing it?

I think the best description of the lazy loading can be found in Doctrine developer articles.
http://www.giorgiosironi.com/2009/07/lazy-loading-of-objects-from-database.html
http://www.giorgiosironi.com/2009/08/doctrine-2-now-has-lazy-loading.html
The main idea is to insert into Product's category list a set of objectes that are subclasses of Category. These and called "proxy objects" and created "on the fly" when Product is retrieved from database. These proxy objects have the same interface as Category object, but add functionality of loading actual Category items from database when needed.

Doctrine actually creates a extra object (think proxy) that keeps a record of what properties have actually been accessed.
See this part from the documentation :
32.4.2. Association proxies
The second most important situation where Doctrine uses proxy objects is when querying for objects. Whenever you query for an object that has a single-valued association to another object that is configured LAZY, without joining that association in the same query, Doctrine puts proxy objects in place where normally the associated object would be. Just like other proxies it will transparently initialize itself on first access.
doctrine documentation

You can create a custom repository for an entity: http://symfony.com/doc/current/book/doctrine.html#custom-repository-classes
Once you have your own custom repository class you can create queries that fetch all of the information you need in one query rather than relying on lazy loading.
So say you have an Product entity which has one or more Category entities using a ManyToMany relationship you could create a function in your custom repository to fetch all products with their categories in one query:
public function fetchProductsWithCategories()
{
return $this->getEntityManager()
->createQuery(
'SELECT p, c FROM Product p join p.categories c'
)
->getResult();
}
Then in your controller you would have something like:
$repo = $this->getDoctrine()->getManager()->getRepository('Product');
$products = $repo->fetchProductsWithCategories();
Edit: missed c in select

Where to put Doctrine queries that use multiple entities in Symfony2?

I have written quite a large and complicated query that internally uses a UNION to select from multiple tables, then returns an array of mixed type entities.
I know that best practises in Symfony say to always put the queries within the repository classes, but how do I decide which to put it in? There's no parent/child relationship between them, the two entities are completely equal.

I usually put them in the repository which I consider the most dependent entity in the context.
For instance, if I had two entities: User and Group.
Many entities might have an owning relationship with group, but you can't expect the Group repository to single handedly provide the methods necessary for every specific dependent to function.
It is the responsibility of the dependent (the owning side) to make the connection and hense provide the functionality.
So a method like getUsersInGroup(Group $group) would belong in the UserRepository.
However, you said there are no direct relationships between your two entities.
In this case, my first comment applies. Use the repository whose entity is more dependent on the other within the context of the query. Whichever entity that one is, depends entirely on you.

Do not use a repository. Repository is bounded to the context of the specific entity type, and it is supposed to be used as a Collection.
Also
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection.
(M.Fowler)
And in your case I assume there is no Object-Relational-Mapping, given that you are using a query with a UNION of tables. I assume that your query will not return an actual Doctrine entity, right?
In that case, the query doesn't belong to the Repository pattern, and I suggest you to have a custom class where to encapsulate the big/complex query PHP+SQL code, e.g.:
namespace App\Query
class MyComplexQuery {
(optionally implementing QueryInterface)
and call it from your Controller or Service, without passing from the Repository.
In case you are defining a custom Doctrine Entity to represent the results of your UNION of entities, then use the repository of such entity.

repository pattern vs ORM

What good is a repository pattern when you have an ORM?
Example. Suppose i have the following (fictional) tables:
Table: users
pk_user_id
fk_userrole_id
username
Table: userroles
fk_userrole_id
role
Now with an orm i could simply put this in a model file:
$user = ORM::load('users', $id);
Now $user is already my object, which could easily be lazy loaded:
(would be even nicer if things are automatically singular/pluralized)
foreach ( $user->userroles()->role as $role )
{
echo $role;
}
Now with the Repository pattern i'd had to create a repository for the Users and one for the Roles. The repository also needs all kinds of functions to retrieve data for me and to store it. Plus it needs to work with Entity models. So i have to create all of those too.
To me that looks like alot of stuff do... When i could simply get the data like i described above with an ORM. And i could store it just as easy:
ORM::store($user);
In this case it would not only store the user object to the database, but also any changes i made to the 'Roles' object aswell. So no need for any extra work like you need with the repository pattern...
So my question basically is, why would i want to use a repository pattern with an ORM? I've seen tutorials where to use that pattern (like with Doctrine). But it really just doesn't make any sense to me... Anyone any explanation for its use in combination with an ORM..??

The ORM is an implementation detail of the Repository. The ORM just makes it easy to access the db tables in an OOP friendly way. That's it.
The repository abstract persistence access, whatever storage it is. That is its purpose. The fact that you're using a db or xml files or an ORM doesn't matter. The Repository allows the rest of the application to ignore persistence details. This way, you can easily test the app via mocking or stubbing and you can change storages if it's needed. Today you might use MySql, tomorrow you'll want to use NoSql or Cloud Storage. Do that with an ORM!
Repositories deal with Domain/Business objects (from the app point of view), an ORM handles db objects. A business objects IS NOT a db object, first has behaviour, the second is a glorified DTO, it only holds data.
Edit
You might say that both repository and ORM abstract access to data, however the devil is in the details. The repository abstract the access to all storage concerns, while the ORM abstract access to a specific RDBMS
In a nutshell, Repository and ORM's have DIFFERENT purposes and as I've said above, the ORM is always an implementation detail of the repo.
You can also check this post about more details about the repository pattern.

ORM and repository pattern...depends on setup.
If you use your ORM entities as the domain layer, then please use no repositories.
If you have a separate domain model and you need to map from that model to ORM entities and so perform a save, then repositories are what you need.
More details you find here (but must be logged to linked-in). Also to understand the difference, check out the definition of the repository pattern.
Most people use classes that they call repositories, but aren't repositories at all, just query classes - this is how/where you should place your queries if you decided to go with the #1 option (see answer above). In this case make sure not to expose DbContext or ISession from that query class, nor to expose CUD-methods from there - remember, Query class!
The #2 option is a tough one. If you do a real repository, all the inputs and outputs on the repository interface will contain clear domain classes (and no database related object). It's forbidden to expose ORM mapped classes or ORM architecture related objects from there. There will be a Save method also. These repositories might also contain queries, but unlike query classes, these repos will do more - they will take your domain aggregate (collection and tree of entities) and save them to DB by mapping those classes to ORM classes and perform a save on ORM. This style (#2) does not needs to use ORM, the repository pattern was primarly made for ADO.NET (any kind of data access).
Anyhow these 2 options are the 2 extremes we can do. A lot of people use repositories with ORM, but they just add an extra layer of code without real function, the only real function there is the query class like behaviour.
Also I'd be careful when someone talks about UnitOfWork, especially with ORM. Almost every sample on the internet is a fail in terms of architecture. If you need UoW, why not use TransactionScope (just make sure you got a wrapper which uses other than Serializable transaction by default). In 99,9% you won't need to manage 2 sets of independent changes in data (so 2 sets of OuW), so TransactionScope will be a fine choce in .NET - for PHP i'd look for some open-session-view implementations...

own ORM: database records in case of JOIN?

We are doing our own framework with ORM capability. The database tables are classes now, but how about records? Lets imagine two tables:
Users
ID,USERNAME
Emails
USER_ID,ADDRESS
so, a record object will have getID(), getUSERNAME() methods, etc but if the two tables are JOIN-ed, it cant have two types right? Since there is no multiple inheritance. And what about field collision?

DBIx::Class handles this by having a Class for each table, and joins are represented by a method that gets an object matching the other table..
$myAddress = $myUser->emails->address;

I think every class should represent a record and a whole table should be an array (or some other collection) of objects. Take a look at http://www.doctrine-project.org/ to get some ideas.
And for JOIN, you should have some mechanism for defining aliases. That way, you can deal with field collision.
And for getters and setters, you can use __call, __get and __set. See http://php.net/manual/en/language.oop5.overloading.php for more info.

I'm providing some insight based on the Model/ORM implementation of this PHP UI Framework . Here are some suggestions from me:
Don't decide blindly to map functions into fields. Why not use get('field') and set('field'). There is no downside (apart from lack of IDEs hinting), but you can avoid code generation or catch-all which usually is slower.
When joining you wouldn't necessarily want multiple objects. In my ORM a single Model can work with joined tables. This introduces transparency and when you call $model->set('address') it might be associated with joined table. Im still using sub-instance of a dynamic query for sub-selects but for joins there is no need.
I've see a lot of power of inheritance and ability to re-shape parent models in parent model. Each table can have multiple models depending on your business uses.
Models and ORM should be separated but should play together very closely. I've also managed to make everything play well with generic views and generic controllers, which is a great time-saver.
Hopefully this would help find your own way or to decide on not implementing your own ORM. It's not an easy task.

Does every single table in Zend have to map to its own Class?

I am not suggesting that all models are tables.
What I am asking is whether every single table must also have its own class defined specifically for it when using Zend? Is there any way of getting away from this awkward boiler-plate coding. We're just starting to look into Zend (hoping to leave procedural PHP land!) and my colleague thinks this could end up being pretty time-consuming.
Is this the reason for people using ORM solutions? Is there any other way around this?
Thanks for your replies.

The Zend Table classes follow the Table Data Gateway pattern, which by definition
... holds all the SQL for accessing a single table or view: selects, inserts, updates, and deletes. Other code calls its methods for all interaction with the database.
In the book, Fowler is not that rigid about that, saying that
for very simple cases, you can have a single TDG that handles all methods for all tables. You can even have one for views or even for interesting queries that aren't kept in the database as views.
However, except for being able to use Views, Zend_Db_Table does not accomodate for this. You can create queries to multiple tables, but those would have to be made through the Zend_Db_Adapter directly or - when using joins - by switching off the integrity check. Otherwise, you have to use the API offered by Zend_Db_Table Relationships
So yes, one instance should correspond to one table or view. You do not need to create classes for that though if you dont plan on extending the classes. Zend_Db_Table_Definitions allow you to configure Zend_Db_Table instances on the fly.
Note that TDG is a DataSource Architectural Pattern and not an Object-Relational pattern. It's purpose is not to help with impedance-mismatch, but with separating database access code from business logic.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.