Doctrine2 should automatically merge associated entities

Doctrine2 should automatically merge associated entities - php

I am building an order process, in which an Order object is built through many steps. After each step I put the partly finished Order object into the session, and in the final step I save it into the database.
During the steps I load other, (already existing) associated objects into my Order object(eg. DiscountCoupon). The problem is that when I save my Order to the session and then load it in the next step, all associated entities will be detached. So when I want to save it to the database, the EntityManager throws an exception, which asks for setting a cascade=persist on the relationship.
Of course, I dont need to persist those objects(they are already in the database). The obvious solution could be to change those associated objects to merged ones(using EntityManager#merge method), however, I have a rather complicated object structure with multiple levels of embedded entities, so doing the above process would be rather inconvinient.
Cannot Doctrine automatically do this task for me? So instead of 'complaining' about my detached entities, it could automatically merge them.

As far as I can tell, you'd be creating some serious magic if you tried to make it entirely automatic.
Your best bet is probably to implement a function with a signature like: mergeOrderEntities($Order, $EntityManager) which knows how to talk the Order structure and invoke EntityManager::merge() on all detached associated entities.
As to why doctrine doesn't do this automatically, I'm not sure. I suspect there are use cases where auto-merging of entities could be dangerous, or at least undesirable.

Related

Flush only certain entities in single transaction

in my application I do write to a read model table (think CQRS) at certain times. While doing so, I also have to remove older read models. So at any given point I need to:
Remove entities a,b,c
Persist entities x,y,z
In order to maintain a valid read model throughout the lifecycle, I would like to encapsulate this process in a single transaction. Doctrine does provide the necessary means.
However, I also must guarantee that no other entities are being flushed in the process. Sadly, calling doctrine's $em->getConnection()->commit(); seems to to flush the whole unit of work. But according to the docs I have to call that to finalise my transaction.
I cannot introduce a second entity manager to only take of my read model entities as they are in the same namespace as the other entities and apparently that is not the way the doctrine-orm-bundle is supposed to be used.
The only other approach I see is to work on a lower level and circumvent the EntityManager and UnitOfWork completely, but I would like to guarantee transactional integrity and do not see a way to do so without em/ouw.

TLDR: The way your application works might warrant that update-concerns are completely separable into two independent sets, but that's unusual, and fragile (usually not even true at the time of making the assertion). The proper way to model that is using separate EntityManagers for each of the sets, and by manually guarding their interconnections is the semantics of "how they are independent" coded into the system.
Details:
If I understand the question correctly, you are facing a design flaw here. You have two sets of updates (one for the read models and one for the other entities) and you mix them (because you want both in the same entity manager) while you also want to separate them (by the means of a separate transaction). In general, this won't work.
For example, let's think non-read-model entity instance A is just created in-memory (so it has no ID yet) and based on this you decide to reference it with read-model entity instance R. The R->A relationship is valid in memory, but now you expect to be able to flush only read model entities but not others. I.e. when you try to persist+flush R it will reference a non-existing foreign key and your RDBMS will hopefully fail the transaction. On a high-level, this is because a connected in-memory graph should be consistent data in its entirety and when you try to split its valid changes into subsets, you're rearranging the order of those changes implicitly, which may introduce temporary inconsistency, and then your transaction commit might just be at such a point.
Of course, it may happen that you know some rule why such a thing will never happen and why consistent state of each set is warranted in a fashion that is completely independent from the other set. But then you need to write your code reflecting that separation; the way you can do that is to use two entity managers. In that case your code will clearly cope with the two distinct transactions, their separation and how exactly that is consistent from both sides. But even in this case, to avoid clashes of updates, you probably need to have rules outlining a one-way visibility between the two sets, which will also imply an order to committing transactions. This is because transactions in a connected graph can be nested, but not "only overlap", because at each transaction commit you are asking the ORM to sync the consistent in-memory data of the transaction scope to the RDBMS.
I know you mentioned that you do not want to use two EMs, because read-model entities "are in the same namespace as the other entities and apparently that is not the way the doctrine-orm-bundle is supposed to be used".
The namespace does not really matter. You can use them separately in the two managers. You can even interconnect them, if you a) properly merge() them and b) cater for the above mentioned consistency you need to provide for both EM's transactions (because now they are working on one connected graph).
You should elaborate what exactly you refer to by saying "that is not the way the doctrine-orm-bundle is supposed to be used" -- probably there's an error in the original suggestion or something wrong with the way that suggestion is applied to this problem.

need to switch between class scopes

I have a class (PersistenceClass), that takes an array of data (posts) and parses that data and puts it into a DB (via doctrine). The field content needs to be parsed by a second class (SyntaxClass) before it is set into the doctrine entity.
Now the problem is, that the SyntaxClass has to set references in the content to other posts (just a link with and ID). So it needs access to the DB, and also needs to search in the persisted but not yet flushed entities from the PersistenceClass.
I can inject a doctrine EM into SyntaxClass and find my references in DB, although I dont like it very much. But the bigger problem is, how I can access the only persisted, but not flushed entities from the PersistenceClass ? I could make an Array of that objects and put it as an parameter to the parser method like:
SyntaxClass->parseSyntax($content, $persistedObjects);
But that does not look very clean. Aside from that, I dont know if it is somehow possible to search in the data of the persisted objects?

Your question is full of sub-question, so, first I'll try to make some things clear.
First, the naming convention you used is a bit abiguos and this not helps, me and also other people that may work on your code in future (maybe you'll grow and need to hire more developers! :P ). So, let's start with some nomenclature.
What you are calling PersistenceClass may be something like this:
class PersistenceClass
{
public function parse(array $posts)
{
foreach ($posts as $post) {
// 1. Parse $post
// 2. Parse content with SyntaxClass
// 3. Persist $post in the database
}
}
}
The same applies also for SyntaxClass: it receives the $content and parses it in some ways, then sets the references and then persists.
This is just to set some boundaries.
Now, go to your questions.
I can inject a doctrine EM into SyntaxClass and find my references in
DB, although I dont like it very much.
This is exactly what you have to do! The OOP development works this way.
But, and here come the problems with naming conventions, the way you inject the entity manager depends on the structure of your classes.
A good design should use services.
So, what currently are PersistenceClass and SyntaxClass in reality should be called PersistenceService and SyntaxService (also if I prefere call them PersistenceManager and SyntaxManager, because in my code I always distinguish between managers and handlers - but this is a convention of mine, so I'll not write more about it here).
Now, another wrong thing that I'm imaging you are doing (only reading your question, I'M IMAGING!): you are instantiating SyntaxService (you currently named SyntaxClass) from inside PersistenceService (your currently named PersistenceClass). This is wrong.
If you need a fresh instance of SyntaxService for each post, then you should use a factory class (say SyntaxFactory), so calling SyntaxFactory::create() you'll get a fresh instance of SyntaxService. Is the factory itself that injects the entity manager in the newly created SyntaxClass.
If you don't need a fresh instance each, time, instead, you'll declare SyntaxClass simply as a service and will pass it to PersistenceService by injection. Below this last simpler example:
# app/config/service.yml
services:
app.service.persistence:
class: ...\PersistenceService
# Pass the SyntaxInstance directly or a factory if you need one
aguments: ["#doctrine.orm.default_entity_manager", "#app.service.syntax"]
app.service.syntax:
class: ...\SyntaxService
aguments: ["#doctrine.orm.default_entity_manager"]
But the bigger problem is, how I can access the only persisted, but
not flushed entities from the PersistenceClass ?
Now the second question: how to search for {persisted + flushed} and {persisted + not flushed} entities?
The problem is that you cannot use the ID as the search parameter as the persisted but not flushed entities doesn't have one before the flushing.
The solution may be to create another service: SearchReferencesService. In it you'll inject the entity manager too (as shown before).
So this class has a method search() that does the search.
To search for the entities persisted but not flushed, the UnitOfWork gives you some interesting methods: getScheduledEntityInsertions(), getScheduledEntityUpdates(), getScheduledEntityDeletions(), getScheduledCollectionDeletions() and getScheduledCollectionUpdates().
The array of which you are speaking about is already there: you need to only cycle it and compare object by object, basing the search on fields other than the ID one (as it doesn't exist yet).
Unfortunately, as you didn't provided more details about the nature of your search, it is not possible for me to be more precise about how to do this search, but only tell you you have to search using the unit of work and connecting to the database if null results are returned by the first search. Also the order in which you'll do this search (before in the database and then in the unit of work or viceversa) is up to you.
Hope this will help.

ORM/Doctrine2 - When to persist?

This has been bugging me for a while.
In Doctrine2, we have the: ObjectManager function:
void persist(object $object = null)
You only need to call it on new entities.
My question though, is "when" should it be called? Immediately after creating the entity, or immediately before flushing it?
I can't find any documentation indicating the convention. The reason this is important is because Doctrine dispatches the "persist event" when calling.
Given that the object might still be empty, it seems to imply that any functionality tagged on to that event should disregard the importance of the data the object contains at that point in time.
Am I correct in that statement or is there a convention Doctrine promotes?

What you want to do is create your new object, use it anyway you want, and when you're done with it and want to send it to your database, then persist it, just before flushing it.
If you persisted your entity just after creating it, any changes you would make wouldn't be taken into account when sent to the database.

Doctrine refresh copy of entity

I have a CustomerAccount entity. After that entity has had changes made to it via a form, but before the entity has been persisted to the database, I need to fetch a new copy of the same CustomerAccount with the entity as it currently exists in the database. The reason I need to do this is I want to fire off a changed event with both the old and new data in my service.
One hack I used was $oldAccount = unserialize(serialize($account)); and passing the old into my service, but thats really hackish.
What I would really like to do is have Doctrine pull back a copy of the original entity (while keeping the changes to the new version).
Is this even possible?
Update
It appears what I really want to do is ultimately impossible at this time with the way Doctrine is architected.
Update 2
I added the solution I ultimately ended up using at the bottom. I'm not completely happy with it because it feels hackish, but it gets the job done and allows me to move on.

It depends.
I mean, Doctrine2 use the IdentityMap that prevents you "accidentally" query the db for the same object over and over again into the same request. The only way to force doctrine fetch entity object again is to detach the entity from the entity manager and request entity again.
This, however, could lead to some strange behaviour that could "slip" out of your control:
you can't persist again a detached object
if you try to persist an object that is related ("linked") to your detached entity you will run into troubles (and sometimes is very difficult to debug)
So, why don't you try with php built-in clone function? Maybe is more suitable for you and could save you from a lot of debugging
Code example:
$em = $this->getDoctrine()->getManager();
$fetched_entity = $em->findOnById(12);
$cloned_entity = clone $fetched_entity;
//and so on ...

Here is the ultimate solution I ended up using. I created a duplicate entity manager in my config.yml and retrieved a second copy of the entity from the duplicate entity manager. Because I won't make any changes to the entity retrieved by the duplicate entity manager, this solution was the best for my use case.

What does a Data Mapper typically look like?

I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?

From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.

If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).

Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.

PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.

keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.