This is a bit of a followup on my previous question:
Symfony2 / Doctrine queries in loops. I thought it would be better to post this as a separate question though, as the first part has been solved.
I've updated some old query code in our product, because it was causing timeouts. I'm not very used to the whole Symfony/Doctrine/Repository/Entity concept, it's pretty confusing and convoluted in my (limited) experience), but I'm trying to fix it as best I can.
At first, the getRepository was inside a nested foreach loop, so I made a getUsersFromArray function in the repository so it doesn't have to loop over that (causing literally a million queries). I accomplished that by using a where IN(:ids) statement. That helps a lot, at least it doesn't time out any more.
I still have a foreach loop left over, which removes groups from users, so with a 1000 users that's still a 1000 queries... should be doable in a single query right?
$usersWhoShouldNotHaveGroup = $em->getRepository('AdminBundle:User')
->getUsersFromArray($session['usersNotHaveGroup'], $user->getCompany());
foreach ($usersWhoShouldHaveGroup as $u) {
$u->removeGroup($group);
$em->persist($u);
}
$em->flush();
With the removeGroup function being defined in the default FOSUserBundle:
public function removeGroup(GroupInterface $group)
{
if ($this->getGroups()->contains($group)) {
$this->getGroups()->removeElement($group);
}
return $this;
}
And groups being defined like this:
/**
* #ORM\ManyToMany(targetEntity="Something\AdminBundle\Entity\Group", inversedBy="users", cascade={"persist"})
* #ORM\JoinTable(name="user_group",
* joinColumns={#ORM\JoinColumn(name="user_id", referencedColumnName="id")},
* inverseJoinColumns={#ORM\JoinColumn(name="group_id", referencedColumnName="id")}
* )
*/
protected $groups;
(user_group table just has 2 columns, user ids coupled to group ids, user can only be in a single group at a time)
My current code is listed here (point 3) as an antipattern, although I wouldn't know how to implement his solution in my case.
What I do not understand, according to the symfony docs, the actual queries should only be called once it does the $em->flush()?
Then why does the profiler still show a 1000 queries during the foreach loop?
How could I remedy this?
Related
I'm wondering if anything like the following already exists for Laravel? It's a trait I wrote called CarefulCount.
What it does: It returns a count of related models (using any defined relation), but only hits the DB if absolutely necessary. First, it tries two options to avoid hitting the DB, if the information is already available:
Was the count retrieved using withCount('relation') when the model was retrieved - i.e. does $model->relation_count exist? If so, just return that.
Has the relation been eager-loaded? If so, count the models in the Eloquent collection without hitting the DB, with $model->relation->count().
Only then resort to calling $model->relation()->count() to retrieve the count from the DB.
To enable it for any model class, you simply need to include the trait with use CarefulCount. You can then call $model->carefulCount('relation') for any defined relation.
For example, in my application there is a suburbs table with a has-many relation to both the users table and churches tables (i.e. there can be many users and many churches in a single suburb). Simply by adding use CarefulCount to the Suburb model, I can then call both $suburb->carefulCount('users') and $suburb->carefulCount('churches').
My use case: I've come across this a number of times - where I need a count of related models, but it's in a lower-level part of my application that may be called from several places. So I can't know how the model was retrieved and whether the count information is already there.
In those situations, the default would be to call $model->relation()->count(). But this can lead to the N+1 query problem.
In fact, the specific trigger came from adding Marcel Pociot's excellent Laravel N+1 Query Detector package to my project. It turned up a number of N+1 query problems that I hadn't picked up, and most were cases when I had already eager-loaded the related models. But in my Blade templates, I use Policies to enable or disable deleting of records; and the delete($user, $suburb) method of my SuburbPolicy class included this:
return $suburb->users()->count() == 0 && $suburb->churches()->count() == 0;
This introduced the N+1 problem - and obviously I can't assume, in my Policy class (or my Model class itself), that the users and churches are eager-loaded. But with the CarefulCount trait added, that became:
return $suburb->carefulCount('users') == 0 && $suburb->carefulCount('churches') == 0;
Voila! Tinkering with this and checking the query log, it works. For example, with the users count:
If $suburb was retrieved using Suburb::withCount('users'), no extra query is executed.
Similarly, if it was retrieved using Suburb::with('users'), no extra query is executed.
If neither of the above were done, then there is a select count(*) query executed to retrieve the count.
As I said, I'd love to know whether something like this already exists and I haven't found it (either in the core or in a package) - or whether I've plain missed something obvious.
Here's the code for my trait:
use Illuminate\Support\Str;
trait CarefulCount
{
/**
* Implements a careful and efficient count algorithm for the given
* relation, only hitting the DB if necessary.
*
* #param string $relation
*
* #return integer
*/
public function carefulCount(string $relation): int
{
/*
* If the count has already been loaded using withCount('relation'),
* use the 'relation_count' property.
*/
$prop = Str::snake($relation) . "_count";
if (isset($this->$prop)) {
return $this->$prop;
}
/*
* If the related models have already been eager-loaded using
* with('relation'), count the loaded collection.
*/
if ($this->relationLoaded($relation)) {
return $this->$relation->count();
}
/*
* Neither loaded, so hit the database.
*/
return $this->$relation()->count();
}
}
Atm I am building a very specific solution for an existing application written in Laravel. The solution executes queries in c++ modifies data, does sorting and returns the results. This c++ program is loaded in via a PHP extension and serves a single method to handle this logic.
The method provided by the extension should be implemented in Laravel using Eloquent, I've been looking at the source code for ages to find the specific method(s) that execute the queries build with Eloquensts Builder.
Where can I find the methods that actually perform the queries?
Why c++? I hear you think. The queries should be executed on multiple schemas (and/or databases) over multiple threads for improved performance. Atm 100+ schemas are being used with each containing thousands of records per table.
After a lot of troubleshooting and testing I have found a solution to my problem. In the class Illuminate\Database\Query\Builder you can find a method called runSelect(). This method runs a select statement against the given connection and returns the selected rows as an array.
/**
* Run the query as a "select" statement against the connection.
*
* #return array
*/
protected function runSelect()
{
return $this->connection->select(
$this->toSql(), $this->getBindings(), ! $this->useWritePdo
);
}
What I did to test my implementation in c++ to run the selects, I mapped the return values of $this->getBindings() to a new array to do some necessary modifications to some strings and did a simple str_replace_array on the prepared statement to get the full query. Eventually the c++ program will execute the prepared statemend and not the parsed query.
The modified method to suit my case looks like this. This has been done quick and dirty for now to test if it is possible, but you get the idea. Works as a charm except for the count() method in eloquent.
/**
* Run the query as a "select" statement against the connection.
*
* #return array
*/
protected function runSelect()
{
$bindings = [];
foreach ($this->getBindings() as $key => $value) {
$bindings[] = $value; // Some other logic to manipulate strings will be added.
}
$query = str_replace_array('?', $bindings, $this->toSql());
$schemas = ['schema1', 'schema2', 'schema3', 'schema4']; // Will be fetched from master DB.
return runSelectCpp($schemas, $query);
}
I'm stuck trying to reduce the number of database queries on a web api.
My database has 3 collections : playground, widget, token
One playground has many widgets, one widget has one token. Each relationship uses referencesOne/referenceMany.
So here are my simplified models
/**
* #MongoDB\Document()
*/
class Widget
{
/**
* #MongoDB\ReferenceOne(targetDocument="Token", inversedBy="widgets")
*/
protected $token;
/**
* #MongoDB\ReferenceOne(targetDocument="Playground", inversedBy="widgets")
*/
protected $playground;
}
/**
* #MongoDB\Document()
*/
class Playground
{
/**
* #MongoDB\ReferenceMany(targetDocument="Widget", mappedBy="playground")
*/
protected $widgets;
}
/**
* #MongoDB\Document()
*/
class Token
{
/**
* #MongoDB\ReferenceMany(targetDocument="Widget", mappedBy="token")
*/
protected $widgets;
}
I need to use the full playground with all its widgets and tokens but by default, Doctrine does too many queries : one to get the playground (ok), one to get all widgets of the mapping (ok) and for each widget, one query to get the token (not ok). Is there a way to query all tokens at once instead of getting them one by one ?
I've looked at prime but it does not seem to solve my problem...
Is there a way other than using the query builder and manually hydrate all objects to reduce the query count ?
Edit :
As I added in my comment, what I'm looking for is get the playground and all its dependencies as a big object, json encode it and return it into the response.
What I do for now is query the playground and encode it but Doctrine populates the dependencies in a non efficient way : first there is the query to get the playgroung, then, there is one more query to get the related widgets and there is one query for each widget to get its token.
As one playground can have hundreds of widgets, this leads to hundreds of database queries.
What I'm looking for is a way to tell Doctrine to get all this data using only 3 queries (one to get the playgroung, one to get the widgets and one to get the tokens).
update: since the ArrayCollection in the $playground should contain all the widgets at least as proxy objects (or should get loaded when accessed), the following should work to fetch all required tokens...
Since the document manager keeps all managed objects, it should prevent additional queries from occuring. (Notice the omitted assignment from the execute).
$qb = $dm->createQueryBuilder('Token')->findBy(['widget' => $playground->getWidgets()]);
$qb->getQuery()->execute();
inspired by this page on how to avoid doctrine orm traps - point 5
old/original answer
I'm not quite familiar with mongodb, tbh, but according to doctrine's priming references, you should be able to somewhat comfortably hydrate your playground by:
$qb = $dm->createQueryBuilder('Widget')->findBy(['playground' => $playground]);
$qb->field('token')->prime(true);
$widgets = $qb->getQuery()->execute();
however, I might be so wrong.
What about that:
class Foo
{
/** #var \Doctrine\ORM\EntityManagerInterface */
private $entityManager;
public function __construct(\Doctrine\ORM\EntityManagerInterface $entityManager)
{
$this->entityManager = $entityManager;
}
public function getAll(): array {
$qb = $this->entityManager->createQueryBuilder();
return $qb
->select('p,t,w')
->from(Playground::class, 'p')
->join(Widget::class, 'w')
->join(Token::class, 't')
->getQuery()
->getResult();
}
}
At least with mysql backend, this solves the "n+1 problem". Not sure about MongoDB, though.
I'm working in a project that use Doctrine 2 in Symfony 2 and I use MEMCACHE to store doctrine's results.
I have a problem with objects that are retrieved from MEMCACHE.
I found this post similar, but this approach not resolves my problem: Doctrine detaching, caching, and merging
This is the scenario
/**
* This is in entity ContestRegistry
* #var contest
*
* #ORM\ManyToOne(targetEntity="Contest", inversedBy="usersRegistered")
* #ORM\JoinColumn(name="contest_id", referencedColumnName="id", onDelete="CASCADE"))
*
*/
protected $contest;
and in other entity
/**
* #var usersRegistered
*
* #ORM\OneToMany(targetEntity="ContestRegistry", mappedBy="contest")
*
*/
protected $usersRegistered;
Now imagine that Contest is in cache and I want to save a ContestRegistry entry.
So I retrieve the object contest in cache as follows:
$contest = $cacheDriver->fetch($key);
$contest = $this->getEntityManager()->merge($contest);
return $contest;
And as last operation I do:
$contestRegistry = new ContestRegistry();
$contestRegistry->setContest($contest);
$this->entityManager->persist($contestRegistry);
$this->entityManager->flush();
My problem is that doctrine saves the new entity correctly, but also it makes an update on the entity Contest and it updates the column updated. The real problem is that it makes an update query for every entry, I just want to add a reference to the entity.
How I can make it possible?
Any help would be appreciated.
Why
When an entity is merged back into the EntityManager, it will be marked as dirty. This means that when a flush is performed, the entity will be updated in the database. This seems reasonable to me, because when you make an entity managed, you actually want the EntityManager to manage it ;)
In your case you only need the entity for an association with another entity, so you don't really need it to be managed. I therefor suggest a different approach.
Use a reference
So don't merge $contest back into the EntityManager, but grab a reference to it:
$contest = $cacheDriver->fetch($key);
$contestRef = $em->getReference('Contest', $contest->getId());
$contestRegistry = new ContestRegistry();
$contestRegistry->setContest($contestRef);
$em->persist($contestRegistry);
$em->flush();
That reference will be a Proxy (unless it's already managed), and won't be loaded from the db at all (not even when flushing the EntityManager).
Result Cache
In stead of using you own caching mechanisms, you could use Doctrine's result cache. It caches the query results in order to prevent a trip to the database, but (if I'm not mistaken) still hydrates those results. This prevents a lot of issues that you can get with caching entities themselves.
What you want to achieve is called partial update.
You should use something like this instead
/**
* Partially updates an entity
*
* #param Object $entity The entity to update
* #param Request $request
*/
protected function partialUpdate($entity, $request)
{
$parameters = $request->request->all();
$accessor = PropertyAccess::createPropertyAccessor();
foreach ($parameters as $key => $parameter) {
$accessor->setValue($entity, $key, $parameter);
}
}
Merge requires the whole entity to be 100% fullfilled with data.
I haven't checked the behavior with children (many to one, one to one, and so on) relations yet.
Partial update is usually used on PATCH (or PUT) on a Rest API.
Is it possible to apply the remember(60) function to something like Service::all()?
This is a data set that will rarely change. I've attempted several variations with no success:
Service::all()->remember(60);
Service::all()->remember(60)->get();
(Service::all())->remember(60);
Of course, I am aware of other caching methods available, but I prefer the cleanliness of this method if available.
Yes, you should be able to simply swap the two such as
Change
Service::get()->remember(60);
to
Service::remember(60)->get();
An odd quirk I agree, but after I ran into this a few weeks back and realized all I had to do was put remember($time_to_remember) in front of the rest of the query builder it works like a charm.
For your perusing pleasure: See the Laravel 4 Query Builder Docs Here
/**
* Indicate that the query results should be cached.
*
* #param int $minutes
* #param string $key
* #return \Illuminate\Database\Query\Builder
*/
public function remember($minutes, $key = null)
{
list($this->cacheMinutes, $this->cacheKey) = array($minutes, $key);
return $this;
}
L4 Docs - Queries