How can I reduce the number of queries Doctrine is performing? - php

I'm building a product management tool where the product can have an arbitrary number of attributes, documents, features, images, videos as well as a single type, brand, and category. There are a few other related tables, but this is enough to demonstrate the problem.
There's a Model class called ProductModel that contains a method like this (reduced for clarity):
public function loadValues() {
//Product entity data
$this->id = $this->entity->getId();
$this->slug = $this->entity->getSlug();
// One of each of these
$this->loadType();
$this->loadBrand();
$this->loadCategory();
// Arbitrary number of each of these
$this->loadAttributes();
$this->loadDocuments();
$this->loadFeatures();
$this->loadImages();
$this->loadVideos();
...
}
Each of the load methods does some boiler plate that eventually executes this method:
public function loadEntitiesByProductId($productId=0) {
// Get all the entities of this type that are associated with the product.
$entities = $this->entityManager
->getRepository($this->entityName)
->findByProduct($productId);
$instances = array();
// Create a Model for each entity and load the data.
foreach ($entities as $entity) {
$id = $entity->getId();
$instances[$id] = new $this->childClass();
$instances[$id]->entity = $entity;
$instances[$id]->loadValues();
}
return $instances;
}
This is OK for cases where the related entity is a single table, but usually it's a mapper. In those cases, I get all the mapper entities in the first query then I have to query for the related entity within the loadValues() method (via Doctrine's get<Entity>() method). The result of this process is a huge number of queries (often >100). I need to get rid of the extraneous queries, but I'd like to do so without losing the idioms I'm using across my data models.
Is there a way to get the entityManager to do a better job at using joins to group these queries?

There were a couple problems with my previous approach:
First, I was getting the entities from the repository instead of loading them from the existing entity:
$entities = $this->entityManager
->getRepository($this->entityName)
->findByProduct($productId);
Better is:
$method = $this->deriveGetMethod($this->entityName);
$entities = $productEntity->$method()
Second, I was retrieving the product entity using $this->entityManager->getRespository... which works fine for loading small data sets (a single table or one or two relations), but there's no way to get the repository's findBy methods to load relations in a single query. The solution is to use the queryBuilder.
$qb = $this->entityManger->createQueryBuilder();
$query = $this->select('product',/*related tables*/)->/*joins etc.*/
$productEntity = $query->getSingleResult();

Related

Advanced Laravel merged data/models - can it be done at model level?

We have a COMMON database and then tenant databases for each organization that uses our application. We have base values in the COMMON database for some tables e.g.
COMMON.widgets. Then in the tenant databases, IF a table called modified_widgets exists and has values, they are merged with the COMMON.widgets table.
Right now we are doing this in controllers along the lines of:
public function index(Request $request)
{
$widgets = Widget::where('active', '1')->orderBy('name')->get();
if(Schema::connection('tenant')->hasTable('modified_widgets')) {
$modified = ModifiedWidget::where('active', '1')->get();
$merged = $widgets->merge($modified);
$merged = array_values(array_sort($merged, function ($value) {
return $value['name'];
}));
return $merged;
}
return $countries;
}
As you can see, we have model for each table and this works OK. We get the expected results for GET requests like this from controllers, but we'd like to merge at the Laravel MODEL level if possible. That way id's are linked to the correct tables and such when populating forms with these values. The merge means the same id can exist in BOTH tables. We ALWAYS want to act on the merged data if any exists. So it seems like model level is the place for this, but we'll try any suggestions that help meet the need. Hope that all makes sense.
Can anyone help with this or does anyone have any ideas to try? We've played with overriding model constructors and such, but haven't quite been able to figure this out yet. Any thoughts are appreciated and TIA!
If you put this functionality in Widget model you will get 2x times of queries. You need to think about Widget as an instance, what I am trying to say is that current approach does 2 queries minimum and +1 if tenant has modified_widgets table. Now imagine you do this inside a model, each Widget instance will pull in, in a best case scenario its equivalent from different database, so for bunch of Widgets you will do 1 (->all())+n (n = number of ModifiedWidgets) queries - because each Widget instance will pull its own mirror if it exists, no eager load is possible.
You can improve your code with following:
$widgets = Widget::where('active', '1')->orderBy('name')->get();
if(Schema::connection('tenant')->hasTable('modified_widgets')) {
$modified = ModifiedWidget::where('active', '1')->whereIn('id', $widgets->pluck('id'))->get(); // remove whereIn if thats not the case
return $widgets->merge($modified)->unique()->sortBy('name');
}
return $widgets;
OK, here is what we came up with.
We now use a single model and the table names MUST be the same in both databases (setTable does not seem to work even though in exists in the Database/Eloquent/Model base source code - that may be why it's not documented). Anyway = just use a regular model and make sure the tables are identical (or at least the fields you are using are):
<?php
namespace App\Models;
use Illuminate\Database\Eloquent\Model;
class Widget extends Model
{
}
Then we have a generic 'merge controller' where the model and optional sort are passed in the request (we hard coded the 'where' and key here, but they could be made dynamic too). NOTE THIS WILL NOT WORK WITH STATIC METHODS THAT CREATE NEW INSTANCES such as $model::all() so you need to use $model->get() in that case:
<?php
namespace App\Http\Controllers;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\Config;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Schema;
class MergeController extends Controller
{
public function index(Request $request)
{
//TODO: add some validations to ensure model is provided
$model = app("App\\Models\\{$request['model']}");
$sort = $request['sort'] ? $request['sort'] : 'id';
$src_collection = $model->where('active', '1')->orderBy('name')->get();
// we setup the tenants connection elsewhere, but use it here
if(Schema::connection('tenant')->hasTable($model->getTable())) {
$model->setConnection('tenant');
$tenant_collection = $model->get()->where('active', '1');
$src_collection = $src_collection->keyBy('id')->merge($tenant_collection->keyBy('id'))->sortBy('name');
}
return $src_collection;
}
}
If you dd($src_collection); before returning it it, you will see the connection is correct for each row (depending on data in the tables). If you update a row:
$test = $src_collection->find(2); // this is a row from the tenant db in our data
$test->name = 'Test';
$test->save();
$test2 = $src_collection->find(1); // this is a row from the tenant db in our data
$test2->name = 'Test2'; // this is a row from the COMMON db in our data
$test2->save();
dd($src_collection);
You will see the correct data is updated no matter which table the row(s) came from.
This results in each tenant being able to optionally override and/or add to base table data without effecting the base table data itself or other tenants while minimizing data duplication thus easing maintenance (obviously the table data and population is managed elsewhere just like any other table). If the tenant has no overrides then the base table data is returned. The merge and custom collection stuff have minimal documentation, so this took some time to figure out. Hope this helps someone else some day!

Query Builder inside entity - Doctrine

I'm looking to a better way to write this function.
It's inside a Doctrine Entity Model
public function getCompanySubscriptions()
{
foreach ($this->subscriptions as $key => $value) {
if ($value->getPlan()->getType() == 'E' && $value->getActive()) {
return $value;
}
}
return null;
}
$this->subscriptions is a many-to-one collection and can have different "types" (but only one of them with type "E").
Problem is: If the Company has too many $subscriptions this function will be too slow to return that only one of type "E" which I need to check when building the view with TWIG. The solution would be to use a QueryBuilder, but I haven't found a way to use it directly from the entity model.
You cannot use a QueryBuilder inside your entity, but instead you can use doctrine Criteria for filtering collections (on SQL level). Check the documentation chapter 8.8. Filtering Collections for more details on Criteria.
If the collection has not been loaded from the database yet, the filtering API can work on the SQL level to make optimized access to large collections.
For example to only get active subscriptions:
$subscriptions = $this->getCompanySubscriptions();
$criteria = Criteria::create()
->where(Criteria::expr()->eq("active", true));
$subscriptions = $subscriptions ->matching($criteria);
Like that you can solve your performance issues, since the collection is loaded from the database using the conditions from the Criteria directly.
The problem in your case might be that you need to join on Plan, but joining is not possible in a Criteria. So if joining is really necessary then you should consider using a custom query where you do the join with conditions in your company EntityRepository (for example with a QueryBuilder).
Note.
The foreach in your question can be rewritten using the filter method from the ArrayCollection class. The filter takes a predicate and all elements satisfying the predicate will be returned.
Look also here in the doctrine 2 class documentation for more info.
Your predicate would look something like:
$predicate = function($subscription){
$subscription->getPlan()->getType() == 'E' && $subscription->getActive();
}
and then:
return $this->subscriptions->filter($predicate);

Advanced Filtering of Associated Entity Collection in Symfony

If I have an associated entity which is a collection, what options do you have when fetching?
e.g. Lets say I have a $view entity with this definition inside it:
/**
* #ORM\OneToMany(targetEntity="\Gutensite\CmsBundle\Entity\View\ViewVersion", mappedBy="entity")
* #ORM\OrderBy({"timeMod" = "DESC"})
*/
protected $versions;
public function getVersions() {
return $this->versions;
}
And I want to get the all the versions associated with the entity like this:
$view->getVersions();
This will return a collection. Great. But is it possible to take that collection and filter it by criteria, e.g. newer than a certain date? Or order it by some (other) criteria?
Or at this point are you just expected to do a query on the repository:
$versions = $em->getRepository("GutensiteCmsBundle:View\ViewVersion")->findBy(array(
array(
'viewId', $view->getId(),
'timeMod', time()-3600
)
// order by
array('timeMod', 'DESC')
));
There is a surprisingly unknown feature in recent versions of Doctrine, which makes these sort of queries much easier.
It doesn't seem to have a name, but you can read about it in the Doctrine docs at 9.8 Filtering Collections.
Collections have a filtering API that allows to slice parts of data from a collection. If the collection has not been loaded from the database yet, the filtering API can work on the SQL level to make optimized access to large collections.
In your case you could write a method like this on your View entity.
use Doctrine\Common\Collections\Criteria;
class View {
// ...
public function getVersionsNewerThan(\DateTime $time) {
$newerCriteria = Criteria::create()
->where(Criteria::expr()->gt("timeMod", $time));
return $this->getVersions()->matching($newerCriteria);
}
}
This will do one of two things:
If the collection is hydrated, it will use PHP to filter the existing collection.
If the collection is not hydrated, it will fetch a partial collection from the database using SQL constraints.
Which is really great, because hooking up repository methods to your views is usually messy and prone to break.
I also like #igor-pantovic's answer, although I've seen the method cause some funny bugs.
I would personally avoid using order by on annotation directly. Yes, you are supposed to do a query, just as you would if you were using raw SQL without Doctrine at all.
However, I wouldn't do it at that point but even before. In your specific case I would create an ViewRepository class:
class ViewRepository extends EntityRepository
{
public function findWithVersionsNewerThan($id, \DateTime $time)
{
return $this->createQueryBuilder('view')
->addSelect('version')
->join('view.versions', 'version')
->where('view.id = :id')
->andWhere('version.timeMod > :time')
->setParameter('time', $time)
->setParameter('id', $id)
->getQuery()
->getOneOrNullResult();
}
}
Now you can do:
$yourDateTime = // Calculate it here ... ;
$view = $em->getRepository("GutensiteCmsBundle:View\ViewVersion")->findWithVersionsNewerThan($yourDateTime);
$versions = $view->getVersions(); // Will only contain versions newer than datetime provided
I'm writing code from the top of my head here directly so sorry if some syntax or method naming error sneaked in.

what is the common practice on doing oo in db?

Here is situation.... ...
I have a DBManager, which is implement a DBInterface, in the DBInterface, I got 4 method:
-create(DBCmd);
-read(DBCmd);
-update(DBCmd);
-delete(DBCmd);
The DBCmd object is responsible for generate the SQL statement, and the DBCmd requires an object in sql statement:
class DBCmd{
public _constructor($aObj){
}
public executeCreate(){
}
public executeRead(){
}
public executeUpdate(){
}
public executeDelete(){
}
}
The flow will be like this:
aObject ---> put it into DBCmd ----> put the DBCmd in DBManager ---> execute
But the problems happen when I get some objects related to other tables, for example...a customer have a purchase record, and which purchase record have many items....
So, what do I do in my read method? should I read all the records related to the customer?? Do I need to loop all the items inside the purchase record too?
If yes, when I doing read customer, I need to query 3 tables, but some that may not need to see.....it waste the resource...
And I come up with another solution, I make a new set of DBCmd, that allow me to get the related DB items, for example:
class getReleatedPurchaseRecordDBCmd{
public _constructor($aCustomerObject){
}
//.... ....
}
But in this "solution", I got some problems, is I loss the relationship in the object customer...yes, I can read back all the records, get the customer object basically don't know any things about the purchase record....
Some may ask me to do something like this:
class customer{
//skip other methods...
public getPurchaseRecords(){
//query the db
}
}
It works, but I don't want the object structure have some strong relationship between the db....That's why I come up with the DBCmd stuff...
So, everything seems to be very coupling, how can solve it? Thank you.
for stuff like this i tend to get the count of sub objects with the initial query usually involving sql COUNT and JOIN, then have a seperate getSubObjects command that can be called if needed later. So for example:
$datamodel->getCustomer($id);//or some such method
returns
class Customer{
$id = 4;
$recordCount = 5;
$records = null;
}
I can then use the count for any display stuff as needed, and if i need the records populated call:
$customer->records = $datamodel->getCustomerRecords($customer->id);

Zend: Two Objects, one Row

I've recently started using Zend Framework (1.8.4), to provide admin tools for viewing the orders of a shopping cart site.
What I'd like to do is to efficiently create multiple model (Zend_Db_Table_Row_Abstract) objects from a single database result row.
The relationship is simple:
an Order has one Customer (foreign key order_custid=customer.cust_id);
a Customer has many Orders.
Loading the orders is easy enough. Using the method documented here:
Modeling objects with multiple table relationships in Zend Framework
...I could then grab the customer for each.
foreach ($orderList as $o)
{
cust = $o->findParentRow('Customers');
print_r ($cust); // works as expected.
}
But when you're loading a long list of orders - say, 40 or more, a pageful - this is painfully slow.
Next I tried a JOIN:
$custTable = new Customers();
$orderTable = new Orders();
$orderQuery = $orderTable->select()
->setIntegrityCheck(false) // allows joins
->from($orderTable)
->join('customers', 'cust_id=order_custid')
->where("order_status=?", 1); //incoming orders only.
$orders = $orderTable->fetchAll($orderQuery);
This gives me an array of order objects. print_r($orders) shows that each of them contains the column list I expect, in a protected member, with raw field names order_* and cust_*.
But how to create a Customer object from the cust_* fields that I find in each of those Order objects?
foreach ($orders as $o) {
$cols = $o->toArray();
print_r ($cols); // looks good, has cust_* fields...
$cust = new Customer(array( 'table' => 'Customer', 'data' => $cols ) );
// fails - $cust->id, $cust->firstname, etc are empty
$cust->setFromArray($cols);
// complains about unknown 'order_' fields.
}
Is there any good way to create an Order and a Customer object simultaneously from the joined rows? Or must I run the query without the table gateway, get a raw result set, and copy each of the fields one-by-one into newly created objects?
Zend_Db doesn't provide convenience methods to do this.
Hypothetically, it'd be nifty to use a Facade pattern for rows that derive from multiple tables. The facade class would keep track of which columns belong to each respective table. When you set an individual field or a whole bunch of fields with the setFromArray() method, the facade would know how to map fields to the Row objects for each table, and apply UPDATE statements to the table(s) affected.
Alternatively, you could work around the problem of unknown fields by subclassing Zend_Db_Table_Row_Abstract, changing the __set() behavior to silently ignore unknown columns instead of throwing an exception.
You can't have an OO interface to do everything SQL can do. There must be some line in the sand where you decide a reasonable set of common cases have been covered, and anything more complex should be done with SQL.
I use this method to assign database row fields to objects. I use setter methods, but this could probably be also done with only properties on object.
public function setOptions(array $options){
$methods = get_class_methods($this);
foreach ($options as $key => $value) {
$method = 'set' . ucfirst($key);
if (in_array($method, $methods)) {
$this->$method($value);
}
}
return $this;
}

Categories