Laravel eloquent recursive children

Laravel eloquent recursive children - php

The structure that I've got is as follows
User table
id PK
name
email
username
password
UserHierarchy table
user_parent_id FK of User
user_child_id FK of User
(Composite primary key)
I've written these 2 relationships, in order to retrieve who is the father of a user and who is a child of a user
public function parent()
{
return $this->hasManyThrough(\App\Models\User::class, \App\Models\UserHierarchy::class, 'user_child_id', 'id', 'id', 'user_parent_id');
}
public function children()
{
return $this->hasManyThrough(\App\Models\User::class, \App\Models\UserHierarchy::class, 'user_parent_id', 'id', 'id', 'user_child_id');
}
In order to get all the children, grandchildren and so on, I've developed this adicional relationship, that takes use of eager loading
public function childrenRecursive()
{
return $this->children()->with('childrenRecursive.children');
}
So far so good, when I find a User with an id, I can get all the downwards tree by using childrenRecursive. What I'm trying to achieve now is to re-use these relationships to filter a certain set of results, and by that I mean: When it's a certain User (for example id 1), I want a collection of Users that belong in his downward tree (children recursive) and his first direct parents as well.
$model->where(function ($advancedWhere) use ($id) {
$advancedWhere->whereHas('parent', function ($advancedWhereHas) use ($filterValue) {
$advancedWhereHas->orWhere('user_child_id', $id);
//I want all users that are recorded as his parents
})->whereHas('childrenRecursive', function ($advancedWhereHas) use ($id) {
// Missing code, I want all users that are recorded as his children and downwards
})->get();
This is the complete tree I'm testing and the result produced above (if I add a similar orWhere on the childrenRecursive) is that it returns every User that has a Parent-Child relationship. E.g User 2 should return every number except 11 and 12, and it's returning every number except 11 (because 11 is not a child of anyone)

I'm going to answer your question first, but in the second half of the answer I have proposed an alternative which I strongly suggest adopting.
MySQL (unlike, incidentally, Microsoft SQL) doesn't have an option to write recursive queries. Accordingly, there is no good Laravel relationship to model this.
As such, there is no way for Laravel to do it other than naively, which, if you have a complex tree, is going to lead to many queries.
Essentially when you load your parent, you will only have access to its children (as a relationship collection). Then you would foreach through its children (and then their children, etc, recursively) to generate the whole tree. Each time you do this, it performs new queries for the child and its children. This is essentially what you're currently doing, and you will find that as your data set grows it is going to start becoming very slow. In the end, this provides you with a data structure on which you can apply your filters and conditions in code. You will not be able to achieve this in a single query.
If you are writing to the db a lot, i.e. adding lots of new children but rarely reading the results, then this may be your best solution.
(Edit: abr's comment below linked me to the release notes for MySQL 8 which does have this functionality. My initial response was based on MySQL 5.7. However, I'm not aware of Laravel/Eloquent having a canonical relationship solution employing this yet. Furthermore I have previously used this functionality in MSSQL and nested sets are a better solution IMO.
Furthermore, Laravel isn't necessarily coupled to MySQL - it just often is the db of choice. It will therefore probably never use such a specific solution to avoid such tight coupling.)
However most hierarchical structure read more than they write, in which case this is going to start stressing your server out considerably.
If this is the case, I would advise looking into:
https://en.wikipedia.org/wiki/Nested_set_model
We use https://github.com/lazychaser/laravel-nestedset which is an implementation of the above, and it works very well for us.
It is worth mentioning that it can be slow and memory intensive when we redefine the whole tree (we have around 20,000 parent-child relationships), but this only has to happen when we've made an error in the hierarchy that can't be unpicked manually and this is rare (we haven't done it in 6 months). Again, if you think you may have to do that regularly, this may not be the best option for you.

Related

Laravel Polymorphism for different content types (Multisite)

I started to develop a little Content Management System for two languages (de, en) that starts to grow bigger.
In that context, I have Posts and Pages (a bit like WordPress or actually just like WordPress). But I am also planning to add further content types like Reviews, Courses, Tutorials (Recipes) and maybe even E-Books that are purchasable.
All these objects would have in common that they are contentable, so they will be shown on the front end with their dedicated urls, like /posts/{slug} for posts, /{slug} for pages, /reviews/{slug} for reviews and so on.
On the backend, this means an auto save and revision system is offered for these content types.
So, this would leave us will the following options:
Single Table Inheritance (we would need to live with many null values) - not supported officially by Laravel, but there is a package.
Multi Table Inheritance (which is not supported in Laravel either)
Polymorphism which is supported
CMS solution (like craft CMS), which basically breaks up the logic in elements, entries, fields and so on, alternative approach would be drupal's node approach - out of scope I believe (I dont have 6-9 months time to write a CMS from the scratch)
Have on Table per Model and try to use as much logic as possible between the models by using Traits (current status, I don't like it that much...).
After some googling, searching here on stackoverflow and looking at other projects, I am thinking of the following structure:
contents table:
id
site_id
title
... (some more columns that are shared among all models)
contentable_type
contentable_id
posts
id
pages
id
home
courses
id
name
featured
difficulty
free
So, these tables would be linked to the contents table through a belongsTo relationship, and the content model would define the morphable relationship.
class Content extends Eloquent {
public function contentable(){
return $this->morphTo();
}
}
class Post extends Eloquent (or Content) {
public function content(){
return $this->morphOne('Content', 'contentable');
}
}
Working with models would mean you would always have to load the content relationships.
Sorting & Ordering must be performed by joins.
And when creating, of course, we have to first create the content type model and then save and attribute it to a content model.
I never implemented a system with that kind of (sub) logic before, and it feels a bit odd to me to have a posts table with just an id (same would be true for other content types e.g. "abouts" in case they don't have extra columns), but I think it would be the "Laravel way" to solve this issue, right?
I believe STI wouldn't work for this case, and it is also a bit against Laravel's Eloquent pattern.
Has somebody already experiences with this approach? Am I on the right track here?
Note: I got inspired by the discussion here: How can I implement single table inheritance using Laravel's Eloquent?

In case anyone finds this question. I finally decided against this approach, basically because I believe it is not worth the efforts (and also most of the packages won’t work out of the box).
It is a much better approach to use Traits et cetera to reuse as much logic as possible and follow the Eloquent ORM approach.

Laravel: What is the purpose of the `loadMissing` function?

The first sentence of the Eager Loading section from the Laravel docs is:
When accessing Eloquent relationships as properties, the relationship
data is "lazy loaded". This means the relationship data is not
actually loaded until you first access the property.
In the last paragraph of this section it is stated:
To load a relationship only when it has not already been loaded, use
the loadMissing method:
public function format(Book $book)
{
$book->loadMissing('author');
return [
'name' => $book->name,
'author' => $book->author->name
];
}
But I don't see the purpose of $book->loadMissing('author'). Is it doing anything here?
What would be the difference if I just remove this line? According to the first sentence, the author in $book->author->name would be lazy-loaded anyway, right?

Very good question; there are subtle differences which are not getting reflected instantly by reading through the documentation.
You are comparing "Lazy Eager Loading" using loadMissing() to "Lazy Loading" using magic properties on the model.
The only difference, as the name suggests, is that:
"Lazy loading" only happens upon the relation usage.
"Eager lazy loading" can happen before the usage.
So, practically, there's no difference unless you want to explicitly load the relation before its usage.
It also worths a note that both load and loadMissing methods give you the opportunity to customize the relation loading logic by passing a closure which is not an option when using magic properties.
$book->loadMissing(['author' => function (Builder $query) {
$query->where('approved', true);
}]);
Which translates to "Load missing approved author if not already loaded" which is not achievable using $book->author unless you define an approvedAuthor relation on the model (which is a better practice, though).
To answer your question directly; yeah, there won't be any difference if you remove:
$book->loadMissing('author');
in that particular example as it's being used right after the loading. However, there might be few use cases where one wants to load the relation before its being used.
So, to overview how relation loading methods work:
Eager loading
Through the usage of with() you can "eager load" relationships at the time you query the parent model:
$book = Book::with('author')->find($id);
Lazy eager loading
To eager load a relationship after the parent model has already been retrieved:
$book->load('author');
Which also might be used in a way to only eager load missing ones:
$book->loadMissing('author');
Contrary to the load() method, loadMissing() method filters through the given relations and lazily "eager" loads them only if not already loaded.
Through accepting closures, both methods support custom relation loading logics.
Lazy loading
Lazy loading which happens through the usage of magic properties, is there for developer's convenience. It loads the relation upon its usage, so that you won't be needing to load it beforehand.
#rzb has mentioned a very good point in his answer as well. Have a look.

I believe the accepted answer is missing one important fact that may mislead some: you cannot run loadMissing($relation) on a collection.
This is important because most use cases of lazy eager loading relationships are when you already have a collection and you don't want to commit the n+1 sin - i.e. unnecessarily hit the DB multiple times in a loop.
So while you can use load($relation) on a collection, if you only want to do it if the relationships haven't already been loaded before, you're out of luck.

its mean do not repeat the query
to be clear about it
if you use : load() 2 times the query will repeat even if the relationships exists
while : loadMissing() is check if the relationship has loaded . it will not repeat the query . beacuse it has already loaded before by [ load() or with() ] = egear load
DB::enableQueryLog();
$user = User::find(1);
// see the query
$user->load('posts');
$user->load('posts');
$user->loadMissing('posts'); // put it on top to see the difference
dd(DB::getQueryLog());
that's what i think its purpose

Very useful for APIs
The use of with, loadMissing or load can has more importance when use it in API environment, where the results are passed to json. On this case, lazy loading hasn't any effect.

Lets say you have multiple relationships.
book belongs to an author and book belongs to a publisher.
so first you might load it with one relationship.
$books->load('author');
and later on certain condition you want to load another relationship into it.
$book->loadMissing('publisher');
But I don't see the purpose of $book->loadMissing('author');. Is it
doing anything here? What would be the difference if I just remove
this line? According to the first sentence, the author in
$book->author->name would be lazy-loaded anyway, right?
Suppose say
public function format(Book $book)
{
//book will not have the author relationship yet
return [
'name' => $book->name, //book will not have the author relationship loaded yet
'author' => $book->author->name //book will now have the author relationship
];
}
Difference between above and below code is when will the relationship be loaded and how much control you have over the property.
public function format(Book $book)
{
$book->loadMissing('author'); // book will now have the author relationship
return [
'name' => $book->name, // book have the author relationship loaded
'author' => $book->author->name // book have the author relationship loaded
];
}

Both answers here have covered pretty well what the technical difference is, so I'd refer you to them first. But the "why" isn't very evident.
Something I find myself preaching a lot lately is that Eloquent is really good at giving you enough rope to hang yourself with. By abstracting the developer so far away from the actual SQL queries being produced, especially with dynamic properties, it's easy to forget when your database hits are hurting your performance more than they need to.
Here's the thing. One query using an IN() statement on 1000 values takes about the same execution time as one query running on one value. SQL is really good at what it does- the performance hit usually comes with opening and closing the DB connection. It's a bit like going grocery shopping by way of making one trip to the market for each item, as opposed to getting it all done at once. Eager-loads use the IN statements.
Lazy-loading is good for instances where you're handling too much data for your server's RAM to cope with, and in my opinion, not good for much else. It handles only one entry at any given moment. But it's reconnecting each time. I can't tell you the number of times I've seen Transformer classes, which should be responsible only for reformatting data as opposed to retrieving it, leveraging those dynamic properties and not realizing that the data wasn't already there. I've seen improvements as dramatic reducing execution time from 30 minutes to 30 seconds just by adding a single line of eager-loading prior to the Transformer being called.
(By the way, batching might be considered the happy-medium, and Eloquent's chunk() method offers that too.)
To answer your question a little more directly; if you're dealing with an instance where it's a one-to-one relationship, and it's going to be used in only one place, then functionally there is no difference between load, loadMissing, or a lazy-loading dynamic property. But if you have a many-to-many, it may be worthwhile to gather up all that data all at once. One book can have many co-authors. One author can write many books. And if you're about to loop through large sets of either, go ahead and make the most of your trip to the market before you start cooking.

PHP datamapper - why use them for non-collection objects?

Perhaps this is a question with a trivial answer but nevertheless it is driving me nuts for a couple of days so i would like to hear an answer. I'm recently looking up a lot of information related to building a custom datamapper for my own project (and not using an ORM) and read several thread on stackoverflow or other websites.
It seems very convincing to me to have AuthorCollection objects, which are basically only a container of Author instances or BookCollection objects, which hold multiple Book instances. But why would one need a mapper for the single Author object? All fetch criterias i can think of (except the one asking for the object with a specified BookID or AuthorID) will return multiple Book or Author instances hence BookCollection or AuthorCollection instances. So why bother with a mapper for the single objects, if the one for the appropriate collection is more general and you don't have to be sure that your criteria will only return one result?
Thanks in advance for your help.

Short answer
You don't need to bother creating two mappers for Author and AuthorCollection. If your program doesn't need an AuthorMapper and an AuthorCollectionMapper in order to work smoothly and have a clean source, by all means, do what you're most comfortable with.
Note: Choosing this route means you should be extra careful looking out for SRP violations.
Long(er) answer
It all depends on what you're trying to do. For the sake of this post, let's call AuthorMapper an item data mapper and AuthorCollectionMapper a collection data mapper.
Typically, item mappers won't be as sophisticated as their collection mappers. Item mappers will normally only fetch by a primary key and therefore limit the results, making the mapper clean and uncluttered by additional collection-specific things.
One main part of these "collection-specific things" I bring up is conditions1 and how they're implemented into queries. Often within collection mappers you'll probably have more advanced, longer, and tedious queries than what would normally be inside an item data mapper. Though entirely possible to combine your average item data mapper query (SELECT ... WHERE id = :id) with a complicated collection mapper query without using a smelly condition2, it gets more complicated and still bothers the database to execute a lengthy query when all it needed was a simple, generic one.
Additionally, though you pointed out that with an item mapper we really only fetch by a primary key, it usually turns out to be radically simpler using an item mapper for other things. An item mapper's save() and remove() methods can handle (with the right implementation) the job better than attempting to use a collection mapper to save/remove items. And, along with this point, it also becomes apparent that at times throughout using a collection mappers' save() and remove() method, a collection mapper may want to utilize item mapper methods.
In response to your question below, there may be numerous times you may want to set conditions in deleting a collection of rows from the database. For example, you may have a spam flag that, when set, hides the post but self-destructs in thirty days. I'm that case you'd most likely have a condition for the spam flag and one for the time range. Another might be deleting all the comments under an answer thirty days after an answer was deleted. I mention thirty days because it's wise to at least keep this data for a little while in case someone should want their comment or it turns out the row with a spam flag isn't actually spam.
1. Condition here means a property set on the collection instance which the collection mapper's query knows how to handle. If you haven't already, check out #tereško's answer here.
2. This condition is different and refers to the "evil if" people speak of. If you don't understand their nefariousness, I'd suggest watching some Clean Code Talks. This one specifically, but all are great.

Object Orientated Design with Databases and scalability/optimisation using PHP and mySQL

I'm currently at an impasse in reguards to the structural design of my website. At the moment I'm using objects to simplify the structure of my site (I have a person object, a party object, a position object, etc...) and in theory each of these is a row from it's respective table in the database.
Now from what I've learnt, OO Design is good for keeping things simple and easy to use/implement, which I agree with - it makes my code look so much cleaner and easier to maintain, but what I'm confused about is how I go about linking my objects to the database.
Let's say there is a person page. I create a person object, which equals one mysql query (which is reasonable), but then that person might have multiple positions which I need to fetch and display on a single page.
What I am currently doing is using a method called getPositions from the person object which gets the data from mysql and creates a separate position object for each row, passing in the data as an array. That keeps the queries down to a minimum (2 to a page) but it seems like a horrible implementation and to me, breaks the rules of object orientated design (should I want to change a mysql row, I'd need to change it in multiple places) but the alternative is worse.
In this case the alternative is just getting the ID's that I need and then creating separate positions, passing in the ID which then goes on to getting the row from the database in the constructor. If you have 20 positions per page, it can quickly add up and I've read about how much Wordpress is criticised for it's high number of queries per page and it's CPU usage. The other thing I'll need to consider in this case is sorting, and doing it this way means I'll need to sort the data using PHP, which surely can't be as efficient as natively doing it in mysql.
Of course, pages will be (and can be) cached, but to me, this seems almost like cheating for poorly built applications. In this case, what is the correct solution?

The way you're doing it now is at least on the right track. Having an array in the parent object with references to the children is basically how the data is represented in the database.
I'm not completely sure from your question if you're storing the children as references in the parent's array, but you should be and that's how PHP should store them by default. If you also use a singleton pattern for your objects that are pulled from the database, you should never need to modify multiple objects to change one row as you suggest in your question.
You should probably also create multiple constructors for your objects (using static methods that return new instances) so you can create them from their ID and have them pull the data or just create them from data you already have. The latter case would be used when you're creating children; you can have the parent pull all of the data for its children and create all of them using only one query. Getting a child from its ID will probably be used somewhere else so its good just to have if its needed.
For sorting, you could create additional private (or public if you want) arrays that have the children sorted in a particular way with references to the same objects the main array references.

Returning records from 3 tables via a Join Table with Propel in Symfony

I have 3 database tables:
article
article_has_tag (2 FK's to the other tables)
tag
I currently show a list of articles with the article's tags shown underneath but the number of queries grows as the list gets longer.
I want to loop over all the articles and get the tag objects from each one in turn.
Can it be done in 1 propel query?

I believe you are using symfony 1.0 and thus Propel 1.2... Whilst the methods already described in the comments talk about alternative methods, there is a direct way to at least solve your problem: add this function to your ArticlePeer class:
public static function getTaggedArticles()
{
$c = new Criteria();
//some filters here, e.g. LIMIT or Criteria::IN array
$ahts = ArticleHasTagPeer::doSelectJoinAll($c);
$articles = array();
foreach($ahts as $aht)
{
if(!isset($articles[$aht->getArticleId()]))
{
$articles[$aht->getArticleId()] = $aht->getArticle();
}
$articles[$aht->getArticleId()]->addTag($aht->getTag());
}
return $articles;
}
where $ahts is short for $article_has_tags. Create a simple array of tags in your Article class (protected array $collTags) along with the addTag() method, if they don't already exist to facilitate this.
This then only executes one SQL query, but consider seriously that without the filter I mention you are potentially hydrating hundreds of objects unnecessarily, and that is a significant performance hit. You may want to research how to hydrate based only on a doSelectRS() call - inspect your BlahPeer classes for how their JOIN methods work, and then this link for how to write custom JOIN methods.
Either way, the method builds a unique array of articles with the ArticleId as the key - if you need a different sort order, you can either sort this array again or use a different array key to organise the collection as you build it.

Unless I'm misunderstanding your question, don't loop over anything as you'll generate bloat of a different kind.
Do a single query where "article" is joined to "article_has_tag" is joined to "tag". The single query should return the specified articles and tag names for the tags they have.
I use Doctrine myself so can't help you with the exact query but Googling brings up stuff like this: http://www.tech-recipes.com/rx/2924/symfony_propel_how_to_left_join/.
Also, the symfony definitive guide (which was written for Propel) should be able to help you.

I assume you are using Propel 1.3 or 1.4, and not yet Propel 1.5 (which is still in beta), as the latter has a very natural support for these multiple joins (inspired, in part, by the Doctrine syntax).
If you defined your foreign keys in the database schema, you should have a static doSelectJoinByAll method in the ArticleHasTagPeer class. If you use this method, the related Article and Tag objects will be hydrated with the same query. You can still pass in a Criteria object that modifies the Article and Tag selection criteria. I know this is a bit strange, since you probably want to start from the Article objects, and this was one of the driving factors for the change in Propel 1.5. In Symfony you can also use the DbFinderPlugin, this will already give you this capability in Propel 1.3 (it needs a small patch for Propel 1.4). In fact, Propel 1.5 is mostly written by François Zaniotto, the author of the DbFinderPlugin.

Short answer is no.
But with some efforts you still can do that. Here's list of options:
Use dbFinderPlugin plugin
Write your own peer method (say, doSelectPostWithUsersAndComments).
Migrate to Propel 1.5
Migrate to Doctrine

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.