I am using the adjacency list model to find sub categories within my website. I have working PHP code to find all the categories and sub categories, but now I cannot figure out how use that to create a navigation system. Here is how the site will work, very basic:
URL string
There will be a main category, followed by levels
index.php?category=category-name&level1=sub-category&level2=another-sub-category&level3=content-item
Later I will make SEO friendly links.
URL with no sub categories
Where Level 1 is the content item
www.website.com/category/content-item/
URL with sub categories
Where Level 1, 2, 3, etc are the sub categories and the final level is the content item
www.website.com/category/sub-category/sub-category-2/content-item/
Here is the code I am using to find categories and sub categories. Currently it just outputs a list of all categories and sub categories and number's the level of each child. Not sure if this helps, it just creates a list.
function display_children($ParentCategoryID, $Level) {
// retrieve all children of parent
if ($ParentCategoryID == ''){
$Result = mysql_query('SELECT * FROM categories WHERE parent_category_id IS null');
}
else{
$Result = mysql_query('SELECT * FROM categories WHERE parent_category_id="'.$ParentCategoryID.'";');
}
// display each child
while ($Row = mysql_fetch_array($Result)) {
echo str_repeat('-',$Level)."[".$Level."]".$Row['category_name']."<br />";
display_children($Row['category_id'], $Level + 1);
}
}
See this question first for options on how to represent hierarchical data in a database.
Adjacency list is great for its simplicity, and makes changes easy, but can be awful because it leads to recursive code, such as your function above, in practice, which is a performance killer under load. The best approach, absent changing your data model is using MySQL session variables to retrieve the entire hierarchy in one query, which brings back all the data you need in one database call. Even this though leads to poor performance under load - less so than the recursive function - but still not good; and, I write from experience :).
If it was me I'd use either Nested Sets, Adjacency List in combination with some denormalizations, such as the Bridge Table and Flat Table, or just a Lineage Table. Really depends on how often the data changes and if you need those changes to be done easily. All of these options should be much, much faster, to work with rather than relying upon just the parent-child ID columns.
Related
I have two entities, post and category which is a 1:n relationship.
I have a reference table with two columns, post_id,category_id
The categories table has an id column, a status column and a parent_id column
If a category is a child of another category (n-depth) then it's parent_id is not null.
If a category is online it's status is 1, otherwise it is 0.
What I need to do is find out if a post is visible.
This requires:
Foreach category joined to the post trace up it's tree to the root node (till a category has parent_id == null), if any of those categories have status 0 then that path is considered offline.
If any path is online then the post is considered visible, otherwise it is hidden.
The only way I can think of doing this (as semi-pseudo code) is:
function visible(category_ids){
categories = //select * from categories where id in(category_ids)
online = false
foreach(categories as category){
if(category.status == 0)
continue;
children = //select id from categories where parent_id = category.id
if(children)
online = visible(children)
}
return online
}
categories = //select c.id from categories c join posts_categories pc on pc.category_id = c.id where pc.post_id = post.id
post.online = visible(categories)
But that could end up being a lot of sql queries, is there a better way?
If nested sets are not an option, I know about the following:
If the data is ordered so that children of a parent always follow after it's parent, you can solve this with one database-query over all data by skipping hidden nodes in the output.
This works equally with a sorted nested set, too, the principle has been outlined in this answer however the algorithms about getting the depth do not work and I would suggest a recursive iterator that is able to remove hidden items.
Also if the data is not ordered, you can create a tree structure from the (unsorted) query of all rows like outlined in the answer to Nested array. Third level is disappearing. No recursion needed and you get a structure you can easily output then, I should have covered that for <ul>/<li> html style output in another answer, too.
Answer to How can I convert a series of parent-child relationships into a hierarchical tree?
Answer to How to obtain a nested HTML list from object's array recordset?
A classic database vs memory tradeoff. What you are doing is building a tree with leafs in it. To build the tree you need recursive loop the leafs. Coming from a database there are 2 scenarios:
Build the tree recursive with a query for each leaf. You hold 1 tree in memory. That is what you are doing.
Get a flat structure from the database, and build the tree recursive in memory. You hold a flat tree and the real tree in memory. That is your alternative way.
What is better depends on a lot of things: your hardware (disk access vs memory), the size of the tree to name two.
I am retrieving all category IDs that a product belongs to like so in a helper:
$productCategories = $this->getProduct()->getCategoryIds();
This gives me an array of category IDs. In my case, each product will only be in one sub-category. From the IDs I need to figure out the logical order based on which categories are children of which.
The aim is to create a method which I can call and specify the level of category I want for a product and it will return the path to that category.
I know I can load a category like so:
Mage::getModel('catalog/category')->load($categoryId)
and I know what I'm doing in PHP, but just a little stuck on the logic (and methods) to use here to achieve what I want.
The easiest way is to test if it has a Parent Id associated to it. A parent_id that is > 0 means that it is a lower-level category. If the parent_id equals one it means that it is one level lower than the root category.
$category = Mage::getModel('catalog/category')->load($categoryId)
echo $category->getParentId();
I advise against using raw SQL as it is bad practice, the database schema may at some point change.
If I had 2 tables, say blog_category and blog, each "blog" can belong in a particular category only so a 1-1 relationship based on a key called "blog_category_id".
Now in my code I would do something like:
//Loop through categories such as
foreach($categories as $cat):
//then for each category create an array of all its posts
$posts = $cat->getPosts(); // This would be another DB call to get all posts for the cat
//do stuff with posts
endforeach;
Now to me this seems like it could end up quite expensive in terms of DB calls depending on the size of $categories. Would this still be the best solution to do this? Or would I be able to do something in the code and first retrieve all the categories, then retrieve all the blogs and map them to their corresponding category via the id somehow? This would in theory be only 2 calls to the DB, now size wise the result set for call 2 (the blogs) would definitely be larger, but would the actual DB call be as expensive?
I would normally go for the first option, but I'm just wondering if there would be a better way of approaching this or is it more likely that the extra processing in PHP would be more costly in terms of performance? Also specifically from an MVC perspective, if the model returns the categories, but it should also return the corresponding blogs for that category, I'm not sure how best to structure this, from my understanding, shouldn't the model return all the data required for the view?
Or would I be better off selecting all categories and blogs using inner joins in the first query and create the output I need of this? Perhaps by using a multi-dimensional array?
Thanks
You can use a simple SQL query to get all categories and posts like the following:
SELECT *
FROM posts p
JOIN categories c ON c.id = p.blog_category_id
ORDER BY c.category_name ASC,
p.posted_date DESC
Then when you loop over the returned records assign the current category id to a variable, which you can use to compare against the next records category. If the category is different then print the category title before printing the record. It is important to note that for this to work you need to get the posts ordered by category first and then post so that all posts in the same category are together.
So for example:
$category_id = null;
foreach($posts as $post) {
if($post['blog_category_id'] != $category_id) {
$category_id = $post['blog_category_id'];
echo '<h2>' . $post['category_name'] . '</h2>';
}
echo '<h3>' . $post['post_title'] . '</h3>';
echo $post['blog_content'];
}
Note: as you have not posted up the schema of these two tables I have had to make up column names that are similar to what I would expect to see in code like this. So the code above will not work with your code without some adjustments to account for this.
The best solution depends on what you are going to do with data.
Lazy loading
Load data when you need it. It's a good solution when you have, for instance, 20 categories and you load posts for only 2 of them. However, if you need to load posts for all of them it won't be efficient at all... It's called a n+1 queries (and it's really bad).
Eager loading
On the other hand, if you have to access to almost all of your posts, you should do an eager loading.
-- Load all your data in a query
SELECT *
FROM categories c
INNER JOIN posts p ON c.id = p.category_id;
// Basic example in JSON of how to format your result
{
'cat1': ['post1', 'post2'],
'cat2': ['post5', 'post4', 'post5'],
...
}
What to do?
In your case I would say an eager loading because you load everything in a loop. But if you don't access to the most of your data, you should re-design your model to perform a lazy loading in such a way that the SQL query to load posts for a specific category is actually performed when a view try to access them.
What do you think?
I have a question that might seem simple, but yet I was unable to find the answer. Unlike articles, which are stored in table jos_content, categories in table jos_categories lack any column named ordering or any other that would have the desired information stored. I also tried to find anything similar in the jos_assets table, but it did not help either.
I am hacking the content component a little and I need to get my child categories ordered by the ordering when calling $parent->getChildren() or just find the ordering column so I can create my own query even though it's not clean, I just need to get it working ASAP.
So where can I find category ordering or how do I force getChildren method to return ordered results?
Thanks in advance, Elwhis
In Joomla categorises' order is stored in table "jos_categories" as hierarchical tree structure with a set of linked nodes. Columns used to set order are: "parent_id", "lft", "rgt" and "level".
Assets and menu items are stored in the same way.
You can read more about "Tree traversal" on wiki
Edit:
From Joomla 1.6 to load a specific category and all its children in a JCategoryNode object use:
jimport( 'joomla.application.categories' );
$extension = 'Content'; // com_content
$options['countItems'] = true;
$categoryId = 0;
$categories = JCategories::getInstance($extension, $options);
$categories->get($categoryId);
I'd like to be able to build the breadcrumbs for a content page, however the categories a piece of content is in can have unlimited depth, so i'm not sure how to go about it without getting each category one by one and then getting its parent etc. It seems like it could be a simpler way but I can't figure it out.
I have an articles table
article_id
article_name
article_cat_id
I also have a categories table
cat_id
cat_name
cat_parent
Cat parent is the id of another category of which a category is a child.
Imagine an article which is 5 categories deep, as far as I can tell i'd have to build the breadcrumbs something like this (example code obviously inputs should be escaped etc)
<?php
$breadcrumbs = array(
'Category 5',
'Content Item'
);
$cat_parent = 4;
while($cat_parent != 0) {
$query = mysql_query('SELECT * FROM categories WHERE cat_id = '.$cat_parent);
$result = mysql_fetch_array($query, MYSQL_ASSOC);
array_unshift($breadcrumbs, $result['cat_name']);
$cat_parent = $result['cat_parent'];
}
?>
This would then give me
array(
'Category 1',
'Category 2',
'Category 3',
'Category 4',
'Category 5',
'Content Item'
)
Which I can use for my breadcrumbs, however its taken me 5 queries to do it, which isn't really preferable.
Can anyone suggest any better solutions?
Here are some easy options in order of simplicity:
Stick with the design you have, use the recursive/iterative approach and enjoy the benefits of having simple code. Really, this will take you pretty far. As a bonus, it is easier to move from here to something more performant, than from a more complicated setup.
If the nr of categories isn't very large, you can select all of them and build the hierarchy in PHP. Due to pagesize the amount of work required to fetch 1 rows vs a whole bunch of them (say a few hundred) is pretty much the same. This minimizes the nr of queries/network trips, but increases the amount of data transported over the cable. Measure!
Cache the hierarchy and reload it entirely every X unit of time or whenever categories are added/modified/deleted. In it's simplest form, the cache could be a PHP file with a nested variable structure containing the entire category hierarchy, along with a simple index for the nodes.
Create an additional table in which you have flattened the hierarchy in some way, either using nested sets, path enumeration, closure table etc. The table will be maintained using triggers on the category table.
I would go for (1) unless you are fairly certain that you will have a sustained load of several users per second in the near future. (1 user per second makes 2,5 million visits a month).
There is nothing wrong with simple code. Complicating code for a speedup that isn't noticable is wrong.
There are two commonly used methods of handling hierarchal data in relational databases: the adjacency list model and nested set model. Your schema here is currently following the adjacency list model. Check out this page for some example queries. See also this question here on SO with a lot of good information.