Building Breadcrumbs in MySQL - php

I have a table that defines the possible categories in my website - fields look something like this:
- id
- name
- parentID
The information is stored something like this:
+-----+------+----------+
| id | name | parentID |
+-----+------+----------+
| 1 | pets | 0 |
+-----+------+----------+
| 2 | cats | 1 |
+-----+------+----------+
| 3 | dogs | 1 |
+-----+------+----------+
A parentID of 0 indicates that the category/page is on the home level. I'm looking for a way to quickly and easily generate the parent categories.
The first method that came to mind was a series of SQL queries, but I quickly realised that this would be insidiously resource intensive the more complicated the site got.
Reading through the mysql manual, I've seen that mysql can use loops and conditional statements, however I'm unsure how I'd put those into practice here.
Ideally, I'd like to have a single query that pulls up all directly related parent elements.
If I were looking at the Pets category, I would only see home because it's on the top level. As soon as I drill down (either into cats, dogs or a page under pets) then I should see pets on the bar - the same goes for subsequent child categories and pages.
What's the most efficient way to generate a list of categories using information stored in this fashion? If this question requires more clarification, please ask, and I will do my best to provide more information.
Clarification: This is part of a CMS - and as such, users are going to need the ability to make changes to categories on the fly. I've looked at several data storage schemes (such as nested sets) and they do not appear to lend themselves well to a simple form for making changes to navigation.
As such, any method needs to be easily a) understood by a user, and b) implemented easily to a user.
The categories are best described as folders on a PC, rather than tags. When you view any given category, you can see the immediate children of that category, as well as immediate child pages.
When you view a category or a page, the parent categories (but not itself are visible).
Example: I have German Shepard which resides under dogs which is under pets
When viewing *pets*: Home
When viewing *dogs*: Home -> Pets
When viewing *German Shepard*: Home -> Pets -> Dogs

Consider using "nested sets" model instead: Managing Hierarchical Data in MySQL.
Update (based on clarification to the question): The nested sets model does not have to be (in fact I have a pretty hard time imagining why would it be) exposed to end users. All directory-style operations (adding a new folder / subfolder; moving folder to a different path, etc...) can be supported in nested sets model, though some are a bit harder to implement then others. The article I've linked to provides examples for both adding and deleting of (sub)folder.

Could you have a stack or ordered set (ordered by how the user applied filters to their browsing) containing your breadcrumb, stored on the session?
I could see it getting grim when you started cross-querying, but sometimes data isn't hierarchical, but more of a soup of tags, and the above starts being your tag-soup clarification breadcrumb.
Most websites don't actually feature good (or any) tag soup drilling down. E.g., how many times have you been look at the sale CDs on a website, and wanted to drill down to just see the Metal CDs (for example), but clicking on the "Rock and Metal" link on the left took out to the top level metal category, instead of acting as a filter on your current browsing state.
So - is your problem actually a tag soup that you're applying a false hierarchy onto? Should you in fact be looking at automatic tag generation libraries that you can pass your items into, and tag lookup mechanisms? Okay, I'm sure your personal website won't be complex enough to ever require tag search, but in general terms, I think it is worth thinking about.

Related

Best MySQL Database Structure for a Yellow Pages Site

Im building a yellow pages site. I tried multiple database structures. Im not sure which one is best. Here are few I considered,
Saving all business data - name, phone, email etc in one table, list of tags in another, and mapping data id and tag id for tag-data relationship in a third table. I found this cumbersome since I'll be doing most things directly in the database (at least initially, before launch) and hence distributing everything can be problematic in my case. This one is a clean solution I must admit though.
Saving biz entries in one table with a separate column for tags (that'll contain comma separated(or JSON) tags for every entry). Then retrieving results using like query or full-text search for a tag. This one will be slower and will get more slow as db size increases. Also its not easy to maintain - suppose if I have to rename a tag.
(My Preferred Choice) Distributing biz data in different tables based on type - all banks in one, hotels, restaurants etc in separate tables. A separate table for all tags containing a rule for searching data from the table. Here is a detailed explanation.
Biz Tables:
college_tbl, bank_tbl, hotel_tbl, restaurant_tbl...so on
Tags Table
ID | Biz Table | Tag Name | Tag Key | Match Rule (col:like_query_part)
1 | bank_tbl | Citi Bank Branches | ['citi','bank'] | 'name:%$1%$2%'
2 | restaurant_tbl | Pizza Hut Restaurants | ['pizza','hut'] | 'name:%$1%$2%'
3 | hotel_tbl | The Leela Hotels | ['the leela'] | 'name:%$1%'
I'll then use 'Match rule' in like query to fetch results from 'Biz Table' for 'Tag Name'.
Im going forward with the third approach. I feel its simple, reduces the need of third data-tag relationship table, renaming is easy and performance won't get down if table has limited entries - say 1 million max per table.
Im scratching my head for the last 15 days to find the best structure and feel this one is pretty good in my case.
Please suggest a better approach or if this approach could have some issues later on.
Use Number 1. Period, full stop.
The mistake is "doing things directly in the database" rather than developing the API first.
Number 2 has one advantage -- FULLTEXT search. That can be tacked onto #1 after you have have a working API and some data to play with.
Number 3 (multiple similar tables) is a fisaco. Numerous Q&A ask about such; the reply is always "NO".

Fastest way to sort through 'tagging' DB

I had a fairly commonplace tagging system set up:
table|'keyword'| : tag_id | tag
table|'tag_thread'|: tag_thread_id | tag_id | thread_id
table|'thread'| : thread_id | thread_info
However, I have since changed the way my tagging will be displayed.
My new idea is to have a related column in the keyword_tbl. I decided to try this route because I wanted to do a breadcrumb system, and would like to 'order' the tags, for instance, Sports -> baseball -> pitchers. Also, if they type in "baseball" i'd like to include Sports as part of tags without them worrying about it.
keyword_tbl : keyword_id | keyword | related_id
For example:
keyword_tbl:
keyword_id // 1 // 2 // 3
keyword // sports // baseball // pitchers
related // 0 // 1 // 2
0 marks the fact that it is a 'general' tag, being the most broadest term. This means for each thread that they post, I would only need to store a single value (the most detailed, or "pitcher" in the above example). Starting with "Pitcher" I could derive the related fields, and create the breadcrumbs in a backwards manner.
My question is this: Which route would be better for what I'm trying to do with the breadcrumbs? Is there something particularly wrong with the way I'm planning on doing it that someone can see?
Thanks
Here are some potential problems (but this doesn't mean you are on the wrong track).
Generally, tags are a looser concept than categories. It sounds like you are mixing them together, which may be a problem. What is going to happen when you have a tag (let's say, "left handers") that applies to people in Baseball and Football? Tags were invented to avoid this kind of classification problem, where everything needs one parent in a tree.
The query to figure out the set of related tags is likely to be inefficient/messy, depending on how many levels of breadcrumbs you may have. Who is in charge of classifying the tags into a tree? If it's an admin function (therefore doesn't happen too often) you may want to create a "materialized view" that will hold all the related tags of each tag.

How do i create a directory for a product catalogue im making?

I am creating a product catalogue which will be browsed through database queries with the results displayed in a div, without page refresh-via ajax.
Category examples would be:
Home
Health
Entertainment
There are also subcategories for each category, i.e:
HOME:
Garden
Furniture
Plumbing
Etc.
I want to make a little directory thing that shows exactly where they are, something like:
Home >>> Garden >>> Lawn care
With each of those as a clickable link to take the person back to that specific query level.
My code is 1 .php document, involving a query and code to output the query. If output is clicked on, it triggers an ajax script which points back to and reruns the same query/output, but with different results.
This being said, i dont know how i would create a way to store and display the directory path. That i mentioned above.
I was thinking of some way that takes the category that was clicked on and passes it through the url so that the php has the value when it reloads. And then i work that variable into a clickable directory link. But the problem is I'm not sure how to do that for multiple layers.
i.e.
If someone just clicked on "garden" i could pass the garden variable through and use that in the nav, but then if someone clicked on "lawn care" i wouldnt be sure how to keep the "garden" variable because the variable i brought over via the url would now read "lawn care."
I feel like it has something to do with dynamically adding and storing the cumulative values in an array, but I'm really out of ideas...
From what you've described, it sounds like you want to implement bread crumbs/categories rather than directories.
If this is the case, you'd basically need to create a Categories table in your database like so:
Categories
id | parent_id | name
1 | 0 | Home
2 | 1 | Garden
3 | 1 | Furniture
4 | 1 | Plumbing
5 | 2 | Lawn Care
This would equate to a hierarchy like the following:
Home
Garden
Lawn Care
Furniture
Plumbing
So if I want to have a product show, for instance, in Home > Garden > Lawn Care, I'll need to link the product to Category #5 (Lawn Care). Then I need to develop a function to do a little while loop that figures out the parent structure from there. It will need to loop until it doesn't find a parent (or until parent_id = 0). In other words, it would go:
I'm in Lawn Care. Does Lawn Care have a parent?
Yes -> Garden. Does Garden have a parent?
Yes -> Home. Does Home have a parent?
No -> End.
There are a number of ways to implement this, which is why I left specifics out.
Alternatively, you could just do this calculation on save of the product or category so it can map out the hierarchy the one time instead of every time (saving on calculations), but this would be a very simple solution that could work for a large number of products.
The benefit to doing it this way is that you can also implement product lists based on the category you're in, and from another perspective, you can create product counts per category.
You can use URLs like \Home\Garden\LawnCare and rewrite them using .htaccess and mod_rewrite.. And when user clicks the Garden or the Home, you can easily go back to the requested page...This is such a simple solution to the problem...

PHP / MySQL / ??? - Organizing articles and arbitrary-depth child articles where any article can be retrieved, along with all parents and children

I'm going to have a database of articles, or pages, sort of like wiki pages. All article names will need to be unique, much like a wiki page. They're going to be organized in a flexible-depth hierarchy, where each article can be a child or parent of multiple other articles--all categories are represented by articles.
A user should be able to jump straight to any page--perhaps by typing in its unique name (either into a box on the home page, or domain.com/articles/name), or searching and choosing a result, or clicking a link from another article. However, they should also be able to drill down through the categories, with breadcrumbs visible for all categories the current article belongs to (there shouldn't be more than 2 or 3, generally), and the ability to view all (or as many as are reasonable) child articles.
For example, there may be:
People
Male
Bob
Jim
Female
Alice
Sally
Employees
Bob
Sally
Edibles
Food
Snacks
Chips
Salt & Vinegar
Sour Cream & Onion
Popcorn
Pizza
Sushi
Drink
Soda
Cola
Root Beer
Juice
Apple Juice
Orange Juice
Grape Juice
For example, if someone goes to Bob's page, they'll see all the information about Bob, as well as:
Parents:
Articles > People > Male > (Bob)
Articles > People > Employees > (Bob)
Children:
(none)
If someone goes to the page for Snacks, they'd see the article about snacks, as well as:
Parents:
Articles > Edibles > Food > (Snacks)
Children:
(Snacks) > Chips
(Snacks) > Chips > Salt & Vinegar
(Snacks) > Chips > Sour Cream & Onion
(Snacks) > Popcorn
I don't even have an idea of where to begin, here. What kind of nightmare am I going to have on my hands, in regards to setting up SQL tables to do this, and what are these queries going to look like? I plan on implementing some kind of caching, but I'd rather not have to rely on it for performance. To keep things working fine on average shared hosting, and to avoid mixing up my databases, I'd rather keep all of this in MySQL or PostgreSQL, which are necessary for the rest of the site. PHP will be used for the rest of the site, and I'd like to offload whatever work I can into PHP when it makes sense to do so. If another technology, like a non-relational database, would make this massively easier, I might be able to work with it.
I can already see the issue of Article 1 > Article 2 > Article 1 causing horrible issues, but I don't see any harm in simply not allowing an article to list one of its ancestors as a child.
Aside from help with trying to figure out how to implement all this in the first place, are there any other major pitfalls I'm missing?
This is pretty trivial, really, it's a run-off-the-mill tree. :)
The table schema for this will look like:
id unique id, perhaps the unique name
parent_id id of parent element
lft MPTT left field
rght MPTT right field
... any other fields you like
INDEX UNIQUE(id)
And that's it. Each article has one and only one parent article. The top elements may either have NULL as their parent id, or you'll have only one top node with NULL which all other articles are a child of. That depends on your preference. That's actually all you need, but to ease fetching of records, you'll use the lft and rght fields in an MPTT logic. Read this introductory article to learn what they're for. With these fields you can also avoid setting an entry as a child of its own child with one simple check.
From the database side, your relationship is very simple. Something like this will get you started:
CREATE TABLE articles (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
parent_id INT NOT NULL DEFAULT '0'
name VARCHAR(100) NOT NULL,
content TEXT);
Now you just have to decide where to enforce the "no ancestors as children" rule. If you want to enforce that in the database you can write an INSERT/UPDATE trigger that checks if the incoming parent_id is already a descendant and if so disallow the action. In typical usage, this will never happen because on the PHP side when you create content you'll usually be starting at a parent level and adding a child, never adding a parent to an existing piece of content.
As for finding the ancestors and children for your record, that is probably best done in PHP with two loops based on these queries:
// The ancestors:
$sql = "SELECT * FROM articles WHERE id = {$currentRecord->parent_id}";
// The children:
$sql = "SELECT * FROM articles WHERE parent_id = {$currentRecord->id";
Hope that gets you started.

Database Structure Advice Needed

Im currently working on a site which will contain a products catalog. I am a little new to database design so I'm looking for advice on how best to do this. I am familiar with relational database design so I understand "many to many" or "one to many" etc (took a good db class in college). Here is an example of what an item might be categorized as:
Propeller -> aircraft -> wood -> brand -> product.
Instead of trying to write what I have so far, just take a quick look at this image I created from the phpmyadmin designer feature.
alt text http://www.usfultimate.com/temp/db_design.jpg
Now, this all seemed fine and dandy, until I realized that the category "wood" would also be used under propeller -> airboat -> (wood). This would mean, that "wood" would have to be recreated every time I want to use it under a different parent. This isn't the end of the world, but I wanted to know if there is a more optimal way to go about this.
Also, I am trying to keep this thing as dynamic as possible so the client can organize his catalog as his needs change.
*Edit. Was thinking about just creating a "tags" table. So I could assign the tag "wood" or "metal" or "50inch" to 1 to many items. I would still keep a parenting type thing for the main categories, but this way the categories wouldnt have to go so deep and there wouldnt be the repetition.
First, the user interface: as user I hate to search a product in a catalog organized in a strictly hierarchical way. I never remember in what sub-sub-sub-sub...-category an "exotic" product is in and this force me to waste time exploring "promising" categories just to discover it is categorized in a (for me, at least) strange way.
What Kevin Peno suggests is a good advice and is known as faceted browsing. As Marcia Bates wrote in After the Dot-Bomb: Getting Web Information Retrieval Right This Time, " .. faceted classification is to hierarchical classification as relational databases are to hierarchical databases. .. ".
In essence, faceted search allows users to search your catalog starting from whatever "facet" they prefer and let them filter information choosing other facets along the search. Note that, contrary to how tag systems are usually conceived, nothing prevents you to organize some of these facets hierarchically.
To quickly understand what faceted search is all about, there are some demos to explore at The Flamenco Search Interface Project - Search Interfaces that Flow.
Second, the application logic: what Manitra proposes is also a good advice (as I understand it), i.e. separating nodes and links of a tree/graph in different relations. What he calls "ancestor table" (which is a much better intuitive name, however) is known as transitive closure of a directed acyclic graph (DAG) (reachability relation). Beyond performance, it simplify queries greatly, as Manitra said.
But I suggest a view for such "ancestor table" (transitive closure), so that updates are in real-time and incremental, not periodical by a batch job. There is SQL code (but I think it needs to be adapted a little to specific DBMSes) in papers I mentioned in my answer to query language for graph sets: data modeling question. In particular, look at Maintaining Transitive Closure of Graphs in SQL (.ps - postscript).
Products-Categories relationship
The first point of Manitra is worth of emphasis, also.
What he is saying is that between products and categories there is a many-to-many relationship. I.e.: each product can be in one or more categories and in each category there can be zero or more products.
Given relation variables (relvars) Products and Categories such relationship can be represented, for example, as a relvar PC with at least attributes P# and C#, i.e. product and category numbers (identifiers) in a foreign-key relationships with corresponding Products and Categories numbers.
This is complementary to management of categories' hierarchies. Of course, this is only a design sketch.
On faceted browsing in SQL
A useful concept to implement "faceted browsing" is relational division, or, even, relational comparisons (see bottom of linked page). I.e. dividing PC (Products-Categories) by a (growing) list of categories chosen from a user (facet navigation) one obtains only products in such categories (of course, categories are presumed not all mutually exclusive, otherwise choosing two categories one will obtain zero products).
SQL-based DBMS usually lack this operators (division and comparisons), so I give below some interesting papers that implement/discuss them:
ON MAKING RELATIONAL DIVISION COMPREHENSIBLE (.pdf from FIE 2003 Session Index);
A simpler (and better) SQL approach to relational division (.pdf from Journal of Information Systems Education - Contents Volume 13, Number 2 (2002));
Processing frequent itemset discovery queries by division and set containment join operators;
Laws for Rewriting Queries Containing Division Operators;
Algorithms and Applications for Universal Quantification in Relational Databases;
Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases;
(ACM access required) On the complexity of division and set joins in the relational algebra;
(ACM access required) Fast algorithms for universal quantification in large databases;
and so on...
I will not go into details here but interaction between categories hierarchies and facet browsing needs special care.
A digression on "flatness"
I briefly looked at the article linked by Pras, Managing Hierarchical Data in MySQL, but I stopped reading after these few lines in the introduction:
Introduction
Most users at one time or another have
dealt with hierarchical data in a SQL
database and no doubt learned that the
management of hierarchical data is not
what a relational database is intended
for. The tables of a relational
database are not hierarchical (like
XML), but are simply a flat list.
Hierarchical data has a parent-child
relationship that is not naturally
represented in a relational database
table. ...
To understand why this insistence on flatness of relations is just nonsense, imagine a cube in a three dimensional Cartesian coordinate system: it will be identified by 8 coordinates (triplets), say P1(x1,y1,z1), P2(x2,y2,z2), ..., P8(x8, y8, z8) [here we are not concerned with constraints on these coordinates so that they represent really a cube].
Now, we will put these set of coordinates (points) into a relation variable and we will name this variable Points. We will represent the relation value of Points as a table below:
Points| x | y | z |
=======+====+====+====+
| x1 | y1 | z1 |
+----+----+----+
| x2 | y2 | z2 |
+----+----+----+
| .. | .. | .. |
| .. | .. | .. |
+----+----+----+
| x8 | y8 | z8 |
+----+----+----+
Does this cube is being "flattened" by the mere act of representing it in a tabular way? Is a relation (value) the same thing as its tabular representation?
A relation variable assumes as values sets of points in a n-dimensional discrete space, where n is the number of relation attributes ("columns"). What does it mean, for a n-dimensional discrete space, to be "flat"? Just nonsense, as I wrote above.
Don't get me wrong, It is certainly true that SQL is a badly designed language and that SQL-based DBMSes are full of idiosyncrasies and shortcomings (NULLs, redundancy, ...), especially the bad ones, the DBMS-as-dumb-store type (no referential constraints, no integrity constrains, ...). But that has nothing to do with relational data model fantasized limitations, on the contrary: more they turn away from it and worse is the outcome.
In particular, the relational data model, once you understand it, poses no problem in representing whatever structure, even hierarchies and graphs, as I detailed with references to published papers mentioned above. Even SQL can, if you gloss over its deficiencies, missing something better.
On the "The Nested Set Model"
I skimmed the rest of that article and I'm not particularly impressed by such logical design: it suggests to muddle two different entities, nodes and links, into one relation and this will probably cause awkwardness. But I'm not inclined to analyze that design more thoroughly, sorry.
EDIT: Stephan Eggermont objected, in comments below, that " The flat list model is a problem. It is an abstraction of the implementation that makes performance difficult to achieve. ... ".
Now, my point is, precisely, that:
this "flat list model" is a fantasy: just because one lay out (represents) relations as tables ("flat lists") does not mean that relations are "flat lists" (an "object" and its representations are not the same thing);
a logical representation (relation) and physical storage details (horizontal or vertical decompositions, compression, indexes (hashes, b+tree, r-tree, ...), clustering, partitioning, etc.) are distinct; one of the points of relational data model (RDM) is to decouple logical from "physical" model (with advantages to both users and implementors of DBMSes);
performance is a direct consequence of physical storage details (implementation) and not of logical representation (Eggermont's comment is a classic example of logical-physical confusion).
RDM model does not constraint implementations in any way; one is free to implement tuples and relations as one see fit. Relations are not necessarily files and tuples are not necessarily records of a file. Such correspondence is a dumb direct-image implementation.
Unfortunately SQL-based DBMS implementations are, too often, dumb direct-image implementations and they suffer poor performance in a variety of scenarios - OLAP/ETL products exist to cover these shortcomings.
This is slowly changing. There are commercial and free software/open source implementations that finally avoid this fundamental pitfall:
Vertica, which is a commercial successor of..
C-Store: A Column-Oriented DBMS;
MonetDB;
LucidDB;
Kdb in a way;
an so on...
Of course, the point is not that there must exist an "optimal" physical storage design, but that whatever physical storage design can be abstracted away by a nice declarative language based on relational algebra/calculi (and SQL is a bad example) or more directly on a logic programming language (like Prolog, for example - see my answer to "prolog to SQL converter" question). A good DBMS should be change physical storage design on-the-fly, based on data access statistics (and/or user hints).
Finally, in Eggermont's comment the statement " The relational model is getting squeeezed between the cloud and prevayler. " is another nonsense but I cannot give a rebuttal here, this comment is already too long.
Before you create a hierarchical category model in your database, take a look at this article which explains the problems and the solution (using nested sets).
To summarize, using a simple parent_category_id doesn't scale very well and you'll have a hard time writing performant SQL queries. The answer is to use nested sets which make you visualize your many-to-many category model as sets which are nested inside other sets.
If you want categories to have multiple parent categories, then it's just a "many to many" relationship instead of a "one to many" relationship. You'll need to put a bridging table between category and itself.
However, I doubt this is what you want. If I'm looking in the category Aircraft > Wood then I wouldn't want to see items from Boating > Wood. There are two Wood categories because they contain different items.
My suggestions
put a many-to-many relation between Item and Category so that a product can be displayed in many hierarchy node (used in ebay, sourceforge ...)
keep the category hierarchy
Performance on the category hierarchy
If your category hierarchy is depth, then you could generate an "Ancestors" table. This table will be generated by a batch work and will contains :
ChildId (the id of a category)
AncestorId (the id of its parent, grand parent ... all ancestors category)
It means that if you have 3 categories : 1-Propeller > 2-aircraft > 3-wood
Then the Ancestor table will contain :
ChildId AncestorId
1 2
1 3
2 3
This means that to have all the children of category1, you just need 1 query and you don't have do nested query. By the way this would work not matter what is the depth of you category hierarchy.
Thanks to this table, you will need only 1 join to query against a category (with its childrens).
If you need help on how to create the Ancestor table, just let me know.
Before you create a hierarchical
category model in your database, take
a look at this article which explains
the problems and the solution (using
nested sets).
To summarize, using a simple
parent_category_id doesn't scale very
well and you'll have a hard time
writing performant SQL queries. The
answer is to use nested sets which
make you visualize your many-to-many
category model as sets which are
nested inside other sets.
It should be worth pointing out that the "multiple categories" idea is basically how "tagging" works. With the exception that, in "tagging", we allow any product to have many categories. By allowing any product to be in many categories, you allow the customer the full ability to filter their search by starting where they believe they need to start. It could be clicking on "airplanes", then "wood", then "turbojet engine" (or whatever). Or they could start their search with Wood, and get the same result.
This will give you the greatest flexibility, and the customer will enjoy a better UX, yet still allow you to maintain the hierarchy structure. So, while the quoted answer suggests letting categories be M:N to categories, my suggestion is to allow products to have M:N categories instead.
All in all the result is mostly the same, the categories will have a natural hierarchy, but this will lend to even greater flexibility.
I should also note that this doesn't prevent strict hierarchy either. You could much easily enforce hierarchy in the code where necessary (ex. only showing the categories "cars", "airplanes", and "boats" on your initial page). It just moves the "strctness" to your business logic, which might make it better in the long run.
EDIT: I just realized that you vaguly mentioned this in your answer. I actually didn't notice it, but I think this is along the lines you would want to do instead. Otherwise you are mixing two hierarchy systems into your program without much benefit.
I've done this before. I recommend starting with tagging (many-to-many relationship table to products). You can build a hierarchy relationship on top of your tags (tree, or nested sets, or whatever) a lot easier than on your products. Because tagging is relatively freeform, this also gives you the ability to allow people to categorize naturally and then later codify certain expected behaviors.
For instance, we had special tags like 2009-Nov-Special. Any product like this was eligible to show as a special on the front page for that month. So we didn't have to build a special system to handle rotating specials onto the front page we just used the existing tag system. Later this could be enhanced to hide those tags from consumers, etc.
Similarly, you can use tagging prefixes like: style:wood mfg:Nike to allow you to do relatively complex categorization and drilldowns without the difficulties of complex database reshuffling or the nightmares of EAV, all in a tagging system which gives you more flexibility to accommodate user expectations. Remember that users might expect to navigate the products in ways different than you as a database and business owner might expect. Using the tagging system can help you enable the shopping interface without compromising your inventory or sales tracking or anything else.
Now, this all seemed fine and dandy, until I realized that the category "wood" would also be used under propeller -> airboat -> (wood). This would mean, that "wood" would have to be recreated every time I want to use it under a different parent. This isn't the end of the world, but I wanted to know if there is a more optimal way to go about this.
What if you have an aircraft that is wood construction, but the propeller could be carbon fiber, fiberglas, metal, graphite?
I'd define a table of materials, and use a foreign key reference in the items table. If you want to support more than one material (IE: say there's metal re-inforcement, or screws...), then you'd need a corrollary/lookup/xref table.
MATERIALS_TYPE_CODE table
MATERIALS_TYPE_CODE pk
MATERIALS_TYPE_CODE_DESC
PRODUCTS table
PRODUCT_ID, pk
MATERIALS_TYPE_CODE fk IF only one material is ever associated
PRODUCT_MATERIALS_XREF table
PRODUCT_ID, pk
MATERIALS_TYPE_CODE pk
I would also relate products to one another using a corrollary/lookup/xref table. A product could be related to more than one kitted product:
KITTED_PRODUCTS table
PARENT_PRODUCT_ID, fk
CHILD_PRODUCT_ID, fk
...and it supports a hierarchical relationship because the child could be the parent of soemthing else.
You can easily test your DB designs at http://cakeapp.com

Categories