I am trying to use inheritance for sake of re-usability and I have a base class named 'Partner' and two sub classes 'Lawyer' and 'Accountant' as below:
abstract class Partner{
}
class Lawyer extends Partner{
}
class Accountant extends Partner{
}
It's all good with the code, but I was wondering more when it comes to database, how do I store these information in db, should I have one table partners for both entities, or I should have separate tables for each entity, based on fact that these two differs from each other based on their attributes.
If I keep it in separated tables, and I have another table 'cases' which is related to partner:
cases: case_id, subject, description,partner_id.... My question is, how do I inner join entities based on user logged in, let say if he is laywer join lawyer table or vice versa, what if I keep adding more entities in future?
How would I do it in such a way that eventhou in future I add more Entities my db queries will not have to be changed again and again.
I can't say I favor the idea of combining lawyers and acct's into one table partners. My suggestion is to keep your db entities separate. So you will have three tables (lawyer, accountant, and cases) and cases will contain foreign keys from lawyer and accountant. Your inner joins will be simple in this case. You'll have two 1:1 relationships whereas using a partner table it will have to be 1:M relationship to get both the lawyer and the accountant - each being its own record within the Partners table. Plus don't forget all the useless NULLS you'll have since lawyers and accountants have their own specific attributes..eew!
Since you are going to use Lawyers and Accountants in the same context, I think it makes sense to use the same table.
You can simply create a partners table with the common properties, and create separate accountants and lawyers tables with a 1:1 relationship. This basically means these two tables don't get their own auto-incrementing primary key, but just a partner_id that is primary and refers to the id in the partner table.
I have a general question about how to implement the best practice of model structure for an application I'm building in Laravel 5.
So, at the moment I have things set up like this:
'user' model and table: id, email, password, admin level - this is really just the info for authenticating login.
'user-details' model and table: id, userID (foreign key for user table id field), name, address etc - all the other details
'lesson-type' model and table: id, teacherID (foreign key for user-details table id field), lesson-label etc - info about different types of lessons
At the moment I have a Teacher Controller in which I'm passing through to the view:
- The info from the User table
- The info from the User-details table
- A list of different lesson types for the teacher from the Lesson-type table
But I kind of feel that all this should be tied together with one separate Teacher model which would extend the User-details model (and probably which in turn should extend the User model), but wouldn't have it's own table associated with it, but all the info pertaining to either updates for the User-details or the Lesson-types table would be stored in those relevant tables. Would this be correct?
(I should also say that users may alternatively be parents rather than teachers, and so would I would have a separate Parents model for all the properties and so on associated with parents)
I would then pass only the Teacher model object into the view and thus gain access to all the teacher info such as personal details and array of lesson types.
As I'm typing, this is sounding more and more to me like the right way to go, but it would be great to get some advice.
1 - technical implementation: I guess in the Teacher model, I'd populate all the relevant teacher into class variables (Name, array of lessons etc) in the constructor?
2 - am I over complicating this structure by having both Users AND Users details tables?
3 - Does what I'm proposing make the most structural sense in Laravel?
4 - just another thought I've just had, should the teacherID in the lesson-type table actually refer to the User table rather than the User-detail table... so user-detail and lesson-type would both be direct children of the user table??
Very much obliged for any help :)
You shouldn't extend models like that unless there is a clear inheritance. From a logical standpoint, it just doesn't make any sense since you'll have to overwrite most of what is on the User model anyway. And what you don't overwrite will be incorrectly mapped to the database because they are 2 completely different tables. What you actually want to do is utilize Eloquent relationships.
For clarity, I am assuming this basic structure:
users - id
teachers - id, user_id
user_details - id, user_id
lesson_types - id, teacher_id
Those should be 4 completely different models all interconnected using the Model::belongsTo() method. So the Teacher model would be
class Teacher extends Model {
public $table = 'teachers';
public function user() {
return $this->belongsTo('App\User');
}
}
When you query for a teacher, you can do Teacher::with('user')->get(). That will return all records from the teachers table and on each instance of the Teacher model, you'll be able to call $teacher->user and get the User instance associated with that teacher. That is a full model, not just extra data, so you have access to everything on the User Model, which is generally the main reason for extending
For your list of questions:
I may be misunderstanding you, but this isn't how an ORM works. I'd suggest going back and reading through the Eloquent docs (if you're running 5.0, I suggest reading 5.1's docs since they are much, much better)
It will depend on who you ask, but I tend to think so. If the data is clearly related and there is no reason for it to be shared across record types (for example, I generally have an addresses table that all records reference instead of having 5 address fields repeated on multiple tables), I believe it should all be on one table. It just makes it more difficult to manage later on if you have it in separate tables.
There will be those who disagree and think that smaller scopes for each table are better and it will likely allow for quicker queries on extremely large datasets, but I don't think it's worth the extra trouble in the end
No, as I have explained above
The teacher_id column should reference the teachers table, assuming that lessons belong to teachers and cannot belong to just any user in the system. Using the ORM, you'll be able to do $lesson->teacher->user->userDetails to get that data
I really think you need to go back and read through the Eloquent docs. Your understanding of how Eloquent works and how it is meant to be used seems very basic and you are missing much of the finer details.
Basics
Relationships
Laracasts - Laravel Fundamentals - You would benefit from watching Lesses 7-9, 11, 14, and 21
I'm currently working on an app backend (business directory). Main "actor" is an "Entry", which will have:
- main category
- subcategory
- tags (instead of unlimited sub-levels of division)
I'm pretty new to OOP but I still want to use it here. The database is MySql and I'll be using PDO.
In an attempt to figure out what database table structure should I use in order to support the above classification of entries, I was thinking about a solution that Wordpress uses - establish relationship between an entry and cats/subcats/tags through several tables (terms, taxonomies, relationships). What keeps me from this solution at the moment is the fact that each relationship of any kind is represented by a row in the relationships table. Given 50,000 entries I would have, attaching to a particular entry: main cat, subcat and up to 15 tags might slow down the app (or I am wrong)?
I then learned a bit about Table Data Gateway which seemed an excellent solution because I liked the idea of having one table per a class but then I read there is virtually no way of successful combating the impedence missmatch between the OOP and relational-mapping.
Are there any other approaches that you may see fit for this situation? I think I will be going with:
tblentry
tblcategory
tblsubcategory
tbltag
structure. Relationships would be based on the parent IDs but I+'m wondering is that enough? Can I be using foreign key and cascade delete options here (that is something I am not too familiar with and it seems to me as a more intuitive way of having relationships between the elements in tables)?
having a table where you store the relationship between your table is a good idea, and through indexes and careful thinking you can achieve very fast results.
since each entry must represent a different kind of link between two entities (subcategory to main entry, tag to subcategory) you need at least (and at the very most) three fields:
id1 (or the unique id of the first entity)
linkid (linking to a fourth table where each link is described)
id2 (or the unique id of the second entity)
those three fields can and should be indexed.
now the fourth table to achieve this kind of many-to-many relationship will describe the nature of the link. since many different type of relationship will exist in the table, you can't keep what the type is (child of, tag of, parent of) in the same table.
that fourth table (reference) could look like this:
id nature table1 table2
1 parent of entry tags
2 tag of tags entry
the table 1 field tells you which table the first id refers to, likewise with table2
the id is the number between the two fields in your relationship table. only the id field should be indexed. the nature field is more for the human reader then for joining tables or organizing data
We have a large number of data in many categories with many properties, e.g.
category 1: Book
properties: BookID, BookName, BookType, BookAuthor, BookPrice
category 2: Fruit
properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice
We have many categories like book and fruit. Obviously we can create many tables for them (MySQL e.g.), and each category a table. But this will have to create too many tables and we have to write many "adapters" to unify manipulating data.
The difficulties are:
1) Every category has different properties and this results in a different data structure.
2) The properties of every categoriy may have to be changed at anytime.
3) Hard to manipulate data if each category a table (too many tables)
How do you store such kind of data?
You can separate the database into two parts: Definition Tables and Data Tables. Basically the Definition Tables is used to interpret the Data Tables where the actual data is stored (some would say that the definition tables is more elegant if represented in XML).
The following is the basic idea.
Definition Tables:
TABLE class
class_id (int)
class_name (varchar)
TABLE class_property
property_id (int)
class_id (int)
property_name (varchar)
property_type (varchar)
Data Tables:
TABLE object
object_id (int)
class_id (varchar)
TABLE object_property
property_id (int)
property_value (varchar)
It would be best if you could also create additional Layer to interpret the structure so as to make it easier for the Data Layer to operate on the data. And you must of course take into consideration performance, ease of query, etc.
Just my two cents, I hope it could be of any help.
Regards.
If your data collection isn't too big, the Entity-Attribute-Value (EAV) model may fit nicely the bill.
In a nutshell, this structure allows the definition of Categories, the list of [required or optional] Attributes (aka properties) the entities in such category include etc, in a set of tables known as the meta-data, the logical schema of the data, if you will. The entity instances are stored in two tables a header and a values tables, whereby each attribute is stored in a single [SQL] record of the later table (aka "vertical" storage: what used to be a record in traditional DBMS model is made of several records of the value table).
This format is very practical in particular for its flexibility: it allows both late and on-going changes in the logical schema (addition of new categories, additions/changes in the attributes of a given category etc.), as well the implicit data-driven handling of the underlying catalog's logical schema, at the level of the application. The main drawbacks of this format are the [somewhat] more sophisticated, abstract, implementation and, mainly, some limitations with regards to scaling etc. when the catalog size grows, say in the million+ entities range.
See the EAV model described in more details in this SO answer of mine.
Triggered by this question and other similar ones, I wrote a blog post on how to handle such cases using a graph database. In short, graph databases don't have the problem "how to force a tree/hierarchy into tables" as there's simply no need for it: you store your tree structure as it is. They're not good at everything (like for example creating reports) but this is a case where graph databases shine.
Im currently working on a site which will contain a products catalog. I am a little new to database design so I'm looking for advice on how best to do this. I am familiar with relational database design so I understand "many to many" or "one to many" etc (took a good db class in college). Here is an example of what an item might be categorized as:
Propeller -> aircraft -> wood -> brand -> product.
Instead of trying to write what I have so far, just take a quick look at this image I created from the phpmyadmin designer feature.
alt text http://www.usfultimate.com/temp/db_design.jpg
Now, this all seemed fine and dandy, until I realized that the category "wood" would also be used under propeller -> airboat -> (wood). This would mean, that "wood" would have to be recreated every time I want to use it under a different parent. This isn't the end of the world, but I wanted to know if there is a more optimal way to go about this.
Also, I am trying to keep this thing as dynamic as possible so the client can organize his catalog as his needs change.
*Edit. Was thinking about just creating a "tags" table. So I could assign the tag "wood" or "metal" or "50inch" to 1 to many items. I would still keep a parenting type thing for the main categories, but this way the categories wouldnt have to go so deep and there wouldnt be the repetition.
First, the user interface: as user I hate to search a product in a catalog organized in a strictly hierarchical way. I never remember in what sub-sub-sub-sub...-category an "exotic" product is in and this force me to waste time exploring "promising" categories just to discover it is categorized in a (for me, at least) strange way.
What Kevin Peno suggests is a good advice and is known as faceted browsing. As Marcia Bates wrote in After the Dot-Bomb: Getting Web Information Retrieval Right This Time, " .. faceted classification is to hierarchical classification as relational databases are to hierarchical databases. .. ".
In essence, faceted search allows users to search your catalog starting from whatever "facet" they prefer and let them filter information choosing other facets along the search. Note that, contrary to how tag systems are usually conceived, nothing prevents you to organize some of these facets hierarchically.
To quickly understand what faceted search is all about, there are some demos to explore at The Flamenco Search Interface Project - Search Interfaces that Flow.
Second, the application logic: what Manitra proposes is also a good advice (as I understand it), i.e. separating nodes and links of a tree/graph in different relations. What he calls "ancestor table" (which is a much better intuitive name, however) is known as transitive closure of a directed acyclic graph (DAG) (reachability relation). Beyond performance, it simplify queries greatly, as Manitra said.
But I suggest a view for such "ancestor table" (transitive closure), so that updates are in real-time and incremental, not periodical by a batch job. There is SQL code (but I think it needs to be adapted a little to specific DBMSes) in papers I mentioned in my answer to query language for graph sets: data modeling question. In particular, look at Maintaining Transitive Closure of Graphs in SQL (.ps - postscript).
Products-Categories relationship
The first point of Manitra is worth of emphasis, also.
What he is saying is that between products and categories there is a many-to-many relationship. I.e.: each product can be in one or more categories and in each category there can be zero or more products.
Given relation variables (relvars) Products and Categories such relationship can be represented, for example, as a relvar PC with at least attributes P# and C#, i.e. product and category numbers (identifiers) in a foreign-key relationships with corresponding Products and Categories numbers.
This is complementary to management of categories' hierarchies. Of course, this is only a design sketch.
On faceted browsing in SQL
A useful concept to implement "faceted browsing" is relational division, or, even, relational comparisons (see bottom of linked page). I.e. dividing PC (Products-Categories) by a (growing) list of categories chosen from a user (facet navigation) one obtains only products in such categories (of course, categories are presumed not all mutually exclusive, otherwise choosing two categories one will obtain zero products).
SQL-based DBMS usually lack this operators (division and comparisons), so I give below some interesting papers that implement/discuss them:
ON MAKING RELATIONAL DIVISION COMPREHENSIBLE (.pdf from FIE 2003 Session Index);
A simpler (and better) SQL approach to relational division (.pdf from Journal of Information Systems Education - Contents Volume 13, Number 2 (2002));
Processing frequent itemset discovery queries by division and set containment join operators;
Laws for Rewriting Queries Containing Division Operators;
Algorithms and Applications for Universal Quantification in Relational Databases;
Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases;
(ACM access required) On the complexity of division and set joins in the relational algebra;
(ACM access required) Fast algorithms for universal quantification in large databases;
and so on...
I will not go into details here but interaction between categories hierarchies and facet browsing needs special care.
A digression on "flatness"
I briefly looked at the article linked by Pras, Managing Hierarchical Data in MySQL, but I stopped reading after these few lines in the introduction:
Introduction
Most users at one time or another have
dealt with hierarchical data in a SQL
database and no doubt learned that the
management of hierarchical data is not
what a relational database is intended
for. The tables of a relational
database are not hierarchical (like
XML), but are simply a flat list.
Hierarchical data has a parent-child
relationship that is not naturally
represented in a relational database
table. ...
To understand why this insistence on flatness of relations is just nonsense, imagine a cube in a three dimensional Cartesian coordinate system: it will be identified by 8 coordinates (triplets), say P1(x1,y1,z1), P2(x2,y2,z2), ..., P8(x8, y8, z8) [here we are not concerned with constraints on these coordinates so that they represent really a cube].
Now, we will put these set of coordinates (points) into a relation variable and we will name this variable Points. We will represent the relation value of Points as a table below:
Points| x | y | z |
=======+====+====+====+
| x1 | y1 | z1 |
+----+----+----+
| x2 | y2 | z2 |
+----+----+----+
| .. | .. | .. |
| .. | .. | .. |
+----+----+----+
| x8 | y8 | z8 |
+----+----+----+
Does this cube is being "flattened" by the mere act of representing it in a tabular way? Is a relation (value) the same thing as its tabular representation?
A relation variable assumes as values sets of points in a n-dimensional discrete space, where n is the number of relation attributes ("columns"). What does it mean, for a n-dimensional discrete space, to be "flat"? Just nonsense, as I wrote above.
Don't get me wrong, It is certainly true that SQL is a badly designed language and that SQL-based DBMSes are full of idiosyncrasies and shortcomings (NULLs, redundancy, ...), especially the bad ones, the DBMS-as-dumb-store type (no referential constraints, no integrity constrains, ...). But that has nothing to do with relational data model fantasized limitations, on the contrary: more they turn away from it and worse is the outcome.
In particular, the relational data model, once you understand it, poses no problem in representing whatever structure, even hierarchies and graphs, as I detailed with references to published papers mentioned above. Even SQL can, if you gloss over its deficiencies, missing something better.
On the "The Nested Set Model"
I skimmed the rest of that article and I'm not particularly impressed by such logical design: it suggests to muddle two different entities, nodes and links, into one relation and this will probably cause awkwardness. But I'm not inclined to analyze that design more thoroughly, sorry.
EDIT: Stephan Eggermont objected, in comments below, that " The flat list model is a problem. It is an abstraction of the implementation that makes performance difficult to achieve. ... ".
Now, my point is, precisely, that:
this "flat list model" is a fantasy: just because one lay out (represents) relations as tables ("flat lists") does not mean that relations are "flat lists" (an "object" and its representations are not the same thing);
a logical representation (relation) and physical storage details (horizontal or vertical decompositions, compression, indexes (hashes, b+tree, r-tree, ...), clustering, partitioning, etc.) are distinct; one of the points of relational data model (RDM) is to decouple logical from "physical" model (with advantages to both users and implementors of DBMSes);
performance is a direct consequence of physical storage details (implementation) and not of logical representation (Eggermont's comment is a classic example of logical-physical confusion).
RDM model does not constraint implementations in any way; one is free to implement tuples and relations as one see fit. Relations are not necessarily files and tuples are not necessarily records of a file. Such correspondence is a dumb direct-image implementation.
Unfortunately SQL-based DBMS implementations are, too often, dumb direct-image implementations and they suffer poor performance in a variety of scenarios - OLAP/ETL products exist to cover these shortcomings.
This is slowly changing. There are commercial and free software/open source implementations that finally avoid this fundamental pitfall:
Vertica, which is a commercial successor of..
C-Store: A Column-Oriented DBMS;
MonetDB;
LucidDB;
Kdb in a way;
an so on...
Of course, the point is not that there must exist an "optimal" physical storage design, but that whatever physical storage design can be abstracted away by a nice declarative language based on relational algebra/calculi (and SQL is a bad example) or more directly on a logic programming language (like Prolog, for example - see my answer to "prolog to SQL converter" question). A good DBMS should be change physical storage design on-the-fly, based on data access statistics (and/or user hints).
Finally, in Eggermont's comment the statement " The relational model is getting squeeezed between the cloud and prevayler. " is another nonsense but I cannot give a rebuttal here, this comment is already too long.
Before you create a hierarchical category model in your database, take a look at this article which explains the problems and the solution (using nested sets).
To summarize, using a simple parent_category_id doesn't scale very well and you'll have a hard time writing performant SQL queries. The answer is to use nested sets which make you visualize your many-to-many category model as sets which are nested inside other sets.
If you want categories to have multiple parent categories, then it's just a "many to many" relationship instead of a "one to many" relationship. You'll need to put a bridging table between category and itself.
However, I doubt this is what you want. If I'm looking in the category Aircraft > Wood then I wouldn't want to see items from Boating > Wood. There are two Wood categories because they contain different items.
My suggestions
put a many-to-many relation between Item and Category so that a product can be displayed in many hierarchy node (used in ebay, sourceforge ...)
keep the category hierarchy
Performance on the category hierarchy
If your category hierarchy is depth, then you could generate an "Ancestors" table. This table will be generated by a batch work and will contains :
ChildId (the id of a category)
AncestorId (the id of its parent, grand parent ... all ancestors category)
It means that if you have 3 categories : 1-Propeller > 2-aircraft > 3-wood
Then the Ancestor table will contain :
ChildId AncestorId
1 2
1 3
2 3
This means that to have all the children of category1, you just need 1 query and you don't have do nested query. By the way this would work not matter what is the depth of you category hierarchy.
Thanks to this table, you will need only 1 join to query against a category (with its childrens).
If you need help on how to create the Ancestor table, just let me know.
Before you create a hierarchical
category model in your database, take
a look at this article which explains
the problems and the solution (using
nested sets).
To summarize, using a simple
parent_category_id doesn't scale very
well and you'll have a hard time
writing performant SQL queries. The
answer is to use nested sets which
make you visualize your many-to-many
category model as sets which are
nested inside other sets.
It should be worth pointing out that the "multiple categories" idea is basically how "tagging" works. With the exception that, in "tagging", we allow any product to have many categories. By allowing any product to be in many categories, you allow the customer the full ability to filter their search by starting where they believe they need to start. It could be clicking on "airplanes", then "wood", then "turbojet engine" (or whatever). Or they could start their search with Wood, and get the same result.
This will give you the greatest flexibility, and the customer will enjoy a better UX, yet still allow you to maintain the hierarchy structure. So, while the quoted answer suggests letting categories be M:N to categories, my suggestion is to allow products to have M:N categories instead.
All in all the result is mostly the same, the categories will have a natural hierarchy, but this will lend to even greater flexibility.
I should also note that this doesn't prevent strict hierarchy either. You could much easily enforce hierarchy in the code where necessary (ex. only showing the categories "cars", "airplanes", and "boats" on your initial page). It just moves the "strctness" to your business logic, which might make it better in the long run.
EDIT: I just realized that you vaguly mentioned this in your answer. I actually didn't notice it, but I think this is along the lines you would want to do instead. Otherwise you are mixing two hierarchy systems into your program without much benefit.
I've done this before. I recommend starting with tagging (many-to-many relationship table to products). You can build a hierarchy relationship on top of your tags (tree, or nested sets, or whatever) a lot easier than on your products. Because tagging is relatively freeform, this also gives you the ability to allow people to categorize naturally and then later codify certain expected behaviors.
For instance, we had special tags like 2009-Nov-Special. Any product like this was eligible to show as a special on the front page for that month. So we didn't have to build a special system to handle rotating specials onto the front page we just used the existing tag system. Later this could be enhanced to hide those tags from consumers, etc.
Similarly, you can use tagging prefixes like: style:wood mfg:Nike to allow you to do relatively complex categorization and drilldowns without the difficulties of complex database reshuffling or the nightmares of EAV, all in a tagging system which gives you more flexibility to accommodate user expectations. Remember that users might expect to navigate the products in ways different than you as a database and business owner might expect. Using the tagging system can help you enable the shopping interface without compromising your inventory or sales tracking or anything else.
Now, this all seemed fine and dandy, until I realized that the category "wood" would also be used under propeller -> airboat -> (wood). This would mean, that "wood" would have to be recreated every time I want to use it under a different parent. This isn't the end of the world, but I wanted to know if there is a more optimal way to go about this.
What if you have an aircraft that is wood construction, but the propeller could be carbon fiber, fiberglas, metal, graphite?
I'd define a table of materials, and use a foreign key reference in the items table. If you want to support more than one material (IE: say there's metal re-inforcement, or screws...), then you'd need a corrollary/lookup/xref table.
MATERIALS_TYPE_CODE table
MATERIALS_TYPE_CODE pk
MATERIALS_TYPE_CODE_DESC
PRODUCTS table
PRODUCT_ID, pk
MATERIALS_TYPE_CODE fk IF only one material is ever associated
PRODUCT_MATERIALS_XREF table
PRODUCT_ID, pk
MATERIALS_TYPE_CODE pk
I would also relate products to one another using a corrollary/lookup/xref table. A product could be related to more than one kitted product:
KITTED_PRODUCTS table
PARENT_PRODUCT_ID, fk
CHILD_PRODUCT_ID, fk
...and it supports a hierarchical relationship because the child could be the parent of soemthing else.
You can easily test your DB designs at http://cakeapp.com