I have two tables: One is big, expanded, it's table_news and one is very simple, with just two fields: id and name and it's called table_categories. So, it's obvious that I want to have some categories for every news. I know I can create a new table: table_news_categories which will contain something like: id, news_id, category_id. But for me it's a little overstatement. Can't I have a column like categories in table_news, which would be an "array" of categories' IDs? Wouldn't it be much simpler and easier to deal with?
If you want flexibility in your database you should have a link-table table
table_news
table_categies
table_news_categories (linktable that links the above table together)
If you don't need the flexibility, you can always add a FORIDCategory or similar field in the table_news - table that creates the relationship between the tables.
table_news
table_categies
By experience I would say the most preferable way is to add another (link)table from scratch. It might seem like unnessecary in the beginning but if the database grows you would certainly need it. It's harder to change the database-structure later on when having a lot of data in it.
Example:
You have 100 news now separated in 10 categories. In about half a year you may have 10.000 news. Now you would want to have two categories for some news. Then you would have create the link table, create all relations based on the FORIDCategory-value into the new link table and then remove the FORIDCategory-field and remain all the integrity of the database.
Related
I am working on a little project where a user submits an article to MySQL, and then PHP send the post to the screen. So far so good.
Now i want to extend it to a "two level" post system, where people can comment on the articles.
The case is, i dont know how to do that.
The table i use for storing articles:
TABLE: posts
id user date avatar signature post
Should i make a new row named comments in the posts table, or should i place the comments in a seperate table? one for articles, one for comments?
All help is much appreciated.
Depends on how you use it on your website. You have to ask: "are my articles and comments essentially the same concept?" If yes, then use one table, if no, use two. On most websites articles work differently, can be categorized, editted etc., and usually need a different fields which would clutter the comments table... so in that case two tables are more reasonable. But if you conclude that on your website articles and comments are generally the same (but try to think future proof: wouldn't you need to add some article functionality in 2 months?) then you can think of articles also as of comments and have one table for them.
If you decide to merge things to one table, and you realize that you need another column to distinguish type of the post, and that some columns remain unused for some types, it is a clear warning signal you should have two tables!
This is a little subjective, but I would just set up a parent/child relationship on your posts table, make a parent_id column that is a foreign key for the id column in the same table. for extra credit, you can make it a nested set relationship which will allow you to pull the parent and all the children in one query
Let's say I have 10 books, each book has assigned some categories (ex. :php, programming, cooking, cookies etc).
After storing this data in a DB I want to search the books that match some categories, and also output the matched categories for each pair of books.
What would be the best approach for a fast and easy to code search:
1) Make a column with all categories for each book, the book rows would be unique (categs separated by comma in each row ) -> denormalisation from 1NF
2) Make a column with only 1 category in each row and multiple rows per book
I think it is easier for other queries if I store the categories 1 by 1 (method 2), but harder for that specific type of search. Is this correct?
I am using PHP and MySQL.
PPS : I know multi relational design, I prefer not joining every time the tables. I'm using different connection for some tables but that's not the problem. I'm asking what's the best approach for a db design for this type of search: a user type cooking, cookies, potatoes and I want to output pairs of books that have 1,2 more or all matched categs. I'm looking for a fast query, or php matching technique for this thing... Tell me your pint of view. Hope I'm understood
Use method 2 -- multiple rows per book, storing one category per row. It's the only way to make searching for a given category easy.
This design avoids repeating groups within a column, so it's good for First Normal Form.
But it's not just an academic exercise, it's a practical design that is good for all sorts of things. See my answer to Is storing a comma separated list in a database column really that bad?
What you want to do is have one table for books, one table for categories, and one table for connecting books and categories. Something like this:
books
book_id | title | etc
categories
category_id | title | etc
book_categories
book_id | category_id
This is called a many-to-many relationship. You should probably google it to learn more.
This relationship is a Many-To-Many (a book can have multiple categories and a category can be used in several books).
Then we have the following:
Got it?
=]
I would recommend approach number 2. This is because approach 1 requires a full text search of the category column.
You may have some success by splitting it up into two tables: One table has one line per book and a unique id (call the table books), and the other has one line per book per category and references the book id from the first table (call the table bookcategories). Then if you only need book data you use table books, where if you need categories you join both tables.
On my database I have among other tables, products. Products table contains the id, name, description and some other data about the product. I want to add also a category.
Should I create a new table named category with and id and the name of each category, and have into the products a *category_id* that will refers to the id of category, or should I have the name of the category on each row of products ?
On the first case I will have to use JOIN. Will this have a serious impact on the performance?
By defining the categories in their own table you can:
Rename categories
Dynamically generate lists of categories for picking from
Add descriptions of categories
Add new categories
… and so on, without having to update every bit of category related code each time you modify them.
So yes, add a table.
Will this have a serious impact on the performance?
Probably not.
Yes you should keep your data normalized. I think create a new table named category is a good idea.
Will categories tend to change over time? In most instances that will be the case, so you're usually better off with a separate Categories table and a foreign key (FK) between the two tables. You can then add or change categories simply with data changes. Otherwise, you'll want to put a check constraint on the category name column in your table to make sure that you don't get junk data in there and that becomes much harder to maintain.
With proper indexing, the join should only have a minimal cost. Also, keep in mind that you won't always necessarily need to join the tables. You'll only need to join them when you want the actual category name as part of your result set. For example, if you have a look-up box on your front end with the categories and their IDs then your select of the Products only needs to return the category ID values and you don't need to even bother with a join.
The stock answer is "it depends". It depends on how the data is used, will be used, might be used. But once you get through all the discussions and arguments, for something this simple and prominent in your data, nine times out of eight you will be better off properly normalizing the data (i.e. Category table, CategoryId columns, and foreign key).
Only add a table if you require the extra functionality this level or normalization provides.
If you need to rename categories, or list them then yes this is a good candidate. If not then why bother with it, it will make your database slightly harder to read.
Join will most probably not be a performance impact but if you do not need it now, try not to be a victim of over-design.
Lets take the example from Yelp: http://www.yelp.com/boston
You can see that it's a website with several different categories, each category containing a listing of places. Should I include all the different places/listing in a single table, or let each category have its own tables?
EDIT: this means having tables 'places_restaurants' and 'places_nightlife', instead of just having the single table 'places' and every entry of every different category will be stored in one huge table... Will this affect performance?
One table per category will require that you CREATE a table every time there's a new category. I'd prefer CATEGORY and PLACE tables, with a one-to-many or many-to-many relationship between them.
You should keep all of the categories in the same table and then have a CategoryID which actually maps each category to the specific / desired category. Your application should be built in a way that is inherently extensible which creating tables each time is definitely not.
It depends. You could normalize the database so that all categories are in their own table, and only referred to from other tables by a foreign key. But there are some arguments that performance outweighs normalization, and so it may be beneficial to keep category names both in their own table of record, and also to include a category name column in other, frequently-joined tables.
If you took the second approach, you would need to ensure data integrity by implementing UPDATE and DELETE triggers such that whenever a category changes in the table of record (presumably, not often), that other tables containing copies of category names also get updated.
It still depends on the application ,also, all the categories is a many to many fields with a main table and of course beliving u have some unique columns in each table
I'm trying to create a web index. Every advertiser in my database will be able to appear on a few categories, so I've added a categorys column, and in that column I'll store the categories separated by "," so it will look like:
1,3,5
The problem is that I have no idea how I'm supposed to select all of the advertisers in a certain category, like: mysql_query("SELECT * FROM advertisers WHERE category = ??");
If categories is another database table, you shouldn't use a plain-text field like that. Create a "pivot table" for the purpose, something like advertisers_categories that links the two tables together. With setup, you could do a query like:
SELECT A.* FROM advertisers AS A
JOIN advertisers_categories AS AC ON AC.advertiser_id = A.id
WHERE AC.category_id = 12;
The schema of advertisers_categories would look something like this:
# advertisers_categories
# --> id INT
# --> advertiser_id INT
# --> category_id INT
You should design your database in another way. Take a look at Atomicity.
Short: You should not store your value in the form of 1,3,5.
I won't give you an answer because if you starting you use it this way now, you going to run into much more severe problems later. No offense :)
It's not possible having comma-separated values to do this strictly in an SQL query. You could return every row and have a PHP script which goes through each row, using explode($row,',') and then if(in_array($exploded_row,'CATEGORY')) to check for the existence of the category.
The more common solution is to restructure your database. You're thinking too two-dimensionally. You're looking for the Many to Many Data Model
advertisers
-----------
id
name
etc.
categories
----------
id
name
etc.
ad_cat
------
advertiser_id
category_id
So ad_cat will have at least one (usually more) entry per advertiser and at least one (usually more) entry per category, and every entry in ad_cat will link one advertiser to one category.
The SQL query then involves grabbing every line from ad_cat with the desired category_id(s) and searching for an advertiser whose id is in the resulting query's output.
Your implementation as-is will make it difficult and taxing on your server's resources to do what you want.
I'd recommend creating a table that relates advertisers to categories and then querying on that table given a category id value to obtain the advertisers that are in that category.
That is a very wrong way to define categories, because your array of values cannot be normalized.
Instead, define another table called CATEGORIES, and use a JOIN-table to match CATEGORIES with ADVERTIZERS.
Only then you will be able to properly select it.
Hope this helps!