How to avoid repetition on data insertion? - php

I wish to allow all application users to add their own categories for their products. The produts may vary A LOT so is not just something that I can predict and insert myself previously.
However, if we allow all users to add their own categories, we may have issues like:
User A inserts a category called: Fruits
User B inserts a category called: Food from trees
(this is a dummy example, but perhaps you get the problem).
Generally speaking, what ways to we have to avoid repetition on our system ?
I'm totally unaware of the ways we may have, so some resources, links, anything, are more then welcome.
Thanks a lot.

Not the most friendly solution, but you could add all new entries to a queue that is moderated by a select number of users. Only after approval, the new entries will appear.

If is how i understand:
First, recomended categories name, if the user start type "fru", display an exists caregories callesd "fruits" and etc.
I use aliases, example:
Table Categories:
id (serial)
name (varchar)
aliasof (bigint)
From a backend i listing a new categories added, and if exists, make a relation:
1 fruits 0
2 fruits of tree 1

Hierarchical categories, so that when this situation is encountered it can be handled cleanly. Then when someone comes along and removes the child category the elements can be dumped into the parent category.


What table structure is better for categories and custom user categories

I have a categories and users table. A user can have many categories and a category can have many users (many to many). However, I also need a feature were users can insert/create their own categories and which is only accessible to the user (category creator) + the defaults categories.
I created a pivot table to handle the many to many relationship, however, I was having difficulty deciding if I need to create another table to handle the custom user categories or just add a user_id on the categories table.
What would be the correct structure I should take/create to handle this.
Given the information you have described, there are two solutions which would be valid: one would be to have a separate table for custom categories, and my preferred solution, would be to have a boolean value on the categories table which indicates whether a category is custom or not. This gives you the following advantages:
Logic applied to the two similar kinds of category remains the same
Other fields which are shared can be kept, in kind
If you wish to convert a custom category to a real category, this then becomes trivial (change the boolean)
You could include a creator id field to identify the person to whom the category applies, alternatively, you might simply designate in-code that custom categories may only have one member.

CakePHP database design: associations for job board

Hy everyone.
I'm actually building a job board with CakePHP and a little help for designing the database will be appreciated!
I have a table jobs with differents foreigns keys:
id, recruiter_id, title, sector_id, division_id, experience_id etc.
The associated table (sectors, divisions and experiences) have the same configuration id, name and job_count and sometimes on or two other fields (like company_count for sectors).
So I would like to know if there is better way to design these tables. I thought for putting the three of them in one table named lists with the keys: id, value and list_name. With this configuration I have just one request to do to get all the list and not 3.
My question is what is the "good way" solution ? May be there's another one ?
Seems kind of repetitive to have them in separate tables, when really they're all the same thing - properties of a job, and would have VERY similar table structures.
I would think you could create a single table for "job_properties" or something.
Each property could have a unique slug (if you wanted) or just use it's id.
// job_properties table example
slug // (optional or could be called "key" if you prefer)
type // (optional - "sector", "division", "min_exp")
name // (for use on the names of things like "marketing" or "technology")
value // (int - for use on things like minimum experience)
Then each Job would hasMany JobProperty. It would also allow any job to have more than one sector if that is ever needed.
This would allow you to pull based on if a job has a particular property or set of properties and seems overall cleaner and more consolidated while not making it too obfuscated.
I think a found a solution by using a system of taxonomy. I created a table terms which contain the list of all terms that can be associated (sector, division, type of contrat, etc.).
Table terms id, name, type
And I created a second table term_relationships which contain all the association including the name of the model that is associated.
Tabe term_relationships id, ref, ref_id, term_id
"ref" refers to the associated model (example: Job or Applicant in my case), the "ref_id" refers to the associated data (which job or which applicant) and term_id refers to which terms is associated. I think is the most evolutive and cleaner solution.
Thanks all for your help (especially Grafikart from where I get the idea) and hope that this topic can help someone else !

Optimal database structure for entries in flexible category/subcategory system?

I want to store reviews in a flexible system of categories and subcategories, and am currently in the process of designing the database structure for that. I have an idea how to do that, but I'm not entirely sure if it couldn't be done more elegant and/or efficient. These are my thoughts - if anybody can comment on if/how this can be improved I'd be really grateful.
(To keep this post concise, I only list the important field for the tables)
1.) The reviews are stored in the table "reviews". It has the following fields:
id: uniquite ID, auto-incrementing.
title: the title that will show up in <head><title>, etc.
stub: a version of the title without spaces, special chars, etc. so it can be part of the URL/URI
text: the actual content
2.) All categories are in the same table "categories"
id: unique ID, auto-incrementing.
title: the full title/name of the categorie how it will be output on the website
stub: version of the title that will be shown in the URL/URI.
parent_id: if this is a subcategory, here is the of the parent category. Else this is 0.
order_number: simple number to order the categories by (for display in the navigation menu)
3.) Now I need an indicator which reviews are in what categories. The can be in multiple. My first idea was to add a "review_list" field to the categories and have it contain all's that should be in this category. However I think that adding and removing reviews from categories would be a hassle and "unelegant". So my current idea is to have a table "review_in_category" and have an entry for every review-category relation. The structure is:
id: Unique ID, auto-increment.
review_id: the
category_id: the
So if a review is in 3 different categories it would result in 3 entries in the "review_in_category" table.
The idea is, that when a user opens the wrapper script will break up the URL into its parts. If it finds more than one category with category.stub = "sci-fi", it will check which of those has a parent category with the stub "animation". Once the correct category is identified (most the time the stubs are unique anyway so this check can be skipped) I want to SELECT all review_id's from "review_in_category" where the category_id matches the the one determined by the wrapper script. All the review_id's are put into an array. A loop will iterate through this array and compose the SELECT statement for listing all review titles (and create links to them using the stub values) by "SELECT title, stub FROM reviews WHERE id=review_list[$counter]" and then add "OR id=review_list[$counter]" until the array is completely travelled.
SO my questions are:
- Is the method my creating a single SELECT statement with potentially a large number of "OR id=" parts an "elegent" and/or efficient way to handle this situation or are there better variants?
- Does using a "taxonomy"-style table (review_in_category) make sense or would it be better to store the "membership"/"relation" directly in the reviews or category tables?
- Any other thoughts... I just started to learn this stuff and appreciate any feedback.
Thank you
Your design looks sound.
To retrieve all reviews in a category, you should use a join:
SELECT reviews.title, reviews.stub FROM reviews, review_in_category WHERE = review_in_category.review_id AND category_id = $category

php mysql - should i add the field "category-name" to a table or not?

what do you think would be performance-wise the better way to get the category-names of a news-system:
add an extra field for the cat-names inside a table, which allreade contains a field for the cat-ids
no extra field for the cat-names, but cat-ids and read in the cat-names (comma-seperated string: "cat1,cat2,cat3,cat4") into the php-file by an existing config-file and then build the cat-names with the help of the db-field "cat-ids" an array and a for-loop?
Thanx in advance,
edit: cant seem to add a "hi" or "hallo" on top of the post, the editor just deletes it...
If you are measuring milliseconds and the disk IO of your system is not extremely slow, then option 2 would yield better performance. But, we are talking a negligible gain in execution time. Since you already will be querying the DB to get the news item it would be highly optimized to just get the category name at the same time. I would add a mapping table of category-name-id to category-names. And the join on that when getting news items.
From a flexibility standpoint and the standpoint of eliminating as many possible sources of error I would also go with my above idea. Since it adds flexibility to your system and keeps all your data in one spot. Changing the name of a category would require editing one column i the database instead of editing a php config file or, if option 1 was used, updating each and every news record.
So my best advise, add a table with category-name-id to category-names mappings and then have the news-items contain the id of the category they belong to.
For performance you could then cache the data you retrieve about existing categories and other data so you don't have to poll the DB for that information all the time.
For instance. You could, instead of joining at all, get all the categories from the category table I described above. Cache it in the application and only get it once the cache is invalidated. i.e. a timeout occurs or the data in the db is manipulated.
I think of two possible ways.
Have a category table, a articles table and a relationship table, and have a many-to-many relationship between categories and articles (as described in the relationship table).
If you feel smart today, declare each category as a binary number (0, 1, 2, 4, 8, 16 etc), and add them in a field on the articles table. If an article has a category value of 11, it has categories 1+2+8.
I like the first solution better, quite frankly.
I would create a categories table like this:
category_id name
1 Weather
2 Local
3 Sports
Then create a junction table, so each article can have 0 or more categories:
article_id category_id
1 2
1 3
2 1
To get the articles with their categories (comma delimited) from MySQL server, you can use GROUP_CONCACT():
FROM Articles a
LEFT JOIN Article_Categories ac
ON ac.article_id = a.article_id
LEFT JOIN Categories c
ON c.category_id = ac.category_id
GROUP BY a.article_id
Add an additional table, that will save lots of issues in future for you. It is just the recommended way.
By the way, that idea of multiple id's in one field, don't try that way. It will give lots of code and issues which are totally unnecessary. If you really find performance issues you can always decide to take a step further and de-normalize or cache some of the data. There are lots of caching options available.
I think your first option is the suitable one. Because it make sense with the relationship with your data. And in a situation you want to display the category name with your news you can simply get everything by single select query with join.
So I recommend Option 1 You have mentioned.
And performance also can measure in two ways. Execution performance and development performance I feel both performance are in good position with your option 1. You don't need to do much just a one query. If you go for the option 2, then you have to load from config file, explode it with comma, then search using array elements which is time consuming.
I may be wrong, but since you already query the database, it's probably faster if you add a name field there..
Please also take into account that having the name in the same table as the ID provides consistency - if you have a config file you'll have to add a new category there plus in the table.
Also think of possible errors that may put wrong data into your config file - if this'd be the case your category names might get messed up..

Selecting rows from MySQL

I'm trying to create a web index. Every advertiser in my database will be able to appear on a few categories, so I've added a categorys column, and in that column I'll store the categories separated by "," so it will look like:
The problem is that I have no idea how I'm supposed to select all of the advertisers in a certain category, like: mysql_query("SELECT * FROM advertisers WHERE category = ??");
If categories is another database table, you shouldn't use a plain-text field like that. Create a "pivot table" for the purpose, something like advertisers_categories that links the two tables together. With setup, you could do a query like:
SELECT A.* FROM advertisers AS A
JOIN advertisers_categories AS AC ON AC.advertiser_id =
WHERE AC.category_id = 12;
The schema of advertisers_categories would look something like this:
# advertisers_categories
# --> id INT
# --> advertiser_id INT
# --> category_id INT
You should design your database in another way. Take a look at Atomicity.
Short: You should not store your value in the form of 1,3,5.
I won't give you an answer because if you starting you use it this way now, you going to run into much more severe problems later. No offense :)
It's not possible having comma-separated values to do this strictly in an SQL query. You could return every row and have a PHP script which goes through each row, using explode($row,',') and then if(in_array($exploded_row,'CATEGORY')) to check for the existence of the category.
The more common solution is to restructure your database. You're thinking too two-dimensionally. You're looking for the Many to Many Data Model
So ad_cat will have at least one (usually more) entry per advertiser and at least one (usually more) entry per category, and every entry in ad_cat will link one advertiser to one category.
The SQL query then involves grabbing every line from ad_cat with the desired category_id(s) and searching for an advertiser whose id is in the resulting query's output.
Your implementation as-is will make it difficult and taxing on your server's resources to do what you want.
I'd recommend creating a table that relates advertisers to categories and then querying on that table given a category id value to obtain the advertisers that are in that category.
That is a very wrong way to define categories, because your array of values cannot be normalized.
Instead, define another table called CATEGORIES, and use a JOIN-table to match CATEGORIES with ADVERTIZERS.
Only then you will be able to properly select it.
Hope this helps!
