I'm trying to create a web index. Every advertiser in my database will be able to appear on a few categories, so I've added a categorys column, and in that column I'll store the categories separated by "," so it will look like:
1,3,5
The problem is that I have no idea how I'm supposed to select all of the advertisers in a certain category, like: mysql_query("SELECT * FROM advertisers WHERE category = ??");
If categories is another database table, you shouldn't use a plain-text field like that. Create a "pivot table" for the purpose, something like advertisers_categories that links the two tables together. With setup, you could do a query like:
SELECT A.* FROM advertisers AS A
JOIN advertisers_categories AS AC ON AC.advertiser_id = A.id
WHERE AC.category_id = 12;
The schema of advertisers_categories would look something like this:
# advertisers_categories
# --> id INT
# --> advertiser_id INT
# --> category_id INT
You should design your database in another way. Take a look at Atomicity.
Short: You should not store your value in the form of 1,3,5.
I won't give you an answer because if you starting you use it this way now, you going to run into much more severe problems later. No offense :)
It's not possible having comma-separated values to do this strictly in an SQL query. You could return every row and have a PHP script which goes through each row, using explode($row,',') and then if(in_array($exploded_row,'CATEGORY')) to check for the existence of the category.
The more common solution is to restructure your database. You're thinking too two-dimensionally. You're looking for the Many to Many Data Model
advertisers
-----------
id
name
etc.
categories
----------
id
name
etc.
ad_cat
------
advertiser_id
category_id
So ad_cat will have at least one (usually more) entry per advertiser and at least one (usually more) entry per category, and every entry in ad_cat will link one advertiser to one category.
The SQL query then involves grabbing every line from ad_cat with the desired category_id(s) and searching for an advertiser whose id is in the resulting query's output.
Your implementation as-is will make it difficult and taxing on your server's resources to do what you want.
I'd recommend creating a table that relates advertisers to categories and then querying on that table given a category id value to obtain the advertisers that are in that category.
That is a very wrong way to define categories, because your array of values cannot be normalized.
Instead, define another table called CATEGORIES, and use a JOIN-table to match CATEGORIES with ADVERTIZERS.
Only then you will be able to properly select it.
Hope this helps!
Related
I was wondering if mysql has a way to look at a column and only retrieve the results when it finds a unique column once. For example
if the table looks like this:
id name category
1 test Health
2 carl Health
3 bob Oscar
4 joe Technology
As you can see their are two rows that could have the same category. Is their a way to retrieve the result where the array will one only return the category once?
What I am trying to do is get all the categories in the database so I can loop through them later in the code and use them. For example if I wanted to created a menu, I would want the menu to list all the categories in the menu.
I know I can run
SELECT categories FROM dbname
but this returns duplicate rows where I only need the cateogry to return once. Is there a way to do this on the mysql side?
I assume I can just use php's array_unique();
but I feel like this adds more overhead, is this not something MYSQL can do on the backend?
group by worked perfectly #Fred-ii- please submit this as answer so I can get that approved for you. – DEVPROCB
As requested by the OP:
You can use GROUP BY col_of_choice in order to avoid duplicates be shown in the queried results.
Reference:
https://dev.mysql.com/doc/refman/5.5/en/group-by-handling.html
By using database normalization, you would create another table with an unique id and the category name and by that link those two together, like
select * from mytable1
on mytable1.cat = mytable2.id
group by mytable1.cat
You can ofcourse also use group by without multiple tables, but for the structure, I recommend doing it.
You can use select distinct:
SELECT DISTINCT categories
FROM dbname ;
For various reasons, it is a good idea to have a separate reference table with one row per category. This helps in many ways:
Ensures that the category names are consistent ("Technology" versus "tech" for instance).
Gives a nice list of categories that are available.
Ensures that a category sticks around, even if no names currently reference it.
Allows for additional information about categories, such as the first time it appears, or a longer description.
This is recommended. However, if you still want to leave the category in place as it is, I would recommend an index on dbname(categories). The query should take advantage of the index.
SELECT id, name from dbname GROUP BY categoryname
Hope this will help.
You can even use distinct category.
I have two tables: One is big, expanded, it's table_news and one is very simple, with just two fields: id and name and it's called table_categories. So, it's obvious that I want to have some categories for every news. I know I can create a new table: table_news_categories which will contain something like: id, news_id, category_id. But for me it's a little overstatement. Can't I have a column like categories in table_news, which would be an "array" of categories' IDs? Wouldn't it be much simpler and easier to deal with?
If you want flexibility in your database you should have a link-table table
table_news
table_categies
table_news_categories (linktable that links the above table together)
If you don't need the flexibility, you can always add a FORIDCategory or similar field in the table_news - table that creates the relationship between the tables.
table_news
table_categies
By experience I would say the most preferable way is to add another (link)table from scratch. It might seem like unnessecary in the beginning but if the database grows you would certainly need it. It's harder to change the database-structure later on when having a lot of data in it.
Example:
You have 100 news now separated in 10 categories. In about half a year you may have 10.000 news. Now you would want to have two categories for some news. Then you would have create the link table, create all relations based on the FORIDCategory-value into the new link table and then remove the FORIDCategory-field and remain all the integrity of the database.
I have a MySQL database with a growing number of users and each user has a list of items they want and of items they have - and each user has a specific ID
The current database was created some time ago and it currently has each users with a specific row in a WANT or HAVE table with 50 columns per row with the user id as the primary key and each item WANT or HAVE has a specific id number.
this currently limits the addition of 50 items per user and greatly complicates searches and other functions with the databases
When redoing the database - would it be viable to instead simply create a 2 column WANT and HAVE table with each row having the user ID and the Item ID. That way there is no 'theoretical' limit to items per user.
Each time a member loads the profile page - a list of their want and have items will then be compiled using a simple SELECT WHERE ID = ##### statement from the have or want table
Furthermore i would need to make comparisons of user to user item lists, most common items, user with most items, complete user searches for items that one user wants and the other user has... - blah blah
The amount of users will range from 5000 - 20000
and each user averages about 15 - 20 items
will this be a viable MySQL structure or do i have to rethink my strategy?
Thanks alot for your help!
This will certainly be a viable structure in mysql. It can handle very large amounts of data. When you build it though, make sure that you put proper indexes on the user/item IDs so that the queries will return nice and quick.
This is called a one to many relationship in database terms.
Table1 holds:
userName | ID
Table2 holds:
userID | ItemID
You simply put as many rows into the second table as you want.
In your case, I would probably structure the tables as this:
users
id | userName | otherFieldsAsNeeded
items
userID | itemID | needWantID
This way, you can either have a simple lookup for needWantID - for example 1 for Need, 2 for Want. But later down the track, you can add 3 for wishlist for example.
Edit: just make sure that you aren't storing your item information in table items just store the user relationship to the item. Have all the item information in a table (itemDetails for example) which holds your descriptions, prices and whatever else you want.
I would recommend 2 tables, a Wants table and a Have table. Each table would have a user_id and product_id. I think this is the most normalized and gives you "unlimited" items per user.
Or, you could have one table with a user_id, product_id, and type ('WANT' or 'HAVE'). I would probably go with option 1.
As you mentioned in your question, yes, it would make much more sense to have a separate tables for WANTs and HAVEs. These tables could have an Id column which would relate the row to the user, and a column that actually dictates what the WANT or HAVE item is. This method would allow for much more room to expand.
It should be noted that if you have a lot of of these rows, you may need to increase the capacity of your server in order to maintain quick queries. If you have millions of rows, they will have a great deal of strain on the server (depending on your setup).
What you're theorizing is a very legitimate database structure. For a many to many relationship (which is what you want), the only way I've seen this done is to, like you say, have a relationships table with user_id and item_it as the columns. You could expand on it, but that's the basic idea.
This design is much more flexible and allows for the infinite items per user that you want.
In order to handle wants and have, you could create two tables or you could just use one and have a third column which would hold just one byte, indicating whether the user/item match is a want or a need. Depending on the specifics of your projects, either would be a viable option.
So, what you would end up with is at least the following tables:
Table: users
Cols:
user_id
any other user info
Table: relationships
Cols:
user_id
item_id
type (1 byte/boolean)
Table: items
Cols:
item_id
any other item info
Hope that helps!
what do you think would be performance-wise the better way to get the category-names of a news-system:
add an extra field for the cat-names inside a table, which allreade contains a field for the cat-ids
no extra field for the cat-names, but cat-ids and read in the cat-names (comma-seperated string: "cat1,cat2,cat3,cat4") into the php-file by an existing config-file and then build the cat-names with the help of the db-field "cat-ids" an array and a for-loop?
Thanx in advance,
Jayden
edit: cant seem to add a "hi" or "hallo" on top of the post, the editor just deletes it...
If you are measuring milliseconds and the disk IO of your system is not extremely slow, then option 2 would yield better performance. But, we are talking a negligible gain in execution time. Since you already will be querying the DB to get the news item it would be highly optimized to just get the category name at the same time. I would add a mapping table of category-name-id to category-names. And the join on that when getting news items.
From a flexibility standpoint and the standpoint of eliminating as many possible sources of error I would also go with my above idea. Since it adds flexibility to your system and keeps all your data in one spot. Changing the name of a category would require editing one column i the database instead of editing a php config file or, if option 1 was used, updating each and every news record.
So my best advise, add a table with category-name-id to category-names mappings and then have the news-items contain the id of the category they belong to.
For performance you could then cache the data you retrieve about existing categories and other data so you don't have to poll the DB for that information all the time.
For instance. You could, instead of joining at all, get all the categories from the category table I described above. Cache it in the application and only get it once the cache is invalidated. i.e. a timeout occurs or the data in the db is manipulated.
I think of two possible ways.
Have a category table, a articles table and a relationship table, and have a many-to-many relationship between categories and articles (as described in the relationship table).
If you feel smart today, declare each category as a binary number (0, 1, 2, 4, 8, 16 etc), and add them in a field on the articles table. If an article has a category value of 11, it has categories 1+2+8.
I like the first solution better, quite frankly.
I would create a categories table like this:
Categories
-----------
category_id name
-------------------------
1 Weather
2 Local
3 Sports
Then create a junction table, so each article can have 0 or more categories:
Article_Categories
-------------------
article_id category_id
-----------------------------
1 2
1 3
2 1
To get the articles with their categories (comma delimited) from MySQL server, you can use GROUP_CONCACT():
SELECT a.*, GROUP_CONCAT(c.name) AS cats
FROM Articles a
LEFT JOIN Article_Categories ac
ON ac.article_id = a.article_id
LEFT JOIN Categories c
ON c.category_id = ac.category_id
GROUP BY a.article_id
Add an additional table, that will save lots of issues in future for you. It is just the recommended way.
By the way, that idea of multiple id's in one field, don't try that way. It will give lots of code and issues which are totally unnecessary. If you really find performance issues you can always decide to take a step further and de-normalize or cache some of the data. There are lots of caching options available.
I think your first option is the suitable one. Because it make sense with the relationship with your data. And in a situation you want to display the category name with your news you can simply get everything by single select query with join.
So I recommend Option 1 You have mentioned.
And performance also can measure in two ways. Execution performance and development performance I feel both performance are in good position with your option 1. You don't need to do much just a one query. If you go for the option 2, then you have to load from config file, explode it with comma, then search using array elements which is time consuming.
I may be wrong, but since you already query the database, it's probably faster if you add a name field there..
Please also take into account that having the name in the same table as the ID provides consistency - if you have a config file you'll have to add a new category there plus in the table.
Also think of possible errors that may put wrong data into your config file - if this'd be the case your category names might get messed up..
On my database I have among other tables, products. Products table contains the id, name, description and some other data about the product. I want to add also a category.
Should I create a new table named category with and id and the name of each category, and have into the products a *category_id* that will refers to the id of category, or should I have the name of the category on each row of products ?
On the first case I will have to use JOIN. Will this have a serious impact on the performance?
By defining the categories in their own table you can:
Rename categories
Dynamically generate lists of categories for picking from
Add descriptions of categories
Add new categories
… and so on, without having to update every bit of category related code each time you modify them.
So yes, add a table.
Will this have a serious impact on the performance?
Probably not.
Yes you should keep your data normalized. I think create a new table named category is a good idea.
Will categories tend to change over time? In most instances that will be the case, so you're usually better off with a separate Categories table and a foreign key (FK) between the two tables. You can then add or change categories simply with data changes. Otherwise, you'll want to put a check constraint on the category name column in your table to make sure that you don't get junk data in there and that becomes much harder to maintain.
With proper indexing, the join should only have a minimal cost. Also, keep in mind that you won't always necessarily need to join the tables. You'll only need to join them when you want the actual category name as part of your result set. For example, if you have a look-up box on your front end with the categories and their IDs then your select of the Products only needs to return the category ID values and you don't need to even bother with a join.
The stock answer is "it depends". It depends on how the data is used, will be used, might be used. But once you get through all the discussions and arguments, for something this simple and prominent in your data, nine times out of eight you will be better off properly normalizing the data (i.e. Category table, CategoryId columns, and foreign key).
Only add a table if you require the extra functionality this level or normalization provides.
If you need to rename categories, or list them then yes this is a good candidate. If not then why bother with it, it will make your database slightly harder to read.
Join will most probably not be a performance impact but if you do not need it now, try not to be a victim of over-design.