MySQL/PHP Relating Database Content by Keywords - php

I read here a lot but this is the first I've asked a question.
I'm developing a website, with LAMP, that is essentially all about user submitted articles. I would like to relate some articles together by tags (or keywords) so that I may show a viewer some content that is related.
I had an idea of creating a field within the MySQL database table, where the articles resides, called "tags" that consists of a comma delimited list of keywords. They would relate to the article and describe it's content in some fashion. Yet I discovered this wasn't a bright idea as I wouldn't be able to index this very well.
So, how would I go about this, any ideas?
ps. Just seen the little box to the right on this site called, "Similar Questions" what I'm trying to achieve is a lot like that...

It's a many-to-many relationship, and can be modelled using a separate table with a foreign key to the article ID, and a column for a tag. Multiple tags are added to an article by adding multiple rows to the table.
For example, if you have two articles where:
Article 1 has tags "foo" and "bar" and
Article 2 has tags "bar" and "baz"
then the table might look like this:
article_tags
article_id tag
1 foo
1 bar
2 bar
2 baz
You can even store the tag names in a separate table:
tags
id name
1 foo
2 bar
3 baz
article_tags
article_id tag_id
1 1
1 2
2 2
2 3

Related

How to deal with multiple categories entry for an article website

I am creating a news web application, and each news has a category field, but this category can be one, or many at times. Which means, that a person may enter politics, world, us tags for just one news article. Now, the problem I have is how to insert this into a database. If I just enter the tags directly to database, as pure text, then I when I have to echo it, I could use explode() to separate them, like
$row['tags] = 'politics, world, us';
foreach(explode($row['tags') as $tag){
echo "<a href='{$tag}'> {$tag} </a> ";
}
Which would echo the tags, and create a hyperlink for each tag, but the problem I have with this is that, if user wants to just see a news with a specific tag, it becomes problem because there is ~no way I could query all the rows, sort out, explode the tags and just show the news feed like that. It is doable, but very cumbersome. So, I would like to ask how to do this. I am certain it involves, having having maybe another table called tags but that is as far as I can go
The proper way to do this is to make a many to many relationship in database. This would require additional table containing id's of categories and articles related to each other
Look at this example of multiple users with multiple roles:
In the same way, you can create a relationship of articles to categories
You can either use a SET to store the data or, which I suggest, add another table for tags and one which stores relations between articles and tags as mentioned by Maciej.
articles table:
id | title | content
tags table:
id | label
articles_tags table:
article_id | tag_id
Consider adding foreign key constraints to the last table, this will make your life easier.

How to make a n:1 multiple tags system for a blog & which database schema is better?

I'd like to create a tagging system for a blog and it's articles; as stackoverflow's tagging system for it's question.
database structure
Table - articles
Fields - article_id, article_title, article_content
Table - article_tags
Fields - tag_id, article_id, tag_name
If an user enters tags in the input; let's say they enter the following tags:
php
mysql
database structure
I'd like to understand how to convert the following tags into an array and then insert each tag into a new value line as followed.
article_Tags
tag_id | article_id | tag_name
1 1 php
2 1 mysql
3 1 database structure
I'd like to have PHP break each tag down by commas, and insert it separately into a new row value.
How do I accomplish this? Do I need a foreach and explode() to break down each tag?
Does the following table structure seems better than the above?
Table - article
Fields - article_id, article_title, article_content
Table - article_tags
Fields - article_id, tag_id
Table - tags
Fields - tag_id, tag_name
If so how do I go by entering each tag into the database, do I need explode() as well as the first table structure?
Your second structure is better. You store way less data that way.
The easiest is to create a CRUD for the tag table, to display those tags on your create post form and on submit insert everything into your article & article_tags table accordingly.
But you could also let the user directly create tags from the post form with a bit of JavaScript then insert those tags into the db before inserting into article & article_tag.
Don't forget you'll need to check for duplicates either way. No sense in having 10 different php tag per your example.
This is called a many to many relation, and it's better to create an intermediary table.
You may ask why? Well, let's try out an example and see which one is better. Let's say you have 100 articles, with 3 tags each, and 30 unique tags.
First version:
article_tags will contain a total of 300 entries, which means that each tag name is duplicated on average 10 times.
Second version:
article_tags will contain the same 300 entries, but instead of having the tag name appear 300 times, it will appear only 30 times in the tags table.
So the second version has less duplicate content, so you should go with that.

PHP/MySQL blog system

I'm making a blog system and I want to add 'tags' to my blogposts. These are similar to the tags you see here, they can be used to group posts with similar subjects.
I want to store the tags in the database as a comma-separated string of words (non-whitespaced strings). But I'm not quite sure how I would search for all posts containing tag A and tag B.
I don't like a simple solution that works with a small database where I retrieve all data and scan it with a PHP loop, because this won't work with a large database (hundreds if not thousands of posts). I do not intend to make this many blogposts, but I want the system to be solid and save worktime on the PHP scripts by getting right results straight from the database.
Let's say my table looks like this (it's a bit more complex actually)
blogposts:
id | title | content_html | tags
0 | "hello world" | "<em>hello world!</em>" | "hello,world,tag0"
1 | "bye world" | "<strong>bye world!</strong>" | "bye,world,tag1,tag2"
2 | "hello you" | "hello you! :>" | "hello,tag3,you"
How would I be able to select all posts that contain "hello" as well as "world" in the tags? I know about the LIKE statement, where you can search for substrings, but can you use it with multiple substrings?
You can't index a field of csv values in a meaningful way, and SQL doesn't support being able to find a unique value in a field of CSV values. Instead, you'll want to set up two more tables, and make the following alteration to your table.
blogposts:
id | title | content_html
tags:
id | tag_name
taxonomy table:
id | blogpost_id | tag_id
When you add a tag to a blog post, you will insert a new record into the taxonomy table. When you query for data, you'll join across all three tables to get the information similar to this:
SELECT `tag_name` FROM `blogposts` INNER JOIN `blogposts_taxonomy` ON
`blogposts`.`id`=`blogposts_taxonomy`.`blogpost_id` INNER JOIN `blogpost_tags` ON
`blogposts_taxonomy`.`tag_id`=`blogpost_tags`.`id` WHERE `blogposts`.`id` = someID;
//UPDATE
Setting up the N:M relationship gives you a lot of options during the build out of your application. For example, say you wanted to be able to search for blogposts that were all tagged "php." You could do that as follows:
SELECT `id`,`html_content` FROM `blogposts` INNER JOIN `blogposts_taxonomy` ON
`blogposts`.`id`=`blogposts_taxonomy`.`blogpost_id` INNER JOIN `blogposts_tags` ON
`blogposts_taxonomy`.`tag_id`=`blogposts_tags`.`id` WHERE `blogposts_tags`.`tag_name`="php";
That will return all blogposts that have been tagged with the "php" tag.
Cheers
If you really wanted to store the data like this the FIND_IN_SET mysql function would be your friend.
Have the function twice in the where clause.
But it will perform horribly - having a linked table one-to-many style as already suggested is MUCH better idea. If you have lots of the same tags a many-to-many could be used. Via a 'post2tag' table.

Database modeling: best aproach for multiple categories for multiple elements

Let's say I have 10 books, each book has assigned some categories (ex. :php, programming, cooking, cookies etc).
After storing this data in a DB I want to search the books that match some categories, and also output the matched categories for each pair of books.
What would be the best approach for a fast and easy to code search:
1) Make a column with all categories for each book, the book rows would be unique (categs separated by comma in each row ) -> denormalisation from 1NF
2) Make a column with only 1 category in each row and multiple rows per book
I think it is easier for other queries if I store the categories 1 by 1 (method 2), but harder for that specific type of search. Is this correct?
I am using PHP and MySQL.
PPS : I know multi relational design, I prefer not joining every time the tables. I'm using different connection for some tables but that's not the problem. I'm asking what's the best approach for a db design for this type of search: a user type cooking, cookies, potatoes and I want to output pairs of books that have 1,2 more or all matched categs. I'm looking for a fast query, or php matching technique for this thing... Tell me your pint of view. Hope I'm understood
Use method 2 -- multiple rows per book, storing one category per row. It's the only way to make searching for a given category easy.
This design avoids repeating groups within a column, so it's good for First Normal Form.
But it's not just an academic exercise, it's a practical design that is good for all sorts of things. See my answer to Is storing a comma separated list in a database column really that bad?
What you want to do is have one table for books, one table for categories, and one table for connecting books and categories. Something like this:
books
book_id | title | etc
categories
category_id | title | etc
book_categories
book_id | category_id
This is called a many-to-many relationship. You should probably google it to learn more.
This relationship is a Many-To-Many (a book can have multiple categories and a category can be used in several books).
Then we have the following:
Got it?
=]
I would recommend approach number 2. This is because approach 1 requires a full text search of the category column.
You may have some success by splitting it up into two tables: One table has one line per book and a unique id (call the table books), and the other has one line per book per category and references the book id from the first table (call the table bookcategories). Then if you only need book data you use table books, where if you need categories you join both tables.

Which Search / Tag system is better?

I am working on a website that has users and user-generated articles, galleries and video's. I am trying to make a tagging system and a search for them all.
At first I was thinking in tbl_articles, tbl_galleries and tbl_videos I would have a title, description and a tags field. Then run a query like the following for each:
select * from tbl_articles where match(title, description, tags)
against ('$search' in boolean mode) ORDER BY match(title, description, tags)
against ('$search' in boolean mode) DESC, views desc LIMIT 0, 3
The same query for tbl_galleries and tbl_videos. For the users just compare the username. Then display three of each on the results page with a 'more' button (facebook style).
When viewing an article, gallery or video there will also have links to related content so I was thinking of using the same query only with the LIMIT set to '1,3' - to avoid showing itself.
Q1 - How is this system?
I was happy with the system, until I found this
In which they have
a 'tags' table which contains two
columns a primary id and a uniquely
indexed tag_name.
a 'type' table for which they have
another primary id and a uniquely
indexed 'type' (category) (I thought
I could use it for
user/video/article/gallery)
a 'search' table that contains the
url of the article with a foreign id
from 'tags' and 'type'. (I thought
instead of a full url I could just
store the related foreign id so that
I can generate the url e.g
article.php?id=....)
Q2 - This system seems far more efficient... although how would I search the title or description?
Q3 - The other bad thing is for every page view I would have to join the tags.. so it might not be that much more efficient.
Q4 - My system only searches boolean too, would I be better with a 'like' query?
Q5 - I limit my users to 4 tags, but I encourage single-words (stackoverflow style)... I realise though that in my system a search for 'train station' will not match a tag like 'train-station' how do I get around this?
So many questions... Sorry it is so long. Thank you.
Q1 - you're better off with three separate tables for articles, tags, and a link table relating articles and tags. You could also do it with two tables for articles and an articles_tags table. The articles_tags table would contain an articleID field and the tag itself as a compound key. Either two or three tables makes it easy to find which articles have a given tag and which tags are assigned to a given article.
Q2 - title and description searches could be done using "like" with percents or with regex or full text search.
Q3 - don't worry about joining the tags tables with the others. To paraphrase Knuth, build it first, then find the bottlenecks. MySQL is very good at what it does. Joining those tables together over and over and over again won't hurt.
Q4 - It depends what you want to get out of the results. usually you want the actual data and then you can just test for the number of rows returned to tell you whether it's true or false.
Q5 - again, you'd have to use "like" syntax and maybe some creative regex on the PHP side before the query is handed off to the database.
good luck!

Categories