Which Search / Tag system is better? - php

I am working on a website that has users and user-generated articles, galleries and video's. I am trying to make a tagging system and a search for them all.
At first I was thinking in tbl_articles, tbl_galleries and tbl_videos I would have a title, description and a tags field. Then run a query like the following for each:
select * from tbl_articles where match(title, description, tags)
against ('$search' in boolean mode) ORDER BY match(title, description, tags)
against ('$search' in boolean mode) DESC, views desc LIMIT 0, 3
The same query for tbl_galleries and tbl_videos. For the users just compare the username. Then display three of each on the results page with a 'more' button (facebook style).
When viewing an article, gallery or video there will also have links to related content so I was thinking of using the same query only with the LIMIT set to '1,3' - to avoid showing itself.
Q1 - How is this system?
I was happy with the system, until I found this
In which they have
a 'tags' table which contains two
columns a primary id and a uniquely
indexed tag_name.
a 'type' table for which they have
another primary id and a uniquely
indexed 'type' (category) (I thought
I could use it for
user/video/article/gallery)
a 'search' table that contains the
url of the article with a foreign id
from 'tags' and 'type'. (I thought
instead of a full url I could just
store the related foreign id so that
I can generate the url e.g
article.php?id=....)
Q2 - This system seems far more efficient... although how would I search the title or description?
Q3 - The other bad thing is for every page view I would have to join the tags.. so it might not be that much more efficient.
Q4 - My system only searches boolean too, would I be better with a 'like' query?
Q5 - I limit my users to 4 tags, but I encourage single-words (stackoverflow style)... I realise though that in my system a search for 'train station' will not match a tag like 'train-station' how do I get around this?
So many questions... Sorry it is so long. Thank you.

Q1 - you're better off with three separate tables for articles, tags, and a link table relating articles and tags. You could also do it with two tables for articles and an articles_tags table. The articles_tags table would contain an articleID field and the tag itself as a compound key. Either two or three tables makes it easy to find which articles have a given tag and which tags are assigned to a given article.
Q2 - title and description searches could be done using "like" with percents or with regex or full text search.
Q3 - don't worry about joining the tags tables with the others. To paraphrase Knuth, build it first, then find the bottlenecks. MySQL is very good at what it does. Joining those tables together over and over and over again won't hurt.
Q4 - It depends what you want to get out of the results. usually you want the actual data and then you can just test for the number of rows returned to tell you whether it's true or false.
Q5 - again, you'd have to use "like" syntax and maybe some creative regex on the PHP side before the query is handed off to the database.
good luck!

Related

How to make a n:1 multiple tags system for a blog & which database schema is better?

I'd like to create a tagging system for a blog and it's articles; as stackoverflow's tagging system for it's question.
database structure
Table - articles
Fields - article_id, article_title, article_content
Table - article_tags
Fields - tag_id, article_id, tag_name
If an user enters tags in the input; let's say they enter the following tags:
php
mysql
database structure
I'd like to understand how to convert the following tags into an array and then insert each tag into a new value line as followed.
article_Tags
tag_id | article_id | tag_name
1 1 php
2 1 mysql
3 1 database structure
I'd like to have PHP break each tag down by commas, and insert it separately into a new row value.
How do I accomplish this? Do I need a foreach and explode() to break down each tag?
Does the following table structure seems better than the above?
Table - article
Fields - article_id, article_title, article_content
Table - article_tags
Fields - article_id, tag_id
Table - tags
Fields - tag_id, tag_name
If so how do I go by entering each tag into the database, do I need explode() as well as the first table structure?
Your second structure is better. You store way less data that way.
The easiest is to create a CRUD for the tag table, to display those tags on your create post form and on submit insert everything into your article & article_tags table accordingly.
But you could also let the user directly create tags from the post form with a bit of JavaScript then insert those tags into the db before inserting into article & article_tag.
Don't forget you'll need to check for duplicates either way. No sense in having 10 different php tag per your example.
This is called a many to many relation, and it's better to create an intermediary table.
You may ask why? Well, let's try out an example and see which one is better. Let's say you have 100 articles, with 3 tags each, and 30 unique tags.
First version:
article_tags will contain a total of 300 entries, which means that each tag name is duplicated on average 10 times.
Second version:
article_tags will contain the same 300 entries, but instead of having the tag name appear 300 times, it will appear only 30 times in the tags table.
So the second version has less duplicate content, so you should go with that.

Codeigniter 2.x : adding tags to my entries

I am building an application where an user may add notes. I would like the user to be able to add some tags for the notes and filter it afterwards. So far I allow the user to use work with string and filter with LIKE %% . But it doesnt fully fulfil my needs, because when there are more tags, I might need to search not all of them, use operators etc. An example: stackoverflow tags.
I was thinking about the following SQL structure
have a table called 'notes' with column called 'tag_id'
have a table called 'tags' which provides the 'tag_id' with 'text'
a function that translates each array (forward and backward) and replaces it with the opposite equivalent [numbers <=> tag translations for front end, and tag translations <=> number for the database and following filtering]
The question is: Is it a good idea to do it like this? How should I structure my database? Do you know about any particular articles that could help me?
if you want to have one note with many tags, I suggest you reverse the way you are tracking the relationship. In other, words have:
table note
- field noteid
- field notename
table tag
- field tagid
- field noteid
- field tagname
if you want a many to many relationship then store the relationship in a third table which would have 3 columns: 1) linkid 2) tagid 3) noteid .
you can then either retrieve all tags for a particular by querying tag table where noteid=noteid or doing a join of the tables - the actual join will depend on what you are trying to do

Per-user tag cloud in PHP & MySQL

I am looking to implement a per-user tag/interest cloud feature to a website I am making.
Each user has a profile page, and on said page a tag cloud of their preselected interests will be displayed. Each user can type their interests comma delimitated, with suggestions if such a tag has been used before or creation if it doesn't exist. Interests will be things such as Music Genres, Hobbies etc.
I'd like to also add basic features such as comparing users tag clouds (shared tags) for finding users that are 'compatible' according to their cloud.
I could use help with the logistics of the database to achieve this. I understand simple database design, but I can't wrap my head around design for the above.
At the moment the database is one single table, with ID/Username/Password/Verification (the last a key for email verification).
The only idea I have come up with for the tag cloud db is two tables - one called tags with a tagid and tagname field, and another users_tags with a tagid and userid field, and an entry for every single tag a user has. However I am unsure if this is best practice.
Hope someone can give me some direction on all this - thanks in advance.
having a table with userid and tagid only sounds like the best route for this.
to find "compatible" users as you mention you can just run a query similar to
SELECT
ut.userid, COUNT(*) ct
FROM
user_tags ut
WHERE
ut.tagid IN (SELECT uta.tagid FROM user_tags uta WHERE uta.userid=24 )
GROUP BY ut.userid ORDER BY ct DESC;
note that the above query will also return the original user, but it's much more efficient than removing him from the query.

Database design for posts and comments

If one post has many comments, and the comments are essentially the same as posts (e.g. they have a title, pictures and audio etc.) should I create two tables or just one?
For example, if I only use one table I can have a parent_id column, so If it's not a reply to anything it would be null, otherwise, it would have the id of the parent post. On the other hand I can create a post table and a comments table. Comments can also reply back to other comment so this could get confusing quick.
*Post*
id
title
content
image
audio
parent_id
or,
*Post* *Comments*
id id
title title
content content
image author_id
audio post_id
author_id image
audio
What the second option would allow is creating indexes. Infact I won't even have to add author_id or post_id If I use indexes from the start will I?
What are you thoughts on this SO? Which would be more efficient? I thinking of using redbeanphp for this.
The second option would be better. When displaying a message board, you don't care about comments and looking them up by an indexed parent post id column is fast. Posts and comments will likely have different fields, so keeping them separate is correct. The parent id index for the first option would work fine, but conceptually, it's messy and you're basically creating an index to use on half or however many comments there are relative to posts.
As a rule in database-design: Tables are called entities, so each entity in your application should be separated and demonstrated by table. Here, although you regarded posts and comments each has the same kind of data but finally each of them is a separate entity so they should be separated in two tables. This behavior is not a personal opinion. It is basic rule that leads to more smooth application development.

Content Tagging

I'm trying to create a small Web App to categorize certain type of YouTube videos, when users submits a video they will choose what categories this video falls under and they will tag it with ready-made tags, for example:
Video one - Category: Ad - Tags: cute, funny, has animal in it.
I'm trying to sketch my Database for that (I'm using MySQL), so far I have two ideas.
Idea 1:
Table Videos with ID and Category columns, another table Tags with ID and Tag columns while Videos.ID and Tags.ID are linked together. So when the user tries to filter search results by tags, the query will have more conditions (AND Tag = 'something' AND Tag = 'other thing').
Idea 2:
One table Videos with Category and Tags columns, tags are stored as a string separated by commas, when the user tries to filter search results by tags, the query will more conditions (AND Tags LIKE '%something%' AND Tags LIKE '% other thing%).
So the question is: Is there any better method? I already think that the 1st one is wasteful (Each video might have up to 40 ready-made tags) and the 2nd one is clumsy. If not, which one do you think is better?
Creating a additional table linking video id and tag id together is the correct solution. Filtering is done by creating additional INNER JOIN conditions. A comma separated list just won't do - it drastically limits your selection and query possibilities.
Idea 1 looks good. Creating a separate table for storing tags helps in selection.

Categories