I'm making a blog system and I want to add 'tags' to my blogposts. These are similar to the tags you see here, they can be used to group posts with similar subjects.
I want to store the tags in the database as a comma-separated string of words (non-whitespaced strings). But I'm not quite sure how I would search for all posts containing tag A and tag B.
I don't like a simple solution that works with a small database where I retrieve all data and scan it with a PHP loop, because this won't work with a large database (hundreds if not thousands of posts). I do not intend to make this many blogposts, but I want the system to be solid and save worktime on the PHP scripts by getting right results straight from the database.
Let's say my table looks like this (it's a bit more complex actually)
blogposts:
id | title | content_html | tags
0 | "hello world" | "<em>hello world!</em>" | "hello,world,tag0"
1 | "bye world" | "<strong>bye world!</strong>" | "bye,world,tag1,tag2"
2 | "hello you" | "hello you! :>" | "hello,tag3,you"
How would I be able to select all posts that contain "hello" as well as "world" in the tags? I know about the LIKE statement, where you can search for substrings, but can you use it with multiple substrings?
You can't index a field of csv values in a meaningful way, and SQL doesn't support being able to find a unique value in a field of CSV values. Instead, you'll want to set up two more tables, and make the following alteration to your table.
blogposts:
id | title | content_html
tags:
id | tag_name
taxonomy table:
id | blogpost_id | tag_id
When you add a tag to a blog post, you will insert a new record into the taxonomy table. When you query for data, you'll join across all three tables to get the information similar to this:
SELECT `tag_name` FROM `blogposts` INNER JOIN `blogposts_taxonomy` ON
`blogposts`.`id`=`blogposts_taxonomy`.`blogpost_id` INNER JOIN `blogpost_tags` ON
`blogposts_taxonomy`.`tag_id`=`blogpost_tags`.`id` WHERE `blogposts`.`id` = someID;
//UPDATE
Setting up the N:M relationship gives you a lot of options during the build out of your application. For example, say you wanted to be able to search for blogposts that were all tagged "php." You could do that as follows:
SELECT `id`,`html_content` FROM `blogposts` INNER JOIN `blogposts_taxonomy` ON
`blogposts`.`id`=`blogposts_taxonomy`.`blogpost_id` INNER JOIN `blogposts_tags` ON
`blogposts_taxonomy`.`tag_id`=`blogposts_tags`.`id` WHERE `blogposts_tags`.`tag_name`="php";
That will return all blogposts that have been tagged with the "php" tag.
Cheers
If you really wanted to store the data like this the FIND_IN_SET mysql function would be your friend.
Have the function twice in the where clause.
But it will perform horribly - having a linked table one-to-many style as already suggested is MUCH better idea. If you have lots of the same tags a many-to-many could be used. Via a 'post2tag' table.
Related
I have a database in MySQL that currently lists approximately 1500 concerts and events. Now, the plan is to add setlists (list of the songs performed at the concerts) for all the concerts in the database. Basically this will mean a lot of repeated values (songs performed at many concerts), and I would really appriciate some input on what the best approach would be.
I initially started out with a database similar to this;
| eventID | edate | venue | city | setlist |
The field setlist was basically text data, where I could paste the list of songs and parse through it to put each song on a new line with php. This works, and editing the text and running order was like editing a text document. Now, obviously this was pretty simple, but has drawbacks and limitations. Simple things like getting stats on songs performed is probably very difficult, right?
So, what is the best way to store the setlist value?
Create a new table that adds a new row for each song performed, and that has a foreign key linking to eventID? How would I best retain (and edit, if needed) the running order of the songs in that table? Any other suggestions?
Thanks for any input or advice on this, as I would love to get some help before I start adding all the data.
I would create a table that holds each song performed at a specific event:
| songId | eventID | song |
Where eventID can be duplicated in multiple rows to show each song performed at that event.
This way you can query all the times a specific song was performed, and also get all songs (the setlist) for a specific event by querying on the eventID.
So far I had my downloads table denormalized. I had two fields - author and country. They were separated by a space, e.g.: Jack James as for author and us uk for country.
I decided it's time to normalize it so I made a new table called downloads_authors with fields (da_id, downloads_id, da_author, da_country) and now I have:
+-----+------------+---------+----------+
|da_id|downloads_id|da_author|da_country|
+-----+------------+---------+----------+
|1 |1 |Jack |us |
+-----+------------+---------+----------+
|2 |1 |James |uk |
+-----+------------+---------+----------+
So far so good.. but in the way I used to have them, I used explode and with a very bad function, I was getting the desired result - <flag img> Jack, <flag img> James
Now, when I have them in another table I cannot think of a way to do this:
SELECT * FROM downloads and list the respective author(s) without having an inner loop (because if I do a JOIN then I will have the information from downloads again and again).
Desired otuput is:
item
- author
item
- author
- author
Am I wrong about the JOIN and is it the way to go?
Your options are join and use the download information the first time you get a new download id and ignore the download information until you get a new download id. Or do what I said, query the data out and loop to build a new array. Or you could also use group_concat to join the authors together back into a single string.
I would just query out all downloads then query out all authors. Loop over the downloads and assign them to an array where the download id is the key. Then loop over the authors and assign the different authors to a sub-array of the downloads using the download id.
I'd like to create a tagging system for a blog and it's articles; as stackoverflow's tagging system for it's question.
database structure
Table - articles
Fields - article_id, article_title, article_content
Table - article_tags
Fields - tag_id, article_id, tag_name
If an user enters tags in the input; let's say they enter the following tags:
php
mysql
database structure
I'd like to understand how to convert the following tags into an array and then insert each tag into a new value line as followed.
article_Tags
tag_id | article_id | tag_name
1 1 php
2 1 mysql
3 1 database structure
I'd like to have PHP break each tag down by commas, and insert it separately into a new row value.
How do I accomplish this? Do I need a foreach and explode() to break down each tag?
Does the following table structure seems better than the above?
Table - article
Fields - article_id, article_title, article_content
Table - article_tags
Fields - article_id, tag_id
Table - tags
Fields - tag_id, tag_name
If so how do I go by entering each tag into the database, do I need explode() as well as the first table structure?
Your second structure is better. You store way less data that way.
The easiest is to create a CRUD for the tag table, to display those tags on your create post form and on submit insert everything into your article & article_tags table accordingly.
But you could also let the user directly create tags from the post form with a bit of JavaScript then insert those tags into the db before inserting into article & article_tag.
Don't forget you'll need to check for duplicates either way. No sense in having 10 different php tag per your example.
This is called a many to many relation, and it's better to create an intermediary table.
You may ask why? Well, let's try out an example and see which one is better. Let's say you have 100 articles, with 3 tags each, and 30 unique tags.
First version:
article_tags will contain a total of 300 entries, which means that each tag name is duplicated on average 10 times.
Second version:
article_tags will contain the same 300 entries, but instead of having the tag name appear 300 times, it will appear only 30 times in the tags table.
So the second version has less duplicate content, so you should go with that.
I am developing a web application where users can create the following resources/contents:
Events | Music | Posts | Classifieds
They have alot of fields in common, such as:
created_date | title | desc | user_id
Now I am wondering if I should create separate tables for each content, or save them all in one table, with a type_id foreign key, which points to a content_type table. Ofcourse, some distinct fields will be there which will be only used by specific content types, for those not using those fields, I can just leave it blank.
Data looks more organized with separate tables for each content type, but searching for a keyword across all tables is becoming a nightmare(with joins, unions etc). If it was just a single table, searching will be very easy.
I need that the user be able to search across all content with a keyword. He would also be able to search specific contents, for that I will do a WHERE clause on the type_id field.
I am not aware of all the pros/cons of each method, but I would appreciate if people could advice me so that I don't make the wrong decision, and have to redo everything from start.
maybe think of using the "has a" relationship. For instance, an event "has a" "web item handle" attached to it, and a "web item handle" is a thing with description, created date, title, 'owner' etc...
Unless they truly have identical data, I would use separate tables. Having one table with some fields only used by specific content types is really not very good database design.
If you really want one table with the basic data, you could create one as you suggested with a content_type and the common fields, and then have 4 separate tables for each of the types with the other distinct fields, then do an inner join when you select the fields for that type. But personally I think you are better off just creating 4 tables.
So I want to index the lyrics from a lyrics website and then perform operations on the lyrics (search for certain artists, terms, patterns etc) .
I figure the best scenario is if there is already some structured file format for me to use--> anyone know if anything like this exists?
The next best thing would be a site that is "amenable" to what I am trying to do--> any such site?
Any comments in general about how I can do this speedily? (This is supposed to be a fun project and not a heavy duty application)
Thanks!
Downloading the lyric database from a site is bad idea, you can query it for each lyric you want instead.
Even if you download all the lyrics, don't store them on a flat-file(maybe xml?), instead of use a database like sqlite. Otherwise the operations like searching or listing would be painful.
But no idea about amenable sites.
Edit; I found ChartLyrics API; you can use their API easily.
Generally,
1) Download that lyric and store it in separate table in your database
table: lyrics (example)
+---------+-------------+-----------------+-------------------------------+
| lyr_id | lyr_artist | lyr_title | lyr_content |
+---------+-------------+-----------------+-------------------------------+
| 1 | Metallica | The Unforgiven | New blood joins this earth... |
+---------+-------------+-----------------+-------------------------------+
...
+---------+-------------+-----------------+-------------------------------+
2) Search artist in column lyr_artist, song title in column lyr_title, text (keywords) in lyr_content, etc.
Query examples
SELECT * FROM lyrics WHERE lyr_artist='artist';
SELECT * FROM lyrics WHERE lyr_title='song_title';
SELECT * FROM lyrics WHERE lyr_content LIKE '%word1%' AND lyr_content LIKE '%word2%'
Well, generally, something like that.. or mix WHERE condition. You can use WHERE...LIKE to columns like song title and artist too, for example to find song "The Unforgiven" if user asks for keyword "Unforgiven", etc.
3) Use query result to display search results
Note: Storing data in files on server is not as good as storing it in database, in terms of speed.