MySQL database design for image options relationships

MySQL database design for image options relationships - php

I have two tables, images and image_data and here is an example of my image_data table.
image_id | slide_id | language_id | type |
101 | 1 | 1 | CQ |
101 | 2 | NULL | NULL |
56 | 5 | 1 | TN |
56 | NULL | 2 | NULL |
So basically, each image will have different options and I am wondering the best way to implement this.. because I have a feeling I am doing this the wrong way.
With this, I can run a query to use GROUP_CONCAT() to turn values in multiple rows into a single concatenated string.
image_id | slide_id | language_id | type |
101 | 1,2 | 1 | CQ |
56 | 5 | 1,2 | TN |
Which is fine, but the problem with the way I am doing it right now is..it seems like it will be really difficult to update the rows with my backend system.
So with my query, I can determine which ones to check based on the database since I have it all in one row since I concatenated it. But now it's like.. when I go to click "Save" and update the rows, which one do I update? there can be more than 1 row of the same image id, so how would I update the right one, and so on.
If I checked off another slide for image #101 then I would need to create a new row for it. If after that I wanted to add another language_id to it, then I would need to make sure to not add a new row since one exists with a NULL value, and to just replace the NULL value with the new language id.
It just seems really complicated and there's so many factors, that using this method is really hard to program.
What would be the best way to do this? Any suggestions are really appreciated.
Thanks!

What you need to do is implement N:M (many-to-many) relationships between your images and slides / languages / types tables so that your design is more normalized (one fact in one place).
Think of it this way: one image can have multiple slides, and one slide may be an option of multiple images. -- this is a N:M relationship. Same goes for languages and types.
What you need to do is get rid of your image_data table which houses the options between ALL entities and have three separate cross-reference tables instead. Here's how you would model it:
Base tables:
images(image_id [PK], ...)
slides(slide_id [PK], slide_name, ...)
languages(language_id [PK], language_name, ...)
types(type_name [PK], ...)
Cross-Reference tables:
images_has_slides(image_id [PK], slide_id [PK])
images_has_languages(image_id [PK], language_id [PK])
images_has_types(image_id [PK], type_name [PK])
How it would look in ER:
With this type of design, you wouldn't have to deal with NULL values or figuring out which row to update because you now have just one fact in one place. To get all options, you would still have to do GROUP_CONCAT() like so:
SELECT
a.*,
GROUP_CONCAT(c.slide_name) AS slides,
GROUP_CONCAT(e.language_name) AS languages,
GROUP_CONCAT(f.type_name) AS types
FROM
images a
LEFT JOIN
images_has_slides b ON a.image_id = b.image_id
LEFT JOIN
slides c ON b.slide_id = c.slide_id
LEFT JOIN
images_has_languages d ON a.image_id = d.image_id
LEFT JOIN
languages e ON d.language_id = e.language_id
LEFT JOIN
images_has_types f ON a.image_id = f.image_id
GROUP BY
a.image_id
Then to update image options, you would use INSERT and DELETE on the cross-reference tables:
Let's say you wanted to add two languages to an image, you would do
INSERT INTO images_has_languages (image_id, language_id)
VALUES (101, 4), (101, 5);
The above query adds languages with id's of 4 and 5 to the image that has an id of 101.
To remove options (unchecking on the form) - let's say you wanted to remove 2 slides from an image
DELETE FROM images_has_slides WHERE image_id = 101 AND slide_id IN (3,6)
This removes slides with id's of 3 and 6 from the image that has an id of 101.
So in your application, you could figure out if you need to do insert/delete queries based on if the user unchecked or checked values in the form for the image.

Have you tried splitting the tables? If you make a separate table for the slide and language and kept the type in the same table as the image ID you could then use that to make your lists. You could then optimize your database with foreign keys so you don't take as big a performance hit.
Here what I mean:
Image data table: two columns, image_id and image_type (type is a reserved word). Imageid is the primary key so there are no duplicates (assuming you only want one type for each image)
Image-language table: two columns, image id and image_language. Both are primary keys so you don't duplicate languages on the same image id, but an image id can have multiple languages. Primary key from image id links to the primary key in the image data table
Image-slide table: two columns, image id and slide number. Same as above (two primary keys, relationship, etc)
This way you could get get all the data like so:
SELECT d.image_id, d.image_type, l.image_language, s.slide_number FROM image_data d LEFT JOIN image_language l ON d.image_id = l.image_id LEFT JOIN image_slide s ON s.image_id = s.image_id
The left joins make sure all the item id always shows up no matter what even if there isn't enough languages or slides to go around. It will create a "matrix" of sorts for you with a row for each image and each language and each slide it applies to. For example, if you had an image that had spanish and english as its language and 4 slides, you would get 8 entries: one for each slide in each language.
I don't know if that will necessarily solve the problem, but it would make it a little easier to control exactly what is in the database while still having the database do a bit of the work for you.

You need to normalize your schema.
You have images table:
CREATE TABLE images (
image_id integer,
image_name varchar(100),
PRIMARY KEY(image_id)
);
Each image can have several slides:
CREATE TABLE slides (
slide_id integer,
image_id integer,
slide_name varchar(100),
PRIMARY KEY(slide_id)
);
The same goes for the image_types and image_languages. I hope you understand the logic. And make sure to add proper FOREIGN KEY constraints. Also, it is a good idea to CREATE INDEX on the image_id columns of the subordinate tables.
Now, you have 1 row per each parameter in the related tables. Managing the contents should be easy: INSERT new records when some features are selected and DELETE them when those are deselected. The query (based on the outlined 2 tables) should be:
SELECT i.image_id, i.image_name,
group_concat(s.slide_id) AS slides
FROM images i
LEFT JOIN slides s USING (image_id)
GROUP BY i.image_id;
Some notes:
It is safe to do GROUP BY only by image_id, as it is a PRIMARY KEY of the iamges and thus it will guarantee single-row groupping;
If you'd like to have slide_id (also language_id, type_id and others) starting from 1 for each of the images, you might go for a 2-field primary keys in the subordinate table, like PRIMARY KEY (image_id, slide_id).
EDIT:
A note on the many-to-many relations. If you happen to have 2 sets of related data, like images can have many slides and slide_id can be shared by many image_id, then you need an extra table:
CREATE TABLE images (
image_id integer,
image_name varchar(100),
PRIMARY KEY(image_id)
);
CREATE TABLE slides (
slide_id integer,
slide_name varchar(100),
PRIMARY KEY(slide_id)
);
CREATE TABLE image_slides (
image_id integer,
slide_id integer,
create_dt timestamp,
PRIMRY KEY (image_id, slide_id)
);

Related

InnoDB: Copying a number of records along with their tag memberships

Foreign keys might be appropriate to this problem/solution. However, I have inherited this code and db, which do not use foreign keys, and so it would be difficult to add them. If absolutely necessary I can do it, but I'd prefer not to.
Let's pretend that I have a very simple set of tables in an InnoDB database that are used to store a bunch of jokes, each of which belongs to particular group and may have one or more tags associated with it. I am using PHP/MySQLi to do the work. Let's say the tables look like so:
GROUPS
id (int, primary, auto_inc) | group_name (varchar[64])
============================================================
1 knock, knock jokes
2 one-liners
JOKES
id (int, primary, auto_inc) | group_id (int) | joke_text (varchar[255])
=============================================================================
1 1 interrupting cow. inte-MOO!
TAGS
id (int, primary, auto_inc) | tag_text (varchar[255])
=============================================================================
1 explicit
2 hilarious
JOKE_TAGS
id (int, primary, auto_inc) | joke_id (int) | tag_id (int)
=============================================================================
1 1 1
Even though it makes no sense in the context of these jokes, let's just say that the user has the option to copy the jokes from one group to another. Thanks to users' help on this site, I have figured out that the easiest way to do that would be something like the following:
INSERT INTO jokes (group_id,joke_text)
SELECT '$dstGroupID', r2.joke_text FROM jokes AS j2
WHERE j2.group_id = '$srcGroupID';
That seems to work just fine. However, I am completely lost as to how I can efficiently copy over the tag memberships. For instance, if I was to copy the jokes from group.id=1 to group.id=2 (using the sample data shown above), I would want the JOKE_TAGS table to look like so:
JOKE_TAGS
id (int, primary, auto_inc) | joke_id (int) | tag_id (int)
=============================================================================
1 1 1
2 2 1
For the life of me, I simply cannot figure out a way to do this without throwing away the SQL above and simply iterating through every single joke that is being copied, with the logic looking something like this:
Pull out a joke's information
Pull out that joke's tag memberships
Insert a new record into JOKES with the pulled out info from above
Grab the id of the newly inserted joke
Insert a new record into JOKE_TAGS, using the id grabbed above
This is obviously WILDLY inefficient when compared to the 'copying' SQL listed above. If anyone can suggest a more efficient solution, I'd be most appreciative.

You're already using foreign keys, even if they're not being enforced. What you're describing is a fundamental change in your data structure. Going from 1:1 to 1:n: one joke existing in one group, to one join existing in MULTIPLE groups.
As such, the normal fix would be to move that group_id field out of the jokes table and into a link table:
jokes <-> joke_groups <-> groups
in which case, a simple "copy" would involve inserting a new record in the link table:
(joke #1, group #7) // existing joke/group link
(joke #1, group #3) // "copying" the joke into group #3
If you CAN'T change the schema to accomodate the change in structure, then you will have to manually copy the joke around:
a) get contents of joke record
b) insert copied data back into joke record to create a NEW joke
c) get ID of new record
d) copy all tags from old id, insert with new linkages to new joke's ID
normally this'd be as simple as a couple INSERT INTO ... SELECT FROM-type queries, but MySQL does not let you select from the same table as you're inserting into, so a round-trip through the client is required.

PHP MySQL is taking 0.8s to 3s to load on search query, how to speed up

my MySQL table is in this structure:
|id|title|duration|thumb|videoid|tags|category|views
|1||Video Name|300|thumb1.jpg|134|tag1|tag2|tag3|category|15
|2||Video Name2|300|thumb2.jpg|1135|tag2|tag3|tag4|category|10
Table contains about 317k rows.
Query is:
SELECT id,title,thumb FROM videos WHERE tags LIKE '%$keyword%' or title LIKE '%$keyword%' order by id desc limit 20
And this is taking 0.8s to 3s to load results.
Im new in php/mysql, how can I speed up these queries, suggestions please, thank you.

The only other suggestion I can throw in is to have a multi-part index of
( tags, title, id )
This way, it can utilize the index to qualify the WHERE clause criteria for both tags and title, and have the ID for the order by clause without having to go back to the raw data pages. Then, when records ARE found, only for those entries does it need to actually retrieve the raw data pages for the other columns associated with the row.

You are using this search construct:
column LIKE '%$keyword%'
The leading % wildcard character definitely defeats the use of indexes to do these searches. How to cure this terrible performance problem? You could use FULLTEXT search, about which you can read. Or, you could try to organize your tables so
column LIKE 'keyword%'
will find what you need, and then index the columns being searched. To do this, you would create a tag table, with a name and id for each distinct tag. This table will have a primary key on the id, and a unique key on the tag. E.g.
tag_id | tag
1 | drama
2 | comedy
3 | horror
4 | historical
The you would create another table, known in the trade as a join table, with two ids in it. The primary key of this table is a composite of the two columns. You also need a non-unique index on the tag_id field.
video_id | tag_id
1 | 1
1 | 4
This sample data gives video with id = 1 the tags "drama" and "historical."
Then to match tags you need
SELECT v.id, v.title, v.thumb
FROM video AS v
JOIN tag_video AS tv ON v.id = tv.video_id
JOIN tag AS t ON tv.tag_id = t.tag_id
WHERE t.tag IN ('drama', 'comedy')
This will look up your tags very fast, and let you look up multiple ones in a single query if you wish.
It won't help with your requirement for full text search on your titles, however.

EDITED:
define indexes on title and keyword fields.
try this:
ALTER TABLE `videos` ADD INDEX (`title`);
ALTER TABLE `videos` ADD INDEX (`keyword`);

How to take votes and rank images using PHP and MySQL

I'm trying to build a php ranking system where users can rank an image on a scale of 1-5.
Depending on how an image is ranked decides what its place on the leaderboard (rank number) would be. The rank should change depending on the different ratings it receives from users.
An example of this is the ranking system here. http://www.newgrounds.com/portal/view/601966 (Right hand side, lower down the page.)
I'm just looking for any information which would help me achieve this.
Thanks.

Create a table called votes and tie it to your images table:
VOTES:
vote_id INT(11) PK
user_id INT(11)
image_id INT(11)
score TINYINT(1)

Here are some things you're going to need to know:
You need a database. In your database you need to store each of the images you're ranking, do this in a table called "images". In this table you will give each image an "auto-incrementing" primary key. (this means that for each new row you add to the database the primary key will AUTOMATICALLY be +1 from the row before). This means that each image has a UNIQUE row number next to it - identifying that specific row. Call this column id. (we will reference it in other tables in the column image_id).
Next you need a table called "votes". In this table you can store all sorts of information you might need, but simply all you'll need to store is the unique image number from the "images" table and the value of the vote that someone has cast. You'll end up with something like this:
image_id | vote_value
1 | 3
2 | 5
1 | 3
4 | 1
4 | 3
Now you can query this information to get your leaderboard. The query might look something like this:
SELECT image_id, SUM(vote_value) AS rank FROM votes GROUP BY image_id ORDER_BY rank;
That will give you a list of "image_id"s ordered by their rank (i.e. the total of all the votes).
Then you can go back to your images table and get the information for that image out of that table.
SELECT name, url FROM images WHERE id=#image_id we got above#;
Hope this helps you. :) If you get stuck come back and ask again.

PHP Whether have a database classification retrieval table

I want store some article into database. I use php and mysql.
Whether have a database classification retrieval table?
like:
directory | keyword1 | keyword2 | keyword3 | keyword4|
sport | football | f1 | nba | tennis |
so that:
$query = mysql_query("SELECT * FROM keywordtable WHERE keyword1='$word' OR keyword2='$word' OR keyword3='$word' OR keyword4='$word' ");
if some words in the article match one of the keywords, the article will insert into directory - sport.
Thanks.
I need a table like my example, it should be have many words which can let me reference: if my article appears these words, I can put it into the directory it should in. I know there have more and more words which can be defined into sport.

If you are using fields like something1 something2 then you should probably use a different table for them, basically that's what relational databases are for. (Of course there are some legitimate reasons to use something like this, for example caching purposes.)

If I understand what you're asking, you'd like to have a keyword table against which you can check content. If the content matches any of the keywords, then the content is tagged with the associated directory. If that's correct then you need 4 tables:
keywords
kword
dir_id, int not null FK directories
directories
dir_id, int not null primary key
dir_name, varchar
article_directories
art_id, int not null FK articles
dir_id, int not null FK directories
articles, MyISAM
art_id, int not null primary key
art_title, varchar
content, text FULLTEXT index
INSERT INTO article_directories(art_id,dir_id)
SELECT DISTINCT a.art_id, k.dir_id
FROM articles a, keywords k
WHERE MATCH (a.content) AGAINST (k.kword)
The above query will identify all possible directories that an article could belong to. This means that an article could belong to both 'sports' and 'entertainment', for example.

Rating System in PHP and MySQL

If we look at the stackoverflow website we have votes. But the question is what is the bestway to store who has voted and who has not. Lets also simplify this even more and say that we can only vote Up, and we can only Remove the Up vote.
I was thinking having the table to be in such form
question - Id(INT) | userId(INT) | title(TEXT) | vote(INT) | ratedBy(TEXT)
Thre rest is self explanitory but ratedBy is a Comma Seperated Id values of the Users.
I was thinking to read the ratedBy and compare it with the userId of the current logged in User. If he dosent exist in the ratedBy he can vote Up, otherwise he can remove his vote. Which in turn will remove the value from ratedBy

I think to make another table "vote" is better. The relationship between users and votes is n to n, therefore a new table should be created. It should be something like this:
question id (int) | user id (int) | permanent (bool) | timestamp (datetime)
Permanent field can be used to make votes stay after a given time, as SO does.
Other fields may be added according to desired features.
As each row will take at least 16B, you can have up to 250M rows in the table before the table uses 4GB (fat32 limit if there is one archive per table, which is the case for MyISAM and InnoDB).
Also, as Matthew Scharley points out in a comment, don't load all votes at once into memory (as fetching all the table in a resultset). You can always use LIMIT clause to narrow your query results.

A new table:
Article ID | User ID | Rating
Where Article ID and User ID make up the composite key, and rating would be 1, indicating upvote, -1 for a downvote and 0 for a removed vote (or just remove the row).

I believe your design won't be able to scale for large numbers of voters.
The typical thing to do is to create to tables
Table 1: question - Id(INT) | userId(INT) | title(TEXT)
Table 2: question - ID(INT) | vote(INT) | ratedBy(TEXT)
Then you can count the votes with a query like this:
SELECT t1.question_Id, t1.userId, t1.title, t2.sum(vote)
FROM table1 t1
LEFT JOIN table2 t2 ON t1.question_id = t2.question_id

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.