MySQL relationship database design - php

I have seen a question on this forum that I can relate with, but I can't apply the answers to my question.
Here it goes:
I have a memberlist table (id, name, number) I'll just make the columns short.
Next, I have an events table (id, eventName, description)
Now,
1. each member in the memberlist can join events as many as he wants.
2. each events in the events table can have members without limits (okay, say 1k members, like that or whatever).
What I have now is an event table that has a column named: "joiners" which will contain the id of a certain joiner/member. But I believe I'm wrong because how can a certain event handles many joiner's id?

I would rename memberlist into members to make your table naming more consistent. Or events into eventlist. Which ever you like more.
Then you want to define a many to many relation between members and events. This is done through an intermediate table which will reference both:
create table eventmembers (
id int unsigned not null primary_key auto_increment,
member_id int unsigned not null references members(id),
event_id int unsigned not null references events(id)
)
I'm assuming that on your memebers and events you already have id fields which are set to be primary keys.
If you want to get all events attended by a specific user you can then do
select events.*
from events
left join eventmembers
on events.id = eventmembers.event_id
where
member_id = ?
and get all the members in an event:
select members.*
from members
left join eventmembers
on members.id = eventmembers.member_id
where
event_id = ?

You'd want a third table called events_memberlist :
events_memberlist
- memberlistId
- eventId
This would allow you to maintain a many-to-many relationship between the two tables.

you need a third table for fun we will call it EventMemberTable, col's:
event id | member
it links the appropiat member to the appropiat event. Keeping your other tables clear of redundant data.

You can achieve this by having a middle table (between members, and events). Middle tables are necessary in situations where a 'many-to-many' relationship between two tables is required.
In the middle table, you will include the primary keys of both tables as foreign keys, on the same row, when a member has joined an event. The foreign keys effectively create the relationship between one member, and one event. The table however, can have thousands of these entries.
I hope that helps.
P.S. Maybe post some syntax next time.
Cheers

When you use a middle table as already mentioned like:
CREATE TABLE event_members (
member_id INT UNSIGNED NOT NULL REFERENCES members(id),
event_id INT UNSIGNED NOT NULL REFERENCES events(id)
)
you should also set up a unique index to prevent multiple entries for the same member/event combination like
ALTER TABLE event_members ADD UNIQUE INDEX uniq_event_members_idx (member_id, event_id);
Otherwise you might end up with loads of duplicates.

Related

Database schema for Books, Authors, Publishers and Users with bookshelves

I am unable to figure out an efficient way to establish relationships between tables. I want to have a database of books, authors, publishers and the users that sign-up and have their bookshelves (Read, Currently Reading, Want to Read (or Plan to Read)). I want the users to be able to select which books they've read, want to read or are currently reading.
P.s. I am aware of PK and FK in database table relations.
Edit: maybe this is a better way of doing it:
Then I shall use "Status" = (Read, Plant to Read and Currently reading) - please tell me if this is good and efficient!
You'll need a N:M link between books and authors, since a book might have multiple authors and each author might have written more than one book. In a RDBMS that means you'll need a written_by table.
The link between books and publishers however is different. Any given book can only have one publisher (unless in your system different editions of a book are considered the same book). So all you need here is a publisher_id foreign key in books
Lastly, and most importantly you're looking at the readers / users. And their relation to books. Naturally, this is also a N:M relation. I sure hope that people read more than one book (we all know what happens if you only ever read one...) and surely a book is read by more than one person. That calls for a book_users connection table. The real question here is, how to design it. There are three basic designs.
Separate tables by type of relation. (as outlined by #just_somebody ) Advantages: You only have INSERTS and DELETES, never UPDATES. While this looks kind of neat, and somewhat helps with query optimization, most of the time it serves no actual purpose other than showing off a big database chart.
One table with a status indicator. (as outlined by #Hardcoded) Advantages: You only have one table. Disadvantages: You'll have INSERTS, UPDATES and DELETES - something RDBMS can easily handle, but which has its flaws for various reasons (more on that later) Also, a single status field implies that one reader can have only one connection to the book at any time, meaning he could only be in the plan_to_read, is_reading or has_read status at any point in time, and it assumes an order in time this happens. If that person would ever plan to read it again, or pause, then reread from the begining etc, such a simple series of status indicators can easily fail, because all of a sudden that person is_reading now, but also has_read the thing. For most applications this still is a reasonable approach, and there are usually ways to design status fields so they are mutually exclusive.
A log. You INSERT every status as a new row in a table - the same combination of book and reader will appear more than once. You INSERT the first row with plan_to_read, and a timestamp. Another one with is_reading. Then another one with has_read. Advantages: You will only ever have to INSERT rows, and you get a neat chronology of things that happened. Disadvantages: Cross table joins now have to deal with a lot more data (and be more complex) than in the simpler approaches above.
You may ask yourself, why is there the emphasis on whether you INSERT, UPDATE or DELETE in what scenario? In short, whenever you run an UPDATE or DELETE statement you are very likely to in fact lose data. At that point you need to stop in your design process and think "What is it I am losing here?" In this case, you lose the chronologic order of events. If what users are doing with their books is the center of your application, you might very well want to gather as much data as you can. Even if it doesn't matter right now, that is the type of data which might allow you to do "magic" later on. You could find out how fast somebody is reading, how many attempts they need to finish a book, etc. All that without asking the user for any extra input.
So, my final answer is actually a question:
Would it be helpful to tell someone how many books they read last year?
Edit
Since it might not be clear what a log would look like, and how it would function, here's an example of such a table:
CREATE TABLE users_reading_log (
user_id INT,
book_id INT,
status ENUM('plans_to_read', 'is_reading', 'has_read'),
ts TIMESTAMP DEFAULT NOW()
)
Now, instead of updating the "user_read" table in your designed schema whenever the status of a book changes you now INSERT that same data in the log which now fills with a chronology of information:
INSERT INTO users_reading_log SET
user_id=1,
book_id=1,
status='plans_to_read';
When that person actually starts reading, you do another insert:
INSERT INTO users_reading_log SET
user_id=1,
book_id=1,
status='is_reading';
and so on. Now you have a database of "events" and since the timestamp column automatically fills itself, you can now tell what happened when. Please note that this system does not ensure that only one 'is_reading' for a specific user-book pair exists. Somebody might stop reading and later continue. Your joins will have to account for that.
a database table is a mathematical relation, in other words a predicate and a set of tuples ("rows") for which that predicate is true. that means each "row" in a "table" is a (true) proposition.
this may all look scary but the basic principles are really simple and worth knowing and applying rigorously: you'll better know what you're doing.
relations are simple if you start small, with the binary relation. for example, there's a binary relation > (greater than) on the set of all integers which "contains" all ordered pairs of integers x, y for which the predicate x > y holds true. note: you would not want to materialize this specific relation as a database table. :)
you want Books, Authors, Publishers and Users with their bookshelfs (Read, Currently Reading, Want to Read). what are the predicates in that? "user U has read book B", "user U is reading book B", "user U wants to read book B" would be some of them; "book B has ISBN# I, title T, author A" would be another, but some books have multiple authors. in that case, you'll do well to split it out into a separate predicate: "book B was written by author A".
CREATE TABLE book (
id INT NOT NULL PRIMARY KEY
);
CREATE TABLE author (
id INT NOT NULL PRIMARY KEY
, name TEXT NOT NULL
);
CREATE TABLE written_by (
book INT NOT NULL REFERENCES book (id)
, author INT NOT NULL REFERENCES author (id)
);
CREATE TABLE reader (
id INT NOT NULL PRIMARY KEY
);
CREATE TABLE has_read (
reader INT NOT NULL REFERENCES reader (id)
, book INT NOT NULL REFERENCES book (id)
);
CREATE TABLE is_reading (
reader INT NOT NULL REFERENCES reader (id)
, book INT NOT NULL REFERENCES book (id)
);
CREATE TABLE plans_reading (
reader INT NOT NULL REFERENCES reader (id)
, book INT NOT NULL REFERENCES book (id)
);
etc etc.
edit: C. J. Date's Introduction to Database Systems
If I was you, I'd use a schema much like the following:
TABLE user
-- Stores user's basic info.
( user_id INTEGER PRIMARY KEY
, username VARCHAR(50) NOT NULL
, password VARCHAR(50) NOT NULL
, ...
, ...
, ...
);
TABLE author
-- Stores author's basic info
( author_id INTEGER PRIMARY KEY
, author_name VARCHAR(50)
, date_of_birth DATE
, ...
, ...
, ...
);
TABLE publisher
-- Stores publisher's basic info
( publisher_id INTEGER PRIMARY KEY
, publisher_name VARCHAR(50)
, ...
, ...
, ...
);
TABLE book
-- Stores book info
( book_id INTEGER PRIMARY KEY
, title VARCHAR(50) NOT NULL
, author_id INTEGER NOT NULL
, publisher_id INTEGER NOT NULL
, published_dt DATE
, ...
, ...
, ...
, FOREIGN KEY (author_id) REFERENCES author(author_id)
, FOREIGN KEY (publisher_id) REFERENCES publisher(publisher_id)
);
TABLE common_lookup
-- This column stores common values that are used in various select lists.
-- The first three values are going to be
-- a - Read
-- b - Currently reading
-- c - Want to read
( element_id INTEGER PRIMARY KEY
, element_value VARCHAR(2000) NOT NULL
);
TABLE user_books
-- This table contains which user has read / is reading / want to read which book
-- There is a many-to-many relationship between users and books.
-- One user may read many books and one single book can be read by many users.
-- Hence we use this table to maintain that information.
( user_id INTEGER NOT NULL
, book_id INTEGER NOT NULL
, status_id INTEGER NOT NULL
, ...
, ...
, ...
, FOREIGN KEY (user_id) REFERENCES user(user_id)
, FOREIGN KEY (book_id) REFERENCES book(book_id)
, FOREIGN KEY (status_id) REFERENCES common_lookup(element_id)
);
TABLE audit_entry_log
-- This is an audit entry log table where you can track changes and log them here.
( audit_entry_log_id INTEGER PRIMARY KEY
, audit_entry_type VARCHAR(10) NOT NULL
-- Stores the entry type or DML event - INSERT, UPDATE or DELETE.
, table_name VARCHAR(30)
-- Stores the name of the table which got changed
, column_name VARCHAR(30)
-- Stores the name of the column which was changed
, primary_key INTEGER
-- Stores the PK column value of the row which was changed.
-- This is to uniquely identify the row which has been changed.
, ts TIMESTAMP
-- Timestamp when the change was made.
, old_number NUMBER(36, 2)
-- If the changed field was a number, the old value should be stored here.
-- If it's an INSERT event, this would be null.
, new_number NUMBER(36,2)
-- If the changed field was a number, the new value in it should be stored here.
-- If it's a DELETE statement, this would be null.
, old_text VARCHAR(2000)
-- Similar to old_number but for a text/varchar field.
, new_text VARCHAR(2000)
-- Similar to new_number but for a text/varchar field.
, old_date VARCHAR(2000)
-- Similar to old_date but for a date field.
, new_date VARCHAR(2000)
-- Similar to new_number but for a date field.
, ...
, ... -- Any other data types you wish to include.
, ...
);
I would then create triggers on a few tables that would track changes and enter data in the audit_entry_log table.
First of all create 4 tables for books, authors, publishers & the users. than
create a table books_authers which has relationship with table books and table authers.
create a table books_publishers which has relationship with table books and table publishers.
create a table books_user which has relationship with table books and table users. also in this table use a flag to show the book id which user Read, Currently Reading, Want to Read (or Plan to Read).
This is just markup try it
I would have a Books table, containing: title, author, publisher, isbn. A Book_Statuses table, containing an id (PK) and a status (Read, Reading, etc..). A third table for user_books, in which there would be a fk_book_id related with the Books table, and a fk_status_id which would be linked to the Book_Statuses table.
All this together gives you an easily accessible data structure.
This is assuming I understand your question. If you want to have tables for authors, publishers and books. I'd need clarification on your needs.
Your answer is the best way to do this. For example, suppose that you have books and categories tables and a book can suit more than one category. best way to keep this data creating a third table to keep book-category relations. otherwise you have to create columns for every category.
ID name comedy adventure etc
5 BookName yes no no
like this. this is the baddest thing to do. believe me. your solution is best way to do it.
and don't aware of PK & FK in Database Table Relations. if you use them good, it will be faster and safer than doing their works manually.

Inserting "large" amounts of data into MySQL and the benefits of using a foreign key

I'm not sure how to store or insert this data. I am using PHP and MySQL.
Let's say we're trying to keep track of people who enter marathons (like jogging or whatever). So far, I have a Person table that has all my person information. Each person happens to be associated with a unique varchar(40) key. There is a table for the marathon information (Marathon). I receive the person data in an CSV that as about 130,000 rows and import that into the database.
So - now the question is... how do I deal with that association between Person and Marathon? For each Marathon, I get a huge list of participants (by that unique varchar key) that I need to import. So... If I go the foreign key route, it seems like the insert would be very heavy and cumbersome to look up the appropriate foreign key for the person. I'm not even sure how I would write that insert... I guess it would look like this:
insert into person_marathon
select p.person_id, m.marathon_id
from ( select 'person_a' as p_name, 'marathon_a' as m_name union
select 'person_b' as p_name, 'marathon_a' as m_name )
as imported_marathon_person_list
join person p
on p.person_name = imported_marathon_person_list.p_name
join marathon m
on m.marathon_name = imported_marathon_person_list.m_name
There are not a lot of marathons to deal with at one time. There a lot of people, though.
--> Should I even give the person an ID and require all the foreign keys? or just use the unique varchar(40) as the true table key? But then I would have to join tables on a varchar and that's bad. A marathon can have anywhere from 1k to 30k participants.
--> Or, I could select the person info and the marathon info from the database and join it with the marathon_person data in PHP before I send it over to MySQL.
--> Or, I guess, maybe make a temporary table, then join in the db, then insert (through PHP)? It's been already strongly suggested that I do not use temporary tables ever (this is a work thing and this isn't my database).
Edit: I am not sure on what schema to use because I'm not sure if I should be using foreign keys or not (purpose of this whole post is to answer that question) but the basic design would be something like...
create table person (
person_id int unisgned auto_incrememnt,
person_key varchar(40) not null,
primary key (person_id),
constraint uc_person_key unique (person_key)
)
create table marathon (
marathon_id int unisgned auto_incrememnt,
marathon_name varchar(60) not null,
primary key (marathon_id)
)
create table person_marathon (
person_marathon_id int unsigned auto_increment,
person_id int unsigned,
marathon_id int unsigned,
primary key (person_marathon_id),
constraint uc_person_marathon unique (person_id, marathon_id),
foreign key person_id references person (person_id),
foreign key marathon_id references marathon (marathon_id)
)
I'm going to repeat the actual question really quick.... If I choose to use a foreign key for person, how do I import all the person_marathon data with the person_id in an efficient way? The insert statement I included above is my best guess....
The person data comes in a CSV of about 130,000 rows so that is a straight import into the person table. The person data comes with a unique varchar(40) for each person.
The person_marathon data comes in a CSV for each marathon, as a list of 1,000 to 30,000 unique varchar(40)'s that represent each person who participated in that marathon.
Summary: I am using PHP. So what is the best way to write the insert/import of the person_marathon data if I am using foreign keys? Would I have to do it like the insert statement above or is there a better way?
This is a many-to-many relationship, one person can enter many marathons, one marathon can be entered by many persons. You need additional table in your data model to track this relation, for example:
CREATE TABLE persons_marathons(
personID int FOREIGN KEY REFERENCES Persons(P_Id),
marathonID int FOREIGN KEY REFERENCES Marathons(M_Id)
)
This table uses foreign key constraints. The foreign key constraint prevents from inserting bad data (for example you cannot insert a row with personID = 123 when there is no such id in Persons table), it prevents also from deletes that would destroy a link between tables (for example you cannot delete a person X when exists a record in person_marathon table witth such personID).
If this table contains the following rows:
personID | MarathonID
----------+-----------
2 | 3
3 | 3
2 | 8
3 | 8
it means that persons 2 and 3 both entered marathons 3 and 8

New table or field with array in field (php/mysql)

I need to store multiple id's in either a field in the table or add another table to store the id's in.
Each member will basically have favourite articles. Each article has an id which is stored when the user clicks on a Add to favourites button.
My question is:
Do I create a field and in this field add the multiple id's or do I create a table to add those id's?
What is the best way to do this?
This is a many-to-many relationship, you need an additional table storing pairs of user_id and article_id (primary keys of user and article tables, respectively).
You should create a new table instead of having comma seperated values in a single column.
Keep your database normalized.
You create a separate table, this is how things work in a relational database. The other solution (comma separated list of ids in one column) will lead to an unmaintainable database. For example, what if you want to know how many times an article was favorited? You cannot write queries on a column like this.
Your table will need to store the user's id and the article's id - these refer to the primary keys of the corresponding tables. For querying, you can either use JOINs or nested SELECT queries.
As lafor already pointed out this is a many-to-many relationship and you'll end up with three tables: user, article, and favorite:
CREATE TABLE user(
id INT NOT NULL,
...
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE article (
id INT NOT NULL,
...
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE favorite (
userID INT NOT NULL,
articleID INT NOT NULL,
FOREIGN KEY (userID) REFERENCES user(id) ON DELETE CASCADE,
FOREIGN KEY (articleID) REFERENCES article(id) ON DELETE CASCADE,
PRIMARY KEY (userID, articleID)
) ENGINE=INNODB;
If you then want to select all user's favorite articles you use a JOIN:
SELECT * FROM favorite f JOIN article a ON f.articleID = a.id WHERE f.userID = ?
If you want to know why you should use this schema, I recommend reading about database normilization. With multiple IDs in a single field you would even violate the first normal form and thus land in a world of pain...

MySQL auto-increment between tables

In MySQL, is it possible to have a column in two different tables that auto-increment? Example: table1 has a column of 'secondaryid' and table2 also has a column of 'secondaryid'. Is it possible to have table1.secondaryid and table2.secondaryid hold the same information? Like table1.secondaryid could hold values 1, 2, 4, 6, 7, 8, etc and table2.secondaryid could hold values 3, 5, 9, 10? The reason for this is twofold: 1) the two tables will be referenced in a separate table of 'likes' (similar to users liking a page on facebook) and 2) the data in table2 is a subset of table1 using a primary key. So the information housed in table2 is dependent on table1 as they are the topics of different categories. (categories being table1 and topics being table2). Is it possible to do something described above or is there some other structural work around that im not aware of?
It seems you want to differentiate categories and topics in two separate tables, but have the ids of both of them be referenced in another table likes to facilitate users liking either a category or a topic.
What you can do is create a super-entity table with subtypes categories and topics. The auto-incremented key would be generated in the super-entity table and inserted into only one of the two subtype tables (based on whether it's a category or a topic).
The subtype tables reference this super-entity via the auto-incremented field in a 1:1 relationship.
This way, you can simply link the super-entity table to the likes table just based on one column (which can represent either a category or a topic), and no id in the subtype tables will be present in both.
Here is a simplified example of how you can model this out:
This model would allow you to maintain the relationship between categories and topics, but having both entities generalized in the superentity table.
Another advantage to this model is you can abstract out common fields in the subtype tables into the superentity table. Say for example that categories and topics both contained the fields title and url: you could put these fields in the superentity table because they are common attributes of its subtypes. Only put fields which are specific to the subtype tables IN the subtype tables.
If you just want the ID's in the two tables to be different you can initially set table2's AUTO_INCREMENT to some big number.
ALTER TABLE `table2` AUTO_INCREMENT=1000000000;
You can't have an auto_increment value shared between tables, but you can make it appear that it is:
set ##auto_increment_increment=2; // change autoinrement to increase by 2
create table evens (
id int auto_increment primary key
);
alter table evens auto_increment = 0;
create table odds (
id int auto_increment primary key
);
alter table odds auto_increment = 1;
The downside to this is that you're changing a global setting, so ALL auto_inc fields will now be growing by 2 instead of 1.
It sounds like you want a MySQL equivalent of sequences, which can be found in DBMS's like PosgreSQL. There are a few known recipes for this, most of which involve creating table(s) that track the name of the sequence and an integer field that keeps the current value. This approach allows you to query the table that contains the sequence and use that on one or more tables, if necessary.
There's a post here that has an interesting approach on this problem. I have also seen this approach used in the DB PEAR module that's now obsolete.
You need to set the other table's increment value manually either by the client or inside mysql via an sql function:
ALTER TABLE users AUTO_INCREMENT = 3
So after inserting into table1 you get back the last auto increment then modify the other table's auto increment field by that.
I'm confused by your question. If table 2 is a subset of table 3, why would you have it share the primary key values. Do you mean that the categories are split between table 2 and table 3?
If so, I would question the design choice of putting them into separate tables. It sounds like you have one of two different situations. The first is that you have a "category" entity that comes in two flavors. In this case, you should have a single category table, perhaps with a type column that specifies the type of category.
The second is that your users can "like" things that are different. In this case, the "user likes" table should have a separate foreign key for each object. You could pull off a trick using a composite foreign key, where you have the type of object and a regular numeric id afterwards. So, the like table would have "type" and "id". The person table would have a column filled with "PERSON" and another with the numeric id. And the join would say "on a.type = b.type and a.id = b.id". (Or the part on the "type" could be implicit, in the choice of the table).
You could do it with triggers:
-- see http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_last-insert-id
CREATE TABLE sequence (id INT NOT NULL);
INSERT INTO sequence VALUES (0);
CREATE TABLE table1 (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
secondardid INT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (id)
);
CREATE TABLE table2 (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
secondardid INT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (id)
);
DROP TRIGGER IF EXISTS table1_before_insert;
DROP TRIGGER IF EXISTS table2_before_insert;
DELIMITER //
CREATE
TRIGGER table1_before_insert
BEFORE INSERT ON
table1
FOR EACH ROW
BEGIN
UPDATE sequence SET id=LAST_INSERT_ID(id+1);
NEW.secondardid = LAST_INSERT_ID();
END;
//
CREATE
TRIGGER table2_before_insert
BEFORE INSERT ON
table2
FOR EACH ROW
BEGIN
UPDATE sequence SET id=LAST_INSERT_ID(id+1);
NEW.secondardid = LAST_INSERT_ID();
END;
//

creating a join table problem!

I have 3 tables
customer, menu, and order.
The order table is suppose to join the customer and menu tables, and contains the primary keys of both. Here's how I tried to create the order table on phpmyadmin.
create table order(
customerID int not null,
itemID int not null,
primary key (customerID, itemID),
foreign key(customerID) reference customer(ID),
foreign key(itemID) reference menu(itemID)
)
This doesn't work. What am I doing wrong?!!
order is a reserved word, try another name, or quote it, like
create table `order`(
customerID int not null,
itemID int not null,
primary key (customerID, itemID),
foreign key(customerID) reference customer(ID),
foreign key(itemID) reference menu(itemID) )
It is complaining as order is a reserved keyword. Wrapping it with backticks as #TokenMacGuy tells you to solves your problem. Here is a list of them
Furthermore, as a general rule, you can transform your entities like so to avoid problems, especially with reserved keywords:-
a) The Entity is always modeled (on paper) as singular as it represents a concept/asset/person in the real world or problem domain. eg. ORDER, CHECK, STUDENT, CAR
b) the corresponding DB Table it is transformed into can always be named using plural. The logic is that the table will contain lots of instances of that Entity. Therefore ORDERS, CHECKS, STUDENTS, CARS

Categories