I'm fairly new to MySQL and I need help with a relatively basic question.
Say I have an auto-increment table that lists individual people by row. I include all of the basic information about each person such as name, age, race, etc in the columns. But say I want to include lists of the people's friends as well. Since these lists would be dynamic and to my knowledge you cannot have two auto-increment variables in a single table, it would not be possible to include the friends lists in that specific table as there are no such things as sub-tables or anything of the sort in MySQL (again to the best of my knowledge). If you wanted dynamic friends lists you would have to make a new table solely dedicated to that purpose.
Am I right in this thinking? Or am I missing something?
Here is my current general idea (which I rather dislike):
table people_list {
person_id (auto-increment)
name
age
race
...
}
table friends_lists {
friendship_id (auto-increment)
person_id1
person_id2
}
Note that I just made up the syntax in essence of MySQL for demonstration.
Is there any better way?
Your approacch is correct... theres no other way to do this other than an auxiliary table (friends_lists in your scenario). Thats how one achieve a "many-to-many" relationship between two tables.
In your case, the two tables are the same (people_list), but, conceptually, they can be thought as "friends" and "people"
But, may i give you a few hints about this approach?
1 - Every table is, in a way, a "list". So, why the suffix "_list" ? Dont, for the same reason we dont use plural for table names (its product, not product*s*. Of course where will be many ;)
2 - Instead of using an auto-increment id at friend, turn both person_id1 and person_id2 into the primary key. You get rid of a useless column, AND this way you enforce that each pair Person X - Friend Y is unique.
3 - Give person_id1 and 2 meaningful names, based on context, like "person_id" and "friend_id"
To sum it up:
table person {
person_id (auto-increment, primary key)
name
age
race
...
}
table friend {
person_id (foreing key from person, primary key)
friend_id (foreing key from person, primary key)
}
Why not
table people_list {
person_id (auto-increment)<
name
age
race
...
}
table person_friend {
person_id(of person)
person_id(of friend)
}
Take a look at this to understand better about one to many relationships.
Related
I have multiple tables in a Laravel app with 1-to-1 relationship such as users , users_settings , user_financial
And some 1-to-many relationships such as users_histories
My questions are:
1. Should I always include incremental id at the first?
for example is the id necessary in the Table #2 below?
Table 1:
id (primary,increments) , name, email, password
Table 2:
id (primary,increments), user_id, something_extra
^ why does every guide include this? // e.g. https://appdividend.com/2017/10/12/laravel-one-to-one-eloquent-relationships/
Can't I just use user_id as primary key and skip the incremental key? because I want to auto insert it on table 2 as soon as data is inserted in table 1.
2. How should I name 1-to-1 and 1-to-many tables in Laravel? `
I searched but didn't find any naming convention for different type of relationships...
Currently I do:
users table with primary key id is the base.
1-to-1: users_settings with foreign key user_id
1-to-many: users_histories foreign_key user_id
many-to-many: users_groups foreign_key user_id
should the first two tables be named settings/setting , histories/history instead? sorry I'm a little confused here.
I actually asked a similar question around 2 days ago. Its up to you but I'd say yes. In my case if I don't auto_increment all my ids in the related tables, data won't be associated with the correct user. However, there is an argument for saying auto_increment columns should not be used in this case, but they are useful for other things. According to some, the relationships might not be as meaningful so it'd be up to you and down to the specifics of you data tables for how meaningful the relationship will be. Regardless, you should research more into the advantages of auto_incrementing all your ids in related tables, as well as possible disadvantages before deciding what you want to do. Either way is fine, but they offer different advantages and disadvantages- which you'll need to compare and what works best for your specific case.
This is a well debated topic about the primary key. IMHO, No, you shouldn't. Every column in database should have a purpose. Following this, for your example, I agree that the auto_increment id is redundant and this is simply because it doesn't have a purpose. The second table is still uniquely describing the user so that the primary key should be the user_id.
Beside the above, there is another principle for me to decide whether I need the auto_increment id: whether I can see a table as an entity. For example, user is clearly an entity, but a relationship is not (in most cases), i.e., composite key can serves the purpose. But when an relationship table is extended to have more attributes and it starts to make sense for it to have an auto_increment id.
I don't have much experience on Laravel, but the naming for a database table should not be dictated by a framework. Comparing history and user_history, what a new DBA or developer expect from the two names without looking its data? user_history describes the table more precisely
I've just started exploring SQL databases, but I've run into an issue with how I store 'compound' structures in an existing table (if that's even the right way to go about it). For example, let's say that I have a database table with rows of users, where each user has a Unique ID, a hashed password, an email address, a phone number, etc.
Simple enough. But, then I want to allow each user to create and store an array of posts. Each post would have a post id, content, date, and various other metadata. If this was C++, I would probably have an array/vector of Posts as a member of the User class, and than I'd store an array/vector of User objects somewhere. Is it possible to store a table within a table in SQL, so that each user has access to their own individual table of posts?
Or, would it be better to create two separate tables (a users table, and a posts table), using some common element (like user ID or user name) to retrieve user-specific data from the posts table, and vice-versa?
I'm trying to understand how to implement a complex database that might be able to manage a large number of users, with user-specific sets of data like posts, messages, etc. So what might be a good approach to take going forward?
As you already mentioned, in relational data model, you can define two tables like below:
table 1 : Users
user_id user_name
----------- ------------------
1 'Tom'
2 'John'
table 2 : Posts
post_id user_id content post_date
-------- ---------- ------------------- ---------------------
1 1 'Hello, I am Tom.' 2014-04-02 14:14
2 1 'good bye' 2014-04-02 20:10
3 2 'I am John' 2014-04-02 22:22
You can read an introductory article here:
Relational_model:
http://en.wikipedia.org/wiki/Relational_model
Hope this helps.
You don't store table within table. You can store data in multiple tables and assign primary key for one table and foreign key for another table.
Read about Primary key, Foreign key and Relational Model.
Once your these concepts are cleared read about Database Normalization
You don't store tables within tables. As your third paragraph suggests, the strategy is to use some common key to "relate" table rows to each other.
The "unique ID" you describe is usually called a "primary key". You might have a table of users with a primary key that auto-increments each time you add a record. A function would be available to you so that after inserting, you could determine what the primary key is of the record you just added, so that you can add records to other tables that refer to the primary key of the users table.
You should probably read about Database normalization ant the relational model, specifically about the differences between Normal Forms.
With regard to selection of a field to relate posts to users, I suggest you don't use the username, and instead use some internal reference that isn't visible to the users. While your application might not allow it now, if you wanted to offer users the opportunity to change their username, tying internal database structure to something based on user input would only cause problems in the future.
I am currently working on a PHP/MySQL project for an assignment. In studying the efficient design of databases while working on the assignment I notice that in many cases it is good practice to create a third table when working with only two sets of data.
For example, if we have a table for "Students" and a table for "Addresses" it appears to be a good idea to create a third table i.e. "Student_Addresses" since a student can hypothetically have more than one address (separated parents etc.) and a single address can represent more than one student (siblings).
My question is: How do we go about populating that third table? Is there a way that it is done automatically using primary and/or foreign keys?
I've tried Google and my textbook to understand this but I've gotten nowhere. Links to tutorials or articles would be greatly appreciated.
Thanks for your help. I hope the question and example are clear.
n:m or 1:m normalization rule
Option 1:
user table
id
f_name
s_name
......
user address table
id
user_id // this should be index only as foreign keys will allow 1:1 only
address line 1
address line 2
address line 3
address_type (home, office ....)
Option 2:
user table
id
f_name
s_name
......
address table
id
address line 1
address line 2
address line 3
address_type (home, office ....)
user_address table
userId
addressId
according to your description option 2 would be the right solution. After adding the data to user table and address table then you need to add the data to user_address table manually. Some Object relational mapper (ORM) may do add the data to the third table automatically but you need to define the relations. check http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/association-mapping.html.
http://docstore.mik.ua/orelly/linux/sql/ch02_02.htm
http://www.keithjbrown.co.uk/vworks/mysql/mysql_p7.php
You can save the data in the third table using triggers when the data is inserted/updated/deleted in your base tables. You can learn more about triggers at
mySQL Triggers
However in your case it would be better if you could write the logic at the application/code level to make an entry in the third table. You can set up foreign key relationships to this table from your base tables so that the data remains consistent.
There is no native method in MySQL to populate Student_Addresses in your situation - you have to take care of entering data (connections) by yourself, but you can use - for example - transactions - see answers in this topic: SQL Server: Is it possible to insert into two tables at the same time?
For taking care of connections consistency - in Student_Addresses make not-null fields for relations to ID from Student and ID from Address, make both of these field as unique key together and use ON UPDATE CASCADE and ON DELETE CASCADE. This will take care of removing records from junction table when removing records from any of two other tables and also won't allow you to add same address to the same student twice.
I don't think data will be populated automatically rather it's responsibility of user to insert data.
I am note sure about PHP but using Hibernate and Java this can be done seemlessly. Since data of Students and addresses could be coming through some web application Hibernate can map java objects to records in table and also populate relationship table.
I want to begin with Thank you, you guys have been good to me.
I will go straight to the question.
Having a table with over 400 columns, is that bad?
I have web forms that consists mainly of questions that require check box answers.
The total number of check boxes can run up to 400 if not more.
I actually modeled one of the forms, and put each check box in a column (took me hours to do).
Because of my unfamiliarity with database design, I did not feel like that was the right way to go.
So I read somewhere that some people use the serialize function, to store a group of check boxes as text in a column.
I just want to know it that would be the best way to store these check boxes.
Oh and some more info I will be using cakephp orm with these tables.
Thanks again in advance.
My database looks something like this
Table : Patients, Table : admitForm, Table : SomeOtherFOrm
each form table will have a PatientId
As i stated above i first attempted creating a table for each form, and then putting each check box in a column. That took me forever to do.
so i read some where serializing check boxes per question would be a good idea
So im asking would would be a good approach.
For questions with multiple options, just add another table.
The question that nobody has asked you yet is do you need to do data mining or put the answers to these checkbox questions into a where clause in a query. If you don't need to do any queries on the data that look at the data contained in these answers then you can simply serialize them up into a few fields. You could even pack them into numbers. (all who come after you will hate you if you pack the data though)
Here's my idea of a schema.
== Edit #3 ==
Updated ERD with ability to store free form answers, also linked patient_reponse_option to question_option_link table so a patients response will be saved with correct option context (we know which question the response is too). I will post a few queries soon.
== Edit #2 ==
Updated ERD with form data
== Edit #1 ==
The short answer to your question is no, 400 columns is not the right approach. As an alternative, check out the following schema:
== Original ==
According to your recent edit, you will want to incorporate a pivot table. A pivot table breaks up a M:M relationship between 'patients' and 'options', for example, many patients can have many options. For this to work, you don't need a table with 400 columns, you just need to incorporate the aforementioned pivot table.
Example schema:
// patient table
tableName: patient
id: int(11), autoincrement, unsigned, not null, primary key
name_first: varchar(100), not null
name_last: varshar(100), not null
// Options table
tableName: option
id: int(11), autoincrement, unsigned, not null, primary key
name: varchar(100), not null, unique key
// pivot table
tableName: patient_option_link
id: int(11), autoincrement, unsigned, not null, primary key
patient_id: Foreign key to patient (`id`) table
option_id: Foreign key to option (`id`) table
With this schema you can have any number of 'options' without having to add a new column to the patients table. Which, if you have a large number of rows, will crush your database if you ever have to run an alter table add column command.
I added an id to the pivot table, so if you ever need to handle individual rows, they will be easier to work with, vs having to know the patient_id and option_id.
I think I would split this out into 3 tables. One table representing whatever entity is answering the questions. A second table containing the questions themselves. Finally, a third junction table that will be populated with the primary key of the first table and the id of the question from the second table whenever the entity from the first table selects the check box for that question.
Usually 400 columns means your data could be normalized better and broken into multiple tables. 400 columns might actually be appropriate, though, depending on the use case. An example where it might be appropriate is if you need these fields on every single query AND you need to filter records using these columns (ie: use them in your WHERE clause)... in that case the SQL JOINs will likely be more expensive than having a sparsely populated "wide" table.
If you never need to use SQL to filter out records based on these "checkboxes" (I'm guessing they are yes/no boolean/tinyint type values) then serializing is a valid approach. I would go this route if I needed to use the checkbox values most of time I query the table, but don't need to use them in a WHERE clause.
If you don't need these checkbox values, or only need a small subset of them, on a majority of requests to your table then its likely you should work on breaking your table into multiple tables. One approach is to have a table with the checkbox values (id, record_id, checkbox_name, checkbox_value) where record_id is the id of your primary table record. This implies a one-to-many relationship between your primary records and your checkbox values.
I have these tables:
category table:
cat_id (PK)
cat_name
category_options table:
option_id (PK)
cat_id (FK)
option_name
option_values table:
value_id (PK)
option_id (FK)
value
classifieds table:
ad_id (PK) (VARCHAR) something like "Bmw330ci_28238239832"
poster_id (FK)
cat_id (FK)
headline
description
price
etc....
posters table:
poster_id (PK)
name
email
tel
password
etc....
Three main questions:
1- Is the above good enough? It covers all my needs atleast...
2- Sometimes when I try out different queries, I get strange results... Could you write a PHP query string which will fetch one complete ad from an ad_id only? (imagine the only variable you have is ad_id)
3- In the query string, must I specify all different tables which are connected in order to display an ad? Can't I just use something like "SELECT * FROM classifieds WHERE ad_id=$ad_id" and it would handle the links automatically, ie fetch all related information also?
Thanks and if you need more input let me know!
You have serious design problems. Never ever ever use name as a PK; it is not unique and it is subject to change! Women change thier names when they get married for instance. In fact, don't use any varchars as PKS at all. Use surrogate keys instead. Surrogate keys don't change, text keys values often do and they are slower too.
And never store name as just one field, this is a poor practice. At a minumum you need first name, last name, middle name, and suffix. You wil also need a autoincrementing id field so that John Smith at one address in Chicago can exist in the table with a different John Smith who lives elsewhere in Chicago.
No you can't get all the data from related tables without adding them to the query through the use of a join. This is database 101 and if you don't know that, then you don't understand relational databases enough to design one. Do some research into joins and querying. You can get all the information for an ad from just having the ad id though as your current relations appear to work.
Do not use implied joins when you add the other tables to your queries. They are outdated by 18 years. Learn correctly by using explicit joins.
1) If it meets your needs, then wouldn't that make it "good enough"? But seriously, I would agree with davek that you should make the ad_id field an int/bigint, and I'd also suggest the same for the posters table. Make the name a regular value field and create an autonum int/bigint PK field for it. If for any reason that user wants to change their name (for privacy concerns, perhaps), then you would have to update any foreign keys in the database as well. With an autonum key, you wouldn't have this problem.
2) Yes, from what I see you should be able to gather all the data on an ad by knowing only the ad_id.
3) No, you need to do more than that, either equi-join in a SELECT query, or use the JOIN keyword to pull your data in. MySQL doesn't have a "meta" relationship model (like MS Access), so it won't automatically understand your primary/foreign key relationships.