Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Wish you a peaceful and healthy new year. **
I am working on a survey database design for mysql/php/wordpress for estimated 10,000,000 users. Each user will eventually answer about 5,000 questions over a course of several years. These questions are answered mainly as scale of : AGREE, NEUTRAL, DISAGREE, DONT KNOW as multiple choice answers. There is no right or wrong answer. A user would be able to attempt the questions again in the future. Also, at each attempt his/her answer_record gets updated with new data. Would the following database design be reasonable from database performance and data normalization perspective? Thank you in advance.
TABLE_USER:
user_id
username
user_email
[other user specific fields]
TABLE_QUESTION:
question_id
question_text
question_image
question_category1 [A question may exist in more than 1 category]
question_category2
question_category3
TABLE_ANSWER:
answer_id
user_id
question_id
answer_agree
answer_neutral
answer_disagree
answer_dontknow
answered_datetime
answer_number_of_attempts
Sincerely,
Harrison.
Part of proper db design means stepping back and making sure that if you added one more thing, you wouldn't have to redesign the tables, and also that discrete types of information are separated out. If several columns are doing the same thing (but recording different answers) you should split them out into another table and have a single link table.
Also, do you really need to say Table in the table name? of course it's a table, what else would it be?
TABLE_USER is fine
for TABLE_QUESTION you should drop the category columns, instead make a new table
TABLE_CATEGORIES
with information on the different categories
and have another table
CATEGORIES_PER_QUESTION
question_id
category_id
that allows a question to have any number of categories, you can look which ones each question has by querying categories_per_question
TABLE_ANSWER should be split into two tables,
RESPONSE
response_id
user_id
question_id
answer_id
datetime_responded
and ANSWERS
answer_id
answer_name
Where answer name is AGREE, NEUTRAL, DISAGREE, DONT KNOW or any other sort of answer you might provide.
If you wanted to be fancy, you could even have another join table between ANSWERS and TABLE_QUESTION, indicating what answers will be available per question.
To know the number of attempts, and other information I dropped, you can query the DB so it doesn't need a column for itself.
I realize you want help with DB design, but even if the design is perfect, this will fail to scale reasonable if your system is not planned properly for scaling (BIG).
A proper designed API can scale endlessly.
With those numbers it will be cheaper overtime to have this external and build an API for it so you can scale properly. Building something directly into WordPress will require you to scale to quickly in all directions just for running PHP, HTTP and MySQL.
If you build an API in between WordPress and your survey database you can then scale MySQL and build any number of systems in between, Memcache, search engines, etc...
This will give you better separation between your systems allowing for more efficient scaling.
Scaling each only when it needs it.
So I would plan your system/infrastructure at this point also.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
For example, Company Abc will have it's own database: Database_Abc.
Let's say each Database will consist of the following tables:
staffs
products
inventory
logs
or
one centralized database with unique merchant id would be better?
staffs - merchant_id
products - merchant_id
inventory - merchant_id
logs - merchant_id
which method is good for scalability?
Well, it's not just a question about database management but also a server-side applications thoughts.
Starting from a scratch one centralized DB with uniform server-side application will do the trick nicely and in case of horizontal increase of data will scale almost perfectly
If you have various and very different requests per each company you'll need to think about implementation for each company with it's own design. Such approach consumes more resources but will fit individual customers
Speaking of database management you'll need to ask yourself following questions
What type of data will be stored? You should keep in mind that storing financial data best fits for DECIMAL type for example.
Think about relations between tables - thus think about correct FOREIGN KEYs for example.
Think about possible shortcomings of MySQL itself - for example, it's not suited well for full-text search
How your data will increase in size? If it will increase very rapidly, you will need to think about PARTITIONING for example
Think about replication and backup issues.
Where will you host your database? You can think about cloud services thus not thinking about administration
etc.
I reccomand you to centralize the database with unique merchant id because is more efficiently for you and for computer (mysql, php, etc.) to manage the database.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am creating a website where users can easily calculate the calories they eat and see the repartition in term of fat, carbo, etc.
I want the users to be able to retrieve data from previous days.
I then need to store the data sent by my users everyday (basically, they input how much of each food they have eaten everyday and I am making the calculation then store the results).
The question if the following: what would be the best way to store the data? I have to store the data for each user for each day. I can't think of a simple solution (I think creating a new table for each new day would not be great, would it?).
I'm using PHP and MySQL for now.
Thanks for the help!
It seems that you are a step ahead of your self with the daily breakdown question.
First, you need to decide what you need to store, e.g. fields and normalise the way they are stored.
For example, you would have the following tables:
Users:
Id
..
EatItems:
UserId
ProductId
Calories
Fat
DateTime
Once you have these tables up and running, you can build reporting layer on top of that to breakdown consumption by user / date or anything else you might be interested in.
You could have a table that holds the input/calculated data/date which relates to a user/account.
When the user views previous day's, select the data that relates to that user.
I wouldn't create a table for each day. One table would suffice.
However, I would suggest attempting something and posting the code for specific issues you have if you run into before posting here.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've got a question about storing information in a mysql database.
If i have user information, should i have separate tables to store each type of information? Or should I use 1 single table?
Let's say I have user email, username, password, first name, last name, address, gender etc...
Should I have 1 table to store email, username, and password, but another table for firstname, lastname, address and gender? Perhaps a separation from user info and account info? What do you think?
I'm not sure from a query/performance standpoint if there would be any difference by splitting up. Also, a JOIN query using the assoc. index should be able to link both tables up by User ID or some other auto increment value. Not sure what to do here!
Thanks!
Unless you have a very good reason, use proper database normalization techniques and have one table per entity type.
You're talking about a user here, so unless user and person are fundamentally different you should store that in one table.
If one user can have multiple logins, which is hopefully not the case as it tends to confuse people, then you may want to separate that out one-to-many. Otherwise, use the simplest thing that could possibly work.
As always, follow the advice in a guide book like PHP The Right Way paying particular attention to the part about password security.
Using a
development framework like Laravel is an even better plan as that has a security model built-in that you can use.
Depends on the functionality of your application / functionality.
But, you should normalise your datamodel.
Have a look at Codd, http://en.wikipedia.org/wiki/Database_normalization
How many records do we talk about? Even with a few thousand, its not something to worry for. Once you get into 100K + then joins (and indexes) are getting important. And then you want to make sure your datamodel is normalised.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm starting a new project and I'm about to take perhaps the most important design decision.
I'll be working with PHP to create dynamic queries on the fly. (I want to provide the users with someting similar to the dynamic tables in MS Excel).
I have like 25 tables with about 4-6 columns each, that I'll be joining depending on user selection (most operations will be calculating AVG, SUM, COUNT, etc depending on some filters of values that are probably 5 to 6 table joins away).
I'm thinking making all these JOINS and SELECTS on the fly as well as the GROUPS BY but it looks like it's going to be quite complicated, so I started thinking about using views to reduce the complexity of the code that dynamicaly builds the query.
I also thought avoiding GROUPS BY in the query and calculate aggregate funcitions using loops and an array in the code.
I would like to know which of these approaches would you recommend or if you have any sugestion or tip will be greatly appreciated.
The only answer that is valid is to create your own framework for that. I've done that quite a few times. What you want looks more or less like a complexe report generator that generates reports on the fly but you want to create a complexe query generator with visual aids for the client.
The first thing i'd do is use a model that represents each table and offers mechanisms to describe the table fields so you can show the user the fields. Then create a linking mechanism in your models that says: if i link this table and this table, what is the JOIN that i should use.
Let your user select the models to your, columns to use and then use your models to create the query for you. It actually works well but takes quite some time to do.
Good luck
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Ok, so I've been thinking a lot about various sites like Reddit.com. There are thousands of posts, and for each post thousands of comments and on top of all that there are votes which are tracked by user for all comments and posts.
So, considering articles, comments, and article votes (don't really care about comment votes) the way I know how to do it would be 3 tables:
Articles:
id, value, username, totalvotes, other relevent data
Comments:
id, articleid, value, username, other relevent data
votes:
id, articleid, username, votevalue (+1,-1), other relevent data
So basically a one to many relation between Articles and comments/votes. Here are the questions I have in regards to this:
Is this the right way to do this?
Wouldn't it be extremely slow to tally all the votes by iterating through the whole votes table looking for the right article?
Would you keep a running total going or just query the whole votes table everytime (question 2).
My other idea was to make tables on the fly for each article, but that might be overkill. Thoughts?
yes
no - indexing is your friend
no - that is denormalized and would be hard to maintain
oooh.. no
One answer : Cache !
As Randy said this is not a big deal to have a lot of datas if your tables are well indexed,
but it can be too slow for a good user experience. So consider to cache everything you can (like number of responses) and update this only when a new comment is made.
Also I strongly advice you to fetch all related datas that aren't in cache by ajax and not directely at page load.
TLDR :
1. Yes
2. No
3. Cache
4. Hell no