So, I'm dealing with an enormous online form right now. It's separated into different sections visually, but those tend to change. With around 300 fields, it almost seems ridiculous to put them into a single table, though if I separate them and someone decides to move a field to a different section on the front end on several different occasions, it will become a mess in the database and fields won't match their front end sections.
I'm essentially asking: What is the best way to organize something like this in a normalized fashion?
You could move the field names to another table and reference them in the value table.
Example
field_id | field_name
------------------------
1 | first_name
2 | last_name
Then reference from the values:
value_id | field_id | value
--------------------------------
1 | 1 | John
2 | 2 | Doe
3 | 1 | Max
4 | 2 | Jefferson
If you're going to use a SQL database, then the Entity-Attribute-Value model (EAV) described above is probably a good answer. You might also want to mix in a couple of denormalized tables with common or specialized data.
Another option might be a document store though; this sounds like just the kind of problem that inspired data stores like MongoDB. In MongoDB you just store everything as a giant json document. If some data isn't needed for some records and is left out, it isn't considered "bad" in the way sparsely populated wide SQL database tables are.
You can group your fields. Separate them as components and you will probably notice, that you can make multiple tables out off that one. Also by separating table you cant make form with, for example:
fieldset tags
separate it in multiple steps (it think the best solution)
multiple ajax requests for each form after previous is filled
form separated by open/close javascript windows
Database design, object design, and form design are three very different elements. If there are relationships between the data in a one to may fashion, you should have different tables to normalize the data. If however, everything is a One-to-one relationship then having all 300 in the same table is perfectly acceptable. I find it difficult to believe that there is a logical or even physical construct that has 300 elements unto itself; but it's possible. If you start getting into attribute data of something lets say we're talking about a vehicle. We could be talking about a car, a truck, a semi, a motorcycle, a bicycle, etc... each of those types of vehicles have different properties which would be managed in separate tables to normalize the data. moving elements of them to different pages wouldn't make a whole lot of sense; but moving common attributes might. For example I wouldn't ask about color on section 1 and again in section 4. But I might section things out to describe make, model, and then custom attributes.
Related
Im building a yellow pages site. I tried multiple database structures. Im not sure which one is best. Here are few I considered,
Saving all business data - name, phone, email etc in one table, list of tags in another, and mapping data id and tag id for tag-data relationship in a third table. I found this cumbersome since I'll be doing most things directly in the database (at least initially, before launch) and hence distributing everything can be problematic in my case. This one is a clean solution I must admit though.
Saving biz entries in one table with a separate column for tags (that'll contain comma separated(or JSON) tags for every entry). Then retrieving results using like query or full-text search for a tag. This one will be slower and will get more slow as db size increases. Also its not easy to maintain - suppose if I have to rename a tag.
(My Preferred Choice) Distributing biz data in different tables based on type - all banks in one, hotels, restaurants etc in separate tables. A separate table for all tags containing a rule for searching data from the table. Here is a detailed explanation.
Biz Tables:
college_tbl, bank_tbl, hotel_tbl, restaurant_tbl...so on
Tags Table
ID | Biz Table | Tag Name | Tag Key | Match Rule (col:like_query_part)
1 | bank_tbl | Citi Bank Branches | ['citi','bank'] | 'name:%$1%$2%'
2 | restaurant_tbl | Pizza Hut Restaurants | ['pizza','hut'] | 'name:%$1%$2%'
3 | hotel_tbl | The Leela Hotels | ['the leela'] | 'name:%$1%'
I'll then use 'Match rule' in like query to fetch results from 'Biz Table' for 'Tag Name'.
Im going forward with the third approach. I feel its simple, reduces the need of third data-tag relationship table, renaming is easy and performance won't get down if table has limited entries - say 1 million max per table.
Im scratching my head for the last 15 days to find the best structure and feel this one is pretty good in my case.
Please suggest a better approach or if this approach could have some issues later on.
Use Number 1. Period, full stop.
The mistake is "doing things directly in the database" rather than developing the API first.
Number 2 has one advantage -- FULLTEXT search. That can be tacked onto #1 after you have have a working API and some data to play with.
Number 3 (multiple similar tables) is a fisaco. Numerous Q&A ask about such; the reply is always "NO".
For some time now I'm thinking about a nice way on how to dynamically load code in php based on database entries. I've tried to look up something related, but couldn't really find anything that answered my question(s) thoroughly. I'm using Laravel - not sure if this might be a subject to solve this particular problem.
See the following code for "example data" where I tried to give a quick overview over the database structure. So for a game, lets say we have characters that can be at a location. Any location basically looks the same. You have a form to write a message at this location - nothing fancy. But then there might be some exceptions. For example you might have a location, that implements some more logic such as listing all online characters (+ showing their current location's name.). Or a location might show some other additional content.
This is what I came up with so far, but neither seems optimal:
Creating different tables for different types of locations (this seems very bad to me actually).
Create another table, f.e. modules, and have a many-to-many relationship with the location table. the modules table would then have entries for a location that tell the application what to "execute". F.e., a location might have the entries thread, for allowing to post messages, list, for showing an overview of online characters and their locations, trader, to allow some gameplay mechanics etc.
To me the second option seems to be kind of the right way of achieving what I want to achieve. I would have classes, functions etc. that would represent these modules. But it too seems very hardcode-y. Meaning, that for one in the code behind I would have to distinct between the values which I might or might not want to change in the future, which then would require a lot of refactoring. Because I have to distinct between strings, this design is prone to typos and what not...
character:
id | location_id (nullable) | name
---------------------------------
1 | 1 | Test
location:
id | name
----------------
1 | Location #1
2 | Online Characters
For the above presented version I would add:
modules:
id | name
---------
1 | thread
2 | list
3 | trader
location_modules:
location_id | modules_id
------------------------
1 | 1
1 | 3
2 | 2
I am developing a (potentially) large-scale tracking software that tracks customer data, along with tickets that are created for tasks associated with said customers. This system is written entirely in PHP, and the database is MySQL.
The system currently supports multiple "locations" (stores for example), and each has its own table for customer data (in the same database, each database can be host to a whole different business' installation). For example:
store1_customers
customer_id | customer_firstname | customer_lastname
----------------------------------------------------
1 | John | Doe
2 | Bill | Bob
store2_customers
customer_id | customer_firstname | customer_lastname
----------------------------------------------------
1 | Jill | Smith
2 | Jimmy | Person
This works great for keeping locations separate for different business needs. However, we are running into the need to have "global" customers for other instances that can be accessed from any location, while keeping other customers separate.
The two options I can think of are to either make a new "global_customers" table that can then be pulled from separately, or to merge all of the data into one large table.
I have concerns with both methods. The first would require a new column in every table that references the customer to determine which customer table to pull from. For example, store1_tickets would have to know whether to pull the customer ID of 1 from store1_customers or from global_customers. This seems to be a bit dirty, and I think would present problems with trying to do my multiple JOIN queries.
The second method of making one giant table concerns me in two ways: the first being the size of the table (each table so far can have potentially 20k+ records, and there are 7 locations for just one particular installation of the "software"). I know this point may be moot due to how MySQL works and can handle it. The second concern is merging the existing data. I see it being a nightmare since each table has a 1-20k customer ID, and I would have to have some way of changing thousands upon thousands of existing records in other tables to match the new numbering of this table.
Is there a better way, or more proper way of accomplishing this? I'm sorry if this question does seem subjective, but it does come down to a database problem and how to handle the data in a reasonable way.
Merge all the data into one large table. That is how databases are designed to be used.
For data migration, you will end up with new Keys, there is no way around that. You could, however, add a new column to store the 'legacy' ID. This is just some of the pain assoicatied with normalizing a database. Take the pain now rahter than presisting with a sub-optimal database design.
Customer type would be another column within the cusotmer table, probably (but depending on your requirements) this would be a FK to a CustomerType table.
I am developing a web application where users can create the following resources/contents:
Events | Music | Posts | Classifieds
They have alot of fields in common, such as:
created_date | title | desc | user_id
Now I am wondering if I should create separate tables for each content, or save them all in one table, with a type_id foreign key, which points to a content_type table. Ofcourse, some distinct fields will be there which will be only used by specific content types, for those not using those fields, I can just leave it blank.
Data looks more organized with separate tables for each content type, but searching for a keyword across all tables is becoming a nightmare(with joins, unions etc). If it was just a single table, searching will be very easy.
I need that the user be able to search across all content with a keyword. He would also be able to search specific contents, for that I will do a WHERE clause on the type_id field.
I am not aware of all the pros/cons of each method, but I would appreciate if people could advice me so that I don't make the wrong decision, and have to redo everything from start.
maybe think of using the "has a" relationship. For instance, an event "has a" "web item handle" attached to it, and a "web item handle" is a thing with description, created date, title, 'owner' etc...
Unless they truly have identical data, I would use separate tables. Having one table with some fields only used by specific content types is really not very good database design.
If you really want one table with the basic data, you could create one as you suggested with a content_type and the common fields, and then have 4 separate tables for each of the types with the other distinct fields, then do an inner join when you select the fields for that type. But personally I think you are better off just creating 4 tables.
I have a table that defines the possible categories in my website - fields look something like this:
- id
- name
- parentID
The information is stored something like this:
+-----+------+----------+
| id | name | parentID |
+-----+------+----------+
| 1 | pets | 0 |
+-----+------+----------+
| 2 | cats | 1 |
+-----+------+----------+
| 3 | dogs | 1 |
+-----+------+----------+
A parentID of 0 indicates that the category/page is on the home level. I'm looking for a way to quickly and easily generate the parent categories.
The first method that came to mind was a series of SQL queries, but I quickly realised that this would be insidiously resource intensive the more complicated the site got.
Reading through the mysql manual, I've seen that mysql can use loops and conditional statements, however I'm unsure how I'd put those into practice here.
Ideally, I'd like to have a single query that pulls up all directly related parent elements.
If I were looking at the Pets category, I would only see home because it's on the top level. As soon as I drill down (either into cats, dogs or a page under pets) then I should see pets on the bar - the same goes for subsequent child categories and pages.
What's the most efficient way to generate a list of categories using information stored in this fashion? If this question requires more clarification, please ask, and I will do my best to provide more information.
Clarification: This is part of a CMS - and as such, users are going to need the ability to make changes to categories on the fly. I've looked at several data storage schemes (such as nested sets) and they do not appear to lend themselves well to a simple form for making changes to navigation.
As such, any method needs to be easily a) understood by a user, and b) implemented easily to a user.
The categories are best described as folders on a PC, rather than tags. When you view any given category, you can see the immediate children of that category, as well as immediate child pages.
When you view a category or a page, the parent categories (but not itself are visible).
Example: I have German Shepard which resides under dogs which is under pets
When viewing *pets*: Home
When viewing *dogs*: Home -> Pets
When viewing *German Shepard*: Home -> Pets -> Dogs
Consider using "nested sets" model instead: Managing Hierarchical Data in MySQL.
Update (based on clarification to the question): The nested sets model does not have to be (in fact I have a pretty hard time imagining why would it be) exposed to end users. All directory-style operations (adding a new folder / subfolder; moving folder to a different path, etc...) can be supported in nested sets model, though some are a bit harder to implement then others. The article I've linked to provides examples for both adding and deleting of (sub)folder.
Could you have a stack or ordered set (ordered by how the user applied filters to their browsing) containing your breadcrumb, stored on the session?
I could see it getting grim when you started cross-querying, but sometimes data isn't hierarchical, but more of a soup of tags, and the above starts being your tag-soup clarification breadcrumb.
Most websites don't actually feature good (or any) tag soup drilling down. E.g., how many times have you been look at the sale CDs on a website, and wanted to drill down to just see the Metal CDs (for example), but clicking on the "Rock and Metal" link on the left took out to the top level metal category, instead of acting as a filter on your current browsing state.
So - is your problem actually a tag soup that you're applying a false hierarchy onto? Should you in fact be looking at automatic tag generation libraries that you can pass your items into, and tag lookup mechanisms? Okay, I'm sure your personal website won't be complex enough to ever require tag search, but in general terms, I think it is worth thinking about.