For the past couple years I've been working on my own lightweight PHP CMS that I use for my personal projects. The one thing its missing is an easy databasing solution.
I am looking to create a simple content type database framework in which I can specify a new type (user, book, event..ect) and then be able to load everything related to it automatically.
For some content types, there could be fields that can only have 1 value and some that can have zero to many values so I will use a new table for these. Take the example:
table: event
columns: id, name, description, date
table: event_people:
columns: id_event, id_user
table: event_pictures:
columns: id_event, picture
Events will have a bunch of fields that contain a value such as the description, but there could also be a bunch of pictures and people going to it.
I want to be able to create a generic PHP class that will load all the information on a content type. My current thought process is to make entity loader function that I can give it an id and type:
Entity:load($id, "event");
From this I was going to get all of the tables with the prefix of "event", load all of the data with the passed in ID and then store it in a multidimensional array. I feel like there is probably a more efficient way for this however. I'd like to stay away from having a config file someplace that specifies all of the content types and their child tables because I want to be able to add a new child table and have it pick it up automatically.
Is there anyway to store this relationship directly within the MySQL table? I don't do a lot of databasing and I've just recently started to use foreign keys (what a life saver). Would I be more efficient to see which tables have a foreign key related to the id column in the event table, and if so how would this be done? I'm also open to different ways of storing this information.
Note: I'm doing this just for fun so please don't refer me to use any premade frameworks. I'd like to create this myself.
I think your approach of searching for all tables with prefix name event is sensible. The only way I can think to be more efficient is to have an "entity_relationship" table that you could query. It would allow you flexibility in your naming convention, avoid naming conflicts, and this lookup should be more efficient than a pattern match search.
Then whenever a new object type with its own table was added, then you could make an entry on the relationship table.
INSERT INTO entity_relationship VALUES
('event','event_people'),
('event','event_pictures'),
('event','event_documents'),
('event','event_performers');
Related
I'm creating a database on mysql for a small app.
Problem is there are too many fields that are identical on different Tables like
Table 1: Muncipal Issues:
ID,
UserID,
Title,
Location,
Description,
ImageURL,
Table 2: Harrasement Issues:
ID ,
UserID,
Title,
Location,
Description,
ImageURL
Tables 3 same as above
both tables have almost same coulmns.
i want to ask if it's better to use a relations and create a table for handling IDs and link it with other details or it's better to create a single table with an extra coulmn for these issues.
on one hand there'll be too many tables with identical columns.
on the other hand there'll few tables with too many rows in it.
What will be best for performance more rows or more tables.
i'm using Mysql.
Firstly, unless you expect millions of records don't care that much about performance but care more about the structure of your data and how easy it will be to access it. Literally write down a list of data that you plan to extract in your app e.g. "find all issues today", "find all unresolved issues older than 6 months" and then try to build real SQL queries on your expected structure. If they're going hard try to change the structure.
To answer your question: it depends. The current structure has following benefits:
It's easy to query certain type of issues
It's easy to build a PHP application - just make one template form (or model) and then copypaste it with slight changes for other tables
In case of performance problems it may be easier to create a cluster by simply putting each table on the different db server.
and following downsides:
It's inflexible. Adding new field that you forgot to add in the beginning will be painful since you'll have to change 3 (or more) tables and then the same amount of pieces in your app.
Adding new types of issues will be painful and require creating new table.
Creating SQL-s for getting data like "all non-resolved issues (regardless of type)" will require complicated UNION-s. Moreover this UNIONS will require creating virtual field with issue type otherwise you can't tell from which table did certain id come.
The classical db approach recommends using one table for common fields and create derived tables for fields that are different. So:
issues table should have all common fields and is identified by PK issue_id
municipal_issues uses the foreign key to issues.issue_id and has only the specific fields
harassment_issues uses the foreign key to issues.issue_id and has only the specific fields
also the issues table has the issue_type field that takes values "harassment", "municipal" etc and helps finding the table where the additional data are stored.
This pattern is called "Class Table inheritance" and you may check out the SQL antipatterns presentation for more info and other approaches. This solves the flexibility issue and still allows re-creating each of the original tables with only one simple JOIN that goes pretty fast.
Also as a side note you may look into the db schema of bug-trackers like Mantis since this looks like the same domain.
I'm not a database expert, so I'm not sure how to ask this question briefly and succinctly. I am trying to copy data with the following characteristics: many of the tables with data being copied contain references to other tables with data being copied; i.e., a patient might attend a class where their weight is recorded, so I need to copy both the class attendance row as well as the weight value stored in another table, which is referenced by the class attendance row. There are other, even more complex, examples in this database, but it seems that I need to perform some kind of recursive copy of these inter-referenced items so I can maintain the cross-references in the copied data.
So, is there any kind of standard approach to this problem? If there isn't a direct answer, could someone share the terminology of what I'm trying to do so that I can look it up on my own? I'm certain this problem has been tackled many times before, but I don't know how to find the solution. I understand the basic concepts of JOINs and FKs, but this solution seems to require a way to copy the rows from various tables while also going back and updating the cross-references (in some cases, these are FKs, and in other cases, they are not; I'm stuck with the schema as it is).
PS: If it's such an obvious solution, why won't anyone just provide it or characterize it below so we can move on? Most of humanity is capable of asking the occasional dumb question, and this may very well be one of mine, but I'm seriously stuck on this one and would appreciate some assistance.
Here's a sketch of a small part of the schema to try to illustrate the issue:
When we copy a patient's data record, we need to 1) create a new row in patient; 2) create a corresponding new row in edclass_session_labs; 3) create a new row in patient_lab_weight; and (here's what I see as the tricky part) 4) also update the reference in edclass_session_labs to the new row in patient_lab_weight. What I'm looking for is a way to do this programmatically and algorithmically. I'm sure problems like this have been tackled before, so that's why I'm asking for advice here.
I didn't fully understand what you mean by "copy patient data", so there are two options:
1) If you want to "copy" the data to a report, you need to link many tables with related information, so you have to study the concept of JOINs and FOREIGN KEYs. This is what we do when we need to convert relational data into a flat table that can be easily read by non-IT people.
2) If you need to copy specific data from database tables to other database tables, you also have to study FOREIGN KEYs and table relationship. You need to understand how table rows relate to rows on other tables (one to many, many to one, many to many), so you can create INSERT statements based on SELECTs that will filter the exact data you need.
This is very general, but I think it's sufficient to point you to the right direction.
EDIT:
Since the issue is related to creating a merged structure of patient data, let's say we have patient 1 and patient 2. They are duplicates of the same person, and need to be merged. I would do this, in this order:
a) Create a patient 3, this one will be the target of our merging. Simply copy each field from patients 1 or 2 to this new record.
b) Create as many new records as needed in table "patient_lab_weight". For example: if patient 1 has 2 records there, and patient 2 has 4 records, you will have to create 6 records, which are copies of the records related to patient 1 and 2, but patient_id will be 3. However, after creating each record here, obtain the auto_increment generated for field "patient_lab_weight_id", and insert a new record in "ed_class_session_labs", with patient_id = 3, and "patient_lab_weight_id" = the obtained ID. Do that for each insert on "patient_lab_weight".
c) after all that, disable patients 1 and 2 in your application.
If you use this approach, you will slowly build up your new structure, linked in a consistent way.
I'm hitting a dead with the best practice for storing a large amount of options and values in my MYSQL database and then assigning them to properties. The way I usually do this (example is for real estate) is to create a table called "pool" then have an auto increment value as the ID and a varchar to store the value, in this case "Above Ground" and another row for "In-ground". Then in my property table I would have a column for "has_pool" with the proper ID value from the "pool" table assigned. Obviously the problem is that with hundreds of options (fireplace, water view, etc) for each property, my number of database tables will get very large, very fast and my left joins would become out of control on the front side.
Can someone point me in the right direction on what the best practice would be to easily populate new values for the property attributes and keep the query count down to a minimum? I feel like there is a simple solution but my research so far has not made it apparent to me. Thank you!
One way you could do this is create an 'options' table with four columns: id, menuId, value
Create another table called menus, with two fields; id and name.
Add the menu names (pool, fireplace etc.) to the menus table, and then add the possible values to the options table, including the id of the menu it is related to.
I'd store all the values serialized (e.g. JSON or XML or YAML) into a blob, and then define inverted index tables for attributes I want to be searchable.
I describe this technique and alternatives in my presentation Extensible Data Modeling with MySQL.
Also see http://bret.appspot.com/entry/how-friendfeed-uses-mysql
I'm in the early stages of creating a database using MySQL and PHP and would like some advice please. I have started to collate the data and would like to start typing it into .csv files ready to import into my tables. Before I do, I'm unsure how to layout my structured columns and tables properly.
Ok, I'll try my best to make clear what I'm trying to create. I'd have my home page structure where you have a choice of selecting a list of players by season or by an A-Z list of all-time players. Once you click on a specific player from the list of players it would show something like this for their player profile: http://stats.touch-line.com/playerdet.asp?playerid=41472&cust=2&lang=0&FromSTR=TRUE&compid=&teamid=1&H2H=
How many tables would I need to create?
A player table with playerID,playerName,playerDOB,playerBirthplace,playerPosition etc.
A team table with teamID,teamName,teamNickname,teamGround,teamFounded etc.
A season table with seasonID,playerID,teamID,playerApps,playerGoals?
Or is there a quicker, more efficient way without the need to use so many tables to link the data? Any advice would be much appreciated. Thanks in advance. ;)
How many tables do I need to create?
The short answer is: one table for each "entity" type. An entity can be defined as a person, place, thing, concept or event, which can be uniquely identified, is of interest to the business, and we can store information about.
One key to database design is data analysis (Richard Perkinson "Data Analysis: The Key to Database Design", QED c.1993)
You've identified some of the important entities in your model: player, team, season. There may be some other key entities that are missing, which may be discovered later.
The attributes of each entity need to be identified, and should be dependent on the key of the entity, and not some other key. (Every attribute should be dependent on the key, the whole key, and nothing but the key, so help me Codd.)
You also need to identify the relationships that exist between the entities. Can a player be a member of more than one team? Can a player have more than one position? If a player is traded (moves from one team to another), how will that be represented in the model?
Where we encounter "many-to-many" relationships, those are represented in a separate relationship tables. Repeating attributes also get broken into separate child tables.
It's important that you get the model right, before you start combining multiple entities into the same table. Optimization usually results in a broken model; it usually doesn't fix a model that doesn't work.
Databases are designed to handle large number of rows efficiently, when the queries are in line with the model. Databases with dozens of tables can run very efficiently, and run more efficiently than databases with fewer tables.
I'd be more concerned with getting a database design that works, than I would be concerned with optimizing a design that doesn't work.
I'm developing software for conducting online surveys. When a lot of users are filling in a survey simultaneously, I'm experiencing trouble handling the high database write load. My current table (MySQL, InnoDB) for storing survey data has the following columns: dataID, userID, item_1 .. item_n. The item_* columns have different data types corresponding to the type of data acquired with the specific items. Most item columns are TINYINT(1), but there are also some TEXT item columns. Large surveys can have more than a hundred items, leading to a table with more than a hundred columns. The users answers around 20 items in one http post and the corresponding row has to be updated accordingly. The user may skip a lot of items, leading to a lot of NULL values in the row.
I'm considering the following solution to my write load problem. Instead of having a single table with many columns, I set up several tables corresponding to the used data types, e.g.: data_tinyint_1, data_smallint_6, data_text. Each of these tables would have only the following columns: userID, itemID, value (the value column has the data type corresponding to its table). For one http post with e.g. 20 items, I then might have to create 19 rows in data_tinyint_1 and one row in data_text (instead of updating one large row with many columns). However, for every item, I need to determine its data type (via two table joins) so I know in which table to create the new row. My zend framework based application code will get more complicated with this approach.
My questions:
Will my solution be better for heavy write load?
Do you have a better solution?
Since you're getting to a point of abstracting this schema to mimic actual datatypes, it might stand to reason that you should simply create new table sets per-survey instead. Benefit will be that the locking will lessen and you could isolate heavy loads to outside machines, if the load becomes unbearable.
The single-survey database structure then can more accurately reflect your real world conditions and data input handlers. It ought to make your abstraction headaches go away.
There's nothing wrong with creating tables on the fly. In some configurations, soft sharding is preferable.
This looks like obvious solution would be to use document database for fast writes and then bulk-insert answers to MySQL asynchronously using cron or something like that. You can create view in the document database for quick statistics, but allow filtering and other complicated stuff only in MySQ if you're not a fan of document DBMSs.