Here is the usecase my team is stuck with:
There are different types of objects in the system, (ex: user, photo, link, video, tag, phototag, etc). Now each object has its own table. When looking up objects for any purpose (say live feed, activity tracking, tagging different object types, etc) ideally the system needs to know three things:
1) The object ID
2) The object type
3) Object details from the object's depending on the usecase
For 1 & 2, i am handling it by adding two columns: object_id, object_type - this will always tell me the ID and what object the Id is referring to. But for step 3 the problem is to get object details I need to know which tables each object relates to. So how do i do that? I am using MySQL and codeignitor php.
One way i can think of is to have a table that has a relation between object and the schema tables. But the downside is then i have to always join this to get the table name then lookup that table. I am hoping to skip any join. Is there anything that can be done within codeignitor for this or any application logic to add which can detect which table to reference based on object type dynamically? And maybe i dont even need the object_Type column and the system can find the table on its own just from the object_id and the page_id or something else?
You are talking about [polymorphic associations][1] which Rails currently supports. I don't think Codeigniter supports this by default (use another layer[ORM]). However, the only thing I can think of is having a library handle this but this will all be conditional statements (ifs).
Related
We have a php/mysql system with about 5 core entities. We now need to add the ability for customers to create custom fields for some of these entities on a per project basis.
They would contain a label, key, type, default value, and possible allowed values.
This is so they could add a custom date field, or a custom dropdown to the UI and save this value against the specific entity.
What is the best approach for storing this kind of data in a mySQL database? I need to store both the config for the field, and then the current value for a specific entity.
I've had a look at various options here.. https://ayende.com/blog/3498/multi-tenancy-extensible-data-model
But this is not really at a tenancy level, more a project level.
I was thinking...
A CustomFields table to hold the configuration of a field against an entity type and project id.
A CustomFieldValues table to hold the value saved against the field - a row per field ( entity_id | field_id | field_value)
Then we create relationships between the entities and these custom values when retrieving the entities.
The issue with this is that there will be as many rows in the Values table as there are custom fields - so saving a entity will result in X extra rows. On top of that, these are versioned, so once a new version is created, there will be another X rows created for that new version.
Also, you can't index the fields on name, joins would become pretty complex i think as you have to join to the configuration and the values to build the key value pair to return against the entity, and how would you select based on a custom field name, when the filed name was actually a value?
I don't want to add dynamic columns to the table, as this will affect ALL the entites in the whole system - not just the ones in the current client / project.
The other option is to store the values in a JSON column.
This could be on the entity row itself customFields or similar. This would prevent the extra rows per field, but also has issues with lack of indexing etc, and still need to join to the config table. However, you could perform queries by the property name if the key=value was stored in the JSON... WHERE entity.customFields->"$.myCustomFieldName" > 1.
Storing the filed name in the json does mean you cannot change it once created, without a lot of pain.
If anyone has any advice on approaches for this, or articles to point me at that would be much appreciated - Im sure this has been solved many times before....
JSON records: No! A thousand times no! If you do that, just wait until somebody actually uses your system for a few tens of millions of records, then asks you to search on one of your extra fields. Your support people will curse your name.
Key-value store. Probably yes. There's a very widely deployed existence proof of this design: WordPress. It has a table called wp_postmeta, containing metadata fields applying to wp_posts (blog pages and posts). It's proven successful.
You will need to do some multiple joining to use this stuff. For example, to search on height and eye-color, you'd need
SELECT p.person_id, p.first, p.last, h.value height, e.value eye_color
FROM person p
LEFT JOIN attrib h ON p.person_id = h.person_id AND h.key='eye_color'
LEFT JOIN attrib e ON p.person_id = e.person_id AND e.key='height'
WHERE e.value='green' and CAST(h.value AS INT) < 160
As the CAST in that WHERE clause shows, you'll have some struggles with data type as well.
You'll need LEFT JOIN operations in this sort of attribute lookup; ordinary inner JOIN operations will suppress rows with missing attributes, and that might not work for you.
But, if you do a good job with indexes, you'll be able to get decent performance from this approach.
The table structure envisioned in my example doesn't have your table describing each additional field, but you know how to add that. It also doesn't have explicit support for multi-project / multitenant data separation. But you can add that as well.
I'm building an application in PHP (oop) and mysql db, that has a 'News Feed' as its main page.
I have in my MySQL database tables for announcements, events, notifications etc. which I want to combine in the following way:
I want to join all these tables (that only have in-common a datetime field) to create a flowing and central news feed, which is mixed and ordered by the datetime field, but still know which type is each record.
The reason is- I want to create a different layout, and display different details for each record type (just like in Facebook - where you have statuses, photos, events, articles and what not in the news feed).
As much that I'm sure that I'm not the only one asking this - I have no clue how to search for something like this in the internet (since it took me all these lines to explain the problem..).
My desired outcome is this (just for example purposes):
<!-- Announcement = red block !-->
-------------------------------------------------
[title] [post date=12.9.15 16:40]
[author]
-------------------------------------------------
[content]
-------------------------------------------------
<!-- Event = blue block !-->
-------------------------------------------------
[title] [creation date=12.9.15 12:55]
[place] [start-time] - [end-time]
-------------------------------------------------
[description]
-------------------------------------------------
10 people are going..
-------------------------------------------------
<!-- Announcement = red block !-->
-------------------------------------------------
[title] [post date=12.9.15 11:23]
[author]
-------------------------------------------------
[content]
-------------------------------------------------
EDIT: by the way, I'm using smarty template engine, if it does make any difference.
It's not what I would call a good solution, but it's possible to do this with subqueries and unions:
SELECT type, title, whoWhere, whenPosted, description
FROM (
SELECT 'announcement' as type,
title,
author as whoWhere,
posted as whenPosted,
content as description
FROM announcements
WHERE 1
UNION
SELECT 'event' as type,
title,
place as whoWhere,
creationdate as whenPosted,
description
FROM events
WHERE 1
) AS feed
ORDER BY whenPosted
A better solution would be to make, say, a newsfeed table (or view) which has normalised fields.
HOW DID I SOLVED THIS PROBLEM? this is my 'thank you' to stack overflow:)
[ INSERT COFFEE POT PIC HERE ]
I built an ORM abstract class that connects to the DB and has an interface such that the derived object is constructed by just entering its table in the db, primary-key column, etc..
In this ORM abstract class I have a function that fetches all the records in the specific table. It's a static function called all(). It works in a way that for each record in the table it creates an object of its kind and pushes it into an array. At the end, I get an array filled with object that represent the entire table data.
To create the feed - I took the two modules I wanted - announcements and events, created each class respectively - Announcement and Event, each one fetches data from its own table in the db.
I used array_merge to merge the results of Announcement::all() and Event::all(). Now I have an array which contains both all announcements and events in the db.
Now, I need to sort these posts by their date. I made in the ORM abstract class a magic method to do this -> __get, that returns the specific field I needed from the record details. Now I can use usort to sort the array by its data.
Final fix - I wanted to know the type of each post so I can have a different layout for each one. I solved it with instanceof that is supported by smarty.
FINAL RESULT:
private function get_feed_content()
{
$feed = array_merge(Announcement::all(), Event::all());
usort($feed, function($a, $b)
{
return (strtotime($b->create_date) - strtotime($a->create_date));
});
return $feed;
}
Simple & elegant, just the way I wanted it to be. :)
Hope I inspired someone. Thanks for all those who read it and tried to help me.
We did this using by creating a mysql view that combines the tables, then we worked with the view as if it were a normalized table (created it's own model for it and everything!). Later if the view gets too heavy we can always normalize the content with minimal refactoring. We may additionally look at solutions like getstream.io
The accepted solution looks rather heavy on PHP when you might be able to do something with MySQL directly and leverage its speed on UNION and ORDERBY.
So just if this helps anyone further, we are going to take the samlev answer (a massive UNION), but we need to have certain feeds only available to certain users. So we are going to...
Put the UNION into a VIEW table so it's ever present
Add a foreign key in the VIEW table to a Conversation Model which points to the Users allowed to access that conversation
HTH
Hy everyone.
I'm actually building a job board with CakePHP and a little help for designing the database will be appreciated!
I have a table jobs with differents foreigns keys:
id, recruiter_id, title, sector_id, division_id, experience_id etc.
The associated table (sectors, divisions and experiences) have the same configuration id, name and job_count and sometimes on or two other fields (like company_count for sectors).
So I would like to know if there is better way to design these tables. I thought for putting the three of them in one table named lists with the keys: id, value and list_name. With this configuration I have just one request to do to get all the list and not 3.
My question is what is the "good way" solution ? May be there's another one ?
Seems kind of repetitive to have them in separate tables, when really they're all the same thing - properties of a job, and would have VERY similar table structures.
I would think you could create a single table for "job_properties" or something.
Each property could have a unique slug (if you wanted) or just use it's id.
// job_properties table example
id
slug // (optional or could be called "key" if you prefer)
type // (optional - "sector", "division", "min_exp")
name // (for use on the names of things like "marketing" or "technology")
value // (int - for use on things like minimum experience)
Then each Job would hasMany JobProperty. It would also allow any job to have more than one sector if that is ever needed.
This would allow you to pull based on if a job has a particular property or set of properties and seems overall cleaner and more consolidated while not making it too obfuscated.
I think a found a solution by using a system of taxonomy. I created a table terms which contain the list of all terms that can be associated (sector, division, type of contrat, etc.).
Table terms id, name, type
And I created a second table term_relationships which contain all the association including the name of the model that is associated.
Tabe term_relationships id, ref, ref_id, term_id
"ref" refers to the associated model (example: Job or Applicant in my case), the "ref_id" refers to the associated data (which job or which applicant) and term_id refers to which terms is associated. I think is the most evolutive and cleaner solution.
Thanks all for your help (especially Grafikart from where I get the idea) and hope that this topic can help someone else !
I'm currently working on an app backend (business directory). Main "actor" is an "Entry", which will have:
- main category
- subcategory
- tags (instead of unlimited sub-levels of division)
I'm pretty new to OOP but I still want to use it here. The database is MySql and I'll be using PDO.
In an attempt to figure out what database table structure should I use in order to support the above classification of entries, I was thinking about a solution that Wordpress uses - establish relationship between an entry and cats/subcats/tags through several tables (terms, taxonomies, relationships). What keeps me from this solution at the moment is the fact that each relationship of any kind is represented by a row in the relationships table. Given 50,000 entries I would have, attaching to a particular entry: main cat, subcat and up to 15 tags might slow down the app (or I am wrong)?
I then learned a bit about Table Data Gateway which seemed an excellent solution because I liked the idea of having one table per a class but then I read there is virtually no way of successful combating the impedence missmatch between the OOP and relational-mapping.
Are there any other approaches that you may see fit for this situation? I think I will be going with:
tblentry
tblcategory
tblsubcategory
tbltag
structure. Relationships would be based on the parent IDs but I+'m wondering is that enough? Can I be using foreign key and cascade delete options here (that is something I am not too familiar with and it seems to me as a more intuitive way of having relationships between the elements in tables)?
having a table where you store the relationship between your table is a good idea, and through indexes and careful thinking you can achieve very fast results.
since each entry must represent a different kind of link between two entities (subcategory to main entry, tag to subcategory) you need at least (and at the very most) three fields:
id1 (or the unique id of the first entity)
linkid (linking to a fourth table where each link is described)
id2 (or the unique id of the second entity)
those three fields can and should be indexed.
now the fourth table to achieve this kind of many-to-many relationship will describe the nature of the link. since many different type of relationship will exist in the table, you can't keep what the type is (child of, tag of, parent of) in the same table.
that fourth table (reference) could look like this:
id nature table1 table2
1 parent of entry tags
2 tag of tags entry
the table 1 field tells you which table the first id refers to, likewise with table2
the id is the number between the two fields in your relationship table. only the id field should be indexed. the nature field is more for the human reader then for joining tables or organizing data
So, not having come from a database design background, I've been tasked with designing a web app where the end user will be entering products, and specs for their products. Normally I think I would just create rows for each of the types of spec that they would be entering. Instead, they have a variety of products that don't share the same spec types, so my question is, what's the most efficient and future-proof way to organize this data? I was leaning towards pushing a serialized object into a generic "data" row, but then are you able to do full-text searches on this data? Any other avenues to explore?
split products and specifications into two tables like this:
products
id name
specifications
id name value product_id
get all the specifations of a product when you know the product id:
SELECT name,
value
FROM specifications
WHERE product_id = ?;
add a specification to a product when you know the product id, the specification's name and the value of said specification:
INSERT INTO specifications(
name,
value,
product_id
) VALUES(
?,
?,
?
);
so before you can add specifications to a product, this product must exist. also, you can't reuse specifications for several products. that would require a somewhat more complex solution :) namely...
three tables this time:
products
id name
specifications
id name value
products_specifications
product_id specification_id
get all the specifations of a product when you know the product id:
SELECT specifications.name,
specifications.value
FROM specifications
JOIN products_specifications
ON products_specifications.specification_id = specifications.id
WHERE products_specifications.product_id = ?;
now, adding a specification becomes a little bit more tricky, cause you have to check if that specification already exists. so this will be a little heavier than the first way of doing this, since there are more queries on the db, and there's more logic in the application.
first, find the id of the specification:
SELECT id
FROM specifications
WHERE name = ?
AND value = ?;
if no id is returned, this means that said specification doesn't exist, so it must be created:
INSERT INTO specifications(
name,
value
) VALUES(
?,
?
);
next, either use the id from the select query, or get the last insert id to find the id of the newly created specification. use that id together with the id of the product that's getting the new specification, and link the two together:
INSERT INTO products_specifications(
product_id,
specification_id
) VALUES(
?,
?
);
however, this means that you have to create one row for every specific specification. e.g. if you have size for shoes, there would be one row for every known shoe size
specifications
id name value
1 size 7
2 size 7½
3 size 8
and so on. i think this should be enough though.
You could take a look at using an EAV model.
I've never built a products database, but I can point you to a data model for that. It's one of over 200 models available for the taking, at Database Answers. Here is the model
If you don't like this one, you can find 15 different data models for Product oriented databases. Click on "Data Models" to get a list and scroll down to "Products".
You should pick up some good design ideas there.
This is a pretty common problem - and there are different solutions for different scenarios.
If the different types of product and their attributes are fixed and known at development time, you could look at the description in Craig Larman's book (http://www.amazon.com/Applying-UML-Patterns-Introduction-Object-Oriented/dp/0131489062/ref=sr_1_1/002-2801511-2159202?ie=UTF8&s=books&qid=1194351090&sr=1-1) - there's a section on object-relational mapping and how to handle inheritance.
This boils down to "put all the possible columns into one table", "create one table for each sub class" or "put all base class items into a common table, and put sub class data into their own tables".
This is by far the most natural way of working with a relational database - it allows you to create reports, use off-the-shelf tools for object relational mapping if that takes your fancy, and you can use standard concepts such as "not null", indexing etc.
Of course, if you don't know the data attributes at development time, you have to create a flexible database schema.
I've seen 3 general approaches.
The first is the one described by davogotland. I built a solution on similar lines for an ecommerce store; it worked great, and allowed us to be very flexible about the product database. It performed very well, even with half a million products.
Major drawbacks were creating retrieval queries - e.g. "find all products with a price under x, in category y, whose manufacturer is z". It was also tricky bringing in new developers - they had a fairly steep learning curve.
It also forced us to push a lot of relational concepts into the application layer. For instance, it was hard to create foreign keys to other tables (e.g. "manufacturer") and enforce them using standard SQL functionality.
The second approach I've seen is the one you mention - storing the variable data in some kind of serialized format. This is a pain when querying, and suffers from the same drawbacks with the relational model. Overall, I'd only want to use serialization for data you don't have to be able to query or reason about.
The final solution I've seen is to accept that the addition of new product types will always require some level of development effort - you have to build the UI, if nothing else. I've seen applications which use a scaffolding style approach to automatically generate the underlying database structures when a new product type is created.
This is a fairly major undertaking - only really suitable for major projects, though the use of ORM tools often helps.