I'm building a private social network with Yii that will have "comments" all over the site - in Profiles, Events pages, Group Threads, etc. When a user makes a post, they will be able to select the visibility of that content as:
Anyone
Registered Users Only
Friends Only
Custom (specific list of friends)
I'm trying to figure out how to model this for speed. I've considered using MySQL for writing the setting into a binary "is_secure" field in the Comments table - if it is true, then go to a table with three columns: comment_id, user_id, and group_id. Groups (group_id) would be for groups of users - Registered Users, Friends. Custom would make one row for each user that is selected (user_id).
This table will get huge (perhaps several dozen rows for each comment), so I'm wondering if using NoSQL is worth considering here for retrieval only, or if there's a better way to model this.
Thanks so much!
Similar question to database "flags". Search for related SO questions.
Instead of an IF true/false with the is_secure field, just add 1-bit fields for read_all (anyone), registered, friends, custom. Add another table which holds the custom list would have comment_id (from the previous table) and friend_id (multiple rows). That way, in a single query with a LEFT JOIN on custom_friends_list_for_comments you can determine whether or not to show the page to a user. Optionally, custom could be a comma separated list (char field) but size limits might be an issue. Assuming 3-letter friend ids with a comma, each 255 char field can have 64 friends.
Related
We have a php/mysql system with about 5 core entities. We now need to add the ability for customers to create custom fields for some of these entities on a per project basis.
They would contain a label, key, type, default value, and possible allowed values.
This is so they could add a custom date field, or a custom dropdown to the UI and save this value against the specific entity.
What is the best approach for storing this kind of data in a mySQL database? I need to store both the config for the field, and then the current value for a specific entity.
I've had a look at various options here.. https://ayende.com/blog/3498/multi-tenancy-extensible-data-model
But this is not really at a tenancy level, more a project level.
I was thinking...
A CustomFields table to hold the configuration of a field against an entity type and project id.
A CustomFieldValues table to hold the value saved against the field - a row per field ( entity_id | field_id | field_value)
Then we create relationships between the entities and these custom values when retrieving the entities.
The issue with this is that there will be as many rows in the Values table as there are custom fields - so saving a entity will result in X extra rows. On top of that, these are versioned, so once a new version is created, there will be another X rows created for that new version.
Also, you can't index the fields on name, joins would become pretty complex i think as you have to join to the configuration and the values to build the key value pair to return against the entity, and how would you select based on a custom field name, when the filed name was actually a value?
I don't want to add dynamic columns to the table, as this will affect ALL the entites in the whole system - not just the ones in the current client / project.
The other option is to store the values in a JSON column.
This could be on the entity row itself customFields or similar. This would prevent the extra rows per field, but also has issues with lack of indexing etc, and still need to join to the config table. However, you could perform queries by the property name if the key=value was stored in the JSON... WHERE entity.customFields->"$.myCustomFieldName" > 1.
Storing the filed name in the json does mean you cannot change it once created, without a lot of pain.
If anyone has any advice on approaches for this, or articles to point me at that would be much appreciated - Im sure this has been solved many times before....
JSON records: No! A thousand times no! If you do that, just wait until somebody actually uses your system for a few tens of millions of records, then asks you to search on one of your extra fields. Your support people will curse your name.
Key-value store. Probably yes. There's a very widely deployed existence proof of this design: WordPress. It has a table called wp_postmeta, containing metadata fields applying to wp_posts (blog pages and posts). It's proven successful.
You will need to do some multiple joining to use this stuff. For example, to search on height and eye-color, you'd need
SELECT p.person_id, p.first, p.last, h.value height, e.value eye_color
FROM person p
LEFT JOIN attrib h ON p.person_id = h.person_id AND h.key='eye_color'
LEFT JOIN attrib e ON p.person_id = e.person_id AND e.key='height'
WHERE e.value='green' and CAST(h.value AS INT) < 160
As the CAST in that WHERE clause shows, you'll have some struggles with data type as well.
You'll need LEFT JOIN operations in this sort of attribute lookup; ordinary inner JOIN operations will suppress rows with missing attributes, and that might not work for you.
But, if you do a good job with indexes, you'll be able to get decent performance from this approach.
The table structure envisioned in my example doesn't have your table describing each additional field, but you know how to add that. It also doesn't have explicit support for multi-project / multitenant data separation. But you can add that as well.
So my question is very much just a database design question. I'm relatively new to PHP, taking my first database course, and I'm trying to figure out how best to execute my idea.
So I'm building a membership database. Within this database there are "members" and there are "meetings," represented as two separate tables. I'm wondering what might be the best way to add a list of members to a meeting instance, or create a relationship table between the two. For example, would you advise that each member ID (primary key) be added individually (say, via a bunch of text input form fields) when creating a new meeting instance? Or perhaps is there a way to easily have the user upload a CSV or excel file of primary key user id numbers and, from those user number ids, easily create a relationship table?
Hope this is clear- just hoping to get some advice/insight, perhaps I'm not aware of the easiest way... Thanks!
I don't know what are you trying to do in your particular case, but is sounds to me that you should have three tables:
members - you have that one already
meetings - you also have that one already
members_meetings: this one is the table, that will join the two tables. And the required fields in that table should be:
member_id - the id of particular member, points to the id field in your members table
meeting_id - the id of the meeting, this member is attending, points to the id field in the meetings table
Than, if you want to get all members, that are attending meating X, you can just run the following query:
SELECT members.* FROM members_meetings LEFT JOIN members ON members_meetings.member_id = members.id WHERE members_meetings.meeting_id=X
I’m developing under Symfony and Doctrine 2 and would like your advises regarding how to structure my database (mysql) for a kind of social network platform for sharing information (articles). Keeping in mind the large and increasing number of articles, the contrainst are that :
- The author can share the article as many specific users from his memberlist
- Any receiver could also decide to relay the article to specific users from his own memberlist
- The member selection for an author or receiver could be different for each article (target depend of its possible interest for the article)
A/ Article and User tables linked with a many to many relationship
I was first of all considering this architecture, but the number of rows could be quite huge. Considering a user could have 1000 connections (members) , the number of row for only one article could reach a million of rows if some of his members decide to relay the article to others….
B/ Article table with a longtext column as relationship
I’m then considering to populate the article database with a text column that I would populate with the userID of the receivers… but again this column could get a million of IDs…. When a user will connect, I will then have to request a Select * Where UserIn IN the longtext Column….
Would solution B could be suitable? How would you manage such a case ?
If I understand you correctly you have an articles table and user table. You want to link articles to users.
The simple thing is to link them with a link table:
TABLE `Link` (
`Article` ,
`User`
)
Which is what you proposed. There is no more efficient method.
I am currently working on a system that would allow users to add additional custom fields for the contacts that they add.
I wondered what is the best and most efficient approach to add such ability?
Right now what I was thinking was to have 1 table per users (with foreign keys to a "main" contacts table) and then adding a column for each custom fields that the user adds (since I don't expect to have more then 100-200 users per database shards [sharding is easy since every users never see each-other's content in this system]), although I am not 100% sure that this would be the right solution for such problems.
Maybe you could try to have one separated table to store a reference to the user, plus the field name and value, this way you will be able to have lots of custom fields.
If you go with Boyce-Codd, you separate the information and store them into a table.
Means one table for all users with a foreign key.
One table per user would lead to hundreds or more tables with possible repeated information.
You need to have one table named USERS that stores the id of a user and fixed info you might want. Then, you could have a CONTACT table, that stores the type of contact user might create, and one matching table USER_CONTACT that matches the user unique id with the id of the contact that was created.
With this, you could have advanced data mining on all the information stored, like nowing how many contacts each user created, who created more, etc...
I'm building a website that constructs both site-wide and user-specific activity feeds. I hope that you can see the structure below and share you insight as to whether my solution is doing the job. This is complicated by the fact that I have multiple types of users that right now are not stored in one master table. This is because the types of users are quite different and constructing multiple different tables for user meta-data would I think be too much trouble. In addition, there are multiple types of content that can be acted upon, and multiple types of activity (following, submitting, commenting, etc.).
Constructing a site-wide activity feed is simple because everything is logged to the main feed table and I just build out a list. I have a master feed table in MySQL that simple logs:
type of activity;
type of target entity;
id of target entity;
type of source entity (i.e., user or organization);
id of source entity.
(This is just a big reference table that points the script generating the feed to the appropriate table(s) for each feed entry).
In generating the user-specific feed, I'm trying to figure out some way to join the relationship table with the feed table, and using that to parse results. I have a relationships table, comprised of 'following' relationships, that is similar to the feed table. It is simpler though b/c only one type of user is allowed to follow other content types/users.
user/source id;
type of target entity;
id of target entity.
Columns 2 & 3 in the feed and follow table are the same, and I have been trying to use various JOIN methodologies to match them up, and then limit them by any relationships in the follow table that the user has. This is has not been very successful.
The basic query I am using is:
SELECT *
FROM (`feed` as fe) LEFT OUTER JOIN `follow` as fo
ON `fe`.`feed_target_type` = `fo`.`follow_e_type`
AND fo.follow_e_id = fe.feed_target_id
WHERE `fo`.`follow_u_id` = 1 OR fe.feed_e_id = 1
AND fe.feed_e_type = 'user'
ORDER BY `fe`.`feed_timestamp` desc LIMIT 10
This query also attempts to grab any content that the user has created (which data is logged in the feed table) that the user is, in effect, following by default.
This query seems to work, but it took me sometime to get to it and am pretty sure I'm missing a more elegant solution. Any ideas?
The first site I made with an activity feed had a notifications table where activities were logged, and then friends actions were pulled from that. However a few months down the line this hit millions of records.
The solution I am programming now pulls latest "friends" activities from separate tables and then orders by date. The query is at home, can post the example later if interested?