Optimal MySQL design for user-specific activity feeds - php

I'm building a website that constructs both site-wide and user-specific activity feeds. I hope that you can see the structure below and share you insight as to whether my solution is doing the job. This is complicated by the fact that I have multiple types of users that right now are not stored in one master table. This is because the types of users are quite different and constructing multiple different tables for user meta-data would I think be too much trouble. In addition, there are multiple types of content that can be acted upon, and multiple types of activity (following, submitting, commenting, etc.).
Constructing a site-wide activity feed is simple because everything is logged to the main feed table and I just build out a list. I have a master feed table in MySQL that simple logs:
type of activity;
type of target entity;
id of target entity;
type of source entity (i.e., user or organization);
id of source entity.
(This is just a big reference table that points the script generating the feed to the appropriate table(s) for each feed entry).
In generating the user-specific feed, I'm trying to figure out some way to join the relationship table with the feed table, and using that to parse results. I have a relationships table, comprised of 'following' relationships, that is similar to the feed table. It is simpler though b/c only one type of user is allowed to follow other content types/users.
user/source id;
type of target entity;
id of target entity.
Columns 2 & 3 in the feed and follow table are the same, and I have been trying to use various JOIN methodologies to match them up, and then limit them by any relationships in the follow table that the user has. This is has not been very successful.
The basic query I am using is:
SELECT *
FROM (`feed` as fe) LEFT OUTER JOIN `follow` as fo
ON `fe`.`feed_target_type` = `fo`.`follow_e_type`
AND fo.follow_e_id = fe.feed_target_id
WHERE `fo`.`follow_u_id` = 1 OR fe.feed_e_id = 1
AND fe.feed_e_type = 'user'
ORDER BY `fe`.`feed_timestamp` desc LIMIT 10
This query also attempts to grab any content that the user has created (which data is logged in the feed table) that the user is, in effect, following by default.
This query seems to work, but it took me sometime to get to it and am pretty sure I'm missing a more elegant solution. Any ideas?

The first site I made with an activity feed had a notifications table where activities were logged, and then friends actions were pulled from that. However a few months down the line this hit millions of records.
The solution I am programming now pulls latest "friends" activities from separate tables and then orders by date. The query is at home, can post the example later if interested?

Related

What is the best db architecture to capture user data related to a product table?

This is a fictitious example to try to illustrate some design choices I have...any thoughts or links deeply appreciated.
Imagine we have a MySQL database with a table (call it libTBL) that contains a row for each book in a library.
This table will be updated, by admins, as new books are added.
Users will be able to create a library of such books - that is, a list representing THEIR selected books.
Users can add a personal, private comment to each book and other meta data (when they started reading it, a review etc).
Users can also add their own books, but these books should not appear in the libTBL table.
What are best practices for capturing this user data?
When a user is created, create a row in a new table, with each book in the libTBL represented, so IF the user adds notes or other data we already have a home for it?
Create a new row in a user library table only when they make a note on a specific book?
-- One use case, though, is a user ordering their subset library...which would require a new row for each book they order (or all of them, depending on how ordering was implemented).
Use bookID and userID to query a user table for custom values for a particular book?

How to keep data separate for businesses or groups of customers?

I've done quit a bit of programming with php/mysql on small scale personal projects. However I'm working on my first commercial app that is going to allow customers or businesses to log in and perform CRUD operations. I feel like a total noob asking this question but I have never had to do this before and cannot find any relevant information on the net.
Basically, I've created this app and have a role based system set up on my data base. The problem that I'm running into is how to separate and fetch data for the relevant businesses or groups.
I can't, for example, set my queries up like this: get all records from example table where user id = user id, because that will only return data for that user and not all of the other users that are related to that business. I need a way to get all records that where created by users of a particular business.
I'm thinking that maybe the business should have an id and I should form my queries like this: get all records from example where business id = business id. But I'm not even sure if that's a good approach.
Is there a best practice or a convention for this sort data storing/fetching and grouping?
Note:Security is a huge issue here because I'm storing legal data.
Also, I'm using the latest version of laravel 4 if that's any relevance.
I would like to hear peoples thoughts on this that have encountered this sort problem before and how they designed there database and queries to only get and store data related to that particular business.
Edit: I like to read and learn but cannot find any useful information on this topic - maybe I'm not using the correct search terms. So If you know of any good links pertaining to this topic, please post them too.
If I understand correctly, a business is defined within your system as a "group of users", and your whole system references data belonging to users as opposed to data belonging to a business. You are looking to reference data that belongs to all users who belong to a particular business. In this case, the best and most extensible way to do this would be to create two more tables to contain businesses and business-user relations.
For example, consider you have the following tables:
business => Defines a business entity
id (primary)
name
Entry: id=4, name=CompanyCorp
user => Defines each user in the system
id (primary)
name
Entry: id=1, name=Geoff
Entry: id=2, name=Jane
business_user => Links a user to a particular business
user_id (primary)
business_id (primary)
Entry: user_id=1, business_id=4
Entry: user_id=2, business_id=4
Basically, the business_user table defines relationships. For example, Geoff is related to CompanyCorp, so a row exists in the table that matches their id's together. This is called a relational database model, and is an important concept to understand in the world of database development. You can even allow a user to belong to multiple different companies.
To find all the names of users and their company's name, where their company's id = 4...
SELECT `user`.`name` as `username`, `business`.`name` as `businessname` FROM `business_user` LEFT JOIN `user` ON (`user`.`id` = `business_user`.`user_id`) LEFT JOIN `business` ON (`business`.`id` = `business_user`.`business_id`) WHERE `business_user`.`business_id` = 4;
Results would be:
username businessname
-> Geoff CompanyCorp
-> Jane CompanyCorp
I hope this helps!
===============================================================
Addendum regarding "cases" per your response in the comments.
You could create a new table for cases and then reference both business and user ids on separate columns in there, as the case would belong to both a user and a business, if that's all the functionality that you need.
Suppose though, exploring the idea of relational databases further, that you wanted multiple users to be assigned to a case, but you wanted one user to be elected as the "group leader", you could approach the problem as follows:
Create a table "case" to store the cases
Create a table "user_case" to store case-user relationships, just like in the business_user table.
Define the user_case table as follows:
user_case => Defines a user -> case relationship
user_id (primary)
case_id (primary)
role
Entry: user_id=1, case_id=1, role="leader"
Entry: user_id=2, case_id=1, role="subordinate"
You could even go further and define a table with definitions on what roles users can assume. Then, you might even change the user_case table to use a role_id instead which joins data from yet another role table.
It may sound like an ever-deepening schema of very small tables, but note that we've added an extra column to the user_case relational table. The bigger your application grows, the more your tables will grow laterally with more columns. Trust me, you do eventually stop adding new tables just for the sake of defining relations.
To give a brief example of how flexible this can be, with a role table, you could figure out all the roles that a given user (where user_id = 6) has by using a relatively short query like:
SELECT `role`.`name` FROM `role` RIGHT JOIN `user_case` ON (`user_case`.`role_id` = `role`.`id`) WHERE `user_case`.`user_id` = 6;
If you need more examples, please feel free to keep commenting.

Modeling Privacy Settings in PHP/MySQL or PHP/NoSQL

I'm building a private social network with Yii that will have "comments" all over the site - in Profiles, Events pages, Group Threads, etc. When a user makes a post, they will be able to select the visibility of that content as:
Anyone
Registered Users Only
Friends Only
Custom (specific list of friends)
I'm trying to figure out how to model this for speed. I've considered using MySQL for writing the setting into a binary "is_secure" field in the Comments table - if it is true, then go to a table with three columns: comment_id, user_id, and group_id. Groups (group_id) would be for groups of users - Registered Users, Friends. Custom would make one row for each user that is selected (user_id).
This table will get huge (perhaps several dozen rows for each comment), so I'm wondering if using NoSQL is worth considering here for retrieval only, or if there's a better way to model this.
Thanks so much!
Similar question to database "flags". Search for related SO questions.
Instead of an IF true/false with the is_secure field, just add 1-bit fields for read_all (anyone), registered, friends, custom. Add another table which holds the custom list would have comment_id (from the previous table) and friend_id (multiple rows). That way, in a single query with a LEFT JOIN on custom_friends_list_for_comments you can determine whether or not to show the page to a user. Optionally, custom could be a comma separated list (char field) but size limits might be an issue. Assuming 3-letter friend ids with a comma, each 255 char field can have 64 friends.

Which of these methods provides for the fastest page loading?

I am building a database in MySQL that will be accessed by PHP scripts. I have a table that is the activity stream. This includes everything that goes on on the website (following of many different things, liking, upvoting etc.). From this activity stream I am going to run an algorithm for each user depending on their activity and display relevant activity. Should I create another table that stores the activity for each user once the algorithm has been run on the activity or should I run the algorithm on the activity table every time the user accesses the site?
UPDATE:(this is what is above except rephrased hopefully in an easier to understand way)
I have a database table called activity. This table creates a new row every time an action is performed by a user on the website.
Every time a user logs in I am going to run an algorithm on the new rows (since the users last login) in the table (activity) that apply to them. For example if the user is following a user who upvoted a post in the activity stream that post will be displayed when the user logs in. I want the ability for the user to be able to access previous content applying to them. Would it be easiest to create another table that saved the rows that have already been run over with the algorithm except attached to individual users names? (a row can apply to multiple different users)
I would start with a single table and appropriate indexes. Using a union statement, you can perform several queries (using different indexes) and then mash all the results together.
As an example, lets assume that you are friends with user 37, 42, and 56, and you are interested in basketball and knitting. And, lets assume you have an index on user_id and an index on subject. This query should be quite performant.
SELECT * FROM activity WHERE user_id IN (37, 42, 56)
UNION DISTINCT
SELECT * FROM activity WHERE subject IN ("basketball", "knitting")
ORDER BY created
LIMIT 50
I would recommend tracking your user specific activities in a separate table and then upon login you could show all user activities that relate to them more easily. ie. So if a user is say big into baseball and hockey you could retrieve that from their recent activity, then got to your everything activities table and grab relevant items from it.

Implementing a rollback system

I have a site which allows users to make changes to content. How can I implement a rollback system? I'm using php and mysql, I was thinking of creating tables such as the following:
posts table --- posts_rollback table --- rollback table
The posts_rollback table would act as a lookup table. The posts table has a one to many relationship with the posts_rollback table. I would then use inner_join to
Is there a better way of doing this or any class/feature which automatically does this itself?
I think what you mean is content versioning (like here on SO) rather than rollbacks - the term "rollback" is mostly used in context of database transactions.
The simplest thing that comes to mind is to have two tables: posts that stores non-editable data (author, date created) and content with versioned data (text, date-updated, editor etc). Have a field called "version" in the posts table. When a post is updated, increase "version" and insert the data into content, along with post ID and "version". When retrieving posts, join content with posts on posts.id and posts.version.

Categories