user-specific content suggestions across sessions [MySQL/Web] - php

In a fairly standard blog/articles/comments type setup, what is the most elegant way of offering content tailored to users' interests (as determined by previous activity)?
A tag system for articles exists, so really all that is required is to track the tags of articles that each user reads or comments on. This would be very straightforward by simply writing this information to users' sessions, but what's the best way of modeling it in the site database so that recommendations are session-independent? If a user reads 4 articles tagged as one thing and 2 tagged as another, the system would ideally suggest more articles matching the first tag than the second the next time the user looks at the main page.
What kind of data model can best accomplish this? Having a whole ton of extra fields in the user table seems extremely cumbersome.

CREATE TABLE interests (user_id int, tag varchar(25), weight int);
Something like that. One row per tag for each user. Add to weight when a user reads/comments on something with that tag.

Related

How to design an efficient Like system?

I'm trying to create a Like/Unlike system akin to Facebook's for an existing comments section of a website, and I need help in designing the system.
Currently, every product on the website has a comments section and members can post and like comments. I need to know each member has posted how many comments and each of his comments has received how many likes. Of course, I need to know who liked what comments too (partly so that I can prevent a user from liking a comment more than once) for analytical purposes.
The naive way of implementing a Like system to the current comments module is to create a new table in the database that has foreign keys to the CommentID and UserID. Then for every "like" given to a comment by a user, I would insert a row to this new table with the targeting comment ID and user ID.
While this might work, the massive amount of comments and users is going to cause this table to grow quickly and retrieving records from and doing counts on this huge table will become slow and inefficient. I can index either one of the columns, but I don't know how effective it would be. The website has over a million comments.
I'm using PHP and MySQL. For a system like this with a huge database, how should I designing a Like system so that it is more optimised and stable?
For scalability, do not include the count column in the same table with other things. This is a rare case where "vertical partitioning" is beneficial. Why? The LIKEs/UNLIKEs will come fast and furious. If the code to do the increment/decrement hits a table used for other things (such as the text of the Comment), there will be an unacceptable amount of contention between the two.
This tip is the first of many steps toward being able to scale to Facebook levels. The other tips will come, not from a free forum, but from the team of smart engineers you will have to hire to get to that level. (Hints: Sharding, Buffering, Showing Estimates, etc.)
Your main concern will be a lot of counts, so the easy thing to do is to keep a separate count in your comments table.
Then you can create a TRIGGER that increments/decrements the count based on a like/unlike.
That way you only use the big table to figure out if a user already voted.

Saving search POST variables in MySQL

I am developing a site where users choose their city (from a long list), an activity (again a long list) and then a level of expertise. After that, the site shows them local events based upon their choices. I have over 100 POST variables (and more being added). How can this be done in MySQL?
You only have a few variables (city, activity, expertise). Use a session.
Here is a link to sessions in PHP (sessions)
Yes, you have the right idea. create a table and store each variable you would like to save.
post whatever variables from whatever scripting language of your choice, and the database statement would look something like this:
'INSERT INTO `your_table_name` (`field_1`,`field_2`) VALUES ('$post1','$post2');
you can add more fields as your information grows. That would be the best way to start getting into mysql. Of course this answer is the simplest solution. I recommend you read up on all fun and much more complex things you can do with mysql.
More efficiently I would do it like this:
categorize / classify your post variables and create multiple tables grouping similar things or data from a specific page. Create an auto increment primary id for your rows on the master table of the group, and then join your tables as needed. That's a little more advanced, so I suggest you add your few fields that you currently are using to a table now, and when you need to start adding more, looking these topics:
Normalization:
http://en.wikipedia.org/wiki/Database_normalization
Primary Keys, Joins, Stored Procedures, and the list goes on.

PHP mySQL best practice for storage in creating a post/comment structure

I am working on a small project where a client would like to have a custom comment system to be shared within the internal network of there company. The logic is something like Google+, Facebook (other?) Where a user will make a Post and have the ability to choose people to share it with where the default (none) will go to everyone in that persons list.
My question is what is the best way to build up a table to store posts where it could have all or select people as the able viewers of said post. I guess my biggest issue is wrapping my head around the logic of it at the moment. Do I have multiple rows per post each with an id of the user(s) able to see said post, should I have a column on a single row for the post where I store an array or object of people able to view the post, I am open to suggestions. I haven't started working on it as of yet. So I am ultimately looking for advice on a good way to build the table that would support sound query logic, that won't cost me over head on either multiple queries or multiple rows I don't need. Don't want to begin without figuring something out as I don't want to box myself into something that will be harder to back out of in the long run.
What you are proposing is a one-to-many relationship. There is a ton of information about db relationships on the internet. Each Post could have Many people that would be allowed to use it. So you would have a posts table and a users table and a users_post table. The users post table would contain a post_id and a user_id. You would then have to check if the user could view the post through this relationship.
You could also put the users in groups, which would simplify this.
You should never store multiple values in an array in one column of the db.

Database structure for "flag as spam" functionality

I have created a webapp with php/mysql.
In my application I have different section, where user submits contents, like photos, news, stories, videos etc.
All these are separate sections with their separate story details pages. I want to apply a "Flag as Spam" functionality for all sections, but confused with database. Should I create separate table for every section such as table name: video_spam or photo_spam or should I go with one table spam_contents which will contain following columns.
SpamId - unique id for the table
ByUserId - Who marked it as spam
SectionName - will be 'news', 'video', 'stories' etc.
Reason - Reason for which user marked it as spam
ContentId - This will contain photoid or videoid or newsid
Date - The day user marked content as spam.
If I need to fetch all content of video section, which is marked as spam by users then I can get it on the basis of SectionName and ContentId.
Will it be a good approach or anyone has any better solution for this scenario.
Please help, Thanks!
Unless there's something unique to "video spam", or something unique to "photo spam", etc., you're almost certainly better off with a single table.
Your situation is similar to this supertype/subtype issue. See my reply to that question, too.
I believe this looks like the best way. Having a centralized collector with a unique purpose is a design plus, imho. You can surely go for some more fields in each table (ex. video_table has also a 'spam_flag','flag_by','flag_date' and whatever along these lines), but I think this, a part from multiplicating your work just in creating, may have significatn drawbacks whenever you need to make adjustement or changes to the system.
And, by the way, I've seen this structure implemented in a couple of well-known open source Bullettin Boards for reported messages and similar, so I believe it's a valid and optimized design.
Alternatively, if you feel in good mood, you could also make both: something 'detailed' pertaining to each table, and a centralized structure as a sort of 'admin panel report'.

Database and Table Management

I have been creating a web app and am looking to expand. In my web app I have a table for users which includes privileges in order to track whether a user is an administrator, a very small table for a dynamic content section of a page, and a table for tracking "events" on the website.
Being not very experienced with web application creation, I'm not really sure about how professionals would create systems of databases and tables for a web application. In my web app, I plan to add further user settings for each member of the website and even a messaging system. I currently use PHP with a MySQL database that I query for all of my commands, but I would be willing to change any of this if necessary. What would be the best wat to track content such as messages that are interpersonal and also specific user settings for each user. Would I want to have multiple databases at any point? Would I want to have multiple tables for each user, perhaps? Any information on how this is done or should be done would be quite helpful.
I'm sorry about the broadness of the question, but I've been wanting to reform this web app since I feel that my ideas for table usage are not on par with those that experienced programmers have.
Here's my seemingly long, hopefully not too convoluted answer to your question. I think I've covered most, if not all of your queries.
For your web app, you could have a table of users called "Users", settings table called "UserSettings" or something equally as descriptive, and messages in "PrivateMessages" table. Then there could be child tables that store extra data that is required.
User security can be a tricky thing to design and implement. Do you want to do it by groups (if you plan on having many users, making it easier to manage their permissions), or just assign individually due to a small user base? For security alone, you'd end up with 4 tables:
Users
UserSettings
UserGroups
UserAssignedGroups
That way you can have user info, settings, groups they can be assigned to and what they ARE assigned to separated properly. This gives you a decent amount of flexibility and conforms to normalization standards (as mentioned above by DrSAR).
With your messages, don't store them with the username, but rather the User ID. For instance, in your PrivateMessages table, you would have a MessageID, SenderUserID, RecipientUserID, Subject, Body and DateSent to store the most basic info. That way, when a user wants to check their received messages, you can query the table saying:
SELECT * FROM PrivateMessages WHERE RecipientUserID = 123556
A list of tables for your messages could be as such:
PrivateMessages
MessageReplies
The PrivateMessages table can store the parent message, and then the MessageReplies table can store the subsequent replies. You could store it all in one table, but depending on traffic and possibly writing recursive functions to retrieve all messages and replies from one table, a two table approach would be simplest I feel.
If I were you, I'd sit down with a pencil and paper, and write down/draw what I want to track in my database. That way you can then draw links between what you want to store, and see how it will come together. It helps me when I'm trying to visualise things.
For the scope of your web app you don't need multiple databases. You do need, however, multiple tables to store your data efficiently.
For user settings, always use a separate table. You want your "main" users table as lean as possible, since it will be accessed (= searched) every time a user will try to log in. Store IDs, username, password (hashed, of course) and any other field that you need to access when authenticating. Put all the extra information in a separate table. That way your login will only query a smaller table and once the user is authenticated you can use its ID to get all other information from the secondary table(s).
Messages can be trickier because they're a bigger order of magnitude - you might have tens or hundreds for each user. You need to design you table structure based on your application's logic. A table for each user is clearly not a feasible solution, so go for a general messages table but implement procedures to keep it to a manageable size. An example would be "archiving" messages older than X days, which would move them to another table (which works well if your users aren't likely to access their old messages too often). But like I said, it depends on your application.
Good luck!
Along the lines of Cristian Radu's comments: you need to split your data into different tables. The lean user table will (in fact, should) have one unique ID per user. This (unique) key should be repeated in the secondary tables. It will then be called a foreign key. Obviously, you want a key that's unique. If your username can be guaranteed to be unique (i.e. you require user be identified by their email address), then you can use that. If user names are real names (e.g. Firstname Sirname), then you don't have that guarantee and you need to keep a userid which becomes your key. Similarly, the table containing your posts could (but doesn't have to) have a field with unique userids indicating who wrote it etc.
You might want to read a bit about database design and the concept of normalization: (http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html) No need to get bogged down with the n-th form of normalization but it will help you at this stage where you need to figure out the database design.
Good luck and report back ;-)

Categories