Database structure for "flag as spam" functionality

Database structure for "flag as spam" functionality - php

I have created a webapp with php/mysql.
In my application I have different section, where user submits contents, like photos, news, stories, videos etc.
All these are separate sections with their separate story details pages. I want to apply a "Flag as Spam" functionality for all sections, but confused with database. Should I create separate table for every section such as table name: video_spam or photo_spam or should I go with one table spam_contents which will contain following columns.
SpamId - unique id for the table
ByUserId - Who marked it as spam
SectionName - will be 'news', 'video', 'stories' etc.
Reason - Reason for which user marked it as spam
ContentId - This will contain photoid or videoid or newsid
Date - The day user marked content as spam.
If I need to fetch all content of video section, which is marked as spam by users then I can get it on the basis of SectionName and ContentId.
Will it be a good approach or anyone has any better solution for this scenario.
Please help, Thanks!

Unless there's something unique to "video spam", or something unique to "photo spam", etc., you're almost certainly better off with a single table.
Your situation is similar to this supertype/subtype issue. See my reply to that question, too.

I believe this looks like the best way. Having a centralized collector with a unique purpose is a design plus, imho. You can surely go for some more fields in each table (ex. video_table has also a 'spam_flag','flag_by','flag_date' and whatever along these lines), but I think this, a part from multiplicating your work just in creating, may have significatn drawbacks whenever you need to make adjustement or changes to the system.
And, by the way, I've seen this structure implemented in a couple of well-known open source Bullettin Boards for reported messages and similar, so I believe it's a valid and optimized design.
Alternatively, if you feel in good mood, you could also make both: something 'detailed' pertaining to each table, and a centralized structure as a sort of 'admin panel report'.

Related

Logging user activities in applications

The problem I'm here to talk about and (ask about of course) is not new. I searched web and stack overflow and I got ideas to many part of this problem (pros and cons) but there is still some part missing in my mind. So I thought it would be a good idea to share in one place (of course it will be more complete with others' ideas) and ask for it.
The problem is clear: "We Want to log every single action of user" - probably when we solve the big problem, smaller ones (like logging only one action would be piece of cake).
First from what I read over the web and stack overflow:
Use DB instead of File: That's a good advice although it always depends on situation. But because of many benefits of DB, in long term and in general, it's the better solution.
DB Layer or Application Layer: Actually it depends. For example If you want really monitor everything(I mean really every single rows that changes in Database, it seems we will have one choice "Using Database Triggers". Although there are many discussions around MySQL that says, triggers slowdown DB and they advised not to use it. So it depends on the level of details you need, you can put your logging system in DB Layer or Application Layer(for exam some common function call $logClass->logThis()).
Use Observers: Clean codes are always better. If you are familiar with observers, you can use them to do things for you when an action is happened so you don't have to add $logClass->logThis() every time a CRUD happens in your application.
What To Log: Simple and short answer is: Based on your needs, but there are some common fields you will need:
user_id (if a unique user ID is available)
timestamp (unix maybe)
ip (not everyone know how to fake it in first place so use it, even faking it give you some insight about user behavior)
action_id (should be predefined actions for better unifying in queries and reports)
object_id (the unique row ID of a record that changes had made on)
action (which my question is about this part)
and etc...
I would appreciate if anyone correct me if I made mistake in any part or add other useful information to this post, so it would become one of good references for other users.
And now my question: How to Store actions?. For better understanding, consider following scenario.
I have a table named "product" and a table named "companies". From the business logic we want to assign products to companies, which we ended up in a table "company_product". Now when a user insert new product and simultaneously assign it's companies, 2 table will be affected (the same goes for delete and update): "product" and "company_product" and we want to know:
what's inserted?
what's deleted?
what's updated to what?
For performance issue and because I don't have enough knowledge about triggers, I want to use logging in Application Layer, so I ended up with this idea that I can, save action fields of database in array or json structure. But as I developed my solution I encountered a problem: How to make this log understandable for non technical users? Because for example I want to save something like this in action field of database when delete(insert) product with id 20:
action : [{id: 20, product_id:2, company_id: 1},{id: 21, product_id:2, company_id: 2}]
And this is not something easy for every one to read and understand. Actually I can use this json more readable and make it something like this:
action : {'Product A Deleted From Company X', 'Product A Deleted From Company Y'}
and save the previous action in technical_action field for further diagnose, But it needs additional works and more query to run for something that is not always needed to be considered(log)
I would appreciate any additional information on this article (I'm definitely sure that there exist other criteria that can be discussed), and answer to my question.

You are actually going to gather details for analytics kind of stuffs.
It will be good if you go for flat tables rather than going to relational tables.
Because if you want to do more analysis your relational table will not be a good choice as it lacks in performance.

Technique for multiple users on same datasets

This is more a learning question than coding, but I'm certain it's a common issue for anyone developing administration systems or applications in php/mysql/js etc.
I've developed quite a complex application that lets users upload images, and define hotspots in them with associated actions. The images are stored in a table, and the actions in another, with json data for every action in a text field. It's a magazine style format that is used by a custom reading application. However, like I say, the problem is generic.
Basically, my fear is that if someone is editing the same image and set of actions at the same time, and they both submit changes, or if it was edited by someone else then there's a whole series of structures that potentially will fail on submission.
I don't want to implement a locking system, as the system is very wide ranging (links to other images, etc), and I think it's a bit ugly. I saw this link (MSDN Multi-tenant architecture article) in another question, but it seems a little overwhelming and specialised for sql server.
So - what are the terms for data and system architecture here that I can investigate, or are there some good articles to do with this topic that people can recommend? Specifically for php/web world would be great!
--
I'm still looking for good responses on this question. Found out meanwhile that the general term is 'Concurrency', but technique is the important thing :)

First
ALTER TABLE tablename ADD COLUMN changecount BIGINT NOT NULL DEFAULT 0;
for all relevant tables. Then whenever you want to submit a change, use not only
UPDATE tablename SET whatever WHERE id=whatever
but
UPDATE tablename SET whatever, changecount=changecount+1 WHERE id=whatever AND changecount=the_changecount_you_remembered_from_loading_the_object
now if a user submits a change, it will update the changecount - another user submitting a change to the same object, but loaded from older state, can be told "another user has just changed blah blah"

user-specific content suggestions across sessions [MySQL/Web]

In a fairly standard blog/articles/comments type setup, what is the most elegant way of offering content tailored to users' interests (as determined by previous activity)?
A tag system for articles exists, so really all that is required is to track the tags of articles that each user reads or comments on. This would be very straightforward by simply writing this information to users' sessions, but what's the best way of modeling it in the site database so that recommendations are session-independent? If a user reads 4 articles tagged as one thing and 2 tagged as another, the system would ideally suggest more articles matching the first tag than the second the next time the user looks at the main page.
What kind of data model can best accomplish this? Having a whole ton of extra fields in the user table seems extremely cumbersome.

CREATE TABLE interests (user_id int, tag varchar(25), weight int);
Something like that. One row per tag for each user. Add to weight when a user reads/comments on something with that tag.

Exploring data modelling (how to hobble a sensible database together)

i am working on a project in which people can create a playlist and its stored in localStorage as objects. everything is client side for the moment.
so i will now like to take a leap forward, make a user login system (i can do it using php mysql and fb connect or oauth system, any other suggestions?). the problem is deciding if i make a sql database for each user and store their playlist (with media info) or is there any other way to go around. will handling a large number of databases be a trouble for me(in terms of speed)?
how about i create only one db as follows:
user database ---> one table containing{ user(primary key) pass someotherInfo} , then tables per USER {contains playlists) , 3rd table per playlist (containing userID and media info, what could be my primary key?)
example:
i have 10 registered user, each user has 2 playlists
1.table 1: 10 entries
2.table(s): username - playlists (10 tables) || i make one table with one field user other field playlist name
3.tables: each playlist - media info, owner (20 tables)
or is there a simpler way?
i hope my question is clear.
PS: i am new to php and database (so this might be very silly)

Surprised most answers seems to have missed the question, but I'll give this a try;
This is called data modeling (how you hobble a bunch of tables in a database together in order to express what you want in the best possible way), and don't feel silly for asking; there are people out there who spend all their waking hours tweaking and designing data models. They are hugely important to the well-being of any system, and they are, in truth, far more important that most people give them credit for.
It sounds like you're on the right path. It's always a good tip to define your entities, and create a table per each, so in this case you've got users and playlists and songs (for example). Define your tables thusly; USER, SONG, PLAYLIST.
The next thing is defining the names of fields and tables (and perhaps the simplistic names suggested above are, well, simplistic). Some introduce faux namespaces (ie. MYAPP_USER instead of just USER), especially if they know the data model will extend and expand in the same database in the future (or, some because they know this is inevitable), while others will just ram through whatever they need.
The big question will always be about normalization and various problems around that, balancing performance against applicability, and there's tons and tons of books written on this subject, so no way for me to give you any meaningful answer, but the gist of it for me is;
At what point will a data field in a table be worthy of its own table? An example is that you could well create your application with only one table, or two, or 6 depending on how you wish to split your data. This is where I think your question really comes in.
I'd say you're pretty much correct in your assumptions, the thing to keep in mind is consistent naming conventions (and there's tons of opinions of how to name identifiers). For your application (with the tables mentioned above), I'd do ;
USER { id, username, password, name, coffee_preference }
SONG { id, artist, album, title, genre }
PLAYLIST { id, userid }
PLAYLIST_ITEM { id, songid, playlistid, songorder }
Now you can use SQL you get all playlists for a user ;
SELECT * FROM PLAYLIST WHERE userid=$userid
Or get all songs in a playlist ;
SELECT * FROM SONG,PLAYLIST_ITEM WHERE playlist_item.playlistid=$playlist.id AND song.id=playlist_item.songid ORDER BY playlist_item.songorder
And so on. Again, tomes have been written about this subject. It's all about thinking clearly and semantically while jotting down a technical solution to it. And some people have only this as a career (like DBA's). There will be lots of opinions, especially on what I've written here. Good luck.

You can use either an SQL database like MYSQL or Postgresql or a NOSQL database like MongoDB. Each has it's pros and cons but since you seem like a beginner i am going to suggest MYSQL because it's what most beginners work with. Take a look at these articles
http://dev.mysql.com/tech-resources/articles/mysql_intro.html
http://www.redhat.com/magazine/007may05/features/mysql/
Of course you may feel free to do you own searching on The Big G as there are tons of resources out there.

Database and Table Management

I have been creating a web app and am looking to expand. In my web app I have a table for users which includes privileges in order to track whether a user is an administrator, a very small table for a dynamic content section of a page, and a table for tracking "events" on the website.
Being not very experienced with web application creation, I'm not really sure about how professionals would create systems of databases and tables for a web application. In my web app, I plan to add further user settings for each member of the website and even a messaging system. I currently use PHP with a MySQL database that I query for all of my commands, but I would be willing to change any of this if necessary. What would be the best wat to track content such as messages that are interpersonal and also specific user settings for each user. Would I want to have multiple databases at any point? Would I want to have multiple tables for each user, perhaps? Any information on how this is done or should be done would be quite helpful.
I'm sorry about the broadness of the question, but I've been wanting to reform this web app since I feel that my ideas for table usage are not on par with those that experienced programmers have.

Here's my seemingly long, hopefully not too convoluted answer to your question. I think I've covered most, if not all of your queries.
For your web app, you could have a table of users called "Users", settings table called "UserSettings" or something equally as descriptive, and messages in "PrivateMessages" table. Then there could be child tables that store extra data that is required.
User security can be a tricky thing to design and implement. Do you want to do it by groups (if you plan on having many users, making it easier to manage their permissions), or just assign individually due to a small user base? For security alone, you'd end up with 4 tables:
Users
UserSettings
UserGroups
UserAssignedGroups
That way you can have user info, settings, groups they can be assigned to and what they ARE assigned to separated properly. This gives you a decent amount of flexibility and conforms to normalization standards (as mentioned above by DrSAR).
With your messages, don't store them with the username, but rather the User ID. For instance, in your PrivateMessages table, you would have a MessageID, SenderUserID, RecipientUserID, Subject, Body and DateSent to store the most basic info. That way, when a user wants to check their received messages, you can query the table saying:
SELECT * FROM PrivateMessages WHERE RecipientUserID = 123556
A list of tables for your messages could be as such:
PrivateMessages
MessageReplies
The PrivateMessages table can store the parent message, and then the MessageReplies table can store the subsequent replies. You could store it all in one table, but depending on traffic and possibly writing recursive functions to retrieve all messages and replies from one table, a two table approach would be simplest I feel.
If I were you, I'd sit down with a pencil and paper, and write down/draw what I want to track in my database. That way you can then draw links between what you want to store, and see how it will come together. It helps me when I'm trying to visualise things.

For the scope of your web app you don't need multiple databases. You do need, however, multiple tables to store your data efficiently.
For user settings, always use a separate table. You want your "main" users table as lean as possible, since it will be accessed (= searched) every time a user will try to log in. Store IDs, username, password (hashed, of course) and any other field that you need to access when authenticating. Put all the extra information in a separate table. That way your login will only query a smaller table and once the user is authenticated you can use its ID to get all other information from the secondary table(s).
Messages can be trickier because they're a bigger order of magnitude - you might have tens or hundreds for each user. You need to design you table structure based on your application's logic. A table for each user is clearly not a feasible solution, so go for a general messages table but implement procedures to keep it to a manageable size. An example would be "archiving" messages older than X days, which would move them to another table (which works well if your users aren't likely to access their old messages too often). But like I said, it depends on your application.
Good luck!

Along the lines of Cristian Radu's comments: you need to split your data into different tables. The lean user table will (in fact, should) have one unique ID per user. This (unique) key should be repeated in the secondary tables. It will then be called a foreign key. Obviously, you want a key that's unique. If your username can be guaranteed to be unique (i.e. you require user be identified by their email address), then you can use that. If user names are real names (e.g. Firstname Sirname), then you don't have that guarantee and you need to keep a userid which becomes your key. Similarly, the table containing your posts could (but doesn't have to) have a field with unique userids indicating who wrote it etc.
You might want to read a bit about database design and the concept of normalization: (http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html) No need to get bogged down with the n-th form of normalization but it will help you at this stage where you need to figure out the database design.
Good luck and report back ;-)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.