I have user discussion forums I coded in php/mysql, I am wanting to know how the big name forums can make it show you which topics have new posts in them, usually by changing an icon image next to the thread without using hardly any resources?
The simplest way is to track the last time someone was logged in. When they come back to visit, everything which has been updated since then is obviously "new".
This has some problems though, since logging out effectively marks all items as read.
The only other way I could think to do it would be to maintain a table containing all the threads and the latest post in that thread which each user has seen.
user_id thread_id post_id
1 5 15
1 6 19
With that information, if there is a post in thread #5 which has an ID larger than 15, then you know there's unread posts there. Update this table only with the post_id of the latest post on that page. This means if there's 3 pages of new posts, and the user only views the first, it'll still know there's unread posts.
As nickf said above except that the threads the user has actually visited is tracked. so anything the user hasn't visited is considered new for that visitor. for finer grain control any threads created before the user registered are ignored and possibly any threads not visited within a period of time are ignored. this would prevent every unvisited thread as becoming a new thread for them.
Of course there are many ways to skin a cat and depending on what the forum creators wanted the above can be changed to suit
DC
You could log the last time they selected that topic and then see if a post has a later time-stamp then their last "click" on the thread.
You could make a special table in your database with columns like USER_ID and THREAD_ID and with appropriate constraints to your USER and THREAD tables and a primary key containing USER and THREAD IDs.
Now when somebody opens a thread, you just insert that USER-THREAD-PAIR into that special table.
In your thread listings you can now simply outer-join that table on to what ever suits you use there. if your new table contains NULL on any particular spot, that thread is unread. This will enable lists like:
All Threads with "unread" marker
All unread threads
Threads read by user XY
If you add a date column to this table, you can do even more interesting stuff.
Just keep an eye on your keys and indexes to prevent too heavy negative performance impacts. Try to read from the USER-THREAD-table only by joining it into your existing queries. That will work much faster than executing individual queries all the time.
You could have a table that gets an insert whenever a thread gets read, if the user reading it hasn't already. Then when someone adds to the thread you can delete all entries in the table for that thread, thus making it unread for all users.
The table structure would be something like
forum_id thread_id user_id
With the optional extra has_read_id for your primary key, with the other fields making a composite key.
Related
I'm making a website that have posts and replies system.
I'd like to do is when someone replies, sending notification to those who have ever replied (or involved) the post.
My thought is to create a table named Notification, contains message and seen (seen/unread) field. Once people replied, INSERT record to the Notification table.
It's seems easy and intuitive, but if there are lots of people involved in, for example, the 31st user replies, 30 people who have ever replied will receive notification. This will make 30 rows of SQL records. And the 32nd user will make 31 records. Then total number of rows will become 30+31=61.
My question is
Is that a good way to handle notification system?
If so, how to deal with the duplicate notification (haven't seen but has new reply)
As above, will this make a huge server load?
Thank you so much.
I was creating similar system. Here is my experience:
My notification table looks like: id (int) | user_id (int) | post_id (int) | last_visited (datetime).
user_id + post_id is an unique composite index.
So when a user opens the page, I'm looking for an entry (user_id + post_id) in the database. If I find it, then I update the last_visited field if I don't find, then create new row.
When I need list messages for notification I'm just query all messages that was created after last_visited time.
Also I have cron sript that clean notification for closed posts or banned users.
As for your questions:
1 and 2: You have to find a balance between the amount of data that will be stored and site performance. If you don't need to store all this data you can follow my way. If this data is needed your way is better.
3: It depends on the number of visitors and other functionality. But here is some advices. You must use indexes for MySql table for better perfomance. Also you should think about cron script that will remove useless notifications. If you have huge amount of visitors more than 700k per day you shoulf think about MogoDb or other high perfomance noSql database.
I have a table that stores a post. Each post has an id, title, content and a score. Currently, you can like a post and its score will increment and decrement, if you dislike it.
Now the thing I don't understand: how do I avoid that a user will vote more than once? Surely they can just refresh and vote again. I've read some articles that store cookies, etc. but can't you just disable cookies or clear them and vote again?
I was thinking you would have to store who has voted, or rather, the ID of who has voted too. However, I can't seem to visualize how I would go about this? Would I store the ID of the voter in the post they're voting, or something else?
You will need an extra table posts_vote or something. Put the fields user_id and post_id in it. If a user votes a post up, you insert both IDs in this table. If a user votes down, find the record and delete it.
You can store the voting records in a separate table.
Vote history (Alternative 1)
voter_id
post_id
action (upvote/downvote)
created_at
Vote history (Alternative 2)
voter_id
post_id
points (this field can get negative numbers for downvote)
created_at
So when you have that particular record you can check if the user has voted for the post before and decide to increment/decrement the points in the actual post table.
In the future vote history table will grow and will cause you performance issues, you can sync history records with redis/memcached/etc. and do your checks faster with those storage technologies.
Also using cookies can help you to disable voting without hitting the server at all and will reduce the number of requests to your web server.
You can store the voted post id's in cookies and check them with javascript, don't do the request if user already voted the post.
So you can have two layers for checking if the user has voted.
Cookies and javascript
Server-side persisted data
If number one fails (users can bypass this via changing browser or cleaning cookies), it fallsback to number two and you don't allow to do the voting in the server side.
Just as an example:
Let's say you have tables posts and votes. Then you could have a posts_votes as a look up table.
Visually:
I am storing user ID values in a table field separated by a | (user_id1|user_id2|user_id3|user_id17).
A user ID will be added and removed from this field at certain points.
How can I check if the current users ID exists in the field or not using a query?
And it of course needs to be an exact match. Can't look for user_id1 and find user_id17.
I know I could use a SELECT query, explode the field, then use in_array but if there's a way to do it using a query it'd be better.
I guess I'll explain what I am doing: I made a forum for a small private website (7 users), but coding it for larger scale.
My table structure is pretty good: forum_categories, forum_topics, forum_posts. Using foreign keys between the tables for delete and update queries.
What I am seeking help on is to mark Topics as unread for each user. I could create a new table with topic_id & user_id, each one being a new row but that wouldn't be good with alot of users & topics.
If somebody has a better solution I am all for it. Or can prove to me that 1 row per user_id is the best way then I'll be more than willing to do that.
I think you want to track read messages, not the other way around. If you tracked unread messages, every time you add a user you'll have to add that user to every topics "unread list".
I looked into SMF like my comment suggested. They are using a separate table to track read messages.
A simple table that holds user_id and topic_id are you are need. When a user reads a topic, make sure there is a row in the table for that user.
Another reason to use a separate table. It's going to be faster to query against 2 int values in the database than to use LIKE % statements.
Problem
In a web application dealing with products and orders, I want to maintain information and relationships between former employees (users) and the orders they handled. I want to maintain information and relationships between obsolete products and orders which include these products.
However I want employees to be able to de-clutter the administration interfaces, such as removing former employees, obsolete products, obsolete product groups etc.
I'm thinking of implementing soft-deletion. So, how does one usually do this?
My immediate thoughts
My first thought is to stick a "flag_softdeleted TINYINT NOT NULL DEFAULT 0" column in every table of objects that should be soft deletable. Or maybe use a timestamp instead?
Then, I provide a "Show deleted" or "Undelete" button in each relevant GUI. Clicking this button you will include soft-deleted records in the result. Each deleted record has a "Restore" button. Does this make sense?
Your thoughts?
Also, I'd appreciate any links to relevant resources.
That's how I do it. I have a is_deleted field which defaults to 0. Then queries just check WHERE is_deleted = 0.
I try to stay away from any hard-deletes as much as possible. They are necessary sometimes, but I make that an admin-only feature. That way we can hard-delete, but users can't...
Edit: In fact, you could use this to have multiple "layers" of soft-deletion in your app. So each could be a code:
0 -> Not Deleted
1 -> Soft Deleted, shows up in lists of deleted items for management users
2 -> Soft Deleted, does not show up for any user except admin users
3 -> Only shows up for developers.
Having the other 2 levels will still allow managers and admins to clean up the deleted lists if they get too long. And since the front-end code just checks for is_deleted = 0, it's transparent to the frontend...
Using soft-deletes is a common thing to implement, and they are dead useful for lots of things, like:
Saving a user's data when they deleted something
Saving your own data when you delete something
Keep a track record of what really happened (a kind of audit)
etcetera
There is one thing I want to point out that almost everyone miss, and it always comes back to bite you in the rear piece. The users of your application does not have the same understanding of a delete as you have.
There are different degrees of deletions. The typical user deletes stuff when (s)he
Made a misstake and want to remove the bad data
Doesn't want to see something on the screen anymore
The problem is that if you don't record the intention of the delete, your application cannot distinguish between erronous data (that should never have been created) and historically correct data.
Have a look at the following data:
PRICES | item | price | deleted |
+------+-------+---------+
| A | 101 | 1 |
| B | 110 | 1 |
| C | 120 | 0 |
+------+-------+---------+
Some user doesn't want to show the price of item B, since they don't sell that item anymore. So he deletes it. Another user created a price for item A by misstake, so he deleted it and created the price for item C, as intended. Now, can you show me a list of the prices for all products? No, because either you have to display potentially erronous data (A), or you have to exclude all but current prices (C).
Of course the above can be dealt with in any number of ways. My point is that YOU need to be very clear with what YOU mean by a delete, and make sure that there is no way for the users to missunderstand it. One way would be to force the user to make a choice (hide/delete).
If I had existing code that hits that table, I would add the column and change the name of the table. Then I would create a view with the same name as the current table which selects only the active records. That way none of the existing code woudl break and you could have the soft delete column. If you want to see the deleted record, you select from the base table, otherwise you use the view.
I've always just used a deleted column as you mentioned. There's really not much more to it than that. Instead of deleting the record, just set the deleted field to true.
Some components I build allow the user to view all deleted records and restore them, others just display all records where deleted = 0
Your idea does make sense and is used frequently in production but, to implement it you will need to update quite a bit of code to account for the new field. Another option could be to archive (move) the "soft-deleted" records to a separate table or database. This is done frequently as well and makes the issue one of maintenance rather than (re)programming. (You could have a table trigger react to the delete to archive the deleted record.)
I would do the archiving to avoid a major update to production code. But if you want to use deleted-flag field, use it as a timestamp to give you additional useful info beyond a boolean. (Null = not deleted.) You might also want to add a DeletedBy field to track the user responsible for deleting the record. Using two fields gives you a lot of info tells you who deleted what and when. (The two extra field solution is also something that can be done in an archive table/database.)
The most common scenario I've come across is what you describe, a tinyint or even bit representing a status of IsActive or IsDeleted. Depending on whether this is considered "business" or "persistence" data it may be baked into the application/domain logic as transparently as possible, such as directly in stored procedures and not known to the application code. But it sounds like this is legitimate business information for your needs so would need to be known throughout the code. (So users can view deleted records, as you suggest.)
Another approach I've seen is to use a combination of two timestamps to show a "window" of activity for a given record. It's a little more code to maintain it, but the benefit is that something can be scheduled to soft-delete itself at a pre-determined time. Limited-time products can be set that way when they're created, for example. (To make a record active indefinitely one could use a max value (or just some absurdly distant future date) or just have the end date be null if you're ok with that.)
Then of course there's further consideration of things being deleted/undeleted from time to time and tracking some kind of audit for that. The flag approach knows only the current status, the timestamp approach knows only the most recent window. But anything as complex as an audit trail should definitely be stored separately than the records in question.
Instead I would use a bin table in which to move all the records deleted from the other tables. The main problem with the delete flag is that with linked tables you will definitely run into a double key error when trying to insert a new record.
The bin table could have a structure like this:
id, table_name, data, date_time, user
Where
id is the primary key with auto increment
table_name is the name of the table from which the record was deleted
data contains the record in JSON format with name and value of all fields
date_time is the date and time of the deletion
user is the identifier of the user (if the system provides for it) who performed the operation
this method will not only save you from checking the delete flag at each query (immagine the ones with many joins), but will allow you to have only the really necessary data in the tables, facilitating any searches and corrections using SQL client programs
I am thinking of storing online users in memcached.
First I thought about having array of key => value pairs where key will be user id and value timestamp of last access.
My problem is that it will be quite large array when there are many users currently online(and there will be).
As memcached is not built to store large data, how would you solve it? What is the best practice!
Thanks for your input!
The problem with this approach is memcache is only queriable if you know the key in advance. This means you would have to keep the entire online user list under a single known key. Each time a user came online or went offline, it would become necessary to read the list, adjust it, and rewrite it. There is serious potential for a race condition there so you would have to use the check-and-set locking mechanism.
I don't think you should do this. Consider keeping a database table of recent user hits:
user_id: int
last_seen: timestamp
Index on timestamp and user_id. Query friends online using:
SELECT user_id FROM online WHERE user_id IN (...) AND timestamp > (10 minutes ago);
Periodically go through the table and batch remove old timestamp rows.
When your site becomes big you can shard this table on the user_id.
EDIT:
Actually, you can do this if you don't need to query all the users who are currently online, but just need to know if certain users are online.
When a user hits a page,
memcache.set("online."+user_id, true, 600 /* 10 mins */);
To see if a user is online,
online = memcache.get("online."+user_id);
There should also be a way to multikey query memcache, look that up. Some strange things could happen if you add a memcache server, but if you use this information for putting an "online" marker next to user names that shouldn't be a big deal.