I'm working on a PHP app that has several objects that can be commented on. Each comment can be voted on, with users being able to give it +1 or -1 (like Digg or Reddit). Right now I'm planning on having a 'votes' table that has carries user_id and their vote info, which seems to work fine.
The thing is, each object has hundreds of comments that are stored in a separate comments table. After I load the comments, I'm having to tally the votes and then individually check each vote against the user to make sure they can only vote once. This works but just seems really database intensive - a lot of queries for just the comments.
Is there a simpler method of doing this that is less DB intensive? Is my current database structure the best way to go?
To be clearer about current database structure:
Comments table:
user_id
object_id
total_votes
Votes table:
comment_id
user_id
vote
End Goal:
Allow user to vote only once on each comment with least # of MySQL queries (each object has multiple comments)
To make sure that each voter votes only once, design your Votes table with these fields—CommentID, UserID, VoteValue. Make CommentID and UserID the primary key, which will make sure that one user gets only one vote. Then, to query the votes for a comment, do something like this:
SELECT SUM(VoteValue)
FROM Votes
WHERE CommentID = ?
Does that help?
Why don't you save the totaled votes for every comment? Increment/decrement this when a new vote has happened.
Then you have to check if the user has voted specifically for this comment to allow only one vote per comment per user.
You can put a sql join condition which returns all the votes on comments made by the current user for this object, if you get no rows, the user hasn't voted. That is just slightly different from you checking each comment one by one in the program.
as far as the database structure is concerned, keeping these things separate seems perfectly logical. vote { user_id, object_id, object_type, vote_info...)
You may be already doing this, sorry but I couldn't interpret from you post if that was the case.
Related
I have an application (More likely a quiz app) where i have saved all my 1000 quizzes in MySQL database, I want to retrieve a random question from this table when a user request one, I can easily do it using the RAND() function in MySQL.. my problem is , I don't want to give the same question two or more times to a user, how can i keep a record of retrieved questions? Do I have to create tables for each and every users? won't that increase the load time?? please help me, any help would be a big favor ..
-regards
If you want it for a short time, use the user's $_SESSION for that.
If you need the long term ( say tomorrow, not to ask the same questions) - you'll have to create additional table for usersToQuestions, where you'll store the user id and the questions the user had been already asked.
Retrieving a question in both cases would require a simple IN condition:
SELECT * FROM questions
WHERE id not IN ('implode(",", $_SESSION["asked"])')
SELECT * FROM questions
WHERE id not IN (
SELECT question_id FROM questions2users WHERE userid = 123
)
my problem is , I don't want to give the same question two or more times to a user,
how can i keep a record of retrieved questions? Do I have to create tables for each
and every users? won't that increase the load time?
Yes, but possibly not so much.
You keep a single extra table with userId, questionId and insert there the questions already asked to the various users.
When you ask question 123 to user 456, you run a single INSERT
INSERT INTO askedQuestions (userId, questionId) VALUES (456, 123);
Then you extract questions from questions with a LEFT JOIN
SELECT questions.* FROM questions
LEFT JOIN askedQuestions ON (questions.id = askedQuestions.questionId AND askedQuestions.userId = {$_SESSION['userId']} )
WHERE askedQuestions.userId IS NULL
ORDER BY RAND() LIMIT 1;
if you keep askedQuestions indexed on (userId, questionId), joining will be very efficient.
Notes on RAND()
Selecting on a table like this should not done with ORDER BY RAND(), which will retrieve all the rows in the table before outputting one of them. Normally you would choose a questionId at random, and select the question with that questionId, and that would be waaaay faster. But here, you have no guarantee that the question has not been already asked to that user, and the faster query might fail.
When most questions are still free to ask, you can use
WHERE questions.questionId IN ( RAND(N), RAND(N), RAND(N), ... )
AND askedQuestions.userId IS NULL LIMIT 1
where N is the number of questions. Chances are that at least one of the random numbers you extract will still be free. The IN will decrease performances, and you will have to strike a balance with the number of RANDs. When questions are almost all asked, chances of a match decrease, and your query might return nothing even with many RANDs (also because RANDs will start yielding duplicate IDs, in what is known as the Birthday Paradox).
One way to achieve the best of both worlds could be to fix a maximum number of attempts, say, three (or better still, based on the number of questions left over).
For X times you generate (in PHP) a set of Y random ids betweeen 1 and 1000, and try to retrieve (userId, questionId) from askedQuestions. The table is thin and indexed, so this is really fast. If you fail, then the extracted questionId is random and free, and you can run
SELECT * FROM questions WHERE id = {$tuple['questionId']};
which is also very fast. If you succeed X times, i.e., for X times, all Y random questionIds are registered as being already asked, then you run the full query. Most users will be served almost instantly (two very quick queries), and only a few really dedicated users will require more processing. You might want to set some kind of alerting to warn you of users running out of questions.
One solution is to add an ID column in the question table and when you serve it to a user you check that ID with the list of questions that you served the user.
You can use in memory data structure like List to keep track of the questions that are served to a particular user. This way, you only need array of Lists instead of tables to get the job done.
I have a website where users can rate comments that are left on pages. Each comment has a unique ID (E.g. 402934) If I want users to be able to thumb-up/thumb-down said comments I can see how I would make a simple counter code to keep track of the number of thumb-ups vs thumb-downs but how can I make sure that each user only ranks said comment once. I was going to make a database with each comment number as a row and in that row having an array of all the users that have ranked it thumbs up and all the users that have ranked it thumbs down but I had a feeling that wasn't the best way. My next thought was to have a table for each user and then having an array showing all the comments said user has ranked. It would probably run faster this way (e.g. checking from a user's 150 rankings verse a comment's 6050 rankings but I still feel like there is a better way... any ideas?
Create a new table with user_id, comment_id and vote TINYINT(1).
A value of 1 in vote is a thumbs up, A value of 0 in vote is a thumbs down.
Have a UNIQUE KEY constraint on (comment_id, user_id).
If you follow the above it will be easy to check whether a user has cast a vote on a specific comment, if you'd like to be able to quickly (as in fast execution) see all the comments a user has made you should also add an INDEX to user_id.
When a user votes you could use REPLACE INTO to user_comment_thumbs, such as the below:
REPLACE INTO `user_comment_thumbs` (user_id,comment_id,vote)
VALUES (#user_id, #comment_id, #vote);
If the user has already made a vote the entry in the table will be updated, otherwise a new row will be inserted.
I have a list of events for example I wanna show on a page with the users that have created them which is all in a table and the user who has created them's unique id, now if I wanna show their username and avatar I would have to run 100 queries inorder to show 100 events! but I'm sure their is a easier way I don;t know!
i have a table (user_table) with fields user_id INT(8) and user_photo VARCHAR(255)
and I have another table (user_event_table) with event_id INT(8), event_user_id INT(8), event_details TEXT
so I want to show a list of all these events but I want to next to it show the user_photo !
Learn to join with SQL. It's fundamental to relational databases.
SELECT * FROM user_event_table uet
LEFT JOIN user_table ut
ON ut.user_id = uet.user_id
Now each record will have a username and photo string.
Show us your tables, along with the query you're currently using, in a different question and people will help you with the SQL.
Yes. Get the accurate current time at the start of your PHP script, and get the time at the end, and log the page name and the difference in times.
If you're worried about this, you need to conduct a security audit of your scripts. There's no easy way to tell what someone who has access to your page's contents will leak.
Again, you need a real security audit. Someone will have to read and understand all the code in order to be sure. There's no easy way.
I don't even know what questions I should ask. Well, I want to create a thumbs up for my comments, but not sure how or what's the best way. Do I just create a new field for thumbs up?
If you need to keep track of who's voted on what, you should perhaps make a Votes table:
vote_id: Primary key.
user_id: The id of the user who made this vote. [Foreign key to Users table.]
comment_id: The id of the comment that was voted on. [Foreign key to Comments table.]
vote: The vote that was cast (perhaps +1 or -1 if you only have a trivial thumbs up/down system).
date: When the vote was cast.
A comment's score is now just the sum of all the vote columns which have that comment_id.
Note that unlike simply adding an integer score column to your Comments table, this has the advantage of telling you the level of controversy a comment is experiencing. Without knowing how many votes were cast, two comments with a net score of zero could either be experiencing a lot of controversy (people are equally split about the merit of the comment, so the total score hovers around 0), or none at all (nobody cares enough to cast a vote).
Just storing an int for the number of times a comment has been voted up would be subject to abuse. You probably also want to associate each vote with the user who cast it, that way you can prevent people from repeatedly voting for the same comment.
For this to work, I think you'll need a separate table for votes. Each record in that table should have the comment id and the user id of the person who cast the vote.
It depends on what you want to do with this. Why not just put an int column on your comments table, storing the total number of thumbs up / down for the comment?
Creating a separate table, as Bill and John have suggested, would probably be the best approach. But you might still want to add a votes column to the comments table for performance reasons. This way, you won't need to access the votes table when you only want to display the vote count for a comment. I believe this is how votes work on SO.
Create 2 fields in your comment table, vote_up and vote_down, and increase their counters accordingly upon user's vote, this way you can display comment score as sum of these values or as a percentage, in the later case you could add third field vote_score which stores percentage score if you ever wanted to be able to sort by score.
Then create votes table to prevent users voting twice the same comment, ever or in given time span, if so just set cron job to run once a day and delete entries older than time()-( 86400 * DAYS_TO_KEEP_VOTE )
comment_id
user_id
vote_time
Good luck.
On a social network I am working on in PHP/MySQL, I have a friends page, it will show all friends a user has, like most networks do. I have a friend table in MySQL, it only has a few fields. auto_ID, from_user_ID, to_friend_ID, date
I would like to make the friends page have a few different options for sorting the results,
By auto_ID which is basically in the order a friend was added. It is just an auto increment id
new friends by date, will use the date field
By friends name, will have a list in alphabetical order.
The alphabetical is where I need some advice. I will have a list of the alphabet A-Z, when a user clicks on K it will show all the user's name starting with K and so on. The trick is it needs to be fast so doing a JOIN on the user's table is not an option, even though most will argue it is fast, it is not the performance I want for this action. One idea I had is to add an extra field to my friendship table and store the first letter of the users name in it. User's can change there name at anytime so I would have to make sure this is updated on possible thousands of records, anytime a user changes there name.
Is there a better way to do this?
Well if you don't want to do a join, then storing the user's name or initials on the friendships table is really your only other viable option. You mention the problem of having to update thousands of records every time a name changes, but is this really a problem? Unless you're talking about a major social networking site like Facebook, or maybe MySpace, does the average user really have enough friends to make this problematic? And then you have to multiply that by the probability that a user will change their name, which I would imagine isn't something that happens very often for each user.
If those updates are in fact non-trivial, you could always background or delay that to happen during non-peak times. Sure you would sacrifice up-to-the-second accuracy, but really, would most users even notice? Probably not.
Edit: Note, my answer above really only applies if you already have those levels of users. If you are still basically developing your site, just worry about getting it working, and worry about scaling problems when they become real problems.
You could also look at a caching solution like memcached. You can have a background process that is always updating a memcached hash and then when you want this data it is already in memory.
I'd just join on the table that contains the name and then sort on the name. Assuming a pretty normal table layout:
Table Person:
ID,
FirstName,
LastName
Table Friend:
auto_ID,
from_user_ID,
to_friend_ID,
date
You could do things like:
Select person.id, person.firstname, person.lastname, friend.auto_id
from Friend
left join on person where person.id = friend.to_friend_ID
where friend.from_user_ID = 1
order by person.lastname, person.firstname
or
Select person.id, person.firstname, person.lastname, friend.auto_id
from Friend
left join on person where person.id = friend.to_friend_ID
where friend.from_user_ID = 1
order by friend.date desc
I'd really recommend adding a column in the friend table to keep the first letter around, no need to duplicate data like that (and have to worry about keeping it in sync), that's what joins are for.