Voting system questions

Voting system questions - php

I'm having some trouble approaching a +1/-1 voting system in PHP, it should vaguely resemble the SO voting system. On average, it will get about ~100 to ~1,000 votes per item, and will be viewed by many.
I don't know whether I should use:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
A text field in the "item" table, containing the uids that already voted (in a serialized array), and also a numeric field that contains the total sum of the votes.
A numeric field in the "item" table, that contains the total sum of the votes, then store whether or not the user voted in a text field (with a serialized array of the poll id).
Something completely different (please post some more ideas!)

I'd probably go with option 3 that you've got listed above. By putting the total number of votes as another column in the item table you can get the total number of votes for an item without doing any more sql queries.
If you need to store which user voted on which item I'd probably create another table with the fields of item, user and vote. item would be the itemID, user would be the userID, vote would contain + or - depending on whether it's an up or down vote.
I'm guessing you'll only need to access this table when a user is logged in to show them which items they've voted on.

I recommend storing the individual votes in one table.
In another table store the summary information like question/poll ID, tally
Do one insert in to the individual votes table.
For the summary table you can do this:
$votedUpOrDown = ($voted = 1) ? 1 : -1;
$query = 'UPDATE summary SET tally = tally + '.$votedUpOrDown.' WHERE pollid = '.$pollId;

I'd go with a slight variant of the first option:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
Replace the boolean with an integer: +1 for an up-vote and -1 for a down-vote.
Then, instead of computing the sum over and over again, keep a running total somewhere; every time there is an up-vote, add one to the total and subtract one every time there is a down-vote. You could do this with an insert-trigger in the database or you could send an extra UPDATE thing SET vote_total = vote_total + this_vote to the database when adding new votes.
You'd probably want a unique constraint on the thing/userid pair in the vote tracking table too.
Keeping track of individual votes makes it easy to keep people from voting twice. Keeping a running total makes displaying the total quick and easy (and presumably this will be the most common operation).
Adding a simple sanity checker that you can run to ensure that the totals match the votes would be a nice addition as well.
serialized array: Please don't do that, such things make it very difficult to root around the database by hand to check and fix things, serialized data structures also make it very difficult (impossible in some cases) to properly constrain your data with foreign keys, check constraints, unique constraints, and what have you. Storing serialized data structures in the database is usually a bad idea unless the database doesn't need to know anything about the data other than how to give it back to you. Packing an array into a text column is a recipe for broken and inconsistent data in your database: broken code is easy to fix, broken data is often forever.

Related

storing sum() results in database vs calculating during runtime

I'm new to sql & php and unsure about how to proceed in this situation:
I created a mysql database with two tables.
One is just a list of users with their data, each having a unique id.
The second one awards certain amounts of points to users, with relevant columns being the user id and the amount of awarded points. This table is supposed to get new entries regularly and there's no limit to how many times a single user can appear in it.
On my php page I now want to display a list of users sorted by their point total.
My first approach was creating a "points_total" column in the user table, intending to run some kind of query that would calculate and update the correct total for each user every time new entries are added to the other table. To retrieve the data I could then use a very simple query and even use sql's sort features.
However, while it's easy to update the total for a specific user with the sum where function, I don't see a way to do that for the whole user table. After all, plain sql doesn't offer the ability to iterate over each row of a table, or am I missing a different way?
I could probably do the update by going over the table in php, but then again, I'm not sure if that is even a good approach in the first place, because in a way storing the point data twice (the total in one table and then the point breakdown with some additional information in a different table) seems redundant.
A different option would be forgoing the extra column, and instead calculating the sums everytime the php page is accessed, then doing the sorting stuff with php. However, I suppose this would be much slower than having the data ready in the database, which could be a problem if the tables have a lot of entries?
I'm a bit lost here so any advice would be appreciated.

To get the total points awarded, you could use a query similar to this:
SELECT
`user_name`,
`user_id`,
SUM(`points`.`points_award`) as `points`,
COUNT(`points`.`points_award`) as `numberOfAwards`
FROM `users`
JOIN `points`
ON `users`.`user_id` = `points`.`user_id`
GROUP BY `users`.`user_id`
ORDER BY `users`.`user_name` // or whatever users column you want.

MYSQL select 'random' rows, but always give the same random rows if called again

My situations is this... I have a table of opportunities that is sorted. We have a paid service that will allow people to view the opportunities on the website any time. However we want an unpaid view that will show a random %/# of opportunities, which will always be the same. The opportunities are sorted out by dates; e.g. they will expire and be removed from the list, and a new one should be on the free search. However the only problem is that they will always have to show the same opportunity. (For example, I can't just pick random rows because it will cycle through them if they keep refreshing, and likewise can't just take the ones about to expire or furthest form expiry because people still end up seeing the entire list.
My only solution thus far is to add an extra column to the table to mark that it is open display. Then to count them on display, and if we are missing rows then to randomly select a few more. Below is a mock up...
SELECT count(id) as total FROM opportunities WHERE display_status="open" LIMIT 1000;
...
while(total < requiredNumber) {
UPDATE opportunities SET display_status="open" WHERE display_status="private" ORDER BY random() LIMIT (required-total);
}
Can anyone think of a better way to solve this problem, preferably one that does not leave me adding another column to the table, and possible conflicts if many people load the page at a single time. One final note as well, it can't be a random set number of them (e.g. pick one, skip a few, take the next).
Any thought/comments would be very helpful,
Thanks.

One way to make sure that a user only sees the same set of random rows is to feed the random number generator a seed that is linked to that user (such as their user_id). That means every user gets a random ordering of rows but it's always the same random ordering for each user.
Your code would be something:
SELECT ...
FROM ...
WHERE ...
ORDER BY random(<user id>)
LIMIT <however many>
Note: as Twelfth pointed out, as new rows are created, they will get new order values and may end up in your random selection.

I'm the type that doesn't like to lose information...including what random rows someone got to see. However I do not like the modification of your existing table idea...
Create a second table as randon_rows or something to that extent to save the ID's of the user and the ID's of the random records they got to see. Inner join to the table whenever you need to find those same rows again. You can also put expirey dates and the sort in the table as well, so the user isn't perma stuck with the same 10 rows.

Save additional information to MYSQL Database and use a simple query, or use complex query?

I have a drupal site, and am trying to use php to grab some data from my database. What I need to do is to display, in a user's profile, how many times they were the first person to review a venue (exactly like Yelp's "First" tally). I'm looking at two options, and trying to decide which is the better way to approach it.
First Option: The first time a venue is reviewed, save the value of the reviewer's user ID into a table in the database. This table will be dedicated to storing the UID of the first user to review each venue. Then, use a simple query to display a count in the user's profile of the number of times their UID appears in this table.
Second Option: Use a set of several more complex queries to display the count in the user's profile, without storing any extra data in the database. This will rely on several queries which will have to do something along the lines of:
Find the ID for each review the user has created
Check the ID of the venue contained in each review
First review for each venue based on the venue ID stored in the review
Get the User ID of the author for the first review
Check which, if any, of these Author UIDs match the current user's UID
I'm assuming that this would involve creating an array of the IDs in step one, and then somehow executing each step for each item in the array. There would also be 3 or 4 different tables involved in the query.
I'm relatively new to writing SQL queries, so I'm wondering if it would be better to perform the set of potentially longer queries, or to take the small database hit and use a much much smaller count query instead. Is there any way to compare the advantages of either, or is it like comparing apples and oranges?

The volume of extra data stored will be negligible; the simplification to the processing will be significant. The data won't change (the first person to review a venue won't change), so there is a negligible update burden. Go with the extra data and simpler query.

Optimizing an MYSQL COUNT ORDER BY query

I have recently written a survey application that has done it's job and all the data is gathered. Now i have to analyze the data and i'm having some time issues.
I have to find out how many people selected what option and display it all.
I'm using this query, which does do it's job:
SELECT COUNT(*)
FROM survey
WHERE users = ? AND table = ? AND col = ? AND row = ? AND selected = ?
GROUP BY users,table,col,row,selected
As evident by the "?" i'm using MySQLi (in php) to fetch the data when needed, but i fear this is causing it to be so slow.
The table consists of all the elements above (+ an unique ID) and all of them are integers.
To explain some of the fields:
Each survey was divided into 3 or 4 tables (sized from 2x3 to 5x5) with a 1 to 10 happiness grade to select form. (questions are on the right and top of the table, then you answer where the questions intersect)
users - age groups
table, row, col - explained above
selected - dooooh explained above
Now with the surveys complete and around 1 million entries in the table the query is getting very slow. Sometimes it takes like 3 minutes, sometimes (i guess) the time limit expires and you get no data at all. I also don't have access to the full database, just my empty "testing" one since the costumer is kinda paranoid :S (and his server seems to be a bit slow)
Now (after the initial essay) my questions are: I left indexing out intentionally because with a lot of data being written during the survey, it would be a bad idea. But since no new data is coming in at this point, would it make sense to index all the fields of a table? How much sense does it make to index integers that never go above 10? (as you can guess i haven't got a clue about indexes). Do i need the primary unique ID in this table? I
I read somewhere that indexing may help groups but only if you group by the first columns in a table (and since my ID is first and from my point of view useless can i remove it and gain anything by it?)
Is there another way to write my query that would basically do the same thing but in a shorter period of time?
Thanks for all your suggestions in advance!

Add an index on entries that you "GROUP BY" or do "WHERE". So that's ONE index incorporating users,table,col,row and selected in your case.
Some quick rules:
combine fields to have the WHERE first, and the GROUP BY elements last.
If you have other queries that only use part of it (e.g. users,table,col and selected) then leave the missing value (row, in this example) last.
Don't use too many indexes/indeces, as each will slow the table to updates marginally - so on really large system you need to balance queries with indexes.
Edit: do you need the GROUP BY user,col,row as these are used in the WHERE. If the WHERE has already filtered them out, you only need group by "selected".

Allow users to rate a comment once PHP MySQL

I have a website where users can rate comments that are left on pages. Each comment has a unique ID (E.g. 402934) If I want users to be able to thumb-up/thumb-down said comments I can see how I would make a simple counter code to keep track of the number of thumb-ups vs thumb-downs but how can I make sure that each user only ranks said comment once. I was going to make a database with each comment number as a row and in that row having an array of all the users that have ranked it thumbs up and all the users that have ranked it thumbs down but I had a feeling that wasn't the best way. My next thought was to have a table for each user and then having an array showing all the comments said user has ranked. It would probably run faster this way (e.g. checking from a user's 150 rankings verse a comment's 6050 rankings but I still feel like there is a better way... any ideas?

Create a new table with user_id, comment_id and vote TINYINT(1).
A value of 1 in vote is a thumbs up, A value of 0 in vote is a thumbs down.
Have a UNIQUE KEY constraint on (comment_id, user_id).
If you follow the above it will be easy to check whether a user has cast a vote on a specific comment, if you'd like to be able to quickly (as in fast execution) see all the comments a user has made you should also add an INDEX to user_id.
When a user votes you could use REPLACE INTO to user_comment_thumbs, such as the below:
REPLACE INTO `user_comment_thumbs` (user_id,comment_id,vote)
VALUES (#user_id, #comment_id, #vote);
If the user has already made a vote the entry in the table will be updated, otherwise a new row will be inserted.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.