Storing user activity? PHP, MySQL and Database Design - php

Ok so a user comes to my web application and gets points and the like for activity, sort of similar (but not as complex) as this site. They can vote, comment, submit, favorite, vote for comments, write description etc and so on.
At the moment I store a user action in a table against a date like so
Table user_actions
action_id - PK AI int
user_id - PK int
action_type - varchar(20)
date_of_action - datetime
So for example if a user comes along and leaves a comment or votes on a comment, then the rows would look something like this
action_id = 4
user_id = 25
action_type = 'new_comment'
date_of_action = '2011-11-21 14:12:12';
action_id = 4
user_id = 25
action_type = 'user_comment_vote'
date_of_action = '2011-12-01 14:12:12';
All good I hear you say, but not quite, remember that these rows would reside in the user_actions table which is a different table to the ones in which the comments and user comment votes are stored in.
So how do I know what comment links to what row in the user_actions?
Well I could just link to the unique comment_id in the comments table to a new column, called target_primary_key in the user_actions table?
Nope. Can't do that because the action could equally have been a user_comment_vote which has a composite key (double key)?
So the thought I am left with is, do I just add the primary keys in a column and comma deliminate them and let PHP parse it out?
So taking the example above, the lines below show how I would store the target primary keys
new_comment
target_primary_keys - 12 // the unique comment_id from the comments table
user_comment_vote
target_primary_keys - 22,12 // the unique comment_id from the comments table
So basically a user makes an action, the user_actions is updated and so is the specific table, but how do I link the two while still allowing for multiple keys?
Has anyone had experience with storing user activity before?
Any thoughts are welcome, no wrong answers here.

You do not need a user actions table.
To calculate the "score" you can run one query over multiple tables and multiply the count of matching comments, ratings etc. with a multiplier (25 points for a comment, 10 for a rating, ...).
To speed up your page you can store the total score in an extra table or the user table and refresh the total score with triggers if the score changes.
If you want to display the number of ratings or comments you can do the same.
Get the details from the existing tables and store the total number of comments and ratings in an extra table.

The simplest answer is to just use another table, which can contain multiple matches for any key and allow great indexing options:
create table users_to_actions (
user_id int(20) not null,
action_id int(20) not null,
action_type varchar(25) not null,
category_or_other_criteria ...
);
create index(uta_u_a) on users_to_actions(user_id, action_id);
To expand on this a bit, you would then select items by joining them with this table:
select
*
from
users_to_actions as uta join comments as c using(action_id)
where
uta.action_type = 'comment' and user_id = 25
order by
c.post_date
Or maybe a nested query depending on your needs:
select * from users where user_id in(
select
user_id
from
users_to_actions
where
uta.action_type = 'comment'
);

Related

Capturing a row's ID to use as a table's name

Just to save anyone reading this time and trouble, DO NOT use this method to store surveys. As pointed out in the answer, this is incredibly poor programming (not to mention dangerous to kitties)
Forgive me if this question is somewhat convoluted. I'm working on building a program that allows users to create surveys and post them for users to take.
Long story short, I have a table that looks like this:
**survey_info**
id bigint(20) Auto_increment Primary Key
title varchar(255)
category bigint(20)
active tinyint(1)
length int(11)
redirect text
now, when a survey is created, a new table is also created that is custom built to hold hte input for that survey. The naming schema I'm using for these new tables is survey_{survey_id}
What I'm hoping to do is in the list of surveys, put the number of responses to a survey to the right of it.
Alright, now my actual question is this, is there a way to retrieve the number of rows in the collection table (survey_id) within the same query I'm using to gather the list of available surveys? I realize that I can do this easily by just using a second query for each survey and grab it's rowcount, but my fear is that the larger the number of surveys the user has, the more time-consuming this process will become. So is there any way to do something like:
SELECT s.id AS id, s.title AS title, c.title AS ctitle, s.active AS active, s.length AS length, s.redirect AS redirect, n.num FROM survey_info s, survey_category c, (SELECT COUNT(*) AS num FROM survey_s.id) n WHERE s.category = c.id;
I just don't know for sure how to use the s.id as part of the other table's name (or if it can even be done)
Any help, or even a point in the right direction would be appreciated!
You need to use one table for all the surveys.
Add newly created id not as a table name but as a survey id in that table.
You create a relational model that will store all surveys options in one table. This is a sample design:
survey
------
id PK
title
surveyOption
--------------
id PK
survey_id FK
option
surveyResponse
--------------
id PK
surveyOptionId FK
response

MySQL table to hold pages that a user likes

How would I set up a table for topics that a user likes? I have a topics tables and a user table (more actually but simplified for a post on here). There is an ever increasing number of topics as they are user generated, how could I allow users to like pages? Would i put the topic's id in the user table or the user's id in the topics table or a create a new likes table? The issue I see is that the number of topics could (potentially) be very large. What could I use to create a system that allows a relationship between a users id and the topics id?
What you could possibly do is a "many to many" table structure
A unique auto incremented id - UINT (10) AUTO_INCREMENT
A feild containing the user id - UINT (10) (or what ever matches your main user_id field)
A field containing the "liked" topic id - UINT (10) (or what ever matches your main topic_id field)
Both user_id and topic_id fields would need to be unique together. That means that there can only be once row for a specific like per user. This makes sure (on the database side), that a user will not be allowed to like a topic more than once.
Getting a users liked topics would look like this -
SELECT * FROM user_likes` WHERE `user_id`=USER_ID
Getting the users per like would look like this -
SELECT * FROM user_likesWHEREtopic_id`=TOPIC_ID
As others have said in their answers and also #trevor in the comments below -
Don't forget to add an index on the userid to support retrieval of user liked topics and a separate index on topic is to support the topics per user query - without these, the queries will get slower as more data is added over time.
One way to do it is to create a new table UserLikedTopics or something similar, in which you have two columns, one to keep the UserId and one to keep the TopicId. For each new topic a user "Likes", you add a new row to the table with the UserId and the TopicId. That way it is easy to keep track of which users likes which topics.
To get whoch topics a certain user like, you simply join the UserLikedTopics with your topics table, and you have a list of all topics a certain user like. You could also make it the other way around and join it on the User table, to get a list of the users that like a certain topic.
You will need a 'likes' table. Something like:
CREATE TABLE `users_likes` (
`user_id` INT(10) UNSIGNED NOT NULL,
`topic_id` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`user_id`, `topic_id`),
INDEX `topic_id` (`topic_id`)
)
Create a separate likes table since it's a separate entity,
link the likes table with users & topics table with userid & topicsid as foreign keys in likes table..
It would be good to have a structure like this, later if you dont even want a feature likes, can just remove it without affecting other tables...

Need support with figuring out a query for a cron job

I have a voting system for articles. Articles are stored in 'stories' table and all votes are stored in 'votes' table. id in 'stories' table is equal to item_name in 'votes' table (therefore each vote is related to article with item_name).
I want to make it so when sum of votes gets to 10 it updates 'showing' field in 'stories' table to value of "1".
I was thinking about setting up a cron job that runs every hour to check all posts that have a showing = 0. If showing = 0 than it will sum up votes related to that article and set showing = 1 if sum of votes >= 10. I'm not sure if it is efficient as it might take up a lot of server resources, not sure.
So could anyone suggest a cron job that could do the task?
Here is my database structure:
Stories table
Votes table
Edit:
For example this row from 'stories' table:
id| 12
st_auth | author name
st_date | story date
st_title| story title
st_category| story category
st_body| story body
showing| 0 for unaproved and 1 for approved
This row is related to this one from 'votes' table
id| 83
item_name| 12 (id of article)
vote_value| 1 for upvote -1 for downvote
...
Couple of things:
Why did you name the column item_name in the votes table, when it is actually the id of the article table? I would recommend making this a match on the article table in that it is an int(11) vs a var_char(255). Also, you should add a foreign key constraint to the votes table, so if an article is ever deleted, you don't orphan a row in the votes table.
Why is the vote_value column an int(11)? If it can only be two states (1, or -1) you can do a tinyint(1) signed (for the -1).
The ip column in the votes table is a bit concerning. If you are regulating 'unique' votes by ip, did you account for proxy ips? Something like this should be handled at the account level, so several users from the same proxy IP can issue individual votes.
I wouldn't do a cronjob for determining whether the showing column should be flagged 0 or 1. Rather, I would issue a count every time a vote was cast against the article. So if someone up-voted or down-voted, calculate the new value of the story, and store it in cache for future reads.
Using this query, you get a list of all articles plus a column containing the count of associated votes.
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
This way, you can create a view from which you can filter on votes_total > 10, without need of the cron job.
Or you can use it as a normal query, something like this:
SELECT * FROM (
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
) WHERE votes_total > 10;
I would use a trigger (insert trigger) and handle your logic there (in the database itself)?
This would remove the poll code altogether (cron job).
I would also keep your foreign key (in VOTES) the same (at least the type) as the primary key (in STORIES)?
Using a trigger instead of polling will be much cleaner in the long run.
You don't specify your database, but in TSQL (for SQL Server) it could be close to this
CREATE TRIGGER myTrigger
ON VOTES
FOR INSERT
AS
DECLARE #I INT --HOLDS COUNT OF VOTES
DECLARE #IN VARCHAR(255) --HOLDS FK ID FOR LOOKUP INTO STORIES IF UPDATE REQUIRED
SELECT #IN = ITEM_NAME FROM INSERTED
SELECT #I = COUNT(*) FROM VOTES WHERE ITEM_NAME = #IN
IF (#I >= 10)
BEGIN
UPDATE STORIES SET SHOWING = 1 WHERE ID = #IN --This is why your PK/FK should be refactored
END

Mysql non-sequential insert problem

I have a table with 100,000 records described as:
ID primary unique int (5)
Ticket unique int (5)
user varchar (20)
Only fields populated on that table are the first two, ID and Ticket. I need to now assign a user to that ticket when requested. How can i do this? How can I find where the next null user is on the table?
Edit: Explaining Scenario as requested
Its a lottery system of sorts. The Ticket numbers have already been made and populated into the table. Now when a user signs up for a ticket, their username has to be inserted next to the next available ticket, in the user field. Im sure theres a much simpler way to do this by inserting the ticket with all the information on a new table, but this is th exact requirement as dumb as it sounds.
So how can I find out where the next null user is on the table?
What is the sorting scheme of the table ?
If the Id numbers are sequential this should work:
SELECT ID FROM TABLE WHERE user is null ORDER by ID LIMIT 1
If Id numbers are NON sequential and you are OK with using the natural sort of the table (sorted as they were entered)
SELECT ID FROM TABLE WHERE user is null LIMIT 1
Find the next NULL row by doing:
SELECT ID
FROM Ticket
WHERE user IS NULL
LIMIT 1;
When you update though you'll have to be careful you don't have a race condition with another process also getting the same ID. You could prevent this duplicate allocation problem by having a separate table holding the TicketAllocation, and giving it a unique foreign key constraint pointing back to the Ticket table.
you can also do it in a single query:
UPDATE users SET user = [username] where id =
(select min(id) from users where user is null)
This assumes ID is auto-incremented.
Start by finding the first record where the user field is null:
Select * from users where user is null order by id asc limit 1;
Then fill it in:
Update users set user = [username] where id = [id from select];

I need some advice on storing data in mysql, where one needs to store more than one, let say userids for a single post?

In cases when some one needs to store more than one value in a in a cell, what approach is more desirable and advisable, storing it with delimiters or glue and exploding it into an array later for processing in the server side language of choice, for example.
$returnedFromDB = "159|160|161|162|163|164|165";
$myIdArray = explode("|",$returnedFromDB);
or as a JSON or PHP serialized array, like this.
:6:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}
then later unserialize it into an array and work with it,
OR
have a new row for every new entry like this
postid 12 | showto 2
postid 12 | showto 3
postid 12 | showto 5
postid 12 | showto 6
postid 12 | showto 8
instead of postid 12 | showto "2|3|4|6|8|5|".
OR postid 12 | showto ":6:{i:0;i:2;i:1;i:3;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}".
Thanks, looking forward to your opinions :D
In cases when some one needs to store more than one value in a in a cell, what approach is more desirable and advisable, storing it with delimiters or glue and exploding it into an array later for processing in the server side language of choice, for example.
Neither. Oh goodness, neither! Edgar F. Codd is rolling in his grave right now.
Storing delimited data in a text field is no better than storing it in a flat file. The data becomes unqueryable. Storing PHP serialized data in a text field is even worse because then only PHP can parse the data.
You want a nice, happy, normalized database.
The thing you're trying to describe is a many-to-many relationship. Each user can maintain one or more posts. Likewise, each post can be maintained by one or more user. Right? Then something like this will work.
CREATE TABLE users (
user_id INTEGER PRIMARY KEY,
...
);
CREATE TABLE posts (
post_id INTEGER PRIMARY KEY,
...
);
CREATE TABLE user_posts (
user_id INTEGER REFERENCES users(user_id),
post_id INTEGER REFERENCES posts(post_id),
UNIQUE KEY(user_id, post_id)
);
-- All posts made by user 22.
SELECT posts.*
FROM posts, user_posts
WHERE user_posts.user_id = 22
AND posts.post_id = user_posts.post_id
-- All users that worked on post 47
SELECT users.*
FROM users, user_posts
WHERE user_posts.post_id = 47
AND users.user_id = user_posts.user_id
Most of the time the recommendation is that many-to-many relationships (such as posts to users) should have a mapping table with 1 row for each post-user combination (in other words, your "new row for every new entry" version).
It's more optimal for things like join queries, and lets you retrieve only the data you need.
You should only serialize data in the DB if the data is never needed to be processed by the DB. For example, you could serialize user ID in the user_id field if you never need to do a query with the user_id field; e.g. never selecting anything based on user.
If these are posts (blog/news/etc. posts?) then I'm pretty confident you'll need to be able to query them by user. Normalizing the user into another table would serve you:
CREATE TABLE posts (post_id, ....);
CREATE TABLE post_users (post_id, user_id, ...);
You can then get the users in a different query, or use group_concat: SELECT post_id, GROUP_CONCAT(user_id) FROM posts JOIN post_users USING (post_id) GROUP BY post_id. When you need to show user name, just join to the users table to get their name in the group concat.
From RDBMS point of view i would 'have a new row for every new entry'
Thats called m:n relationship table.
You can then query the data however you like.
If you need postid 12 | showto ":6:{i:0;i:2;i:1;i:3;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}". you can do
SELECT postid, CONCAT(':',count(showto),':{i:',GROUP_CONCAT(showto SEPARATOR ';i:'),';}') AS showto
FROM tablename
GROUP BY postid
However if you only need the data in 1 form and not do any other kind of queries on that data then you may aswell store the string.

Categories