On a social network I am working on in PHP/MySQL, I have a friends page, it will show all friends a user has, like most networks do. I have a friend table in MySQL, it only has a few fields. auto_ID, from_user_ID, to_friend_ID, date
I would like to make the friends page have a few different options for sorting the results,
By auto_ID which is basically in the order a friend was added. It is just an auto increment id
new friends by date, will use the date field
By friends name, will have a list in alphabetical order.
The alphabetical is where I need some advice. I will have a list of the alphabet A-Z, when a user clicks on K it will show all the user's name starting with K and so on. The trick is it needs to be fast so doing a JOIN on the user's table is not an option, even though most will argue it is fast, it is not the performance I want for this action. One idea I had is to add an extra field to my friendship table and store the first letter of the users name in it. User's can change there name at anytime so I would have to make sure this is updated on possible thousands of records, anytime a user changes there name.
Is there a better way to do this?
Well if you don't want to do a join, then storing the user's name or initials on the friendships table is really your only other viable option. You mention the problem of having to update thousands of records every time a name changes, but is this really a problem? Unless you're talking about a major social networking site like Facebook, or maybe MySpace, does the average user really have enough friends to make this problematic? And then you have to multiply that by the probability that a user will change their name, which I would imagine isn't something that happens very often for each user.
If those updates are in fact non-trivial, you could always background or delay that to happen during non-peak times. Sure you would sacrifice up-to-the-second accuracy, but really, would most users even notice? Probably not.
Edit: Note, my answer above really only applies if you already have those levels of users. If you are still basically developing your site, just worry about getting it working, and worry about scaling problems when they become real problems.
You could also look at a caching solution like memcached. You can have a background process that is always updating a memcached hash and then when you want this data it is already in memory.
I'd just join on the table that contains the name and then sort on the name. Assuming a pretty normal table layout:
Table Person:
ID,
FirstName,
LastName
Table Friend:
auto_ID,
from_user_ID,
to_friend_ID,
date
You could do things like:
Select person.id, person.firstname, person.lastname, friend.auto_id
from Friend
left join on person where person.id = friend.to_friend_ID
where friend.from_user_ID = 1
order by person.lastname, person.firstname
or
Select person.id, person.firstname, person.lastname, friend.auto_id
from Friend
left join on person where person.id = friend.to_friend_ID
where friend.from_user_ID = 1
order by friend.date desc
I'd really recommend adding a column in the friend table to keep the first letter around, no need to duplicate data like that (and have to worry about keeping it in sync), that's what joins are for.
Related
For a forum, i want to enable the users to send messages to each other to.
In order to do this, I made a table called Contacts, within this table I have 5 collumns: The user_id, a collumn for storing Friends, one for storing Family, one for storing Business and one for other contacts. These last four should all contain an array, which holds the user_id's of that type of contact. The reason I chose for this design is because I don't want to type an awful lot or limit the users on the amount of friends, like friend1, friend2 etc.
My question is: Is this correct how I do it? If not, what should be improved?And what type of MYSQL field should Friends, Family, Business and Other be?
What you should do instead of that is have a map table between your Contacts table and any related tables (User, Friends, Family, Business). The purpose would purely be to create a link between your Contact and your User(s) etc, without having to do what you're talking about and use arrays compacted into a varchar etc field.
Structured data approach gives you a much more flexible application.
E.g. UserContacts table purely contains its own primary key (id), a foreign key for Users and a foreign key for Contacts. You do this for each type, allowing you to easily insert, or modify maps between any number of users and contacts whenever you like without potentially damaging other data - and without complicated logic to break up something like this: 1,2,3,4,5 or 1|2|3|4|5:
id, user_id, contact_id
So then when you come to use this structure, you'll do something like this:
SELECT
Contacts.*
-- , Users.* -- if you want the user information
FROM UserContacts
LEFT JOIN Contacts ON (UserContacts.contact_id = Contacts.id)
LEFT JOIN Users ON (Users.id = UserContacts.user_id)
Use the serialize() and unserialize() functions.
See this question on how to store an array in MySQL:
Save PHP array to MySQL?
However, it's not recommended that you do this. I would make a separate table that stores all the 'connections' between two users. For example, if say John adds Ali, there would be a record dedicated to Ali and John. To find the friends of a user, simply query the records that have Ali or John in them. But that's my personal way of doing things.
I recommend that you query the users friends using PHP/MySQL all the time you need them. This could save considerable amount of space and would not take up so much speed.
serialize the array before storing and unserialize after retrieving.
$friends_for_db = serialize($friends_array);
// store $friends_for_db into db
And for retrieving:
// read $friends_for_db from db
$friends_array = unserialize($friends_for_db);
However, it should be wiser to follow other answers about setting up an appropriate many-to-many design.
Nevertheless, I needed this kind of design for a minor situation which a complete solution would not be necessary (e.g. easy storing/retrieving some multi-select list value which I'll never query nor use, other than displaying to user)
I am currently using MySQL and MyISAM.
I have a function of which returns an array of user IDs of either friends or users in general in my application, and when displaying them a foreach seemed best.
Now my issue is that I only have the IDs, so I would need to nest a database call to get each user's other info (i.e. name, avatar, other fields) based on the user ID in the loop.
I do not expect hundreds of thousands of users (mainly for hobby learning), although how should I do this one, such as the flexibility of placing code in a foreach for display, but not relying on ID arrays so I am out of luck to using a single query?
Any general structures or tips on what I can display the list appropriately with?
Is my amount of queries (1:1 per users in list) inappropriate? (although pages 0..n of users, 10 at a time make it seem not as bad I just realize.)
You could use the IN() MySQL method, i.e.
SELECT username,email,etc FROM user_table WHERE userid IN (1,15,36,105)
That will return all rows where the userid matches those ID's. It gets less efficient the more ID's you add but the 10 or so you mention should be just fine.
Why couldn't you just use a left join to get all the data in 1 shot? It sounds like you are getting a list, but then you only need to get all of a single user's info. Is that right?
Remember databases are about result SETS and while generally you can return just a single row if you need it, you almost never have to get a single row then go back for more info.
For instance a list of friends might be held in a text column on a user's entry.
Whether you expect to have a small database or large database, I would consider using the InnoDB engine rather than MyISAM. It does have a little higher overhead for processing than MyISAM, however you get all the added benefits (as your hobby grows) including JOIN, which will allow you to pull in specific data from multiple tables:
SELECT u.`id`, p.`name`, p.`avatar`
FROM `Users` AS u
LEFT JOIN `Profiles` AS p USING `id`
Would return id from Users and name and avatar from Profiles (where id of both tables match)
There are numerous resources online talking about database normalization, you might enjoy: http://www.devshed.com/c/a/MySQL/An-Introduction-to-Database-Normalization/
This situation is pretty difficult to explain, but I'll do my best.
For school, we have to create a web application (written in PHP) which allows teachers to manage their students' projects and allow these to make peer-evaluation. As there are many students, every projects has multiple projectgroups (and ofcourse you should only peer-evaluate your own group members).
My databasestructure looks like this at the moment:
Table users: contains all user info (user_id is primary)
Table: projects: Contains a project_id, a name, a description and a start date.
So far this is pretty easy. But now it gets more difficult.
Table groups: Contains a group_id, a groupname and as a group is specific for a project, it also holds a project_id.
Table groupmembers: A group contains multiple users, but users can be in multiple groups (as they can be active in multiple projects). So this table contains a user_id and a group_id to link these.
At last, admins can decide when users need to do their peer-evaluation and how much time they have for it. So there is a last table evaluations containing an evaluation_id, a start and end date and a project_id (the actual evaluations are stored in a sixth table, which is not relevant for now).
I think this is a good design, but it gets harder when I actually have to use this data. I would like to show a list of evaluations you still have to fill in. The only thing you know is your user_id as this is stored in the session.
So this would have to be done:
1) Run a query on groupmembers to see in which groups the user is.
2) With this result, run a query on groups to see to which projects these groups are related.
3) Now that we know what projects the user is in, the evaluations table should be queried to see if there are ongoing evaluations for this projects.
4) We now know which evaluations are available, but now we also need to check the sixth table to see if the user has already completed this evaluation.
All these steps are dependent on the result of each other, so they should all contain their own error handling. Once the user has chosen the evaluation they wish to fill in (a evaluationID will be send via GET), a lot of new queries will have to be run to check which users this member has in his group and will have to evaluate and another check to see which other groupmembers are already evaluated).
As you see, this is quite complex. With all the errorhandling included, my script will be a real mess. Someone told me a "view" might help in this situation, but I don't really understand why this would help me here.
Is there a good way to do this?
Thank you very much!
you are thinking too procedurally.
all your conditions should be easily entered into one single where clause of a sql statement.
you will end up with a single list of the items to be evaluated. only one list, only one set of error handling.
Not sure if this is exactly right, but try this basic approach. I didn't run this against an actual database so the syntax may need to be tweaked.
select p.project_name
from projects p inner join evaluations e on p.project_id = e.project_id
where p.project_id in (
select project_id
from projects p inner join groups g on p.project_id = g.project_id
inner join groupmembers gm on gm.group_id = g.group_id
where gm.user_id = $_SESSION['user_id'])
Also, you'll need to make sure that you properly escape your user_id when making it a part of the query, but that is a whole other topic.
This may be a hairy question but. Say I have
Followers:
-user_id
-follower_id
Activities:
-id
-user_id
-activity_type
-node_id
Pulling a users activity is fairly easy. But what is the best way to get a followers activity? A subselect? It seems like it is incredibly slow as users get more and more followers. Any ideas to speed this up?
Also, on a more conceptual level. How does the grouping work. Is it all done with a single query? Or is all the activity data pulled in and then sorted and grouped on the PHP side?
Users X, Y and Z did Activity A
User J did 3 of Activity B
Subselects are often slower than JOINs, but it really depends on what exactly you're doing with them. To answer you main question, I would get follower data with a JOIN:
SELECT * FROM followers f
LEFT JOIN activities a ON f.follower_id=a.user_id
WHERE f.user_id=$followedPerson
That's assuming that the followers table represents a user with user_id, and someone who is following them with a follower_id that happens to be a user_id in the users table as well.
It won't ever be incredibly slow as long as you have an index on followers.user_id. However, the amount of data such a query could return could become larger than you really want to deal with. You need to determine what kinds of activity your application is going to want to show, and try to filter it accordingly so that you aren't making huge queries all the time but only using a tiny fraction of the returned results.
Pulling data out and grouping it PHP side is fine, but if you can avoid selecting it in the first place, you're better off. In this case, I would probably add an ORDER BY f.follower_id,activity_date DESC, assuming a date exists, and try to come up with some more filtering criteria for the activity table. Then I'd iterate through the rows in PHP, outputting data grouped by follower.
An activity log has the potential for a very large number of records since it usually has a mix of the current user's activity and all their friends. If you are joining various tables and a user has 100s of friends that's potentially a lot of data being pulled out.
One approach is to denormalise the data and treat it as one big log where all entries that should appear on a user's activity log page to be stored in the activity log table against that user. For example if User A has two friends, User B and User C, when User A does something three activity log records are created:
record 1: "I did this" log for user A
record 2: "My friend did this" log for user B
record 3: "My friend did this" log for user C
You'll get duplicates, but it doesn't really matter. It's fast to select since it's from one table and indexed on just the user ID. And it's likely you'll housekeep an activity log table (i.e. delete entries over 1 month old).
The activity log table could be something like:
-id
-user_id (user who's activity log this is)
-action_user_id (user who took the action, or null if same as user_id)
-activity_type
-date
To select all recent activity logs for a single user is then easy:
SELECT * from activity_log WHERE user_id = ? ORDER by date DESC LIMIT 0,50
To make this approach really efficient you need to have enough information in the single activity log table to not need any further selects. For example you may store the raw log message, rather than build it on the fly.
I dont know if I understood correctly what you need but
I would try this select, if I'm right you should get all activity for all followers of #USERID#
SELECT a.* FROM Activities AS a
INNER JOIN Followers AS f1
ON a.user_id = f1.follower_id
WHERE f1.user_id = #USERID#
I'm working on a PHP app that has several objects that can be commented on. Each comment can be voted on, with users being able to give it +1 or -1 (like Digg or Reddit). Right now I'm planning on having a 'votes' table that has carries user_id and their vote info, which seems to work fine.
The thing is, each object has hundreds of comments that are stored in a separate comments table. After I load the comments, I'm having to tally the votes and then individually check each vote against the user to make sure they can only vote once. This works but just seems really database intensive - a lot of queries for just the comments.
Is there a simpler method of doing this that is less DB intensive? Is my current database structure the best way to go?
To be clearer about current database structure:
Comments table:
user_id
object_id
total_votes
Votes table:
comment_id
user_id
vote
End Goal:
Allow user to vote only once on each comment with least # of MySQL queries (each object has multiple comments)
To make sure that each voter votes only once, design your Votes table with these fields—CommentID, UserID, VoteValue. Make CommentID and UserID the primary key, which will make sure that one user gets only one vote. Then, to query the votes for a comment, do something like this:
SELECT SUM(VoteValue)
FROM Votes
WHERE CommentID = ?
Does that help?
Why don't you save the totaled votes for every comment? Increment/decrement this when a new vote has happened.
Then you have to check if the user has voted specifically for this comment to allow only one vote per comment per user.
You can put a sql join condition which returns all the votes on comments made by the current user for this object, if you get no rows, the user hasn't voted. That is just slightly different from you checking each comment one by one in the program.
as far as the database structure is concerned, keeping these things separate seems perfectly logical. vote { user_id, object_id, object_type, vote_info...)
You may be already doing this, sorry but I couldn't interpret from you post if that was the case.