I am thinking of storing online users in memcached.
First I thought about having array of key => value pairs where key will be user id and value timestamp of last access.
My problem is that it will be quite large array when there are many users currently online(and there will be).
As memcached is not built to store large data, how would you solve it? What is the best practice!
Thanks for your input!
The problem with this approach is memcache is only queriable if you know the key in advance. This means you would have to keep the entire online user list under a single known key. Each time a user came online or went offline, it would become necessary to read the list, adjust it, and rewrite it. There is serious potential for a race condition there so you would have to use the check-and-set locking mechanism.
I don't think you should do this. Consider keeping a database table of recent user hits:
user_id: int
last_seen: timestamp
Index on timestamp and user_id. Query friends online using:
SELECT user_id FROM online WHERE user_id IN (...) AND timestamp > (10 minutes ago);
Periodically go through the table and batch remove old timestamp rows.
When your site becomes big you can shard this table on the user_id.
EDIT:
Actually, you can do this if you don't need to query all the users who are currently online, but just need to know if certain users are online.
When a user hits a page,
memcache.set("online."+user_id, true, 600 /* 10 mins */);
To see if a user is online,
online = memcache.get("online."+user_id);
There should also be a way to multikey query memcache, look that up. Some strange things could happen if you add a memcache server, but if you use this information for putting an "online" marker next to user names that shouldn't be a big deal.
Related
I'm making a website that have posts and replies system.
I'd like to do is when someone replies, sending notification to those who have ever replied (or involved) the post.
My thought is to create a table named Notification, contains message and seen (seen/unread) field. Once people replied, INSERT record to the Notification table.
It's seems easy and intuitive, but if there are lots of people involved in, for example, the 31st user replies, 30 people who have ever replied will receive notification. This will make 30 rows of SQL records. And the 32nd user will make 31 records. Then total number of rows will become 30+31=61.
My question is
Is that a good way to handle notification system?
If so, how to deal with the duplicate notification (haven't seen but has new reply)
As above, will this make a huge server load?
Thank you so much.
I was creating similar system. Here is my experience:
My notification table looks like: id (int) | user_id (int) | post_id (int) | last_visited (datetime).
user_id + post_id is an unique composite index.
So when a user opens the page, I'm looking for an entry (user_id + post_id) in the database. If I find it, then I update the last_visited field if I don't find, then create new row.
When I need list messages for notification I'm just query all messages that was created after last_visited time.
Also I have cron sript that clean notification for closed posts or banned users.
As for your questions:
1 and 2: You have to find a balance between the amount of data that will be stored and site performance. If you don't need to store all this data you can follow my way. If this data is needed your way is better.
3: It depends on the number of visitors and other functionality. But here is some advices. You must use indexes for MySql table for better perfomance. Also you should think about cron script that will remove useless notifications. If you have huge amount of visitors more than 700k per day you shoulf think about MogoDb or other high perfomance noSql database.
I read the topic https://meta.stackexchange.com/questions/36728/how-are-the-number-of-views-in-a-question-calculated . I understand the algorithm, but I not understand how do that thing in mysql, php.
Every time a new hit is registered, it is also added to a memory buffer in addition to the expiring cache entry. The buffer itself also expires after a few minutes or after it is filled up to a certain size, whichever happens first. When it expires, everything it has accumulated is written into the database in bulk. They call it a "buffered write scheme".
We use Storage Engine -MEMORY in mysql or maybe better solution with mysql,php.
Can anyone help me how "buffered write scheme" for view counter with php, mysql.
Thanks very much.
Well it wont go faster than MySQL.
A stored procedure for your query can speed-up the process but database-design is the other half.
Make sure you got one table to count:
user_therad_visit:
----------------------------
user_id | thread_id | count
----------------------------
Make sure there is an index on or better a two-rows unique index on columns "user_id" and "thread_id".
When a user logs in, read his entire and thread_id and count values and save them in $_SESSION array.
This way you can check by $_SESSION var if the user has already visited the page or not and simply ignore fetching the database if he was already here, this will reduce queries drastically.
Then simply dont forgot to UPDATE your database incase the user has never been on this thread and also directly update your $_SESSION array manually.
With query helper:
INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
you can simply combine insert into and update, (whatever is needed) into one query also improving performance.
This way a query is only fired when the user enters the thread first time which you have no choice but to write down somewhere and a database is one of the fastest ways to do that.
Aslong as your thread_id and user_id fields are indexed you should be pretty fast with the SELECT query, even with a million rows in the table.
I am creating a database for keeping track of water usage per person for a city in South Florida.
There are around 40000 users, each one uploading daily readouts.
I was thinking of ways to set up the database and it would seem easier to give each user separate a table. This should ease the download of data because the server will not have to sort through a table with 10's of millions of entries.
Am I false in my logic?
Is there any way to index table names?
Are there any other ways of setting up the DB to both raise the speed and keep the layout simple enough?
-Thank you,
Jared
p.s.
The essential data for the readouts are:
-locationID (table name in my idea)
-Reading
-ReadDate
-ReadTime
p.p.s. during this conversation, i uploaded 5k tables and the server froze. ~.O
thanks for your help, ya'll
Setting up thousands of tables in not a good idea. You should maintain one table and put all entries in that table. MySQL can handle a surprisingly large amount of data. The biggest issue that you will encounter is the amount of queries that you can handle at a time, not the size of the database. For instances where you will be handling numbers use int with attribute unsigned, and instances where you will be handling text use varchar of appropriate size (unless text is large use text).
Handling users
If you need to identify records with users, setup another table that might look something like this:
user_id INT(10) AUTO_INCREMENT UNSIGNED PRIMARY
name VARCHAR(100) NOT NULL
When you need to link a record to the user, just reference the user's user_id. For the record information I would setup the SQL something like:
id INT(10) AUTO_INCREMENT UNSIGNED PRIMARY
u_id INT(10) UNSIGNED
reading Im not sure what your reading looks like. If it's a number use INT if its text use VARCHAR
read_time TIMESTAMP
You can also consolidate the date and time of the reading to a TIMESTAMP.
Do NOT create a seperate table for each user.
Keep indexes on the columns that identify a user and any other common contraints such as date.
Think about how you want to query the data at the end. How on earth would you sum up the data from ALL users for a single day?
If you are worried about primary key, I would suggest keeping a LocationID, Date composite key.
Edit: Lastly, (and I do mean this in a nice way) but if you are asking these sorts of questions about database design, are you sure that you are qualified for this project? It seems like you might be in over your head. Sometimes it is better to know your limitations and let a project pass by, rather than implement it in a way that creates too much work for you and folks aren't satisfied with the results. Again, I am not saying don't, I am just saying have you asked yourself if you can do this to the level they are expecting. It seems like a large amount of users constantly using it. I guess I am saying that learning certain things while at the same time delivering a project to thousands of users may be an exceptionally high pressure environment.
Generally speaking tables should represent sets of things. In your example, it's easy to identify the sets you have: users and readouts; there the theoretical best structure would be having those two tables, where the readouts entries have a reference to the id of the user.
MySQL can handle very large amounts of data, so your best bet is to just try the user-readouts structure and see how it performs. Alternatively you may want to look into a document based NoSQL database such as MongoDB or CouchDB, since storing readouts reports as individual documents could be a good choice aswell.
If you create a summary table that contains the monthly total per user, surely that would be the primary usage of the system, right?
Every month, you crunch the numbers and store the totals into a second table. You can prune the log table on a rolling 12 month period. i.e., The old data can be stuffed in the corner to keep the indexes smaller, since you'll only need to access it when the city is accused of fraud.
So exactly how you store the daily readouts isn't that big of a concern that you need to be freaking out about it. Giving each user his own table is not the proper solution. If you have tons and tons of data, then you might want to consider sharding via something like MongoDB.
I am building a facebook application and the application accesses the user's friends through the open graph and I need them contained in a database.
Now here this database contains the user id, name and some other info needed by my app. In order to prevent multiple entries in the database, I assigned the user id as unique using phpMyAdmin. now this works fine for many values but at the same time it fails big time.
Lets say the values that are unique according to mysql are:
51547XXXX
52160XXXX
52222XXXX
52297XXXX
52448XXXX
But if the ids become
5154716XX
5154716XX
or
5216069673X
521606903XX
Then it counts it as similar and thus discards one of them.
The result, lets say I am entering my friend list into the table, then it should have 830 records and if I do not use the unique constraint thats the value I get.
But as soon as unique is activated, I just get 375 which means 455 records are discarded, considering them same as the previous data..
The solution what I can think of is, comparing data with php first and then logging them into the database, but then with that much queries, it will take long long time. Google cannot answer this, dunno why.. :(
Facebook user ids are too big to fit into MySQL's INT type (which is 32bit). You need to use the BIGINT type which is 64 bit and can thus handle the range of ids facebook uses.
I'm working on an app in JavaScipt, jQuery, PHP & MySQL that consists of ~100 lessons. I am trying to think of an efficient way to store the status of each user's progress through the lessons, without having to query the MySQL database too much.
Right now, I am thinking the easiest implementation is to create a table for each user, and then store each lesson's status in that table. The only problem with that is if I add new lessons, I would have to update every user's table.
The second implementation I considered would be to store each lesson as a table, and record the user ID for each user that completed that lesson there - but then generating a status report (what lessons a user completed, how well they did, etc.) would mean pulling data from 100 tables.
Is there an obvious solution I am missing? How would you store your users progress through 100 lessons, so it's quick and simple to generate a status report showing their process.
Cheers!
The table structure I would recommend would be to keep a single table with non-unique fields userid and lessonid, as well as the relevant progress fields. When you want the progress of user x on lesson y, you would do this:
SELECT * FROM lessonProgress WHERE userid=x AND lessonid=y LIMIT 1;
You don't need to worry about performance unless you see that it's actually an issue. Having a table for each user or a table for each lesson are bad solutions because there aren't meant to be a dynamic number of tables in a database.
If reporting is restricted to one user at a time - that is, when generating a report, it's for a specific user and not a large clump of users - why not consider javascript object notation stored in a file? If extensibility is key, it would make it a simple matter.
Obviously, if you're going to run reports against an arbitrarily large number of users at once, separate data files would become inefficient.
Discarding the efficiency argument, json would also give you a very human-readable and interchangeable format.
Lastly, if the security of the report output isn't a big sticking point, you'd also gain the ability to easily offload view rendering onto the client.
Use relations between 2 tables. One for users with user specific columns like ID, username, email, w/e else you want to store about them.
Then a status table that has a UID foreign key. ID UID Status etc.
It's good to keep datecreated and dateupdated on tables as well.
Then just join the tables ON status.UID = users.ID
A good option will be to create one table with an user_ID as primary key and a status (int) each row of the table will represent a user. Accessing to its progress would be fast a simple since you have an index of user IDs.
In this way, adding new leassons would not make you change de DB