This question already has answers here:
Is it unreasonable to assign a MySQL database to each user on my site?
(10 answers)
Closed 8 years ago.
I have a small file hosting website that I am trying to store the files that users upload in a database. The issue that I am having is that I cannot decide which method would be better:
To store all users in one table
create a new table for each user.
I understand that the second method will slow performance but by how much? I am planning on having 1000+ users eventually. The issue with the first method is listing the files back to the user. What method should I go with and which one would be the most efficient?
Short answer:
No. Use the simplest thing that works: A single table.
Long answer:
You'll know what kind of scaling problems when you have a production system under production loads, and then you can analyze where your bottlenecks are and develop a sharding strategy based on real-world use cases and not hypotheticals.
Right now you're just guessing, and you'll probably guess wrong. Then you're stuck with an awful database structure you'll find impossible to undo.
Try not to store actual files in the MySQL database, this almost always leads to horrible disaster, but instead store them on the filesystem and keep references to them in the database. If you're going to be managing a lot of files, heaps and tons of them, you may want to look at document store database like Riak to help with that.
I suggest creating a table for each entity and having a correct relationship between them.
For example:
Users table will have user_id, user_name, etc.
Files table will have id, url, user_id
In this case, the relationship is created by having the same user_id. So when you upload a file, you attach the user_id to the image.
That means - no, don't create a separate table for each user. Go with method 1 and make sure each important entitiy has its own table.
Down the road, you will probably have more entities and more tables such as Permission, Products, etc etc. By using SQL queries you will be able to get all the data you want.
Hope this helps!
Having 1000 ish users is not a problem for MySQL. But tadman is rigth, save the files on the filesystems instead of the database.
If you know that you will endup with millions of users, I suggested that you read on how Facebook or others big users related sites handle this scalling problems.
Related
I am currently developing a PHP website and since the website will be used by many people, I just want to know if there will be a problem if there is multiple database access at the same time from those different users, and if so how to go about it. Thanks in advance.
SIMPLE ANSWER: As long as your code is well designed, No.
Elaborating: In a MySQL server, databases are made to work very efficiently and to handle a large set of tasks. Among these tasks include the constant querying of tables inside separate databases, among which include statements that SELECT data, UPDATE data, INSERT rows, DELETE rows, etc.
There are some corner cases that can happen however. Imagine if two people are registering on your website for the first time, and both of them want to register the username Awesomesauce. Programmers often code algorithms that first check if the current username exists, and if it doesn't, INSERT a new row in the users table with the new username and all the other relevant info (password, address, etc). If both users were to click the Register button at the same time, and if your code was badly designed, what could happen is two rows could be created with the same username, in which case you would have a problem.
Luckily, MySQL as features to prevent such corner cases. A UNIQUE INDEX could be implemented on the username column, hence forcing the database not to accept one of the two users who tried to register the name at the exact time.
All in all, if your code is well designed, you shouldn't have a problem.
It all depends on how much traffic, how large your site's database is and a host of other factors.
But for starters, i'ld say there's really nothing to worry about.
I think you should go with MySQL since you are just starting out with php, but you can pretty much use whatever you want with PHP's PDO http://php.net/manual/en/book.pdo.php. There is a lot of online support for mysql with php, so I would start there.
I would suggest make multiple tables in same db rather than multiple db. Though there won't be any problem even if there are multiple db access at same time.
Refer following link to know how its done:-
How do you connect to multiple MySQL databases on a single webpage?
While your question is way too broad, if you want horizontal scaling (adding more servers) look at a PHP/NoSQL solution. Otherwise, something like PHP/MySQL will be fine.
A bit of reading for you here: Difference between scaling horizontally and vertically for databases
Here I come again ;)
I am doing an application where each user will have their own DB.
Is it ok if I store session for each user in their individual DB? Or is it for some reason convenient to have active sessions in a common DB for all users?
Sorry about my question, I am kind of new to this level. :) I am working with PHP and MySQL, if that makes any difference, although I thik the question is language independent.
In a typical application, there will only be one database with several tables, where each table can have several records.
Sessions
You can just save sessions the same way you would add a record to database.
Profile Details / Friendship
This is where relationships take place.
Consider the image below. Credits to the owner on w3stack(dot)org.
Focus and try to study on the three tables above: Users, Friendships, Friends(virtual table). Ignore the virtual table concept for now, so you will not be much confused.
It is really a BAD, and I mean BAD approach to create individual databases for each users. What if you thought of adding a "following" and "follower" feature to your application? You would need to add another table, and re-add all those friends from another db. If UserA will have 100 friends with each database, you wouldn't want to query all those 100 databases.
To end, just use a single DB, and identify relationships according to your application features. It is important to plan your structure before you actually apply it on hands-on. Happy coding!
i am working on a project in which people can create a playlist and its stored in localStorage as objects. everything is client side for the moment.
so i will now like to take a leap forward, make a user login system (i can do it using php mysql and fb connect or oauth system, any other suggestions?). the problem is deciding if i make a sql database for each user and store their playlist (with media info) or is there any other way to go around. will handling a large number of databases be a trouble for me(in terms of speed)?
how about i create only one db as follows:
user database ---> one table containing{ user(primary key) pass someotherInfo} , then tables per USER {contains playlists) , 3rd table per playlist (containing userID and media info, what could be my primary key?)
example:
i have 10 registered user, each user has 2 playlists
1.table 1: 10 entries
2.table(s): username - playlists (10 tables) || i make one table with one field user other field playlist name
3.tables: each playlist - media info, owner (20 tables)
or is there a simpler way?
i hope my question is clear.
PS: i am new to php and database (so this might be very silly)
Surprised most answers seems to have missed the question, but I'll give this a try;
This is called data modeling (how you hobble a bunch of tables in a database together in order to express what you want in the best possible way), and don't feel silly for asking; there are people out there who spend all their waking hours tweaking and designing data models. They are hugely important to the well-being of any system, and they are, in truth, far more important that most people give them credit for.
It sounds like you're on the right path. It's always a good tip to define your entities, and create a table per each, so in this case you've got users and playlists and songs (for example). Define your tables thusly; USER, SONG, PLAYLIST.
The next thing is defining the names of fields and tables (and perhaps the simplistic names suggested above are, well, simplistic). Some introduce faux namespaces (ie. MYAPP_USER instead of just USER), especially if they know the data model will extend and expand in the same database in the future (or, some because they know this is inevitable), while others will just ram through whatever they need.
The big question will always be about normalization and various problems around that, balancing performance against applicability, and there's tons and tons of books written on this subject, so no way for me to give you any meaningful answer, but the gist of it for me is;
At what point will a data field in a table be worthy of its own table? An example is that you could well create your application with only one table, or two, or 6 depending on how you wish to split your data. This is where I think your question really comes in.
I'd say you're pretty much correct in your assumptions, the thing to keep in mind is consistent naming conventions (and there's tons of opinions of how to name identifiers). For your application (with the tables mentioned above), I'd do ;
USER { id, username, password, name, coffee_preference }
SONG { id, artist, album, title, genre }
PLAYLIST { id, userid }
PLAYLIST_ITEM { id, songid, playlistid, songorder }
Now you can use SQL you get all playlists for a user ;
SELECT * FROM PLAYLIST WHERE userid=$userid
Or get all songs in a playlist ;
SELECT * FROM SONG,PLAYLIST_ITEM WHERE playlist_item.playlistid=$playlist.id AND song.id=playlist_item.songid ORDER BY playlist_item.songorder
And so on. Again, tomes have been written about this subject. It's all about thinking clearly and semantically while jotting down a technical solution to it. And some people have only this as a career (like DBA's). There will be lots of opinions, especially on what I've written here. Good luck.
You can use either an SQL database like MYSQL or Postgresql or a NOSQL database like MongoDB. Each has it's pros and cons but since you seem like a beginner i am going to suggest MYSQL because it's what most beginners work with. Take a look at these articles
http://dev.mysql.com/tech-resources/articles/mysql_intro.html
http://www.redhat.com/magazine/007may05/features/mysql/
Of course you may feel free to do you own searching on The Big G as there are tons of resources out there.
as the title of the question suggests, my question is simple, which one is better in terms of performance knowing that i'm on a linux shared hosting, siteground.. i'm capable of coding both, i actually coded a one that updates the DB, but from reading around some people suggested to insert and not to update.. any feed back is much appreciated..
thank you.
Use a database! Since you will have multiple people accessing your site, writing to one file will either mean blocking or having the count overwritten.
By using a database and inserting, you don't have to wait for other clients and you are safely allowing concurrent access. you just get the count by doing a select count(*) from countTbl
What are you storing in the database? If it´s just that one number (the page counter), I would not use a database but if you are storing data for each visitor, a database is the way to go.
I have a pretty large social network type site I have working on for about 2 years (high traffic and 100's of files) I have been experimenting for the last couple years with tweaking things for max performance for the traffic and I have learned a lot. Now I have a huge task, I am planning to completely re-code my social network so I am re-designing mysql DB's and everything.
Below is a photo I made up of a couple mysql tables that I have a question about. I currently have the login table which is used in the login process, once a user is logged into the site they very rarely need to hit the table again unless editing a email or password. I then have a user table which is basicly the users settings and profile data for the site. This is where I have questions, should it be better performance to split the user table into smaller tables? For example if you view the user table you will see several fields that I have marked as "setting_" should I just create a seperate setting table? I also have fields marked with "count" which could be total count of comments, photo's, friends, mail messages, etc. So should I create another table to store just the total count of things?
The reason I have them all on 1 table now is because I was thinking maybe it would be better if I could cut down on mysql queries, instead of hitting 3 tables to get information on every page load I could hit 1.
Sorry if this is confusing, and thanks for any tips.
alt text http://img2.pict.com/b0/57/63/2281110/0/800/dbtable.jpg
As long as you don't SELECT * FROM your tables, having 2 or 100 fields won't affect performance.
Just SELECT only the fields you're going to use and you'll be fine with your current structure.
should I just create a seperate setting table?
So should I create another table to store just the total count of things?
There is not a single correct answer for this, it depends on how your application is doing.
What you can do is to measure and extrapolate the results in a dev environment.
In one hand, using a separate table will save you some space and the code will be easier to modify.
In the other hand you may lose some performance ( and you already think ) by having to join information from different tables.
About the count I think it's fine to have it there, although it is always said that is better to calculate this kind of stuff, I don't think for this situation it hurt you at all.
But again, the only way to know what's better your you and your specific app, is to measuring, profiling and find out what's the benefit of doing so. Probably you would only gain 2% of improvement.
You'll need to compare performance testing results between the following:
Leaving it alone
Breaking it up into two tables
Using different queries to retrieve the login data and profile data (if you're not doing this already) with all the data in the same table
Also, you could implement some kind of caching strategy on the profile data if the usage data suggests this would be advantageous.
You should consider putting the counter-columns and frequently updated timestamps in its own table --- every time you bump them the entire row is written.
I wouldn't consider your user table terrible large in number of columns, just my opinion. I also wouldn't break that table into multiple tables unless you can find a case for removal of redundancy. Perhaps you have a lot of users who have the same settings, that would be a case for breaking the table out.
Should take into account the average size of a single row, in order to find out if the retrieval is expensive. Also, should try to use indexes as while looking for data...
The most important thing is to design properly, not just to split because "it looks large". Maybe the IP or IPs could go somewhere else... depends on the data saved there.
Also, as the socialnetworksite using this data also handles auth and autorization processes (guess so), the separation between login and user tables should offer a good performance, 'cause the data on login is "short enough", while the access to the profile could be done only once, inmediately after the successful login. Just do the right tricks to improve DB performance and it's done.
(Remember to visualize tables as entities, name them as an entity, not as a collection of them)
Two things you will want to consider when deciding whether or not you want to break up a single table into multiple tables is:
MySQL likes small, consistent datasets. If you can structure your tables so that they have fixed row lengths that will help performance at the potential cost of disk space. One thing that from what I can tell is common is taking fixed length data and putting it in its own table while the variable length data will go somewhere else.
Joins are in most cases less performant than not joining. If the data currently in your table will normally be accessed all at the same time then it may not be worth splitting it up as you will be slowing down both inserts and quite potentially reads. However, if there is some data in that table that does not get accessed as often then that would be a good candidate for moving out of the table for performance reasons.
I can't find a resource online to substantiate this next statement but I do recall in a MySQL Performance talk given by Jay Pipes that he said the MySQL optimizer has issues once you get more than 8 joins in a single query (MySQL 5.0.*). I am not sure how accurate that magic number is but regardless joins will usually take longer than queries out of a single table.