Many fields vs. many tables in php mysql

Many fields vs. many tables in php mysql - php

I am building a game site with a lot of queries. For optimisation what is best. handeling the data with a lot of tables and relations or fewer tables but with many fields?
I would think, especially regarding to inserts and updates that fewer fields with many fields would be better than many tables. That would give more queries or???
I'm trying to figure out what is best course I am experiencing high load on my server at the evenings when I have a lot of users...

Start off with the database normalized. Ensure that you have the appropriate indexes for the queries/updates/inserts that you are doing. Use optimize table periodically.
If you are still encountering problems do some profiling to find out where the performance is insufficient. Then consider either denormalizing or perhaps rewriting the queries.
In addition make sure that the system cannot have deadlocks. That really messes up performance.

i don't think the number of columns effects anything, really. it's all about how well you've indexed the columns. if you do more updates then selects on a particular field, you might want to drop the index if you have one.
not really an answer, just something i've noticed.

Related

offloading data to another db-table and cache

I have an online store, our products come from a 4 table join.
I want to move away from these joins for the following reasons:
too expensive on the database.
when I need to query data, I want to use simpler queries.
I am thinking of offloading the data into a simpler form into another DB and table.
Then, in addition, cache that data coming from the new table.
This gives me:
Good performance
simpler querying when I need to perform on the fly lookups using a DB client.
Can anyone weigh in on whether or not this is a good approach?
Am I overdoing it?

This is not a good approach, what you are doing is denormalizing and this should only be done as a last resort if you really need to increase performance in your system. I've worked on websites with over 10 million views per month and even on those sites it was only necessary for some specific use cases.
MySQL joins are very fast, and joining on 4 tables is nothing, I've written queries joining to 15 tables that ran in less than 0.001s, if your indexes are done right the difference won't be noticeable.
What you're doing is both Premature Optimization and query writing laziness, unless your online store gets hundreds of thousands (or even millions) of visits every day you are not focusing on the right things, data integrity and consistency is way more important.

SQL query is much faster if I create indexes

Is it ok if I create like 8 indexes inside a table which has 13 columns?
If I select data from it and sort the results by a key, the query is really fast, but if the sort field is not a key it's much slower. Like 40 times slower.
What I'm basically asking is if there are any side effects of having many keys in the database...

Creating indexes on a table slows down all write operations on it a little, but speeds up read operations on the relevant columns a lot. If your application is not going to be doing lots and lots of writes to that table (which is true of most applications) then you are going to be fine.

Don't create indexes that are redundant or unused. But do create indexes you need to optimize the queries you run.
You choose indexes in any table based on your queries. Each query may use a different index, so it pays to analyze your queries carefully. See my presentation MENTOR Your Indexes. I also cover similar information in the chapter on indexing in my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
There is no specific rule about how many indexes is too many. In Oracle SQL Tuning Pocket Reference, author Mark Gurry says:
My recommendation is to avoid rules stating a site will not have any more than a certain number of indexes. The bottom line is that all SQL statements must run acceptably. There is ALWAYS a way to achieve this. If it requires 10 indexes on a table, then you should put 10 indexes on the table.
There are a couple of good tools to help you find redundant or unused indexes for MySQL in Percona Toolkit: http://www.percona.com/doc/percona-toolkit/pt-duplicate-key-checker.html and pt-index-usage.

This is a good question and everyone who works with mysql should know the answer. It is also commonly asked. Here is a link to one of them with a good answer:
Indexing every column in a table

In a nutshell, each new index requires space (especially if you use InnoDB - see the "Disadvantages of clustering" section in this article) and slows down INSERTs, UPDATEs and DELETEs.
Only you are in a position to decide whether speedup you'll get in SELECT and the frequency with which it will be used is worth it. But whatever you eventually decide, make sure you base your decision on measurement, not guessing!
P.S. INSERTs, UPDATEs and DELETEs with WHERE can also be sped-up by index(es), but that's another topic...

The cost of an index in disk space is generally trivial. The cost of additional writes to update the index when the table changes is often moderate. The cost in additional locking can be severe.
It depends on the read vs write ratio on the table, and on how often the index is actually used to speed up a query.
Indexes use up disc space to store, and take time to create and maintain. Unused ones don't give any benefit. If there are lots of candidate indexes for a query, the query may be slowed down by having the server choose the "wrong" one for the query.
Use those factors to decide whether you need an index.
It is usually possible to create indexes which will NEVER be used - for example, and index on a (not null) field with only two possible values, is almost certainly going to be useless.
You need to explain your own application's queries to make sure that the frequently-performed ones are using sensible indexes if possible, and create no more indexes than required to do that.
You can get more by following this links:
For mysql:
http://www.mysqlfaqs.net/mysql-faqs/Indexes/What-are-advantages-and-disadvantages-of-indexes-in-MySQL
For DB2:
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/c0005052.htm

Indexes improve read performance, but increase size, and degrade insert/update. 8 indexes seem to be a bit too many for me; however, it depends on how often you typically update the table

Assuming MySQL from tag, even though OP makes no mention of it.
You should edit your question and add the fact that you are conducting order by operations as well (from a comment you posted to a solution). order by operations will also slow down queries (as will various other mysql ops) because MySQL has to create a temp table to accomplish the ordered result set (more info here). A lot of times, if the dataset allows it, I will pull the data I need, then order it at the application layer to avoid this penalty.
Your best bet is to EXPLAIN your most used queries, and check your slow query log.

MySQL many tables or few tables

I'm building a very large website currently it uses around 13 tables and by the time it's done it should be about 20.
I came up with an idea to change the preferences table to use ID, Key, Value instead of many columns however I have recently thought I could also store other data inside the table.
Would it be efficient / smart to store almost everything in one table?
Edit: Here is some more information. I am building a social network that may end up with thousands of users. MySQL cluster will be used when the site is launched for now I am testing using a development VPS however everything will be moved to a dedicated server before launch. I know barely anything about NDB so this should be fun :)

This model is called EAV (entity-attribute-value)
It is usable for some scenarios, however, it's less efficient due to larger records, larger number or joins and impossibility to create composite indexes on multiple attributes.
Basically, it's used when entities have lots of attributes which are extremely sparse (rarely filled) and/or cannot be predicted at design time, like user tags, custom fields etc.

Granted I don't know too much about large database designs, but from what i've seen, even extremely large applications store their things is a very small amount of tables (20GB per table).
For me, i would rather have more info in 1 table as it means that data is not littered everywhere, and that I don't have to perform operations on multiple tables. Though 1 table also means messy (usually for me, each object would have it's on table, and an object is something you have in your application logic, like a User class, or a BlogPost class)
I guess what i'm trying to say is that do whatever makes sense. Don't put information on the same thing in 2 different table, and don't put information of 2 things in 1 table. Stick with 1 table only describes a certain object (this is very difficult to explain, but if you do object oriented, you should understand.)

nope. preferences should be stored as-they-are (in users table)
for example private messages can't be stored in users table ...
you don't have to think about joining different tables ...

I would first say that 20 tables is not a lot.
In general (it's hard to say from the limited info you give) the key-value model is not as efficient speed wise, though it can be more efficient space wise.

I would definitely not do this. Basically, the reason being if you have a large set of data stored in a single table you will see performance issues pretty fast when constantly querying the same table. Then think about the joins and complexity of queries you're going to need (depending on your site)... not a task I would personally like to undertake.
With using multiple tables it splits the data into smaller sets and the resources required for the query are lower and as an extra bonus it's easier to program!
There are some applications for doing this but they are rare, more or less if you have a large table with a ton of columns and most aren't going to have a value.
I hope this helps :-)

I think 20 tables in a project is not a lot. I do see your point and interest in using EAV but I don't think it's necessary. I would stick to tables in 3NF with proper FK relationships etc and you should be OK :)

the simple answer is that 20 tables won't make it a big DB and MySQL won't need any optimization for that. So focus on clean DB structures and normalization instead.

Which is better database design?

Given a site like StackOverflow, would it be better to create num_comments column to store how many comments a submission has and then update it when a comment is made or just query the number of rows with the COUNT function? It seems like the latter would be more readable and elegant but the former would be more efficient. What does SO think?

Definitely to use COUNT. Storing the number of comments is a classic de-normalization that produces headaches. It's slightly more efficient for retrieval but makes inserts much more expensive: each new comment requires not only an insert into the comments table, but a write lock on the row containing the comment count.

The former is not normalized but will produce better performance (assuming many more reads than writes).
The latter is more normalized, but will require more resources and hence be less performant.
Which is better boils down to application requirements.

I would suggest counting comment records. Although the other method would be faster it lends to a cleaner database. Adding a count column would be a sort of data duplication not to mention require on additional code step and insert.
If you were to expect millions of comments, then you may want to pick the count column approach.

I agree with #Oded. It depends on the app requirements and also how active is the site, however here is also my two cents
I would try to avoid the writes which will have to be done by triggers, UPDATES to post table when new comments are added.
If you are concerned about reporting the data then don't do that on a transactional system. Create a reporting DB and update that periodically.

The "correct" way to design is to use another table, join it and COUNT. This is consistent with what database normalization teaches.
The problem with normalization is that it cannot scale. There are only so many ways to skin a cat, so if you have millions of queries per day and a lot of them involve table X, the database performance is going below ground as the server also has to deal with concurrent writes, transactions, etc.
To deal with this problem, a common practice is sharding. Sharding has the side effect that the rows of a table are not stored in the same physical location, and a primary consequence of this is that you cannot JOIN anymore; how can you JOIN against half a table and receive meaningful results? And obviously, trying to JOIN against all partitions of a table and merge the results is going to be worse than the disease.
So you see that not only the alternative you examine is used in practice to achieve high performance, but also that there are even more radical steps that engineers can and do take.
Of course, unless you do have performance issues, sharding or even de-normalizing is just making your life harder for no tangible benefit.

Will more MySql tables slow down searches on MySql database?

I have a classifieds website, and I am thinking about redesigning the database a bit.
Currently I have 7 tables in the db. One table for each "MAIN CATEGORY".
For example, I have a "VEHICLES" table which holds all information about the following categories of classifieds:
cars
mc
mopeds/scooters
trucks
boats
etc etc
However, users on the website usually search in specific categories. For example, the user chooses the "cars" category to search in, and enters a keyword.
My code today, will search the entire VEHICLES table for all records with the field "category" equal to "cars", and then get their details:
"SELECT * IN vehicles WHERE category='cars' AND alot of other conditions" // just for example, not tested
I am thinking about making a table now, for each of these "sub-categories".
Ie, one for cars, one for mc, one for trucks etc, so that search isn't done through information which isn't needed.
Will this increase search speed? Because I have calculated that I will need atleast 30 or so tables for this.
Thanks

With a properly indexed table and a "reasonable" number of rows, you will not gain much speed from this approach. Anything you gain in speed of execution you will lose in time-to-market because your programming will become more complicated.
Do not perform this optimization unless and until you encounter a performance problem in testing with a representative set of data.

It will increase the speed of a search within the same category. It will potentially slow down queries where you need aggregate information from the different categories. You need to decide which is the best option for your site.
How many records do you have in total in the vehicles table. Its quite likely that adding proper indexes will greatly increase the speed of your searches.
Check out the 'EXPLAIN' query option in MySQL. Understanding this will help you optimize your database a lot with indices.

Performance optimization is as much art as science, and to really understand what's the best option requires that you do some benchmarking; anyone offering a definitive answer given the available information is just wrong. That said, a few thoughts on your situation:
You don't say what type your category column is now, but if it's a string type, it's probably using more space than other options, thus making the table larger. Proper indexing can help tremendously with speed, but a larger table with larger indexes will always work to do just the opposite.
As already mentioned by someone else, your queries within a category will be faster in the simple case of a category search. How much faster depends on how much data you have in your current table, and the increases may be negated if you have to join in other tables to satisfy the need for all the other conditions to which you alluded. OTOH, it may actually speed things up in certain join cases (e.g., if you were doing self-joins with your all-encompassing table).
If you're working with a lot of data, splitting into multiple tables can greatly ease backups.
Splitting into multiple tables may also make it easier to shard your data across multiple servers for performance reasons. Similarly, it may make replication setups easier to keep running.
If you're tracking data that's category-specific, separate tables enables you to better normalize your database and likely reap some nice performance as a result of using much smaller tables.
Splitting obviously means modifying your code. If your code is of the old, creaky type, you may very well achieve a performance gain from the clean-up. Of course, there's also the risk that you'll break something....
Check your indexes. Bad indexes are a very common cause of poor performance but are relatively easy to fix with a bit of quality time spent on self-education. MySQL's EXPLAIN can tell you whether your queries are using the indexes, and the index stats (look in the docs) can tell you how efficiently your indexes are working.
Finally, speaking of code, check yours. Try experimenting with a few approaches, regardless of how the database is set up. For example, it may be quicker to do a couple of separate queries and join the results in code than to do the join in the database. Likewise, it's often quicker to do things like sorts in code, particularly in cases where a join or something means the database would have to create a temporary file/table. Again, check the EXPLAIN output, and if you can't eliminate a problem area in your queries, see if it helps to simplify the queries and do more work in the code. This can be particularly beneficial in the common case where the web server has more resources to spare than the database server.
There are many more factors to consider. Ultimately, though, the best way to make these decisions is not to spend time pondering theories but to put both methods to the test. Create some test databases and benchmark the sort of queries you'd run most often, with and without simulated load. You'll get your answer.

if you are using php try something like
$query = mysql_query($sql);
while($row = mysql_fetch_assoc($query)){
$tempvalue[]=$row;
}
and then to loop the info use for like sentence
foreach($tempvalue as $key => $value){
write the table .....
}
maybe mysql isnt slow and the problem is in the code
test dont kill anyone =)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.