Proper way to optimize a multi-DB web application

Proper way to optimize a multi-DB web application - php

I am currently building a web application that will be using two databases at the same time. The user first creates a general account and afterwards uses it to create multiple characters on multiple 'servers'. (think of it as creating an account on Facebook and then using it to sign up on other websites).
My current database setup is the following:
main_english
main_russian
server1
server2
server3
The main_english and main_russian databases contain the session and general account data for users who've registered at the appropriate languages.
The server1 and server2 databases handle the created character data of the English users, while server3 handles the Russian ones.
Users will be updating tables every 5-10 seconds or so. My question is, how could I efficiently handle having several DBs used?
So far I've come to several options:
Open two DB connections: one for main_ and another for server,
but I've read that opening them is fairly expensive.
Modifying all
my queries to explicitly state which DB to update/select data from
(the project is still young and thus it wouldn't be that
painful)
Modifying my $db class to have select_db() statements
with each query, which, in my opinion, is both messy and
inefficient.
What could you suggest? Perhaps I am overreacting about opening a second connection for the server queries?

MySQL is known to have fast connection handling so thats unlikely to be a bottleneck. That being said you could look into persistent connections in the PDO or mysqli extensions.
In terms of the server access and data modeling its difficult to say without knowing more about your application. How do DB slaves factor in? Is replication lag OK?

Related

Is it a good programming having two or more different mysql connections? [duplicate]

I am wanting to hear what others think about this? Currently, I make a mysql database connection inside of a header type file that is then included in the top of every page of my site. I then can run as many queries as I want on that 1 open connection. IF the page is built from 6 files included and there is 15 different mysql queries, then they all would run on this 1 connection.
Now sometimes I see classes that make multiple connections, like 1 for each query.
Is there any benefit of using one method over the other? I think 1 connection is better then multiple but I could be wrong?

Creating connections can be expensive (I don't have a reference for this statement as yet Edit: Aha! Here it is) so it seems as if the consensus is to use fewer connections. Using a single connection for all queries on a single page seems to be a better choice than multiple connections.

In PHP+MySQL usually there is no much sence to use multiple connections per page (just slower and a little more RAM consumed).
The only way it might be useful is when you alter connection paremters which might interfer with other pages (like collation). But good PHP programs usually never do that kind of stuff.
Also, it is a good idea to enable persistent connections, so that 1 MySQL connection would be reused across multiples page executions.

If really depends on the level of activity you suspect the site will generate - if it's a high traffic web site, you'll soon run out of connections (unless you set the adjust MySQLs max connections to a stupidly high level, but that'll eventually grind the server to a halt).
I'd generally recommend that the front end of a web site should use a shared database object (singleton is your friend), as it doesn't require a great deal of discipline to write with this is mind and you won't waste time making connections. If you require additional concurrent queries on the backend, it shouldn't be that much of a deal as this isn't likely to be a highly trafficked area.

Its not recommended to execute multiple small queries where the work can be done using just one query, you can use a single query to get data from multiple tables and ieven multiple databases. see the link below:
http://www.x-developer.com/php-scripts/sql-connecting-multiple-databases-in-a-single-query

I don't see any benefit of using multiple connections, I 'd rather think it is a sign of bad structure. These are the reasons I can think of against using multiple connections:
You have to initialize the database multiple times. Setting conection properties upon connection establishment (like SET NAMES UTF8) would have to be done on multiple line.
It is definitely slower than a single connection.
A non-technical reason: Someone working with your code will most probably not expect it and might spend hours debugging the connection properties he had set in another connection.
Having a global connection object (or a class providing one) is the much better approach in PHP.

Are you sure the classes that make multiple connections aren't just returning a reference to the already open connection when one is open? I've seen a lot of stuff structured that way. It really is better performance-wise to use only one connection per page.

Accessing MySQL from PHP and another process at the same time

I'm writing a program that runs (24/7) on a Linux server and adds entries to a MySQL database.
The contents of the database are presented on a web interface with PHP and the user should be able to delete entries using the web interface.
Is it possible to access the database from multiple processes at the same time?

Yes, databases are designed for this purpose quite well. You'll want to keep a few things in mind in your designs:
Concurrency and race conditions on database writes.
Performance.
Separate database permissions for separate applications.

Unless you're doing something like accessing the DB using a singleton, the max number of simultaneous mysql connections php will use is limited in your php.ini. I believe it defaults to 100.

Yes multiple users can access the database at the same time.
You should however take care that the data is consistent.
If you create/edit entry with many small sql statements and in the meantime someone useses the web interface this may lead to some errors.
If you have a simple db this should not be a problem, else you should consider using transactions.
http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-transactions.html

Yes and there will not be any problems while trying to delete records in the presence of that automated program which runs 24/7 if you are using the InnoDb engine. This is because transactions happen one at a time, one starts after another has finished and the database is consistent everytime.
This answer How to implement the ACID model for a database has many relevant points.
Read about the ACID Properties of a database. A Mysql database with InnoDb engine will take care of all these things for you and you need not worry about that.

Why would you use two (or more) databases instead of one?

Many database libraries come setup for multiple database connections - but I've never actually known of an scripting application that needed to connect to two databases during it's run. (compiled, daemon-running languages are a different matter).
I understand having database slaves so that you can spread the load out - but usually on startup only one of them is chosen to handle that scripts needs.
So why would a PHP or Ruby application need to connect to more than one database? Or rather, why would you split your data up among several databases?
The only thing I can think of is bad design from a slowly evolving system that started off in multiple separate parts.

Are you talking about different physical database servers or different databases in the "schema" sense?
Regarding physical servers, If you're using MySQL replication you might write to a master and always read from a slave. This helps split the load among each database.

The simple answer is "scalability".
The ready availability of replication and clustering in a number of database products makes multiple database use a definite 'this must be possible'. Any decent ORM should know how to connect to multiple databases as required.
But even when the main application doesn't connect to more than one, there will often be other needs that do. Report generation, either scripted or ad-hoc, often involve queries that run for a long time. These are best run on database replicants dedicated (and configured) for these queries so they don't disrupt the main application.
Another good use is a type of scripted processing. Many apps will have a regular process that needs to rummage through a large part of the database. Whislt updates obviously have to go to the master, the big read queries can be run off a replicant.
Of course, the obvious need is simple performance. I oversaw a webapp and database that grew from surviving comfortably on one MySQL databse on a 32-bit dual-core machine with 3Gb to needing two 8-core 64-bit servers with 8Gb. Once it reached this stage, it relied on the database handler directing traffic to both servers. We had a window of about 50 minutes in a day where it could survive on just one database.

I have a Ruby application that connects to multiple databases. One database contains user login credentials (which is shared between several other projects). Another database contains archived data that my application tracks and compares (that only my application accesses). Another database contains data regarding physical machine resources which my application uses to generate new data (these resources are used by several different applications). By splitting the data into multiple databases, different applications only access the data that they need to be accessing.

It is all too frequently the case that some of the data you need is stored in The Wrong Database. Sometimes it's personnel records in a PeopleSoft (Oracle) database. Maybe it's Enterprise CRM data on Informix. Or some departmental database stored in MS SQL Server. Whatever it is, it's in a different database, but you still need access (hopefully read-only).
Unless your primary database is magic-based, it isn't going to be able to provide you with remote table access for every other database out there. (Most will only provide remote access to other databases of the same type, eg: MySQL->MySQL.) When that all too frequent situation occurs, you'll have no other option but to have multiple database connections, and be glad that your framework supports it.

I have a site that connects with two databases. One powers the website content (CMS DB) the other drives a web application that runs within the site (large amounts of non-CMS data) In fact, the latter uses replication.
I don't feel that's bad design. If one set of data has no relation to the other, then it makes sense even from a pure organization perspective to house it in a separate DB. Otherwise, people would just put all their tables in one DB.

For added security, I always create two accounts for every database: a read-only account (good for SELECT) and a read-write account (for SELECT, UPDATE, INSERT, DELETE and whatever else I might need). On some pages, I may need to use both accounts, thus I will consume two connections for only one database.

Well, reading from one and writing to another is a very common use case. It's easy and fun to write a data access layer that reads from one connection (reading from the slave), and writes to another (the master). A single script might make multiple reads before writing -- perhaps some lookups are necessary for validation, for instance.
Scripting languages are also frequently used for integration. You might have two off-the-shelf codebases, both of which want to maintain their own database. Your integration code might want to talk to both of them.
In general, you can usually design out of using more than one connection, but in general, I don't see anything fundamentally wrong with using connections to more than one database.

Other reasons to have multiple databases. We have one application that everyone can access. We also have client database that are very differnt from client to client. It is easier to maintain the application that all clients use (and which is maintained by a differnte team) if the client_specific data is separated out to their own databases. It is also easier to move the client to a new server when they become a large enterprise client rather than the smaller clietns who run on a server with many other clients.
Further there are types of data that are transactional and need to be in databases that are set to full recovery mode with full transaction logging. Other data is only populated from imports and does not need transactional logging and which might slow down the system as the log grew enough to handle the 10,000,000 record import. These are often split out to a separate databse so they can be in simple recovery mode as it si not necessary to recover data from the transaction log if there is a problem, it can be easily recoverd by re-running the import.
Then data is split out into datawarehouses which are optimized for data reporting not transactions. Again these reporting databases are usually separate databases (often on separate servers).
Then you have the databases for multiple different COTS applications (we have accounting databases, Credit Card transaction porcessing databases, HR databases, our project management database). A particular website might need to access more than one of these or transfer information from one to the other. Believe me vendors won't let you copy their database structure into one database to rule them all.
We have several hundred databases here on many differnt servers.

Connect two database from 2 different host

currently I have two websites:
1. A website connected to mySQL database in host A.
2. A website connected to Ms. Access database in Host B.
Is there anyway if I update the database in Host B, the database in Host A can be updated automatically?
Thank you. Really appreciate your help.

Two options: - (1) at the database level, with whats commonly called ETL (Extract Transform and Load). In the Microsoft world you'd use SSIS (which comes as part of MS SQL) to move data about. This would be a common approach within an enterpruise, particularly if you have a lot of control over then environment.
(2) some sort of "service" based approach. Maybe you provide some sort of interface (like a web service) so that one application can call the other. The issue with that is that you need to build it into the application - but you seem to be after a database driven solution (?)
Have a think about what you're trying to do and who should be responsible for that - are you sure it's the database?
Regarding your specific technology - I'm not sure about MySQL as I haven't used it much myself; I don't know of any 'easy' way to have MySQL and Access talk to each other, so yopu may have to write something.
The data your exchanging - how much and how often? How timely (can you have one poll the other every hour, or does it need to be 'real-time')?
You could consider using a (new) third system that brokered communcation between the two databases, so that they could remain ignorant of each other and the need to update.
Is it likely you'll have a third database to update later (or a 4th, etc...?)
can you change the database platform to something that is common across both / all sites, and which has some sort of messaging / updating system built in?

In PHP/MySQL should I open multiple database connections or share 1?

I am wanting to hear what others think about this? Currently, I make a mysql database connection inside of a header type file that is then included in the top of every page of my site. I then can run as many queries as I want on that 1 open connection. IF the page is built from 6 files included and there is 15 different mysql queries, then they all would run on this 1 connection.
Now sometimes I see classes that make multiple connections, like 1 for each query.
Is there any benefit of using one method over the other? I think 1 connection is better then multiple but I could be wrong?

Creating connections can be expensive (I don't have a reference for this statement as yet Edit: Aha! Here it is) so it seems as if the consensus is to use fewer connections. Using a single connection for all queries on a single page seems to be a better choice than multiple connections.

In PHP+MySQL usually there is no much sence to use multiple connections per page (just slower and a little more RAM consumed).
The only way it might be useful is when you alter connection paremters which might interfer with other pages (like collation). But good PHP programs usually never do that kind of stuff.
Also, it is a good idea to enable persistent connections, so that 1 MySQL connection would be reused across multiples page executions.

If really depends on the level of activity you suspect the site will generate - if it's a high traffic web site, you'll soon run out of connections (unless you set the adjust MySQLs max connections to a stupidly high level, but that'll eventually grind the server to a halt).
I'd generally recommend that the front end of a web site should use a shared database object (singleton is your friend), as it doesn't require a great deal of discipline to write with this is mind and you won't waste time making connections. If you require additional concurrent queries on the backend, it shouldn't be that much of a deal as this isn't likely to be a highly trafficked area.

Its not recommended to execute multiple small queries where the work can be done using just one query, you can use a single query to get data from multiple tables and ieven multiple databases. see the link below:
http://www.x-developer.com/php-scripts/sql-connecting-multiple-databases-in-a-single-query

I don't see any benefit of using multiple connections, I 'd rather think it is a sign of bad structure. These are the reasons I can think of against using multiple connections:
You have to initialize the database multiple times. Setting conection properties upon connection establishment (like SET NAMES UTF8) would have to be done on multiple line.
It is definitely slower than a single connection.
A non-technical reason: Someone working with your code will most probably not expect it and might spend hours debugging the connection properties he had set in another connection.
Having a global connection object (or a class providing one) is the much better approach in PHP.

Are you sure the classes that make multiple connections aren't just returning a reference to the already open connection when one is open? I've seen a lot of stuff structured that way. It really is better performance-wise to use only one connection per page.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.