Most performant solution for multiple databases in MySQL and PHP

Most performant solution for multiple databases in MySQL and PHP - php

I have four databases with different tables, let's call the databases admindata, userdata, pagedata and mediadata.
At the moment I use one instance of mysqli for all databases:
$mysqli = new mysqli(...);
I keep this instance open until the end of the page load. I was not sure about the performance so I made databases switches like this when I had multiple queries in one script:
$mysqli->query('SELECT fields FROM database1.table1');
$mysqli->query('SELECT fields FROM database2.table2');
instead of:
$mysqli->select_db('database1');
$mysqli->query('SELECT fields FROM table1');
$mysqli->select_db('database2');
$mysqli->query('SELECT fields FROM table2');
What of those two is better in performance? Or is it even better to hold one instance of mysqli for every database like this:
$mysqli->query('SELECT fields FROM table1');
$mysqli2 = new mysqli(...);
$mysqli2->query('SELECT fields FROM table2');
EDIT:
On php.net someone wrote two years ago:
In some situations its useful to use this function for changing databases in general. We've tested it in production environment and it seams to be faster with switching databases than creating new connections.
http://ch2.php.net/mysqli_select_db
It would be nice to hear more about it.

I was teached that is better to hold separate connection per database / per user. Not just because of matter of security, but also a flexibility of your code. In the future, if you would like to start using ORM, it will be very difficuilt to rewrite the code that uses single connection for several databases. It may also help with debugging the database problems.
On other hand there's potential problem of having many connections per request. In the project im leading, we do use separate connections, but we pay attention on closing them when they are not needed anymore.
I think there's no universal solution, and all depends from your own needs and requirements.

Related

Is it a good programming having two or more different mysql connections? [duplicate]

I am wanting to hear what others think about this? Currently, I make a mysql database connection inside of a header type file that is then included in the top of every page of my site. I then can run as many queries as I want on that 1 open connection. IF the page is built from 6 files included and there is 15 different mysql queries, then they all would run on this 1 connection.
Now sometimes I see classes that make multiple connections, like 1 for each query.
Is there any benefit of using one method over the other? I think 1 connection is better then multiple but I could be wrong?

Creating connections can be expensive (I don't have a reference for this statement as yet Edit: Aha! Here it is) so it seems as if the consensus is to use fewer connections. Using a single connection for all queries on a single page seems to be a better choice than multiple connections.

In PHP+MySQL usually there is no much sence to use multiple connections per page (just slower and a little more RAM consumed).
The only way it might be useful is when you alter connection paremters which might interfer with other pages (like collation). But good PHP programs usually never do that kind of stuff.
Also, it is a good idea to enable persistent connections, so that 1 MySQL connection would be reused across multiples page executions.

If really depends on the level of activity you suspect the site will generate - if it's a high traffic web site, you'll soon run out of connections (unless you set the adjust MySQLs max connections to a stupidly high level, but that'll eventually grind the server to a halt).
I'd generally recommend that the front end of a web site should use a shared database object (singleton is your friend), as it doesn't require a great deal of discipline to write with this is mind and you won't waste time making connections. If you require additional concurrent queries on the backend, it shouldn't be that much of a deal as this isn't likely to be a highly trafficked area.

Its not recommended to execute multiple small queries where the work can be done using just one query, you can use a single query to get data from multiple tables and ieven multiple databases. see the link below:
http://www.x-developer.com/php-scripts/sql-connecting-multiple-databases-in-a-single-query

I don't see any benefit of using multiple connections, I 'd rather think it is a sign of bad structure. These are the reasons I can think of against using multiple connections:
You have to initialize the database multiple times. Setting conection properties upon connection establishment (like SET NAMES UTF8) would have to be done on multiple line.
It is definitely slower than a single connection.
A non-technical reason: Someone working with your code will most probably not expect it and might spend hours debugging the connection properties he had set in another connection.
Having a global connection object (or a class providing one) is the much better approach in PHP.

Are you sure the classes that make multiple connections aren't just returning a reference to the already open connection when one is open? I've seen a lot of stuff structured that way. It really is better performance-wise to use only one connection per page.

use only one $dbh for several databases?

If you have two databases on the same host, one called blog and one called forum, it seems like you can access both using only one database handle? (in PDO)
$dbh=new PDO("mysql:host=$dbHost;dbname=blog", $dbUser, $dbPassword);
This handle is for the database blog, although you can also perform operations on forum using $dbh if you write something like
SELECT website.tableName.fieldName
My questions are:
Is the only reason why you have to specify dbname in $dbh for letting you omit the blog.tableName.fieldName part?
Since my website has two databases, would there be any pros or cons of only using one database handle, rather than creating two handles (obviously one for blog and one for forum)? Possible performance difference?
Does creating a database handle consume any server resources?

It is usually good practice to keep database specific user on any app you make. I would go as far as calling it necessity. That is the reason of keeping the name of the database necessary in the connection. (Hint for reason behind this: What if someone got your dbms password for one table somehow?)
I am not very good at this but I do not think its a good idea to keep two separate databases when one can do. Like in your case, you are not using master-slave or anything. So unless you have some physical limit you are trying to make up for, make them into 1 database (use prefixes for table names to avoid name collisions)
The reason for the previous point comes with this one. Keeping one user for a database or some people even keep two for strange and to some extent justifiable reasons is a safety measure you should follow. For multiple users, you need to make multiple connections, which means for every page load you will be connecting to the dbms twice! simple math 2x load (yes, it eats resources, every single line of code does) Simplifying it, think of a man who needs to walk to the grocery store for everything you ask for, gets only 1 thing at a time. if you give him 2 different grocery stores, the man will need 2x time and energy to do the same work.

Yes, you can omit it. Or switch with running USE databasename;.
Use one handle, seems a waste to make double the connections.
Yes, hence (2).

In PHP/MySQL should I open multiple database connections or share 1?

I am wanting to hear what others think about this? Currently, I make a mysql database connection inside of a header type file that is then included in the top of every page of my site. I then can run as many queries as I want on that 1 open connection. IF the page is built from 6 files included and there is 15 different mysql queries, then they all would run on this 1 connection.
Now sometimes I see classes that make multiple connections, like 1 for each query.
Is there any benefit of using one method over the other? I think 1 connection is better then multiple but I could be wrong?

Creating connections can be expensive (I don't have a reference for this statement as yet Edit: Aha! Here it is) so it seems as if the consensus is to use fewer connections. Using a single connection for all queries on a single page seems to be a better choice than multiple connections.

In PHP+MySQL usually there is no much sence to use multiple connections per page (just slower and a little more RAM consumed).
The only way it might be useful is when you alter connection paremters which might interfer with other pages (like collation). But good PHP programs usually never do that kind of stuff.
Also, it is a good idea to enable persistent connections, so that 1 MySQL connection would be reused across multiples page executions.

If really depends on the level of activity you suspect the site will generate - if it's a high traffic web site, you'll soon run out of connections (unless you set the adjust MySQLs max connections to a stupidly high level, but that'll eventually grind the server to a halt).
I'd generally recommend that the front end of a web site should use a shared database object (singleton is your friend), as it doesn't require a great deal of discipline to write with this is mind and you won't waste time making connections. If you require additional concurrent queries on the backend, it shouldn't be that much of a deal as this isn't likely to be a highly trafficked area.

Its not recommended to execute multiple small queries where the work can be done using just one query, you can use a single query to get data from multiple tables and ieven multiple databases. see the link below:
http://www.x-developer.com/php-scripts/sql-connecting-multiple-databases-in-a-single-query

I don't see any benefit of using multiple connections, I 'd rather think it is a sign of bad structure. These are the reasons I can think of against using multiple connections:
You have to initialize the database multiple times. Setting conection properties upon connection establishment (like SET NAMES UTF8) would have to be done on multiple line.
It is definitely slower than a single connection.
A non-technical reason: Someone working with your code will most probably not expect it and might spend hours debugging the connection properties he had set in another connection.
Having a global connection object (or a class providing one) is the much better approach in PHP.

Are you sure the classes that make multiple connections aren't just returning a reference to the already open connection when one is open? I've seen a lot of stuff structured that way. It really is better performance-wise to use only one connection per page.

How quick is switching DBs with PHP + MySQL?

I'm wondering how slow it's going to be switching between 2 databases on every call of every page of a site. The site has many different databases for different clients, along with a "global" database that is used for some general settings. I'm wondering if there would be much time added for the execution of each script if it has to connect to the database, select a DB, do a query or 2, switch to another DB and then complete the page generation. I could also have the data repeated in each DB, I just need to mantain it (will only change when upgrading).
So, in the end, how fast is mysql_select_db()?
Edit: Yes, I could connect to each DB separately, but as this is often the slowest part of any PHP script, I'd like to avoid this, especially since it's on every page. (It's slow because PHP has to do some kind of address resolution (be it an IP or host name) and then MySQL has to check the login parameters both times.)

Assuming that both databases are on the same machine, you don't need to do the mysql_select_db. You can just specify the database in the queries. For example;
SELECT * FROM db1.table1;
You could also open two connections and use the DB object that is returned from the connect call and use those two objects to select the databases and pass into all of the calls. The database connection is an optional parameter on all of the mysql db calls, just check the docs.

You're asking two quite different questions.
Connecting to multiple database instances
Switching default database schemas.
MySQL is known to have quite fast connection setup time; making two mysql_connect() calls to different servers is barely more expensive than one.
The call mysql_select_db() is exactly the same as the USE statement and simply changes the default database schema for unqualified table references.
Be careful with your use of the term 'database' around MySQL: it has two different meanings.

How do you manage SQL Queries

At the moment my code (PHP) has too many SQL queries in it. eg...
// not a real example, but you get the idea...
$results = $db->GetResults("SELECT * FROM sometable WHERE iUser=$userid");
if ($results) {
// Do something
}
I am looking into using stored procedures to reduce this and make things a little more robust, but I have some concerns..
I have hundreds of different queries in use around the web site, and many of them are quite similar. How should I manage all these queries when they are removed from their context (the code that uses the results) and placed in a stored procedure on the database?

The best course of action for you will depend on how you are approaching your data access. There are three approaches you can take:
Use stored procedures
Keep the queries in the code (but put all your queries into functions and fix everything to use PDO for parameters, as mentioned earlier)
Use an ORM tool
If you want to pass your own raw SQL to the database engine then stored procedures would be the way to go if all you want to do is get the raw SQL out of your PHP code but keep it relatively unchanged. The stored procedures vs raw SQL debate is a bit of a holy war, but K. Scott Allen makes an excellent point - albeit a throwaway one - in an article about versioning databases:
Secondly, stored procedures have fallen out of favor in my eyes. I came from the WinDNA school of indoctrination that said stored procedures should be used all the time. Today, I see stored procedures as an API layer for the database. This is good if you need an API layer at the database level, but I see lots of applications incurring the overhead of creating and maintaining an extra API layer they don't need. In those applications stored procedures are more of a burden than a benefit.
I tend to lean towards not using stored procedures. I've worked on projects where the DB has an API exposed through stored procedures, but stored procedures can impose some limitations of their own, and those projects have all, to varying degrees, used dynamically generated raw SQL in code to access the DB.
Having an API layer on the DB gives better delineation of responsibilities between the DB team and the Dev team at the expense of some of the flexibility you'd have if the query was kept in the code, however PHP projects are less likely to have sizable enough teams to benefit from this delineation.
Conceptually, you should probably have your database versioned. Practically speaking, however, you're far more likely to have just your code versioned than you are to have your database versioned. You are likely to be changing your queries when you are making changes to your code, but if you are changing the queries in stored procedures stored against the database then you probably won't be checking those in when you check the code in and you lose many of the benefits of versioning for a significant area of your application.
Regardless of whether or not you elect not to use stored procedures though, you should at the very least ensure that each database operation is stored in an independent function rather than being embedded into each of your page's scripts - essentially an API layer for your DB which is maintained and versioned with your code. If you're using stored procedures, this will effectively mean you have two API layers for your DB, one with the code and one with the DB, which you may feel unnecessarily complicates things if your project does not have separate teams. I certainly do.
If the issue is one of code neatness, there are ways to make code with SQL jammed in it more presentable, and the UserManager class shown below is a good way to start - the class only contains queries which relate to the 'user' table, each query has its own method in the class and the queries are indented into the prepare statements and formatted as you would format them in a stored procedure.
// UserManager.php:
class UserManager
{
function getUsers()
{
$pdo = new PDO(...);
$stmt = $pdo->prepare('
SELECT u.userId as id,
u.userName,
g.groupId,
g.groupName
FROM user u
INNER JOIN group g
ON u.groupId = g.groupId
ORDER BY u.userName, g.groupName
');
// iterate over result and prepare return value
}
function getUser($id) {
// db code here
}
}
// index.php:
require_once("UserManager.php");
$um = new UserManager;
$users = $um->getUsers();
foreach ($users as $user) echo $user['name'];
However, if your queries are quite similar but you have huge numbers of permutations in your query conditions like complicated paging, sorting, filtering, etc, an Object/Relational mapper tool is probably the way to go, although the process of overhauling your existing code to make use of the tool could be quite complicated.
If you decide to investigate ORM tools, you should look at Propel, the ActiveRecord component of Yii, or the king-daddy PHP ORM, Doctrine. Each of these gives you the ability to programmatically build queries to your database with all manner of complicated logic. Doctrine is the most fully featured, allowing you to template your database with things like the Nested Set tree pattern out of the box.
In terms of performance, stored procedures are the fastest, but generally not by much over raw sql. ORM tools can have a significant performance impact in a number of ways - inefficient or redundant querying, huge file IO while loading the ORM libraries on each request, dynamic SQL generation on each query... all of these things can have an impact, but the use of an ORM tool can drastically increase the power available to you with a much smaller amount of code than creating your own DB layer with manual queries.
Gary Richardson is absolutely right though, if you're going to continue to use SQL in your code you should always be using PDO's prepared statements to handle the parameters regardless of whether you're using a query or a stored procedure. The sanitisation of input is performed for you by PDO.
// optional
$attrs = array(PDO::ATTR_PERSISTENT => true);
// create the PDO object
$pdo = new PDO("mysql:host=localhost;dbname=test", "user", "pass", $attrs);
// also optional, but it makes PDO raise exceptions instead of
// PHP errors which are far more useful for debugging
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$stmt = $pdo->prepare('INSERT INTO venue(venueName, regionId) VALUES(:venueName, :regionId)');
$stmt->bindValue(":venueName", "test");
$stmt->bindValue(":regionId", 1);
$stmt->execute();
$lastInsertId = $pdo->lastInsertId();
var_dump($lastInsertId);
Caveat: assuming that the ID is 1, the above script will output string(1) "1". PDO->lastInsertId() returns the ID as a string regardless of whether the actual column is an integer or not. This will probably never be a problem for you as PHP performs casting of strings to integers automatically.
The following will output bool(true):
// regular equality test
var_dump($lastInsertId == 1);
but if you have code that is expecting the value to be an integer, like is_int or PHP's "is really, truly, 100% equal to" operator:
var_dump(is_int($lastInsertId));
var_dump($lastInsertId === 1);
you could run into some issues.
Edit: Some good discussion on stored procedures here

First up, you should use placeholders in your query instead of interpolating the variables directly. PDO/MySQLi allow you to write your queries like:
SELECT * FROM sometable WHERE iUser = ?
The API will safely substitute the values into the query.
I also prefer to have my queries in the code instead of the database. It's a lot easier to work with an RCS when the queries are with your code.
I have a rule of thumb when working with ORM's: if I'm working with one entity at a time, I'll use the interface. If I'm reporting/working with records in aggregate, I typically write SQL queries to do it. This means there's very few queries in my code.

I had to clean up a project wich many (duplicate/similar) queries riddled with injection vulnerabilities.
The first steps I took were using placeholders and label every query with the object/method and source-line the query was created.
(Insert the PHP-constants METHOD and LINE into a SQL comment-line)
It looked something like this:
-- #Line:151 UserClass::getuser():
SELECT * FROM USERS;
Logging all queries for a short time supplied me with some starting points on which queries to merge. (And where!)

I'd move all the SQL to a separate Perl module (.pm) Many queries could reuse the same functions, with slightly different parameters.
A common mistake for developers is to dive into ORM libraries, parametrized queries and stored procedures. We then work for months in a row to make the code "better", but it's only "better" in a development kind of way. You're not making any new features!
Use complexity in your code only to address customer needs.

Use a ORM package, any half decent package will allow you to
Get simple result sets
Keep your complex SQL close to the data model
If you have very complex SQL, then views are also nice to making it more presentable to different layers of your application.

We were in a similar predicament at one time. We queried a specific table in a variety of ways, over 50+.
What we ended up doing was creating a single Fetch stored procedure that includes a parameter value for the WhereClause. The WhereClause was constructed in a Provider object, we employed the Facade design pattern, where we could scrub it for any SQL injection attacks.
So as far as maintenance goes, it is easy to modify. SQL Server is also quite the chum and caches the execution plans of dynamic queries so the the overall performance is pretty good.
You'll have to determine the performance drawbacks based on your own system and needs, but all and all, this works very well for us.

There are some libraries, such as MDB2 in PEAR that make querying a bit easier and safer.
Unfortunately, they can be a bit wordy to set up, and you sometimes have to pass them the same info twice. I've used MDB2 in a couple of projects, and I tended to write a thin veneer around it, especially for specifying the types of fields. I generally make an object that knows about a particular table and its columns, and then a helper function in that fills in field types for me when I call an MDB2 query function.
For instance:
function MakeTableTypes($TableName, $FieldNames)
{
$Types = array();
foreach ($FieldNames as $FieldName => $FieldValue)
{
$Types[] = $this->Tables[$TableName]['schema'][$FieldName]['type'];
}
return $Types;
}
Obviously this object has a map of table names -> schemas that it knows about, and just extracts the types of the fields you specify, and returns an matching type array suitable for use with an MDB2 query.
MDB2 (and similar libraries) then handle the parameter substitution for you, so for update/insert queries, you just build a hash/map from column name to value, and use the 'autoExecute' functions to build and execute the relevant query.
For example:
function UpdateArticle($Article)
{
$Types = $this->MakeTableTypes($table_name, $Article);
$res = $this->MDB2->extended->autoExecute($table_name,
$Article,
MDB2_AUTOQUERY_UPDATE,
'id = '.$this->MDB2->quote($Article['id'], 'integer'),
$Types);
}
and MDB2 will build the query, escaping everything properly, etc.
I'd recommend measuring performance with MDB2 though, as it pulls in a fair bit of code that might cause you problems if you're not running a PHP accelerator.
As I say, the setup overhead seems daunting at first, but once it's done the queries can be simpler/more symbolic to write and (especially) modify. I think MDB2 should know a bit more about your schema, which would simpify some of the commonly used API calls, but you can reduce the annoyance of this by encapsulating the schema yourself, as I mentioned above, and providing simple accessor functions that generate the arrays MDB2 needs to perform these queries.
Of course you can just do flat SQL queries as a string using the query() function if you want, so you're not forced to switch over to the full 'MDB2 way' - you can try it out piecemeal, and see if you hate it or not.

This other question also has some useful links in it...

Use a ORM framework like QCodo - you can easily map your existing database

I try to use fairly generic functions and just pass the differences in them. This way you only have one function to handle most of your database SELECT's. Obviously you can create another function to handle all your INSERTS.
eg.
function getFromDB($table, $wherefield=null, $whereval=null, $orderby=null) {
if($wherefield != null) {
$q = "SELECT * FROM $table WHERE $wherefield = '$whereval'";
} else {
$q = "SELECT * FROM $table";
}
if($orderby != null) {
$q .= " ORDER BY ".$orderby;
}
$result = mysql_query($q)) or die("ERROR: ".mysql_error());
while($row = mysql_fetch_assoc($result)) {
$records[] = $row;
}
return $records;
}
This is just off the top of my head, but you get the idea. To use it just pass the function the necessary parameters:
eg.
$blogposts = getFromDB('myblog', 'author', 'Lewis', 'date DESC');
In this case $blogposts will be an array of arrays which represent each row of the table. Then you can just use a foreach or refer to the array directly:
echo $blogposts[0]['title'];

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Most performant solution for multiple databases in MySQL and PHP - php

Related

Is it a good programming having two or more different mysql connections? [duplicate]

use only one $dbh for several databases?

In PHP/MySQL should I open multiple database connections or share 1?

How quick is switching DBs with PHP + MySQL?

How do you manage SQL Queries

Categories

Resources