I am searching for a efficient way to use PHP MySQL innoDB connection but not able to found conclusive information on the web.
As I know, persistent connection is much faster than non-persistent one,
we can set up the connection in following way:
$instance_mysqli = new mysqli('p:127.0.0.1', 'username', 'password', 'db');
However, from the official website, it said the default behavior is "reset" on reuse, which is slower. http://php.net/manual/en/mysqli.persistconns.php
The mysqli extension does this cleanup by automatically calling the
C-API function mysql_change_user(). The automatic cleanup feature has
advantages and disadvantages though. The advantage is that the
programmer no longer needs to worry about adding cleanup code, as it
is called automatically. However, the disadvantage is that the code
could potentially be a little slower, as the code to perform the
cleanup needs to run each time a connection is returned from the
connection pool.
So, there is no way to pass parameter to the above constructor to avoid "reset"? The only way is to recompile extension from source code as the document suggested?
And my anther question is... if mysqli is so smart that it can automatically reset connection by default, what is the point many people still use non-persistent connection, which is even slower.
The cost of a connection is quite small, whether it is persistent or not, whether there is cleanup or not.
Normally, one should acquire one connection at the beginning of the program, and keep it until the end. (There are some exceptions.)
The only time a connection is really noticeable is if you acquire a connection before each and every SQL query.
Bottom line: Worry about your indexes, system design, etc, not about acquiring the connection.
Related
There are quite a few blog/links which discourage usage of persistent connections, mainly because the cleanup needs to be done on client side, and cases where transactions/locks have to be correctly rolled back. However, those links are old, and not enirely in context of mysqli PHP interface.
I read the link : The mysqli Extension and Persistent Connections
It clearly suggests that it does most of the desired cleanup when a client terminates unexpectedly:
Rollback active transactions
Close and drop temporary tables
Unlock tables
Reset session variables
Close prepared statements (always happens with PHP)
Close handler
Release locks acquired with GET_LOCK()
Now that pretty much does most of the cleanup, including READ/WRITE locks on tables if any acquired. So I believe it should be safe. Can I be wrong?
Also, it says there are some performance penalty in form of extra time needed to do cleanup. I would like to know how much that may be in terms of millisecs? Can it ever be as large as say 100 ms?
The automatic cleanup feature has advantages and disadvantages though.
The advantage is that the programmer no longer needs to worry about
adding cleanup code, as it is called automatically. However, the
disadvantage is that the code could potentially be a little slower, as
the code to perform the cleanup needs to run each time a connection is
returned from the connection pool.
I wonder if you really think you can trust an answer from anonymous passer-by more than official documentation page, that clearly answers your question.
But if you do - yes, you can believe it should be safe.
As of performance penalty, from the way it is asked, I believe you don't really need persistent connections at all.
Note: I used Google Translator to write this
I've always done the following to work with MySQL:
-> Open Connection to the database.
-> see details
-> Insert Data
-> another query
-> close Connection
I usually use the same connection to do various things before closing.
A friend who studies this in the IPN of Mexico mentioned to me that the right way (for safety) is to make a new connection for each query, for example:
-> Open Connection to the database.
-> see details
-> close Connection
-> Open Connection to the database.
-> Insert Data
-> close Connection
-> Open Connection to the database.
-> another query
-> close Connection
My question is, what is the right thing to do? My method has been to make the least amount of queries to the database, and only make a connection and keep it until it no longer serves me.
Additionally, is it possible to make a double insertion to a table? For example:
insert into table1(relacion) values([insert into tablaRelacionada(id) values("dato")]);
and that "relacion" is the inserted ID from the first query in "tablaRelacionada".
No, it's not possible to insert rows into two different tables with a single INSERT statement. (You can use a trigger to get it done, but that trigger will need to issue a separate INSERT statement... from the client side it will look like one statement, but on the server, there would be two INSERT statements executed.
If performance and scalability aren't concerns, then "churning" connections is workable. There's nothing necessarily "wrong" with creating a separate connection for each statement, but it's resource intensive. There is a lot of overhead in creating a new session. (It looks rather simple from the client side, but it requires a lot of work on the server side, in addition to the codepath on the client.)
Reusing existing connections is a common pattern. It's one of the biggest benefits of implementing "connection pool", to make it easy to reuse connections without "churning", repeatedly connecting and disconnecting from the database.
In terms of a separate connection for each SQL statement somehow increasing "safety", that's a bit of a stretch.
But I can see some benefit of having a freshly initialized session.
For example, if you reuse an existing session, you may not know what changes have been made in the session state. Any changes made previously are still "in effect". This would be things like session variable settings (e.g. timezone, characterset, autocommit, user defined variables) which could have an impact on the current statement. But within a single script, where you've gotten a fresh connection, you should know what changes have been made, so that shouldn't really be an issue. (This would be more of an issue with using connections from a pool, where the connections are shared by multiple processes. One process mucking with the timezone or characterset could cause a slew of problems for other processes that reuse the connection.)
Using a separate connection per query is at best a great way to bog down both your application and database servers with needless overhead. There are three aspects I see raised here:
Efficiency
Application Security
Network Security
1. Efficiency
Short answer: Bad idea.
Oftentimes the overhead required to initialize the connection is far more than what is required to run the actual query. Your application is probably going to run orders of magnitude slower if you take a connection-per-query approach.
2. Application Security
Short answer: Generally a bad idea, but in the context of PHP completely unnecessary.
The only 'safety' issue I can think of here would be worrying about users accessing leftover temp tables, or session settings "bleeding" over. This is unlikely to happen unless you're using persistent connections which are not the default. As well, most temporary values in MySQL are stored per-connection, and unless you have some PHP code that written poorly [in a particular, strange, and seldom-recommended way, ie. sharing around DB singletons and accessing them strangely] then maybe if the planets align just right you might access some MySQL session-specific data in an unexpected way.
This is pretty much the same as preemptive optimization, and is not worth worrying about.
3. Network Security
Short answer: No. What? Just... no.
If you're worried about someone peeping in on your connections the solution is not to make more of them, it to make them securely. MySQL supports SSL, so use that if you're worried.
TL;DR No. Don't create separate connections per-query. Bad. Whoever told you this needs to go back to school.
Multi-Table Insert
What you've quoted is not possible, you would want to do something along the lines of the following:
$dbh->query("INSERT tablaRelacionada(id) values('dato')");
$lastid = $dbh->lastInsertId();
$dbh->query("INSERT INTO table1(relacion) values($lastid);");
Assuming that the table tablaRelacionada has an AUTO_INCREMENT column which is what you're trying to get from the first query.
See: lastInsertId()
It's php application using mysqli.
Someone else suggested to have db connection closed right after each query.
Current system have singleton database connection, so over-created new connection is not issue here. Only unused open connections.(Say, the script has not finished execution and the database is not closed by itself.)
So it seemed that there is something to balance - between the cost of waiting for the script to finish and multiple unnecessary closings of the db connection per script. I tend to think that the first is safer. But I am not very sure if it's sufficient. For example if I do:
$userA->sendMessageTo($userB);
And inside this:
$userA->send($userB);
$userA->useSomePoints();
$userA->flushPointsBalance();
....
Imagining each method will have some database operation but this is just one script call/request, if the db open/close happens around each query, this will certainly happen more than once, comparing to not closing it right after each query in method scope.
So which way is better?
generally, having your DB wrapper class (or ORM) create a single connection for the entire request and only close it during clean up (either via destructor, or via PHP's cleanup) is okay. if this is a problem, it probably means that something long is happening between your opening and closing of connections, and this is what you should be addressing instead.
causes could be:
slow queries that don't make use of indices
some other high latency blocking IO (file reading, decoding, etc)
you'll get better gains in terms of effort addressing those issues, rather than looking at how you open and close connections.
basically i'd like to know if it's preferable to establish a database connection before each database query, and then use mysqli_close() immediately after the relevant section, for every spot in the layout where database information has to be pulled - or if it's better to just open the database connection at the start of the file, and then use mysqli_close() near the end of the file.
One connection per request is more efficient. Only if you do many concurrent updates on the same rows is important to commit (close connection) as fast as posible.
it's better to just open the database connection at the start of the file, then get all the data, then use mysqli_close(), and then call a template to start displaying a page.
Use the connection pooling, so it really doesn't matter how often you request a connection in the code. Applications wishing to be scalable should avoid rapidly creating new connections, as they potentially could have noticeable overhead for encryption setup or waiting for a authentication server.
I use lazy connection to connect to my DB within my DB object. This basically means that it doesn't call mysql_connect() until the first query is handed to it, and it subsequently skips reconnecting from then on after.
Now I have a method in my DB class called disconnectFromDB() which pretty much calls mysql_close() and sets $_connected = FALSE (so the query() method will know to connect to the DB again). Should this be called after every query (as a private function) or externally via the object... because I was thinking something like (code is an example only)
$students = $db->query('SELECT id FROM students');
$teachers = $db->query('SELECT id FROM teachers');
Now if it was closing after every query, would this slow it down a lot as opposed to me just adding this line to the end
$db->disconnectFromDB();
Or should I just include that line above at the very end of the page?
What advantages/disadvantages do either have? What has worked best in your situation? Is there anything really wrong with forgetting to close the mySQL connection, besides a small loss of performance?
Appreciate taking your time to answer.
Thank you!
As far as I know, unless you are using persistent connections, your MySQL connection will be closed at the end of the page execution.
Therefore, you calling disconnect will add nothing and because you do the lazy connection, may cause a second connection to be created if you or another developer makes a mistake and disconnects at the wrong time.
Given that, I would just allow my connection to close automatically for me. Your pages should be executing quickly, therefore holding the connection for that small amount of time shouldn't cause any problems.
I just read this comment on PHP website regarding persistent connection and it might be interesting to know:
Here's a recap of important reasons
NOT to use persistent connections:
When you lock a table, normally it is unlocked when the connection
closes, but since persistent
connections do not close, any tables
you accidentally leave locked will
remain locked, and the only way to
unlock them is to wait for the
connection to timeout or kill the
process. The same locking problem
occurs with transactions. (See
comments below on 23-Apr-2002 &
12-Jul-2003)
Normally temporary tables are dropped when the connection closes,
but since persistent connections do
not close, temporary tables aren't so
temporary. If you do not explicitly
drop temporary tables when you are
done, that table will already exist
for a new client reusing the same
connection. The same problem occurs
with setting session variables. (See
comments below on 19-Nov-2004 &
07-Aug-2006)
If PHP and MySQL are on the same server or local network, the
connection time may be negligible, in
which case there is no advantage to
persistent connections.
Apache does not work well with persistent connections. When it
receives a request from a new client,
instead of using one of the available
children which already has a
persistent connection open, it tends
to spawn a new child, which must then
open a new database connection. This
causes excess processes which are just
sleeping, wasting resources, and
causing errors when you reach your
maximum connections, plus it defeats
any benefit of persistent connections.
(See comments below on 03-Feb-2004,
and the footnote at
http://devzone.zend.com/node/view/id/686#fn1)
(I was not the one that wrote the text above)
Don't bother disconnecting. The cost of checking $_connected before each query combined with the cost of actually calling $db->disconnectFromDB(); to do the closing will end up being more expensive than just letting PHP close the connection when it is finished with each page.
Reasoning:
1: If you leave the connection open till the end of the script:
PHP engine loops through internal array of mysql connections
PHP engine calls mysql_close() internally for each connection
2: If you close the connection yourself:
You have to check the value of $_connected for every single query. This means PHP has to check that the variable $_connected A) exists B) is a boolean and C) is true/false.
You have to call your 'disconnect' function, and function calls are one of the more expensive operations in PHP. PHP has to check that your function A) exists, B) is not private/protected and C) that you provided enough arguments to your function. It also has to create a copy of the $connection variable in the new local scope.
Then your 'disconnect' function will call mysql_close() which means PHP A) checks that mysql_close() exists and B) that you have provided all needed arguments to mysql_close() and C) that they are the correct type (mysql resource).
I might not be 100% correct here but I believe the odds are in my favour.
You may want to look at a using persistent connections. Here are two links to help you out
http://us2.php.net/manual/en/features.persistent-connections.php
http://us2.php.net/manual/en/function.mysql-pconnect.php
The basic unit of execution presumably is an entire script. What you first of all are wanting to apply resources (i.e. the database) to, efficiently and effectively, is the entirety of a single script.
However, PHP, Apache/IIS/whatever, have lives of their own; and they are capable of using the connections you open beyond the life of your script. That's the signficance of persistent (or pooled) connections.
Back to your script. It turns out you have a great deal of opportunity to be creative about using that connection during its execution.
The typical naive script will tend to hit the connection again and again, picking up locally appropriate scraps of data associated with given objects/modules/selected options. This is where procedural methodology can inflict a penalty on that connection by opening, requesting, receiving, and closing. (Note that any single query will remain alive until it is explicitly closed, or the script ends. Be careful to note that a connection and a query are not the same thing at all. Queries tie up tables; connections tie up ... connections (in most cases mapped to sockets). So you should be conscious of proper economy in the use of both.
The most economical strategy with regard to queries is to have as few as possible. I'll often try to construct a more or less complex joined query that brings back a full set of data rather than parceling out the requests in small pieces.
Using a lazy connection is probably a good idea, since you may not need the database connection at all for some script executions.
On the other hand, once it's open, leave it open, and either close it explicitly as the script ends, or allow PHP to clean up the connection - having an open connection isn't going to harm anything, and you don't want to incur the unnecessary overhead of checking and re-establishing a connection if you are querying the database a second time.