PHP/MySql Connection Overhead - php

I have recently completed a data layer for a PHP application. In my data layer I have various methods for executing different sql tasks such as selects, inserts, deletes, etc. Coming from a .NET background, I practice opening connections, doing whatever with the connection, and closing them.
In a recent code review, I was questioned about this practice, and a colleague stated it was best to leave connections open for the life of the application. Their reasoning is that opening/closing connections is time consuming. My argument is that leaving them open is resource consuming. Following is a code sample from the data layer that executes a select query. I am fairly new to PHP so I don't really have a response for the critique. Can anyone provide any insight into this?
public static final function executeSelectQuery($qry){
$connection = mysql_connect(ADS_DB_HOST, ADS_DB_USERNAME, ADS_DB_PASSWORD) or die(ADS_ERROR_MSG . mysql_error());
$db = mysql_select_db(ADS_DB_NAME) or die(ADS_ERROR_MSG . mysql_error());
$result = mysql_query($qry) or die(ADS_ERROR_MSG . mysql_error());
mysql_close();
$results = array();
while($rows = mysql_fetch_assoc(($result))){
$results[] = $rows;
}
return sprintf('{"results":{"rows":%s}}', json_encode($results));
}

Classic case of the speed vs memory usage trade-off. There is no one-answer-fits-all to this question; it comes down to which of the two is the most important for the particular project you are undertaking.
Edit: After reading the new comments I see this project is aimed at mobile devices. In this case I would say that due to the limited memory of mobile devices (as opposed to the boatload available to most desktops nowadays) prioritising memory usage would be the right call since everyone expects mobile devices to be rather slow in comparison to desktops anyway.

Your colleague is right and this code is wrong.
Ask your self a question, what certain resources does consume an opened connection.
To add to the code review, your idea of error handling is wrong. Never use die to handle an error but throw an Exception instead.
Also, do not craft JSON manually.
return json_encode(array("results" => array ("rows" => $results)));
Also, consider to amend this unction to make it accept values for the parameterized query

Related

Memory usage increasing inside loop: are Magento functions the cause?

My platform is PHP 5.2, Apache, Magento EE 1.9 and CentOS.
I have a pretty basic script which is fetching about 60,000 rows of data from an MS-SQL database using PHP's ms_sql() functions. The data is then processed a bit via data from Magento and finally written to a text file.
Really simple stuff...
$result = mssql_query($query);
while($row = mssql_fetch_assoc($result)) {
$member = $row; // Copied so I can modify it
// Do some stuff with each row... e.g.:
$customer = Mage::getModel("customer/customer");
$customer->loadByEmail($member["email"]);
$customerId = $customer->getId();
// Some more stuff like that...
$ordersCollection = Mage::getResourceModel('sales/order_collection');
// ...........
// Some more stuff like that...
$wishList = Mage::getModel('wishlist/wishlist')->loadByCustomer($customer);
// ...........
// Write straight to a file
fwrite($fp, implode("\t", $member) . "\r\n");
// Probably not even necessary
unset($member);
}
The problem is, the memory usage of my script increases with each iteration of the loop (about 10MB for every 300 rows), with a theoretical peak of about 2GB (though it hasn't got there yet).
I've taken great pains to ensure that I'm not leaving any data in memory. No huge arrays are building up, no variables are being added to, everything is either unset() or directly overwritten with each iteration of the loop.
So my question is: could the Magento functions be causing memory leaks?
And if so, how do I stop them from doing so?
Ideally this script should be totally "passive": just grab the query results, modify them a bit (very temporary memory needed for this) then dump them straight to a file and destroy the memory. But this is not happening!
Thanks
Exclude all Mage:: from your code and just dump data to the file without processing. And see what happens to the memory while doing this. Then start adding the Mage:: functions back one by one and see when it breaks.
This way you'll find the culprit. Then you need to start digging into it's implementation and see what could go wrong. You could also consider doing the processing without relying on your Mage:: calls. Just write the plain code to deal with the data in self-contained functions/classes and compare how things turn out if you exclude Mage:: entirely from the process.
Yes — PHP has a long history of non-ideal behavior when it comes to memory managment and code that pushes the edges of it's object oriented model.
You can try an alternate method of querying for your data that wastes less memory, or you can read up on how the Magento core team deals with this same issue.

Using PHP to dump large databases into JSON

I have a slight problem with an application I am working on. The application is used as a developer tool to dump tables from a database in a MySQL server to a JSON file which the devs grab by using the Unix curl command. So far the databases we've been using are relatively small tables(2GB or less) however recently we've moved into another stage of testing that use fully populated tables (40GB+) and my simple PHP script breaks. Here's my script:
[<?php
$database = $_GET['db'];
ini_set('display_errors', 'On');
error_reporting(E_ALL);
# Connect
mysql_connect('localhost', 'root', 'root') or die('Could not connect: ' . mysql_error());
# Choose a database
mysql_select_db('user_recording') or die('Could not select database');
# Perform database query
$query = "SELECT * from `".$database."`";
$result = mysql_query($query) or die('Query failed: ' . mysql_error());
while ($row = mysql_fetch_object($result)) {
echo json_encode($row);
echo ",";
}
?>]
My question to you is what can I do to make this script better about handling larger database dumps.
This is what I think that the problem is:
you are using mysql_query. mysql_query buffers data in memory and then mysql_fetch_object just fetches that data from the memory. For very large tables, you just don't have enough memory (most likely you are getting all 40G of rows into that one single call).
Use mysql_unbuffered_query instead. More info here on MySQL performance blog There you can find some other possible causes for this behavior.
I'd say just let mysql do it for you, not php:
SELECT
CONCAT("[",
GROUP_CONCAT(
CONCAT("{field_a:'",field_a,"'"),
CONCAT(",field_b:'",field_b),"'}")
)
,"]")
AS json FROM table;
it should generates something like this:
[
{field_a:'aaa',field_b:'bbb'},
{field_a:'AAA',field_b:'BBB'}
]
You might have a problem with MySQL buffering. But, you might also have other problems. If your script is timing out, try disabling the timeout with set_time_limit(0). That's a simple fix, so if that doesn't work, you could also try:
Try dumping your database offline, then transfer it via script or just direct http. You
might try making a first PHP script call a shell script which calls
a PHP-CLI script that dumps your database to text. Then, just pull
the database via HTTP.
Try having your script dump part of a database (the rows 0 through
N, N+1 through 2N, etc).
Are you using compression on your http connections? If your lag is transfer time (not script
processing time), then speeding up the transfer via compression might help.
If it's the data transfer, JSON might not be the best way to transfer the data. Maybe it is. I don't know. This question might help you: Preferred method to store PHP arrays (json_encode vs serialize)
Also, for options 1 and 3, you might try looking at this question:
What is the best way to handle this: large download via PHP + slow connection from client = script timeout before file is completely downloaded

Redirect user to a static page when server's usage is over a limit

We would like to implement a method that checks mysql load or total sessions on server and
if this number is bigger than a value then the next visitor of the website is redirected to a static webpage with a message Too many users try later.
One way I implemented it in my website is to handle the error message MySQL outputs when it denies a connection.
Sample PHP code:
function updateDefaultMessage($userid, $default_message, $dttimestamp, $db) {
$queryClients = "UPDATE users SET user_default_message = '$default_message', user_dtmodified = '$dttimestamp' WHERE user_id = $userid";
$resultClients = mysql_query($queryClients, $db);
if (!$resultClients) {
log_error("[MySQL] code:" . mysql_errno($db) . " | msg:" . mysql_error($db) . " | query:" . $queryClients , E_USER_WARNING);
$result = false;
} else {
$result = true;
}
}
In the JS:
function UpdateExistingMsg(JSONstring)
{
var MessageParam = "msgjsonupd=" + JSON.encode(JSONstring);
var myRequest = new Request({url:urlUpdateCodes, onSuccess: function(result) {if (!result) window.open(foo);} , onFailure: function(result) {bar}}).post(MessageParam);
}
I hope the above code makes sense. Good luck!
Here are some alternatives to user-lock-out that I have used in the past to decrease load:
APC Cache
PHP APC cache (speeds up access to your scripts via in memory caching of the scripts): http://www.google.com/search?gcx=c&ix=c2&sourceid=chrome&ie=UTF-8&q=php+apc+cache
I don't think that'll solve "too many mysql connections" for you, but it should really really help your website's speed in general, and that'll help mysql threads open and close more quickly, freeing resources. It's a pretty easy install on a debian system, and hopefully anything with package management (perhaps harder if you're using a if you're using a shared server).
Cache the results of common mysql queries, even if only within the same script execution. If you know that you're calling for certain data in multiple places (e.g. client_info() is one that I do a lot), cache it via a static caching variable and the info parameter (e.g.
static $client_info;
static $client_id;
if($incoming_client_id == $client_id){
return $client_info;
} else {
// do stuff to get new client info
}
You also talk about having too many sessions. It's hard to tell whether you're referring to $_SESSION sessions, or just browsing users, but too many $_SESSION sessions may be an indication that you need to move away from use of $_SESSION as a storage device, and too many browsing users, again, implies that you may want to selectively send caching headers for high use pages. For example, almost all of my php scripts return the default caching, which is no cache, except my homepage, which displays headers to allow browsers to cache for a short 1 hour period, to reduce overall load.
Overall, I would definitely look into your caching procedures in general in addition to setting a hard limit on usage that should ideally never be hit.
This should not be done in PHP. You should do it naturally by means of existing hard limits.
For example, if you configure Apache to a known maximal number of clients (MaxClients), once it reaches the limit it would reply with error code 503, which, in turn, you can catch on your nginx frontend and show a static webpage:
proxy_intercept_errors on;
error_page 503 /503.html;
location = /503.html {
root /var/www;
}
This isn't hard to do as it may sound.
PHP isn't the right tool for the job here because once you really hit the hard limit, you will be doomed.
The seemingly simplest answer would be to count the number of session files in ini_get("session.save_path"), but that is a security problem to have access to that directory from the web app.
The second method is to have a database that atomically counts the number of open sessions. For small numbers of sessions where performance really isn't an issue, but you want to be especially accurate to the # of open sessions, this will be a good choice.
The third option that I recommend would be to set up a chron job that counts the number of files in the ini_get('session.save_path') directory, then prints that number to a file in some public area on the filesystem (only if it has changed), visible to the web app. This job can be configured to run as frequently as you'd like -- say once per second if you want better resolution. Your bootstrap loader will open this file for reading, check the number, and give the static page if it is above X.
Of course, this third method won't create a hard limit. But if you're just looking for a general threshold, this seems like a good option.

MySQL connection limit

I have a problem with reaching my connection limit too quickly... Am I right in thinking the following will help resolve this?
On older files using mysql_query
<?php
mysql_close($link);
if (isset($link2)) {
mysql_close($link2);
}
?>
On newer files using mysqli class
class DB extends MySQLi {
function __destruct() {
$this->close();
}
}
You may also be keeping connections open via persistent connections (pconnect), causing your database server to pool and stack up the connections. I've had troubles with this up until about PHP5.2?
Connection is closed automatically when script finishes it's work even if you forget to mysql_close(). Consider raising max_clients setting my.cnf
Also, if you are using only one database, you wont need, you don't need two connections - use one instead.
<?php
mysql_close($link);
if (isset($link2)) {
mysql_close($link2);
}
?>
This doesn't make any sense - if know that both variables may contain mysql connection resources then close the both!
Better yet - if your code is a mess and you can't make sense of it, then...
<?php
#mysql_close();
#mysql_close();
#mysql_close();
?>
But the only place you can sensibly put this (without analysing the code behaviour in some details - in which case you would know what resources you have open) is at the end of the script - which is where the connections get closed anyway.
Similarly, the destruct method is only called when all references to an object are deleted - this is marginally better but depending on the code structure you may get no benefit at all.
It makes far more sense to identify which URLs are taking a long time to process and trying to re-factor the code (both PHP and SQL) in these.

How often should I close database connections?

Currently, I'm opening a database connection in my app's initialization. It's a fairly small app, PHP if that's relevant.
Should I be connecting to the database, making calls, then closing and repeating this process for each database function I write?
For example, I have the following function which grabs the $db variable from my app's initialization.
function get_all_sections()
{
global $db;
$sql = 'select * from sections';
if (!$db->executeSQL($sql, $result))
{
throw new Exception($db->getDatabaseError());
exit();
}
$sections = array();
for ($i = 0; $i < $db->numberOfRows($result); $i++)
{
$sections[] = new Section($db->fetchArray($result, MYSQLI_ASSOC));
}
return $sections;
}
Would it be better if I opened the connection then closed it after I fetched the rows? That seems like a lot of connections that are opened and closed.
If you have connection pooling on (http://en.wikipedia.org/wiki/Connection_pool) its ok to be grabbing a new connection when you need it. HOWEVER, I'd say to be in the habit of treating any resource as "limited" and if you open the db handle keep it around for as long as possible.
Connecting to the database takes a finite amount of time. It's negligible when connecting over a domain socket or named pipe, but it can be much larger if over a network connection, or worse yet, the open Internet. Leave it connected for the life of the request at least.
Use mysql_pconnect for connection pooling, and close at the end of each operation. The thread won't really close, and the thread will be re-used. This way its both safe and efficient.
Since PHP MySQL connections are fairly light-weight, you are probably OK opening and closing the connection when needed. The same is not true of other database connections, such as connecting to SQL Server, which has rather heavy connections.
In all cases, however, do whatever makes the most sense in terms of maintainability/logic/dependencies. Then, if you find you are feeling a measurable slowdown, you can optimize those areas that need the speed-boost.
When in doubt, follow the golden rule: Don't optimize prematurely.
The simplest solution would be to use mysql_pconnect() - see here
This way if the connection is already open it will use it instead of creating a new one. If it's not it will connect again

Categories