We are developing a script that reads data from a SQL Server 2005 located in other server.
At this moment we are having some trouble with the connection time, because the data we are retrieving is rather large.
One solution that came to us was calling mssql_close() just after mssql_query() and before mssql_fetch_array(), because after mssql_query() the data is on our PHP server, or that is what the documentation says. That would shorten the connection time quite a bit because of the data manipulations we have to do on the returned records.
Is that possible? Do we need to have an open connection for executing mssql_fetch_array()?
If you have a large data set to be pulled you can either:
Pull the data in chunks (multiple requests to DB)
Increase the timeout connection (if DB is under your control)
I also hope you have read this part from the manual:
The downside of the buffered mode is that larger result sets might require quite a lot memory. The memory will be kept occupied till all references to the result set are unset or the result set was explicitly freed, which will automatically happen during request end the latest.
For the question:
Do we need to have an open connection for executing mssql_fetch_array()?
Not needed if you have already fetched data.
Related
I have an android app, which needs to fetch new data from a server running a MySQL database. I add data to the database via a Panel which I access online on my domain.com/mypanel/.
What would be the best way to fetch the data on the client to reduce the overhead, but keep the programming effort as small as possible. It's not necessary for the client to get the latest database changes right after they have been updated, i.e. it would be okay if the client is updated some hours after the update.
Currently I thought of the following:
Add a column timestamp to the database-tables so that I know which changes have been made
Run some sort of background service on the client (in the app) which runs every X hours and then checks for the latest updates since the last successfull server-client synchronization
Send the time-gap to the server in which there haven't been any updates on the client anymore, using HTTP-POST
On the server, there will be some sort of MySQL SELECT-statement which considers the sent time-gap (if there is no time-gap sent from the client, just SELECT everything, e.g. in case of the first synchronization (full-sync)) --> JSON-Encode the Arrays -> Sent the JSON Response to the Client
On the client, take the data, loop row by row and insert into the local database file
My question would be:
Is there something you would rather do differently?
Or would you maybe send the database changes as a whole package/sql-file instead of the raw-data as array?
What would happen, when the internet connection aborts during the synchronization? I thought of the following to avoid any conflicts in this sort of process: Only after successfull retrieve of the complete server-response (i.e. the complete JSON-array), ONLY then insert the rows into the local database and update the local update timestamp to the actual time. If I've retrieved only some of the JSON rows and the internet connection gets interrupted inbetween (or app is being killed), I would NOT have inserted ANY of the retrieved rows into my local app-database, which means that the next time the background service is running, there will hopefully be no conflicts.
Thank you very much
You've mentioned database on client and my guess is that database is SQLite.
SQLite fully supports transaction, which means that you could wrap your inserts in BEGIN TRANSACTION and END TRANSACTION statements. A successful transaction would mean that all your inserts/updates/deletes are fine.
Choosing JSON has a lot of ups and a few downs - its easy for both client and server side. A downside I've been struggling in the past is with big JSONs (a few Mb). The client device have to download all the string and parse it at once, so it may run out of memory while converting the string to JSONObject. I've been there, so just keep that in mind as a possibility. That could be solved by splitting your update into pieces and marking each piece with its number and total number of pieces. Then the client device should know that it'd make a few requests to get all the pieces.
Another option you have is the good old CSV. You won't need the JSON includes, which will save your app some space. An upside is that you may parse and process the data line by line, so the memory impact would be very low. The obvious downside here is that you'll have to parse the data, which might be a problem, depending on your data.
I should also mention XML as an option. My personal opinion is that I'd use only if I really have to.
I have pages on my site that use the database and pages that don't.
When ever I need the database, I connect using $conn = connect(). But this means I need to put it in everywhere its needed. If I put it in an include file and put that file into every page it would connect even when the database is not needed. Would this be a good idea? Would creating a connection cause problems or other issues when not needed, or should I connect only when needed?
Connecting to a database when you don't need to introduces a small amount of overhead that is avoidable. If you need your pages to run as fast as possible, then you could optimize by avoiding the unnecessary db connection.
How much overhead this represents as a proportion of your total PHP execution time varies, for instance if your PHP script is simple and quick, then proportionally the db connection is a larger percentage of the total time wasted. If your PHP script does a lot of other things, then the db connection is a smaller percentage of total time.
Also the speed of a db connection can vary, depending on the speed of your server, whether MySQL is configured with DNS dependency, etc.
When I worked on the Zend Framework, we implemented "lazy" connections. You can create an instance of a Zend_Db_Adapter object anytime you want, but that class doesn't connect to the class constructor. It connects to the database when you run your first query (or when you explicitly call the getConnection() method).
Another consideration is how soon do you disconnect from the database when you're done running queries.
Suppose you handle 1000 PHP requests per second (one per millisecond on average), and each of your PHP requests lasts 100ms. So at any given instant, you may have 100 PHP requests in progress on average. If the first thing your PHP code does is connect to the db, and the last thing it does is disconnect from the db and other resources (by automatic request cleanup), then you may also have 100 db connections active at any time.
But if you delay connecting to the db, and disconnect promptly when you are done querying the db, and avoid connecting altogether on some requests, then on average you will have a much lower number of concurrent db sessions.
This can help reduce resource use on the db server, allowing more throughput and a higher number of PHP requests to complete per second.
I have an app that is posting data from android to some MySQL tables through PHP with a 10 second interval. The same PHP file does a lot of queries on some other tables in the same database and the result is downloaded and processed in the app (with DownloadWebPageTask).
I usually have between 20 and 30 clients connected this way. Most of the data each client query for is the same as for all the other clients. If 30 clients run the same query every 10th second, 180 queries will be run. In fact every client run several queries, some of them are run in a loop (looping through results of another query).
My question is: if I somehow produce a textfile containing the same data, and updating this textfile every x seconds, and let all the clients read this file instead of running the queries themself - is it a better approach? will it reduce serverload?
In my opinion you should consider using memcache.
It will let you store your data in memory which is even faster than files on disk or mysql queries.
What it will also do is reduce load on your database so you will be able to serve more users with the same server/database setup.
Memcache is very easy to use and there are lots of tutorials on the internet.
Here is one to get you started:
http://net.tutsplus.com/tutorials/php/faster-php-mysql-websites-in-minutes/
What you need is caching. You can either cache the data coming from your DB or cache the page itself. Below you can find few links on how do the same in PHP:
http://www.theukwebdesigncompany.com/articles/php-caching.php
http://www.addedbytes.com/articles/for-beginners/output-caching-for-beginners/
And yes. This will reduce DB server load drastically.
I'm designing my own session handler for my web app, the PHP sessions are too limited when trying to control the time the session should last.
Anyway, my first tests were like this: a session_id stored on a mysql row and also on a cookie, on the same mysql row the rest of my session vars.
On every request to the server I make a query, get these vars an put them on an array to use the necesary ones on runtime.
Last night I was thinking if I could write the vars on a server file once, on the login stage, and later just include that file instead of making a mysql query on every request.
So, my question is: which is less resource consuming? doing this on mysql or on a file?
I know, I know, I already read several threads on stackoverflow about this issue, but I have something different from all those cases (I hope I didn't miss something):
I need to keep track of the time that has passed since the last time the user used the app, so, in every call to the server not only I request the entire database row, I also update a timestamp on that same row.
So, on both cases I need to write to the session on every request...
FYI: the entire app runs on one server so the several servers scenario when using files does not apply..
It's easier to work with when it's done in a database and I've been using sessions in database mostly for scalability.
You may use MySQL since it can store sessions in it's temporary memory with well-configured MySQL servers, you can even use memory tables to fasten the thing if you can store all the sessions within memory. If you get near your memory limit it's easy to switch to a normal table.
I'd say MySQL wins over files for performance for medium to large sites and also for customization/options. For smaller websites I think that it doesn't make that much of a difference, but you will use more of the hard drive when using files.
I use lazy connection to connect to my DB within my DB object. This basically means that it doesn't call mysql_connect() until the first query is handed to it, and it subsequently skips reconnecting from then on after.
Now I have a method in my DB class called disconnectFromDB() which pretty much calls mysql_close() and sets $_connected = FALSE (so the query() method will know to connect to the DB again). Should this be called after every query (as a private function) or externally via the object... because I was thinking something like (code is an example only)
$students = $db->query('SELECT id FROM students');
$teachers = $db->query('SELECT id FROM teachers');
Now if it was closing after every query, would this slow it down a lot as opposed to me just adding this line to the end
$db->disconnectFromDB();
Or should I just include that line above at the very end of the page?
What advantages/disadvantages do either have? What has worked best in your situation? Is there anything really wrong with forgetting to close the mySQL connection, besides a small loss of performance?
Appreciate taking your time to answer.
Thank you!
As far as I know, unless you are using persistent connections, your MySQL connection will be closed at the end of the page execution.
Therefore, you calling disconnect will add nothing and because you do the lazy connection, may cause a second connection to be created if you or another developer makes a mistake and disconnects at the wrong time.
Given that, I would just allow my connection to close automatically for me. Your pages should be executing quickly, therefore holding the connection for that small amount of time shouldn't cause any problems.
I just read this comment on PHP website regarding persistent connection and it might be interesting to know:
Here's a recap of important reasons
NOT to use persistent connections:
When you lock a table, normally it is unlocked when the connection
closes, but since persistent
connections do not close, any tables
you accidentally leave locked will
remain locked, and the only way to
unlock them is to wait for the
connection to timeout or kill the
process. The same locking problem
occurs with transactions. (See
comments below on 23-Apr-2002 &
12-Jul-2003)
Normally temporary tables are dropped when the connection closes,
but since persistent connections do
not close, temporary tables aren't so
temporary. If you do not explicitly
drop temporary tables when you are
done, that table will already exist
for a new client reusing the same
connection. The same problem occurs
with setting session variables. (See
comments below on 19-Nov-2004 &
07-Aug-2006)
If PHP and MySQL are on the same server or local network, the
connection time may be negligible, in
which case there is no advantage to
persistent connections.
Apache does not work well with persistent connections. When it
receives a request from a new client,
instead of using one of the available
children which already has a
persistent connection open, it tends
to spawn a new child, which must then
open a new database connection. This
causes excess processes which are just
sleeping, wasting resources, and
causing errors when you reach your
maximum connections, plus it defeats
any benefit of persistent connections.
(See comments below on 03-Feb-2004,
and the footnote at
http://devzone.zend.com/node/view/id/686#fn1)
(I was not the one that wrote the text above)
Don't bother disconnecting. The cost of checking $_connected before each query combined with the cost of actually calling $db->disconnectFromDB(); to do the closing will end up being more expensive than just letting PHP close the connection when it is finished with each page.
Reasoning:
1: If you leave the connection open till the end of the script:
PHP engine loops through internal array of mysql connections
PHP engine calls mysql_close() internally for each connection
2: If you close the connection yourself:
You have to check the value of $_connected for every single query. This means PHP has to check that the variable $_connected A) exists B) is a boolean and C) is true/false.
You have to call your 'disconnect' function, and function calls are one of the more expensive operations in PHP. PHP has to check that your function A) exists, B) is not private/protected and C) that you provided enough arguments to your function. It also has to create a copy of the $connection variable in the new local scope.
Then your 'disconnect' function will call mysql_close() which means PHP A) checks that mysql_close() exists and B) that you have provided all needed arguments to mysql_close() and C) that they are the correct type (mysql resource).
I might not be 100% correct here but I believe the odds are in my favour.
You may want to look at a using persistent connections. Here are two links to help you out
http://us2.php.net/manual/en/features.persistent-connections.php
http://us2.php.net/manual/en/function.mysql-pconnect.php
The basic unit of execution presumably is an entire script. What you first of all are wanting to apply resources (i.e. the database) to, efficiently and effectively, is the entirety of a single script.
However, PHP, Apache/IIS/whatever, have lives of their own; and they are capable of using the connections you open beyond the life of your script. That's the signficance of persistent (or pooled) connections.
Back to your script. It turns out you have a great deal of opportunity to be creative about using that connection during its execution.
The typical naive script will tend to hit the connection again and again, picking up locally appropriate scraps of data associated with given objects/modules/selected options. This is where procedural methodology can inflict a penalty on that connection by opening, requesting, receiving, and closing. (Note that any single query will remain alive until it is explicitly closed, or the script ends. Be careful to note that a connection and a query are not the same thing at all. Queries tie up tables; connections tie up ... connections (in most cases mapped to sockets). So you should be conscious of proper economy in the use of both.
The most economical strategy with regard to queries is to have as few as possible. I'll often try to construct a more or less complex joined query that brings back a full set of data rather than parceling out the requests in small pieces.
Using a lazy connection is probably a good idea, since you may not need the database connection at all for some script executions.
On the other hand, once it's open, leave it open, and either close it explicitly as the script ends, or allow PHP to clean up the connection - having an open connection isn't going to harm anything, and you don't want to incur the unnecessary overhead of checking and re-establishing a connection if you are querying the database a second time.