I am working on a web application which returns large tables of statistics from a database, using SQL Server (not my call). In my code I have several functions with while loops of the type :
while($row = sqlsrv_fetch_array($stmt, SQLSRV_FETCH_ASSOC))
However, since the tables are so large I would like to make as few calls to the database as possible to minimize loading time. That means I would like to somehow reset the internal pointer of the database statement after these while-loops. My experiece with SQL Server is limited, to say the least, and googling suggest that in MySQL I should be able to use the mysql_data_seek-function. Is there an equivalent in SQL Server? Is it even possible?
sqlsrv_fetch($stmt,SQLSRV_SCROLL_FIRST);
in http://www.php.net/manual/en/function.sqlsrv-fetch.php
Related
I'm developing a project where I need to retrieve HUGE amounts of data from an MsSQL database and treat that data. The data retrieval comes from 4 tables, 2 of them with 800-1000 rows, but the other two with 55000-65000 rows each one.
The execution time wasn't tollerable, so I started to rewrite the code, but I'm quite inexperienced with PHP and MsSQL. My execution of PHP atm is in localhost:8000. I'm generating the server using "php -S localhost:8000".
I think that this is one of my problems, the poor server for a huge ammount of data. I thought about XAMPP, but I need a server where I can put without problems the MsSQL Drivers to use the functions.
I cannot change the MsSQL for MySQL or some other changes like that, the company wants it that way...
Can you give me some advices about how to improve the performance? Any server that I can use to improve the PHP execution? Thank you really much in advance.
The PHP execution should least of your concerns. If it is, most likely you are going about things in the wrong way. All the PHP should be doing is running the SQL query against the database. If you are not using PDO, consider it: http://php.net/manual/en/book.pdo.php
First look to the way your SQL query is structured, and how it can be optimised. If in complete doubt, you could try posting the query here. Be aware that if you can't post a single SQL query that encapsulates your problem you're probably approaching the problem from the wrong angle.
I am assuming from your post that you do not have recourse to alter the database schema, but if so that would be the second course of action.
Try to do as much data processing in SQL Server as possible. Don't do data joining or other type of data processing that can be done in the RDBMS.
I've seen PHP code that retrieved data from multiple tables and matched lines based on several conditions. This is just an example of a misuse.
Also try to handle data in sets in SQL (be it MS* or My*) and avoid, if possible, line-by-line processing. The optimizer will output a much more performant plan.
This is small database. Really. My advices:
- Use paging for the tables and get data by portions (by parts)
- Use indexes for tables
- Try to find more powerful server. Often hosters companies uses one database server for thousands user's databases and speed is very slow. I suffered from this and bought dedicated server finally.
I have a theoretical question.
I can't see any difference between declaring a function within a PHP file and creating a stored procedure in a database that does the same thing.
Why would I want to create a stored procedure to, for example, return a list of all the Cities for a specific Country, when I can do that with a PHP function to query the database and it will have the same result?
What are the benefits of using stored procedures in this case? Or which is better? To use functions in PHP or stored procedures within the database? And what are the differences between the two?
Thank you.
Some benefits include:
Maintainability: you can change the logic in the procedure without needing to edit app1, app2 and app3 calls.
Security/Access Control: it's easier to worry about who can call a predefined procedure than it is to control who can access which tables or which table rows.
Performance: if your app is not situated on the same server as your DB, and what you're doing involves multiple queries, using a procedure reduces the network overhead by involving a single call to the database, rather than as many calls as there are queries.
Performance (2): a procedure's query plan is typically cached, allowing you to reuse it again and again without needing to re-prepare it.
(In the case of your particular example, the benefits are admittedly nil.)
Short answer would be if you want code to be portable, don't use stored procedures because if you will want at some point change database for example from MySQL to PostgreSQL you will have to update/port all stored procedures you have written.
On the other hand, sometimes you can achieve better performance results using stored procedures because all that code will run by database engine. You also can make situation worse if stored procedures will be used improperly.
I dont think that selecting country is very expensive operation. So I guess you don't have to use stored procedures for this case.
As most of the guys already explained it, but still i would try to reiterate in my own way
Stored Procedures :
Logic resides in the database.
Lets say some query which we need to execute, then we can do that either by :
Sending the query to DataBase server from client, where it will be parsed, compiled and then executed.
The other way is stationing the query at DataBase server and create an aliasing for the query, which client will use to send the request to database server and when recieved at server it will be executed.
So we have :
Client ----------------------------------------------------------> Server
Conventional :
Query created #Client ---------- then propagate to Server ----------Query : Reached server : Parse, Compiled , execute.
Stored Procedures :
Alias is Created, used by Client----------------then propogate to Server-------- Alias reached at Server : Parse,Compiled, Cached (for the first Time)
Next time same alias comes up, execute the query executable directly.
Advantages :
Reduce Network Traffic : If client is sending a big query, and may be using the same query very frequently then every bit of the query is send to the network and hence which may increase the network traffic and unnecessary increase the network usage.
Faster Query Execution : Since stored procedures are Parsed, Compiled at once, and the executable is cached in the Database. Therefore if same query is
repeated multiple times then Database directly executes the executable and hence Time is saved in Parse,Compile etc. This is good if query is used frequently.
If query is not used frequently, then it might not be good, because storing cached executable takes space, why to put Load on Database unnecessarily.
Modular : If multiple applications wants to use the same query, then with traditional way you are duplicating code unnecessarily at applications, the best
way is to put the code close to Database, this way duplication can be alleviated easily.
Security: Stored procedures are also developed, keeping in mind about Authorization(means who is privileged to run the query and who is not).So for a specific user you can grant permissions, to others you as DBA can revoke the permission. So its a good way as a point wrt to DBAs a DBA you can know who are right persons to get the access.But such things are not that popular now, you can design your Application Database such that only authorized person can access it and not all.
So if you have only Security/Authorization as the point to use Stored Procedures instead of Conventional way of doing things, then Stored procedure might not be appropriate.
ok, this may be a little oversimplified (and possibly incomplete):
With a stored procedure:
you do not need to transmit the query to the database
the DBMS does not need to validate the query every time (validate in a sense of syntax, etc)
the DBMS does not need to optimize the query every time (remember, SQL is declarative, therefore, the DBMS has to generate an optimized query execution plan)
I am attempting to build a progress bar loaded for a long running page (yes indexes are obvious solution but currently the dataset/schema prohibits partitioning/proper indexing) so I plan to use a GUID/uniqueID within the query comment to track progess with SHOW_FULL_PROCESSLIST via ajax - but the key to this rests on if the sequential queries are executed in order by php, does anyone know?
MySQL as a database server uses multiple threads to handle multiple running queries. It allocates a thread as and when it received a query from a client i.e. a PHP connection or ODBC connection etc.
However, since you mentioned mysql_query I think there can be 2 things in your mind:
if you call mysql_query() multiple times to pass various commands, then MySQL will be passed each query after the previous query is completely executed and the result returned to PHP. So, of course MySQL will seem to then work sequentially although it is actually PHP that is waiting to send MySQL a query till one query is finished.
In MySQL5 and PHP5 there is a function called (mysqli_multi_query()) using which you can pass multiple queries at once to MySQL without PHP waiting for 1 query to end. The function will return all results at once in the same result object. MySQL will actually run all the queries at once using multiple threads and the wait time in this case is substantially less and you also tend to use the server resources available much better as all the queries will run as separate threads.
As Babbage once said, "I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question." Calls to mysql_query, like calls to any other function, execute in the order that they are reached in program flow.
Order in the file is mostly irrelevant. It's order of execution that matters.
<?php
myCmdFour();
myCmdTwo();
myCmdThree();
myCmdTwo();
function myCmdTwo() {
mysql_query(...);
}
function myCmdThree() {
mysql_query(...);
}
function myCmdFour() {
mysql_query(...);
}
myCmdFour();
myCmdThree();
myCmdTwo();
myCmdTwo();
?>
Although anyone that has PHP files that look like that, needs to seriously rethink things.
How do the different MySQL Cursors within PHP manage memory? What I mean is, when I make a MySQL query that retrieves a large result set, and get the MySQL resource back, how much of the data that the query retrieved is stored in local memory, and how are more results retrieved? Does the cursor automatically fetch all the results, and give them to me as I iterate through the resource with fetch_array or is it a buffered system?
Finally, are the cursors for the different drivers within mysql implemented differently? There's several MySQL drivers for PHP available, mysql, mysqli, pdo, etc. Do they all follow the same practices?
That depends on what you ask php to do, for instance mysql_query() grabs all the result set (if that's 500 megabytes, goodbye) ; if you don't want that you can use :
http://php.net/manual/en/function.mysql-unbuffered-query.php
PDO, MySQLI seem to have other ways of doing the same thing.
Depending on your query, the result set may be materialized on the database side (if you need a sort, then the sort must be done entirely before you even get the first row).
For not too large result sets it's usually better to fetch it all at once, so the server can free used resources asap.
When you run a query like so:
$query = "SELECT * FROM table";
$result = odbc_exec($dbh, $query);
while ($row = odbc_fetch_array($result)) {
print_r($row);
}
Does the resource stored in $result point to data that exists on the server running php? Or is pointing to data in the database? Put another way, as the while loop does it's thing ,is PHP talking to the DB every iteration or is it pulling that $row from some source on the application side?
Where this is mattering to me is I have a database I'm talking to over VPN using ODBC with PHP. This last weekend something strange has happened where huge pauses are happening during the while loop. So between iterations, the script will stop execution for seconds and up to minutes. It seems to be completely random where this happens. I'm wondering if I need to talk to the server over VPN each iteration and maybe the connection is flaky or if something has gone wrong with my ODBC driver (FreeTDS).
mysql_query and odbc_exec both return a resource which (quote from php.net) "is a special variable, holding a reference to an external resource." This suggests the server is talking with the database server every iteration, I am not sure though.
However, there are 2 connections we are talking about here. The first being your connection with the PHP server, and the second one being the connection between the PHP server and the database server. If both servers have a fast connection, the strange behaviour you are experiencing might not have anything to do with your VPN.
The resource identifies the internal data structure used by PHP for interacting with the external resource.
In the case of the resource returned by mysql_query(), this data structure will include the rows returned by the query (and won't return until all the data has been returned or the conenction fails). However this behaviour is specific to MySQL - there is no requirement that the DBMS return the data before it is explicitly requested by the client.
If there is some strange problem causing lots of latency in your setup, then the only obvious solution would be to compile the results of the query at the database side then deliver them to your PHP code, aither batched or as a whole (think webservice).
C.