php odbc_exec not returning all results - php

We are currently running some complex queries which return large data sets in MS SQL, exporting to Excel, and providing these as reports to our users. What we are trying to do is automate these reports on Drupal (running on Windows Server) by executing the query and writing the results to a file for the users to download.
Here is what one query looks like, it returns 86,972 rows, contains 66 columns, and executes in 134 seconds (when executed in MS SQL 2008 client):
Select with 16 JOINS
UNION
Select with 7 JOINS
UNION
Select with 12 JOINS
UNION
Select with 11 JOINS
When I execute this query via PHP, only 3608 rows are returned and the execution time has been anywhere from 70 seconds to 300 seconds. I have also checked the ini_get('max_execution_time') which is 1200, so it is not timing out.
When I simplify this query by only executing the first Select statement group (which only returns 47,158 rows, not 86,972) and reducing the number of columns, I do get more results:
Columns - Rows returned
66 - 3608
58 - 4059
32 - 8105
16 - 32,051
11 - 32,407
5 - 47,158
I am at a loss for what to do, it seems as though executing the query via PHP does not return the same results as when I run this in the MS SQL client. Does anyone know are there any limits on the data set which is returned when using ODBC_EXEC or ODBC_EXECUTE (I tried that too)? Any help is much appreciated!
Here is the PHP code that we are using:
$dbhandle = odbc_connect($odbcmgt_dsn,$odbcmgt_user,$odbcmgt_pwd,SQL_CUR_USE_ODBC);
$queryResult = odbc_exec($dbhandle, $sql);
$filename = 'path\test.csv';
$delimiter = ',';
//print max execution time
$maxEx = ini_get('max_execution_time');
dpm("Max Execution Time: " . $maxEx);
//open file to write into
$fp = fopen($filename,'w');
//retrieve number of columns, print column names to array
$columnHeaderArray = array();
$numColsRet = odbc_num_fields($queryResult);
for($i = 1; $i <= $numColsRet; $i++){
$columnHeaderArray[$i] = odbc_field_name($queryResult,$i);
}
$numRowsRet = 0;
//print column names to the file
fputcsv($fp,$columnHeaderArray,$delimiter);
//print each row to the file
while($row = odbc_fetch_array($queryResult)) {
fputcsv($fp, $row,$delimiter);
$numRowsRet++;
}
//print connection error if any
$error = odbc_errormsg($dbhandle);
dpm("error message: ".$error);
if($error != '') { drupal_set_message('<pre>'.var_export($error,true). '</pre>'); }
//print number of rows
dpm("Num records: " . $numRowsRet);
//close file that was written into
fclose($fp);
odbc_close($dbhandle);

Related

PHP PDO sqlsrv large result set inconsistency

I am using PDO to execute a query for which I am expecting ~500K results. This is my query:
SELECT Email FROM mytable WHERE flag = 1
When I run the query in Microsoft SQL Server management Studio I consistently get 544838 results. I wanted to write a small script in PHP that would fetch these results for me. My original implementation used fetchAll(), but this was exhausting the memory available to php, so I decided to fetch the results one at a time like so:
$q = <<<QUERY
SELECT Email FROM mytable WHERE flag = 1
QUERY;
$stmt = $conn->prepare($q);
$stmt->execute();
$c = 0;
while ($email = $stmt->fetch()[0]) {
echo $email." $c\n";
$c++;
}
but each time I run the query, I get a different number of results! Typical results are:
445664
445836
445979
The number of results seems to be short 100K +/- 200 ish. Any help would be greatly appreciated.
fetch() method fetches one row at a time from current result set. $stmt->fetch()[0] is the first column of the current row.
Your sql query has no ordering and can have some null or empty values (probably).
Since you are controlling this column value in while loop, if the current row's first value is null, it will exit from the loop.
Therefore, you should control only fetch(), not fetch()[0] or something like that.
Also, inside the while loop, use sqlsrv_get_field() to access the columns by index.
$c = 0;
while ($stmt->fetch()) { // You may want to control errors
$email = sqlsrv_get_field($stmt, 0); // get first column value
// $email can be false on errors
echo $email . " $c\n";
$c++;
}
sqlsrv_fetch

Querying from database and writing the result to text file

I have a query selects all from the database table and writes it to a text file. If the state is small (say max of 200k rows), the code still works and writes it to the text file. Problem arises when I have a state that has 2M rows when queried, then there's also the fact that the table has 64 columns.
Here's a part of the code:
create and open file
$file = "file2.txt";
$fOpen = fopen($file, "a"); // Open file, write and append
$qry = "SELECT * FROM tbl_two WHERE STE='48'";
$res = mysqli_query($con, $qry);
if(!$res) {
echo "No data record" . "<br/>";
exit;
}
$num_res =mysqli_num_rows($res);
for ($i=0; $i<=$num_res; $i++) {
$row = mysqli_fetch_assoc ($res);
$STATE = (trim($row['STATE'] === "") ? " " : $row['STATE']);
$CTY = (trim($row['CTY']=== "") ? " " : $row['CTY']);
$ST = (trim($row['ST']=== "") ? " " : $row['ST']);
$BLK = (trim($row['BLK']=== "") ? " " : $row['BLK']);
....
....
//64th column
$data = "$STATE$CTY$ST$BLK(to the 64th variable)\r\n";
fwrite($f,$data);
}
fclose($f);
I tried putting a limit to the query:
$qry = "SELECT * FROM tbl_two WHERE STE='48' LIMIT 200000";
Problem is, it just writes until the 200kth line, and it doesn't write the remaining 1.8m lines.
If I don't put a limit to the query, it encounters the error Out of memory .... . TIA for any kind suggestions.
First you need to use buffer query for fetching the data Read it
Queries are using the buffered mode by default. This means that query results are immediately transferred from the MySQL Server to PHP and then are kept in the memory of the PHP process.
Unbuffered MySQL queries execute the query and then return a resource while the data is still waiting on the MySQL server for being fetched. This uses less memory on the PHP-side, but can increase the load on the server. Unless the full result set was fetched from the server no further queries can be sent over the same connection. Unbuffered queries can also be referred to as "use result".
NOTE: buffered queries should be used in cases where you expect only a limited result set or need to know the amount of returned rows before reading all rows. Unbuffered mode should be used when you expect larger results.
Also optimize the array try to put variable directly and you while loop only
pdo = new PDO("mysql:host=localhost;dbname=world", 'my_user', 'my_pass');
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$uresult = $pdo->query("SELECT * FROM tbl_two WHERE STE='48' LIMIT 200000");
if ($uresult) {
$lineno = 0;
while ($row = $uresult->fetch(PDO::FETCH_ASSOC)) {
echo $row['Name'] . PHP_EOL;
// write value in text file
$lineno++;
}
}

Processing a large result set with php

My scenario is like this ,i have a huge dataset fetched from mysql table
$data = $somearray; //say the number of records in this array is 200000
iam looping this data,processing some functionalities and writing this data to an excel file
$my_file = 'somefile.csv';
$handle = fopen($my_file, 'w') or die('Cannot open file: ' . $my_file); file
for($i=0;$i<count($data);$i++){
//do something with the data
self::someOtherFunctionalities($data[$i]); //just some function
fwrite($handle, $data[$i]['index']); //here iam writing this data to a file
}
fclose($handle);
My problem is that the loop gets memory exhaustion ...it shows "fatal error allowed memory size of.." is there anyway to process this loop without exhaustion
Due to the server limitation im unable to increase php memory limit like
ini_set("memory_limit","2048M");
Im not concerned about the time it takes..even if it takes hours..so i did set_time_limit(0)
your job is linear and you don't need load all data. use Unbuffered Query also use php://stdout(don't temp file) if send this file to httpClient.
<?php
$mysqli = new mysqli("localhost", "my_user", "my_password", "world");
$uresult = $mysqli->query("SELECT Name FROM City", MYSQLI_USE_RESULT);
$my_file = 'somefile.csv'; // php://stdout
$handle = fopen($my_file, 'w') or die('Cannot open file: ' . $my_file); file
if ($uresult) {
while ($row = $uresult->fetch_assoc()) {
// $row=$data[i]
self::someOtherFunctionalities($row); //just some function
fwrite($handle, $row['index']); //here iam writing this data to a file
}
}
$uresult->close();
?>
Can you use "LIMIT" in your MySQL query?
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements).
With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15
http://dev.mysql.com/doc/refman/5.0/en/select.html
If you don't worry about time, take 1000 rows at a time, and just append the rows to the end of the file, eg. make a temp file that you move and/or rename when the job is done.
First select count(*) from table
then for($i = 0; i < number of row; i = i + 1000){
result = SELECT * FROM table LIMIT i,1000; # Retrieve rows 6-15
append to file = result
}
move and rename the file
this is verry meta code, but the process should work

Odd behavior when fetching 100K rows from MySQL via PHP

Looking for some ideas here... I have a MySQL table that has 100K rows of test data in it.
I am using a PHP script to fetch rows out of that table, and in this test case, the script is fetching all 100,000 rows (doing some profiling and optimization for large datasets).
I connect to the DB, and execute an unbuffered query:
$result = mysql_unbuffered_query("SELECT * FROM TestTable", $connection) or die('Errant query: ' . $query);
Then I iterate over the results with:
if ($result) {
while($tweet = mysql_fetch_assoc($result)) {
$ctr++;
if ($ctr > $kMAX_RECORDS) {
$masterCount += $ctr;
processResults($results);
$results = array();
$ctr = 1;
}
$results[] = array('tweet' => $tweet);
}
echo "<p/>FINISHED GATHERING RESULTS";
}
function processResults($resultSet) {
echo "<br/>PROCESSED " . count($resultSet) . " RECORDS";
}
$kMAX_RECORDS = 40000 right now, so I would expect to see output like:
PROCESSED 40000 RECORDS PROCESSED 40000 RECORDS PROCESSED
20000 RECORDS FINISHED GATHERING RESULTS
However, I am consistently seeing:
PROCESSED 39999 RECORDS
PROCESSED 40000 RECORDS
FINISHED GATHERING RESULTS
If I add the output of $ctr right after $ctr++, I get the full 100K records, so it seems to me to be some sort of timing issue or problem with fetching the data from the back-end with MYSQL_FETCH_ASSOC.
On a related note, the code in the while loop is there because prior to breaking up the $results array like this the while loop would just fall over at around 45000 records (same place every time). Is this due to a setting somewhere that I have missed?
Thanks for any input... just need some thoughts on where to look for the answer to this.
Cheers!
You're building an array of results, and counting that new array's members. So yes, it is expected behavior that after fetching the first row, you'll get "1 result", then "2 results", etc...
If you want to get the total number of rows expected, you'll need to use mysql_num_rows()
When you start going through your result $ctr has no value and doing first incrementation will evaluate it to 0. But when reaching $kMAX_RECORDS you reset it to 1 instead of 0. I don't know however why you see 1 row less in the first time of calling processResults(). I think it should be one more.
As to missing last 20000 rows notice that you are running processResults() only after $ctr exceeds $kMAX_RECORDS

Creating a very large MySQL Database from PHP Script

Please bear with me on this question.
I'm looking to create a relatively large MySQL database that I want to use to do some performance testing. I'm using Ubuntu 11.04 by the way.
I want to create about 6 tables, each with about 50 million records. Each table will have about 10 columns. The data would just be random data.
However, I'm not sure how I can go about doing this. Do I use PHP and loop INSERT queries (bound to timeout)? Or if that is inefficient, is there a way I can do this via some command line utility or shell script?
I'd really appreciate some guidance.
Thanks in advance.
mysql_import is what you want. Check this for full information. It's command line and very fast.
Command-line mode usually has the timeouts disabled, as that's a protection against taking down a webserver, which doesn't apply at the command line.
You can do it from PHP, though generating "random" data will be costly. How random does this information have to be? You can easily read from /dev/random and get "garbage", but it's not a source of "good" randomness (You'd want /dev/urandom, then, but that will block if there isn't enough entropy available to make good garbage).
Just make sure that you have keys disabled on the tables, as keeping those up-to-date will be a major drag on your insert operations. You can add/enable the keys AFTER you've got your data set populated.
If you do want to go the php way, you could do something like this:
<?php
//Edit Following
$millionsOfRows = 2;
$InsertBatchSize = 1000;
$table = 'ATable';
$RandStrLength = 10;
$timeOut = 0; //set 0 for no timeout
$columns = array('col1','col2','etc');
//Mysql Settings
$username = "root";
$password = "";
$database = "ADatabase";
$server = "localhost";
//Don't edit below
$letters = range('a','z');
$rows = $millionsOfRows * 1000000;
$colCount = count($columns);
$valueArray = array();
$con = #mysql_connect($server, $username, $password) or die('Error accessing database: '.mysql_error());
#mysql_select_db($database) or die ('Couldn\'t connect to database: '.mysql_error());
set_time_limit($timeOut);
for ($i = 0;$i<$rows;$i++)
{
$values = array();
for ($k = 0; $k<$colCount;$k++)
$values[] = RandomString();
$valueArray[] = "('".implode("', '", $values)."')";
if ($i > 0 && ($i % $InsertBatchSize) == 0)
{
echo "--".$i/$InsertBatchSize."--";
$sql = "INSERT INTO `$table` (`".implode('`,`',$columns)."`) VALUES ".implode(',',$valueArray);
mysql_query($sql);
echo $sql."<BR/><BR/>";
$valueArray = array();
}
}
mysql_close($con);
function RandomString ()
{
global $RandStrLength, $letters;
$str = "";
for ($i = 0;$i<$RandStrLength;$i++)
$str .= $letters[rand(0,25)];
return $str;
}
Of course you could just use a created dataset, like the NorthWind Database.
all you need to do is launch your script from command line like this:
php -q generator.php
it can then be a simple php file like this:
<?php
$fid = fopen("query.sql", "w");
fputs($fid, "create table a (id int not null auto_increment primary key, b int, c, int);\n");
for ($i = 0; $i < 50000000; $i++){
fputs($fid, "insert into table a (b,c) values (" . rand(0,1000) . ", " . rand(0,1000) . ")\n");
}
fclose($fid);
exec("mysql -u$user -p$password $db < query.sql");
Probably it is fastest to run multiple inserts in one query as:
INSERT INTO `test` VALUES
(1,2,3,4,5,6,7,8,9,0),
(1,2,3,4,5,6,7,8,9,0),
.....
(1,2,3,4,5,6,7,8,9,0)
I created a PHP script to do this. First I tried to construct a query that will hold 1 million inserts but it failed. Then I tried with 100 thousend and it failed again. 50 thousends don't do it also. My nest try was with 10 000 and it works fine. I guess I am hitting the transfer limit from PHP to MySQL. Here is the code:
<?php
set_time_limit(0);
ini_set('memory_limit', -1);
define('NUM_INSERTS_IN_QUERY', 10000);
define('NUM_QUERIES', 100);
// build query
$time = microtime(true);
$queries = array();
for($i = 0; $i < NUM_QUERIES; $i++){
$queries[$i] = 'INSERT INTO `test` VALUES ';
for($j = 0; $j < NUM_INSERTS_IN_QUERY; $j++){
$queries[$i] .= '(1,2,3,4,5,6,7,8,9,0),';
}
$queries[$i] = rtrim($queries[$i], ',');
}
echo "Building query took " . (microtime(true) - $time) . " seconds\n";
mysql_connect('localhost', 'root', '') or die(mysql_error());
mysql_select_db('store') or die(mysql_error());
mysql_query('DELETE FROM `test`') or die(mysql_error());
// execute the query
$time = microtime(true);
for($i = 0; $i < NUM_QUERIES; $i++){
mysql_query($queries[$i]) or die(mysql_error());
// verify all rows inserted
if(mysql_affected_rows() != NUM_INSERTS_IN_QUERY){
echo "ERROR: on run $i not all rows inserted (" . mysql_affected_rows() . ")\n";
exit;
}
}
echo "Executing query took " . (microtime(true) - $time) . " seconds\n";
$result = mysql_query('SELECT count(*) FROM `test`') or die(mysql_error());
$row = mysql_fetch_row($result);
echo "Total number of rows in table: {$row[0]}\n";
echo "Total memory used in bytes: " . memory_get_usage() . "\n";
?>
The result on my Win 7 dev machine are:
Building query took 0.30241012573242 seconds
Executing query took 5.6592788696289 seconds
Total number of rows in table: 1000000
Total memory used in bytes: 22396560
So for 1 mil inserts it took 5 and a half seconds. Then I ran it with this settings:
define('NUM_INSERTS_IN_QUERY', 1);
define('NUM_QUERIES', 1000000);
which is basically doing one insert per query. The results are:
Building query took 1.6551470756531 seconds
Executing query took 77.895285844803 seconds
Total number of rows in table: 1000000
Total memory used in bytes: 140579784
Then I tried to create a file with one insert per query in it, as suggested by #jancha. My code is slightly modified:
$fid = fopen("query.sql", "w");
fputs($fid, "use store;");
for($i = 0; $i < 1000000; $i++){
fputs($fid, "insert into `test` values (1,2,3,4,5,6,7,8,9,0);\n");
}
fclose($fid);
$time = microtime(true);
exec("mysql -uroot < query.sql");
echo "Executing query took " . (microtime(true) - $time) . " seconds\n";
The result is:
Executing query took 79.207592964172 seconds
Same as executing the queries through PHP. So, probably the fastest way is to do multiple inserts in one query and shouldn't be a problem to use PHP to do the work.
Do I use PHP and loop INSERT queries (bound to timeout)
Certainly running long duration scripts via a webserver mediated requset is not a good idea. But PHP can be compiled to run from the command line - in fact most distributions of PHP come bundled with this.
There are lots of things you do to make this run more efficiently, exactly which ones will vary depedning on how you are populating the data set (e.g. once only, lots of batch additions). However for a single load, you might want to have a look at the output of mysqldump (note disabling, enabling indexes, multiple insert lines) and recreate this in PHP rather than connecting directly to the database from PHP.
I see no point in this question, and, especially, in raising a bounty for it.
as they say, "the best is the enemy of good"
You have asked this question ten days ago.
If you'd just go with whatever code you've got, you'd have your tables already and even done with your tests. But you lose so much time just in vain. It's above my understanding.
As for the method you've been asking for (just to keep away all these self-appointed moderators), there are some statements as a food for thought:
mysql's own methods considered more effective in general.
mysql can insert all data from the table into another using INSERT ... SELECT syntax. so, you will need to run only about 30 queries to get your 50 mil records.
and sure mysql can copy whole tables as well.
keep in mind that there should be no indexes at the time of table creation.
I just want to point you to http://www.mysqldumper.net/ which is a tool that allows you to backup and restore big databases with PHP.
The script has some mechanisms to circumvent the maximum execution time of PHP -> imo worth a look.
This is not a solution for generating data, but a great one for importing / exporting.

Categories