Odd behavior when fetching 100K rows from MySQL via PHP

Odd behavior when fetching 100K rows from MySQL via PHP - php

Looking for some ideas here... I have a MySQL table that has 100K rows of test data in it.
I am using a PHP script to fetch rows out of that table, and in this test case, the script is fetching all 100,000 rows (doing some profiling and optimization for large datasets).
I connect to the DB, and execute an unbuffered query:
$result = mysql_unbuffered_query("SELECT * FROM TestTable", $connection) or die('Errant query: ' . $query);
Then I iterate over the results with:
if ($result) {
while($tweet = mysql_fetch_assoc($result)) {
$ctr++;
if ($ctr > $kMAX_RECORDS) {
$masterCount += $ctr;
processResults($results);
$results = array();
$ctr = 1;
}
$results[] = array('tweet' => $tweet);
}
echo "<p/>FINISHED GATHERING RESULTS";
}
function processResults($resultSet) {
echo "<br/>PROCESSED " . count($resultSet) . " RECORDS";
}
$kMAX_RECORDS = 40000 right now, so I would expect to see output like:
PROCESSED 40000 RECORDS PROCESSED 40000 RECORDS PROCESSED
20000 RECORDS FINISHED GATHERING RESULTS
However, I am consistently seeing:
PROCESSED 39999 RECORDS
PROCESSED 40000 RECORDS
FINISHED GATHERING RESULTS
If I add the output of $ctr right after $ctr++, I get the full 100K records, so it seems to me to be some sort of timing issue or problem with fetching the data from the back-end with MYSQL_FETCH_ASSOC.
On a related note, the code in the while loop is there because prior to breaking up the $results array like this the while loop would just fall over at around 45000 records (same place every time). Is this due to a setting somewhere that I have missed?
Thanks for any input... just need some thoughts on where to look for the answer to this.
Cheers!

You're building an array of results, and counting that new array's members. So yes, it is expected behavior that after fetching the first row, you'll get "1 result", then "2 results", etc...
If you want to get the total number of rows expected, you'll need to use mysql_num_rows()

When you start going through your result $ctr has no value and doing first incrementation will evaluate it to 0. But when reaching $kMAX_RECORDS you reset it to 1 instead of 0. I don't know however why you see 1 row less in the first time of calling processResults(). I think it should be one more.
As to missing last 20000 rows notice that you are running processResults() only after $ctr exceeds $kMAX_RECORDS

Related

Explanation of mysqli_fetch_array and foreach

<?php
$connect = mysqli_connect("localhost", "root", "", "hempbag_db") or die("Connection failed");
$query= "Select * from tbl_sales";
$ress = mysqli_query($connect, $query);
$result = mysqli_fetch_array($ress);
foreach($result as $a)
{
echo $a['ID']; // This doesnt print although i used fetch array
}
foreach($ress as $a)
{
echo $a['ID']; // This works why???? This variable has only query run
}
?>
Why does the upper foreach does not run and lower one does? Can anyone explain please?

When you run a query, it returns a result:
$ress = mysqli_query($connect, $query);
var_dump($ress); // You will see it's a result.
At this point $ress just contains the result of what you just queried. Think of it like this:
You goto the warehouse, and you make and order for 1000 boxes of crackers. She heads to the back, and gets your boxes ready, and comes back and hands you a piece of paper with the order number. (This is $ress). Now, you can't loop through that, you can't do anything with that.
You now take that piece of paper, and you hand it to your assistant, and you say you want to get all the crackers on your trucks (This is now mysqli_fetch_array()). Your assistant goes, fetches it, and returns you the crackers.
Simply put, mysqli_query just returns an object like Result#1. From Result#1, mysql can tell you how many rows were returned mysql_num_rows(Result#1), or get actual data if it was a select query: mysqli_fetch_array(Result#1).
Now onto the reasoning: Performance. Let's say you didn't want 1000 crackers, you just wanted to know if they had 1000 crackers. If she came back with all the boxes of crackers and you had to count them yourself, it would be much more difficult. Instead, with that piece of paper, she can determine how many boxes you were able to order. Less data being transferred, and much more efficient.
Just a small note, in later versions of php, they made it so the result is iterable, meaning that if you try and loop through it, it will automagically call mysqli_fetch_array on that result, and return you the results.
Additionally, mysql_fetch_array will return one row from the database, and is not able to be looped through via foreach. Perhaps you were thinking of mysqli_fetch_all? This returns all rows and can be looped through (Although is a bit less performant than using a while loop with mysqli_fetch_array)

$ress = mysqli_query($connect, $query);
This line returns a result set which is Traversable. So your second foreach works fine.
whereas the following line (mysqli_fetch_array) gets one row at a time and makes it an array.
$result = mysqli_fetch_array($ress); // Suppose you have 3 rows, Now cursor is at row 1
echo $result["ID"]; // this will print FIRST row's ID
$result = mysqli_fetch_array($ress); // Now cursor is at row 2
echo $result["ID"]; // this will print SECOND row's ID.
$result = mysqli_fetch_array($ress); // Now cursor is at row 3
echo $result["ID"]; // this will print THIRD row's ID.
To echo all IDs
while($result = mysqli_fetch_array($ress)) {
echo $result["ID"];
}

PHP Use for loop to iterate over MySQLi recordset

Whenever I am working with PHP MySQLi recordsets, I have always worked with the returned data using the standard while loop to iterate over the recordset. Recently, however, I started wondering if there is a way to use a for loop instead. This would be handy in situations where you want to limit the number of results returned.
Here is an example of using the while loop:
//Prepare a query that will produce a reverse-order recordset
$sql = "SELECT * FROM tblNames ORDER BY numberID DESC";
$recordset = $conn -> query($sql);
//Count the number of contacts added to the list
$contactCount = 0;
while($row = $recordset -> fetch_assoc())
{
//If the list has reached its maximum number (5), end the display loop
if($contactCount >= 5)
{
break;
}
$contactList .= $row["name"] . "<br>";
//Increment the number of contacts added to the list
$contactCount ++;
}
//Use '$contactList' somewhere....
echo($contactList);
While this definitely works, there must be a better way to end the loop after a specified number of iterations. Is it easier to use a for loop in a situation like this? If so, how?

You can use LIMIT in the query. For example:
SELECT * FROM tblNames ORDER BY numberID DESC LIMIT 15
This way you don't have to worry about what happens if your query does return less than 15 results.

As I was writing this question, I suddenly decided that I would try it one last time, but in a different way than I had been previously. I had been stuck finding an efficient/safe way to tell when the recordset was empty (had been running into issues when the custom max number was greater than the number of records, and when there were no records).
//Execute the SQL query (reverse order), and store the results in a recordset
$sql = "SELECT * FROM tblNames ORDER BY numberID DESC";
$recordset = $conn -> query($sql);
//Use a 'for' loop to iterate over the recordset
for($i = 0; $i < 15; $i++)
{
//If there is another row in the recordset, add the column value to the list
if($row = $recordset -> fetch_assoc())
{
$contactList .= $row["name"] . "<br>";
}
else
{
//Break from the loop when there are no more records (used if the
// given maximum number was actually greater than the number of records)
break;
}
}
echo($contactList);
As far as I can tell, this is a much better way to loop through a set/custom number of records, and then stop. It also will safely catch the end of the recordset (assuming it is reached before the cutoff number), and end the loop.
Edit
As is pointed out in the answer by HenryTK above, if you have control over the query, the best way is to use the LIMIT SQL statement. However, if you merely have access to the recordset, I still think the for loop will save time. (Although I'm not sure when this would happen).

MySQL performance hit accessing table simultaneously

Following code is a mock-up of my real code. I'm getting a big performance hit when myFunction is called. myTable is no more than a few hundred rows, but calling myFunction adds ~10 seconds execution time. Is there something inherently wrong with trying to access a row of a table inside a loop already accessing that table?
<select>
<?php
$stmt = SQLout ("SELECT ID,Title FROM myTable WHERE LEFT(Title,2) = ? ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title));
while ($stmt->fetch()) {
if (myFunction($id)) // skip this function call and save 10 seconds
echo '<option value="' . $co_title . '">' . $co_title . '</option>';
}
$stmt->close();
function myFunction ($id) {
$stmt = SQLout ("SELECT Info FROM myTable WHERE ID = ?",
array ('i', $id), array (&$info));
if ($stmt->fetch()) {
$stmt->close();
if ($info == $something)
return true;
}
return false;
}
?>
SQLout is basically:
$sqli_db->prepare($query);
$stmt->bind_param;
$stmt->execute();
$stmt->bind_result;
return $stmt;

What you're doing is sometimes called the "N+1 queries" problem. You run the first (outer) query 1 times, and it returns N rows. Then you run N subordinate queries, one for each row returned by the first query. Thus N+1 queries. It causes a lot of overhead.
This would have far better performance if you could apply the "something" condition in SQL:
$stmt = SQLout ("SELECT ID,Title FROM myTable
WHERE LEFT(Title,2) = ? AND Info = ... ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title));
In general, it's not a good idea to run queries in a loop that depends on how many rows match the outer query. What if the outer query matches 1000000 rows? That means a million queries inside the loop will hit your database for this single PHP request.
Even if today the outer query only matches 3 rows, the fact that you've architected the code in this way means that six months from now, at some unpredictable time, there will be some search that results in a vast overhead, even if your code does not change. The number of queries is driven by the data, not the code.
Sometimes it's necessary to do what you're doing, for instance of the "something" condition is complex and can't be represented by an SQL expression. But you should try in all other cases to avoid this pattern of N+1 queries.

So, if you have a "few hundred rows" in the table, you might be calling myFunction a few hundred times, depending on how many rows are returned in the first query.
Check the number of rows that first query is returning to make sure it meets your expectations.
After that, make sure you have an index on myTable.ID.
After that, I would start looking into system/server level issues. On slower systems, say a laptop hard drive, 10 queries per second might be all you can get.

Try something like this:
$stmt = SQLout ("SELECT ID,Title, Info FROM myTable WHERE LEFT(Title,2) = ? ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title, &$info));
while ($stmt->fetch()) {
if (myFunction($info)) // skip this function call and save 10 seconds
echo '<option value="' . $co_title . '">' . $co_title . '</option>';
}
$stmt->close();
function myFunction ($info) {
if ($info == $something)
return true;
}
return false;
}

Running while($row = sqlsrv_fetch_array($result)) multiple times

I am retrieving a couple of tables from a MSSQL database, which I am then running through to obtain order information from.
My code looks like this:
while($row = sqlsrv_fetch_array($orderResult))
{
......code........
..................
while($statusRow = sqlsrv_fetch_array($statusResult))
{
....code....
}
....code....
}
Now my problem is that after the second loop runs through, it never runs again. And I need it to run every time the first loop runs.
Is there anything I can do to reset that second loop to run again?
Thank you in advance. Any help or a push in the right direction will be very helpful.

Read about the other parameters in sqlsrv_fetch_array()
You can do something like this to reset
// reset, and get first row
$row = sqlsrv_fetch_row($result, SQLSRV_FETCH_BOTH, SQLSRV_SCROLL_FIRST);
// get second (and nth row) normally
$row = sqlsrv_fetch_row($result);
Alternatively, I think you could benefit from doing a JOIN in your query to return a single result. Merging results manually like this seems a little hackish.

I had a similar problem when calling a stored proc from a database that returned multiple result sets. I found Macek's answer didn't work for me, but another answer did:
$resultSet = array();
$isNotLastResult = true;
$i = 0;
while (!is_null($isNotLastResult))
{
$resultSet[$i] = array();
while ($row = sqlsrv_fetch_array($result, SQLSRV_FETCH_ASSOC))
{
$resultSet[$i][] = $row;
}
$isNotLastResult = sqlsrv_next_result($result);
$i++;
}
print_r($resultSet);
PS: I gave you an up arrow to counteract your down arrow. You asked a question I spent quite a bit of time looking for the answer to. Good question!

Use the following to reset the pointer:
sqlsrv_fetch_array ($Res, SQLSRV_FETCH_ASSOC, SQLSRV_SCROLL_ABSOLUTE, -1);
Then use the same code as before to loop through the result set. sql_fetch_array appears to increment the pointer after retrieving the data. Requesting the -1 record retrieves nothing but then sets the pointer to 0 which is the first record.

are you sure your second loop got results for a second run ? maybe, if your $statusResult depents on your $row and your $row has just entries for the first loop you should insert some pseudo-entries to your db.
the second loop should refresh itself in the next "big" loop cause the whole inner block will be destroyed at the end of the big block and will be (fully) rebuild at the next entry.

Use a foreach loop instead of while with myslqli_fetch_array()

Okay, so I realize that when do:
//connection code;
//query code;
//$result= mysqli_query();
$row= mysqli_fetch_array($result);
you create an associative array where the column names from your table are the keys for the data in the respective row.
Then you can use:
while ($row= mysqli_fetch_array($result))
{
//code to echo out table data.
}
My question is how does the while loop go to the next row after each iteration? I thought that was what foreach loops were for?

From http://www.php.net/manual/en/function.mysql-fetch-array.php
array mysql_fetch_array ( resource $result [, int $result_type = MYSQL_BOTH ] )
Returns an array that corresponds to the fetched row and moves the internal data pointer ahead.

Many functions that return a result set do so by returning an array that you can do a foreach() on like you are used to. This is not always the case however, especially with database functions. mysqli_fetch_array fetches just a single row, or returns boolean false if there are no more remaining. This is how the loop works: the expression evaluates to true as long as there is a row to process.
The reason for this construction is mainly efficiency. Fetching database rows can be a performance critical operation, and there are cases where not all rows are needed. In these situations this approach will give more flexibility. Fetching rows one-by-one is also more memory-friendly since not all result data will have to be loaded into memory at once.
Mysqli actually has a function that does fetch the entire result set in an array: mysqli_fetch_all. You will be able to foreach() over that.

mysql_fetch_array simply fetches the next row of the result set from your mysql query and returns the row as an array or false if there are no more rows to fetch.
The while loops continually pulls the results, one at a time from the result set and continues until mysql_fetch_array is false.
A foreach loop loops through each value of an array. As mysql_fetch_array only pulls one result and therefore the value of count($row) would be 1 every time.

Each time the while loop runs, it executes the function mysql_fetch_array and gets the next result. It does that until there aren't more results to show.

mysql_fetch_array returns an array of strings that corresponds to the fetched row, or FALSE if there are no more rows. If row exists then get data.
I hope this has answered you q. Its hard to understand what you mean

This part fetches one row at a time
$row = mysqli_fetch_array($result);
Putting it into a while loop makes it fetch one row at a time, until it does not fetch a row because there are no more to be fetched.
The alternative would be to fetch all the rows, then loop through them with a foreach
$rows = mysql_fetch_all($result);
foreach($rows as $row){
// do something with row
}
For this to work, you have to make yourself a mysql_fetch_all function, which of course has the original while loop in it...
function mysql_fetch_all($result)
{
$all = array();
while($thing = mysql_fetch_assoc($result)) {
$all[] = $thing;
}
return $all;
}

This works due to the SQL connector storing the current state of the query (i.e. the next result row to return) inside the result.
If you want a similar example, it works like reading from a file, where you're able to use similar constructs:
while ($line = fgets($fp, 1000)) {
// ...
}
Behind the scenes (and depending on the language, interpreter, compiler, etc.) for and while essentially result in the same code. The difference is, depending on what your code should do, one approach could be more readable than the other.
Take the following two loops as an example. Both do exactly the same.
for ($i = 0; $i < 10; $i++) {
// ...
}
$i = 0;
while ($i < 10) {
// ...
$i++;
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Odd behavior when fetching 100K rows from MySQL via PHP - php

You're building an array of results, and counting that new array's members. So yes, it is expected behavior that after fetching the first row, you'll get "1 result", then "2 results", etc... If you want to get the total number of rows expected, you'll need to use mysql_num_rows()

Related

Explanation of mysqli_fetch_array and foreach

PHP Use for loop to iterate over MySQLi recordset

MySQL performance hit accessing table simultaneously

Running while($row = sqlsrv_fetch_array($result)) multiple times

Use a foreach loop instead of while with myslqli_fetch_array()

Categories

Resources