PDO unbuffered query still waits until query result is complete - php

I have an SQL query which can return quite a lot results (something like 10k rows) but I cannot use the SQL LIMIT parameter, as I don't know the exact amount of needed rows (there's a special grouping done in PHP). So the plan was to stop fetching rows once I have enough.
Since PDO normally operates in buffered mode, which fetches the whole result set and passes it to PHP, I switched PDO to unbuffered mode with
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
Now I expected that executing the query should take about the same time no matter what LIMIT I pass. So basically
$result = $pdo->query($query);
$count = 0;
while ($row = $result->fetch()) {
++$count;
if ($count > 10) break;
}
should execute in about the same time for
$query = 'SELECT * FROM myTable';
and
$query = 'SELECT * FROM myTable LIMIT 10';
However the first one takes 8 seconds whereas the second one executes instantly. So it seems like the unbuffered query also waits until the whole result set is fetched - which shouldn't be the case according to the documentation.
Is there any way to get the query result instantly in PHP with PDO and stop the query once I have enough results?
Database applications like "Sequel Pro SQL" can do this (I can hit cancel after 1 second and get the results that were already queried until that time) so it can't be a general problem with MySQL servers.
I can workaround the problem by choosing a very high LIMIT which always has enough valid results after my grouping. But since performance is an issue, I'd like to query only as many entries as really needed. Please don't suggest anything that involves grouping in MySQL, the terrible performance of that is the reason we have to change the behaviour.

Now I expected that executing the query should take about the same time no matter what LIMIT I pass. So basically
This might not be completely true. While you won't get the overhead of receiving all your results, they are all queried (without a limit)! You do get the advantage of keeping most of the results serverside until you need them, but your server actually does perform the whole query first as far as I know. I'm not sure how complicated your query is, but this could be the issue?
Say for instance you have a very slow join (not indexed), but only want the first 10 by id, your query will get 10 based on the index, and then only do the join for those 10. This'll be quick
But if you don't actually limit, but ask for the result in batches, your complete join will have to be done (slow!) and then your resultsset is released in parts.
A quicker method might be to repeat your limited query untill you have your result. I know, this will increase overhead, but it might be way quicker. Only way to know is to test.
as response to your comment: this is from the manual
Unbuffered MySQL queries execute the query and then return a resource while the data is still waiting on the MySQL server for being fetched.
So it executes the query. The complete query. So as I tried to explain above, it will not be as quick as the same query with a LIMIT 10, as it doesn't perform a partial query! The fact that a different DB engine does this does not mean MySQL can...

Have you tried using prepare/execute instead of query, and putting a $stmt->closeCursor(); call after the break?
$stmt = $dbh->prepare($query);
$stmt->execute();
$count = 0;
while ($row = $stmt->fetch()) {
++$count;
if ($count > 10) break;
}
$stmt->closeCursor();

Related

multiple query execution or one query and nextRowset() to SELECT?

in what is efficient to execute multiple queries:
this with nextRowset() function to move over the queries
$stmt = $db->query("SELECT 1; SELECT 2;");
$info1 = $stmt->fetchAll();
$stmt->nextRowset();
$info2 = $stmt->fetchAll();
or multiple executions plan which is a lot easier to manage?
$info1 = $db->query("SELECT 1;")->fetchAll();
$info2 = $db->query("SELECT 2;")->fetchAll();
Performance of the code is likely to be similar.
The code at the bottom, to me, is more efficient for your software design because:
it is more readable
it can be changed with less chance of error since each of them addresses 1 query only
individual query and its interaction can be moved to a different function easily and can be tested individually
That's why I feel that overall efficiency (not just how fast data comes back from DB to PHP to the user, but also maintainability/refactoring of code) will be better with the code at the bottom.
"SQL injection" by a hacker is easier when you issue multiple statements at once. So, don't do it.
If you do need it regularly, write a Stored Procedure to perform all the steps via one CALL statement. That will return multiple "rowsets", so similar code will be needed.

MySQL Delete all selected rows in one request

Is it possible to SELECT specific rows and DELETE the selected result in ONE request?
$res = $Connection->query("SELECT * FROM tasks");
if($res->num_rows > 0){
while($row = $res->fetch_assoc()){ ...
The problem is that I have limited Number of queries to my SQL database and I want to minimize it as mush as possible.
You can't do that using the regular mysql-api in PHP. Just execute two queries. The second one will be so fast that it won't matter. This is a typical example of micro optimization. Don't worry about it. (So timing doesn't matter much)
For the record, since you are worried about the number of queries, it can be done using mysqli and the mysqli_multi_query-function.
P.S. - I haven't tried this, but since this mysql_multi_query is there in the documentation, it might help... :)
Both SELECT and DELETE need one request individually. And there is no SQL can do this work in the official documents.

SQL LIMIT VS Loop Limit

I was browsing around Stack Overflow attempting to find how to limit an SQL query with a while loop and I came across this code.
$count = 0;
while ($count < 4 && $info = mysql_fetch_assoc($result)) {
//stuff
$count++;
}
Q 1: What is the difference between this code and using the SQL LIMIT clause?
Q 2: For what reason would somebody want to use this code, rather than using LIMIT?
With this code, the MySQL server will send all the results to the client, but the client ignores everything after the 4th row. So the server has to do more work, and more bandwidth will be used between the client and server.
They might want to use mysql_num_rows() to find out how many total rows were selected, even though they only want to display the first 4. However, MySQL provides a way to do that with LIMIT -- you can put the SQL_CALC_FOUND_ROWS option in the SELECT clause, and then use SELECT FOUND_ROWS() to get the total number of rows. So there's no good reason, except they don't know about this feature.
Everyting #Barmar said is right on. Following with code like that will cause lots of problems as your result sets start to grow. Let a database do what its good at doing, let it supply the limit of results you want/need. Just think of what happens when you do a SELECT with no LIMIT clause in the command line client where there are thousands of rows...it just goes on and on.
One more thing, I wouldn't recommend using mysql_num_rows() as its a deprecated function. Might as well go along with mysqli or PDO.

Fetch all SQL query into an array at a time?

Well, the php code below successfull adds up all the rows in the url field.
$sql = "SELECT * FROM table WHERE url <> ''";
$result = mysql_query($sql,$con);
$sql_num = mysql_num_rows($result);
while($sql_row=mysql_fetch_array($result))
{
urls[] = $sql_row["url"];
}
The problem is that if the list of url are in millions, then it takes a lot of time (especially in localhost). So, I'd like to know anothe way of getting the sql query result directly into an array without using a loop. Is it possible?
The problem is not the loop, it's that you are transferring millions of pieces of data (possibly large) from your database into memory. Whichever way you're doing that, it'll take time. And somebody needs to loop somewhere anyway.
In short: reduce the amount of data you get from the database.
You should consider using mysqli for that purpose. The fetch_all() method would allow to do that.
http://php.net/manual/en/mysqli-result.fetch-all.php
UPDATE
As per comments, I tried both methods. I tried using mysql_fetch_array in a loop, and using mysqli::fetch_all() method, on a large table we have in production. mysqli::fetch_all() did use less memory and ran faster than the mysql_fetch_array loop.
The table has about 500000 rows. mysqli::fetch_all() finished loading the data in an array in 2.50 seconds, and didn't hit the 1G memory limit set in the script. mysql_fetch_array() failed from memory exhaustion after 3.45 seconds.
mysql is deprecated, and the functionality you want is found in mysqli and PDO. It's the perfect excuse to switch to the newer MySQL extensions. For both mysqli and PDO, the method is fetchAll (note that the mysqli::fetch_all requires the mysqlnd driver to run).
there is no option for it in mysql. though you can use pdo's fetchall()
http://php.net/pdostatement.fetchall

Get Number of Rows from a Select Statement Efficiently

Until recently I've been using mysql_real_escape_string() to fix most of my variables before making SQL queries to my database. A friend said that I should be using PDO's prepared statements instead, so after reading a bit about them I'm now switching over to them.
I've only encountered one problem so far in switching over, and that's counting the rows to returned by a SELECT statement. On occasion in my code, I'd run an SQL query and then count the number of rows returned from the SELECT statement. Depending on whether a result set returned, I would take different actions. Sometimes I do need to use the result set from it. MySQL let me go straight to mysql_fetch_assoc() after mysql_num_rows() with no problem. However, PDO doesn't seem to have anything like mysql_num_rows().
I've been reading some responses on SO that gave me a solution, to either use COUNT() in the SQL statement or to use the PHP function count() on the result set. COUNT() would work fine in the SQL statement if I didn't need the result set in some places, however, several people have mentioned that using count() on the result set is fairly inefficient.
So my question is, how should I be doing this if I need to count the number of rows selected (if any), then run a script with the result set? Is using count() on the result set the only way in this case, or is there a more efficient way to do things?
Below is a short example of something similar to my previous SQL code:
$query=mysql_query('SELECT ID FROM Table WHERE Name='Paul' LIMIT 1);
if(mysql_num_rows($query)>0)
{
print_r(mysql_fetch_assoc($query));
}
else
{
//Other code.
}
Thanks.
EDIT
I do know that you use fetchAll() on the statement before counting the result set (which gives me what I need), but I'm just trying to figure out the most efficient way to do things.
$stmt->rowCount();
http://php.net/manual/en/pdostatement.rowcount.php
the rows must be fetched(buffered into memory, or iterated) for it to work. It's not uncommon for your pdo driver to be configured to do this automatically.
You will have to use Count(). You can run two queries like
SELECT COUNT(ID) FROM Table WHERE Name='Paul'
one you have get the count, then run the query with select clause
SELECT ID FROM Table WHERE Name='Paul' LIMIT 1
Count() function is not inefficient at all if you are using it like COUNT(ID), because most probably id is primary key and have an index. MYSQL wont even have to access the table.

Categories