Is there any performance issues with php mongo query cursor handling?
My code:
$cursor = $collection->find($searchCriteria)->limit($limit_rows);
// Sort ascending based on S_DTTM
$cursor->sort(array('S_DTTM' => 1 , 'SYMBOL' => 1 ));
// How many results found?
$num_docs = $cursor->count();
if( $num_docs > 0 )
{
// loop over the results
foreach ($cursor as $ticks)
{
See codes like
// request data
$result = $cursor->getNext();
My issue is after the first query returns ( full with limit of 100 rows ) the next query just goes on looping. Have millions of rows returning, so I wanted to put the limits with "limit".
I did do re-index just in case, still no difference.
What am I doing wrong? Does the getNext works better?
Using mongod ver 2.5.4 and the latest php mongo driver downloaded a week ago.
Collection size is 100Gb including 2 additional indexes.
mongo log shows all the query executing in less than 200ms.
Turns out to be Query Issue and not php mongo driver issue ..
Use of count() and sort() may decrease performance.
Related
I have a query like this
$results = $collection->find([
'status' => "pending",
'short_code' => intval($shortCode),
'create_time' => ['$lte' => time()],
])
->limit(self::BATCH_NUMBER)
->sort(["priority" => -1, "create_time" => 1]);
Where BATCH_SIZE is 70.
and i use the result of query like below :
foreach ($results as $mongoId => $result) {
}
or trying to convert in to array like :
iterator_to_array($results);
mongo fetch data and traveling on iterate timing is :
FetchTime: 0.003173828125 ms
IteratorTime: 4065.1459960938 ms
As you can see, fetching data by mongo is too fast, but iterating (in both case of using iterator_to_array or using foreach) is slow.
It is a queue for sending messages to another server. Destination server accept less than 70 documents per each request. So i forced to fetch 70 document. anyway. I want to fetch 70 documents from 1,300,000 documents and we have problem here.
query try to fetch first 70 documents which have query conditions, send them and finally delete them from collection.
can anybody help? why it takes long? or is there any config for accelerating for php or mongo?
Another thing, when total number of data is like 100,000 (isntead of 1,300,000) the traveling is fast. traveling time will increase by increasing number of total documents.
That was because of sorting.
The problem :
Fetching data from mongo was fast, but traveling in iterator was slow using foreach.
solution :
there was a sort which we used. sorting by priority DESC and create_time ASC. These fileds was index ASC seperatly. Indexing priority DESC and create_time ASC together fixed problem.
db.queue_1.createIndex( { "priority" : -1, "create_time" : 1 } )
order of fileds on indexing is important. means you should use priority at first. then use create_time.
because when you try to sort your query, you sort them like below :
.sort({priority : -1, create_time : 1});
I have a query that returns roughly 6,000 results. Although this query executes in under a second in MySQL, once it is run through Zend Framework 2, it experiences a significant slowdown.
For this reason, I tried to do it a more "raw" way with PDO:
class ThingTable implements ServiceLocatorAwareInterface
{
// ...
public function goFast()
{
$db_config = $this->getServiceLocator()->get('Config')['db'];
$pdo = new PDO($db_config['dsn'], $db_config['username'], $db_config['password']);
$statement = $pdo->prepare('SELECT objectNumber, thingID, thingmaker, hidden, title FROM Things ', array(PDO::MYSQL_ATTR_COMPRESS, PDO::CURSOR_FWDONLY));
$statement->execute();
return $statement->fetchAll(PDO::FETCH_ASSOC);
}
}
This doesn't seem to have much of a speedup, though.
I think the problem might be that Zend is still trying to create a new Thing object for each record, even though it is only a partial list of columns. I'd really be okay not populating any objects. I really just need a few columns with those attributes to iterate over.
As suggested by user MonkeyZeus, the following was used for bench-marking:
$start = microtime(true);
$result = $statement->fetchAll(PDO::FETCH_ASSOC);
echo (microtime(true) - $start).' seconds';
And in response:
In a VM, that returns 0.0050520896911621. This is in line with what it
is when I just run the command straight in MySQL. I believe the
overhead is in Zend, but not sure how to quite go about that. Again if
I had to guess, I'd say it is because Zend is adding overhead while
trying to be nice with the results, but I'm not quite sure how to
proceed after that.
[I'm] not so worried about the query. It is a single select statement.
goFast() gets called by the Zend indexAction() --similar to other
actions used across the project--this one is just way slower at
returning the page. One problem I found was that Zend's $this->url()
was slowing things down a bit. So I removed those, but the performance
still isn't great.
How can I speed this up?
When you say , that query runs under a second in MySQL , what do you mean ? did you try to run this query and print ALL 6000 rows ? or you just queried this and command line printed first/last few of them ?
The problem might be that , you are fetching them all , going through cursor , you are copying all the data ( 6000 rows ) from MySQL to PHP and then returning it , are you sure you want to do this ?
Maybe you could return a statement/cursor to the Query and then iterate through rows when you really need it ?
Your problem is not the SQL itself , but fetching them into PHP array all at once.
You can test it by logging the time it needs to actually execute SQL and fetching it into PHP array.
Do not use fetchall , return the statement itself and in the function/code where you have to run "foreach" this array , use statement to fetch each row one by one.
When i am running my mongodb query in shell, i get results set in few milliseconds. and when i execute this same query in codeigniter i get results in 12 seconds.
Shell script
db.order.find({customer_email:/^asd#asd.com/}).explain()
Codeigniter script
$orderData = $this->mongo_db->get_where('order', array('customer_email'=> new MongoRegex("/^asd#asd.com/i")));
Is there any solution to optimize speed of fetching results?
There are 7272699 total records and i need to find asd#asd.com.
First, you should set an index on customer_email if not already. Second, try to remove the i flag in the MongoRegex to use the index:
$orderData = $this->mongo_db->get_where('order', array('customer_email'=> new MongoRegex("/^asd#asd.com/")));
I have a documents in a collection called Reports that are to be processed. I do a query like
$collectionReports->find(array('processed' => 0))
(anywhere between 50 and 2000 items). I process them how I need to and insert the results into another collection, but I need to update the original Report to set processed to the current system time. Right now it looks something like:
$reports = $collectionReports->find(array('processed' => 0));
$toUpdate = array();
foreach ($reports as $report) {
//Perform the operations on them now
$toUpdate = $report['_id'];
}
foreach ($toUpdate as $reportID) {
$criteria = array('_id' => new MongoId($reportID));
$data = array('$set' => array('processed' => round(microtime(true)*1000)));
$collectionReports->findAndModify($criteria, $data);
}
My problem with this is that it is horribly inefficient. Processing the reports and inserting them into the collection takes maybe 700ms for 2000 reports, but just updating the processed times takes at least 1500ms for those same 2000 reports. Any tips to speed this up? Thanks in advance.
EDIT: The processed time doesn't have to be exact, it can just be the time the script is ran (+/- 10 seconds or so), if it would be possible to take the object ($report) and update the time directly like that, it would be better than just searching after the first foreach.
Thanks Sammaye, changing from findAndModify() to update() seems to work much better and faster.
I have a table with a lot of data, so I retrieve it and display it one page at a time (my request is lengthy so there is no way I run it on the entire table).
But I would like to paginate the results, so I need to know what is the total number of elements in my table.
If I perform a COUNT(*) in the same request, I get the number of selected elements (in my case, 10).
If I perform a COUNT(*) on my table in a second request, the result might be wrong because of the where, join and having clauses in my main query.
What is the cleanest way to:
Retrieve the data
Know the maximum number of elements in my table for this specific request
One solution seems to be using the Mysql function FOUND_ROWS :
I tried this, as mysql_query performs one query at a time: (taken here)
$query = 'SELECT SQL_CALC_FOUND_ROWS * FROM Users';
$result = mysql_query($query);
// fetching the results ...
$query = 'SELECT FOUND_ROWS()';
$ result = mysql_query($query);
// debug
while ($row = mysql_fetch_row($result)) {
print_r($row);
}
And I got an array with 0 results:
Array ( [0] => 0 )
Whereas my query does returns results.
What is wrong with my approach ? Do you have a better solution ?
Set mysql.trace_mode to Off if it is On.
ini_set('mysql.trace_mode','Off'); may also work depending on your host configuration if you cannot edit my.cnf
If that doesn't make the code you posted above work, then you will need to run the query again without LIMIT and count it that way.
The code above works fine. I wasn't opening the connection correctly.
Output is :
Array ( [0] => 10976 )
I am still interested for an other way to do it, especially something that is not mysql dependent.