can't getting the exact error in aggregation on mongodb - php

When i am trying this aggregation query on a sample mongo database of 25 entries, It works. But when the same method is applied for entire mongodb of 317000 entries, it doesn't respond. Even it's not showing any kind of error.
What to do for such problems.
$mongo = new Mongo("localhost");
$collection = $mongo->selectCollection("sampledb","analytics");
$cursor = $collection->aggregate(
array(
array(
'$group' => array(
'_id' => array(
'os'=>'$feat_os',
'minprice'=>'$finder_input_1',
'max_price'=>'$finder_input_2',
'feat_quadcode'=>'$feat_quadcore'
),
'count' => array('$sum' => 1)
)
),
array('$out'=>"result")
)
);

Related

Exact search query with Elastica QueryBuilder

I'm having a problem with an Elastica QueryBuilder exact search in a Symfony app. The Elastica version is 2.1, which depends on Elasticsearch 1.5.2. I'm searching an index where one of the fields is mapped as
majors:
type: string
index : not_analyzed
The field is a student's major that can have one or more words, such as "Ancient History".
The query is set up like this:
/** #var ElasticaFactory $ef */
protected $ef;
public function __construct(ElasticaFactory $ef)
{
$this->ef = $ef;
}
$query = $this->ef->createQuery();
$qb = $this->ef->createQueryBuilder();
$bool = $qb->filter()->bool();
$query->setQuery(
$qb->query()->filtered(
$qb->query()->term(),
$bool->addShould(
$qb->filter()->term(
array('majors' => $params['majors'])
)
)
)
);
When I run this query, I get a SearchPhaseExecutionException error. The full text of the error message is here.
Stack trace shows the full contents of the request:
Client ->request ('students/_search', 'GET', array('query' => array('filtered' => array('query' => array('term' => array()), 'filter' => array('bool' => array('should' => array(array('term' => array('majors' => 'Ancient Studies')))))))), array('from' => '0', 'size' => '10'))
in /vagrant/persona/vendor/ruflin/elastica/lib/Elastica/Search.php at line 455
When I set up the query as a "match" below, it works with no errors:
$query->setQuery(
$qb->query()->filtered(
$qb->query()->match('majors', $params['majors']),
$bool
)
);
However, a match query returns too many irrelevant results. I need specifically an exact search query.
Next, in order to eliminate any issues with Elastica, I converted the query into a raw ES query, like this:
$query->setRawQuery(
array(
'query' => array(
'filtered' => array(
'query' => array(
'term' => array()),
'filter' => array(
'bool' => array(
'should' => array(
array(
'term' => array(
'majors' => 'Art History'
)
)
)
)
)
)
)
)
);
This gave me the same error message as quoted above. Looks like there is a problem with the query itself, and not with Elastica.
When I rearranged the query to exactly follow the documentation as below, it returned no results, even though matching records were present
$query->setRawQuery(
array(
'query' => array(
'filtered' => array(
'filter' => array(
'term' => array(
'majors' => 'Art History'
)
)
)
)
)
);
Any help would be appreciated!

Elastic search scoring is always 0

Any idea why ElasticSearch would always return a _score of 0 for all the search queries i do ?
Using Elastica, i am doing something like:
$elasticaClient= $this->getElasticaClient();
$elasticaIndex = $elasticaClient->getIndex($this->getIndexName());
$elasticaQuery = new Elastica\Query\BoolQuery();
$queryAnd = new \Elastica\Query\BoolQuery();
$queryOr = new \Elastica\Query\BoolQuery();
$queryOr->addShould(new \Elastica\Query\Wildcard('search_field1', $keyword));
$queryOr->addShould(new \Elastica\Query\Wildcard('search_field2', $keyword));
$queryAnd->addMust($queryOr);
$elasticaQuery->addFilter($queryAnd);
$mainQuery = new \Elastica\Query();
$mainQuery->setQuery($elasticaQuery);
$elasticaResultSet = $elasticaIndex->search($mainQuery);
I get a bunch of results back, but always the _score for those results is 0, even if i enter the full word that can be found in the stored field(for a full match).
The mapping for the field is pretty simple:
'search_field1' => array(
'type' => 'string',
'include_in_all' => true,
'analyzer' => 'stringLowercase',
),
The stringLowercase analyzer is just:
'stringLowercase' => array(
'type' => 'custom',
'tokenizer' => 'keyword',
'filter' => 'lowercase'
),
Moreover, even if i try to boost either of the fields, it does not seem to have any effect.
Can anybody shed some light over this?
have you tried to query something nested like this?:
//i.e.
$queryField = new \Elastica\Query\QueryString($query);
$queryField->setDefaultOperator('OR');
$queryBool->addMustNot($queryField);
//then
$queryOr = new \Elastica\Query\Wildcard('search_field1', $keyword);
$queryOr->addShould($queryBool);

MongoDB -> DynamoDB Migration

All,
I am attempting to migrate roughly 6GB of Mongo data that is comprised of hundreds of collections to DynamoDB. I have written some scripts using the AWS PHP SDK and am able to port over very small collections but when I try ones that have more than 20k documents (still a very small collection all things considered) it either takes an outrageous amount of time or quietly fails.
Does anyone have some tips/tricks for taking data from Mongo (or any other NoSQL DB) and migrating it to Dynamo, or any other NoSQL DB. I feel like this should be relatively easy because the documents are extremely flat/simple.
Any thoughts/suggestions would be much appreciated!
Thanks!
header.php
<?
require './aws-autoloader.php';
require './MongoGet.php';
set_time_limit(0);
use \Aws\DynamoDb\DynamoDbClient;
$client = \Aws\DynamoDb\DynamoDbClient::factory(array(
'key' => 'MY_KEY',
'secret' => 'MY_SECRET',
'region' => 'MY_REGION',
'base_url' => 'http://localhost:8000'
));
$collection = "AccumulatorGasPressure4093_raw";
function nEcho($str) {
echo "{$str}<br>\n";
}
echo "<pre>";
test-store.php
<?
include('test-header.php');
nEcho("Creating table(s)...");
// create test table
$client->createTable(array(
'TableName' => $collection,
'AttributeDefinitions' => array(
array(
'AttributeName' => 'id',
'AttributeType' => 'N'
),
array(
'AttributeName' => 'count',
'AttributeType' => 'N'
)
),
'KeySchema' => array(
array(
'AttributeName' => 'id',
'KeyType' => 'HASH'
),
array(
'AttributeName' => 'count',
'KeyType' => 'RANGED'
)
),
'ProvisionedThroughput' => array(
'ReadCapacityUnits' => 10,
'WriteCapacityUnits' => 20
)
));
$result = $client->describeTable(array(
'TableName' => $collection
));
nEcho("Done creating table...");
nEcho("Getting data from Mongo...");
// instantiate class and get data
$mGet = new MongoGet();
$results = $mGet->getData($collection);
nEcho ("Done retrieving Mongo data...");
nEcho ("Inserting data...");
$i = 0;
foreach($results as $result) {
$insertResult = $client->putItem(array(
'TableName' => $collection,
'Item' => $client->formatAttributes(array(
'id' => $i,
'date' => $result['date'],
'value' => $result['value'],
'count' => $i
)),
'ReturnConsumedCapacity' => 'TOTAL'
));
$i++;
}
nEcho("Done Inserting, script ending...");
I suspect that you are being throttled by DynamoDB, especially if your tables' throughputs are low. The SDK retries the requests, up to 11 times per request, but eventually, the requests fail, which should throw an exception.
You should take a look at the WriteRequestBatch object. This object is basically a queue of items that get sent in batches, but any items that fail to transfer are re-queued automatically. Should provide a more robust solution for what you are doing.

CakePHP findList doesn't return aggregated values

The following query returns an array containing the proper ids, but null for all values.
If I remove the aggregation function (AVG()), it returns values (not the averaged ones of course), if I choose e.g. find('all') it returns the average, but not in the list format I want (I could work with that, but I want to try to do it with 'list' first).
$progress = $this->Trial->find('list', array(
'fields' => array(
'Trial.session_id',
'AVG(Trial.first_reaction_time_since_probe_shown) AS average_reaction_time'
),
'group' => 'Trial.session_id',
'conditions' => array(
'Trial.first_valid_response = Trial.probe_on_top',
'TrainingSession.user_id IS NOT NULL'
),
'contain' => array(
'TrainingSession' => array(
'conditions' => array(
'TrainingSession.user_id' => $this->Auth->user('id')
)
)
),
'recursive' => 1,
));
The generated SQL query returns exactly the result I want, when I send it to the DB via PhpMyAdmin.
SELECT
`Trial`.`session_id`,
AVG(`Trial`.`first_reaction_time_since_probe_shown`) AS average_reaction_time
FROM
`zwang`.`trials` AS `Trial`
LEFT JOIN
`zwang`.`training_sessions` AS `TrainingSession` ON (
`Trial`.`session_id` = `TrainingSession`.`id` AND
`TrainingSession`.`user_id` = 1
)
WHERE
`Trial`.`first_valid_response` = `Trial`.`probe_on_top`
GROUP BY
`Trial`.`session_id`
I've examined the source for find('list'). I think it's due to the "array path" for accessing the list getting screwed up when using functions in the query, but I couldn't fix it yet (or recognise my abuse of CakePHP logic).
Once I posted the question, Stackoverflow started relating the correct answers to me.
Apparently, it can't be done with 'list' without virtualFields.
I didn't expect that because it worked using the other find-types.
$this->Trial->virtualFields = array(
'average_reaction_time' => 'AVG(Trial.first_reaction_time_since_probe_shown)'
);
$progress = $this->Trial->find('list', array(
'fields' => array('Trial.session_id','average_reaction_time')
/* etc... */
));

CakePHP paginate and order by

It feels like I've tried everything so I now come to you.
I am trying to order my data but it isn't going so well, kinda new to Cake.
This is my code:
$this->set('threads', $this->paginate('Thread', array(
'Thread.hidden' => 0,
'Thread.forum_category_id' => $id,
'order' => array(
'Thread.created' => 'desc'
)
)));
It generates an SQL error and this is the last and interesting part:
AND `Thread`.`forum_category_id` = 12 AND order = ('desc') ORDER BY `Thread`.`created` ASC LIMIT 25
How can I fix this? The field created obviously exists in the database. :/
You need to pass in the conditions key when using multiple filters (i.e. order, limit...). If you just specify conditions, you can pass it as second parameter directly.
This should do it:
$this->set('threads', $this->paginate('Thread', array(
'conditions' => array(
'Thread.hidden' => 0,
'Thread.forum_category_id' => $id
),
'order' => array(
'Thread.created' => 'desc'
)
)));
or perhaps a little clearer:
$this->paginate['order'] = array('Thread.created' => 'desc');
$this->paginate['conditions'] = array('Thread.hidden' => 0, ...);
$this->paginate['limit'] = 10;
$this->set('threads', $this->paginate());
if you get an error, add public $paginate; to the top of your controller.
Try
$this->set('threads', $this->paginate('Thread', array(
'Thread.hidden' => 0,
'Thread.forum_category_id' => $id
),
array(
'Thread.created' => 'desc'
)
));
I'm not a Cake master, just a guess.
EDIT. Yes, thats right. Cake manual excerpt:
Control which fields used for ordering
...
$this->paginate('Post', array(), array('title', 'slug'));
So order is the third argument.
try
$all_threads = $this->Threads->find('all',
array(
'order' => 'Threads.created'
)
);
$saida = $this->paginate($all_threads,[
'conditions' => ['Threads.hidden' => 0]
]);
There are a few things to take note of in paginate with order. For Cake 3.x, you need :
1) Ensure you have included the fields in 'sortWhitelist'
$this->paginate = [
'sortWhitelist' => [
'hidden', 'forum_category_id',
],
];
2) for 'order', if you put it under $this->paginate, you will not be able to sort that field in the view. So it is better to put the 'order' in the query (sadly this wasn't stated in the docs)
$query = $this->Thread->find()
->where( ['Thread.hidden' => 0, 'Thread.forum_category_id' => $id, ] )
->order( ['Thread.created' => 'desc'] );
$this->set('threads', $this->paginate($query)

Categories