Any idea why ElasticSearch would always return a _score of 0 for all the search queries i do ?
Using Elastica, i am doing something like:
$elasticaClient= $this->getElasticaClient();
$elasticaIndex = $elasticaClient->getIndex($this->getIndexName());
$elasticaQuery = new Elastica\Query\BoolQuery();
$queryAnd = new \Elastica\Query\BoolQuery();
$queryOr = new \Elastica\Query\BoolQuery();
$queryOr->addShould(new \Elastica\Query\Wildcard('search_field1', $keyword));
$queryOr->addShould(new \Elastica\Query\Wildcard('search_field2', $keyword));
$queryAnd->addMust($queryOr);
$elasticaQuery->addFilter($queryAnd);
$mainQuery = new \Elastica\Query();
$mainQuery->setQuery($elasticaQuery);
$elasticaResultSet = $elasticaIndex->search($mainQuery);
I get a bunch of results back, but always the _score for those results is 0, even if i enter the full word that can be found in the stored field(for a full match).
The mapping for the field is pretty simple:
'search_field1' => array(
'type' => 'string',
'include_in_all' => true,
'analyzer' => 'stringLowercase',
),
The stringLowercase analyzer is just:
'stringLowercase' => array(
'type' => 'custom',
'tokenizer' => 'keyword',
'filter' => 'lowercase'
),
Moreover, even if i try to boost either of the fields, it does not seem to have any effect.
Can anybody shed some light over this?
have you tried to query something nested like this?:
//i.e.
$queryField = new \Elastica\Query\QueryString($query);
$queryField->setDefaultOperator('OR');
$queryBool->addMustNot($queryField);
//then
$queryOr = new \Elastica\Query\Wildcard('search_field1', $keyword);
$queryOr->addShould($queryBool);
Related
I'm trying to use the Firebase PHP API to update/append a document field's array with a map
I have the following code in Python that works fine
ref = db.collection(u'jobs').document(jobId)
ref.update({
u'messages': firestore.ArrayUnion([{
u'category': u'0',
u'message': u'TEST',
u'sender': u'TEAM',
}])
})
Though when I try to replicate it in PHP, it doesn't work. I tried a lot of different ways to view the errors, but all I get is 500 INTERNAL SERVER ERROR.
require 'vendor/autoload.php';
use Google\Cloud\Firestore\FirestoreClient;
use Google\Cloud\Firestore\FieldValue;
$firestore = new FirestoreClient([
'projectId' => 'XXX-XX',
'credentials' => 'key.json'
]);
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->update([
'messages' => FieldValue::arrayUnion([{
'category' : '0',
'message' : 'TEST',
'sender' : 'TEAM',
}])
]);
I looked up samples of Array Union in PHP, adding data with PHP. I've tried a lots of variations of : or => or arrayUnion([]) or arrayUnion({[]}) to no avail.
Any idea what is causing this?
Looks like there's a few things going wrong here.
First, PHP uses arrays for both maps and "normal" arrays. There is no object literal ({}) in PHP. Array values are specified using the => operator, not :.
Second, DocumentReference::update() accepts a list of values you wish to change, with the path and value. So an update call would look like this:
$docRef->update([
['path' => 'foo', 'value' => 'bar']
]);
You can use DocumentReference::set() for the behavior you desire. set() will create a document if it does not exist, where update() will raise an error if the document does not exist. set() will also replace all the existing fields in the document unless you specify merge behavior:
$docRef->set([
'foo' => 'bar'
], ['merge' => true]);
Therefore, your code can be re-written as either of the following:
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->set([
'messages' => FieldValue::arrayUnion([[
'category' => '0',
'message' => 'TEST',
'sender' => 'TEAM',
]])
], ['merge' => true]);
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->update([
[
'path' => 'messages', 'value' => FieldValue::arrayUnion([[
'category' => '0',
'message' => 'TEST',
'sender' => 'TEAM',
]])
]
]);
One final thing to note: arrayUnion will not append duplicate values. So if the value you provide (including all keys and values in the nested map) already exists, it will not be appended to the document.
If you haven't already, turn up error reporting in your development environment to receive information about why your code is failing. PHP will inform you about the parse errors your snippet included, and the Firestore client will give you errors which can often be quite useful.
From Firebase Documentation:
$cityRef = $db->collection('cities')->document('DC');
// Atomically add a new region to the "regions" array field.
$cityRef->update([
['path' => 'regions', 'value' => FieldValue::arrayUnion(['greater_virginia'])]
]);
I would assume you would like something like this:
$docRef = $firestore->collection('jobs')->document($jobId);
// Atomically add new values to the "messages" array field.
$docRef->update([
['path' => 'messages', 'value' => FieldValue::arrayUnion([[
'category' : '0',
'message' : 'TEST',
'sender' : 'TEAM',
]])]
]);
When i use aggregatein PHP, i get error:
MongoResultException: localhost:27017: The 'cursor' option is
required, except for aggregate with the explain argument
I use mongoDB 3.6 and PHP 5.6
Please see the photo
My Code:
$dbconn = new MongoClient();
$c = $dbconn->selectDB("test")->selectCollection("users");
$ops = array(
array(
'$lookup' => array(
'from' => 'news',
'localField' => '_id',
'foreignField' => 'user_id',
'as' => 'user_docs'
)
)
);
$results = $c->aggregate($ops);
var_dump($results);
For other people who may run into the same problem, here is the solution.
The aggregator command was modified in version 3.6, as indicated in the documentation:
Changed in version 3.4: MongoDB 3.6 removes the use of aggregate command without the cursor option unless the command includes the explain option. Unless you include the explain option, you must specify the cursor option.
In Mongo, you could just add the cursor option without specifying any parameter, as specified in the documentation:
cursor: {}
In PHP, you would need to specify the option like this, new stdClass()corresponding to an empty object '{}' in Mongo :
$results = $c->aggregate($ops, ['cursor' => new \stdClass()]);
Here's how to do it for your example :
$dbconn = new MongoClient();
$c = $dbconn->selectDB("test")->selectCollection("users");
$ops = array(
array(
'$lookup' => array(
'from' => 'news',
'localField' => '_id',
'foreignField' => 'user_id',
'as' => 'user_docs'
)
)
);
$results = $c->aggregate($ops, ['cursor' => new \stdClass()]);
var_dump($results);
If you want to take advantage of calling 'cursor' to add parameters, such as batchSize, you can do it like this :
$results = $c->aggregate($ops, ['cursor' => ['batchSize' => 200]]);
All the parameters are listed in the documentation page linked above.
In my example code I am using the php client library, but it should be understood by anyone familiar with elasticsearch.
I'm using elasticsearch to create an index where each document contains an array of nGram indexed authors. Initially, the document will have a single author, but as time progresses, more authors will be appended to the array. Ideally, a search could be executed by an author's name, and if any of the authors in the array get matched, the document will be found.
I have been trying to use the documentation here for appending to the array and here for using the array type - but I have not had success getting this working.
First, I want to create an index for documents, with a title, array of authors, and an array of comments.
$client = new Client();
$params = [
'index' => 'document',
'body' => [
'settings' => [
// Simple settings for now, single shard
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'filter' => [
'shingle' => [
'type' => 'shingle'
]
],
'analyzer' => [
'my_ngram_analyzer' => [
'tokenizer' => 'my_ngram_tokenizer',
'filter' => 'lowercase',
]
],
// Allow searching for partial names with nGram
'tokenizer' => [
'my_ngram_tokenizer' => [
'type' => 'nGram',
'min_gram' => 1,
'max_gram' => 15,
'token_chars' => ['letter', 'digit']
]
]
]
],
'mappings' => [
'_default_' => [
'properties' => [
'document_id' => [
'type' => 'string',
'index' => 'not_analyzed',
],
// The name, email, or other info related to the person
'title' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'authors' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'comments' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
]
],
]
]
];
// Create index `person` with ngram indexing
$client->indices()->create($params);
Off the get go, I can't even create the index due to this error:
{"error":"MapperParsingException[mapping [_default_]]; nested: MapperParsingException[No handler for type [list] declared on field [authors]]; ","status":400}
HAD this gone successfully though, I would plan to create an index, starting with empty arrays for authors and title, something like this:
$client = new Client();
$params = array();
$params['body'] = array('document_id' => 'id_here', 'title' => 'my_title', 'authors' => [], 'comments' => []);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$ret = $client->index($params);
return $ret;
This seems like it should work if I had the desired index to add this structure of information to, but what concerns me would be appending something to the array using update. For example,
$client = new Client();
$params = array();
//$params['body'] = array('person_id' => $person_id, 'emails' => [$email]);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$params['script'] = 'NO IDEA WHAT THIS SCRIPT SHOULD BE TO APPEND TO THE ARRAY';
$ret = $client->update($params);
return $ret;
}
I am not sure how I would go about actually appending a thing to the array and making sure it's indexed.
Finally, another thing that confuses me is how I could search based on any author in the array. Ideally I could do something like this:
But I'm not 100% whether it will work. Maybe there is something fundemental about elasticsearch that I am not understanding. I am completely new to so any resources that will get me to a point where these little details don't hang me up would be appreciated.
Also, any direct advice on how to use elasticsearch to solve these problems would be appreciated.
Sorry for the big wall of text, to recap, I am looking for advice on how to
Create an index that supports nGram analysis on all elements of an array
Updating that index to append to the array
Searching for the now-updated index.
Thanks for any help
EDIT: thanks to #astax, I am now able to create the index and append to the value as a string. HOWEVER, there are two problems with this:
the array is stored as a string value, so a script like
$params['script'] = 'ctx._source.authors += [\'hello\']';
actually appends a STRING with [] rather than an array containing a value.
the value inputted does not appear to be ngram analyzed, so a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hello';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // SUCCESS
will find the new value but a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hel';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // NO RESULTS
does not
There is no type "list" in elasticsearch. But you can use "string" field type and store array of values.
....
'comments' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
....
And index a document this way:
....
$params['body'] = array(
'document_id' => 'id_here',
'title' => 'my_title',
'authors' => [],
'comments' => ['comment1', 'comment2']);
....
As for the script for apending an element to array, this answer may help you - Elasticsearch upserting and appending to array
However, do you really need to update the document? It might be easier to just reindex it as this is exactly what Elasticsearch does internally. It reads the "_source" property, does the required modification and reindexes it. BTW, this means that "_source" must be enabled and all properties of the document should be included into it.
You also may consider storing comments and authors (as I understand these are authors of comments, not the document authors) as child document in ES and using "has_child" filter.
I can't really give you specific solution, but strongly recommend installing Marvel plugin for ElasticSearch and use its "sense" tool to check how your overall process works step by step.
So check if your tokenizer is properly configured by running tests as described at http://www.elastic.co/guide/en/elasticsearch/reference/1.4/indices-analyze.html.
Then check if your update script is doing what you expect by retrieving the document by running GET /document/example_type/some_existing_id
The authors and comments should be arrays, but not strings.
Finally perform the search:
GET /document/_search
{
'query' : {
'match': { '_all': 'hel' }
}
}
If you're building the query yourself rather than getting it from the user, you may use query_string with placeholders:
GET /document/_search
{
'query' : {
'query_string': {
'fields': '_all',
'query': 'hel*'
}
}
}
Following this question I gather that upsert: false and multi: true fields need to be set for this to work.
However, when I try to code this in PHP, I have a problem:
$conn = new Mongo("mongodb://foo:bar#localhost:27017");
$db = $conn->selectDB("someDB");
$data = array('$rename' => array(
'nmae' => 'name'
));
$db->command(array(
'findAndModify' => 'foo',
'update' => $data,
'upsert' => 'false',
'multi' => 'true'
));
After running this script, only the first document with the nmae typo is changed to name; the rest still say nmae. The same as if I had run it without the upsert and multi options.
I also tried this:
$data = array('$rename' => array(
'nmae' => 'name'
),
'upsert' => 'false',
'multi' => 'true'
);
$db->command(array(
'findAndModify' => 'foo',
'update' => $data
));
But that does the same thing.
Any way to get this working?
The findAndModify query doesn't have a "multi" option:
http://www.php.net/manual/en/mongocollection.findandmodify.php
What you probably want to use is update instead:
http://www.php.net/manual/en/mongocollection.update.php
I like to implement
"SELECT * FROM TABLE_NAME
WHERE
name like '$query_string' or
title like '%$query_string%' or
tags like '%$query_string%'"
to mongoDB, and I tried
$condition = array('$or' =>
array('writer'=> array('name'=>"$query_string"),
'title'=> new MongoRegex("/$query_string/"),
'tags' => new MongoRegex("/$query_string/") ));
and this does not work.
What is proper way to implement that SQL to mongoDB?
Here's how I construct a case-insensitive, "contains" term
$containsTerm = new MongoRegex(sprintf('/%s/i', preg_quote($term, '/')));
So your condition might look like
$condition = array('$or' => array(
'writer.name' => $term,
'title' => $containsTerm,
'tags' => $containsTerm
));
Apologies if the condition array is wrong, I typically use the Doctrine ODM