While working on trying to switch out PHP code with pure elasticsearch-painless, I noticed that the document doesn't return "noop" even if the document is identical before and after update.
I'm not sure if there is any consequences of having a version update for every time the code is executed? How does it scale?
I'm simply trying to update the views of a post during visit if the identity was not found in views_log, and was wondering either if there is a way to fix the "noop" return, or somehow have it cancel the update?
The code I have right now looks like this:
$script = 'if (!ctx._source.views_log.contains(params.identity)) {
ctx._source.views_log.add(params.identity);
ctx._source.views += 1;
}';
$params = [
'index' => 'post',
'id' => 4861,
'body' => [
'script' => [
'source' => $script,
'lang' => "painless",
'params' => [
'identity' => $identifier
]
]
]
];
$response = $client->update($params);
Following elasticsearch's documentation:
ctx['op']:
Use the default of index to update a document. Set to none to specify no operation or delete to delete the current document from the index.
I tried setting ctx.op to none if the condition is not met, but that didn't seem to work.
During writing of this question I figured it out, and might as well share with others.
none is an accepted keyword for ctx.op, it accepts a string. Change none to "none".
So the full script should look like this:
$script = 'if (!ctx._source.views_log.contains(params.identity)) {
ctx._source.views_log.add(params.identity);
ctx._source.views += 1;
} else {
ctx.op = "none";
}';
$params = [
'index' => 'post',
'id' => 4861,
'body' => [
'script' => [
'source' => $script,
'lang' => "painless",
'params' => [
'identity' => $identifier
]
]
]
];
$response = $client->update($params);
This will give the desired "result": "noop"
Related
For the currency conversion i am using "florianv/laravel-swap": "^1.1" library. Florianv/Laravel-swap.
As Fixer.io has changed its implementation, it is necessary to pass the access_key with the request, and because of that i am getting this error: "InvalidArgumentException: The "access_key" option must be provided to use fixer.io in /var/www/project/project-files/vendor/florianv/exchanger/src/Service/Fixer.php:51".
I registered and got the access_key.
I updated the library using composer and now i can see three constants in the vendor/florianv/exchanger/src/Service/Fixer.php.
const ACCESS_KEY_OPTION = 'access_key';
const LATEST_URL = 'http://data.fixer.io/api/latest?base=%s&access_key=%s';
const HISTORICAL_URL = 'http://data.fixer.io/api/%s?base=%s&access_key=%s';
To pass the access key i tried this:
I have a swap.php in config folder which looks something like this:
return [
'options' => [
'cache_ttl' => 86400, // 24 hours.
'cache_key_prefix' => 'currency_rate'
],
'services' => [
'fixer' => true,
],
'currency_layer' => [
'access_key' => 'asdfas7832mw3nsdfa776as8dfa', // Your app id
'enterprise' => true, // True if your AppId is an enterprise one
],
'cache' => env('CACHE_DRIVER', 'file'),
'http_client' => null,
'request_factory' => null,
'cache_item_pool' => null,
];
This had one more option which was commented, i enabled and passed the access_key in it but it doesn't work.
I also added it in services block below 'fixer => true'.
'currency_layer' => [
'access_key' => 'asdfas7832mw3nsdfa776as8dfa'
]
Also in options block:
'options' => [
'cache_ttl' => 86400, // 24 hours.
'cache_key_prefix' => 'currency_rate',
'access_key'=>'7ca208e9136c5e140d6a14427bf9ed21'
],
I tried with adding access_key in config/services.php file but it also didn't work.
'fixer' => [
'access_key' => 'asdfas7832mw3nsdfa776as8dfa'
],
Even i tried, adding to env file and calling from there, but no success. How do i pass the access_key, can anyone help me on this, what should be the approach.
vendor/florianv/exchanger/src/Service/Fixer.php -> don't touch the constant (that was my own error).
Pass the options-array by creating the Builder:
$options = ['access_key' => 'YourGeneratedAPIKeyAtCurrencyLayer'];
$this->exchangeSwap = (new Builder($options))
->add('fixer', $options )
->build();
I hope I could help ;-)
I am using elasticsearch 6.2 with elasticsearch-php 6.0 client. There is situation i got stuck. I need to update field userid = 987 where userid = 123. I went through update API, Here in every query we need to pass document ID in API (like POST test/_doc/1/_update). First i need to fetch _id and then i have to make update query with POST test/_doc/{_id}/_update
It won't possible to every time produce _id. It didn't help me.
I found another option to use _update_by_query. In which i got success by using following API:
curl -XPOST 'localhost:9200/my_index/my_type/_update_by_query?pretty' -H 'Content-Type: application/json' -d '
{
"query":{
"term":{
"userid":123
}
},
"script":{
"lang":"painless",
"inline":"ctx._source.userid = params.value",
"params":{
"value":987
}
}
}'
I am not finding any reference which shows how i can use _update_by_query with elasticserach-php client. Plus let me know if you guyz have better way to tackle this. Thanks!
I got solution So i would like to share:
$client = \Elasticsearch\ClientBuilder::create()->setHosts(['127.0.0.1:9200'])->build();
$update = [
'index' => 'my_index',
'type' => 'my_type',
'conflicts' => 'proceed',
'body' => [
'query' => [
'term' => [
"userid" => 987
]
],
'script' => [
'lang' => 'painless',
'inline' => 'ctx._source.userid = params.userid',
'params' => [
'userid' => 123
]
]
]
];
$results = $client->updateByQuery($update);
It solve my problem. I think elasticserach should document this.
I am using the Official PHP driver to connect to Elasticsearch(v 2.3), every when I index a new document it takes from 5sec to 60sec to be able to get it into my filter results. How can I cut down the delay time to zero?
Here is my index query
# Document Body
$data = [];
$data['time'] = $time;
$data['unique'] = 1;
$data['lastACtivity'] = $time;
$data['bucket'] = 20,
$data['permission'] = $this->_user->permission; # Extracts User Permission
$data['ipaddress'] = $this->_client->ipaddress(); # Extracts User IP Address
# Construct Index
$indexRequest = [
'index' => 'gorocket',
'type' => 'log',
'refresh' => true,
'body' => $data
];
# Indexing Document
$confirmation = $client->index( $indexRequest );
And here is my search filter query
# Query array
$query =[ 'query' => [
'filtered' => [
'filter' => [
'bool' => [
'must' =>[
[
'match' => [ 'unique' => 1 ]
],
[
'range' => [
'lastACtivity' => [
'gte' => $from,
'lte' => $to
],
'_cache' => false
]
]
],
'must_not' => [
[ 'match' => [ 'type' => 'share' ] ],
]
]
]
]
]
];
# Prepare filter parameters
$filterParams = [
'index' => 'gorocket',
'type' => 'log',
'size' => 20,
'query_cache' => false,
'body' => $query
];
$client->search($filterParams);
Thank you.
When you index a new document you can specify the refresh parameter in order to make the new document available immediately for your next search operation.
$params = [
'index' => 'my-index',
'type' => 'my-type',
'id' => 123,
'refresh' => true <--- add this
];
$response = $client->index($params);
The refresh parameter is also available on the bulk operation if you're using it.
Be aware, though, that refreshing too often can have negative impacts on performance.
There is a refresh option provided, which needs a value (in seconds) to refresh the index. For example, if you update something in index, it gets written in the index but not ready for reading until the index is refreshed.
Refresh can be set to true for refreshing the index as soon as any change happens. This needs to be very carefully thought, because many times, it downgrades your performance as its an overkill to refresh for each small operation, plus many bulk refreshes can make the index busy.
Tip: Use an elasticsearch plugin, such as kopf and see more such options like refresh rate, to configure.
I have a field called url that is set to not_analyzed when I index it:
'url' => [
'type' => 'string',
'index' => 'not_analyzed'
]
Here is my method to determine if a URL already exists in the index:
public function urlExists($index, $type, $url) {
$params = [
'index' => $index,
'type' => $type,
'body' => [
'query' => [
'match' => [
'url' => $url
]
]
]
];
$results = $this->client->count($params);
return ($results['count'] > 0);
}
This seems to work fine however I can't be 100% sure this is the correct way to find an exact match, as reading the docs another way to do the search is with the params like:
$params = [
'index' => $index,
'type' => $type,
'body' => [
'query' => [
'filtered' => [
'filter' => [
'term' => [
'url' => $url
]
]
]
]
]
];
My question is would either params work the same way for a not_analyzed field?
The second query is the right approach. term level queries/filters should be used for exact match. Biggest advantage is caching. Elasticsearch uses bitset for this and you will get quicker response time with subsequent calls.
From the Docs
Exclude as many document as you can with a filter, then query just the
documents that remain.
Also if you observe your output, you will find that _score of every document is 1 as scoring is not applied to filters, same goes for highlighting but with match query you will see different _score. Again From the Docs
Keep in mind that once you wrap a query as a filter, it loses query
features like highlighting and scoring because these are not features
supported by filters.
Your first query uses match which is basically used for analyzed fields e.g when you want both Google and google to match all your documents containing google(case insensitive) match queries are used.
Hope this helps!!
In my example code I am using the php client library, but it should be understood by anyone familiar with elasticsearch.
I'm using elasticsearch to create an index where each document contains an array of nGram indexed authors. Initially, the document will have a single author, but as time progresses, more authors will be appended to the array. Ideally, a search could be executed by an author's name, and if any of the authors in the array get matched, the document will be found.
I have been trying to use the documentation here for appending to the array and here for using the array type - but I have not had success getting this working.
First, I want to create an index for documents, with a title, array of authors, and an array of comments.
$client = new Client();
$params = [
'index' => 'document',
'body' => [
'settings' => [
// Simple settings for now, single shard
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'filter' => [
'shingle' => [
'type' => 'shingle'
]
],
'analyzer' => [
'my_ngram_analyzer' => [
'tokenizer' => 'my_ngram_tokenizer',
'filter' => 'lowercase',
]
],
// Allow searching for partial names with nGram
'tokenizer' => [
'my_ngram_tokenizer' => [
'type' => 'nGram',
'min_gram' => 1,
'max_gram' => 15,
'token_chars' => ['letter', 'digit']
]
]
]
],
'mappings' => [
'_default_' => [
'properties' => [
'document_id' => [
'type' => 'string',
'index' => 'not_analyzed',
],
// The name, email, or other info related to the person
'title' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'authors' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'comments' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
]
],
]
]
];
// Create index `person` with ngram indexing
$client->indices()->create($params);
Off the get go, I can't even create the index due to this error:
{"error":"MapperParsingException[mapping [_default_]]; nested: MapperParsingException[No handler for type [list] declared on field [authors]]; ","status":400}
HAD this gone successfully though, I would plan to create an index, starting with empty arrays for authors and title, something like this:
$client = new Client();
$params = array();
$params['body'] = array('document_id' => 'id_here', 'title' => 'my_title', 'authors' => [], 'comments' => []);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$ret = $client->index($params);
return $ret;
This seems like it should work if I had the desired index to add this structure of information to, but what concerns me would be appending something to the array using update. For example,
$client = new Client();
$params = array();
//$params['body'] = array('person_id' => $person_id, 'emails' => [$email]);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$params['script'] = 'NO IDEA WHAT THIS SCRIPT SHOULD BE TO APPEND TO THE ARRAY';
$ret = $client->update($params);
return $ret;
}
I am not sure how I would go about actually appending a thing to the array and making sure it's indexed.
Finally, another thing that confuses me is how I could search based on any author in the array. Ideally I could do something like this:
But I'm not 100% whether it will work. Maybe there is something fundemental about elasticsearch that I am not understanding. I am completely new to so any resources that will get me to a point where these little details don't hang me up would be appreciated.
Also, any direct advice on how to use elasticsearch to solve these problems would be appreciated.
Sorry for the big wall of text, to recap, I am looking for advice on how to
Create an index that supports nGram analysis on all elements of an array
Updating that index to append to the array
Searching for the now-updated index.
Thanks for any help
EDIT: thanks to #astax, I am now able to create the index and append to the value as a string. HOWEVER, there are two problems with this:
the array is stored as a string value, so a script like
$params['script'] = 'ctx._source.authors += [\'hello\']';
actually appends a STRING with [] rather than an array containing a value.
the value inputted does not appear to be ngram analyzed, so a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hello';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // SUCCESS
will find the new value but a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hel';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // NO RESULTS
does not
There is no type "list" in elasticsearch. But you can use "string" field type and store array of values.
....
'comments' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
....
And index a document this way:
....
$params['body'] = array(
'document_id' => 'id_here',
'title' => 'my_title',
'authors' => [],
'comments' => ['comment1', 'comment2']);
....
As for the script for apending an element to array, this answer may help you - Elasticsearch upserting and appending to array
However, do you really need to update the document? It might be easier to just reindex it as this is exactly what Elasticsearch does internally. It reads the "_source" property, does the required modification and reindexes it. BTW, this means that "_source" must be enabled and all properties of the document should be included into it.
You also may consider storing comments and authors (as I understand these are authors of comments, not the document authors) as child document in ES and using "has_child" filter.
I can't really give you specific solution, but strongly recommend installing Marvel plugin for ElasticSearch and use its "sense" tool to check how your overall process works step by step.
So check if your tokenizer is properly configured by running tests as described at http://www.elastic.co/guide/en/elasticsearch/reference/1.4/indices-analyze.html.
Then check if your update script is doing what you expect by retrieving the document by running GET /document/example_type/some_existing_id
The authors and comments should be arrays, but not strings.
Finally perform the search:
GET /document/_search
{
'query' : {
'match': { '_all': 'hel' }
}
}
If you're building the query yourself rather than getting it from the user, you may use query_string with placeholders:
GET /document/_search
{
'query' : {
'query_string': {
'fields': '_all',
'query': 'hel*'
}
}
}