I am trying to build a query, where I am using exact phrase match and synonyms and I can't figure it out. Also, when using wildcard approach I don't know how to use fuzziness. Is it even possible with wildcards? It would be great to get same results for terms "call of duty", "cod" or "call of dutz".
I have created this index:
PUT exact_search
{
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0",
"analysis": {
"analyzer": {
"analyzer_exact": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase",
"icu_folding",
"synonyms"
]
}
},
"filter": {
"synonyms": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "keyword",
"fields": {
"analyzer_exact": {
"type": "text",
"analyzer": "analyzer_exact"
}
}
}
}
}
}
And I fill it with these items:
POST exact_search/_doc/1
{
"name": "Hoodie Call of Duty"
}
POST exact_search/_doc/2
{
"name": "Call of Duty 2"
}
POST exact_search/_doc/3
{
"name": "Call of Duty: Modern Warfare 2"
}
POST exact_search/_doc/4
{
"name": "COD: Modern Warfare 2"
}
POST exact_search/_doc/5
{
"name": "Call of duty"
}
POST exact_search/_doc/6
{
"name": "Call of the sea"
}
POST exact_search/_doc/7
{
"name": "Heavy Duty"
}
synonyms.txt looks like this:
cod,call of duty
And what I am trying to achieve is, to get all the results (exept call of the sea and heavy duty) when I search "call of duty" or "cod".
So far, I constructed this query, but it does not work as expected when using "cod" search term (term "call of duty" works fine):
GET exact_search/_search
{
"explain": false,
"query":{
"bool":{
"must":[
{
"wildcard": {
"name.analyzer_exact": {
"value": "*cod*"
}
}
}
]
}
}
}
But the result is only two items:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "exact_search",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "COD: Modern Warfare 2"
}
},
{
"_index" : "exact_search",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"name" : "Call of duty"
}
}
]
}
}
It looks like that the synonyms are working, because it returns "call of duty" game, but it ignores the wildcards - it won't return Call of Duty 2 for example.
I need to look for the exact phrase match, because I dont't want to get results Heavy Duty or Call of the sea (when words "call" and "duty" match).
Thank you for pointing me in the right direction.
I have my doubts if the analyzer would generate the tokens synonymous with the analyzer_exact "tokenizer": "keyword".
I would change a few things to make it work.
keyword -> standard
"analyzer_exact": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"synonyms"
]
}
I would use match phrase to eliminate names other than call of duty and cod.
{
"match_phrase": {
"name.analyzer_exact": "cod"
}
}
Response after changes
{
"hits": {
"hits": [
{
"_source": {
"name": "Call of duty"
}
},
{
"_source": {
"name": "COD: Modern Warfare 2"
}
},
{
"_source": {
"name": "Call of Duty 2"
}
},
{
"_source": {
"name": "hoddies Call of Duty"
}
},
{
"_source": {
"name": "Call of Duty: Modern Warfare 2"
}
}
]
}
Related
Hello i want to do something like that with elasticsearch enter image description here
I already have some knowledge in elasticsearch but I can't understand how can I do this , multiple search
You can use a combination of bool/must/should clause to combine multiple conditions
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "tag"
}
},
{
"match": {
"answers": 0
}
},
{
"match": {
"user": 1234
}
},
{
"multi_match": {
"query": "words here",
"type": "phrase"
}
},
{
"match": {
"score": 3
}
},
{
"match": {
"isaccepted": "yes"
}
}
]
}
}
}
If you want to search on multiple fields then you can use multi_match query
If no fields are provided, the multi_match query defaults to the
index.query.default_field index settings, which in turn defaults to *.
This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.
Adding a working example with index data, search query, and search result
Index Data:
{
"answers": 0,
"isaccepted": "no"
}
{
"answers": 0,
"isaccepted": "yes"
}
Search Query:
{
"query": {
"multi_match" : {
"query" : "yes"
}
}
}
Search Result:
"hits": [
{
"_index": "67542669",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"answers": 0,
"isaccepted": "yes"
}
}
]
Use Elasticsearch version is 5.4.2
I'd like to build an Elasticsearch query to satisfy three conditions.
filter by championId
get minimal time to buy various item per game
calculate avg minimal time to buy each item in all games.
I did 1 and 2. But I could not find solving 3. Is it possible to execute 1 to 3 in the query? Just in case, I will use the result on Laravel 5.4, one of PHP frameworks.
My data format is the following:
"_index": "timelines",
"_type": "timeline"
"_source": {
"gameId": 152735348,
"participantId": 3,
"championId": 35,
"role": "NONE",
"lane": "JUNGLE",
"win": 1,
"itemId": 1036,
"timestamp": 571200
}
My current Elasticsearch query is this
GET timelines/_search?size=0&pretty
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"items": {
"terms": {
"field": "itemId",
"order" : { "min_buying_time" : "asc" }
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
}
}
}
}
}
As #Sönke Liebau said pipeline aggregation is the key, but if you want to count average minimal time of all games per item you should first aggregate by itemID. Following code should help:
POST misko/_search
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"items": {
"terms": {
"field": "itemId"
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
},
"avg_min_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}
}
}
}
}
If I understand your objective correctly you should be able to solve this with pipeline aggregations. More specifically to your use case, the Avg Bucket aggregation should be helpful, check out the example in the documentation, that should be very close to what you need I think.
Something like:
"avg_min_buying_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}
I am trying to use geo_point for distance but it always shows location type double not geo_point how can I set location mapped to geo_point .
Actually I have to find all records within 5km sorted.
"pin" : {
"properties" : {
"location" : {
"properties" : {
"lat" : {
"type" : "double"
},
"lon" : {
"type" : "double"
}
}
},
"type" : {
"type" : "string"
}
}
},
and when I am trying searches with this query below to find result within 5km of delhi lat long :
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"pin": {
"distance": "5",
"distance_unit": "km",
"coordinate": {
"lat": 28.5402707,
"lon": 77.2289643
}
}
}
}
}
}
It shows me error query_parsing_exception and No query registered for [pin]. I cannot figure out the problem. It always throws this exception
{
"error": {
"root_cause": [
{
"type": "query_parsing_exception",
"reason": "No query registered for [pin]",
"index": "find_index",
"line": 1,
"col": 58
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "find_index",
"node": "DtpkEdLCSZCr8r2bTd8p5w",
"reason": {
"type": "query_parsing_exception",
"reason": "No query registered for [pin]",
"index": "find_index",
"line": 1,
"col": 58
}
}
]
},
"status": 400
}
Please help me to figure out this problem. how can I set geo_point and solve this exception error and status 400 and all_shards_failed error
You simply need to map your pin type like this, i.e. using the geo_point data type:
# 1. first delete your index
DELETE shouut_find_index
# 2. create a new one with the proper mapping
PUT shouut_find_index
{
"mappings": {
"search_shouut": {
"properties": {
"location": {
"type": "geo_point"
},
"type": {
"type": "string"
}
}
}
}
}
Then you can index a document like this
PUT shouut_find_index/search_shouut/1
{
"location": {
"lat": 28.5402707,
"lon": 77.2289643
},
"type": "dummy"
}
And finally your query can look like this
POST shouut_find_index/search_shouut/_search
{
"query": {
"filtered": {
"filter": {
"geo_distance": {
"distance": "5km",
"location": {
"lat": 28.5402707,
"lon": 77.2289643
}
}
}
}
}
}
I am having problem with sorting in PHP, here is my mapping:
{
"jdbc": {
"mappings": {
"jdbc": {
"properties": {
"admitted_date": {
"type": "date",
"format": "dateOptionalTime"
},
"et_tax": {
"type": "string"
},
"jt_tax": {
"type": "string"
},
"loc_cityname": {
"type": "string"
},
"location_countryname": {
"type": "string"
},
"location_primary": {
"type": "string"
},
"pd_firstName": {
"type": "string"
}
}
}
}
}
}
When I use order the result by sort, it will order the results with alphanumeric, it will load the results with numeric as first. I need to order the results only starting letter alphabets. Now it orders like this:
http://localhost:9200/jdbc/_search?pretty=true&sort=pd_lawFirmName:asc
BM&A
Gomez-Acebo & Pombo
Addleshaw Goddard
How to order the results like this?
Addleshaw Goddard
BM&A
Gomez-Acebo & Pombo
Here is my query i using for indexing
{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/dbname",
"user" : "user",
"password" : "pass",
"sql" : "SQL QUERY",
"poll" : "24h",
"strategy" : "simple",
"scale" : 0,
"autocommit" : true,
"bulk_size" : 5000,
"max_bulk_requests" : 30,
"bulk_flush_interval" : "5s",
"fetchsize" : 100,
"max_rows" : 149669,
"max_retries" : 3,
"max_retries_wait" : "10s",
"locale" : "in",
"digesting" : true,
"mappings": {
"sorting": {
"properties": {
"pd_lawFirmName": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
This is like that because Elasticsearch will tokenize the text using the default analyzer, which is standard. For example, McDermott Will Amery is indexed like:
"amery",
"mcdermott",
"will"
If you want to sort like that, I would suggest to change the mapping of your pd_lawFirmName in something like this:
"pd_lawFirmName": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
and sort by the raw subfield:
http://localhost:9200/jdbc/_search?pretty=true&sort=pd_lawFirmName.raw:asc
Using PHP and Mongo I would like to update the users availability but cannot figure it out. How can I structure my collection to be able to reference availability groups.steve.availability?
Below is the structure of my "groups" collection:
{
"_id": {
"$oid": "524327d536b82c7c5c842f6d"
},
"group_id": "testing",
"password": "test",
"users": [
{
"username": "steve",
"availability": "null"
},
{
"username": "joeb",
"availability": "null"
}
]
}
If you want to reference it the way you've suggested: groups.steve.availability, you'd need to structure your documents more like below. (I'm not sure where groups is coming from).
This example would give you users.steve.availability by moving the user's name to a sub-field of the users field (users.steve).
{
"_id": {
"$oid": "524327d536b82c7c5c842f6d"
},
"group_id": "testing",
"password": "test",
"users": {
"steve": {
"availability": "null"
},
"joeb" : {
"availability": "null"
}
}
}
Or, you could just create fields directly on the document:
{
"_id": {
"$oid": "524327d536b82c7c5c842f6d"
},
"group_id": "testing",
"password": "test",
"steve": {
"availability": "null"
},
"joeb" : {
"availability": "null"
}
}
That would allow you to just use steve.availability.
If you're trying to do a query though, you'd be better off leaving it more like you had it originally:
"users": [
{
"username": "steve",
"availability": "null"
}]
So, you could write queries that were like:
db.groups.find({"users.username" : "steve" })