ElasticSearch: Relevancy score override by function_score - php

Following is my elastic search query, I am using function_score to order my result in following manner:
1. First query should match to some fields (Fulltext search)
2. order result as per user current location (using gauss function)
3. give more weight to those service providers who has some recommendations (using gauss function)
4. give more preference to those service providers which has been recently reviewed (using script score)
Now (2,3,4) point ordering will be done on resulting set, but problem is whenever i am using geo location function exact matched service provider reordered and down in the listing and my query shows result those are near to users location irrespective its less matching to other documents.
Following is my query, please help me resolve this issue. Please also suggest to optimize my this scenario what is the best way to solve this issue.
{
"from": 0,
"size": 15,
"sort": {
"_score": {
"order": "desc"
}
},
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"status": "1"
}
},
{
"query_string": {
"default_field": "_all",
"query": "Parag Gadhia Parenting classes",
"fields": [
"service_prrovider_name^3",
"location^2",
"category_name^5",
"keyword"
],
"use_dis_max": true
}
},
{
"term": {
"city_id": "1"
}
}
],
"should": [
{
"term": {
"category.category_name": "parenting classes"
}
}
]
}
},
"functions": [
{
"gauss": {
"geo_location": {
"origin": {
"lat": "19.451624199999998",
"lon": "72.7966481"
},
"offset": "20km",
"scale": "3km"
}
}
},
{
"gauss": {
"likes_count": {
"origin": 3,
"offset": "5",
"scale": "20"
}
},
"weight": 2
},
{
"script_score": {
"script": "(0.08 / ((3.16*pow(10,-11)) * abs(1426072330 - doc[\"reviews.created_time\"].value) + 0.05)) + 1.0"
}
}
]
}
}
}

Yes this is "normal", script_score will override the previous score.
You can use _score variable inside script to use it.
(Such as "script": "_score * (0.08 / ((3.16*pow(10,-11)) * abs(1426072330 - doc[\"reviews.created_time\"].value) + 0.05)) + 1.0")

Related

Multiple search field elasticsearphp

Hello i want to do something like that with elasticsearch enter image description here
I already have some knowledge in elasticsearch but I can't understand how can I do this , multiple search
You can use a combination of bool/must/should clause to combine multiple conditions
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "tag"
}
},
{
"match": {
"answers": 0
}
},
{
"match": {
"user": 1234
}
},
{
"multi_match": {
"query": "words here",
"type": "phrase"
}
},
{
"match": {
"score": 3
}
},
{
"match": {
"isaccepted": "yes"
}
}
]
}
}
}
If you want to search on multiple fields then you can use multi_match query
If no fields are provided, the multi_match query defaults to the
index.query.default_field index settings, which in turn defaults to *.
This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.
Adding a working example with index data, search query, and search result
Index Data:
{
"answers": 0,
"isaccepted": "no"
}
{
"answers": 0,
"isaccepted": "yes"
}
Search Query:
{
"query": {
"multi_match" : {
"query" : "yes"
}
}
}
Search Result:
"hits": [
{
"_index": "67542669",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"answers": 0,
"isaccepted": "yes"
}
}
]

How to sort aggregations by top hits field (text field)? Or is there any possibility to sort aggregations by text field (without using _term)

As in title, I have a problem with sorting Elasticsearch aggregation by text field. Is there any possibility to do it? Using top hits or something like this? Now i'm using term aggregation and i can sort by aggregation field using _term, but i need to sort this aggregations by different field. I know how to do it with fields with numeric value. For instance using max, min, sum etc.
It will be great if i can do it like this (but i cant):
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"top_Song_hits": "asc"
}
},
"aggs": {
"top_Song_hits": {
"sum": {
"name": {
"order": "desc"
}
}
}
}
}
}
}
or like this:
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"name_agg": "asc"
}
},
"aggs": {
"name_agg": {
"terms": {
"field": "name"
}
}
}
}
}
}
Or
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"details": "asc"
}
},
"aggs": {
"details": {
"top_hits": {
"size": 1,
"_source": {
"include": ["name"]
}
}
}
}
}
}
}
In last case i get error:
"reason": "Invalid aggregation order path [details]. Buckets can only be sorted on a sub-aggregator path that is built out of zero or more single-bucket aggregations within the path and a final single-bucket or a metrics aggregation at the path end."
i found solution for my problem here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html

How to calculate average minimal times in Elasticsearch?

Use Elasticsearch version is 5.4.2
I'd like to build an Elasticsearch query to satisfy three conditions.
filter by championId
get minimal time to buy various item per game
calculate avg minimal time to buy each item in all games.
I did 1 and 2. But I could not find solving 3. Is it possible to execute 1 to 3 in the query? Just in case, I will use the result on Laravel 5.4, one of PHP frameworks.
My data format is the following:
"_index": "timelines",
"_type": "timeline"
"_source": {
"gameId": 152735348,
"participantId": 3,
"championId": 35,
"role": "NONE",
"lane": "JUNGLE",
"win": 1,
"itemId": 1036,
"timestamp": 571200
}
My current Elasticsearch query is this
GET timelines/_search?size=0&pretty
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"items": {
"terms": {
"field": "itemId",
"order" : { "min_buying_time" : "asc" }
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
}
}
}
}
}
As #Sönke Liebau said pipeline aggregation is the key, but if you want to count average minimal time of all games per item you should first aggregate by itemID. Following code should help:
POST misko/_search
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"items": {
"terms": {
"field": "itemId"
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
},
"avg_min_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}
}
}
}
}
If I understand your objective correctly you should be able to solve this with pipeline aggregations. More specifically to your use case, the Avg Bucket aggregation should be helpful, check out the example in the documentation, that should be very close to what you need I think.
Something like:
"avg_min_buying_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}

How to do an Elasticsearch range query with gauss function applied too?

How would I construct an ElasticSearch query to satisfy the following:
Price must be between 100,000 & 200,000, but also show results outside of this range, but with decreasing relevance if above 200k or below 100k.
So far I have the following but it doesn't seem to be doing what I want (omitted the wrapping query for brevity):
"function_score": {
"query": {
"range": {
"price_amount": {
"gte": 100000,
"lte": 200000
}
}
},
"functions": [
{
"gauss": {
"price_amount": {
"origin": "50000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Update:
Had another look and I think setting the function to the following, without the range query would do the trick, wouldn't it?
"function_score": {
"functions": [
{
"gauss": {
"price_amount": {
"origin": "150000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Many thanks!
Lee

Elastic Search different reults on URL query and JSON POST

I'm completing a search function on a big online webstore.
I have a problem with additional fields. When I try searching for some fields in browser, it works, but when posting a JSON using bool filter, it gives me 0 results (doesn't raise an error).
Basically: when I visit localhost:9200/search/items/_search?pretty=true&q=field-7:Diesel
It works well, however, in JSON it doesn't.
I've been googling all day and couldn't find any help in ElasticSeach documents. What frustrates me even more is that some other fields in bool query work OK, but this one doesn't.
I don't have any mapping and ES works for me out of the box - querying on the "name" field works well, as well as any other field, as well as for this field too - but only inside browser.
I realise that querying ES over browser uses so called "query string query".
Anyway, here is an example JSON that I'm posting to ElasticSearch.
(searching all items that have "golf mk5" in their name, which have diesel fuel type - by searching field-7).
{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"sold": "1"
}
},
{
"term": {
"user_id": "0"
}
}
],
"must": [
{
"term": {
"locked": "0"
}
},
{
"term": {
"removed": "0"
}
},
{
"terms": {
"field-7": [
"Diesel"
]
}
}
]
}
},
"query": {
"match": {
"name": {
"operator": "and",
"query": "+golf +Mk5"
}
}
}
}
},
"sort": [
{
"ordering": {
"price": "desc"
}
}
],
"from": 0,
"size": 24,
"facets": {
"category_count": {
"terms": {
"field": "category_id",
"size": 20,
"order": "count"
}
},
"price": {
"statistical": {
"field": "price"
}
}
}
}
Using a query_string-query, the text is analyzed. With the term-query (and -filter), it is not.
Since you're not specifying a mapping, you'll get the standard-analyzer for string fields. It tokenizes, lowercases and removes stopwords.
Thus, the term Diesel will be indexed as diesel. Your terms-filter is looking up the exact term Diesel, which is different.

Categories