How to calculate average minimal times in Elasticsearch?

How to calculate average minimal times in Elasticsearch? - php

Use Elasticsearch version is 5.4.2
I'd like to build an Elasticsearch query to satisfy three conditions.
filter by championId
get minimal time to buy various item per game
calculate avg minimal time to buy each item in all games.
I did 1 and 2. But I could not find solving 3. Is it possible to execute 1 to 3 in the query? Just in case, I will use the result on Laravel 5.4, one of PHP frameworks.
My data format is the following:
"_index": "timelines",
"_type": "timeline"
"_source": {
"gameId": 152735348,
"participantId": 3,
"championId": 35,
"role": "NONE",
"lane": "JUNGLE",
"win": 1,
"itemId": 1036,
"timestamp": 571200
}
My current Elasticsearch query is this
GET timelines/_search?size=0&pretty
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"items": {
"terms": {
"field": "itemId",
"order" : { "min_buying_time" : "asc" }
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
}
}
}
}
}

As #Sönke Liebau said pipeline aggregation is the key, but if you want to count average minimal time of all games per item you should first aggregate by itemID. Following code should help:
POST misko/_search
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"items": {
"terms": {
"field": "itemId"
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
},
"avg_min_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}
}
}
}
}

If I understand your objective correctly you should be able to solve this with pipeline aggregations. More specifically to your use case, the Avg Bucket aggregation should be helpful, check out the example in the documentation, that should be very close to what you need I think.
Something like:
"avg_min_buying_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}

Related

Elasticsearch : How to use multiple filter and calculation in aggregations?

I'm trying to do a function on kibana.
I have an index with orders with some fields :
datetime1, datetime2 with format : yyyy-MM-dd HH:mm
First I have to check if datetime1 exist.
Secondly I have to check the difference between this 2 datime datetime2 - datetime1
To finish I have to put the result in differents aggs if the difference is:
less than 24h
between 24 and 48h
48 - 72
....
What I tried :
GET orders/_search
{
"size": 0,
"aggs": {
"test1": {
"filters": {
"filters": {
"exist_datetime1": {
"exists": {
"field": "datetime1"
}
},
"24_hours": {
"script": {
"script": {
"source": "doc['datetime2'].value - doc['datetime1'].value < 24",
"lang": "painless"
}
}
}
}
}
}
}
}
How can I do multiple filter and do a subtraction between date ?
Thank for your help :)

That's a good start, however, I think you need something slightly different. Here is an attempt at providing the ranges your need using the range aggregation powered by your script.
You need to make sure both date fields have values (query part) and then you can define the buckets you need (< 24h, 24h - 48h, etc)
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "datetime1"
}
},
{
"exists": {
"field": "datetime2"
}
}
]
}
},
"aggs": {
"ranges": {
"range": {
"script": {
"lang": "painless",
"source": "(doc['datetime2'].value.millis - doc['datetime1'].value.millis) / 3600000"
},
"ranges": [
{
"to": 24,
"key": "< 24h"
},
{
"from": 24,
"to": 48,
"key": "24h-48h"
},
{
"from": 48,
"key": "> 48h"
}
]
}
}
}
}

How to sort aggregations by top hits field (text field)? Or is there any possibility to sort aggregations by text field (without using _term)

As in title, I have a problem with sorting Elasticsearch aggregation by text field. Is there any possibility to do it? Using top hits or something like this? Now i'm using term aggregation and i can sort by aggregation field using _term, but i need to sort this aggregations by different field. I know how to do it with fields with numeric value. For instance using max, min, sum etc.
It will be great if i can do it like this (but i cant):
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"top_Song_hits": "asc"
}
},
"aggs": {
"top_Song_hits": {
"sum": {
"name": {
"order": "desc"
}
}
}
}
}
}
}
or like this:
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"name_agg": "asc"
}
},
"aggs": {
"name_agg": {
"terms": {
"field": "name"
}
}
}
}
}
}
Or
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"details": "asc"
}
},
"aggs": {
"details": {
"top_hits": {
"size": 1,
"_source": {
"include": ["name"]
}
}
}
}
}
}
}
In last case i get error:
"reason": "Invalid aggregation order path [details]. Buckets can only be sorted on a sub-aggregator path that is built out of zero or more single-bucket aggregations within the path and a final single-bucket or a metrics aggregation at the path end."

i found solution for my problem here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html

How to do an Elasticsearch range query with gauss function applied too?

How would I construct an ElasticSearch query to satisfy the following:
Price must be between 100,000 & 200,000, but also show results outside of this range, but with decreasing relevance if above 200k or below 100k.
So far I have the following but it doesn't seem to be doing what I want (omitted the wrapping query for brevity):
"function_score": {
"query": {
"range": {
"price_amount": {
"gte": 100000,
"lte": 200000
}
}
},
"functions": [
{
"gauss": {
"price_amount": {
"origin": "50000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Update:
Had another look and I think setting the function to the following, without the range query would do the trick, wouldn't it?
"function_score": {
"functions": [
{
"gauss": {
"price_amount": {
"origin": "150000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Many thanks!
Lee

ElasticSearch: Relevancy score override by function_score

Following is my elastic search query, I am using function_score to order my result in following manner:
1. First query should match to some fields (Fulltext search)
2. order result as per user current location (using gauss function)
3. give more weight to those service providers who has some recommendations (using gauss function)
4. give more preference to those service providers which has been recently reviewed (using script score)
Now (2,3,4) point ordering will be done on resulting set, but problem is whenever i am using geo location function exact matched service provider reordered and down in the listing and my query shows result those are near to users location irrespective its less matching to other documents.
Following is my query, please help me resolve this issue. Please also suggest to optimize my this scenario what is the best way to solve this issue.
{
"from": 0,
"size": 15,
"sort": {
"_score": {
"order": "desc"
}
},
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"status": "1"
}
},
{
"query_string": {
"default_field": "_all",
"query": "Parag Gadhia Parenting classes",
"fields": [
"service_prrovider_name^3",
"location^2",
"category_name^5",
"keyword"
],
"use_dis_max": true
}
},
{
"term": {
"city_id": "1"
}
}
],
"should": [
{
"term": {
"category.category_name": "parenting classes"
}
}
]
}
},
"functions": [
{
"gauss": {
"geo_location": {
"origin": {
"lat": "19.451624199999998",
"lon": "72.7966481"
},
"offset": "20km",
"scale": "3km"
}
}
},
{
"gauss": {
"likes_count": {
"origin": 3,
"offset": "5",
"scale": "20"
}
},
"weight": 2
},
{
"script_score": {
"script": "(0.08 / ((3.16*pow(10,-11)) * abs(1426072330 - doc[\"reviews.created_time\"].value) + 0.05)) + 1.0"
}
}
]
}
}
}

Yes this is "normal", script_score will override the previous score.
You can use _score variable inside script to use it.
(Such as "script": "_score * (0.08 / ((3.16*pow(10,-11)) * abs(1426072330 - doc[\"reviews.created_time\"].value) + 0.05)) + 1.0")

Elastic Search different reults on URL query and JSON POST

I'm completing a search function on a big online webstore.
I have a problem with additional fields. When I try searching for some fields in browser, it works, but when posting a JSON using bool filter, it gives me 0 results (doesn't raise an error).
Basically: when I visit localhost:9200/search/items/_search?pretty=true&q=field-7:Diesel
It works well, however, in JSON it doesn't.
I've been googling all day and couldn't find any help in ElasticSeach documents. What frustrates me even more is that some other fields in bool query work OK, but this one doesn't.
I don't have any mapping and ES works for me out of the box - querying on the "name" field works well, as well as any other field, as well as for this field too - but only inside browser.
I realise that querying ES over browser uses so called "query string query".
Anyway, here is an example JSON that I'm posting to ElasticSearch.
(searching all items that have "golf mk5" in their name, which have diesel fuel type - by searching field-7).
{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"sold": "1"
}
},
{
"term": {
"user_id": "0"
}
}
],
"must": [
{
"term": {
"locked": "0"
}
},
{
"term": {
"removed": "0"
}
},
{
"terms": {
"field-7": [
"Diesel"
]
}
}
]
}
},
"query": {
"match": {
"name": {
"operator": "and",
"query": "+golf +Mk5"
}
}
}
}
},
"sort": [
{
"ordering": {
"price": "desc"
}
}
],
"from": 0,
"size": 24,
"facets": {
"category_count": {
"terms": {
"field": "category_id",
"size": 20,
"order": "count"
}
},
"price": {
"statistical": {
"field": "price"
}
}
}
}

Using a query_string-query, the text is analyzed. With the term-query (and -filter), it is not.
Since you're not specifying a mapping, you'll get the standard-analyzer for string fields. It tokenizes, lowercases and removes stopwords.
Thus, the term Diesel will be indexed as diesel. Your terms-filter is looking up the exact term Diesel, which is different.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to calculate average minimal times in Elasticsearch? - php

Related

Elasticsearch : How to use multiple filter and calculation in aggregations?

How to sort aggregations by top hits field (text field)? Or is there any possibility to sort aggregations by text field (without using _term)

How to do an Elasticsearch range query with gauss function applied too?

ElasticSearch: Relevancy score override by function_score

Elastic Search different reults on URL query and JSON POST

Categories

Resources