Executing bitmask with elastics database query - php

I would like to select data from an elastics database where the data it returns will be based on the (bitmap) evaluation of a number in the query.
Something like $x & 32 == 32
The query is as follows:
{
"size":1000,
"sort": {
"timestamp": "desc"
},
"fields" : ["id","timestamp", "eval_id"],
"query": {
"bool": {
"must": [
{
"term": {
"id": "450"
}
},
{
"term": {"eval_id": "161"}
},
{
"range": {
"timestamp": {
"gte": 1427061600000,
"lte": 1427147999000
}
}
}
]
}
}
}
So the "eval_id" must pass a bitmap evaluation in order to be returned by the JSON result.
So eval_id can be 161 or 681 or 421 and so on..
In SQL it looks like this: SUM(If ((eval_id & 1 = 1), 1,0)) as 'EVAL_value'
Can anyone help?

Related

Multiple search field elasticsearphp

Hello i want to do something like that with elasticsearch enter image description here
I already have some knowledge in elasticsearch but I can't understand how can I do this , multiple search
You can use a combination of bool/must/should clause to combine multiple conditions
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "tag"
}
},
{
"match": {
"answers": 0
}
},
{
"match": {
"user": 1234
}
},
{
"multi_match": {
"query": "words here",
"type": "phrase"
}
},
{
"match": {
"score": 3
}
},
{
"match": {
"isaccepted": "yes"
}
}
]
}
}
}
If you want to search on multiple fields then you can use multi_match query
If no fields are provided, the multi_match query defaults to the
index.query.default_field index settings, which in turn defaults to *.
This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.
Adding a working example with index data, search query, and search result
Index Data:
{
"answers": 0,
"isaccepted": "no"
}
{
"answers": 0,
"isaccepted": "yes"
}
Search Query:
{
"query": {
"multi_match" : {
"query" : "yes"
}
}
}
Search Result:
"hits": [
{
"_index": "67542669",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"answers": 0,
"isaccepted": "yes"
}
}
]

Elasticsearch : How to use multiple filter and calculation in aggregations?

I'm trying to do a function on kibana.
I have an index with orders with some fields :
datetime1, datetime2 with format : yyyy-MM-dd HH:mm
First I have to check if datetime1 exist.
Secondly I have to check the difference between this 2 datime datetime2 - datetime1
To finish I have to put the result in differents aggs if the difference is:
less than 24h
between 24 and 48h
48 - 72
....
What I tried :
GET orders/_search
{
"size": 0,
"aggs": {
"test1": {
"filters": {
"filters": {
"exist_datetime1": {
"exists": {
"field": "datetime1"
}
},
"24_hours": {
"script": {
"script": {
"source": "doc['datetime2'].value - doc['datetime1'].value < 24",
"lang": "painless"
}
}
}
}
}
}
}
}
How can I do multiple filter and do a subtraction between date ?
Thank for your help :)
That's a good start, however, I think you need something slightly different. Here is an attempt at providing the ranges your need using the range aggregation powered by your script.
You need to make sure both date fields have values (query part) and then you can define the buckets you need (< 24h, 24h - 48h, etc)
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "datetime1"
}
},
{
"exists": {
"field": "datetime2"
}
}
]
}
},
"aggs": {
"ranges": {
"range": {
"script": {
"lang": "painless",
"source": "(doc['datetime2'].value.millis - doc['datetime1'].value.millis) / 3600000"
},
"ranges": [
{
"to": 24,
"key": "< 24h"
},
{
"from": 24,
"to": 48,
"key": "24h-48h"
},
{
"from": 48,
"key": "> 48h"
}
]
}
}
}
}

How to calculate average minimal times in Elasticsearch?

Use Elasticsearch version is 5.4.2
I'd like to build an Elasticsearch query to satisfy three conditions.
filter by championId
get minimal time to buy various item per game
calculate avg minimal time to buy each item in all games.
I did 1 and 2. But I could not find solving 3. Is it possible to execute 1 to 3 in the query? Just in case, I will use the result on Laravel 5.4, one of PHP frameworks.
My data format is the following:
"_index": "timelines",
"_type": "timeline"
"_source": {
"gameId": 152735348,
"participantId": 3,
"championId": 35,
"role": "NONE",
"lane": "JUNGLE",
"win": 1,
"itemId": 1036,
"timestamp": 571200
}
My current Elasticsearch query is this
GET timelines/_search?size=0&pretty
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"items": {
"terms": {
"field": "itemId",
"order" : { "min_buying_time" : "asc" }
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
}
}
}
}
}
As #Sönke Liebau said pipeline aggregation is the key, but if you want to count average minimal time of all games per item you should first aggregate by itemID. Following code should help:
POST misko/_search
{
"query": {
"bool": {
"must": [
{ "match": { "championId": 22 }}
]
}
},
"aggs": {
"items": {
"terms": {
"field": "itemId"
},
"aggs": {
"games": {
"terms": {
"field": "gameId"
},
"aggs": {
"min_buying_time": {
"min": {
"field": "timestamp"
}
}
}
},
"avg_min_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}
}
}
}
}
If I understand your objective correctly you should be able to solve this with pipeline aggregations. More specifically to your use case, the Avg Bucket aggregation should be helpful, check out the example in the documentation, that should be very close to what you need I think.
Something like:
"avg_min_buying_time": {
"avg_bucket": {
"buckets_path": "games>min_buying_time"
}
}

Elasticsearch - Distinct Values, Not Counts

I am trying to do something similar to this SQL query:
SELECT * FROM table WHERE fileContent LIKE '%keyword%' AND company_id = '1' GROUP BY email
Having read posts similar to this I have this:
{
"query": {
"bool": {
"must": [{
"match": {
"fileContent": {
"query": "keyword"
}
}
}],
"filter": [{
"terms": {
"company_id": [1]
}
}]
}
},
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
}
}
},
"size": 0
}
Field mappings are:
{
"cvs" : {
"mappings" : {
"application" : {
"_meta" : {
"model" : "Acme\\AppBundle\\Entity\\Application"
},
"dynamic_date_formats" : [ ],
"properties" : {
"email" : {
"type" : "keyword"
},
"fileContent" : {
"type" : "text"
},
"company_id" : {
"type" : "text"
}
}
}
}
}
}
... which are generated from Symfony config.yml:
fos_elastica:
clients:
default:
host: "%elastica.host%"
port: "%elastica.port%"
indexes:
cvs:
client: default
types:
application:
properties:
fileContent: ~
email:
index: not_analyzed
company_id: ~
persistence:
driver: orm
model: Acme\AppBundle\Entity\Application
provider: ~
finder: ~
The filter works fine, but I am finding that hits:hits returns no items (or all results matching the search if I remove size:0) and aggregations:group_by_email:buckets has a count of the groups but not the records themselves. The records that were grouped aren't returned and it's these that I need.
I have also tried with FOSElasticBundle using the query builder if this is your preferred flavour (this works but doesn't have the grouping/aggregation):
$boolQuery = new \Elastica\Query\BoolQuery();
$filterKeywords = new \Elastica\Query\Match();
$filterKeywords->setFieldQuery('fileContent', 'keyword');
$boolQuery->addMust($filterKeywords);
$filterUser = new \Elastica\Query\Terms();
$filterUser->setTerms('company_id', array('1'));
$boolQuery->addFilter($filterUser);
$finder = $this->get('fos_elastica.finder.cvs.application');
Thanks.
For this you need top_hits aggregation inside the terms one you are already using:
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 100
}
}
}
}
}
top_hits:{size:1} appears to be what I need, having played around with Andrei's answer. This will return one record for each bucket in the aggregation
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
Ref: top_hits
top_hits helped me too. I had some trouble too, but eventually figured out how to resolve it. So here is my solution:
{
"query": {
"nested": {
"path": "placedOrders",
"query": {
"bool": {
"must": [
{
"term": {
"placedOrders.ownerId": "0a9fdef0-4508-4f9c-aa8c-b3984e39ad1e"
}
}
]
}
}
}
},
"aggs": {
"custom_name1": {
"nested": {
"path": "placedOrders"
},
"aggs": {
"custom_name2": {
"terms": {
"field": "placedOrders.propertyId"
},
"aggs": {
"custom_name3": {
"top_hits": {
"size": 1,
"sort": [
{
"placedOrders.propertyId": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}

Elastic Search different reults on URL query and JSON POST

I'm completing a search function on a big online webstore.
I have a problem with additional fields. When I try searching for some fields in browser, it works, but when posting a JSON using bool filter, it gives me 0 results (doesn't raise an error).
Basically: when I visit localhost:9200/search/items/_search?pretty=true&q=field-7:Diesel
It works well, however, in JSON it doesn't.
I've been googling all day and couldn't find any help in ElasticSeach documents. What frustrates me even more is that some other fields in bool query work OK, but this one doesn't.
I don't have any mapping and ES works for me out of the box - querying on the "name" field works well, as well as any other field, as well as for this field too - but only inside browser.
I realise that querying ES over browser uses so called "query string query".
Anyway, here is an example JSON that I'm posting to ElasticSearch.
(searching all items that have "golf mk5" in their name, which have diesel fuel type - by searching field-7).
{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"sold": "1"
}
},
{
"term": {
"user_id": "0"
}
}
],
"must": [
{
"term": {
"locked": "0"
}
},
{
"term": {
"removed": "0"
}
},
{
"terms": {
"field-7": [
"Diesel"
]
}
}
]
}
},
"query": {
"match": {
"name": {
"operator": "and",
"query": "+golf +Mk5"
}
}
}
}
},
"sort": [
{
"ordering": {
"price": "desc"
}
}
],
"from": 0,
"size": 24,
"facets": {
"category_count": {
"terms": {
"field": "category_id",
"size": 20,
"order": "count"
}
},
"price": {
"statistical": {
"field": "price"
}
}
}
}
Using a query_string-query, the text is analyzed. With the term-query (and -filter), it is not.
Since you're not specifying a mapping, you'll get the standard-analyzer for string fields. It tokenizes, lowercases and removes stopwords.
Thus, the term Diesel will be indexed as diesel. Your terms-filter is looking up the exact term Diesel, which is different.

Categories