Elasticsearch multiple queries - php

i would like to get distinct ip's for example today and where campaigne="2"
in sql:
select distinct ip
from test
where timestamp >= "2016-01-16" ... AND
fk_campaign_id = "2";
this works but json validator outputs "Duplicate key, names should be unique."
{
"size":0,
"aggs":{
"distinct_ip":{
"cardinality":{
"field":"ip"
}
}
},
"query":{
"range":{
"timestamp":{
"gte":"2016-01-16T00:00:00",
"lt":"2016-01-17T00:00:00"
}
}
},
"query":{
"match":{
"fk_campaign_id":"2"
}
}
}
But if i try to build this query in php, var_dump($params) returns me back json only with one "query", may be because of Duplicate key???
{
"size":0,
"aggs":{
"distinct_ip":{
"cardinality":{
"field":"ip"
}
}
},
part with range is not here?!?!?
"query":{
"match":{
"fk_campaign_id":"2"
}
}
}
Thanks in advance.

In your json query is a duplicate key. You need to use bool query whenever you have multiple conditions. since you have AND condition you need to use must clause. This is the right syntax
{
"query": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "2016-01-16T00:00:00",
"lt": "2016-01-17T00:00:00"
}
}
},
{
"match": {
"fk_campaign_id": "2"
}
}
]
}
},
"size": 0,
"aggs": {
"distinct_ip": {
"cardinality": {
"field": "ip"
}
}
}
}
Hope this helps!

Related

How to sort aggregations by top hits field (text field)? Or is there any possibility to sort aggregations by text field (without using _term)

As in title, I have a problem with sorting Elasticsearch aggregation by text field. Is there any possibility to do it? Using top hits or something like this? Now i'm using term aggregation and i can sort by aggregation field using _term, but i need to sort this aggregations by different field. I know how to do it with fields with numeric value. For instance using max, min, sum etc.
It will be great if i can do it like this (but i cant):
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"top_Song_hits": "asc"
}
},
"aggs": {
"top_Song_hits": {
"sum": {
"name": {
"order": "desc"
}
}
}
}
}
}
}
or like this:
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"name_agg": "asc"
}
},
"aggs": {
"name_agg": {
"terms": {
"field": "name"
}
}
}
}
}
}
Or
{
"aggs": {
"Variants": {
"terms": {
"field": "variant",
"order": {
"details": "asc"
}
},
"aggs": {
"details": {
"top_hits": {
"size": 1,
"_source": {
"include": ["name"]
}
}
}
}
}
}
}
In last case i get error:
"reason": "Invalid aggregation order path [details]. Buckets can only be sorted on a sub-aggregator path that is built out of zero or more single-bucket aggregations within the path and a final single-bucket or a metrics aggregation at the path end."
i found solution for my problem here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html

Elasticsearch - Distinct Values, Not Counts

I am trying to do something similar to this SQL query:
SELECT * FROM table WHERE fileContent LIKE '%keyword%' AND company_id = '1' GROUP BY email
Having read posts similar to this I have this:
{
"query": {
"bool": {
"must": [{
"match": {
"fileContent": {
"query": "keyword"
}
}
}],
"filter": [{
"terms": {
"company_id": [1]
}
}]
}
},
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
}
}
},
"size": 0
}
Field mappings are:
{
"cvs" : {
"mappings" : {
"application" : {
"_meta" : {
"model" : "Acme\\AppBundle\\Entity\\Application"
},
"dynamic_date_formats" : [ ],
"properties" : {
"email" : {
"type" : "keyword"
},
"fileContent" : {
"type" : "text"
},
"company_id" : {
"type" : "text"
}
}
}
}
}
}
... which are generated from Symfony config.yml:
fos_elastica:
clients:
default:
host: "%elastica.host%"
port: "%elastica.port%"
indexes:
cvs:
client: default
types:
application:
properties:
fileContent: ~
email:
index: not_analyzed
company_id: ~
persistence:
driver: orm
model: Acme\AppBundle\Entity\Application
provider: ~
finder: ~
The filter works fine, but I am finding that hits:hits returns no items (or all results matching the search if I remove size:0) and aggregations:group_by_email:buckets has a count of the groups but not the records themselves. The records that were grouped aren't returned and it's these that I need.
I have also tried with FOSElasticBundle using the query builder if this is your preferred flavour (this works but doesn't have the grouping/aggregation):
$boolQuery = new \Elastica\Query\BoolQuery();
$filterKeywords = new \Elastica\Query\Match();
$filterKeywords->setFieldQuery('fileContent', 'keyword');
$boolQuery->addMust($filterKeywords);
$filterUser = new \Elastica\Query\Terms();
$filterUser->setTerms('company_id', array('1'));
$boolQuery->addFilter($filterUser);
$finder = $this->get('fos_elastica.finder.cvs.application');
Thanks.
For this you need top_hits aggregation inside the terms one you are already using:
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 100
}
}
}
}
}
top_hits:{size:1} appears to be what I need, having played around with Andrei's answer. This will return one record for each bucket in the aggregation
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
Ref: top_hits
top_hits helped me too. I had some trouble too, but eventually figured out how to resolve it. So here is my solution:
{
"query": {
"nested": {
"path": "placedOrders",
"query": {
"bool": {
"must": [
{
"term": {
"placedOrders.ownerId": "0a9fdef0-4508-4f9c-aa8c-b3984e39ad1e"
}
}
]
}
}
}
},
"aggs": {
"custom_name1": {
"nested": {
"path": "placedOrders"
},
"aggs": {
"custom_name2": {
"terms": {
"field": "placedOrders.propertyId"
},
"aggs": {
"custom_name3": {
"top_hits": {
"size": 1,
"sort": [
{
"placedOrders.propertyId": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}

Multi indices search with nested fields

I have two indices:
First, questions, have nested field answers. Second, articles do not have this field.
I try search by multi indices:
{
"index": "questions, articles",
"body":{
"query":{
"bool":{
"must":{
"nested":{
"path": "answer",
...
}
}
}
}
}
}
and get error "query_parsing_exception: [nested] failed to find nested object under path [answer]"
How I can search without errors, when one index have nested field, but another does not have?
I think you need to use the indices query and to use a different query for each index. Something like this:
GET /questions,articles/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"indices": {
"indices": [
"questions"
],
"query": {
"nested": {
"path": "answer",
"query": {
"term": {
"text": "bla"
}
}
}
}
}
},
{
"match_all": {}
}
]
}
},
{
"term": {
"some_common_field": {
"value": "whatever"
}
}
}
]
}
}
}

How to do an Elasticsearch range query with gauss function applied too?

How would I construct an ElasticSearch query to satisfy the following:
Price must be between 100,000 & 200,000, but also show results outside of this range, but with decreasing relevance if above 200k or below 100k.
So far I have the following but it doesn't seem to be doing what I want (omitted the wrapping query for brevity):
"function_score": {
"query": {
"range": {
"price_amount": {
"gte": 100000,
"lte": 200000
}
}
},
"functions": [
{
"gauss": {
"price_amount": {
"origin": "50000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Update:
Had another look and I think setting the function to the following, without the range query would do the trick, wouldn't it?
"function_score": {
"functions": [
{
"gauss": {
"price_amount": {
"origin": "150000",
"offset": "50000",
"scale": "10000"
}
}
}
]
}
Many thanks!
Lee

_score while doing indexing in elasticsearch

{
"query": {
"custom_score": {
"query": {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"filter": {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
},
"script": "_score * doc['_score']"
}
}
This gives error
[custom_score] query does not support [filter]
Then how to evaluate such query?
I would suggest you to look at your requirements regarding boosting, since your current script doesn't make much sense.
Also, have a look at the documentation for the elasticsearch query DSL. It provides either compound queries and simple ones, which you can combine together. As the error says, you can't put a filter inside a custom score query. You can either use a filtered query inside the custom score query:
{
"query": {
"custom_score": {
"query": {
"filtered" : {
"query" : {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"filter" : {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
}
},
"script": "_score * doc['_score']"
}
}
}
or use a top level filter like this:
{
"query": {
"custom_score": {
"query": {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"script": "_score * doc['_score']"
}
},
"filter": {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
}
The difference between the two options is that the top level filter is not considered if you make facets too in your search request, while if you put the filters within the query they are considered.
One other thing to look at: you don't need an and filter if you have only a single clause. Also, it usually doesn't make sense to put a full-text search within a filter, since filters are cacheable and given that full-text searches are free and pretty much unpredictable it would be a waste to cache them.

Categories