ES: Match bool and fuzzy query - php

I have ES query. Now I want add to this query 'fuzzy' parameter.
I'am trying :
"body" : {
"query" : {
"bool" : {
"must" : {
$finalQuery,
},
}
},
"match" : {
"city" : {
"query" : 'Tokkiio',
"fuzziness" : "AUTO"
},
}
}
$finalQuery is query generated in loop, contains terms, range and term parameters.
I am receiving :
"{"error":{"root_cause":[{"type":"parsing_exception","reason":"[bool] malformed query, expected [END_OBJECT] but found [FIELD_NAME]","line":1,"col":177}],"type":"parsing_exception","reason":"[bool] malformed query, expected [END_OBJECT] but found [FIELD_NAME]","line":1,"col":177},"status":400}"
Thanks for help.

Please restructure the query into Query Context and Filter Context as documented here - https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html.
Place your fuzzy query and similar conditions in query context. Move the range and any filter conditions to filter context. Here is a sample query.
{
"query":
{
"bool":
{
"must":
{
"fuzzy":
{
"city":
{
"value": "Tokkiio",
"fuzziness": "AUTO"
}
}
},
"filter":
{
"range":
{
"year":
{
"gte": 2016
}
}
}
}
}
}

Related

Convert an infix expression to elastic search query

How can I convert an infix expression to an elastic search query
my operators are ! + *
and user may make any expression using those operators, something like:
(((A*B*(!C))*(D*E))+F)*G
and I wish to convert it to a bool query in elastic search
Edit
I don't know why I didn't say this earlier but I have already written a code to convert infix to postfix expression and then I call a very dirty recursive method to create should (+), must (*) and must_not (!) but what i'm seeking is an optimized way to do the trick for me.
My query at the end is something like this which I think is very very nasty:
{
"from": 0,
"size": 10,
"_source": [
"*"
],
"index": "insta_userpost_new2",
"body": {
"query": {
"bool": {
"must": [
{
"match_phrase": {
"caption.text": "G"
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"caption.text": "F"
}
},
{
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase": {
"caption.text": "E"
}
},
{
"match_phrase": {
"caption.text": "D"
}
}
]
}
},
{
"bool": {
"must": [
{
"bool": {
"must_not": [
{
"match_phrase": {
"caption.text": "C"
}
},
{
"bool": {
"must": [
{
"match_phrase": {
"caption.text": "B"
}
},
{
"match_phrase": {
"caption.text": "A"
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
}
}
I would maybe try to leverage simply_query_string. For that, you'd have to:
replace + by | (for the OR)
then replace * by + (for the AND)
finally replace ! by - (for the NOT)
So if a user inputs this:
(((A*B*(!C))*(D*E))+F)*G
You'd end up with this
(((A+B+(-C))+(D+E))|F)+G
Which is a boolean expression that you can directly use in a simply_query_string query.
GET /_search
{
"query": {
"simple_query_string" : {
"fields" : ["content"],
"query" : "(((A+B+(-C))+(D+E))|F)+G"
}
}
}
You can use Elasticsearch scripts in queries, like this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html
There are few scripting options, with the most simple and strait forward is 'Painless' scripts. From elastic documentation:
When you define a scripted field in Kibana, you have a choice of scripting languages. Starting with 5.0, the default options are Lucene expressions and Painless
Also you can return the result of the calculation using Scripted Fields:
https://www.elastic.co/guide/en/kibana/current/scripted-fields.html
You can run an infix expression evaluation[1] and replace the standard eval operations with DSL bool query composers.
We actually do something akin to this for https://opensource.appbase.io/mirage/ (you can try it live), where we map GUI blocks to a composable bool query. The source code is viewable at https://github.com/appbaseio/mirage.
[1] Ref: https://www.geeksforgeeks.org/expression-evaluation/

How to group a query to match one or more words in elasticsearch?

I have this query
{
"bool": {
"should": [
{
"multi_match": {
"query": "LAS VEGAS, HENDERSON",
"fields": ["city"]
}
}
]
}
}
this returns:
"city": "LAS VEGAS",
"city": "LAS CRUCES",
"city": "HENDERSON",
Note the LAS CRUCES result. I don't want it.
One way would be to have it written like this:
"bool": {
"should": [
{
"match": {
"city": {
"query": "LAS VEGAS",
"operator": "and"
}
}
},
{
"match": {
"city": {
"query": "HENDERSON",
"operator": "and"
}
}
}
}
But I prefer the first approach, if it can be done.
Any ideas?
You can use query_string query as shown below:
GET /_search
{
"query": {
"query_string" : {
"fields" : ["city"],
"query" : "\"LAS VEGAS\" OR \"HENDERSON\""
}
}
}
You need to enclose the values in quotes to search for exact phrase.
If you are using city field for searching exact matches then you should consider changing it's mapping type from text to keyword. It will fetch you good performance.
If your city field is of type keyword then you can achieve the same results using terms query as shown below:
GET /_search
{
"query": {
"constant_score" : {
"filter" : {
"terms" : { "city" : ["LAS VEGAS", "HENDERSON"]}
}
}
}
}
Hope this helps!
multi_match accepts the operator flag, which can be set to or.
{
"bool": {
"should": [
{
"multi_match": {
"query": "LAS VEGAS, HENDERSON",
"fields": ["city"],
"operator": "or"
}
}
]
}
}

Elasticsearch multiple queries

i would like to get distinct ip's for example today and where campaigne="2"
in sql:
select distinct ip
from test
where timestamp >= "2016-01-16" ... AND
fk_campaign_id = "2";
this works but json validator outputs "Duplicate key, names should be unique."
{
"size":0,
"aggs":{
"distinct_ip":{
"cardinality":{
"field":"ip"
}
}
},
"query":{
"range":{
"timestamp":{
"gte":"2016-01-16T00:00:00",
"lt":"2016-01-17T00:00:00"
}
}
},
"query":{
"match":{
"fk_campaign_id":"2"
}
}
}
But if i try to build this query in php, var_dump($params) returns me back json only with one "query", may be because of Duplicate key???
{
"size":0,
"aggs":{
"distinct_ip":{
"cardinality":{
"field":"ip"
}
}
},
part with range is not here?!?!?
"query":{
"match":{
"fk_campaign_id":"2"
}
}
}
Thanks in advance.
In your json query is a duplicate key. You need to use bool query whenever you have multiple conditions. since you have AND condition you need to use must clause. This is the right syntax
{
"query": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "2016-01-16T00:00:00",
"lt": "2016-01-17T00:00:00"
}
}
},
{
"match": {
"fk_campaign_id": "2"
}
}
]
}
},
"size": 0,
"aggs": {
"distinct_ip": {
"cardinality": {
"field": "ip"
}
}
}
}
Hope this helps!

Elasticsearch Range filter with year only

I need to filter my data with year only using elastic search. I am using PHP to fetch and show the results. Here is my JSON Format data
{ loc_cityname: "New York",
location_countryname: "US",
location_primary: "North America"
admitted_date : "1994-12-10"
},
{ loc_cityname: "New York",
location_countryname: "US",
location_primary: "North America"
admitted_date : "1995-12-10"
},
I am using below codes to filter the values by year.
$options='{
"query": {
"range" : {
"admitted_date" : {
"gte" : 1994,
"lte" : 2000
}
}
},
"aggs" : {
"citycount" : {
"cardinality" : {
"field" : "loc_cityname",
"precision_threshold": 100
}
}
}
}';
How can i filter the results with year only. Please somebody help me to fix this.
Thanks in advance,
You simply need to add the format parameter to your range query like this:
$options='{
"query": {
"range" : {
"admitted_date" : {
"gte" : 1994,
"lte" : 2000,
"format": "yyyy" <--- add this line
}
}
},
"aggs" : {
"citycount" : {
"cardinality" : {
"field" : "loc_cityname",
"precision_threshold": 100
}
}
}
}';
UPDATE
Note that the above solution only works for ES 1.5 and above. With previous versions of ES, you could use a script filter instead:
$options='{
"query": {
"filtered": {
"filter": {
"script": {
"script": "(min..max).contains(doc.admitted_date.date.year)",
"params": {
"min": 1994,
"max": 2000
}
}
}
}
},
"aggs": {
"citycount": {
"cardinality": {
"field": "loc_cityname",
"precision_threshold": 100
}
}
}
}';
In order to be able to run this script filter, you need to make sure that you have enabled scripting in elasticsearch.yml:
script.disable_dynamic: false

_score while doing indexing in elasticsearch

{
"query": {
"custom_score": {
"query": {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"filter": {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
},
"script": "_score * doc['_score']"
}
}
This gives error
[custom_score] query does not support [filter]
Then how to evaluate such query?
I would suggest you to look at your requirements regarding boosting, since your current script doesn't make much sense.
Also, have a look at the documentation for the elasticsearch query DSL. It provides either compound queries and simple ones, which you can combine together. As the error says, you can't put a filter inside a custom score query. You can either use a filtered query inside the custom score query:
{
"query": {
"custom_score": {
"query": {
"filtered" : {
"query" : {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"filter" : {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
}
},
"script": "_score * doc['_score']"
}
}
}
or use a top level filter like this:
{
"query": {
"custom_score": {
"query": {
"match": {
"xxx": {
"query": "foobar"
}
}
},
"script": "_score * doc['_score']"
}
},
"filter": {
"and": [
{
"query": {
"match": {
"yyyy": {
"query": "barfoo"
}
}
}
}
]
}
}
The difference between the two options is that the top level filter is not considered if you make facets too in your search request, while if you put the filters within the query they are considered.
One other thing to look at: you don't need an and filter if you have only a single clause. Also, it usually doesn't make sense to put a full-text search within a filter, since filters are cacheable and given that full-text searches are free and pretty much unpredictable it would be a waste to cache them.

Categories