Elasticsearch PHP - Exact Word Matches Before Partial Matches - php

So I'm trying to sort my search results to show the exact matches before all the partial matches. What I mean by this is if I have the documents with names:
Set 4/102
Set 44/102
Set 94/102
I'm searching on the term 4/102 and it returns all documents. This is fine, however, I want the Set 4/102 to show up first but it seemingly sorts them randomly. Is there a way to use script sorting or something like that to have the exact term match to show up first?
These are my mappings and settings:
$settingsParams = [
'index' => 'products',
'body' => [
'settings' => [
'analysis' => [
'analyzer' => [
'substring_analyzer' => [
'tokenizer' => 'substring_tokenizer',
'filter' => [
'lowercase'
]
],
'fullword_analyzer' => [
'tokenizer' => 'whitespace',
'filter' => [
'lowercase'
]
],
],
'tokenizer' => [
'substring_tokenizer' => [
'type' => 'nGram',
'min_gram' => 3,
'max_gram' => 12,
'token_chars' => [
'letter',
'digit',
'symbol',
'custom'
],
'custom_token_chars' => '/'
]
]
],
'max_ngram_diff' => 20
]
]
];
$mappingParams = [
'index' => 'products',
'body' => [
'_source' => [
'enabled' => true
],
'properties' => [
'name' => [
'type' => 'text',
'fields' => [
'keyword' => [
'type' => 'keyword'
]
],
'analyzer' => 'substring_analyzer',
'search_analyzer' => 'fullword_analyzer'
],
'min_price' => [
'type' => 'double'
],
'saleprice' => [
'type' => 'double'
],
'list_price' => [
'type' => 'double'
],
'root_category_rank' => [
'type' => 'integer'
],
'interest_level' => [
'type' => 'integer'
],
'root_categoryid' => [
'type' => 'integer'
]
]
]
];

Adding a working example
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"substring_analyzer": {
"tokenizer": "substring_tokenizer"
},
"fullword_analyzer": {
"tokenizer": "whitespace"
}
},
"tokenizer": {
"substring_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 12,
"token_chars": [
"letter",
"digit",
"custom",
"symbol"
],
"custom_token_chars": "/"
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "substring_analyzer",
"search_analyzer": "fullword_analyzer"
}
}
}
}
Search Query:
{
"query": {
"match": {
"name": "4/102"
}
}
}
Search Result:
The document "name": "4/102" is having a higher score as compared to other documents
"hits": [
{
"_index": "66232066",
"_type": "_doc",
"_id": "1",
"_score": 0.15275992,
"_source": {
"name": "4/102" // note this
}
},
{
"_index": "66232066",
"_type": "_doc",
"_id": "2",
"_score": 0.12562492,
"_source": {
"name": "44/102"
}
},
{
"_index": "66232066",
"_type": "_doc",
"_id": "3",
"_score": 0.12562492,
"_source": {
"name": "94/102"
}
}
]

Related

ElasticSearch match query multiple terms PHP

I am trying to construct must query on multiple terms, the array looks like this:
$params = [
'body' => [
'query' => [
"bool" => [
"must" => [
"terms" => [
"categories" => [
"Seating",
],
],
"terms" => [
"attributes.Color" => [
"Black",
],
]
],
"filter" => [
"range" => [
"price" => [
"gte" => 39,
"lte" => 2999,
],
],
],
],
],
'from' => 0,
'size' => 3,
],
];
Which is represented in JSON like this:
{
"query": {
"bool": {
"must": {
"terms": {
"attributes.Color": ["Black"]
}
},
"filter": {
"range": {
"price": {
"gte": "39",
"lte": "2999"
}
}
}
}
},
"from": 0,
"size": 3
}
The problem is, JSON objects are represented as arrays in PHP so if I setup key for one array, it is rewritten. Do you have any idea on how to create multiple terms query in PHP?
Thanks in advance.
You need to add an additional array to enclose all your terms queries
$params = [
'body' => [
'query' => [
"bool" => [
"must" => [
[
"terms" => [
"categories" => [
"Seating",
],
]
],
[
"terms" => [
"attributes.Color" => [
"Black",
],
]
]
],
"filter" => [
"range" => [
"price" => [
"gte" => 39,
"lte" => 2999,
],
],
],
],
],
'from' => 0,
'size' => 3,
],
];

elasticsearch return only one document of id field

I have this data returned with my actual query.
{
"id": 1,
"chantierId": 60,
"location": {
"lat": 49.508804203333,
"lon": 2.4385195366667
}
},
{
"id": 2,
"chantierId": 60,
"location": {
"lat": 49.508780168333,
"lon": 2.43844484
}
},
{
"id": 3,
"chantierId": 33,
"location": {
"lat": 49.50875823,
"lon": 2.4383772216667
}
}
This my Elasticsearch query which search the point with geo_point. :
[
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667]
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667],
"order" => "asc"
]
]
]
How can I to have only one documents of chantierId for 33, 60 and the must nearest of my location.
Thanks
You can add size parameter before query as the number of documents you want to recieve. The modified query will be:
[ "size" => 1,
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667]
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667],
"order" => "asc"
]
]
]
I Resolved my problem with this answer of stackoverflow question : Remove duplicate documents from a search in Elasticsearch
So :
[
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => $location
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => $location,
"order" => "asc"
]
],
"aggs" => [
"Geoloc" => [
"terms" => [
"field" => "chantierId"
],
"aggs" => [
"Geoloc_docs" => [
"top_hits" => [
"size" => 1
]
]
]
]
]
]);
Thanks to #Tanu who tried to help me

Can't have a term and geo_distance_range in the same must filter

I have some data and I am trying to get all the results that have a certain month and are less than 1.6km from the target point. I am using the PHP client so my query looks like this.
$crimeSearch = [
'size' => 0,
'query' => [
'filtered' => [
'filter' => [
'bool' => [
'must' => [
'term' => [
'month' => $date,
],
'geo_distance_range' => [
'location' => [
'lat' => $lat,
'lon' => $lng,
],
'lt' => '1.6km',
],
],
],
],
],
],
'aggs' => [
'group_by_category' => [
'terms' => [
'field' => 'category',
],
],
],
];
I am currently seeing the following error:
query_parsing_exception: No query registered for [location]
My mapping looks like this:
"properties": {
"location": {
"type": "geo_point"
},
"category": {
"type": "string",
"index": "not_analyzed"
},
"month": {
"type": "string",
"index": "not_analyzed"
}
}
Now if I comment out either the term value or the geo_distance_range value from the must array then I get the correct results back. This error only occurs when they are both present.
Can anyone see what I wrong with my query?
I have tried moving the geo_distance_range into its own must block but this seems to bring back all results that match either of the the must filters and not them both.
If you need any more information please ask!
Thank you.
I do not know anything about PHP but If I try to convert equivalent ES json query then this might work. I guess you need to put every must clause in array like this
[
'size' => 0,
'query' => [
'filtered' => [
'filter' => [
'bool' => [
'must' => [
[
'term' => [
'month' => $date,
]
],
[
'geo_distance_range' => [
'location' => [
'lat' => $lat,
'lon' => $lng,
],
'lt' => '1.6km',
],
],
],
],
],
],
],
'aggs' => [
'group_by_category' => [
'terms' => [
'field' => 'category',
],
],
],
];
This is equivalent to
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"month": "June"
}
},
{
"geo_distance_range": {
"lt": "1.6km",
"location": {
"lat": 37.9174,
"lon": -122.305
}
}
}
]
}
}
}
}
}
Does this work?

how to Improving relevancy in elasticsearch?

This is how my mapping looks
$arr = [
'index' => 'test1',
'body' => [
'settings' => [
'analysis' => [
'analyzer' => [
'name_analyzer' => [
'type' => 'custom',
'tokenizer' => 'standard',
'filter' => [
'lowercase',
'asciifolding',
'word_delimiter'
]
]
]
]
],
"mappings" => [
"info" => [
"properties" => [
"Name" => [// this field is analyzed
"type" => "string",
"fields" => [
"raw" => [ //subfield of Name is not analyzed so that we can avoid a known issue of space saperated bucket generation
"type" => "string",
"index" => "not_analyzed"
]
]
],
"Address" => [
"type" => "string",
"index" => "analyzed",
"analyzer" => "name_analyzer"
]
]
]
]
]
];
And this is my query
$query['index'] = 'test1';
$query['type'] = 'info';
//without bool & should also it will work
$query['body'] = [
'query'=> [
'bool' => [
'should' => [
'query_string' => [
'fields' => ['Name'],
'query' => 'sa*',
'analyze_wildcard' => 'true'
]
]
]
],
'size'=> '0',
'aggregations' => [
'actor' => [
'terms' => [
'field' => 'Name.raw',
'size' => 10
]
]
]
];
My output is
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"actor": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Salma Hayak",
"doc_count": 1
},
{
"key": "Salman Khan",
"doc_count": 1
},
{
"key": "Salman Shaikh",
"doc_count": 1
}
]
}
}
}
What I want is since Salman Khan is the most searched actor as compare to Salma Hayak, having said that when user searched for "sa" they should see salman khan first rather than salma hayak.
Can anyone please help me on this?

Array structure in datatables

How to obtain the following structure after the php json_encode.
It is possible?
{
"data": [
{
"name": "Tiger Nixon",
"position": "System Architect",
"salary": "$320,800"
},
{
"name": "Garrett Winters",
"position": "Accountant",
"salary": "$170,750"
}
]
}
How must look arrays?
Although formally array keys can only be integer you could simple use:
array( 'data' =>
array(
array( 'name' => 'tiger nixon', 'position' => 'system architect', 'salary' => '$320,800' ),
array( 'name' => 'Garrett Winters', 'position' => 'Accountant', 'salary' => '$170,750' )
)
);

Categories