Elasticsearch find input word and all synonyms - php

Using elasticsearch I try find all items by word "skiing".
My mapping (PHP array):
"properties" => [
"title" => [
"type" => "string",
"boost" => 1.0,
"analyzer" => "autocomplete"
]
]
Settings:
"settings"=> [
"analysis" => [
"analyzer" => [
"autocomplete" => [
"type" => "custom",
"tokenizer" => "standard",
"filter" => ["lowercase", "trim", "synonym", "porter_stem"],
"char_filter" => ["html_strip"]
]
],
"filter" => [
"synonym" => [
"type" => "synonym",
"synonyms_path" => "analysis/synonyms.txt"
]
]
]
]
Search query:
[
"index" => "articles",
"body" => [
"query" => [
"filtered" => [
"query" => [
"bool" => [
"must" => [
"indices" => [
"indices" => ["articles"],
"query" => [
"bool" => [
"should" => [
"multi_match" => [
"query" => "skiing",
"fields" => ["title"]
]
]
]
]
]
]
]
]
]
],
"sort" => [
"_score" => [
"order" => "desc"
]
]
],
"size" => 10,
"from" => 0,
"search_type" => "dfs_query_then_fetch",
"explain" => true
];
In the sysnonyms.txt have skiing => xanthic.
I want get all items with "skiing" (because it is input word), "ski" (by porter_stem tokenizer) and then "xanthic" (by synonyms file). But get result only with word "xanthic".
Please, tell me why? How I need configure the index?

In the synonyms file you need to have "skiing, xanthic". In the way you have it now you are replacing skiing with xanthic, but you want to keep both. And I think you need to reindex the data to see the change.

Thanx, but this is decision. I changed mapping:
"properties" => [
"title" => [
"type" => "string",
"boost" => 1.5,
"analyzer" => "standard",
"fields" => [
"english" => [
"type" => "string",
"analyzer" => "standard",
"search_analyzer" => "english",
"boost" => 1.0
],
"synonym" => [
"type" => "string",
"analyzer" => "standard",
"search_analyzer" => "synonym",
"boost" => 0.5
]
]
]
]
Settings:
"settings"=> [
"analysis" => [
"analyzer" => [
"synonym" => [
"type" => "custom",
"tokenizer" => "standard",
"filter" => ["lowercase", "trim", "synonym"],
"char_filter" => ["html_strip"]
]
],
"filter" => [
"synonym" => [
"type" => "synonym",
"synonyms_path" => "analysis/synonyms.txt"
]
]
]
]

Related

ElasticSearch 6.2 - aggs return : unknown_named_object_exception

$params2 = [
'index' => 'index',
'type' => "items",
'body' => [
'aggs' => [
"types" => [
"filter" => [
"bool" => [
"should" => [
["term" => ["type_id" => 1]],
["term" => ["type_id" => 2]]
]
]
],
"aggs" => [
"types" =>[
["terms" => ["field" => "type_id","size" => 4]],
"aggs" =>[
"top" => [
["top_hits" => ["size" => 2]]
]
]
]
]
]
],
]
];
when i pass this params to $elastic->search($params2);
its return me this exception
{"error":{"root_cause":[{"type":"unknown_named_object_exception","reason":"Unknown BaseAggregationBuilder [0]","line":1,"col":117}],"type":"unknown_named_object_exception","reason":"Unknown BaseAggregationBuilder [0]","line":1,"col":117},"status":400}
i am using ErickTamayo/laravel-scout-elastic package
You need to remove the square brackets around terms and top_hits
$params2 = [
'index' => 'index',
'type' => "items",
'body' => [
'aggs' => [
"types" => [
"filter" => [
"bool" => [
"should" => [
["term" => ["type_id" => 1]],
["term" => ["type_id" => 2]]
]
]
],
"aggs" => [
"types" =>[
"terms" => ["field" => "type_id","size" => 4],
"aggs" =>[
"top" => [
"top_hits" => ["size" => 2]
]
]
]
]
]
],
]
];

PHP array index undefined, although its there

Given an array as follow :
$z = [
"cat lovers" => [
"name" => "cat lovers",
"impressions" => 2038,
"ctr" => 0.032875368007851,
"actions" => [
[
"action_type" => "attention_event",
"value" => "232",
],
[
"action_type" => "landing_page_view",
"value" => "18",
],
],
"shorty kodi" => [
"name" => "shorty kodi",
"impressions" => 534,
"ctr" => 0.041198501872659,
"actions" => [
[
"action_type" => "attention_event",
"value" => "56",
],
[
"action_type" => "landing_page_view",
"value" => "7",
]
]
]
Following code run with no error :
foreach($z as $i) {
print_r(array_column($i["actions"], "action_type"));
}
But if we remove the print_r function to like :
foreach($z as $i) {
$b = array_column($i["actions"], "action_type");
}
It results an error saying :
PHP error: Undefined index: actions on line 2
Any idea why?
Thanks
It's working fine, I just checked it.
put print_r($b); inside or after foreach. You will get the same result as you're getting from the first foreach.
The provided array has a syntax error.
Try this:
$z = [
"cat lovers" => [
"name" => "cat lovers",
"impressions" => 2038,
"ctr" => 0.032875368007851,
"actions" => [
[
"action_type" => "attention_event",
"value" => "232",
],
[
"action_type" => "landing_page_view",
"value" => "18",
],
]
],
"shorty kodi" => [
"name" => "shorty kodi",
"impressions" => 534,
"ctr" => 0.041198501872659,
"actions" => [
[
"action_type" => "attention_event",
"value" => "56",
],
[
"action_type" => "landing_page_view",
"value" => "7",
]
]
]
];
foreach($z as $i) {
$b = array_column($i["actions"], "action_type");
}
I dont know what went wrong but i coped and pasted your code and it is working fine.
Here is what I did.
$z = [
"cat lovers" => [
"name" => "cat lovers",
"impressions" => 2038,
"ctr" => 0.032875368007851,
"actions" => [
[
"action_type" => "attention_event",
"value" => "232",
],
[
"action_type" => "landing_page_view",
"value" => "18",
],
],
"shorty kodi" => [
"name" => "shorty kodi",
"impressions" => 534,
"ctr" => 0.041198501872659,
"actions" => [
[
"action_type" => "attention_event",
"value" => "56",
],
[
"action_type" => "landing_page_view",
"value" => "7",
]
]
]
]];
$b = "";
foreach($z as $i) {
$b=array_column($i["actions"], "action_type");
}
print_r($b);
Your array has a syntax issue. Try this fixed array,
$z = [
"cat lovers" => [
"name" => "cat lovers",
"impressions" => 2038,
"ctr" => 0.032875368007851,
"actions" => [
[
"action_type" => "attention_event",
"value" => "232",
],
[
"action_type" => "landing_page_view",
"value" => "18",
],
]
],
"shorty kodi" => [
"name" => "shorty kodi",
"impressions" => 534,
"ctr" => 0.041198501872659,
"actions" => [
[
"action_type" => "attention_event",
"value" => "56",
],
[
"action_type" => "landing_page_view",
"value" => "7",
]
]
]
];
Just change your code like that :
<?php
$z = [
"cat lovers" => [
"name" => "cat lovers",
"impressions" => 2038,
"ctr" => 0.032875368007851,
"actions" => [
[
"action_type" => "attention_event",
"value" => "232",
],
[
"action_type" => "landing_page_view",
"value" => "18",
],
],
],
"shorty kodi" => [
"name" => "shorty kodi",
"impressions" => 534,
"ctr" => 0.041198501872659,
"actions" => [
[
"action_type" => "attention_event",
"value" => "56",
],
[
"action_type" => "landing_page_view",
"value" => "7",
]
]
]
];
echo '<pre>';
$b = [];
foreach($z as $i) {
$b[] = array_column($i["actions"], "action_type");
}
print_r($b);
?>
That sit!
Sorry, I found the cause to be simple. The array actually quite long when I tested it, and in one of the element, there is no "action" index, all the rest has!
That's why when loop through each, it will fail on that one.
Thank you for all the feedbacks.

ElasticSearch match query multiple terms PHP

I am trying to construct must query on multiple terms, the array looks like this:
$params = [
'body' => [
'query' => [
"bool" => [
"must" => [
"terms" => [
"categories" => [
"Seating",
],
],
"terms" => [
"attributes.Color" => [
"Black",
],
]
],
"filter" => [
"range" => [
"price" => [
"gte" => 39,
"lte" => 2999,
],
],
],
],
],
'from' => 0,
'size' => 3,
],
];
Which is represented in JSON like this:
{
"query": {
"bool": {
"must": {
"terms": {
"attributes.Color": ["Black"]
}
},
"filter": {
"range": {
"price": {
"gte": "39",
"lte": "2999"
}
}
}
}
},
"from": 0,
"size": 3
}
The problem is, JSON objects are represented as arrays in PHP so if I setup key for one array, it is rewritten. Do you have any idea on how to create multiple terms query in PHP?
Thanks in advance.
You need to add an additional array to enclose all your terms queries
$params = [
'body' => [
'query' => [
"bool" => [
"must" => [
[
"terms" => [
"categories" => [
"Seating",
],
]
],
[
"terms" => [
"attributes.Color" => [
"Black",
],
]
]
],
"filter" => [
"range" => [
"price" => [
"gte" => 39,
"lte" => 2999,
],
],
],
],
],
'from' => 0,
'size' => 3,
],
];

elasticsearch return only one document of id field

I have this data returned with my actual query.
{
"id": 1,
"chantierId": 60,
"location": {
"lat": 49.508804203333,
"lon": 2.4385195366667
}
},
{
"id": 2,
"chantierId": 60,
"location": {
"lat": 49.508780168333,
"lon": 2.43844484
}
},
{
"id": 3,
"chantierId": 33,
"location": {
"lat": 49.50875823,
"lon": 2.4383772216667
}
}
This my Elasticsearch query which search the point with geo_point. :
[
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667]
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667],
"order" => "asc"
]
]
]
How can I to have only one documents of chantierId for 33, 60 and the must nearest of my location.
Thanks
You can add size parameter before query as the number of documents you want to recieve. The modified query will be:
[ "size" => 1,
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667]
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => ['lat' => 49.508804203333, 'lon => 2.4385195366667],
"order" => "asc"
]
]
]
I Resolved my problem with this answer of stackoverflow question : Remove duplicate documents from a search in Elasticsearch
So :
[
"query" => [
"filtered" => [
"query" => [
"match_all" => []
],
"filter" => [
"geo_distance" => [
"distance" => "100m",
"location" => $location
]
]
]
],
"sort" => [
"_geo_distance" => [
"location" => $location,
"order" => "asc"
]
],
"aggs" => [
"Geoloc" => [
"terms" => [
"field" => "chantierId"
],
"aggs" => [
"Geoloc_docs" => [
"top_hits" => [
"size" => 1
]
]
]
]
]
]);
Thanks to #Tanu who tried to help me

how to Improving relevancy in elasticsearch?

This is how my mapping looks
$arr = [
'index' => 'test1',
'body' => [
'settings' => [
'analysis' => [
'analyzer' => [
'name_analyzer' => [
'type' => 'custom',
'tokenizer' => 'standard',
'filter' => [
'lowercase',
'asciifolding',
'word_delimiter'
]
]
]
]
],
"mappings" => [
"info" => [
"properties" => [
"Name" => [// this field is analyzed
"type" => "string",
"fields" => [
"raw" => [ //subfield of Name is not analyzed so that we can avoid a known issue of space saperated bucket generation
"type" => "string",
"index" => "not_analyzed"
]
]
],
"Address" => [
"type" => "string",
"index" => "analyzed",
"analyzer" => "name_analyzer"
]
]
]
]
]
];
And this is my query
$query['index'] = 'test1';
$query['type'] = 'info';
//without bool & should also it will work
$query['body'] = [
'query'=> [
'bool' => [
'should' => [
'query_string' => [
'fields' => ['Name'],
'query' => 'sa*',
'analyze_wildcard' => 'true'
]
]
]
],
'size'=> '0',
'aggregations' => [
'actor' => [
'terms' => [
'field' => 'Name.raw',
'size' => 10
]
]
]
];
My output is
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"actor": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Salma Hayak",
"doc_count": 1
},
{
"key": "Salman Khan",
"doc_count": 1
},
{
"key": "Salman Shaikh",
"doc_count": 1
}
]
}
}
}
What I want is since Salman Khan is the most searched actor as compare to Salma Hayak, having said that when user searched for "sa" they should see salman khan first rather than salma hayak.
Can anyone please help me on this?

Categories