Elasticsearch Partial match or fuzzy match, boost partial results - php

Trying to query in Elasticsearch w/ the PHP client and give priority to partial words matches but still include fuzzy matches. If I remove the address.company match block, the query works as expected, but is broken with it present no matter how I seem to frame it. I am lost on the formatting to also include the fuzzy searches with a lower priority?
$search_data = [
"from" => (int) $start, "size" => (int) $count,
'query' => [
'bool' => [
'filter' => [
['term' => ['active' => 1]],
['term' => ['type' => 2]],
],
'must' => [
'wildcard' => [
'address.company' => '*' . $search_query . '*'
],
'match' => [
'address.company' => [
'query' => $search_query,
'operator' => 'and',
'fuzziness' => 'AUTO',
],
],
],
],
],
];

While I am still new to ES as likely is apparent, this solution seems to get the data I'm after. I only mention that because there may be a more ideal way if someone views this in the future. Switching from must to should and wrapping the arrays a bit differently did the trick.
$search_data = [
"from" => (int) $start, "size" => (int) $count,
'query' => [
'bool' => [
'filter' => [
['term' => ['active' => 1]],
['term' => ['type' => 2]],
],
'should' => [
[
'match' => ['address.company' => ['query'=>$search_query,'boost'=>10]],
],
[
'match' =>
[
'address.company' =>
[
'query' => $search_query,
'fuzziness' => 'AUTO',
],
],
],
],
'minimum_should_match'=>1,
],
],
];

Related

PHP MySQL ElasticSearch migration query OR

I cannot figure out the following:
A database with address fields (country, county, city) and an "operate" field, options: locally, nationally internationally
I want to find matches based on location and the "operate" status. So I am basically searching for how to use MySQL OR in ElasticSearch. To my understanding this is "should".
So, any entry that operates internationally should be in the results, any entry that operates nationally in the same country and any entry that operates locally in the same county regardless which city is selected
Just as a test case, I tried several things among which the following:
$arrayinternationally[] = array('match' => array('operate' => 'internationally'));
$arrayNationally[] = array('match' => array('operate' => 'nationally'));
$arrayNationally[] = array('match' => array('country' => '60'));
$arrayLocally[] = array('match' => array('operate' => 'locally'));
$arrayLocally[] = array('match' => array('country' => 'UK'));
$arrayLocally[] = array('match' => array('county' => '60'));
$params = [
'index' => 'bzlistings',
'body' => [
'from' => 0,
'size' => 80,
'query' => [
'bool' => [
'should' => [
'bool' => [
'should' => $arrayinternationally
]
],
'should' => [
'bool' => [
'should' => $arrayNationally
]
],
'should' => [
'bool' => [
'should' => $arrayLocally
]
],
],
],
],
];
Those entries in another country with "operate" set to "internationally", are not included, which is wrong.
How can this be done in ElasticSearch?
Thanks,
Peter
I think there is something wrong with the bool queries. Could you test with following query :
$params = [
'index' => 'bzlistings',
'body' => [
'from' => 0,
'size' => 80,
'query' => [
'bool' => [
'should' => [
[
'bool' => [
'must' => $arrayinternationally
]
],
[
'bool' => [
'must' => $arrayNationally
]
],
[
'bool' => [
'must' => $arrayLocally
]
],
],
],
],
],
];
But I think, in your case, inner bools need to be must instead of should. Also, I recommend that if you are looking for exact matching, you can use term query instead of match.
Typically, the bool query structure is something like the below :
GET sample-index/_search
{
"query": {
"bool": {
"should": [
{},
{}
]
}
}
}

Elasticsearch-php queries

Does anyone know a good resource with elasticsearch-php examples ideally covering queries taking MySQL Examples. I am struggling both with the code syntax and what to use when.
For example, I want to do a search where $name must be part of field 'business' and where 'country' matches $country
$params = [
'index' => 'xxxxx',
'type' => 'zzzzz',
'body' => [
'from' => 0,
'size' => $maxResults,
'query' => [
'bool' => [
'must' => [
'match' => ['name' => $searchString],
],
'must' => [
'match' => ['country' => $country],
],
],
],
],
];
The first 'must' seems to be completely ignored. Removing this will return exactly the same results.
I searched around for hours. There are plenty of quick beginner tutorials with simple search examples but I already get stuck one step further like with the above example
Thanks
You can only have a single must in a bool query, then all must constraints must be elements of the must array. Try like this instead:
$params = [
'index' => 'xxxxx',
'type' => 'zzzzz',
'body' => [
'from' => 0,
'size' => $maxResults,
'query' => [
'bool' => [
'must' => [
[
'match' => ['name' => $searchString],
],
[
'match' => ['country' => $country],
],
]
],
],
],
];

Elasticsearch how to correctly provide a negative boost in PHP

I'm trying to give a negative boost to push results down in the ranking if they have 'b-stock' in the title.
Here is my code:
'body' => [
'size' => 15,
'query' => [
'boosting' => [
'positive' =>[
'bool' => [
'should' => [
['query_string' => [
'default_field' => 'title_tag',
'query' => $term
]],
['query_string' => [
'default_field' => 'name',
'query' => $term
]],
['query_string' => [
'default_field' => 'description',
'query' => $term
]],
]
],
],
'negative' => [
'term' => [
'name' => 'B-Stock'
]
],
'negative_boost' => 2
]
]
]
However this seems to have no affect on the results even if I remove the term array from the 'negative' array the same results set is returned.

How to optimize elastic search query

I have been reading through elastic search docs over the last few months and have continued to optimize my query, but I can't seem to get a search query below 500-600ms. Locally with less data I can get responses in ~80-200ms.
To outline what I am trying to accomplish:
I have 12 different models in Laravel that are searchable from a single search bar. As someone types it is searched and returned in a list of results.
Currently, I have this for my search query. Are there any references for how I can improve this? I looked into multi_match, but I was having issues with partial matches and specifying all fields.
$results = $this->elastic->search([
'index' => config('scout.elasticsearch.index'),
'type' => $type ?? implode(',', array_keys($this->permissions, true, true)),
'body' => [
'query' => [
'bool' => [
'must' => [
[
'query_string' => [
'query' => "$searchQuery*",
],
],
],
'filter' => [
[
'term' => [
'account_id' => $accountId,
],
],
],
'should' => [
[
'term' => [
'_type' => [
'value' => 'customers',
'boost' => 1.3,
],
],
],
[
'term' => [
'_type' => [
'value' => 'contacts',
'boost' => 1.3,
],
],
],
[
'term' => [
'_type' => [
'value' => 'users',
'boost' => 1.3,
],
],
],
[
'term' => [
'_type' => [
'value' => 'chart_accounts',
'boost' => 1.2,
],
],
],
],
],
],
'from' => $from,
'size' => $size,
],
]);

elasticsearch: search for parts of words

I'm trying to learn how to use elasticsearch (using elasticsearch-php for queries). I have inserted a few data, which look something like this:
['id' => 1, 'name' => 'butter', 'category' => 'food'],
['id' => 2,'name' => 'buttercup', 'category' => 'food'],
['id' => 3,'name' => 'something else', 'category' => 'butter']
Now I created a search query which looks like this:
$query = [
'filtered' => [
'query' => [
'bool' => [
'should' => [
['match' => [
'name' => [
'query' => $val,
'boost' => 7
]
]],
['match' => [
'category' => [
'query' => $val,
'boost' => 5
]
]],
],
]
]
]
];
where $val is the search term. This works nicely, the only problem I have: when I search for "butter", I find ids 1 and 3, but not 2, because the searchterm seems to match exact words only. Is there a way to search "within words", or, in mysql terms, to do something like WHERE name LIKE '%val%' ?
You can try the wildcard query
$query = [
'filtered' => [
'query' => [
'bool' => [
'should' => [
['wildcard' => [
'name' => [
'query' => '*'.$val.'*',
'boost' => 7
]
]],
['wildcard' => [
'category' => [
'query' => '*'.$val.'*',
'boost' => 5
]
]],
],
]
]
]
];
or the query_string query.
$query = [
'filtered' => [
'query' => [
'bool' => [
'should' => [
['query_string' => [
'default_field' => 'name',
'query' => '*'.$val.'*',
'boost' => 7
]],
['query_string' => [
'default_field' => 'category',
'query' => '*'.$val.'*',
'boost' => 7
]],
],
]
]
]
];
Both will work but are not really performant if you have lots of data.
The correct way of doing this is to use a custom analyzer with a standard tokenizer and an ngram token filter in order to slice and dice each of your tokens into small ones.

Categories