How do you do field boosting in the Elasticsearch PHP client? - php

I am using Elasticsearch for the first time and have the indexing and basic searching down, but I am looking todo some complex searching.
With the PHP client how do you do partial searches and field boosting / relevance? Ultimately, I want to search multiple fields for partial matches, exact matches, and boost some of the fields.
Here is what I have so far, but I can't get it working. The Elasticsearch documentation is no good.
$show_params = [
'index' => env('ES_INDEX'),
'type' => 'show',
'size' => 6,
'body' => [
'query' => [
'bool' => [
'should' => [
[
'match' => [
'title' => [
'query' => '*' . $q . '*',
'boost' => 2
]
]
],
[
'match' => [
'synopsis' => '*' . $q . '*'
]
]
]
]
]
]
];
$client = \Elasticsearch\ClientBuilder::create()->build();
$show_raw_results = $client->search($show_params);

The basic match query doesn't support wildcards, which is why your queries are not working (the boosting syntax is, however, correct).
You can try using the wildcard query, but it has some limitations (in particular, it is a not_analyzed query, which means your input text needs to be "pre-analyzed"):
$show_params = [
'index' => env('ES_INDEX'),
'type' => 'show',
'size' => 6,
'body' => [
'query' => [
'bool' => [
'should' => [
[
'wildcard' => [
'title' => [
'value' => '*' . $q . '*',
'boost' => 2
]
]
],
[
'wildcard' => [
'synopsis' => '*' . $q . '*'
]
]
]
]
]
]
];
$client = \Elasticsearch\ClientBuilder::create()->build();
$show_raw_results = $client->search($show_params);
However, this isn't necessarily the best approach. Wildcards are slower, and will pull back many documents which may frustrate your users (since there are many low-scoring, less-relevant hits).
And most importantly, wildcards are not_analyzed queries, meaning the query text won't go through analysis. If you search for "quick brown fox", the above query will search your index for "*quick brown fox*" exactly as if it were a single token, instead of breaking it into multiple tokens (["quick","brown","fox"]) and searching with those.
I would highly suggest reading through the section in the Definitive Guide on Partial Matching, or just start at the beginning of the Fulltext Search chapter and work your way through it. You'll need a good understanding of analysis and tokenization to get decent results with partial/fuzzy matching.

Related

If the number of letters is low, why doesn't elasticsearch work?

$params = [
'from' => $from,
'size' => config('app.pagination'),
'index' => $index,
//'type' => $this->type,
'body' => [
'query' => [
'bool' => [
'filter' => [
'term' => $where
],
'must' => [
'multi_match' => [
'query' => $match,
'fields'=>$fields,
'fuzziness' => "AUTO:1,5",
]
]
],
]
]
];
Hello, I don't have a problem with my query above.
For example, I am looking for a pizza. I am writing pizaz again.
These queries return the correct records to me.
But I have a problem.
It doesn't return anything when you type piz.
How can I solve this problem? I want it to work when I write it missing.
Add minimum_should_match parameter to your query.
Using minimum_should_match sets a threshold (absolute number, percentage, or combination of these) for matching clauses in boolean queries.
As you know from docs
If the bool query includes at least one should clause and no must or
filter clauses, the default value is 1. Otherwise, the default value
is 0.

Highlighting does not work in Elasticsearch and PHP

I've just downloaded and installed the last version of Elasticsearch on my Windows machine. I did my first search queries and everything seemed to work ok. However. when I try to highlight the search results, I fail. So, this is how my query looks like:
$params = [
'index' => 'test_index',
'type' => 'test_index_type',
'body' => [
'query' => [
'bool' => [
'should' => [ 'match' => [ 'field1' => '23' ] ]
]
],
'highlight' => [
'pre_tags' => "<em>",
'post_tags' => "</em>",
'fields' => (object)Array('field1' => new stdClass),
'require_field_match' => false
]
]
]
$res = $client->search($params);
On the whole the query itself works nice - the results are filtered. In the console I see, that all documents indeed contain "23" value in their field1 field. However, these tags - <em></em> are simply not added to the result. What I see is just the raw value in field1 like "some text 23", "23 another text". It is not what I expect to see - "some text <em>23</em>", "<em>23</em> another text". So, what is wrong with that and how can I fix it?
From the manual:
The value of pre_tags and post_tags should be an array (however if you don't want to change the em tags you can ignore them, they already set as default).
The fields value should be an array, key is the field name and the value is an array with the field options.
Try this fix:
$params = [
'index' => 'test_index',
'type' => 'test_index_type',
'body' => [
'query' => [
'bool' => [
'should' => [ 'match' => [ 'field1' => '23' ] ]
]
],
'highlight' => [
// 'pre_tags' => ["<em>"], // not required
// 'post_tags' => ["</em>"], // not required
'fields' => [
'field1' => new \stdClass()
],
'require_field_match' => false
]
]
];
$res = $client->search($params);
var_dump($res['hits']['hits'][0]['highlight']);
update
Did a double check, the value of the field in the fields array should be an object (which is a requirement, not exactly the same as other options).
The pre/post_tags can also be strings (and not array).
Did you check the correct response? $res['hits']['hits'][0]['highlight']
The important thing to notice is that the highligted results goes into the highlight array - $res['hits']['hits'][0]['highlight'].

Trying to only return results for multiple must matches - elasticsearch

Consider this PHP array for defining a search query on elasticsearch
$searchParams = [
'index' => $index_name,
'type' => 'nginx-access',
'size' => 1000,
'sort' => ['_score'],
'fields' => ["remote_addr", "#timestamp", "request", "method", "status"],
'body' => [
// 'min_score' => 4,
'query' => [
'bool' => [
'must' => [
['match' => ['status' => '200']],
['match' => ['method' => 'POST']],
['match' => ['request' => '/wp-login.php']],
],
]
]
]
];
$results = $this->client->search($searchParams);
var_dump($results);
Notice that 'min_score' is commented out. The array of results returns everything as expected, however I only want results that match all of the criteria. I thought this is what query->bool->must is supposed to do, however near the end of the dataset I am getting matches for other seemingly random things as well.
If I run it with min_score then I'm always going to get the results I want, but this does not seem the right way to do it.
Am I missing something obvious here?

Empty values in Elasticsearch query string

I am using ES for my Laravel app in order to search a table/type.
My users can search a total of 5 columns which means that there can be a total of 31 query combinations.
So my question is now if I can use the same query but dont provide ES with all the seach params. Or somehow write dynamic queries.
Eg:
'filtered' => [
'query' => [
'match' => ['title' => Input::get('query')]
],
'filter'=> [
'bool' => [
'must' => [
['term' => [ 'type' => 1] ],
['term' => [ 'state' => 22] ],
['term' => [ 'city' => ] ], (empty)
[
'range' => [
'price' => [
'gte' => , (empty)
'lte' => , (empty)
]
]
]
]
]
],
],
Otherwise I have to write 31 different combinations of this query - If ES dont have anything that can help me. And I can use Laravels eloquent ORM for this.
Thanks in advance
You can use Elasticquent
Elasticquent makes working with Elasticsearch and Eloquent models easier by mapping them to Elasticsearch types. You can use the default settings or define how Elasticsearch should index and search your Eloquent models right in the model.
Elasticquent uses the official Elasticsearch PHP API. To get started, you should have a basic knowledge of how Elasticsearch works (indexes, types, mappings, etc).
https://github.com/adamfairholm/Elasticquent

Elastic Search deleteByQuery multiple terms

I'm having trouble duplicating my MySQL delete query in elastic search, I am using this documentation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html using the PHP wrapper for Laravel.
I'm trying this:
$this->es->deleteByQuery([
'index' => 'users',
'type' => 'user',
'body' => [
'query' => [
'term' => ['field1' => $this->field1],
'term' => ['field2' => $this->field2],
'term' => ['temp' => 0]
]
]
]);
Its suppose to be a DELETE FROM users WHERE field1 = $this->field1 AND field2 = $this->field2...
I'm having trouble translating the WHERE AND syntax to Elastic Search.
Any help?
Your second comment was mostly correct:
I think I have have gone overboard right now. I have body => query =>
filter => filtered => bool => must => term, term, term. Do I need the
filter => filtered arrays?
The bool filter is preferable over the bool query, since filtering is often much faster than querying. In your case, you are simply filtering documents that have the various terms, and don't want them contributing to the score, so filtering is the correct approach.
This should be done though the query clause, however, since the top-level filter clause is used for a different purpose (filtering facets/aggregations...it was in-fact renamed to post_filter in 1.0 to signify that it is a "post filtering" operation).
Your query should look something like this:
$this->es->deleteByQuery([
'index' => 'users',
'type' => 'user',
'body' => [
'query' => [
'filtered' => [
'filter' => [
'must' => [
['term' => ['field1' => $this->field1]],
['term' => ['field2' => $this->field2]],
['term' => ['temp' => 0]]
]
]
]
]
]
]);

Categories