Elasticsearch "Join" tables - php

I need to do "Join" between 2 indexes (tables) and preform a check on specific field on documents that exists in both indexes.
I want to add condition like "dateExpiry" below, but I get an error. Is it possible to join 2 or more indexes?
GET cache-*/_search
{
"query": {
"bool": {
"must_not": [
{
"query": {
"terms": {
"TagId": {
"index": "domain_block-2016.06",
"type": "cBlock",
"id": "57692ef6ae8c50f67e8b45",
"path": "TagId",
"range" : {
"dateExpiry" : {
"gte" : "20160705T12:00:00"
}
}
}
}
}
]
}
}
}

Filters within a Terms Query Lookup are currently not supported. However, Elasticsearch has some great documentation on joins / relationships here.
Your best bet may be to run two queries against Elasticsearch - one to fetch the list of TagIds, then another that includes the list as an exclusion clause.

Related

Convert Mysql Query into Elastic search query

i'm working on elastic search but not expert in making elastic search queries. find my query bellow and if possible to convert into elasticsearch query then take thanks in advance
SELECT
`currency`.`id` AS `cur_id`,
`currency`.`currency_name` AS `cur_name`,
`currency`.`currency_code` AS `cur_code`,
`currency`.`currency_slug` AS `cur_slug`,
`currency`.`logo` AS `cur_logo`,
`currency`.`added_date` AS `cur_added_date`,
`currency`.`mineable_or_not` AS `mineable_or_not`,
`currency`.`market_cap` AS `cur_market_cap`,
`currency`.`circulating_supply` AS `cur_circulating_supply`,
`currency`.`max_supply` AS `cur_max_supply`,
`currency`.`total_supply` AS `cur_total_supply`,
`currency`.`market_cap` AS `ng_cur_market_cap`,
`currency`.`added_date` AS `ng_cur_added_date`,
`currency`.`circulating_supply` AS `ng_cur_circulating_supply`,
`calculations`.`volume_1hour` AS `cal_volume_1hour`,
`calculations`.`volume_24hour` AS `cal_volume_24hour`,
`calculations`.`volume_168hour` AS `cal_volume_168hour`,
`calculations`.`volume_720hour` AS `cal_volume_720hour`,
`calculations`.`volume_24hour_btc` AS `cal_volume_24hour_btc`,
`calculations`.`current_price` AS `cal_current_price`,
`calculations`.`percentage_change` AS `cal_percentage_change_24h`,
`calculations`.`percentage_change_1h` AS `cal_percentage_change_1h`,
`calculations`.`percentage_change_168h` AS `cal_percentage_change_168h`,
`calculations`.`volume_24hour` AS `ng_cal_volume_24hour`,
`calculations`.`current_price` AS `ng_cal_current_price`
FROM `currency`
JOIN `calculations` ON `calculations`.`currency_id` = `currency`.`id`
WHERE `calculations`.`update_status` = 1 AND `currency`.`currency_type` != 3 AND `calculations`.`update_status` = 1 AND `currency`.`status` = 1
ORDER BY `market_cap` DESC
LIMIT 100
As eliasah commented, there is no join operation in elastic search.
Joining queries
In general you can't really perform joining queries in ES. You can have a parent/child relationship on documents that are under the same index, but that is something I would not opt into. My best advice is to denormalize your data and have each document as 'self-contained' as possible. In this specific example, one possible solution is to store the calculations inside the currency, you would end up with a query like:
{
"_source": ["id", "logo", ..., "calculations.volume_1h","calculations.volume_24h",...],
"query": {
"bool": {
"must":[
{
"match":{
"calculations.update_status":1
}
},
{
"match":{
"currency_type":3
}
},
{
"match":{
"status":1
}
}
]
},
"sort" : [
{
"market_cap": {
"order": "desc"
}
}
]
"size":100
}

Custom sorting in Elasticsearch

Does anyone know if it's possible to custom sort in elasticsearch?
I have a sort on the category field. Which groups all of the records together by category. This works great.
However could you then give the sort a list e.g cars, books, food.
It would then show the cars first, then books and finally food?
You can use a function_score query, something like this:
{
"query": {
"function_score": {
"query": { "match_all": {} },
"boost": "5",
"functions": [
{
"filter": { "match": { "category": "cars" } },
"weight": 100
},
{
"filter": { "match": { "category": "books" } },
"weight": 50
},
{
"filter": { "match": { "category": "food" } },
"weight": 1
}
],
"score_mode": "max",
"boost_mode": "replace"
}
}
}
Where you, of course, put whichever query you are using now instead of the match_all query, and leave off the sort (the default is by score, which is what you want here).
This is replacing the score elasticsearch normally generates, with a custom score for each category. You could experiment with other boost_mode in order to have a reasonable ranking within the categories. In case you need to understand what is happening with the scoring, you can add "explain": true to the query at the top level.
You can use custom script for your own scoring.
More details at in Script Based Sorting section: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-sort.html

elasticsearch: applying an additional boost on a given field for a given value

I have a Symfony 2.7.6 application using the FOSElasticaBundle.
I have 2 types of search:
One without keyword, in this case only filters are applied and all documents scores are 1 (sometimes with a random order), in this case the main query is:
$query = new Elastica\Query\MatchAll();
One with keyword, same filters are applied and the match is run again a list of fields, (one with a different boost). And the results are stored by score. The main query is now:
$match = new Elastica\Query\MultiMatch();
$match->setQuery($keyword);
$match->setOperator('AND');
$match->setFields([
'field1^30',
'field2',
'field3',
'field4',
'_all'
]);
Those 2 search are working well.
Now for both search I want a dynamic boost to be applied for a given field value. Let's say: if field5 == 'value' then add boost 15, (15 is just an example, we will make tests to see what additional boost value has to be chosen) the value used here is not the keyword, it is another parameter.
I tried with a FunctionScore and with Boosting queries but without success. Any hint with a very simple elasticsearch query would be appreciated.
How about this:
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "blabla",
"operator": "AND",
"fields": [
"field1^30",
"field2",
"field3",
"field4",
"_all"
]
}
},
"functions": [
{
"filter": {
"term": {
"field5": "some_value"
}
},
"boost_factor": 15
}
]
}
}
}

Elasticsearch either or match query

I am trying to write a query to search for a products on two columns called category1 and category2. I am working using elastic search php client and tried with match should query but this giving me wrong results because of match of substring.
But i am looking for exact match with OR operation on two columns. I am new to this please guide me.
$params['index'] = 'furnit';
$params['type'] = 'products';
$params['body']['query']['bool']['should'] = array(
array('match' => array('category1' => $category->name)),
array('match' => array('category2' => $category->name)),
);
$results = $this->elasticsearch->search($params);
If you are not searching then using a bool query in this scenario is not the right way to do it in elasticsearch. Queries are used when you are searching something and relevancy of your search keyword and score of matching documents matters.
Here you can apply a bool filter of elasticsearch to filter out the desired results. Using filters with queries (filtered query) is right way to do it as it excludes all non-matching documents and then you can search for desired documents by using match queries.
here's an example of a bool filter
{
"from": 0,
"size": 50,
"sort": [
{
"name" : {
"order": "asc"
}
}
],
"query": {
"filtered": {
"query": {
"match_all" : {}
},
"filter": {
"bool": {
"should": [
{
"term": {
"category1" : "category1"
}
},
{
"term": {
"category2" : "category2"
}
}
]
}
}
}
}
}
you can refer to docs as well (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html)
Maybe your problem is you have used default analyzer (which is standard analyzer).
could you give me your mapping ?
I suggest you to change to use not_analyzer when indexing and use term filter/query.
You could use put mapping here to setting for your analyzer: Put Mapping
Edit: I have created a gist for you, check it here:
Mappings & Terms Filter

multiple group by in elasticsearch including missing values

I'm trying to do a group by in elasticsearch, by multiple fields. I know that nested aggregation exists, but what I want is including in a certain bucket the record for which the field I'm grouping by is empty.
Say that we have this kind of data structure:
SONG_ID | SONG_GENRE | SONG_ARTIST
and i want to group by genere, artists.
I would like to have a group for each possibile combination, i.e
group by genre gives me 5 buckets (if genres are 5) plus the bucket in which there are the songs without a genre. grouping then by artist gives me, for each genre, bucket by artists plus the one with songs without an artist.
Basically, I'd like to have the same results that I have using a group by. Is that even possible?
You can approach in different ways to solve your need.
The simplest way would be to index a fix value say "notmentioned" against the genre field of songs if genre is not present. you can do it while indexing or by defining "null_value" in your field mapping.
"SONG_GENRE": {"type": "string", "null_value": "notmentioned"},
"SONG_ARTIST": {"type": "string", "null_value": "notmentioned"},
So during aggregation (nested) you will automatically find the count against "notmentioned" for songs not having genre.
Another approach would be to use the missing filter as another aggregation along with normal aggregation. Something like below.
{
"aggs": {
"SONG_GENRE": {
"terms": {
"field": "SONG_GENRE"
},
"aggs": {
"SONG_ARTIST": {
"terms": {
"field": "SONG_ARTIST"
}
},
"MISSING_SONG_ARTIST": {
"filter": {
"missing": {
"field": "SONG_ARTIST"
}
}
}
}
},
"MISSING_SONG_GENRE": {
"filter": {
"missing": {
"field": "SONG_GENRE"
}
},
"aggs": {
"MISSING_SONG_GENRE_SONG_ARTIST": {
"terms": {
"field": "SONG_ARTIST"
}
},
"MISSING_SONG_GENRE_MISSING_SONG_ARTIST": {
"filter": {
"missing": {
"field": "SONG_ARTIST"
}
}
}
}
}
}
}
I haven't verified the syntax. It is just to give you an idea
Another hacking way could be to treat the missing count (total hits - all aggregation count) as the count against no genre.

Categories