elasticsearch aggregations on substring - php

I have a field indexed as String in elasticsearch 5
For example 20090219 , 20100416 etc
I can make a aggregation this data, But I want to aggregate on substring.
that is on
2009,2010
I don't want to convert to date. I want to get first 4 characters and get the count.
This is my current code.Very new to Elasticsearch
$params['body']["aggs"]["Year"]["terms"]["field"] = "PublicationDate.keyword";
$params['body']["aggs"]["Year"]["terms"]["size"] = 10;
$params['body']["aggs"]["Year"]["terms"]["order"]["_count"] = "desc";

You can use elasticsearch script feature to achieve this.
GET my-index/_search
{
"aggs" : {
"my-agg" : {
"terms" : {
"script": {
"inline": "doc['PublicationDate.keyword'].getValue().substring(0,4)"
},
"size": 10,
"order" : { "_count" : "desc" }
}
}
}
}
I don't know equivalent php script for above command, but believe you will able to make it work in php.

this did the task
$params['body']["aggs"]["PublicationYear"]["terms"]["script"] = "_value.substring(0,4)";

Related

Elasticsearch either or match query

I am trying to write a query to search for a products on two columns called category1 and category2. I am working using elastic search php client and tried with match should query but this giving me wrong results because of match of substring.
But i am looking for exact match with OR operation on two columns. I am new to this please guide me.
$params['index'] = 'furnit';
$params['type'] = 'products';
$params['body']['query']['bool']['should'] = array(
array('match' => array('category1' => $category->name)),
array('match' => array('category2' => $category->name)),
);
$results = $this->elasticsearch->search($params);
If you are not searching then using a bool query in this scenario is not the right way to do it in elasticsearch. Queries are used when you are searching something and relevancy of your search keyword and score of matching documents matters.
Here you can apply a bool filter of elasticsearch to filter out the desired results. Using filters with queries (filtered query) is right way to do it as it excludes all non-matching documents and then you can search for desired documents by using match queries.
here's an example of a bool filter
{
"from": 0,
"size": 50,
"sort": [
{
"name" : {
"order": "asc"
}
}
],
"query": {
"filtered": {
"query": {
"match_all" : {}
},
"filter": {
"bool": {
"should": [
{
"term": {
"category1" : "category1"
}
},
{
"term": {
"category2" : "category2"
}
}
]
}
}
}
}
}
you can refer to docs as well (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html)
Maybe your problem is you have used default analyzer (which is standard analyzer).
could you give me your mapping ?
I suggest you to change to use not_analyzer when indexing and use term filter/query.
You could use put mapping here to setting for your analyzer: Put Mapping
Edit: I have created a gist for you, check it here:
Mappings & Terms Filter

Searching in an array indexed in ElasticSearch

I have indexed a document in ElasticSearch that contains arrays like this:
{
"student": "John",
"sport": "Soccer",
"match":
{
"eventType": "League",
"date": "2013-12-31T11:00:00.000Z"
}
}
I need to perform a query that searches for, for example, all league matches (ie, where doc["match"]["eventType"] == "League")
I am using the ElasticSearch-PHP api 1.1.0 and tried querying as such as this without success:
$params['body']['query']['match']['match']['eventType'] = 'League';
I also tried:
$params['body']['query']['match']['match']->eventType = 'League';
What is the correct way to do such a search? The documentation has no such examples.
Can you convert this JSON to php object?
{
"query": {
"match": {
"match.eventType": "League"
}
}
}
I think this will do the work.
As a first step, try to use a different name for your soccer 'match' and call it 'game' to prevent causing a collision with the use of the 'match' operation.

matching whole string with dashes in elasticsearch

I have an elasticsearch query which I am trying to match properly, the field data itself contains -(dashes), the string data are GUIDS
It was not matching properly because it was splitting the term up into separate words split by the -
I have since changed the query to use a match_phrase query like this:
"query": {
"filtered": {
"query": {
"match_phrase":{
"guid":{"operator" : "or","query":"bd2acb42-cf01-11e2-ba92-12313916f4be"}
}
}
}
}
When I am trying to match just one GUIDS, this works just fine.
However I am trying to match multiple GUIDS
So it currently looks like
"query": {
"filtered": {
"query": {
"match_phrase":{
"guid":{"operator" : "or","query":"bd2acb42-cf01-11e2-ba92-12313916f4be d1091f08-ceff-11e2-ba92-12313916f4be"}
}
}
}
}
I assume its not working because its trying to match the whole string, and not each GUID separately.
I tried added "analyzer" : "whitespace", to the query, but this broke the query entirely.
So what is the best method to ensure the query is looking for the whole GUID string and allows matching of multiple GUIDS?
I have been setting the field mapping to not_analyzed for similar purposes.
"guid" : {
"type" : "string",
"index" : "not_analyzed"
}
Building the query manually then works.
{
"bool" : {
"should" : [
{
"term" : { "guid" : "bd2acb42-cf01-11e2-ba92-12313916f4be" }
},
{
"term" : { "guid" : "d1091f08-ceff-11e2-ba92-12313916f4be" }
}
],
"minimum_number_should_match" : 1
}
}

Filtering in Elastic Search?

I'm using date_histogram api to get the actual count using the interval (hour/day/week or month). Also I have a feature which I'm having trouble implementing, a user can filter the results by entering an startDate and endDate (textbox) which will be queried using a field timestamp. So how can I filter the results by querying only one field (which is TIMESTAMP) while using date_histogram api or any api so I can achieve my desire result.
In SQL I will just use a between operator to get the result but from what I've read so far their is no BETWEEN operator in Elastic Search (not sure).
I have this script so far:
curl 'http://anotherdomain.com:9200/myindex/_search?pretty=true' -d '{
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "adid"
}
},
"query" : {
"query_string" : {
"fields" : [
"adid", "imp"
],
"query" : "525826 AND true"
}
}
}
},
"facets" : {
"histo1":{
"date_histogram":{
"field":"timestamp",
"interval":"day"
}
}
}
}'
In elasticsearch you can use range query of filter to achieve that.

Sort data in sub array in mongodb

Is that possible to sort data in sub array in mongo database?
{ "_id" : ObjectId("4e3f8c7de7c7914b87d2e0eb"),
"list" : [
{
"id" : ObjectId("4e3f8d0be62883f70c00031c"),
"datetime" : 1312787723,
"comments" :
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787723,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787724,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787725,
},
}
],
"user_id" : "3" }
For example I want to sort comments by field "datetime". Thanks. Or only variant is to select all data and sort it in PHP code, but my query works with limit from mongo...
With MongoDB, you can sort the documents or select only some parts of the documents, but you can't modify the documents returned by a search query.
If the current order of your comments can be changed, then the best solution would be to sort them in the MongoDB documents (find(), then for each doc, sort its comments and update()). If you want to keep the current internal order of comments, then you'll have to sort each document after each query.
In both case, the sort will be done with PHP. Something like:
foreach ($doc['list'] as $list) {
// uses a lambda function, PHP 5.3 required
usort($list['comments'], function($a,$b){ return $a["datetime"] < $b["datetime"] ? -1 : 1; });
}
If you can't use PHP 5.3, replace the lambda function by a normal one. See usort() examples.

Categories