Related
I have data on invoices and I want to group them by time period wall. for example 0-10 and 11-20:
I need to solve them with $aggregate in mongoDB.
{
'_id': '1',
'value': 10,
'due_date': '20221001'
},
{
'_id': '2',
'value': 10,
'due_date': '20221012'
},
{
'_id': '2',
'value': 10,
'due_date': '20221030'
},
I need to group by period 0-10 days, 11-20 days and sum the values: For the example above the result would be:
[{
"_id": '0-10 days',
"total": 10,
},
{
"_id": '11-20 days',
"total": 10,
},
{
"_id": '>20 days',
"total": 10,
}]
I try with:
['$facet' => [
['due_date_one' => [
['$match' => [
'due_date' => [
'$gt' => new UTCDateTime((new Carbon())-> subDays(100) -> getTimestamp()),
'$lte' => date('Y-m-d', strtotime('now'))
]
]
]],
]]
You could use $bucket or add a few $addFields and $group stages:
Example mongo playground - https://mongoplayground.net/p/B0zl_GGH4sG
Example Documents:
[
{
"_id": "1a",
"value": 10,
"due_date": "20221001"
},
{
"_id": "1b",
"value": 5,
"due_date": "20221102"
},
{
"_id": "1c",
"value": 7,
"due_date": "20221102"
},
{
"_id": "2a",
"value": 10,
"due_date": "20221012"
},
{
"_id": "2b",
"value": 7,
"due_date": "20221113"
},
{
"_id": "2c",
"value": 8,
"due_date": "20221113"
},
{
"_id": "3a",
"value": 10,
"due_date": "20221030"
},
{
"_id": "3b",
"value": 9,
"due_date": "20221131"
},
{
"_id": "3c",
"value": 11,
"due_date": "20221131"
}
]
Aggregation query:
db.collection.aggregate([
{
$addFields: {
day: {
$toInt: { $substr: [ "$due_date", 6, 2 ] }
}
}
},
{
$addFields: {
bucketDate: {
$switch: {
branches: [
{ case: { $gt: [ "$day", 20 ] }, then: ">20 days" },
{ case: { $gt: [ "$day", 10 ] }, then: "11-20 days" }
],
"default": "0-10 days"
}
}
}
},
{
$addFields: {
bucketDateWithMonth: {
$concat: [
{ $substr: [ "$due_date", 0, 6 ] },
" ",
"$bucketDate"
]
}
}
},
{
$group: {
//_id: "$bucketDate", //No grouped month
_id: "$bucketDateWithMonth", //With grouped month
count: { $sum: 1 },
value: { $sum: "$value" }
}
}
])
Output: (grouped month)
[
{
"_id": "202210 0-10 days",
"count": 1,
"value": 10
},
{
"_id": "202211 0-10 days",
"count": 2,
"value": 12
},
{
"_id": "202210 11-20 days",
"count": 1,
"value": 10
},
{
"_id": "202211 11-20 days",
"count": 2,
"value": 15
},
{
"_id": "202210 \u003e20 days",
"count": 1,
"value": 10
},
{
"_id": "202211 \u003e20 days",
"count": 2,
"value": 20
}
]
Output: (No grouped month)
[
{
"_id": "\u003e20 days",
"count": 3,
"value": 30
},
{
"_id": "0-10 days",
"count": 3,
"value": 22
},
{
"_id": "11-20 days",
"count": 3,
"value": 25
}
]
I'm newbie in MongoDB and I'm really struggling to get the expected output. I have a set of records in collection, let's say:
{"_id": "id", "category": "Category A", "price": 100, "created_at": "2022-02-01 01:05:00"},
{"_id": "id", "category": "Category B", "price": 200, "created_at": "2022-02-01 01:10:00"},
{"_id": "id", "category": "Category B", "price": 150, "created_at": "2022-02-01 01:20:00"}
I want to group these by hour (created_at), and by categories, and get the sum of each category's price in a single line like this:
{"_id": "id", "Category A Total Price": 100, "Category B Total Price": 350, "created_at": "2022-02-01 01:00:00"}
Does this problem needs different query for each category and then merge? Or any solution you can share.
db.collection.aggregate([
{
"$match": {}
},
{
"$group": {
"_id": {
category: "$category",
created_at: {
$dateTrunc: {
date: { "$toDate": "$created_at" },
unit: "day"
}
}
},
"total": { "$sum": "$price" }
}
},
{
"$group": {
"_id": "$_id.created_at",
"obj": {
"$push": {
k: { "$concat": [ "$_id.category", " Total Price" ] },
v: "$total"
}
}
}
},
{
$replaceWith: {
$mergeObjects: [
{ "$arrayToObject": "$obj" },
{ created_at: "$_id" }
]
}
}
])
mongoplayground
I have an index containing details regarding mobiles, case covers etc.
Below is the query used.
{
"query": {
"function_score": {
"query": { "match": {"title" :"Apple iPhone 6s"} },
"boost": "5",
"functions": [
{
"filter": { "match": { "main_category": "mobiles" } },
"weight": 8
},
{
"filter": { "match": {"main_category": "cases-and-covers" } },
"weight": 6
}
],
"max_boost": 8,
"score_mode": "max",
"boost_mode": "multiply",
"min_score" : 5
}
},
"_source":["title","main_category","selling_price"],
"size" : 1000
}
Is it possible to boost mobiles category like below and sort within the mobiles category by selling price ascending order.
Boost is working fine. How to sort within the specific boost function?
If user searched for apple iphone 6s,I want mobiles category to be boosted and lowest price should comes first and then case and cover category products also in selling price ascending order.
Below are the results needed
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 104,
"max_score": 40.645245,
"hits": [
{
"_index": "test",
"_type": "products",
"_id": "shop_24",
"_score": 40.645245,
"_source": {
"selling_price": 72000,
"main_category": "mobiles",
"title": "Apple iPhone 6s"
}
},
{
"_index": "test",
"_type": "products",
"_id": "shop_20",
"_score": 40.168346,
"_source": {
"selling_price": 82000,
"main_category": "mobiles",
"title": "Apple iPhone 6s Plus"
}
},
{
"_index": "test",
"_type": "products",
"_id": "shop_15",
"_score": 39.365562,
"_source": {
"selling_price": 92000,
"main_category": "mobiles",
"title": "Apple iPhone 6s Plus"
}
},
{
"_index": "test",
"_type": "products",
"_id": "shop_17",
"_score": 39.365562,
"_source": {
"selling_price": 2000,
"main_category": "cases-and-covers",
"title": "Case cover for Apple iPhone 6s"
}
},
{
"_index": "test",
"_type": "products",
"_id": "shop_18",
"_score": 39.365562,
"_source": {
"selling_price": 2300,
"main_category": "cases-and-covers",
"title": "Case cover for Apple iPhone 6s Plus"
}
}
]
}
}
Please help?.
Can you try following query:
{
"query": {
"function_score": {
"query": {
"match_all": {} // Change this to query as per your need
},
"boost": "5",
"functions": [
{
"filter": {
"match": {
"main_category": "mobiles"
}
},
"weight": 50
},
{
"filter": {
"match": {
"main_category": "cases-and-covers"
}
},
"weight": 25
}
]
}
},
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"selling_price": {
"order": "asc"
}
}
]
}
We are providing high weight weight=50 to the documents which have mobile category and low weight weight=25 to the documents which have case-and-cover category.
Finally We are sorting first on the basis of score and then selling price.
Suppose I have stored bellow data and want to search for term xy in old_value and new_value fields of those documents that their field_name is curriculum_name_en or curriculum_name_pr:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 98,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c197e83c",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 2,
"field_name": "curriculum_id",
"new_value": 118,
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c197f2c3",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 3,
"field_name": "curriculum_name_en",
"new_value": "Test Index creation",
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c198045c",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 4,
"field_name": "curriculum_name_pr",
"new_value": null,
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c1981512",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 5,
"field_name": "curriculum_name_pa",
"new_value": null,
"old_value": null
}
}
]
}
}
and many more fields may be there, now user may select one or more of those fields and define a search term across those fields that he/she selected, the challenge is here, how we can say elastic that consider field_name to match those fields that user selected, then search in old_value, and new_value.
for example if user select curriculum_name_en and curriculum_name_pr and then want to search for xy inside old_value and new_value fields of those documents that their field_name is above fields.
how we can do that?
The idea with this requirement is that you need to make something like: the query needs to match new_value and/or old_value only if field_name matches a certain value as well. There is no programmatic-like way of saying if this then that.
What I'm suggesting is something like this:
{
"query": {
"bool": {
"must": [
{
"terms": {
"field_name": [
"curriculum_name_en",
"curriculum_name_pr"
]
}
},
{
"multi_match": {
"query": "Test Index",
"fields": ["new_value","old_value"]
}
}
]
}
}
}
So, your if this then that condition is a must statement from a bool query where your if and then branches live inside the must.
This may solve your problem
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query" : {
"terms" : {
"field_name" : [
"curriculum_name_en",
"curriculum_name_pr"
],
"minimum_match" : 1
}
}
},
{
"query" : {
"terms" : {
"new_value" : [
"test", "index"
],
"minimum_match" : 1
}
}
}
]
}
}
}
}
}
I'm using FOSElasticaBundle with Symfony2 on my project and there are entry and user tables on MySQL database and each entry belongs to one user.
I want to get just one entry per a user among the whole entries from the database.
Entries Representation
[
{
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
},
{
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
},
{
"id": 3,
"name": "Dolar sit amet",
"user": {
"id": 17,
"username": "foo"
}
},
]
Expected result is:
[
{
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
},
{
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
}
]
But it returns all entries on table. I've tried to add an aggregation to my elasticsearch query and nothing changed.
$distinctAgg = new \Elastica\Aggregation\Terms("distinctAgg");
$distinctAgg->setField("user.id");
$distinctAgg->setSize(1);
$query->addAggregation($distinctAgg);
Is there any way to do this via term filter or anything else? Any help would be great. Thank you.
Aggregations are not easy to understand when you are used to MySQL group by.
The first thing, is that aggregations results are not returned in hits, but in aggregations. So when you get the result of your search, you have to get aggregations like that :
$results = $search->search();
$aggregationsResults = $results->getAggregations();
The second thing is that aggregations wont return you the source. With the aggregation of your example, you will only know that you have 1 user with ID 15, and 2 users with ID 15.
E.g. with this query :
{
"query": {
"match_all": {}
},
"aggs": {
"byUser": {
"terms": {
"field": "user.id"
}
}
}
}
Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [ ... ]
},
"aggregations": {
"byUser": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 17,
"doc_count": 2
},
{
"key": 15,
"doc_count": 1
}
]
}
}
}
If you want to get results, the same way you would do with a GROUP BY in MySQL, you have to use a top_hits sub-aggregation:
{
"query": {
"match_all": {}
},
"aggs": {
"byUser": {
"terms": {
"field": "user.id"
},
"aggs": {
"results": {
"top_hits": {
"size": 1
}
}
}
}
}
}
Result:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [ ... ]
},
"aggregations": {
"byUser": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 17,
"doc_count": 2,
"results": {
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_stackoverflow",
"_type": "test1",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
}
}
]
}
}
},
{
"key": 15,
"doc_count": 1,
"results": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test_stackoverflow",
"_type": "test1",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
}
}
]
}
}
}
]
}
}
}
More informations on this page : https://www.elastic.co/blog/top-hits-aggregation