Aggregate nested array in Mongodb - php

I have a mongo collection lie this :
{
"_id":ObjectId("55f16650e3cf2242a79656d1"),
"user_id":11,
"push":[
ISODate("2015-09-08T11:14:18.285 Z"),
ISODate("2015-09-08T11:14:18.285 Z"),
ISODate("2015-09-09T11:14:18.285 Z"),
ISODate("2015-09-10T11:14:18.285 Z"),
ISODate("2015-09-10T11:14:18.285 Z")
]
}{
"_id":ObjectId("55f15c78e3cf2242a79656c3"),
"user_id":12,
"push":[
ISODate("2015-09-06T11:14:18.285 Z"),
ISODate("2015-09-05T11:14:18.285 Z"),
ISODate("2015-09-07T11:14:18.285 Z"),
ISODate("2015-09-09T11:14:18.285 Z"),
ISODate("2015-09-09T11:14:18.285 Z"),
ISODate("2015-09-10T11:14:18.285 Z"),
ISODate("2015-09-11T11:14:18.285 Z")
]
}
How can I find user_ids where count of timeStamps < 3 and having date(timestamp) > (currentDate-5) in single query. I will be using php and dont want to bring all the documents in memory.
Explanation:
user_id : date : count
11 : 2015-09-08 : 2
2015-09-09 : 1
2015-09-10 : 2
12 : 2015-09-05 : 1
2015-09-06 : 1
2015-09-07 : 1
2015-09-09 : 2
2015-09-10 : 1
2015-09-11 : 1
If date set to 2015-09-09(user input) it will give 3(count) for user_id 11 and 4(count) for user_id 12. So suppose count is set to 3(user input). The query should return 11(user_id). If count is set to 2, there will be no user_id available and if count is set to 5, it should return both 11 and 12

To solve this you need an aggregation pipeline that first "filters" the results to the "last 5 days" and then essentially "sums the count" of array items present in each qualifying document to then see if the "total" is "less than three".
The $size operator of MongoDB aggregation really helps here, as does $map and some additional filtering via $setDifference for the false results returned from $map, as doing this "in document first" and "within" the $group stage required, is the most efficient way to process this
$result = $collection->aggregate(array(
array( '$match' => array(
'push' => array(
'time' => array(
'$gte' => MongoDate( strtotime('-5 days',time()) )
)
)
)),
array( '$group' => array(
'_id' => '$user_id',
'count' => array(
'$sum' => array(
'$size' => array(
'$setDifference' => array(
array( '$map' => array(
'input' => '$push',
'as' => 'time',
'in' => array(
'$cond' => array(
array( '$gte' => array(
'$$time',
MongoDate( strtotime('-5 days',time()) )
)),
'$time',
FALSE
)
)
)),
array(FALSE)
)
)
)
)
)),
array( '$match' => array(
'count' => array( '$lt' => 3 )
))
));
So the after all of the work to first find the "possible" documents that contain array entries meeting the criteria via $match and then find the "total" size of the matched array items under $group, then the final $match excludes all results that are less than three in total size.
For the largely "JavaScript brains" out there ( like myself, well trained into it ) this is basically this contruct:
db.collection.aggregate([
{ "$match": {
"push": {
"$gte": new Date( new Date().valueOf() - ( 5 * 1000 * 60 * 60 * 24 ))
}
}},
{ "$group": {
"_id": "$user_id",
"count": {
"$sum": {
"$size": {
"$setDifference": [
{ "$map": {
"input": "$push",
"as": "time",
"in": {
"$cond": [
{ "$gte": [
"$$time",
new Date(
new Date().valueOf() -
( 5 * 1000 * 60 * 60 * 24 )
)
]},
"$$time",
false
]
}
}},
[false]
]
}
}
}
}},
{ "$match": { "count": { "$lt": 3 } } }
])
Also, future versions of MongoDB will offer $filter, which simplifies the whole $map and $setDifference statement part:
db.collection.aggregate([
{ "$match": {
"push": {
"$gte": new Date( new Date().valueOf() - ( 5 * 1000 * 60 * 60 * 24 ))
}
}},
{ "$group": {
"_id": "$user_id",
"count": {
"$sum": {
"$size": {
"$filter": {
"input": "$push",
"as": "time",
"cond": {
"$gte": [
"$$time",
new Date(
new Date().valueOf() -
( 5 * 1000 * 60 * 60 * 24 )
)
]
}
}
}
}
}
}},
{ "$match": { "count": { "$lt": 3 } } }
])
As well as noting that the "dates" are probably best calculated "before" the pipeline definition as a separate variable for the best accuracy.

Related

Mongodb string to date conversion

Below is my sample mongodb collection
{
"_id" : ObjectId("57ed32f4070577ec56a56b9f"),
"log_id" : "180308",
"issue_id" : "108850",
"author_key" : "priyadarshinim_contus",
"timespent" : NumberLong(18000),
"comment" : "Added charts in the dashboard page of the application.",
"created_on" : "2017-08-16T18:22:04.816+0530",
"updated_on" : "2017-08-16T18:22:04.816+0530",
"started_on" : "2017-08-16T18:21:39.000+0530",
"started_date" : "2017-08-02",
"updated_date" : "2017-08-02",
"role" : "PHP",
"updated_at" : ISODate("2017-09-29T15:27:48.069Z"),
"created_at" : ISODate("2017-09-29T15:27:48.069Z"),
"status" : 1.0
}
I need to get records with help of started_date , by default I will give two dates in that i will check $gt and $lt of started date .
$current_date = '2017-08-31';
$sixmonthfromcurrent ='2017-08-01';
$worklogs = Worklog::raw ( function ($collection) use ($issue_jira_id, $current_date, $sixmonthfromcurrent) {
return $collection->aggregate ( [
['$match' => ['issue_id' => ['$in' => $issue_jira_id],
'started_date' => ['$lte' => $current_date,'$gte' => $sixmonthfromcurrent]
]
],
['$group' => ['issue_id' => ['$push' => '$issue_id'],
'_id' => ['year' => ['$year' => '$started_date'],
'week' => ['$week' => '$started_date'],'resource_key' => '$author_key'],
'sum' => array ('$sum' => '$timespent')]
],
[ '$sort' => ['_id' => 1]
]
] );
} );
If I run this query I am getting this type of error:
Can't convert from BSON type string to Date
How to rectify this error?
The only field in your $group that I see as troubling is the field week.
The year you could extract by doing a $project before your $group aggregation:
$project: {
year: { $substr: [ "$started_date", 0, 4 ] },
issue_id: 1,
author_key: 1,
timespent: 1
}
if you know that the date string will always come at this format. Of course you cannot do a substr operation for finding out the week.
It would be easy though if your field started_date would be an actual ISODate(), then you could use exactly what you wrote as you probably already saw in the documentation.
If you need the field week very bad, which I imagine you do, then I'd suggest you convert your field started_date to an ISODate().
You can do that with a bulkWrite:
db = db.getSiblingDB('yourDatabaseName');
var requests = [];
db.yourCollectionName.find().forEach(doc => {
var date = yourFunctionThatConvertsStringToDate(doc.started_date);
requests.push( {
'updateOne': {
'filter': { '_id': doc._id },
'update': { '$set': {
"started_date": date
} }
}
});
if (requests.length === 500) {
db.yourCollectionName.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.yourCollectionName.bulkWrite(requests);
}
Load this script directly on your mongodb server and execute there.
Hope this helps.

PHP & MongoDB show results grouped by date

I have an mongodb collection with following documents:
{
"_id" : ObjectId("547af6aea3f0eba7148b4567"),
"check_id" : "f5d654e7-257d-4a93-ae50-2d59dfeeb451",
"chunks" : NumberLong(200),
"num_hosts" : NumberLong(1000),
"num_rbls" : NumberLong(163),
"owner" : NumberLong(7901),
"created" : ISODate("2014-11-30T10:51:26.924Z"),
"started" : ISODate("2014-11-30T10:51:31.558Z"),
"finished" : ISODate("2014-11-30T10:57:08.512Z")
}
{
"_id" : ObjectId("54db19a858a5d395a18b4567"),
"check_id" : "9660e510-1349-43f3-9d5e-8bf4b06179be",
"chunks" : NumberLong(2),
"num_hosts" : NumberLong(10),
"num_rbls" : NumberLong(166),
"owner" : NumberLong(7901),
"created" : ISODate("2015-02-11T08:58:17.118Z"),
"started" : ISODate("2015-02-11T08:58:18.78Z"),
"finished" : ISODate("2015-02-11T08:58:47.486Z")
}
{
"_id" : ObjectId("54db267758a5d30eab8b4567"),
"check_id" : "9660e510-1349-43f3-9d5e-8bf4b06179be",
"chunks" : NumberLong(2),
"num_hosts" : NumberLong(10),
"num_rbls" : NumberLong(166),
"owner" : NumberLong(7901),
"created" : ISODate("2015-02-11T09:52:55.388Z"),
"started" : ISODate("2015-02-11T09:52:56.109Z"),
"finished" : ISODate("2015-02-11T09:53:22.095Z")
}
What I need is to get the result and produce an array similar to this:
Array
(
[2015-02-11] => array
(
//array with results from 2015-02-11
)
[2014-11-30] => array
(
//array with results from 2014-11-30
)
)
I know that it's possible to just perform simply collection->find and then loop through results and use php logic to achieve my goal but is it possible to make it using mongo? Maybe using aggregation framework?
EDIT: I want to group results by "created" date
Any help will be highly appreciated.
Monogo aggregation mongo aggregation group used for this, so below query may solve your problem
db.collectionName.aggregate({
"$group": {
"_id": "$created",
"data": {
"$push": {
"check_id": "$check_id",
"chunks": "$chunks",
"num_hosts": "$num_hosts",
"num_rbls": "$num_rbls",
"owner": "$owner",
"started": "$started",
"finished": "$finished"
}
}
}
}).pretty()
Or
db.collectionName.aggregate({
"$group": {
"_id": "$created",
"data": {
"$push": "$$ROOT"
}
}
}).pretty()
Also in mongo 2.8 $dateToString provide facility to convert ISO date to string format so below query also work
db.collectionName.aggregate([
{
"$project": {
"yearMonthDay": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$created"
}
},
"check_id": "$check_id",
"chunks": "$chunks",
"num_hosts": "$num_hosts",
"num_rbls": "$num_rbls",
"owner": "$owner",
"started": "$started",
"finished": "$finished"
}
},
{
"$group": {
"_id": "$yearMonthDay",
"data": {
"$push": "$$ROOT"
}
}
}
]).pretty()
I have managed to solve this using the aggregation framework. Here is the answer, in case anyone need it.
$op = array(
array(
'$project' => array(
'data' => array(
'check_id' => '$check_id',
'chunks' => '$chunks',
'num_hosts' => '$num_hosts',
'num_rbls' => '$num_rbls',
'owner' => '$owner',
'started' => '$started',
'finished' => '$finished',
),
'year' => array('$year' => '$created' ),
'month' => array('$month' => '$created' ),
'day' => array('$dayOfMonth' => '$created'),
)
),
array(
'$group' => array(
'_id' => array('year' => '$year', 'month' => '$month', 'day' => '$day'),
'reports_data' => array('$push' => '$data'),
)
),
);
$c = $collection->aggregate($op);

mongodb date aggregation operators timezone adjustments with php

I'm trying to adjust the timezone with date aggregation operators.
I need to make -7 hours adjustment on the $signs.timestamp field.
This is my code:
function statsSignatures() {
$cursor = $this->db->collection->users->aggregate(
array('$unwind' => '$signs'),
array('$project'=>array(
'signs'=>'$signs',
'y'=>array('$year'=>'$signs.timestamp'),
'm'=>array('$month'=>'$signs.timestamp'),
'd'=>array('$dayOfMonth'=>'$signs.timestamp'),
'h'=>array('$hour'=>'$signs.timestamp')
)),
array('$group'=>array(
'_id'=>array('year'=>'$y','month'=>'$m','day'=>'$d','hour'=>'$h'),
'total'=>array('$sum'=>1)
)),
array('$sort'=>array(
'_id.year'=>1,
'_id.month'=>1,
'_id.day'=>1,
'_id.hour'=>1
))
);
return $cursor['result'];
}
I'm using MongoDB version 2.6.3.
Thank you a lot !
You can use $project with $subtract operator to make a -7 hour adjustment to a Date field:
{
$project : {
ts : { $subtract : [ "$signs.timestamp", 25200000 ] }
}
}
// 25200000 == 1000 milis x 60 sec x 60 mins x 7 h
The projected field ts is a Date that's offset by -7 hours.
Edit
This is the correct PHP syntax when using $subtract.
array(
'$project' => array(
'ts' => array('$subtract' => array('$signs.timestamp', 25200000))
)
)
Subtract accepts an array of values, not a key=>value pair.
I'm not sure why, but I'm getting "exception: invalid operator '$signs.timestamp'" if i'm trying to subtract this in php like this code:
$cursor = $app['mdb']->changemi->users->aggregate(
array('$unwind' => '$signs'),
array('$project' => array(
'ts'=>array('$subtract'=>array(
'$signs.timestamp'=> 25200000
))
)),
array('$project'=>array(
'y'=>array('$year'=>'$ts'),
'm'=>array('$month'=>'$ts'),
'd'=>array('$dayOfMonth'=>'$ts'),
'h'=>array('$hour'=>'$ts')
)),
array('$group'=>array(
'_id'=>array('year'=>'$y','month'=>'$m','day'=>'$d','hour'=>'$h'),
'total'=>array('$sum'=>1)
)),
array('$sort'=>array(
'_id.year'=>1,
'_id.month'=>1,
'_id.day'=>1,
'_id.hour'=>1
))
);
So I came with 2 workarounds:
backend php. json_decode
$cursor = $app['mdb']->changemi->users->aggregate(
array('$unwind' => '$signs'),
json_decode('{"$project" : {"ts" : { "$subtract" : [ "$signs.timestamp", 25200000 ] }}}',true),
array('$project'=>array(
'y'=>array('$year'=>'$ts'),
'm'=>array('$month'=>'$ts'),
'd'=>array('$dayOfMonth'=>'$ts'),
'h'=>array('$hour'=>'$ts')
)),
array('$group'=>array(
'_id'=>array('year'=>'$y','month'=>'$m','day'=>'$d','hour'=>'$h'),
'total'=>array('$sum'=>1)
)),
array('$sort'=>array(
'_id.year'=>1,
'_id.month'=>1,
'_id.day'=>1,
'_id.hour'=>1
))
);
frontend javascript (minusHours)
Date.prototype.minusHours= function(h){
this.setHours(this.getHours()-h);
return this;
}
...
"date": new Date({{ i._id.year }}, {{ i._id.month -1 }}, {{ i._id.day }}, {{ i._id.hour }}, 0, 0, 0).minusHours(7),
Here is what worked for me. Instead of doing the timezone conversion in the 'project', I just convert the timestamp while grouping.
group._id = {
year: { $year : [{ $subtract: [ "$timestamp", 25200000 ]}] },
month: { $month : [{ $subtract: [ "$timestamp", 25200000 ]}] },
day: { $dayOfMonth : [{ $subtract: [ "$timestamp", 25200000 ]}] }
};
group.count = {
$sum : 1
};
There's no need to close some objects in array, this way worked for me:
group._id = {
year: { $year : { $subtract: [ "$timestamp", 25200000 ]}},
month: { $month : { $subtract: [ "$timestamp", 25200000 ]}},
day: { $dayOfMonth : { $subtract: [ "$timestamp", 25200000 ]}}
};
group.count = {
$sum : 1
};

sum query mongodb php

I have this collection
> db.test.find()
{ "_id" : ObjectId("5398ddf40371cdb3aebca3a2"), "name" : "ahmed", "qte" : 30 }
{ "_id" : ObjectId("5398de040371cdb3aebca3a3"), "name" : "demha", "qte" : 35 }
{ "_id" : ObjectId("5398de140371cdb3aebca3a4"), "name" : "ahmed", "qte" : 50 }
{ "_id" : ObjectId("5398de210371cdb3aebca3a5"), "name" : "ahmed", "qte" : 60 }
i would like to sum "qte" where "name"= "ahmed" and print the sum with php
i know how to do with SQL but i have no idea how it is in mongodb.
Thanks :)
Use the aggregation framework.
Assuming you have an the current collection as $collection
result = $collection->aggregate(array(
array(
'$match' => array(
'name' => 'ahmed'
)
),
array(
'$group' => array(
'_id' => NULL,
'total' => array(
'$sum' => '$qte'
)
)
)
));
The two parts are the $match to meet the criteria, and the $group to arrive at the "total" using $sum
See other Aggregation Framework Operators and the Aggregation to SQL Mapping chart for more examples.
This is done with an aggregate statement:
db.test.aggregate([
{
$match: {
name: "ahmed"
}
},
{
$group: {
_id:"$name",
total: {
$sum: "$qte"
}
}
}
])

Query to select documents based only on month or year

Is there a query to select documents based only on month or year in MongoDB, something like the equivalent for the following code in mysql;
$q="SELECT * FROM projects WHERE YEAR(Date) = 2011 AND MONTH(Date) = 5";
I am looking for the MongoDB equivalent, can anyone help?
Use the aggregation framework to get the query, in particular the Date Aggregation Operators $year and $month. The aggregation pipeline that gives you the above query would look like this:
var pipeline = [
{
"$project": {
"year": { "$year": "$date" },
"month": { "$month": "$date" },
"other_fields": 1
}
},
{
"$match": {
"year": 2011,
"month": 5
}
}
]
db.project.aggregate(pipeline);
The equivalent PHP query would be:
$m = new MongoClient("localhost");
$c = $m->selectDB("examples")->selectCollection("project");
$pipeline = array(
array(
'$project' => array(
"year" => array("$year" => '$date'),
"month" => array("$month" => '$date'),
"other_fields" => 1,
)
),
array(
'$match' => array(
"year" => 2011,
"month" => 5,
),
),
);
$results = $c->aggregate($pipeline);
var_dump($results);

Categories