Get top 5 documents with newest nested objects - php

I have the following data structure in MongoDB:
{ "_id" : ObjectId( "xy" ),
"litter" : [
{ "puppy_name" : "Tom",
"birth_timestamp" : 1353963728 },
{ "puppy_name" : "Ann",
"birth_timestamp" : 1353963997 }
]
}
I have many of these "litter" documents with varying number of puppies. The highter the timestamp number, the younger the puppy is (=born later).
What I would like to do is to retrieve the five youngest puppies from the collection accross all litter documents.
I tried something along
find().sort('litter.birth_timestamp' : -1).limit(5)
to get the the five litters which have the youngest puppies and then to extract the youngest puppy from each litter in the PHP script.
But I am not sure if this will work properly. Any idea on how to do this right (without changing the data structure)?

You can use the new Aggregation Framework in MongoDB 2.2 to achieve this:
<?php
$m = new Mongo();
$collection = $m->selectDB("test")->selectCollection("puppies");
$pipeline = array(
// Create a document stream (one per puppy)
array('$unwind' => '$litter'),
// Sort by birthdate descending
array('$sort' => array (
'litter.birth_timestamp' => -1
)),
// Limit to 5 results
array('$limit' => 5)
);
$results = $collection->aggregate($pipeline);
var_dump($results);
?>

Related

Mongo PHP join data from another collection and update doc - like SQL JOIN

I need help with MongoDb with PHP driver.
I have 4 collections:
order_aproved :
{
"order_id" : mongoId ,
"user_id":num ,
"order_date":mongoDate ,
"requset" : string
}
orders_rejected :
{
"order_id" : mongoId,
"user_id" : num ,
"order_date" : mongoDate ,
"requset" : string
}
users :
{
"user_id" : mongoId,
"username" : num ,
"last_order" : mongoDate ,
"num_orders" : num,
"last_order"
}
orders_log :
{
"order_id" : mongoId ,
"order_date" : mongoDate ,
"status" : boolen ,
"user_id" : num
}
Every approved/rejected order, I update the num_orders on user document
that have a new/rejected order. So that number is always changing
and log that order on orders_log.
I need to fetch all orders approved/rejected on orders_log by list of users [array] with condition and get the orders count num_orders and last order date for that user by the order from this user
I am doing it like this:
$cursor = $orders->find()->sort(array("order_date" => -1))->limit(15);
$array = iterator_to_array($cursor,false);
$users_for_aproved = ["123","124","125"];
$users_for_rejcted = ["112","113","114"];
$js = "function() { if ( this.requset ) { return this.requset.length > 0 } }";
$query1 = array( '$and' => array(
array("user_id" => array('$in'=> $users_for_aproved)),
array('$where' => $js )
));
$query1 = array( '$and' => array(
array("user_id" => array('$in'=> $users_for_rejcted)),
array('$where' => $js )
));
$query_or = array('$or' => array($query,$query1);
$cursor = $orders_log->find($query_or)->sort(array("order_date" => -1))->limit(15);
$array = iterator_to_array($cursor,false);
for ( $x=0; $x < count($array) ; $x++ ) {
$query = array( "user_id" => $order["user_id"] );
$cursor = $orders->find($query)->limit(1);
$array = iterator_to_array($cursor,false);
$order_count = $array[0]["num_orders"];
$array[$x]["order_count"] = $order_count;
}
return $array;
It's working but its not very efficient , i need a way to fetch data from another collection and add the num_orders to the doc that i have find without a use form anther query
like SQL JOIN but on mongo and php driver
Thanks!
There are two ways to achieve gain in performance in this case:
create an index on MongoDB:
db.order_aproved.createIndex( { user_id: 1 } )
You may create the above index either in the above way, or in the background:
db.order_aproved.createIndex( { user_id: 1 }, { background: true } )
In the last case, the creation will be slower, but it will not bother the currently ongoing operations on the database. If you may afford it, I think you should better create an index not in the background, esp. if are not running this script on the Production Database
re-design the collections, so that instead of the different collections, joined by some ID, you should create embedded documents inside the main document, thus eliminating the need to perform any operations, similar to JOINs in RDBMSs.
Of the above, simplest and more straight forward solution in your case, seems to me the first one. Choosing it, you will also avoid performance losses in updates for embedded documents

Displaying all aggregated results from Elasticsearch query in PHP

I have a field called "arrivalDate" and this field is a string. Each document has an arrivalDate in string format (ex: 20110128). I want my output to be something like this (date and the number of records that have that date):
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
I already have the query to get these results.
I am trying to display aggregated results in PHP from Elasticsearch. I want my output to be something like this:
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
This is what I have so far:
$json = '{"aggs": { "group_by_date": { "terms": { "field": "arrivalDate" } } } }';
$params = [
'index' => 'pickups',
'type' => 'external',
'body' => $json
];
$results = $es->search($params);
However, I don't know how to display the results in PHP. For example, if I wanted to display the total number of documents I would do echo $results['hits']['total'] How could I display all the dates with the number of records they have in PHP?
I'd suggest using aggregations in the same way you construct the query, from my experience it seems to work quicker. Please see the below code:
'aggs' => [
'group_by_date' => [
'terms' => [
'field' => 'arrivalDate',
'size' => 500
]
]
]
Following that, instead of using the typical results['hits']['hits'] you would switch out the hits parts to results['aggregations']. Then access the returning data by accessing the buckets in the response.
For accessing the data from the aggregation shown above, it would likely be something along the lines of:
foreach ($results as $result){
foreach($result['buckets'] as $record){
echo($record['key']);
}
}
There will be a better way of accessing the array within the array, however, the above loop system works well for me. If you have any issues with accessing the data, let me know.

PhpMongo - how to apply AND condition for a single document present in an array?

My Mongo collection has two documents
{
"_id":ObjectId("567168393d5c6cd46a00002a"),
"type":"SURVEY",
"description":"YOU HAVE AN UNANSWERED SURVEY.",
"user_to_notification_seen_status":[
{
"user_id":1,
"status":"UNSEEN",
"time_updated":1450272825
},
{
"user_id":2,
"status":"SEEN",
"time_updated":1450273798
},
{
"user_id":3,
"status":"UNSEEN",
"time_updated":1450272825
}
],
"feed_id":1,
"time_created":1450272825,
"time_updated":1450273798
}
Here is the query I used to fetch only if the user_id is 2 & status is "UNSEEN".
**$query = array('$and' => array(array('user_to_notification_seen_status.user_id'=> 2,'user_to_notification_seen_status.status' => "UNSEEN")));**
$cursor = $notification_collection->find($query);
Ideally the above query shouldn't retrieve results but it returning results. If I give an invalid id or invalid status, it is not returning any record.
You're misunderstanding how the query works. It matches your document because user_to_notification_seen_status contains elements with user_id: 2 and status: UNSEEN.
What you can do to get the desired results is use the aggregation framework; unwind the array and then match both conditions. That way you'll only get the unwinded documents with the array element satisfying both conditions.
Run this in mongo shell (or convert to PHP equivalent). Also, change YourCollection to your actual collection name:
db.YourCollection.aggregate([ { $unwind: "$user_to_notification_seen_status" }, { $match: { "user_to_notification_seen_status.status": "UNSEEN", "user_to_notification_seen_status.user_id": 2 } } ] );
This will return no records, but if you change the id to 3 for example, it will return one.
Try:
$query = array(
array('$unwind' => '$user_to_notification_seen_status'),
array(
'$match' => array('user_to_notification_seen_status.status' => 'UNSEEN', 'user_to_notification_seen_status.user_id' => 2),
),
);
$cursor = $notification_collection->aggregate($query);

collection not add pushing to subarray

I am using MongoDB and in the past I have been able to use the following to insert or add to a sub array that was already in the DB
Here is my issue, every day we take a look at the itunes top 100 and we insert the songs and artist into our collection, well infact we use two collections to do this job
but the one I am having issue with is the one that we store every single song and artist that has ever appeared in the iTunes top 100.
see code below
$collection = static::db()->itunes_collection_australia;
$document = $collection->findOne(array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist));
if (null !== $document) {
$collection->update(
array(array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist)),
array('$push' => array("date" => array('$each'=> array((string)$date)),"chartno"=> array('$each'=> array($a))),
));
}
else
{
$collection->insert(array("song"=>(string)$entry->imname, "artist"=>(string)$entry->imartist,"preview"=>(string)$preview,"cd_cover"=>(string)$cover, "price"=>(string)$price,"date"=>array((string)$date),"chartno"=>array($a)));
}
what should be happening is if the artist and song is found to already be the the collection , it should update. at the moment it is not running anything.
and if it is updating its not doing it right.
You see the "date" field should be showing multiple dates same with the chartno it should also be showing what position it was in the charts on that day.
here is how it should look when first inserted.
{
"_id" : ObjectId("52ea794d6ed348572d000013"),
"song" : "Timber (feat. Ke$ha)",
"artist" : "Pitbull",
"preview" : "http://a1264.phobos.apple.com/us/r1000/030/Music6/v4/48/30/3c/48303ca0-c509-8c15-4d4a-7ebd65c74725/mzaf_5507852070192786345.plus.aac.p.m4a",
"cd_cover" : "http://a1082.phobos.apple.com/us/r30/Music6/v4/64/41/81/644181ba-d236-211d-809e-057f4352d3d8/886444273480.170x170-75.jpg",
"price" : "$2.19",
"date" : [
"2014-01-29T07:10:38-07:00"
],
"chartno" : [
20
]
}
when the script sees it is back in the top 100 it should add it to the date and chartno fields.
like so
{
"_id" : ObjectId("52ea794d6ed348572d000013"),
"song" : "Timber (feat. Ke$ha)",
"artist" : "Pitbull",
"preview" : "http://a1264.phobos.apple.com/us/r1000/030/Music6/v4/48/30/3c/48303ca0-c509-8c15-4d4a-7ebd65c74725/mzaf_5507852070192786345.plus.aac.p.m4a",
"cd_cover" : "http://a1082.phobos.apple.com/us/r30/Music6/v4/64/41/81/644181ba-d236-211d-809e-057f4352d3d8/886444273480.170x170-75.jpg",
"price" : "$2.19",
"date" : [{
"2014-01-30T07:10:38-07:00"
},{2014-01-31T07:10:38-07:00}],
"chartno" : [
{20},{30}
]
}
however that is not happening infact nothing seems to be getting added.
I am wondering if I have done something wrong? Well clearly I have.
I have also tried the following '$addToSet' but with no success.
your update statement is wrong. you have too many arrays on first parameter. try this:
$collection->update(
array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist),
array('$push' => array("date" => array('$each'=> array((string)$date)),"chartno"=> array('$each'=> array($a))),
));

MongoDB, PHP getting unique visitors per day

I'm creating some analytics script using PHP and MongoDB and I am a bit stuck. I would like to get the unique number of visitors per day within a certain time frame.
{
"_id": ObjectId("523768039b7e7a1505000000"),
"ipAddress": "127.0.0.1",
"pageId": ObjectId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId": "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime": ISODate("2013-09-16T20:20:19.0Z")
}
The field to filter on is uniqueVisitorId and recordedTime.
I've created a database object in PHP that I initialise and it makes me a database connection when the object is constructed, then I have MongoDB php functions simply mapped to public function using the database connection created on object construction.
Anyhow, so far I get the number of visitors per day with:
public function GetUniqueVisitorsDiagram() {
// MAP
$map = new MongoCode('function() {
day = new Date(Date.UTC(this.recordedTime.getFullYear(), this.recordedTime.getMonth(), this.recordedTime.getDate()));
emit({day: day, uniqueVisitorId:this.uniqueVisitorId},{count:1});
}');
// REDUCE
$reduce = new MongoCode("function(key, values) {
var count = 0;
values.forEach(function(v) {
count += v['count'];
});
return {count: count};
}");
// STATS
$stats = $this->database->Command(array(
'mapreduce' => 'statistics',
'map' => $map,
'reduce' => $reduce,
"query" => array(
"recordedTime" =>
array(
'$gte' => $this->startDate,
'$lte' => $this->endDate
)
),
"out" => array(
"inline" => 1
)
));
return $stats;
}
How would I filter this data correctly to get unique visitors? Or would it better to use aggregation, if so could you be so kind to help me out with a code snippet?
The $group operator in the aggregation framework was designed for exactly this use case and will likely be ~10 to 100 times faster. Read up on the group operator here: http://docs.mongodb.org/manual/reference/aggregation/group/
And the php driver implementation here: http://php.net/manual/en/mongocollection.aggregate.php
You can combine the $group operator with other operators to further limit your aggregations. It's probably best you do some reading up on the framework yourself to better understand what's happening, so I'm not going to post a complete example for you.
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year'), 'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
To answer the question entirely:
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year','v'=>'$uniqueVisitorId'), 'c'=>array('$sum'=>1))),
array('$group'=>array('_id'=>array('day'=>'$_id.day','month'=>'$_id.month','year'=>'$_id.year'),'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
Something close to that is what your looking for I believe.
It will reutrn a set of documents that have the _id as the date and then the count of unique visitors for that day irresptive of the of the id, it simply detects only if the id is there.
Since you want it per day you can actually exchange the dat parts for just one field of $dayOfYear I reckon.

Categories