Sort data in sub array in mongodb - php

Is that possible to sort data in sub array in mongo database?
{ "_id" : ObjectId("4e3f8c7de7c7914b87d2e0eb"),
"list" : [
{
"id" : ObjectId("4e3f8d0be62883f70c00031c"),
"datetime" : 1312787723,
"comments" :
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787723,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787724,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787725,
},
}
],
"user_id" : "3" }
For example I want to sort comments by field "datetime". Thanks. Or only variant is to select all data and sort it in PHP code, but my query works with limit from mongo...

With MongoDB, you can sort the documents or select only some parts of the documents, but you can't modify the documents returned by a search query.
If the current order of your comments can be changed, then the best solution would be to sort them in the MongoDB documents (find(), then for each doc, sort its comments and update()). If you want to keep the current internal order of comments, then you'll have to sort each document after each query.
In both case, the sort will be done with PHP. Something like:
foreach ($doc['list'] as $list) {
// uses a lambda function, PHP 5.3 required
usort($list['comments'], function($a,$b){ return $a["datetime"] < $b["datetime"] ? -1 : 1; });
}
If you can't use PHP 5.3, replace the lambda function by a normal one. See usort() examples.

Related

update value using nested element match in mongo [duplicate]

I have a document in mongodb with 2 level deep nested array of objects that I need to update, something like this:
{
id: 1,
items: [
{
id: 2,
blocks: [
{
id: 3
txt: 'hello'
}
]
}
]
}
If there was only one level deep array I could use positional operator to update objects in it but for second level the only option I've came up is to use positional operator with nested object's index, like this:
db.objects.update({'items.id': 2}, {'$set': {'items.$.blocks.0.txt': 'hi'}})
This approach works but it seems dangerous to me since I'm building a web service and index number should come from client which can send say 100000 as index and this will force mongodb to create an array with 100000 indexes with null value.
Are there any other ways to update such nested objects where I can refer to object's ID instead of it's position or maybe ways to check if supplied index is out of bounds before using it in query?
Here's the big question, do you need to leverage Mongo's "addToSet" and "push" operations? If you really plan to modify just individual items in the array, then you should probably build these arrays as objects.
Here's how I would structure this:
{
id: 1,
items:
{
"2" : { "blocks" : { "3" : { txt : 'hello' } } },
"5" : { "blocks" : { "1" : { txt : 'foo'}, "2" : { txt : 'bar'} } }
}
}
This basically transforms everything in to JSON objects instead of arrays. You lose the ability to use $push and $addToSet but I think this makes everything easier. For example, your query would look like this:
db.objects.update({'items.2': {$exists:true} }, {'$set': {'items.2.blocks.0.txt': 'hi'}})
You'll also notice that I've dumped the "IDs". When you're nesting things like this you can generally replace "ID" with simply using that number as an index. The "ID" concept is now implied.
This feature has been added in 3.6 with expressive updates.
db.objects.update( {id: 1 }, { $set: { 'items.$[itm].blocks.$[blk].txt': "hi", } }, { multi: false, arrayFilters: [ { 'itm.id': 2 }, { 'blk.id': 3} ] } )
The ids which you are using are linear number and it has to come from somewhere like an additional field such 'max_idx' or something similar.
This means one lookup for the id and then update. UUID/ObjectId can be used for ids which will ensure that you can use Distributed CRUD as well.
Building on Gates' answer, I came up with this solution which works with nested object arrays:
db.objects.updateOne({
["items.id"]: 2
}, {
$set: {
"items.$.blocks.$[block].txt": "hi",
},
}, {
arrayFilters: [{
"block.id": 3,
}],
});
MongoDB 3.6 added all positional operator $[] so if you know the id of block that need update, you can do something like:
db.objects.update({'items.blocks.id': id_here}, {'$set': {'items.$[].blocks.$.txt': 'hi'}})
db.col.update({"items.blocks.id": 3},
{ $set: {"items.$[].blocks.$[b].txt": "bonjour"}},
{ arrayFilters: [{"b.id": 3}] }
)
https://docs.mongodb.com/manual/reference/operator/update/positional-filtered/#update-nested-arrays-in-conjunction-with
This is pymongo function for find_one_and_update. I searched a lot to find the pymongo function. Hope this will be useful
find_one_and_update(filter, update, projection=None, sort=None, return_document=ReturnDocument.BEFORE, array_filters=None, hint=None, session=None, **kwargs)
Added reference and pymongo documentation in comments

Limit returned fields with PHP MongoClient find() query

I am writing a PHP MongoClient Model which accesses mongodb that stores deploy logs with gitlab information, server hosts, and zend restart instructions. I have a mongo Collection called deployAppConfigs. Its document structure looks like this:
{
"_id" : ObjectId("54de193790ded22d1cd24c36"),
"app_name" : "ai2_api",
"name" : "AI2 Admin API",
"app_directory" : "path_to_app",
"app_owner" : "www-data:deployers",
"directories" : [],
"vcs" : {
"type" : "git",
"name" : "input/ai2-api"
},
"environments" : {
"development" : {
...
},
"qa" : {
...
},
"staging" : {
...
},
"production" : {
...
},
"actions" : {
"post_checkout" : [
"composer_install"
]
}
}
Because there are many documents in this collection, I would like to query the entire collection for only the "vcs" sub document and the "app_name". I am able to execute this command in Robomongo's mongo shell with the following find() query:
db.deployAppConfigs.find({}, {"vcs": 1, "app_name": 1})
This returns exactly what I want for each document in the collection:
{
"_id" : ObjectId("54de193790ded22d1cd24c36"),
"app_name" : "ai2_api",
"vcs" : {
"type" : "git",
"name" : "input/ai2-api"
}
}
I am having a problem writing a PHP MongoClient equivalent to that mongo shell command. I basically want to make a PHP MongoClient version of this mongo docs example on Limit Fields to Return from a Query
I have tried using an empty array to replace the "{}" in the mongo shell command like this, but it hasn't worked:
$query = array (
array(),
array("vcs"=> 1, "app_name"=> 1)
);
All the fields share the vcs.type = "git" so I tried wrote a query that selects all fields in every document based on that shared value. It looks like this:
$query = array (
"vcs.type" => "git"
);
But this returns the entire document, which is what I want to avoid.
The alternative could be to do a limit projection find() for the first document in the collection and then use the MongoCursor to iterate through the whole collection, but I'd rather not have to do the extra loop if possible.
Essentially, I am asking how to limit the return fields of a find() query to only one subdocument of each document in the entire collection.
looks like I was able to find the solution... I will solve the question and leave it up in case it ends up being useful to anyone else.
What I ended up having to do was alter my MongoClient custom class find() function, which calls the $collection->find() query, to include a $fields parameter.
Now, the MongoClient->find() query looks like this:
$collection->find(
array("vcs.type" => "git"),
array("vcs" => 1, "app_name" = 1)
)
Found the answer on the MongoClient::cursor::find() : here

PHP JSON object nested loop data syntax

I'm trying to loop through some JSON data and pull out specific values. Here is the JSON data and the partially working code.
$jsondata = '
[
{
"id" : "421356",
"trip_update" : {
"trip" : {
"trip_id" : "421356",
"start_time" : "12:05:00",
"start_date" : "20130926",
"route_id" : "15"
},
"stop_time_update" : {
"stop_sequence" :70,
"departure" : {
"delay" : 240,
"time" : 1380215057
},
"stop_id" : "6090"
},
"stop_time_update" : {
"stop_sequence" :71,
"departure" : {
"delay" : 240,
"time" : 1380215075
},
"stop_id" : "6095"
}
}
}]';
$result = json_decode($jsondata);
foreach($result as $value) {
echo "trip_id: ".$value->trip_update->trip->trip_id;
if (gettype($value->trip_update ) == "object") {
foreach($value->trip_update as $item) {
echo " - stop_sequence: ".$item->stop_sequence;
}
}
}
I can get the first level of data under 'trip_update->trip'. But there can be any number of 'stop_time_update' data within 'trip_update' as well. Since this data relates to the trip_update data, I need to loop through it and correlate it.
The end goal is to save this data to a database (not shown in the code), so for clarity, this would be the simplified 2 rows of DB data I would like to save in this example:
trip_id,stop_sequence
421356,70
421356,71
There can be any number of stop_sequences in the source data.
Here is an interactive link to the code for you to edit or mess with:
http://sandbox.onlinephpfunctions.com/code/f21ca8928da7de3e9fb351edb075d0a446906937
You might get better results if you write your own parser or use a stream-parser with callbacks. Here's a PHP implementation of such a parser that works with callbacks. So instead of reading the whole JSON data into memory, the parser will read the data in chunks and notify your "listener-class" whenever a new object starts or a property was read in etc. By doing this, you should get separate callback events for each stop_time_update property instead of just one value in the parsed array.
Very similar to what SAX is for XML.
Hi maybe you can change the name.
function next_update($coincidencias){
$replace=$coincidencias[0].$GLOBALS["json_stop_time_update"]++;
return $replace;
}
$result= preg_replace_callback("/stop_time_update/","next_update",$jsondata);
$result = json_decode($result);
You should rework your JSON - you have multiple keys with the same name, try to do print_r($result) to see what I am talking about - PHP will override the "stop_time_update" key time after time and you will be able to access only the last entry. Instead, you should organize your JSON like that:
[
{
"id" : "421356",
"trip_update" : {
"trip" : {
"trip_id" : "421356",
"start_time" : "12:05:00",
"start_date" : "20130926",
"route_id" : "15"
},
"stop_time_update" : [{
"stop_sequence" :70,
"departure" : {
"delay" : 240,
"time" : 1380215057
},
"stop_id" : "6090"
}, {
"stop_sequence" :71,
"departure" : {
"delay" : 240,
"time" : 1380215075
},
"stop_id" : "6095"
}]
}
}]
then you will be able to iterate through your data like this:
foreach($result[0]->trip_update->stop_time_update as $update)
{
$time = $update->departure->time;
...
}
If you cannot change the data structure, then what probably could help you is a PULL parser - one that does not return parsed data structure, but allows you to use a data stream instead - this way you could iterate over each node. The only one I managed to find is an extension to PHP:
https://github.com/shevron/ext-jsonreader
Check the usage section.
This JSON response is invalid because it contains duplicate keys but JSON doesn't allow duplicate keys.
You should contact the service you're trying to request this response from.
If you have a valid JSON response then you can decode it using the json_decode function which returns an object or an array (depends on the second parameter);
You cannot use a JSON parser for this as it will always overwrite the first element due to the same keys. The only proper solution would be asking whoever creates that "JSON" to fix his code to either use an array or an object with unique keys.
Another option is to write your own decoder function for parse it

MongoDB Query to find out all the array elements of a collection

I have a pretty big MongoDB document that holds all kinds of data. I need to identify the fields that are of type array in a collection so I can remove them from the displayed fields in the grid that I will populate.
My method now consists of retrieving all the field names in the collection with
This was taken from the response posted here MongoDB Get names of all keys in collection
mr = db.runCommand({
"mapreduce" : "Product",
"map" : function() {
for (var key in this) { emit(key, null); }
},
"reduce" : function(key, stuff) { return null; },
"out": "things" + "_keys"
})
db[mr.result].distinct("_id")
And running for each of the fields a query like this one
db.Product.find( { $where : "Array.isArray(this.Orders)" } ).count()
If there's anything retrieved the field is considered an array.
I don't like that I need to run n+2 queries ( n being the number of different fields in my collection ) and I wouldn't like to hardcode the fields in the model. It would defeat the whole purpose of using MongoDB.
Is there a better method of doing this ?
I made a couple of slight modifications to the code you provided above:
mr = db.runCommand({
"mapreduce" : "Product",
"map" : function() {
for (var key in this) {
if (Array.isArray(this[key])) {
emit(key, 1);
} else {
emit(key, 0);
}
}
},
"reduce" : function(key, stuff) { return Array.sum(stuff); },
"out": "Product" + "_keys"
})
Now, the mapper will emit a 1 for keys that contain arrays, and a 0 for any that do not. The reducer will sum these up, so that when you check your end result:
db[mr.result].find()
You will see your field names with the number of documents in which they contain Array values (and a 0 for any that are never arrays).
So this should give you which fields contain Array types with just the map-reduce job.
--
Just to see it with some data:
db.Product.insert({"a":[1,2,3], "c":[1,2]})
db.Product.insert({"a":1, "b":2})
db.Product.insert({"a":1, "c":[2,3]})
(now run the "mr =" code above)
db[mr.result].find()
{ "_id" : "_id", "value" : 0 }
{ "_id" : "a", "value" : 1 }
{ "_id" : "b", "value" : 0 }
{ "_id" : "c", "value" : 2 }

Filtering in Elastic Search?

I'm using date_histogram api to get the actual count using the interval (hour/day/week or month). Also I have a feature which I'm having trouble implementing, a user can filter the results by entering an startDate and endDate (textbox) which will be queried using a field timestamp. So how can I filter the results by querying only one field (which is TIMESTAMP) while using date_histogram api or any api so I can achieve my desire result.
In SQL I will just use a between operator to get the result but from what I've read so far their is no BETWEEN operator in Elastic Search (not sure).
I have this script so far:
curl 'http://anotherdomain.com:9200/myindex/_search?pretty=true' -d '{
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "adid"
}
},
"query" : {
"query_string" : {
"fields" : [
"adid", "imp"
],
"query" : "525826 AND true"
}
}
}
},
"facets" : {
"histo1":{
"date_histogram":{
"field":"timestamp",
"interval":"day"
}
}
}
}'
In elasticsearch you can use range query of filter to achieve that.

Categories