Can MongoDB and its drivers preserve the ordering of document elements - php

I am considering using MongoDB to store documents that include a list of key/value pairs. The safe but ugly and bloated way to store this is as
[ ['k1' : 'v1'] , ['k2' : 'v2'], ...]
But document elements are inherently ordered within the underlying BSON data structure, so in principle:
{k1 : 'v1',
k2 : 'v2', ...}
should be enough. However I expect most language bindings will interpret these as associative arrays, and thus potentially scramble the ordering. So what I need to know is:
Does MongoDB itself promise to preserve item ordering of the second form.
Do language bindings have some API which can extract it ordered form -- even if the usual "convenient" API returns an associative array.
I am mostly interested in Javascript and PHP here, but I would also like to know about other languages. Any help is appreciated, or just a link to some documentation where I can go RTM.

From Version 2.6 on, MongoDB preserves the order of fields where possible. However, the _id field always comes first an renaming fields can lead to re-ordering. However, I'd generally try not to rely on details like this. As the original question mentions, there are also additional layers to consider which each must provide some sort of guarantee for the stability of the order...
Original Answer:
No, MongoDB does not make guarantees about the ordering of fields:
"There is no guarantee that the field order will be consistent, or the same, after an update."
In particular, in-place updates that change the document size will usually change the ordering of fields. For example, if you $set a field whose old value was of type number and the new value is NumberLong, fields usually get re-ordered.
However, arrays preserve ordering correctly:
[ {'key1' : 'value1'}, {'key2' : 'value2'}, ... ]
I don't see why this is "ugly" and "bloated" at all. Storing a list of complex objects couldn't be easier. However, abusing objects as lists is definitely ugly: Objects have associative array semantics (i.e. there can only be one field of a given name), while lists/arrays don't:
// not ok:
db.foo2.insert({"foo" : "bar", "foo" : "lala" });
db.foo2.find();
{ "_id" : ObjectId("4ef09cd9b37bc3cdb0e7fb26"), "foo" : "lala" }
// a list can do that
db.foo2.insert({ 'array' : [ {'foo' : 'bar'}, { 'foo' : 'lala' } ]});
db.foo2.find();
{ "_id" : ObjectId("4ef09e01b37bc3cdb0e7fb27"), "array" :
[ { "foo" : "bar" }, { "foo" : "lala" } ] }
Keep in mind that MongoDB is an object database, not a key/value store.

As of Mongo 2.6.1, it DOES keep the order of your fields:
MongoDB preserves the order of the document fields following write operations except for the following cases:
The _id field is always the first field in the document.
Updates that
include renaming of field names may result in the reordering of
fields in the document.
http://docs.mongodb.org/manual/release-notes/2.6/#insert-and-update-improvements

One of the pain points of this is comparing documents to one another in the shell.
I've created a project that creates a custom mongorc.js which sorts the document keys by default for you when they are printed out so at least you can see what is going on clearly in the shell. It's called Mongo Hacker if you want to give it a whirl.

Though it's true that, as of Mongo 2.6.1, it does preserve order, one should still be careful with update operations.
mattwad makes the point that updates can reorder things, but there's at least one other concern I can think of.
For example $addToSet:
https://docs.mongodb.com/manual/reference/operator/update/addToSet/
$addToSet when used on embedded documents in an array is discussed / exemplified here:
https://stackoverflow.com/a/21578556/3643190
In the post, mnemosyn explains how $addToSet disregards the order when matching elements in its deep value by value comparison.
($addToSet only adds records when they're unique)
This is relevant if one decided to structure data like this:
[{key1: v1, key2: v2}, {key1: v3, key2: v4}]
With an update like this (notice the different order on the embedded doc):
db.collection.update({_id: "id"},{$addToSet: {field:
{key2: v2, key1: v1}
}});
Mongo will see this as a duplicate and NOT this object to the array.

Related

Updating field in nested documents in mongodb php

I use MongoDB with PHP driver, so for convenience I will write the query with this syntax,
I would like to find a more elegant solution that I found today for the following problem.
I have this collection "Story" with nested document:
Collection Story:
{
"_id":"Story1",
"title":null,
"slug":null,
"sections":[
{
"id_section":"S1",
"index":0,
"type":"0",
"elements":[
{
"id_element":"001",
"text":"img",
"layout":1
}
]
},
{
"id_section":"S2",
"index":0,
"type":"0",
"elements":[
{
"id_element":"001",
"text":"hello world",
"layout":1
},
{
"id_element":"002",
"text":"default text",
"layout":1
},
{
"id_element":"003",
"text":"hello world 3",
"layout":"2"
}
]
}
]
}
Assuming you want to change the value of the element with id_element => 002 present in section with id_section => S2 of Story with _id => Story1
The solution I've found now is to find the "position" of element 002 and do the following
1]
$r=$m->db->plot->findOne(array("_id" => 'Story1',
"sections.id_section"=>'S2'),
array('_id'=>false,'sections.$.elements'=>true));
2]
foreach($r['sections'][0]['elements'] as $key=>$value){
if($value['id_element']=='002'){
$position=$key;
break;
}
3]
$m->db->story->update(array('_id'=>'Story1','sections.id_section'=>'S2','sections.elements.id_element'=>'002'),
array('$set'=>array('sections.$.elements.'.$position.'.text'=>'NEW TEXT')),
array('w'=>1));
I repeat that I do not think an elegant solution, and I noticed that it is a common problem.
Thank you for your help
S.
You can't use $ to match multiple levels of nested arrays. This is why it's not a good idea to nest arrays in MongoDB if you anticipate searching on properties anywhere deeper than the top level array. The alternatives for a fixed document structure are to know which positions in all but one of the arrays you want to update at (or to retrieve the document and find out the indexes, as you are doing) or to retrieve the document, update it in the client, and reinsert it.
The other option is to rethink how the data is modeled as documents in MongoDB so that nested arrays don't happen/ In your case, a story is a collection of sections which are collections of elements. Instead of making a story document, you could have a story be represented by multiple section documents. The section documents would share some common field value to indicate they belong to the same story. The above update would then be possible as an update on one section document using the $ positional operator to match and update the correct element.

Select condition within a hash column using Doctrine mongoDB ODM query builder

I have the following structure within a mongoDB collection:
{
"_id" : ObjectId("5301d337fa46346a048b4567"),
"delivery_attempts" : {
"0" : {
"live_feed_id" : 107,
"remaining_attempts" : 2,
"delivered" : false,
"determined_status" : null,
"date" : 1392628536
}
}
}
// > db.lead.find({}, {delivery_attempts:1}).pretty();
I'm trying to select any data from that collection where remaining_attempts are greater than 0 and a live_feed_id is equal to 107. Note that the "delivery_attempts" field is of a type hash.
I've tried using an addAnd within an elemMatch (not sure if this is the correct way to achieve this).
$qb = $this->dm->createQueryBuilder($this->getDocumentName());
$qb->expr()->field('delivery_attempts')
->elemMatch(
$qb->expr()
->field('remaining_attempts')->gt(0)
->addAnd($qb->expr()->field('live_feed_id')->equals(107))
);
I do appear to be getting the record detailed above. However, changing the greater than
test to 3
->field('remaining_attempts')->gt(3)
still returns the record (which is incorrect). Is there a way to achieve this?
EDIT: I've updated the delivery_attempts field type from a "Hash" to a "Collection". This shows the data being stored as an array rather than an object:
"delivery_attempts" : [
{
"live_feed_id" : 107,
"remaining_attempts" : 2,
"delivered" : false,
"determined_status" : null,
"date" : 1392648433
}
]
However, the original issue still applies.
You can use a dot notation to reference elements within a collection.
$qb->field('delivery_attempts.remaining_attempts')->gt(0)
->field('delivery_attempts.live_feed_id')->equals(107);
It works fine for me if I run the query on mongo.
db.testQ.find({"delivery_attempts.remaining_attempts" : {"$gt" : 0}, "delivery_attempts.live_feed_id" : 107}).pretty()
so it seems something wrong with your PHP query, I suggest running profiler to see which query is actually run against mongo
db.setProfilingLevel(2)
This will log all operation since you enable profiling. Then you can query the log to see which the actual queries
db.system.profile.find().pretty()
This might help you to find the culprit.
It sounds like your solved your first problem, which was using the Hash type mapping (instead for storing BSON objects, or associative arrays in PHP) instead of the Collection mapping (intended for real arrays); however, the query criteria in the answer you submitted still seems incorrect.
$qb->field('delivery_attempts.remaining_attempts')->gt(0)
->field('delivery_attempts.live_feed_id')->equals(107);
You said in your original question:
I'm trying to select any data from that collection where remaining_attempts are greater than 0 and a live_feed_id is equal to 107.
I assume you'd like that criteria to be satisfied by a single element within the delivery_attempts array. If that's correct, the criteria you specified above may match more than you expect, since delivery_attempts.remaining_attempts can refer to any element in the array, as can the live_feed_id criteria. You'll want to use $elemMatch to restrict the field criteria to a single array element.
I see you were using elemMatch() in your original question, but the syntax looked a bit odd. There should be no need to use addAnd() (i.e. an $and operator) unless you were attempting to apply two query operators to the same field name. Simply add extra field() calls to the same query expression you're using for the elemMatch() method. One example of this from ODM's test suite is QueryTest::testElemMatch(). You can also use the debug() method on the query to see the raw MongoDB query object created by ODM's query builder.

Use ConfigParser to read an array from ini file [duplicate]

This question already has answers here:
How to ConfigParse a file keeping multiple values for identical keys?
(5 answers)
Closed 9 years ago.
I have read this post, and defined an array in subscriber.ini
[smtp]
subscriber[] = aaa#hotmail.com
subscriber[] = bbb#XX.webmail
subscriber[] = ccc#test.org
Then I try to use ConfigParser to read the array
#!/usr/bin/python
import ConfigParser
CONFIG_FILE = 'subscriber.ini'
config = ConfigParser.ConfigParser()
config.read( CONFIG_FILE )
subscriber = config.get('smtp' , 'subscriber[]' )
print subscriber
It will output the last element, ccc#test.org. But I expect a full subscriber list.
How do I get the array from ini file ?
Python ConfigParser doesn't provide this feature. Use following instead:
[smtp]
subscriber = aaa#hotmail.com bbb#XX.webmail ccc#test.org
Then in your script:
subscriber = config.get('smtp' , 'subscriber').split()
This syntax, where subscriber[] automatically makes subscriber into a list of multiple values, is not a feature of .ini files in general, nor of ConfigParser; it's a feature of Zend_Config_Ini.
In Python, a ConfigParser ini file creates a dict mapping each key to its value. If you have more than one value, it will just override previous values. The magic [] suffix means nothing.
However, the ConfigParser constructor lets you specify a custom dictionary type or factory, in place of the default OrderedDict.
One simple solution would be to use a defaultdict(list) (or an OrderedDefaultDict, which there are recipes for in the docs) for the underlying storage, have __setitem__(self, key, value) do self.dd[key].append(value), and delegate everything else normally. (Or, if you prefer, inherit from defaultdict, override the constructor to pass list to the super, and then just don't override anything but __setitem__.) That will make all of your values into lists.
You could even do something hacky where a value that's only seen once is a single value, but if you see the same name again it becomes a list. I think that would be a terrible idea (do you really want to check the type of config.get('smtp', 'subscriber[]') to decide whether or not you want to iterate over it?), but if you want to, How to ConfigParse a file keeping multiple values for identical keys? shows how.
However, it's not at all hard to reproduce the exact magic you're looking for, where all keys ending in [] are lists (whether they appear once or multiple times), and everything else works like normal (keeps only the last value if it appears multiple times). Something like this:
class MultiDict(collections.OrderedDict):
def __setitem__(self, key, value):
if key.endswith('[]'):
super(MultiDict, self).setdefault(key, []).append(value)
else:
super(MultiDict, self).__setitem__(key, value)
This obviously won't provide all of the extended features that Zend_Config_Ini adds on top of normal .ini files. For example, [group : subgroup : subsub] won't have any special meaning as a group name, nor will key.subkey.subsub as a key name. PHP values TRUE, FALSE, yes, no, and NULL won't get converted to Python values True, False, True, False, and None. Numbers won't magically become numbers. (Actually, this isn't a feature of Zend_Config_Ini, but a misfeature of PHP's leaky typing.) You have to use # comments, rather than freely mixing #, ;, and //. And so on. Any of those features that you want to add, you'll have to add manually, just as you did this one.
As I suggested in a comment, if you really want to have more than two levels of hierarchy, you may be better off with a naturally infinitely-hierarchical format, where any value can be a list or dict of other values.
JSON is ubiquitous nowadays. It may not be quite as human-editable as INI, but I think more people are familiar with it than INI in 2014. And it has the huge advantage that it's a standardized format, and that both Python (2.6+) and PHP (5.2+) come with parsers and pretty-printers for in their standard libraries.
YAML is a more flexible and human-editable format. But you will need third-party modules in both languages (see the list at the YAML site). And it can also bring in some security concerns if you're not careful. (See safe_load and friends in the PyYAML docs; most other libraries have similar features.)
I wonder if you have considered using Michael Foord's configobj module? It seems to be capable
of doing what you want, which might be better that trying to pervert ConfigParser to do what you apparently need.

Assign array value to a field

I am working with migration and I am migrating taxonomy terms that the document has been tagged with. The terms are in the document are separated by commas. so far I have managed to separate each term and place it into an array like so:
public function prepareRow($row) {
$terms = explode(",", $row->np_tax_terms);
foreach ($terms as $key => $value) {
$terms[$key] = trim($value);
}
var_dump($terms);
exit;
}
This gives me the following result when I dump it in the terminal:
array(2) {
[0]=>
string(7) "Smoking"
[1]=>
string(23) "Not Smoking"
}
Now I have two fields field_one and field_two and I want to place the value 0 of the array into field_one and value 1 into field_two
e.g
field_one=[0]$terms;
I know this isn't correct and I'm not sure how to do this part. Any suggestions on how to do this please?
If you are only looking to store the string value of the taxonomy term into a different field of a node, then the following code should do the trick:
$node->field_one['und'][0]['value'] = $terms[0];
$node->field_two['und'][0]['value'] = $terms[1];
node_save($node);
Note you will need to load the node first, if you need help with that, comment here and will update my answer.
You are asking specifically about ArrayList and HashMap, but I think to fully understand what is going on you have to understand the Collections framework. So an ArrayList implements the List interface and a HashMap implements the Map interface.
List:
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
Map:
An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
So as other answers have discussed, the list interface (ArrayList) is an ordered collection of objects that you access using an index, much like an array (well in the case of ArrayList, as the name suggests, it is just an array in the background, but a lot of the details of dealing with the array are handled for you). You would use an ArrayList when you want to keep things in sorted order (the order they are added, or indeed the position within the list that you specify when you add the object).
A Map on the other hand takes one object and uses that as a key (index) to another object (the value). So lets say you have objects which have unique IDs, and you know you are going to want to access these objects by ID at some point, the Map will make this very easy on you (and quicker/more efficient). The HashMap implementation uses the hash value of the key object to locate where it is stored, so there is no guarentee of the order of the values anymore.
You might like to try:
list($field_one, $field_two) = prepareRow($row);
The list function maps entries in an array (in order) to the variables passed by reference.
This is a little fragile, but should work so long as you know you'll have at least two items in your prepareRow result.

Will the Order of my Associative Array be maintained from PHP to Javascript?

In PHP I'm running a mysql_query that has an ORDER BY clause. I'm then iterating through the results to build an associative array, with the row_id as the key.
Then, I'm calling json_encode on that array and outputting the result.
This page is loaded with AJAX, and defined in a Javascript variable. When I iterate through that Javascript variable, will I still have the order that was returned from the mysql_query?
PHP arrays are somewhat unique in their property of maintaining insertion order. Javascript doesn't have associative arrays per se. It has objects, which are often used as associative arrays. These do not guarantee any particular key order.
Why not output them as an array? That will have a particular order. If you want some sort of key lookup why does the order matter?
What cletus says is correct, but in my experience, most browsers will maintain the order. That being said, you should consider using an Array. If you need to sort it once you receive it on the client-side, just use the .sort() function in JavaScript:
rows.sort(function(a, b) {
return a.row_id - b.row_id;
}
Though it seems like it works, the order of properties in an object can't be counted on. See the many comments below for more info (smarter eyes than mine). However, this was the code I used to test the behavior in my own limited testing:
var test = {
one: 'blah',
two: 'foo',
another: 'bar'
};
for (prop in test) {
document.write(prop + "<br />");
}
Prints (in Firefox 3.6.3 and Chrome 5.0.375.9):
one
two
another
Also, you may want to be sure you're getting the type of JSON encoding you're needing back from json_encode(), such as an object (uses {} curly braces) and not an array ([] braces). You may need to pass JSON_FORCE_OBJECT to json_encode() to force it.
Edited to clarify that the Array approach is preferred)
Edited again (sorry), as I had overlooked pcorcoran's comment, which has a link to an issue in Chromium's issue tracker regarding this. Suffice to say, the order an object's properties is not reliable.

Categories