php mongodb count nested attributes - php

I have the following document structure:
"messages": {
"_id" : ObjectId("515a4de9c1a3c09c19000001"),
"author" : "50fd0d38c1a3c04c27000000",recipient" : "5159a292c1a3c01d5b000005",
"conversation" : [
{
"author" : "50fd0d38c1a3c04c27000000",
"date" : ISODate("2013-04-02T03:18:01.204Z"),
"message" : "hello test",
"read" : false
},
{
"message" : "reply test",
"date" : ISODate("2013-04-02T03:36:57.444Z"),
"author" : "5159a292c1a3c01d5b000005",
"read" : true
}....
Is it possible to get the total number of conversation messages where conversation.read is false and the author (conversation.author) of the unread message is not the person who started the conversation?
I'm currently finding documents that have conversations.read as false and then looping through them in PHP to check the conversations.author field. This works but I'm afraid when I get lots of data it will slow down.
I'm using the PHP MongoDB driver.
I have tried this..
$unread_conversations = $db->messages->find(
array(
'conversation.read'=>false
'conversation.author'=>array('$ne'=>$_SESSION['account']['_id']->__toString())
));
but thats not working because I want the messages that are not read and not an author.

This is not possible with regular mongo queries. You have to use map-reduce or mongos built in aggregation framework.
To accomplish your goal with current schema, you have to unwind nested list first.
Other solution is to change your schema and store each conversation as a separated document (with parent filed which contains the id of parent conversation).
And just as a side note: don't store id fields as string, always use ObjectId to store mongo ids.

Related

Bulk inserting documents into couchbase database with php - how to?

I am experimenting actually a little bit with couchbase server.
I have tried to read a mysql database table, build a document from each data row and then inserting the document with an id which I generate with
uniqid('table_name');
via cUrl, method is POST.
This so far works pretty good, until the script has inserted roundabout 7050 documents. Then an exception is thrown -> "No buffer space".
Until now I was not able to fix this, so I decided to collect i.e. 50 rows of data build a json_encode(d) string and POST it again via cUrl.
This worked so far if I don't set the id - but I can't figure out how to set the id of the inserted documents.
Actually I try to send my documents in a format like this:
{"docs": {
"_id": {
"geodata_de_54476f7e6adc57.14196038": {
"table": "geodata_de",
"country": "DE",
"postal_code": "01945",
"place_name": "Lindenau",
"state_name": "Brandenburg",
"state_code": "BB",
"province_name": "",
"province_code": "00",
"community_name": "Oberspreewald-Lausitz",
"community_code": "12066",
"lat": "51.4",
"lng": "13.7333",
"Xco": "3861.1",
"Yco": "943.614",
"Zco": "4979.07"
}
}, ...
}
}
but this just inserts ONE document with the above object.
Maybe there is someone here who can point me the right direction.
I would use the Couchbase PHP SDK to insert these documents instead of using curl. http://docs.couchbase.com/developer/php-2.0/storing.html
Also for CB, you do not have to set the ID in the document itself. it depends. I might take a look at instead using the ID you have in your example ("geodata_de_54476f7e6adc57.14196038") and put it as the key for the object in Couchbase. Then you do not necessarily need the _id. The key in Couchbase can be up to 250 bytes of data and you can make it meaningful to your application so you can do lookup by key extremely fast.
Another option is, if you wrote your docs to the filesystem, you could also use cbdocloader utility which is specifically for bulk loading docs. If you are on linux it is in /opt/couchbase/bin/tools/cbdocloader.

MongoDB: strategies for resolving a race condition

We have a scenario where we need to store multiple feeds under a site model as following:
{
id: site_id
name: site_name
feeds: [
{
url: feed_url_1
date: feed_update_date_1
},
{
url: feed_url_2
date: feed_update_date_2
},
...
]
}
Since feeds is an array, we can update it with $set, $push or $addToSet.
2 different race condition (write skew) may occur when our concurrent application (queue) try to update the same site model.
If we pick $set, and guard duplicate on client side, then if 2 queues are writing to the same site, one feed maybe lost with following sequence.
Given a wordpress site, extract 2 feeds (RSS and ATOM), dispatch to Q1 and Q2.
Q1: load existing feed, check RSS feed is new
Q2: load existing feed, check ATOM feed is new
Q1: $set feeds => [RSS]
Q2: $set feeds => [ATOM]
Now RSS feed is lost.
If we pick $push or $addToSet, then following may happen.
User A added a site, putting RSS feed to Q1
User B added the same site, putting the same RSS feed to Q2
Q1: load existing feed, check RSS feed is new
Q2: load existing feed, check RSS feed is new
Q1: $push RSS
Q2: $push RSS
Now RSS feed has been duplicated
If our data model were simply { url }, then $addToSet will safeguard against duplicate feed. But unfortunately this is not the case, the date attribute may differ. So $addToSet is not much safer than $push.
We have thought of a few possible workaround to this problem, but none are great given our tight schedule.
Decouple feeds from site into its own collection, safeguard with url alone, and change our model and repository accordingly.
Insert a partial { url } into the site model first, then update them with addition information, this should makes $addToSet usable, but may break other queue that require date to always be present (testing needed).
Let race condition happen as-is, $push the feed first, use a background queue to detect duplicate and remove them later.
(There might be a 4th solution if upsert work with positional query, but as far as I know MongoDB v2.4 doesn't have it yet)
So I wonder whether there are better alternative for resolving this kind of race condition. Or if there are some best practices for it.
you might want to have a look at tokumx, a fork of mongodb which supports transactions (besides a few other usefull things)
You can use a gard on the update selector:
alice(mongod-2.4.8) test> db.foo.save({_id: 12 })
Updated 1 new record(s) in 1ms
alice(mongod-2.4.8) test> db.foo.update({ _id: 12, "feeds.url" : {$ne: "baz"} },
{ $push : { feeds : { url: "baz" } } } )
Updated 1 existing record(s) in 1ms
alice(mongod-2.4.8) test> db.foo.update({ _id: 12, "feeds.url" : {$ne: "baz"} },
{ $push : { feeds : { url: "baz" } } } )
Updated 0 record(s) in 1ms
alice(mongod-2.4.8) test> db.foo.find({_id: 12 })
{
"_id": 12,
"feeds": [
{
"url": "baz"
}
]
}
Fetched 1 record(s) in 1ms -- Index[_id_]

Stored function used in insert trough PHP runs multiple times

I'm trying to understand a strange behavior using PHP with mongodb 2.4.3 win32.
I try to have server side generated sequence ids.
When inserting documents using a stored function as one of the parameters it seems that the stored function is called several times at each insertion.
Let's say I have a counter initialized like this:
db.counters.insert( { _id: "uqid", seq: NumberLong(0) } );
I have a stored function named getUqid which is defined as
db.system.js.save(
{ _id: "getUqid",
value: function () {
var ret = db.counters.findAndModify(
{ query: { _id: "uqid" },
update: { $inc: { seq: NumberLong(1) } },
new: true
} );
return ret.seq;
}
} );
When I do three insertions like this:
$conn->test->ads->insert(['qid' => new MongoCode('getUqid()') , 'name' => "Sarah C."]);
I get something like that:
db.ads.find()
{ "_id" : ObjectId("51a34f8bf0774cac03000000"), "qid" : 17, "name" : "Sarah C." }
{ "_id" : ObjectId("51a34f8bf0774cac03000001"), "qid" : 20, "name" : "Michel D." }
{ "_id" : ObjectId("51a34f8bf0774cac03000002"), "qid" : 23, "name" : "Robert U." }
Any clue why qid is getting stepped by 3 ? It should mean that I received three call to my stored function right ?
Thanks in advance for your help, Regards.
PS: secondary question: are NumberLong still required to be sure we have 64bit unsigned integer in internal mongodb storage ? Any command to cross-check that in the shell ?
Cross-referencing this question with PHP-841. From the PHP side of things, you're actually storing a BSON code value in the qid field. You can likely verify that when fetching results back from the database or doing a database export with the mongodump command.
The issue is with the JS shell wrongfully evaluating the code type upon display, and that's the point where findAndModify is executed. This fix should be included in a subsequent server release.
In the meantime, Sammaye's suggestion to call findAndModify from PHP is the best option for this sort of functionality. Coincidentally, it is also what is done in Doctrine MongoDB ODM (see: IncrementGenerator). It does require an additional round trip to the server, but that is necessary since MongoDB has no facility for executing JS callbacks during a write operation.
If minimizing the round-trips to MongoDB is of utmost importance, you could insert the documents by executing server-side JS through PHP with MongoDB::execute() and do something like returning the generated ID(s) as the command response. Of course, that's generally not advisable and JS evaluation has its own caveats.

Doctrine is making duplicated document entries

I am trying to insert a embedded document into my document with the following code.
// Add states, for the joining player.
$state = new PlayerState();
$state->setReady(false);
$state->setPlayer($player->getId());
$game->addPlayerState($state);
// Save element.
$dm->persist($game);
$dm->flush();
Problem being, that this generates 2 PlayerState Document like this.
{ "_id" : ObjectId( "513f50a58ead0ee9ac00000f" ),
"ready" : false,
"player" : "513f509f8ead0e8bac00000b" },
{ "_id" : ObjectId( "513f50af8ead0ecdac000015" ),
"ready" :false,
"player" : "513f509f8ead0e8bac00000b" }
Am i saving this in a incorrect way? Let me know, if you need more code.
This seemed to do the trick.
$state = new PlayerState();
$state->setReady(false);
$state->setPlayer($player->getId());
$dm->persist($state);
$dm->flush();
$game->addPlayerState($state);
// Save element.
$dm->flush();
This is hard to explain, but i will give it a try.
You need to persist the embedded document first, otherwise Doctrine will first persist the Document, making a embedded doc only with the values set, acting like a simple data container.
$state->setReady(false);
$state->setPlayer($player->getId());
After, doctrine will persist the embedded document once again, but this time looking at the Document object, assigning ID's, default values etc.
Resulting in 2 entries.

How do you update a mongodb document while replacing the entire document?

If I have a document with two values:
{
"name" : "Bob",
"location" "France"
}
And then pass an array to the document that contains "name", "country". How do I ensure that the entire document is updated, with "location" removed?
I will not be aware of the differences between the two documents, so I wont be able to unset a specific field. I am looking for a method to simply replace the document data with the array supplied.
If you supply update with a new document containing the records you want, you will replace the document.
For example:
db.so.insert({"name":"Bob", "location": "France"})
db.so.update({"name":"Bob", "location": "France"}, {"name":"Roberto", "country":"Italy"})
db.so.find()
{ "_id" : ObjectId(), "name" : "Roberto", "country" : "Italy" }
Of course, if you had the _id of the document, this would be a better way of specifying the update (rather than passing back the document you just inserted like above)

Categories