MongoDB querying date

MongoDB querying date - php

Querying with $gt is not working as expected if the date's are same. It's more like $gte.
But if I add 1 second to query param then it works.
Here is the sample query;
I have a document which it's creation_date 1367414837 timestamp.
db.collection.find({creation_date : {'$gt' : new Date(1367414837000)}});
This query matches with the document which date's 1367414837
If i increment the query timestamp just one like 1367414838. it works expected.
Im using mongo console but i have same problem in php with MongoDate
edit: output of query
db.collection.findOne({creation_date : {'$gt' : new Date(1367414837000)}});
{
"_id" : ObjectId("5181183543c51695ce000000"),
"action" : {
"type" : "comment",
"comment" : {
"id" : 74,
"post_id" : "174",
"owner_id" : "5",
"text" : "ne diyeyim lae :D",
"creation_date" : "2013-05-01 16:27:17"
}
},
"creation_date" : ISODate("2013-05-01T13:27:17.336Z"),
"owner" : {
"id" : "5",
"username" : "tylerdurden"
}
}
edit2: problem is php extension of mongo. it's documented " any precision beyond milliseconds will be lost when the document is sent to/from the database."
http://php.net/manual/en/class.mongodate.php
I incremented query param one second as a turnaround solution.

Dates in BSON are UNIX dates equal to milliseconds since epoch; they're accurate down to the millisecond. If the times you're inserting (and trying to match against) are accurate to the millisecond level, the element you're trying to match is possibly just a few milliseconds later than the timestamp you're querying, and $gt is likely working as expected. (2013-05-01T13:27:17.001Z is indeed later than 2013-05-01T13:27:17Z, for example.)

Related

ISODate format in mongodb version

I had restored MongoDB server version: 4.2.3 to MongoDB server version: 4.2.7 and I had an error about ISODate as below when saving data to the database again:
{ "_id" : ObjectId("5ed4b193ed6fab6d2272c5c4"), "id" : 1, "timestamp" : ISODate("2020-05-31T05:59:59Z") } #new data run after change db (it must disappear for unique)
{ "_id" : ObjectId("5ed33bef1e499012bf35e412"), "id" : 1, "timestamp" : ISODate("2020-05-31T04:59:59.999Z") } #old data
{ "_id" : ObjectId("5ed4b193ed6fab6d2272c5c3"), "id" : 1, "timestamp" : ISODate("2020-05-31T04:59:59Z") } #new data run after change db (it must disappear for unique)
{ "_id" : ObjectId("5ed32de165269b416f6c7362"), "id" : 1, "timestamp" : ISODate("2020-05-31T03:59:59.999Z") } #old data
{ "_id" : ObjectId("5ed4b193ed6fab6d2272c5c2"), "id" : 1, "timestamp" : ISODate("2020-05-31T03:59:59Z") } #new data run after change db (it must disappear for unique)
{ "_id" : ObjectId("5ed31fcff2a5076cc947bc02"), "id" : 1, "timestamp" : ISODate("2020-05-31T02:59:59.999Z") } #old data
{ "_id" : ObjectId("5ed311bfb0d88300f81e90d2"), "id" : 1, "timestamp" : ISODate("2020-05-31T01:59:59.999Z") } #old data
I have an index id and timestamp which is unique, but because timestamp has microseconds, not exactly so. Please give me a solution to keep microseconds in an ISODate.
PS: my code did not change. I use PHP and always format dates with 'Y-m-d\TH:i:s.uP'

MongoDB time resolution is 1 millisecond. Values with more precision will be truncated to millisecond precision.

Filtering in Elastic Search?

I'm using date_histogram api to get the actual count using the interval (hour/day/week or month). Also I have a feature which I'm having trouble implementing, a user can filter the results by entering an startDate and endDate (textbox) which will be queried using a field timestamp. So how can I filter the results by querying only one field (which is TIMESTAMP) while using date_histogram api or any api so I can achieve my desire result.
In SQL I will just use a between operator to get the result but from what I've read so far their is no BETWEEN operator in Elastic Search (not sure).
I have this script so far:
curl 'http://anotherdomain.com:9200/myindex/_search?pretty=true' -d '{
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "adid"
}
},
"query" : {
"query_string" : {
"fields" : [
"adid", "imp"
],
"query" : "525826 AND true"
}
}
}
},
"facets" : {
"histo1":{
"date_histogram":{
"field":"timestamp",
"interval":"day"
}
}
}
}'

In elasticsearch you can use range query of filter to achieve that.

How can an index in MongoDB render a query slower?

**
UPDATE
**
I posted an answer as it's been confirmed to be an issue
**
ORIGINAL
**
First, I apologize -- I have just started using MongoDB yesterday, and I am still pretty new at this. I have a pretty simple query, and using PHP my findings are this:
Mongo version is 2.0.4, running on CentOS 6.2 (Final) x64
$start = microtime(true);
$totalactive = $db->people->count(array('items'=> array('$gt' => 1)));
$end = microtime(true);
printf("Query lasted %.2f seconds\n", $end - $start);
Without index, it returns:
Query lasted 0.15 seconds
I have 280,000 records in people the database. So I thought adding an index on "items" should be helpful, because I query this data a lot. But to my disbelief, after adding the index I get this:
Query lasted 0.25 seconds
Am I doing anything wrong?
Instead of count i used find to get the explain and this is the output:
> db.people.find({ 'items' : { '$gte' : 1 } }).explain();
{
"cursor" : "BtreeCursor items_1",
"nscanned" : 206396,
"nscannedObjects" : 206396,
"n" : 206396,
"millis" : 269,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"items" : [
[
1,
1.7976931348623157e+308
]
]
}
}
If I change my query to be "$ne" 0, it takes 10ms more!
Here are the collection stats:
> db.people.stats()
{
"ns" : "stats.people",
"count" : 281207,
"size" : 23621416,
"avgObjSize" : 84.00009957077881,
"storageSize" : 33333248,
"numExtents" : 8,
"nindexes" : 2,
"lastExtentSize" : 12083200,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 21412944,
"indexSizes" : {
"_id_" : 14324352,
"items_1" : 7088592
},
"ok" : 1
}
I have 1GB of ram free, so I believe the index fits in memory.
Here's the people index, as requested:
> db.people.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "stats.people",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"items" : 1
},
"ns" : "stats.people",
"name" : "items_1"
}
]

Having an index can be beneficial for two reasons:
when accessing only a small part of the collection (because of a restrictive filter that can be satisfied by the index). Rule of thumb is less than 10%.
when the collection does not need to be accessed at all (because all necessary data is in the index, both for the filtering, and for the result set). This will be indicated by "indexOnly = true".
For the "find" query, both of this is not true: You are accessing almost the whole collection (206396 out of 281207) and need all fields data. So you will go through the index first, and then through almost the whole collection anyway, defeating the purpose of the index. Just reading the whole collection would have been faster.
I would have expected the "count" query to perform better (because that can be satisfied by just going through the index). Can you get an explain for that, too?

Look at this:
http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-5.MongoDB%27s%24neor%24ninoperator%27saren%27tefficientwithindexes.
Which made me consider this solution. How about this?
$totalactive = $db->people->count() - $db->people->count(array('items'=> array('$eq' => 1)));

This was confirmed to be a bug or something that needed optimization in the MongoDB engine. I posted this in the mongo mailing list and the response I received from Eliot Horowitz
That's definitely a bug, or at least a path that could be way better
optimized. Made a case: https://jira.mongodb.org/browse/SERVER-5607
Priority: Major
Fix Version/s: 2.3 desired
Type: Bug
Thanks for those who helped confirming this was a bug =)

Can you please provide an example of an object in this collection? The "items" field is an array? If so, I would recommend you add a new field "itemCount" and put an index on that. Doing $gt on this field will be extremely fast.

This is because your queries are near-full collection scans. The query optimizer is picking to use the index, when it should not use it for optimum performance. It's counterintuitive, yes, but it's because the cursor is walking the index b-tree and fetching the documents that the tree points to, which is slower than just walking the collection if it has to scan the almost the whole tree.
If you really need to do this kind of query, and you want to use that index for other things, like sorting, you can use .hint({$natural: 1}), to tell the query to not use the index.
Coincidentally, I posted about a similar issue in a blog post recently: http://wes.skeweredrook.com/testing-with-mongodb-part-1/

Duplicate value inserted in the mongodb "_id" field

In mongodb, We can assign our own value to _id field and the "_id" field value may be of any type, other than arrays, so long as it is a unique -- From the docs.
But in my live database, i can see some records were duplicated as follows,
db.memberrecords.find().limit(2).forEach(printjson)
{
"_id" : "999783",
"Memberid" : "999783",
"Name" : "ASHEESH SHARMA",
"Gender" : "M",
}
{
"_id" : "999783",
"Memberid" : "999783",
"Name" : "Sanwal Singh Meena",
"Gender" : "M",
}
In above records, the same _id value inserted twice in the table. When i tested with local database it is not allowing to insert the same _id record and throwing error as follows,
E11000 duplicate key error index: mongoms.memberrecords.$_id_ dup key: { : "999783" }
Below is the Indexes for my live memberrecords table(for your reference),
db.memberrecords.getIndexes()
[
{
"name" : "_id_",
"ns" : "mongoms.memberrecords",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4f0bcdf2b1513267f4ac227c"),
"ns" : "mongoms.memberrecords",
"key" : {
"Memberid" : 1
},
"name" : "Memberid_1",
"unique" : true,
"v" : 0
}
]
Note: i have two sharding for this table.
Any suggestion on this please,

Is your shard key the _id field? You can only have one unique index enforced across a cluster: the shard key (otherwise the server would have to check with every shard on every insert).
So: on a single a shard, _id will be unique. However, if it isn't your shard key, all bets are off across multiple shards.
See http://www.mongodb.org/display/DOCS/Sharding+Limits#ShardingLimits-UniqueIndexesDOCS%3AIndexes%23UniqueIndexes.

Sort data in sub array in mongodb

Is that possible to sort data in sub array in mongo database?
{ "_id" : ObjectId("4e3f8c7de7c7914b87d2e0eb"),
"list" : [
{
"id" : ObjectId("4e3f8d0be62883f70c00031c"),
"datetime" : 1312787723,
"comments" :
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787723,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787724,
},
{
"id" : ObjectId("4e3f8d0be62883f70c00031d")
"datetime": 1312787725,
},
}
],
"user_id" : "3" }
For example I want to sort comments by field "datetime". Thanks. Or only variant is to select all data and sort it in PHP code, but my query works with limit from mongo...

With MongoDB, you can sort the documents or select only some parts of the documents, but you can't modify the documents returned by a search query.
If the current order of your comments can be changed, then the best solution would be to sort them in the MongoDB documents (find(), then for each doc, sort its comments and update()). If you want to keep the current internal order of comments, then you'll have to sort each document after each query.
In both case, the sort will be done with PHP. Something like:
foreach ($doc['list'] as $list) {
// uses a lambda function, PHP 5.3 required
usort($list['comments'], function($a,$b){ return $a["datetime"] < $b["datetime"] ? -1 : 1; });
}
If you can't use PHP 5.3, replace the lambda function by a normal one. See usort() examples.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

MongoDB querying date - php

Related

ISODate format in mongodb version

Filtering in Elastic Search?

How can an index in MongoDB render a query slower?

Duplicate value inserted in the mongodb "_id" field

Sort data in sub array in mongodb

Categories

Resources